File Management¶
Most time is spent within the CLI... One needs to be rather fluent in this otherwise we’ll have problems.
- pathlib
- fsspec
- upath
- CloudPathLib - nice package for dealing with cloud data.
- toolz
Code Management¶
- Git
- GitHub/ GitLab
Data Management¶
Code is not static and neither is data.
AI Assistance¶
Docstrings, Testing, StackOverFlow searches...
- GitHub Copilot - Guides
Deployment¶
- Streamlit
- FastAPI
Environment Management¶
- conda
- pip
- poetry
- Daytona + DevContainer
Configuration Management¶
- Hydra
- Hydra-Zen
- Typer
Code Quality¶
- Ruff
- Pyright
- MyPy
- Isort
- pre-commit-config
Continuous Integration¶
- GitHub Actions
- DVC
Performance Checks¶
- Scalene
Logging¶
This is something I often use whenever I am in the process of building software and I think there are some key things that need to be documented. It is often much better than print statements. I often do this when I’m not really sure if what I did is correct in the process. It’s really important to log. Especially when you’re doing server computing and you need a history of what was going on.
Testing¶
Something that we all should do but don’t always do. It’s important for the long run but it seems annoying for the short game. But overall, you cannot go wrong with tests; you just can’t.
Repository Organization¶
Packaging¶
Continuous Integration¶
WorkFlow¶
- JupyterLab - Prototyping, Remote Computing
- VSCode - Package Management, Remote Computing
- Remote Computing - SSH (JLab, VSCode)