File Management¶
Most time is spent within the CLI... One needs to be rather fluent in this otherwise we’ll have problems.
CloudPathLib - nice package for dealing with cloud data.
Code Management¶
Git
GitHub/ GitLab
Data Management¶
Code is not static and neither is data.
AI Assistance¶
Docstrings, Testing, StackOverFlow searches...
GitHub Copilot - Guides
Deployment¶
Streamlit
FastAPI
Environment Management¶
conda
pip
poetry
Daytona + DevContainer
Configuration Management¶
Hydra
Hydra-Zen
Typer
Code Quality¶
Ruff
Pyright
MyPy
Isort
pre-commit-config
Continuous Integration¶
GitHub Actions
DVC
Performance Checks¶
Scalene
Logging¶
This is something I often use whenever I am in the process of building software and I think there are some key things that need to be documented. It is often much better than print statements. I often do this when I’m not really sure if what I did is correct in the process. It’s really important to log. Especially when you’re doing server computing and you need a history of what was going on.
Testing¶
Something that we all should do but don’t always do. It’s important for the long run but it seems annoying for the short game. But overall, you cannot go wrong with tests; you just can’t.
Repository Organization¶
Packaging¶
Continuous Integration¶
WorkFlow¶
JupyterLab - Prototyping, Remote Computing
VSCode - Package Management, Remote Computing
Remote Computing - SSH (JLab, VSCode)