Best Practices for Python
Using venv to manage dependencies:
Dependency management on the HPC can be very difficult. The venv module
allows you to set up a virtual environment that isolates the dependencies for
each individual project by installing them into a project-specific folder. This
is an intergral part of writing reproducible code and is useful for two
reasons:
- Virtual environments allow you to clearly communicate your code's required dependencies to your collaborators
- Virtual environments ensures that updating dependencies for one project will never break dependencies for another project (this is an issue when two projects rely on different versions of the same package)
To create a new virtual environment for your python, you can run the following command:
which creates a new virtual environment at the specified location. In the
example repositories, the virtual environment is located in ./venv/.
To have the virtual environment manage your dependencies, you have to
activate it, which is done with the following command:
While the virtual environment is activated, you can use pip as you normally
would, but all changes you make by installing packages will be isolated to this
project. When you are done using the virtual environment, you can either press
Ctrl+d (if working interactively), or use the deactivate command which will
return your shell to normal. To get a snapshot of the packages used in the venv
(say, to generate a requirements.txt file), you can run pip freeze.
You can check out more info about the venv module and virtual environments on the python language website