Conda 4 Remote Servers#

It’s advisable to create personal conda environments in your /path/to/workdir directory. Preconfigured environments are great for quickly prototyping. But, inevitably, you always end up needing more fine-tune control over your environments whenever you are prototyping. In addition, too many people using the same environment results in massive environments with unnecessary packages. Having control over your environment will lead to more reproducible settings especially if you keep track.

Look at the table of contents to see what could be of interest. If it’s your first time, I would suggest going through all of it first before you start tinkering.

  • Install Miniconda

    • Testing

    • Install Mamba

  • Creating from environment.yml

    • Updating Env with environment.yml

  • Example Environments

  • Multiple Conda directories

  • Server Installations

    • Gricad

    • JeanZay


Install Miniconda#

Get the appropriate link via this webpage.

wget link_to_miniconda_installer

This will download a bash script. Then you need to run this bash script to install the

# change permissions to make it executable
chmod +x path/to/file.sh

# run as bash script
bash .path/to/file.sh

Warning

Make sure you install it in the correct directory. Typically we all have a homedir, a workdir and a scratchdir. The first option should always be the homedir. However, many times the homedir on servers are extremely small. This is a problem because some conda environments can be kind of heavy. So the second option should be the workdir. You need to change this when doing the installation; there will be an option which allows one choose which directory to install the conda package. The last option would be the scratchdir of course but these are typically erased more frequently. So it makes sense to avoid this if possible.

# some prompt should appear in the installation
/path/to/workdir

Now you should have the conda directory. Continue to do the installation by following the steps given by the prompt and then you should be done.

Note

Make sure you initialize your miniconda installation so that it adds the appropriate stuff to your .profile or .bashrc.

conda init

Note

Be sure to restart the terminal (e.g. log out and log back in, rerun the .profile and/or .bashrc). And your prompt should have an indicator like so:

(base) user@server:prompt$

The (base) is a message that lets you know your personal conda environment is active. You can also check to see where it is located:

conda env list

Testing#

If done correctly, you should be able to see the following:

# conda environments:
#
base     *  /path/to/workdir/miniconda3

And now you should be able to create new environments and install new packages.

# Create environment from scratch
conda create --name myenv python=3.9
# use mamba to create an env from a yaml file
conda env create -f file.yml -n myenv
# activate the created environment
conda activate myenv
# install packages in the environment
conda install numpy scipy matplotlib pandas xarray

And if you check using conda env list, you should see a new environment listed:

# conda environments:
#
base        /path/to/workdir/miniconda3
myenv    *  /path/to/workdir/.conda/envs/myenv

Install Mamba#

Conda (currently) is quite slow to install things. Sometimes it hangs for quite a long time for no apparent reason. So I recommend using mamba. You need to install mamba in the base environment. You can find more instructions here.

conda install mamba --name base --channel conda-forge

Now you should be able to create, install and remove packages via the mamba command.

# Create environment from scratch
- conda create --name myenv python=3.9
+ mamba create --name myenv python=3.9
# use mamba to create an env from a yaml file
- conda env create -f file.yml -n myenv
+ mamba env create -f file.yml -n myenv
# activate the created environment
conda activate myenv
# install packages in the environment
- conda install numpy scipy matplotlib pandas xarray
+ mamba install numpy scipy matplotlib pandas xarray
# deactivate environment
conda deactivate
# remove environment
- conda remove --name myenv --all
+ mamba remove --name myenv --all

Note: you only should change the regions where we are creating/removing environments and installing packages within environments. All other commands should use conda.


Creating from environment.yml#

Often times we have a preconfigured environment. This allows us to reproduce the conda environments. This comes in the form of an environment.yml

mamba env create -f environment.yml --prefix=/path/to/workdir/.conda/envs/env_name
conda activate env_name

Typically we have the environment.yml


Updating Env with environment.yml#

We can also update an existing environment with the packages within the same (or similar) environment.yml file. This happens when we may have updated the environment.yml file (externally) and we cannot remember which packages we installed or not.

mamba env update -f environment.yml --prefix=/path/to/workdir/.conda/envs/env_name

This will install any extra packages that are located within the environment.yml however it does not remove any packages that are already within the env_name. If you wish to update the env_name with the packages listed in the environment.yml and remove any excess packages, use the --prune flag.

mamba env update -f environment.yml --prefix=/path/to/workdir/.conda/envs/env_name --prune

Example Environments#

As I mentioned above, it is useful (and advisable) to install your conda environments using .yaml files. This ensures that they are reproducible and it’s also easier to install.

As mentioned before, to install the first time, you can use the following command:

conda env create --file environment.yaml

If you already have an environment but you would like to update the environment, use this command:

conda env update --file environment.yaml

Tip 1: The --prune command ensures that you remove any packages that aren’t within the .yaml file.

mamba env update --file environment.yaml --prune

Tip 2: The --prefix allows you to add these packages to an environment with a different name .yaml file.

mamba env update --file environment.yaml --prefix "path/to/env"

Below I have included yaml file for using conda and general Earth science packages.

name: earthsci_py39
channels:
- defaults
- conda-forge
dependencies:
- python=3.9
# Standard Libraries
- numpy             # Numerical Linear Algebra
- scipy             # Scientific Computing
- xarray            # Data structures
- pandas            # Data structure
- scikit-learn      # Machine Learning
- scikit-image      # Image Processing
- statsmodels       # Statistical Learning
- pymc3             # Probabilistic programming library
# Plotting
- matplotlib
- seaborn
- bokeh
- plotly::plotly>=4.6.0
- pyviz::geoviews
- conda-forge::cartopy
- datashader
- conda-forge::cmocean
- pyviz::hvplot
- conda-forge::xmovie
# Geospatial packages
- geopandas
- conda-forge::regionmask
- conda-forge::xesmf
- conda-forge::xcube
- conda-forge::rioxarray
- conda-forge::shapely
- conda-forge::pooch
- conda-forge::cftime
- conda-forge::pyinterp
# Scale
- numba
- dask              # Out-of-Core processing
- dask-ml           # Out-of-Core machine learning
# Storage
- hdf5              # standard large storage h5
- conda-forge::zarr
# GUI
- conda-forge::papermill
- conda-forge::nb_conda_kernels     # Access to other conda kernels
- conda-forge::nodejs               # for extensions in jupyterlab
- conda-forge::tqdm   
- ipykernel                         # IMPORTANT: allows other environments to see this environment  
- conda-forge::tqdm             # progress bar  
- pip
- pip:
  # Jupyter
  - ipywidgets
  # Formatters
  - black
  - pylint
  - isort
  - flake8
  - mypy
  - pytest
  # Notebook stuff
  - pyprojroot
  # Extra
  -"git+https://github.com/swartn/cmipdata.git"
  - emukit
  - netCDF4
  - shapely
  - affine
  - netCDF4
  - joblib  # Embarssingly parallel

Multiple Conda directories#

Sometimes there may be multiple directories where there are packages available. We have our primary miniconda3 installation but we also want to have access to the other external environments by other people. We simply need to change the .condarc script to include all of the directories which have relevant environments.

envs_dirs:
    - /path/to/workdir/.conda/envs
    - /path/to/otherdir/.conda/envs

You can add as many directories as you want. This just ensures that conda can talk to it. However, the more directories you have, the longer it takes for conda/mamba to spider through all of them.


Server Installations#

Often times, miniconda/conda is already installed. You just need to activate it using the command from the server. However, the above steps allow us to have access to our own miniconda installer which gives us fine-grain control. However, we can still access all of the created conda environments by simply adding all of these to the .condarc that we showed above. Below are a few servers that I personally have access to and the filenames.


Gricad#

For the gricad server, there are a few preconfigured environments available. Most of them are for GPU computation so it will be useful for the bigfoot cluster. All of the common conda environments are located in the following directory.

/applis/common/miniconda3/envs

So add this to the .condarc file as shown above. Now we now have access to all of the environments they have already configured! So we can use them but not necessarily if we don’t want to. :)


JeanZay#

In the jean-zay server, there are quite a lot of preconfigured environments. Mainly for GPU computation. They are located in following directory:

/gpfslocalsup/pub/anaconda-py3/2021.05/envs

So by adding this to the .condarc file.

Again, now we now have access to all of the environments they have already configured! So we can use them but not necessarily if we don’t want to. :D


#!/bin/bash
MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-py39_4.9.2-Linux-x86_64.sh"
MINICONDA_PREFIX="$SCRATCH/miniconda"

install_miniconda(){
  if [ ! -d $SCRATCH/miniconda ]; then
    echo "Installing Miniconda"
    wget $MINICONDA_URL -O $WORK/downloads/miniconda.sh
    bash $WORK/downloads/miniconda.sh -b -p $MINICONDA_PREFIX
    conda init
    eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
    conda install -y mamba -c conda-forge
    install_mamba
  else
    echo "Miniconda already installed"
  fi

}

install_mamba(){
  eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
  conda install -y mamba -c conda-forge
}

clone_dotfiles(){
  rm -rf $WORK/projects/dot_files
  git clone https://github.com/jejjohnson/dot_files.git $WORK/projects/dot_files/
}

install_mamba_jlab(){
	wget https://raw.githubusercontent.com/jejjohnson/dot_files/master/jupyter_scripts/jupyterlab.yml -O $WORK/downloads/jlab.yaml
	eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
	mamba env create -f $WORK/downloads/jlab.yaml
}

install_mamba_dl(){
	install_mamba_jax
  install_conda_pytorch
  install_conda_tensorflow
}

install_mamba_jax(){
	wget https://raw.githubusercontent.com/jejjohnson/dot_files/main/jupyter_scripts/jupyterlab.yaml -O $WORK/downloads/jlab.yaml
	eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
	conda env create -f $WORK/downloads/jlab.yaml
}

install_conda_pytorch(){
	wget https://raw.githubusercontent.com/jejjohnson/dot_files/main/jupyter_scripts/jupyterlab.yaml -O $WORK/downloads/jlab.yaml
	eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
	conda env create -f $WORK/downloads/jlab.yaml
}

install_conda_tensorflow(){
	wget https://raw.githubusercontent.com/jejjohnson/dot_files/main/jupyter_scripts/jupyterlab.yaml -O $WORK/downloads/jlab.yaml
	eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
	conda env create -f $WORK/downloads/jlab.yaml
}


init_conda_env(){
	wget https://raw.githubusercontent.com/quentinf00/dotfiles/main/conda/base_environment.yaml -O conda_ide.yaml
	eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
	conda install -y mamba -c conda-forge
	mamba env update -f conda_ide.yaml
	rm -f ~/.zshrc ~/.bashrc
	conda init && mv ~/.bashrc ~/.condainitrc
	conda init zsh && mv ~/.zshrc ~/.condainitzshrc
}



reinstall_everything(){
        rm -rf $MINICONDA_PREFIX
        bash ~/miniconda.sh -b -p $MINICONDA_PREFIX
        eval "$($MINICONDA_PREFIX/condabin/conda shell.bash hook)"
        conda install -y mamba -c conda-forge
        mamba env update -f conda_ide.yaml
        mamba env create -f jlab.yaml
        conda init
}