Difference between revisions of "JupyterHub"
Line 159: | Line 159: | ||
(...)[neissner@td110 ~]$ mamba install additional_module1 additional_module2 ... | (...)[neissner@td110 ~]$ mamba install additional_module1 additional_module2 ... | ||
</pre> | </pre> | ||
+ | You can use pip install inside a mamba environment, however, resolving dependencies might require installing additional packages manually. |
Revision as of 07:53, 25 October 2022
Introduction
PIC offers a service for running Jupyter notebooks on CPU or GPU resources. This service is primarily thought for code developing or prototyping rather than data processing. The usage is similar to running notebooks on your personal computer but offers the advantage of developing and testing your code on different hardware configurations, as well as facilitating the scalability of the code since it is being tested in the same environment in which it would run on a mass scale.
Since the service is strictly thought for development and small scale testing tasks, a shutdown policy for the sessions has been put in place:
- The maximum duration for a session is 48h.
- After an idle period of 2 hours, the session will be closed.
In practice that means that you should estimate the test data volume that you work with during a session to be able to be processed in less than 48 hours.
How to connect to the service
Got to jupyter.pic.es to see your login screen.
Sign in with your PIC user credentials. This will prompt you to the following screen.
Here you can choose the hardware configuration for your Jupyter session. Also, you have to choose the experiment (project) you are working on during the Jupyter session. After choosing a configuration and pressing start the next screen will show you the progress of the initialisation process. Keep in mind that a job containing your Jupyter session is actually sent to the HTCondor queuing system and waiting for available resources before being started. This usually takes less than a minute but can take up to a few depending on our resource usage.
In the next screen you can choose the tool that you want to use for your work: a Python notebook, a Python console or a plain bash terminal. For the Python environment (either notebook or environment) you have two default options:
- the ipykernel version of Python 3
- the XPython version of Python 3.9, this one allows you to use the integrated debugging module.
Further you see an icon with a "D" - desktop, this one starts a VNC session that allows the use of programs with graphical user interfaces.
Also, recently you can find the icon of Visual Studio, an integrated development environment.
Your python environments should appear under Notebook and Console headers. In a later section we will show you how to create a new environment and to remove an existing one.
Terminate your session and logout
It is important that you terminate your session before you log out. In order to do so, go to the top page menu "File -> Hub Control Panel" and you will see the following screen.
Here, click on the Stop My Server button. After that you can log out by clicking the Logout button in the right upper corner.
Python virtual environments
This section covers the use of Python virtual environments with Jupyter.
Initialize conda (we highly recommend the use of mambaforge)
Before using conda/mamba in your bash session, you have to initialize it. For access to an available conda/mamba installation, please get in contact with your project liaison at PIC. He/she will give you the actual value for the /path/to/anaconda placeholder.
Log onto Jupyter and start a session. On the homepage of your Jupyter session, click on the terminal button on the session dashboard on the right to open a bash terminal. If no specific version is needed you can use the link provided in the example.
First, let's initialize conda for our bash sessions:
[neissner@td110 ~]$ /data/astro/software/centos7/conda/mambaforge_4.14.0/bin/mamba
This actually changes the .bashrc file in your home directory in order to activate the base environment on login. To avoid that the base environment is activated every time you log on to a node, run:
[neissner@td110 ~]$ conda config --set auto_activate_base false
For now you can exit the terminal.
[neissner@td110 ~]$ exit
Link an existing environment to Jupyter
If you want to know which environments are available for your project, please contact your project liaison at PIC. He/she will give you the values for the placeholder /path/to/predefined/venv/environment or /path/to/predefined/conda/environment. Although, you can find instructions on how to create your own environments, e.g. here, we would like to encourage the use of the predeployed environments. The main reason is the sheer size of a virtual environment which reaches easily several GB. If for any reason you need to use your own environment, make sure that the ipykernel module is installed.
Log into Jupyter, start a session. From the session dashboard choose the bash terminal.
Inside the terminal, activate your environment.
For conda environments:
[neissner@td110 ~]$ mamba activate environment (...) [neissner@td110 ~]$
The parenthesis (...) in front of your bash prompt show the name of your environment.
For venv environments:
[neissner@td110 ~]$ source /path/to/predefined/venv/environment/bin/activate (...) [neissner@td110 ~]$
Link the environment to a Jupyter kernel. For both, conda and venv:
(...) [neissner@td110 ~]$ python -m ipykernel install --user --name=whatever_kernel_name Installed kernelspec whatever_kernel_name in /nfs/pic.es/user/n/neissner/.local/share/jupyter/kernels/whatever_kernel_name
Deactivate your environment.
For conda:
(...) [neissner@td110 ~]$ mamba deactivate
For venv:
(...) [neissner@td110 ~]$ deactivate
Now you can exit the terminal. After refreshing the Jupyter page your whatever_kernel_name appears in the dashboard. In this example test has been used for whatever_kernel_name
Unlink an environment from Jupyter
Log onto Jupyter, start a session and from the session dashboard choose the bash terminal. To remove your environment/kernel from Jupyter run:
[neissner@td110 ~]$ jupyter kernelspec uninstall whatever_kernel_name Kernel specs to remove: whatever_kernel_name /nfs/pic.es/user/n/neissner/.local/share/jupyter/kernels/whatever_kernel_name Remove 1 kernel specs [y/N]: y [RemoveKernelSpec] Removed /nfs/pic.es/user/n/neissner/.local/share/jupyter/kernels/whatever_kernel_name
Keep in mind that, although not available in Jupyter anymore, the environment still exists. Whenever you need it, you can link it again.
Create virtual environments with venv or conda
Before creating a new environment, please get in contact with your project liaison at PIC as there may be already a suitable environment for your needs in place.
If none of the existing environments suits your needs, you can create a new environment. First, create a directory in a suitable place to store the environment. For single-user environments, place them in your home under ~/env. For environments that will be shared with other project users, contact your project liaison and ask him/her for a path in a shared storage volume that is visible to all of them.
Once you have the location (i.e. /path/to/env/folder), create the environment with the following commands:
For venv environments (recommended)
[neissner@td110 ~]$ cd /path/to/env/folder [neissner@td110 ~]$ python3 -m venv your_env
Now you should be able to activate your environment and install additional modules
[neissner@td110 ~]$ cd /path/to/env/folder [neissner@td110 ~]$ . your_env/bin/activate (...)[neissner@td110 ~]$ pip install additional_module1 additional_module2 ...
For conda enviroments
[neissner@td110 ~]$ mamba create --prefix /path/to/env/your_env
The list of modules (module1, module2, ...) is optional. For instance, for a python3 environment with scipy you would specify: python=3 scipy
Now you should be able to activate your environment and install additional modules
[neissner@td110 ~]$ mamba activate /path/to/env/folder/your_env (...)[neissner@td110 ~]$ mamba install additional_module1 additional_module2 ...
You can use pip install inside a mamba environment, however, resolving dependencies might require installing additional packages manually.