.. include:: ../substitutions.txt

*************************
Using HPC Environments
*************************

.. include:: ../site_snippets/apptainer_prereqs.txt

Conducting analyses on high performance computing clusters happens through very different patterns of interaction than running analyses on a VM or on your own laptop.  When you login, you are on a node that is shared with lots of people.  Trying to run jobs on that node is not "high performance" at all.  Those login nodes are just intended to be used for moving files, editing files, and launching jobs.

Most jobs on a HPC cluster are neither interactive, nor realtime.  When you submit a job to the scheduler, you must tell it what resources you need (e.g. how many nodes, what type of nodes) and what you want to run.  Then the scheduler finds resources matching your requirements, and runs the job for you when it can.

For example, if you want to run the command:

.. code-block:: text

  apptainer exec docker://python:latest /usr/local/bin/python --version

On a HPC system, your job submission script would look something like:

.. include:: ../site_snippets/apptainer_batch_job_script.txt

.. Note::

  Every HPC cluster is a little different, but they almost universally have a "User's Guide" that serves both as a quick reference for helpful commands and contains guidelines for how to be a "good citizen" while using the system.  For |site|'s |machine| system, the user guide is at: |sitedocslink|_


How do HPC systems fit into the development workflow?
=====================================================

A couple of things to consider when using HPC systems:

#. Using 'sudo' is not allowed on HPC systems, and building an Apptainer container from scratch requires sudo.  That means you have to build your containers on a different development system, which is why we started this course developing Docker on your own laptop.  You can pull a docker image on HPC systems.
#. If you need to edit text files, command line text editors don't support using a mouse, so working efficiently has a learning curve.  There are text editors that support editing files over SSH.  This lets you use a local text editor and just save the changes to the HPC system.

In general, most developers that work with containers develop their code locally and then deploy their containers to HPC systems to do analyses at scale.  If the containers are written in a way that accommodates the small differences between the Docker and Apptainer runtimes, the transition is fairly seamless.

Differences between Docker and Apptainer
==========================================

Host Directories
^^^^^^^^^^^^^^^^

**Docker:** None by default. Use ``-v <source>:<destination>`` to mount a source host directory to an arbitrary destination within the container.

**Apptainer:** Mounts your current working directory, $HOME directory, and some system directories by default. Other defaults may be set in a system-wide configuration. The ``--bind`` flag is supported but rarely used in practice.

User ID
^^^^^^^

**Docker:** Defined in the Dockerfile, but containers run as root unless a different user is defined or specified on the command line.  This user ID only exists within the container, and care must be taken when working with files on the host filesystem to make sure permissions are set correctly.

**Apptainer:** Containers are run in "userspace", so you are the same user and user ID both inside and outside the container.

Image Format
^^^^^^^^^^^^

**Docker:** Containers are stored in layers and managed in a repository by Docker.  The ``docker images`` command will show you what containers are on your local machine and images are always referenced by their repository and tag name.

**Apptainer:** Containers are files.  Apptainer can build a container on the fly if you specify a repository, but ultimately they are stored as individual files, with all the benefits and dangers inherent to files.


Running a Batch Job on |machine|
================================

Please login to the |machine| system, just like we did at the start of the previous section.  You should be on one of the login nodes of the system.

We will not be editing much text directly on |machine|, but we need to do a little.  If you have a text editor you prefer, use it for this next part.  If not, the ``nano`` text editor is probably the most accessible for those new to Linux.

Create a file called "|pijobscript|":

.. include:: ../site_snippets/apptainer_mkdir.txt

Those commands should open a new file in the nano editor.  Either type in (or copy and paste) the following job script.

.. include:: ../site_snippets/apptainer_pi_job_script.txt

Once you are done, try submitting this file as a job to |scheduler|.

.. include:: ../site_snippets/apptainer_pi_job_sub.txt

You can check the status of your job with the command |jobstatuscmd|.

Once your job has finished, take a look at the output:

.. include:: ../site_snippets/apptainer_pi_job_output.txt

If your containers ran successfully, then congratulations! While this was just a toy example, you have now gone through all the motions of a development lifecycle:

* capturing your code and requirements as a Docker recipe
* deploying your own code to run on your laptop and a HPC system
* using someone else's container both on your laptop and a HPC system
* publishing your code to DockerHub so that it can be shared with others