Table of Contents

Cluster access request

To get cluster access, submit a form, click here.request access to Thea clusters fill this form

Once you have username and can access the login nodes, follow this example here: Getting Started with HPC-AI AC Clusters to allocate GH nodes.

Connect to the lab

Once you got your username, login to the clusters:

...

Code Block
$ sinfo -p gh PARTITION AVAIL TIMELIMIT NODES STATE NODELIST gh up infinite 8 idle gh[001-008]

...

Running Jobs

Slurm is the system job scheduler. Each job has a maximum walltime of 12 hours and nodes are allocated by default in exclusive mode (one user allocating always a full node, no sharing). GPU is always visible once a job allocated a node, no need to use any gres options.

Please avoid allocating nodes interactively if possible or set the time limit short because we are sharing the resources with multiple users.

...

Allocating Examples

How to allocate one GH200 node:

Code Block
salloc -n 72 -N 1 -p gh -t 1:00:00

How to allocate two GH200 nods:

Code Block
salloc -n 144 -N 2 -p gh -t 1:00:00

How to allocate one specific GH200 node:

Code Block
salloc -n 72 -N 1 -p gh -w gh004 -t 1:00:00

How to allocate two specific GH200 nodes:

Code Block
salloc -n 144 -N 2 -p gh --time=w gh002,gh004 -t 1:00:00

How to allocate 4 GH200 nodes but force to exclude a specific one (gh001):

Code Block
salloc -n 288 -N 4 -p gh -x gh001 -t 1:00:00

How to allocate one Grace-only node:

Code Block
salloc -n 144 -N 1 -p gg -t 1:00:00

How to allocate four Grace-only nodes:

Code Block
salloc -n 576

...

-N 4 -p gg -t 1:00:00

How to submit a batch job

Code Block
$ sbatch -N 4 -p gh --time=1:00:00 <slurm script>

Batch job

Example of job batch script running on 2 GH200 nodes and 2 task per node via mpirun

Code Block

#!/bin/bash -l
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=36
#SBATCH --nodes=2
#SBATCH --partition=gh
#SBATCH --time=1:00:00
#SBATCH --exclusive

. /global/scratch/groups/gh/bootstrap-gh-env.sh
module purge
module load openmpi/4.1.6-gcc-12.3.0-wftkmyd

mpirun -np 4 --map-by ppr:2:node:PE=36 \
   --report-bindings uname -a

Example of job batch script running on 2 GH200 nodes and 2 task per node via srun

Code Block

#!/bin/bash -l
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=36
#SBATCH --nodes=2
#SBATCH --partition=gh
#SBATCH --time=1:00:00
#SBATCH --exclusive

. /global/scratch/groups/gh/bootstrap-gh-env.sh
module purge

srun --mpi=pmi2 uname -a

Example of job batch script running on 2 Grace-only nodes and full MPI-only via mpirun

Code Block

#!/bin/bash -l
#SBATCH --ntasks=288
#SBATCH --ntasks-per-node=144
#SBATCH --cpus-per-task=1
#SBATCH --nodes=2
#SBATCH --partition=gg
#SBATCH --time=1:00:00
#SBATCH --exclusive

. /global/scratch/groups/gh/bootstrap-gh-env.sh
module purge
module load openmpi/4.1.6-gcc-12.3.0-wftkmyd

mpirun -np 288 --map-by ppr:144:node:PE=1 \
   --report-bindings uname -a

Example of job batch script running on 4 Grace-only nodes and MPI+OpenMP combination via mpirun

Code Block

#!/bin/bash -l
#SBATCH --ntasks=64
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=9
#SBATCH --nodes=4
#SBATCH --partition=gg
#SBATCH --time=1:00:00
#SBATCH --exclusive

. /global/scratch/groups/gh/bootstrap-gh-env.sh
module purge
module load openmpi/4.1.6-gcc-12.3.0-wftkmyd

export OMP_NUM_THREADS=9
mpirun -np 64 --map-by ppr:16:node:PE=9 \
   --report-bindings uname -a

Working with Singularity containers

Singularity is the only container engine present at the moment. Docker or enroot workflows need to be adapted to run (as user) on Thea.

Example 1: Run interactively pre-staged Singularity containers

(1) Allocate an interactive node

Code Block
salloc -n 1 -N 1 -p gh -t 1:00:00

(2) Select container and invoke singularity run

Code Block
export CONT="/global/scratch/groups/gh/sif_images/pytorch-23.12-py3.sif" singularity run --nv "${CONT}"

NOTE - Accessing a SIF container is usually fast enough also when the file is locate on the lustre filesystem. Copying it on /local will improve the bootstrap time marginally.

Example 2: Run interactively pre-staged Singularity containers

Code Block

export CONT="/global/scratch/groups/gh/sif_images/pytorch-23.12-py3.sif"
srun --mpi=pmi2 -N 1 -n 1 --ntasks-per-node=1 -p gh -t 4:00:00 \ 
    singularity -v run --nv "${CONT}" python my_benchmark_script.sh

NOTE - The current path where srun and singularity are executed is automatically exposed inside the container.

Example 3: How to squash and run a NGC container into a new read-only Singularity image

TIP - Building a container is a very intense I/O operation, it is better to leverage /local when possible but remember to copy your sif image or sandbox folder back to ${SCRATCH} before the job is completed otherwise all files are lost.

1. Allocate an interactive node

Code Block
salloc -n 1 -N 1 -p gh -t 1:00:00

2. Set additional env variables

Make sure singularity pull operates entirely from /local for performance reasons and capacity constrains

Code Block
mkdir /local/tmp_singularity mkdir /local/tmp_singularity_cache export APPTAINER_TMPDIR=/local/tmp_singularity export APPTAINER_CACHEDIR=/local/tmp_singularity_cache

3. Pull locally singularity image

Code Block
singularity pull pytorch-23.12-py3.sif docker://nvcr.io/nvidia/pytorch:23.12-py3

Example 4: How to create a Singularity Sandbox and run / repackage a new container image

1. Grab one node in interactive mode

Code Block
salloc -n 1 -N 1 -p gh -t 2:00:00

2. Identify which container to extend via a sandbox and prep the environment

Code Block

export CONT_DIR=/global/scratch/groups/gh/sif_images
export CONT_NAME="pytorch-23.12-py3.sif"
mkdir /local/$SLURM_JOBID
export APPTAINER_TMPDIR=/local/$SLURM_JOBID/_tmp_singularity
export APPTAINER_CACHEDIR=/local/$SLURM_JOBID/_cache_singularity
rm -rf ${APPTAINER_TMPDIR} && mkdir -p ${APPTAINER_TMPDIR}
rm -rf ${APPTAINER_CACHEDIR} && mkdir -p ${APPTAINER_CACHEDIR}

3. Make a copy of base container as reading and verifying it is faster on local disk

Code Block
cp ${CONT_DIR}/${CONT_NAME} /local/$SLURM_JOBID/

4. Create a Singularity definition file

Start with the original NGC container as base image and add extra packages in the %post phase

Code Block

cat > custom-pytorch.def << EOF
Bootstrap: localimage
From: /local/${SLURM_JOBID}/${CONT_NAME}
 
%post
    apt-get update
    apt-get -y install python3-venv
    pip install --upgrade pip
    pip install transformers accelerate huggingface_hub
EOF

After this there are two options:

5A. Create the sandbox on the persistent storage

TIP - Use this method if you want to customise your image by bulding manually software or debugging a failing pip command.

Code Block
cd /global/scratch/users/$USER singularity build --sandbox custom-python-sandbox custom-pytorch.def

When completed, run on an interactive node via

Code Block
singularity run --nv custom-python-sandbox -bash-command /bin/bash

5B. Create a new SIF image

TIP - Use this method if you want to create a read-only image to run workloads and you are confident all %post steps can run successfully without manual intervention.

Code Block
cd /global/scratch/users/$USER singularity build custom-python.sif custom-python.def

When completed, run on an interactive node via

Code Block
singularity run --nv custom-python.sif

Storage

When you login you are in the $HOME. There is also extra scratch space.

...

Version	Old Version 1	New Version Current
Changes made by	Ophir Maor	Ophir Maor
Saved on	Jan 30, 2024	Nov 12, 2024

Versions Compared

Key

Cluster access request

Connect to the lab

Running Jobs

Allocating Examples

Batch job

Working with Singularity containers

Storage

Content Comparison

Versions Compared

Key

Cluster access request

Connect to the lab

Running Jobs

Allocating Examples

Batch job

Working with Singularity containers

Storage