Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

For ISC22 Student Cluster Competition, we will have a coding challenge for the participating teams!

...

Widget Connector
overlayyoutube
_templatecom/atlassian/confluence/extra/widgetconnector/templates/youtube.vm
width400px
urlhttps://www.youtube.com/watch?v=4YXfH-o57y0
height300px

Slides:

View file
nameMVAPICH2-DPU-Overview.pdf
View file
nameISC22-SCC__Coding-Challenge.pdf

...

Code Block
#!/bin/bash -l
#SBATCH -p thor
#SBATCH --nodes=16
#SBATCH -J osu
#SBATCH --time=15:00
#SBATCH --exclusive

module load gcc/8.3.1 mvapich2-dpu/2021.08

srun -l hostname -s | awk '{print $2}' | grep -v bf | sort > hostfile
srun -l hostname -s | awk '{print $2}' | grep bf | sort |uniq > dpufile
NPROC=$(cat hostfile |wc -l)

EXE=$MVAPICH2_DPU_DIR/libexec/osu-micro-benchmarks/osu_ialltoall
# No DPU offload
mpirun_rsh -np $NPROC -hostfile hostfile MV2_USE_DPU=0 $EXE
# DPU offload
mpirun_rsh -np $NPROC -hostfile hostfile -dpufile dpufile $EXE

Keep in mind that we are not running processes directly on DPUs, but on the hosts. Mvapich will take care of DPU offloading.

Job submission:

Code Block
sbatch -N 16 -w thor0[25-32],thor-bf[25-32] --ntasks-per-node=8 RUN-osu.slurm

...

[contribution to total score 30%] – Run the original and modified xcompact3d application using the cylinder input case (/global/home/groups/isc_scc/coding-challenge/input.i3d). You are not allowed to change the problem size but you should adjust “Domain decomposition” in the input file. Obtain performance measurements using 8 nodes with and without the DPU adapter (note: Thor servers equipped with 2 adapters, ConnectX-6 and BlueField-2, mlx5_2 should be used on the host), make sure to vary PPN (4, 8, 16, 32). Run MPI profiler (mpiP or IPM) to understand if MPI overlap is happening and how the parallel behaviour of the application has changed.

...