Page Comparison

For ISC22 Student Cluster Competition, we will have a coding challenge for the participating teams!

...

Widget Connector

overlay	youtube
_template	com/atlassian/confluence/extra/widgetconnector/templates/youtube.vm
width	400px
url	https://www.youtube.com/watch?v=4YXfH-o57y0
height	300px

Slides:

View file

name	MVAPICH2-DPU-Overview.pdf

View file

name	ISC22-SCC__Coding-Challenge.pdf

...

Code Block

#!/bin/bash -l
#SBATCH -p thor
#SBATCH --nodes=16
#SBATCH -J osu
#SBATCH --time=15:00
#SBATCH --exclusive

module load gcc/8.3.1 mvapich2-dpu/2021.08

srun -l hostname -s | awk '{print $2}' | grep -v bf | sort > hostfile
srun -l hostname -s | awk '{print $2}' | grep bf | sort |uniq > dpufile
NPROC=$(cat hostfile |wc -l)

EXE=$MVAPICH2_DPU_DIR/libexec/osu-micro-benchmarks/osu_ialltoall
# No DPU offload
mpirun_rsh -np $NPROC -hostfile hostfile MV2_USE_DPU=0 $EXE
# DPU offload
mpirun_rsh -np $NPROC -hostfile hostfile -dpufile dpufile $EXE

Keep in mind that we are not running processes directly on DPUs, but on the hosts. Mvapich will take care of DPU offloading.

Job submission:

Code Block
sbatch -N 16 -w thor0[25-32],thor-bf[25-32] --ntasks-per-node=8 RUN-osu.slurm

...

[contribution to total score 30%] – Run the original and modified xcompact3d application using the cylinder input case (/global/home/groups/isc_scc/coding-challenge/input.i3d). You are not allowed to change the problem size but you should adjust “Domain decomposition” in the input file. Obtain performance measurements using 8 nodes with and without the DPU adapter (note: Thor servers equipped with 2 adapters, ConnectX-6 and BlueField-2, mlx5_2 should be used on the host), make sure to vary PPN (4, 8, 16, 32). Run MPI profiler (mpiP or IPM) to understand if MPI overlap is happening and how the parallel behaviour of the application has changed.

...

Versions Compared

Old Version 27

New Version 28

Key

Keep in mind that we are not running processes directly on DPUs, but on the hosts. Mvapich will take care of DPU offloading.