...
The MILC Code is a body of high performance research software written in C (with some C++) for doing SU(3) lattice gauge theory on high performance computers as well as single-processor workstations.
Compiling
...
MILC (CPU only)
Code Block |
---|
# fall3d-8.1.2.tar.gz will be provided during competition. git clone --branch develop https://github.com/milc-qcd/milc_qcd.git cd milc_qcd/ks_imp_rhmc cp ../Makefile . # Edit compile_su3_rhmd_hisq_quda.sh # Remove QUDA, CUDA, GPU flags and set compilers, arch etc. # module load intel/2021.42022.3.1 compiler mkl module load compilerhpcx/20212.414.0 module load mkl/2021.4.0 # Build netcdf and set include and lib paths ./configure -prefix=<install path> --enable-parallel make clean make -j 32 make install |
Running FALL3D
A sample example is available under Example directory.
Code Block |
---|
cd Example
NPROC=160
mpirun -np $NPROC <MPI Flags> Fall3d.r8.x Example.inp 8 5 4 -nens 1
# Make sure the process grid matches with NPROC, i.e. 8 * 5 * 4 = 160 |
Tasks and Submissions
Input is related Mount St. Helens eruption, https://www.usgs.gov/media/videos/mount-st-helens-eruption-may-18-1980.
Code Block |
---|
# Copy input from USB to a folder.
NPROC=160
mpirun -np $NPROC <MPI Flags> Fall3d.r8.x fall3d helens.inp 8 5 4
# Make sure the process grid matches with NPROC, i.e. 8 * 5 * 4 = 160 |
Try different processor grids for better performance.
For debugging or quick testing, you can decrease the duration of simulation by changing the below parameter in helens.inp but make sure to use the original value for the submission.
Code Block |
---|
RUN_END_(HOURS_AFTER_00) = 55 |
...
export OMPI_MPICC=icx
./compile_su3_rhmd_hisq_quda.sh |
Compiling MILC (GPU support)
Check the below link for building MILC with QUDA.
https://github.com/lattice/quda/wiki/MILC-with-QUDA
Running MILC
Input file and instructions are available at https://github.com/lattice/quda/wiki/Running-the-NERSC-MILC-Benchmarks
We will be using the medium benchmark,36x36x36x72.chklat
for the competition.
Code Block |
---|
wget https://portal.nersc.gov/project/m888/apex/MILC_160413.tgz
tar xvzf MILC_160413.tgz
cd MILC-apex/benchmarks/medium
wget https://portal.nersc.gov/project/m888/apex/MILC_lattices/36x36x36x72.chklat
# Edit and execute run_medium.sh |
Sample output:
Code Block |
---|
Running "mpirun -np 1 -x UCX_NET_DEVICES=mlx5_0:1 ./su3_rhmd_hisq"
Ignoring PCI device with non-16bit domain.
Pass --enable-32bits-pci-domain to configure to support such devices
(warning: it would break the library ABI, don't enable unless really needed).
com_qmp: set thread-safety level to 0
SU3 with improved KS action
Microcanonical simulation with refreshing
Rational function hybrid Monte Carlo algorithm
MIMD version 3be2-dirty
Machine = QMP (portable), with 1 nodes
...
Options selected...
Generic double precision
C_GLOBAL_INLINE
FEWSUMS
KS_MULTICG=HYBRID
KS_MULTIFF=FNMAT
VECLENGTH=4
INT_ALG=INT_3G1F
HISQ_REUNIT_ALLOW_SVD
HISQ_REUNIT_SVD_REL_ERROR = 1e-08
HISQ_REUNIT_SVD_ABS_ERROR = 1e-08
HISQ_FORCE_FILTER = 5e-05
HISQ_FF_MULTI_WRAPPER is ON
type 0 for no prompts, 1 for prompts, or 2 for proofreading
nx 36
ny 36
nz 36
nt 72
...
initQuda-endQuda Total time = 2806.835 secs
QUDA Total time = 2458.793 secs
download = 133.950 secs ( 5.448%), with 6136 calls at 2.183e+04 us per call
upload = 143.992 secs ( 5.856%), with 3255 calls at 4.424e+04 us per call
init = 47.168 secs ( 1.918%), with 10661 calls at 4.424e+03 us per call
preamble = 0.252 secs ( 0.010%), with 3174 calls at 7.952e+01 us per call
compute = 2098.426 secs ( 85.344%), with 9173 calls at 2.288e+05 us per call
comms = 1.242 secs ( 0.051%), with 861 calls at 1.443e+03 us per call
epilogue = 20.609 secs ( 0.838%), with 3180 calls at 6.481e+03 us per call
free = 11.711 secs ( 0.476%), with 6848 calls at 1.710e+03 us per call
total accounted = 2457.351 secs ( 99.941%)
total missing = 1.442 secs ( 0.059%)
Device memory used = 30996.2 MiB
Pinned device memory used = 0.0 MiB
Managed memory used = 5689.6 MiB
Shmem memory used = 0.0 MiB
Page-locked host memory used = 20174.4 MiB
Total host memory used >= 26043.5 MiB |
Submissions
Submit your build & run scripts and the output file.