Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To run Gromacs with GPU

Code Block
% export OMP_NUM_THREADS=2
% export KMP_AFFINITY=verbose,compact
% mpirun -np 4 -x UCX_NET_DEVICES=mlx5_0:1 -bind-to none --map-by node:PE=2$OMP_NUM_THREADS \
mdrun_mpi -v -s stmv.tpr -nsteps 10000 -noconfout -nb gpu -pin on \
-ntomp $OMP_NUM_THREADS 

Command line:
  mdrun_mpi -v -s stmv.tpr -nsteps 10000 -noconfout -nb gpu -pin on

Reading file stmv.tpr, VERSION 2018.1 (single precision)
Note: file tpx version 112, software tpx version 119
Overriding nsteps with value passed on the command line: 10000 steps, 20 ps
Changing nstlist from 10 to 100, rlist from 1.2 to 1.339


On host ops003.hpcadvisorycouncil.com 2 GPUs selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 2 ranks on this node:
  PP:0,PP:1
PP tasks will do (non-perturbed) short-ranged and most bonded interactions on the GPU
PP task will update and constrain coordinates on the CPU
Using 4 MPI processes
Using 2 OpenMP threads per MPI process
...
imb F  0% step 9900, remaining wall clock time:     3 s
imb F  0% step 10000, remaining wall clock time:     0 s


Dynamic load balancing report:
 DLB was off during the run due to low measured imbalance.
 Average load imbalance: 0.2%.
 The balanceable part of the MD step is 59%, load imbalance is computed from this.
 Part of the total run time spent waiting due to load imbalance: 0.1%.


               Core t (s)   Wall t (s)        (%)
       Time:     2581.428      322.681      800.0
                 (ns/day)    (hour/ns)
Performance:        5.356        4.481

...