...
|
...
|
...
To run Gromacs with GPU
Code Block |
---|
% export OMP_NUM_THREADS=2 % export KMP_AFFINITY=verbose,compact % mpirun -np 4 -x UCX_NET_DEVICES=mlx5_0:1 -bind-to none --map-by node:PE=2$OMP_NUM_THREADS \ mdrun_mpi -v -s stmv.tpr -nsteps 10000 -noconfout -nb gpu -pin on \ -ntomp $OMP_NUM_THREADS Command line: mdrun_mpi -v -s stmv.tpr -nsteps 10000 -noconfout -nb gpu -pin on Reading file stmv.tpr, VERSION 2018.1 (single precision) Note: file tpx version 112, software tpx version 119 Overriding nsteps with value passed on the command line: 10000 steps, 20 ps Changing nstlist from 10 to 100, rlist from 1.2 to 1.339 On host ops003.hpcadvisorycouncil.com 2 GPUs selected for this run. Mapping of GPU IDs to the 2 GPU tasks in the 2 ranks on this node: PP:0,PP:1 PP tasks will do (non-perturbed) short-ranged and most bonded interactions on the GPU PP task will update and constrain coordinates on the CPU Using 4 MPI processes Using 2 OpenMP threads per MPI process ... imb F 0% step 9900, remaining wall clock time: 3 s imb F 0% step 10000, remaining wall clock time: 0 s Dynamic load balancing report: DLB was off during the run due to low measured imbalance. Average load imbalance: 0.2%. The balanceable part of the MD step is 59%, load imbalance is computed from this. Part of the total run time spent waiting due to load imbalance: 0.1%. Core t (s) Wall t (s) (%) Time: 2581.428 322.681 800.0 (ns/day) (hour/ns) Performance: 5.356 4.481 |
...