...
The following tests were done with HPC-X version 2.57.0-pre, gcc/4.8.5intel 2019 compilers
Table of Contents |
---|
OSU Point to Point Tests
...
This micro-benchmarks runs on two cores only .(basic latency and bandwidth)
Please use the local core to the adapter, in this case core 80.
HPC-X MPI version 2.57.0 was used
10000 iteration were used per test
OSU 5.6.2
MLNX_OFED 5.0.2
Command example:
Code Block | ||
---|---|---|
| ||
mpirun mpirun -np 2 -map-by ppr:1:node -rank-by core -bind-to cpu-list:ordered -cpu-list 80 -mca pml ucx80 -x UCX_NET_DEVICES=mlx5_2:1 osu_latency -i 10000010000 -x 10000010000 |
Command output example on Rome (HDR):
Code Block | ||
---|---|---|
| ||
$ mpirun -np 2 -map-by ppr:1:node -rank-by core -bind-to cpu-list:ordered -cpu-list 80 -mca pml ucx -x UCX_NET_DEVICES=mlx5_2:1 osu_latency -i 100000010000 -x 100000000010000 # OSU MPI Latency Test v5.6.2 # Size Latency (us) 0 1.0907 1 1.1007 2 1.0907 4 1.0907 8 1.0907 16 1.1007 32 1.2817 64 1.3126 128 1.4631 256 1.8570 512 21.0494 1024 2.4127 2048 2.4927 4096 32.0180 8192 3.8744 16384 4.9658 32768 76.2256 65536 109.4236 131072 1615.8519 262144 1716.3748 524288 3027.1246 1048576 5550.9623 2097152 10795.3084 4194304 202175.4050 |
osu_bw
This is a point to point benchmark.
This micro-benchmarks runs on two cores only.
Please use the local core to the adapter, in this case core 80.
HPC-X MPI version 2.57.0 was used
10000 iteration were used per test
OSU 5.6.2
Set NPS=1 (or 2) on the BIOS to reach line rate (more memory channels).
...
Code Block | ||
---|---|---|
| ||
mpirun mpirun -np 2 -map-by ppr:1:node -rank-by core -bind-to cpu-list:ordered -cpu-list 80 -mca80 pml ucx -x UCX_NET_DEVICES=mlx5_2:1 osu_bw -i 10000 -x 10000 |
...
Code Block | ||
---|---|---|
| ||
$ mpirunmpirun -np 2 -map-by ppr:1:node -rank-by core -bind-to cpu-list:ordered -cpu-list 80 -mca pml ucx -x UCX_NET_DEVICES=mlx5_2:1 osu_bw -i 10000 -x 10000 # OSU MPI Bandwidth Test v5.6.2 # Size Bandwidth (MB/s) 1 3.8390 2 7.6480 4 15.2553 8 3031.6120 16 6162.1634 32 121124.3911 64 240243.3701 128 472477.1291 256 893900.9270 512 15061593.8369 1024 28143103.1496 2048 42865299.1351 4096 62317513.4663 8192 851610371.1166 16384 929816105.8336 32768 1776819001.4713 65536 1932022253.0028 131072 2114023313.3310 262144 2397223997.8289 524288 2438024349.9010 1048576 24532.09 2097152 24614.44 4194304 24636.01 |
osu_bibw
This is a point to point benchmark.
This micro-benchmarks runs on two cores only.
Please use the local core to the adapter, in this case core 80.
HPC-X MPI version 2.57.0 was used
10000 iteration were used per test
OSU 5.6.2
Set NPS=1 (or 2) on the BIOS to reach line rate (more memory channels).
...
Code Block | ||
---|---|---|
| ||
mpirun -np 2 -map-by ppr:1:node -rank-by core -bind-to cpu-list:ordered -cpu-list 80 -mca pml ucx90 -x UCX_NET_DEVICES=mlx5_2:1 osu_bibw -i 10000 -x 10000 -W 512 |
Command output example on Rome (HDR):
Code Block | ||
---|---|---|
| ||
$ mpirun -np 2 -map-by ppr:1:node -rank-by core -bind-to cpu-list:ordered -cpu-list 80 90 -mca pml ucx -x UCX_NET_DEVICES=mlx5_2:1 osu_bibw -i 10000 -x 10000 -W 512 # OSU MPI Bi-Directional Bandwidth Test v5.6.2 # Size Bandwidth (MB/s) 1 65.3782 2 1211.7763 4 2523.6427 8 5146.3262 16 102 93.3807 32 202185.3889 64 306285.0426 128 594559.6738 256 10851143.7645 512 18011761.3726 1024 31833385.1442 2048 53215512.6257 4096 75849142.8215 8192 1012615138.0753 16384 1067821865.7895 32768 2824830857.4667 65536 3423439546.8248 131072 3825743946.8792 262144 4178646488.1386 524288 4624047851.9840 1048576 4846448518.9397 2097152 4886048831.1865 4194304 4901348942.0690 |
osu_mbw_mr
Multi Bandwidth Message Rate test creates multiple pairs that are sending traffic each other. Each rank of those pairs are located on difference node (otherwise, it will be shared memory test).
...
Code Block | ||
---|---|---|
| ||
$ mpirun -np 128 -map-by ppr:64:node -rank-by core -bind-to cpu-list:ordered -cpu-list 64-127 -mca pml ucx -x UCX_NET_DEVICES=mlx5_2:1 osu_mbw_mr -W 512 # OSU MPI Multiple Bandwidth / Message Rate Test v5.6.2 # [ pairs: 64 ] [ window size: 512 ] # Size MB/s Messages/s 1 130.26 130257378.09194.22 194215337.24 2 258.83 129414509.59387.23 193615177.36 4 519.04 129760593.14773.95 193486570.81 8 1027.171545.46 193182287.83 16 128395883.76 16 2976.90 186056309.00 32 2055.53 128470323.55 32 4297.78 134305707.67 64 2952.82 92275622.75 646344.98 99140297.19 128 5782.98 9758.37 76237262.16 256 90359007.38 128 13561.93 52976275.87 512 8340.03 65156459.45 256 17913.93 34988135.77 1024 14207.65 55498629.86 51221370.83 20869955.40 2048 18901.00 23158.19 11307707.78 4096 36916009.55 1024 23462.98 5728265.86 8192 21789.19 21278509.05 2048 24260.12 2961439.84 16384 23110.41 11284378.38 409623698.72 1446455.16 32768 23748.48 23653.46 721846.43 65536 5797969.69 8192 23803.63 363214.64 131072 23810.83 2906595.67 1638424523.31 187097.99 262144 24284.29 24546.99 93639.35 524288 1482195.44 32768 24557.37 46839.47 1048576 24394.41 744458.18 6553624571.09 23432.82 2097152 24438.78 372906.17 13107224579.22 11720.29 4194304 24748.07 188812.7724561.79 5855.99 |
OSU Collectives
osu_barrier
...