Here is a simple example script to test RDMA verbs on two nodes.
1. Before you start make sure you understand the server topology (NUMA), for best performance select the closest NUMA to the adapterĀ
See example here: HowTo Find the local NUMA node in AMD EYPC Servers
In this example, we see the the NUMA that is closest to the adapter is NUMA 2.
2. Get access to two nodes in the cluster. In case you are using slurm you can use salloc to allocate two nodes.
$ salloc -N 2 -p venus
3. Run manually the RDMA verbs scripts or use the following script, and bind the CPU NUMA to NUMA 2 (local).
TAG=$(date +%Y%m%d-%H%M%S) DEST=venus002 SRC=$HOSTNAME HCA="-d mlx5_0 -i 1" BINDS="numactl --cpunodebind=2 " BINDC="numactl --cpunodebind=2 " LTFLAGS="-a $HCA -F " BWFLAGS="-a $HCA --report_gbits -F " $BINDS ib_write_lat $LTFLAGS & ssh $DEST "$BINDC ib_write_lat $LTFLAGS $SRC " 2>&1 |tee log-lat-$TAG.txt $BINDS ib_write_bw $BWFLAGS & ssh $DEST "$BINDC ib_write_bw $BWFLAGS $SRC " 2>&1 |tee log-bw-$TAG.txt $BINDS ib_read_lat $LTFLAGS & ssh $DEST "$BINDC ib_read_lat $LTFLAGS $SRC " 2>&1 |tee log-lat-$TAG.txt $BINDS ib_read_bw $BWFLAGS & ssh $DEST "$BINDC ib_read_bw $BWFLAGS $SRC " 2>&1 |tee log-bw-$TAG.txt
0 Comments