Testing RDMA Message rate
This post will help you to test RDMA Message rate between two nodes.
Setup Example: Two nodes connected with InfiniBand HDR/HDR100 link
You will need to create a script to run on full PPN, or 50% PPN for best performance.
Things to check:
Number of PPN on the node
Numa locality for the adapter
Â
Server side:
Run the following script for example, on 32 core node:
Message size 2 (with WQE inline enable)
Run ib_write_bw server instance on each core using -p
for i in {0..31}
do
numactl --physcpubind=$i ib_write_bw -s 2 -d mlx5_0 -i 1 --report_gbits -F -D 20 --inline_size=2 -c RC -p $((10000+$i)) --output=message_rate &
~
done
Client side:
Run the following script for example, on 32 core node:
Message size 2 (with WQE inline enable)
Run ib_write_bw server instance on each core using -p
Collect the message rate results to a file and sum the numbers
rm -f file.out
for i in {0..31}
do
numactl --physcpubind=$i ib_write_bw -s 2 -d mlx5_0 -i 1 --report_gbits -F thor001 -D 20 --output=message_rate --inline_size=2 -c RC -p $((10000+$i)) | awk '{ print $1 }' >> file.out &
~
done
~
wait
cat file.out | awk '{ SUM += $1} END { print "RDMA Message Rate =" SUM }'
Â
Note: for HDR Socket Direct, you will need to map the network device to the right numa.
This is an example from our Helios cluster.
Â
Server side:
# Using mlx5_0 device for the first 20 cores
for i in {0..19}
do
numactl --physcpubind=$i ib_write_bw -s 2 -d mlx5_0 -i 1 --report_gbits -F -D 20 --inline_size=2 -c RC -p $((10000+$i)) --output=message_rate &
done
# Using mlx5_2 device for the second 20 cores
for i in {20..39}
do
numactl --physcpubind=$i ib_write_bw -s 2 -d mlx5_2 -i 1 --report_gbits -F -D 20 --inline_size=2 -c RC -p $((10000+$i)) --output=message_rate &
done
Â
Client side:
Â