HowTo Find the local NUMA node in AMD EPYC Servers
References
Basic information
1. Numa
How to check the mapping of the Network adapter to the NUMA node?
Each of the EPYC CPUs comes with 4 NUMA nodes (8 NUMAs on 2 sockets).
Here is an example from our Venus cluster.
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 1 Core(s) per socket: 32 Socket(s): 2 NUMA node(s): 8 Vendor ID: AuthenticAMD CPU family: 23 Model: 1 Model name: AMD EPYC 7551 32-Core Processor Stepping: 2 CPU MHz: 2000.000 CPU max MHz: 2000.0000 CPU min MHz: 1200.0000 BogoMIPS: 3999.39 Virtualization: AMD-V L1d cache: 32K L1i cache: 64K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0-7 NUMA node1 CPU(s): 8-15 NUMA node2 CPU(s): 16-23 NUMA node3 CPU(s): 24-31 NUMA node4 CPU(s): 32-39 NUMA node5 CPU(s): 40-47 NUMA node6 CPU(s): 48-55 NUMA node7 CPU(s): 56-63
Note that other OEMs may map differently the cores to CPUs. Here is another example from Dell R7415 with one socket equipped with EPYC 7551.
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian ... Thread(s) per core: 1 Core(s) per socket: 32 Socket(s): 1 NUMA node(s): 4 Vendor ID: AuthenticAMD ... Model name: AMD EPYC 7551 32-Core Processor ... CPU MHz: 1996.203 ... NUMA node0 CPU(s): 0,4,8,12,16,20,24,28 NUMA node1 CPU(s): 1,5,9,13,17,21,25,29 NUMA node2 CPU(s): 2,6,10,14,18,22,26,30 NUMA node3 CPU(s): 3,7,11,15,19,23,27,31 ...
2. Check the adapter you have:
$ sudo lspci | grep Mel 21:00.0 Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] 21:00.1 Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
3. Check with Numa is connected to the network adapter:
There are several options to do it.
Using MST tools
$ ibdev2netdev mlx5_0 port 1 ==> ib0 (Up) mlx5_1 port 1 ==> ib1 (Down) $ sudo mst start ... $ sudo mst status -v MST modules: ------------ MST PCI module is not loaded MST PCI configuration module loaded PCI devices: ------------ DEVICE_TYPE MST PCI RDMA NET NUMA ConnectX5(rev:0) /dev/mst/mt4121_pciconf0.1 21:00.1 mlx5_1 net-ib1 2 ConnectX5(rev:0) /dev/mst/mt4121_pciconf0 21:00.0 mlx5_0 net-ib0 2
Using the net device:
$ cat /sys/class/net/ib0/device/numa_node 2
Note: Knowing which numa is connected to the network adapter is important for performance tuning.
Another option is to use lstopo-no-graphics
You can see under numa 2, the two port adapter.
$ lstopo-no-graphics Machine (256GB total) Package L#0 NUMANode L#0 (P#0 32GB) L3 L#0 (8192KB) L2 L#0 (512KB) + L1d L#0 (32KB) + L1i L#0 (64KB) + Core L#0 + PU L#0 (P#0) L2 L#1 (512KB) + L1d L#1 (32KB) + L1i L#1 (64KB) + Core L#1 + PU L#1 (P#1) L2 L#2 (512KB) + L1d L#2 (32KB) + L1i L#2 (64KB) + Core L#2 + PU L#2 (P#2) L2 L#3 (512KB) + L1d L#3 (32KB) + L1i L#3 (64KB) + Core L#3 + PU L#3 (P#3) L3 L#1 (8192KB) L2 L#4 (512KB) + L1d L#4 (32KB) + L1i L#4 (64KB) + Core L#4 + PU L#4 (P#4) L2 L#5 (512KB) + L1d L#5 (32KB) + L1i L#5 (64KB) + Core L#5 + PU L#5 (P#5) L2 L#6 (512KB) + L1d L#6 (32KB) + L1i L#6 (64KB) + Core L#6 + PU L#6 (P#6) L2 L#7 (512KB) + L1d L#7 (32KB) + L1i L#7 (64KB) + Core L#7 + PU L#7 (P#7) HostBridge L#0 PCIBridge PCIBridge PCI 1a03:2000 GPU L#0 "card0" GPU L#1 "controlD64" NUMANode L#1 (P#1 32GB) L3 L#2 (8192KB) L2 L#8 (512KB) + L1d L#8 (32KB) + L1i L#8 (64KB) + Core L#8 + PU L#8 (P#8) L2 L#9 (512KB) + L1d L#9 (32KB) + L1i L#9 (64KB) + Core L#9 + PU L#9 (P#9) L2 L#10 (512KB) + L1d L#10 (32KB) + L1i L#10 (64KB) + Core L#10 + PU L#10 (P#10) L2 L#11 (512KB) + L1d L#11 (32KB) + L1i L#11 (64KB) + Core L#11 + PU L#11 (P#11) L3 L#3 (8192KB) L2 L#12 (512KB) + L1d L#12 (32KB) + L1i L#12 (64KB) + Core L#12 + PU L#12 (P#12) L2 L#13 (512KB) + L1d L#13 (32KB) + L1i L#13 (64KB) + Core L#13 + PU L#13 (P#13) L2 L#14 (512KB) + L1d L#14 (32KB) + L1i L#14 (64KB) + Core L#14 + PU L#14 (P#14) L2 L#15 (512KB) + L1d L#15 (32KB) + L1i L#15 (64KB) + Core L#15 + PU L#15 (P#15) HostBridge L#3 PCIBridge PCI 8086:1521 Net L#2 "eth0" PCI 8086:1521 Net L#3 "eno2" PCI 8086:1521 Net L#4 "eno3" PCI 8086:1521 Net L#5 "eno4" PCIBridge PCI 1022:7901 Block(Disk) L#6 "sda" NUMANode L#2 (P#2 32GB) L3 L#4 (8192KB) L2 L#16 (512KB) + L1d L#16 (32KB) + L1i L#16 (64KB) + Core L#16 + PU L#16 (P#16) L2 L#17 (512KB) + L1d L#17 (32KB) + L1i L#17 (64KB) + Core L#17 + PU L#17 (P#17) L2 L#18 (512KB) + L1d L#18 (32KB) + L1i L#18 (64KB) + Core L#18 + PU L#18 (P#18) L2 L#19 (512KB) + L1d L#19 (32KB) + L1i L#19 (64KB) + Core L#19 + PU L#19 (P#19) L3 L#5 (8192KB) L2 L#20 (512KB) + L1d L#20 (32KB) + L1i L#20 (64KB) + Core L#20 + PU L#20 (P#20) L2 L#21 (512KB) + L1d L#21 (32KB) + L1i L#21 (64KB) + Core L#21 + PU L#21 (P#21) L2 L#22 (512KB) + L1d L#22 (32KB) + L1i L#22 (64KB) + Core L#22 + PU L#22 (P#22) L2 L#23 (512KB) + L1d L#23 (32KB) + L1i L#23 (64KB) + Core L#23 + PU L#23 (P#23) HostBridge L#6 PCIBridge PCI 15b3:1019 Net L#7 "ib0_mlx5" OpenFabrics L#8 "mlx5_0" PCI 15b3:1019 Net L#9 "ib0" OpenFabrics L#10 "mlx5_1" NUMANode L#3 (P#3 32GB) L3 L#6 (8192KB)