InfiniBand is one of the most popular interconnection network standards in HPC. InfiniBand standard defines variety of routing algorithm to be configured via the Subnet Manager (SM). InfiniBand architecture supports deterministic routing. This may prevents packets from using alternative paths when the requested output port is busy, thus may lead to network performance degradation .
Adaptive Routing (AR) algorithms will dynamically select the route of a packet based on the network switches availability to deliver the packet. AR is controlled by the Subnet Manager (SM) while the switch is performing the routing decision to achieve lowest latency and maximum bandwidth accumulated over all pairs in the network achieving highest possible efficiency of the network.
This post discussed the needs for Adaptive Routing for HPC, and supply configuration examples and Performance Analysis.
Overview
What is Adaptive Routing?
Adaptive Routing is the network ability of the switches to dynamically select the best route for each packet based on queue size, latency and bandwidth available.
Configuration
Refer to the example here:
OpenSM Configuration for AR
Get the current OpenSM configuration (opensm -c <filename>), and check the following parameters.
# Routing engine # Multiple routing engines can be specified separated by # commas so that specific ordering of routing algorithms will # be tried if earlier routing engines fail. # Supported engines: minhop, updn, dnup, file, ftree, lash, # dor, torus-2QoS, kdor-hc, dfsssp (EXPERIMENTAL), # sssp (EXPERIMENTAL), chain, # pqft (EXPERIMENTAL), # dfp, ar_updn, ar_ftree, ar_torus, ar_dor (AR) routing_engine (null) # AR SL mask - 16 bit bitmask indicating which SLs should be configured for AR ar_sl_mask 0xFFFF # Enable adaptive routing only to devices that support packet reordering. # When enabled, state in ARLFT entries for devices which does not support packet # reordering is set to static. # When disabled, ARLFT entries remains as determined by routing engine. enable_ar_by_device_cap TRUE # Advanced routing - Adaptive routing mode # Supported values: # 0 - Adaptive routing disabled. # 1 - Enable adaptive routing. # 2 - Enable adaptive routing with notifications. # 3 - Auto mode in which adaptive routing is determined by routing engine. ar_mode 3 # Advanced routing - Advanced routing engine # Supported values: # none - advanced routing is not enabled. # ar_lag - Ports groups are created out of "parallel" links. Links that connect the same pair of switches. # ar_tree - All the ports with minimal hops to destination are in the same group. Must run together with UPDN routing engine. # auto - the advanced routing engine is selected based on routing engine. Works for ar_updn, ar_ftree, ar_torus, ar_dor engines. adv_routing_engine auto # AR Transport mask - indicates which transport types are enabled for AR # Bit 0 = UD, Bit 1 = RC, Bit 2 = UC, Bit 3 = DCT, Bits 4-7 are reserved. ar_transport_mask 0x000A
Enable AR
One option to enable AR is to set the routing engine to a routing engine supported by AR.
for example, chance opensm.conf as follows, in case you have fat-tree configuration:
routing_engine ar_ftree
Enable AR per SL
By default all SLs are enabled with AR is enabled. To set AR enabled per SL, set the following bitmap parameter:
In the following example AR is disabled on SL0 and SL2.
ar_sl_mask 0xFFFA # A=0x1010 SL1, SL3 enabled with AR
Validation
smparquery
Check that AR is enabled on a switch (using the switch lid), us the smparquery (supported on OFED 5.x)
Use for example, ibswitches to get the switch lids in your network.
$ sudo ibswitches -C mlx5_1 Switch : 0xb8599f0300fccaac ports 81 "MF0;thor-qm8700-2:MQM8700/U1" enhanced port 0 lid 136 lmc 0 Switch : 0xb8599f0300fcca6c ports 41 "MF0;thor-qm8700-4:MQM8700/U1" enhanced port 0 lid 118 lmc 0 Switch : 0xb8599f0300df8faa ports 41 "Quantum Mellanox Technologies" base port 0 lid 375 lmc 0 Switch : 0xb8599f0300fcca4c ports 81 "MF0;thor-qm8700-3:MQM8700/U1" enhanced port 0 lid 105 lmc 0
Use amparquery ARInfo to get the AR information on the lid.
In this example AR is disabled on the switch.
$ sudo smparquery ARInfo -L 136 -C mlx5_1 op = ARInfo, dest = 136, rest = -I- Getting ARInfo from lid=136 -I- ARInfo: AR Status.........................Enabled Is ARN Supported..................Yes Is FRN Supported..................Yes Is FR Supported...................Yes FR Enabled........................Yes RN Xmit Enabled...................Yes AR Sub Groups Active..............0 AR Groups Copy Supported..........15 Direction Num Supported...........4 AR Fallback.......................Enabled AR IS4 mode.......................No AR Glb Group......................Yes AR By SL Cap......................Yes AR By Transport Cap...............Yes AR Dynamic Cap Calc...............Yes AR Group Capability...............1792 AR Group Top......................1 AR Group Table Capability.........1 RN String Width Capability........3 AR Sub Groups Capability..........0x3 AR Version........................2 RN Version........................0 AR By SL Mask Enable..............0 (All SLs enabled) AR SL Mask........................N/A AR By TransportDisable............0x5 AR Ageing Time Value..............0
ibdiagnet
Refer to the example here:
0 Comments