Adaptive Routing
InfiniBand is one of the most popular interconnection network standards in HPC. InfiniBand standard defines variety of routing algorithm to be configured via the Subnet Manager (SM). InfiniBand architecture supports deterministic routing. This may prevents packets from using alternative paths when the requested output port is busy, thus may lead to network performance degradation .
Adaptive Routing (AR) algorithms will dynamically select the route of a packet based on the network switches availability to deliver the packet. AR is controlled by the Subnet Manager (SM) while the switch is performing the routing decision to achieve lowest latency and maximum bandwidth accumulated over all pairs in the network achieving highest possible efficiency of the network.
This post discussed the needs for Adaptive Routing for HPC, and supply configuration examples and Performance Analysis.
Overview
What is Adaptive Routing?
Adaptive Routing is the network ability of the switches to dynamically select the best route for each packet based on queue size, latency and bandwidth available.
Configuration
Refer to the example here:
OpenSM Configuration for AR
Get the current OpenSM configuration (opensm -c <filename>), and check the following parameters.
# Routing engine
# Multiple routing engines can be specified separated by
# commas so that specific ordering of routing algorithms will
# be tried if earlier routing engines fail.
# Supported engines: minhop, updn, dnup, file, ftree, lash,
# dor, torus-2QoS, kdor-hc, dfsssp (EXPERIMENTAL),
# sssp (EXPERIMENTAL), chain,
# pqft (EXPERIMENTAL),
# dfp, ar_updn, ar_ftree, ar_torus, ar_dor (AR)
routing_engine (null)
# AR SL mask - 16 bit bitmask indicating which SLs should be configured for AR
ar_sl_mask 0xFFFF
# Enable adaptive routing only to devices that support packet reordering.
# When enabled, state in ARLFT entries for devices which does not support packet
# reordering is set to static.
# When disabled, ARLFT entries remains as determined by routing engine.
enable_ar_by_device_cap TRUE
# Advanced routing - Adaptive routing mode
# Supported values:
# 0 - Adaptive routing disabled.
# 1 - Enable adaptive routing.
# 2 - Enable adaptive routing with notifications.
# 3 - Auto mode in which adaptive routing is determined by routing engine.
ar_mode 3
# Advanced routing - Advanced routing engine
# Supported values:
# none - advanced routing is not enabled.
# ar_lag - Ports groups are created out of "parallel" links. Links that connect the same pair of switches.
# ar_tree - All the ports with minimal hops to destination are in the same group. Must run together with UPDN routing engine.
# auto - the advanced routing engine is selected based on routing engine. Works for ar_updn, ar_ftree, ar_torus, ar_dor engines.
adv_routing_engine auto
# AR Transport mask - indicates which transport types are enabled for AR
# Bit 0 = UD, Bit 1 = RC, Bit 2 = UC, Bit 3 = DCT, Bits 4-7 are reserved.
ar_transport_mask 0x000A
Enable AR
One option to enable AR is to set the routing engine to a routing engine supported by AR.
for example, chance opensm.conf as follows, in case you have fat-tree configuration:
routing_engine ar_ftree
In addition, set the root_guid config file
For example: set in opensm.conf:
root_guid_file /etc/opensm/root_guid.cfg
Create a file:
Enable AR per SL
By default all SLs are enabled with AR is enabled. To set AR enabled per SL, set the following bitmap parameter:
In the following example AR is disabled on SL0 and SL2.
UCX Support
FW 12.29.0356 added support for ooo_sl_mask per vport that’s now used by UCX.
With that, UCX has a way to request AR support. It could be done using UCX_IB_AR_ENABLE={ yes | no | try}. The default value is “try”.
UCX_IB_AR_ENABLE=yes strictly selects SL with AR, if there no SLs with AR, it fails UCX
UCX_IB_AR_ENABLE=try selects SL with AR, if there no SLs with AR, it selects the first available (or specified by UCX_IB_SL=<sl>)
UCX_IB_AR_ENABLE=no strictly selects SL without AR, if there no SLs without AR, it fails UCX
Specific SL still could be specified by user through UCX_IB_SL=[0..15]
If the UCX_IB_AR_ENABLE=yes/no was asked on old FW, it fails UCX since UCX is not able to detect ooo_sl_mask.
The feature is part of the UCX v1.10 release.
Validation
smparquery
Check that AR is enabled on a switch (using the switch lid), us the smparquery (supported on OFED 5.x)
Use for example, ibswitches to get the switch lids in your network.
Use amparquery ARInfo to get the AR information on the lid.
In this example AR is disabled on the switch.
Note that the switch SL map is always enabled for all SLs. The SL mask configuration is done on the adapter.
Get the SL Mask Configuration on the hosts
ibdiagnet gathers the SL Mask on the adapters. look for the OOOSLMASK column in the ibdiagnet2.db_csv output. In this example, SL mask is set to 0xFFFA.
ibdiagnet
Refer to the example here.