HowTo Setup Fluent 19.0 with HPC-X 2.2 MPI

This procedure show how to set Fluent 1.9 and accelerate it using HPC-X 2.1 based on OpenMPI 3.0 over IB EDR network.

References 

Prerequisites 

1. Get the binaries and and Install Fluent 1.9 on your login/head nodes. Check the fluent module:

$ module show cfd/fluent/19.0
-------------------------------------------------------------------
/global/software/centos-7/modfiles/apps/cfd/fluent/19.0:

module-whatis    This module sets up Fluent 19.0 in your environment.
setenv           FLUENT_DIR /global/software/centos-7/modules/apps/cfd/ansys_inc/v190
setenv           FLUENTBENCH /global/software/centos-7/modules/apps/cfd/ansys_inc/v190/fluent/bin/fluentbench.pl
-------------------------------------------------------------------


2. Install on the cluster the latest MLNX_OFED drivers.

$ ofed_info -s
MLNX_OFED_LINUX-4.3-1.0.1.0:


3. Install HPC-X 2.2 , HPC-X could be compiled either using gcc or Intel MPI. You download it here.

Configuration

As Fluent 1.9 was designed/tested for OpenMPI 1.x , running it with HPC-X MPI based on OpenMPI 3.0 require some changes in the Fluent scripts and libraries.

1. Comment out the my_ic_fleg line from  /ansys_inc/v190/fluent/fluent19.0.0/multiport/mpi_wrapper/bin/mpirun.fl


# my_ic_flag="--mca btl self,sm,openib"

As of Open MPI 3.0, the "sm" BTL is no longer available, and create errors.


... 
case $my_protocol in
        tcp)
            my_ic_flag="--mca btl self,sm,tcp  --mca btl_sm_use_knem 0"
            ;;
        usnic)
            my_ic_flag="--mca btl self,sm,usnic"
            ;;
        gm)
            echo "Error: protocol $my_protocol not implemented for Open MPI"
            exit 1
            my_ic_flag="--mca btl self,sm,gm"
            ;;
        mx)
            my_ic_flag="--mca pml ob1 --mca btl self,sm,mx"
            # put MX libs in lib path
            sys_prepend_ld_library_path "/opt/mx/lib64"
            ;;
        ofed)
#            my_ic_flag="--mca btl self,sm,openib"
## OpenMPI/IB uses rand48() functions. So, any module using rand48() will need to be cautious.
            ;;
...


2. Modify the fluentbench.pl Fluent benchmark script under ~/ansys_inc/v190/fluent/bin/fluentbench.pl

You need to add single quotation '..' marks around PARA_MPIRUN_FLAGS to be able to pass multiple MPI flags.


Change the following line:

{ $solver_options="$solver_options -mpiopt=$PARA_MPIRUN_FLAGS"; }


info this line

{ $solver_options="$solver_options -mpiopt='$PARA_MPIRUN_FLAGS'"; }


...

#####################
# Setting auto-flush

$old_fh = select(OUT);
$| =1;
select ($old_fh);

 if ($OUT_ONLY == 0)
 {
#  initial cleanup

   unlink $logfile, $outfile, $trnfile, $errfile;
   unlink $casfile, $datfile, $casfile.Z, $datfile.Z, $casfile.gz, $datfile.gz;

   $solver_options="$SOLVER_FLAGS";

   $solver_options="$solver_options -i $JOURNAL";

   if ($FLUENT_INC_PATH ne "")
    { $solver_options="$solver_options -path$FLUENT_INC_PATH"; }

   if ($PARA_CNF ne "")
    { $solver_options="$solver_options -cnf=$PARA_CNF"; }

   if ($PARA_MPIRUN_FLAGS ne "")
    { $solver_options="$solver_options -mpiopt='$PARA_MPIRUN_FLAGS'"; }

   if ($nprocs eq "serial")


Slum Script Creation

1.  For the script you need to create hostfile file with all relevant hostfiles you wish to use for the -cnf=hostfile flag on Fluent script. You can either run create it by itself, or add it to the slurm script.

For Example:

$ srun -p helios -l --nodes=2 /bin/hostname | sort -n | awk '{print $2}' > hostfile
$ cat hostfile
helios001.hpcadvisorycouncil.com
helios002.hpcadvisorycouncil.com


2. Add soft links to the OpenMPI 3.0 libraries.

$ OPENMPI_ROOT=$OMPI_HOME

$ RUNDIR=$PWD

$ rm -f $RUNDIR/lib*.so*
$ ln -s $OMPI_HOME/lib/libmpi.so $RUNDIR/libmpi.so.1
$ ln -s $OMPI_HOME/lib/libopen-pal.so $RUNDIR/libopen-pal.so.4
$ ln -s $OMPI_HOME/lib/libopen-rte.so $RUNDIR/libopen-rte.so.4


3. Here is a simple example of a slurm script 

srun -l /bin/hostname | sort -n | awk '{print $2}' > hostfile

export FLUENT_ARCH=lnamd64
export ANSYSLMD_LICENSE_FILE=1055@<license server>

module load hpcx/2.2.0

OPENMPI_ROOT=$OMPI_HOME

RUNDIR=$PWD

rm -f $RUNDIR/lib*.so*
ln -s $OMPI_HOME/lib/libmpi.so $RUNDIR/libmpi.so.1
ln -s $OMPI_HOME/lib/libopen-pal.so $RUNDIR/libopen-pal.so.4
ln -s $OMPI_HOME/lib/libopen-rte.so $RUNDIR/libopen-rte.so.4

export LD_LIBRARY_PATH=${RUNDIR}:${LD_LIBRARY_PATH}

BENCH=aircraft_wing_2m

/ansys_inc/v190/fluent/bin/fluentbench.pl -path=/ansys_inc/v190/fluent/ -ssh -noloadchk -norm -nosyslog  $BENCH -t1024 -mpi=openmpi -pib -cnf=hostfile -mpiopt="-report-bindings -x UCX_NET_DEVICES=mlx5_0:1 -mca btl_openib_if_include mlx5_0:1 -bind-to core -map-by node -mca ras_base_launch_orted_on_hn true"


Note: the flag "-mca ras_base_launch_orted_on_hn true" was needed due to a bug in openmpi-3.1.x and can be removed in the future release.