HowTo Setup Fluent 19.0 with HPC-X 2.2 MPI
This procedure show how to set Fluent 1.9 and accelerate it using HPC-X 2.1 based on OpenMPI 3.0 over IB EDR network.
References
Prerequisites
1. Get the binaries and and Install Fluent 1.9 on your login/head nodes. Check the fluent module:
$ module show cfd/fluent/19.0 ------------------------------------------------------------------- /global/software/centos-7/modfiles/apps/cfd/fluent/19.0: module-whatis This module sets up Fluent 19.0 in your environment. setenv FLUENT_DIR /global/software/centos-7/modules/apps/cfd/ansys_inc/v190 setenv FLUENTBENCH /global/software/centos-7/modules/apps/cfd/ansys_inc/v190/fluent/bin/fluentbench.pl -------------------------------------------------------------------
2. Install on the cluster the latest MLNX_OFED drivers.
$ ofed_info -s MLNX_OFED_LINUX-4.3-1.0.1.0:
3. Install HPC-X 2.2 , HPC-X could be compiled either using gcc or Intel MPI. You download it here.
Configuration
As Fluent 1.9 was designed/tested for OpenMPI 1.x , running it with HPC-X MPI based on OpenMPI 3.0 require some changes in the Fluent scripts and libraries.
1. Comment out the my_ic_fleg line from /ansys_inc/v190/fluent/fluent19.0.0/multiport/mpi_wrapper/bin/mpirun.fl
# my_ic_flag="--mca btl self,sm,openib"
As of Open MPI 3.0, the "sm" BTL is no longer available, and create errors.
... case $my_protocol in tcp) my_ic_flag="--mca btl self,sm,tcp --mca btl_sm_use_knem 0" ;; usnic) my_ic_flag="--mca btl self,sm,usnic" ;; gm) echo "Error: protocol $my_protocol not implemented for Open MPI" exit 1 my_ic_flag="--mca btl self,sm,gm" ;; mx) my_ic_flag="--mca pml ob1 --mca btl self,sm,mx" # put MX libs in lib path sys_prepend_ld_library_path "/opt/mx/lib64" ;; ofed) # my_ic_flag="--mca btl self,sm,openib" ## OpenMPI/IB uses rand48() functions. So, any module using rand48() will need to be cautious. ;; ...
2. Modify the fluentbench.pl Fluent benchmark script under ~/ansys_inc/v190/fluent/bin/fluentbench.pl
You need to add single quotation '..' marks around PARA_MPIRUN_FLAGS to be able to pass multiple MPI flags.
Change the following line:
{ $solver_options="$solver_options -mpiopt=$PARA_MPIRUN_FLAGS"; }
info this line
{ $solver_options="$solver_options -mpiopt='$PARA_MPIRUN_FLAGS'"; }
... ##################### # Setting auto-flush $old_fh = select(OUT); $| =1; select ($old_fh); if ($OUT_ONLY == 0) { # initial cleanup unlink $logfile, $outfile, $trnfile, $errfile; unlink $casfile, $datfile, $casfile.Z, $datfile.Z, $casfile.gz, $datfile.gz; $solver_options="$SOLVER_FLAGS"; $solver_options="$solver_options -i $JOURNAL"; if ($FLUENT_INC_PATH ne "") { $solver_options="$solver_options -path$FLUENT_INC_PATH"; } if ($PARA_CNF ne "") { $solver_options="$solver_options -cnf=$PARA_CNF"; } if ($PARA_MPIRUN_FLAGS ne "") { $solver_options="$solver_options -mpiopt='$PARA_MPIRUN_FLAGS'"; } if ($nprocs eq "serial")
Slum Script Creation
1. For the script you need to create hostfile file with all relevant hostfiles you wish to use for the -cnf=hostfile flag on Fluent script. You can either run create it by itself, or add it to the slurm script.
For Example:
$ srun -p helios -l --nodes=2 /bin/hostname | sort -n | awk '{print $2}' > hostfile
$ cat hostfile
helios001.hpcadvisorycouncil.com
helios002.hpcadvisorycouncil.com
2. Add soft links to the OpenMPI 3.0 libraries.
$ OPENMPI_ROOT=$OMPI_HOME $ RUNDIR=$PWD $ rm -f $RUNDIR/lib*.so* $ ln -s $OMPI_HOME/lib/libmpi.so $RUNDIR/libmpi.so.1 $ ln -s $OMPI_HOME/lib/libopen-pal.so $RUNDIR/libopen-pal.so.4 $ ln -s $OMPI_HOME/lib/libopen-rte.so $RUNDIR/libopen-rte.so.4
3. Here is a simple example of a slurm script
srun -l /bin/hostname | sort -n | awk '{print $2}' > hostfile export FLUENT_ARCH=lnamd64 export ANSYSLMD_LICENSE_FILE=1055@<license server> module load hpcx/2.2.0 OPENMPI_ROOT=$OMPI_HOME RUNDIR=$PWD rm -f $RUNDIR/lib*.so* ln -s $OMPI_HOME/lib/libmpi.so $RUNDIR/libmpi.so.1 ln -s $OMPI_HOME/lib/libopen-pal.so $RUNDIR/libopen-pal.so.4 ln -s $OMPI_HOME/lib/libopen-rte.so $RUNDIR/libopen-rte.so.4 export LD_LIBRARY_PATH=${RUNDIR}:${LD_LIBRARY_PATH} BENCH=aircraft_wing_2m /ansys_inc/v190/fluent/bin/fluentbench.pl -path=/ansys_inc/v190/fluent/ -ssh -noloadchk -norm -nosyslog $BENCH -t1024 -mpi=openmpi -pib -cnf=hostfile -mpiopt="-report-bindings -x UCX_NET_DEVICES=mlx5_0:1 -mca btl_openib_if_include mlx5_0:1 -bind-to core -map-by node -mca ras_base_launch_orted_on_hn true"
Note: the flag "-mca ras_base_launch_orted_on_hn true" was needed due to a bug in openmpi-3.1.x and can be removed in the future release.