Nalu is a Low-Mach Number turbulent flow simulator for energy applications.
This note post describes the steps followed to build it on Thor, a Linux cluster with Intel Xeon (Broadwell) processor and Mellanox EDR InfiniBand interconnect.
First, consult the base document at, specifically following document , specifically the Linux section. The
Note that the two subsections with heading Nalu Releases aren't especially clear, but the build will work if one follows the directions for "Head Code Base". Many packages need to be built before one has a working Nalu executable.
1. Set up the build environment
a. Load the compiler and HPC-X modules you wish to use
module load intel/compiler/2017.4.196 hpcx-1.9/icc-2017 |
export CC=icc |
export CXX=icpc |
export FC=ifort |
export F90=ifort |
export F77=ifort |
export OMPI_CC=icc |
export OMPI_CXX=icpc |
export OMPI_FC=ifort |
export OMPI_F77=ifort |
b. Replace the path that follows with whatever is appropriate for you
cd /mnt/beegfs3/gerardo/nrel_esif/Nalu/ |
mkdir build |
cd build |
export nalu_build_dir=$PWD |
mkdir $nalu_build_dir/packages |
mkdir $nalu_build_dir/install |
mkdir $nalu_build_dir/install/lib |
cd $nalu_build_dir/packages |
export nalu_install_dir=$nalu_build_dir/install |
2. Install
wget |
tar zxvf cmake-3.1.0-rc2.tar.gz |
cd cmake-3.1.0-rc2/ |
./configure --prefix=$nalu_build_dir/install |
make |
make install |
Make sure your newly-built CMake can be found
export PATH=/mnt/beegfs3/gerardo/nrel_esif/Nalu/build/install/bin:$PATH |
3. Install SuperLU
cd $nalu_build_dir/packages |
wget |
tar zxvf superlu_4.3.tar.gz |
cd $nalu_build_dir/packages/SuperLU_4.3 |
cp MAKE_INC/make.linux |
Edit so that the following macros are defined as indicated:
# PLAT = _x86_64 |
# SuperLUroot = /mnt/beegfs3/gerardo/nrel_esif/Nalu/build/packages/SuperLU_4.3 |
# BLASLIB = -L/opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64 -lmkl_blas95_lp64 -lmkl_core -lmkl_rt -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_avx2 -liomp5 |
# CC = mpicc |
# FORTRAN = mpif77 |
Create directory and make
mkdir $nalu_build_dir/install/SuperLU_4.3 |
mkdir $nalu_build_dir/install/SuperLU_4.3/lib |
mkdir $nalu_build_dir/install/SuperLU_4.3/include |
make |
cp SRC/*.h $nalu_build_dir/install/SuperLU_4.3/include |
Note: The following copy step is missing from the directions in the base document, it will trip you up if not done.
cp -p lib/libsuperlu_4.3.a $nalu_build_dir/install/SuperLU_4.3/lib |
4. Install
cd $nalu_build_dir/packages |
wget |
tar zxvf libxml2-2.9.2.tar.gz |
cd $nalu_build_dir/packages/libxml2-2.9.2 |
CC=mpicc CXX=mpicxx ./configure -without-python --prefix=$nalu_build_dir/install |
make -k install |
5. Install Boost
cd $nalu_build_dir/packages |
wget |
tar zxvf boost_1_55_0.tar.gz |
cd $nalu_build_dir/packages/boost_1_55_0 |
echo "using mpi : `which mpicxx` ;" >> ./tools/build/v2/user-config.jam |
./ --prefix=$nalu_build_dir/install --with-libraries=signals,regex,filesystem,system,mpi,serialization,thread,program_options,exception |
./b2 -j 4 2>&1 | tee boost_build_one |
./b2 -j 4 install 2>&1 | tee boost_build_instal |
6. Install
cd $nalu_build_dir/packages |
git clone |
cd $nalu_build_dir/packages/yaml-cpp |
mkdir build |
cd build |
cmake -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_CXX_FLAGS=-std=c++11 -DCMAKE_CC_COMPILER=mpicc -DCMAKE_INSTALL_PREFIX=$nalu_build_dir/install .. |
make |
make install |
7. Install
Note that the base document above calls for downloading, which is no longer available
cd $nalu_build_dir/packages wget |
tar zxvf zlib-1.2.11.tar.gz |
cd $nalu_build_dir/packages/zlib-1.2.11 |
## CC=gcc CXX=g++ CFLAGS=-O3 CXXFLAGS=-O3 ./configure --prefix=$nalu_build_dir/install/ |
CC=icc CXX=icpc CFLAGS=-O3 CXXFLAGS=-O3 ./configure --prefix=$nalu_build_dir/install/ |
make |
make install |
8. Install
Note that the URL provided in the base document above, changed.
cd $nalu_build_dir/packages/ wget |
tar zxvf hdf5-1.8.12.tar.gz |
cd $nalu_build_dir/packages/hdf5-1.8.12 |
./configure CC=mpicc FC=mpif90 CXX=mpicxx CXXFLAGS="-fPIC -O3" CFLAGS="-fPIC -O3" FCFLAGS="-fPIC -O3" --enable-parallel --with-zlib=$nalu_build_dir/install --prefix=$nalu_build_dir/install |
make |
make install |
make check |
9. Install Parallel
cd $nalu_build_dir/packages/ |
wget |
tar zxvf parallel-netcdf-1.6.1.tar.gz |
cd parallel-netcdf-1.6.1 |
./configure --prefix=$nalu_install_dir CC=mpicc FC=mpif90 CXX=mpicxx CFLAGS="-I$nalu_install_dir/include -O3" LDFLAGS=-L$nalu_install_dir/lib --disable-fortran |
make |
make install |
10. Install NetCDF
cd $nalu_build_dir/packages/ |
curl -o netcdf-c- |
tar zxvf netcdf-c- |
cd netcdf-c- |
./configure --prefix=$nalu_install_dir CC=mpicc FC=mpif90 CXX=mpicxx CFLAGS="-I$nalu_install_dir/include -O3" LDFLAGS=-L$nalu_install_dir/lib --enable-pnetcdf --enable-parallel-tests --enable-netcdf-4 --disable-shared --disable-fsync --disable-cdmremote --disable-dap --disable-doxygen --disable-v2 |
make -j 4 |
make install |
make check |
Trilinos (this is the most time-consuming step)
11. Install Trilinos
Note that this step takes a lot of time.
cd $nalu_build_dir/packages/ |
git clone |
cd Trilinos/ |
Use the "Head Code Base"
Code Block |
mkdir build |
curl -o $nalu_build_dir/packages/Trilinos/build/do-configTrilinos_release |
cd build |
Edit do-
configTrilinos_release to provide the MPI base dir & the Nalu build dir; in this example:
## mpi_base_dir=/opt/hpcx-1.9/ompi-v2.x.i2017 |
## nalu_build_dir=/mnt/beegfs3/gerardo/nrel_esif/Nalu/build |
## In addition, F\fix -DBoost_INCLUDE_DIRS and -DBoost_LIBRARY_DIRS, i.e., add the following lines: |
## -DBoost_INCLUDE_DIRS=$boost_dir/include \ |
## -DBoost_LIBRARY_DIRS=$boost_dir/lib \ |
Change mode and install
chmod +x do-configTrilinos_release |
./do-configTrilinos_release |
make |
Go do something else, the above 'make' takes several hours
make install |
12. Install Nalu
Code Block |
cd $nalu_build_dir/packages/ |
git clone |
cd Nalu/build/ |
cp do-configNalu_release do-configNaluNonTracked |
Edit do-
configNaluNonTracked to adjust the nalu_build_dir path
Code Block |
./do-configNaluNonTracked |
make |
This make takes another 70 minutes or so. When complete, you will have a NaluX executable under $nalu_build_dir/packages/Nalu/build
Nalu was supplied with two datasets: abl_3km_256 and abl_3km_512. Running Nalu on 768 cores (24 nodes having 32 cores each) takes about 1 hour for the small (abl_3km_256) dataset; 10 times as much for the large dataset. Here is a Slurm script used to test Nalu with the small dataset:
#!/bin/bash #SBATCH -N 24 #SBATCH -o nalu_256-24n-%j.out #SBATCH --tasks-per-node=32 #SBATCH -J Nalu256 #SBATCH -p pthor #SBATCH -t 06:00:00 #SBATCH --exclusive #SBATCH -d singleton ## Load any modules required here module load intel/compiler/2017.4.196 hpcx-1.9/icc-2017 NODES=$SLURM_NNODES NPROC=$SLURM_NPROCS export OMP_NUM_THREADS=1 MPI=ompi LOG=${NPROC}-${MPI}-plain-mbn-$SLURM_JOB_ID MPIFLAGS="" MPIFLAGS+="--map-by node --rank-by core " MPIFLAGS+="-report-bindings --display-map " MPIFLAGS+="-mca btl_sm_use_knem 1 " ## ulimit -s 10485760 ulimit -s unlimited echo Running on host `hostname` echo Time is `date` echo Directory is `pwd` echo '----------------------------------------------------' echo ' NODES USED = '$SLURM_NODELIST echo ' SLURM_JOB_ID = '$SLURM_JOB_ID echo ' CORES = '$NPROC echo '----------------------------------------------------' EXE=/mnt/beegfs3/gerardo/nrel_esif/Nalu/build/install/bin/naluX echo "ldd $EXE" ldd $EXE echo "Launch command: /usr/bin/time -p mpirun -np ${NPROC} $MPIFLAGS $EXE -i abl_3km_256.i -o abl_3km_256_${NPROC}.log" /usr/bin/time -p mpirun -np ${NPROC} $MPIFLAGS $EXE -i abl_3km_256.i -o abl_3km_256_${NPROC}.log mv abl_3km_256_${NPROC}.log abl_3km_256_${LOG} |