Nektar++

The version of Nektar++ that will be used for the Student Cluster Competition at ISC18 is available from GitHub ().

Here is a step by step example of building Nektar++ for use on a cluster with x86_64 processors.  Replace all absolute paths with whatever is appropriate for your system.

  1. Create a directory where you want to do your build, and change into it:

     cd /global/home/users/gerardo/benchmarks/isc18scc
  2. Set your environment to use the appropriate compilers and MPI implementation (in this example, Intel compilers and HPC-X):

    module purge
    module load intel/2018.2.199 hpcx/2.1.0 mkl/2018.2.199
    module load cmake
    
  3. Get the Nektar++ tarball from GitHub, extract its contents, and go down into the nektar-master directory:

    wget https://gitlab.nektar.info/nektar/nektar/-/archive/master/nektar-master.tar.gz
    tar zxvf nektar-master.tar.gz
    cd nektar-master
  4. Make a build directory, go down into it and configure the build by running cmake (this will take some time):

    mkdir build_i18h21; cd build_i18h21
    cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc \
     -DCMAKE_Fortran_COMPILER=ifort \
     -DCMAKE_INSTALL_PREFIX=/global/home/groups/hpcperf/centos-7/modules/nektar++/master-hpcx-2.1.0-intel-2018.2.199 -DNEKTAR_USE_MPI=ON \
     -DNEKTAR_USE_MKL=ON -DNEKTAR_USE_METIS=ON  .. 2>&1 | tee ../cmake_i18h21.log
  5. Run make to actually build the application:

    make -j20 2>&1 | tee ../make_i18h21.log
  6. Create a directory in which to run the benchmark, get the data tarball (T106AP3.tgz) and extract it in it:

    cd ..
    mkdir isc_scc ; cd isc_scc
    # get the tarball
    tar zxvf T106AP3.tgz
    cd p3

    The subdirectory p3 contains the test case data and a sample batch script, submit.slurm.  Edit the latter for your paths and system, and submit to your batch scheduler or run.  Here is a sample run command:

    mpirun -np 320 --display-map --report-bindings --map-by node --bind-to core -mca pml ucx -x UCX_NET_DEVICES=mlx5_0:1 -mca coll_fca_enable 0 -mca coll_hcoll_enable 0 IncNavierStokesSolver --npz 8 --use-metis t106a.xml

    The '--use-metis' option is important, because the default mesh partitioner, Scotch, fails at larger core counts.

  7. The measure of performance is the "Total Computation Time" written by the application near the end of the run.  For example, on eight nodes having 40 SKL 2.0GHz cores each:

    Total Computation Time = 1337s