Getting Started with SWIFT

SWIFT is an acronym for SPH WIth Fine-grained inter-dependent Tasking, where SPH means Smooth Particle Hydrodynamics. It is a gravity and SPH solver designed to run cosmological simulations. For the competition, we have settled on a specific commit known to work well with the target dataset.

Follow this procedure for ISC19 SCC Swift task:

Preparations

1. Download (Clone) SWIFT from the GitLab site. Use the specific version of the example below.

$ git clone https://gitlab.cosma.dur.ac.uk/swift/swiftsim.git && cd swiftsim
$ git checkout 3d44fb65ea39b9f7a2a99525f15c4cd464045c38


2. In the swiftsim directory you will find an INSTALL.swift file with instructions for building.


3. Disable snapshot dumps:  Edit src/engine.c and examples/main.c to comment out the calls to engine_dump_snapshot() (two lines in engine.c, one line in main.c). This will disable two large snapshot dumps (one at the beginning of the run and another at the end), as well as any others in between.

This is done because, for the purposes of the competition, we are interested in the computational portion of the code, not in large file outputs it would otherwise perform.


4. Before attempting to build SWIFT, you will need to have the following software installed :

  • Compiler
  • MPI
  • HDF5 library (to read the large input file; version 1.8.x with x ≥ 17 is fine)
  • GSL (GNU Scientific Library, without which SWIFT will not be able to perform cosmological time integration)
  • FFTW (3.3.x, x ≥ 6 is fine)
  • Metis or ParMetis (to optimize load between MPI tasks)

Configure and Build SWIFT


1. Run ./autogen.sh (only the first time)

2. Configure, the --with-tbbmalloc option is recommended on Xeon-based clusters.

./configure --with-metis --with-tbbmalloc --with-fftw=/path/to/fftw


3. make

If make succeeds, there will be two binaries: .

  • examples/swift - for single-node runs
  • examples/swift_mpi - for multi-node runs


Running SWIFT


1. Change to the examples/EAGLE_low_z/EAGLE_50 subdirectory.

Get eagle_50 here.

2. Get the initial conditions, a ~30 GB file named EAGLE_ICs_50.hdf5, by running ./getICs.sh (this will only need to be done once).

3. Edit eagle_50.yml to change the value of dt_max from 1.e-2 to 1.e-5. This is done to increase the computational load and provide better scaling.

4. Run SWIFT.

Note that SWIFT runs best with only a few MPI tasks per node. You may test a few different numbers of tasks per node, such as 1, 2 and 4. Performance will vary.


5. The threading model used by SWIFT is not OpenMP. You will need to tell it explicitly how many threads to use per MPI task, through the --threads=N option, where N is the number of desired threads.

6. Your command line must include the --cosmology, --hydro, --self-gravity, and --stars options, all of which relate to the physics aspects of the simulation.

7. You should run for exactly 64 time steps, which means that your command line must also include “-n 64”.

For example, a minimal command line to run SWIFT on eight nodes with two tasks per node and 16 threads per task would be:

mpirun -np 16 ../../swift_mpi --cosmology --hydro --self-gravity --stars - threads=16 -n 64 eagle_50.yml

(The above line works under a scheduler and with an MPI build that works with the scheduler and knows to spawn only two MPI tasks per node.  Failing that, a host file would need to be provided.)


8. There are other SWIFT options you may find useful. Run examples/swift --help to find out what other options are available.


Results 

9. Each SWIFT run will produce a timesteps_XXX.txt file, where ‘XXX’ is the number of cores used by the run (i.e., the total number of MPI tasks times the number of threads).

The wall clock time of interest, which excludes initialization, is obtained by adding up the next to last column of that file and dividing the total by 1000 to obtain the wall clock time in seconds.

This is easily done with a command such as the following:

awk 'BEGIN{tot=0} {tot += $11} END {print tot/1000}' < timesteps_XXX.txt


The performance, in steps/(wall clock hour), is then computed as 64*3600/(wall clock in s), in other words:

awk 'BEGIN{tot=0} {tot += $11} END {print 64*3600000/tot}' < timesteps_XXX.txt


Competition

At the start of the actual competition, you will be provided with an alternate set of input files (initial conditions and parameter file), which are not publicly available.  The EAGLE_50 test case above is just for practice.