Slides:
GPAW is a density-functional theory (DFT) code based on the projector-augmented wave (PAW) method and various basis sets. The wave functions can be described with:
Uniform real-space grids
Plane waves
Localized atomic orbitals
GPAW is implemented in Python and C programming languages and it relies on high performance libraries for linear algebra operations, FFTs, etc.
Parallelization is primarily with MPI, complementary OpenMP parallelization can improve the performance in some cases.
Python 3.6 or later
NumPy 1.9 or later (base N-dimensional array package)
SciPy 0.14 or later (library for scientific computing)
ASE 3.18.0 or later (atomic simulation environment)
a C-compiler
LibXC 3.x or 4.x
BLAS library
an MPI library (required for parallel calculations)
Optional, but highly recommended:
The required Python packages (NumPy, SciPy, ASE) will be downloaded and installed (if not present in the system) automatically by the GPAW’s installation process, and do not normally need to be separately installed. Typically, LibXC is the only dependency that needs to be separately installed (for BLAS, MPI, etc. one should rely on high performance system libraries). Installing LibXC from source (also package management tools yum
, apt
etc. can be used when appropriate):
libxc_version=4.3.4 wget --content-disposition http://www.tddft.org/programs/libxc/down.php?file=$libxc_version/libxc-$libxc_version.tar.gz tar xzf libxc-$libxc_version.tar.gz cd libxc-$libxc_version export CFLAGS="-O3 -fPIC" # -fPIC is needed as libxc will be used from shared library ./configure --prefix=$HOME/libxc/$libxc_version make make install export CPATH=$CPATH:$HOME/libxc/$libxc_version/include export LIBRARY_PATH=$LIBRARY_PATH:$HOME/libxc/$libxc_version/lib |
There are several ways to install GPAW, refer to https://wiki.fysik.dtu.dk/gpaw/install.html for all options. Here, we recommend installing GPAW from source, version 21.1.0 will be used for the competition.
//download from gitlab git clone -b 21.1.0 https://gitlab.com/gpaw/gpaw.git cd gpaw |
By default GPAW looks for BLAS library libblas
in the default locations, and if it is available installation can be done at simplest with:
export PYTHONUSERBASE=some_installation_root pip3 install --user --verbose . |
For some testing one should install also the optional pytest
package
pip3 install --user pytest |
By default, the same compiler and compiler options as when building the Python interpreter are used. In HPC systems it is, however, often recommended to customize the installation in order to use other libraries, compilers or compiler options. This can accomplished via siteconfig.py
file, sample file is provided in siteconfig_example.py
. As an example, in order to use icc compiler with extra optimization flags and MKL BLAS library one should add into siteconfig.py
:
# compiler compiler = 'icc' extra_compile_args = ['-O2', '-xHost', '-qopenmp'] extra_link_args = ['-qopenmp'] # MKL libraries += ['mkl_rt'] |
One should also set the GPAW_CONFIG corresponding to this siteconfig.py
file:
export GPAW_CONFIG=$PWD/siteconfig.py |
and proceed then with pip3 install …
. (Note: siteconfig.py
can be also named differently when playing with different installation options, i.e. siteconfig_intel.py
, siteconfig_gcc.py
… )
More information about customizing the installation can be found in GPAW wiki
Once the installation is complete, the PATH variable needs to be set for GPAW bins:
export PATH=$PATH:$PYTHONUSERBASE/bin |
Finally, one needs to install the PAW datasets
gpaw install-data $HOME/gpaw-setups |
Once installation is complete and PYTHONUSERBASE and PATH are set, one can check the installation information as:
gpaw info ----------------------------------------------------------------------------------------------------------------------------------------------- | python-3.8.5 /.../python/3.8.5/bin/python3.8 | | gpaw-21.1.0 /.../lib/python3.8/site-packages/gpaw/ | | ase-3.21.1 /.../lib/python3.8/site-packages/ase/ | | numpy-1.19.2 /.../python/3.8.5/lib/python3.8/site-packages/numpy-1.19.2-py3.8-linux-x86_64.egg/numpy/ | | scipy-1.5.2 /.../python/3.8.5/lib/python3.8/site-packages/scipy-1.5.2-py3.8-linux-x86_64.egg/scipy/ | | libxc-5.1.2 yes | | _gpaw-c5b9c0c91b /.../lib/python3.8/site-packages/_gpaw.cpython-38-x86_64-linux-gnu.so | | MPI enabled yes | | OpenMP enabled yes | | scalapack yes | | Elpa no | | FFTW no | | libvdwxc no | | PAW-datasets (1) /home/.../gpaw-setups/gpaw-setups-0.9.20000 | ----------------------------------------------------------------------------------------------------------------------------------------------- |
Next, one should make short serial test calculation
gpaw test |
followed by short parallel calculation
mpirun -np 4 gpaw test |
GPAW contains also an extensive test set which should be run when developing the code, more details in GPAW wiki
# load necessary modules, set PYTHONUSERBASE and PATH mpirun -np <# proc> gpaw python <input>.py |
___ ___ ___ _ _ _ | | |_ | | | | | | | | | . | | | | |__ | _|___|_____| 21.1.0 |___|_| User: ... Date: Fri Nov 6 10:11:47 2020 Arch: x86_64 Pid: 205184 Python: 3.8.5 libxc: 4.3.4 units: Angstrom and eV cores: 320 OMP_NUM_THREADS: 1 Input parameters: eigensolver: {name: rmm-diis, niter: 3} h: 0.22 kpts: [1 1 8] maxiter: 1 mixer: {backend: pulay, beta: 0.1, method: separate, nmaxold: 5, weight: 100} nbands: -20 occupations: {fixmagmom: False, name: fermi-dirac, width: 0.2} xc: PBE System changes: positions, numbers, cell, pbc, initial_charges, initial_magmoms Initialize ... ... Timing: incl. excl. ----------------------------------------------------------- Hamiltonian: 0.289 0.000 0.0% | Atomic: 0.000 0.000 0.0% | ... SCF-cycle: 351.855 5.648 1.6% || ... Other: 1.100 1.100 6.0% |-| ----------------------------------------------------------- Total: 362.262 100.0% Memory usage: 548.92 MiB |
In short benchmarks we will be looking the time for SCF-cycle
(neglecting the initialization of the calculation).
There are several ways to install GPAW, refer to https://wiki.fysik.dtu.dk/gpaw/install.html for all options.
Here is an basic example with CentOS 7/8:
//download wget https://pypi.org/packages/source/g/gpaw/gpaw-20.10.0.tar.gz tar xf gpaw-20.10.0.tar.gz cd gpaw-20.10.0 //install libxc wget -O libxc-4.3.4.tar.gz http://www.tddft.org/programs/libxc/down.php?file=4.3.4/libxc-4.3.4.tar.gz tar xfp libxc-4.3.4.tar.gz cd libxc-4.3.4 module load gcc export CFLAGS="-fPIC" export CXXFLAGS="-fPIC" export FCFLAGS="-fPIC" ./configure --prefix=$PWD/install make make install |
Note: In case you get the error during installation, export the below variables and rebuild gpaw.
// UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3215: ordinal not in range(128) export LC_ALL="en_US.UTF-8" export LC_CTYPE="en_US.UTF-8" |
Building GPAW with Intel compilers:
Create siteconfig.py with blow lines.
compiler = 'icc' mpicompiler = 'mpicc' mpilinker = 'mpicc' # - static linking: xc = '<path>/gpaw-20.10.0/libxc-4.3.4/install/' include_dirs += [xc + 'include'] extra_link_args += [xc + 'lib/libxc.a'] if 'xc' in libraries: libraries.remove('xc') parallel_python_interpreter = True scalapack = True library_dirs += ['$MKLROOT'] libraries = ['mkl_intel_lp64', 'mkl_sequential', 'mkl_core', 'mkl_lapack95_lp64', 'mkl_scalapack_lp64', 'mkl_blacs_openmpi_lp64'] |
Install GPAW
module load intel/2020u4 module load hpcx module load python/3.8.5 export GPAW_CONFIG=$PWD/siteconfig.py pip3 install -v gpaw --user |
Download HPC-X from https://developer.nvidia.com/networking/hpc-x. Uncompress hpcx-<version>.tar.gz module use <path to hpcx>/modulefiles module load hpcx You may recompile openmpi using Intel compilers. HPC-X was compiled with GNU compilers by default. |
Check the bin files:
$ ls $HOME/.local/bin gpaw gpaw-analyse-basis gpaw-basis gpaw-plot-parallel-timings gpaw-python gpaw-runscript gpaw-setup gpaw-upfplot |
wget https://wiki.fysik.dtu.dk/gpaw-files/gpaw-setups-0.9.20000.tar.gz tar xfp gpaw-setups-0.9.20000.tar.gz |
module load intel/2020u4 module load hpcx //set the dataset path export GPAW_SETUP_PATH=$PWD/gpaw-setups-0.9.20000 mpirun -np <# proc> $MPI_FLAGS $PWD/install/bin/gpaw-python <input>.py |
___ ___ ___ _ _ _ | | |_ | | | | | | | | | . | | | | |__ | _|___|_____| 21.1.0 |___|_| User: Date: Sat Feb 27 21:45:18 2021 Arch: x86_64 Pid: 100932 Python: 3.6.8 gpaw: /ISC21/gpaw-20.10.0/install-hpcx-2.7.0/lib/python3.6/site-packages/gpaw _gpaw: /ISC21/gpaw-20.10.0/install-hpcx-2.7.0/bin/ gpaw-python ase: /ISC21/gpaw-20.10.0/install-hpcx-2.7.0/lib/python3.6/site-packages/ase (version 3.21.1) numpy: /scinet/niagara/software/2019b/opt/base/python/3.6.8/lib/python3.6/site-packages/numpy (version 1.15.1) scipy: /scinet/niagara/software/2019b/opt/base/python/3.6.8/lib/python3.6/site-packages/scipy (version 1.1.0) libxc: 4.3.4 units: Angstrom and eV cores: 320 OpenMP: False OMP_NUM_THREADS: 1 Input parameters: eigensolver: {name: rmm-diis, niter: 3} h: 0.22 kpts: [1 1 8] maxiter: 15 mixer: {backend: pulay, beta: 0.1, method: separate, nmaxold: 5, weight: 100} nbands: -20 occupations: {fixmagmom: False, name: fermi-dirac, width: 0.2} xc: PBE System changes: positions, numbers, cell, pbc, initial_charges, initial_magmoms Initialize ... ... Timing: incl. excl. ----------------------------------------------------------- Hamiltonian: 89.152 0.000 0.0% | Atomic: 0.000 0.000 0.0% | XC Correction: 0.000 0.000 0.0% | Calculate atomic Hamiltonians: 0.001 0.001 0.0% | Communicate: 0.092 0.092 0.0% | Hartree integrate/restrict: 0.003 0.003 0.0% | Initialize Hamiltonian: 0.000 0.000 0.0% | Poisson: 79.466 0.002 0.0% | Communicate from 1D: 5.253 5.253 2.1% || Communicate from 2D: 33.278 33.278 13.0% |----| Communicate to 1D: 6.447 6.447 2.5% || Communicate to 2D: 34.416 34.416 13.5% |----| FFT 1D: 0.008 0.008 0.0% | FFT 2D: 0.063 0.063 0.0% | XC 3D grid: 9.589 9.589 3.8% |-| vbar: 0.001 0.001 0.0% | ... SCF-cycle: 146.913 1.021 0.4% | ----------------------------------------------------------- Total: 255.165 100.0% Memory usage: 580.22 MiB |
Note: You will need to supply runs on both clusters Niagara and Aspire-1 (4 CPU nodes only).
Can be download from this location: https://github.com/jussienko/gpaw-isc-2021
Build the code both on NCSS and Niagara. Run the provided input copper.py on one, two, and four nodes using pure MPI parallelization. The text output is in the file output_M_xxx.txt (xxx = number of MPI tasks). Input file has check for the correctness, so there is an error message if the results are wrong. Runtime should be less than 45 minutes with a single node.
Discuss the differences in performance and scalability in the two clusters. No modifications to the input file and source code are allowed.
Submit the output files to your shared folder in OneDrive.
Run the provided input nanoribbon.py in any of the clusters (it is recommended to use one node, running time should be max. few minutes). As a result you will obtain a file named elf_ribbon.cube which contains electron localization function of the simulated system. Make a visualization (picture or animation) of the electron localization function. See e.g.
https://wiki.fysik.dtu.dk/gpaw/tutorials/plotting/plot_wave_functions.html
for hints for visualizing .cube files.
Submit the figure or animation to your team's shared folder.
Use IPM profiler to profile the application copper.py over 4 node run.
Submit the profiler output as text or PDF file.
Try to optimize the performance of the copper.py input in one of the clusters. Use the result in task 1 as baseline. You can try different compilers, compiler options, and high performance libraries. You can try also hybrid OpenMP / MPI parallelization with different number of OpenMP threads. You are allowed to modify parallelization options in copper.py (see https://wiki.fysik.dtu.dk/gpaw/documentation/parallel_runs/parallel_runs.html#manual-parallelization-types),
and modify also the source code as long as the correctness check in the input passes. Any modifications to the source code have to be made available e.g. in github. Submit report about the steps you did for performance tuning and about the performance improvements you achieved.
The scalapack parallelization in GPAW fails with certain input:
https://gitlab.com/gpaw/gpaw/-/issues/269
Try to fix the bug and if successful make a merge request to GPAW's master branch.
Note that you need to build GPAW with Scalapack support for this task.