Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Go to /path/to/icon-scc/build/experiments/exp_scc2850/scripts

  • You can remove “#SBATCH --account=…” from all slurm scripts.

  • Modify exp_scc2850.run_start script

    • Add/change Slurm directives

    • Check # nodes, nproc and mpi_procs_pernode values

    • Add a path where CDO executable is located, such as "export PATH=$BUILD_DIR/cdo-1.9.10/bin:$PATH"

    • If you get the below error, add "-L" to cdo commands.
      Ocdo settaxis (Warning): Use a thread-safe NetCDF4/HDF5 library or the CDO option -L to avoid such errors.

    • If you want to replace srun with mpirun, set START variable using mpirun.
      export START="mpirun -np $SLURM_NPROCS <mpi flags>"

      Code Block
      Submitting slurm jobs within a compute node is not allowed on Niagara cluster so modify the sbatch command as follows:
      sbatch exp_scc2850.post $start_date ==> ./exp_scc2850.post $start_date
      
  • Modify the name of Slurm partition in exp_scc2850.post and comment out the last line, subprocess.check_call(['sbatch', 'exp_scc2850.mon', str(start_date)])

  • Finally, submit the job.

  • $ sbatch exp_scc2850.run_start

...

View file
nameslurm_process_binding.pdf
pageGetting started with ICON for ISC22 SCC
View file
nameicon_grid.pdf
pageGetting started with ICON for ISC22 SCC

Tasks and Submissions

Run ICON with the Think about Load Balancing:

Upon successful completion you will see in the log file 2 “Timer reports”, one for the atmosphere and one for the ocean. By inspecting these reports identify the most CPU time consuming parts. The timer coupling_1stget gives an indication of the extent of load imbalance between atmosphere and ocean. Can you think of a better load balancing scheme?

Tasks and Submissions

Run ICON with the coupled atmosphere ocean experiment as described in section “Running ICON for the competition” and submit the results. The simulated time is 1 model year and the reference Wallclock time is around 30 minutes with 4 nodes.

  1. Porting and tuning for specific processor architecture . For single node/processor performance optimization the parameter nproma can be used to tune the model towards better vectorization (AVX2/AVX512).Load Balancing: Upon successful completion you will see in the log file 2 “Timer reports”, one for the atmosphere and one for the ocean. By inspecting these reports identify the most CPU time consuming parts. The timer coupling_1stget gives an indication of the extent of load imbalance between atmosphere and ocean. Can you think of a better load balancing scheme?

  2. Your optimizations need to yield acceptable results. Check the correctness of your results by using the python script scc_plots.ipynb, as described above in the section “Postprocessing”.

  3. Submit all log files and scc_plots.ipynb from your best run.