...
Optional: https://developer.nvidia.com/networking/hpc-x or other MPI (HPC-X will be provided in a USB drive with input files).
Check this link to learn how to install/use HPC-X, Profiling using IPM and HPC-X
...
The executable file of ICON will be created to /path/to/icon-scc/build/bin
. OpenMPI/HPC-X is available from Nvidia, https://developer.nvidia.com/networking/hpc-x .was more stable than Intel MPI.
Testing Icon
Interactive Mode Test Example
...
Running ICON for the competition
Download the input data from : TBDInput will be handed out using a USB drive.
The input data contain the 3 files: restart.tar.gz, hd.tar.gz and grids.tar.gz
...
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
Tasks and Submissions
Note: The following tests should be done on both Niagara and Bridges-2 clusters (using 4 nodes). The submission should be done to a OneDrive directory that will be shared with each team.
Input files
on Bridges-2 cluster, copy the input files from the following location:
/ocean/projects/cis210088p/davidcho/Icon-input
to your home directory.
Code Block |
---|
[maor@bridges2-login011 ~]$ ll /ocean/projects/cis210088p/davidcho/Icon-input
total 8458004
-rw-r--r-- 1 davidcho cis210088p 8356853152 Mar 1 13:06 grids.tar.gz
-rw-r--r-- 1 davidcho cis210088p 451058 Mar 1 13:06 hd.tar.gz
-rw-r--r-- 1 davidcho cis210088p 303674079 Mar 1 13:06 restart.tar.gz
[maor@bridges2-login011 ~]$ cp /ocean/projects/cis210088p/davidcho/Icon-input . |
On Niagara cluster, copy the input files from the following location
/scratch/i/iscscc-scinet/davidcho01/Icon-input
to your home directory.
GOAL : is to speedup the execution time of this experiment setup by porting and tuning ICON to the newer architectures of Niagara and Bridges-2 cluster. Run ICON with the coupled atmosphere ocean experiment as described in section “Running ICON for the competition” and submit the results, assuming 4 node cluster. The simulated time Run ICON with the coupled atmosphere ocean experiment as described in section “Running ICON for the competition” and submit the results. The simulated time is 1 model year and the reference Wallclock time is around 30 minutes . Your main task is to beat the reference time, i.e. make ICON run 1 simulated year in less than 30 minutes.with 4 nodes.
Porting and tuning for specific processor architecture . For single node/processor performance optimization the parameter nproma can be used to tune the model towards better vectorization (AVX2/AVX512).
Load Balancing: Upon successful completion you will see in the log file 2 “Timer reports”, one for the atmosphere and one for the ocean. By inspecting these reports identify the most CPU time consuming parts. The timer coupling_1stget gives an indication of the extent of load imbalance between atmosphere and ocean. Can you think of a better load balancing scheme?Run IPM profile for the application and submit the results as PDF. Based on this profile make an analysis of the MPI-Communication pattern. What are the main bottlenecks of the extent of load imbalance between atmosphere and ocean. Can you think of a better load balancing scheme?
Your optimizations need to yield acceptable results. Check the correctness of your results by using the python script scc_plots.ipynb, as described above in the section “Postprocessing”.
Files to be returned are the results in the subdirectory outdata of your experiment all log files and your scc_plots.ipynb. The profiles you have produced and your analysis in a PDF document.In case the team have a twitter account publish the figure (or video) with the hashtags: #ISC22, #ISC22_SCC (mark/tag the figure or video with your team name/university).