Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

MILC Profile based on 32 Nodes Helios cluster.

MPI Communication

...

MPI Message Sizes

91% of MPI Communication spent on MPI_Wait while 5% MPI Allreduce 8 bytes. In addition, we see async send and receive MPI communication.

...

MPI time

About 20% of imbalance of the applicaiton can be seen.

...

MPI time among the 256 ranks sorted by time spent in MPI shows that rhere is a load imbalance of about 20% between the ranks that spend the most and the least time in MPI:

...

MPI time ordered by rank, shows imbalanced imbalance between sockets (4 processes MPI ranks per socket, 5 OpenMP threads per MPI rank).Image Removed

...

Communication Matrix

...

Memory Footprint

...

Summary

95% scaling was achieved from 16 to 32 nodes for the medium benchmark on Helios Cluster, using HDR InfiniBand network. A 3% difference was seen comparing 5 OpenMP threads comparing to 10 OpenMP threads on Helios.

...