Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Current »

For this year, ISC21 SCC, we will have a coding challenge.
The main target is to analyze MPI_alltoallv patterns within the application.

MPI_alltoallv is a an MPI function call in which each rank sends to each other rank up to one message, similar to alltoall, but the size of each message could be different. More info about MPI_alltoallv can be found here.

In this task, you will be using an alltoallv collective profiler to generate MPI traces when running applications. After the run, you will use a webUI GUI to access the results and present them.

Coding Challenge - Overview presentation

by HPC Architect Geoffroy Vallee, Nvidia:

Download the slides here:

Tasks

Task 1: Understand MPI_alltoallv calls

Write a program with an input flag for pattern, on the Niagara cluster using 4 nodes, each with 40 ppn (full), total of 160 ppn.

The program should run 1000 iterations of MPI_alltoallv using the following characteristics.

Assuming the ranks mapping is as follows:

ranks 0-39 → node 1

ranks 40-79 → node 2

ranks 80-119 → node 3

ranks 120-159 → node 4

MPI_alltoallv unbalanced Pattern:

  • ranks 0-39 on node 1 are sending 1MB to ranks 40-79 on node 2, and 0B to the rest (40*40MB)

  • ranks 40-79 on nodes 2 are sending 1MB to ranks 0-39 on node 1, and 0B to the rest (40*40MB)

  • ranks 80-159 on nodes 3,4 are sending 0B to all ranks.

MPI_alltoallv balanced Pattern:

  • ranks 0-19 ranks on nodes 1 are sending 1MB to ranks 40-59 and to ranks 80-99 (20*40MB)

  • ranks 40-59 ranks on nodes 2 are sending 1MB to ranks 100-119 and to ranks 120-139 (20*40MB)

  • ranks 80-99 ranks on nodes 3 are sending 1MB to ranks 140-159 and to ranks 0-19 (20*40MB)

  • ranks 120-139 ranks on nodes 4 are sending 1MB to ranks 20-39 and to ranks 60-79 (20*40MB)

Compare the time it takes on both patterns. What are the differences, why?

Submit the code and results (zip file)

Task 2: Getting to know the MPI_alltoallv profiler

Download the profiler from here: https://github.com/gvallee/collective_profiler/releases/tag/ISC21_cc_v1

Use the ISC21_cc_v1 tag for the source code, follow README.md and Install.md to get started.

Run the profiler on GPAW and WRF applications, get the MPI traces for MPI_alltoallv calls. Run the WebUI and load the traces and view the results.

Note: the webUI can be used in any linux machine or laptop.

Nothing to submit here. You will need to present the results at the interview.

Task 3: Pattern display

Applications may have many MPI_alltoallv calls, but only few patterns (or templates) of those calls.

A pattern for n ranks can be defined as group of all <Xi,Yi,Mi> - where rank Xi is sending to rank Yi The Message length Mi. Each MPI_alltoallv have a different pattern, however, in many cases, there could be very large number of MPI_alltoallv calls, but only few patterns.

Pattern example:

  • 3 ranks

  • 4 MPI_alltoallv calls (in bytes)

MPI_alltoallv call number

Rank 1 send to Rank 2

Rank 1 is sending to Rank 3

Rank 2 send to Rank 1

Rank 2 send to Rank 3

Rank 3 send to Rank 1

Rank 3 send to Rank 2

1

10B

10B

0B

5B

7B

100B

2

4B

4B

3B

3B

10B

10B

3

10B

10B

0B

5B

7B

100B

4

10B

10B

0B

5B

7B

100B

In this example, we have 2 patterns while having 4 MPI_alltoallv calls (call #1,3,4 have the same pattern).

The pattern with more calls, might have higher impact on the run time results.

We can define the patten weight as the number of calls used for each pattern.

The Task

Display a list of up to 10 most heavy patterns (high weight) from high to low.

In this task you will need to get those patterns from the profiler, and develop the code to present it as heatmap in the webUI.

Per pattern show a heatmap graph with the following:

 

X axis -> rank 0..n (sender)

Y axis -> rank 0..n (receiver)

 

The value is a message size heat-map presented in 1x1 square with 8 color histogram:

 

0 -> white color

1-10 -> yellow color

11-100-> orange color

101->1,000 -> green color

1,001->10,000->red color

10,001->100,000-> purple color

100,001-> 1,000,000 -> brown color

1,000,001-> max -> black color

 

 

For example, for 4 rank MPI_alltoallv patten (this is specific pattern) we can get similar map to this (different colors).

(need to add X,Y rank labels)

For your test, use 4 nodes with full ppn, from the Niagara cluster.

Submit the code and demonstrate the results in your report and ppt to the judges.

Note: This code should be added to the profiler as a clone from gitbub to your own private github location (details below).

Task 4

Create a map of all patterns on a different tab of the WebUI, where you present in the WebUI a map that is a sum of all patterns.

In cell (xj,yk) you will have the color based on the weighted sum (Wi) of all heatmap table (T) in the cell (xj,yk) For i =1 to number of patterns. j,k = 1 to number of ranks(=160) :=

cell (x,y) ← Sum{Wi*Ti(xj,yk)}

Task 5

Have the number of bins/colors as a fixed in the GUI to 256 colors, keeping a logarithmic scale. 8 colors is the example above for example.

Use 3 different color schemes

  1. Linear distribution: e.g. the cell will get the color of min { Message length / 4KB , 255}

2. Logarithmic distribution: for example:

In general (a possible example, might need adjustment for the edges)

X=number of bytes in a cell // 0 to max integer.

n the number of colors // e.g. 0 to 255

The cell will get the color i ( 0<=i<=n)

1.6*log2(X)*log2n -1 ← color i

For example: for X=1MB and 256 colors, that would get 1.6*20*8-1=255 for the last color (black).

Above 1MB can be black.

Note: The formula may not reach all colors in some cases, and need special care for 0 message size.

3. Come up with your own distribution.

Submit the code and demonstrate the results in your ppt to the judges.

Task 6

Find a way to map ranks to cores in a more balanced way (using MPI rankfile for example) for WRF application, using the output you got, so each node will send/receive in average similar amount of bytes for MPI_alltoallv calls. Try to make this generic as possible.

Task 7 (Bonus)

Open task: Find ways to reduce the running time of the profiler, show the running time with and without the enhancement and show the code.

Submit the code and demonstrate the results in your ppt to the judges.

Task 8 (Bonus)

Find a way to reduce the disk space used for the alltoallv_profiler. Show your work via graph and code changes.

Submit the code and demonstrate the results in your ppt to the judges.

Work environment

Once you get the code, you will need to clone it to your own gitbub environment and to allow to our judge(s) access to your private area, to review your work.

Submissions

You will need to submit your code via (github pull request in your private area) and create a ppt report for this task to send and present to the judges.

Profiler output files

The profiler generates several different output files. For the heatmap, look on the following files :

$ ls
alltoallv_backtrace_rank0_trace0.md  alltoallv_backtrace_rank0_trace7.md                alltoallv_heat-map.rank0-send.md                      data                             profiler
alltoallv_backtrace_rank0_trace1.md  alltoallv_backtrace_rank0_trace8.md                alltoallv_hosts-heat-map.rank0-recv.md                heat-map-recv.md                 rankfile.txt
alltoallv_backtrace_rank0_trace2.md  alltoallv_backtrace_rank0_trace9.md                alltoallv_hosts-heat-map.rank0-send.md                heat-map-send.md                 ranks_map
alltoallv_backtrace_rank0_trace3.md  alltoallv_comm_data_rank0.md                       alltoallv_late_arrival_times.rank0_comm0_job96634.md  helios                           recv-counters.job0.rank0.txt
alltoallv_backtrace_rank0_trace4.md  alltoallv_execution_times.rank0_comm0_job96634.md  alltoallv_late_arrival_times.rank0_comm1_job96634.md  patterns-job0-rank0.md           send-counters.job0.rank0.txt
alltoallv_backtrace_rank0_trace5.md  alltoallv_execution_times.rank0_comm1_job96634.md  alltoallv_locations_comm0_rank0.md                    patterns-summary-job0-rank0.md   stats-job0-rank0.md
alltoallv_backtrace_rank0_trace6.md  alltoallv_heat-map.rank0-recv.md                   alltoallv_locations_comm1_rank0.md                    profile_alltoallv_job0.rank0.md

Open the files:

  • heat-map-send.md

  • send-counters.job0.rank0.txt

  • alltoallv_heat-map.rank0-recv.md

The counters will give you detailed count, per call, per rank. while the other files will give a summary of the count.

  • Note that the message size is the count * datatype size.

send-counters.job0.rank0.txt

File output example, in this example Rank 430 is sending some data, but not to all ranks.

$ more send-counters.job0.rank0.txt                                                                                                                                                                             
# Raw counters

Number of ranks: 1280
Datatype size: 1
Alltoallv calls 0-483
Count: 61 calls - 0, 2, 12, 20, 28, 36, 44, 52, 60, 68, 76, 84, 92, 100, 108, 116, 124, 132, 140, 148, 156, 164, 172, 180, 188, 196, 204, 212, 220, 228, 236, 244, 252, 260, 268, 276, 284, 292, 300, 308, 316
, 324, 332, 340, 348, 356, 364, 372, 380, 388, 396, 404, 412, 420, 428, 436, 444, 452, 460, 468, 476


BEGINNING DATA
Rank(s) 0-425, 438-457, 470-489, 502-521, 534-553, 566-585, 598-617, 630-649, 662-681, 694-713, 726-745, 758-777, 790-809, 822-841, 854-1279: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
...
Rank(s) 430: 0 0 0 0 0 0 0 0 0 92400 123200 138600 15400 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

heat-map-send.md

output example:

This file shows the total bytes sent from each rank for the entire run.

more heat-map-send.md           
FORMAT_VERSION: 9

Rank 551: 926315632 bytes
Rank 638: 231851708 bytes
Rank 129: 213282784 bytes
Rank 655: 1146861584 bytes
Rank 390: 895268704 bytes
Rank 1002: 913167068 bytes
Rank 345: 910132784 bytes
Rank 586: 964314604 bytes
Rank 1090: 215668860 bytes
Rank 114: 213590124 bytes
Rank 293: 340279896 bytes
Rank 135: 213282784 bytes
Rank 1050: 236980524 bytes
Rank 26: 240010188 bytes
Rank 331: 912518860 bytes
Rank 0: 288625788 bytes
Rank 324: 201145648 bytes
Rank 762: 274451144 bytes
Rank 720: 1134724448 bytes
Rank 928: 262920988 bytes
Rank 349: 215668860 bytes
Rank 83: 213282784 bytes
Rank 1034: 619255648 bytes
Rank 1028: 201145648 bytes
Rank 461: 1146861584 bytes
....

alltoallv_heat-map.rank0-recv.md

This file shows the number of bytes sent per MPI_alltoallv call, per rank.

$ more alltoallv_heat-map.rank0-recv.md
FORMAT_VERSION: 9

# Call 0:
Rank 0: 762300 bytes
Rank 1: 554400 bytes
Rank 2: 623700 bytes
Rank 3: 554400 bytes
Rank 4: 554400 bytes
Rank 5: 623700 bytes
Rank 6: 554400 bytes
Rank 7: 554400 bytes
Rank 8: 623700 bytes
Rank 9: 554400 bytes
Rank 10: 554400 bytes
Rank 11: 623700 bytes
Rank 12: 554400 bytes
Rank 13: 554400 bytes
Rank 14: 623700 bytes
Rank 15: 554400 bytes
Rank 16: 554400 bytes
Rank 17: 623700 bytes
Rank 18: 554400 bytes
Rank 19: 554400 bytes
Rank 20: 623700 bytes
Rank 21: 554400 bytes
Rank 22: 554400 bytes
Rank 23: 623700 bytes
Rank 24: 554400 bytes
Rank 25: 554400 bytes
Rank 26: 623700 bytes
Rank 27: 554400 bytes
Rank 28: 554400 bytes
Rank 29: 623700 bytes
Rank 30: 623700 bytes
Rank 31: 831600 bytes
Rank 32: 592900 bytes
Rank 33: 431200 bytes
Rank 34: 485100 bytes
Rank 35: 431200 bytes
Rank 36: 431200 bytes
...

# Call 1:
...

# Call 2:
...

patterns-summary-job0-rank0.md

$ cat patterns-summary-job0-rank0.md                                                                                                                                                

# N to 1 patterns

## Pattern #0 (61/484 alltoallv calls)

Alltoallv calls: 1,11,19,27,35,43,51,59,67,75,83,91,99,107,115,123,131,139,147,155,163,171,179,187,195,203,211,219,227,235,243,251,259,267,275,283,291,299,307,315,323,331,339,347,355,363,371,379,387,395,403,411,419,427,435,443,451,459,467,475,483                                                                                                                                                                      

110 ranks sent to 4 other ranks

532 ranks sent to 2 other ranks

638 ranks sent to 1 other ranks

1 ranks recv'd from 1 other ranks

5 ranks recv'd from 3 other ranks

6 ranks recv'd from 9 other ranks

19 ranks recv'd from 4 other ranks

49 ranks recv'd from 12 other ranks

88 ranks recv'd from 16 other ranks


## Pattern #0 (181/484 alltoallv calls)

Alltoallv calls: 4,6,8,10,14,16,18,22,24,26,30,32,34,38,40,42,46,48,50,54,56,58,62,64,66,70,72,74,78,80,82,86,88,90,94,96,98,102,104,106,110,112,114,118,120,122,126,128,130,134,136,138,142,144,146,150,152,154,158,160,162,166,168,170,174,176,178,182,184,186,190,192,194,198,200,202,206,208,210,214,216,218,222,224,226,230,232,234,238,240,242,246,248,250,254,256,258,262,264,266,270,272,274,278,280,282,286,288,290,294,296,298,302,304,306,310,312,314,318,320,322,326,328,330,334,336,338,342,344,346,350,352,354,358,360,362,366,368,370,374,376,378,382,384,386,390,392,394,398,400,402,406,408,410,414,416,418,422,424,426,430,432,434,438,440,442,446,448,450,454,456,458,462,464,466,470,472,474,478,480,482                                                                                                                            

260 ranks sent to 1 other ranks

380 ranks sent to 4 other ranks

640 ranks sent to 2 other ranks

4 ranks recv'd from 1 other ranks

38 ranks recv'd from 3 other ranks

50 ranks recv'd from 2 other ranks

90 ranks recv'd from 9 other ranks

154 ranks recv'd from 4 other ranks

236 ranks recv'd from 6 other ranks


## Pattern #0 (181/484 alltoallv calls)

Alltoallv calls: 4,6,8,10,14,16,18,22,24,26,30,32,34,38,40,42,46,48,50,54,56,58,62,64,66,70,72,74,78,80,82,86,88,90,94,96,98,102,104,106,110,112,114,118,120,122,126,128,130,134,136,138,142,144,146,150,152,154,158,160,162,166,168,170,174,176,178,182,184,186,190,192,194,198,200,202,206,208,210,214,216,218,222,224,226,230,232,234,238,240,242,246,248,250,254,256,258,262,264,266,270,272,274,278,280,282,286,288,290,294,296,298,302,304,306,310,312,314,318,320,322,326,328,330,334,336,338,342,344,346,350,352,354,358,360,362,366,368,370,374,376,378,382,384,386,390,392,394,398,400,402,406,408,410,414,416,418,422,424,426,430,432,434,438,440,442,446,448,450,454,456,458,462,464,466,470,472,474,478,480,482

260 ranks sent to 1 other ranks

380 ranks sent to 4 other ranks

640 ranks sent to 2 other ranks

4 ranks recv'd from 1 other ranks

38 ranks recv'd from 3 other ranks

50 ranks recv'd from 2 other ranks

90 ranks recv'd from 9 other ranks

154 ranks recv'd from 4 other ranks

236 ranks recv'd from 6 other ranks

# N to n patterns

## Pattern #0 (61/484 alltoallv calls)

Alltoallv calls: 0,2,12,20,28,36,44,52,60,68,76,84,92,100,108,116,124,132,140,148,156,164,172,180,188,196,204,212,220,228,236,244,252,260,268,276,284,292,300,308,316,324,332,340,348,356,364,372,380,388,396,404,412,420,428,436,444,452,460,468,476

1 ranks sent to 1 other ranks

5 ranks sent to 3 other ranks

6 ranks sent to 9 other ranks

19 ranks sent to 4 other ranks

49 ranks sent to 12 other ranks

88 ranks sent to 16 other ranks

110 ranks recv'd from 4 other ranks

532 ranks recv'd from 2 other ranks

638 ranks recv'd from 1 other ranks

alltoallv_hosts-heat-map.rank0-send.md

$ cat alltoallv_hosts-heat-map.rank0-send.md
FORMAT_VERSION: 9

Host helios009.hpcadvisorycouncil.com: 24252271076 bytes
Host helios010.hpcadvisorycouncil.com: 28475053332 bytes
Host helios012.hpcadvisorycouncil.com: 26930246156 bytes
Host helios007.hpcadvisorycouncil.com: 26288719908 bytes
Host helios015.hpcadvisorycouncil.com: 32469552104 bytes
Host helios003.hpcadvisorycouncil.com: 8771148320 bytes
Host helios020.hpcadvisorycouncil.com: 26836848324 bytes
Host helios019.hpcadvisorycouncil.com: 32662142524 bytes
Host helios030.hpcadvisorycouncil.com: 8645049512 bytes
Host helios026.hpcadvisorycouncil.com: 25689412980 bytes
Host helios005.hpcadvisorycouncil.com: 9001630968 bytes
Host helios027.hpcadvisorycouncil.com: 12985697120 bytes
Host helios021.hpcadvisorycouncil.com: 27337244660 bytes
Host helios001.hpcadvisorycouncil.com: 9853191876 bytes
Host helios024.hpcadvisorycouncil.com: 24607889284 bytes
Host helios014.hpcadvisorycouncil.com: 32359178852 bytes
Host helios016.hpcadvisorycouncil.com: 27286341456 bytes
Host helios006.hpcadvisorycouncil.com: 13601046712 bytes
Host helios008.hpcadvisorycouncil.com: 24514491452 bytes
Host helios017.hpcadvisorycouncil.com: 26931404984 bytes
Host helios023.hpcadvisorycouncil.com: 28348954524 bytes
Host helios031.hpcadvisorycouncil.com: 8716950308 bytes
Host helios018.hpcadvisorycouncil.com: 32416740840 bytes
Host helios029.hpcadvisorycouncil.com: 9001630968 bytes
Host helios028.hpcadvisorycouncil.com: 8649698728 bytes
Host helios025.hpcadvisorycouncil.com: 24158873244 bytes
Host helios032.hpcadvisorycouncil.com: 11474918884 bytes
Host helios011.hpcadvisorycouncil.com: 28729744340 bytes
Host helios004.hpcadvisorycouncil.com: 8649698728 bytes
Host helios013.hpcadvisorycouncil.com: 27243846828 bytes
Host helios022.hpcadvisorycouncil.com: 30921569404 bytes
Host helios002.hpcadvisorycouncil.com: 8842367380 bytes
  • No labels