HPL Calculator - Estimate Your Supercomputer's Rank Among the TOP500 - Mohamad Sindi - 2009
System Performance and Ranking Estimation
Below are more details on how to actually run HPL on your system and some tuned parameters suggestions for the HPL input file:
Total number of processors in your grid: 0
Note: The shape of the grid should be close to being a "square", thus P and Q should be approximately equal, with Q slightly larger than P, and both are whole numbers (i.e. no decimals).
Possible combinations of how the HPL grid might look like in terms of P and Q:
Note: Usually 90% of memory is a decent choice for N, while NB=224 was a good block size to use with our Infiniband clusters of size 128 and 512 nodes. If you choose a small value for N, this will result in not enough work performed on each CPU and will give you bad results and low efficiency. If you choose a value of N exceeding your memory size or a size that does not leave enough memory for the operating system processes to run, swapping will take place and the performance will go down.
Possible memory percentages (i.e. N values) aligned with various NB values:
N/NB | 96 | 104 | 112 | 120 | 128 | 136 | 144 | 152 | 160 | 168 | 176 | 184 | 192 | 200 | 208 | 216 | 224 | 232 | 240 | 248 | 256 |
80% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
82% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
84% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
86% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
88% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
90% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
92% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
94% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
96% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
98% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
100% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
HPL Over MVAPICH HowTo (Running on a 512-nodes Infiniband Linux cluster) HPL Over Intel-MPI HowTo (Running on a 128-nodes Infiniband Linux cluster) HPL Using NVIDIA GPUs Developed By: Mohamad Sindi
The efficiency of a system depends on several factors (e.g. type of interconnect, memory, etc..,). The Rpeak value is your system's theoretical performance. Having more memory and better interconnects can help your system's actual performance (Rmax) approach its Rpeak value, hence improving its efficiency. We chose a default value of 84% efficiency (estimate based on Infiniband 4x and memory >= 4GB), and at the same time, we calculate other common efficiency ranges below which your system might fall into.
The future rank estimate is based on average drop rates of historical data from the Top500 lists. The rank is just an estimate with no guarantee.
qweasdzxc