Graph500
The Graph500 is a rating of supercomputer systems, focused on data-intensive loads. The project was announced on International Supercomputing Conference in June 2010. The first list was published at the ACM/IEEE Supercomputing Conference in November 2010. New versions of the list are published twice a year. The main performance metric used to rank the supercomputers is GTEPS (giga- traversed edges per second).
Richard Murphy from Sandia National Laboratories, says that "The Graph500's goal is to promote awareness of complex data problems", instead of focusing on computer benchmarks like HPL (High Performance Linpack), which TOP500 is based on.[1]
Despite its name, there were several hundreds of systems in the rating, growing up to 174 in June 2014.[2]
The algorithm and implementation that won the championship is published in the paper titled "Extreme scale breadth-first search on supercomputers".[3]
There is also list Green Graph 500, which uses same performance metric, but sorts list according to performance per Watt, like Green 500 works with TOP500 (HPL).
Benchmark
The benchmark used in Graph500 stresses the communication subsystem of the system, instead of counting double precision floating-point.[1] It is based on a breadth-first search in a large undirected graph (a model of Kronecker graph with average degree of 16). There are three computation kernels in the benchmark: the first kernel is to generate the graph and compress it into sparse structures CSR or CSC (Compressed Sparse Row/Column); the second kernel does a parallel BFS search of some random vertices (64 search iterations per run); the third kernel runs a single-source shortest paths (SSSP) computation. Six possible sizes (Scales) of graph are defined: toy (226 vertices; 17 GB of RAM), mini (229; 137 GB), small (232; 1.1 TB), medium (236; 17.6 TB), large (239; 140 TB), and huge (242; 1.1 PB of RAM).[4]
The reference implementation of the benchmark contains several versions:[5]
- serial high-level in GNU Octave
- serial low-level in C
- parallel C version with usage of OpenMP
- two versions for Cray-XMT
- basic MPI version (with MPI-1 functions)
- optimized MPI version (with MPI-2 one-sided communications)
The implementation strategy that have won the championship on the Japanese K computer is described in.[6]
Top 10 ranking
2016
According to June 2016 release of the list:[8]
Rank | Site | Machine (architecture) | Number of nodes | Number of cores | Problem scale | GTEPS |
---|---|---|---|---|---|---|
1 | Riken Advanced Institute for Computational Science | K computer (Fujitsu custom) | 82944 | 663552 | 40 | 38621.4 |
2 | National Supercomputing Center in Wuxi | Sunway TaihuLight (NRCPC - Sunway MPP) | 40768 | 10599680 | 40 | 23755.7 |
3 | Lawrence Livermore National Laboratory | IBM Sequoia (Blue Gene/Q) | 98304 | 1572864 | 41 | 23751 |
4 | Argonne National Laboratory | IBM Mira (Blue Gene/Q) | 49152 | 786432 | 40 | 14982 |
5 | Forschungszentrum Jülich | JUQUEEN (Blue Gene/Q) | 16384 | 262144 | 38 | 5848 |
6 | CINECA | Fermi (Blue Gene/Q) | 8192 | 131072 | 37 | 2567 |
7 | Changsha, China | Tianhe-2 (NUDT custom) | 8192 | 196608 | 36 | 2061.48 |
8 | CNRS/IDRIS-GENCI | Turing (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
8 | Science and Technology Facilities Council – Daresbury Laboratory | Blue Joule (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
8 | University of Edinburgh | DIRAC (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
8 | EDF R&D | Zumbrota (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
8 | Victorian Life Sciences Computation Initiative | Avoca (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
2014
According to June 2014 release of the list:[2]
Rank | Site | Machine (Architecture) | Number of nodes | Number of cores | Problem scale | GTEPS |
---|---|---|---|---|---|---|
1 | RIKEN Advanced Institute for Computational Science | K computer (Fujitsu custom) | 65536 | 524288 | 40 | 17977.1 |
2 | Lawrence Livermore National Laboratory | IBM Sequoia (Blue Gene/Q) | 65536 | 1048576 | 40 | 16599 |
3 | Argonne National Laboratory | IBM Mira (Blue Gene/Q) | 49152 | 786432 | 40 | 14328 |
4 | Forschungszentrum Jülich | JUQUEEN (Blue Gene/Q) | 16384 | 262144 | 38 | 5848 |
5 | CINECA | Fermi (Blue Gene/Q) | 8192 | 131072 | 37 | 2567 |
6 | Changsha, China | Tianhe-2 (NUDT custom) | 8192 | 196608 | 36 | 2061.48 |
7 | CNRS/IDRIS-GENCI | Turing (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | Science and Technology Facilities Council - Daresbury Laboratory | Blue Joule (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | University of Edinburgh | DIRAC (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | EDF R&D | Zumbrota (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | Victorian Life Sciences Computation Initiative | Avoca (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
2013
According to June 2013 release of the list:[9]
Rank | Site | Machine (Architecture) | Number of nodes | Number of cores | Problem scale | GTEPS |
---|---|---|---|---|---|---|
1 | Lawrence Livermore National Laboratory | IBM Sequoia (Blue Gene/Q) | 65536 | 1048576 | 40 | 15363 |
2 | Argonne National Laboratory | IBM Mira (Blue Gene/Q) | 49152 | 786432 | 40 | 14328 |
3 | Forschungszentrum Jülich | JUQUEEN (Blue Gene/Q) | 16384 | 262144 | 38 | 5848 |
4 | RIKEN Advanced Institute for Computational Science | K computer (Fujitsu custom) | 65536 | 524288 | 40 | 5524.12 |
5 | CINECA | Fermi (Blue Gene/Q) | 8192 | 131072 | 37 | 2567 |
6 | Changsha, China | Tianhe-2 (NUDT custom) | 8192 | 196608 | 36 | 2061.48 |
7 | CNRS/IDRIS-GENCI | Turing (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | Science and Technology Facilities Council - Daresbury Laboratory | Blue Joule (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | University of Edinburgh | DIRAC (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | EDF R&D | Zumbrota (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
7 | Victorian Life Sciences Computation Initiative | Avoca (Blue Gene/Q) | 4096 | 65536 | 36 | 1427 |
See also
References
- The Exascale Report (March 15, 2012). "The Case for the Graph 500 – Really Fast or Really Productive? Pick One". Inside HPC.
- "Archived copy". Archived from the original on June 28, 2014. Retrieved June 26, 2014.CS1 maint: archived copy as title (link)
- Ueno, Koji; Suzumura, Toyotaro; Maruyama, Naoya; Fujisawa, Katsuki; Matsuoka, Satoshi (2016). "Extreme scale breadth-first search on supercomputers". 2016 IEEE International Conference on Big Data (Big Data). pp. 1040–1047. doi:10.1109/BigData.2016.7840705. ISBN 978-1-4673-9005-7.
- Performance Evaluation of Graph500 on Large-Scale Distributed Environment // IEEE IISWC 2011, Austin, TX; presentation
- "Graph500: адекватный рейтинг" (in Russian). Open Systems #1 2011.
- Ueno, K.; Suzumura, T.; Maruyama, N.; Fujisawa, K.; Matsuoka, S. (December 1, 2016). "Extreme scale breadth-first search on supercomputers". 2016 IEEE International Conference on Big Data (Big Data): 1040–1047. doi:10.1109/BigData.2016.7840705. ISBN 978-1-4673-9005-7.
- "Fujitsu and RIKEN Take First Place in Graph500 Ranking with Supercomputer Fugaku". HPCwire. June 23, 2020. Retrieved August 8, 2020.
- "Archived copy". Archived from the original on June 24, 2016. Retrieved July 6, 2016.CS1 maint: archived copy as title (link)
- "Archived copy". Archived from the original on June 21, 2013. Retrieved June 19, 2013.CS1 maint: archived copy as title (link)