Rules and Metrics

The DCL list covers many components of data centers. Understanding the balance between FLOPs, storage and archival capacity is an important aspect that is tracked by the list. Since a storage system (or tape library) can be integral part of a supercomputer and, for example, procured together, or it can be procured separately and be installed on the site serving multiple supercomputers, this aspect must be described further underneath the characteristics. This should be also reflected in an architectural picture covering the topology. The data can be aggregated and filtered in multiple ways.

The following list is outdated – it needs an update to the current flexible data model

Metrics

Generally speaking, a metrics is defined by the measurement procedure and a unit ¹⁾.

Most metrics can be determined without measurement and describe hardware and software characteristics that should be well known to the site and vendor. A few metrics cover actually observed metadata and I/O performance, in this case the measurement procedure must be clear.

The following list of metrics includes a description about the metrics.

Institution

institution: The abbreviation of the institution. Note that systems are linked together based on year and institution.
year: The year for which the data is valid.
nationality: The international abbreviation for the nationality of the institution.
web_page: The web page of the institution.
energy_consumption: The overall energy consumption of the building containing the data center.
power_usage_effectiveness: The PUE of the datacenter.
initial_facility_costs: The costs to construct the overall data center building.
annual_staff_costs: The annual costs for the personal in $.

Supercomputer

institution: The abbreviation of the institution. Note that systems are linked together based on year and institution.
year: The year for which the data is valid.
vendor: The vendor from which the system has been procured.
software: A list ²⁾ of keywords with relevant software components, e.g., which file system, parallelization software.
installation: This is the date when the supercomputer has been installed. Multi-phase installations should appear with their last upgrade date.
compute_peak: The theoretical peak performance in FLOPs.
node_count: The number of nodes.
total_cores: The total number of available cores.
memory_capacity: The total available memory capacity in Bytes.
memory_bandwidth: The sum of the theoretical memory bandwidth available in B/s.
memory_per_node: The memory capacity per node.
application_domain: A list of the main (scientific) domains that use this supercomputer.
applications: A list of the main applications (if known).
energy_consumption: The energy consumption of the supercomputer (without storage) in Watts – this does not take the PUE into account.
interconnect: A list of keywords about the interconnect.
processor: A list of keywords specifying the processor.
graph500: The achieved performance according to the graph 500 list. This is not the position in the list, as this may change over time.
graph500_problem_scale: The problem scale according to the graph 500 list.
top500: The achieved performance according to the top 500 list.
green500: The achieved efficiency according to the green 500 list.
architecture: A list of keywords covering the system architecture, e.g., i386_64, GPGPU
life_time: The expected life time of the system, e.g., 5 years.
annual_procurement_costs: The procurement costs divided by the life time, e.g., 5 M$ for 5 years == 1 M per year
annual_facility_update_costs: The costs to initially modify the facility for the system divided by the systems life time.
annual_tco: The TCO divided by the system life time. This includes explicitly projected staff costs and energy.

Storage

There are two types of storage supported: tape archive and shared storage, e.g., file systems / object storage.

institution: The abbreviation of the institution. Note that systems are linked together based on year and institution.
year: The year for which the data is valid.
type: The type of the storage, i.e., tape archive or shared storage
installation: This is the date when the storage has been installed. Multi-phase installations should appear with their last upgrade date.
energy_consumption: The energy consumption of the storage part in Watts – this does not take the PUE into account.
capacity: The effective capacity that is available to users. It includes overhead of erasure (RAID) coding and potential hot/cold spares. This value can be easily derived from the number of available storage devices that support the listed file system.
interconnect: A list of keywords about the interconnect.
drives: The total number of tape drives for a nearline tape/MAID archive.
cache_size: The amount of storage cache in a nearline HSM system.
slots: The number of slots in a nearline tape/MAID archive to hold media.
vendor: The vendor of the storage hardware.
software: A list of keywords specifying the software further.
hardware: A list of keywords specifying the hardware further.
peak: The theoretical peak performance of the storage system. The value is the performance that could theoretically be achieved when transferring data between clients and storage. It is limited by 1) the aggregated network throughput between client and servers, 2) the aggregated (RAID) controller throughput, 3) the network topology.
metadata_rate: Metadata throughput. The value can be determined using any I/O benchmark of choice that ensures that client-side and server-side caches are overwhelmed.
sustained_write: Best I/O throughput ever measured when accessing files. The read and write values can be determined using any I/O benchmark of choice that ensures that client-side and server-side caches are overwhelmed.
sustained_read: (see the description for write)
servers: The number of storage servers of the storage system.
hdds: The number of HDDs that belong to the storage system.
ssds: The number of SSDs that belong to the storage system.
life_time: See above
annual_procurement_costs: See above
annual_facility_update_costs: See above
annual_tco: See above

Measurement procedure

Compared to other lists (TOP500, Green500) that have a clear measurement process, the rules for determining performance values for the HPSL are often relaxed due to the complexity of I/O benchmarks, However it must be clarified how the measurement has been conducted.

FAQ

What happens if the hardware is upgraded?
- It is quite common, that a supercomputer, tape library or storage system is upgraded, For example, some procurements involve several phases in which the system is upgraded.
- Since the last should be able to track the changes but we track them on an annual basis, changes to pages with a past installation date must be avoided. The list is maintained in folders for each installation year, and at the beginning of each new year the old systems (with sufficient storage capacity) are copied over and the field “year” is adjusted. This leads to two rules:
  - If you upgraded your system in a new year, i.e., the installation/upgrade time is from last year, then edit the page that has been copied over.
  - If you upgrade your system in the same year, you can update the month in “installation” date and all the metrics/descriptions you need.

¹⁾

The system allows users to input data in SI supporteded symbolic metric prefixes, e.g. 10 KW or 0.01 MW is the same

²⁾

Lists are always comma separated

Virtual Institute
for
I/O

Sidebar

Table of Contents

Rules and Metrics

Metrics

Institution

Supercomputer

Storage

Measurement procedure

FAQ

Virtual Institute for I/O

User Tools

Site Tools

Sidebar

Table of Contents

Rules and Metrics

Metrics

Institution

Supercomputer

Storage

Measurement procedure

FAQ

Virtual Institute
for
I/O