====== IO-500 ======


===== Goals =====

The creation of a suite of I/O benchmarks to compare facilities and storage systems.
Subgoals of the benchmark are:
  * Capture user-experienced performance
  * Reported performance is representative for:
    * applications with well optimized I/O patterns 
    * applications with random-like workloads 
    * workloads involving metadata small/objects

The list will be curated and linked to the [[hpsl:start|Data Center List]] allowing to compare system information and theoretic performance to actually measured performance.

===== Reasons for Participation =====

  * The reputation that comes from a *500 ranking
  * Ease comparison of I/O characteristics from your site with other sites!
  * Document relevant best practices for your site


===== Running the IO-500 benchmark =====

The benchmark consists of multiple subcomponents.  There are bandwidth subcomponents, metadata subcomponents, and namespace searching subcomponents.  The bandwidth subcomponents and metadata components use the IOR and mdtest benchmarks respectively.  They both require that users submit a 'hero' number in which users can configure and tune the system and the command line arguments to maximize performance.  They also require that users measure with a more challenging set of command line arguments.  The namespace traversal and search can use a supplied MPI-based namespace traversal or can use custom tools.

Information about how the ranking is computed from the various subcomponent benchmark measurements can be found at the [[https://www.vi4io.org/io500/start|IO500 List]]. 

Running the benchmark suite is comparably easy.
It consists of scripts to prepare and setup an initial run.
  * Please clone the GitHub repository here: https://github.com/IO500/io500 
  * It contains a description how to run the benchmark.

===== Communication & Contribution =====

 If you are interested in the status or to contribute use our communication channels:
  * [[https://www.vi4io.org/listinfo/IO-500|mailing list]]
  * Twitter: @IO500benchmark
  * [[https://join.slack.com/t/vi4io/shared_invite/enQtMjMyOTgxMDg0OTQ1LTcyYWJkYzJiMDUzMDU2YjE1NjFjMGNjZWEwYTM2NzQxNzcxMDExYmFmMjJjMDY3NjBiYTRjYTM1M2I3ZGE3NmM|Slack]]
  * The current [[https://github.com/IO500/io500|IO500 benchmark source]]
===== Approach ==== 

Our strategy is as follows: 
  * Build on existing benchmarks
  * Plugin systems should allow for alternative storage technology
  * Start by reporting a single metric per benchmark, later decide about a single IO500 number


===== Timeline =====

Our timeline and history for establishing of the benchmark are as follows:

  * Nov 2017, Birds-of-a-feather during the Supercomputing; discussion
  * March 2017, Discussion of the IO500 during the [[https://wr.informatik.uni-hamburg.de/events/2017/uiop|UIOP workshop]]
  * May 2017, Discussion of the IO500 during the Dagstuhl Seminar
  * May 2017, Initial draft for IO500 benchmarks
  * ISC HPC 2017, [[http://www.isc-hpc.com/isc17_ap/sessiondetails.htm?t=session&o=557&a=select&ra=index|Birds-of-a-feather session]]; presentation of the proposal, presentation of results for some data centers


===== Publications =====

  * {{:std:io500:poster-vi4io.pdf|Poster ISC}}
  * {{::17-isc-bof-io-500-proposal.pdf|ISC-17-BoF Presentation IO-500}}
  * WiP at PDSW: //Establishing the IO-500 Benchmark//
    * [[http://www.pdsw.org/pdsw-discs17/wips/slides/kunkel-wipslides-pdsw-discs17.pdf|Slides]]
    * [[http://www.pdsw.org/pdsw-discs17/wips/kunkel-wip-pdsw-discs17.pdf|Abstract]]
  * See also details of our [[events/2017/bof-sc-bof|BoF at SC]]