Table of Contents

SC'19

This page is preliminary.

The IO-500 and the Virtual Institute of I/O

Date November 19th, 12:15-13:15
Venue Room 205-207, Denver Colorado

Link to the official announcement.

Abstract

The IO500 is quickly becoming the de facto benchmarking standard for HPC storage. Developed two years ago, the IO500 has released two official lists so far. A BoF highlight is the presentation of the fourth IO-500 list.

The general purpose of this BoF is to foster the IO500 and VI4IO communities to ensure forward progress towards the common goals of creating, sharing, and benefiting from a large corpus of shared storage data. We also serve as a repository of detailed information about production storage system architectures over time as a knowledge base for other researchers and system designers to use.

Goals of the BoF are to 1) reveal the current IO-500 list and provide highlights and insight; 2) advertise the community hub but also discuss and steer the direction of the community effort; 3) to discuss the benefit and direction of the efforts within the community.

The IO-500 benchmark consists of data and metadata benchmarks to identify performance boundaries for optimized and suboptimal applications. Together with comprehensive data from sites, supercomputers, and storage, in-depth analysis of system characteristics are tracked by the list and can be analyzed. In contrast to other lists, the IO-500 collects the execution scripts for providing means of result verification and sharing best practices for data centers.

Goals of the Virtual Institute for I/O are:

Expected HPC audience are 1) I/O experts from data centers and industry, 2) researchers/engineers working on high-performance I/O for data centers, 3) domain scientists and computer scientists interested in discussing I/O issues.

The outcome of this BoF will steer the direction of the community efforts.

Agenda

We have a series of interactive talks and discussions.

1)
Abstract: Accelerators like GPUs are now commonly used in modern HPC systems to relieve computational performance bottlenecks. As whole workflows are migrated to GPUs, the new bottleneck is IO between storage and GPU memory; the CPU need not be part of the data path. NVIDIA is introducing GPUDirect Storage to enable GPUs to directly move data into and out of a node's GPUs to storage. To test this capability, the IO500 benchmark suite is being used. This talk will introduce the idea, share some unofficial (or official if done in time) results, the challenges with this technology, and a timeline for when this will be available as a product.