This shows you the differences between two versions of the page.
— | hpsl:2021:deu:dkrz:lustre01 [2021/06/05 13:06] (current) – created - external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== FS: Lustre01 ====== | ||
+ | ===== Characteristics ===== | ||
+ | |||
+ | < | ||
+ | name:Lustre Phase1 | ||
+ | </ | ||
+ | |||
+ | ===== Description ===== | ||
+ | |||
+ | The DKRZ system was procured in two phases that are roughly have the same size. | ||
+ | The storage of the first phase consisted of [[http:// | ||
+ | |||
+ | Both systems are configured in Scalable System Units (SSUs); pairs of servers in active/ | ||
+ | |||
+ | ===== Measurement protocols ===== | ||
+ | |||
+ | ==== Peak performance ==== | ||
+ | |||
+ | The peak performance is derived from the maximum performance possible on a ClusterStor 9000, that is 5.4 GiB/s, multiplied with the number of servers in the SSU/ | ||
+ | |||
+ | ==== Sustained metadata performance ==== | ||
+ | |||
+ | Performance has been measured using [[tools: | ||
+ | With Lustre DNE phase 1, DKRZ distributes the load among 5 metadata servers | ||
+ | The observed performance was roughly 25 kOP/s on the root MDS and 15 kOP/s on each DNE MDS resulting 80 kOPS/s. | ||
+ | Then five folders have been pre-created and distributed across the MDS. | ||
+ | Then five Parabench were started at the same to create the typical workload behavior. | ||
+ | The benchmark runs for a considerable time on 16 nodes with 16 processes per node but does not explicitly synchronize between the individual parabench runs. | ||
+ | Theoretically, | ||
+ | |||
+ | |||
+ | |||
+ | ==== Sustained performance ==== | ||
+ | |||
+ | Performance of the phase 1 system has been measured with [[tools: | ||
+ | The configuration was as follows: | ||
+ | |||
+ | * Sriping across 84 OSTs = 21 SSUs + 21 extension units | ||
+ | * 168 compute nodes, 6 IOR procs per node | ||
+ | * Arguments to IOR: -b 2000000 -t 2000000 | ||
+ | * The amount of data was about 3x main memory of the used nodes |