User Tools

Site Tools


Table of Contents

b_eff_io

b_eff_io is short for Effective I/O Bandwidth Benchmark. While aiming to measure performance that can be expected in parallel applications this benchmark is also focused on evaluating the speed of different access patterns.

Usage

After compiling the benchmark using mpicc simply call the tool with mpirun. The only mandatory parameters are -MB and -MT to set the memory per node and total memory used for the benchmark respectively.

Example Output

Runnning

mpirun -np 32 ./b_eff_io -MB 2048 -MT 81920 -noshared -rewrite -N 32 -T 60 -p /MY_fast_filesystem/MY_directory -f MYSYS_32pe_0060sec_noshared

results on in the following output on our test machine (saved in MYSYS_32pe_0060sec_noshared.sum)

b_eff_io.c, Revision 2.1 from Dec. 12, 2001

MEMORY_PER_PROCESSOR = 2048 MBytes  [1MBytes = 1024*1024 bytes, 1MB = 1e6 bytes]
Maximum chunk size   =   16.000 MBytes

-N  32 T=60, MT=81920 MBytes, -noshared, -rewrite
PATH=/home/gresens/test, PREFIX=MYSYS_32pe_0060sec_noshared
       system name : Linux
       hostname    : cluster
       OS release  : 3.13.0-76-generic
       OS version  : #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016
       machine     : x86_64

Date of measurement: Sat Mar 26 17:42:34 2016



Summary of file I/O bandwidth accumulated on  32 processes with 2048 MByte/PE
-----------------------------------------------------------------------------

 number pos chunk-   access   type=0   type=1   type=2   type=3   type=4
 of PEs     size (l) methode scatter  shared  separate segmened seg-coll
            [bytes]  methode  [MB/s]   [MB/s]   [MB/s]   [MB/s]   [MB/s]
 -----------------------------------------------------------------------

  32 PEs 1     1024 write     50.373    0.621    1.462    2.136    0.814
  32 PEs 2     1032 write     60.067    0.662    1.909    1.582    0.695
  32 PEs 3    32768 write     58.940   11.607   32.038   19.695   17.600
  32 PEs 4    32776 write     61.503   16.008   34.560   21.868   13.998
  32 PEs 5  1048576 write     82.490   89.902  113.474   92.407   83.098
  32 PEs 6  1048584 write     82.964   84.907  105.175   80.906   82.537
  32 PEs 7 16777216 write    109.741  111.965  113.723  109.800  109.211
  32 PEs      total-write     86.998   88.472   95.186   83.414   71.959

  32 PEs 1     1024 rewrite   71.952    0.532    4.815    1.552    1.129
  32 PEs 2     1032 rewrite   30.374    0.571    2.961    1.132    0.610
  32 PEs 3    32768 rewrite   78.283    6.473   18.009    7.372    7.933
  32 PEs 4    32776 rewrite   81.708    9.767   10.925   11.109    8.962
  32 PEs 5  1048576 rewrite   51.444   27.174  110.536   44.736   48.128
  32 PEs 6  1048584 rewrite   42.578   52.136   60.421   57.537   60.699
  32 PEs 7 16777216 rewrite   76.624   63.918  107.528   78.295   85.087
  32 PEs      total-rewrite   64.469   49.024   70.673   50.648   53.845

  32 PEs 1     1024 read      38.914    3.526    3.072    2.419    0.818
  32 PEs 2     1032 read      37.997    1.154    4.093    2.984    0.953
  32 PEs 3    32768 read      31.802   23.553   11.717    2.973    2.758
  32 PEs 4    32776 read      39.174   28.625   14.161   11.857    9.168
  32 PEs 5  1048576 read      70.816   49.608   40.960   31.331   31.057
  32 PEs 6  1048584 read      77.031   77.107   77.181   25.288   27.258
  32 PEs 7 16777216 read      50.218   41.240   84.615   71.739   78.096
  32 PEs      total-read      47.801   40.171   46.944   40.404   40.009

This table shows all results, except pattern 2 (scatter, l=1MBytes, L=2MBytes): 
  bw_pat2=   59.459 MB/s write,   76.267 MB/s rewrite,   37.815 MB/s read

(For gnuplot:)
  set xtics ( '1k' 1, '+8' 2, '32k' 4, '+8' 5, '1M' 7, '+8' 8, '16M' 10)
  set title 'Linux cluster 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64' -4
  set label 1 'b_eff_io'  at 10,50000 right
  set label 2 'rel. 2.1'  at 10,25000 right
  set label 3 'T=1.0min' at 10,12500 right
  set label 4 'n=32'     at 10,6250  right
  set label 5 'workaround for type 1:'  at 10,0.50 right
  set label 6 'individual file pointer' at 10,0.25 right

weighted average bandwidth for write   :   85.504 MB/s on 32 processes
weighted average bandwidth for rewrite :   58.855 MB/s on 32 processes
weighted average bandwidth for read    :   43.855 MB/s on 32 processes
(type=0 is weighted double)

Total amount of data written/read with each access method: 3516.800 MBytes
  =  4.3 percent of the total memory (81920 MBytes)

b_eff_io of these measurements =   59.876 MB/s on 32 processes with 2048 MByte/PE and scheduled time=1.0 min

NOT VALID for comparison of different systems
  criterion 1: scheduled time 1.0 min >= 30 min -- NOT reached
  criterion 2: shared file pointers must be used for pattern type 1 -- NOT reached
  criterion 3: error count (0) == 0 -- reached

Maximum over all number of PEs
------------------------------


 b_eff_io =   59.876 MB/s on 32 processes with 2048 MByte/PE, scheduled time=1.0 Min, on Linux cluster 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64, NOT VALID (see above)

Output has been created using version 2.1.