====== b_eff_io ======
---- dataentry benchmark ----
name : b_eff_io #
layers_ttags : MPI-IO #
features_ttags : data
parallel_ttags : MPI
type_ttags : synthetic #
webpage_url : https://fs.hlrs.de/projects/par/mpi//b_eff_io/ #
license_ttag : tbd #
----
b_eff_io is short for //Effective I/O Bandwidth Benchmark//. While aiming to measure performance that can be expected in parallel applications this benchmark is also focused on evaluating the speed of different access patterns.
===== Usage =====
After compiling the benchmark using ''mpicc'' simply call the tool with ''mpirun''.
The only mandatory parameters are ''-MB'' and ''-MT'' to set the memory per node and total memory used for the benchmark respectively.
===== Example Output =====
Runnning mpirun -np 32 ./b_eff_io -MB 2048 -MT 81920 -noshared -rewrite -N 32 -T 60 -p /MY_fast_filesystem/MY_directory -f MYSYS_32pe_0060sec_noshared
results on in the following output on our test machine (saved in ''MYSYS_32pe_0060sec_noshared.sum'')
b_eff_io.c, Revision 2.1 from Dec. 12, 2001
MEMORY_PER_PROCESSOR = 2048 MBytes [1MBytes = 1024*1024 bytes, 1MB = 1e6 bytes]
Maximum chunk size = 16.000 MBytes
-N 32 T=60, MT=81920 MBytes, -noshared, -rewrite
PATH=/home/gresens/test, PREFIX=MYSYS_32pe_0060sec_noshared
system name : Linux
hostname : cluster
OS release : 3.13.0-76-generic
OS version : #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016
machine : x86_64
Date of measurement: Sat Mar 26 17:42:34 2016
Summary of file I/O bandwidth accumulated on 32 processes with 2048 MByte/PE
-----------------------------------------------------------------------------
number pos chunk- access type=0 type=1 type=2 type=3 type=4
of PEs size (l) methode scatter shared separate segmened seg-coll
[bytes] methode [MB/s] [MB/s] [MB/s] [MB/s] [MB/s]
-----------------------------------------------------------------------
32 PEs 1 1024 write 50.373 0.621 1.462 2.136 0.814
32 PEs 2 1032 write 60.067 0.662 1.909 1.582 0.695
32 PEs 3 32768 write 58.940 11.607 32.038 19.695 17.600
32 PEs 4 32776 write 61.503 16.008 34.560 21.868 13.998
32 PEs 5 1048576 write 82.490 89.902 113.474 92.407 83.098
32 PEs 6 1048584 write 82.964 84.907 105.175 80.906 82.537
32 PEs 7 16777216 write 109.741 111.965 113.723 109.800 109.211
32 PEs total-write 86.998 88.472 95.186 83.414 71.959
32 PEs 1 1024 rewrite 71.952 0.532 4.815 1.552 1.129
32 PEs 2 1032 rewrite 30.374 0.571 2.961 1.132 0.610
32 PEs 3 32768 rewrite 78.283 6.473 18.009 7.372 7.933
32 PEs 4 32776 rewrite 81.708 9.767 10.925 11.109 8.962
32 PEs 5 1048576 rewrite 51.444 27.174 110.536 44.736 48.128
32 PEs 6 1048584 rewrite 42.578 52.136 60.421 57.537 60.699
32 PEs 7 16777216 rewrite 76.624 63.918 107.528 78.295 85.087
32 PEs total-rewrite 64.469 49.024 70.673 50.648 53.845
32 PEs 1 1024 read 38.914 3.526 3.072 2.419 0.818
32 PEs 2 1032 read 37.997 1.154 4.093 2.984 0.953
32 PEs 3 32768 read 31.802 23.553 11.717 2.973 2.758
32 PEs 4 32776 read 39.174 28.625 14.161 11.857 9.168
32 PEs 5 1048576 read 70.816 49.608 40.960 31.331 31.057
32 PEs 6 1048584 read 77.031 77.107 77.181 25.288 27.258
32 PEs 7 16777216 read 50.218 41.240 84.615 71.739 78.096
32 PEs total-read 47.801 40.171 46.944 40.404 40.009
This table shows all results, except pattern 2 (scatter, l=1MBytes, L=2MBytes):
bw_pat2= 59.459 MB/s write, 76.267 MB/s rewrite, 37.815 MB/s read
(For gnuplot:)
set xtics ( '1k' 1, '+8' 2, '32k' 4, '+8' 5, '1M' 7, '+8' 8, '16M' 10)
set title 'Linux cluster 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64' -4
set label 1 'b_eff_io' at 10,50000 right
set label 2 'rel. 2.1' at 10,25000 right
set label 3 'T=1.0min' at 10,12500 right
set label 4 'n=32' at 10,6250 right
set label 5 'workaround for type 1:' at 10,0.50 right
set label 6 'individual file pointer' at 10,0.25 right
weighted average bandwidth for write : 85.504 MB/s on 32 processes
weighted average bandwidth for rewrite : 58.855 MB/s on 32 processes
weighted average bandwidth for read : 43.855 MB/s on 32 processes
(type=0 is weighted double)
Total amount of data written/read with each access method: 3516.800 MBytes
= 4.3 percent of the total memory (81920 MBytes)
b_eff_io of these measurements = 59.876 MB/s on 32 processes with 2048 MByte/PE and scheduled time=1.0 min
NOT VALID for comparison of different systems
criterion 1: scheduled time 1.0 min >= 30 min -- NOT reached
criterion 2: shared file pointers must be used for pattern type 1 -- NOT reached
criterion 3: error count (0) == 0 -- reached
Maximum over all number of PEs
------------------------------
b_eff_io = 59.876 MB/s on 32 processes with 2048 MByte/PE, scheduled time=1.0 Min, on Linux cluster 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64, NOT VALID (see above)
Output has been created using version 2.1.