====== b_eff_io ====== ---- dataentry benchmark ---- name : b_eff_io # layers_ttags : MPI-IO # features_ttags : data parallel_ttags : MPI type_ttags : synthetic # webpage_url : https://fs.hlrs.de/projects/par/mpi//b_eff_io/ # license_ttag : tbd # ---- b_eff_io is short for //Effective I/O Bandwidth Benchmark//. While aiming to measure performance that can be expected in parallel applications this benchmark is also focused on evaluating the speed of different access patterns. ===== Usage ===== After compiling the benchmark using ''mpicc'' simply call the tool with ''mpirun''. The only mandatory parameters are ''-MB'' and ''-MT'' to set the memory per node and total memory used for the benchmark respectively. ===== Example Output ===== Runnning mpirun -np 32 ./b_eff_io -MB 2048 -MT 81920 -noshared -rewrite -N 32 -T 60 -p /MY_fast_filesystem/MY_directory -f MYSYS_32pe_0060sec_noshared results on in the following output on our test machine (saved in ''MYSYS_32pe_0060sec_noshared.sum'') b_eff_io.c, Revision 2.1 from Dec. 12, 2001 MEMORY_PER_PROCESSOR = 2048 MBytes [1MBytes = 1024*1024 bytes, 1MB = 1e6 bytes] Maximum chunk size = 16.000 MBytes -N 32 T=60, MT=81920 MBytes, -noshared, -rewrite PATH=/home/gresens/test, PREFIX=MYSYS_32pe_0060sec_noshared system name : Linux hostname : cluster OS release : 3.13.0-76-generic OS version : #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 machine : x86_64 Date of measurement: Sat Mar 26 17:42:34 2016 Summary of file I/O bandwidth accumulated on 32 processes with 2048 MByte/PE ----------------------------------------------------------------------------- number pos chunk- access type=0 type=1 type=2 type=3 type=4 of PEs size (l) methode scatter shared separate segmened seg-coll [bytes] methode [MB/s] [MB/s] [MB/s] [MB/s] [MB/s] ----------------------------------------------------------------------- 32 PEs 1 1024 write 50.373 0.621 1.462 2.136 0.814 32 PEs 2 1032 write 60.067 0.662 1.909 1.582 0.695 32 PEs 3 32768 write 58.940 11.607 32.038 19.695 17.600 32 PEs 4 32776 write 61.503 16.008 34.560 21.868 13.998 32 PEs 5 1048576 write 82.490 89.902 113.474 92.407 83.098 32 PEs 6 1048584 write 82.964 84.907 105.175 80.906 82.537 32 PEs 7 16777216 write 109.741 111.965 113.723 109.800 109.211 32 PEs total-write 86.998 88.472 95.186 83.414 71.959 32 PEs 1 1024 rewrite 71.952 0.532 4.815 1.552 1.129 32 PEs 2 1032 rewrite 30.374 0.571 2.961 1.132 0.610 32 PEs 3 32768 rewrite 78.283 6.473 18.009 7.372 7.933 32 PEs 4 32776 rewrite 81.708 9.767 10.925 11.109 8.962 32 PEs 5 1048576 rewrite 51.444 27.174 110.536 44.736 48.128 32 PEs 6 1048584 rewrite 42.578 52.136 60.421 57.537 60.699 32 PEs 7 16777216 rewrite 76.624 63.918 107.528 78.295 85.087 32 PEs total-rewrite 64.469 49.024 70.673 50.648 53.845 32 PEs 1 1024 read 38.914 3.526 3.072 2.419 0.818 32 PEs 2 1032 read 37.997 1.154 4.093 2.984 0.953 32 PEs 3 32768 read 31.802 23.553 11.717 2.973 2.758 32 PEs 4 32776 read 39.174 28.625 14.161 11.857 9.168 32 PEs 5 1048576 read 70.816 49.608 40.960 31.331 31.057 32 PEs 6 1048584 read 77.031 77.107 77.181 25.288 27.258 32 PEs 7 16777216 read 50.218 41.240 84.615 71.739 78.096 32 PEs total-read 47.801 40.171 46.944 40.404 40.009 This table shows all results, except pattern 2 (scatter, l=1MBytes, L=2MBytes): bw_pat2= 59.459 MB/s write, 76.267 MB/s rewrite, 37.815 MB/s read (For gnuplot:) set xtics ( '1k' 1, '+8' 2, '32k' 4, '+8' 5, '1M' 7, '+8' 8, '16M' 10) set title 'Linux cluster 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64' -4 set label 1 'b_eff_io' at 10,50000 right set label 2 'rel. 2.1' at 10,25000 right set label 3 'T=1.0min' at 10,12500 right set label 4 'n=32' at 10,6250 right set label 5 'workaround for type 1:' at 10,0.50 right set label 6 'individual file pointer' at 10,0.25 right weighted average bandwidth for write : 85.504 MB/s on 32 processes weighted average bandwidth for rewrite : 58.855 MB/s on 32 processes weighted average bandwidth for read : 43.855 MB/s on 32 processes (type=0 is weighted double) Total amount of data written/read with each access method: 3516.800 MBytes = 4.3 percent of the total memory (81920 MBytes) b_eff_io of these measurements = 59.876 MB/s on 32 processes with 2048 MByte/PE and scheduled time=1.0 min NOT VALID for comparison of different systems criterion 1: scheduled time 1.0 min >= 30 min -- NOT reached criterion 2: shared file pointers must be used for pattern type 1 -- NOT reached criterion 3: error count (0) == 0 -- reached Maximum over all number of PEs ------------------------------ b_eff_io = 59.876 MB/s on 32 processes with 2048 MByte/PE, scheduled time=1.0 Min, on Linux cluster 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64, NOT VALID (see above) Output has been created using version 2.1.