User Tools

Site Tools


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
std:compression:start [2017/05/16 13:31]
kunkel
std:compression:start [2017/05/18 12:05] (current)
novikova
Line 4: Line 4:
  
  
-The goal of this effort ​are to establish conventions about quantities for lossless and primarily lossy compression algorithms that are useful for users to define and set.  ​+The goal of this effort ​is to establish conventions about quantities for lossless and primarily lossy compression algorithms that are useful for users to define and set.  ​
  
 {{ :​std:​compression:​example-data-simplex-206-sigbits-3bits.png?​200|}} {{ :​std:​compression:​example-data-simplex-206-sigbits-3bits.png?​200|}}
 In detail: In detail:
   * Identification all quantities that are -- from the user-perspective -- useful to be set on a compression algorithm, i.e., they help users to control the compression rate and performance.   * Identification all quantities that are -- from the user-perspective -- useful to be set on a compression algorithm, i.e., they help users to control the compression rate and performance.
-  * Define these quantities properly and assign ​understandable abbreviation for the quantities+  * Define these quantities properly and assign ​an understandable abbreviation for the quantities
   * Foster development of APIs and tools that use these quantities   * Foster development of APIs and tools that use these quantities
  
Line 29: Line 29:
 Our strategy and timeline for establishing these conventions are as follows: Our strategy and timeline for establishing these conventions are as follows:
  
-  * Identification ​first set of user-defined quantities that shall be allowed to set for the compression. (Please see the current list of quantities below)+  * Identification ​the first set of user-defined quantities that shall be allowed to set for the compression. (Please see the current list of quantities below)
   * Invite international experts to this effort   * Invite international experts to this effort
   * Identify relevant quantities   * Identify relevant quantities
Line 36: Line 36:
 ===== Quantities ===== ===== Quantities =====
  
-The following list of quantities contains candidates for the standardization.+The following list of quantities contains candidates for the standardisation.
 They can be classified into: 1) accuracy/​precision loss bounding quantities for lossy algorithms; 2) performance related quantities; 3) other quantities They can be classified into: 1) accuracy/​precision loss bounding quantities for lossy algorithms; 2) performance related quantities; 3) other quantities
  
Line 42: Line 42:
  
 These quantities define the tolerable error on individual values or multidimensional fields of data from a given datatype. These quantities define the tolerable error on individual values or multidimensional fields of data from a given datatype.
-The definition is mostly based on the notion of the term error, which is the residual when subtracting the (lossy) compressed value (c) from the true value (v).+The definition is mostly based on the notion of the term error, which is the residual when subtracting the (lossy) compressed value (d) from the true value (v).
  
-  * **Absolute error tolerance**:​ is the maximum amount of the residual error in the calculations;​ abs(v-c) < absolute error+  * **Absolute error tolerance**:​ is the maximum amount of the residual error in the calculations;​ abs(v-d) < absolute error 
 +  * **Relative error tolerance** is a measure of absolute error compared to the size of the calculations. 
 +  * **Relative error with finest absolute tolerance** is a combination of two quantities. With a relative tolerance, small numbers around 0 are problematic for compressors,​ e.g. 1% relative error for the data value 0.01 results in the compressed accuracy of 0.01±0.0001. The finest absolute tolerance limits the smallest relative error. In our example, setting a relative error finest absolute tolerance of 0.01 would result in an error of ±0.01f or small numbers, while for large numbers their relative error is considered. Thus, it is the lower bound and guaranteed error for relative error bounds, where as the absolute tolerance is the guaranteed resolution for all data points. 
 +  * **Precision bits and precision digits** indicates how much bits or decimal digits are required to represent the array values. 
 +  * **Mean squared error (MSE)** is the arithmetic mean of squared errors between uncompressed and original values;  
 +  * **Standard deviation** is the square root of the mean squared error. 
 +  * **Average absolute deviation** summarises the statistical dispersion or variability. 
 +  * **Peak signal-to-noise ratio (PSNR)** is the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. 
 +  * **Preserved values** , which must be preserved literally, i.e., they cannot be changed and must be preserved, i.e., only lossless compression can be applied to those values.
  
 ==== Performance quantities ==== ==== Performance quantities ====
 +  * **Compression/​decompression speed** sets throughput limit. Otherwise, a default will be used, to achieve maximum error tolerance.
  
 ==== Other quantities ====  ==== Other quantities ==== 
  
-  * **Rate limitation**: Defines ​the mean number of bits to be used for compression. Based on the entropy and, thus, compressibility of information,​ the precision of data is reduced to meet the overall mean rate.+  * **Rate limitation** ​defines ​the mean number of bits to be used for compression. Based on the entropy and, thus, the compressibility of information,​ the precision of data is reduced to meet the overall mean rate.
  
  
Line 61: Line 70:
  
  
 +