API Snapshots: Java Core, C++ Core, Python, Memory, Pig, Hive,

Introduction to the 3 Quantiles Sketches

Quantile-type sketches in the library

This is an overview of the three types of quantiles sketches in the library. Each of these quantile types may have one or more specific implementtaions.

The mathematical error bounds of all the quantile sketches is specified with respect to rank and not with respect to quantiles. In other words, the difference between the rank upper bound and the rank lower bound is the confidence interval and can be expressed as a percent of the overall rank distribution (which is 1.0) and is the mathematically derived error for a specific configuration of the sketch.

Although the quantile upper bound and quantile lower bounds can be approximately computed from the rank upper bound and rank lower bound, and the difference between the quantile bounds is also an approximate confidence interval, the size of the quantile confidence interval may not be meaningful and is not constrained by the defined error of the sketch.

These sketches have many parallel methods. Please refer to the individual Javadocs for more information.

The REQ Sketch

  • Java
  • C++, Python
  • Key Features (both Java & C++)
    • Accuracy %: a function of K and relative with respect to rank. The user can select either High Rank Accuracy (HRA) or Low Rank Accuracy (LRA). This enables extremely high accuracy for the ends of the rank domain. E.g., 99.999%ile quantiles.
    • User selectable comparison QuantileSearchCriteria:
      • Exclusive, which is compatible with the KLL and older Quantiles Sketch
      • Inclusive, a common definition in some of the theoretical literature.
    • Java: Dedicated float implementation.
    • C++: Template implementation for arbitrary comparible types.
    • Python: Dedicated float and integer implementations.

The KLL Sketch

  • Java
  • C++, Python
  • Key Features (both Java & C++)
    • User selectable comparison QuantileSearchCriteria:
      • Exclusive, which is compatible with the KLL and older Quantiles Sketch
      • Inclusive, a common definition in some of the theoretical literature.
    • Accuracy %: a function of K and independent of rank.
    • Near optimal accuracy per sketch size compared to other constant accuracy quantiles sketches.
    • Java: Dedicated float and double implementations.
    • C++: Template implementation for arbitrary comparible types.
    • Python: Dedicated float and integer implementations

The Quantiles Sketch

  • Java
  • C++, Python
  • Key Features (both Java & C++)
    • User selectable comparison QuantileSearchCriteria:
      • Exclusive, which is compatible with the KLL and older Quantiles Sketch
      • Inclusive, a common definition in some of the theoretical literature.
    • Accuracy %: a function of K and independent of rank.
    • Dedicated double implentation.
    • java generic implementation for arbitrary comparible types.
    • The dedicated double implementation can be configured for off-heap operation.