Skip to main content

Python library implementing a subset of streaming algorithms. Includes variations of these algorithms (e.g. adversarially robust), as well as support for multiple data types.

Project description

sublinear

Python library implementing a subset of streaming algorithms. Includes variations of these algorithms (e.g. adversarially robust), as well as support for multiple data types.

:zap: Algorithms

Here is the list of the currently implemented streaming algorithms.

F0 Estimation (Count of Distinct Elements)

  • BJKST Sketch (basic, plus, adversarially robust) [1]

  • HyperLogLog [2]

F1 Estimation (Length of Stream)

  • Morris (basic, plus, plus plus) [3][4]

F2 Estimation (Estimate of Second Moment)

  • AMS Sketch (basic, plus, plus plus) [5]

Frequency Table Estimation

  • Count Min Sketch [6]

Heavy Hitters

  • Misra-Gries Sketch [7]

Other

  • K Independent Hash Function [8]

:book: Bibliography

[1] Bar-Yossef, Ziv, et al. "Counting distinct elements in a data stream." Randomization and Approximation Techniques in Computer Science. Springer Berlin Heidelberg, 2002.

[2] Flajolet, Philippe, et al. "HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm." Conference on Analysis of Algorithms. Springer Berlin Heidelberg, 2007.

[3] Morris, R. "Counting large numbers of events in small registers". Communications of the ACM 21, 10, 1978.

[4] Flajolet, P. "Approximate Counting: A Detailed Analysis". BIT 25, 1985.

[5] Noga Alon, Yossi Matias, Mario Szegedy, "The Space Complexity of Approximating the Frequency Moments". Journal of Computer and System Sciences, Volume 58, Issue 1, 1999.

[6] Cormode, Graham; S. Muthukrishnan. "An Improved Data Stream Summary: The Count-Min Sketch and its Applications". 2005.

[7] Misra, J.; Gries, David. "Finding repeated elements". Science of Computer Programming. 1982

[8] Wegman, Mark N., et al. "New Hash Functions and Their Use in Authentication and Set Equality". Journal of Computer and System Sciences. 1981.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sublinear-0.1.0.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sublinear-0.1.0-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file sublinear-0.1.0.tar.gz.

File metadata

  • Download URL: sublinear-0.1.0.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for sublinear-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cb783a736fea5323ba21d0bf15f23d3ec48006071a8fe5883b3f5ee943455cea
MD5 96a6a642bbdf1006c81681f34ba1992e
BLAKE2b-256 c4ce0a6dcf90beb76bd5edb02ff9e1a4777d841ec44bbca1e06855c163a1357c

See more details on using hashes here.

File details

Details for the file sublinear-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sublinear-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 28.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for sublinear-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fded7c711ee6d05985b7217611bf2fed756244f3b577adcadf2de1bcb5eea322
MD5 0a8a0b41fd03ebcf3341ec93017367b2
BLAKE2b-256 6695a6a4b727194072e6dc2be9bd2fb1bc8fb5262e0c6ff140a3680ef25e5657

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page