Python library implementing a subset of streaming algorithms. Includes variations of these algorithms (e.g. adversarially robust), as well as support for multiple data types.
Project description
sublinear
Python library implementing a subset of streaming algorithms. Includes variations of these algorithms (e.g. adversarially robust), as well as support for multiple data types.
:zap: Algorithms
Here is the list of the currently implemented streaming algorithms.
F0 Estimation (Count of Distinct Elements)
-
BJKST Sketch (basic, plus, adversarially robust) [1]
-
HyperLogLog [2]
F1 Estimation (Length of Stream)
- Morris (basic, plus, plus plus) [3][4]
F2 Estimation (Estimate of Second Moment)
- AMS Sketch (basic, plus, plus plus) [5]
Frequency Table Estimation
- Count Min Sketch [6]
Heavy Hitters
- Misra-Gries Sketch [7]
Other
- K Independent Hash Function [8]
:book: Bibliography
[1] Bar-Yossef, Ziv, et al. "Counting distinct elements in a data stream." Randomization and Approximation Techniques in Computer Science. Springer Berlin Heidelberg, 2002.
[2] Flajolet, Philippe, et al. "HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm." Conference on Analysis of Algorithms. Springer Berlin Heidelberg, 2007.
[3] Morris, R. "Counting large numbers of events in small registers". Communications of the ACM 21, 10, 1978.
[4] Flajolet, P. "Approximate Counting: A Detailed Analysis". BIT 25, 1985.
[5] Noga Alon, Yossi Matias, Mario Szegedy, "The Space Complexity of Approximating the Frequency Moments". Journal of Computer and System Sciences, Volume 58, Issue 1, 1999.
[6] Cormode, Graham; S. Muthukrishnan. "An Improved Data Stream Summary: The Count-Min Sketch and its Applications". 2005.
[7] Misra, J.; Gries, David. "Finding repeated elements". Science of Computer Programming. 1982
[8] Wegman, Mark N., et al. "New Hash Functions and Their Use in Authentication and Set Equality". Journal of Computer and System Sciences. 1981.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sublinear-0.1.0.tar.gz.
File metadata
- Download URL: sublinear-0.1.0.tar.gz
- Upload date:
- Size: 16.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb783a736fea5323ba21d0bf15f23d3ec48006071a8fe5883b3f5ee943455cea
|
|
| MD5 |
96a6a642bbdf1006c81681f34ba1992e
|
|
| BLAKE2b-256 |
c4ce0a6dcf90beb76bd5edb02ff9e1a4777d841ec44bbca1e06855c163a1357c
|
File details
Details for the file sublinear-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sublinear-0.1.0-py3-none-any.whl
- Upload date:
- Size: 28.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fded7c711ee6d05985b7217611bf2fed756244f3b577adcadf2de1bcb5eea322
|
|
| MD5 |
0a8a0b41fd03ebcf3341ec93017367b2
|
|
| BLAKE2b-256 |
6695a6a4b727194072e6dc2be9bd2fb1bc8fb5262e0c6ff140a3680ef25e5657
|