Skip to main content

No project description provided

Project description

Finding Frequent Items 频繁集挖掘

Install

apt-get install -y libboost-python-dev
pip install lossycount

if cannot find -lboost_python3

occurred.

Then I went to

/usr/lib/x86_64-linux-gnu

search and found that the library file is in different name as

libboost_python-py35.so

so I made a link by following command

sudo ln -s libboost_python-py35.so libboost_python3.so which solved my problem.

Use

from lossycount import LossyCount

# 0.001 是要统计的频率下限
lc = LossyCount(0.001)

for i in range(200):
  for j in range(100):
    for k in range(j):
      lc.incr(j)
      # lc.incr(j, 1)

for i in range(1, 100, 30):
  print(i)
  print("出现的次数(估计值)", lc.est(i))
  print(
    "estimate the worst case error in the estimate of a) particular item :",
    lc.err(i)
  )
  print("---" * 20)

result = lc.output(1000)
result.sort(key=lambda x: -x[1])
print(result)

print(lc.capacity())

Thanks

hadjieleftheriou.com/frequent-items

version 1.0

Description

This package provides implementations of various one pass algorithms for finding frequent items in data streams. In particular it contains the following:

  • Frequent Algorithm
  • Lossy Counting, and variations
  • Space Saving
  • Greewald & Khanna
  • Quantile Digest
  • Count Sketch
  • Hierarchical Count-Min Sketch
  • Combinatorial Group Testing

The code is an extension of the MassDAL library. Implementations are by Graham Cormode.

Citations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lossycount-2.0.tar.gz (31.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page