Skip to main content

Algorithm to compress sparse binary data

Project description

Compression Techniques for sparse binary data

In Development

Prerequisites

Usage

from BinHash import hasher
corpus = 'path_to_the_folder_containing_documents'
d = 10000
k = 500
myhasher = hasher(corpus, d, k)
sample_text = "this is a sample text"
sample_hash = myhasher.hash_text(sample_text)

Citation

Please cite these papers in your publications if it helps your research

@inproceedings{DBLP:conf/pakdd/PratapSK18,
  author    = {Rameshwar Pratap and
               Ishan Sohony and
               Raghav Kulkarni},
  title     = {Efficient Compression Technique for Sparse Sets},
  booktitle = {Advances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia
               Conference, {PAKDD} 2018, Melbourne, VIC, Australia, June 3-6, 2018,
               Proceedings, Part {III}},
  pages     = {164--176},
  year      = {2018},
  crossref  = {DBLP:conf/pakdd/2018-3},
  url       = {https://doi.org/10.1007/978-3-319-93040-4\_14},
  doi       = {10.1007/978-3-319-93040-4\_14},
  timestamp = {Tue, 19 Jun 2018 09:13:55 +0200},
  biburl    = {https://dblp.org/rec/bib/conf/pakdd/PratapSK18},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}


@inproceedings{compression,
 author    = {Rameshwar Pratap and
               Raghav Kulkarni and
		Ishan Sohony},
  title     = {Efficient Dimensionality Reduction for Sparse Binary Data},
  booktitle = {IEEE International Conference on BIG DATA, Accepted},
  year      = {2018}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

binhash-0.2.3.tar.gz (2.9 kB view details)

Uploaded Source

Built Distribution

binhash-0.2.3-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file binhash-0.2.3.tar.gz.

File metadata

  • Download URL: binhash-0.2.3.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for binhash-0.2.3.tar.gz
Algorithm Hash digest
SHA256 bfb1e7a8aaadcc14e32ada593424c02335a776a85a16b2df33d60a141db5e2b5
MD5 45e44a440083d37c4fad895e92a9fdae
BLAKE2b-256 3386c3ad97f78463f3b76f6d5dc7e959ae85e3684d32b3a495b84eb8afddce47

See more details on using hashes here.

File details

Details for the file binhash-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: binhash-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for binhash-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 27803f99d9e523376346377ffa0abf18b3bebad25c2607d30a92c3073b1379a5
MD5 f828ed40eacf373a59a1773563cc3125
BLAKE2b-256 6200159a59706355f4d9f2e9eccbd27e05c931e7175ae03f7bb625f290ec11a7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page