Algorithm to compress sparse binary data
Project description
Compression Techniques for sparse binary data
In Development
Prerequisites
- Python 2.7 or higher
- NumPy
- scikit learn
- Libraries: [Pickle], [random], [re]
Usage
from BinHash import hasher
corpus = 'path_to_the_folder_containing_documents'
d = 10000
k = 500
myhasher = hasher(corpus, d, k)
sample_text = "this is a sample text"
sample_hash = myhasher.hash_text(sample_text)
Citation
Please cite these papers in your publications if it helps your research
@inproceedings{DBLP:conf/pakdd/PratapSK18,
author = {Rameshwar Pratap and
Ishan Sohony and
Raghav Kulkarni},
title = {Efficient Compression Technique for Sparse Sets},
booktitle = {Advances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia
Conference, {PAKDD} 2018, Melbourne, VIC, Australia, June 3-6, 2018,
Proceedings, Part {III}},
pages = {164--176},
year = {2018},
crossref = {DBLP:conf/pakdd/2018-3},
url = {https://doi.org/10.1007/978-3-319-93040-4\_14},
doi = {10.1007/978-3-319-93040-4\_14},
timestamp = {Tue, 19 Jun 2018 09:13:55 +0200},
biburl = {https://dblp.org/rec/bib/conf/pakdd/PratapSK18},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@inproceedings{compression,
author = {Rameshwar Pratap and
Raghav Kulkarni and
Ishan Sohony},
title = {Efficient Dimensionality Reduction for Sparse Binary Data},
booktitle = {IEEE International Conference on BIG DATA, Accepted},
year = {2018}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
binhash-0.2.3.tar.gz
(2.9 kB
view details)
Built Distribution
File details
Details for the file binhash-0.2.3.tar.gz
.
File metadata
- Download URL: binhash-0.2.3.tar.gz
- Upload date:
- Size: 2.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfb1e7a8aaadcc14e32ada593424c02335a776a85a16b2df33d60a141db5e2b5 |
|
MD5 | 45e44a440083d37c4fad895e92a9fdae |
|
BLAKE2b-256 | 3386c3ad97f78463f3b76f6d5dc7e959ae85e3684d32b3a495b84eb8afddce47 |
File details
Details for the file binhash-0.2.3-py3-none-any.whl
.
File metadata
- Download URL: binhash-0.2.3-py3-none-any.whl
- Upload date:
- Size: 4.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27803f99d9e523376346377ffa0abf18b3bebad25c2607d30a92c3073b1379a5 |
|
MD5 | f828ed40eacf373a59a1773563cc3125 |
|
BLAKE2b-256 | 6200159a59706355f4d9f2e9eccbd27e05c931e7175ae03f7bb625f290ec11a7 |