Python library for performing string similarity joins.
Project description
py_stringsimjoin
This project seeks to build a Python software package that provides scalable implementation of string similarity joins over two tables, for commonly used similarity measures such as Jaccard, Dice, cosine, overlap, overlap coefficient and edit distance. The package is free, open-source, and BSD-licensed.
Important links
Project Homepage: https://sites.google.com/site/anhaidgroup/projects/magellan/py_stringsimjoin
Code repository: https://github.com/anhaidgroup/py_stringsimjoin
User Manual: http://anhaidgroup.github.io/py_stringsimjoin/v0.3.2/index.html
Overview: https://anhaidgroup.github.io/py_stringsimjoin/v0.3.2/overview.html
How to Contribute: https://anhaidgroup.github.io/py_stringsimjoin/v0.3.2/contributing.html
Issue Tracker: https://github.com/anhaidgroup/py_stringsimjoin/issues
Mailing List: https://groups.google.com/forum/#!forum/py_stringsimjoin
Dependencies
py_stringsimjoin has been tested on each Python version between 3.7 and 3.11, inclusive.
The required dependencies to build the package are pandas 0.16.0 or higher, py_stringmatching 0.2.1 or higher, joblib, pyprind, six and a C++ compiler. For the development version, you will also need Cython.
Platforms
py_stringsimjoin has been tested on Linux, OS X and Windows.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_stringsimjoin_temp-0.3.3.tar.gz.
File metadata
- Download URL: py_stringsimjoin_temp-0.3.3.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a5060d3ad5b875f1e7253d132ac063ee61a9085950cea5cedb96d884ed6660e
|
|
| MD5 |
55c43d0e671e1728e4be4e2050594204
|
|
| BLAKE2b-256 |
b0e2f8ab533af8a238dd7cc5319d11c022e33a517d318cfa4e62ea052f261da4
|
File details
Details for the file py_stringsimjoin_temp-0.3.3-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: py_stringsimjoin_temp-0.3.3-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 2.0 MB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37144717c02e3fb597d1af8bd3f8a5fcb58370b638f8eecbac9e94de0ca36bfa
|
|
| MD5 |
2d5f7fef4f983737a2459a5e89cd7d9f
|
|
| BLAKE2b-256 |
7f6c230066ba68e41144dbde57d152c862dc428d11e846c9a4c8bc470807ecfa
|
File details
Details for the file py_stringsimjoin_temp-0.3.3-cp39-cp39-macosx_10_9_universal2.whl.
File metadata
- Download URL: py_stringsimjoin_temp-0.3.3-cp39-cp39-macosx_10_9_universal2.whl
- Upload date:
- Size: 2.6 MB
- Tags: CPython 3.9, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66c016dc3d811c54c0a9fc29c197e711d55e84299de897cb659f4602d89836a5
|
|
| MD5 |
dfd207f1e48e41ec3d3b49c4a0861c50
|
|
| BLAKE2b-256 |
fb1d2a2e5de1dbf56cd814e2ba66694f1a60dc0b48ac3cdc6b02369c4eac9eb7
|