Python library for string matching.
Project description
py_stringmatching
This project seeks to build a Python software package that consists of a comprehensive and scalable set of string tokenizers (such as alphabetical tokenizers, whitespace tokenizers) and string similarity measures (such as edit distance, Jaccard, TF/IDF). The package is free, open-source, and BSD-licensed.
Important links
Project Homepage: https://sites.google.com/site/anhaidgroup/projects/magellan/py_stringmatching
Code repository: https://github.com/anhaidgroup/py_stringmatching
User Manual: https://anhaidgroup.github.io/py_stringmatching/v0.4.2/index.html
Tutorial: https://anhaidgroup.github.io/py_stringmatching/v0.4.2/Tutorial.html
How to Contribute: https://anhaidgroup.github.io/py_stringmatching/v0.4.2/Contributing.html
Developer Manual: http://pages.cs.wisc.edu/~anhai/py_stringmatching/v0.2.0/dev-manual-v0.2.0.pdf
Issue Tracker: https://github.com/anhaidgroup/py_stringmatching/issues
Mailing List: https://groups.google.com/forum/#!forum/py_stringmatching
Dependencies
py_stringmatching has been tested on each Python version between 3.7 and 3.12, inclusive.
The required dependencies to build the package are NumPy 1.7.0 or higher and a C or C++ compiler. For the development version, you will also need Cython.
Platforms
py_stringmatching has been tested on Linux, OS X and Windows. At this time we have only tested on x86 architecture.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.