GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
Project description
GISMO
GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
Gismo stands for Generic Information Search… with a Mind of its Own.
Free software: GNU General Public License v3
Github: https://github.com/balouf/gismo.
Documentation: https://gismo.readthedocs.io.
Features
Gismo combines three main ideas:
TF-IDTF: a symmetric version of the TF-IDF embedding.
DI-Iteration: a fast, push-based, variant of the PageRank algorithm.
Fuzzy dendrogram: a variant of the Louvain clustering algorithm.
Quickstart
Install gismo:
$ pip install gismo
Import gismo in a Python project:
import gismo as gs
Credits
Thomas Bonald, Anne Bouillard, Marc-Olivier Buob, Dohy Hong.
This package was created with Cookiecutter and the francois-durand/package_helper project template.
History
0.2.3 (2020-05-04)
ACM and DBLP dataset creation added.
0.2.2 (2020-05-04)
Notebook tutorials added (early version)
0.2.1 (2020-05-03)
Actual code
Coverage badge
0.1.0 (2020-04-30)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for gismo-0.2.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 19a48d01c16f5d462758d562f325c365518555d166ab36584314630e91429a28 |
|
MD5 | e460b2a7ee4affe63dcc14edcda01e1f |
|
BLAKE2b-256 | d5098b0497094c9e7910818d51299799fef8e603558f836fba3e3592ea7db8c6 |