GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
Project description
GISMO
GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
Gismo stands for Generic Information Search… with a Mind of its Own.
Free software: GNU General Public License v3
Github: https://github.com/balouf/gismo.
Documentation: https://gismo.readthedocs.io.
Features
Gismo combines three main ideas:
TF-IDTF: a symmetric version of the TF-IDF embedding.
DI-Iteration: a fast, push-based, variant of the PageRank algorithm.
Fuzzy dendrogram: a variant of the Louvain clustering algorithm.
Quickstart
Install gismo:
$ pip install gismo
Import gismo in a Python project:
import gismo as gs
Credits
Thomas Bonald, Anne Bouillard, Marc-Olivier Buob, Dohy Hong.
This package was created with Cookiecutter and the francois-durand/package_helper project template.
History
0.2.2 (2020-05-04)
Notebook tutorials added (early version)
0.2.1 (2020-05-03)
Actual code
Coverage badge
0.1.0 (2020-04-30)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for gismo-0.2.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9eb7780139331fc1e55ebe6ad5b5437fd01a3322466f38b4e7c3195f650bb71 |
|
MD5 | 0fe77a9672ec0a4079bfb91c932fd29a |
|
BLAKE2b-256 | 5771a73a846270c3535c26520ee54d3b06b2c4923718f990f9f73aa2481ecb28 |