Performs extractive, hierarchical, summarization out of a corpus of documents.
Project description
Structured and Interactive Summarization
Performs extractive, hierarchical, summarization out of a corpus of documents.
Free software: BSD license
Documentation: https://balouf.github.io/sisu/.
Features
Preprocessing tools.
Credits
This package was created with Cookiecutter and the francois-durand/package_helper_2 project template.
History
0.X.X (2021-XX-XX): TODO
Run experiments and comparison on the flat summarizer
Start converting the Wikipedia Animals dataset
Start converting hierarchical summary notebook
0.2.0 (2021-03-14): Flat summarizer
Main update: fully-functional flat summarizer!
Fully customisable;
Fully documented.
Two tutorials
Building a Gismo for the Covid dataset;
Flat summarizer on the Covid dataset.
Change of paradigm: start from the notebook and build the module cell by cell.
Consequences: all non-converted modules from 0.1.2 are moved to the pit. They will be restored during the notebook transformation.
sentence splitter optimized big time using nltk hidden features!
0.1.2 (2021-03-03): the pit
Batch import of remaining modules in a temporary submodule (the pit). The pit will be dispatched afterwards.
Fix import issues (e.g. spacy neuralcoref version incompatibility, Qt5, sknetwork…)
submodule gismo_wrapper on death row (may never leave the pit)
Embedding_idf OK
Building summary: summarize and make_tree have been updated to work.
Lot’s of cleaning remains (separating covid/generic, unified pre-proc and source convention,…)
Take down neuralcoref for the moment. Does not build on github.
0.1.1 (2021-02-23): data_loader
Finish import / transformation of the data_loader module.
0.1.0 (2021-02-23): First release
First release on PyPI.
Preprocessing submodule deployed
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sisu-0.2.0.tar.gz
.
File metadata
- Download URL: sisu-0.2.0.tar.gz
- Upload date:
- Size: 54.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fcd80e083e04a0d9502fb01026ed1177f6fd2f7249414e9fb88808e1125bb45c |
|
MD5 | 0e966ee680d215dc4102b89e6c2f4df0 |
|
BLAKE2b-256 | 666b2f4daf71daf130640ead15f6a941412d1fd842c2cedfba93e0e169f6ad49 |
File details
Details for the file sisu-0.2.0-py2.py3-none-any.whl
.
File metadata
- Download URL: sisu-0.2.0-py2.py3-none-any.whl
- Upload date:
- Size: 53.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | adde16a1d0f027ca0d158f08eccb11ec442b4836d7d73d492c0fa917c78014da |
|
MD5 | b0e9981110debce5161a580b39865da5 |
|
BLAKE2b-256 | 85c97f14087435ecc1a043131ed547fe7a0a005c595494cdd41a3a744ea4cb57 |