Skip to main content

Performs extractive, hierarchical, summarization out of a corpus of documents.

Project description

Structured and Interactive Summarization

PyPI Status Build Status Documentation Status Code Coverage

Performs extractive, hierarchical, summarization out of a corpus of documents.

Features

  • Preprocessing tools.

Credits

This package was created with Cookiecutter and the francois-durand/package_helper_2 project template.

History

0.X.X (2021-XX-XX): TODO

  • Run experiments and comparison on the flat summarizer

  • Start converting the Wikipedia Animals dataset

  • Start converting hierarchical summary notebook

0.2.0 (2021-03-14): Flat summarizer

  • Main update: fully-functional flat summarizer!

    • Fully customisable;

    • Fully documented.

  • Two tutorials

    • Building a Gismo for the Covid dataset;

    • Flat summarizer on the Covid dataset.

  • Change of paradigm: start from the notebook and build the module cell by cell.

  • Consequences: all non-converted modules from 0.1.2 are moved to the pit. They will be restored during the notebook transformation.

  • sentence splitter optimized big time using nltk hidden features!

0.1.2 (2021-03-03): the pit

  • Batch import of remaining modules in a temporary submodule (the pit). The pit will be dispatched afterwards.

  • Fix import issues (e.g. spacy neuralcoref version incompatibility, Qt5, sknetwork…)

  • submodule gismo_wrapper on death row (may never leave the pit)

  • Embedding_idf OK

  • Building summary: summarize and make_tree have been updated to work.

  • Lot’s of cleaning remains (separating covid/generic, unified pre-proc and source convention,…)

  • Take down neuralcoref for the moment. Does not build on github.

0.1.1 (2021-02-23): data_loader

  • Finish import / transformation of the data_loader module.

0.1.0 (2021-02-23): First release

  • First release on PyPI.

  • Preprocessing submodule deployed

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sisu-0.2.0.tar.gz (54.2 kB view details)

Uploaded Source

Built Distribution

sisu-0.2.0-py2.py3-none-any.whl (53.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file sisu-0.2.0.tar.gz.

File metadata

  • Download URL: sisu-0.2.0.tar.gz
  • Upload date:
  • Size: 54.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for sisu-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fcd80e083e04a0d9502fb01026ed1177f6fd2f7249414e9fb88808e1125bb45c
MD5 0e966ee680d215dc4102b89e6c2f4df0
BLAKE2b-256 666b2f4daf71daf130640ead15f6a941412d1fd842c2cedfba93e0e169f6ad49

See more details on using hashes here.

File details

Details for the file sisu-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: sisu-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 53.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for sisu-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 adde16a1d0f027ca0d158f08eccb11ec442b4836d7d73d492c0fa917c78014da
MD5 b0e9981110debce5161a580b39865da5
BLAKE2b-256 85c97f14087435ecc1a043131ed547fe7a0a005c595494cdd41a3a744ea4cb57

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page