Skip to main content

An Extractive and Abstractive Summarization Library Powered with Artificial Intelligence

Project description

pyAutoSummarizer

pyAutoSummarizer - An Extractive and Abstractive Summarization Library Powered with Artificial Intelligence.

Introduction

pyAutoSummarizer is a sophisticated Python library developed to handle the complex task of text summarization, an essential component of NLP (Natural Language Processing). The library implements several advanced summarization algorithms, both extractive and abstractive. Extractive summarization algorithms focus on identifying and extracting key sentences or phrases from the original text to form the summary. Among the techniques utilized by pyAutoSummarizer are TextRank, LexRank, LSA (Latent Semantic Analysis), and KL-Sum. In the domain of deep learning, pyAutoSummarizer incorporates BART (Bidirectional and Auto-Regressive Transformers) and the use of T5 (Text-to-Text Transfer Transformer) model, which is known for its versatility in handling a range of language tasks including summarization. Furthermore, pyAutoSummarizer also utilizes PEGASUS (Pre-training with Extracted Gap-sentences for Abstractive Summarization) and the OpenAI's GPT (Generative Pretrained Transformer), specifically the chatGPT model for abstractive summarization. Unlike extractive techniques, abstractive summarization involves generating new sentences, offering a summary that maintains the essence of the original text but may not use the exact wording.

pyAutoSummarizer stands out for its proficient preprocessing capabilities that pave the way for high-quality text summarization. Recognizing the importance of text normalization, the library offers a range of text cleansing and standardization features. It can convert text to lowercase, ensuring uniformity across the data. Additionally, it can remove accents, remove special characters, and remove numbers, which helps mitigate the text's noise. It also offers the functionality to remove custom words, enabling users to tailor their preprocessing needs. Notably, pyAutoSummarizer supports stopwords removal across various languages, including Arabic, Bengali, Bulgarian, Chinese, Czech, English, Finnish, French, German, Greek, Hebrew, Hind, Hungarian, Italian, Japanese, Korean, Marathi, Persia, Polish, Portuguese-br, Romanian, Russian, Slovak, Spanish, Swedish, Thai, and Ukrainian. The library provides flexibility in sentence segmentation, allowing sentences to be split based on punctuation, character count, or word count.

To evaluate the quality of the summaries generated, pyAutoSummarizer integrates various metrics such as Rouge-N, Rouge-L, and Rouge-S, which compare the overlap of n-grams, longest common subsequence, and skip-bigram between the generated summary and the reference summary respectively. Additionally, it employs BLEU (Bilingual Evaluation Understudy), and METEOR (Metric for Evaluation of Translation with Explicit ORdering).

Usage

  1. Install
pip install pyAutoSummarizer
  1. Try it in Colab:

Extractive Summarization

Abstractive Summarization.

Others

  • pyBibX - A Bibliometric and Scientometric Python Library Powered with Artificial Intelligence Tools

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyAutoSummarizer-1.1.8.tar.gz (50.4 kB view details)

Uploaded Source

Built Distribution

pyAutoSummarizer-1.1.8-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file pyAutoSummarizer-1.1.8.tar.gz.

File metadata

  • Download URL: pyAutoSummarizer-1.1.8.tar.gz
  • Upload date:
  • Size: 50.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.25.11 tqdm/4.64.1 importlib-metadata/4.11.3 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.6 CPython/3.7.6

File hashes

Hashes for pyAutoSummarizer-1.1.8.tar.gz
Algorithm Hash digest
SHA256 b88e6878fd084659d1e1ffd437efe5fe31eb39cf761a92e78add737c0c40c781
MD5 67feebe2292dcf3c4b2fe1f7dfe1da00
BLAKE2b-256 6060c2649940805a774ffbf7dfda871d697f03f97072bd215d2cbdcffe643f76

See more details on using hashes here.

File details

Details for the file pyAutoSummarizer-1.1.8-py3-none-any.whl.

File metadata

  • Download URL: pyAutoSummarizer-1.1.8-py3-none-any.whl
  • Upload date:
  • Size: 50.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.25.11 tqdm/4.64.1 importlib-metadata/4.11.3 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.6 CPython/3.7.6

File hashes

Hashes for pyAutoSummarizer-1.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 f8b4424da6bcb7da177b8d89187d4859e268df631b6b282c5dcf137268f75501
MD5 5b0c711617849e1a6a93995d82915fdd
BLAKE2b-256 538fd5bdef951867010dc5deb457c030affd286d1c53d5f821dab691e9ff7c4f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page