An Extractive and Abstractive Summarization Library Powered with Artificial Intelligence

These details have not been verified by PyPI

Project links

Homepage

Project description

pyAutoSummarizer

pyAutoSummarizer â€” An Extractive and Abstractive Summarization Library Powered with Artificial Intelligence.

Citation

PEREIRA, V., DE LIMA PORTO, R.C., FIGUEIRA, L.A.A., FERREIRA, R.A.C.A. (2026). Unveiling pyAutoSummarizer: An Extractive and Abstractive Summarization Library Powered with Artificial Intelligence. In: DA HORA, H., PORTER, A.L., CHIAVETTA, D., ZHANG, Y. (eds) Technology Mining. Springer, Cham. https://doi.org/10.1007/978-3-032-10849-4_2

Introduction

pyAutoSummarizer is a Python library for text summarization, covering both extractive and abstractive approaches, and providing a comprehensive suite of evaluation metrics â€” from classic n-gram overlap to modern semantic and faithfulness measures.

Summarization Methods

Extractive â€” identifies and returns the most important sentences from the original text:

Method	Description
TextRank	Graph-based ranking using sentence embeddings and cosine similarity
LexRank	Graph-based ranking using TF-IDF cosine similarity
LSA	Latent Semantic Analysis via SVD on embeddings or TF-IDF matrix
KL-Sum	Selects sentences that minimise KL-divergence from the full document distribution
BART	`facebook/bart-large-cnn` abstractive model (deep learning)
T5	`t5-base` abstractive model (deep learning)

Abstractive â€” generates new text that captures the meaning of the source:

Method	Description
PEGASUS	`google/pegasus-xsum` model fine-tuned for abstractive summarization
chatGPT	OpenAI `gpt-4o-mini` (or any chat model) via the OpenAI API

Text Pre-processing

The library provides a flexible pre-processing pipeline:

Lowercasing, accent removal, special character removal, number removal
Custom word removal
Stopword removal across 26 languages: Arabic, Bengali, Bulgarian, Chinese, Czech, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Italian, Japanese, Korean, Marathi, Persian, Polish, Portuguese-br, Romanian, Russian, Slovak, Spanish, Swedish, Thai, and Ukrainian
Sentence segmentation by punctuation, word count, or character count

Evaluation Metrics

Classic Metrics (reference-based, lexical)

Metric	Method	Returns
ROUGE-N	`rouge_N(generated, reference, n=1)`	F1, Precision, Recall
ROUGE-L	`rouge_L(generated, reference)`	F1, Precision, Recall
ROUGE-S	`rouge_S(generated, reference, skip_distance=4)`	F1, Precision, Recall
BLEU	`bleu(generated, reference, n=4)`	Score
METEOR	`meteor(generated, reference)`	Score

Semantic Metric (reference-based)

Metric	Method	Returns	Notes
BERTScore	`bert_score(generated, reference, model_type='roberta-large')`	F1, Precision, Recall	Requires `pip install bert-score`. Captures paraphrasing that ROUGE misses by comparing contextualised token embeddings.

Faithfulness / Factual Consistency Metrics (source-based, no reference needed)

These metrics check whether the summary is factually consistent with the source document, detecting hallucinations that lexical metrics cannot see.

Metric	Method	Returns	Notes
SummaC	`summa_c(generated, nli_model='cross-encoder/nli-deberta-v3-small')`	Score âˆˆ [0, 1]	Self-contained NLI-based faithfulness scorer using HuggingFace transformers. No extra install needed.
AlignScore	`align_score(generated, model='AlignScore-base')`	Score âˆˆ [0, 1]	Requires `pip install pyAutoSummarizer[faithfulness]` and `python -m spacy download en_core_web_sm`. Based on Zha et al., ACL 2023.

LLM-as-Judge Metric

Metric	Method	Returns	Notes
G-Eval	`g_eval(generated, api_key, model='gpt-4o-mini', dimensions=['coherence','consistency','fluency','relevance'])`	`dict {dimension: int 1â€“5}`	Uses an OpenAI chat model to score the summary across four quality dimensions. Based on Liu et al., 2023. Requires an OpenAI API key.

Installation

Core install (extractive/abstractive methods + lexical/BERTScore metrics)

pip install pyAutoSummarizer

With faithfulness metrics (AlignScore)

pip install "pyAutoSummarizer[faithfulness]"
python -m spacy download en_core_web_sm

Requirements: Python â‰¥ 3.9

Quick Start

from pyAutoSummarizer.base import psr

text = """
Your long text goes here. It can be multiple paragraphs.
The library will pre-process it, split it into sentences,
and summarize it using any of the available methods.
"""

# Initialise â€” pre-processes the text
s = psr.summarization(text, stop_words=['en'], lowercase=True,
                      rmv_accents=True, rmv_special_chars=True, rmv_numbers=True)

# --- Extractive summarization ---
rank    = s.summ_text_rank()          # TextRank
summary = s.show_summary(rank, n=3)   # top-3 sentences
print(summary)

# --- Abstractive summarization ---
summary = s.summ_abst_chatgpt(api_key='YOUR_KEY', model='gpt-4o-mini')

# --- Evaluation (classic) ---
f1, p, r = s.rouge_N(summary, reference, n=1)
bleu_s   = s.bleu(summary, reference)

# --- Evaluation (semantic) ---
f1, p, r = s.bert_score(summary, reference)

# --- Evaluation (faithfulness â€” no reference needed) ---
faith_sc = s.summa_c(summary)    # SummaC (built-in NLI)
align_sc = s.align_score(summary) # AlignScore (requires [faithfulness] extra)

# --- Evaluation (LLM-as-judge) ---
scores   = s.g_eval(summary, api_key='YOUR_KEY')
# {'coherence': 4, 'consistency': 5, 'fluency': 5, 'relevance': 4}

Colab Demos

Extractive Summarization

Abstractive Summarization

chatGPT â€” requires an OpenAI API key
PEGASUS

Related Projects

pyBibX â€” A Bibliometric and Scientometric Python Library Powered with Artificial Intelligence Tools

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.2.0

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyautosummarizer-1.2.0.tar.gz (55.9 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyautosummarizer-1.2.0-py3-none-any.whl (53.6 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file pyautosummarizer-1.2.0.tar.gz.

File metadata

Download URL: pyautosummarizer-1.2.0.tar.gz
Upload date: Apr 17, 2026
Size: 55.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.10.9

File hashes

Hashes for pyautosummarizer-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`3751d6cba51b35b69f14b1aa98cc416f261467aa4adbda8198ae6a97ec42e1a2`
MD5	`1c25f81cb3a8e4d00260ae411def1e17`
BLAKE2b-256	`1101f5d84471c74f1d0cf62bf2dbd20d6c9653f02d933ca207c847e60755c50b`

See more details on using hashes here.

File details

Details for the file pyautosummarizer-1.2.0-py3-none-any.whl.

File metadata

Download URL: pyautosummarizer-1.2.0-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 53.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.10.9

File hashes

Hashes for pyautosummarizer-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1dc731bf31a0d37c3f6b39a05c8fc63acb316a313f7f6fa8f8d91709818b8832`
MD5	`e42b44b1c769ec15bc6da12334a8b0cc`
BLAKE2b-256	`b7724c01b45fac7a11ecb963bf1ac9c12b15e6381ac8e0f9d92ca960b7951ee3`

See more details on using hashes here.

pyautosummarizer 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pyAutoSummarizer

Citation

Introduction

Summarization Methods

Text Pre-processing

Evaluation Metrics

Classic Metrics (reference-based, lexical)

Semantic Metric (reference-based)

Faithfulness / Factual Consistency Metrics (source-based, no reference needed)

LLM-as-Judge Metric

Installation

Core install (extractive/abstractive methods + lexical/BERTScore metrics)

With faithfulness metrics (AlignScore)

Quick Start

Colab Demos

Related Projects

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes