A text summarization tool using GloVe embeddings and PageRank algorithm

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language

Project description

Text Summarizer

A Python-based text summarization tool that uses GloVe word embeddings and PageRank algorithm to generate extractive summaries of documents.

Features

Extractive Summarization: Uses sentence similarity and PageRank to identify the most important sentences
GloVe Embeddings: Leverages pre-trained GloVe word vectors for semantic similarity calculation
Multiple Input Methods: Support for single documents, CSV files, or interactive creation
GUI Interface: User-friendly Tkinter-based graphical interface
Command Line Interface: Scriptable command-line tool for automation
Batch Processing: Process multiple documents at once

Installation

Prerequisites

Python 3.8 or higher
Required packages (automatically installed): pandas, numpy, nltk, scikit-learn, networkx

Install from PyPI

pip install text-summarizer-aweebtaku

Install from Source

Clone the repository:

git clone https://github.com/AWeebTaku/Summarizer.git
cd Summarizer

Install the package:

pip install -e .

Download GloVe Embeddings

No manual download required! The package will automatically download GloVe embeddings (100d, ~400MB) on first use and cache them in your home directory (~/.text_summarizer/).

If you prefer to use your own GloVe file, you can specify the path:

summarizer = TextSummarizer(glove_path='path/to/your/glove.6B.100d.txt')

Usage

Command Line Interface

# Summarize a CSV file
text-summarizer-aweebtaku --csv-file data/tennis.csv --article-id 1

# Interactive mode
text-summarizer-aweebtaku

Graphical User Interface

# Launch GUI (easiest way)
text-summarizer-aweebtaku --gui

# Or use the dedicated GUI command
text-summarizer-gui

Python API

from text_summarizer import TextSummarizer

# Initialize summarizer (automatic GloVe download)
summarizer = TextSummarizer(num_sentences=3)

# Simple text summarization
text = "Your long text here..."
summary = summarizer.summarize_text(text)
print(summary)

# Advanced usage with DataFrame
import pandas as pd
df = pd.DataFrame([{'article_id': 1, 'article_text': text}])
scored_sentences = summarizer.run_summarization(df)
article_text, summary = summarizer.summarize_article(scored_sentences, 1, df)

Data Format

Input data should be in CSV format with columns:

article_id: Unique identifier for each document
article_text: The full text of the document

Example:

article_id,article_text
1,"This is the first article. It contains multiple sentences..."
2,"This is the second article. It also has several sentences..."

Algorithm

The summarization process follows these steps:

Sentence Tokenization: Split documents into individual sentences
Text Cleaning: Remove punctuation, convert to lowercase, remove stopwords
Sentence Vectorization: Convert sentences to vectors using GloVe embeddings
Similarity Calculation: Compute cosine similarity between all sentence pairs
PageRank Scoring: Apply PageRank algorithm to identify important sentences
Summary Extraction: Select top-ranked sentences in original order

Configuration

glove_path: Path to GloVe embeddings file (default: 'glove.6B.100d.txt/glove.6B.100d.txt')
num_sentences: Number of sentences in summary (default: 5)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Citation

If you use this tool in your research, please cite:

@software{text_summarizer,
  title = {Text Summarizer},
  author = {Aditya Chaurasiya},
  url = {https://github.com/AWeebTaku/Summarizer},
  year = {2026}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

1.3.2

Feb 16, 2026

1.3.1

Feb 16, 2026

1.3.0

Feb 9, 2026

1.2.9

Feb 2, 2026

1.2.8.post2

Feb 2, 2026

1.2.8.post1

Feb 2, 2026

1.2.7

Feb 2, 2026

1.2.6

Feb 2, 2026

1.2.5

Feb 2, 2026

1.2.4

Feb 2, 2026

This version

1.2.3

Feb 1, 2026

1.2.2

Feb 1, 2026

1.2.1

Feb 1, 2026

1.2.0

Feb 1, 2026

1.1.0

Feb 1, 2026

1.0.2

Feb 1, 2026

1.0.1

Feb 1, 2026

1.0.0

Feb 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text_summarizer_aweebtaku-1.2.3.tar.gz (19.6 kB view details)

Uploaded Feb 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

text_summarizer_aweebtaku-1.2.3-py3-none-any.whl (18.8 kB view details)

Uploaded Feb 1, 2026 Python 3

File details

Details for the file text_summarizer_aweebtaku-1.2.3.tar.gz.

File metadata

Download URL: text_summarizer_aweebtaku-1.2.3.tar.gz
Upload date: Feb 1, 2026
Size: 19.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for text_summarizer_aweebtaku-1.2.3.tar.gz
Algorithm	Hash digest
SHA256	`ef480f17ef94bae0193e69f5a5ff1a08b4cfc3cc00dbd7e34366151393cd65c3`
MD5	`d892f3282d1155a2aade8f118e04271d`
BLAKE2b-256	`b0bcf226edc9f89b49271c1d99918c7d60ed8aa732cb498120ad9963bf39e514`

See more details on using hashes here.

File details

Details for the file text_summarizer_aweebtaku-1.2.3-py3-none-any.whl.

File metadata

Download URL: text_summarizer_aweebtaku-1.2.3-py3-none-any.whl
Upload date: Feb 1, 2026
Size: 18.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for text_summarizer_aweebtaku-1.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`70328df152e9afb5930bbf6e844635e27e5a1c8e47cb92a670066d1819159373`
MD5	`621e75c8226319b76864b0408ca4ad87`
BLAKE2b-256	`fc5f4093c98628fc24f6c4efc171e914f39799db56b9621650e04f161ec8b75c`

See more details on using hashes here.

text-summarizer-aweebtaku 1.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Text Summarizer

Features

Installation

Prerequisites

Install from PyPI

Install from Source

Download GloVe Embeddings

Usage

Command Line Interface

Graphical User Interface

Python API

Data Format

Algorithm

Configuration

License

Contributing

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes