Skip to main content

A flexible text summarization library supporting extractive and abstractive methods.

Project description

pysuma

🚀 PySummarizer - The Ultimate Text Summarization Library

PySummarizer is a powerful and intelligent text summarization library designed for both extractive and abstractive summarization. It efficiently extracts key insights from PDFs and presents them in a structured, bullet-point format, making it perfect for academic research, content summarization, and AI-driven automation.


🌟 Features

Supports Extractive & Abstractive Summarization
Seamlessly extracts text from PDFs & generates structured summaries
Utilizes TextRank, LSA, and LexRank for extractive summarization
Automatically adjusts bullet points based on text length
Customizable summary length to meet your needs
Command-line interface (CLI) for quick and efficient summarization
Easy-to-use and integrates effortlessly into Python projects


📌 Installation

Get started in seconds! Install PySummarizer via pip:

pip install pysuma

🚀 Quick Start Guide

Transform long PDFs into structured summaries in just a few lines of code.

1️⃣ Import the Library

import pysuma as pyss

2️⃣ Define PDF Paths

pdf_path = "sample.pdf"      # Input PDF file
output_file = "summary.txt"  # Output summary file

3️⃣ Generate a Summarized Report

pyss.summarize_pdf(
    pdf_path,
    output_file,
    method="textrank",            # Choose from "textrank", "lsa", "lexrank"
    summary_type="extractive"      # Select either "extractive" or "abstractive"
)

4️⃣ Run & Retrieve Your Summary

python script.py

✅ The summary will be saved in summary.txt, formatted as bullet points.


📄 Summary Output Example

PySummarizer automatically structures summaries into bullet points for better readability:

• Learning enables acquiring new skills and knowledge.
• Supervised learning requires labeled datasets for training.
• Decision trees classify data using entropy and information gain.
• Reinforcement learning optimizes decision-making using rewards and penalties.
...
(Total bullet points depend on text length)

📊 Adaptive Bullet Point Summarization

PySummarizer dynamically adjusts the number of bullet points based on text length:

Text Length (Characters) Number of Bullet Points
0 - 2500 10
2501 - 5000 20
5001 - 7500 30
More than 7500 50

🔹 Ensures precise summarization without losing key details.


💻 CLI Usage (Command Line)

Use PySummarizer directly from the terminal for quick PDF summarization:

pysuma sample.pdf summary.txt --method textrank --summary_type extractive

✔️ Perfect for automation & large-scale text processing.


📜 License

PySummarizer is released under the MIT License, making it open-source and free for personal and commercial use.


🎯 Transform lengthy PDFs into structured insights with PySummarizer today!
🔗 Contribute or explore more: GitHub Repository


---

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysuma-0.1.0.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysuma-0.1.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file pysuma-0.1.0.tar.gz.

File metadata

  • Download URL: pysuma-0.1.0.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for pysuma-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8b76453f76173937553c24cb9bb3614864458af96a45b0983439305849a18f1e
MD5 40222be4649605e8a08293518ecf70e1
BLAKE2b-256 6c17d95d260dc73359af8b46d97bad806c5106f3b805e22bc0f7dc573e7f4db0

See more details on using hashes here.

File details

Details for the file pysuma-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pysuma-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for pysuma-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d648f828124909aaa936ff51342f8942e77746f764adafaff70f6f3c3b2542e
MD5 822ec9f56cebf83d39837ee950f50118
BLAKE2b-256 dd6a375af3888d68810e80dae8a73c23cf9b5fe2df44a2d8bab843a471575762

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page