Skip to main content

Generate abstractive summaries of Project Gutenberg tests

Project description

Summarize: generate abstractive summaries from Project Gutenberg books

About the app

Summarize is a CLI app written in Python 3.12. The CLI uses Typer and Rich for beautiful rich-text output. The app uses the Gutendex API to retrieve books from Project Gutenberg and the pszemraj's pegasus-x-large-book-summary LLM to generate abstractive summaries. The user can save summaries to a .txt file or print them to stdout for further piping.

Installation

Summarize is available as a package on PyPI! You'll need Python 3.12 or later installed.

  1. Navigate to the directory where you want to install Summarize.
  2. Create a virtual environment: python -m venv venv (Windows/Linux) or python3 -m venv venv (Mac).
  3. Activate the virtual environment: .\venv\Scripts\activate (Windows) or source venv/bin/activate (Linux/Mac).
  4. Install summarize: pip install summarize-gutenberg

Running the app

At present the app retrieves the top 32 books from Project Gutenberg and offers the user a choice among them. Here's how to use it:

  1. Make sure your virtual environment is activated.
  2. Type summarize
  3. Follow the onscreen prompts to create your summary!

A note on chunks

Summarize works by breaking the source text into chunks of a given number of lines; you can specify how many lines per chunk the program works on. In theory the LLM should work with very large chunks or even entire texts, but experience has shown that a range of 200-800 works best. Please note that the smaller the chunk size, the longer the program will take to run!

Sample output

Kafka's The Metamorphosis, 400 lines per chunk:

One morning, Samsa wakes from a nightmare and finds himself transformed into a "horrid vermin" in his bed. He is a traveling salesman, and his room is small, but it is comfortable. A picture hangs above the table, showing a woman wearing a hat and a boa. Samsa thinks about how hard it is to be a salesman, how he has to travel all over the world, and how he can never be friendly with anyone. He feels a slight itch on his belly, and pushes himself up onto the bed to lift his head. When he touches the bed, he is overcome by a "cold shudder". He thinks about getting up early, but he knows that other salesmen live a "life of luxury" and that he would get kicked out of his job if he did not have his parents to worry about. He decides that he will pay off his parents' debt, and then he will make the big decision to quit his job. He looks over at the clock, which is ticking past six, and wonders if he could have slept peacefully through the furniture-ratting noise.The next morning, the chief clerk tells Gregor that he has to leave immediately. He tells him that he is in debt to his employer and must look after his parents and sister.The chapter opens with a loud "No" from the family. It's five years later, and the family is still broke. They've lost everything, but they're still able to scrape together enough money to send their sister to the conservatory. The narrator tells us that this is the first time that the family has heard anything positive about their financial situation since the business collapsed five years ago. The family is overjoyed, and they've even gotten used to the idea of having to pay for the expenses of the house. But now that the business is gone, the family can't afford to go back to the good old days. So, they'll have to work.The chapter opens with a description of the situation in which the family finds itself. The family is in the middle of a heated argument when an apple flies through the air and lands on the floor. It is an apple that has been thrown at the family, and it is the apple that is lodged in the flesh of Gregor. The apple is a reminder of the family's reaction to the accident, and the family does not treat the apple as an enemy, but rather as a reminder that the family is patient with its son. The chapter ends with a comparison between the family and a hotel room. The hotel room was a place where the family would gather and talk, but now the family has moved to the bedroom, where they can only talk in the dark.The monster is back, and it's not going to stop until they get rid of him. It's going to take a long time, and they're going to have to live with it for the rest of their lives.

Planned additional features

There are two major features still to be implemented. The first is a search function allowing you to retrieve any book from Project Gutenberg. The second is a full set of command-line options to control program flow.

License

Summarize is free software, released under version 3.0 of the GPL. Everyone has the right to use, modify, and distribute Summarize subject to the stipulations of that license. Contributions are welcome!

Acknowledgments

The overall structure of the app is inspired by Brian Okken's cards. The UI and (eventual) search functionality are inspired by pybites-search. The tone of user messages is doubtless inspired by countless hours of Dungeon Crawl Stone Soup over the years.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

summarize_gutenberg-0.0.2.tar.gz (46.5 kB view details)

Uploaded Source

Built Distribution

summarize_gutenberg-0.0.2-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file summarize_gutenberg-0.0.2.tar.gz.

File metadata

  • Download URL: summarize_gutenberg-0.0.2.tar.gz
  • Upload date:
  • Size: 46.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.1

File hashes

Hashes for summarize_gutenberg-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8c9f1cb007fc7c003350f2f044cd91f03b0dd6b18829004d2bd1c716a040effb
MD5 75489e8e7bf6253faea1ac845bf57474
BLAKE2b-256 f477911e9f297e7a38b458b353a7cb6967a9c90c94165cb35c2a7cd6fdc394b6

See more details on using hashes here.

File details

Details for the file summarize_gutenberg-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for summarize_gutenberg-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cdffa92253ed68498f38f85b071271165b5558d4c847dcb060f2161734973881
MD5 ecaa214c41a0dfb04483882e41468f0b
BLAKE2b-256 bc872d9bfb775bacfda2e64f78a1dbe829a8c67b1de951f5419d689f3a5302db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page