Skip to main content

A package for summarizing text using OpenAI

Project description

OpenAI Summarize

OpenAI Summarize is a Python package that generates summaries of text using OpenAI's text-davinci-003 model. It chunks the input text into smaller pieces and generates summaries for each chunk separately using the OpenAI API. The generated summaries are then combined into a final summary. If the final summary is too long, it is recursively summarized until it reaches the desired length.

Installation

OpenAI Summarize can be installed from PyPI using pip. Simply run:

pip install openai-summarize

Alternatively, you can install OpenAI Summarize from Git by cloning the repository and running setup.py:

git clone https://github.com/kixpanganiban/openai_summarize.git
cd openai-summarize
python setup.py install

Usage

import openai_summarize

openai_summarizer = openai_summarize.OpenAISummarize("your_openai_token")

text = "This is a long piece of text that needs to be summarized."
summary = openai_summarizer.summarize_text(text)

print(summary)

Examples

Here's an example of how to use OpenAI Summarize to summarize a long piece of text:

import openai_summarize

openai_summarizer = openai_summarize.OpenAISummarize("your_openai_token")

text = """The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing pandemic of coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The virus was first identified in December 2019 in Wuhan, China. The World Health Organization declared a Public Health Emergency of International Concern regarding COVID-19 on 30 January 2020, and later declared a pandemic on 11 March 2020. As of 18 March 2023, more than 472 million cases have been confirmed, with more than 6.5 million deaths attributed to COVID-19, making it one of the deadliest pandemics in history.

Efforts to prevent the spread of COVID-19 include vaccination programs, lockdowns, travel restrictions, and the use of masks and other protective equipment. Vaccines have been developed and authorized for emergency use, with the Pfizer-BioNTech vaccine being the first to receive emergency use authorization in December 2020.

The pandemic has had significant social, economic, and political impacts. Many businesses have closed, and unemployment rates have risen in many countries. The pandemic has also highlighted disparities in access to healthcare and education, and has led to an increase in domestic violence and mental health issues."""

summary = openai_summarizer.summarize_text(text)
print(summary)

This generates the following summary:

The COVID-19 pandemic is caused by the SARS-CoV-2 virus and has resulted in over 6.5 million deaths worldwide. Efforts to prevent its spread include vaccination programs, lockdowns, travel restrictions, and the use of masks and protective equipment. The pandemic has had significant social, economic, and political impacts, including business closures and rising unemployment rates.

Here's another example of how to use OpenAI Summarize to summarize a news article:

from newspaper3k import Article
import openai_summarize

openai_summarizer = openai_summarize.OpenAISummarize("your_openai_token")

article = Article("https://www.nytimes.com/2023/03/18/world/europe/russia-nato-ukraine.html")
article.download()
article.parse()
summary = openai_summarizer.summarize_text(article.text)
print(summary)

API Reference

OpenAISummarize class

__init__(self, openai_token)

Creates an instance of the OpenAISummarize class.

Arguments
  • openai_token (str): Your OpenAI API token.

count_tokens(self, text)

Counts the number of tokens in a given text.

Arguments
  • text (str): The text to count the tokens of.
Returns
  • int: The number of tokens in the text.

chunk_text(self, text, max_tokens=500)

Breaks up a given text into chunks of at most max_tokens tokens.

Arguments
  • text (str): The text to chunk.
  • max_tokens (int): The maximum number of tokens allowed in each chunk. Defaults to 500.
Returns
  • list of str: The chunks of text.

summarize_text(self, text, max_chunk_size=500, max_combined_summary_size=4000)

Generates a summary of a given text using OpenAI's text-davinci-003 model.

Arguments
  • text (str): The text to summarize.
  • max_chunk_size (int, optional): The size of each chunk of text to summarize. Defaults to 500.
  • max_combined_summary_size (int, optional): The maximum size of the combined summary. Defaults to 4000.
Returns
  • str: The generated summary of the text.

extract_article_text(self, url)

Extracts the main text content of an article from a given URL.

Arguments
  • url (str): The URL of the article to extract the text from.
Returns
  • str: The main text content of the article.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openai_summarize-0.1.2.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

openai_summarize-0.1.2-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file openai_summarize-0.1.2.tar.gz.

File metadata

  • Download URL: openai_summarize-0.1.2.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for openai_summarize-0.1.2.tar.gz
Algorithm Hash digest
SHA256 adde63b587a5254ed6bcccfe2132aeb234ad6ae791d644187a08e2909f1dea63
MD5 554c0e3046f51e33783b27080c9e7183
BLAKE2b-256 313d7a0e5067353f48ec06b9cd3dee895381a16e4f44adcb3093636cc2715d5c

See more details on using hashes here.

File details

Details for the file openai_summarize-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for openai_summarize-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f0b211acbe6b4d0536493fd7616d1a4d3ae7427b74cf6f5f3f989075fe55b4bf
MD5 632515f4909de7b7a0c9657b43074733
BLAKE2b-256 daf4bd6dc0339350b9b6f3c9c2d8b7ec6222c86c3289714603e01e02ace1dd73

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page