Skip to main content

Summarize research papers from arXiv using LLMs

Project description

thom

Summarize research papers from arXiv using LLMs.

Named after René Thom (1923-2002), the French mathematician who founded catastrophe theory and won the Fields Medal in 1958.

Installation

pip install thom

Or install from source:

git clone https://github.com/thom-project/thom.git
cd thom
pip install -e .

Quick Start

Python Library

import thom

# Fetch a paper from arXiv
paper = thom.fetch_paper("2301.00001")
print(paper.title)
print(paper.abstract)

# Summarize the paper
summary = thom.summarize(paper)
print(summary.summary)
print(summary.key_points)

# Search for papers
papers = thom.search("transformer attention mechanism", max_results=5)
for p in papers:
    print(f"{p.arxiv_id}: {p.title}")

# Compare multiple papers
papers = [thom.fetch_paper(id) for id in ["2301.00001", "2301.00002"]]
analysis = thom.compare(papers)
print(analysis)

Command Line

# Summarize a paper
thom summarize 2301.00001

# Or use a URL
thom summarize https://arxiv.org/abs/2301.00001

# Fetch paper metadata only
thom fetch 2301.00001

# Search for papers
thom search "machine learning"

# Compare papers
thom compare 2301.00001 2301.00002

# Use different models
thom summarize 2301.00001 --model gpt-4o
thom summarize 2301.00001 --model claude-3-5-sonnet-20241022
thom summarize 2301.00001 --model ollama/llama3

# Adjust detail level
thom summarize 2301.00001 --detail brief
thom summarize 2301.00001 --detail detailed

# Output in different languages
thom summarize 2301.00001 --language french

# JSON output
thom summarize 2301.00001 --json

# List supported models
thom models

Configuration

API Keys

Set your API key via environment variables:

# OpenAI
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Or set programmatically
import thom
thom.set_api_key("openai", "sk-...")

Using Local Models with Ollama

# First, install and run Ollama
ollama run llama3

# Then use with thom
thom summarize 2301.00001 --model ollama/llama3
import thom

paper = thom.fetch_paper("2301.00001")
summary = thom.summarize(paper, model="ollama/llama3")

API Reference

Core Functions

thom.fetch_paper(identifier)

Fetch a paper from arXiv by ID or URL.

paper = thom.fetch_paper("2301.00001")
paper = thom.fetch_paper("https://arxiv.org/abs/2301.00001")
paper = thom.fetch_paper("https://arxiv.org/pdf/2301.00001.pdf")

Returns an ArxivPaper object with:

  • arxiv_id: The arXiv identifier
  • title: Paper title
  • authors: List of author names
  • abstract: Paper abstract
  • categories: arXiv categories
  • published: Publication date
  • pdf_url: URL to PDF
  • arxiv_url: URL to arXiv page

thom.summarize(paper, model="gpt-4o-mini", detail_level="medium", language="english")

Generate a summary of a paper.

summary = thom.summarize(paper)
summary = thom.summarize(paper, model="gpt-4o", detail_level="detailed")
summary = thom.summarize(paper, language="spanish")

Returns a Summary object with:

  • paper: The original ArxivPaper
  • summary: Generated summary text
  • key_points: List of key points
  • model: Model used for summarization

thom.search(query, max_results=10, sort_by="relevance")

Search for papers on arXiv.

papers = thom.search("machine learning")
papers = thom.search("au:hinton", max_results=20)
papers = thom.search("cat:cs.LG", sort_by="submittedDate")

thom.compare(papers, model="gpt-4o-mini")

Generate a comparative analysis of multiple papers.

papers = [thom.fetch_paper(id) for id in ids]
analysis = thom.compare(papers)

Supported Models

thom uses LiteLLM for LLM support, which means you can use models from:

  • OpenAI: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo
  • Anthropic: claude-3-5-sonnet-20241022, claude-3-opus-20240229, claude-3-sonnet-20240229
  • Google: gemini/gemini-1.5-pro, gemini/gemini-1.5-flash
  • Cohere: command-r-plus, command-r
  • Ollama (local): ollama/llama3, ollama/mistral, ollama/mixtral
  • Together AI: together_ai/meta-llama/Llama-3-70b-chat-hf
  • And many more

List available models:

models = thom.list_supported_models()

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thom-0.1.0.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thom-0.1.0-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file thom-0.1.0.tar.gz.

File metadata

  • Download URL: thom-0.1.0.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.15

File hashes

Hashes for thom-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8d8b09c01a90e836d945e5916bfcb733f52cc4d05f32a1dbb6ac46a6c102e16e
MD5 16a55416bd78f7252180f9434977dfac
BLAKE2b-256 211009e1f63826bd80994dbecb24483ef84a48f4b0c324db85f1965d80768e0b

See more details on using hashes here.

File details

Details for the file thom-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: thom-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.15

File hashes

Hashes for thom-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b18ef9d88aca57aab1ed7cd4aa15f3423c2e1de878c3b2c2b51a8508749f1412
MD5 e1855f432ff274ca84825637608ca226
BLAKE2b-256 7ad1f3535b3bd56e456b0838ea750163d1d316f851c67be14376f9159cdfe855

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page