CLI tool for automated literature research workflows.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

spignotti

These details have not been verified by PyPI

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Programming Language
Topic
- Scientific/Engineering :: Information Analysis
Typing
- Typed

Project description

litresearch

CLI tool that automates literature research from research questions to curated, ranked, and exported paper sets with structured reports.

Overview

Generates search facets and academic queries from one or more research questions
Discovers candidates from Semantic Scholar and OpenAlex
Screens and analyzes papers with an LLM through LiteLLM
Supports citation graph expansion for frequently referenced works
Ranks papers and exports reports, references, JSON data, PDFs, and metrics
Supports robust resume via a saved state.json

What's New in v1.0.0

Multi-source discovery (S2 + OpenAlex)

Use discovery_sources = ["s2", "openalex"] for broader coverage.
Candidates are deduplicated across sources and source provenance is tracked.

Citation graph expansion

Optional expansion stage adds highly cross-referenced papers after ranking.
Configure with expand_citations and min_cross_refs.

Zotero export

Export top papers to Zotero user or group libraries.
Supports collection assignment, tags, and PDF attachment when available.

PDF injection

Bring your own PDFs with --inject-pdfs or inject_pdf_dir.
Match files by {paper_id}.pdf or DOI-based filenames.

Run metrics and telemetry

Every run writes metrics.json with stage timings and aggregate counts.
Includes source breakdown plus PDF availability and usage metrics.

Resume behavior improvements

Improved resume reliability from state.json checkpoints.
Safer state persistence with atomic writes.

Token-budgeted PDF extraction

Configurable extraction strategy supports token budgets for LLM context limits.
Falls back gracefully when PDFs are unavailable or extraction is limited.

Installation

uv pip install litresearch

For local development:

uv sync
uv run nox

Quickstart

Set an LLM API key for a LiteLLM-supported provider:

export OPENAI_API_KEY=your_key_here
# or
export ANTHROPIC_API_KEY=your_key_here

Optionally set a Semantic Scholar key for better rate limits:

export S2_API_KEY=your_key_here

Copy the example config and tune defaults:

cp litresearch.toml.example litresearch.toml

Run the pipeline:

litresearch run "What is the impact of large language models on software engineering?"

Inspect the output directory:

output/
  report.md
  paper_analyses.md
  references.bib
  references.ris
  data.json
  metrics.json
  papers/
  state.json

Usage

Run one or more research questions:

litresearch run \
  "How do large language models affect developer productivity?" \
  "What evidence exists about code quality impacts?"

Override settings from the CLI:

litresearch run \
  "How do LLMs affect software engineering?" \
  --model anthropic/claude-sonnet-4-20250514 \
  --top-n 10 \
  --threshold 50 \
  --output-dir runs/llm-se \
  --overwrite

Resume an interrupted run:

litresearch resume output/state.json

Inject local PDFs for papers you already have:

litresearch run "Your research question" --inject-pdfs /path/to/pdfs

Inspect current configuration:

litresearch config

Configuration

Settings load in this order:

CLI flags
Environment variables
litresearch.toml
Built-in defaults

Supported environment variables:

OPENAI_API_KEY
ANTHROPIC_API_KEY
OPENROUTER_API_KEY
S2_API_KEY
ZOTERO_API_KEY
S2_TIMEOUT
S2_REQUESTS_PER_SECOND
SCREENING_SELECTION_MODE
SCREENING_TOP_PERCENT
SCREENING_TOP_K
SCREENING_THRESHOLD

Start from the full example config:

cp litresearch.toml.example litresearch.toml

Key options include:

default_model = "openai/gpt-4o-mini"
llm_timeout = 120
max_retries = 3
retry_base_delay = 1.0
discovery_sources = ["s2"]
screening_selection_mode = "top_percent"
screening_top_percent = 0.3
screening_threshold = 60
top_n = 20
max_results_per_query = 20
expand_citations = false
min_cross_refs = 3
zotero_export = false
s2_timeout = 10
s2_requests_per_second = 1.0
pdf_extraction_mode = "budget"
pdf_token_budget = 4000
pdf_first_pages = 4
pdf_last_pages = 2
abstract_fallback = true
# inject_pdf_dir = "/path/to/pdfs"
output_dir = "output"

Screening selection modes:

top_percent (default): deep-analyze the top share of screened papers globally
top_k: deep-analyze the top K screened papers globally
threshold: deep-analyze papers scoring >= screening_threshold

Semantic Scholar tuning:

s2_timeout: request timeout in seconds
s2_requests_per_second: global request rate cap across S2 endpoints

Discovery tuning:

discovery_sources: choose s2, openalex, or both
openalex_email: optional email for OpenAlex polite pool rate limits

Citation expansion tuning:

expand_citations: enable or disable expansion stage
min_cross_refs: minimum citation graph references to include

Zotero export tuning:

zotero_export: enable export integration
zotero_library_id, zotero_library_type, zotero_collection_key, zotero_tag

Output Files

report.md: main literature review report with research questions, search summary, top papers, and synthesis
paper_analyses.md: detailed per-paper analysis for all analyzed papers
references.bib: BibTeX for ranked papers when citation data is available
references.ris: RIS export for citation managers
data.json: machine-readable export of the pipeline state
metrics.json: per-stage timings and aggregate run metrics
papers/: downloaded open-access PDFs for ranked papers
state.json: resumable pipeline checkpoint

Development

uv run nox
uv run litresearch --help

Status

v1.0.0 delivers a production-ready core workflow for automated literature research, including multi-source discovery, ranking, export, and operational telemetry.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

spignotti

These details have not been verified by PyPI

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Programming Language
Topic
- Scientific/Engineering :: Information Analysis
Typing
- Typed

Release history Release notifications | RSS feed

This version

1.1.0

May 10, 2026

1.0.1

Apr 21, 2026

1.0.0

Apr 11, 2026

0.2.0

Mar 23, 2026

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litresearch-1.1.0.tar.gz (182.0 kB view details)

Uploaded May 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

litresearch-1.1.0-py3-none-any.whl (35.4 kB view details)

Uploaded May 10, 2026 Python 3

File details

Details for the file litresearch-1.1.0.tar.gz.

File metadata

Download URL: litresearch-1.1.0.tar.gz
Upload date: May 10, 2026
Size: 182.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for litresearch-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6a2c7ba460f50def426146ce87d80b8e844af47d6cd14e3dd945e67cdac4ce7a`
MD5	`2d9537930dda7238b2055b6cd4c3a663`
BLAKE2b-256	`597c372ad9baf98d09d4e3ddc996a1527652ee9c951f5f5797820ce970a7a4f2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for litresearch-1.1.0.tar.gz:

Publisher: release.yml on spignotti/litresearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: litresearch-1.1.0.tar.gz
- Subject digest: 6a2c7ba460f50def426146ce87d80b8e844af47d6cd14e3dd945e67cdac4ce7a
- Sigstore transparency entry: 1499054618
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: spignotti/litresearch@84ce45b1b7ad6901c766621b786fa4c1faecd144
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/spignotti
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@84ce45b1b7ad6901c766621b786fa4c1faecd144
- Trigger Event: push

File details

Details for the file litresearch-1.1.0-py3-none-any.whl.

File metadata

Download URL: litresearch-1.1.0-py3-none-any.whl
Upload date: May 10, 2026
Size: 35.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for litresearch-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5027270f01a806cfef97f1d4e01a994dd0d7f2fc0d91449d1cc032f877812aee`
MD5	`87199e27fb3b07ad9fa161c88671f170`
BLAKE2b-256	`1d16d65fae5b7e2bc3df510bf56706dbd43a652cee375d204b685f7d57be7043`

See more details on using hashes here.

Provenance

The following attestation bundles were made for litresearch-1.1.0-py3-none-any.whl:

Publisher: release.yml on spignotti/litresearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: litresearch-1.1.0-py3-none-any.whl
- Subject digest: 5027270f01a806cfef97f1d4e01a994dd0d7f2fc0d91449d1cc032f877812aee
- Sigstore transparency entry: 1499054798
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: spignotti/litresearch@84ce45b1b7ad6901c766621b786fa4c1faecd144
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/spignotti
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@84ce45b1b7ad6901c766621b786fa4c1faecd144
- Trigger Event: push

litresearch 1.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

litresearch

Overview

What's New in v1.0.0

Multi-source discovery (S2 + OpenAlex)

Citation graph expansion

Zotero export

PDF injection

Run metrics and telemetry

Resume behavior improvements

Token-budgeted PDF extraction

Installation

Quickstart

Usage

Configuration

Output Files

Development

Status

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance