Skip to main content

CLI tool for automated literature research workflows.

Project description

litresearch

CI PyPI

CLI tool that automates literature research from research questions to curated, ranked, and exported paper sets with structured reports.

Overview

  • Generates search facets and academic queries from one or more research questions
  • Searches Semantic Scholar for candidate papers
  • Screens and analyzes papers with an LLM through LiteLLM
  • Ranks papers and exports reports, references, JSON data, and PDFs
  • Supports resume via a saved state.json

Installation

uv pip install litresearch

For local development:

uv sync
uv run nox

Quickstart

  1. Set an LLM API key for a LiteLLM-supported provider:
export OPENAI_API_KEY=your_key_here
# or
export ANTHROPIC_API_KEY=your_key_here
  1. Optionally set a Semantic Scholar key for better rate limits:
export S2_API_KEY=your_key_here
  1. Copy the example config and tune defaults:
cp litresearch.toml.example litresearch.toml
  1. Run the pipeline:
litresearch run "What is the impact of large language models on software engineering?"
  1. Inspect the output directory:
output/
  report.md
  paper_analyses.md
  references.bib
  references.ris
  data.json
  papers/
  state.json

Usage

Run one or more research questions:

litresearch run \
  "How do large language models affect developer productivity?" \
  "What evidence exists about code quality impacts?"

Override settings from the CLI:

litresearch run \
  "How do LLMs affect software engineering?" \
  --model anthropic/claude-sonnet-4-20250514 \
  --top-n 10 \
  --threshold 50 \
  --output-dir runs/llm-se \
  --overwrite

Resume an interrupted run:

litresearch resume output/state.json

Inspect current configuration:

litresearch config

Configuration

Settings load in this order:

  1. CLI flags
  2. Environment variables
  3. litresearch.toml
  4. Built-in defaults

Supported environment variables:

  • OPENAI_API_KEY
  • ANTHROPIC_API_KEY
  • OPENROUTER_API_KEY
  • S2_API_KEY
  • S2_TIMEOUT
  • S2_REQUESTS_PER_SECOND
  • SCREENING_SELECTION_MODE
  • SCREENING_TOP_PERCENT
  • SCREENING_TOP_K
  • SCREENING_THRESHOLD

Example litresearch.toml:

default_model = "openai/gpt-4o-mini"
screening_selection_mode = "top_percent"
screening_top_percent = 0.3
screening_threshold = 60
top_n = 20
max_results_per_query = 20
s2_timeout = 10
s2_requests_per_second = 1.0
pdf_first_pages = 4
pdf_last_pages = 2
output_dir = "output"

Screening selection modes:

  • top_percent (default): deep-analyze the top share of screened papers globally
  • top_k: deep-analyze the top K screened papers globally
  • threshold: deep-analyze papers scoring >= screening_threshold

Semantic Scholar tuning:

  • s2_timeout: request timeout in seconds
  • s2_requests_per_second: global request rate cap across S2 endpoints

Output Files

  • report.md: main literature review report with research questions, search summary, top papers, and synthesis
  • paper_analyses.md: detailed per-paper analysis for all analyzed papers
  • references.bib: BibTeX for ranked papers when citation data is available
  • references.ris: RIS export for citation managers
  • data.json: machine-readable export of the pipeline state
  • papers/: downloaded open-access PDFs for ranked papers
  • state.json: resumable pipeline checkpoint

Development

uv run nox
uv run litresearch --help

Status

This is an MVP-oriented proof of concept intended to answer one question clearly: is the end-to-end literature research workflow useful enough to keep investing in?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litresearch-0.2.0.tar.gz (156.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

litresearch-0.2.0-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file litresearch-0.2.0.tar.gz.

File metadata

  • Download URL: litresearch-0.2.0.tar.gz
  • Upload date:
  • Size: 156.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litresearch-0.2.0.tar.gz
Algorithm Hash digest
SHA256 5ebb215a3fcfe471de21f7a0beaf87255aef3dabd5da769b843569830acef2df
MD5 173cb9601f76e7c8451ed60dc1f74378
BLAKE2b-256 b5695609c063646acdba0f92613eb2ffb51cecf51eda8d43baba5a996ecc155c

See more details on using hashes here.

Provenance

The following attestation bundles were made for litresearch-0.2.0.tar.gz:

Publisher: release.yml on spignotti/litresearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file litresearch-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: litresearch-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 21.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litresearch-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e7b04f5d852d7b12b6bd2aed2041005e0cf8c5ae4b8c2cbd1cf6dab00a1bded
MD5 d232b9b28a13faa97d89914f31ff9ed0
BLAKE2b-256 2839ed4390e235b222dfb2eb9336161b989172efa911614ed4c7650b336dd1f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for litresearch-0.2.0-py3-none-any.whl:

Publisher: release.yml on spignotti/litresearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page