Automated literature search, screening, and prioritization pipeline powered by LLMs.

These details have not been verified by PyPI

Project links

Project description

litscout

Python 3.11+ License Status

Automated literature search, screening, and prioritization pipeline powered by LLMs.

What It Does

litscout is an automated literature discovery and screening pipeline for academic researchers. It uses AI to:

Generate smart search queries based on your research angle
Search academic databases (OpenAlex, Semantic Scholar, arXiv, PubMed, CORE) for candidate papers
Download PDFs (with Elsevier ScienceDirect API fallback for paywalled papers)
Screen papers using an LLM for relevance to your research angle
Keep medium/high relevance papers, discard the rest
Repeat until sufficient coverage is achieved
Generate a final Markdown report summarizing everything found

Installation

From PyPI (coming soon)

pip install litscout

From Source

git clone https://github.com/your-username/litscout.git
cd litscout
pip install -e .

Quick Start

1. Initialize a Project

litscout init

This scaffolds a new litscout project in your current directory with all the necessary config files and directories.

2. Configure API Keys

# Edit .env with your API keys
# At minimum: LLM_BASE_URL, LLM_API_KEY, LLM_MODEL

3. Configure Sources

# Edit input/settings.yaml to enable your sources
# At least one source with role 'search_and_pdf' must be enabled

4. Write Research Angle

# Edit input/research.md with your research focus

5. Run the Pipeline

litscout run

6. Check Results

ls output/kept_papers/   # Downloaded PDFs
ls output/reports/       # Final Markdown reports
cat output/manifest.json # Full log of all papers

CLI Commands

Command	Description
`litscout init`	Scaffold a new litscout project directory
`litscout run`	Run the literature search and screening pipeline
`litscout report`	Regenerate the markdown report from existing manifest.json
`litscout clean`	Clean output and temp directories
`litscout status`	Show quick summary of current project state
`litscout --help`	Show help message and exit

`litscout init`

Scaffolds a new litscout project in the current directory (or specified path):

litscout init              # Use current directory
litscout init ./my-project # Use specified directory

Creates:

input/research.md — Your research angle
input/settings.yaml — Source and target settings
.env — API keys
config.yaml — Advanced technical settings
output/, temp/ — Output and temp directories

`litscout run`

Runs the full literature search and screening pipeline:

litscout run                    # Use default config.yaml
litscout run --config my.yaml   # Use custom config path

`litscout report`

Regenerates the markdown report from an existing manifest.json without re-running the pipeline:

litscout report
litscout report --config my.yaml

`litscout clean`

Cleans output and temp directories:

litscout clean           # Ask for confirmation
litscout clean --confirm # Skip confirmation

`litscout status`

Shows a quick summary of the current project state:

litscout status
litscout status --config my.yaml

Output includes:

Active sources and their roles
Target papers
Iterations run
Papers kept (high/medium breakdown)
Papers discarded
Last updated timestamp

Configuring Search Sources

Edit input/settings.yaml to enable the sources you have access to:

Source	Role	Key Required?	Coverage	Best For
OpenAlex	Search + PDF	No (email optional)	250M+ works, all disciplines	General academic research
Semantic Scholar	Search + PDF	No (optional for speed)	200M+ papers, AI-ranked	CS, biomedical, broad coverage
Elsevier	PDF only	Yes (institutional)	Paywalled Elsevier journals	University-subscribed content
arXiv	Search + PDF	No	2.4M+ preprints	Physics, math, CS, quantitative biology
PubMed	Search + PDF	No (optional for speed)	36M+ citations	Biomedical and life sciences
CORE	Search + PDF	Yes (free)	300M+ metadata, 40M+ full texts	Open access aggregation

Default enabled sources: OpenAlex (search+pdf), Elsevier (pdf-only)

Configuration Files

`.env` (API Keys)

Required:

LLM_BASE_URL — OpenAI-compatible endpoint URL
LLM_API_KEY — Your LLM API key
LLM_MODEL — Model name (e.g., qwen3.6-plus)

Optional (enable sources as needed):

OPENALEX_EMAIL — For faster OpenAlex rate limits
S2_API_KEY — Semantic Scholar (optional, for guaranteed 1 req/sec)
ELSEVIER_API_KEY — Elsevier ScienceDirect (required for pdf_only role)
ELSEVIER_INST_TOKEN — Elsevier institutional token (for off-campus access)
PUBMED_API_KEY — PubMed (optional, for 10 req/sec vs 3 req/sec)
CORE_API_KEY — CORE (required if enabled)

`input/settings.yaml` (User Settings)

target_papers: 20          # Stop when this many papers are kept
max_iterations: 0          # 0 = unlimited
auto_stop: false           # true = stop automatically; false = ask user

sources:
  openalex:
    enabled: true
    role: search_and_pdf
  semantic_scholar:
    enabled: false
    role: search_and_pdf
  elsevier:
    enabled: true
    role: pdf_only
  arxiv:
    enabled: false
    role: search_and_pdf
  pubmed:
    enabled: false
    role: search_and_pdf
  core:
    enabled: false
    role: search_and_pdf

`config.yaml` (Technical Settings — rarely needs editing)

Setting	Description	Default
`api.max_tokens`	Max tokens for LLM responses	16384
`api.temperature`	LLM temperature (0.0-1.0)	0.3
`api.max_concurrent_requests`	Concurrent LLM requests	3
`search.queries_per_iteration`	Queries per round	5
`search.results_per_query`	Max results per query	20
`search.year_range`	Papers from last N years	5
`download.concurrency`	Max simultaneous downloads	5
`download.timeout`	Download timeout (seconds)	60
`download.max_pdf_size_mb`	Skip PDFs larger than this	50
`screening.batch_size`	Papers per LLM screening call	10
`screening.max_tokens_per_batch`	Token budget per batch	200000
`sufficiency.min_high_relevance`	Min high-relevance papers	5
`sufficiency.min_medium_relevance`	Min medium-relevance papers	8

Adding New Sources

litscout uses a plugin-based source architecture. To add a new source:

Create a new file in litscout/sources/ (e.g., my_source.py)
Subclass ScholarSource from litscout.sources.base
Implement the required methods:
- name() — Return the source identifier
- search(query, limit, year_min, credentials) — Search for papers
- fetch_pdf(paper, credentials, session) — Fetch PDF content
Register it in litscout/sources/__init__.py

Example:

from litscout.sources.base import PaperMetadata, ScholarSource

class MySource(ScholarSource):
    @classmethod
    def name(cls) -> str:
        return "my_source"

    async def search(self, query, limit, year_min, credentials):
        # Implement search logic
        return []

    async def fetch_pdf(self, paper, credentials, session):
        # Implement PDF fetch logic
        return None

Output Format

The final report is a Markdown file with:

Header: Generation timestamp, iteration count, paper statistics
Research Angle: Your original research prompt
Summary Table: All kept papers with relevance and brief descriptions
Detailed Evaluations: Full analysis for each kept paper
Coverage Analysis: Gaps identified by the LLM
Search Queries Used: All queries across all iterations

Graceful Shutdown

Press Ctrl+C to gracefully stop the pipeline. It will:

Finish the current iteration
Save the manifest
Generate a final report
Clean up temporary files

Note on Language Support

Currently supports English-language papers only. Japanese language support is planned for a future release.

License

MIT License — see LICENSE for details.

Acknowledgments

Semantic Scholar (Allen Institute for AI) — Free academic paper search API
OpenAlex — Open bibliographic database
Elsevier — ScienceDirect API for paywalled paper access
arXiv — Open-access preprint repository
PubMed / NCBI — Biomedical literature database
CORE — Open access aggregator

Contributing

Contributions are welcome! See CONTRIBUTING.md for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Apr 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litscout-0.2.0.tar.gz (54.5 kB view details)

Uploaded Apr 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

litscout-0.2.0-py3-none-any.whl (61.4 kB view details)

Uploaded Apr 15, 2026 Python 3

File details

Details for the file litscout-0.2.0.tar.gz.

File metadata

Download URL: litscout-0.2.0.tar.gz
Upload date: Apr 15, 2026
Size: 54.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for litscout-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`f8e72acc3d543cb6e49ab61e45d06c80ba31959707410ca38237bf26b8959493`
MD5	`80337274de7e919f932884616375f84e`
BLAKE2b-256	`aa227003d5c97c1d79557818cc04369635c25665d63dfc1dd5aa1ffa4ddf4010`

See more details on using hashes here.

File details

Details for the file litscout-0.2.0-py3-none-any.whl.

File metadata

Download URL: litscout-0.2.0-py3-none-any.whl
Upload date: Apr 15, 2026
Size: 61.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for litscout-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a80f830eaa107f48a0d17a47c1bfd14ca7691641a092226d3df5a5061d08b2c3`
MD5	`e1fbf96b472dbd78ac82801b291825dc`
BLAKE2b-256	`a34ea82fc36d432a64eb58b8733bd4decb3219bd6e8b21157bfe4cf77bdd73e1`

See more details on using hashes here.

litscout 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

litscout

What It Does

Installation

From PyPI (coming soon)

From Source

Quick Start

1. Initialize a Project

2. Configure API Keys

3. Configure Sources

4. Write Research Angle

5. Run the Pipeline

6. Check Results

CLI Commands

litscout init

litscout run

litscout report

litscout clean

litscout status

Configuring Search Sources

Configuration Files

.env (API Keys)

input/settings.yaml (User Settings)

config.yaml (Technical Settings — rarely needs editing)

Adding New Sources

Output Format

Graceful Shutdown

Note on Language Support

License

Acknowledgments

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`litscout init`

`litscout run`

`litscout report`

`litscout clean`

`litscout status`

`.env` (API Keys)

`input/settings.yaml` (User Settings)

`config.yaml` (Technical Settings — rarely needs editing)