OpenAlex-based deep research agent with an OpenAI-compatible LLM interface.

These details have not been verified by PyPI

Project links

Project description

Open Deep Research

Open Deep Research is a small, practical repo for building a scholarly "deep research" workflow on top of OpenAlex and an OpenAI-compatible LLM.

It does four things:

plans search queries from a research question
searches and expands papers through OpenAlex references and citations
fetches open-access text when available
writes a Markdown literature review with explicit paper citations

The project is intentionally simple enough to teach in an Information Retrieval course and strong enough to serve as a working baseline for assignments.

Why this stack

OpenAlex is the discovery graph and metadata backbone.
OpenAI-compatible chat models handle planning, reranking, and synthesis.
Local scoring and trace logging keep the retrieval decisions inspectable.

Repository layout

open_deep_research/
  src/open_deep_research/
    api.py
    cli.py
    config.py
    fetchers.py
    llm.py
    models.py
    openalex.py
    planner.py
    reporting.py
    research.py
  tests/
  .env.example
  pyproject.toml

Quickstart

Create a virtual environment.
Install the package.
Set your API keys.
Run a research job.

cd /Users/birger/Documents/uppsala_lektorat/Information_Retrieval_Course/open_deep_research
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
cp .env.example .env
open-deep-research research "How do retrieval-augmented generation systems reduce hallucinations?" --output-dir outputs/rag

If you also want PDF extraction support:

pip install -e '.[pdf]'

Install directly from GitHub without cloning:

pip install "open-deep-research-cli @ git+https://github.com/BirgerMoell/open-deep-research.git"

Install from PyPI:

pip install open-deep-research-cli

Environment variables

OPENALEX_MAILTO: recommended for OpenAlex polite-pool access
OPENALEX_API_KEY: optional OpenAlex premium key
OPENAI_BASE_URL: defaults to https://api.openai.com/v1
OPENAI_API_KEY: required for hosted OpenAI, often omitted for local OpenAI-compatible servers
OPENAI_MODEL: defaults to gpt-4o-mini

Commands

Research and write a report:

open-deep-research research "What are the main evaluation methods for neural information retrieval?" --final-papers 8

Read the question from stdin and print only the report body, which is the most convenient mode for agent skills:

printf '%s' "How are citation graphs used in scientific literature retrieval?" | \
  open-deep-research research --stdin --format report

Disable the LLM and run the retrieval-only pipeline:

open-deep-research research "What are the main evaluation methods for neural information retrieval?" --no-llm

Inspect the query plan only:

open-deep-research plan "How do agentic retrieval systems differ from standard RAG?"

Print only the planned queries:

open-deep-research plan "How do agentic retrieval systems differ from standard RAG?" --format queries

Run the local JSON API:

open-deep-research serve --host 127.0.0.1 --port 8080

Example request:

curl -X POST http://127.0.0.1:8080/research \
  -H 'Content-Type: application/json' \
  -d '{"question": "What are the main design patterns in deep research systems?", "final_papers": 6}'

Outputs

Each run writes:

report.md: literature review in Markdown
papers.json: normalized paper metadata and scores
trace.json: planned queries, expansion edges, and selection decisions

research also supports skill-friendly stdout modes:

--format json: full structured result
--format paths: just the output file locations
--format report: print report.md
--format papers: print papers.json
--format trace: print trace.json

Deep research workflow

question
  -> query plan
  -> OpenAlex search
  -> reference/citation expansion
  -> heuristic scoring
  -> optional LLM reranking
  -> OA text fetch
  -> report synthesis

Notes

This repo is designed for open scholarly discovery, not closed publisher access.
OpenAlex does not contain all full texts. The pipeline therefore falls back to abstracts when open text cannot be fetched.
For large-scale ingestion, OpenAlex also provides snapshots and an official CLI: OpenAlex CLI.

Codex skill use

This repo now includes a minimal skill template at codex_skill/open-deep-research/SKILL.md.

That template assumes the CLI is installed and then uses stdin plus explicit output modes, which is the cleanest way for an agent to call the tool:

printf '%s' "$QUESTION" | open-deep-research research --stdin --format report

Official references

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_deep_research_cli-0.1.1.tar.gz (18.8 kB view details)

Uploaded Mar 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

open_deep_research_cli-0.1.1-py3-none-any.whl (19.6 kB view details)

Uploaded Mar 10, 2026 Python 3

File details

Details for the file open_deep_research_cli-0.1.1.tar.gz.

File metadata

Download URL: open_deep_research_cli-0.1.1.tar.gz
Upload date: Mar 10, 2026
Size: 18.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for open_deep_research_cli-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`21154993e0cd2102eef6d55b1188ada14b71c693b4e4cbbf51adbe3e5e37c690`
MD5	`cfa3c8ce2fff8b06db9ca1c3647318a3`
BLAKE2b-256	`67a084e8bfad10b07c56aa2e8a0cddf18fb93e25e0a93b13e7631397d2b9ae22`

See more details on using hashes here.

File details

Details for the file open_deep_research_cli-0.1.1-py3-none-any.whl.

File metadata

Download URL: open_deep_research_cli-0.1.1-py3-none-any.whl
Upload date: Mar 10, 2026
Size: 19.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for open_deep_research_cli-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bdb2fa6bf425d3c41c42ce9f4e7225b8208356844b19c61ce9c08fe9d46d99c6`
MD5	`301333fec6cb06f00470318cf14478ba`
BLAKE2b-256	`b081f98a4a3dc2f8f1295cb2aafe1299642bb371de681ecca408cabca353ecc4`

See more details on using hashes here.

open-deep-research-cli 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Open Deep Research

Why this stack

Repository layout

Quickstart

Environment variables

Commands

Outputs

Deep research workflow

Notes

Codex skill use

Official references

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes