Skip to main content

Scholar All-In-One — local academic literature explorer powered by AI

Project description

ScholarAIO

Scholar All-In-One — a research knowledge infrastructure for AI agents.

English | 中文

GitHub stars License: MIT Python 3.10+ Claude Code Skills


Your coding agent already reads code, writes code, and runs experiments. ScholarAIO adds a structured research workspace on top, so the same agent can search literature, cross-check results against papers, use scientific software more accurately, and carry the whole research workflow from one terminal.

  • Your paper library becomes a reusable knowledge base for the same agent.
  • When scientific software questions come up, the agent can consult official documentation at runtime instead of guessing from prompts.
  • The system is built to keep expanding as new tools and workflows become worth supporting.
ScholarAIO natural-language research workflow

ScholarAIO offers more than search. It gives an AI coding agent a research workspace that supports natural-language interaction, papers and notes, more reliable use of scientific software, writing and running code, checking results against the literature, and structured academic writing.

ScholarAIO architecture: human, agent, scientific context, tool layer, and compute/outputs

Quick Start

The default and recommended way to use ScholarAIO is simple: install it, configure it once, and open this repository directly with your coding agent.

git clone https://github.com/ZimoLiao/scholaraio.git
cd scholaraio
pip install -e ".[full]"
scholaraio setup

Then open the repository in Codex, Claude Code, or another supported agent. In this setup, the agent gets the fullest experience: bundled instructions, local skills, the CLI, and the complete codebase context are all available directly. For Claude Code plugins, Codex/OpenClaw skill registration, and other setup paths, see docs/getting-started/agent-setup.md.

What It Does

Feature Details
PDF Parsing Deep structure extraction Convert PDFs into structured Markdown while preserving formulas, figures, and layout as much as possible
Not Just Papers More than papers Journal articles, theses, patents, technical reports, standards, and lecture notes — four inbox categories with tailored metadata handling
Hybrid Search Keyword + semantic fusion Combine full-text and vector retrieval for stronger search results
Topic Discovery See what your library is about Automatically group papers into research themes and use interactive views to grasp the overall structure quickly
Literature Exploration Multi-dimensional discovery Explore a research direction through journal, topic, author, institution, keyword, year, citation impact, and more
Citation Graph References & impact Forward citations, backward citations, and shared-reference analysis
Layered Reading Read on demand Start with metadata or the abstract, then move into conclusions or full text only when you need to
Multi-Source Import Connect your existing library Import directly from reference managers, PDFs, and Markdown without rebuilding your library from scratch
Workspaces Organize by project Manage paper subsets with scoped search and BibTeX export
Multi-Format Export BibTeX, RIS, Markdown, DOCX Export your full library or a workspace for Zotero, Endnote, submission, or sharing
Persistent Notes Cross-session memory Keep analysis notes for each paper so future sessions can reuse them instead of starting over
Research Insights Reading behavior analytics Search hot keywords, most-read papers, reading trends, and semantic neighbor recommendations for papers you haven't read yet
Federated Discovery Cross-library search Search your main library, exploration libraries, and arXiv from one entry point instead of hopping across tools
AI-for-Science Runtime Use scientific software more accurately Use scientific software against official documentation at runtime instead of guessing commands and parameters
Extensible Tool Onboarding Keep adding the tools that matter As new scientific tools and workflows become important, the system can keep expanding
Academic Writing AI-assisted writing Literature review, paper sections, citation check, rebuttal, and gap analysis — with every citation traceable to your own library

Works With Your Agent

ScholarAIO is designed to be agent-agnostic, but different agents expose different integration paths. Some work best when you open this repository directly; others are easier to use through plugins.

Agent / IDE Open this repo directly Reuse from another project
Claude Code CLAUDE.md + .claude/skills/ Claude plugin marketplace
Codex / OpenClaw AGENTS.md + .agents/skills/ Symlink skills into ~/.agents/skills/
Cline .clinerules + .claude/skills/ CLI + skills
Cursor .cursorrules CLI + skills
Windsurf .windsurfrules CLI + skills
GitHub Copilot .github/copilot-instructions.md CLI + skills

Skills follow the open AgentSkills.io standard, and .agents/skills/ is a symlink to .claude/skills/ so different agents can discover and reuse the same skills.

Migrating from existing tools? Import directly from Endnote (XML/RIS) and Zotero (Web API or local SQLite), with PDFs, metadata, and references brought over together. More import sources are on the roadmap.

Configuration

Start by opening scholaraio with your agent and let it walk you through the setup. The notes below are only a basic overview.

ScholarAIO works with a minimal setup and can be expanded as needed.

  • scholaraio setup walks you through the basics.
  • An LLM API key is optional but recommended for more robust metadata extraction and content completion.
  • A MinerU token is optional but recommended, and free. You can also deploy MinerU or Docling locally for PDF parsing.
  • scholaraio setup check shows what is installed, what is optional, and what is missing.

Full setup and configuration details → docs/getting-started/agent-setup.md, config.yaml

Agent First, CLI Available

ScholarAIO works best through an AI coding agent, but it also provides a CLI for scripting, debugging, and quick queries. For a current command reference aligned with the code, see docs/guide/cli-reference.md.

Project Structure

scholaraio/             # Python package — CLI and all core modules
  ingest/               #   PDF parsing + metadata extraction pipeline
  sources/              #   External source adapters (arXiv / Endnote / Zotero)

.claude/skills/         # Agent skills (AgentSkills.io format)
.agents/skills/         # ↑ symlink for cross-agent discovery
data/papers/            # Your paper library (gitignored)
data/proceedings/       # Proceedings library (gitignored)
data/inbox/             # Drop PDFs here for ingestion
data/inbox-proceedings/ # Drop proceedings volumes here for dedicated ingest

Full module reference → CLAUDE.md or AGENTS.md

Citation

If you use ScholarAIO in your research, please cite:

@software{scholaraio,
  author = {Liao, Zi-Mo},
  title = {ScholarAIO: AI-Native Research Terminal},
  year = {2026},
  url = {https://github.com/ZimoLiao/scholaraio},
  license = {MIT}
}

License

MIT © 2026 Zi-Mo Liao

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scholaraio-1.3.0.tar.gz (275.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scholaraio-1.3.0-py3-none-any.whl (292.1 kB view details)

Uploaded Python 3

File details

Details for the file scholaraio-1.3.0.tar.gz.

File metadata

  • Download URL: scholaraio-1.3.0.tar.gz
  • Upload date:
  • Size: 275.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scholaraio-1.3.0.tar.gz
Algorithm Hash digest
SHA256 82faa0f12cfc0f75952171ac6a9f3fcfda6e3c87c1b01b725f36b122ae9755bf
MD5 03e914decb6681e3bfec6f178998042b
BLAKE2b-256 f85d51dc53e74ee8f72224ed3386fe317eba3568a56bb0e93cf1f086f391d40a

See more details on using hashes here.

Provenance

The following attestation bundles were made for scholaraio-1.3.0.tar.gz:

Publisher: release.yml on ZimoLiao/scholaraio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scholaraio-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: scholaraio-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 292.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scholaraio-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d308813d92f5856c680ca54edb56772eb7f6c6c29c161c70fc57fabef931ec75
MD5 ad2f6cadd3f2541cfd5a900ccb7dc4e0
BLAKE2b-256 211dcffba48bae83f658e8e28810a622ac30e0012cb0d1cb481e1d7b3bf2e16c

See more details on using hashes here.

Provenance

The following attestation bundles were made for scholaraio-1.3.0-py3-none-any.whl:

Publisher: release.yml on ZimoLiao/scholaraio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page