Installable RAG + MCP skills framework with a reliability-loop workflow.
Project description
rag-ai-scientist
Installable toolkit for local RAG indexing + MCP serving in scientific workflows.
rag-ai-scientist gives you:
- a CLI to initialize and build a local vector database from your papers and notes,
- an MCP server entrypoint for Cursor / agent integrations,
- packaged skills under
rag_ai_scientist/skills/(workflow checklists—no Git clone needed).
On PyPI, only this README is shown; detailed guides live on GitHub (absolute links below).
End-user workflow (pip only — no clone required)
You install from PyPI, create any folder for your project, put your research materials there, index once, then connect Cursor.
Full step-by-step: Getting started — install → references/ → init-references → setup-rag → MCP → update notes and rebuild.
Minimal command sequence (after pip install rag-ai-scientist):
mkdir -p ~/my-ai-scientist/references
cd ~/my-ai-scientist
# Add your own .md / .pdf files under references/
rag-ai-scientist init-references --project-root . --references-dir ./references
rag-ai-scientist setup-rag --project-root . --force
rag-ai-scientist mcp --project-root . # usually configured once inside Cursor — see Getting started link above
query_analysis_knowledgeanswers from your indexed files.get_skillloads packaged skills (e.g.cms-higgs-opendata) without indexing anything extra.
You update your AI scientist by editing files under references/ (and configs/references.yaml if paths change), then setup-rag --force again.
Installation
From PyPI (recommended)
python -m pip install rag-ai-scientist
Pinned example:
python -m pip install rag-ai-scientist==0.1.3
PyPI project page: rag-ai-scientist
Verify
rag-ai-scientist --help
python -c "import rag_ai_scientist; print(rag_ai_scientist.__version__)"
From source (maintainers / contributors only)
git clone https://github.com/uzzielperez/rag-ai-scientist.git
cd rag-ai-scientist
git checkout dev # or your working branch
python3 -m venv .venv && source .venv/bin/activate
python -m pip install -e .
Isolation tip: use a dedicated venv (e.g. ~/venvs/rag-ai-scientist) instead of mixing with heavy analysis stacks.
CLI commands
| Command | Purpose |
|---|---|
init-references |
Writes configs/references.yaml pointing at your references directory. |
setup-rag |
Indexes sources into .cursor/rag_db. |
mcp |
Starts the stdio MCP server — point --project-root at the same folder you indexed. |
Common flags: --project-root, --force (rebuild index), --references-dir (with init-references).
Cursor MCP configuration
Register the server so Cursor runs it with your project path:
{
"mcpServers": {
"rag-ai-scientist": {
"command": "rag-ai-scientist",
"args": ["mcp", "--project-root", "/absolute/path/to/my-ai-scientist"]
}
}
}
See Getting started for optional .cursor/.env (LLM keys).
Packaged skills and examples
- Skills ship inside the installed package. Access via MCP
get_skill(e.g.cms-higgs-opendata). No clone required. - Examples / MCP access explains
get_skill, Cursor wiring, and optional curated markdown for maintainers who ship a full docs tree. End users normally only need their own files underreferences/.
Running agents beside a separate lab environment
If training runs use a different conda/venv than rag-ai-scientist:
- Install
rag-ai-scientistin its own small venv. - Keep
--project-rootpointed at your research folder. - Run heavy jobs via explicit wrappers (
conda run, scripts) from the agent — see Runbook for patterns.
Repository layout (when developing from source)
rag_ai_scientist/
cli.py # CLI entrypoint
mcp_server.py # MCP server
skills/ # Packaged skills (ship in wheel)
rag/
index_documents.py # Indexer used by setup-rag
configs/
references.example.yaml # Example only — users run init-references instead
docs/
GETTING_STARTED.md # Primary user guide (pip-only path)
examples/ # Maintainer docs / optional narratives
Browse on GitHub: docs/.
Development & PyPI releases
Contributor workflow and release steps: DEV_README.md.
License
- Open-source: AGPL-3.0-or-later (
LICENSE) - Commercial: see
LICENSE-COMMERCIAL.md
Security notes
- Never commit secrets (
.env, API keys). - Treat
.cursor/rag_dbas sensitive if your indexed PDFs are sensitive.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rag_ai_scientist-0.1.3.tar.gz.
File metadata
- Download URL: rag_ai_scientist-0.1.3.tar.gz
- Upload date:
- Size: 23.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b3781dab87f81a679a6f7697d108f3967efd9809bdd89087a7d455af74530c6
|
|
| MD5 |
287e5a5c51996c835a518a7d063e2633
|
|
| BLAKE2b-256 |
ebbeacd75d2c0063b6671ea5ee9aebecd02ae13953b7b88e16f48b297ae47563
|
File details
Details for the file rag_ai_scientist-0.1.3-py3-none-any.whl.
File metadata
- Download URL: rag_ai_scientist-0.1.3-py3-none-any.whl
- Upload date:
- Size: 28.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
981d7f85d32ae1a0b2a65f0bf049aee4463f96defdd2d39b7b904d6ca8ea4080
|
|
| MD5 |
2ee033007cfcefb52f6a8dbd24727875
|
|
| BLAKE2b-256 |
159e3384a52a4c8cd5afa537388263b90ca7940e11f29529de6544bc572c1b6c
|