Skip to main content

Installable RAG + MCP skills framework with a reliability-loop workflow.

Project description

rag-ai-scientist

Installable toolkit for local RAG indexing + MCP serving in scientific workflows.

PyPI Python License

rag-ai-scientist gives you:

  • a CLI to initialize and build a local vector database from your papers and notes,
  • an MCP server entrypoint for Cursor / agent integrations,
  • packaged skills under rag_ai_scientist/skills/ (workflow checklists—no Git clone needed).

On PyPI, only this README is shown; detailed guides live on GitHub (absolute links below).


End-user workflow (pip only — no clone required)

You install from PyPI, create any folder for your project, put your research materials there, index once, then connect Cursor.

Full step-by-step: Getting started — install → references/init-referencessetup-rag → MCP → update notes and rebuild.

Minimal command sequence (after pip install rag-ai-scientist):

mkdir -p ~/my-ai-scientist/references
cd ~/my-ai-scientist
# Add your own .md / .pdf files under references/

rag-ai-scientist init-references --project-root . --references-dir ./references
rag-ai-scientist setup-rag --project-root . --force
rag-ai-scientist mcp --project-root .    # usually configured once inside Cursor — see Getting started link above
  • query_analysis_knowledge answers from your indexed files.
  • get_skill loads packaged skills (e.g. cms-higgs-opendata) without indexing anything extra.

You update your AI scientist by editing files under references/ (and configs/references.yaml if paths change), then setup-rag --force again.


Installation

From PyPI (recommended)

python -m pip install rag-ai-scientist

Pinned example:

python -m pip install rag-ai-scientist==0.1.3

PyPI project page: rag-ai-scientist

Verify

rag-ai-scientist --help
python -c "import rag_ai_scientist; print(rag_ai_scientist.__version__)"

From source (maintainers / contributors only)

git clone https://github.com/uzzielperez/rag-ai-scientist.git
cd rag-ai-scientist
git checkout dev   # or your working branch
python3 -m venv .venv && source .venv/bin/activate
python -m pip install -e .

Isolation tip: use a dedicated venv (e.g. ~/venvs/rag-ai-scientist) instead of mixing with heavy analysis stacks.


CLI commands

Command Purpose
init-references Writes configs/references.yaml pointing at your references directory.
setup-rag Indexes sources into .cursor/rag_db.
mcp Starts the stdio MCP server — point --project-root at the same folder you indexed.

Common flags: --project-root, --force (rebuild index), --references-dir (with init-references).


Cursor MCP configuration

Register the server so Cursor runs it with your project path:

{
  "mcpServers": {
    "rag-ai-scientist": {
      "command": "rag-ai-scientist",
      "args": ["mcp", "--project-root", "/absolute/path/to/my-ai-scientist"]
    }
  }
}

See Getting started for optional .cursor/.env (LLM keys).


Packaged skills and examples

  • Skills ship inside the installed package. Access via MCP get_skill (e.g. cms-higgs-opendata). No clone required.
  • Examples / MCP access explains get_skill, Cursor wiring, and optional curated markdown for maintainers who ship a full docs tree. End users normally only need their own files under references/.

Running agents beside a separate lab environment

If training runs use a different conda/venv than rag-ai-scientist:

  1. Install rag-ai-scientist in its own small venv.
  2. Keep --project-root pointed at your research folder.
  3. Run heavy jobs via explicit wrappers (conda run, scripts) from the agent — see Runbook for patterns.

Repository layout (when developing from source)

rag_ai_scientist/
  cli.py                  # CLI entrypoint
  mcp_server.py           # MCP server
  skills/                 # Packaged skills (ship in wheel)
rag/
  index_documents.py      # Indexer used by setup-rag
configs/
  references.example.yaml # Example only — users run init-references instead
docs/
  GETTING_STARTED.md      # Primary user guide (pip-only path)
  examples/               # Maintainer docs / optional narratives

Browse on GitHub: docs/.


Development & PyPI releases

Contributor workflow and release steps: DEV_README.md.


License


Security notes

  • Never commit secrets (.env, API keys).
  • Treat .cursor/rag_db as sensitive if your indexed PDFs are sensitive.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_ai_scientist-0.1.3.tar.gz (23.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_ai_scientist-0.1.3-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file rag_ai_scientist-0.1.3.tar.gz.

File metadata

  • Download URL: rag_ai_scientist-0.1.3.tar.gz
  • Upload date:
  • Size: 23.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for rag_ai_scientist-0.1.3.tar.gz
Algorithm Hash digest
SHA256 6b3781dab87f81a679a6f7697d108f3967efd9809bdd89087a7d455af74530c6
MD5 287e5a5c51996c835a518a7d063e2633
BLAKE2b-256 ebbeacd75d2c0063b6671ea5ee9aebecd02ae13953b7b88e16f48b297ae47563

See more details on using hashes here.

File details

Details for the file rag_ai_scientist-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for rag_ai_scientist-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 981d7f85d32ae1a0b2a65f0bf049aee4463f96defdd2d39b7b904d6ca8ea4080
MD5 2ee033007cfcefb52f6a8dbd24727875
BLAKE2b-256 159e3384a52a4c8cd5afa537388263b90ca7940e11f29529de6544bc572c1b6c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page