Skip to main content

AI-native project knowledge base

Project description

index1

English | 中文

AI-native project knowledge base. BM25 + vector hybrid search, < 200ms response.

index1 vs grep real-world comparison

index1 tested in real-world Claude grep! Comparison of index1 + Claude grep vs Claude grep only:

https://github.com/user-attachments/assets/b689b0bb-b767-4fc8-9055-cc3ae872559e

Install

One-click (recommended):

# macOS / Linux
curl -sSL https://raw.githubusercontent.com/gladego/index1/main/scripts/install.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/gladego/index1/main/scripts/install.ps1 | iex

The script auto-detects Python, installs via pipx, sets up Ollama, and creates default config.

Manual install:

pipx install index1    # recommended
# or: pip install index1

Note: macOS blocks global pip install by default. Use pipx instead:

  • macOS: brew install pipx
  • Linux: pip install --user pipx && pipx ensurepath
  • Windows: scoop install pipx or pip install --user pipx

Quick Start

ollama pull nomic-embed-text      # optional, for semantic search
index1 index ./docs ./src
index1 search "how to use the liquidation API"

Ollama is optional. Without it, index1 falls back to BM25-only search.

AI Tool Integration

Claude Code

Add .mcp.json to your project root:

{
  "mcpServers": {
    "index1": {
      "type": "stdio",
      "command": "index1",
      "args": ["serve"]
    }
  }
}

Restart Claude Code — five docs_* tools will be available (docs_search, docs_get, docs_status, docs_reindex, docs_config).

Full setup guide: Claude Code integration — MCP config, search strategy, CLAUDE.md setup, context-saving tips

Other AI Tools (OpenClaw, Cursor, Windsurf, Cline...)

MCP-compatible tools: Add the same config above to your tool's MCP settings.

CLI mode (works with any tool):

index1 search "how does authentication work"
index1 get <chunk_id>

Full setup guide: Other AI agents integration — per-tool config, CLI usage, Web UI

Ollama (recommended)

# macOS
brew install ollama && ollama pull nomic-embed-text

# Linux
curl -fsSL https://ollama.ai/install.sh | sh && ollama pull nomic-embed-text

# Windows — download from https://ollama.ai/download, then:
ollama pull nomic-embed-text
Model Dim Disk RAM Best for
all-minilm 384 ~45 MB ~250 MB English, low-resource machines
nomic-embed-text (default) 768 ~270 MB ~500 MB English + Chinese, general use
bge-m3 1024 ~1.2 GB ~1.2 GB Chinese-optimized, 100+ languages

Without Ollama, index1 falls back to BM25-only search (no semantic/cross-language support).

CLI Commands

index1 index <paths...>          # Index files/directories
index1 search <query>            # Hybrid search
index1 status                    # View index statistics
index1 config [key] [value]      # View/modify configuration
index1 serve                     # Start MCP Server (stdio)
index1 web                       # Start Web UI (port 6888)

Supported File Types

.md .markdown .py .rs .js .ts .jsx .tsx .txt

Each type uses structure-aware chunking: headings for Markdown, AST for Python, regex patterns for Rust/JS/TS.

Configuration

Config file: ~/.claude-index1/config.yaml

embedding_model: nomic-embed-text   # Ollama model
embedding_dim: 768
ollama_url: http://localhost:11434
top_k: 10                           # Results per query
collection: default                 # Namespace isolation

Project-level override: .index1.yaml in project root.

Architecture

Claude Code ──► MCP Server (stdio)
                    │
CLI ────────────► Query Engine ──► SQLite
                    │               ├── FTS5 (BM25)
Web UI ─────────┘   │               └── sqlite-vec (vector)
                    │
              Ollama Embedding
  • Storage: Single SQLite file (~/.claude-index1/knowledge.db)
  • Search: BM25 + vector with Reciprocal Rank Fusion (k=60)
  • Chunking: Structure-aware splitting by file type

Performance

Mode Cold Hot (cached)
Hybrid (BM25 + Vector) 40–180 ms < 1 ms
BM25-only (no Ollama) ~35 ms* < 1 ms
Grep/Glob (native) 4 ms N/A

* After first query. First cold query without Ollama takes ~1s due to connection timeout, then result is cached for 60s.

Without Ollama: 6–8x slower cold start, Chinese semantic search returns 0 results, no cross-language support.

Context savings: index1 returns top-k ranked results (~400–500 tokens) vs Grep returning all matches (~5,000–35,000 tokens for common keywords). Saves 90–99% of LLM context window on broad queries.

Full benchmark and integration guides:

FAQ

Ollama is not running / not installed? index1 automatically falls back to BM25-only keyword search. However, this comes with significant penalties:

  • 6–8x slower cold queries (connection timeout overhead)
  • 0 results for Chinese/Japanese/Korean semantic queries
  • No cross-language search (Chinese query → English code)

We strongly recommend installing Ollama:

# macOS
brew install ollama
ollama pull nomic-embed-text

# Linux
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull nomic-embed-text

# Windows
# Download from https://ollama.ai/download, then:
ollama pull nomic-embed-text

Ollama runs locally on port 11434 (configurable). All data stays on your machine.

Resource comparison — with vs without Ollama:

Without Ollama With Ollama (nomic-embed-text)
Disk 0 ~270 MB (model file)
RAM 0 ~500 MB (while running)
Cold query ~1s (timeout) → ~35ms (cached) 40–180 ms
CJK search 0 results Full semantic search
Cross-language Not supported Supported
Search mode BM25 keyword only BM25 + vector hybrid

Ollama only uses RAM while running. If you stop ollama serve, RAM is fully released. Disk usage depends on the model — all-minilm is only ~45 MB for machines with limited storage.

How to switch embedding models?

index1 config embedding_model <model-name>
index1 index --force ./docs ./src   # Rebuild index with new model

Can I use multiple projects? Yes. Use --collection to isolate namespaces:

index1 index ./project-a -c proj_a
index1 index ./project-b -c proj_b
index1 search "query" -c proj_a

Where is the database stored? Default: ~/.claude-index1/knowledge.db. Override via index1 config db_path /custom/path.db or set INDEX1_HOME environment variable.

Migrating from older versions?

mv ~/.index1 ~/.claude-index1

How to rebuild the index?

index1 index --force ./docs ./src

How to monitor file changes?

index1 watch ./docs ./src

Contributing

git clone https://github.com/gladego/index1.git
cd index1
pip install -e ".[dev]"
pytest

PRs welcome. Please ensure pytest passes before submitting.

Changelog

v0.1.0

  • BM25 + vector hybrid search with RRF fusion
  • Structure-aware chunking (Markdown, Python, Rust, JS/TS)
  • MCP Server with 5 tools for Claude Code integration
  • Web UI with Atom Core animated logo
  • L1/L2 query cache (10min TTL)
  • File watcher for auto-reindex
  • Optional rerank with cosine similarity
  • One-click install script

Requirements

  • Python >= 3.10
  • macOS / Linux / Windows
  • Ollama (optional, for semantic search)

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

index1-2.0.0.tar.gz (4.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

index1-2.0.0-py3-none-any.whl (370.1 kB view details)

Uploaded Python 3

File details

Details for the file index1-2.0.0.tar.gz.

File metadata

  • Download URL: index1-2.0.0.tar.gz
  • Upload date:
  • Size: 4.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for index1-2.0.0.tar.gz
Algorithm Hash digest
SHA256 8daa97d85aa93178ee0aacacf3ac7fd8b0f949ae6c60679a01bf639a719d1dcb
MD5 f8be697cc84150be77ecb13d17920ee7
BLAKE2b-256 6871fd6139ee2495a6c0f94b7cb96e7bde94fc7c583d5a38b4affb6b4d3012cf

See more details on using hashes here.

File details

Details for the file index1-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: index1-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 370.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for index1-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3cc0f4bd9d543a385f9924ee0e2f5ac3348eb7411c54b4c38d1f8984314aaade
MD5 23f4b83a7e4a793138432d88148a33a0
BLAKE2b-256 259b0f7fa982b92cb9fe86883d071845adabbf9d05a9d409b8695cbbd267070f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page