Skip to main content

Filesystem tools for AI agents with optional RAG capabilities

Project description

Contextinator

Turn any codebase into semantically-aware, searchable knowledge for AI-powered workflows.

Key Features

  • AST-Powered Chunking - Extract functions, classes, and methods from 23+ programming languages
  • Parent-Child Relationships - Maintain hierarchical chunk-context for complete understanding
  • Semantic Search - Find relevant code using natural language queries
  • Multiple Search Modes - Semantic, symbol-based, pattern matching, and hybrid search
  • Smart Deduplication - Hash-based detection of duplicate code
  • TOON Format Export - Token-efficient output format for LLM prompts (40-60% token savings)
  • Full Pipeline Automation - One command to chunk, embed, and store
  • Docker-Ready - ChromaDB server included

Use Cases

Agentic AI Systems RAG Applications Code Intelligence
Dynamic code retrieval for autonomous coding agents High-precision code retrieval for question answering Cross-repository code search and discovery
Context provision for code generation Context injection for code explanation and documentation Duplicate and similar code detection
Multi-step reasoning over large codebases Semantic code search across repositories Legacy codebase analysis and understanding
Tool integration for agent frameworks Parent-child relationship tracking for complete context MCP-compliant async architecture

Getting Started

Prerequisites

  • Python 3.11 or higher
  • Docker (for ChromaDB)
  • OpenAI API key (for embeddings)

Installation

pip install contextinator

Verify the installation (requiers chromadb & openai api key setup):

contextinator --help

For detailed setup and configuration, see USAGE.md

Getting Started

  1. Index a repository:
contextinator chunk-embed-store-embeddings \
  --repo-url https://github.com/user/repo \
  --save \
  --collection-name MyRepo
  1. Search your codebase:
# Natural language semantic search
contextinator search "authentication logic" -c MyRepo

# Find specific functions
contextinator symbol authenticate_user -c MyRepo

# Export results in TOON format for LLM consumption
contextinator search "error handling" -c MyRepo --toon results.json

For comprehensive CLI and Python API documentation, see USAGE.md

Acknowledgements

Built with and inspired by amazing open-source projects:

Core Technologies

  • tree-sitter - Incremental parsing system for AST generation
  • ChromaDB - AI-native embedding database
  • OpenAI - Embedding generation API

Inspired By

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

TL;DR Contextinator

Contextinator is a code intelligence tool that uses Abstract Syntax Tree (AST) parsing to extract semantic code chunks, generates embeddings, and stores them in a vector database. This enables AI systems to understand, navigate, and reason about codebases with precision.

Star History

Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

contextinator-2.0.2-cp311-cp311-manylinux_2_34_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

File details

Details for the file contextinator-2.0.2-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for contextinator-2.0.2-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 745af31eb1fd1b64742a282facefdd990dfc753360426805df432e00ce4a13f3
MD5 885ca66e06dadfd5df5cbb17727883a4
BLAKE2b-256 0bcd045aa4ecca41ce3c44d67b6d96c0ff893a926282497457c866e9730250a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page