Lightweight Python library for tag-based context injection with anti-hallucination safeguards. Zero dependencies, 100% type-safe, <25ms performance.
Project description
AI Context Injector
A lightweight Python library for tag-based context injection into LLM prompts with built-in anti-hallucination safeguards.
Extract relevant context from your codebase, memory systems, or custom data sources using simple tags like @memory, @code, or @session in user queries.
Key Features
- Tag-based retrieval:
@memory dark keyboard,@code UserService,@session last decision - Anti-hallucination safeguards: 5 critical rules + automatic citations to prevent LLM fabrication
- Hard project isolation: Never mix contexts from different projects without explicit opt-in
- Plugin architecture: Write custom providers for any data source
- Blazing fast: <25ms total pipeline (proven at 17.7ms in production)
- Zero dependencies: Core library uses only Python stdlib
- 100% type-safe: Full type hints with py.typed marker
- Battle-tested: 219 tests including real-world integration scenarios
Installation
pip install ai-context-injector
Quick Start (5 lines)
from ai_context_injector import ContextInjector
injector = ContextInjector(current_project="my-app")
injector.register_provider("@memory", my_memory_provider)
context = injector.inject("@memory user authentication flow")
# Ready to inject into LLM prompt!
Why Context Injection?
LLMs need relevant context to provide accurate answers. Instead of manually copying code snippets or documentation, use tags to automatically retrieve and format context:
Without context injection:
User: "How does our authentication work?"
LLM: *hallucinates based on general knowledge*
With context injection:
User: "@memory authentication @code AuthService"
System: *retrieves actual decisions + code*
LLM: *answers based on YOUR actual implementation*
Complete Example
from ai_context_injector import ContextInjector, ContextItem, IContextProvider
from datetime import datetime
# 1. Create a custom provider
class MyMemoryProvider(IContextProvider):
@property
def name(self) -> str:
return "MyMemory"
@property
def source_type(self) -> str:
return "memory"
def is_available(self) -> bool:
return True
def retrieve(self, request):
# Your retrieval logic here (database, files, API, etc.)
results = search_my_database(request.query, request.project)
return [
ContextItem(
content=result.text,
source="memory",
project=request.project,
metadata={"id": result.id},
relevance_score=result.score,
timestamp=result.created_at
)
for result in results
]
# 2. Initialize injector
injector = ContextInjector(current_project="my-app")
injector.register_provider("@memory", MyMemoryProvider())
# 3. Inject context from user queries
user_query = "Tell me @memory what we decided about the database"
if injector.has_tags(user_query):
context = injector.inject(user_query)
# Inject into LLM prompt
llm_prompt = f"""
{context}
User Question: {injector.extract_query_only(user_query)}
Answer based ONLY on the context above.
"""
response = llm.generate(llm_prompt)
Anti-Hallucination Safeguards
Every injected context includes 5 critical rules to prevent LLM fabrication:
=== BEGIN CONTEXT ===
CRITICAL RULES FOR USING THIS CONTEXT:
1. ONLY cite information that appears in the context sections below
2. If context is from a different project, CLEARLY state which project
3. Include source citations [memory:project:date] or [code:file:line]
4. If unsure or context missing, say "I don't have information about this"
5. NEVER mix information from different projects without explicit warning
Retrieved Context for: my-app
Found 2 relevant item(s)
--- Context Item 1/2 ---
Source: memory | Project: my-app | Relevance: 0.95 | Date: 2026-05-30
Citation: [memory:my-app:2026-05-30]
Decision: Use PostgreSQL with WAL mode for better performance
--- Context Item 2/2 ---
Source: code | Project: my-app | Relevance: 0.88 | Date: 2026-05-30
Citation: [code:src/db/connection.py:15-23]
def create_connection():
return psycopg2.connect(
host="localhost",
database="myapp",
options="-c wal_level=replica"
)
=== END CONTEXT ===
Tag Syntax
Basic Tags
"@memory user authentication" # Search memory for "user authentication"
"@code AuthService class" # Search code for "AuthService class"
"@session last decision" # Search current session
Modifiers
"@memory:all database design" # Search across ALL projects (with warning)
Multiple Tags
"@memory architecture and @code UserService" # Aggregates from both sources
Project Isolation (Key Differentiator)
By default, context is strictly isolated to the current project:
injector = ContextInjector(current_project="frontend")
# Only retrieves from "frontend" project
context = injector.inject("@memory React components")
# To search across projects, use :all modifier (generates warning)
context = injector.inject("@memory:all authentication patterns")
Why this matters: Prevents LLMs from mixing context across different projects, which is a major source of hallucinations and incorrect answers.
Advanced Usage
With Performance Metrics
response = injector.inject_with_metrics("@memory query")
print(f"Found: {response.total_found} items")
print(f"Filtered: {response.filtered_count} items")
print(f"Performance: {response.performance_ms:.2f}ms")
print(f"Filter ratio: {response.filter_ratio:.1%}")
Custom Parser and Formatter
from ai_context_injector import TagParser, ContextFormatter
# Custom parser with additional tags
parser = TagParser(custom_tags={"@docs", "@tickets"})
# Compact formatter for tight token budgets
formatter = ContextFormatter(
include_metadata=False,
include_anti_hallucination_rules=False
)
injector = ContextInjector(
current_project="my-app",
parser=parser,
formatter=formatter
)
Relevance Filtering
# Only return items with relevance >= 0.80
context = injector.inject(
"@memory database optimization",
min_relevance=0.80,
max_items=5
)
Custom Provider Examples
Simple In-Memory Provider
from ai_context_injector import IContextProvider, ContextItem
from datetime import datetime
class SimpleProvider(IContextProvider):
def __init__(self, data):
self.data = data # Dict[str, List[tuple]]
@property
def name(self) -> str:
return "Simple"
@property
def source_type(self) -> str:
return "memory"
def is_available(self) -> bool:
return True
def retrieve(self, request):
items = self.data.get(request.project, [])
return [
ContextItem(
content=content,
source="memory",
project=request.project,
metadata={},
relevance_score=score,
timestamp=datetime.now()
)
for content, score in items
if score >= request.min_relevance
][:request.max_items]
# Usage
data = {
"my-app": [
("Decision: Use Redis for caching", 0.95),
("Learning: Connection pooling improved latency", 0.90),
]
}
provider = SimpleProvider(data)
injector.register_provider("@memory", provider)
SQLite Provider
import sqlite3
from ai_context_injector import IContextProvider, ContextItem
from datetime import datetime
class SQLiteProvider(IContextProvider):
def __init__(self, db_path):
self.db_path = db_path
@property
def name(self) -> str:
return "SQLite"
@property
def source_type(self) -> str:
return "memory"
def is_available(self) -> bool:
try:
conn = sqlite3.connect(self.db_path)
conn.close()
return True
except:
return False
def retrieve(self, request):
conn = sqlite3.connect(self.db_path)
cursor = conn.cursor()
# Simple full-text search
cursor.execute("""
SELECT content, relevance, timestamp
FROM memories
WHERE project = ?
AND content LIKE ?
AND relevance >= ?
ORDER BY relevance DESC
LIMIT ?
""", (
request.project,
f"%{request.query}%",
request.min_relevance,
request.max_items
))
results = [
ContextItem(
content=row[0],
source="memory",
project=request.project,
metadata={},
relevance_score=row[1],
timestamp=datetime.fromisoformat(row[2])
)
for row in cursor.fetchall()
]
conn.close()
return results
Filesystem Code Provider
import subprocess
from pathlib import Path
from ai_context_injector import IContextProvider, ContextItem
from datetime import datetime
class FilesystemCodeProvider(IContextProvider):
def __init__(self, repo_path):
self.repo_path = Path(repo_path)
@property
def name(self) -> str:
return "Filesystem"
@property
def source_type(self) -> str:
return "code"
def is_available(self) -> bool:
return self.repo_path.exists()
def retrieve(self, request):
# Use ripgrep for fast search
result = subprocess.run(
["rg", "--json", request.query, str(self.repo_path)],
capture_output=True,
text=True
)
# Parse ripgrep JSON output and convert to ContextItem
# (implementation details omitted for brevity)
return items[:request.max_items]
API Reference
Core Classes
ContextInjector
Main orchestrator for the entire pipeline.
injector = ContextInjector(
current_project="my-app", # Project name (auto-detected from cwd if None)
parser=None, # Custom TagParser instance
formatter=None # Custom ContextFormatter instance
)
# Register providers
injector.register_provider(tag, provider)
# Inject context
context = injector.inject(
user_input, # String with tags
max_items=10, # Max total items
min_relevance=0.70 # Min relevance score (0.0-1.0)
)
# With metrics
response = injector.inject_with_metrics(user_input, max_items=10, min_relevance=0.70)
# Utility methods
injector.has_tags(user_input) # bool
injector.extract_query_only(user_input) # str (tags removed)
IContextProvider
Abstract base class for custom providers.
class MyProvider(IContextProvider):
@property
def name(self) -> str:
"""Provider name for logging/debugging."""
return "MyProvider"
@property
def source_type(self) -> str:
"""Source type: 'memory', 'code', 'session', 'custom', etc."""
return "memory"
def is_available(self) -> bool:
"""Check if provider is available (DB connected, files exist, etc.)."""
return True
def retrieve(self, request: ContextRequest) -> List[ContextItem]:
"""Retrieve context items matching the request."""
# Your logic here
return items
ContextItem
Represents a single piece of context.
item = ContextItem(
content="Decision: Use PostgreSQL", # Required
source="memory", # Required
project="my-app", # Required
metadata={}, # Required (can be empty)
relevance_score=0.95, # Required (0.0-1.0)
timestamp=datetime.now(), # Required
file_path=None, # Optional (for code)
line_range=None # Optional (for code, tuple)
)
# Generate citation
citation = item.citation() # "[memory:my-app:2026-05-30]"
TagParser
Parses tags from user input.
parser = TagParser(custom_tags={"@docs", "@tickets"})
tags = parser.parse("@memory query") # List[ParsedTag]
has_tags = parser.has_tags("@memory query") # bool
clean = parser.remove_tags("@memory query") # str
parser.register_tag("@custom") # None
ContextFormatter
Formats context items for LLM injection.
formatter = ContextFormatter(
include_metadata=True,
include_citations=True,
include_anti_hallucination_rules=True
)
response = formatter.format(items, current_project)
compact = formatter.format_compact(items)
single = formatter.format_single(item, include_delimiters=True)
Convenience Functions
from ai_context_injector import inject_context, parse_tags, format_context
# Quick injection
context = inject_context(
user_input="@memory query",
providers={"@memory": my_provider},
project="my-app",
max_items=10,
min_relevance=0.70
)
# Quick parsing
tags = parse_tags("@memory query", custom_tags={"@docs"})
# Quick formatting
response = format_context(items, current_project="my-app")
Architecture
User Input: "@memory dark keyboard @code KeyboardView"
│
├─> TagParser
│ ├─> ParsedTag(tag="@memory", query="dark keyboard")
│ └─> ParsedTag(tag="@code", query="KeyboardView")
│
├─> ContextInjector
│ ├─> Route to providers
│ │ ├─> MemoryProvider.retrieve() → [ContextItem, ...]
│ │ └─> CodeProvider.retrieve() → [ContextItem, ...]
│ │
│ ├─> Aggregate results
│ ├─> Deduplicate (first 100 chars hash)
│ ├─> Sort by relevance
│ └─> Limit to max_items
│
└─> ContextFormatter
├─> Add delimiters
├─> Add anti-hallucination rules
├─> Add metadata + citations
├─> Check cross-project warnings
└─> Format items
│
└─> Formatted context string (ready for LLM)
Performance
Target: <25ms total pipeline
Proven: 17.7ms in production (memory-system)
Breakdown:
- Parsing: <1ms
- Provider retrieval: ~10ms (depends on your provider)
- Aggregation/dedup: ~2ms
- Sorting: ~2ms
- Formatting: ~2ms
Tip: Provider performance is usually the bottleneck. Use indexes, caching, and efficient queries.
Testing
# Run all tests
pytest
# Run with coverage
pytest --cov=ai_context_injector
# Run only unit tests
pytest tests/core/
# Run only integration tests
pytest tests/test_integration.py
# Run specific test
pytest tests/core/test_injector.py::TestBasicInjection::test_inject_with_single_tag
Design Principles
- Explicit over implicit: Empty providers dict by default, users must register
- Project isolation by default: Never auto-mix contexts across projects
- Zero magic: No auto-discovery, no global state, no hidden dependencies
- Library not framework: Integrate into your app, don't build around it
- Performance matters: <25ms target, every millisecond counts
- Type safety: Full type hints, no
Anytypes in public API - Fail gracefully: Missing/unavailable providers return None, not exceptions
FAQ
Why not just use RAG?
RAG is great for semantic search, but context injection solves a different problem:
- Structured retrieval: Tag-based routing to different data sources
- Multi-source aggregation: Combine memory + code + session in one query
- Project isolation: Hard boundaries prevent context mixing
- Anti-hallucination: Built-in safeguards with citations
You can use RAG as a provider in this system!
Why project isolation by default?
In multi-project environments, mixing contexts causes LLMs to:
- Suggest code patterns from wrong project
- Reference APIs that don't exist in current project
- Mix architectural decisions across boundaries
Hard isolation by default prevents these errors. Use :all modifier when you actually want cross-project search.
Can I use this with any LLM?
Yes! This library just generates formatted context strings. Use them with:
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude)
- Local models (Ollama, LM Studio)
- Any LLM API that accepts text prompts
How do I handle large codebases?
- Smart providers: Use search indexes (ripgrep, SQLite FTS, Elasticsearch)
- Relevance filtering: Set
min_relevance=0.80to reduce noise - Limit results: Use
max_items=5for tight token budgets - Compact format: Disable metadata with
ContextFormatter(include_metadata=False)
Can I use this in production?
Yes! The core library has:
- 219 tests (100% passing)
- No external dependencies
- Battle-tested in memory-system project
- Proven <20ms performance
Just write solid providers and you're good to go.
Contributing
Contributions welcome! Please:
- Add tests for new features
- Follow existing code style
- Update documentation
- Run
pytestbefore submitting
License
MIT License - see LICENSE file for details
Acknowledgments
Ported from the context injection system in memory-system project, which has been battle-tested in production for multi-project development workflows.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_context_injector-1.0.0.tar.gz.
File metadata
- Download URL: ai_context_injector-1.0.0.tar.gz
- Upload date:
- Size: 34.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f92a88048b05d043fe3f07b66ef1908dfcab6176a6e7b1560e39b18977bc3d08
|
|
| MD5 |
352e12039c8cd96a68b65a281af3ecf6
|
|
| BLAKE2b-256 |
d0fd20059e7fc9a06cc9adca1276073fb12d6f77d350ba6ded92fb7534d8c8cf
|
File details
Details for the file ai_context_injector-1.0.0-py3-none-any.whl.
File metadata
- Download URL: ai_context_injector-1.0.0-py3-none-any.whl
- Upload date:
- Size: 24.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4113a836b052584795f2e351229e0b4318969f8ccfba22614fbc4b00b8f30365
|
|
| MD5 |
159166dcd3dda0a0f0922cbec2a9aa61
|
|
| BLAKE2b-256 |
63c75e70a78bc08428d23d279fa648a28c81f32272059b352ab7f18ebe92c5ca
|