非常强力的对任意it项目生成ai上下文单个markdown文档,丢给大模型或者知识库,非常强力,大模型幻觉大幅度减少。尤其是对python项目还有专门的ast语法树解析
Project description
nb_ai_context
🚀 An extremely powerful AI context generator - Merge any IT project into a single structured Markdown document for AI LLMs or RAG knowledge bases.
What is nb_ai_context?
nb_ai_context is not simply merging project file code, it is a context optimization tool specifically designed for AI code interaction, with the core goal of reducing AI illusions.
-
nb_ai_context packages any IT project into a single markdown file for AI to learn and understand.
-
You can search for "repomix" to understand its purpose - nb_ai_context generates documents that are far superior for AI learning, especially for Python projects.
repomix simply merges multiple file contents, but nb_ai_context does much more than just merging. -
Why do you need nb_ai_context? Because third-party packages like google-genai, langchain, and pydantic have APIs that change too quickly. If you don't provide the latest documentation, AI will write outdated code using old package versions - sometimes even the imports won't work!
You need to upload updated tutorial documents to AI so it can write correct code. Users shouldn't have to compromise by using old, outdated Python package versions just to use AI.
✨ Core Features
- ✅ AI Reading Guide - Adds instructions to help AI models understand document structure and reduce hallucinations
- ✅ File Dependencies Analysis - Analyzes import relationships, identifies entry points and core modules
- ✅ AST Metadata Extraction - Extracts class/function signatures from Python files without full source code
- ✅ Smart File Merging - Supports .gitignore, file filtering, directory exclusion
- ✅ Clear File Boundaries - Each file marked with project name and path for easy AI identification
- ✅ GitHub Project Support - Generate docs directly from GitHub zip URLs
- ✅ Chainable API - Elegant fluent interface for building context
📦 Installation
pip install nb_ai_context
Requirements:
- Python >= 3.7
- nb_path
- nb_log
🚀 Quick Start
Basic Usage (from examples/AiMdGenerator_example.py)
from nb_ai_context import AiMdGenerator
project_name = "nb_ai_context"
project_root = rf"D:\codes\{project_name}"
project_summary = f"""
- `{project_name}` is a powerful ai llm context generator library, it is used for ai llm and rag
- `AiMdGenerator(...)` is the main class to create ai context for llm.
"""
(
AiMdGenerator(
rf"D:\codes\nb_ai_context\ai_md_files_demo\{project_name}_all_docs_and_codes.md"
)
.set_project_propery(project_name=project_name, project_root=project_root)
.ensure_parent()
.clear_text()
.add_ai_reading_guide() # Add AI reading guide to help AI understand document structure
.add_project_summary(
project_summary=project_summary,
most_core_source_code_file_list=[
"nb_ai_context/__init__.py",
"nb_ai_context/ai_md_generator.py",
"nb_ai_context/contrib/gen_github_proj_ai_md.py",
],
)
.auto_merge_from_python_project_some_files()
.show_textfile_info()
.merge_from_dir(
relative_dir_name='examples',
use_gitignore=True,
as_title=f"{project_name} examples",
should_include_suffixes=[".py", ".md"],
excluded_dir_name_list=[],
include_ast_metadata=True,
)
.merge_from_dir(
relative_dir_name=project_name,
use_gitignore=True,
as_title=f"{project_name} codes",
should_include_suffixes=[".py", ".md"],
excluded_dir_name_list=[],
include_ast_metadata=True,
)
.show_textfile_info()
)
From GitHub Projects
from nb_ai_context import gen_github_proj_docs_and_codes_ai_md
gen_github_proj_docs_and_codes_ai_md(
github_zip_url="https://codeload.github.com/fastapi/sqlmodel/zip/refs/heads/main",
output_md_path=r"D:\ai_docs\sqlmodel_all_docs_and_codes.md",
readme_file="README.md",
docs_dir_name="docs",
codes_dir_name="sqlmodel",
should_include_suffixes=[".py", ".md"],
excluded_dir_name_list=["tests", "__pycache__"],
)
📖 API Reference
AiMdGenerator Class
The core class for generating AI context. Inherits from NbPath and supports chainable calls.
Methods
| Method | Description |
|---|---|
set_project_propery(project_name, project_root) |
Required first. Set project name and root directory |
add_ai_reading_guide() |
Add AI reading instructions to reduce hallucinations |
add_project_summary(project_summary, most_core_source_code_file_list) |
Add project summary with core file AST metadata |
add_file_dependencies(file_list) |
Analyze and add file dependency graph |
auto_merge_from_python_project_some_files() |
Auto-merge README.md, setup.py, pyproject.toml |
merge_from_files(file_list, as_title) |
Merge specific files |
merge_from_dir(relative_dir_name, as_title, ...) |
Merge entire directory with filters |
merge_from_files_with_metadata(...) |
Advanced merge with metadata control |
show_textfile_info() |
Display generated file statistics |
merge_from_dir Parameters
.merge_from_dir(
relative_dir_name="src", # Directory relative to project_root
as_title="Source Code", # Section title in markdown
project_root=None, # Override project root (optional)
should_include_suffixes=[".py"], # File extensions to include
excluded_dir_name_list=[], # Directories to exclude
excluded_file_name_list=[], # Files to exclude
use_gitignore=True, # Respect .gitignore rules
dry_run=False, # Preview mode (no actual generation)
include_ast_metadata=True, # Include Python AST metadata
)
GitHub Helper Functions
| Function | Description |
|---|---|
gen_github_proj_docs_and_codes_ai_md(...) |
Generate docs from GitHub repo with separate docs/codes directories |
gen_github_proj_all_dirs_ai_md(...) |
Generate docs from entire GitHub repo |
🎨 Generated Markdown Structure
# 🤖 AI Reading Guide for Project: my_project
(Instructions for AI models)
# markdown content namespace: my_project project summary
(Project description)
## 📋 my_project most core source files metadata
(AST metadata for core files - no source code)
## 🔗 my_project File Dependencies Analysis
(Import relationships and dependency graph)
# markdown content namespace: my_project Source Code
## my_project File Tree (relative dir: `src`)
(Directory tree)
## my_project Included Files (total: X files)
(File list)
--- **start of file: src/main.py** (project: my_project) ---
### 📄 Python File Metadata: `src/main.py`
(AST metadata)
```python
(Full source code)
```
--- **end of file: src/main.py** (project: my_project) ---
🐍 Python AST Metadata Extraction
For Python files, automatically extracts:
- Module docstrings
- Import statements
- Class definitions (name, bases, decorators, docstring, methods, properties, class variables)
- Function definitions (name, parameters with types/defaults, return type, decorators, docstring)
- Constructor (
__init__) details
🔒 Security
- Automatically respects
.gitignorerules whenuse_gitignore=True - Excludes hidden directories (starting with
.) - Supports manual exclusion of sensitive directories/files
🎯 Use Cases
- AI Code Review - Let AI analyze entire project for quality, security, performance
- RAG Knowledge Base - Import structured project docs into vector databases
- Project Documentation - Generate comprehensive project overview for new team members
- Learning Open Source - Quickly understand GitHub project architecture with AI assistance
🔗 Links
- GitHub: https://github.com/ydf0509/nb_ai_context
- PyPI: https://pypi.org/project/nb_ai_context/
- Issues: https://github.com/ydf0509/nb_ai_context/issues
📄 License
MIT License
nb_ai_context vs repomix: Professional Analysis on Reducing AI Hallucinations
-
nb_ai_contextis a byproduct ofnb_path.AiMdGeneratorinherits fromNbPath, so it also supports infinite chainable operations, making it easy for users to chain-merge multiple folder sources into one markdown.
However,nb_ai_contexthas now been separated out because generating AI context is harder, more complex, and requires more skill than file path operations. -
repomixis the top-tier third-party library for packaging IT project code into a single file, butnb_ai_contextsurpassesrepomixin almost every aspect. -
nb_ai_contextuses Python code with infinite chainable operations, supporting various methods - much more flexible than repomix's command-line approach. For example, it supports custom important AI prompt engineering.
nb_ai_contextallows users to specify the most important core file list viamost_core_source_code_file_list, helping AI clearly understand the core APIs of third-party packages or your project.nb_ai_contextSupport adding custom AI prompt words through project_stummary input parameter. -
Users can verify whether
nb_ai_contextis really powerful or if the author is just bragging. The fileai_md_files_demo/nb_ai_context_all_docs_and_codes.mdin this project was generated bynb_ai_context.
You can uploadnb_ai_context_all_docs_and_codes.mdtoGoogle AI Studioand let AI help you master how to usenb_ai_context- see if AI can learn how to use an obscure third-party package without prior training.
Core Design Philosophy Comparison
nb_ai_context
Designed specifically to reduce AI hallucinations, with multiple targeted features explicitly mentioned in the documentation:
- Detailed AI reading guide (explicitly tells AI how to understand document structure)
- Strict file boundary markers (clearly identifies start/end of each file)
- AST metadata extraction (lets AI understand code structure before seeing source code)
- Project dependency analysis (helps AI understand inter-module relationships)
- Forced path verification (requires AI to verify file paths exist when suggesting code changes)
repomix
Mainly focused on codebase aggregation, with the design goal of converting codebases into a single text file:
- Simple file separation markers
- Basic file filtering capability
- Preserves original code structure
- Lacks deep design specifically for AI understanding and reducing hallucinations
Key Feature Comparison for Reducing AI Hallucinations
| Feature | nb_ai_context | repomix |
|---|---|---|
| AI Reading Guide | ✅ Detailed guide explicitly telling AI how to understand document structure | ❌ Basically none |
| File Boundary Identification | ✅ Strict project name + path identification to prevent file confusion | ⚠️ Simple file separators |
| Code Structure Preview | ✅ AST metadata extraction (class/function signatures, docstrings) | ❌ None, shows source code directly |
| Dependency Analysis | ✅ Visualizes inter-module dependencies, helps AI understand architecture | ❌ None |
| Core Entry Point Identification | ✅ Clearly identifies core files and entry points | ❌ None |
| Path Verification Requirements | ✅ Explicit instructions requiring AI to verify file paths | ❌ No explicit guidance |
| Project Summary | ✅ Structured project overview helps AI quickly grasp key points | ⚠️ Limited description capability |
| Hidden/Sensitive File Handling | ✅ Supports .gitignore and manual exclusion of sensitive content | ⚠️ Basic filtering |
Practical Effect Comparison
When providing context generated by these tools to AI models:
nb_ai_context Advantages
- Reduces file path hallucinations: By forcing AI to "check file paths" and "verify file paths exist in the File Tree", it nearly eliminates the problem of AI fabricating non-existent files
- Reduces architectural misunderstanding: Through dependency graphs and AST metadata, AI more easily understands overall project architecture and won't incorrectly assume inter-module relationships
- Precise code references: Strictly marked file boundaries enable AI to accurately reference specific files and line numbers when answering
- More comprehensive context understanding: Project summaries and core file analysis help AI quickly grasp project focus instead of getting lost in details
repomix Limitations
- Blurred boundaries: Simple file separators may cause AI to confuse content from different files
- Lack of guidance: No explicit instructions on how AI should interpret document structure, increasing hallucination risk
- Insufficient deep understanding: Directly exposes complete source code without providing code structure preview, making it difficult for AI to quickly grasp project architecture
Conclusion: nb_ai_context is Significantly Stronger at Reducing AI Hallucinations
nb_ai_context is not just a code aggregation tool, but a context optimization system specifically designed for AI-code interaction. It explicitly targets "reducing hallucinations" as a core goal, repeatedly emphasizing in the documentation:
⚠️ Important Notes
- Do NOT hallucinate: Only reference code, classes, functions, and APIs that actually exist in this document
- Check file paths: When suggesting code changes, always verify the file path exists in the File Tree
- Respect the project structure: The File Tree shows the actual directory layout
While repomix is more of a general code aggregation tool without deep design specifically targeting AI hallucination issues. For scenarios requiring high-quality AI code understanding, review, or generation, nb_ai_context provides a more professional solution.
If you're preparing code context for AI systems, especially in enterprise applications or security-sensitive scenarios, nb_ai_context's professional design will significantly reduce the risk of AI producing dangerous hallucinations.
nb_ai_context - Let AI truly understand your code 🚀
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nb_ai_context-1.6.tar.gz.
File metadata
- Download URL: nb_ai_context-1.6.tar.gz
- Upload date:
- Size: 154.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e88bf40899900eb7315241dd76a0c8578ff99d1df70b39936b8939bab037287
|
|
| MD5 |
f6965b8a5bf989d5c75083484ad966ee
|
|
| BLAKE2b-256 |
156021df560b6e835bbf5dc73c4e3ec6a8c198992ee323d0a080664ed73edee5
|
File details
Details for the file nb_ai_context-1.6-py3-none-any.whl.
File metadata
- Download URL: nb_ai_context-1.6-py3-none-any.whl
- Upload date:
- Size: 24.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cf5f38753f4ad62faec62562aaa80fd7697579baf57819ac225af132ef9a8b7
|
|
| MD5 |
25a4515322254b0c8ebaffc6d70a1bfa
|
|
| BLAKE2b-256 |
2e179d1d1ece418055535b11d7b145215a7bf41dbcbcc0750820ed211f63f40f
|