Deterministic codebase context for AI coding agents
Project description
sourcecode
Compressed AI-ready context for Java/Spring enterprise codebases.
What is it?
sourcecode analyzes a repository and produces structured JSON or YAML designed to be fed directly to AI agents or language models. It solves the "stuff the whole repo into the prompt" problem by extracting a deterministic, high-signal summary: stack detection, entry points, dependencies, git hotspots, inline annotations, and confidence metadata.
Optimized for Java/Spring Boot monorepos. Works on any codebase.
Installation
Homebrew (macOS / Linux)
brew tap haroundominique/sourcecode
brew install sourcecode
pip / pipx
pip install sourcecode
# or with isolation:
pipx install sourcecode
Verify
sourcecode version
# sourcecode 1.27.0
Quickstart
# High-signal summary (1000–3000 tokens depending on repo size) — recommended starting point
sourcecode --compact
# Add git hotspots and uncommitted file count
sourcecode --compact --git-context
# Analyze a specific path
sourcecode /path/to/repo --compact
# Copy result to clipboard
sourcecode --compact --copy
# Structured output for AI agents (identity, entry points, dependencies, confidence)
sourcecode --agent
# Only process git-modified files (forces compact output)
sourcecode --changed-only
Example output for a Spring Boot project (--compact):
{
"project_type": "api",
"stacks": [{ "stack": "java", "detection_method": "manifest", "confidence": "high",
"primary": true, "frameworks": ["Spring Boot", "MyBatis"] }],
"entry_points": {
"bootstrap": ["src/main/java/io/spring/RealWorldApplication.java"],
"security": ["src/main/java/io/spring/api/security/WebSecurityConfig.java"],
"controllers": { "count": 8, "sample": ["src/main/java/io/spring/api/ArticleApi.java"] }
},
"key_dependencies": [
{ "name": "org.mybatis.spring.boot:mybatis-spring-boot-starter",
"version": "2.2.2", "risk_flags": ["spring-boot-2.x-eol"] }
],
"language_version": "11",
"deployment": { "spring_boot_version": "2.6.3", "packaging": "jar" },
"mybatis": { "mapper_interfaces": 4, "xml_files": 4 },
"confidence_summary": { "overall": "high", "stack": "high", "entry_points": "high" }
}
Flags reference
| Flag | Alias | Default | Description |
|---|---|---|---|
--compact |
off | High-signal summary (1000–3000 tokens): stacks, entry points, dependencies, risk flags, confidence, gaps. Includes security_surface, mybatis, and transactional_boundaries for Java projects. |
|
--agent |
off | Structured noise-free JSON for AI agents: identity, entry points, dependencies, confidence, gaps. Auto-enables dependency, env-var, and code-notes analysis. | |
--full |
off | Remove truncation limits on transactional_boundaries, mybatis.dto_mappers, and other capped lists. |
|
--git-context |
-g |
off | Include git activity: recent commits, change hotspots, and uncommitted changes. |
--changed-only |
off | Limit output to git-modified files (staged, unstaged, untracked). Forces compact output. | |
--depth |
4 |
File tree traversal depth (1–20). Java/Maven projects auto-adjust to 12. | |
--format |
-f |
json |
Output format: json or yaml. |
--output |
-o |
stdout | Write output to a file instead of stdout. |
--copy |
-c |
off | Copy output to clipboard after a successful run. No-op when --output is set or clipboard is unavailable. |
--no-redact |
off | Disable automatic secret redaction. Output may contain sensitive values. | |
--version |
-v |
— | Show version and exit. |
prepare-context — task-specific context
Generates a focused context bundle for a specific AI coding task. More targeted than --compact: each task re-ranks files according to its own signal priorities.
sourcecode prepare-context TASK [PATH] [OPTIONS]
Tasks
| Task | What it surfaces | Primary use |
|---|---|---|
explain |
Architecture, entry points, key dependencies | Onboarding an LLM to a new project |
onboard |
Full structural context: entry points, architecture, key files, dependencies | New developer or agent joining the codebase |
fix-bug |
Files ranked by risk (annotations, churn, uncommitted changes), suspected areas | Debugging session |
refactor |
Structural problems, improvement opportunities, high-annotation files | Code quality review |
generate-tests |
Source files without test pairs, coverage gap analysis | Writing missing tests |
review-pr |
Uncommitted/changed files + architectural impact | Pre-merge review |
delta |
Changed files with multi-hop impact analysis, structural import graph, system-level impact summary | Incremental CI/review context |
Options
| Option | Description |
|---|---|
--since REF |
Git ref for delta task (e.g. HEAD~3, main, v1.2.0). Required for delta; ignored for other tasks. |
--symptom TEXT |
(fix-bug only) Keyword hint for the bug — boosts matching files and surfaces related code notes. |
--llm-prompt |
Append a ready-to-use LLM prompt to the output. |
--dry-run |
Show what would be analyzed without running it. |
--copy / -c |
Copy output to clipboard after a successful run. |
--output / -o |
Write output to a file. |
--task-help |
List all tasks with descriptions and exit. |
Examples
# Explain the current repo
sourcecode prepare-context explain
# Focus on bug-prone files, with a symptom hint
sourcecode prepare-context fix-bug --symptom "NullPointerException in OrderService"
# Incremental context: files changed since branch diverged from main
sourcecode prepare-context delta . --since main
# Onboard with a ready-to-paste LLM prompt
sourcecode prepare-context onboard --llm-prompt
# List all tasks
sourcecode prepare-context --task-help
delta — incremental impact analysis
The delta task is the recommended mode for CI pipelines and PR reviews. It goes beyond listing changed files: it builds a structural import graph and propagates impact transitively up to 3 hops.
sourcecode prepare-context delta [PATH] --since REF
Output fields:
| Field | Description |
|---|---|
changed_files |
Files modified in the git range |
relevant_files |
Changed files + files pulled in by the import graph (scored by artifact type and hop distance) |
impact_summary |
Human-readable summary: artifact types changed and active risk areas |
affected_modules |
DDD domain modules touched by the change |
risk_areas |
Per-area severity breakdown (security, api, persistence, etc.) |
change_type |
Closed taxonomy: behavioral_change, structural_change, configuration_change, dependency_change, security_change |
system_impact |
Subsystems affected, behavioral changes, runtime impact notes |
dependency_graph_summary |
Verified structural import edges (hop 1–3) and propagation_depth. Only real imports — no heuristics, no test files. |
impact_score_per_file |
Per-file numeric impact score (0–1) |
since |
The git ref used |
gaps |
What the analysis could not determine |
How the import graph works:
- Each changed file is classified by artifact type (
controller,service,repository,security,spring_config, etc.). - A BFS traversal walks the import graph repo-wide (not restricted to the same module), up to 3 hops deep.
dependency_graph_summary.edgesonly contains verifiedimport/@Autowired/ constructor-injection relationships. Test files and heuristic proximity matches are excluded from edges (they appear inrelevant_filesonly if they have real imports of changed files).- Score decays 30% per hop: a directly-changed
SecurityConfig.javascores 0.90; its direct importer scores 0.63; a transitive importer scores 0.44.
# Changed service → controller → facade (3 hops)
sourcecode prepare-context delta . --since main
# Output includes:
# dependency_graph_summary.edges:
# hop-1: OrderService.java → OrderRepository.java
# hop-2: OrderRepository.java → OrderController.java
# hop-3: OrderController.java → OrderFacade.java
# propagation_depth: 3
Output schema
All outputs include a confidence_summary block with overall, stack, and entry_points confidence levels (high / medium / low), plus an analysis_gaps list describing what could not be analyzed and why.
Java/Spring-specific fields
When a Java manifest (pom.xml or build.gradle) is detected, the output includes additional fields:
| Field | Description |
|---|---|
language_version |
Java version from maven.compiler.source or equivalent |
deployment.spring_boot_version |
Spring Boot version |
deployment.packaging |
jar or war |
deployment.app_server_hint |
weblogic, wildfly, etc. (when detectable) |
security_surface.resource_names |
Values of @M3FiltroSeguridad(nombreRecurso=...) annotations across all controllers |
mybatis |
Mapper interface / XML file pairing summary |
transactional_boundaries |
Classes annotated with @Transactional |
deployment_risks |
Static risk flags: spring-boot-2.x-eol, legacy-java-runtime, legacy-app-server-deployment |
Telemetry
Anonymous, opt-in telemetry collects: version, OS, commands used, flags, duration, repo size range, and errors. No source code, paths, secrets, or output content is ever collected.
sourcecode telemetry status # current setting
sourcecode telemetry enable # opt in
sourcecode telemetry disable # opt out (permanent)
Alternatively, set the environment variable:
export SOURCECODE_TELEMETRY=0
Configuration
sourcecode config # show version, config file path, telemetry status
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sourcecode-1.27.0.tar.gz.
File metadata
- Download URL: sourcecode-1.27.0.tar.gz
- Upload date:
- Size: 382.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b46bae14be357c952e7a6233915849a65e623dd8b79976cc5bc09970af7ba2f
|
|
| MD5 |
bcce4632a15eabc661e9e3675c280f91
|
|
| BLAKE2b-256 |
0966774d5976bd54df29c03c7522a9cf0f6b08480fce12d4be7534e511e1c201
|
File details
Details for the file sourcecode-1.27.0-py3-none-any.whl.
File metadata
- Download URL: sourcecode-1.27.0-py3-none-any.whl
- Upload date:
- Size: 291.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ee9f9fe8dfe8b9f736461557e06e716f07b67207ce0e01562d6f1066c283f6b
|
|
| MD5 |
036abf8f7992f69be34b29ed1e57d0e7
|
|
| BLAKE2b-256 |
6279372c0d6b6352adffd3945356931c29b6d0c8167fc18c31c73f67d40a6e78
|