Deterministic codebase context for AI coding agents

These details have not been verified by PyPI

Project description

sourcecode

Deterministic, behavior-aware codebase context for AI agents and PR review.

Version Python

What is it?

sourcecode analyzes a repository and produces structured JSON or YAML designed to be fed directly to AI agents or language models. It solves the "stuff the whole repo into the prompt" problem by extracting a deterministic, high-signal summary: stack detection, entry points, dependencies, git hotspots, inline annotations, and confidence metadata.

For PR review specifically, sourcecode extracts execution paths: ordered chains from entry point through service to data access, with runtime signals (auth guards, cache short-circuits, async execution) anchored to the specific step where they affect behavior. A reviewer sees what the system does under this change, not just which files changed.

Optimized for Java/Spring Boot monorepos. Works on any codebase.

Installation

Homebrew (macOS / Linux)

brew tap haroundominique/sourcecode
brew install sourcecode

pip / pipx

pip install sourcecode
# or with isolation:
pipx install sourcecode

Verify

sourcecode version
# sourcecode 1.31.13

Quickstart

# High-signal summary (1000–3000 tokens depending on repo size) — recommended starting point
sourcecode --compact

# Add git hotspots and uncommitted file count
sourcecode --compact --git-context

# Analyze a specific path
sourcecode /path/to/repo --compact

# Copy result to clipboard
sourcecode --compact --copy

# Structured output for AI agents (identity, entry points, dependencies, confidence)
sourcecode --agent

# Only process git-modified files (forces compact output)
sourcecode --changed-only

Example output for a Spring Boot project (--compact):

{
  "project_type": "api",
  "stacks": [{ "stack": "java", "detection_method": "manifest", "confidence": "high",
               "primary": true, "frameworks": ["Spring Boot", "MyBatis"] }],
  "entry_points": {
    "bootstrap": ["src/main/java/io/spring/RealWorldApplication.java"],
    "security":  ["src/main/java/io/spring/api/security/WebSecurityConfig.java"],
    "controllers": { "count": 8, "sample": ["src/main/java/io/spring/api/ArticleApi.java"] }
  },
  "key_dependencies": [
    { "name": "org.mybatis.spring.boot:mybatis-spring-boot-starter",
      "version": "2.2.2", "risk_flags": ["spring-boot-2.x-eol"] }
  ],
  "language_version": "11",
  "deployment": { "spring_boot_version": "2.6.3", "packaging": "jar" },
  "mybatis": { "mapper_interfaces": 4, "xml_files": 4 },
  "confidence_summary": { "overall": "high", "stack": "high", "entry_points": "high" }
}

Flags reference

Flag	Alias	Default	Description
`--compact`		off	High-signal summary (1000–3000 tokens): stacks, entry points, dependencies, risk flags, confidence, gaps. Includes `security_surface`, `mybatis`, and `transactional_boundaries` for Java projects.
`--agent`		off	Structured noise-free JSON for AI agents: identity, entry points, dependencies, confidence, gaps. Auto-enables dependency, env-var, and code-notes analysis.
`--full`		off	Remove truncation limits on `transactional_boundaries`, `mybatis.dto_mappers`, and other capped lists.
`--git-context`	`-g`	off	Include git activity: recent commits, change hotspots, and uncommitted changes.
`--changed-only`		off	Limit output to git-modified files (staged, unstaged, untracked). Forces compact output.
`--depth`		`4`	File tree traversal depth (1–20). Java/Maven projects auto-adjust to 12.
`--format`	`-f`	`json`	Output format: `json` or `yaml`.
`--output`	`-o`	stdout	Write output to a file instead of stdout.
`--copy`	`-c`	off	Copy output to clipboard after a successful run. No-op when `--output` is set or clipboard is unavailable.
`--no-redact`		off	Disable automatic secret redaction. Output may contain sensitive values.
`--version`	`-v`	—	Show version and exit.

`prepare-context` — task-specific context

Generates a focused context bundle for a specific AI coding task. More targeted than --compact: each task re-ranks files according to its own signal priorities.

sourcecode prepare-context TASK [PATH] [OPTIONS]

Tasks

Task	What it surfaces	Primary use
`explain`	Architecture, entry points, key dependencies	Onboarding an LLM to a new project
`onboard`	Full structural context: entry points, architecture, key files, dependencies	New developer or agent joining the codebase
`fix-bug`	Files ranked by risk (annotations, churn, uncommitted changes), suspected areas	Debugging session
`refactor`	Structural problems, improvement opportunities, high-annotation files	Code quality review
`generate-tests`	Source files without test pairs, coverage gap analysis	Writing missing tests
`review-pr`	Execution paths with per-step runtime signals, security/transactional impact, test coverage gaps	Pre-merge behavior review
`delta`	Changed files with multi-hop impact analysis, structural import graph, system-level impact summary	Incremental CI/review context

Options

Option	Description
`--since REF`	Git ref for `delta` task (e.g. `HEAD~3`, `main`, `v1.2.0`). Required for `delta`; ignored for other tasks.
`--symptom TEXT`	(fix-bug only) Keyword hint for the bug — boosts matching files and surfaces related code notes.
`--format TEXT`	Output format: `json` (default) \| `github-comment` (Markdown PR comment, `review-pr` only).
`--llm-prompt`	Append a ready-to-use LLM prompt to the output.
`--dry-run`	Show what would be analyzed without running it.
`--copy` / `-c`	Copy output to clipboard after a successful run.
`--output` / `-o`	Write output to a file.
`--task-help`	List all tasks with descriptions and exit.

Examples

# Explain the current repo
sourcecode prepare-context explain

# Focus on bug-prone files, with a symptom hint
sourcecode prepare-context fix-bug --symptom "NullPointerException in OrderService"

# Incremental context: files changed since branch diverged from main
sourcecode prepare-context delta . --since main

# Onboard with a ready-to-paste LLM prompt
sourcecode prepare-context onboard --llm-prompt

# PR analysis as a GitHub Markdown comment (paste directly into PR)
sourcecode prepare-context review-pr --since main --format github-comment

# List all tasks
sourcecode prepare-context --task-help

`delta` — incremental impact analysis

The delta task is the recommended mode for CI pipelines and PR reviews. It goes beyond listing changed files: it builds a structural import graph and propagates impact transitively up to 3 hops.

sourcecode prepare-context delta [PATH] --since REF

Output fields:

Field	Description
`changed_files`	Files modified in the git range
`relevant_files`	Changed files + files pulled in by the import graph (scored by artifact type and hop distance)
`impact_summary`	Human-readable summary: artifact types changed and active risk areas
`affected_modules`	DDD domain modules touched by the change
`risk_areas`	Per-area severity breakdown (`security`, `api`, `persistence`, etc.)
`change_type`	Closed taxonomy: `behavioral_change`, `structural_change`, `configuration_change`, `dependency_change`, `security_change`
`system_impact`	Subsystems affected, behavioral changes, runtime impact notes
`dependency_graph_summary`	Verified structural import edges (hop 1–3) and `propagation_depth`. Only real imports — no heuristics, no test files.
`impact_score_per_file`	Per-file numeric impact score (0–1)
`since`	The git ref used
`gaps`	What the analysis could not determine

How the import graph works:

Each changed file is classified by artifact type (controller, service, repository, security, spring_config, etc.).
A BFS traversal walks the import graph repo-wide (not restricted to the same module), up to 3 hops deep.
dependency_graph_summary.edges only contains verified import / @Autowired / constructor-injection relationships. Test files and heuristic proximity matches are excluded from edges (they appear in relevant_files only if they have real imports of changed files).
Score decays 30% per hop: a directly-changed SecurityConfig.java scores 0.90; its direct importer scores 0.63; a transitive importer scores 0.44.

# Changed service → controller → facade (3 hops)
sourcecode prepare-context delta . --since main

# Output includes:
# dependency_graph_summary.edges:
#   hop-1: OrderService.java → OrderRepository.java
#   hop-2: OrderRepository.java → OrderController.java
#   hop-3: OrderController.java → OrderFacade.java
# propagation_depth: 3

`review-pr` — behavior-aware PR analysis

Extracts execution paths: ordered chains from entry point through service to data access layer, with runtime signals anchored to the specific step where they affect behavior.

sourcecode prepare-context review-pr [PATH] --since REF
# or against uncommitted working-tree changes:
sourcecode prepare-context review-pr

execution_paths schema:

{
  "execution_paths": [
    {
      "name": "Order",
      "entry_point": {
        "step": "OrderController.createOrder",
        "notes": [
          { "note": "condition: authorization check present (@PreAuthorize / @Secured)",
            "epistemic_level": "STRUCTURAL SIGNAL" }
        ]
      },
      "path": [
        {
          "step": "ShippingService.process",
          "notes": [
            { "note": "branch: Spring cache annotation present — downstream call may be short-circuited",
              "epistemic_level": "STRUCTURAL SIGNAL" },
            { "note": "async: @Async annotation present — runs in separate thread",
              "epistemic_level": "STRUCTURAL SIGNAL" }
          ]
        },
        { "step": "OrderRepository.save", "notes": [] }
      ],
      "end_state": "DB write",
      "end_state_epistemic_level": "INFERRED (LOW CONFIDENCE)"
    }
  ]
}

Path rules:

One path per changed entry point — most-evident downstream call, not all branches
Each step requires direct code evidence: field injection, constructor param, method call, or type annotation
notes are scanned from that step's own source file — no cross-contamination between steps
Path terminates where evidence ends; no gap-filling by naming convention or module similarity

Runtime signals detected per step:

Signal	Example code	Note emitted	Epistemic level
Auth guard	`@PreAuthorize`, `@Secured`	`condition: authorization check present (@PreAuthorize / @Secured)`	`STRUCTURAL SIGNAL`
Auth context read	`isAuthenticated()`, `SecurityContextHolder`	`condition: reads authentication context`	`STRUCTURAL SIGNAL`
Feature flag	`featureFlag.isEnabled()`, `FeatureToggle`	`condition: feature flag gates execution`	`INFERRED (LOW CONFIDENCE)`
Null/empty guard	`if (x == null) return`	`condition: null/empty guard with early return`	`STRUCTURAL SIGNAL`
Spring cache	`@Cacheable`, `@CacheEvict`	`branch: Spring cache annotation present — downstream call may be short-circuited`	`STRUCTURAL SIGNAL`
Manual cache	`cache.get()`, `cacheManager.`	`branch: manual cache lookup detected — downstream call may be short-circuited`	`INFERRED (LOW CONFIDENCE)`
Optional absence	`Optional<>`, `.orElseThrow()`	`branch: Optional type in use — result may be absent`	`STRUCTURAL SIGNAL`
Async thread	`@Async`	`async: @Async annotation present — runs in separate thread`	`STRUCTURAL SIGNAL`
CompletableFuture	`CompletableFuture`, `.supplyAsync()`	`async: CompletableFuture detected — non-blocking execution`	`STRUCTURAL SIGNAL`
Event publishing	`publishEvent()`, `applicationEventPublisher`	`async: Spring application event emitted`	`STRUCTURAL SIGNAL`
Kafka	`kafkaTemplate.`, `KafkaProducer`	`async: Kafka producer detected`	`STRUCTURAL SIGNAL`
RabbitMQ	`rabbitTemplate.`, `amqpTemplate.`	`async: RabbitMQ producer detected`	`STRUCTURAL SIGNAL`

Epistemic contract:

Every output field in review-pr carries an explicit epistemic_level:

Level	Meaning
`FACT`	Directly observed in diff (file present, config changed)
`STRUCTURAL SIGNAL`	Annotation or type-system evidence in source (`@Service`, `@Transactional`, injection)
`INFERRED (LOW CONFIDENCE)`	Heuristic pattern match — no full structural proof
`OMITTED`	Insufficient evidence — field not emitted

No field blends certainty levels without labeling. end_state (e.g. "DB write") is always accompanied by end_state_epistemic_level: "INFERRED (LOW CONFIDENCE)" — it is a keyword-match heuristic, not an AST-verified fact.

Other review-pr output fields:

Field	Description
`review_hotspots`	Top changed files ranked by impact score
`suggested_review_order`	Security → API → Service → Persistence → Config
`security_impact`	Changed security-classified files (`epistemic_level: STRUCTURAL SIGNAL`) + risk note (`INFERRED (LOW CONFIDENCE)`)
`transactional_impact`	Changed service/business-logic files with possible transaction boundary effect
`test_coverage_risk`	Changed source files with no corresponding test (`epistemic_level: INFERRED (LOW CONFIDENCE)`)
`affected_modules`	DDD domain modules touched by the change

Output schema

All outputs include a confidence_summary block with overall, stack, and entry_points confidence levels (high / medium / low), plus an analysis_gaps list describing what could not be analyzed and why.

Java/Spring-specific fields

When a Java manifest (pom.xml or build.gradle) is detected, the output includes additional fields:

Field	Description
`language_version`	Java version from `maven.compiler.source` or equivalent
`deployment.spring_boot_version`	Spring Boot version
`deployment.packaging`	`jar` or `war`
`deployment.app_server_hint`	`weblogic`, `wildfly`, etc. (when detectable)
`security_surface.resource_names`	Values of `@M3FiltroSeguridad(nombreRecurso=...)` annotations across all controllers
`mybatis`	Mapper interface / XML file pairing summary
`transactional_boundaries`	Classes annotated with `@Transactional`
`deployment_risks`	Static risk flags: `spring-boot-2.x-eol`, `legacy-java-runtime`, `legacy-app-server-deployment`

Telemetry

Anonymous, opt-in telemetry collects: version, OS, commands used, flags, duration, repo size range, and errors. No source code, paths, secrets, or output content is ever collected.

sourcecode telemetry status    # current setting
sourcecode telemetry enable    # opt in
sourcecode telemetry disable   # opt out (permanent)

Alternatively, set the environment variable:

export SOURCECODE_TELEMETRY=0

Configuration

sourcecode config    # show version, config file path, telemetry status

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.31.30

May 26, 2026

1.31.29

May 26, 2026

1.31.28

May 26, 2026

1.31.27

May 26, 2026

1.31.26

May 26, 2026

1.31.25

May 25, 2026

1.31.24

May 25, 2026

1.31.23

May 25, 2026

1.31.22

May 25, 2026

1.31.21

May 25, 2026

1.31.20

May 25, 2026

1.31.18

May 24, 2026

1.31.17

May 24, 2026

1.31.16

May 24, 2026

1.31.15

May 24, 2026

1.31.14

May 24, 2026

This version

1.31.13

May 24, 2026

1.31.12

May 22, 2026

1.31.11

May 22, 2026

1.31.10

May 22, 2026

1.31.9

May 21, 2026

1.31.8

May 21, 2026

1.31.7

May 21, 2026

1.31.6

May 21, 2026

1.31.5

May 21, 2026

1.31.4

May 21, 2026

1.31.3

May 21, 2026

1.31.2

May 21, 2026

1.31.1

May 21, 2026

1.31.0

May 20, 2026

1.30.30

May 20, 2026

1.30.29

May 19, 2026

1.30.28

May 19, 2026

1.30.27

May 18, 2026

1.30.26

May 18, 2026

1.30.25

May 18, 2026

1.30.24

May 18, 2026

1.30.23

May 18, 2026

1.30.22

May 18, 2026

1.30.21

May 18, 2026

1.30.20

May 18, 2026

1.30.19

May 18, 2026

1.30.18

May 18, 2026

1.30.17

May 18, 2026

1.30.16

May 18, 2026

1.30.15

May 17, 2026

1.30.14

May 17, 2026

1.30.13

May 16, 2026

1.30.12

May 16, 2026

1.30.11

May 16, 2026

1.30.10

May 16, 2026

1.30.9

May 16, 2026

1.30.8

May 16, 2026

1.30.7

May 16, 2026

1.30.6

May 16, 2026

1.30.5

May 16, 2026

1.30.4

May 16, 2026

1.30.3

May 16, 2026

1.30.2

May 16, 2026

1.30.1

May 16, 2026

1.30.0

May 16, 2026

1.29.0

May 16, 2026

1.28.0

May 16, 2026

1.27.0

May 16, 2026

1.26.0

May 13, 2026

1.24.0

May 13, 2026

1.23.0

May 13, 2026

1.22.0

May 13, 2026

1.21.0

May 13, 2026

1.20.0

May 13, 2026

1.19.0

May 13, 2026

1.18.0

May 13, 2026

1.17.0

May 13, 2026

1.16.0

May 13, 2026

1.15.1

May 13, 2026

1.15.0

May 13, 2026

1.14.0

May 12, 2026

1.13.0

May 12, 2026

1.12.0

May 11, 2026

1.11.0

May 11, 2026

1.10.0

May 9, 2026

1.9.0

May 9, 2026

1.8.0

May 8, 2026

1.7.0

May 8, 2026

1.6.0

May 8, 2026

1.5.0

May 8, 2026

1.4.0

May 8, 2026

1.3.0

May 8, 2026

1.2.0

May 8, 2026

1.1.0

May 8, 2026

1.0.0

May 5, 2026

0.49.0

May 5, 2026

0.48.0

May 5, 2026

0.47.0

May 5, 2026

0.46.0

May 5, 2026

0.45.0

May 5, 2026

0.44.0

May 4, 2026

0.43.0

May 4, 2026

0.42.0

May 4, 2026

0.41.0

May 3, 2026

0.39.0

May 3, 2026

0.38.0

May 3, 2026

0.37.0

May 3, 2026

0.36.0

May 3, 2026

0.35.0

May 3, 2026

0.34.0

May 3, 2026

0.33.0

May 2, 2026

0.32.0

May 1, 2026

0.31.0

May 1, 2026

0.30.0

May 1, 2026

0.29.0

May 1, 2026

0.28.0

Apr 30, 2026

0.27.0

Apr 29, 2026

0.26.0

Apr 29, 2026

0.25.0

Apr 29, 2026

0.24.0

Apr 29, 2026

0.23.0

Apr 26, 2026

0.22.0

Apr 25, 2026

0.21.0

Apr 25, 2026

0.20.0

Apr 24, 2026

0.19.0

Apr 24, 2026

0.18.0

Apr 24, 2026

0.17.0

Apr 24, 2026

0.15.1

Apr 23, 2026

0.15.0

Apr 23, 2026

0.14.0

Apr 23, 2026

0.13.0

Apr 18, 2026

0.12.0

Apr 18, 2026

0.11.0

Apr 15, 2026

0.10.0

Apr 14, 2026

0.9.0

Apr 11, 2026

0.8.0

Apr 10, 2026

0.7.0

Apr 10, 2026

0.6.0

Apr 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourcecode-1.31.13.tar.gz (543.1 kB view details)

Uploaded May 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sourcecode-1.31.13-py3-none-any.whl (389.1 kB view details)

Uploaded May 24, 2026 Python 3

File details

Details for the file sourcecode-1.31.13.tar.gz.

File metadata

Download URL: sourcecode-1.31.13.tar.gz
Upload date: May 24, 2026
Size: 543.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sourcecode-1.31.13.tar.gz
Algorithm	Hash digest
SHA256	`6a02dd6a3f7016097fbe0705498c0df16e7a5d9df29bd935151b817b0df0973a`
MD5	`3dc91f451d59664e02fa518fc4ba60a1`
BLAKE2b-256	`5e145178d18eaf5b5e90f9d07ff3fdf317f7cefa2bbfec22b226606e7308c4fb`

See more details on using hashes here.

File details

Details for the file sourcecode-1.31.13-py3-none-any.whl.

File metadata

Download URL: sourcecode-1.31.13-py3-none-any.whl
Upload date: May 24, 2026
Size: 389.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sourcecode-1.31.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ea5ef399e719a00e4b3dd41c96a94fca6365b68f2c0a830dc444c11ac999509`
MD5	`50fcc23f90d82a3cbea5671b6cf7195d`
BLAKE2b-256	`88e3bc078d4ac43e5a371e3aad1829ac7bd13b5ac58427419b29496998b99890`

See more details on using hashes here.

sourcecode 1.31.13

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

sourcecode

What is it?

Installation

Homebrew (macOS / Linux)

pip / pipx

Verify

Quickstart

Flags reference

prepare-context — task-specific context

Tasks

Options

Examples

delta — incremental impact analysis

review-pr — behavior-aware PR analysis

Output schema

Java/Spring-specific fields

Telemetry

Configuration

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`prepare-context` — task-specific context

`delta` — incremental impact analysis

`review-pr` — behavior-aware PR analysis