ICSF – Intelligent Code Security & Fixing Platform (CLI)

These details have not been verified by PyPI

Project links

Project description

ICSF – Intelligent Code Security & Fixing Platform

ICSF is a full-stack, AI-powered platform that automates the discovery, analysis, and remediation of security vulnerabilities in Java/Maven codebases. It combines a multi-agent cognitive fixing pipeline with an autonomous self-healing testing framework (Atlas) to deliver a closed-loop system: from vulnerability report → verified, PR-ready fix.

Architecture Overview
End-to-End Application Flow
Backend Deep Dive
Frontend Deep Dive
RAG (Retrieval-Augmented Generation)
AI / LLM Integration
Input Requirements
Technical Stack
Getting Started
Project Structure
API Reference

🏗️ Architecture Overview

ICSF follows a layered, modular architecture:

┌─────────────────────────────────────────────────────────────────────┐
│                    Frontend (Streamlit – 5676 lines)                │
│   Premium dark-mode dashboard · Real-time progress · Lineage graph │
├─────────────────────────────────────────────────────────────────────┤
│                   Backend (FastAPI – 1636 lines)                    │
│         REST API · WebSocket/SSE · Request ID middleware            │
├────────────┬──────────────────────┬─────────────────────────────────┤
│  Services  │       Agents         │      Atlas Subsystem            │
│ (14 files) │  (Cognitive Loop)    │  (Self-Healing Testing)         │
│            │  5 agents + helpers  │  14 sub-packages                │
├────────────┼──────────────────────┼─────────────────────────────────┤
│            │   AWS Bedrock (LLM)  │  SQLite RAG Store               │
│            │  Claude 3.5 Sonnet   │  Titan Embeddings               │
│            │  Titan Embeddings    │  Cosine Similarity Search       │
└────────────┴──────────────────────┴─────────────────────────────────┘

Key Design Principles

Principle	Implementation
Single AI Provider	All LLM/embedding calls route through AWS Bedrock only
Multi-Agent Pipeline	5 specialized agents, each with a single responsibility
Self-Healing	Atlas auto-repairs build failures and test regressions
Cross-Repo Awareness	Dependency analysis spans across multiple repositories
Cost Control	`CostGuardService` enforces per-run budget limits
Resilience	Retry with exponential backoff + circuit breakers

🔄 End-to-End Application Flow

flowchart TD
    %% Entry & Configuration
    U((User)) -->|1. Setup| UI[Streamlit Dashboard]
    UI -->|2. Upload CSV| BE[FastAPI Backend]

    subgraph "Phase I: Discovery & Mapping"
        direction TB
        BE -->|3. Fetch Projects| GH[GitHub API]
        GH -->|4. List Repos| REPO[(Repository Store)]
        VS[Vulnerability Mapper] -->|5. Map Assets| FT[(Local Workspace)]
    end

    BE --> VS

    subgraph "Phase II: Quality Baseline – Atlas"
        direction LR
        BASE[Lightweight Baseline] -->|6. Verify Build| COV[Capture Coverage]
    end

    VS --> BASE

    subgraph "Phase III: Impact Analysis"
        direction TB
        DS[Dependency Service] -->|7. Map Blast Radius| MAP[Call Tree & Usage]
    end

    COV --> DS

    subgraph "Phase IV: Cognitive Fixing Loop"
        direction TB
        CC[Code Context Agent] --> FS[Fix Strategy Agent]
        FS --> CF[Code Fixer Agent]
        CF --> SV[Safety Validator Agent]
    end

    MAP -->|8. Start Fix| CC
    CC <-->|Rich Context| DS
    CF -->|9. Apply Fix| FT

    subgraph "Phase V: Self-Healing Pipeline – Atlas"
        direction TB
        BM[BuildMechanic] --> TH[TestHealer]
        TH --> TG[AI Test Generator]
        TG --> VR[Validation Report]
    end

    SV -->|10. Final Verify| BM
    BM -->|Self-Heal Build| FT
    FT --> TH

    subgraph "Phase VI: Delivery & Sync"
        direction TB
        VR --> PR[Batch PR Manager]
        PR -->|11. Create Sync PR| GH
        PR -->|12. Update UI| UI
    end

    %% RAG Knowledge Loop
    TG -.->|Save Success Patterns| RAG[(SQLite RAG Store)]
    RAG -.->|Context Enrichment| CC

Phase-by-Phase Walkthrough

Phase	What Happens	Key Service
I. Discovery	Upload CSV → fetch repos from GitHub API → match vulnerability file paths to repositories using intelligent path normalization	`VulnerabilityService`, `GitHubService`
II. Baseline	Run `mvn compile test` on the unmodified code to establish Ground Truth coverage & build health	`AtlasService.run_baseline_only()`
III. Impact Analysis	Parse Java files, build global dependency graph, find cross-repo callers of the vulnerable method	`DependencyService`
IV. Cognitive Fixing	4-agent pipeline: Analyze context → Plan strategy → Generate fix → Validate safety	`FixOrchestrator` + 4 Agents
V. Self-Healing	BuildMechanic auto-repairs compilation; TestHealer fixes broken tests; AI generates new security-targeted tests	Atlas pipeline
VI. Delivery	Aggregate all fixes into a single PR with rich markdown body, push to GitHub	`BatchPRService`, `PRManagerService`

🔧 Backend Deep Dive

1. FastAPI Main (`main.py`) — 1636 lines

The central orchestration hub. Defines the REST API, middleware, and all endpoint routes.

Startup & Middleware

Component	Purpose
`_startup_validation()`	Smoke-checks AWS + GitHub credentials on boot
`RequestIDMiddleware`	Injects a UUID `X-Request-ID` header into every request for log correlation
CORS middleware	Configurable via `ALLOWED_ORIGINS` env var

Pydantic Request/Response Models (inline)

Model	Fields	Used By
`GitHubRepoRequest`	`username`, `email`, `token`	`POST /api/github/repos`
`Repository`	`id`, `name`, `full_name`, `clone_url`, `language`, etc.	All repo endpoints
`RepositoriesResponse`	`username`, `total_repos`, `repositories[]`	Repo listing
`TestingRequest`	`repo_url`, `repo_path`, `fixed_files`, `create_pr`, `vulnerability`, etc.	Testing pipeline
`Vulnerability`	`file_name`, `line_no`	Vulnerability mapping
`MappedVulnerability`	`repo: Repository`, `vulnerabilities[]`	Mapping results

API Endpoints

Method	Route	Description
`GET`	`/`	Root welcome
`GET`	`/api/health`	Health check for Docker/LB probes
`GET`	`/api/credentials/github`	Retrieve loaded GitHub credentials
`GET`	`/api/credentials/verify`	Debug credential loading
`POST`	`/api/github/repos`	Fetch repos (POST with body)
`GET`	`/api/github/repos`	Fetch repos (GET with query params)
`POST`	`/api/vulnerabilities/map`	Upload CSV + map vulnerabilities to repos
`POST`	`/api/dependencies/analyze`	Analyze dependencies for a single vulnerability
`POST`	`/api/dependencies/batch-analyze`	Batch dependency analysis for multiple vulnerabilities
`POST`	`/api/fix/orchestrate`	Run the full multi-agent fixing pipeline
`POST`	`/api/pr/create`	Create a single PR with fixed code
`POST`	`/api/testing/start`	Start async testing pipeline job
`GET`	`/api/testing/job/{job_id}`	Poll job status
`GET`	`/api/testing/stream/{job_id}`	SSE event stream for real-time progress
`GET`	`/api/testing/runs`	List recent pipeline runs
`POST`	`/api/testing/run`	Legacy sync testing endpoint
`POST`	`/api/fix/batch`	Batch fix multiple vulnerabilities
`POST`	`/api/pr/merge`	Merge PR with conflict resolution
`POST`	`/api/pr/check-mergeability`	Check PR mergeability
`POST`	`/api/pr/create-batch`	Create single aggregated PR for all fixes

2. Configuration & Credentials

`config.py` — The Config Class

Attribute	Source	Default
`AWS_ACCESS_KEY_ID`	`.env`	—
`AWS_SECRET_ACCESS_KEY`	`.env`	—
`AWS_REGION`	`.env`	`us-east-1`
`AWS_SESSION_TOKEN`	`.env`	`None`
`BEDROCK_MODEL_ID`	`.env`	`anthropic.claude-3-5-sonnet-20240620-v1:0`
`BEDROCK_EMBED_MODEL_ID`	`.env`	`amazon.titan-embed-text-v1`

Key Methods:

get_github_credentials(force_reload=False) — Reads credentials.yaml for GitHub PAT, username, email
validate_bedrock_credentials() — Returns (is_valid, error_msg) tuple
get_bedrock_config() — Returns dict with access_key, secret_key, region

`credentials.yaml`

github:
  token: ghp_xxxxx
  username: your-username
  email: your-email@example.com

3. Pydantic Data Models (`models/agent_models.py`)

These 10 models define the complete data flow through the multi-agent pipeline:

flowchart LR
    VFR[VulnerabilityFixRequest] --> VA[VulnerabilityAnalysis]
    VA --> CC[CodeContext]
    CC --> FS[FixStrategy]
    FS --> CF[CodeFix]
    CF --> SV[SafetyValidation]
    SV --> FE[FixExplanation]
    VFR --> FOR[FixOrchestrationResult]
    VA --> FOR
    CC --> FOR
    FS --> FOR
    CF --> FOR
    SV --> FOR
    FE --> FOR

Model	Role	Key Fields
`VulnerabilityFixRequest`	Input to pipeline	`vulnerability_type`, `file_path`, `line_number`, `repo_path`
`VulnerabilityAnalysis`	Agent 1 output	`severity`, `security_impact`, `root_causes`, `fix_category`
`CodeContext`	Agent 2 output	`code_snippet`, `class_name`, `dependent_files_intra/inter`, `data_flow`
`FixStrategy`	Agent 3 output	`fix_approach`, `code_changes_plan`, `files_to_modify_primary/secondary`
`CodeFix`	Agent 4 output	`fixed_code` (Dict[path→code]), `diff`, `change_summary`, `reasoning`
`SafetyValidation`	Agent 5 output	`validation_status`, `correctness_score`, `breaking_changes`, `issues_found`
`FixExplanation`	Agent 6 output	`vulnerability_summary`, `fix_explanation`, `markdown_report`
`FixOrchestrationResult`	Complete result	Aggregates all agent outputs + `overall_status`, `errors`
`VulnerabilitySeverity`	Enum	`CRITICAL`, `HIGH`, `MEDIUM`, `LOW`, `INFO`
`ValidationResult`	Enum	`APPROVED`, `REJECTED`, `NEEDS_REVIEW`

4. Services Layer (`services/`)

The services layer contains 14 files providing the core business logic.

4.1 `bedrock_service.py` — AWS Bedrock LLM Wrapper (439 lines)

The primary AI gateway used by the Agents layer.

Method	Description
`invoke_claude(prompt, model_id, max_tokens, temperature, system_prompt)`	Synchronous Claude invocation via Bedrock `invoke_model` API
`ainvoke_claude(...)`	Async wrapper using `asyncio.to_thread`
`invoke_llama(prompt, ...)`	Llama 3 70B invocation (different payload format)
`ainvoke_llama(...)`	Async Llama wrapper
`embed_text(text, embed_model_id)`	Generate embeddings via Amazon Titan Embed
`invoke_model(model_id, prompt, ...)`	Generic dispatcher — auto-selects Claude/Llama based on model ID
`test_connection()`	Smoke test with simple prompt

Supported Model Constants:

Constant	Model ID
`CLAUDE_3_5_SONNET`	`anthropic.claude-3-5-sonnet-20240620-v1:0`
`CLAUDE_3_SONNET`	`anthropic.claude-3-sonnet-20240229-v1:0`
`LLAMA_3_70B`	`meta.llama3-70b-instruct-v1:0`

4.2 `github_service.py` — GitHub API Client (455 lines)

Method	Description
`verify_token_and_get_user(username)`	Validate PAT + retrieve user info
`get_user_by_username(username)`	Public API user lookup
`get_username_from_email(email)`	Reverse email → username lookup
`get_user_organizations()`	List authenticated user's orgs
`get_organization_repositories(org_name)`	List all repos in an org (paginated)
`get_all_repositories(username, include_private, include_orgs)`	Aggregated repo fetch (user + org repos)
`get_repository_details(owner, repo_name)`	Single repo metadata
`get_repository_file_tree(owner, repo_name, branch)`	Recursive file tree via Git Tree API

4.3 `vulnerability_service.py` — CSV Parser & Repo Mapper (833 lines)

Parses vulnerability reports from Fortify, Checkmarx, SonarQube, Snyk, etc.

Method	Description
`parse_csv_file(file_content, filename)`	Parse CSV into DataFrame; auto-detects column names
`extract_repo_name_from_url(url)`	Handles HTTPS, SSH, `.git` suffix URLs
`normalize_repo_identifier(repo_name, repo_url)`	Lowercase normalization for matching
`normalize_file_path(file_path)`	Cross-platform path normalization
`get_path_variations(file_path)`	Generates multiple path format variations for fuzzy matching
`match_file_in_repo(file_name, repo_files)`	Intelligent file matching with early-exit optimization
`clone_repository_and_get_files(repo_url, clone_dir)`	Git clone + file tree extraction
`map_vulnerabilities_to_repos(df, repositories, repo_files_map, clone_repos)`	Core mapping: CSV rows → repository + file matches

4.4 `dependency_service.py` — Java Dependency Graph Engine (2037 lines)

The largest service file. Performs static analysis of Java source code and Maven POM files.

Method	Description
`parse_java_file(file_path)`	Extracts package, imports, classes, methods, interfaces, method calls via regex/AST parsing
`parse_pom_xml(pom_path)`	Extracts `groupId`, `artifactId`, `version`, dependencies, parent POM
`find_java_files(repo_path)`	Recursive `.java` file discovery
`find_pom_files(repo_path)`	Recursive `pom.xml` file discovery
`build_global_dependency_graph(all_repos, artifact_index)`	Builds both intra-repo and inter-repo dependency edges. Node identity = `(repo_name, file_path)`
`build_intra_repo_dependencies(repo_path)`	File-to-file dependencies within a single repo (import-graph)
`find_maven_artifact_for_file(file_path, repo_path)`	Map a `.java` file to its Maven artifact coordinates
`find_cross_repo_dependent_files(...)`	Inter-repo blast radius: finds files in other repos that depend on the vulnerable file
`_build_cross_repo_dependency_chains(...)`	Transitive dependency chain traversal across repos (up to `max_depth=5`)
`build_maven_artifact_index(all_repos)`	Maps `(groupId, artifactId)` → repository metadata

4.5 `fix_orchestrator.py` — Multi-Agent Pipeline Controller (564 lines)

Coordinates the sequential agent execution:

Agent 2 (Code Context) → Agent 3 (Fix Strategy) → Agent 4 (Code Fix) → Agent 5 (Safety Validator)

Method	Description
`orchestrate_fix(request, stop_at_agent, validate_fix, max_validation_retries, all_repositories_info)`	Main entry point. Runs agents 2→5 sequentially, with optional validation loop
`get_orchestration_status(result)`	Human-readable status summary
`_create_skeleton_analysis(request)`	Generates a default `VulnerabilityAnalysis` from request data

Supports stop_at_agent for incremental testing (e.g., run only agents 2-3).

4.6 `batch_fix_service.py` — Batch Vulnerability Processing (753 lines)

Processes multiple vulnerabilities in sequence or with controlled concurrency.

Method	Description
`_process_vulnerability_fix(current_idx, vuln_idx, vuln, ...)`	Process a single vulnerability with logging
`_run_testing_agent(repo_path, repo_name, phase, fixed_files)`	Run Atlas testing (baseline or validation phase)
`fix_single_vulnerability(vulnerability, repo_path, ...)`	Single fix with full orchestration
`fix_batch_vulnerabilities(vulnerabilities, repo_path, ..., max_concurrent, auto_create_pr, run_tests_after_fix)`	Main batch entry point. Runs baseline → sequential fixes → validation → optional PR

Workflow: Baseline → Fix each vulnerability → Run Atlas validation → Create aggregated PR

4.7 `pr_manager_service.py` — Git & PR Operations (1359 lines)

Complete Git workflow management.

Method	Description
`_run_git_command(repo_path, command, timeout)`	Safe subprocess wrapper for git commands
`create_branch(repo_path, branch_name, base_branch)`	Create and checkout new branch
`commit_changes(repo_path, files_to_commit, commit_message, author_name, author_email)`	Stage + commit with configurable author
`push_branch(repo_path, branch_name, remote)`	Git push to remote
`create_pull_request(owner, repo, title, body, head_branch, base_branch)`	GitHub API PR creation
`_validate_compilation(repo_path, files_modified)`	Best-effort Maven/Gradle compilation check
`_clean_code_before_validation(code)`	Removes markdown artifacts, separator lines from LLM output
`_validate_java_code(code, file_path)`	Basic Java structure validation (package, class, brace matching)
`apply_fixed_code(repo_path, files_modified, fixed_code_map)`	Write fixed code to files with validation
`create_pr_for_fix(repo_path, repo_owner, ..., include_all_repo_changes)`	Complete workflow: apply → branch → commit → push → PR

4.8 `batch_pr_service.py` — Aggregated PR Creation (429 lines)

Method	Description
`_extract_files_and_code(fix_result)`	Parse fix result into `files_modified` + `fixed_code_map`
`create_single_pr(fix_result, ...)`	Create PR for one vulnerability
`create_batch_prs(successful_fixes, ...)`	One PR per vulnerability
`create_single_batch_pr(successful_fixes, ..., test_results)`	Single aggregated PR combining all fixes + test results

4.9 `atlas_service.py` — Testing Pipeline Façade (430 lines)

Bridges the backend API to the Atlas subsystem.

Method	Description
`_check_required_tools()`	Validates `git`, `mvn`, `java` are on PATH
`run_testing_pipeline(repo_url, create_pr, job_id)`	Full pipeline on remote repo (clone → test → coverage → PR)
`run_testing_pipeline_local(repo_path, repo_url, fixed_files, ...)`	Full pipeline on already-cloned local repo
`run_baseline_only(repo_path, repo_url)`	Lightweight: build + existing tests + coverage — NO AI

4.10 `fix_validator_service.py` — Post-Fix Validation (277 lines)

Method	Description
`validate_fix(repo_path, files_modified)`	Run Maven build + tests on the fixed repo. Uses `BuildMechanic` for auto-repair
`get_validation_feedback(validation_result)`	Generate feedback string for retry loop

4.11 `job_manager.py` — Async Job & SSE Streaming (86 lines)

Method	Description
`create_job()`	Create UUID-identified job with `asyncio.Queue`
`update_job(job_id, status, message, progress)`	Update status + push to SSE queue
`end_job(job_id)`	Signal `[DONE]` to SSE stream
`stream_job_events(job_id)`	Async generator for `StreamingResponse`

4.12 `run_history.py` — SQLite Run Persistence (115 lines)

Method	Description
`create_run(repo_url, repo_path)`	Insert new run record
`update_run(run_id, status, result_data, error_msg, cost)`	Update with test/coverage/regression/quality gate reports
`get_recent_runs(limit)`	Fetch recent runs with JSON report parsing

Schema: pipeline_runs(run_id, repo_url, repo_path, status, start_time, end_time, total_cost, test_report, coverage_report, regression_report, quality_gate_report, error_message)

4.13 `cost_guard.py` — LLM Cost Limiter (50 lines)

Method	Description
`start_run(run_id)`	Initialize per-run cost tracking
`add_cost(run_id, prompt_tokens, completion_tokens, model_id)`	Accumulate cost; returns `False` if budget exceeded
`get_run_cost(run_id)`	Query accumulated cost

Pricing: Claude 3.5 Sonnet — $0.003/1K prompt tokens, $0.015/1K completion tokens. Default budget: $5.00/run.

5. Agents Layer (`agents/`) — Cognitive Fixing Loop

5.1 `code_context_agent.py` — Blast Radius Mapper (642 lines)

Purpose: Understand the full context around a vulnerability — local code, dependent files, data flow.

Method	Logic
`_read_file_with_context(file_path, line_number, context_lines)`	Extract code snippet + surrounding context. Includes class/method even if vulnerability is in imports
`_extract_class_and_method(file_path, line_number, code_content)`	Regex-based Java class/method extraction
`_analyze_data_flow_and_usage(code_snippet, vulnerability_type, ...)`	LLM call: Analyze how user input flows from source → vulnerable sink
`_analyze_method_usage_in_dependents(vulnerable_class, ..., dependent_files)`	Check if the fix will break dependent files by analyzing their imports and usage
`_discover_other_repositories(current_repo_path)`	Scan `temp_cloned_repos/` directory for cross-repo analysis
`analyze(request, vulnerability_analysis, all_repositories_info)`	Main entry: reads file, finds dependents (intra + inter-repo), runs LLM data flow analysis

5.2 `fix_strategy_agent.py` — Surgical Planner (633 lines)

Purpose: Design a backward-compatible fix plan.

Method	Logic
`_get_available_java_files(repo_path, max_files)`	Inventory of Java files for validation
`_analyze_file_imports_and_usage(repo_path, file_path)`	Static import analysis to find related files
`_build_strategy_prompt(request, analysis, context)`	Constructs a detailed LLM prompt with vulnerability info, dependents, constraint rules
`_parse_strategy_response(response_content)`	JSON extraction from LLM response
`analyze(request, vulnerability_analysis, code_context)`	LLM call: Generate fix strategy; categorizes files as Primary (logic change) or Secondary (impacted usage)

Key Decision Logic:

If a method is called by 50+ files → force backward-compatible fix (overloaded method, not breaking change)
Uses FrameworkDetector for framework-specific recommendations (Spring Security, Jakarta, etc.)

5.3 `code_fix_agent.py` — Multi-File Code Generator (1071 lines)

Purpose: Generate actual fixed Java code across multiple files.

Method	Logic
`_read_file(file_path)`	Read source file
`_find_nearest_pom_xml(repo_path, file_rel_path)`	Walk up directories to find `pom.xml`
`_project_allows_spring_security(repo_path, file_rel_path, original_code)`	Check if Spring Security dependencies exist before generating SS code
`_dependency_constraints_text(repo_path, ...)`	Generate constraint text for LLM prompt
`_postprocess_for_project_dependencies(code, ...)`	Deterministic safety net: strip Spring Security constructs if project doesn't include it
`_generate_diff(original, fixed, file_path)`	Unified diff generation
`_clean_generated_code(code, file_path)`	Aggressive cleanup: removes markdown, `<thinking>` blocks, ensures valid Java
`_generate_fixed_code(original_code, request, ...)`	LLM call: Generate complete fixed file with prompt-chain reasoning
`fix_code(request, ..., fix_strategy)`	Main entry: fixes ALL files in `files_to_modify_primary`, runs post-processing

Uses ImportManager.add_missing_imports() and SyntaxValidator.validate() for post-processing.

5.4 `safety_validator_agent.py` — Logic Gate (371 lines)

Purpose: Verify the fix is correct, introduces no regressions.

Method	Logic
`_format_fixed_code(fixed_code)`	Format code dict for display
`_format_dependent_files_for_validation(code_context)`	Format dependent files context
`_build_validation_prompt(request, ..., code_fix)`	Comprehensive validation prompt
`_parse_validation_response(response_content)`	Extract structured validation data
`_normalize_validation_data(parsed)`	Ensure correct types for downstream consumption
`validate(request, ..., code_fix)`	LLM call: Returns `APPROVED`/`REJECTED`/`NEEDS_REVIEW` with `correctness_score` (0-1)

5.5 `codebase_analysis_agent.py` — Repository Intelligence (594 lines)

Purpose: Deep structural analysis of the codebase (similar to AI coding assistants).

Method	Logic
`analyze_codebase_structure(repo_path, focus_file)`	Full repo analysis with in-memory cache (TTL=300s)
`find_dependent_files(repo_path, target_file, max_depth)`	Find all files depending on target file
`analyze_code_flow(repo_path, file_path, line_number)`	Data flow analysis around a specific line
`_analyze_architecture(repo_path, java_files)`	Detect project layers (Controller, Service, DAO, etc.)
`_build_dependency_graph(repo_path, java_files)`	Build import-based dependency graph
`_detect_patterns(repo_path, java_files)`	Detect design patterns (Singleton, Factory, Builder, Observer)
`_parse_java_file(file_path)`	Extract package, imports, classes (regex-based)

5.6 `agent_improvements.py` — Helper Utilities (368 lines)

Four static helper classes:

Class	Purpose
`ImportManager`	Auto-detect and add missing Java imports (maps common security classes to their import statements)
`SyntaxValidator`	Basic Java syntax validation (brace matching, package declaration, class structure)
`FrameworkDetector`	Detect frameworks in `pom.xml` (Spring Boot, Spring Security, JPA, Jackson, etc.) with framework-specific fix recommendations
`ContextEnhancer`	Extract full method/class definitions from source code for enhanced prompt context

6. Atlas Subsystem (`atlas/`) — Self-Healing Testing Framework

Atlas is a comprehensive, autonomous testing and quality assurance pipeline with 14 sub-packages.

6.1 `orchestrator/run_pipeline.py` — Pipeline Core (1412 lines)

The brain of Atlas. Orchestrates the entire testing lifecycle.

Function	Description
`run_full_pipeline(repo_url, ...)`	Clone remote repo → full pipeline
`run_full_pipeline_local(repo_path, ...)`	Full pipeline on local repo
`run_baseline_only(repo_path, ...)`	Lightweight: build + test + coverage only
`_run_baseline_phase(repo_path, ...)`	Build (with restricted auto-fix) + existing tests + JaCoCo coverage
`_run_validation_phase(repo_path, ...)`	Diff-aware test generation, healing, regression detection (630+ lines)
`_run_full_pipeline_core(...)`	Core pipeline: baseline → validation → quality gate → PR
`evaluate_quality_gate(coverage, unit, min_coverage_pct, max_failures)`	Pass/fail decision on release readiness
`calculate_regression_report(state_mgr, ...)`	Compare current vs baseline to detect regressions/improvements
`_calculate_usage(llm)`	Compute estimated cost from Bedrock token metrics
`run_organization_pipeline(org_url, ...)`	Scan entire GitHub org: run pipeline on each Java/Maven repo

6.2 `agents/build_mechanic.py` — Build Failure Auto-Repair (1133 lines)

The SRE agent. Diagnoses and fixes compilation failures.

Method	Description
`analyze(stdout, stderr)`	Parse Maven build output → `BuildDiagnosis` (root cause, confidence, hints)
`generate_fix(diagnosis, workspace_path, ...)`	LLM call: Generate concrete fix (file patches, POM changes, config files)

Domain Expertise:

Spring Security 6 migration patterns (WebSecurityConfigurerAdapter → lambda DSL)
Deprecated API detection and deletion
Missing dependency resolution (maps class names → Maven coordinates)
COMMON_TEST_HINTS dictionary: 30+ patterns mapping class names to imports
Test assertion guidelines (status codes, JSON paths, mock strategies)

6.3 `agents/test_healer.py` — Test Failure Doctor (151 lines)

Method	Description
`heal(failed_tests, workspace_path)`	Group failures by class → LLM call: generate fixed test file → `AgentFix`
`_find_test_file(root, classname)`	Locate `.java` test file by class name

Processes top 10 failures, max 3 classes, max 5 failures per class.

6.4 `rag/store.py` — SQLite Vector RAG Store (210 lines)

Lightweight persistent RAG store for test pattern learning.

Method	Description
`upsert(id, kind, embedding, text, metadata)`	Insert/update with normalized float32 embedding blob
`query(embedding, top_k, kind, kinds, score_threshold, include_expired)`	Cosine similarity search via dot product
`get_by_id(id)`	Direct ID lookup
`count(kind)`	Count entries by kind
`evict_expired()`	TTL-based cleanup (default 30 days)

Schema: rag_items(id TEXT PK, kind TEXT, created_at INT, embedding BLOB, metadata_json TEXT, text TEXT) Indexes: kind, created_at

6.5 `llm/bedrock.py` — Atlas Bedrock Client (163 lines)

Dedicated Bedrock client for the Atlas subsystem.

Method	Description
`embed_text(text)`	Titan Embeddings: `inputText` → embedding vector
`generate_text(system, user, max_tokens)`	Claude Messages API via Bedrock `invoke_model`

Tracks total_input_tokens, total_output_tokens, total_embedding_tokens for cost calculation. Security: Permanent credentials (AKIA*) do NOT use session tokens; temporary (ASIA*) require them.

6.6 `generation/java_unit_test_generator.py` — RAG-Enhanced Test Gen (441 lines)

Method	Description
`generate_minimal_tests_for_repo(target_count, preferred_classes, ...)`	Discover main classes → prioritize by scoring → generate tests
`_generate_single_test(src, repo_path, ...)`	LLM + RAG call: Check fingerprint → query RAG for similar patterns → generate JUnit 5 test
`_set_fingerprint(class_key, sha, test_path)`	Store source hash in RAG for idempotent re-runs

Scoring heuristic for class prioritization:

+10 if in preferred classes list
+5 for service/controller/repository classes
+3 for @RestController/@Service/@Repository annotations
−2 for test/config/model classes

Uses RepoContractRegistry for constructor/method signature validation in generated tests.

6.7 `build/` — Build Infrastructure (5 files)

File	Purpose
`maven.py`	Maven command runner (`mvn compile`, `mvn test`, etc.) with subprocess management
`jacoco_injector.py`	Inject JaCoCo Maven plugin into `pom.xml` for code coverage
`spring_test_injector.py`	Inject `spring-boot-starter-test` dependency
`failsafe_injector.py`	Inject Maven Failsafe plugin for integration tests
`dependency_governance.py`	Enforce dependency version governance (BOM alignment, conflict resolution)

6.8 `core/` — Core Infrastructure (5 files)

File	Purpose
`config.py`	Atlas-specific configuration (data dirs, model IDs, etc.)
`logging.py`	`RunLogger` class for structured pipeline logging
`state.py`	`PipelineStateManager` — manages baseline/validation state persistence
`shell.py`	Safe shell command execution with timeout
`resilience.py`	Retry with exponential backoff (configurable attempts, jitter) + Circuit Breaker pattern (CLOSED/OPEN/HALF-OPEN states) + Rate Limiter

6.9 `analysis/` — Code Analysis (3 files)

File	Purpose
`java_maven.py`	Java project analysis: `detect_repo_facts()`, `count_existing_tests()`, `find_domain_models()`
`contract_service.py`	`RepoContractRegistry`: extract class constructors, method signatures for test generation validation
`diff_analyzer.py`	Analyze git diffs to identify functional changes for targeted test generation

6.10 `reporting/` — Test Reporting (2 files)

File	Purpose
`models.py`	Report dataclasses: `TestReport`, `CoverageReport`, `BreakageReport`, `GenerationReport`, `RegressionReport`, `QualityGateReport`, `UsageReport`, `FullRunReport`
`parsers.py`	Parse Surefire XML reports, JaCoCo CSV coverage data, classify test failures

6.11 `gitops/` — GitHub Integration (3 files)

File	Purpose
`github_pr.py`	Create PRs for Atlas-generated tests
`github_issues.py`	Create GitHub issues for persistent test failures
`github_org.py`	List repos in a GitHub organization for org-wide scanning

6.12 `repo/` — Repository Management (2 files)

File	Purpose
`cloner.py`	`RepoCloner`: Clone repos with token authentication
`history.py`	Run history tracking for the Atlas pipeline

🎨 Frontend Deep Dive

Technology: Streamlit (5676 lines, single app.py + utility modules)

UI Components

The frontend is a premium dark-mode dashboard with glassmorphism styling, gradient headers, and micro-animations. Key CSS tokens:

Background: #0f172a (dark slate), Secondary: #1e293b
Accent: linear-gradient(135deg, #3b82f6, #2dd4bf) (blue → teal)
Font: Inter (body), JetBrains Mono (code)

Core Functions

Function	Lines	Purpose
`main()`	45	Entry point: mode selector (Vulnerability Workflow vs Repository Explorer)
`display_vulnerability_workflow(api_url)`	~600	Streamlined flow: Upload → Map → Test → Fix → Verify
`display_repositories(data)`	~2400	Full repository explorer with vulnerability cards, dep trees, fix controls
`process_active_batch_fix(selected_repo_id, ...)`	~1040	Real-time batch fix processing with progress bars
`display_lineage_graph(result, repo_name, vuln_idx)`	~275	NetworkX-based dependency graph visualization
`fetch_repositories(api_base_url, ...)`	30	Call backend to fetch GitHub repos
`map_vulnerabilities(api_url, repositories_data, csv_file)`	28	Upload CSV and map vulnerabilities
`run_testing_agent(api_url, repo_url)`	70	SSE streaming of Atlas pipeline progress
`batch_fix_vulnerabilities(api_url, vulnerabilities, ...)`	~210	Call batch fix endpoint with progress callbacks
`display_lineage_graph._extract_paths(items)`	10	Extract file paths from dependent files list
`display_setup_progress(current_step)`	~160	Animated 3-step progress tracker (Upload → Fetch → Map)
`display_run_history(api_url)`	70	Fetch and display pipeline run history table

Frontend Utility Modules

File	Purpose
`src/vulnerability_ui.py` (52KB)	Advanced vulnerability display: cards, severity badges, fix result rendering
`src/lineage.py` (10KB)	Lineage graph data transformations
`utils/atlas_report_comprehensive.py` (21KB)	Comprehensive Atlas report rendering
`utils/integrate_render.py` (4KB)	Report integration helpers

🧠 RAG (Retrieval-Augmented Generation)

ICSF uses a custom RAG implementation for test pattern learning:

Architecture

                    ┌──────────────────┐
                    │   Titan Embed    │
                    │   (Bedrock)      │
                    └────────┬─────────┘
                             │ embedding vector
                    ┌────────▼─────────┐
                    │  SqliteVectorRag │
                    │     Store        │
                    │  (cosine search) │
                    └────────┬─────────┘
                             │ similar patterns
                    ┌────────▼─────────┐
                    │  Test Generator  │
                    │  (LLM prompt)    │
                    └──────────────────┘

How RAG is Used

Fingerprint Check: Before generating a test, hash the source file → query RAG for existing fingerprint → skip if unchanged
Pattern Retrieval: Query RAG store for similar test patterns (kind=test_pattern) with cosine similarity ≥ 0.25
Context Injection: Retrieved patterns are injected into the LLM prompt as examples
Pattern Storage: After successful test generation, store the pattern in RAG for future use

RAG Store Configuration

Setting	Value
Database	SQLite (`data/atlas_rag.db`)
Embedding Model	Amazon Titan Embed Text v1
Embedding Dimension	1536 (float32)
Similarity Metric	Cosine (via dot product on normalized vectors)
TTL	30 days (auto-eviction of stale entries)
Score Threshold	0.25 minimum cosine similarity

🤖 AI / LLM Integration

Models Used

Model	Use Case	Provider
Claude 3.5 Sonnet	All reasoning: code analysis, fix generation, strategy planning, safety validation, build repair, test healing	AWS Bedrock
Amazon Titan Embed Text v1	Text embeddings for RAG store	AWS Bedrock
Llama 3 70B (optional)	Alternative generation model	AWS Bedrock

LLM Call Sites

Component	# of LLM Calls	Purpose
`CodeContextAgent`	1	Data flow analysis
`FixStrategyAgent`	1	Fix strategy planning
`CodeFixAgent`	1 per file	Code generation
`SafetyValidatorAgent`	1	Fix validation
`BuildMechanic`	1–3 per build failure	Build error diagnosis + fix
`TestHealer`	1 per test class	Test repair
`JavaUnitTestGenerator`	1 per source class	Test generation
Total per vulnerability	~6–12	Depending on file count and failure iterations

Cost Management

CostGuardService tracks cost per run with $5.00 default budget
Pricing model: Claude 3.5 Sonnet @ $0.003/1K input, $0.015/1K output
_calculate_usage() in the pipeline reports total tokens + estimated cost
BedrockClient tracks total_input_tokens, total_output_tokens, total_embedding_tokens

📥 Input Requirements

1. Security Vulnerability Report (CSV)

Supported scanners: Fortify, Checkmarx, SonarQube, Snyk

Required Column	Example
`vulnerability_type` or `category`	Cross-Site Scripting
`file_name` or `file_path`	`src/main/java/com/example/Controller.java`
`line_no` or `line_number`	`42`
`severity`	Critical / High / Medium / Low
`description`	User input is rendered without encoding
`recommendation`	Use OWASP encoder for output encoding
`repo_name` or `link`	`my-app` or `https://github.com/org/my-app`

2. Version Control Credentials

GitHub PAT: Requires repo and read:user scopes
Stored in backend/credentials.yaml

3. AI Model Access (AWS Bedrock)

AWS credentials: AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY in .env
Region: us-east-1 (default) or any Bedrock-enabled region
Model access: Must have Claude 3.5 Sonnet + Titan Embeddings enabled in your AWS account

4. Build Environment

Java JDK 17+ on PATH
Maven on PATH
Git on PATH

🛠️ Technical Stack

Layer	Technology	Version
Language	Python	3.10+
Backend Framework	FastAPI	≥0.104
Frontend Framework	Streamlit	≥1.28
LLM Provider	AWS Bedrock (Boto3)	≥1.34
Embedding Model	Amazon Titan Embed Text v1	—
Reasoning Model	Claude 3.5 Sonnet	—
Database	SQLite	(stdlib)
HTTP Client	httpx	≥0.28
Data Processing	pandas	≥2.0
Version Control	GitPython + GitHub API	≥3.1
Graph Analysis	NetworkX	≥3.0
Validation	Pydantic	≥2.10
Containerization	Docker Compose	3.8
Build Tools	Maven, JDK 17+	—

🚀 Getting Started

Prerequisites

Git, Java JDK 17+, and Maven installed and on PATH
Python 3.10+
AWS credentials with Bedrock access (Claude 3.5 Sonnet + Titan Embeddings enabled)
GitHub PAT with repo and read:user scopes

Environment Setup

Create .env in backend/:

AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1
BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20240620-v1:0
BEDROCK_EMBED_MODEL_ID=amazon.titan-embed-text-v1

Create credentials.yaml in backend/:

github:
  token: ghp_your_personal_access_token
  username: your-github-username
  email: your-email@example.com

Run with Docker (Recommended)

docker-compose up --build

Backend: http://localhost:8000
Frontend: http://localhost:8501
Backend has 4GB memory limit, frontend has 1GB
Health checks are configured for both services

Manual Installation

Backend:

cd backend
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend:

cd frontend
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt
streamlit run app.py --server.port 8501

📂 Project Structure

ICSF/
├── backend/
│   ├── main.py                          # FastAPI entrypoint (1636 lines, 20+ endpoints)
│   ├── config.py                        # Config class (AWS, Bedrock, GitHub credentials)
│   ├── credentials.yaml                 # GitHub PAT + user info
│   ├── logging_config.py               # Global logging configuration
│   ├── .env                             # AWS credentials (not committed)
│   │
│   ├── models/
│   │   └── agent_models.py              # 10 Pydantic models for pipeline data flow
│   │
│   ├── services/                        # 14 service files
│   │   ├── bedrock_service.py           # AWS Bedrock LLM wrapper (Claude, Llama, Titan)
│   │   ├── github_service.py            # GitHub API client (repos, orgs, file trees)
│   │   ├── vulnerability_service.py     # CSV parsing & repo mapping (833 lines)
│   │   ├── dependency_service.py        # Java dependency graph engine (2037 lines)
│   │   ├── fix_orchestrator.py          # Multi-agent pipeline controller
│   │   ├── batch_fix_service.py         # Batch vulnerability processing
│   │   ├── pr_manager_service.py        # Git operations & PR creation (1359 lines)
│   │   ├── batch_pr_service.py          # Aggregated PR creation
│   │   ├── atlas_service.py             # Testing pipeline façade
│   │   ├── fix_validator_service.py     # Post-fix build/test validation
│   │   ├── job_manager.py               # Async job & SSE streaming
│   │   ├── run_history.py               # SQLite run persistence
│   │   └── cost_guard.py                # LLM cost limiter ($5/run default)
│   │
│   ├── agents/                          # 7 agent files (Cognitive Fixing Loop)
│   │   ├── code_context_agent.py        # Blast radius mapper (642 lines)
│   │   ├── fix_strategy_agent.py        # Surgical planner (633 lines)
│   │   ├── code_fix_agent.py            # Multi-file code generator (1071 lines)
│   │   ├── safety_validator_agent.py    # Logic gate validator (371 lines)
│   │   ├── codebase_analysis_agent.py   # Repository intelligence (594 lines)
│   │   └── agent_improvements.py        # Helpers: ImportManager, SyntaxValidator, etc.
│   │
│   ├── atlas/                           # Self-Healing Testing Framework
│   │   ├── orchestrator/
│   │   │   └── run_pipeline.py          # Pipeline core (1412 lines)
│   │   ├── agents/
│   │   │   ├── build_mechanic.py        # Build failure auto-repair (1133 lines)
│   │   │   ├── test_healer.py           # Test failure doctor (151 lines)
│   │   │   └── models.py               # Agent data models
│   │   ├── rag/
│   │   │   └── store.py                 # SQLite vector RAG store (210 lines)
│   │   ├── llm/
│   │   │   └── bedrock.py               # Atlas Bedrock client (163 lines)
│   │   ├── generation/
│   │   │   └── java_unit_test_generator.py  # RAG-enhanced test gen (441 lines)
│   │   ├── build/
│   │   │   ├── maven.py                 # Maven command runner
│   │   │   ├── jacoco_injector.py       # JaCoCo coverage plugin injection
│   │   │   ├── spring_test_injector.py  # Spring test dependency injection
│   │   │   ├── failsafe_injector.py     # Failsafe plugin injection
│   │   │   └── dependency_governance.py # Dependency version governance
│   │   ├── core/
│   │   │   ├── config.py                # Atlas configuration
│   │   │   ├── logging.py              # RunLogger
│   │   │   ├── state.py                # Pipeline state manager
│   │   │   ├── shell.py                # Safe shell execution
│   │   │   └── resilience.py           # Retry, circuit breaker, rate limiter
│   │   ├── analysis/
│   │   │   ├── java_maven.py           # Java project fact detection
│   │   │   ├── contract_service.py     # Constructor/method signature registry
│   │   │   └── diff_analyzer.py        # Git diff → functional change detection
│   │   ├── reporting/
│   │   │   ├── models.py               # Report dataclasses
│   │   │   └── parsers.py              # Surefire XML & JaCoCo CSV parsers
│   │   ├── gitops/
│   │   │   ├── github_pr.py            # PR creation for generated tests
│   │   │   ├── github_issues.py        # Issue creation for failures
│   │   │   └── github_org.py           # Organization repo listing
│   │   └── repo/
│   │       ├── cloner.py               # Repository cloning
│   │       └── history.py              # Run history tracking
│   │
│   ├── scripts/                         # Utility & test scripts
│   │   ├── test_bedrock_connection.py
│   │   ├── test_cross_repo_dependencies.py
│   │   ├── test_dependency_analysis.py
│   │   ├── test_orchestrator.py
│   │   ├── analyze_all_matched_files.py
│   │   └── visualize_dependency_mapping.py
│   │
│   └── data/                            # SQLite databases & logs
│       ├── runs.db                      # Pipeline run history
│       └── atlas_rag.db                 # RAG vector store
│
├── frontend/
│   ├── app.py                           # Streamlit UI (5676 lines)
│   ├── src/
│   │   ├── vulnerability_ui.py          # Vulnerability display components
│   │   └── lineage.py                   # Lineage graph data transforms
│   ├── utils/
│   │   ├── atlas_report_comprehensive.py # Atlas report rendering
│   │   └── integrate_render.py          # Report integration helpers
│   ├── requirements.txt
│   └── Dockerfile
│
├── docker-compose.yml                   # Multi-container setup
├── start_frontend.bat                   # Windows frontend launcher
└── start_frontend.sh                    # Linux/Mac frontend launcher

📡 API Reference

Health & Credentials

Endpoint	Method	Description
`/api/health`	GET	Health check (Docker/LB probes)
`/api/credentials/github`	GET	Retrieve loaded GitHub credentials
`/api/credentials/verify`	GET	Debug credential loading

Repository Management

Endpoint	Method	Description
`/api/github/repos`	POST/GET	Fetch GitHub repositories

Vulnerability Management

Endpoint	Method	Description
`/api/vulnerabilities/map`	POST	Upload CSV + map vulnerabilities
`/api/dependencies/analyze`	POST	Single vulnerability dependency analysis
`/api/dependencies/batch-analyze`	POST	Batch dependency analysis

Fix Operations

Endpoint	Method	Description
`/api/fix/orchestrate`	POST	Full multi-agent fix pipeline
`/api/fix/batch`	POST	Batch fix multiple vulnerabilities

Testing Pipeline

Endpoint	Method	Description
`/api/testing/start`	POST	Start async testing job
`/api/testing/job/{job_id}`	GET	Poll job status
`/api/testing/stream/{job_id}`	GET	SSE event stream
`/api/testing/runs`	GET	Pipeline run history
`/api/testing/run`	POST	Legacy sync testing

Pull Request Management

Endpoint	Method	Description
`/api/pr/create`	POST	Create single PR
`/api/pr/create-batch`	POST	Create aggregated PR
`/api/pr/merge`	POST	Merge with conflict resolution
`/api/pr/check-mergeability`	POST	Check PR mergeability

ICSF — Making code security intelligent, automated, and reliable.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.34

Mar 12, 2026

0.1.33

Mar 12, 2026

0.1.32

Mar 12, 2026

0.1.31

Mar 12, 2026

0.1.29

Mar 11, 2026

0.1.27

Mar 11, 2026

0.1.26

Mar 11, 2026

0.1.25

Mar 10, 2026

0.1.24

Mar 10, 2026

0.1.23

Mar 10, 2026

0.1.22

Mar 10, 2026

0.1.21

Mar 10, 2026

0.1.19

Mar 10, 2026

0.1.17

Mar 10, 2026

0.1.15

Mar 10, 2026

0.1.14

Mar 10, 2026

0.1.13

Mar 10, 2026

This version

0.1.12

Mar 9, 2026

0.1.11

Mar 9, 2026

0.1.10

Mar 9, 2026

0.1.9

Mar 9, 2026

0.1.8

Mar 9, 2026

0.1.7

Mar 9, 2026

0.1.6

Mar 9, 2026

0.1.5

Mar 9, 2026

0.1.4

Mar 9, 2026

0.1.3

Mar 9, 2026

0.1.1

Mar 9, 2026

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

icsf_cli-0.1.12.tar.gz (399.5 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

icsf_cli-0.1.12-py3-none-any.whl (406.2 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file icsf_cli-0.1.12.tar.gz.

File metadata

Download URL: icsf_cli-0.1.12.tar.gz
Upload date: Mar 9, 2026
Size: 399.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for icsf_cli-0.1.12.tar.gz
Algorithm	Hash digest
SHA256	`7a2b1e27828fcda5523b480923872641aefdcc3f05ee17386318e1f0a502e5d4`
MD5	`ed8c56ed7c95a70518dc92234e977326`
BLAKE2b-256	`b7c74da1e6b003d3e53b8fbec0aacd0f6c81ebbbb9db973eda4b15fa2f8335b0`

See more details on using hashes here.

File details

Details for the file icsf_cli-0.1.12-py3-none-any.whl.

File metadata

Download URL: icsf_cli-0.1.12-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 406.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for icsf_cli-0.1.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`37a7a16165345f412b866e200f65df85636c662f88efe11d83f621130ec88648`
MD5	`7536d43c9f4b8fca1867bbd72d9d1f87`
BLAKE2b-256	`c8e906c4ddb0a59c190e7f8ea0fada3fb5548f06d0fc0b314604fe0026fde195`

See more details on using hashes here.

icsf-cli 0.1.12

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ICSF – Intelligent Code Security & Fixing Platform

Table of Contents

🏗️ Architecture Overview

Key Design Principles

🔄 End-to-End Application Flow

Phase-by-Phase Walkthrough

🔧 Backend Deep Dive

1. FastAPI Main (main.py) — 1636 lines

Startup & Middleware

Pydantic Request/Response Models (inline)

API Endpoints

2. Configuration & Credentials

config.py — The Config Class

credentials.yaml

3. Pydantic Data Models (models/agent_models.py)

4. Services Layer (services/)

4.1 bedrock_service.py — AWS Bedrock LLM Wrapper (439 lines)

4.2 github_service.py — GitHub API Client (455 lines)

4.3 vulnerability_service.py — CSV Parser & Repo Mapper (833 lines)

4.4 dependency_service.py — Java Dependency Graph Engine (2037 lines)

4.5 fix_orchestrator.py — Multi-Agent Pipeline Controller (564 lines)

4.6 batch_fix_service.py — Batch Vulnerability Processing (753 lines)

4.7 pr_manager_service.py — Git & PR Operations (1359 lines)

4.8 batch_pr_service.py — Aggregated PR Creation (429 lines)

4.9 atlas_service.py — Testing Pipeline Façade (430 lines)

4.10 fix_validator_service.py — Post-Fix Validation (277 lines)

4.11 job_manager.py — Async Job & SSE Streaming (86 lines)

4.12 run_history.py — SQLite Run Persistence (115 lines)

4.13 cost_guard.py — LLM Cost Limiter (50 lines)

5. Agents Layer (agents/) — Cognitive Fixing Loop

5.1 code_context_agent.py — Blast Radius Mapper (642 lines)

5.2 fix_strategy_agent.py — Surgical Planner (633 lines)

5.3 code_fix_agent.py — Multi-File Code Generator (1071 lines)

5.4 safety_validator_agent.py — Logic Gate (371 lines)

5.5 codebase_analysis_agent.py — Repository Intelligence (594 lines)

5.6 agent_improvements.py — Helper Utilities (368 lines)

6. Atlas Subsystem (atlas/) — Self-Healing Testing Framework

6.1 orchestrator/run_pipeline.py — Pipeline Core (1412 lines)

6.2 agents/build_mechanic.py — Build Failure Auto-Repair (1133 lines)

6.3 agents/test_healer.py — Test Failure Doctor (151 lines)

6.4 rag/store.py — SQLite Vector RAG Store (210 lines)

6.5 llm/bedrock.py — Atlas Bedrock Client (163 lines)

6.6 generation/java_unit_test_generator.py — RAG-Enhanced Test Gen (441 lines)

6.7 build/ — Build Infrastructure (5 files)

6.8 core/ — Core Infrastructure (5 files)

6.9 analysis/ — Code Analysis (3 files)

6.10 reporting/ — Test Reporting (2 files)

6.11 gitops/ — GitHub Integration (3 files)

6.12 repo/ — Repository Management (2 files)

🎨 Frontend Deep Dive

UI Components

Core Functions

Frontend Utility Modules

🧠 RAG (Retrieval-Augmented Generation)

Architecture

How RAG is Used

RAG Store Configuration

🤖 AI / LLM Integration

Models Used

LLM Call Sites

Cost Management

📥 Input Requirements

1. Security Vulnerability Report (CSV)

2. Version Control Credentials

3. AI Model Access (AWS Bedrock)

4. Build Environment

🛠️ Technical Stack

🚀 Getting Started

Prerequisites

Environment Setup

Run with Docker (Recommended)

1. FastAPI Main (`main.py`) — 1636 lines

`config.py` — The Config Class

`credentials.yaml`

3. Pydantic Data Models (`models/agent_models.py`)

4. Services Layer (`services/`)

4.1 `bedrock_service.py` — AWS Bedrock LLM Wrapper (439 lines)

4.2 `github_service.py` — GitHub API Client (455 lines)

4.3 `vulnerability_service.py` — CSV Parser & Repo Mapper (833 lines)

4.4 `dependency_service.py` — Java Dependency Graph Engine (2037 lines)

4.5 `fix_orchestrator.py` — Multi-Agent Pipeline Controller (564 lines)

4.6 `batch_fix_service.py` — Batch Vulnerability Processing (753 lines)

4.7 `pr_manager_service.py` — Git & PR Operations (1359 lines)

4.8 `batch_pr_service.py` — Aggregated PR Creation (429 lines)

4.9 `atlas_service.py` — Testing Pipeline Façade (430 lines)

4.10 `fix_validator_service.py` — Post-Fix Validation (277 lines)

4.11 `job_manager.py` — Async Job & SSE Streaming (86 lines)

4.12 `run_history.py` — SQLite Run Persistence (115 lines)

4.13 `cost_guard.py` — LLM Cost Limiter (50 lines)

5. Agents Layer (`agents/`) — Cognitive Fixing Loop

5.1 `code_context_agent.py` — Blast Radius Mapper (642 lines)

5.2 `fix_strategy_agent.py` — Surgical Planner (633 lines)

5.3 `code_fix_agent.py` — Multi-File Code Generator (1071 lines)

5.4 `safety_validator_agent.py` — Logic Gate (371 lines)

5.5 `codebase_analysis_agent.py` — Repository Intelligence (594 lines)

5.6 `agent_improvements.py` — Helper Utilities (368 lines)

6. Atlas Subsystem (`atlas/`) — Self-Healing Testing Framework

6.1 `orchestrator/run_pipeline.py` — Pipeline Core (1412 lines)

6.2 `agents/build_mechanic.py` — Build Failure Auto-Repair (1133 lines)

6.3 `agents/test_healer.py` — Test Failure Doctor (151 lines)

6.4 `rag/store.py` — SQLite Vector RAG Store (210 lines)

6.5 `llm/bedrock.py` — Atlas Bedrock Client (163 lines)

6.6 `generation/java_unit_test_generator.py` — RAG-Enhanced Test Gen (441 lines)

6.7 `build/` — Build Infrastructure (5 files)

6.8 `core/` — Core Infrastructure (5 files)

6.9 `analysis/` — Code Analysis (3 files)

6.10 `reporting/` — Test Reporting (2 files)

6.11 `gitops/` — GitHub Integration (3 files)

6.12 `repo/` — Repository Management (2 files)