SCOM-based microservice boundary analysis from Jaeger traces
Project description
Changelog
v0.7.6 (2026-06-19)
Bug fix
_load_llm_analysisNameError:REPORT_FILEimported as_REPORT_FILEbut used without underscore. Fixed by using_REPORT_FILE.
v0.7.5 (2026-06-19)
Dashboard & SCOM classification
- Dashboard:
_load_service_rank_from,_load_endpoint_table_map_from, and_get_data_freshnessnow fall back to run-registry filenames (service_rank.csv,service_scom.csv,meta.json) when the old pipeline paths (processed/,interim/) don't exist. FixesUPDATED: unknownon the dashboard. - SCOM cohesion labels: New
classify_scom()function in_utils.pywith thresholds (≥0.8 Très cohésif, ≥0.5 Cohésif, ≥0.3 Peu cohésif, <0.3 Pas cohésif). Added "Cohésion" column to bothmba runs showCLI output and the dashboard table. - Tests: 567 passed, 0 failed — no regressions.
v0.7.4 (2026-06-19)
Bug fixes & resilience improvements
- Version sync:
__version__bumped from 0.6.6 → 0.7.4 to matchpyproject.toml. - Jaeger reset (
_ensure_jaeger_ports_free): Added port-based container lookup (docker ps --filter publish=<port>) alongside name-based lookup. Fixes--reset-jaegerfailing when Jaeger container has a different name. - Jaeger reset (
_reset_jaeger_container): Now acceptsotlp_port, searches by both name and published port, passesotlp_porttostart_jaeger(). Also called in the local-process deployment branch. - Trace isolation (
_export_jaeger_traces): Addedstart_timeparameter. Traces are now filtered client-side by spanstartTime, preventing old traces from polluting SCOM analysis across runs. - Alpine Dockerfile (
_generate_otel_dockerfile): Fixed line-index shift logic — usesnum_insertedcounter instead of hardcoded+1, preventing ENTRYPOINT corruption on Alpine images. - Report path (
orchestrator.py):output_diris now only deleted if analysis step failed. Temp dir cleaned incli.pyaftersave_run. - DNS fallback (
_build_compose_override): Wheninclude_jaeger=False,otel_hostis forced tohost.docker.internalso services never depend on fragile Docker DNS resolution. - Java volume quoting: Removed nested double quotes in volume mount string.
v0.7.2 (2026-06-19)
Traffic gen for POST endpoints, Jaeger isolation, endpoint count display
- traffic.py: POST/PUT/PATCH without OpenAPI schema now guesses a JSON body from the endpoint path (e.g.
/employees/insert→{"name":"...", ...},/delete→{"id":1}). Falls back through multiple body shapes on 4xx. - orchestrator.py: New
_reset_jaeger_container()— stops/removes existing Jaeger and starts fresh, activated by--reset-jaegerCLI flag. - cli.py: Added
--reset-jaegerflag tomba full. - run_registry.py:
_build_run_metafalls back to SCOM CSV endpoint count whenproject.serviceshave empty endpoints (fixesEndpoints: 0inmba runs show).
v0.7.1 (2026-06-19)
Hotfix — missing import socket in deploy.py
- deploy.py:
_is_port_in_use()usedsocketwithout importing it, crashing the pipeline before any deploy could start. Addedimport socket.
v0.7.0 (2026-06-19)
Full audit — 71 bugs fixed (11 P0, 17 P1, 43 P2)
- 11 P0 fixes: entry_points[0] IndexError, health_endpoint=None→URL, 4xx treated healthy, error msg empty list, temp dirs leak, pandas import order, hardcoded report path, bool(NaN)=True, logging.basicConfig no-op, Windows zombies (
_kill_process_tree) - 17 P1 fixes: Docker DNS order, CWD in check_container_alive, host.docker.internal Linux, empty ProjectInfo, lookback CLI arg, dashboard dropdown, TOCTOU runs.json, flush delay, --data-dir ignored, multi-network connect, Alpine build deps, LLM multi-lang, per-service Dockerfile, volume path quoting, fallback CID, LLM numbered backups, zero table falsy, dashboard KeyErrors
- Run comparison:
mba runs compare— side-by-side SCOM per service with Δ column - SCOM trend chart: multi-run timeline per service in dashboard
- Process management: cross-platform
_kill_process_tree()(Unix SIGKILL / Windows taskkill) - Adaptive polling: hardcoded
time.sleep(3/5)replaced with adaptive Jaeger API poll and trace wait
v0.6.6 (2026-06-18)
Fix traces never reaching external Jaeger (SCOM=0 root cause)
-
deploy.py:
_resolve_external_jaeger_host()now returns the Jaeger container name instead of its bridge-network IP. Afterdocker compose up,_connect_jaeger_to_compose_network()attaches the external Jaeger container to the compose project's user-defined network. Services resolve the Jaeger hostname via Docker DNS instead of trying (and failing) to reach an IP on a separate bridge network.Before:
OTEL_EXPORTER_OTLP_ENDPOINT=http://172.17.0.2:4318— unreachable from compose user-defined network. After:OTEL_EXPORTER_OTLP_ENDPOINT=http://mba-jaeger:4318— resolves via DNS on the shared compose network.Falls back to bridge gateway IP or
host.docker.internalif no Jaeger container is found.
v0.6.5 (2026-06-18)
SCOM robustness, Jaeger reachability, container health, new analyze command
- mapping_builder.py:
_normalize_id()fixes SCOM=0 root cause — trace_id/span_id now consistently converted to strings across DataFrames, preventing dict-key lookup failures when pandas reads hex IDs as float from CSV. Added debug logging for chain-walk statistics (found/fallback/no-parent counts). - deploy.py:
_resolve_external_jaeger_host()replaces rawhost.docker.internalwith Docker container IP resolution (works in Alpine/musl containers) and Docker bridge gateway fallback —host.docker.internalis last resort. - deploy.py:
_check_container_alive()— post-deploy container health check. If a container has exited/crashed, capturesdocker logs --tail 20and surfaces it in the deployment result with a clear error message. - cli.py: New
mba analyze <traffic_file>subcommand — runs SCOM pipeline (steps 2–8) on an existing Jaeger JSON traces file without deployment or trace collection. Supports--language,--skip-no-db,--threshold,--dashboard. - cli.py:
--languageflag added tomba analyzeandmba full— bypasses auto-detection for non-Python projects.
Tests
- 3 new tests: container alive without Docker, external Jaeger host resolution, trace_id/span_id format mismatch cross-DataFrame mapping
v0.6.4 (2026-06-18)
Fix SCOM = 0 services and missing DB instrumentation
- deploy.py: Added
_OTEL_DB_PACKAGES(psycopg2, sqlalchemy, dbapi, pymongo, redis, mysql, pymysql) — DB instrumentation packages are now installed in the Docker image, fixingPsycopg2Instrumentorimport failures and missingdb.systemspans - db_table_extractor.py: Warning logged when 0 DB spans are found among total spans, guiding users to check DB instrumentation
- mapping_builder.py: Warning logged when >50% of endpoint-to-table mappings are
unknown_endpoint, flagging parent-span chain walking failures - orchestrator.py: Case-insensitive service name matching between discovered services and Jaeger; warning when discovered service names are absent from Jaeger
v0.6.3 (2026-06-18)
Reuse manually started Jaeger in Docker Compose
- deploy.py:
_resolve_compose_jaeger()— when Jaeger is already healthy on ports 4318/16686 (e.g.docker run --name jaeger), MBA reuses it instead of failing with "port already in use". Compose services reach it viahost.docker.internal:4318withextra_hosts: host-gateway - deploy.py:
_build_compose_override()acceptsinclude_jaegerandotel_hostfor external Jaeger mode - deploy.py: clearer error when ports are busy but Jaeger is not healthy
v0.6.0 (2026-06-18)
Port Conflict Detection & Recovery
- deploy.py:
_ensure_jaeger_ports_free()— proactive check beforedocker compose up. Frees ports 4318/16686 by force-removing zombiemba-jaegercontainer. Clear error if another process holds the port - deploy.py:
_parse_docker_error()— scans streaming output for 4 known patterns ("port is already allocated", "cannot connect to daemon", "permission denied", "no such image") and produces a specific fix message instead of the generic "check syntax" - orchestrator.py:
_try_cleanup()now force-removesmba-jaegercontainer after every run to prevent zombie containers
LLM Reliability (OpenRouter + Ollama)
- prompts.py: New rule #8 — LLM is explicitly allowed to add OTel instrumentation around database operations as long as the original query logic is unchanged. Prevents false refusals like "Cannot instrument database queries"
- instrumentation.py: Two-stage retry — if OpenRouter fails (None, refusal, or syntax error), automatically retries with local Ollama before giving up
- orchestrator.py: Clear messages showing what was tried ("OpenRouter API key detected — will fall back to local Ollama if needed") and actionable tips ("Install Ollama (ollama.com) and pull qwen2.5-coder")
v0.5.0 (2026-06-18)
Robustness & Performance
- deploy.py: Threaded streaming for
docker compose up— real-time output on stderr with 60-line rotating tail for error diagnostics, 300s timeout,proc.stdout.close()on Windows to unblock reader thread - deploy.py: Platform-aware Docker daemon timeout — 25s on Windows (WSL2/Hyper-V latency), 10s on Linux
- deploy.py: Deduplicated
_find_otel_dockerfiles→find_otel_dockerfiles(public), removed duplicate from orchestrator - orchestrator.py: Proactive Docker check with visible "waiting up to 60s..." feedback before any deploy;
_ensure_docker()with real elapsed time reporting - instrumentation_marker.py:
cleanup_orphans()— scans for orphan.mba_bak,.mba-Dockerfile,.mba-compose-override.ymlwithout marker (pre-v0.4.0 compat). Usesos.walkwith directory pruning to skip.venv/node_modules/__pycache__ - prompts.py: LLM sentinel
jaeger_host="env"now tells the model to reados.environ.get("OTEL_EXPORTER_OTLP_ENDPOINT", ...)at runtime instead of generatinghttp://env:4318 - orchestrator.py: LLM instrumentation passes
"env"for Docker Compose projects,"127.0.0.1"for local projects
Tests
- 565 tests (+7 new), 0 regressions
- 7 new tests:
build_instrumentation_prompt(jaeger_host="env")sentinel,_extract_host_port()with all formats including127.0.0.1:5000:5000
v0.4.0 (2026-06-17)
Version-Aware Instrumentation System (new feature)
- NEW:
.mba-instrumentedmarker file written after successful deploy, recording version, mode, and all artifacts created (backups, Dockerfile overrides, compose overrides) - NEW:
check_stale_instrumentation()detects instrumentation from a different MBA version at the start ofmba fulland automatically cleans up before re-instrumenting - NEW:
cleanup_instrumentation()restores backup files (.mba_bak→ original), deletes generated.mba-Dockerfileand.mba-compose-override.ymlfiles - NEW: On each run, if marker exists with a different version, cleanup runs automatically before discovery
Docker Compose Robustness (bug fixes)
- deploy.py: Added
subprocess.TimeoutExpiredhandler indeploy_docker_compose()— previously an unhandled crash; now produces a clearDOCKER_COMPOSE_FAILEDerror - deploy.py:
_generate_otel_dockerfile()now logs warnings on all 7 silent failure paths instead of returning(None, None)with no user feedback - discover.py: Fixed port extraction from Docker Compose YAML. The old
p.rsplit(":", 1)[0].rsplit(":", 1)[0]was broken forhost_ip:host_port:container_portformat (e.g.,127.0.0.1:5000:5000). Now uses a proper_extract_host_port()helper
LLM Chain Improvements (bug fixes + diagnostics)
- instrumentation.py: Added
logger.warning()for each reason the LLM returnsNone: API/Ollama failure,"ERROR:"refusal (with the actual reason), andSyntaxErrorin generated code. Previously all three were silent - context.py: Extended
_find_main_file()to recognize all entry point names from the Python plugin:run.py,manage.py,wsgi.py,api.py(in addition to existingmain.py,app.py,server.py). Also checks subdirectories (app/,src/,application/) for all these names - context.py: Added
"language"key to context dict (value:"python") so the prompt template correctly shows"Language: python"instead of duplicating the framework name - prompts.py: Fixed
"Language:"label to readcontext.get('language', 'python')instead ofcontext.get('framework', 'unknown')
v0.3.11 (2026-06-17)
Fix Docker daemon detection on Windows
v0.3.10 (2026-06-17)
Docker error messages now accurate
- deploy.py:
deploy_docker_compose()andstart_jaeger()now distinguish between Docker not installed (DOCKER_NOT_FOUND) and Docker daemon not running (DOCKER_DAEMON_DOWN). Users with Docker installed but Desktop not launched now see: "Docker is installed but the daemon is not running — Start Docker Desktop and wait for it to be ready." instead of the misleading "Docker is required but was not found."
v0.3.9 (2026-06-17)
Bug fixes and robustness improvements
- orchestrator.py: Fixed
'ServiceInfo' object has no attribute 'root_dir'crash when LLM instrumentation tries to read the service path. Now usesentry_points[0].path.parentinstead. - deploy.py: Replaced
_docker_available()with 3-functions:_docker_installed(),_docker_daemon_ready(), and retry-based_docker_available()(3 attempts × 3s). Usesdocker version --formatwhich is 10× faster thandocker info. - deploy.py: Added Jaeger health check after
docker compose up— explicitly waits for port 16686 and verifies/api/servicesendpoint. - deploy.py:
cleanup_docker_compose()now checks Docker availability first — skips cleanly if the daemon is not responding. - deploy.py: Reduced timeouts — compose up 300s→120s, compose down 60s→15s, docker check 10s→5s.
- orchestrator.py:
_try_cleanup()is now protected againstKeyboardInterrupt— clean message instead of traceback. - cli.py: Top-level
KeyboardInterrupthandler — returns exit code 130 with clean message. - deploy.py:
cleanup_docker_composeno longer raises on failure (check=Trueremoved,subprocess.CalledProcessErrorhandled gracefully). - All 561 tests pass with zero regressions.
v0.3.8 (2026-06-17)
Consolidation — single-service orchestrator
- deploy.py: Python services always use OTLP HTTP/4318 (removed conditional gRPC fallback). Smart Jaeger detection (
_jaeger_alive,_docker_container_exists) with 3-case restart logic. NewDOCKER_START_FAILEDerror code. - discover.py: Service deduplication by
(name, deployment). Subdirectory scanning for monorepos (_is_service_dir,_discover_subdirectory_services). - orchestrator.py: New
_llm_instrument_servicesstep called between discovery and deploy, triggered by--llmflag +OPENROUTER_API_KEY. Falls back silently to Dockerfile patching. - prompts.py: Universal framework-agnostic prompt replaces FastAPI/Flask-only prompt. Python reference appendix (FastAPI, Flask, Django, SQLAlchemy).
- instrumentation.py: Passes structured
contextdict for richer prompts. - Tests: All 561 pass with updated env vars and prompt text.
v0.3.7 (2026-06-16)
Bug fixes
- Pipeline crash when no services are flagged suspicious (
EmptyDataErroron empty CSV). Added size check and try/except inreport_builder.py.
v0.3.6 (2026-06-16)
Features
- ENTRYPOINT injected directly into
.mba-Dockerfileinstead of composeentrypointoverride (Docker Compose v5 on Windows clears CMD when entrypoint is set in YAML) opentelemetry-distroadded as runtime dependency (providesOpenTelemetryConfiguratorentry point, needed for SDK config from env vars)- Windows console encoding fix:
sys.stdout.reconfigure(encoding='utf-8')in CLI module
v0.3.5 (2026-06-16)
Features
- Build-time OTel install: generates
.mba-DockerfilewithRUN pip install opentelemetry-distro opentelemetry-instrumentation-flasketc. at build time - Compose override points
build.dockerfileto.mba-Dockerfile - Cleanup of
.mba-Dockerfilefiles after analysis
v1.0.0 (2026-06-11)
Features
- SCOM pipeline : computes Service-COhesion Metric from Jaeger traces (health filtering, endpoint extraction, DB table detection, endpoint-table mapping, threshold analysis, report generation)
- CLI tool :
mba/boundary-analyzercommands (run,setup,dashboard,teastore) - Auto-instrumentation : auto-detects Python microservices (FastAPI, Flask, Django), injects OpenTelemetry, collects traces via Jaeger, runs SCOM analysis
- TeaStore support : Docker Compose deployment with OTel Java agent, traffic generator, trace exporter, full SCOM pipeline
- Dashboard : interactive Dash web UI for SCOM results
- LLM analysis (optional) : AI-powered narrative report via OpenRouter (Qwen), disabled by default
Improvements
- Segment-based health matching (
HEALTH_KEYWORDS) instead of fragileendswith—/health/all,/auth/health,/ready/isready,/metrics(viahttp.target) correctly filtered --skip-no-db-servicesflag to exclude stateless services (proxy, orchestrator, etc.) from SCOM rankingrun_teastore()function extracted for programmatic access
Bug fixes
- MissingGreenlet in classroom-repository (added
selectinload) - datetime timezone-aware comparison in enrollment-service
academic_yearint→str conversion in enrollment-service- Scope bug in
cleaned_partsvariable in CLI cleanup logic - SQLAlchemy duplicate instrumentation (event listeners only, no
SQLAlchemyInstrumentor/AsyncPGInstrumentor) [project.scripts]whitespace in pyproject.toml
Tests
- 74 tests total (58 existing + 16 TeaStore)
- TeaStore synthetic fixtures (persistence-service with 5 tables, auth-service without DB)
- 3 test classes : TeaStorePipelineTest, TeaStoreSkipNoDbTest, TeaStoreNoFilterTest
Infrastructure
- CI via GitHub Actions (
.github/workflows/ci.yml) — Python 3.11 × 3.12 mbaCLI alias alongsideboundary-analyzer- Version bump to 0.2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file boundary_analyzer-0.7.6.tar.gz.
File metadata
- Download URL: boundary_analyzer-0.7.6.tar.gz
- Upload date:
- Size: 184.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f755dab098163cae34ab9426138e596f4f406cc271aace574af06700ca34c95
|
|
| MD5 |
07c6aad9349f52116a88ff417d008364
|
|
| BLAKE2b-256 |
b69dd504c9d6e8b596fefcc9801180d65d2eb99688ca64e21aaf71d3fa332954
|
File details
Details for the file boundary_analyzer-0.7.6-py3-none-any.whl.
File metadata
- Download URL: boundary_analyzer-0.7.6-py3-none-any.whl
- Upload date:
- Size: 177.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b01ce0cbea15536d48dee1ea446075ca8245eea18093c7bff3e1911bcf6a96b
|
|
| MD5 |
e36ce3816ec41d68fdaeb445e729eb39
|
|
| BLAKE2b-256 |
3cd19023dba68f5bcad200448d2b7fb4fda5d3398f2dcce951be9c98183055bd
|