Backend Pro Max — BM25-searchable backend & distributed-systems engineering intelligence as an AI skill / CLI.
Project description
🚀 Backend Pro Max
A staff-engineer-in-a-box for your AI coding assistant
Curated, BM25-searchable backend & distributed-systems intelligence across 20 domains and 12 language stacks — drop it into Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini, Continue, or any AI assistant.
Quick Start · Domains · Stacks · Install as a Skill · Examples · Contributing
✨ What is this?
Backend Pro Max grounds your AI coding assistant in opinionated, source-citable, senior-engineer-grade knowledge for backend & distributed-systems work — and forces it to search before answering.
LLMs know surface-level facts about backend tech, but they:
- 🎯 Recommend the trendy pattern instead of the right one for your team / scale.
- ⏱️ Forget timeouts, retries, idempotency, backpressure, and graceful shutdown.
- 🧩 Don't know your stack's idioms — Spring lazy-init pitfalls, FastAPI sync-in-async, Express vs Fastify, sqlx compile-time queries, EF Core change tracking, …
- 🔀 Mix up consistency models, replication modes, and partition strategies.
- 🛡️ Skip the boring-but-critical stuff: SLOs, error budgets, runbooks, PII in logs.
This skill fixes that with a structured, searchable knowledge base that the model is instructed to consult — so its advice cites a row, not a vibe.
🎁 What you get
| 📚 20 domain knowledge bases | Languages · Patterns · Databases · Messaging · Cache · Cloud · IaC · Containers · Observability · API · Auth · Security · CI/CD · Testing · Architecture · Scaling · Consistency · Performance · Reliability · Data |
| 🛠️ 12 stack guidelines | Go · Java/Spring · Python/FastAPI · Node/Express · Rust/Axum · C#/ASP.NET · Kotlin/Spring · Scala/Akka · Elixir/Phoenix · Ruby/Rails · PHP/Laravel · C++ |
| 🔎 Pure-Python BM25 search | No installs, no models, no network — just python3. |
| 🤖 Drop-in skill files | SKILL.md for Claude Code · skill-content.md for Cursor / Windsurf / Copilot / Gemini / Continue |
| 📐 Do / Don't + Code examples | Each row contains good vs bad code, severity, and a docs URL |
| 🧠 Auto domain detection | Skip --domain and the engine picks the right CSV from your query |
| ⚙️ JSON output mode | First-class integration with tool-calling agents and MCP servers |
⚡ Quick start
Option A — install once, type backendpro
# Install from PyPI — pure stdlib, zero dependencies
pip install backendpro
backendpro --list
backendpro "kafka exactly once delivery"
backendpro "circuit breaker" --domain pattern
backendpro "virtual threads" --stack java-spring
backendpro "idempotency" --all
backendpro "redis cluster" --json
💡
pipx install backendproworks too if you prefer an isolated venv. You can also install from source:pip install git+https://github.com/shashankswe2020-ux/backend-pro-max-skill
Option B — run the script directly (no install)
# 0. No install needed — pure Python 3.8+ stdlib
python3 src/backend-pro-max/scripts/search.py --list
# 1. Auto-detect the domain from the query
python3 src/backend-pro-max/scripts/search.py "kafka exactly once delivery"
# 2. Constrain to a specific domain
python3 src/backend-pro-max/scripts/search.py "circuit breaker" --domain pattern
# 3. Stack-specific guidance
python3 src/backend-pro-max/scripts/search.py "virtual threads" --stack java-spring
# 4. Cross-domain search
python3 src/backend-pro-max/scripts/search.py "idempotency" --all
# 5. JSON output (great for agents / MCP)
python3 src/backend-pro-max/scripts/search.py "redis cluster" --json
💡 Tip: the search engine ranks results with BM25 over the search columns of each CSV, with light keyword-based domain auto-detection when
--domainis omitted.
🧭 Example queries
A taste of what to ask — these all return ranked, citable rows:
| Domain | Try this |
|---|---|
🧬 pattern |
"saga vs 2pc for distributed transactions" |
🗄️ database |
"postgres index on jsonb" · "dynamodb single table design" |
📨 messaging |
"kafka exactly once" · "sqs vs sns vs eventbridge" |
⚡ cache |
"thundering herd" · "negative caching with redis" |
☁️ cloud |
"aws gcp azure equivalent of pubsub" |
🛰️ observability |
"slo error budget alerting" · "otel trace context propagation" |
🔐 security |
"ssrf prevention" · "sigstore supply chain" |
🧪 testing |
"contract testing pact" · "testcontainers postgres" |
🏗️ architecture |
"modular monolith vs microservices" |
📈 scaling |
"hedged requests" · "backpressure" |
⚖️ consistency |
"linearizability vs sequential" · "PACELC" |
🛡️ reliability |
"graceful shutdown" · "circuit breaker timeouts" |
📚 Domains
| Domain | What's in it |
|---|---|
🧠 language |
Go, Java, Kotlin, Python, Rust, Node.js/TS, C#, Scala, Elixir, Ruby, PHP, C++ |
🧩 pattern |
Saga, CQRS, Event Sourcing, Outbox, CDC, Circuit Breaker, Bulkhead, Retry, Idempotency, Leader Election, Sidecar, Strangler Fig, ACL, BFF, API Gateway, Rate Limiting, Sharding, Read Replica, Materialized View, Process Manager, Outbox+Inbox, Fan-out / Scatter-Gather |
🗄️ database |
Postgres, MySQL/Vitess, CockroachDB, Spanner/TiDB, MongoDB, Cassandra/Scylla, DynamoDB, Redis, Memcached, Elastic/OpenSearch, ClickHouse, DuckDB, Snowflake/BigQuery/Redshift, Neo4j/Memgraph, Influx/Timescale/VictoriaMetrics, vector DBs, S3/GCS/Blob, etcd/ZK/Consul, SQLite |
📨 messaging |
Kafka, Redpanda, Pulsar, RabbitMQ, NATS/JetStream, MQTT, SQS, SNS/EventBridge, Kinesis, Pub/Sub, Service Bus / Event Grid / Event Hubs, ZeroMQ |
⚡ cache |
In-process LRU, Redis (single + cluster), Memcached, CDN, HTTP cache, read/write/write-back, materialized views, negative caching, Bloom filters, L1+L2 hybrid |
☁️ cloud |
AWS / GCP / Azure / Cloudflare service mapping & equivalents |
🏗️ iac |
Terraform/OpenTofu, Pulumi, AWS CDK, CloudFormation, Bicep, Ansible, Crossplane, Helm, Kustomize, Packer |
📦 container |
Docker/OCI, Podman, containerd, Kubernetes, EKS/GKE/AKS, Helm, Kustomize, ArgoCD/Flux, Istio/Linkerd/Cilium, Envoy, Karpenter, Nomad, Compose, Testcontainers |
📊 observability |
Prometheus, Mimir/Cortex/Thanos/VM, Grafana, Loki, ELK/OpenSearch, Tempo/Jaeger/Zipkin, OpenTelemetry, Pyroscope/Parca, Datadog, New Relic / Honeycomb / Dynatrace, Sentry, Fluent Bit / Vector, PagerDuty / Opsgenie, SLO frameworks |
🔌 api |
REST, GraphQL, gRPC, gRPC-Web/Connect, WebSocket, SSE, HTTP/2, HTTP/3, Webhooks, WebSub/ActivityPub, JSON-RPC, SOAP |
🔑 auth |
OAuth 2.0 + PKCE, OIDC, JWT, SAML, mTLS, API keys, HMAC signing, sessions, passkeys/WebAuthn, magic links, RBAC/ABAC/ReBAC, SCIM, workload identity (IRSA / WIF) |
🛡️ security |
OWASP Top 10, CSRF, XSS, SSRF, deserialisation, secrets, supply chain (SLSA, Sigstore), zero-trust, TLS hardening, PII/logging, rate limiting, CORS, SBOM, SAST, DAST/fuzz |
🔁 cicd |
GitHub Actions, GitLab CI, Jenkins, CircleCI, Buildkite, Drone, Tekton, Argo Workflows, ArgoCD, Flux, Spinnaker, Argo Rollouts, Renovate/Dependabot, SonarQube, GHAS |
🧪 testing |
Unit, component/slice, integration (Testcontainers), contract (Pact), E2E, property-based, fuzz, snapshot, mutation, load, stress/soak, chaos, smoke / synthetic monitoring |
🏛️ architecture |
Monolith, modular monolith, microservices, serverless/FaaS, event-driven, hexagonal/ports-and-adapters, clean/onion, DDD, CQRS+ES, service mesh, BFF, lambda/kappa, actor model, cell-based |
📈 scaling |
Vertical, horizontal, autoscaling (HPA/KEDA/Karpenter), sharding, read replicas, multi-tier caching, connection pooling, backpressure, bulkhead, hedged requests, load balancing, CDN, geo-distribution, async/queue load levelling, indexing, materialized views, partitioning |
⚖️ consistency |
Linearizability, sequential, causal, read-your-writes, eventual, SEC/CRDTs, CAP, PACELC, Raft, Paxos, 2PC, snapshot isolation/SSI, quorum, Lamport/vector/HLC clocks |
🚀 performance |
N+1, missing indexes, plan regressions, pool exhaustion, GC pauses, hot keys, tail latency, thundering herd, async-blocking, cold starts, leaks, hot-path allocations, JSON serialisation, chatty interfaces, TLS overhead |
🛟 reliability |
SLO/SLI/error budget, timeouts, retries+backoff, circuit breaker, bulkhead, idempotency, graceful shutdown, liveness/readiness, capacity & headroom, RPO/RTO, multi-AZ/region, backups + PITR, chaos engineering, runbooks, blue/green & canary, feature flags, per-tenant quotas, postmortems |
🧮 data |
Spark, Flink, Kafka Streams/ksqlDB, Airbyte/Fivetran/Stitch/Meltano, dbt, Airflow, Dagster, Prefect, Iceberg/Delta/Hudi, ClickHouse/Druid/Pinot, Spark Streaming + Delta, Debezium, Kafka Connect, LakeFS/Nessie, vector DBs, feature stores |
🛠️ Stacks
Each stack file contains tight, opinionated, "what would a staff engineer say in code review" guidelines — categorised by Concurrency, HTTP, Errors, Persistence, Tooling, Observability, Performance, Testing, Build, … — with ✅ Do / ❌ Don't plus good vs bad code examples.
| Stack | Highlights |
|---|---|
🐹 go |
context.Context, errgroup, http.Client reuse, pgx/sqlc, table-driven tests |
☕ java-spring |
Virtual threads (Loom), constructor DI, OSIV off, Flyway, Testcontainers, native image |
🐍 python-fastapi |
async-all-the-way, Pydantic v2, httpx, uv, ruff/mypy, structlog, Testcontainers |
🟢 nodejs-express |
Fastify > Express, zod at boundaries, Undici pool, pino, OTel, Vitest |
🦀 rust-axum |
Tokio + Axum + Tower, sqlx compile-time queries, thiserror/anyhow, tracing, tokio-console |
🟪 csharp-aspnet |
Minimal APIs, async-all-the-way, HttpClientFactory, Polly v8, EF Core AsNoTracking, Native AOT |
🟧 kotlin-spring |
Coroutines + structured concurrency, Spring Boot Kotlin DSL, Exposed/jOOQ, kotest |
🔺 scala-akka |
Pekko (Akka fork), Typed actors, Pekko Streams, Cats Effect / ZIO |
💧 elixir-phoenix |
OTP supervision, GenServer, Task.async_stream, Phoenix LiveView, Broadway, libcluster |
💎 ruby-rails |
Modular Rails (Packwerk), Sidekiq, Puma tuning, Bullet, Rails 7+ defaults, Solid Queue/Cache |
🐘 php-laravel |
Octane (Swoole/RoadRunner/FrankenPHP), OPcache+JIT, Horizon, eager loading, PHPStan |
➕ cpp |
C++20+, RAII, jthread/stop_token, coroutines, sanitizers, CMake presets, Conan/vcpkg, GoogleTest, clang-tidy |
🤖 How AI agents use this
A typical interaction inside Claude Code, Cursor, Copilot, etc.:
👤 "Add retries to our outbound HTTP client without melting the dependency."
🤖 → search.py "retry backoff jitter circuit breaker" --domain reliability
→ search.py "http client retries" --stack <your stack>
→ answers with: exponential backoff + jitter, max attempts, idempotency
key requirement, circuit breaker around it, budgeted timeout, plus
a code snippet using the right library for your stack — and cites
the row(s) it pulled from.
The skill files (SKILL.md / skill-content.md) instruct the agent to:
- Search first — never guess when a row exists.
- Cite the row (domain + key) so reviewers can verify.
- Prefer stack guidelines for code-shaped answers.
- Combine multiple domains for cross-cutting concerns (e.g. a "saga"
answer pulls from
pattern+messaging+consistency+reliability).
📁 Repository structure
See CLAUDE.md for the full layout. TL;DR:
src/backend-pro-max/
├── data/ # 20 domain CSVs + stacks/ (12 stack CSVs)
│ ├── languages.csv patterns.csv databases.csv messaging.csv …
│ └── stacks/
│ └── go.csv java-spring.csv python-fastapi.csv …
├── scripts/
│ ├── core.py # BM25 engine + domain auto-detection
│ └── search.py # CLI entry point
└── templates/base/
├── skill-content.md # Drop-in rules for any AI assistant
└── quick-reference.md # Cheatsheet
.claude/skills/backend-pro-max/ # Claude Code skill (SKILL.md)
.claude-plugin/plugin.json # Claude marketplace manifest
docs/ # ARCHITECTURE.md & USAGE.md
Visual architecture
flowchart TD
U["👤 User Query\n"Design a URL shortener with caching""]:::user
S["📜 SKILL.md / skill-content.md\nInstructs model to search before answering"]:::skill
C["🔎 backendpro CLI\nBM25 search engine · pure Python stdlib"]:::cli
D["📚 20 Domain CSVs\napi · cache · database\nscaling · reliability …"]:::data
K["🛠️ 12 Stack CSVs\ngo · java-spring\npython-fastapi …"]:::data
A["🌐 Auto-detect / --all\nCross-domain search"]:::data
R["📋 Ranked Results\nCited rows · do/don't · code\nseverity · docs URL"]:::result
G["✅ Grounded, Citable Answer"]:::answer
U --> S --> C
C --> D & K & A
D & K & A --> R --> G
classDef user fill:#6366f1,color:#fff,stroke:#4f46e5,stroke-width:2px
classDef skill fill:#8b5cf6,color:#fff,stroke:#7c3aed,stroke-width:2px
classDef cli fill:#0ea5e9,color:#fff,stroke:#0284c7,stroke-width:2px
classDef data fill:#f59e0b,color:#fff,stroke:#d97706,stroke-width:2px
classDef result fill:#10b981,color:#fff,stroke:#059669,stroke-width:2px
classDef answer fill:#22c55e,color:#fff,stroke:#16a34a,stroke-width:2px
🔌 Installation as an AI skill
🟣 Claude Code
Symlink (or copy) the src/backend-pro-max directory into your repo at
.claude/skills/backend-pro-max/ — the SKILL.md already lives there. The
agent will discover it automatically.
mkdir -p .claude/skills
ln -s "$(pwd)/src/backend-pro-max" .claude/skills/backend-pro-max
🟦 Cursor / Windsurf / Continue / GitHub Copilot / Gemini
Copy src/backend-pro-max/templates/base/skill-content.md into your editor's
rules file:
| Tool | Rules file |
|---|---|
| Cursor | .cursor/rules/backend.mdc |
| Windsurf | .windsurfrules |
| Continue | AGENTS.md |
| GitHub Copilot | .github/copilot-instructions.md |
| Gemini Code Assist | GEMINI.md |
Make sure the assistant can run python3 src/backend-pro-max/scripts/search.py …
in your repo.
⚙️ Anywhere else (CLI / scripts / MCP)
The CLI is pure Python 3 standard library. Either install it:
pip install git+https://github.com/shashankswe2020-ux/backend-pro-max-skill
backendpro --list
backendpro "redis cluster" --json
…or just clone and run the script directly:
python3 src/backend-pro-max/scripts/search.py --list
python3 src/backend-pro-max/scripts/search.py "redis cluster" --json
The --json output makes it trivial to wire into an MCP tool, a custom
agent loop, or any CI step.
✅ Prerequisites
- Python 3.8+ — that's it. No
pip install, no virtualenv, no models. - Works on Linux, macOS, Windows (WSL/native), and inside containers.
🧪 Smoke test
# Installed CLI
backendpro --list
backendpro "circuit breaker"
backendpro "virtual threads" --stack java-spring
backendpro "idempotency" --all
# Or, without installing
python3 src/backend-pro-max/scripts/search.py --list
python3 src/backend-pro-max/scripts/search.py "circuit breaker"
🧱 Extending
Adding a new row, a new domain, or a new stack takes ~2 minutes.
| Want to add… | Steps |
|---|---|
| 📝 A new row | Append to the relevant data/<domain>.csv (keep column order). |
| 🆕 A new domain | Add data/<domain>.csv, register in CSV_CONFIG + _DOMAIN_KEYWORDS in core.py. |
| 🧱 A new stack | Add data/stacks/<stack>.csv, register in STACK_CONFIG in core.py. |
Full details in CLAUDE.md ("Adding new content") and
docs/ARCHITECTURE.md.
🤝 Contributing
PRs welcome — especially for:
- 🆕 New stacks: Swift on the server, Erlang/OTP, Zig, Crystal, Gleam, Deno, Bun, …
- 🌐 New domains: FinOps, ML platform, edge / WASM, blockchain infra, mobile-backend, …
- 🧠 More rows in existing CSVs (with
Do,Don't, code, severity, and a docs URL). - 🐛 Corrections — if a recommendation is dated or wrong, open a PR with the source.
Please follow the Git workflow in CLAUDE.md:
- Branch from
main(feat/...orfix/...). - Commit with a clear message.
- Open a PR — never push directly to
main.
❓ FAQ
Does this need an internet connection?
No. The CLI is offline-first, pure Python stdlib. The only network calls are whatever the AI assistant itself makes.
Why CSV instead of YAML / JSON / SQLite?
CSV diffs cleanly in PRs, opens in any editor / spreadsheet, and is trivial to parse with stdlib. Search is BM25 over the configured columns.
How is this different from a generic "rules" file?
A flat rules file forces the model to keep everything in context. This skill makes the model search a structured KB on demand — so it scales to hundreds of rows across 20+ domains without bloating the prompt.
Can I use it from an MCP server / tool-calling agent?
Yes — use --json and parse the result. Each row includes the citation key,
domain, summary, do/don't, code samples, severity, and docs URL.
📜 License
MIT © 2025 contributors
Built for the engineers who actually ship distributed systems. If this saves you one outage, ⭐ the repo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file backendpro-0.2.0.tar.gz.
File metadata
- Download URL: backendpro-0.2.0.tar.gz
- Upload date:
- Size: 105.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
798e211ac7dd37a4bd017df9440729bb6aedeeff7e5fad98bc785fbba40806b2
|
|
| MD5 |
03d8ed3d244aedb65288a2f21a6cb8c7
|
|
| BLAKE2b-256 |
580b720c6bf9e806163eb4d4531662fabf3cee33e7971b679fe98dd0a6043953
|
File details
Details for the file backendpro-0.2.0-py3-none-any.whl.
File metadata
- Download URL: backendpro-0.2.0-py3-none-any.whl
- Upload date:
- Size: 120.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6fed38de5d840b189777b8d109098939859af3b6dafe504e0832506c1b55ad33
|
|
| MD5 |
2a3f59254606337e289d788aadca14cf
|
|
| BLAKE2b-256 |
2a10cf4a66a4aed469c53e943692c2ce37599a682ae90e37dec237a9e53e62fa
|