A simple CI/CD utility for running LLM tasks with Semantic Kernel
Project description
AI-First Toolkit: LLM-Powered Automation
๐ The Future of DevOps is AI-First
This toolkit represents a step toward AI-First DevOps - where intelligent automation handles the entire development lifecycle. Built for teams ready to embrace the exponential productivity gains of AI-powered development. Please read the blog post for more details on the motivation.
TLDR: What This Tool Does
Purpose: Zero-friction LLM integration for pipelines with 100% guaranteed schema compliance. This is your foundation for AI-first integration practices.
Perfect For:
- ๐ค AI-Generated Code Reviews: Automated PR analysis with structured findings
- ๐ Intelligent Documentation: Generate changelogs, release notes, and docs automatically
- ๐ Security Analysis: AI-powered vulnerability detection with structured reports
- ๐ฏ Quality Gates: Enforce standards through AI-driven validation
- ๐ Autonomous Development: Enable AI agents to make decisions in your pipelines
- ๐ฏ JIRA Ticket Updates: Update JIRA tickets based on LLM output
- ๐ Unlimited Integration Possibilities: Chain it multiple times and use as glue code in your tool stack
Simple structured output example
# Install and use immediately
pip install llm-ci-runner
llm-ci-runner --input-file examples/02-devops/pr-description/input.json --schema-file examples/02-devops/pr-description/schema.json
The AI-First Development Revolution
This toolkit embodies the principles outlined in Building AI-First DevOps:
| Traditional DevOps | AI-First DevOps (This Tool) |
|---|---|
| Manual code reviews | ๐ค AI-powered reviews with structured findings |
| Human-written documentation | ๐ AI-generated docs with guaranteed consistency |
| Reactive security scanning | ๐ Proactive AI security analysis |
| Manual quality gates | ๐ฏ AI-driven validation with schema enforcement |
| Linear productivity | ๐ Exponential gains through intelligent automation |
Features
- ๐ฏ 100% Schema Enforcement: Your pipeline never gets invalid data. Token-level schema enforcement with guaranteed compliance
- ๐ Resilient execution: Retries with exponential back-off and jitter plus a clear exception hierarchy keep transient cloud faults from breaking your CI.
- ๐ Zero-Friction CLI: Single script, minimal configuration for pipeline integration and automation
- ๐ Enterprise Security: Azure RBAC via DefaultAzureCredential with fallback to API Key
- ๐ฆ CI-friendly CLI: Stateless command that reads JSON/YAML, writes JSON/YAML, and exits with proper codes
- ๐จ Beautiful Logging: Rich console output with timestamps and colors
- ๐ File-based I/O: CI/CD friendly with JSON/YAML input/output
- ๐ Template-Driven Workflows: Handlebars and Jinja2 templates with YAML variables for dynamic prompt generation
- ๐ YAML Support: Use YAML for schemas, input files, and output files - more readable than JSON
- ๐ง Simple & Extensible: Easy to understand and modify for your specific needs
- ๐ค Semantic Kernel foundation: async, service-oriented design ready for skills, memories, orchestration, and future model upgrades
- ๐ Documentation: Comprehensive documentation for all features and usage examples. Use your semantic kernel skills to extend the functionality.
- ๐งโโ๏ธ Acceptance Tests: pytest framework with the LLM-as-Judge pattern for quality gates. Test your scripts before you run them in production.
- ๐ฐ Coming soon: token usage and cost estimation appended to each result for budgeting and optimisation
๐ The Only Enterprise AI DevOps Tool That Delivers RBAC Security, Robustness and Simplicity
LLM-CI-Runner stands alone in the market as the only tool combining 100% schema enforcement, enterprise RBAC authentication, and robust Semantic Kernel integration with templates in a single CLI solution. No other tool delivers all three critical enterprise requirements together.
Installation
pip install llm-ci-runner
That's it! No complex setup, no dependency management - just install and use. Perfect for CI/CD pipelines and local development.
Quick Start
1. Install from PyPI
pip install llm-ci-runner
2. Set Environment Variables
Azure OpenAI (Priority 1):
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_MODEL="gpt-4.1-nano" # or any other GPT deployment name
export AZURE_OPENAI_API_VERSION="2024-12-01-preview" # Optional
OpenAI (Fallback):
export OPENAI_API_KEY="your-very-secret-api-key"
export OPENAI_CHAT_MODEL_ID="gpt-4.1-nano" # or any OpenAI model
export OPENAI_ORG_ID="org-your-org-id" # Optional
Authentication Options:
- Azure RBAC (Recommended): Uses
DefaultAzureCredentialfor Azure RBAC authentication - no API key needed! See Microsoft Docs for setup. - Azure API Key: Set
AZURE_OPENAI_API_KEYenvironment variable if not using RBAC. - OpenAI API Key: Required for OpenAI fallback when Azure is not configured.
Priority: Azure OpenAI takes priority when both Azure and OpenAI environment variables are present.
3a. Basic Usage
# Simple chat example
llm-ci-runner --input-file examples/01-basic/simple-chat/input.json
# With structured output schema
llm-ci-runner \
--input-file examples/01-basic/sentiment-analysis/input.json \
--schema-file examples/01-basic/sentiment-analysis/schema.json
# Custom output file
llm-ci-runner \
--input-file examples/02-devops/pr-description/input.json \
--schema-file examples/02-devops/pr-description/schema.json \
--output-file pr-analysis.json
# YAML input files (alternative to JSON)
llm-ci-runner \
--input-file config.yaml \
--schema-file schema.yaml \
--output-file result.yaml
3b. Template-Based Workflows
Dynamic prompt generation with YAML, Handlebars or Jinja2 templates:
# Handlebars template example
llm-ci-runner \
--template-file examples/05-templates/handlebars-template/template.hbs \
--template-vars examples/05-templates/handlebars-template/template-vars.yaml \
--schema-file examples/05-templates/handlebars-template/schema.yaml \
--output-file handlebars-result.yaml
# Or using Jinja2 templates
llm-ci-runner \
--template-file examples/05-templates/jinja2-template/template.j2 \
--template-vars examples/05-templates/jinja2-template/template-vars.yaml \
--schema-file examples/05-templates/jinja2-template/schema.yaml \
--output-file jinja2-result.yaml
For more examples see the examples directory.
Benefits of Template Approach:
- ๐ฏ Reusable Templates: Create once, use across multiple scenarios
- ๐ YAML Configuration: More readable than JSON for complex setups
- ๐ Dynamic Content: Variables and conditional rendering
- ๐ CI/CD Ready: Perfect for parameterized pipeline workflows
4. Development Setup (Optional)
For contributors or advanced users who want to modify the source:
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and install for development
git clone https://github.com/Nantero1/ai-first-devops-toolkit.git
cd ai-first-devops-toolkit
uv sync
# Run from source
uv run llm-ci-runner --input-file examples/01-basic/simple-chat/input.json
Real-World Examples
You can explore the examples directory for a complete collection of self-contained examples organized by category.
For comprehensive real-world CI/CD scenarios, see examples/uv-usage-example.md. Some possibilities:
- ๐ค AI-generated PR review โ automated pull request analysis with structured review findings
- ๐ Commit summarizer โ convert commit logs into concise release notes
- ๐ Vulnerability scanner โ map code vulnerabilities to OWASP standards with actionable remediation
- ๐ฏ Quality gate enforcer โ validate build artifacts against schema-defined quality criteria
- ๐ฆ Loan application analyzer โ transform free-text loan applications into Basel-III risk-model inputs
- ๐ผ Consulting report generator โ convert meeting notes into itemized Statement of Work deliverables
- ๐๏ธ Legal contract parser โ extract clauses and compute risk scores from contract documents
- ๐ Court opinion digest โ summarize judicial opinions into structured precedent and citation graphs
- ๐ฅ Patient intake processor โ build HL7/FHIR-compliant patient records from free-form intake forms
- ๐ Earnings call analyzer โ convert transcripts into KPI dashboards for financial performance review
- ๐ Code-review bot โ scan commits and PRs to produce OWASP-mapped vulnerability reports
- ๐ฏ Incident post-mortem summarizer โ generate structured root cause analysis and corrective action plans
- ๐ Regulatory compliance reporter โ synthesize regulatory texts into structured compliance checklists
- ๐ผ Financial audit note handler โ convert audit commentary into ledger-ready journal entries
- ๐ Vulnerability scanner โ map code vulnerabilities to OWASP standards with actionable remediation
- ๐ฏ Quality gate enforcer โ validate build artifacts against schema-defined quality criteria
- ๐ฆ Loan application analyzer โ transform free-text loan applications into Basel-III risk-model inputs
- ๐ผ Consulting report generator โ convert meeting notes into itemized Statement of Work deliverables
- ๐๏ธ Legal contract parser โ extract clauses and compute risk scores from contract documents
- ๐ Court opinion digest โ summarize judicial opinions into structured precedent and citation graphs
- ๐ฅ Patient intake processor โ build HL7/FHIR-compliant patient records from free-form intake forms
- ๐ Earnings call analyzer โ convert transcripts into KPI dashboards for financial performance review
- ๐ Code-review bot โ scan commits and PRs to produce OWASP-mapped vulnerability reports
- ๐ฏ Incident post-mortem summarizer โ generate structured root cause analysis and corrective action plans
- ๐ Regulatory compliance reporter โ synthesize regulatory texts into structured compliance checklists
- ๐ผ Financial audit note handler โ convert audit commentary into ledger-ready journal entries
- ๐ง Technical review assistant โ output structured code review reports with clear action items
- ๐ฅ Doctor dictation converter โ transform verbal notes into ICD-10 coded encounter records
- ๐๏ธ Legal discovery summarizer โ extract key issues and risks from large document sets
- ๐ Manufacturing defect analyzer โ build 8D corrective-action records from production issue notes
- ๐น Budget variance analyzer โ summarize financial reports into detailed KPI and variance analyses
- ๐ฅ๏ธ Ticket triage assistant โ prioritize technical support tickets with automated incident classification
- ๐ฆ Compliance transformer โ create structured Basel reports from raw regulatory text
- ๐ Credit risk evaluator โ convert customer feedback into quantifiable risk scores
- ๐ฐ Investor memo summarizer โ distill strategic memos into pitch-deck bullet points
- ๐ก๏ธ Cyber threat mapper โ translate security alerts into MITRE ATT&CK mapped incident reports
- ๐ท Equipment maintenance scheduler โ analyze sensor logs to generate predictive maintenance reports
- ๐ซ Health history compiler โ produce structured patient histories from narrative medical notes
- ๐ Safety inspection checker โ transform inspection narratives into OSHA citation checklists
- ๐ฅ Radiology result formatter โ convert radiology reports into SNOMED-coded JSON outputs
- ๐ Insurance claim analyzer โ structure claim narratives into automated claim assessments
- ๐ผ Contract review summarizer โ extract risk factors and key dates from legal contracts
- ๐ Fraud detector โ transform analyst notes into SAR (Suspicious Activity Report) JSON objects
- ๐๏ธ Policy impact assessor โ convert policy proposals into stakeholder impact matrices
- ๐ญ Production incident reporter โ build actionable recovery plans from factory incident logs
- ๐ Documentation updater โ generate schema-compliant technical documentation automatically
- ๐ API diff analyzer โ produce backward-compatibility risk reports from API specification changes
- ๐ Financial forecaster โ summarize financial reports into structured cash-flow and projection objects
- ๐ง Deployment log analyzer โ convert rollout logs into performance and downtime metrics
- ๐ E-commerce sentiment analyzer โ tag customer reviews with sentiment and key product features
- ๐๏ธ Meeting minute extractor โ transform recorded meetings into action items and follow-up tasks
- ๐ Sprint retrospective summarizer โ generate improvement plans from agile team discussions
- ๐ฅ Clinical trial data packager โ automatically structure clinical notes for FDA-submission
- ๐ข Employee feedback analyzer โ convert free-text feedback into HR insights and action checklists
- ๐ ๏ธ Process efficiency reporter โ output production logs into structured performance metrics
- ๐๏ธ Legal bill auditor โ transform billing details into itemized expense and compliance reports
- ๐ฆ Automated inventory trigger โ build reordering reports from warehouse inventory logs
- ๐งพ Receipt processor โ convert OCR receipts into ledger-ready accounting entries
- ๐ฆ Mortgage eligibility assessor โ analyze mortgage applications to generate risk and eligibility scores
- ๐ง Infrastructure incident analyst โ summarize log files into detailed RCAs and incident timelines
- ๐๏ธ Regulatory update tracker โ generate structured compliance action items from updated guidelines
- ๐ Board meeting summarizer โ extract key decisions and action items from meeting transcripts
- ๐ Vulnerability risk assessor โ create remediation plans by mapping findings to risk frameworks
- ๐ผ Legal email analyzer โ extract key issues and deadlines from email threads for legal review
- ๐ฅ Prescription manager โ transform handwritten prescription notes into structured medication lists
- ๐ฅ๏ธ Git log analyzer โ generate detailed changelogs from version control commit histories
- ๐ SOP generator โ create standard operating procedures with checklist items from process descriptions
- ๐ฏ PR triage tool โ score and tag pull requests by urgency and impact automatically
- ๐ฆ Audit finding summarizer โ convert audit observations into structured compliance and risk reports
- ๐ Market trend analyzer โ synthesize marketing data into structured trend forecasting objects
- ๐งโ๐ผ Proposal evaluator โ produce structured scoring and evaluation criteria from project proposals
- ๐ข Operations dashboard creator โ translate facility logs into productivity and efficiency metrics
- ๐ฅ Lab result organizer โ build structured diagnostic tables from laboratory results
- ๐ก Innovation evaluator โ compile ideation logs into cost-benefit structured analyses
- ๐๏ธ Judicial ruling summarizer โ generate concise, structured digests from court rulings
- ๐ง Commit changelog generator โ extract impactful changes from commit logs for release summaries
- ๐ญ Production yield analyzer โ produce reports on output statistics and downtime from factory logs
- ๐ณ Fraud alert generator โ transform risk signals into automated CVSS-scored alerts
- ๐ Regulatory filing assistant โ structure raw regulatory data for seamless filing and compliance tracking
- ๐ฉโโ๏ธ Clinical observation compiler โ convert medical research notes into structured clinical data entries
- ๐ Deployment success reporter โ summarize production rollouts with performance metrics and KPIs
- ๐ฆ Mortgage risk evaluator โ process mortgage files into detailed risk scoring and eligibility summaries
- ๐ผ Contract amendment monitor โ track version changes and compliance updates in amended contracts
- ๐ฅ Vital signs monitor โ generate alert reports from patient vital signs and anomaly detection
- ๐ IT security auditor โ convert access logs into structured audit and compliance reports
- ๐ง Incident ticket classifier โ generate detailed RCA reports and automated ticket categorizations
- ๐๏ธ Governance mapper โ produce structured mappings of internal policies to regulatory frameworks
- ๐ข Onboarding compliance checker โ convert training logs into automated compliance and checklist trackers
- ๐ Data breach notifier โ build structured breach incident reports with remediation plans
- ๐ฆ Teller performance analyzer โ transform shift logs into performance and error analysis reports
- ๐ผ Contract risk assessor โ generate automated legal risk memos from detailed contract reviews
- ๐ ๏ธ Bug report classifier โ categorize issue reports by severity and produce remediation plans
- ๐ฅ Appointment summarizer โ convert appointment notes into structured follow-up recommendations
- ๐ Data migration manifest โ output ETL mapping details into a structured transformation record
- ๐ Post-release analyst โ synthesize customer feedback into performance improvement metrics
- ๐ญ Equipment efficiency evaluator โ analyze production logs to predict maintenance needs and cost analysis
- ๐ต๏ธ Fraud case reporter โ compile investigative notes into structured fraud case summaries
- ๐๏ธ Compliance checklist generator โ map internal controls to GDPR or other frameworks in structured reports
- ๐จโ๐ป Diff summarizer โ automatically generate summaries of code differences for peer review
- ๐ Patent claim comparator โ produce novelty and prior art comparison tables from patent texts
- ๐ Cyber incident analyzer โ structure incident narratives into threat intelligence and remediation guides
- ๐ก๏ธ Security audit mapper โ create control maps aligned with NIST frameworks from audit notes
- ๐ฆ Portfolio risk analyzer โ transform investment notes into performance and risk metric summaries
- ๐ Stress test reporter โ compile financial stress test scenarios into structured risk reports
- ๐ Meeting action tracker โ extract decisions and assign tasks from meeting minutes
- ๐ ๏ธ DevOps runbook creator โ produce actionable standard operating procedures from runbook logs
- ๐ Supply chain optimizer โ generate delay forecasts and automated inventory suggestions from logistics notes
- โ๏ธ Process improvement recommender โ convert operational logs into structured efficiency recommendations
- ๐ฎ Compliance reporter โ map internal governance policies to GDPR and similar frameworks
- ๐ API performance optimizer โ analyze API usage logs to generate optimization and performance metrics
- ๐ ๏ธ Legacy system analyzer โ assess legacy code bases and produce migration impact reports
- ๐งฉ Unstructured anything โ your bespoke schema-validated JSON
Input Formats
Traditional JSON Input
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Your task description here"
}
],
"context": {
"session_id": "optional-session-id",
"metadata": {
"any": "additional context"
}
}
}
YAML Input
messages:
- role: system
content: "You are a helpful assistant."
- role: user
content: "Your task description here"
context:
session_id: "optional-session-id"
metadata:
any: "additional context"
Template-Based Input
Handlebars Template (template.hbs):
<message role="system">
You are an expert {{expertise.domain}} engineer.
Focus on {{expertise.focus_areas}}.
</message>
<message role="user">
Analyze this {{task.type}}:
{{#each task.items}}
- {{this}}
{{/each}}
Requirements: {{task.requirements}}
</message>
Jinja2 Template (template.j2):
<message role="system">
You are an expert {{expertise.domain}} engineer.
Focus on {{expertise.focus_areas}}.
</message>
<message role="user">
Analyze this {{task.type}}:
{% for item in task.items %}
- {{item}}
{% endfor %}
Requirements: {{task.requirements}}
</message>
Template Variables (vars.yaml):
expertise:
domain: "DevOps"
focus_areas: "security, performance, maintainability"
task:
type: "pull request"
items:
- "Changed authentication logic"
- "Updated database queries"
- "Added input validation"
requirements: "Focus on security vulnerabilities"
Structured Outputs with 100% Schema Enforcement
When you provide a --schema-file, the runner guarantees perfect schema compliance:
llm-ci-runner \
--input-file examples/01-basic/sentiment-analysis/input.json \
--schema-file examples/01-basic/sentiment-analysis/schema.json
Note: Output defaults to result.json. Use --output-file custom-name.json for custom output files.
Supported Schema Features:
โ
String constraints (enum, minLength, maxLength, pattern)
โ
Numeric constraints (minimum, maximum, multipleOf)
โ
Array constraints (minItems, maxItems, items type)
โ
Required fields enforced at generation time
โ
Type validation (string, number, integer, boolean, array)
CI/CD Integration
GitHub Actions Example
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install LLM CI Runner
run: pip install llm-ci-runner
- name: Generate PR Review with Templates
run: |
llm-ci-runner \
--template-file .github/templates/pr-review.j2 \
--template-vars pr-context.yaml \
--schema-file .github/schemas/pr-review.yaml \
--output-file pr-analysis.yaml
env:
AZURE_OPENAI_ENDPOINT: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
AZURE_OPENAI_MODEL: ${{ secrets.AZURE_OPENAI_MODEL }}
For complete CI/CD examples, see examples/uv-usage-example.md.
Authentication
Azure OpenAI: Uses Azure's DefaultAzureCredential supporting:
- Environment variables (local development)
- Managed Identity (recommended for Azure CI/CD)
- Azure CLI (local development)
- Service Principal (non-Azure CI/CD)
OpenAI: Uses API key authentication with optional organization ID.
Testing
We maintain comprehensive test coverage with 100% success rate:
# For package users - install test dependencies
pip install llm-ci-runner[dev]
# For development - install from source with test dependencies
uv sync --group dev
# Run specific test categories
pytest tests/unit/ -v # 70 unit tests
pytest tests/integration/ -v # End-to-end examples
pytest acceptance/ -v # LLM-as-judge evaluation
# Or with uv for development
uv run pytest tests/unit/ -v
uv run pytest tests/integration/ -v
uv run pytest acceptance/ -v
Architecture
Built on Microsoft Semantic Kernel for:
- Enterprise-ready Azure OpenAI and OpenAI integration
- Future-proof model compatibility
- 100% Schema Enforcement: KernelBaseModel integration with token-level constraints
- Dynamic Model Creation: Runtime JSON schema โ Pydantic model conversion
- Azure RBAC: Azure RBAC via DefaultAzureCredential
- Automatic Fallback: Azure-first priority with OpenAI fallback
The AI-First Development Journey
This toolkit is your first step toward AI-First DevOps. As you integrate AI into your development workflows, you'll experience:
- ๐ Exponential Productivity: AI handles routine tasks while you focus on architecture
- ๐ฏ Guaranteed Quality: Schema enforcement eliminates validation errors
- ๐ค Autonomous Operations: AI agents make decisions in your pipelines
- ๐ Continuous Improvement: Every interaction improves your AI system
The future belongs to teams that master AI-first principles. This toolkit gives you the foundation to start that journey today.
License
MIT License - See LICENSE file for details. Copyright (c) 2025, Benjamin Linnik.
Support
๐ Found a bug? ๐ก Have a question? ๐ Need help?
GitHub is your primary destination for all support:
- ๐ Issues & Bug Reports: Create an issue
- ๐ Documentation: Browse examples
- ๐ง Source Code: View source
Before opening an issue, please:
- โ Check the examples directory for solutions
- โ Review the error logs (beautiful output with Rich!)
- โ Validate your Azure authentication and permissions
- โ Ensure your input JSON follows the required format
- โ Search existing issues for similar problems
Quick Links:
- ๐ Getting Started Guide
- ๐ Complete Examples
- ๐ง CI/CD Integration
- ๐ฏ Use Cases
Ready to embrace the AI-First future? Start with this toolkit and build your path to exponential productivity. Learn more about the AI-First DevOps revolution in Building AI-First DevOps.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_ci_runner-1.3.0.tar.gz.
File metadata
- Download URL: llm_ci_runner-1.3.0.tar.gz
- Upload date:
- Size: 819.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
82fdd01af38c960f6a7c4dbf68ef32c8f5392327ed866dad94ef79320a21820e
|
|
| MD5 |
7005bb6061187add415b2a3b4f20973f
|
|
| BLAKE2b-256 |
e3e8d52f674c1c6b620d9dd68c2d5b446f1e2d227eb4a8973ad6692970eab286
|
File details
Details for the file llm_ci_runner-1.3.0-py3-none-any.whl.
File metadata
- Download URL: llm_ci_runner-1.3.0-py3-none-any.whl
- Upload date:
- Size: 29.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
441252ca996d69ba535ca9c1bc1952c026a70f70e25649a5b6af579b134fa03b
|
|
| MD5 |
b8e44225a45fe1300196dd99a297817f
|
|
| BLAKE2b-256 |
4c96083a366b1cfdfbccf71673518921e878a0267f1b848a41462e51c0efb501
|