A simple CI/CD utility for running LLM tasks with Semantic Kernel
Project description
AI-First Toolkit: LLM-Powered Automation
๐ The Future of DevOps is AI-First
This toolkit represents a step toward AI-First DevOps - where intelligent automation handles the entire development lifecycle. Built for teams ready to embrace the exponential productivity gains of AI-powered development. Please read the blog post for more details on the motivation.
TLDR: What This Tool Does
Purpose: Transform any unstructured business knowledge into reliable, structured data that powers intelligent automation across your entire organization.
Perfect For:
- ๐ฆ Financial Operations: Convert loan applications, audits, and regulatory docs into structured compliance data
- ๐ฅ Healthcare Systems: Transform patient records, clinical notes, and research data into medical formats
- โ๏ธ Legal & Compliance: Process contracts, court docs, and regulatory texts into actionable risk assessments
- ๐ญ Supply Chain: Turn logistics reports, supplier communications, and forecasts into optimization insights
- ๐ฅ Human Resources: Convert resumes, performance reviews, and feedback into structured talent analytics
- ๐ก๏ธ Security Operations: Transform threat reports, incident logs, and assessments into standard frameworks
- ๐ DevOps & Engineering: Use commit logs, deployment reports, and system logs for automated AI actions
- ๐ Enterprise Integration: Connect any business process to downstream systems with guaranteed consistency
Simple structured output example
# Install and use immediately
pip install llm-ci-runner
llm-ci-runner --input-file examples/02-devops/pr-description/input.json --schema-file examples/02-devops/pr-description/schema.json
The AI-First Development Revolution
This toolkit embodies the principles outlined in Building AI-First DevOps:
| Traditional DevOps | AI-First DevOps (This Tool) |
|---|---|
| Manual code reviews | ๐ค AI-powered reviews with structured findings |
| Human-written documentation | ๐ AI-generated docs with guaranteed consistency |
| Reactive security scanning | ๐ Proactive AI security analysis |
| Manual quality gates | ๐ฏ AI-driven validation with schema enforcement |
| Linear productivity | ๐ Exponential gains through intelligent automation |
Features
- ๐ฏ 100% Schema Enforcement: Your pipeline never gets invalid data. Token-level schema enforcement with guaranteed compliance
- ๐ Resilient execution: Retries with exponential back-off and jitter plus a clear exception hierarchy keep transient cloud faults from breaking your CI.
- ๐ Zero-Friction CLI: Single script, minimal configuration for pipeline integration and automation
- ๐ Enterprise Security: Azure RBAC via DefaultAzureCredential with fallback to API Key
- ๐ฆ CI-friendly CLI: Stateless command that reads JSON/YAML, writes JSON/YAML, and exits with proper codes
- ๐จ Beautiful Logging: Rich console output with timestamps and colors
- ๐ File-based I/O: CI/CD friendly with JSON/YAML input/output
- ๐ Template-Driven Workflows: Handlebars and Jinja2 templates with YAML variables for dynamic prompt generation
- ๐ YAML Support: Use YAML for schemas, input files, and output files - more readable than JSON
- ๐ง Simple & Extensible: Easy to understand and modify for your specific needs
- ๐ค Semantic Kernel foundation: async, service-oriented design ready for skills, memories, orchestration, and future model upgrades
- ๐ Documentation: Comprehensive documentation for all features and usage examples. Use your semantic kernel skills to extend the functionality.
- ๐งโโ๏ธ Acceptance Tests: pytest framework with the LLM-as-Judge pattern for quality gates. Test your scripts before you run them in production.
- ๐ฐ Coming soon: token usage and cost estimation appended to each result for budgeting and optimisation
๐ The Only Enterprise AI DevOps Tool That Delivers RBAC Security, Robustness and Simplicity
LLM-CI-Runner stands alone in the market as the only tool combining 100% schema enforcement, enterprise RBAC authentication, and robust Semantic Kernel integration with templates in a single CLI solution. No other tool delivers all three critical enterprise requirements together.
Installation
pip install llm-ci-runner
That's it! No complex setup, no dependency management - just install and use. Perfect for CI/CD pipelines and local development.
Quick Start
1. Install from PyPI
pip install llm-ci-runner
2. Set Environment Variables
Azure OpenAI (Priority 1):
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_MODEL="gpt-4.1-nano" # or any other GPT deployment name
export AZURE_OPENAI_API_VERSION="2024-12-01-preview" # Optional
OpenAI (Fallback):
export OPENAI_API_KEY="your-very-secret-api-key"
export OPENAI_CHAT_MODEL_ID="gpt-4.1-nano" # or any OpenAI model
export OPENAI_ORG_ID="org-your-org-id" # Optional
Authentication Options:
- Azure RBAC (Recommended): Uses
DefaultAzureCredentialfor Azure RBAC authentication - no API key needed! See Microsoft Docs for setup. - Azure API Key: Set
AZURE_OPENAI_API_KEYenvironment variable if not using RBAC. - OpenAI API Key: Required for OpenAI fallback when Azure is not configured.
Priority: Azure OpenAI takes priority when both Azure and OpenAI environment variables are present.
3a. Basic Usage
# Simple chat example
llm-ci-runner --input-file examples/01-basic/simple-chat/input.json
# With structured output schema
llm-ci-runner \
--input-file examples/01-basic/sentiment-analysis/input.json \
--schema-file examples/01-basic/sentiment-analysis/schema.json
# Custom output file
llm-ci-runner \
--input-file examples/02-devops/pr-description/input.json \
--schema-file examples/02-devops/pr-description/schema.json \
--output-file pr-analysis.json
# YAML input files (alternative to JSON)
llm-ci-runner \
--input-file config.yaml \
--schema-file schema.yaml \
--output-file result.yaml
3b. Template-Based Workflows
Dynamic prompt generation with YAML, Handlebars or Jinja2 templates:
# Handlebars template example
llm-ci-runner \
--template-file examples/05-templates/handlebars-template/template.hbs \
--template-vars examples/05-templates/handlebars-template/template-vars.yaml \
--schema-file examples/05-templates/handlebars-template/schema.yaml \
--output-file handlebars-result.yaml
# Or using Jinja2 templates
llm-ci-runner \
--template-file examples/05-templates/jinja2-template/template.j2 \
--template-vars examples/05-templates/jinja2-template/template-vars.yaml \
--schema-file examples/05-templates/jinja2-template/schema.yaml \
--output-file jinja2-result.yaml
For more examples see the examples directory.
Benefits of Template Approach:
- ๐ฏ Reusable Templates: Create once, use across multiple scenarios
- ๐ YAML Configuration: More readable than JSON for complex setups
- ๐ Dynamic Content: Variables and conditional rendering
- ๐ CI/CD Ready: Perfect for parameterized pipeline workflows
4. Development Setup (Optional)
For contributors or advanced users who want to modify the source:
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and install for development
git clone https://github.com/Nantero1/ai-first-devops-toolkit.git
cd ai-first-devops-toolkit
uv sync
# Run from source
uv run llm-ci-runner --input-file examples/01-basic/simple-chat/input.json
The AI-First Transformation: Why Unstructured โ Structured Matters
LLMs excel at extracting meaning from messy text, logs, documents, and mixed-format data, then emitting schema-compliant JSON/YAML that downstream systems can trust. This unlocks:
- ๐ Straight-Through Processing: Structured payloads feed BI dashboards, RPA robots, and CI/CD gates without human parsing
- ๐ฏ Context-Aware Decisions: LLMs fuse domain knowledge with live telemetry to prioritize incidents, forecast demand, and spot security drift
- ๐ Auditable Compliance: Formal outputs make it easy to track decisions for regulators and ISO/NIST audits
- โก Rapid Workflow Automation: Enable automation across customer service, supply-chain planning, HR case handling, and security triage
- ๐ Safe Pipeline Composition: Structured contracts let AI-first pipelines remain observable and composable while capitalizing on unstructured enterprise data
Input Formats
Traditional JSON Input
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Your task description here"
}
],
"context": {
"session_id": "optional-session-id",
"metadata": {
"any": "additional context"
}
}
}
YAML Input
messages:
- role: system
content: "You are a helpful assistant."
- role: user
content: "Your task description here"
context:
session_id: "optional-session-id"
metadata:
any: "additional context"
Template-Based Input
Handlebars Template (template.hbs):
<message role="system">
You are an expert {{expertise.domain}} engineer.
Focus on {{expertise.focus_areas}}.
</message>
<message role="user">
Analyze this {{task.type}}:
{{#each task.items}}
- {{this}}
{{/each}}
Requirements: {{task.requirements}}
</message>
Jinja2 Template (template.j2):
<message role="system">
You are an expert {{expertise.domain}} engineer.
Focus on {{expertise.focus_areas}}.
</message>
<message role="user">
Analyze this {{task.type}}:
{% for item in task.items %}
- {{item}}
{% endfor %}
Requirements: {{task.requirements}}
</message>
Template Variables (vars.yaml):
expertise:
domain: "DevOps"
focus_areas: "security, performance, maintainability"
task:
type: "pull request"
items:
- "Changed authentication logic"
- "Updated database queries"
- "Added input validation"
requirements: "Focus on security vulnerabilities"
Structured Outputs with 100% Schema Enforcement
When you provide a --schema-file, the runner guarantees perfect schema compliance:
llm-ci-runner \
--input-file examples/01-basic/sentiment-analysis/input.json \
--schema-file examples/01-basic/sentiment-analysis/schema.json
Note: Output defaults to result.json. Use --output-file custom-name.json for custom output files.
Supported Schema Features:
โ
String constraints (enum, minLength, maxLength, pattern)
โ
Numeric constraints (minimum, maximum, multipleOf)
โ
Array constraints (minItems, maxItems, items type)
โ
Required fields enforced at generation time
โ
Type validation (string, number, integer, boolean, array)
CI/CD Integration
GitHub Actions Example
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install LLM CI Runner
run: pip install llm-ci-runner
- name: Generate PR Review with Templates
run: |
llm-ci-runner \
--template-file .github/templates/pr-review.j2 \
--template-vars pr-context.yaml \
--schema-file .github/schemas/pr-review.yaml \
--output-file pr-analysis.yaml
env:
AZURE_OPENAI_ENDPOINT: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
AZURE_OPENAI_MODEL: ${{ secrets.AZURE_OPENAI_MODEL }}
For complete CI/CD examples, see examples/uv-usage-example.md. This repo is also using itself for release note generation, check it out here.
Authentication
Azure OpenAI: Uses Azure's DefaultAzureCredential supporting:
- Environment variables (local development)
- Managed Identity (recommended for Azure CI/CD)
- Azure CLI (local development)
- Service Principal (non-Azure CI/CD)
OpenAI: Uses API key authentication with optional organization ID.
Testing
We maintain comprehensive test coverage with 100% success rate:
# For package users - install test dependencies
pip install llm-ci-runner[dev]
# For development - install from source with test dependencies
uv sync --group dev
# Run specific test categories
pytest tests/unit/ -v # 70 unit tests
pytest tests/integration/ -v # End-to-end examples
pytest acceptance/ -v # LLM-as-judge evaluation
# Or with uv for development
uv run pytest tests/unit/ -v
uv run pytest tests/integration/ -v
uv run pytest acceptance/ -v
Architecture
Built on Microsoft Semantic Kernel for:
- Enterprise-ready Azure OpenAI and OpenAI integration
- Future-proof model compatibility
- 100% Schema Enforcement: KernelBaseModel integration with token-level constraints
- Dynamic Model Creation: Runtime JSON schema โ Pydantic model conversion
- Azure RBAC: Azure RBAC via DefaultAzureCredential
- Automatic Fallback: Azure-first priority with OpenAI fallback
The AI-First Development Journey
This toolkit is your first step toward AI-First DevOps. As you integrate AI into your development workflows, you'll experience:
- ๐ Exponential Productivity: AI handles routine tasks while you focus on architecture
- ๐ฏ Guaranteed Quality: Schema enforcement eliminates validation errors
- ๐ค Autonomous Operations: AI agents make decisions in your pipelines
- ๐ Continuous Improvement: Every interaction improves your AI system
The future belongs to teams that master AI-first principles. This toolkit gives you the foundation to start that journey today.
Real-World Examples
You can explore the examples directory for a complete collection of self-contained examples organized by category.
For comprehensive real-world CI/CD scenarios, see examples/uv-usage-example.md.
100 AI Automation Use Cases for AI-First Automation
DevOps & Engineering ๐ง
- ๐ค AI-generated PR review โ automated pull request analysis with structured review findings
- ๐ Release note composer โ map commits to semantic-version bump rules and structured changelogs
- ๐ Vulnerability scanner โ map code vulnerabilities to OWASP standards with actionable remediation
- โธ๏ธ Kubernetes manifest optimizer โ produce risk-scored diffs and security hardening recommendations
- ๐ Log anomaly triager โ convert system logs into OTEL-formatted events for SIEM ingestion
- ๐ฐ Cloud cost explainer โ output tagged spend by team in FinOps schema for budget optimization
- ๐ API diff analyzer โ produce backward-compatibility scorecards from specification changes
- ๐ก๏ธ IaC drift detector โ turn Terraform plans into CVE-linked security findings
- ๐ Dependency license auditor โ emit SPDX-compatible reports for compliance tracking
- ๐ฏ SLA breach summarizer โ file structured JIRA tickets with SMART action items
Governance, Risk & Compliance ๐๏ธ
- ๐ Regulatory delta analyzer โ emit change-impact matrices from new compliance requirements
- ๐ฑ ESG report synthesizer โ map CSR prose to GRI indicators and sustainability metrics
- ๐ SOX-404 narrative converter โ transform controls descriptions into testable audit checklists
- ๐ฆ Basel III stress-test interpreter โ output capital risk buckets from regulatory scenarios
- ๐ต๏ธ AML SAR formatter โ convert investigator notes into Suspicious Activity Report structures
- ๐ Privacy policy parser โ generate GDPR data-processing-activity logs from legal text
- ๐ Internal audit evidence linker โ export control traceability graphs for compliance tracking
- ๐ Carbon emission disclosure normalizer โ structure sustainability data into XBRL taxonomy
- โ๏ธ Regulatory update tracker โ generate structured compliance action items from guideline changes
- ๐ก๏ธ Safety inspection checker โ transform narratives into OSHA citation checklists
Financial Services ๐ฆ
- ๐ฆ Loan application analyzer โ transform free-text applications into Basel-III risk-model inputs
- ๐ Earnings call sentiment quantifier โ output KPI deltas and investor sentiment scores
- ๐น Budget variance explainer โ produce drill-down pivot JSON for financial analysis
- ๐ Portfolio risk dashboard builder โ feed VaR models with structured investment analysis
- ๐ณ Fraud alert generator โ map investigation notes to CVSS-scored security metrics
- ๐ฐ Treasury cash-flow predictor โ ingest email forecasts into structured planning models
- ๐ Financial forecaster โ summarize reports into structured cash-flow and projection objects
- ๐งพ Invoice processor โ convert receipts into double-entry ledger posts with GAAP tags
- ๐ Stress test scenario packager โ structure regulatory submission data for banking compliance
- ๐ฆ Insurance claim assessor โ return structured claim-decision objects with risk scores
Healthcare & Life Sciences ๐ฅ
- ๐ฅ Patient intake processor โ build HL7/FHIR-compliant patient records from free-form intake forms
- ๐ง Mental health triage assistant โ structure referral notes with priority classifications and care pathways
- ๐ Radiology report coder โ output SNOMED-coded JSON from diagnostic imaging narratives
- ๐ Clinical trial note packager โ create FDA eCTD modules from research documentation
- ๐ Prescription parser โ turn text prescriptions into structured e-Rx objects with dosage validation
- โก Vital sign anomaly summarizer โ generate alert reports with clinical priority rankings
- ๐งช Lab result organizer โ output LOINC-coded tables from diagnostic test narratives
- ๐ฅ Medical device log summarizer โ generate UDI incident files for regulatory reporting
- ๐ Patient feedback sentiment analyzer โ feed quality-of-care KPIs from satisfaction surveys
- ๐ฉโโ๏ธ Clinical observation compiler โ convert research notes into structured data for trials
Legal & Compliance โ๏ธ
- ๐๏ธ Legal contract parser โ extract clauses and compute risk scores from contract documents
- ๐ Court opinion digest โ summarize judicial opinions into structured precedent and citation graphs
- ๐๏ธ Legal discovery summarizer โ extract key issues and risks from large document sets
- ๐ผ Contract review summarizer โ extract risk factors and key dates from legal contracts
- ๐๏ธ Policy impact assessor โ convert policy proposals into stakeholder impact matrices
- ๐ Patent novelty comparator โ produce claim-overlap matrices from prior art analysis
- ๐๏ธ Legal bill auditor โ transform billing details into itemized expense and compliance reports
- ๐ Case strategy brainstormer โ summarize likely arguments from litigation documentation
- ๐ผ Legal email analyzer โ extract key issues and deadlines from email threads for review
- โ๏ธ Expert witness report normalizer โ create citation-linked outlines from testimony records
Customer Experience & Sales ๐
- ๐ง Tier-1 support chatbot โ convert customer queries into tickets with reproducible troubleshooting steps
- โญ Review sentiment miner โ produce product-feature tallies from customer feedback analysis
- ๐ Churn risk email summarizer โ export CRM risk scores from customer communication patterns
- ๐บ๏ธ Omnichannel conversation unifier โ generate customer journey maps from multi-platform interactions
- โ Dynamic FAQ builder โ structure knowledge base content from community forum discussions
- ๐ Proposal auto-grader โ output RFP compliance matrices with scoring rubrics
- ๐ Upsell opportunity extractor โ create lead-scoring JSON from customer interaction analysis
- ๐ฑ Social media crisis detector โ feed escalation playbooks with brand sentiment monitoring
- ๐ Multilingual intent router โ tag customer chats to appropriate support queues by language/topic
- ๐ฏ Marketing copy generator โ create brand-compliant content with tone and messaging constraints
HR & People Operations ๐ฅ
- ๐ CV-to-JD matcher โ rank candidates with explainable competency scores and fit analysis
- ๐ค Interview transcript summarizer โ export structured competency rubrics with evaluation criteria
- โ Onboarding policy compliance checker โ produce new-hire checklist completion tracking
- ๐ Performance review sentiment analyzer โ create growth-plan JSON with development recommendations
- ๐ฐ Payroll inquiry classifier โ map employee emails to structured case codes for HR processing
- ๐ฅ Benefits Q&A automation โ generate eligibility responses from policy documentation
- ๐ช Exit interview insight extractor โ feed retention dashboards with structured departure analytics
- ๐ Training content gap mapper โ align job roles to skill taxonomies for learning programs
- ๐ก๏ธ Workplace incident processor โ convert safety reports into OSHA 301 compliance records
- ๐ Diversity metric synthesizer โ summarize inclusion survey data into actionable insights
Supply Chain & Manufacturing ๐ญ
- ๐ Demand forecast summarizer โ output SKU-level predictions from market analysis and sales data
- ๐ Purchase order processor โ convert supplier communications into structured ERP line-items
- ๐ฑ Supplier risk scanner โ generate ESG compliance scores from vendor assessment reports
- ๐ง Predictive maintenance log analyst โ produce work orders from equipment telemetry narratives
- ๐ Logistics delay explainer โ return route-change suggestions from transportation disruption reports
- โป๏ธ Circular economy return classifier โ create refurbishment tags from product return descriptions
- ๐ Carbon footprint calculator โ map transport legs to COโe emissions for sustainability reporting
- ๐ฆ Safety stock alert generator โ output inventory triggers with lead-time assumptions
- ๐ Regulatory import/export harmonizer โ produce HS-code sheets from trade documentation
- ๐ญ Production yield analyzer โ generate efficiency reports from manufacturing floor logs
Security & Risk Management ๐
- ๐ก๏ธ MITRE ATT&CK mapper โ translate IDS alerts into tactic-technique JSON for threat intelligence
- ๐ฃ Phishing email extractor โ produce IOC STIX bundles from security incident reports
- ๐ Zero-trust policy generator โ convert narrative access requests into structured policy rules
- ๐จ SOC alert deduplicator โ cluster security tickets by kill-chain stage for efficient triage
- ๐ดโโ ๏ธ Red team debrief summarizer โ export OWASP Top-10 gaps from penetration test reports
- ๐ Data breach notifier โ craft GDPR-compliant disclosure packets with timeline and impact data
- ๐ง Threat intel feed normalizer โ convert mixed security PDFs into MISP threat objects
- ๐ Secret leak scanner โ output GitHub code-owner mentions from repository security scans
- ๐ Vendor risk questionnaire scorer โ generate SIG Lite security assessment answers
- ๐๏ธ Security audit tracker โ link ISO-27001 controls to evidence artifacts for compliance
Knowledge & Productivity ๐
- ๐๏ธ Meeting transcript processor โ extract action items with owners and deadlines into project tracking JSON
- ๐ Research paper summarizer โ export citation graphs and key findings for literature review databases
- ๐ SOP generator โ convert process narratives into step-by-step validation checklists
- ๐ Code diff summarizer โ generate reviewer hints and impact analysis from version control changes
- ๐ API changelog analyzer โ produce backward-compatibility scorecards for development teams
- ๐ง Mind map creator โ structure brainstorming sessions into hierarchical knowledge trees
- ๐ Knowledge base gap detector โ suggest article stubs from frequently asked questions analysis
- ๐ฏ Personal OKR journal parser โ output progress dashboards with milestone tracking
- ๐ผ White paper composer โ transform technical discussions into structured thought leadership content
- ๐งฉ Universal transformer โ convert any unstructured domain knowledge into your custom schema-validated JSON
License
MIT License - See LICENSE file for details. Copyright (c) 2025, Benjamin Linnik.
Support
๐ Found a bug? ๐ก Have a question? ๐ Need help?
GitHub is your primary destination for all support:
- ๐ Issues & Bug Reports: Create an issue
- ๐ Documentation: Browse examples
- ๐ง Source Code: View source
Before opening an issue, please:
- โ Check the examples directory for solutions
- โ Review the error logs (beautiful output with Rich!)
- โ Validate your Azure authentication and permissions
- โ Ensure your input JSON follows the required format
- โ Search existing issues for similar problems
Quick Links:
- ๐ Getting Started Guide
- ๐ Complete Examples
- ๐ง CI/CD Integration
- ๐ฏ Use Cases
Ready to embrace the AI-First future? Start with this toolkit and build your path to exponential productivity. Learn more about the AI-First DevOps revolution in Building AI-First DevOps.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_ci_runner-1.4.3.tar.gz.
File metadata
- Download URL: llm_ci_runner-1.4.3.tar.gz
- Upload date:
- Size: 785.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a875758f2b78b8cce945f99a3a937e8c98e271db01622794bb010d192921d857
|
|
| MD5 |
e5b781f0dfbfcbe440cc774a3b5aac47
|
|
| BLAKE2b-256 |
400ed1b6b8e88979f709453bc44a47ccd3c1fdea9cbdf2d44b61b666715df5df
|
File details
Details for the file llm_ci_runner-1.4.3-py3-none-any.whl.
File metadata
- Download URL: llm_ci_runner-1.4.3-py3-none-any.whl
- Upload date:
- Size: 41.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7509a93dfbb0f5f92c41b2a92b807b2ede45a7bba1d9304b8fed6f2b6b93e3d
|
|
| MD5 |
ad25ab363b2b2424361cbb84f4153e29
|
|
| BLAKE2b-256 |
8772e2c501189f35af53d67a2942df132b2596b425c38e372d5869f7f8a6fc7f
|