A Two-Stage LLM Pipeline for generating optimized Dockerfiles

These details have not been verified by PyPI

Project description

DockAI

The End of Manual Dockerfiles: Automated, Intelligent, Production-Ready.

DockAI is a robust, enterprise-grade Python CLI tool designed to intelligently analyze a software repository and generate a production-ready, optimized Dockerfile. It uses a novel two-stage LLM pipeline to first understand the project structure ("The Brain") and then architect the build environment ("The Architect").

💡 Why DockAI?

Automated Dockerfiles > Human Written > Cloud Native Buildpacks

DockAI represents the next evolution in containerization.

Better than Humans: Humans forget best practices, security patches, and layer optimizations. DockAI applies the collective knowledge of thousands of expert DevOps engineers to every single build, ensuring multi-stage optimization, non-root users, and perfect caching strategies every time.
Better than Buildpacks: Cloud Native Buildpacks are opaque "black boxes" that add bloat and are hard to debug. DockAI generates a transparent, standard Dockerfile that you can read, audit, and modify. You get the automation of buildpacks with the control of a handwritten file.

✨ Key Features

Zero-Config Automation: Developers never need to write a Dockerfile again. The GitHub Action automatically generates a perfect, up-to-date Dockerfile on every commit.
Two-Stage Pipeline: Separates analysis (cheap/fast) from generation (smart/expensive) for cost-efficiency.
Intelligent Scanning: Uses pathspec to fully respect .gitignore and .dockerignore patterns (including wildcards like *.log or secret_*.json).
Robust & Reliable: Built-in automatic retries with exponential backoff for all AI API calls to handle network instability.
Observability: Structured logging with a --verbose mode for deep debugging and transparency.
Security First: Generates non-root, multi-stage builds by default.

🧠 Architecture

The system operates in three distinct phases:

The Intelligent Scanner (scanner.py):
- Maps the entire repository file tree.
- Automatically filters out files based on .gitignore and .dockerignore using industry-standard wildcard matching.
Stage 1: The Brain (analyzer.py):
- Input: JSON list of file paths.
- Task: Identifies the technology stack (e.g., Python/Flask, Node/Express) and pinpoints the exact files needed for context (e.g., package.json, requirements.txt).
Stage 2: The Architect (generator.py):
- Input: Content of the critical files identified in Stage 1.
- Task: Writes a multi-stage, security-focused Dockerfile with version pinning and cache optimization.

🚀 Getting Started

Prerequisites

Python 3.8+
An OpenAI API Key

Installation

From PyPI (Recommended):

pip install dockai-cli

From Source (Development):

Clone the repository:

git clone https://github.com/itzzjb/dockai.git
cd dockai

Install the package: You can install the tool locally using pip. We recommend installing in "editable" mode (-e) if you plan to modify the code.
```
pip install -e .
```
Configure Environment: Create a .env file in the root directory and add your OpenAI API key and model configurations:
```
OPENAI_API_KEY=sk-your-api-key-here
MODEL_ANALYZER=gpt-4o-mini
MODEL_GENERATOR=gpt-4o
```

🤖 Usage as GitHub Action

You can use DockAI directly in your GitHub Actions workflow to automatically generate a Dockerfile on every push. This ensures your Dockerfile is always perfectly in sync with your code changes, without any manual intervention.

Example Workflow

Create a file .github/workflows/dockai.yml:

name: Generate Dockerfile
on:
  workflow_dispatch: # Allows manual triggering

jobs:
  generate:
    runs-on: ubuntu-latest
    permissions:
      contents: write # Needed to push the generated Dockerfile back
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Run DockAI
        uses: itzzjb/dockai@main
        with:
          openai_api_key: ${{ secrets.OPENAI_API_KEY }}
          model_analyzer: gpt-4o-mini
          model_generator: gpt-4o
          
      - name: Commit and Push Dockerfile
        run: |
          git config --global user.name "DockAI Bot"
          git config --global user.email "bot@dockai.com"
          git add Dockerfile
          git commit -m "ci: generate optimized Dockerfile via DockAI" || echo "No changes to commit"
          git push

💻 CLI Usage

Once installed, the dockai command is available globally in your terminal.

Run the tool by pointing it to the target repository path.

dockai /path/to/target/repo

Example (Current Directory):

dockai .

Verbose Mode (for debugging):

dockai . --verbose

What to Expect

The CLI uses a rich terminal interface to show progress:

Scanning: Locates files, respecting all ignore patterns.
Analyzing: "The Brain" decides what matters.
Reading: Only reads the content of critical files (privacy/token efficient).
Generating: "The Architect" builds the Dockerfile.
Result: A Dockerfile is saved to the target directory.

🎨 Custom Instructions

DockAI supports custom instructions to tailor the Dockerfile generation to your specific needs. You can provide instructions in natural language using two methods:

Method 1: Environment Variables

Set environment variables to provide instructions:

export DOCKAI_ANALYZER_INSTRUCTIONS="Always include package-lock.json if it exists"
export DOCKAI_GENERATOR_INSTRUCTIONS="Use port 8080 and install ffmpeg"
dockai .

Or in your .env file:

DOCKAI_ANALYZER_INSTRUCTIONS="Always include package-lock.json if it exists."
DOCKAI_GENERATOR_INSTRUCTIONS="Ensure all images are based on Alpine Linux."

Method 2: `.dockai` File

Create a .dockai file in your project root with section-based instructions:

# Instructions for the analyzer (file selection stage)
[analyzer]
Always include package-lock.json or yarn.lock if they exist.
Look for any .env.example files to understand environment variables.
Include docker-compose.yml if present.

# Instructions for the generator (Dockerfile creation stage)
[generator]
Ensure the container runs as a non-root user named 'appuser'.
Do not expose any ports other than 8080.
Install 'curl' and 'vim' for debugging purposes.
Set the timezone to 'UTC'.
Define an environment variable 'APP_ENV' with value 'production'.

Note: If you don't use sections ([analyzer] and [generator]), the instructions will be applied to both stages.

Use Cases for Custom Instructions

Analyzer Instructions:

"Always include lock files (package-lock.json, yarn.lock, poetry.lock)"
"Look for configuration files in the config/ directory"
"Include any .proto files for gRPC services"

Generator Instructions:

"Use Alpine-based images only"
"Install system dependencies: ffmpeg, imagemagick, ghostscript"
"Expose port 3000 instead of the default"
"Add health check using curl to /health endpoint"
"Set NODE_ENV to production"
"Create a non-root user named 'nodeuser'"

GitHub Action with Custom Instructions

- name: Run DockAI
  uses: itzzjb/dockai@main
  with:
    openai_api_key: ${{ secrets.OPENAI_API_KEY }}
    model_analyzer: gpt-4o-mini
    model_generator: gpt-4o
    analyzer_instructions: "Always include yarn.lock if present"
    generator_instructions: "Use Alpine Linux and install curl"

🛠️ Development

Running Tests

This project uses pytest for testing. To run the test suite:

pytest

Project Structure

The project follows a modern src-layout:

src/dockai/: Source code package.
- main.py: The CLI orchestrator using typer and rich.
- scanner.py: Directory traversal logic with pathspec.
- analyzer.py: Interface for the Stage 1 LLM call (with retries).
- generator.py: Interface for the Stage 2 LLM call (with retries).
tests/: Unit and integration tests.
pyproject.toml: Build configuration and dependency management.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

4.0.7

Dec 9, 2025

4.0.6

Dec 9, 2025

4.0.5

Dec 8, 2025

4.0.4

Dec 8, 2025

4.0.3

Dec 8, 2025

4.0.2

Dec 8, 2025

4.0.1

Dec 8, 2025

4.0.0

Dec 8, 2025

3.1.12

Dec 7, 2025

3.1.11

Dec 7, 2025

3.1.10

Dec 7, 2025

3.1.9

Dec 7, 2025

3.1.8

Dec 6, 2025

3.1.7

Dec 6, 2025

3.1.6

Dec 5, 2025

3.1.5

Dec 5, 2025

3.1.4

Dec 5, 2025

3.1.3

Dec 4, 2025

3.1.2

Dec 3, 2025

3.1.1

Nov 29, 2025

3.1.0

Nov 29, 2025

3.0.2

Nov 28, 2025

3.0.1

Nov 28, 2025

3.0.0

Nov 28, 2025

2.4.3

Nov 28, 2025

2.4.2

Nov 28, 2025

2.4.1

Nov 28, 2025

2.4.0

Nov 27, 2025

2.3.0

Nov 26, 2025

2.1.1

Nov 26, 2025

2.1.0

Nov 26, 2025

2.0.0

Nov 26, 2025

1.1.2

Nov 25, 2025

1.1.0

Nov 25, 2025

1.0.6

Nov 24, 2025

1.0.5

Nov 24, 2025

1.0.4

Nov 24, 2025

1.0.3

Nov 24, 2025

1.0.2

Nov 23, 2025

1.0.1

Nov 23, 2025

This version

1.0.0

Nov 23, 2025

0.1.3

Nov 23, 2025

0.1.2

Nov 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dockai_cli-1.0.0.tar.gz (15.8 kB view details)

Uploaded Nov 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dockai_cli-1.0.0-py3-none-any.whl (13.2 kB view details)

Uploaded Nov 23, 2025 Python 3

File details

Details for the file dockai_cli-1.0.0.tar.gz.

File metadata

Download URL: dockai_cli-1.0.0.tar.gz
Upload date: Nov 23, 2025
Size: 15.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dockai_cli-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`94d03f39536fd8b7b187bc950a79281f5bf0a38ab7f7deeec837c6bb3acfa497`
MD5	`9cc5e4282cd30e4d6b0cd2b530f2c6e7`
BLAKE2b-256	`f67096bc6e8365c4e12b3e1548f64217bbaefba4cfe0ff7fe55a3b8f7ce34213`

See more details on using hashes here.

File details

Details for the file dockai_cli-1.0.0-py3-none-any.whl.

File metadata

Download URL: dockai_cli-1.0.0-py3-none-any.whl
Upload date: Nov 23, 2025
Size: 13.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dockai_cli-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`992448e5ae8bff61e22ab48e9c43636c8a6f4eb18fb3910a2d125079bddda8fb`
MD5	`6d62270357486cf60400f6ab6bce08c5`
BLAKE2b-256	`9b12587ff831e78b92d6d9659a7d475875311feec63bc3469f448aa9982c4294`

See more details on using hashes here.

dockai-cli 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

DockAI

💡 Why DockAI?

✨ Key Features

🧠 Architecture

🚀 Getting Started

Prerequisites

Installation

🤖 Usage as GitHub Action

Example Workflow

💻 CLI Usage

What to Expect

🎨 Custom Instructions

Method 1: Environment Variables

Method 2: .dockai File

Use Cases for Custom Instructions

GitHub Action with Custom Instructions

🛠️ Development

Running Tests

Project Structure

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Method 2: `.dockai` File