Skip to main content

A production-grade, OpenAI-compatible API layer for local LLMs with guaranteed structured output.

Project description

Artisan Engine

A production-grade, OpenAI-compatible API layer for local LLMs with guaranteed structured output.

CI PyPI version License: Apache 2.0

Mission

The goal of Artisan Engine is to bridge the last-mile gap between powerful open-source models and the developers who want to use them. It provides the elegant developer experience of a cloud API with the security and control of local infrastructure, making it simple to build production-grade AI applications on your own terms.


Project Status & Roadmap

Artisan Engine is currently in its initial v0.1.0 release. The core focus of this version is to deliver a rock-solid, OpenAI-compatible endpoint for guaranteed structured output.

Our future roadmap is focused on building a complete, stateful application platform:

  • Full Function Calling / Tool Use: Complete orchestration for multi-step agentic workflows.
  • The Assistants API: A stateful, persistent API for managing long-running conversations with memory.
  • Integrated RAG: Seamlessly connect your private documents to your local models.
  • Expanded Backend Support: Official adapters for Ollama, vLLM, and other popular model servers.

We are actively looking for contributors to help us build this future. See the "Contributing" section below!


Key Features

  • Guaranteed Structured Output: Don't just prompt for JSON, enforce it. Artisan uses grammar-based sampling to guarantee that the model's output will always be a syntactically correct JSON object that validates against your Pydantic schema.
  • OpenAI Compatibility: Use the official openai client library you already know. Just change the base_url, and your existing code works.
  • One-Command Deploy: A single docker-compose up command downloads the model and starts the server.
  • Language Agnostic: Any service that can make an HTTP request (NodeJS, Go, Rust, Java, etc.) can use Artisan's power.

Quick Start (with Docker Compose)

Get the entire engine running with a single command. This is the easiest and recommended way to get started.

Prerequisites:

  • Docker and Docker Compose installed.
  • Git installed.

1. Clone the repository:

git clone [https://github.com/aafre/artisan-engine.git](https://github.com/aafre/artisan-engine.git)
cd artisan-engine

2. Start the services: This single command will take care of everything:

  • Build the Artisan Engine image.
  • Automatically download a default LLM model (Llama-3.1-8B-Instruct) if you don't have it.
  • Start the Artisan API server.
docker-compose up -d

Note: The first time you run this, it may take several minutes to download the multi-gigabyte model file. On subsequent runs, it will start instantly as the model is cached in a Docker volume.

The server will be available at http://localhost:8000.

3. Test with Python (OpenAI Client)

Once the server is running, you can verify everything is working with this script.

First, install the openai library: pip install openai pydantic

import openai
from pydantic import BaseModel, Field

# 1. Define your desired Pydantic schema
class UserProfile(BaseModel):
    name: str = Field(description="The user's full name")
    age: int = Field(description="The user's age in years")

# 2. Point the OpenAI client to your local Artisan server
client = openai.OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="not-needed"
)

# 3. Make the API call with the schema
response = client.chat.completions.create(
    model="local-llm",
    messages=[
        {"role": "user", "content": "Extract data for John Doe, who is 42 years old."}
    ],
    response_format={
        "type": "json_object",
        "json_schema": UserProfile.model_json_schema()
    }
)

# 4. The result is a guaranteed valid JSON string
json_response = response.choices[0].message.content
print("Raw JSON from server:", json_response)

# 5. You can load it directly into your Pydantic model
user = UserProfile.model_validate_json(json_response)
print(f"\\nSuccessfully validated object: {user}")

Usage Examples

The examples/ directory in this repository contains more runnable scripts that demonstrate how to use the Artisan Engine for common tasks.


Configuration

Artisan Engine is configured via environment variables. The easiest way to configure the docker-compose setup is to edit the environment section for the artisan-engine service directly in the docker-compose.yml file.

For a full list of configuration options, please see the .env.example file.


Endpoints

  • /docs: Interactive API documentation (Swagger UI).
  • /health: Health check for the service and model.
  • /models: Lists the available models (OpenAI-compatible).
  • /v1/chat/completions: The OpenAI-compatible endpoint for structured and unstructured chat.

Powered By

Artisan Engine stands on the shoulders of giants. Our core functionality is made possible by these fantastic open-source projects:

  • Outlines: For the state-of-the-art, grammar-based generation that guarantees our structured output.
  • llama-cpp-python: For high-performance inference of GGUF models on local hardware.
  • FastAPI: For building our robust and modern API.

Contributing

Contributions are welcome and essential for making Artisan Engine the best tool for local AI development! We have several issues flagged as good first issue that are perfect for getting started. Please see the issues tab to get involved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

artisan_engine-0.1.0.tar.gz (29.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

artisan_engine-0.1.0-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

File details

Details for the file artisan_engine-0.1.0.tar.gz.

File metadata

  • Download URL: artisan_engine-0.1.0.tar.gz
  • Upload date:
  • Size: 29.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for artisan_engine-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7c16c2c3dbb9d1c9128e8bab55539d89439ed81609735690bb862c043e67a5b5
MD5 a21b554575c54435ac9ba752431f1834
BLAKE2b-256 a8c0c9d288f63ca0ac438a729f8616a6fba9a7d1f5154083658aeee5bdf9e037

See more details on using hashes here.

File details

Details for the file artisan_engine-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: artisan_engine-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for artisan_engine-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4fa4382c86f55a82f99dbf83bc4fcb8e6659a04fcd0bf9920bcd7a7185e6145f
MD5 433652778c658718d7fd89072210bcff
BLAKE2b-256 4a6ece6d60c39f01fca0a8e0a9f5beb3effe2237669157383ee6d7631e3cec96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page