Skip to main content

Generate conversational, tool-calling, structured-output, and preference datasets — easily and at scale

Project description

AfterImage

Tests Ruff format Ruff lint Documentation

AfterImage is a Python library and CLI for generating synthetic conversational datasets with modern LLMs (Gemini, OpenAI-compatible APIs, DeepSeek, and local OpenAI-compatible servers). It is built so you can start with a YAML file and one command, then compose callbacks, document providers, storage, evaluation, and export pipelines as your needs grow—from quick experiments to large, production-style runs.

Two ways to work (same engine)

1. CLI and config — easy to begin
Describe generation in YAML, set your API key in the environment, and run afterimage generate. No boilerplate, no custom harness required to get JSONL on disk. Optional commands cover export to fine-tuning formats and preference (DPO-style) pair generation.

2. Python API — composable and extensible
Use ConversationGenerator, StructuredGenerator, and PersonaGenerator with pluggable instruction generators, respondent prompt modifiers, stopping criteria, storage (JSONL or SQL), quality judges, and monitoring. The same abstractions power the CLI; you swap or combine pieces instead of forking the stack.

That split keeps onboarding shallow while leaving room for scale (concurrency, key pools, SQL storage) and specialized flows (RAG-style context, personas, structured extraction, preference data). Guides and API reference are on afterimage.altai.dev.


Installation

The package can be installed from PyPI as afterimage.

uv add afterimage
pip install afterimage

Optional extras (see pyproject.toml for exact dependency sets):

uv add "afterimage[embeddings-local]"
# or
pip install "afterimage[embeddings-local]"
Extra Purpose
embeddings-local Local embeddings (sentence-transformers) for process-based embedding providers, Qdrant-style workflows, and quality checks that need a local model.
server FastAPI app (afterimage-server entry point).
training Torch / TRL stack, Gradio, and FastMCP for examples/demo_ui and the training scripts under examples/.

Start in minutes (CLI)

Requires Python 3.11+ and an API key (e.g. GEMINI_API_KEY for the sample config).

afterimage generate -c examples/configs/basic.yaml

Dry-run the plan without calling the API:

afterimage generate -c examples/configs/basic.yaml --dry-run

Export a dataset to common fine-tuning formats:

afterimage export -i your_dataset.jsonl -f sharegpt -f messages
afterimage export --list-formats

Generate preference pairs from config:

afterimage preference -c your_config.yaml

More examples live under examples/configs/. In-depth guides (conversations, personas, structured generation, evaluation, export, preference data, local models) are on afterimage.altai.dev.


What you can build

  • Multi-turn synthetic chat for SFT, evaluation sets, or simulation.
  • Document-grounded questions and answers (instruction side + optional respondent context).
  • Persona-driven diversity tied to your corpus.
  • Structured outputs via Pydantic schemas (single-turn extraction or generation).
  • DPO / RLHF-style preference data with multiple variation strategies.
  • Quality loops (async judge, optional auto-improve) and observability (metrics, periodic alert checks, exports to JSON/CSV/Parquet).

Repository layout

Path Contents
docs/ Sphinx sources; mirrors and extends the hosted site.
examples/ YAML configs and demo flows.
DESIGN.md Architecture and design notes for contributors.
afterimage/ Library and CLI implementation.

Source & issues: github.com/altaidevorg/afterimage


License

Apache License 2.0 (see the LICENSE file and PyPI package metadata).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

afterimage-0.14.3-py3-none-any.whl (168.3 kB view details)

Uploaded Python 3

File details

Details for the file afterimage-0.14.3-py3-none-any.whl.

File metadata

File hashes

Hashes for afterimage-0.14.3-py3-none-any.whl
Algorithm Hash digest
SHA256 89dbd67c7b2b723f14ceb40db9cddf2d65a609a4cce3dc7ee95b56a9d9db9af8
MD5 63f376da56a49816364f8a1d7685dcea
BLAKE2b-256 f2880aa19267f2196d09b81c792caf9802f499ea5a5d61f2b817e8c7dda47495

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page