Skip to main content

AI library and helpers (Python/Poetry/Typer - LM Studio & llama.cpp)

Project description

TransAI

AI library and helpers (Python/Poetry/Typer - LM Studio & llama.cpp).

  • Primary use case: Python API/interface with local AI models
  • Works with: local AI models via LM Studio or llama.cpp
  • Status: stable
  • License: Apache-2.0

Since version 1.0.0 it is a PyPI package: https://pypi.org/project/transai/

Table of contents

License

Copyright 2025 Daniel Balparda balparda@github.com

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License here.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Third-party notices

This project depends on third-party software. Key runtime dependencies:

See pyproject.toml for the full dependency list.

Installation

To use in your project:

pip3 install transai

and then import the library:

from transai.core import ai, lms, llama
from transai.utils import images

For the CLI tool, after installation just run:

transai --help

Supported platforms

  • OS: Linux, macOS, Windows (wherever llama-cpp-python and lmstudio are supported)
  • Architectures: x86_64, arm64
  • Python: 3.12+

Known dependencies (Prerequisites)

What TransAI is

TransAI is a Python library and CLI tool that provides a unified interface for running local AI models through two backends:

  • LM Studio (LMStudioWorker): connects to a running LM Studio server on localhost via the lmstudio client library. This is the recommended and default backend.
  • llama.cpp (LlamaWorker): loads GGUF model files directly into memory using llama-cpp-python. Useful when you want full control without running an LM Studio server.

Both backends share the same abstract interface (AIWorker), so you can swap backends without changing your application code. Models can be queried with plain text prompts or with structured output (Pydantic models), vision models can process images, and tool-capable models can call Python functions.

What TransAI is not

  • Not a cloud AI service — it only works with local models
  • Not a model downloader — you must have models available locally (via LM Studio or as GGUF files)
  • Not a training framework — inference only
  • Not a high-level agent framework — it provides the low-level model interface layer

Key concepts and terminology

  • AIWorker: abstract base class defining the interface for loading and querying AI models
  • LMStudioWorker: concrete worker that connects to a local LM Studio server
  • LlamaWorker: concrete worker that loads GGUF files directly via llama.cpp
  • AIModelConfig: TypedDict with all model loading parameters (context, temperature, GPU, seed, etc.)
  • Model ID: a string identifying the model, typically in the format model-name@quantization (e.g., qwen3-8b@Q8_0); should match what you would use with lms get <model_id> or https://huggingface.co/<model_id>
  • GGUF: the quantized model file format used by llama.cpp
  • CLIP projector: a companion model file enabling vision capabilities in multi-modal models
  • Speculative decoding: a technique for faster inference by generating multiple tokens in parallel

Known limitations

  • LM Studio backend requires a running LM Studio server on localhost (127.0.0.1)
  • llama.cpp backend requires GGUF model files on disk
  • Vision support in llama.cpp depends on CLIP projector file availability and supported architectures (Qwen2-VL, MiniCPM, Llama3-Vision, Moondream, NanoLLava, Obsidian, Llava)
  • No telemetry, no network calls beyond localhost (LM Studio server)

Library API usage

Loading a model

transai.core.ai exposes a convenience constructor MakeAIModelConfig(**overrides) which returns a fully-populated AIModelConfig TypedDict with sensible defaults.

from transai.core import ai, lms, llama

# --- Using LM Studio ---
with lms.LMStudioWorker() as worker:
  config, metadata = worker.LoadModel(ai.MakeAIModelConfig(
    model_id='qwen3-vl-32b-instruct@Q8_0',
    vision=True,
    temperature=0.5,  # only override the ones you care about!
    # all other fields will have sensible defaults; currently also supported are:
    # seed, context, gpu_ratio, gpu_layers, use_mmap, fp16, flash, spec_tokens, kv_cache
  ))
  # ... use worker.ModelCall() ...

# --- Using llama.cpp ---
import pathlib
with llama.LlamaWorker(pathlib.Path('~/.lmstudio/models/')) as worker:
  config, metadata = worker.LoadModel(ai.AIModelConfig(
    model_id='qwen3-8b@Q8_0',
    # ... same config field possibilities ...
  ))
  # ... use worker.ModelCall() ...

Querying a model (text)

response: str = worker.ModelCall(
  model_id='qwen3-8b@Q8_0',
  system_prompt='You are a helpful assistant.',
  user_prompt='What is the capital of France?',
  output_format=str,
)
print(response)  # "The capital of France is Paris."

Querying a model (structured JSON)

To get a structured object back from the model, just create a pydantic.BaseModel class as shown below. Make sure to add pydocs and pydantic.Field description to the fields, as all the information (name, type, descriptions) are sent to the model.

import pydantic

class CityInfo(pydantic.BaseModel):
  """City information"""

  city: str = pydantic.Field(description='city name')
  country: str = pydantic.Field(description='country name')
  population: int = pydantic.Field(description='city population')
  districts: list[str] =  pydantic.Field(description='list of city district names')

result: CityInfo = worker.ModelCall(
  model_id='qwen3-8b@Q8_0',
  system_prompt='Extract a city information, its country, population, and list of districts.',
  user_prompt='Tell me about Paris, France.',
  output_format=CityInfo,
)
print(result.city)        # "Paris"
print(result.population)  # 2161000

Vision models (images)

import pathlib

response: str = worker.ModelCall(
  model_id='qwen3-vl-32b-instruct@Q8_0',
  system_prompt='Describe what you see.',
  user_prompt='What is in this image?',
  output_format=str,
  images=[pathlib.Path('photo.jpg')],  # or raw bytes, or file path string
)

Images are automatically resized to fit within 1024px (longest edge) before being sent to the model.

Tool use (function calling)

Pass Python callables (or fully-qualified dotted names) as tools. The model may invoke them during the conversation and TransAI handles the execution round-trip automatically:

import math

def celsius_to_fahrenheit(celsius: float) -> float:
  """Convert Celsius to Fahrenheit.

  Args:
    celsius: temperature in Celsius

  Returns:
    temperature in Fahrenheit

  """
  return celsius * 9 / 5 + 32

# tools must be a list of callables; the model may call them zero or more times
response: str = worker.ModelCall(
  model_id='qwen3-8b@Q8_0',
  system_prompt='You are a helpful assistant.',
  user_prompt='What is 23°C in Fahrenheit? Also, what is the GCD of 48 and 36?',
  output_format=str,
  tools=[celsius_to_fahrenheit, math.gcd],
)

Image utilities

The transai.utils.images module provides helpers for image preprocessing:

from transai.utils import images

# Resize an image for vision models (max 1024px, returns PNG bytes)
png_bytes: bytes = images.ResizeImageForVision(raw_image_bytes)

# Extract frames from an animated image (GIF, APNG, etc.)
for frame_png in images.AnimationFrames(animated_gif_bytes):
  # each frame is PNG bytes, resized to max 336px
  pass

AI Guide

Models suggestions as of April/2026. Just an opinion, not to be taken seriously. Do your own tests.

Vision Models

These models can process images.

Model Flag Value Size Type Tool? Reason? Comment
qwen3-vl-32b-instruct@Q8_0 36GB llm/qwen3vl/GGUF Y Very good, slow.
qwen3-vl-32b-instruct@F16 67GB llm/qwen3vl/GGUF Y --fp16 - Very good, slow. Q8_0 version is faster-ish and still very good.
qwen3.5-35b-a3b@Q8_0 * 38GB llm/qwen35moe/GGUF Y Y Decent, slow.
zai-org/glm-4.6v-flash@8bit * 12GB llm/glm4v/MLX Y Y Decent, slow.

Blind Models

These models cannot process images (blind).

Model Flag Value Size Type Tool? Reason? Comment
qwen3-8b@Q8_0 8.7GB llm/qwen3/GGUF Y Good, medium-speed.
gpt-oss-20b@MXFP4 * 12GB llm/gpt_oss/MLX Y Y Poor, slow.
zai-org/glm-4.7-flash@8bit * 32GB llm/glm4v/MLX Y Y Good, inconsistent.

CLI Interface

Quick start

Query a local AI model via LM Studio (server must be running):

transai query "What is the capital of France?"

Query using the llama.cpp backend (direct GGUF loading, no server needed):

transai --no-lms --root ~/.lmstudio/models/ query "Give me an onion soup recipe."

Query with tool use (pass fully-qualified Python callable names; model calls them automatically):

transai query --tools math.gcd --tools os.getcwd "What is the GCD of 48 and 36? Also what is my current directory?"

Global flags

Flag Description Default
--help Show help off
--version Show version and exit off
-v, -vv, -vvv, --verbose Verbosity (nothing=ERROR, -v=WARNING, -vv=INFO, -vvv=DEBUG) ERROR
--color/--no-color Force enable/disable colored output (respects NO_COLOR env var if not provided) --color
-r, --root Local models root directory (only needed for --no-lms) LM Studio default if it exists
--lms/--no-lms Use LM Studio backend vs llama.cpp backend --lms
-m, --model Model to load (e.g., qwen3-8b@Q8_0) qwen3-8b@Q8_0
-t, --tokens Speculative decoding tokens (2-200) disabled
-s, --seed Random seed for reproducibility random
--context Max context tokens (16-16777216) 32768
-x, --temperature Sampling temperature (0.0-2.0) 0.15
-g, --gpu GPU ratio (0.1-1.0) 0.80
--gpu-layers GPU layers to offload (-1 = as many as possible) -1
--fp16/--no-fp16 FP16 precision mode --no-fp16
--mmap/--no-mmap Memory-mapped file loading --mmap
--flash/--no-flash Flash attention --flash
--kv-cache KV-cache precision type (GGML type, 4-128) model default

CLI Commands Documentation

This software auto-generates docs for CLI apps:

Color and formatting

Rich provides color output in logging and CLI output. The app:

  • Respects NO_COLOR environment variable
  • Has --no-color / --color flag: if given, overrides the NO_COLOR environment variable
  • If there is no environment variable and no flag is given, defaults to having color

To control color see Rich's markup conventions.

Project Design

Architecture overview

TransAI uses an abstract base class pattern for backend abstraction:

CLI (transai.py + cli/query.py)
  │
  ├─ LMStudioWorker (core/lms.py)  ──▶  LM Studio server (localhost)
  │
  └─ LlamaWorker (core/llama.py)   ──▶  GGUF files on disk
  │
  └─ Both implement AIWorker (core/ai.py)
       │
       └─ Image utilities (utils/images.py)
  • AIWorker defines LoadModel() and ModelCall() as the public interface
  • LMStudioWorker and LlamaWorker implement _Load() and _Call() internally
  • The CLI layer (transai.py, cli/query.py) orchestrates configuration and delegates to workers
  • Image preprocessing is handled by utils/images.py

Modules

Module Responsibility
transai.py CLI app definition, global options, TransAIConfig dataclass
cli/query.py query command implementation
core/ai.py AIWorker abstract base class, AIModelConfig, shared constants and types
core/lms.py LMStudioWorker — LM Studio backend implementation
core/llama.py LlamaWorker — llama.cpp backend implementation (GGUF loading, CLIP detection, vision handlers)
utils/images.py Image resizing for vision models, animation frame extraction

Development Instructions

File structure

.
├── CHANGELOG.md                  ⟸ latest changes/releases
├── LICENSE
├── Makefile
├── transai.md                    ⟸ auto-generated CLI doc (by `make docs` or `make ci`)
├── poetry.lock                   ⟸ maintained by Poetry, do not manually edit
├── pyproject.toml                ⟸ most important configurations live here
├── README.md                     ⟸ this documentation
├── SECURITY.md                   ⟸ security policy
├── requirements.txt
├── .pre-commit-config.yaml       ⟸ pre-submit configs
├── .github/
│   ├── copilot-instructions.md   ⟸ GitHub Copilot project-specific instructions
│   ├── dependabot.yaml           ⟸ Github dependency update pipeline
│   └── workflows/
│       ├── ci.yaml               ⟸ Github CI pipeline
│       └── codeql.yaml           ⟸ Github security scans and code quality pipeline
├── .vscode/
│   └── settings.json             ⟸ VSCode configs
├── scripts/
│   └── make_test_images.py       ⟸ helper script for generating test images
├── src/
│   └── transai/
│       ├── __init__.py           ⟸ version and package metadata
│       ├── __main__.py           ⟸ `python -m transai` entry point
│       ├── transai.py            ⟸ main CLI app entry point (Run(), Main())
│       ├── py.typed              ⟸ PEP 561 marker for type stubs
│       ├── cli/
│       │   └── query.py          ⟸ `transai query` command implementation
│       ├── core/
│       │   ├── ai.py             ⟸ AIWorker abstract base class, AIModelConfig, shared types
│       │   ├── llama.py          ⟸ LlamaWorker (llama.cpp backend)
│       │   └── lms.py            ⟸ LMStudioWorker (LM Studio backend)
│       └── utils/
│           └── images.py         ⟸ image preprocessing for vision models
├── tests/                        ⟸ unit tests
│   ├── transai_test.py
│   ├── cli/
│   │   └── query_test.py
│   ├── core/
│   │   ├── ai_test.py
│   │   ├── llama_test.py
│   │   └── lms_test.py
│   └── utils/
│       └── images_test.py
└── tests_integration/
    └── test_installed_cli.py     ⟸ integration tests (wheel build + install)

Development Setup

Install Python

On Linux:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install git python3 python3-dev python3-venv build-essential software-properties-common

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.12

On Mac:

brew update
brew upgrade
brew cleanup -s

brew install git python@3.12

Install Poetry (recommended: pipx)

Poetry reference.

Install pipx (if you don't have it):

python3 -m pip install --user pipx
python3 -m pipx ensurepath

If you previously had Poetry installed, but not through pipx make sure to remove it first: brew uninstall poetry (mac) / sudo apt-get remove python3-poetry (linux). You should install Poetry with pipx and configure poetry to create .venv/ locally. This keeps Poetry isolated from project virtual environments and python for the environments is isolated from python for Poetry. Do:

pipx install poetry
poetry --version

If you will use PyPI to publish:

poetry config pypi-token.pypi <TOKEN>  # add your personal PyPI project token, if any

Make sure .venv is local

This project expects a project-local virtual environment at ./.venv (VSCode settings assume it).

poetry config virtualenvs.in-project true

Get the repository

git clone https://github.com/balparda/transai.git transai
cd transai

Create environment and install dependencies

From the repository root:

poetry env use python3.12  # creates the .venv with the correct Python version
poetry sync                # sync env to project's poetry.lock file
poetry env info            # no-op: just to check that environment looks good
poetry check               # no-op: make sure all pyproject.toml fields are being used correctly

poetry run transai --help    # simple test if everything loaded OK
make ci                    # should pass OK on clean repo

To activate and use the environment do:

poetry env activate        # (optional) will print activation command for environment, but you can just use:
source .venv/bin/activate  # because .venv SHOULD BE LOCAL
...
pytest -vvv  # for example, or other commands you want to execute in-environment
...
deactivate  # to close environment

Optional: VSCode setup

This repo ships a .vscode/settings.json configured to:

  • use ./.venv/bin/python
  • run pytest
  • use Ruff as formatter
  • disable deprecated pylint/flake8 integrations
  • configure Google-style docstrings via autoDocstring
  • use Code Spell Checker

Recommended VSCode extensions:

  • Python (ms-python.python)
  • Python Environments (ms-python.vscode-python-envs)
  • Python Debugger (ms-python.debugpy)
  • Pylance (ms-python.vscode-pylance)
  • Mypy Type Checker (ms-python.mypy-type-checker)
  • Ruff (charliermarsh.ruff)
  • autoDocstring – Python Docstring Generator (njpwerner.autodocstring)
  • Code Spell Checker (streetsidesoftware.code-spell-checker)
  • markdownlint (davidanson.vscode-markdownlint)
  • Markdown All in One (yzhang.markdown-all-in-one) - helps maintain this README.md table of contents
  • Markdown Preview Enhanced (shd101wyy.markdown-preview-enhanced, optional)
  • GitHub Copilot (github.copilot) - AI assistant; reads .github/copilot-instructions.md for project-specific coding conventions (indentation, naming, workflow)

Testing

Unit tests / Coverage

make test               # plain test run, no integration tests
make integration        # run the integration tests
poetry run pytest -vvv  # verbose test run, includes integration tests

make cov  # coverage run, equivalent to: poetry run pytest --cov=src --cov-report=term-missing

A test can be marked with a "tag" by just adding a decorator:

@pytest.mark.slow
def test_foo_method() -> None:
  """Test."""
  ...

These tags are defined in pyproject.toml, in section [tool.pytest.ini_options.markers]:

Tag Meaning
slow test is slow (> 1s)
flaky AVOID! — test is known to be flaky
stochastic test is capable of failing (even if very unlikely)
integration integration test (wheel build + install)

You can use them to filter tests:

poetry run pytest -vvv -m slow  # run only the slow tests

You can find the slowest tests by running:

poetry run pytest -vvv -q --durations=20

You can search for flaky tests by running make flakes, which runs all tests 100 times.

Instrumenting your code

You can instrument your code to find bottlenecks:

$ source .venv/bin/activate
$ which transai
/path/to/.venv/bin/transai  # <== place this in the command below:
$ pyinstrument -r html -o output1.html -- /path/to/.venv/bin/transai <your-cli-command> <your-cli-flags>
$ deactivate

This will save a file output1.html to the project directory with the timings for all method calls. Make sure to cleanup these html files later.

Integration / e2e tests

Integration tests validate packaging and the installed console script by:

  • building a wheel from the repository
  • installing that wheel into a fresh temporary virtualenv
  • running the installed console script(s) to verify behavior (e.g., --version and basic commands)

The canonical integration test is tests_integration/test_installed_cli.py. Tests in this suite are marked with pytest.mark.integration.

Run the integration tests with:

make integration  # or: poetry run pytest -m integration -q

Linting / formatting / static analysis

make lint  # equivalent to: poetry run ruff check .
make fmt   # equivalent to: poetry run ruff format .

To check formatting without rewriting:

poetry run ruff format --check .

Type checking

make type  # equivalent to: poetry run mypy src tests tests_integration

(Pyright is primarily for editor-time; MyPy is what CI enforces.)

Versioning and releases

Versioning scheme

This project follows a pragmatic versioning approach:

  • Patch: bug fixes / docs / small improvements.
  • Minor: new features or non-breaking changes.
  • Major: breaking API changes.

See: CHANGELOG.md

Updating versions

Bump project version (patch/minor/major)

Poetry can bump versions:

# bump the version!
poetry version minor  # updates 1.0.0 to 1.1.0, for example
# or:
poetry version patch  # updates 1.0.0 to 1.0.1
# or:
poetry version <version-number>
# (also updates `pyproject.toml` and `poetry.lock`)

This updates [project].version in pyproject.toml. Remember to also update src/transai/__init__.py to match (this repo gets/prints __version__ from there)!

Update dependency versions

The project has a dependabot config file in .github/dependabot.yaml that weekly (defaulting to Tuesdays) scans both Github actions and the project dependencies and creates PRs to update them.

To update poetry.lock file to more current versions do poetry update, it will ignore the current lock, update, and rewrite the poetry.lock file. If you have cache problems poetry cache clear PyPI --all will clean it.

To add a new dependency you should do:

poetry add "pkg>=1.2.3"  # regenerates lock, updates env (adds dep to prod code)
poetry add -G dev "pkg>=1.2.3"  # adds dep to dev code ("group" dev)
# also remember: "pkg@^1.2.3" = latest 1.* ; "pkg@~1.2.3" = latest 1.2.* ; "pkg@1.2.3" exact

Keep tool versions aligned. Remember to check your diffs before submitting (especially poetry.lock) to avoid surprises!

Exporting the requirements.txt file

This project does not generate requirements.txt automatically (Poetry uses poetry.lock). If you need a requirements.txt for Docker/legacy tooling, use Poetry's export plugin (poetry-plugin-export) by simply running:

make req  # or: poetry export --format requirements.txt --without-hashes --output requirements.txt
CI and docs

Make sure to run make docs or even better make ci. Both will update the CLI markdown docs and requirements.txt automatically.

Git tag and commit

Publish to GIT, including a TAG:

git commit -a -m "release version 1.0.0"
git tag 1.0.0
git push
git push --tags
Publish to PyPI

If you already have your PyPI token registered with Poetry (see Install Poetry) then just:

poetry build
poetry publish

Remember to update CHANGELOG.md.

Security

Please refer to the security policy in SECURITY.md for supported versions and how to report vulnerabilities.

The project has a codeql config file in .github/workflows/codeql.yaml that weekly (defaulting to Fridays) scans the project for code quality and security issues. It will also run on all commits. Github security issues will be opened in the project if anything is found.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transai-1.1.0.tar.gz (48.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

transai-1.1.0-py3-none-any.whl (44.8 kB view details)

Uploaded Python 3

File details

Details for the file transai-1.1.0.tar.gz.

File metadata

  • Download URL: transai-1.1.0.tar.gz
  • Upload date:
  • Size: 48.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.5 Darwin/25.3.0

File hashes

Hashes for transai-1.1.0.tar.gz
Algorithm Hash digest
SHA256 168bfe8f51b12238caf66e219917727d4f375f38e98d5ba020f4f8536837117e
MD5 910afb7634838b480b0065eaf086de52
BLAKE2b-256 46839770b84c408ef1b59b4cb553138d75399cda3fca63b7782989c944919c96

See more details on using hashes here.

File details

Details for the file transai-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: transai-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 44.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.5 Darwin/25.3.0

File hashes

Hashes for transai-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c79d86203c4480a58b263171ef7b4808a727c3b787198f351e94d79adb2cae67
MD5 f8eb6e5072487ddae97ebfeda20a3a04
BLAKE2b-256 1e4bee172aeff7e6242d9e403f1e31eea986666e314e3469c16f1f7522be6871

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page