A lightweight bridge to serve synchronous model inference via async FastAPI.

These details have not been verified by PyPI

Project links

Project description

LightInfer

LightInfer is a lightweight, high-performance bridge for serving synchronous model inference code (PyTorch, TensorFlow, etc.) via an asynchronous FastAPI server.

It solves the "Blocking Loop" problem by efficiently isolating heavy computation in dedicated worker threads while maintaining a fully asynchronous, high-concurrency web frontend.

Features

Zero-Blocking Architecture: Async Web Frontend + Sync Worker Threads.
Efficient Bridge: Uses AsyncResponseBridge for zero-thread-overhead waiting.
Streaming Support:
- Native Server-Sent Events (SSE) for text streaming.
- Binary Streaming for audio/video generation (with chunk buffering).
Easy Integration: Wrap any Python class with an infer method.
Context Isolation: Each worker runs in its own thread, ensuring safety for libraries like PyTorch.

Installation

pip install lightinfer

Quick Start

1. Define your Model

LightInfer wraps any class with an infer method. The arguments to infer are automatically mapped from the JSON request.

import time

class MyModel:
    def infer(self, prompt: str = "world"):
        # Simulate heavy work
        time.sleep(1)
        return {"message": f"Hello, {prompt}!"}

2. Start the Server

from lightinfer.server import LightServer

# Create your model instance
model = MyModel()

# Start server (you can pass a list of models to run multiple worker threads)
server = LightServer([model])
server.start(port=8000)

3. Make Requests

Standard Request:

import requests

# 'args' in JSON maps to positional arguments of infer()
# 'kwargs' in JSON maps to keyword arguments of infer()
resp = requests.post("http://localhost:8000/api/v1/infer", 
                     json={"args": ["LightInfer"]})
print(resp.json())
# Output: {'message': 'Hello, LightInfer!'}

Streaming Request:

If your model returns a generator, you can use streaming:

class StreamingModel:
    def infer(self, prompt: str):
        yield "Part 1"
        time.sleep(0.5)
        yield "Part 2"

Client side:

resp = requests.post("http://localhost:8000/api/v1/infer", 
                     json={"args": ["test"], "stream": True}, stream=True)

for line in resp.iter_lines():
    if line:
        print(line.decode('utf-8'))

Examples

Check the examples/ directory for ready-to-run scenarios:

Simple LLM: Text-to-Text generation with SSE streaming.
Streaming TTS: Text-to-Audio generation with binary chunk streaming.

CLI Usage

You can serve any model class directly from the terminal.

Format: lightinfer <module>:<Class>

Given a file my_model.py:

class MyModel:
    def infer(self, prompt: str):
        return f"Echo: {prompt}"

Run:

lightinfer my_model:MyModel --port 8000 --workers 2

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Dec 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightinfer-0.1.0.tar.gz (8.7 kB view details)

Uploaded Dec 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lightinfer-0.1.0-py3-none-any.whl (7.5 kB view details)

Uploaded Dec 12, 2025 Python 3

File details

Details for the file lightinfer-0.1.0.tar.gz.

File metadata

Download URL: lightinfer-0.1.0.tar.gz
Upload date: Dec 12, 2025
Size: 8.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for lightinfer-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fe1916b9f94c701729004b5477f9e26f2eaeb69b6f5fd99d885d1e70abd1aa32`
MD5	`a81613fa3275d92bcd8338d0dadb3b2d`
BLAKE2b-256	`9d19389f0cd82486881adf641482173df38a66f12c1227a2722f8366ba68afa4`

See more details on using hashes here.

File details

Details for the file lightinfer-0.1.0-py3-none-any.whl.

File metadata

Download URL: lightinfer-0.1.0-py3-none-any.whl
Upload date: Dec 12, 2025
Size: 7.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for lightinfer-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d2daa94f9b9b465b2885ad48bf33b3e39525e5ff7d0a37ce90a69c45aec64e34`
MD5	`910a7fe65f64f4bd95431e533e4073c8`
BLAKE2b-256	`147389874283dbd10038fa2d19b9c33827541bd68b628f1ac7e7e967e8d41ac9`

See more details on using hashes here.

lightinfer 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LightInfer

Features

Installation

Quick Start

1. Define your Model

2. Start the Server

3. Make Requests

Examples

CLI Usage

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes