Skip to main content

A super lightweight library for LLM-based applications

Project description

tinyLoop Logo

A lightweight Python library for building AI-powered applications with clean function calling, vision support, and structured outputs.

Python License PyPI

TinyLoop is fully built on top of LiteLLM, providing 100% compatibility with the LiteLLM API while adding powerful abstractions and utilities. This means you can use any model, provider, or feature that LiteLLM supports, including:

  • All LLM Providers: OpenAI, Anthropic, Google, Azure, Cohere, and 100+ more
  • All Model Types: Chat, completion, embedding, and vision models
  • Advanced Features: Streaming, function calling, structured outputs, and more
  • Ops Features: Retries, fallbacks, caching, and cost tracking

TinyLoop provides a clean, intuitive interface for working with Large Language Models (LLMs), featuring:

  • ๐ŸŽฏ Clean Function Calling: Convert Python functions to JSON tool definitions automatically
  • ๐Ÿ‘๏ธ Vision Support: Handle images and vision models seamlessly
  • ๐Ÿ“Š Structured Output: Generate structured data from LLM responses using Pydantic
  • โšก Async Support: Full async/await support for all operations
  • ๐Ÿ“ˆ Context Analysis (CTX): Monitor token usage and detect when conversations enter the "dumb zone"

๐Ÿ“ฆ Installation

pip install tinyloop

๐Ÿš€ Quick Start

Basic LLM Usage

Synchronous Calls

from tinyloop.inference.litellm import LLM

# Initialize the LLM
llm = LLM(model="openai/gpt-3.5-turbo", temperature=0.1)

# Simple text generation
response = llm(prompt="Hello, how are you?")
print(response)

# Get conversation history
history = llm.get_history()

# Access comprehensive response information
print(f"Response: {response}")
print(f"Cost: ${response.cost:.6f}")
print(f"Tool calls: {response.tool_calls}")
print(f"Raw response: {response.raw_response}")
print(f"Message history: {len(response.message_history)} messages")

Asynchronous Calls

from tinyloop.inference.litellm import LLM

llm = LLM(model="openai/gpt-3.5-turbo", temperature=0.1)

# Async text generation
response = await llm.acall(prompt="Hello, how are you?")
print(response)

Supported Features

๐ŸŽฏ Structured Output Generation

Generate structured data using Pydantic models:

from tinyloop.inference.litellm import LLM
from pydantic import BaseModel
from typing import List

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: List[str]

class EventsList(BaseModel):
    events: List[CalendarEvent]

# Initialize LLM with structured output
llm = LLM(
    model="openai/gpt-4.1-nano",
    temperature=0.1,
)

# Generate structured data
response = llm(
    prompt="List 5 important events in the XIX century",
    response_format=EventsList
)

# Access structured data
for event in response.events:
    print(f"{event.name} - {event.date}")
    print(f"Participants: {', '.join(event.participants)}")

๐Ÿ‘๏ธ Vision

Work with images using various input methods:

from tinyloop.inference.litellm import LLM
from tinyloop.features.vision import Image
from PIL import Image as PILImage

llm = LLM(model="openai/gpt-4.1-nano", temperature=0.1)

# From PIL Image
pil_image = PILImage.open("image.jpg")
image = Image.from_PIL(pil_image)

# From file path
image = Image.from_file("image.jpg")

# From URL
image = Image.from_url("https://example.com/image.jpg")

# Analyze image
response = llm(prompt="Describe this image", images=[image])
print(response)

๐Ÿ”ง Function Calling

Convert Python functions to LLM tools with automatic schema generation:

from tinyloop.inference.litellm import LLM
from tinyloop.features.function_calling import Tool
import json

def get_current_weather(location: str, unit: str):
    """Get the current weather in a given location

    Args:
        location: The city and state, e.g. San Francisco, CA
        unit: Temperature unit {'celsius', 'fahrenheit'}

    Returns:
        A sentence indicating the weather
    """
    if location == "Boston, MA":
        return "The weather is 12ยฐF"
    return f"Weather in {location} is sunny"

# Create LLM instance
llm = LLM(model="openai/gpt-4.1-nano", temperature=0.1)

# Create tool from function
weather_tool = Tool(get_current_weather)

# Use function calling
inference = llm(
    prompt="What is the weather in Boston, MA?",
    tools=[weather_tool],
)

# Process tool calls
for tool_call in inference.raw_response.choices[0].message.tool_calls:
    tool_name = tool_call.function.name
    tool_args = json.loads(tool_call.function.arguments)
    print(f"Tool: {tool_name}")
    print(f"Args: {tool_args}")
    print(weather_tool(**tool_args))

# Access comprehensive response information
print(f"Total cost: ${inference.cost:.6f}")
print(f"Tool calls made: {len(inference.tool_calls) if inference.tool_calls else 0}")
print(f"Conversation length: {len(inference.message_history)} messages")

๐Ÿ“ Generate Module

Simple text generation with a clean interface:

from tinyloop.modules.generate import Generate

# Synchronous generation
response = Generate.run(
    prompt="Write a haiku about programming",
    model="openai/gpt-3.5-turbo",
    temperature=0.7
)
print(response.response)

# Async generation
response = await Generate.arun(
    prompt="Explain quantum computing",
    model="openai/gpt-4",
    temperature=0.3
)
print(response.response)

# Using the class for multiple calls
generator = Generate(
    model="openai/gpt-3.5-turbo",
    temperature=0.5,
    system_prompt="You are a helpful coding assistant."
)

response1 = generator.call("How do I implement a binary search?")
response2 = generator.call("What's the time complexity?")

๐ŸŽจ Prompt Rendering

Manage prompts with YAML templates and Jinja2:

from tinyloop.utils.prompt_renderer import PromptRenderer, render_base_prompts

# Using PromptRenderer class
renderer = PromptRenderer("prompts/chat.yaml")
system_prompt = renderer.render("system", user_name="Alice", context="coding")
user_prompt = renderer.render("user", question="How do I debug Python?")

Example YAML prompt file (prompts/chat.yaml):

system: |
  You are {{ user_name }}, a helpful AI assistant specializing in {{ context }}.
  Always provide clear, actionable advice.

user: |
  {{ user_name }}, I have a question: {{ question }}

  Please provide a detailed response with examples if relevant.

๐ŸŒŠ Streaming Responses

Get real-time responses as they're generated:

from tinyloop.inference.litellm import LLM

llm = LLM(model="openai/gpt-3.5-turbo", temperature=0.1)

# Stream responses
for chunk in llm.stream(prompt="Write a story about a robot"):
    print(chunk.response, end="", flush=True)

๐Ÿ“ˆ Context Analysis (CTX)

Monitor token usage and detect when conversations enter the "dumb zone" - a region of the context window where model performance may degrade.

CLI Usage

# Full report with TUI tables
tinyloop ctx conversation.json

# Simple one-line status
tinyloop ctx -s conversation.json

# Pipe from another command
cat conversation.json | tinyloop ctx

# Custom threshold (30%) and context window
tinyloop ctx -t 0.3 -c 200000 conversation.json

# Specify model for accurate tokenization
tinyloop ctx -m anthropic/claude-sonnet-4-20250514 conversation.json

Programmatic API

from tinyloop.ctx import CTXAnalyzer, analyze_conversation, get_status

# Quick status check
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there! How can I help?"},
]

status = get_status(messages, model="gpt-4", threshold=0.4)
print(status.message)
# Output: โœ“ 45 / 67,200 tokens (0.0%) โ€” 67,155 tokens until dumb zone

# Full analysis
analyzer = CTXAnalyzer(
    model="anthropic/claude-sonnet-4-20250514",
    context_window=168000,
    threshold=0.4
)
result = analyzer.analyze(messages)

print(f"Total tokens: {result.total_tokens}")
print(f"In dumb zone: {result.is_in_dumb_zone}")
print(f"Categories: {result.categories}")

LLM Integration with Middleware

from tinyloop import LLM
from tinyloop.ctx import CTXMiddleware, CTXThresholdExceeded

# Create middleware with warning action
ctx = CTXMiddleware(
    context_window=168000,
    threshold=0.4,
    action="warn"  # or "raise" to throw exception
)

# Use with LLM
llm = LLM(model="anthropic/claude-sonnet-4-20250514")
llm(prompt="Hello!")
llm(prompt="Tell me about Python")

# Check status at any point
status = ctx.check(llm)
print(status.message)

if status.is_in_dumb_zone:
    print("Warning: Consider summarizing the conversation")

# Or use raise action to stop when threshold exceeded
ctx_strict = CTXMiddleware(threshold=0.4, action="raise")
try:
    # ... long conversation ...
    ctx_strict.check(llm)
except CTXThresholdExceeded as e:
    print(f"Threshold exceeded at {e.percentage_used:.1%}")

๐Ÿ›ก๏ธ Error Handling and Retries

Handle errors gracefully with retry patterns:

from tinyloop.inference.litellm import LLM
import time
import random

def robust_llm_call(llm, prompt, max_retries=3, delay=1):
    """Make LLM calls with retry logic"""
    for attempt in range(max_retries):
        try:
            response = llm(prompt=prompt)
            return response
        except Exception as e:
            if attempt == max_retries - 1:
                raise e
            print(f"Attempt {attempt + 1} failed: {e}")
            time.sleep(delay * (2 ** attempt) + random.uniform(0, 1))

    return None

# Usage
llm = LLM(model="openai/gpt-3.5-turbo", temperature=0.1)
response = robust_llm_call(
    llm,
    "Explain the concept of machine learning",
    max_retries=3
)
print(response.response)

๐Ÿ—๏ธ Project Structure

tinyloop/
โ”œโ”€โ”€ ctx/
โ”‚   โ”œโ”€โ”€ analyzer.py          # Core CTX analysis logic
โ”‚   โ”œโ”€โ”€ categories.py        # Token categorization
โ”‚   โ”œโ”€โ”€ cli.py               # CLI implementation
โ”‚   โ”œโ”€โ”€ middleware.py        # LLM integration middleware
โ”‚   โ””โ”€โ”€ tokenizers.py        # Tokenizer abstractions
โ”œโ”€โ”€ features/
โ”‚   โ”œโ”€โ”€ function_calling.py  # Function calling utilities
โ”‚   โ””โ”€โ”€ vision.py            # Vision model support
โ”œโ”€โ”€ inference/
โ”‚   โ”œโ”€โ”€ base.py              # Base inference classes
โ”‚   โ””โ”€โ”€ litellm.py           # LiteLLM integration
โ”œโ”€โ”€ modules/
โ”‚   โ”œโ”€โ”€ base_loop.py         # Base loop implementation
โ”‚   โ””โ”€โ”€ generate.py          # Generation modules
โ””โ”€โ”€ utils/
    โ””โ”€โ”€ prompt_renderer.py   # Prompt rendering utilities

๐Ÿงช Development

Running Tests

# Run all tests
pytest tests/

# Run specific test file
pytest tests/test_function_calling.py -v

# Run with coverage
pytest tests/ --cov=tinyloop

Examples

Check out the Jupyter notebooks for more detailed examples:

๐Ÿค Contributing

We welcome contributions! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with โค๏ธ for the AI community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinyloop-0.1.4.tar.gz (595.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinyloop-0.1.4-py3-none-any.whl (36.1 kB view details)

Uploaded Python 3

File details

Details for the file tinyloop-0.1.4.tar.gz.

File metadata

  • Download URL: tinyloop-0.1.4.tar.gz
  • Upload date:
  • Size: 595.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.5

File hashes

Hashes for tinyloop-0.1.4.tar.gz
Algorithm Hash digest
SHA256 4b286130a46b9332947ad56707922bccfaeea09ce349c12e288b03844e557aba
MD5 4b1232d94b89f1ed98121f711a96486c
BLAKE2b-256 c61e8c110e07baada80cbc960665504cc0e702425c9664956bbfe0c7fbc8b82b

See more details on using hashes here.

File details

Details for the file tinyloop-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: tinyloop-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 36.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.5

File hashes

Hashes for tinyloop-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6634c01ac180938d25f2e51c90339ae5b129a8ce026b39a0e67fbad8267586d9
MD5 2c8f5705b303df0c0b7f0522bbecb50d
BLAKE2b-256 2173f886f14507a1d28437e6843903c79c48a024e1be102c2a941ee8fcd862ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page