Skip to main content

The fastest way to build, chain, and reuse LLM agents and flows

Project description

cruise-llm

Quickly build and reuse LLM workflows/agents with a clean, composable API — inspired by scikit-learn's chainability and litellm's model flexibility.

from cruise_llm import LLM
LLM().user("Explain quantum computing").chat(stream=True)

Multi-turn Prompt Queues

Build complex micro-workflows by queuing prompts that the model will execute sequentially.

# Automatic multi-step processing
news_processor = (
    LLM(model="fast")
    .user(f"Process this article: {raw_text}")
    .queue("Summarize the key points into 3 bullet points for an executive.")
    .queue("Translate those points into Spanish.")
    .queue("Format the Spanish summary as a Slack message with emojis.")
    .chat()
)

# Create reusable bot templates
def style_refiner(style):
    return LLM().sys(f"Rewrite in a {style} tone").queue("Make it half the length")

casual = style_refiner("casual")
formal = style_refiner("formal")

casual.user("We need to discuss Q3 deliverables").res()
formal.user("hey wanna grab coffee and chat about the project?").res()

Easy Tool Calling for Fast Agent Building

Simply define functions, no schema necessary:

def search_docs(query: str):
    """Search internal documentation."""
    return f"Found: '{query}' appears in onboarding.md and api-reference.md"

def create_ticket(title: str, priority: str):
    """Create a support ticket."""
    return f"Created ticket #{hash(title) % 1000}: {title} [{priority}]"

def send_slack(channel: str, message: str):
    """Send a Slack message."""
    return f"Sent to #{channel}: {message[:50]}..."

support_agent = (
    LLM()
    .sys("You are a support agent")
    .tools(fns=[search_docs, create_ticket, send_slack])
)

support_agent.user("User can't log in. Check docs, create a P1 ticket, and alert #incidents").chat()

Image & Audio Support

Attach images and audio to prompts — auto-switches to a capable model if needed:

# Images
LLM().user("What's in this image?", image="photo.jpg").chat()
LLM().user("Compare these", image=["before.png", "after.png"]).chat()

# Audio
LLM().user(audio="meeting.mp3").chat()                          # audio as the prompt
LLM().user("What language is this?", audio="clip.wav").chat()    # audio + text
LLM().user("Compare these", audio=["clip1.wav", "clip2.wav"]).chat()

# Combined
LLM().user("Describe the scene", image="photo.jpg", audio="narration.mp3").chat()

# URLs work for both
LLM().user("Describe", image="https://example.com/img.jpg").chat()
LLM().user("Summarize", audio="https://example.com/podcast.mp3").chat()

# Standalone transcription (uses Whisper)
text = LLM().transcribe("recording.wav")

Evaluate & Compare Outputs

Rank multiple LLM outputs with pairwise comparison, or score a single response:

from cruise_llm import evaluate

# Compare outputs from different models
outputs = [model.run(text=article) for model in models]
result = evaluate(results=outputs)
print(result["rankings"])  # [2, 0, 1] = third output was best
print(result["scores"])    # {0: 0.35, 1: 0.15, 2: 0.50}

# Custom metrics
result = evaluate(
    results=outputs,
    metrics={"How interesting is it?": "1-10", "How easy to understand?": "1-10"},
    weights={"How interesting is it?": 0.3, "How easy to understand?": 0.7}
)

# Score a single response
llm = LLM().user("Explain quantum computing").chat()
score = llm.evaluate_last(metrics={"How clear?": "1-10"})
print(score["score"])  # 0.82

Flexible Conversations

Chat instances with swappable models and minimal verbosity:

chat1 = (
    LLM(model="fast")
    .sys("You are a bitcoin analyst")
    .user("What is proof of work?").chat()
    .user("Steel man the case for bitcoin mining").chat()
    .user("Now steel man the case against").chat()
)

# Replay history with more intelligent yet expensive config
chat2 = chat1.run_history(model="best", reasoning=True, reasoning_effort="high")

# Save chat histories to analyze offline or load later
chat1.save_llm("chats/bitcoin_analysis_fast_model.json")
chat2.save_llm("chats/bitcoin_analysis_best_model.json")

Model Discovery & A/B Testing

Pick specific models or get up-to-date top-10 from category:

LLM(model="gpt-5.2")
LLM(model="best")     # top intelligence rankings
LLM(model="fast")     # optimized for speed
LLM(model="cheap")
LLM(model="open")     # open-source models
LLM(model="optimal")  # balanced best+fast (default)
LLM(model="codex")

# Simple numeric selection (zips optimal and best)
LLM(model=1)          # top optimal (default)
LLM(model=2)          # top best
LLM(model=3)          # second optimal

# Deterministic selection by rank (1-indexed)
LLM(model="best1")    # top model in best category
LLM(model="fast3")    # 3rd fastest model

# Discover and filter what's available
LLM().get_models("claude")
LLM().models_with_vision()
LLM().models_with_audio_input()
LLM().models_with_search()

Generate LLMs from Descriptions

Create configured LLM instances from natural language:

# Generate a specialized LLM
summarizer = LLM().generate("Text summarizer that outputs 3 bullet points")
result = summarizer.run(text="Long article here...")

# Use a powerful model as the generator for better results
analyst = LLM(model="best", reasoning=True).generate(
    "A senior financial analyst for DCF valuations"
)
result = analyst.run(ticker="NVDA")

# Generated LLMs can be saved and reused
analyst.save_llm("agents/dcf_analyst.json")

LLM as Function

Use LLMs as reusable functions with template variables:

# Define with {placeholders}, call with .run()
sentiment = LLM().sys("Classify sentiment").user("Text: {text}")
sentiment.run("I love this product!")  # positional arg when 1 required var
sentiment.run(text="This is terrible")  # or use kwargs

# Optional variables with {var?} syntax
analyzer = LLM().user("Analyze {ticker} focusing on {aspect?}")
analyzer.run(ticker="TSLA")                        # aspect becomes ""
analyzer.run(ticker="TSLA", aspect="growth")       # aspect = "growth"

# JSON output
extractor = LLM().sys("Extract entities as JSON").user("{text}")
entities = extractor.run_json("Apple announced new MacBooks")

Cost Tracking

Track token usage and costs across your session:

llm = LLM(model="best")
llm.user("Explain quantum computing").chat()
llm.user("Summarize in one sentence").chat()

print(f"Last call: ${llm.last_cost():.6f}")
print(f"Session total: ${llm.total_cost():.6f}")
print(f"Breakdown: {llm.all_costs()}")

Save, Load, Export

# Save an agent config
researcher = LLM("claude-sonnet-4-5").tools(search=True)
researcher.save_llm("agents/researcher.json")

# Load
r = LLM.load_llm("agents/researcher.json")
r.user(f"What happened in tech {todays_date}?").chat()

# Export conversation to markdown
r.to_md(f"tech_briefing/{todays_date}.md")

Install

pip install cruise-llm

Your access to models is based on your API keys from the various providers—keys are available for free from most providers. Create a local .env file in your project root with at least one API key. Use litellm-specific variable names:

OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...
GROQ_API_KEY=gsk_...
XAI_API_KEY=xai-...

Caveat: Search, reasoning, and model categories/rankings (best, cheap, fast, open, etc.) has only been tested with the above listed providers. Calling other providers (perplexity, huggingface etc.) is still available with explicit litellm model strings but may require different search/reasoning setup.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cruise_llm-0.6.0.tar.gz (27.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cruise_llm-0.6.0-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file cruise_llm-0.6.0.tar.gz.

File metadata

  • Download URL: cruise_llm-0.6.0.tar.gz
  • Upload date:
  • Size: 27.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cruise_llm-0.6.0.tar.gz
Algorithm Hash digest
SHA256 dc0571d2ff32b48c7b35f7b25ae8a9e16305c16e00efec70bac6f1177c6e0f73
MD5 6fda4b58ba718368fa6cb6541c96ae06
BLAKE2b-256 fc027463737520204e1c73e05310bcdc8491f39d6c43a6517065236886f4079b

See more details on using hashes here.

File details

Details for the file cruise_llm-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: cruise_llm-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cruise_llm-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 717a5cc31b50e858e4366daee8144170affee7f1b12961d0294f6ef46f3774f1
MD5 a1ddf61d467e3c9aa01c88fe9a1a908a
BLAKE2b-256 d8bd7bc14a60862b145af07f1954f699e02cd612c9edef0a71c42e9b770e5aac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page