Official Python SDK for Variably feature flags, LLM experimentation, and prompt optimization platform

These details have not been verified by PyPI

Project links

Project description

Variably Python SDK

Official Python SDK for Variably — feature flags, LLM experimentation, and prompt optimization.

Installation

pip install variably-sdk

Quick Start

from variably import VariablyClient

# Initialize the client
client = VariablyClient({
    "api_key": "your-api-key",
    "base_url": "https://api.variably.com",  # optional, defaults to localhost:8080
    "environment": "production"  # optional
})

# Evaluate a boolean feature flag
user_context = {
    "user_id": "user-123",
    "email": "user@example.com",
    "country": "US"
}

is_feature_enabled = client.evaluate_flag_bool(
    "new-checkout-flow",
    False,  # default value
    user_context
)

if is_feature_enabled:
    # Show new checkout flow
    pass

# Evaluate a feature gate
has_access = client.evaluate_gate("premium-features", user_context)

# Track events
client.track({
    "name": "button_clicked",
    "user_id": "user-123",
    "properties": {
        "button_name": "checkout",
        "page": "product-detail"
    }
})

# Clean up resources
client.close()

Prompt Experimentation

Variably provides two modes for LLM prompt experimentation:

BYOR (Bring Your Own Runtime)

You call your own LLM. Variably handles variant allocation and 41-dimensional evaluation.

from variably import VariablyClient
import time

client = VariablyClient({"api_key": "your-api-key"})

user_context = {"user_id": "user-123"}
input_variables = {"query": "What are the symptoms of Type 2 diabetes?"}

# Step 1: Get the allocated variant
variant = client.get_variant("rag-prompt-experiment", user_context, input_variables)
print(f"Variant: {variant.variant_key}, Model: {variant.model}")

# Step 2: Call your LLM with the variant's prompt template
prompt = variant.prompt_template.format(**input_variables)
start = time.time()
llm_response = call_your_llm(prompt, model=variant.model)  # your LLM call
latency = int((time.time() - start) * 1000)

# Step 3: Submit the response for 41-dimensional evaluation
result = client.submit_response(
    experiment_key="rag-prompt-experiment",
    variant_key=variant.variant_key,
    executed_prompt=prompt,
    response=llm_response,
    user_context=user_context,
    input_variables=input_variables,
    provider=variant.provider,
    model=variant.model,
    latency_ms=latency,
)
print(f"Submitted: {result.status}")

Managed Execution

Variably selects the variant, calls the LLM, and evaluates — all in one call.

response = client.evaluate_prompt(
    experiment_key="rag-prompt-experiment",
    user_context={"user_id": "user-123"},
    input_variables={"query": "What are the symptoms of Type 2 diabetes?"},
    evaluation_mode="full",  # "full" | "fast"
)

print(f"Content: {response.content}")
print(f"Model: {response.model}, Latency: {response.latency_ms}ms")
print(f"Tokens: {response.token_usage}")
print(f"Quality Score: {response.quality_score}")

Managed Execution with Streaming (v2.1.0+)

Same as managed execution, but tokens stream in real-time — ideal for chatbot UIs.

from variably import VariablyClient

client = VariablyClient({"api_key": "your-api-key"})

stream = client.evaluate_prompt_stream(
    experiment_key="rag-prompt-experiment",
    user_context={"user_id": "user-123"},
    input_variables={"query": "What are the symptoms of Type 2 diabetes?"},
)

# Tokens arrive one-by-one for real-time display
for token in stream:
    print(token, end="", flush=True)

print()  # newline after stream ends

# After iteration, metadata is available (token usage, latency, quality score)
meta = stream.metadata
if meta:
    print(f"Model: {meta.model}, Latency: {meta.latency_ms}ms")
    print(f"Tokens: {meta.token_usage}")

Context-Aware Evaluation (Better RAG Quality) — v2.2.0+

For RAG chatbots, passing conversation history and retrieved chunks enables groundedness scoring, hallucination detection, and conversational coherence — dimensions that are impossible to evaluate in isolation.

The evaluation_context parameter is not sent to the LLM — it's only used by Variably's evaluator for richer scoring.

# Step 1: Collect conversation history from your session
workflow_history = [
    {"role": "user", "content": "What causes diabetes?"},
    {"role": "assistant", "content": "Key factors include genetics, diet..."},
    {"role": "user", "content": "What about potatoes?"},
]

# Step 2: Collect retrieved RAG chunks (after your retrieval step)
reference_materials = [
    {
        "id": "chunk-001",
        "content": "Unhealthy diets high in refined sugars, fats...",
        "source": "Kenya National Clinical Guidelines",
        "type": "chunk",
        "relevance_score": 0.89,
    },
    {
        "id": "chunk-002",
        "content": "Modifiable risk factors include obesity...",
        "source": "Kenya National Clinical Guidelines",
        "type": "chunk",
        "relevance_score": 0.82,
    },
]

# Step 3: Pass evaluation_context in your evaluate call
response = client.evaluate_prompt(
    experiment_key="rag-prompt-experiment",
    user_context={"user_id": "user-123"},
    input_variables={"query": "What about potatoes?", "context": context_text},
    evaluation_mode="full",
    evaluation_context={
        "reference_materials": reference_materials,
        "workflow_history": workflow_history,
        "retrieval_query": "potato consumption glycemic index diabetes risk",
    },
)

# Same works with streaming
stream = client.evaluate_prompt_stream(
    experiment_key="rag-prompt-experiment",
    user_context={"user_id": "user-123"},
    input_variables={"query": "What about potatoes?", "context": context_text},
    evaluation_context={
        "reference_materials": reference_materials,
        "workflow_history": workflow_history,
    },
)
for token in stream:
    print(token, end="", flush=True)

What this enables:

Dimension	Description	Requires
`faithfulness`	% of claims grounded in retrieved chunks	`reference_materials`
`hallucination_rate`	% of claims with no source in context	`reference_materials`
`context_utilization`	% of relevant chunks actually used	`reference_materials`
`attribution_accuracy`	Do citations map to correct chunks?	`reference_materials`
`conversation_consistency`	No contradictions with prior turns	`workflow_history`
`context_retention`	Maintains topic awareness across turns	`workflow_history`
`transparency`	Discloses when going beyond source material	`reference_materials`

BYOR mode also supports evaluation_context — pass it in submit_response():

result = client.submit_response(
    experiment_key="my-experiment",
    variant_key=variant.variant_key,
    executed_prompt=prompt,
    response=llm_response,
    user_context=user_context,
    input_variables=input_variables,
    provider=variant.provider,
    model=variant.model,
    latency_ms=latency,
    evaluation_context={
        "reference_materials": reference_materials,
        "workflow_history": workflow_history,
    },
)

evaluation_context Schema

Field	Type	Description
`reference_materials`	`list[dict]`	RAG chunks / source documents for groundedness scoring
`reference_materials[].id`	`str`	Unique chunk identifier
`reference_materials[].content`	`str`	Chunk text content
`reference_materials[].source`	`str` (optional)	Source document URL or name
`reference_materials[].type`	`str` (optional)	e.g. `"chunk"`, `"document"`
`reference_materials[].relevance_score`	`float` (optional)	Retriever similarity score
`workflow_history`	`list[dict]`	Conversation turns for coherence scoring
`workflow_history[].role`	`str`	`"user"` or `"assistant"`
`workflow_history[].content`	`str`	Message content
`retrieval_query`	`str` (optional)	The rewritten query sent to the retriever

See Context-Aware RAG Evaluation for the full concept doc with architecture diagrams and integration examples.

Integration with LangGraph / FastAPI streaming

from fastapi.responses import StreamingResponse

async def stream_with_variably(query: str, session_id: str):
    """Yield NDJSON events from Variably streaming evaluation."""
    stream = client.evaluate_prompt_stream(
        experiment_key="my-experiment",
        user_context={"user_id": session_id},
        input_variables={"query": query},
    )

    for token in stream:
        yield json.dumps({"type": "token", "content": token}) + "\n"

    # Send final metadata
    if stream.metadata:
        yield json.dumps({
            "type": "stream_end",
            "content": stream.metadata.content,
        }) + "\n"

@app.post("/api/chat")
async def chat(request: ChatRequest):
    return StreamingResponse(
        stream_with_variably(request.message, request.session_id),
        media_type="application/x-ndjson",
    )

Backend API: SSE Streaming Endpoint

The streaming endpoint uses Server-Sent Events (SSE). Here's the raw API:

Endpoint: POST /api/v1/internal/sdk/prompt-experiments/evaluate-stream

Headers:

X-API-Key: your-api-key
Content-Type: application/json

Request body (same as non-streaming evaluate):

{
  "experiment_key": "rag-prompt-experiment",
  "user_context": {
    "userId": "user-123",
    "sessionId": "sess-456"
  },
  "input_variables": {
    "query": "What are the symptoms of Type 2 diabetes?"
  },
  "evaluation_context": {
    "reference_materials": [{"id": "chunk-1", "content": "...", "source": "...", "type": "chunk"}],
    "workflow_history": [{"role": "user", "content": "..."}],
    "retrieval_query": "diabetes symptoms type 2"
  }
}

Response (SSE stream):

event: token
data: {"content": "Type"}

event: token
data: {"content": " 2"}

event: token
data: {"content": " diabetes"}

event: token
data: {"content": " symptoms"}

event: token
data: {"content": " include..."}

event: metadata
data: {"experiment_id": "exp-123", "variant_id": "variant-a", "execution_id": "eval-789", "provider": "anthropic", "model": "claude-3-5-haiku-20241022", "prompt_tokens": 150, "completion_tokens": 85, "total_tokens": 235, "cost_usd": 0.000425, "latency_ms": 1250}

event: done
data: {}

curl example:

curl -N -X POST http://localhost:8080/api/v1/internal/sdk/prompt-experiments/evaluate-stream \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "experiment_key": "rag-prompt-experiment",
    "user_context": {"userId": "user-123", "sessionId": "sess-456"},
    "input_variables": {"query": "What are the symptoms of Type 2 diabetes?"}
  }'

Error handling: If an error occurs during streaming, an error event is sent:

event: error
data: {"message": "LLM generation failed: rate limit exceeded"}

Configuration

from variably import VariablyConfig, VariablyClient

config = VariablyConfig(
    api_key="your-api-key",
    base_url="https://api.variably.com",  # default: http://localhost:8080
    environment="production",  # default: development
    timeout=5000,  # timeout in milliseconds, default: 5000
    retry_attempts=3,  # default: 3
    enable_analytics=True,  # default: True
    cache={
        "ttl": 300,  # TTL in seconds, default: 300 (5 minutes)
        "max_size": 1000,  # default: 1000
        "enabled": True  # default: True
    },
    log_level="INFO"  # DEBUG, INFO, WARNING, ERROR
)

client = VariablyClient(config)

Advanced Usage

Environment Variables

You can create a client using environment variables:

from variably import create_client_from_env

# Uses these environment variables:
# VARIABLY_API_KEY (required)
# VARIABLY_BASE_URL
# VARIABLY_ENVIRONMENT
# VARIABLY_TIMEOUT
# VARIABLY_RETRY_ATTEMPTS
# VARIABLY_ENABLE_ANALYTICS
# VARIABLY_LOG_LEVEL

client = create_client_from_env()

Different Flag Types

# Boolean flags
bool_value = client.evaluate_flag_bool("feature-enabled", False, user_context)

# String flags
string_value = client.evaluate_flag_string("theme", "light", user_context)

# Number flags
number_value = client.evaluate_flag_number("max-items", 10, user_context)

# JSON flags
json_value = client.evaluate_flag_json("config", {"timeout": 5000}, user_context)

# Get full evaluation details
result = client.evaluate_flag("feature-flag", "default", user_context)
print(f"Value: {result.value}, Reason: {result.reason}, Cache Hit: {result.cache_hit}")

Batch Evaluation

flags = client.evaluate_flags([
    "feature-a",
    "feature-b", 
    "feature-c"
], user_context)

print(flags["feature-a"].value)

Event Tracking

from datetime import datetime

# Single event
client.track({
    "name": "purchase_completed",
    "user_id": "user-123",
    "properties": {
        "amount": 99.99,
        "currency": "USD",
        "items": ["item-1", "item-2"]
    },
    "timestamp": datetime.utcnow()  # optional, auto-generated if not provided
})

# Batch events
client.track_batch([
    {"name": "page_view", "user_id": "user-123", "properties": {"page": "/home"}},
    {"name": "button_click", "user_id": "user-123", "properties": {"button": "cta"}}
])

Cache Management

# Clear cache
client.clear_cache()

# Get cache stats
stats = client.cache.get_stats()
print(stats)  # {"size": 10, "max_size": 1000, "enabled": True, "ttl": 300}

Metrics

# Get SDK metrics
metrics = client.get_metrics()
print(metrics)
# {
#     "api_calls": 25,
#     "cache_hits": 15,
#     "cache_misses": 10,
#     "errors": 1,
#     "average_latency": 45.2,
#     "cache_hit_rate": 0.6,
#     "error_rate": 0.04,
#     "flags_evaluated": 20,
#     "gates_evaluated": 5,
#     "events_tracked": 12,
#     "start_time": "2023-10-01T12:00:00Z",
#     "uptime_seconds": 3600
# }

Context Manager

# Use with context manager for automatic cleanup
with VariablyClient({"api_key": "your-api-key"}) as client:
    result = client.evaluate_flag_bool("feature", False, user_context)
    # client.close() is called automatically

Custom Logger

from variably import VariablyClient, create_logger

# Create custom logger
logger = create_logger(
    name="my-app",
    level="DEBUG",
    structured=True,  # JSON logging
    silent=False
)

# Client will use the custom logger
client = VariablyClient({
    "api_key": "your-api-key",
    "log_level": "DEBUG"
})

Error Handling

from variably import (
    VariablyError,
    NetworkError,
    AuthenticationError,
    ValidationError,
    RateLimitError,
    TimeoutError,
    ConfigurationError
)

try:
    result = client.evaluate_flag("my-flag", False, user_context)
except AuthenticationError:
    print("Invalid API key")
except NetworkError as e:
    print(f"Network error: {e.status_code}")
except ValidationError as e:
    print(f"Validation error in field: {e.field}")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after} seconds")
except TimeoutError:
    print("Request timed out")
except ConfigurationError as e:
    print(f"Configuration error in parameter: {e.parameter}")
except VariablyError as e:
    print(f"Variably SDK error: {e}")

Type Hints

The SDK includes full type hints for better IDE support:

from typing import Dict, Any
from variably import VariablyClient, UserContext, FlagResult

user_context: UserContext = {
    "user_id": "user-123",
    "email": "user@example.com",
    "attributes": {
        "plan": "premium",
        "signup_date": "2023-01-01"
    }
}

result: FlagResult = client.evaluate_flag("feature", False, user_context)

Async Support

For async applications, you can wrap the synchronous client:

import asyncio
from concurrent.futures import ThreadPoolExecutor
from variably import VariablyClient

class AsyncVariablyClient:
    def __init__(self, config):
        self.client = VariablyClient(config)
        self.executor = ThreadPoolExecutor(max_workers=4)
    
    async def evaluate_flag_bool(self, flag_key, default_value, user_context):
        loop = asyncio.get_event_loop()
        return await loop.run_in_executor(
            self.executor,
            self.client.evaluate_flag_bool,
            flag_key, default_value, user_context
        )
    
    async def close(self):
        self.client.close()
        self.executor.shutdown(wait=True)

# Usage
async def main():
    client = AsyncVariablyClient({"api_key": "your-api-key"})
    
    result = await client.evaluate_flag_bool("feature", False, {
        "user_id": "user-123"
    })
    
    await client.close()

asyncio.run(main())

Development

Setup

# Install development dependencies
pip install -e ".[dev]"

Testing

pytest

Code Quality

# Format code
black src/ tests/

# Sort imports
isort src/ tests/

# Lint
flake8 src/ tests/

# Type check
mypy src/

Publishing to PyPI

Prerequisites

Create a PyPI account at https://pypi.org/account/register/
Generate an API token at https://pypi.org/manage/account/token/
- Scope: select "Entire account" for first upload, or project-specific after that
Install build tools:
```
pip3 install build twine
```

Note: build and twine install to user site-packages and may not be on your PATH. Always use python3 -m build and python3 -m twine instead of bare build/twine.

Configure PyPI credentials

Create ~/.pypirc:

[distutils]
index-servers = pypi

[pypi]
username = __token__
password = pypi-YOUR_API_TOKEN_HERE

Secure the file:

chmod 600 ~/.pypirc

Build and publish

The version in the build output (e.g., variably_sdk-2.0.0-py3-none-any.whl) comes directly from pyproject.toml's version field. PyPI rejects re-uploads of the same version — you must bump the version to publish again.

# 1. Clean previous builds
rm -rf dist/ build/ src/*.egg-info

# 2. Build sdist and wheel
python3 -m build

# 3. Verify the package (optional but recommended)
python3 -m twine check dist/*

# 4. Upload to TestPyPI first (optional, for dry-run)
python3 -m twine upload --repository testpypi dist/*

# 5. Upload to PyPI
python3 -m twine upload dist/*

Verify the published package

pip3 install variably-sdk==2.1.0
python3 -c "from variably import VariablyClient, PromptVariant; print('OK')"

Version bumping checklist

When releasing a new version, update these three files then clean-build-publish:

src/variably/version.py — __version__
pyproject.toml — version
src/variably/http_client.py — User-Agent header string

# Example: bumping from 2.0.0 to 2.0.1
# After updating the 3 files above:
rm -rf dist/ build/ src/*.egg-info
python3 -m build
python3 -m twine upload dist/*

Requirements

Python 3.7+
requests >= 2.25.0

License

MIT License - see LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.9.1

May 2, 2026

2.9.0

May 2, 2026

2.8.0

Apr 28, 2026

2.7.1

Apr 15, 2026

2.7.0

Apr 14, 2026

2.6.1

Mar 19, 2026

2.6.0

Mar 19, 2026

This version

2.5.0

Mar 19, 2026

2.4.0

Mar 18, 2026

2.3.0

Mar 16, 2026

2.2.0

Mar 12, 2026

2.1.0

Mar 9, 2026

2.0.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

variably_sdk-2.5.0.tar.gz (29.8 kB view details)

Uploaded Mar 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

variably_sdk-2.5.0-py3-none-any.whl (23.5 kB view details)

Uploaded Mar 19, 2026 Python 3

File details

Details for the file variably_sdk-2.5.0.tar.gz.

File metadata

Download URL: variably_sdk-2.5.0.tar.gz
Upload date: Mar 19, 2026
Size: 29.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for variably_sdk-2.5.0.tar.gz
Algorithm	Hash digest
SHA256	`68a8f7560abe0d2a166fc8c794c634cc1a3c75550239debfeba86e16c7d300c0`
MD5	`017e8981c84fd4d8475143014cef1fdb`
BLAKE2b-256	`26486ab8f3b3393661582587972755137c5fcad4eeb8b5f4dba04d541180c513`

See more details on using hashes here.

File details

Details for the file variably_sdk-2.5.0-py3-none-any.whl.

File metadata

Download URL: variably_sdk-2.5.0-py3-none-any.whl
Upload date: Mar 19, 2026
Size: 23.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for variably_sdk-2.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e079848678c1cc71cd49ec36f79010a4ea7c275687b3650303293dceaab7ad50`
MD5	`1cadea289fe33576a86e752752c4aab4`
BLAKE2b-256	`49f0cad7f6c8bd2a9d790fbcc799602501e30f25d816bc7debed75b3b3bd143b`

See more details on using hashes here.

variably-sdk 2.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Variably Python SDK

Installation

Quick Start

Prompt Experimentation

BYOR (Bring Your Own Runtime)

Managed Execution

Managed Execution with Streaming (v2.1.0+)

Context-Aware Evaluation (Better RAG Quality) — v2.2.0+

evaluation_context Schema

Integration with LangGraph / FastAPI streaming

Backend API: SSE Streaming Endpoint

Configuration

Advanced Usage

Environment Variables

Different Flag Types

Batch Evaluation

Event Tracking

Cache Management

Metrics

Context Manager

Custom Logger

Error Handling

Type Hints

Async Support

Development

Setup

Testing

Code Quality

Publishing to PyPI

Prerequisites

Configure PyPI credentials

Build and publish

Verify the published package

Version bumping checklist

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes