Skip to main content

Translation proxy enabling Claude Code CLI to work with SWE-agent format models

Project description

Ai2 Soft-Verified Efficient Repository Agents (SERA) Claude Code Proxy

This repo allows Claude Code to be used with the Ai2 Open Coding Agents SERA model.

You will need Claude Code and uv installed to set up the SERA CLI.

Quick Start with Modal

The fastest way to try SERA is with Modal, which handles GPU provisioning, vLLM deployment, and downloading the model automatically. This takes ~10m for the first run as ~65GB of model weights are downloaded. Subsequent runs will cache the model and start up faster.

When you exit Claude Code, the Modal app will automatically get cleaned up.

# Install modal and sera globally
uv tool install modal
uv tool install ai2-sera-cli

# Setup modal (this will prompt you to set up an account)
modal setup

# Deploy SERA to Modal and launch Claude Code
sera --modal

Using Existing Endpoints

If you have an existing vLLM endpoint for the SERA model (e.g., from a shared deployment or your own infrastructure):

# Install sera globally
uv tool install ai2-sera-cli

# Set the API key if your endpoint requires authentication
export SERA_API_KEY=<your API key>

# Run sera with your endpoint
sera --endpoint <endpoint URL>

Shared Deployments with deploy-sera

For teams or multi-user setups, you can create a persistent vLLM deployment on Modal using deploy-sera. Unlike sera --modal which creates ephemeral deployments that stop when you exit, deploy-sera creates persistent deployments that stay up until explicitly stopped.

# Deploy a persistent vLLM instance
deploy-sera --model allenai/SERA-32B

# The command outputs an endpoint URL and API key
# Share these with your team members

# Team members can then connect with:
SERA_API_KEY=<api-key> sera --endpoint <endpoint-url>

# Stop the deployment when done
deploy-sera --stop

deploy-sera Options

Option Description
--model MODEL HuggingFace model ID to deploy (default: allenai/SERA-32B)
--num-gpus N Number of GPUs to use; also sets tensor parallelism (default: 1)
--api-key KEY API key for authentication (auto-generated if not specified)
--hf-secret NAME Modal secret containing HF_TOKEN for private/gated models
--stop Stop the running deployment

Deploying Private Models

For private models (e.g., fine-tuned on a proprietary codebase), use --hf-secret to authenticate with HuggingFace:

# 1. Create a Modal secret with your HuggingFace token
modal secret create huggingface HF_TOKEN=hf_your_token_here

# 2. Deploy your private model
deploy-sera --model your-org/private-sera-model --hf-secret huggingface

# 3. Users connect with the provided endpoint and API key
SERA_API_KEY=<api-key> sera --endpoint <endpoint-url>

For ephemeral single-user deployments, the same --hf-secret flag works with sera --modal.

Self-Hosted vLLM

You can run SERA with vLLM on any cloud GPU provider or your own hardware directly with vLLM.

On the server:

python -m vllm.entrypoints.openai.api_server \
    --model allenai/SERA-32B \
    --host 0.0.0.0 \
    --port 8000 \
    --max-model-len 32768 \
    --tensor-parallel-size 2 \
    --trust-remote-code \
    --enable-auto-tool-choice \
    --tool-call-parser hermes

On your dev machine:

uv tool install ai2-sera-cli
sera --endpoint http://your-server:8000/v1/chat/completions

Configuration

sera CLI Options

Option Description
--endpoint URL vLLM endpoint URL (required unless --modal is used)
--modal Deploy vLLM to Modal (ephemeral, auto-cleanup on exit)
--port PORT Proxy server port (default: 8080)
--model MODEL Model name/path
--hf-secret NAME Modal secret name containing HF_TOKEN for private/gated models
--proxy-only Start proxy only, don't launch Claude Code

Environment Variables

Variable Description
SERA_API_KEY API key for vLLM endpoint authentication
SERA_MODEL Default model name (fallback for --model)
SERA_HF_SECRET Default Modal secret name (fallback for --hf-secret)

API Key Authentication

The proxy supports API key authentication for vLLM endpoints:

  • sera --modal: API key is auto-generated and managed in the background
  • deploy-sera: API key is auto-generated and printed so it can be shared with team members
  • Existing endpoints: Set SERA_API_KEY environment variable before running sera
  • Self-hosted vLLM: Start vLLM with --api-key YOUR_KEY, then set SERA_API_KEY=YOUR_KEY

The proxy includes the API key in the Authorization: Bearer <api_key> header when making requests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai2_sera_cli-0.1.0.tar.gz (20.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai2_sera_cli-0.1.0-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file ai2_sera_cli-0.1.0.tar.gz.

File metadata

  • Download URL: ai2_sera_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 20.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ai2_sera_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2fe1940ebccc4a1eea077ce0f888b572da9c35e19d3011d162f1593647f3d6f0
MD5 8884e52d8f3f7c4a50059604f08acb12
BLAKE2b-256 f8d6b9f10df2997fa7570fb79c1eb626b0d8b5080dba4c01c637727670d00130

See more details on using hashes here.

File details

Details for the file ai2_sera_cli-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ai2_sera_cli-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ai2_sera_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7cc63625888675de748544c6a67ae87cd2cc59cf8a59b412d1adc93fe68f12c5
MD5 345e6bda1a7381ad7c5dcbbeb8fb44c5
BLAKE2b-256 68af42e0bd16d9a5fa654a340cf703c03c2d94064b3b164d5890b9f7b97aff7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page