Skip to main content

Elastic Models Client from TheStage AI

Project description

Elastic Models CLI

CLI tool for benchmarking and testing elastic AI model inference.

Installation

pip install thestage-elastic-models-cli

Commands

Client Inference

# Test single inference requests
elastic-models-client client llm --prompt "Hello" --url <endpoint> --model <name>
elastic-models-client client diffusion --prompt "A cat" --url <endpoint> --model <name>
elastic-models-client client vlm --prompt "Describe image" --image <path> --url <endpoint> --model <name>
elastic-models-client client stt --audio <path> --url <endpoint> --model <name>

Benchmarking

# Run load tests using Locust
elastic-models-client benchmark llm --url <endpoint> --model <name>
elastic-models-client benchmark diffusion --url <endpoint> --model <name>
elastic-models-client benchmark vlm --url <endpoint> --model <name>
elastic-models-client benchmark stt --url <endpoint> --model <name>

# Options: --concurrency, --num-requests, --output-dir, --authorization

Requirements

  • Python: >=3.10
  • Dependencies: qlip_serve_client, locust, Pillow, requests, aiohttp
  • NumPy: <2.0 (Triton compatibility)

⚠️ Important Caveats

  1. Triton Dependency: NumPy must be <2.0 for Triton server compatibility
  2. Authorization: Use --authorization for authenticated endpoints
  3. Ready Endpoint: Server readiness checks may fail with Nginx/Salad setups
  4. Metadata: Requires model metadata JSON file or auto-download from server

Quick Example

# Test single inference
elastic-models-client client llm \
  --prompt "Write a haiku about AI" \
  --url https://api.example.com/v2/models \
  --model meta-llama/Llama-3.1-8B

# Benchmark LLM with 4 concurrent users, 100 requests
elastic-models-client benchmark llm \
  --url https://api.example.com/v2/models \
  --model meta-llama/Llama-3.1-8B \
  --concurrency 4 \
  --num-requests 100 \
  --output-dir ./results

Output

  • Benchmarks: CSV stats, HTML reports, JSONL logs (optional)
  • Client: JSON response with inference results

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thestage_elastic_models_cli-0.0.18-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file thestage_elastic_models_cli-0.0.18-py3-none-any.whl.

File metadata

File hashes

Hashes for thestage_elastic_models_cli-0.0.18-py3-none-any.whl
Algorithm Hash digest
SHA256 d9dfe3d501ab1d593a5b6b1d5ca385a30db4495e7e9c72a8e8fa9859c4bcd91a
MD5 82ee907d3efef54f028cb2ba881dfbf4
BLAKE2b-256 a44f25f2425ff54d15ba01924508ba6773deb4a94ce61efed9d45934f3e59f16

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page