Skip to main content

Elastic Models Client from TheStage AI

Project description

Elastic Models CLI

CLI tool for benchmarking and testing elastic AI model inference.

Installation

pip install thestage-elastic-models-cli

Commands

Client Inference

# Test single inference requests
elastic-models-client client llm --prompt "Hello" --url <endpoint> --model <name>
elastic-models-client client diffusion --prompt "A cat" --url <endpoint> --model <name>
elastic-models-client client vlm --prompt "Describe image" --image <path> --url <endpoint> --model <name>
elastic-models-client client stt --audio <path> --url <endpoint> --model <name>

Benchmarking

# Run load tests using Locust
elastic-models-client benchmark llm --url <endpoint> --model <name>
elastic-models-client benchmark diffusion --url <endpoint> --model <name>
elastic-models-client benchmark vlm --url <endpoint> --model <name>
elastic-models-client benchmark stt --url <endpoint> --model <name>

# Options: --concurrency, --num-requests, --output-dir, --authorization

Requirements

  • Python: >=3.10
  • Dependencies: qlip_serve_client, locust, Pillow, requests, aiohttp
  • NumPy: <2.0 (Triton compatibility)

⚠️ Important Caveats

  1. Triton Dependency: NumPy must be <2.0 for Triton server compatibility
  2. Authorization: Use --authorization for authenticated endpoints
  3. Ready Endpoint: Server readiness checks may fail with Nginx/Salad setups
  4. Metadata: Requires model metadata JSON file or auto-download from server

Quick Example

# Test single inference
elastic-models-client client llm \
  --prompt "Write a haiku about AI" \
  --url https://api.example.com/v2/models \
  --model meta-llama/Llama-3.1-8B

# Benchmark LLM with 4 concurrent users, 100 requests
elastic-models-client benchmark llm \
  --url https://api.example.com/v2/models \
  --model meta-llama/Llama-3.1-8B \
  --concurrency 4 \
  --num-requests 100 \
  --output-dir ./results

Output

  • Benchmarks: CSV stats, HTML reports, JSONL logs (optional)
  • Client: JSON response with inference results

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thestage_elastic_models_cli-0.0.19-py3-none-any.whl (42.6 kB view details)

Uploaded Python 3

File details

Details for the file thestage_elastic_models_cli-0.0.19-py3-none-any.whl.

File metadata

File hashes

Hashes for thestage_elastic_models_cli-0.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 95d13ab660b2ae609854801e53dded4c27e9956e20b674ce0a5a3bd0dbfa72c0
MD5 b7f56eae6d90c0841f5261d0c0e18e21
BLAKE2b-256 e814ec16705b6f28f64c41e45358f944deb04a2c7e7f37494af597284c8c2305

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page