A toolkit for managing and testing LM Studio models with automatic context limit discovery
Project description
LMStrix
LMStrix is a professional Python toolkit designed to supercharge your interaction with LM Studio. It provides a powerful command-line interface (CLI) and Python API for managing, testing, and running local language models, with a standout feature: Adaptive Context Optimization.
Key Features
- ๐ Automatic Context Discovery: Binary search algorithm to find the true operational context limit of any model
- ๐ Beautiful Verbose Logging: Enhanced stats display with emojis showing inference metrics, timing, and token usage
- ๐ Smart Model Management: Models persist between calls to reduce loading overhead
- ๐ฏ Flexible Inference Engine: Run inference with powerful prompt templating and percentage-based output control
- ๐ Comprehensive Model Registry: Track models, their context limits, and test results with JSON persistence
- ๐ก๏ธ Safety Controls: Configurable thresholds and fail-safes to prevent system crashes
- ๐ป Rich CLI Interface: Beautiful terminal output with progress indicators and formatted tables
Installation
# Using pip
pip install lmstrix
# Using uv (recommended)
uv pip install lmstrix
Quick Start
Command-Line Interface
# Scan for available models in LM Studio
lmstrix scan
# List all models with their context limits and test status
lmstrix list
# Test context limit for a specific model
lmstrix test llama-3.2-3b-instruct
# Test all untested models with safety threshold
lmstrix test --all --threshold 102400
# Run inference with enhanced verbose logging
lmstrix infer "What is the capital of Poland?" -m llama-3.2-3b-instruct --verbose
# Run inference with percentage-based output tokens
lmstrix infer "Explain quantum computing" -m llama-3.2-3b-instruct --out_ctx "25%"
# Use file-based prompts with templates
lmstrix infer summary -m llama-3.2-3b-instruct --file_prompt adam.toml --text_file document.txt
# Direct text input for prompts
lmstrix infer "Summarize: {{text}}" -m llama-3.2-3b-instruct --text "Your content here"
Enhanced Verbose Output
When using --verbose, LMStrix provides comprehensive statistics:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ค MODEL: llama-3.2-3b-instruct
๐ง CONFIG: maxTokens=26214, temperature=0.7
๐ PROMPT (1 lines, 18 chars): Capital of Poland?
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ธ Running inference...
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ INFERENCE STATS
โก Time to first token: 0.82s
โฑ๏ธ Total inference time: 11.66s
๐ข Predicted tokens: 338
๐ Prompt tokens: 5
๐ฏ Total tokens: 343
๐ Tokens/second: 32.04
๐ Stop reason: eosFound
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Python API
from lmstrix.loaders.model_loader import load_model_registry
from lmstrix.core.inference_manager import InferenceManager
# Load model registry
registry = load_model_registry()
# List available models
models = registry.list_models()
print(f"Available models: {len(models)}")
# Run inference
manager = InferenceManager(verbose=True)
result = manager.infer(
model_id="llama-3.2-3b-instruct",
prompt="What is the meaning of life?",
out_ctx=100,
temperature=0.7
)
if result["succeeded"]:
print(f"Response: {result['response']}")
print(f"Tokens used: {result['tokens_used']}")
print(f"Time: {result['inference_time']:.2f}s")
Context Testing & Optimization
LMStrix uses a sophisticated binary search algorithm to discover true model context limits:
Safety Features
- Threshold Protection: Configurable maximum context size to prevent system crashes
- Progressive Testing: Starts with small contexts and increases safely
- Persistent Results: Saves test results to avoid re-testing
Testing Commands
# Test specific model
lmstrix test llama-3.2-3b-instruct
# Test all models with custom threshold
lmstrix test --all --threshold 65536
# Test at specific context size
lmstrix test --all --ctx 32768
# Reset and re-test a model
lmstrix test llama-3.2-3b-instruct --reset
Model Management
Registry Commands
# Scan for new models
lmstrix scan --verbose
# List models with different sorting
lmstrix list --sort size # Sort by size
lmstrix list --sort ctx # Sort by tested context
lmstrix list --show json # Export as JSON
# Check system health
lmstrix health --verbose
Model Persistence
Models stay loaded between inference calls for improved performance:
- When no explicit context is specified, models remain loaded
- Last-used model is remembered for subsequent calls
- Explicit context changes trigger model reloading
Prompt Templating
LMStrix supports flexible prompt templating with TOML files:
# adam.toml
[aps]
prompt = """
You are an AI assistant skilled in Abstractive Proposition Segmentation.
Convert the following text: {{text}}
"""
[summary]
prompt = "Create a comprehensive summary: {{text}}"
Use with CLI:
lmstrix infer aps --file_prompt adam.toml --text "Your text here"
lmstrix infer summary --file_prompt adam.toml --text_file document.txt
Development
# Clone repository
git clone https://github.com/twardoch/lmstrix.git
cd lmstrix
# Install for development
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
hatch run lint:all
Project Structure
src/lmstrix/
โโโ cli/main.py # CLI interface
โโโ core/
โ โโโ inference_manager.py # Unified inference engine
โ โโโ models.py # Model registry
โ โโโ context_tester.py # Context limit testing
โโโ api/client.py # LM Studio API client
โโโ loaders/ # Data loading utilities
โโโ utils/ # Helper utilities
Features in Detail
Adaptive Context Optimizer
- Binary search algorithm for efficient context limit discovery
- Safety thresholds to prevent system crashes
- Automatic persistence of test results
- Resume capability for interrupted tests
Enhanced Logging
- Beautiful emoji-rich output in verbose mode
- Comprehensive inference statistics
- Progress indicators for long operations
- Clear error messages with context
Smart Model Management
- Automatic model discovery from LM Studio
- Persistent registry with JSON storage
- Model state tracking (loaded/unloaded)
- Batch operations for multiple models
Requirements
- Python 3.11+
- LM Studio installed and configured
- Models downloaded in LM Studio
License
MIT License - see LICENSE file for details.
Contributing
Contributions welcome! Please read our contributing guidelines and submit pull requests for any improvements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lmstrix-1.0.64.tar.gz.
File metadata
- Download URL: lmstrix-1.0.64.tar.gz
- Upload date:
- Size: 68.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78e136c9fde238225f73078e6c6efa9cc38a110549a21fd43f152ea1adef0d66
|
|
| MD5 |
1cbf819a60cce957478916be5f75d676
|
|
| BLAKE2b-256 |
04e8ad032dc00e86399d79a3248bb10dd8c42aee7f257494f3f252b72d61b375
|
File details
Details for the file lmstrix-1.0.64-py3-none-any.whl.
File metadata
- Download URL: lmstrix-1.0.64-py3-none-any.whl
- Upload date:
- Size: 62.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac6bc6cbe00fe759128de591e9c64af5244330c98b6f7b4d86c6891435bf62c2
|
|
| MD5 |
7f76a8aeddc612591c597663469ba790
|
|
| BLAKE2b-256 |
717c1dec0e7522f6b321cdf81ee6cc2091226ed66b6b214ceaf25ae12057b26a
|