Python SDK to extract relevant metrics from Small Language Model inference calls.
Project description
cognitor-py
cognitor-py is a Python SDK that wraps transformers inference calls to extract useful metadata and performance metrics.
Features
- Model Information: Automatically captures the model name.
- Performance Metrics: Tracks CPU and RAM usage during inference.
- GPU Monitoring: Captures peak GPU memory usage (if CUDA is available).
- Token Counting: Calculates input and output token counts for common pipeline tasks.
- Latency Tracking: Measures inference duration.
- Error Handling: Captures and reports errors during inference.
- Flexible Logging Targets: Automatically saves all inference logs to either a local PostgreSQL database or a local file (JSON lines).
- Graceful Error Handling: Ensures the program continues to run even if the database is unreachable.
Installation
pip install cognitor-py
Usage
Using the Inference Monitor
from transformers import pipeline, AutoTokenizer
from cognitor import Cognitor
# Initialize your model and tokenizer
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
pipe = pipeline("text-generation", model=model_name, tokenizer=tokenizer)
# Initialize Cognitor with PostgreSQL configuration (default)
# Or use log_type="file" and log_path="logs.jsonl" for file logging
cognitor = Cognitor(
model_name=model_name,
tokenizer=tokenizer,
log_type="database", # or "file"
host="localhost",
port=5432,
user="postgres",
password="postgres",
dbname="cognitor"
)
# Run inference within the monitor context
with cognitor.monitor() as m:
input_text = "Once upon a time,"
# Use track() to capture only the inference duration
with m.track():
output = pipe(input_text, max_length=50)
m.capture(input_data=input_text, output=output)
# The metadata is now available via the cognitor instance
metadata = cognitor.get_last_metadata()
print(output)
print(metadata)
Metadata Structure
The extracted metadata follows this structure:
{
"model_name": "gpt2",
"timestamp": "2026-04-01T14:34:14+0200",
"input_tokens": 5,
"output_tokens": 45,
"cpu_percent": 12.5,
"ram_usage_percent": 1.2,
"gpu_usage_percent": 5.5, # Optional
"duration": 0.45, # Inference-only duration
"input": "Once upon a time,",
"output": [...],
"error": None
}
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cognitor-0.0.0.tar.gz
(9.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cognitor-0.0.0.tar.gz.
File metadata
- Download URL: cognitor-0.0.0.tar.gz
- Upload date:
- Size: 9.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aba2a9831004faab3ea8cd78953db13b00577c187e64f361e325b2bfda5c984a
|
|
| MD5 |
620f0292742490cd67bef8ee9016e202
|
|
| BLAKE2b-256 |
9ac303fc29be090837e628ff0a5d385e778ae25b7321f40270001bfed0056feb
|
File details
Details for the file cognitor-0.0.0-py3-none-any.whl.
File metadata
- Download URL: cognitor-0.0.0-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50d6fed631513c69915d5e535c70702a5d6d6cc97617efbed4a87023c9cec5d2
|
|
| MD5 |
c353d754881a2f5aa2919dfb33a3feba
|
|
| BLAKE2b-256 |
673f602d852d57e70986b457c5191e7c0f3217c23d06b57b36e3748691403a82
|