A Python package for tracking Bedrock API usage metrics (cost, latency, tokens) with an automatically launched dashboard
Project description
Newberry Metrics
A Python package for tracking and analyzing AWS Bedrock API usage metrics, including costs, latency, and token usage, with an automatically launched dashboard for live visualization.
Features
- Track API call costs, latency, and token usage (input/output).
- Automatic Background Dashboard: A Streamlit dashboard is launched as a background process upon
TokenEstimatorinitialization, providing live visualization. - Dashboard Features: Displays KPIs (total/average cost & latency), hourly/daily charts, and detailed call logs.
- Persistent Session Storage: Maintains session-based metrics in a JSON file located in
~/.newberry_metrics/sessions/, uniquely identified by AWS credentials. - Support for multiple Bedrock models.
- Automatic AWS credential handling.
- Console alerts for configurable cost and latency thresholds.
- Static method (
TokenEstimator.stop_dashboard()) to manually stop the background dashboard process. - Duplicate Call Prevention: Ensures metrics for a single Bedrock response are logged only once, even if
calculate_prompt_costis called multiple times with the same response object.
Installation
pip install newberry_metrics
This will also install necessary dependencies like streamlit, pandas, and plotly.
AWS Credential Setup
The package uses the AWS credential chain. Configure your AWS credentials via IAM roles (recommended for EC2), aws configure, or environment variables.
Usage Examples
1. Initialize TokenEstimator & Launch Dashboard
When TokenEstimator is initialized, it automatically launches the Newberry Metrics dashboard in the background if it's not already running. The console will display the dashboard URL (typically http://localhost:8501).
from newberry_metrics import TokenEstimator
model_id = "anthropic.claude-3-5-sonnet-20240620-v1:0"
region = "us-east-1"
cost_alert_threshold = 0.05
latency_alert_threshold_ms = 2000
estimator = TokenEstimator(
model_id=model_id,
region=region,
cost_threshold=cost_alert_threshold, # Optional
latency_threshold_ms=latency_alert_threshold_ms # Optional
)
The dashboard will continue running even if your script finishes. Open the URL in your browser.
2. Get Model Pricing
costs = estimator.get_model_cost_per_million()
print(f"Input cost/1M tokens: ${costs['input']}, Output cost/1M tokens: ${costs['output']}")
3. Making API Calls & Tracking Metrics
First, get the raw response object from Bedrock using get_response(). Then, pass this object to calculate_prompt_cost() to process it, calculate metrics, and update the session file.
prompt = "Explain Large Language Models simply."
max_tokens_to_generate = 150
# Step 1: Get the raw response object
raw_response_object = estimator.get_response(
prompt=prompt,
max_tokens=max_tokens_to_generate
)
# Step 2: Calculate cost and update metrics
# This also returns detailed info about the call and session.
call_and_session_info = estimator.calculate_prompt_cost(raw_response_object)
print(f"Answer (truncated): {call_and_session_info.get('answer', 'N/A')[:100]}...")
current_call_metrics = call_and_session_info.get('current_call_metrics', {})
print(f"Cost for this call: ${current_call_metrics.get('cost', 0):.6f}")
print(f"Total session cost: ${call_and_session_info.get('total_cost_session', 0):.6f}")
Refresh your dashboard (using its refresh button 🔄) to see the new data. If calculate_prompt_cost is called again with the same raw_response_object, new metrics will not be logged, preventing duplicates.
4. Using the Dashboard
- Automatic Launch: Starts in the background with
TokenEstimator. URL and PID are printed. - Persistence: Runs independently of the launching script.
- Data Source: Reads from
~/.newberry_metrics/sessions/session_metrics_<CREDENTIAL_HASH>.json. - Refresh Button: Manually click the 🔄 button on the dashboard to load the latest metrics after new calls.
- Shutdown:
- Programmatically:
TokenEstimator.stop_dashboard() - Manually: Kill the process using the PID (from console or
~/.newberry_metrics/sessions/.newberry_dashboard.pid).
- Programmatically:
from newberry_metrics import TokenEstimator
TokenEstimator.stop_dashboard()
5. Retrieve Current Session Metrics Programmatically
current_session_object = estimator.get_session_metrics()
print(f"Total calls: {current_session_object.total_calls}, Total cost: ${current_session_object.total_cost:.6f}")
6. Reset Session Metrics
Resets metrics in the current session's JSON file to zero.
estimator.reset_session_metrics()
Supported Models
Pricing information is included for (but not limited to):
- amazon.nova-pro-v1:0
- anthropic.claude-3-haiku-20240307-v1:0
- anthropic.claude-3-sonnet-20240229-v1:0
- anthropic.claude-3-opus-20240229-v1:0
- anthropic.claude-3-5-sonnet-20240620-v1:0
- meta.llama2-13b-chat-v1
- meta.llama2-70b-chat-v1
- ai21.jamba-1-5-large-v1:0
- cohere.command-r-v1:0
- cohere.command-r-plus-v1:0
- mistral.mistral-7b-instruct-v0:2
- mistral.mixtral-8x7b-instruct-v0:1
Parsing logic for these and other models is in bedrock_models.py.
Session Metrics & Alerting
- Session File Location:
~/.newberry_metrics/sessions/session_metrics_<CREDENTIAL_HASH>.json. The hash is derived from AWS credentials and region. - PID File Location:
~/.newberry_metrics/sessions/.newberry_dashboard.pid. - Dashboard Data: The Streamlit dashboard reads from the session JSON file.
- Metrics Stored:
total_cost,average_cost,total_latency,average_latency,total_calls, and a detailed listapi_calls(each withtimestamp,cost,latency,input_tokens,output_tokens,call_counter). - Alerting: Console warnings are printed if
cost_threshold(total session cost) orlatency_threshold_ms(individual call latency) are exceeded.
Requirements
- Python >= 3.10
boto3streamlitpandasplotly
Contact & Support
- Developer: Satya-Holbox, Harshika-Holbox
- Email: satyanarayan@holbox.ai
- GitHub: SatyaTheG
License
This project is licensed under the MIT License.
Note: This package is actively maintained. Please ensure you are using the latest version for new features and model support.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file newberry_metrics-0.2.1.tar.gz.
File metadata
- Download URL: newberry_metrics-0.2.1.tar.gz
- Upload date:
- Size: 330.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6c3c4c8c70a20c767557836a4d86eba78cd7557d22813e43bd35046a2bff779
|
|
| MD5 |
6125e3d8133e2c519d807e380520dcf9
|
|
| BLAKE2b-256 |
b2b2eb255c700bbe8e73916aaf60e255d6b978d8c37b0d5f291a4166d6fc56b2
|
File details
Details for the file newberry_metrics-0.2.1-py3-none-any.whl.
File metadata
- Download URL: newberry_metrics-0.2.1-py3-none-any.whl
- Upload date:
- Size: 19.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90571945972fb5ef14c129baff41c5c191dfb316a7a8219e9277fcc48c109789
|
|
| MD5 |
b270d21b391221c4acb76ccb21777be4
|
|
| BLAKE2b-256 |
e30c1695558adb1a4215ef190998c2fa03c07c9fd385e0f804156b27a5b7e6b8
|