A Python package for tracking Bedrock API usage metrics (cost, latency, tokens) with DynamoDB storage and alerting.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.10

Project description

Newberry Metrics

A Python package for tracking and analyzing AWS Bedrock API usage metrics, including costs and latency.

Latest Version: 0.1.4

Features

Track API call costs and latency
Monitor token usage (input and output)
Maintain session-based metrics
Support for multiple Bedrock models
Automatic AWS credential handling
Detailed latency tracking and analysis

Installation

pip install newberry_metrics

AWS Credential Setup

The package uses the AWS credential chain to authenticate with AWS services. You can set up credentials in one of the following ways:

1. Using IAM Role (Recommended for EC2)

Attach an IAM role to your EC2 instance with Bedrock permissions
No additional configuration needed
The code will automatically use the instance's IAM role credentials

2. Using AWS CLI

aws configure

This will create a credentials file at ~/.aws/credentials with your access key and secret key.

3. Using Environment Variables

export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=your_region

Usage Examples

1. Initialize TokenEstimator

Initialize the tracker with a specific Bedrock model ID. You can also optionally provide cost and latency thresholds for automatic alerting.

from newberry_metrics import TokenEstimator
import json # For printing examples

# Initialize with your model ID
model_id = "anthropic.claude-3-sonnet-20240229-v1:0"

# Optional: Define alert thresholds
cost_alert_threshold = 0.05  # Alert if total session cost exceeds $0.05
latency_alert_threshold_ms = 2000 # Alert if any single call takes > 2000ms

estimator = TokenEstimator(
    model_id=model_id,
    cost_threshold=cost_alert_threshold,      # Optional
    latency_threshold_ms=latency_alert_threshold_ms # Optional
)

2. Get Model Pricing

Retrieve the cost per million tokens for the initialized model.

costs = estimator.get_model_cost_per_million()
print(f"Input cost per million: ${costs['input']}")
print(f"Output cost per million: ${costs['output']}")

3. Making API Calls & Tracking Metrics

The _invoke_bedrock method (though marked private) handles calling the Bedrock model. Crucially, it automatically updates and saves the session metrics (cost, latency, token counts) after each call.

If the thresholds set during initialization are exceeded (total cost or current call latency), alerts will be printed to the console during this step.

# Make an API call - session metrics are automatically tracked & alerts checked
prompt = "Explain the concept of Large Language Models."
# Note: Using the private _invoke_bedrock method here as shown in main.py example
# A public wrapper might be added in future versions.
response = estimator._invoke_bedrock(prompt, max_tokens=200)

current_call_metrics = response['SessionMetrics']['current_call']

answer = response['SessionMetrics'].get('answer', '(Answer not found)') # Answer is also included
print(f"\n--- Single Call Results ---")
print(f"Answer: {answer[:100]}...") # Print truncated answer
print(f"Cost (This Call): ${current_call_metrics['cost']:.6f}")
print(f"Latency (This Call): {current_call_metrics['latency']:.3f}s")
print(f"Tokens (In/Out): {current_call_metrics['input_tokens']}/{current_call_metrics['output_tokens']}")


# Get session metrics from the response
session_metrics = response['SessionMetrics']
print(f"Total session cost: ${session_metrics['total_cost']}")
print(f"Average cost: ${session_metrics['average_cost']}")
print(f"Total latency: {session_metrics['total_latency']} seconds")
print(f"Average latency: {session_metrics['average_latency']} seconds")
print(f"Total calls: {session_metrics['total_calls']}")

# Access the session metrics updated by this call (contained within the response)
session_metrics_after_call = response['SessionMetrics']
print(f"Latest total session cost: ${session_metrics_after_call['total_cost']:.6f}")
print(f"Cost of current call: ${session_metrics_after_call['current_call']['cost']:.6f}")

4. Retrieve Current Session Metrics

You can get the complete metrics object for the current session (reflecting all calls made so far) at any time. This reads the latest state managed by the estimator.

from dataclasses import asdict # For printing example

current_metrics = estimator.get_session_metrics()
print(f"Total calls so far: {current_metrics.total_calls}")
print(f"Average latency: {current_metrics.average_latency:.3f}s")

5. Reset Session Metrics

Reset the tracked metrics for the current session (identified by AWS credentials) back to zero in the persistent store (DynamoDB).

estimator.reset_session_metrics()
print("Session metrics have been reset.")

6. Visualize Session Metrics (Optional)

Generate bar charts summarizing the cost and average latency of the session's API calls, grouped either hourly or daily. Requires matplotlib and pandas to be installed.

# Ensure you have data first by making some calls
# response = estimator._invoke_bedrock("Another prompt...")

try:
    # Show plots grouped by hour (default)
    print("Displaying hourly metrics plot...")
    estimator.visualize_metrics(time_interval='hourly')

    # Or, show plots grouped by day
    print("Displaying daily metrics plot...")
    estimator.visualize_metrics(time_interval='daily')

    # Or, save the plots to files instead of showing them
    # Creates 'session_plots_hourly.png' and 'session_plots_daily.png'
    # print("Saving metrics plots...")
    # estimator.visualize_metrics(time_interval='hourly', save_path='session_plots')
    # estimator.visualize_metrics(time_interval='daily', save_path='session_plots')

except ImportError:
    print("Please install matplotlib and pandas to use visualization: pip install matplotlib pandas")
except Exception as e:
    print(f"An error occurred during visualization: {e}")


2.  **Update the "Requirements" section to:**

```markdown
## Requirements
- Python >= 3.10
- `boto3` for AWS Bedrock integration
- `matplotlib` for visualization
- `pandas` for visualization

These changes add the documentation for the visualize_metrics method and ensure the necessary dependencies are listed. Remember to update the version in setup.py if you release this.

Optional: Analyzing a Specific Response (Advanced)

If you have a raw response object from _invoke_bedrock (or elsewhere), you can calculate its specific cost/latency independently using these helper methods. Note that this calculation is already performed internally by _invoke_bedrock during the tracking process.

Supported Models

The package includes pricing information for the following Bedrock models (primarily in us-east-1). Ensure the model ID you use matches one of these or add its pricing to get_model_cost_per_million if needed.

amazon.nova-pro-v1:0 ($0.003/$0.012 per 1K tokens)
amazon.nova-micro-v1:0 ($0.000035/$0.00014 per 1K tokens)
anthropic.claude-3-sonnet-20240229-v1:0 ($0.003/$0.015 per 1K tokens)
anthropic.claude-3-haiku-20240307-v1:0 ($0.00025/$0.00125 per 1K tokens)
anthropic.claude-3-opus-20240229-v1:0 ($0.015/$0.075 per 1K tokens)
meta.llama2-13b-chat-v1 ($0.00075/$0.001 per 1K tokens)
meta.llama2-70b-chat-v1 ($0.00195/$0.00256 per 1K tokens)
ai21.jamba-1-5-large-v1:0 ($0.0125/$0.0125 per 1K tokens)
cohere.command-r-v1:0 ($0.0005/$0.0015 per 1K tokens)
cohere.command-r-plus-v1:0 ($0.003/$0.015 per 1K tokens)
mistral.mistral-7b-instruct-v0:2 ($0.0002/$0.0006 per 1K tokens)
mistral.mixtral-8x7b-instruct-v0:1 ($0.0007/$0.0021 per 1K tokens) (Pricing based on us-east-1, may vary in other regions)

Session Metrics & Alerting

The package automatically tracks and persists session metrics using Amazon DynamoDB. A dedicated table (default name: BedrockSessionMetrics) is required in your AWS account in the specified region. Each session's data is stored as an item keyed by a unique hash derived from the AWS credentials and region used.

Metrics stored include:

total_cost: Cumulative cost for the session.
average_cost: Average cost per call in the session.
total_latency: Cumulative latency (in seconds) for the session.
average_latency: Average latency per call in the session.
total_calls: Total number of API calls made in the session.
api_calls: A detailed list (List[APICallMetrics]) of each individual API call containing its timestamp, cost, latency, token counts, and call number within the session.

Alerting: If cost_threshold (float, e.g., 0.10 for $0.10) or latency_threshold_ms (float, e.g., 1500.0 for 1500ms) are provided during TokenEstimator initialization, the package will automatically print warning messages to the console if:

The total cost for the current session exceeds the cost_threshold after an API call.
The latency of any individual API call exceeds the latency_threshold_ms.

Recent Updates

Requirements

Python >= 3.10
boto3 for AWS Bedrock integration
matplotlib for visualization
pandas for visualization

Contact & Support

Developer: Satya-Holbox, Harshika-Holbox
Email: satyanarayan@holbox.ai
GitHub: SatyaTheG

License

This project is licensed under the MIT License.

Note: This package is actively maintained and regularly updated with new features and model support.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.10

Release history Release notifications | RSS feed

0.2.1

May 16, 2025

0.1.8

May 12, 2025

This version

0.1.4

May 5, 2025

0.0.30

Apr 22, 2025

0.0.20

Apr 22, 2025

0.0.10

Apr 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

newberry_metrics-0.1.4.tar.gz (14.4 kB view details)

Uploaded May 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

newberry_metrics-0.1.4-py3-none-any.whl (11.8 kB view details)

Uploaded May 5, 2025 Python 3

File details

Details for the file newberry_metrics-0.1.4.tar.gz.

File metadata

Download URL: newberry_metrics-0.1.4.tar.gz
Upload date: May 5, 2025
Size: 14.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for newberry_metrics-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`849f186928554854df8fe10a385b1bfbe16c60cc66232c71ca9141797b783be2`
MD5	`64dfc8be2ff275eb3d830881e12d9223`
BLAKE2b-256	`0b48eca2593f2d97fd6af4eb25770f609c37fc8e0435557fb3b6f8779c62364f`

See more details on using hashes here.

File details

Details for the file newberry_metrics-0.1.4-py3-none-any.whl.

File metadata

Download URL: newberry_metrics-0.1.4-py3-none-any.whl
Upload date: May 5, 2025
Size: 11.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for newberry_metrics-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`148f321d916e48015ab8be9695cd00fb5318dfcea5902cef50f200e3f0ee4f46`
MD5	`1fba587d4cbae6561f41e484f4b5e790`
BLAKE2b-256	`f6d8fa8df55016d97cddb0cdc2c89b6ff4951d94f26772f4dbdabc937d8c0d8d`

See more details on using hashes here.

newberry-metrics 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Newberry Metrics

Latest Version: 0.1.4

Features

Installation

AWS Credential Setup

1. Using IAM Role (Recommended for EC2)

2. Using AWS CLI

3. Using Environment Variables

Usage Examples

1. Initialize TokenEstimator

2. Get Model Pricing

3. Making API Calls & Tracking Metrics

4. Retrieve Current Session Metrics

5. Reset Session Metrics

6. Visualize Session Metrics (Optional)

Optional: Analyzing a Specific Response (Advanced)

Supported Models

Session Metrics & Alerting

Recent Updates

Requirements

Contact & Support

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes