Skip to main content

OpenAI API proxy for fine-grained cost tracking & control and caching of responses

Project description

OpenAI Proxy [openai-wrapi]

A drop-in wrapper to the openai package that tracks costs per user, project, model and staging account.

Problem statement

OpenAI does not currently provide any way to monitor or limit API usage costs by user*, project or model. In fact, there is no concept of "project", only users (which correspond to email addresses), organizations (which correspond to OpenAI accounts and must be individually funded) and API keys (which can be used interchangeably across any organizations to which a user belongs).

This leads to a proliferation of API keys and users opening up a wider attack surface from a security point of view. Furthermore, users cannot be forced to use MFA and may continue to use the API and create API keys, even if their email no longer exists.

Lastly, it is easy to make redundant calls to the API incurring unnecessary costs, especially when developing in an interactive environment such as a Jupyter notebook.

* The latest version of the OpenAI usage dashboard shows number of calls per user, but not cost.

Solution

This repo provides a wrapper which checks usage limits before passing on the request to the OpenAI API and records the usage costs per user, project, model and staging account. It leverages the IAM permission framework of AWS to control access to the OpenAI API, without exposing the unique API keys per staging account. Responses from the OpenAI API are cached by default. Infrastructure As Code (IAC) is given to deploy the solution using a serverless architecture in AWS at a minimal extra cost and latency.

Install

From PyPI (for users)

pip install openai-wrapi

From source (for admins)

git clone https://github.com/teticio/openai-proxy.git
cd openai-proxy
pip install .

Deploy

Ideally, you should have one OpenAI account per staging account (dev, prod). Create a terraform.tfvars file in the iac directory with the following variables:

profile         = "default"   # AWS profile to use
region          = "eu-west-2" # AWS region to deploy to
num_azs         = 3           # Number of availability zones to deploy to (limited by available Elastic IP addresses)
use_elasticache = true        # Whether to use ElastiCache Memcache

stages = { # Staging accounts
  "dev" = {
    openai_api_key = "sk-XXX"
    openai_org_id  = "org-XXX"
  }
  "prod" = {
    openai_api_key = "sk-YYY"
    openai_org_id  = "org-YYY"
  }
}

To deploy run:

cd iac
terraform init
terraform apply -auto-approve

This will create

  • A streaming Lambda function URL to proxy calls to the OpenAI API per staging account (dev, prod).
  • A Lambda function to set usage limits and flush the cache per staging account (dev, prod).
  • A DynamoDB table to store usage and limits.
  • An optional ElastiCache Memcached cluster to cache OpenAI API responses.

Usage

In order to use the proxy in your Python code, provided you have the appropriate IAM permissions, include

import openai_wrapi as openai

You no longer need set your OpenAI API key or organization ID as these are securely stored in the corresponding Lambda functions. Instead, you should set your OPENAI_API_KEY environment variable to be sk-XXX, where XXX corresponds to the URL https://XXX.lambda-url.region.on.aws/ of your Lambda function, as output by Terraform.

If you plan to use packages such as langchain which use the openai package internally, you need only ensure you have previously imported openai_wrapi.

By default, the project associated with any API calls will be N/A. In order to set the project name:

openai.set_project("my-project")

To set the staging account:

openai.set_staging("dev")

If you want to disable caching (enabled by default):

openai.set_caching(False)

Alternatively, if you are using version 1.x.x of openai, you can create a client using

client = openai.OpenAI(project="my-project", staging="dev", caching=True)

It is also possible to set these parameters using environment variables:

os.environ["OPENAI_DEFAULT_PROJECT"] = "hello"
os.environ["OPENAI_DEFAULT_STAGING"] = "dev"
os.environ["OPENAI_DEFAULT_CACHING"] = "0"

(Note that it is necessary to do this when using langchain with version 1.x.x of openai, which explicitly creates a new OpenAI client.)

If you want to use the proxy from somewhere other than Python, you can use the URL of the Lambda function in place of the OpenAI endpoint, provided you authenticate with AWS appropriately. In fact, you can even make the Lambda function URL public and restrict the access with CORS, so that it can be used directly in a frontend application.

Admins

Again, supposing you have the IAM permissions to be able to invoke the openai-admin-{staging} Lambda function, you can

  • set the usage limits per user, project and model:
openai.set_limits(
    limit=10,              # 10 USD
    staging="dev",         # Dev account
    project="my-project",  # Project name
    user="me",             # Optional
    model="gpt-4",         # Optional
)
  • flush the cache:
openai.flush_cache(staging="dev")

The prices for the OpenAI models can be set in the iac/openai_proxy/prices.js file.

Note that this wrapper currently works for major versions 0 and 1 of the openai package.

To see the usage in a dashboard, run

streamlit run app.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openai_wrapi-0.1.8.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

openai_wrapi-0.1.8-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file openai_wrapi-0.1.8.tar.gz.

File metadata

  • Download URL: openai_wrapi-0.1.8.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.6 importlib-metadata/5.2.0 keyring/23.13.1 pkginfo/1.9.6 readme-renderer/34.0 requests-toolbelt/0.10.1 requests/2.31.0 rfc3986/1.5.0 tqdm/4.65.0 urllib3/1.26.16 CPython/3.10.12

File hashes

Hashes for openai_wrapi-0.1.8.tar.gz
Algorithm Hash digest
SHA256 49bccb65af14c4da23ab327bd48041b56a750f4546a74ccbc1b8961aea4b0345
MD5 2dac375b9476b4568588093978c15ac4
BLAKE2b-256 363c76f34b7d32297323c96c70e37f7c68e704a74126c1f99adcbd4ad33059fc

See more details on using hashes here.

File details

Details for the file openai_wrapi-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: openai_wrapi-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.6 importlib-metadata/5.2.0 keyring/23.13.1 pkginfo/1.9.6 readme-renderer/34.0 requests-toolbelt/0.10.1 requests/2.31.0 rfc3986/1.5.0 tqdm/4.65.0 urllib3/1.26.16 CPython/3.10.12

File hashes

Hashes for openai_wrapi-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 30adcee82b656839d14caebc9c71ee705ca7ac750185fea52832b4b77361c8ae
MD5 90e78a6b7b2ab320e2ec6084fe6b0239
BLAKE2b-256 f56e9b2111e9f3f55dd16792d41f4de9b56650c14e40a2b6f3275ba4151c956b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page