Skip to main content

Read through cache for OpenAI's API.

Project description

llmem

A read through cache for OpenAI chat completions.

Why:

  • Avoid running up bills when developing/testing by sending the same request multiple times/forgetting to mock things.
  • Fine-tune an open source model on data I've paid for Adhere to all terms of use.
  • Evaluate future models.
  • Track model changes over time.

use

It currently only supports the chat completion API.

requests

NOTE: These examples run against my public server llmem.com. You might want to run your own server instead.

import requests

headers = {
  "Content-Type": "application/json",
  "Authorization": f"Bearer {api_key}"
}

payload = {
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Hi."
        },
      ]
    }
  ],
  "max_tokens": 10
}

response = requests.post("https://llmem.com/chat/completions", headers=headers, json=payload)

print(response.json())

If you run that twice, you'll see that first it gives an ID starting with chatcmpl, then IDs starting with llmem.

You can also use the OpenAI client, although it will currently only work for /chat requests:

from openai import OpenAI

client = OpenAI(
    api_key=api_key,
    base_url="https://llmem.com"
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Say this is a test",
        }
    ],
    model="gpt-3.5-turbo",
)

private server

docker

Makes it available on port 8000, and keeps its cache in ~/.llmem.

docker run --rm -p 8000:8000  -v ~/.llmem:/workspace ghcr.io/c0g/llmem:latest

python

pip install .
uvicorn llmem:app

By default it will make a llmem.db file in the working directory, you can override this location by setting the LLMEM_DB_FILE environment variable.

You can get the code on GitHub.

public server

I host a public server at llmem.com. If you point your API client to https://llmem.com/chat, you'll hit my cache and share the wealth (and get to share other people's wealth).

  • NOTE: I could look at your OpenAI key if I wanted to. I don't, but I could. You can probably trust me[^criminal].
  • NOTE: I can look at your queries. I probably will (see 'Why' bullets above). Please don't submit anything you wouldn't want your mother seeing, since I might well be your mother.
  • NOTE: If any of the above NOTEs worry you it's probably best not to point your API client to https://llmem.com.

[^criminal]: That's exactly what a criminal would say.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmem-20240215.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

llmem-20240215-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file llmem-20240215.tar.gz.

File metadata

  • Download URL: llmem-20240215.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for llmem-20240215.tar.gz
Algorithm Hash digest
SHA256 20d0c368ac3a1ac3afcf2ae633c2565f98ed2242f5a27e758f0d8db74cb499fe
MD5 77c89a8ca3226c54a62c8abc0fb45c4d
BLAKE2b-256 c869feedd2643c5b064657e3bfff236ac06c19cfda0f741554ee9035b79672f0

See more details on using hashes here.

File details

Details for the file llmem-20240215-py3-none-any.whl.

File metadata

  • Download URL: llmem-20240215-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for llmem-20240215-py3-none-any.whl
Algorithm Hash digest
SHA256 40b62c78fbd7cb5defb72c8dfc163162c9c958ac090fea7629aac78603f584a7
MD5 d83a2336bac60225e4b9daae7b56315a
BLAKE2b-256 805d0d63ccfd515a32e3e6e1ca6cbcdb44bf6dcb0c964286a3ec653c9f4118f9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page