Skip to main content

Estimate costs and running times of complex LLM workflows/experiments/pipelines in advance before spending money, via simulations.

Project description

costly

Estimate costs and running times of complex LLM workflows/experiments/pipelines in advance before spending money, via simulations. Just put @costly() on the load-bearing function; make sure all functions that call it pass **kwargs to it and call your complex function with simulate=True and some cost_log: Costlog object. See examples.ipynb for more details.

https://github.com/abhimanyupallavisudhir/costly

Installation

pip install costly

Usage

See examples.ipynb for a full walkthrough; some examples below.

from costly import Costlog, costly, CostlyResponse
from costly.estimators.llm_api_estimation import LLM_API_Estimation as estimator


@costly()
def chatgpt(input_string: str, model: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(
        model=model, messages=[{"role": "user", "content": input_string}]
    )
    output_string = response.choices[0].message.content
    return output_string


@costly(
    input_tokens=lambda kwargs: LLM_API_Estimation.messages_to_input_tokens(
        kwargs["messages"], kwargs["model"]
    ),
)
def chatgpt_messages(messages: list[dict[str, str]], model: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(model=model, messages=messages)
    output_string = response.choices[0].message.content
    return output_string


@costly()
def chatgpt(input_string: str, model: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "user", "content": input_string},
        ],
    )

    return CostlyResponse(
        output=response.choices[0].message.content,
        cost_info={
            "input_tokens": response.usage.prompt_tokens,
            "output_tokens": response.usage.completion_tokens,
        },
    ) # in usage, this will still just return the output, not the whole CostlyResponse object

Testing

poetry run pytest -s -m "not slow"
poetry run pytest -s -m "slow"

Tests for instructor currently fail.

TODO

  • Make it work with async
  • Support for locally run LLMs -- ideally need a cost & time estimator that takes into account your machine details, GPU pricing etc.
  • Decide and document what the best way to "propagate" description (for breakdown purposes) through function calls is. Have the user manually write def f(...): ... g(description = kwargs.get("description") + ["f"]? Add a @description("blabla") decorator? Add a @description decorator that automatically appends the function name and arguments into description?
  • Better solution for token counting for Chat messages (search HACK in the repo)
  • make instructor tests pass
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact[gpt-4-turbo-messages0] - AssertionError: ['Time estimate maximum 73.728 is less than truth 74.1186316999956']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[PERSONINFO_gpt-4o] - AssertionError: ['Input tokens estimate 43 not within 20pc of truth 83']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[PERSONINFO_gpt-4o-mini] - AssertionError: ['Input tokens estimate 43 not within 20pc of truth 83']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[PERSONINFO_gpt-4-turbo] - AssertionError: ['Input tokens estimate 43 not within 20pc of truth 85']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[PERSONINFO_gpt-4] - AssertionError: ['Input tokens estimate 43 not within 20pc of truth 76']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[PERSONINFO_gpt-3.5-turbo] - AssertionError: ['Input tokens estimate 43 not within 20pc of truth 85']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[FOOMODEL_gpt-4o] - AssertionError: ['Input tokens estimate 229 not within 20pc of truth 108', 'Cost estimate minimum 0.0011450000000000002 exceeds truth 0.000795']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[FOOMODEL_gpt-4o-mini] - AssertionError: ['Input tokens estimate 230 not within 20pc of truth 108', 'Cost estimate minimum 3.45e-05 exceeds truth 2.8800000000000002e-05']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[FOOMODEL_gpt-4-turbo] - AssertionError: ['Input tokens estimate 231 not within 20pc of truth 113', 'Cost estimate minimum 0.00231 exceeds truth 0.0016400000000000002']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[FOOMODEL_gpt-4] - AssertionError: ['Input tokens estimate 228 not within 20pc of truth 92', 'Cost estimate minimum 0.006840000000000001 exceeds truth 0.00426']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[FOOMODEL_gpt-3.5-turbo] - AssertionError: ['Input tokens estimate 233 not within 20pc of truth 113', 'Cost estimate minimum 0.0001165 exceeds truth 8.05e-05']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[BARMODEL_gpt-4o] - AssertionError: ['Input tokens estimate 321 not within 20pc of truth 168']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[BARMODEL_gpt-4o-mini] - AssertionError: ['Input tokens estimate 322 not within 20pc of truth 168']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[BARMODEL_gpt-4-turbo] - AssertionError: ['Input tokens estimate 323 not within 20pc of truth 178']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[BARMODEL_gpt-4] - AssertionError: ['Input tokens estimate 320 not within 20pc of truth 126']
FAILED tests/test_estimators/test_llm_api_estimation.py::test_estimate_contains_exact_instructor[BARMODEL_gpt-3.5-turbo] - AssertionError: ['Input tokens estimate 325 not within 20pc of truth 178']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

costly-0.1.7.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

costly-0.1.7-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file costly-0.1.7.tar.gz.

File metadata

  • Download URL: costly-0.1.7.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.11.0 Windows/10

File hashes

Hashes for costly-0.1.7.tar.gz
Algorithm Hash digest
SHA256 985ff65285f22c4321f98b25843cd24a466f1c52ef25d9bed4b39aba63c6e4aa
MD5 bac92281f7dd679f197d22c9ecfaf5df
BLAKE2b-256 e29305369208bf7c26b16132f8ac1f4c0aa539e3303c8e1dae830e6e25dcc28d

See more details on using hashes here.

File details

Details for the file costly-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: costly-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 13.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.11.0 Windows/10

File hashes

Hashes for costly-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 df636113be96881eee2877c639fc2c21368544d7cabc4dbbb0f0d2217432fa37
MD5 c83405def931b713128cc2a4eab06ad0
BLAKE2b-256 1d5062efeb7dab1a95591a5e2d44690dc6bd727b0d34b99a845e630d4e42ec21

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page