Skip to main content

Fast Run-Eval-Polish Loop for LLM App

Project description

⚡♾️ FastREPL

Fast Run-Eval-Polish Loop for LLM Applications.

This project is still in the early development stage. Have questions? Let's chat!

CI Status PyPI Version

Quickstart

Let's say we have this existing system:

import openai

context = """
The first step is to decide what to work on. The work you choose needs to have three qualities: it has to be something you have a natural aptitude for, that you have a deep interest in, and that offers scope to do great work.
In practice you don't have to worry much about the third criterion. Ambitious people are if anything already too conservative about it. So all you need to do is find something you have an aptitude for and great interest in.
"""

def run_qa(question: str) -> str:
    return openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": f"Answer in less than 30 words. Use the following context if needed: {context}",
            },
            {"role": "user", "content": question},
        ],
    )["choices"][0]["message"]["content"]

We already have a fixed context. Now, let's ask some questions. local_runner is used here to run it locally with threads and progress tracking. We will have remote_runner to run the same in the cloud.

contexts = [[context]] * len(questions)

# https://huggingface.co/datasets/repllabs/questions_how_to_do_great_work
questions = [
    "how to do great work?.",
    "How can curiosity be nurtured and utilized to drive great work?",
    "How does the author suggest finding something to work on?",
    "How did Van Dyck's painting differ from Daniel Mytens' version and what message did it convey?",
]

runner = fastrepl.local_runner(fn=run_qa)
ds = runner.run(args_list=[(q,) for q in questions], output_feature="answer")

ds = ds.add_column("question", questions)
ds = ds.add_column("contexts", contexts)
# fastrepl.Dataset({
#     features: ['answer', 'question', 'contexts'],
#     num_rows: 4
# })

Now, let's use one of our evaluators to evaluate the dataset. Note that we are running it 5 times to ensure we obtain consistent results.

evaluator = fastrepl.RAGEvaluator(node=fastrepl.RAGAS(metric="Faithfulness"))

ds = fastrepl.local_runner(evaluator=evaluator, dataset=ds).run(num=5)
# ds["result"]
# [[0.25, 0.0, 0.25, 0.25, 0.5],
#  [0.5, 0.5, 0.5, 0.75, 0.875],
#  [0.66, 0.66, 0.66, 0.66, 0.66],
#  [1.0, 1.0, 1.0, 1.0, 1.0]]

Seems like we are getting quite good results. If we increase the number of samples a bit, we can obtain a reliable evaluation of the entire system. We will keep working on bringing better evaluations.

Detailed documentation is here.

Contributing

Any kind of contribution is welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastrepl-0.0.24.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

fastrepl-0.0.24-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file fastrepl-0.0.24.tar.gz.

File metadata

  • Download URL: fastrepl-0.0.24.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.5 Linux/6.2.0-1012-azure

File hashes

Hashes for fastrepl-0.0.24.tar.gz
Algorithm Hash digest
SHA256 75fa8a9d17f3d1c11d074c2eaf74ef235436abed0d99f3cca11e73c261eb8831
MD5 ec03ecbaf68bdfdd9b97c405aa1f1c06
BLAKE2b-256 0e9e3bad2a0810c40c05990f1305dc46a724f191e72b168925e3825a8f160858

See more details on using hashes here.

File details

Details for the file fastrepl-0.0.24-py3-none-any.whl.

File metadata

  • Download URL: fastrepl-0.0.24-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.5 Linux/6.2.0-1012-azure

File hashes

Hashes for fastrepl-0.0.24-py3-none-any.whl
Algorithm Hash digest
SHA256 50958fc20ee16c8bc389afc7e55ee8f97ebcde7872e17a741f70de88911a6c05
MD5 4c67c4ef031d7ebd6c073467d9118dab
BLAKE2b-256 a163a53bfe958eb26690f74da32d4d02bb257b4d7c0f2d1d12d4a11b03d1c03e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page