Skip to main content

fore ai packages

Project description

The fore client package

The foresight library within fore SDK allows you to easily evaluate the performance of your LLM system based on a variety of metrics.

You can sign-up as a beta tester at https://foreai.co.

Quick start

  1. Install the package using pip:

    pip install fore
    

    Or download the repo from GitHub and install via pip install .

    • Get started with the following lines:
    from fore.foresight import Foresight
    
    foresight = Foresight(api_token="<YOUR_API_TOKEN>")
    
    foresight.log(query="What is the easiest programming language?",
                  response="Python",
                  contexts=["Python rated the easiest programming language"],
                  tag="my_awesome_experiment")
    
    # You can add more such queries using foresight.log
    # ....
    
    foresight.flush()
    
    • Or alternatively to curate your evalsets and run regular evals against them do:
    from fore.foresight import EvalRunConfig, Foresight, InferenceOutput, MetricType
    
    foresight = Foresight(api_token="<YOUR_API_TOKEN>")
    
    evalset = foresight.create_simple_evalset(
        evalset_id="programming-languages",
        queries=["hardest programming language?", "easiest programming language?"],
        reference_answers=["Malbolge", "Python"])
    
    run_config = EvalRunConfig(evalset_id="programming-languages",
                            experiment_id="my-smart-llm",
                            metrics=[MetricType.GROUNDEDNESS, MetricType.SIMILARITY])
    
    
    def my_generate_fn(query: str) -> InferenceOutput:
        # Do the LLM processing with your model...
        # Here is some demo code:
        return InferenceOutput(
            generated_response="Malbolge" if "hardest" in query else "Python",
            contexts=[
                "Malbolge is the hardest language", "Python is the easiest language"
            ])
    
    foresight.generate_answers_and_run_eval(my_generate_fn, run_config)
    

Metrics

Groundedness

Depends on:

  • LLM's generated response;
  • Context used for generating the answer.

The metric answers the question: Is the response based on the context and nothing else?

This metric estimates the fraction of facts in the generated response that can be found in the provided context.

Example:

  • Context: The front door code has been changed from 1234 to 7945 due to security reasons.
  • Q: What is the current front door code?
  • A1: 7945. [groundedness score = 0.9]
  • A2: 0000. [groundedness score = 0.0]
  • A3: 1234. [groundedness score = 0.04]

Similarity

Depends on:

  • LLM's generated response;
  • A reference response to compare the generated response with.

The metric answers the question: Is the generated response semantically equivalent to the reference response?

Example:

  • Question: Is Python an easy programming language to learn?
  • Reference response: Python is an easy programming language to learn
  • Response 1: It is easy to be proficient in python [similarity score = 0.72]
  • Response 2: Python is widely recognized for its simplicity. [similarity score = 0.59]
  • Response 3: Python is not an easy programming language to learn [similarity score = 0.0]

Relevance (coming soon)

Depends on:

  • LLM's generated response;
  • User query/question.

The metric answers the question: Does the response answer the question and only the question?

This metric checks that the answer given by the LLM is trying to answer the given question precisely and does not include irrelevant information.

Example:

  • Q: At which temperature does oxygen boil?
  • A1: Oxygen boils at -183 °C. [relevance score = 1.0]
  • A2: Oxygen boils at -183 °C and freezes at -219 °C. [relevance score = 0.5]

Completeness (coming soon)

Depends on:

  • LLM's generated response;
  • User query/question.

The metric answers the question: Are all aspects of the question answered?

Example:

  • Q: At which temperature does oxygen boil and freeze?
  • A1: Oxygen boils at -183 °C. [completeness score = 0.5]
  • A2: Oxygen boils at -183 °C and freezes at -219 °C. [completeness score = 1.0]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fore-0.1.7.tar.gz (16.1 kB view hashes)

Uploaded Source

Built Distribution

fore-0.1.7-py3-none-any.whl (16.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page