Comet tool for logging and evaluating LLM traces

These details have not been verified by PyPI

Project links

Project description

Opik
Open source LLM evaluation framework

From RAG chatbots to code assistants to complex agentic pipelines and beyond, build LLM systems that run better, faster, and cheaper with tracing, evaluations, and dashboards.

Website • Slack community • Twitter • Documentation

Opik thumbnail

🚀 What is Opik?

Opik is an open-source platform for evaluating, testing and monitoring LLM applications. Built by Comet.

You can use Opik for:

Development:
- Tracing: Track all LLM calls and traces during development and production (Quickstart, Integrations
- Annotations: Annotate your LLM calls by logging feedback scores using the Python SDK or the UI.
- Playground:: Try out different prompts and models in the prompt playground
Evaluation: Automate the evaluation process of your LLM application:
- Datasets and Experiments: Store test cases and run experiments (Datasets, Evaluate your LLM Application)
- LLM as a judge metrics: Use Opik's LLM as a judge metric for complex issues like hallucination detection, moderation and RAG evaluation (Answer Relevance, Context Precision
- CI/CD integration: Run evaluations as part of your CI/CD pipeline using our PyTest integration
Production Monitoring:
- Log all your production traces: Opik has been designed to support high volumes of traces, making it easy to monitor your production applications. Even small deployments can ingest more than 40 million traces per day!
- Monitoring dashboards: Review your feedback scores, trace count and tokens over time in the Opik Dashboard.
- Online evaluation metrics: Easily score all your production traces using LLM as a Judge metrics and identify any issues with your production LLM application thanks to Opik's online evaluation metrics

[!TIP]
If you are looking for features that Opik doesn't have today, please raise a new Feature request 🚀

🛠️ Installation

Opik is available as a fully open source local installation or using Comet.com as a hosted solution. The easiest way to get started with Opik is by creating a free Comet account at comet.com.

If you'd like to self-host Opik, you can do so by cloning the repository and starting the platform using Docker Compose:

# Clone the Opik repository
git clone https://github.com/comet-ml/opik.git

# Navigate to the opik/deployment/docker-compose directory
cd opik/deployment/docker-compose

# Start the Opik platform
docker compose up --detach

# You can now visit http://localhost:5173 on your browser!

For more information about the different deployment options, please see our deployment guides:

Installation methods	Docs link
Local instance
Kubernetes

🏁 Get Started

To get started, you will need to first install the Python SDK:

pip install opik

Once the SDK is installed, you can configure it by running the opik configure command:

opik configure

This will allow you to configure Opik locally by setting the correct local server address or if you're using the Cloud platform by setting the API Key

[!TIP]
You can also call the opik.configure(use_local=True) method from your Python code to configure the SDK to run on the local installation.

You are now ready to start logging traces using the Python SDK.

📝 Logging Traces

The easiest way to get started is to use one of our integrations. Opik supports:

Integration	Description	Documentation
OpenAI	Log traces for all OpenAI LLM calls	Documentation
LiteLLM	Call any LLM model using the OpenAI format	Documentation
LangChain	Log traces for all LangChain LLM calls	Documentation
Haystack	Log traces for all Haystack calls	Documentation
Anthropic	Log traces for all Anthropic LLM calls	Documentation
Bedrock	Log traces for all Bedrock LLM calls	Documentation
CrewAI	Log traces for all CrewAI calls	Documentation
DSPy	Log traces for all DSPy runs	Documentation
Gemini	Log traces for all Gemini LLM calls	Documentation
Groq	Log traces for all Groq LLM calls	Documentation
LangGraph	Log traces for all LangGraph executions	Documentation
LlamaIndex	Log traces for all LlamaIndex LLM calls	Documentation
Ollama	Log traces for all Ollama LLM calls	Documentation
Predibase	Fine-tune and serve open-source Large Language Models	Documentation
Ragas	Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines	Documentation
watsonx	Log traces for all watsonx LLM calls	Documentation

[!TIP]
If the framework you are using is not listed above, feel free to open an issue or submit a PR with the integration.

If you are not using any of the frameworks above, you can also use the track function decorator to log traces:

import opik

opik.configure(use_local=True) # Run locally

@opik.track
def my_llm_function(user_question: str) -> str:
    # Your LLM code here

    return "Hello"

[!TIP]
The track decorator can be used in conjunction with any of our integrations and can also be used to track nested function calls.

🧑‍⚖️ LLM as a Judge metrics

The Python Opik SDK includes a number of LLM as a judge metrics to help you evaluate your LLM application. Learn more about it in the metrics documentation.

To use them, simply import the relevant metric and use the score function:

from opik.evaluation.metrics import Hallucination

metric = Hallucination()
score = metric.score(
    input="What is the capital of France?",
    output="Paris",
    context=["France is a country in Europe."]
)
print(score)

Opik also includes a number of pre-built heuristic metrics as well as the ability to create your own. Learn more about it in the metrics documentation.

🔍 Evaluating your LLM Application

Opik allows you to evaluate your LLM application during development through Datasets and Experiments.

You can also run evaluations as part of your CI/CD pipeline using our PyTest integration.

🤝 Contributing

There are many ways to contribute to Opik:

Submit bug reports and feature requests
Review the documentation and submit Pull Requests to improve it
Speaking or writing about Opik and letting us know
Upvoting popular feature requests to show your support

To learn more about how to contribute to Opik, please see our contributing guidelines.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.4.1b1 pre-release

Jan 23, 2025

1.4.1b0 pre-release

Jan 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opik_uto-1.4.1b1.tar.gz (179.8 kB view details)

Uploaded Jan 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

opik_uto-1.4.1b1-py3-none-any.whl (342.0 kB view details)

Uploaded Jan 23, 2025 Python 3

File details

Details for the file opik_uto-1.4.1b1.tar.gz.

File metadata

Download URL: opik_uto-1.4.1b1.tar.gz
Upload date: Jan 23, 2025
Size: 179.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for opik_uto-1.4.1b1.tar.gz
Algorithm	Hash digest
SHA256	`b75ace2465e5c57379f3bc877505c0eca3210628d2abfdc8d7baf3a6b1c84eab`
MD5	`b27b2ea7ba4451d22751580fc64ea18d`
BLAKE2b-256	`47a81f4c1a7f9b5e5176feb5ca231b98a6ef5918f2151e013a283b6fb0032f66`

See more details on using hashes here.

File details

Details for the file opik_uto-1.4.1b1-py3-none-any.whl.

File metadata

Download URL: opik_uto-1.4.1b1-py3-none-any.whl
Upload date: Jan 23, 2025
Size: 342.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for opik_uto-1.4.1b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5b613ad5817338600e3c92bd63ae2900a4d8969b9dfa1e275a66a94bb57a51ca`
MD5	`bffd9c1e8f8aeb76ea0b9aa703c60613`
BLAKE2b-256	`614006812ff41fb686ce743ecc93d189af4c5ca7819135e53ea63bd0eef45916`

See more details on using hashes here.

opik-uto 1.4.1b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Opik
Open source LLM evaluation framework

🚀 What is Opik?

🛠️ Installation

🏁 Get Started

📝 Logging Traces

🧑‍⚖️ LLM as a Judge metrics

🔍 Evaluating your LLM Application

🤝 Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

opik-uto 1.4.1b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Opik Open source LLM evaluation framework

🚀 What is Opik?

🛠️ Installation

🏁 Get Started

📝 Logging Traces

🧑‍⚖️ LLM as a Judge metrics

🔍 Evaluating your LLM Application

🤝 Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Opik
Open source LLM evaluation framework