Not Diamond Python Library
Project description
Getting started with Not Diamond
Not Diamond automatically determines which model is best-suited to respond to any query, drastically improving LLM output quality while reducing costs and latency and avoiding vendor lock-in. Unlike any other model router, Not Diamond is 100% privacy-preserving and continuously adapts to your preferences (demo video).
Installation
Requires Python 3.9+
pip install notdiamond
If your application isn't in Python, you can directly call our REST API endpoint.
Key features
- Maintain privacy: All inputs are fuzzy hashed before being sent to the Not Diamond API. We return a label for the recommended model and LLM calls go out client-side. This means we never see your raw query strings or your response outputs.
- Maximize performance: Not Diamond outperforms Claude 3 Opus on major evaluation benchmarks. Our cold-start recommendations are based on hundreds of thousands of data points from rigorous evaluation benchmarks and real-world data.
- Continuously improve: By providing feedback on routing decisions, Not Diamond continuously learns a hyper-personalized routing algorithm optimized to your preferences and your application's requirements.
- Reduce cost and latency: Define explicit quality, cost, and latency tradeoffs to cut down your inference costs and achieve blazing fast speeds. Not Diamond determines which model to call in under 40ms—less than the time it takes an LLM to stream a single token.
Not Diamond vs. Claude 3 Opus
These are preliminary results and require further validation, but initial evaluations show that Not Diamond outperforms Claude 3 Opus on major benchmarks:
Dataset | Claude 3 Opus | Not Diamond |
---|---|---|
MMLU | 85.21 | 88.25 |
BIG-Bench-Hard | 81.06 | 81.24 |
ARC-Challenge | 93.56 | 95.59 |
WinoGrande | 76.80 | 79.60 |
MBPP | 47.40 | 69.05 |
As can be seen with the MBPP benchmark, the biggest gains emerge when Claude 3 Opus scores particularly low on a benchmark. More testing is required however and we will be releasing a full technical report soon.
🚧 Beta testing ahead
Not Diamond is still in beta! Please let us know if you have any feedback or ideas on how we can improve. Tomás is at t5@notdiamond.ai or 917 725 2192.
👍 Free to use!
Not Diamond is 100% free to use during beta ♡
API keys
👍 Sign up and get a Not Diamond API key.
Create a .env
file with your Not Diamond API key, and the API keys of the models you want to route between.
OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY"
ANTHROPIC_API_KEY="YOUR_ANTHROPIC_API_KEY"
MISTRAL_API_KEY="YOUR_MISTRAL_API_KEY"
COHERE_API_KEY="YOUR_COHERE_API_KEY"
NOTDIAMOND_API_KEY="YOUR_NOTDIAMOND_API_KEY"
Alternatively, you can also set API keys programmatically as described further below.
📘 API keys
The
notdiamond
library uses your API keys to call the highest-quality LLM client-side. We never pass your keys to our servers. This meansnotdiamond
will only call models you have access to. You can also use our router to determine the best model to call regardless of whether you have access or not (example). Our router supports most of the popular open and proprietary models (full list).Drop me a line if you have a specific model requirement and we're happy to work with you to support it.
Example
If you already have existing projects in LangChain, check out our Langchain integration guide. An integration for OpenAI is also coming soon.
Create a main.py
file in the same folder as the .env
file you created earlier, or try it in Colab.
from notdiamond.llms.llm import NDLLM
from notdiamond.prompts.prompt import NDPromptTemplate
# Define the template object -> the string that will be routed to the best LLM
prompt_template = NDPromptTemplate("Write a merge sort in Python.")
# Define the available LLMs you'd like to route between
llm_providers = ['openai/gpt-3.5-turbo', 'openai/gpt-4','openai/gpt-4-1106-preview', 'openai/gpt-4-turbo-preview',
'anthropic/claude-3-haiku-20240307', 'anthropic/claude-3-sonnet-20240229', 'anthropic/claude-3-opus-20240229']
# Create the NDLLM object -> like a 'meta-LLM' combining all of the specified models
nd_llm = NDLLM(llm_providers=llm_providers)
# After fuzzy hashing the inputs, the best LLM is determined by the ND API and the LLM is called client-side
result, session_id, provider = nd_llm.invoke(prompt_template=prompt_template)
print("ND session ID: ", session_id) # A unique ID of the invoke. Important for personalizing ND to your use-case
print("LLM called: ", provider.model) # The LLM routed to
print("LLM output: ", result.content) # The LLM response
👍 Run it!
python main.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for notdiamond-0.2.10b0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03ea4b4bea490ffc7e9a557ca131396bcc7bf1e52133ca8edfcbe93b6922d0b9 |
|
MD5 | 778d20e6d0e32feb0bbdcafac2525705 |
|
BLAKE2b-256 | 06316e15d69bd9fad9a085cdeffb57e1447164798f5a840e0323054713da7418 |