BagelML is a Python library for interacting with the Bagel inference and vector embedding API.
Project description
Bagel Python Client 🥯
Welcome to the Bagel Python Client Example! Bagel is your bread-and-butter library for interacting with the Bagel API without breaking a sweat.
One of the perks? No need to call the OpenAI Embeddings method or any other model to generate embeddings! That's right, the Bagel client handles that for you. So, you don't need to spend extra bucks on generating embeddings. Quite a dough-saver, isn't it? 🥯💰
Prerequisites
- Python 3.6+
- pip package manager
- Cluster size limit 500MB (*Create a new issue if you want to increase the limit)
Installation
To install the Bagel Python client, run the following command in your terminal:
pip install bagelML
Usage
- Import the necessary modules:
import uuid
import bagel
from bagel.config import Settings
This snippet imports the required modules for using Bagel, including the uuid module for generating unique identifiers.
- Define the Bagel server settings:
server_settings = Settings(
bagel_api_impl="rest",
bagel_server_host="api.bageldb.ai"
)
Here, we define the settings for connecting to the Bagel server.
- Create the Bagel client:
client = bagel.Client(server_settings)
Create an instance of the Bagel client using the previously defined server settings.
- Ping the Bagel server:
print(client.ping())
This checks the connectivity to the Bagel server.
- Get the Bagel server version:
print(client.get_version())
Retrieves and prints the version of the Bagel server.
- Create and delete a cluster:
name = str(uuid.uuid4())
client.create_cluster(name)
client.delete_cluster(name)
Generates a unique name for a cluster, creates it, and then deletes it. This demonstrates basic cluster management.
- Create, add documents, and query a cluster:
cluster = client.get_or_create_cluster("testing")
cluster.add(
documents=["This is doc", "This is gooogle doc"],
metadatas=[{"source": "notion"},
{"source": "google-doc"}],
ids=[str(uuid.uuid4()), str(uuid.uuid4())],
)
results = cluster.find(query_texts=["query"], n_results=5)
Creates a cluster or retrieves an existing one, adds documents with metadata. Here ids
are unique identifiers for each documents. Bagel generates embeddings using its model. And performs a text-based query/search. Here n_results
is to limit number of results.
- Add embeddings and query (without needing to generate embeddings yourself!):
cluster = client.get_or_create_cluster("new_testing")
cluster.add(embeddings=[[1.1, 2.3], [4.5, 6.9]],
metadatas=[{"info": "M1"}, {"info": "M1"}],
documents=["doc1", "doc2"],
ids=["id1", "id2"])
results = cluster.find(query_embeddings=[[1.1, 2.3]], n_results=2)
This is similar to the previous example but uses pre-calculated embeddings for documents and performs a query based on those embeddings.
- Modify cluster name:
cluster.modify(name="new_name")
Changes the name of the cluster.
- Update document metadata:
cluster.update(ids=["id1"], metadatas=[{"new":"metadata"}])
Updates the metadata of a specific document in the cluster.
- Upsert documents:
cluster.upsert(documents=["new doc"],
metadatas=[{"new": "metadata"}],
ids=["doc1"])
Inserts or updates documents in the cluster based on provided IDs.
- Get cluster size:
cluster = client.get_or_create_cluster("new_testing")
print(f"cluster size {cluster.cluster_size} mb")
Get the size of the cluster in megabytes. For each cluster max size is 500MB.
- Add image:
In Bagel we can add image also. Here is an example of adding image to cluster. It supports almost every image format.
filename = "your_img.png"
resp = cluster.add_image(filename)
- Embedding size:
print(f"Embedding size {cluster.embedding_size}")
Initially, if no data is added to the cluster, the value of embedding_size
is None. After adding data, the embedding_size
is set or assigned.
- Add image by image download URLs:
Multiple images can be added to a Bagel cluster using URLs. It's recommended to add fewer than 20 images at a time using this function. Upon execution, the function will return the URLs of successfully added images and those that failed. Here's an example:
cluster = api.get_or_create_cluster("new_testing")
urls = [
"https://bagel-public-models-s3-download.s3.eu-north-1.amazonaws.com/cat/60de145c79609acaba3bbe08974a9ff5.jpg",
"https://bagel-public-models-s3-download.s3.eu-north-1.amazonaws.com/cat/black-white-cat-wallpaper.jpg",
]
ids = [str(uuid.uuid4()) for i in range(len(urls))]
resp = cluster.add_image_urls(ids=ids, urls=urls)
Tutorials
Explore additional tutorials for more insights.
- Python Client Example
- Using Bagel with Llama Index
- Using Bagel with Langchain
- Build an image search engine in 10 minutes using Bagel
Need more dough-tails? See the example code for a more comprehensive guide on using the Bagel Python client.
Happy coding and enjoy your fresh Bagels! 🥯👩💻👨💻
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.