Skip to main content

BagelDB is a Python library for interacting with the BagelDB API.

Project description

BagelDB Python Client 🥯

Welcome to the BagelDB Python Client Example! BagelDB is your bread-and-butter library for interacting with the BagelDB API without breaking a sweat.

One of the perks? No need to call the OpenAI Embeddings method or any other model to generate embeddings! That's right, the BagelDB client handles that for you. So, you don't need to spend extra bucks on generating embeddings. Quite a dough-saver, isn't it? 🥯💰

Prerequisites

  • Python 3.6+
  • pip package manager
  • Cluster size limit 500MB (*Create a new issue if you want to increase the limit)

Installation

To install the BagelDB Python client, run the following command in your terminal:

pip install betabageldb

Usage

  1. Import the necessary modules:
import uuid
import bagel
from bagel.config import Settings

This snippet imports the required modules for using BagelDB, including the uuid module for generating unique identifiers.

  1. Define the BagelDB server settings:
server_settings = Settings(
    bagel_api_impl="rest",
    bagel_server_host="api.bageldb.ai"
)

Here, we define the settings for connecting to the BagelDB server.

  1. Create the BagelDB client:
client = bagel.Client(server_settings)

Create an instance of the BagelDB client using the previously defined server settings.

  1. Ping the BagelDB server:
print(client.ping())

This checks the connectivity to the BagelDB server.

  1. Get the BagelDB server version:
print(client.get_version())

Retrieves and prints the version of the BagelDB server.

  1. Create and delete a cluster:
name = str(uuid.uuid4())
client.create_cluster(name)
client.delete_cluster(name)

Generates a unique name for a cluster, creates it, and then deletes it. This demonstrates basic cluster management.

  1. Create, add documents, and query a cluster:
cluster = client.get_or_create_cluster("testing")

cluster.add(
    documents=["This is doc", "This is gooogle doc"],
    metadatas=[{"source": "notion"},
               {"source": "google-doc"}],
    ids=[str(uuid.uuid4()), str(uuid.uuid4())],
)

results = cluster.find(query_texts=["query"], n_results=5)

Creates a cluster or retrieves an existing one, adds documents with metadata. Here ids are unique identifiers for each documents. BagelDB generates embeddings using its model. And performs a text-based query/search. Here n_results is to limit number of results.

  1. Add embeddings and query (without needing to generate embeddings yourself!):
cluster = client.get_or_create_cluster("new_testing")

cluster.add(embeddings=[[1.1, 2.3], [4.5, 6.9]],
            metadatas=[{"info": "M1"}, {"info": "M1"}],
            documents=["doc1", "doc2"],
            ids=["id1", "id2"])

results = cluster.find(query_embeddings=[[1.1, 2.3]], n_results=2)

This is similar to the previous example but uses pre-calculated embeddings for documents and performs a query based on those embeddings.

  1. Modify cluster name:
cluster.modify(name="new_name")

Changes the name of the cluster.

  1. Update document metadata:
cluster.update(ids=["id1"], metadatas=[{"new":"metadata"}])

Updates the metadata of a specific document in the cluster.

  1. Upsert documents:
cluster.upsert(documents=["new doc"],
               metadatas=[{"new": "metadata"}],
               ids=["doc1"])

Inserts or updates documents in the cluster based on provided IDs.

  1. Get cluster size:
cluster = client.get_or_create_cluster("new_testing")
print(f"cluster size {cluster.cluster_size} mb")

Get the size of the cluster in megabytes. For each cluster max size is 500MB.

  1. Add image:

In BagelDB we can add image also. Here is an example of adding image to cluster. It supports almost every image format.

filename = "your_img.png"
resp = cluster.add_image(filename)
  1. Embedding size:
print(f"Embedding size {cluster.embedding_size}")

Initially, if no data is added to the cluster, the value of embedding_size is None. After adding data, the embedding_size is set or assigned.

  1. Add image by image download URLs:

Multiple images can be added to a BagelDB cluster using URLs. It's recommended to add fewer than 20 images at a time using this function. Upon execution, the function will return the URLs of successfully added images and those that failed. Here's an example:

cluster = api.get_or_create_cluster("new_testing")
urls = [
    "https://bagel-public-models-s3-download.s3.eu-north-1.amazonaws.com/cat/60de145c79609acaba3bbe08974a9ff5.jpg",
    "https://bagel-public-models-s3-download.s3.eu-north-1.amazonaws.com/cat/black-white-cat-wallpaper.jpg",
]
ids = [str(uuid.uuid4()) for i in range(len(urls))]
resp = cluster.add_image_urls(ids=ids, urls=urls)

Tutorials

Explore additional tutorials for more insights.


Need more dough-tails? See the example code for a more comprehensive guide on using the BagelDB Python client.

Happy coding and enjoy your fresh Bagels! 🥯👩‍💻👨‍💻

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betabageldb-0.2.45.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

betabageldb-0.2.45-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file betabageldb-0.2.45.tar.gz.

File metadata

  • Download URL: betabageldb-0.2.45.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.6

File hashes

Hashes for betabageldb-0.2.45.tar.gz
Algorithm Hash digest
SHA256 60b12e586a2fb9bb5f8c647121b980f4584d00a6eec32ff1c3ee05264d8a9fe9
MD5 8574c22d3d4accbb30ab37be8b189c3f
BLAKE2b-256 876eef558a511f8711b4f6bea6de029a8ae566b1b0108959bb2d7b86d86fd218

See more details on using hashes here.

File details

Details for the file betabageldb-0.2.45-py3-none-any.whl.

File metadata

File hashes

Hashes for betabageldb-0.2.45-py3-none-any.whl
Algorithm Hash digest
SHA256 080b461d660e3937d3217c1c79134aa72c740d92deb06e79e78ab59cd86cb825
MD5 f05df866ba08e328f5feba641337d2e0
BLAKE2b-256 06b0c1ce967297615b360c8d91e1e28f40167ba15a40dc50fac1566397e72684

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page