Skip to main content

BagelML is a library for Bagel's finetuning API.

Project description

Bagel Python Client 🥯

Welcome to the Bagel Python Client! Bagel is your platform for peer-to-peer machine learning, fine-tuning open-source models like Llama or Mistral, and using retrieval augmented generation (RAG).

One of the perks? No need to manage complex embeddings or model integrations yourself! The Bagel client handles these processes, saving you time and money. 🥯

Table of Contents

  1. Prerequisites
  2. Installation
  3. Import the necessary modules
  4. Define the Bagel server settings
  5. Create the Bagel client
  6. Ping the Bagel server
  7. Get the Bagel server version
  8. Create an Asset
  9. Delete an Asset
  10. Download Model Files
  11. Query Asset
  12. Update Asset
  13. Download file
  14. File Upload
  15. Fine-tune
  16. Get all assets
  17. Get all assets by Id
  18. Get finetuned model
  19. Get job by job id
  20. Get job
  21. List Job
  22. Add data to asset

Prerequisites

  • Python 3.6+
  • pip package manager
  • Asset size limit 500MB (*Create a new issue if you want to increase the limit)

Installation

To install the Bagel Python client, run the following command in your terminal:

pip install bagelML

Import the necessary modules

import uuid
import bagel
from bagel.config import Settings

This snippet imports the required modules for using Bagel.

Define the Bagel server settings

server_settings = Settings(
    bagel_api_impl="rest",
    bagel_server_host="api.bageldb.ai"
)

Here, we define the settings for connecting to the Bagel server.

Create the Bagel client

client = bagel.Client(server_settings)

Create an instance of the Bagel client using the previously defined server settings.

Ping the Bagel server

print(client.ping())

This checks the connectivity to the Bagel server.

Get the Bagel server version

print(client.get_version())

Retrieves and prints the version of the Bagel server.

Create an Asset

Assets in Bagel serve as powerful containers for large datasets, encapsulating embeddings — high-dimensional vectors that represent various data forms, such as text, images, or audio. These Assets enable efficient similarity searches, which are fundamental to a wide range of applications, from recommendation systems and search engines to data analytics tools.

api_key = 'insert api key'
payload = {
    "dataset_type": "RAW",
    "title": "",
    "category": "",
    "details": "Testing",
    "tags": ["AI", "DEMO", "TEST"],
    "user_id": 'insert user id'
}

client.create_asset(payload, api_key)

Delete an Asset

api_key = 'insert api key'
dataset_id = 'insert dataset/asset id'
client.delete_asset(dataset_id, api_key)

This method deletes a specific Asset.

Download Model Files

api_key = 'insert api key'
asset_id = 'insert dataset/asset id'
file_name = "insert file .txt"

client.download_file(asset_id, file_name, api_key)

Downloads a file associated with a specific Asset.

Query Asset

api_key = ""
asset_id = ""

payload = {
    "where": {
        # "category": "Cat2",
    },
    "where_document": {
        # "is_published": True,
    },
    # "query_embeddings": [em],
    "n_results": 1,
    "include": ["metadatas", "documents", "distances"],
    "query_texts": ["insert query text"],
    "padding": False,
}

client.query_asset(asset_id, payload, api_key)

Queries a specific Asset with detailed parameters.

Update Asset

import bagel
from bagel.config import Settings

api_key = ""
asset_id = ""

payload = {
   "title": "Updated dataset title",
    "category": "Updated category",
    "details": "Updated dataset description.",
    "tags": ["Updated", "Tags"]
}

server_settings = Settings(
    bagel_api_impl="rest",
    bagel_server_host="api.bageldb.ai",
    bagel_server_http_port="80",
)
client = bagel.Client(server_settings)

client.update_asset(asset_id, payload, api_key)

Updates the details of an existing Asset.

Download file

api_key = ""
asset_id = ""
file_name = ""

client.download_file(asset_id, file_name, api_key)

Downloads a specific file from an Asset.

File Upload

api_key = ""
dataset_id = ""
file_path = ""

client.file_upload(file_path, dataset_id, api_key)

Uploads a file to a specific Asset.

Fine-tune

# Define the URL for the fine-tune function
apiKey = ""
# Define the payload for the fine-tune function
payload = {
  "dataset_type": 'RAW',
  "title": 'what!',
  "category": '',
  "details": '',
  "tags": [],
  "user_id": '',
  "fine_tune_payload": {
    "asset_id": '', # Move asset_id here
    "model_name": '', # Same as the title
    "base_model": '',
    "file_name": 'catch.txt',
    "user_id": '',
  }
}

client.fine_tune(payload, apiKey)

Fine-tunes a model using a specific Asset and provided parameters.

Get all assets

user_Id = ""
api_key = ""

client.get_all_asset(user_Id, api_key)

Retrieves all assets for a specific user.

Get all assets by Id

asset_id = ""
api_key = ""

client.get_asset_by_id(asset_id, api_key)

Retrieves a specific asset by its ID.

Get finetuned model

# Replace these values with actual ones
api_key = ""
asset_id = ""
file_name = "train.txt"

# Call the function
client.download_file_by_asset_and_name(asset_id, file_name)

Downloads a fine-tuned model by asset ID and file name.

Get job by job id

job_id = ""  # Replace with the actual job ID
api_key = ""  # Replace with the actual API key

client.get_job(job_id, api_key)

Retrieves the status of a specific job by job ID.

Get job

job_id = ""  # Replace with the actual job ID
api_key = ""  # Replace with the actual API key

client.get_job(job_id, api_key)

Retrieves details of a job.

List Job

# Replace "your_api_key_here" with the provided API key
api_key = ""
user_id = ""

# Call the function
client.list_jobs(user_id, api_key)

Lists all jobs for a specific user.

Add data to asset

asset_id = ""
api_key = ""

payload = {
  "metadatas": [{ "source": "testing" }],
  "documents": ["Hi man"],
  "ids": ["xxxx-xxxx-xxxx-xxxx--xxxxx"], #manually generated by you
}

client.add_data_to_asset(asset_id, payload, api_key)

Adds data to an existing asset.

Download Finetuned Model

api_key = "insert api key"
asset_id = "insert asset id"

response = client.download_model(asset_id, api_key)

Buy Asset

api_key = "insert api key"
asset_id = "insert asset id"
user_id = "insert userid"

client.buy_asset(asset_id, user_id, api_key)      

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bagelML-0.0.19.tar.gz (23.7 kB view details)

Uploaded Source

Built Distribution

bagelML-0.0.19-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file bagelML-0.0.19.tar.gz.

File metadata

  • Download URL: bagelML-0.0.19.tar.gz
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for bagelML-0.0.19.tar.gz
Algorithm Hash digest
SHA256 befcd110d78c406472044822c53b9b8495197ea3d1e88876a6fc1a08dec7b614
MD5 a035563c833d424dfb12bb9291ba2810
BLAKE2b-256 7bdf064a5811a1356c96b5d3eea231888fede4a5b6684d717506a582fe15c0b1

See more details on using hashes here.

File details

Details for the file bagelML-0.0.19-py3-none-any.whl.

File metadata

  • Download URL: bagelML-0.0.19-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for bagelML-0.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 1a82f1b543df46b8cd3bdd7c7bd6afbbc470aeac515920b2c3dde338f22cd3d2
MD5 117cb28375d9a273db1ef1f950a79c3c
BLAKE2b-256 e41dea9a61e2f230f1e235f459b5e64318dced62aabceba15ab465bb265e4fa9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page