Skip to main content

A tool for building gRPC-based model backends for LeapfrogAI

Project description

LeapfrogAI Logo

Table of Contents

  1. Project Goal
  2. Why Host Your Own LLM?
  3. Features
  4. Getting Started
  5. Usage
  6. License
  7. Community

Project Goal

LeapfrogAI is designed to provide AI-as-a-service in egress limited environments. This project aims to bridge the gap between resource-constrained environments and the growing demand for sophisticated AI solutions, by enabling the hosting of APIs that provide AI-related services.

Our services include vector databases, completions with models like Large Language Models (LLMs), and the creation of embeddings. These AI capabilities can be easily accessed and integrated with your existing infrastructure, ensuring the power of AI can be harnessed irrespective of your environment's limitations.

Why Host Your Own LLM?

Large Language Models (LLMs) are a powerful resource for AI-driven decision making, content generation, and more. However, the use of cloud-based LLMs can introduce limitations such as:

  • Data Privacy and Security: Sending sensitive information to a third-party service may not be suitable or permissible for all types of data or organizations. By hosting your own LLM, you retain full control over your data.

  • Cost: Pay-as-you-go AI services can become expensive, especially when large volumes of data are involved. Running your own LLM can often be a more cost-effective solution in the long run.

  • Customization and Control: By hosting your own LLM, you have the ability to customize the model's parameters, training data, and more, tailoring the AI to your specific needs.

  • Latency: If your application requires real-time or near-real-time responses, hosting the model locally can significantly reduce latency compared to making a round trip to a remote API.

Features

  • LeapfrogAI provides an API that closely matches that of OpenAI's. This feature allows tools that have been built with OpenAI/ChatGPT to function seamlessly with LeapfrogAI as a backend. This compatibility greatly simplifies the transition process for developers familiar with OpenAI's API, and facilitates easy integration with existing systems.

  • Vector Databases: Our vector database service allows you to perform efficient similarity searches on large scale databases. This feature can be utilized to augment prompts with responses from VectorDBs, enhancing the contextual awareness of the model.

  • Fine-Tuning Models: One of the key strengths of LeapfrogAI is its ability to leverage customer specific data. We provide capabilities to fine-tune models with your data, enabling the AI to better understand your domain and provide more accurate and contextually relevant outputs.

  • Embeddings Creation: Embeddings are fundamental to the working of many AI algorithms. LeapfrogAI provides services to generate embeddings which can be used for a variety of tasks such as semantic similarity, clustering, and more.

Architecture

Leapfrog exposes both Weaviate and LLM and embedding generative capabilities over HTTP. However, internal communications are a combination of gRPC and HTTP connections as described below:

graph LR;
    A[User] --HTTP--> L[LeapfrogAI]
    L ---->|HTTP| B(API)
    B ----> |gRPC| C(StableLM)
    B ----> |gRPC| D(WhisperAI)
    B ----> |gRPC| E(all-MiniLM-L6-v2)

    L ---->|HTTP| W[Weaviate]
    W --/embeddings-->B

Getting Started

Setting up the Kubernetes Cluster

LeapfrogAI's API server and weaviate's vector database don't require GPUs, however some models will not function without GPUs. If using a CPU based platform, see the ctransformers folder for working with GGML architectures.

K3d w/ GPU support

If developing on a node that has a GPU, there's a Zarf package that deploys a k3d cluster with GPU support clone and follow the instructions in the repository here.

on a node with at least 1 GPU

Initialize Cluster

The supported install method uses zarf to initialize the cluster and then deploy Big Bang on top:

zarf init -a amd64
zarf package deploy oci://ghcr.io/defenseunicorns/packages/dubbd-k3d:0.11.0-amd64 --set APPROVED_REGISTRIES="ghcr.io/runyontr/* | ghcr.io/defenseunicorns/* | nvcr.io/nvidia/k8s/* | semitechnologies/*"

Deploy

To build and deploy Leapfrg

zarf package create .
zarf package deploy zarf-package-leapfrogai-*.zst --confirm

Configure DNS

Ensure that the DNS record for *.bigbang.dev points to the load balancer for Istio. By default this DNS record points at localhost, so for the k3d deployment, this should work out of the box with the load balancers configured.

The OpenAI API service is hosted and is watching for new models to get installed in the cluster.

Install a model

$ cd models/test/repeater
$ zarf package create .
$ zarf package deploy zarf-package-*.zst --confirm
$ kubectl get pods -n leapfrogai
NAME                              READY   STATUS    RESTARTS   AGE
api-deployment-65cd6fbf95-l5dzw   2/2     Running   0          5m23s

Usage

Reference one of the ipythonnotebooks that showcase a simple getting started.

Leapfrog AI

Leapfrog AI is a deployable AI-as-a-service that brings the capabilities of AI models to egress limited environments by allowing teams to deploy APIs that mirror OpenAI's spec. Teams are able to use tools built around OpenAIs models in their own environment, preventing the release of proprietary and sensitive data to SaaS tools.

In addition, tools like Weaviate are deployed to allow for the creation of content augmented applications.

Create the API Server

See the Getting Started Notebook for example of using the API with the OpenAI python module.

Building leapfrogai and updating PyPi

  1. Change the version in pyproject.toml
  2. python3 -m pip install --upgrade build hatchling twine
  3. python3 -m build
  4. python3 -m twine upload dist/*

Community

Real-time discussions about LeapfrogAI development happen in Discord. Discussions should be civil and focused on the open source development of LeapfrogAI. Distribution of proprietary or non-distributable code or model weights are prohibited and will be removed.

LeapfrogAI is supported by a community of users and contributors, including:

Defense Unicorns logoBeast Code logoHypergiant logoPulze logo

Want to add your organization or logo to this list? Open a PR!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

leapfrogai-0.4.0rc1.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

leapfrogai-0.4.0rc1-py3-none-any.whl (35.8 kB view details)

Uploaded Python 3

File details

Details for the file leapfrogai-0.4.0rc1.tar.gz.

File metadata

  • Download URL: leapfrogai-0.4.0rc1.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for leapfrogai-0.4.0rc1.tar.gz
Algorithm Hash digest
SHA256 b18aa16ed5f154068474aadc2cf56205f93382ba3949e48aba8d6e8c3bf78c1c
MD5 f1099e7852348fee0abaf7ac032735a4
BLAKE2b-256 09ecf79717a3e64daf32431bb4a28880a4213b8a04c80dfdd65d249256115953

See more details on using hashes here.

File details

Details for the file leapfrogai-0.4.0rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for leapfrogai-0.4.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 9beb1604ccd3933ae7ef939c4f18d751f9b5b62023ee7f723ff5b1e6994fd485
MD5 10a3dfc21639e5965646d4501e1f8a87
BLAKE2b-256 97c9be0933649e79eb92ab2f4ed1a863e8c56ba905c09f61b3ff6d80af058372

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page