Skip to main content

Canopy is an orchestration engine for intergating LLMs with Pinecone.

Project description

Canopy

Canopy is an open-source Retrieval Augmented Generation (RAG) framework and context engine built on top of the Pinecone vector database. Canopy enables you to quickly and easily experiment with and build applications using RAG. Start chatting with your documents or text data with a few simple commands.

Canopy provides a configurable built-in server so you can effortlessly deploy a RAG-powered chat application to your existing chat UI or interface. Or you can build your own, custom RAG application using the Canopy library.

Canopy is desinged to be:

  • Easy to implement: Bring your text data in Parquet or JSONL format, and Canopy will handle the rest. Canopy makes it easy to incorporate RAG into your OpenAI chat applications.
  • Reliable at scale: Build fast, highly accurate GenAI applications that are production-ready and backed by Pinecone’s vector database. Seamlessly scale to billions of items with transarent, resource-based pricing.
  • Open and flexible: Fully open-source, Canopy is both modular and extensible. You can configure to choose the components you need, or extend any component with your own custom implementation.
  • Interactive and iterative: Evaluate your RAG workflow with a CLI based chat tool. With a simple command in the Canopy CLI you can interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side.

RAG with Canopy

Canopy implemenets the full RAG workflow to prevent hallucinations and augment you LLM with your own text data.

Canopy has two flows: knowledge base creation and chat. In the knowledge base creation flow, users upload their documents and transform them into meaningful representations stored in Pinecone's Vector Database. In the chat flow, incoming queries and chat history are optimized to retrieve the most relevant documents, the knowledge base is queried, and a meaningful context is generated for the LLM to answer.

What's inside the box?

  1. Canopy Core Library - The library has 3 main classes that are responsible for different parts of the RAG workflow:
    • ChatEngine - Exposes a chat interface to interact with your data. Given the history of chat messages, the ChatEngine formulates relevant queries to the ContextEngine, then uses the LLM to generate a knowledgeable response.
    • ContextEngine - Performs the “retrieval” part of RAG. The ContextEngine utilizes the underlying KnowledgeBase to retrieve the most relevant documents, then formulates a coherent textual context to be used as a prompt for the LLM.
    • KnowledgeBase - Manages your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings, storing them in a Pinecone vector database. Given a text query - the KnowledgeBase will retrieve the most relevant document chunks from the database.

more information about the Core Library usage can be found in the Library Documentation

  1. Canopy Server - This is a webservice that wraps the Canopy Core library and exposes it as a REST API. The server is built on top of FastAPI, Uvicorn and Gunicorn and can be easily deployed in production. The server also comes with a built-in Swagger UI for easy testing and documentation. After you start the server, you can access the Swagger UI at http://host:port/docs (default: http://localhost:8000/docs)

  2. Canopy CLI - A built-in development tool that allows users to swiftly set up their own Canopy server and test its configuration.
    With just three CLI commands, you can create a new Canopy server, upload your documents to it, and then interact with the Chatbot using a built-in chat application directly from the terminal. The built-in chatbot also enables comparison of RAG-infused responses against a native LLM chatbot.

Considerations

  • Canopy currently only supports OpenAI as backend for both the embedding model and the LLM. Rate limits and pricing set by OpenAI will apply.
  • More integrations will be supported in the near future.

Setup

  1. set up a virtual environment (optional)
python3 -m venv canopy-env
source canopy-env/bin/activate

more about virtual environments here

  1. install the package
pip install canopy-sdk
  1. Set up the environment variables
export PINECONE_API_KEY="<PINECONE_API_KEY>"
export PINECONE_ENVIRONMENT="<PINECONE_ENVIRONMENT>"
export OPENAI_API_KEY="<OPENAI_API_KEY>"
export INDEX_NAME=<INDEX_NAME>
CLICK HERE for more information about the environment variables
Name Description How to get it?
PINECONE_API_KEY The API key for Pinecone. Used to authenticate to Pinecone services to create indexes and to insert, delete and search data Register or log into your Pinecone account in the console. You can access your API key from the "API Keys" section in the sidebar of your dashboard
PINECONE_ENVIRONMENT Determines the Pinecone service cloud environment of your index e.g west1-gcp, us-east-1-aws, etc You can find the Pinecone environment next to the API key in console
OPENAI_API_KEY API key for OpenAI. Used to authenticate to OpenAI's services for embedding and chat API You can find your OpenAI API key here. You might need to login or register to OpenAI services
INDEX_NAME Name of the Pinecone index Canopy will underlying work with You can choose any name as long as it follows Pinecone's restrictions
CANOPY_CONFIG_FILE The path of a configuration yaml file to be used by the Canopy server. Optional - if not provided, default configuration would be used
  1. Check that installation is successful and environment is set, run:
canopy

output should be similar to this:

Canopy: Ready

Usage: canopy [OPTIONS] COMMAND [ARGS]...
# rest of the help message

Quickstart

In this quickstart, we will show you how to use the Canopy to build a simple question answering system using RAG (retrival augmented generation).

1. Create a new Canopy Index

As a one-time setup, Canopy needs to create a new Pinecone index that is configured to work with Canopy. Just run:

canopy new

And follow the CLI instructions. The index that will be created will have a prefix canopy--<INDEX_NAME>.
You only have to do this process once for every Canopy index you want to create.

To learn more about Pinecone indexes and how to manage them, please refer to the following guide: Understanding indexes

2. Uploading data

You can load data into your Canopy index using the command:

canopy upsert /path/to/data_directory

# or
canopy upsert /path/to/data_directory/file.parquet

# or
canopy upsert /path/to/data_directory/file.jsonl

Canopy supports files in jsonl or parquet format. The documents should have the following schema:

+----------+--------------+--------------+---------------+
| id(str)  | text(str)    | source       | metadata      |
|          |              | Optional[str]| Optional[dict]|
|----------+--------------+--------------+---------------|
| "id1"    | "some text"  | "some source"| {"key": "val"}|
+----------+--------------+--------------+---------------+

Follow the instructions in the CLI to upload your data.

3. Start the Canopy server

The canopy server exposes Canopy's functionality via a REST API. Namely, it allows you to upload documents, retrieve relevant docs for a given query, and chat with your data. The server exposes a /chat.completion endpoint that can be easily integrated with any chat application. To start the server, run:

canopy start

Now, you should be prompted with the following standard Uvicorn message:

...

INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

That's it! you can now start using the Canopy server with any chat application that supports a /chat.completion endpoint.

📝 NOTE:

The canopy start command will keep the terminal occupied. If you want to run the server in the background, you can use the following command - nohup canopy start & However, this is not recommended.

Stopping the server

To stop the server, simply press CTRL+C in the terminal where you started it.

If you have started the server in the background, you can stop it by running:

canopy stop

Evaluation chat tool

Canopy's CLI comes with a built-in chat app that allows you to interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side to evaluate the results

In a new terminal window, set the required environment variables then run:

canopy chat

This will open a chat interface in your terminal. You can ask questions and the RAG-infused chatbot will try to answer them using the data you uploaded.

To compare the chat response with and without RAG use the --no-rag flag

canopy chat --no-rag

This will open a similar chat interface window, but will show both the RAG and non-RAG responses side-by-side.

Advanced usage

Migrating existing OpenAI application to Canopy

If you already have an application that uses the OpenAI API, you can migrate it to Canopy by simply changing the API endpoint to http://host:port/context as follows:

import openai

openai.api_base = "http://host:port/"

# now you can use the OpenAI API as usual

or without global state change:

import openai

openai_response = openai.Completion.create(..., api_base="http://host:port/")

Running Canopy server in production

Canopy is using FastAPI as the web framework and Uvicorn as the ASGI server. It is recommended to use Gunicorn as the production server, mainly because it supports multiple worker processes and can handle multiple requests in parallel, more details can be found here.

To run the canopy server for production, please run:

gunicorn canopy_cli.app:app --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 --workers <number of desired worker processes>

The server interacts with services like Pinecone and OpenAI using your own authentication credentials. When deploying the server on a public web hosting provider, it is recommended to enable an authentication mechanism, so that your server would only take requests from authenticated users.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canopy_sdk-0.1.1.tar.gz (56.7 kB view hashes)

Uploaded Source

Built Distribution

canopy_sdk-0.1.1-py3-none-any.whl (71.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page