Python client for colbertdb

These details have not been verified by PyPI

Project description

Quickstart Guide for `pycolbertdb`

This quickstart guide provides instructions on how to use the pycolbertdb package to integrate ColbertDB with LlamaIndex, leveraging OpenAI's GPT-4 model for processing and querying documents.

Prerequisites

Ensure you have the following installed and configured:

Python 3.x
An OpenAI API key
Environment variables configured for ColbertDB

Installation

Install the necessary packages

pip install pycolbertdb -U
pip install llama-index
pip install llama-index-readers-web
pip install requests
pip install python-dotenv

Code Example

Below is an example of how to use the pycolbertdb package to fetch, process, and query documents.

Import Dependencies

Start by importing the necessary dependencies.

import os
from dotenv import load_dotenv
from llama_index.readers.web import SimpleWebPageReader
from llama_index.core import Document, PromptTemplate
from llama_index.llms.openai import OpenAI

from pycolbertdb.client import Colbertdb
from pycolbertdb.models import CreateCollectionDocument
from pycolbertdb.helpers import from_llama_index_documents

Load Environment Variables

Load your environment variables from a .env file.

load_dotenv()
URL = os.getenv('COLBERTDB_URL')
API_KEY = os.getenv('COLBERTDB_API_KEY')
STORE_NAME = os.getenv('COLBERTDB_STORE_NAME')
OPEN_AI_KEY = os.getenv('OPENAI_API_KEY')

URLS = ['https://en.wikipedia.org/wiki/Onigiri']

Initialize Clients

Initialize the ColbertDB and OpenAI clients.

client = Colbertdb(url=URL, api_key=API_KEY, store_name=STORE_NAME)
open_ai_client = OpenAI(model="gpt-4-turbo", api_key=OPEN_AI_KEY)

qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Please write the answer in the style of {tone_name}
Query: {query_str}
Answer: \
"""

prompt_tmpl = PromptTemplate(qa_prompt_tmpl_str)

Fetch and Process Documents

Fetch and process HTML content from the specified URLs.

docs = from_llama_index_documents(SimpleWebPageReader(html_to_text=True).load_data(URLS))

Create a Collection in ColbertDB

Create a new collection in ColbertDB with the processed documents.

collection = client.create_collection(documents=docs, name='rice_ball_facts', options={"force_create": True})

Search the Collection

Perform a search query on the created collection.

result = collection.search(query="What are some popular fillings for onigiri?", k=3)

Generate a Response Using OpenAI

Format the retrieved documents and generate a response using OpenAI.

context = ''
for document in result.documents:
    print("Source: " + document.metadata['source'] + "\n", document.content)
    context += (document.content + "\n\n")

prompt = prompt_tmpl.format(context_str=context, tone_name="shakespeare", query_str="What are some typical onigiri fillings")
response = open_ai_client.complete(prompt)
print(response)

Add New Documents to the Collection

Fetch additional documents and add them to the existing collection.

new_docs = SimpleWebPageReader(html_to_text=True).load_data(["https://en.wikipedia.org/wiki/Kewpie_(mayonnaise)"])
new_formatted = [{"content": doc.text, "metadata": {"source": doc.id_}} for doc in new_docs[0:2]]

collection = collection.add_documents(documents=new_formatted)

Search the Updated Collection

Perform a new search query on the updated collection.

new_result = collection.search(query="When was kewpie mayo founded?", k=3)
new_context = ''
for document in new_result.documents:
    print("Source: " + document.metadata['source'] + "\n", document.content)
    new_context += (document.content + "\n\n")

prompt = prompt_tmpl.format(context_str=new_context, tone_name="bruce springsteen", query_str="When and where was kewpie mayo founded")
new_response = open_ai_client.complete(prompt)
print(new_response)

Conclusion

This guide provides a quickstart overview of using the pycolbertdb package for document processing and querying. Customize the prompt and collection as needed for your specific use case.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.8

Jun 5, 2024

0.2.7

Jun 5, 2024

0.2.6

Jun 5, 2024

0.2.5

Jun 4, 2024

0.2.4

Jun 4, 2024

0.2.3

Jun 4, 2024

0.2.2

May 29, 2024

0.2.1

May 29, 2024

0.2.0

May 29, 2024

0.1.3

May 28, 2024

0.1.2

May 23, 2024

0.1.1

May 18, 2024

0.1.0

May 18, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycolbertdb-0.2.8.tar.gz (6.1 kB view details)

Uploaded Jun 5, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pycolbertdb-0.2.8-py3-none-any.whl (7.3 kB view details)

Uploaded Jun 5, 2024 Python 3

File details

Details for the file pycolbertdb-0.2.8.tar.gz.

File metadata

Download URL: pycolbertdb-0.2.8.tar.gz
Upload date: Jun 5, 2024
Size: 6.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.1 Linux/6.5.0-1021-azure

File hashes

Hashes for pycolbertdb-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`70788efef0b5338ec1bb820509349a3ff8fa2691889a163c7398c299d82146d0`
MD5	`90d8ab811b111fd8d926b3829456f155`
BLAKE2b-256	`fac2c55931801e4a285c44665f36390ec8d8887dae90d018a5e55369ff4d5087`

See more details on using hashes here.

File details

Details for the file pycolbertdb-0.2.8-py3-none-any.whl.

File metadata

Download URL: pycolbertdb-0.2.8-py3-none-any.whl
Upload date: Jun 5, 2024
Size: 7.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.1 Linux/6.5.0-1021-azure

File hashes

Hashes for pycolbertdb-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3a86d232d629b8b39a0004b3a3799cc54f889eedff38d1eea7175ef94bfe58a8`
MD5	`49fc5dd6187d6f7c60c3556ac7ed3225`
BLAKE2b-256	`fa50400b4bf6a7d96fa3c6a0fbe63de751bba00a8065ed1171873048a3bbb805`

See more details on using hashes here.

pycolbertdb 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Quickstart Guide for `pycolbertdb`

Prerequisites

Installation

Code Example

Import Dependencies

Load Environment Variables

Initialize Clients

Fetch and Process Documents

Create a Collection in ColbertDB

Search the Collection

Generate a Response Using OpenAI

Add New Documents to the Collection

Search the Updated Collection

Conclusion

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

pycolbertdb 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Quickstart Guide for pycolbertdb

Prerequisites

Installation

Code Example

Import Dependencies

Load Environment Variables

Initialize Clients

Fetch and Process Documents

Create a Collection in ColbertDB

Search the Collection

Generate a Response Using OpenAI

Add New Documents to the Collection

Search the Updated Collection

Conclusion

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Quickstart Guide for `pycolbertdb`