Skip to main content

The official Python SDK for the ColiVara API

Project description

colivara-py

PyPI Changelog License Tests codecov

The official Python SDK for the ColiVara API. ColiVara is a document search and retrieval API that uses advanced machine learning techniques to index and search documents. This SDK allows you to interact with the API to create collections, upload documents, search for documents, and generate embeddings.

Installation

Install colivara-py using pip:

pip install colivara-py

Usage

Refer to the ColiVara API documentation for detailed guidance on how to use this library.

Requirements

  • You need access to the ColiVara API, which you can self-host (see ColiVara API repo) or use the hosted version at colivara.com.
  • Obtain an API key by signing up at ColiVara or from your self-hosted API.

Example Code

import os
from colivara_py import ColiVara

rag_client = ColiVara(
    api_key=os.environ.get("COLIVARA_API_KEY"),  # Default is `None`
    base_url="https://api.colivara.com"  # Default is `https://api.colivara.com`
)

# Create a new collection (optional)
new_collection = rag_client.create_collection(name="my_collection", metadata={"description": "A sample collection"})
print(f"Created collection: {new_collection.name}")

# Upload a document to the collection
document = rag_client.upsert_document(
    name="sample_document",
    collection_name="my_collection",  # Defaults to "default_collection"
    url="https://example.com/sample.pdf",
    metadata={"author": "John Doe"}
)
print(f"Uploaded document: {document.name}")

# Search for documents
search_results = rag_client.search(
    query="machine learning",
    collection_name="my_collection",
    top_k=3
)
for result in search_results.results:
    print(f"Page {result.page_number} of {result.document_name}: Score {result.normalized_score}")


# Search using images
image_search_results = rag_client.search_image(
    collection_name="my_collection",
    image_path="path/to/image.jpg",  # Alternatively, use image_base64="base64_encoded_string"
    top_k=3
)
for result in image_search_results.results:
    print(f"Page {result.page_number} of {result.document_name}: Score {result.normalized_score}")

# List documents in a collection
documents = rag_client.list_documents(collection_name="my_collection")
for doc in documents:
    print(f"Document: {doc.name}, Pages: {doc.num_pages}")

# Generate embeddings
embeddings = rag_client.create_embedding(
    input_data=["This is a sample text for embedding"],
    task="query"
)
print(f"Generated {len(embeddings.data)} embeddings")

# Delete a document
rag_client.delete_document("sample_document", collection_name="my_collection")
print("Document deleted")

Development

Setting up the Development Environment

  1. Clone the repository and navigate to the project directory:

    cd colivara-py
    
  2. Create a virtual environment:

    uv venv
    
  3. Activate the virtual environment:

    macOS/Linux:

    source .venv/bin/activate
    

    Windows:

    .venv\Scripts\activate
    
  4. Install the development dependencies:

    uv sync --extra dev-dependencies
    
  5. Run tests:

    pytest
    

Regenerating the SDK

If the OpenAPI specification is updated, regenerate the SDK as follows:

  1. Install the OpenAPI generator (on macOS, use Homebrew):

    brew install openapi-generator
    
  2. Verify the installation:

    openapi-generator version
    
  3. Run the OpenAPI generator from the project directory:

    openapi-generator generate -i https://api.colivara.com/v1/openapi.json -g python -c config.yaml --ignore-file-override .openapi-generator-ignore --template-dir ./templates
    

Updating the SDK and Documentation

Follow these steps for major changes to the OpenAPI spec:

  1. Regenerate the SDK using the OpenAPI generator.
  2. Update the client interface in colivara_py/client.py. if needed
  3. Modify tests in the tests directory to reflect the changes. if needed.
  4. Run tests to ensure functionality.

Building Documentation Locally

Generate and view the SDK documentation:

  1. To serve the documentation locally:

    pdocs server colivara_py
    
  2. To generate documentation as HTML:

    pdocs as_html colivara_py --overwrite
    
  3. To generate documentation as Markdown:

    pdocs as_markdown colivara_py
    

License

This SDK is licensed under the Apache License, Version 2.0. The ColiVara API is licensed under the Functional Source License, Version 1.1, Apache 2.0 Future License. See LICENSE.md for details.

For commercial licensing, contact us via tjmlabs.com. We’re happy to work with you to provide a license tailored to your needs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

colivara_py-1.6.0.tar.gz (39.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

colivara_py-1.6.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file colivara_py-1.6.0.tar.gz.

File metadata

  • Download URL: colivara_py-1.6.0.tar.gz
  • Upload date:
  • Size: 39.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for colivara_py-1.6.0.tar.gz
Algorithm Hash digest
SHA256 1bd102a255d4ba2e73be5a1e2bb70da958f367399dbbdb96f185745b2b5eb305
MD5 cf23622c6d90a6c10d83dfe251b0dbe5
BLAKE2b-256 117bc558f4ee538520a6c1c3de4808c35ff935dc922a43c4f12dedd00be11428

See more details on using hashes here.

Provenance

The following attestation bundles were made for colivara_py-1.6.0.tar.gz:

Publisher: publish.yml on tjmlabs/colivara-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file colivara_py-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: colivara_py-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 30.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for colivara_py-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85bd05ce5ab1c3f8855fd386c44db5b5cf9600cc532e1f5d1645e70dee0b82c8
MD5 18abdadfac10d3d7fdb6a95305023fc8
BLAKE2b-256 86b5acc2794fb55ac90784519c7c24d8099d4148d08a963eb4b7ea6b10fec66b

See more details on using hashes here.

Provenance

The following attestation bundles were made for colivara_py-1.6.0-py3-none-any.whl:

Publisher: publish.yml on tjmlabs/colivara-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page