Skip to main content

PostgresML Django integration that enables automatic embedding of specified fields.

Project description

postgresml-django

postgresml-django is a Python module that integrates PostgresML with Django ORM, enabling automatic in-database embedding of Django models. It simplifies the process of creating and searching vector embeddings for your text data.

Introduction

This module provides a seamless way to:

  • Automatically generate in-databse embeddings for specified fields in your Django models
  • Perform vector similarity searches in-database

Installation

  1. Ensure you have pgml installed and configured in your database. The easiest way to do that is to sign up for a free serverless database at postgresml.org. You can also host it your self.

  2. Install the package using pip:

    pip install postgresml-django
    

You are ready to go!

Usage Examples

Example 1: Using intfloat/e5-small-v2

This example demonstrates using the intfloat/e5-small-v2 transformer, which has an embedding size of 384.

from django.db import models
from postgresml_django import VectorField, Embed

class Document(Embed):
    text = models.TextField()
    text_embedding = VectorField(
        field_to_embed="text",
        dimensions=384,
        transformer="intfloat/e5-small-v2"
    )

# Searching
results = Document.vector_search("text_embedding", "some query to search against")

Example 2: Using mixedbread-ai/mxbai-embed-large-v1

This example shows how to use the mixedbread-ai/mxbai-embed-large-v1 transformer, which has an embedding size of 1024 and requires specific parameters for recall.

from django.db import models
from postgresml_django import VectorField, Embed

class Article(Embed):
    content = models.TextField()
    content_embedding = VectorField(
        field_to_embed="content",
        dimensions=1024,
        transformer="mixedbread-ai/mxbai-embed-large-v1",
        transformer_recall_parameters={
            "prompt": "Represent this sentence for searching relevant passages: "
        }
    )

# Searching
results = Article.vector_search("content_embedding", "some query to search against")

Note the differences between the two examples:

  1. The dimensions parameter is set to 384 for intfloat/e5-small-v2 and 1024 for mixedbread-ai/mxbai-embed-large-v1.
  2. The mixedbread-ai/mxbai-embed-large-v1 transformer requires additional parameters for recall, which are specified in the transformer_recall_parameters argument.

Both examples will automatically generate embeddings when instances are saved and allow for vector similarity searches using the vector_search method.

Contributing

We welcome contributions to postgresml-django! Whether it's bug reports, feature requests, documentation improvements, or code contributions, your input is valuable to us. Feel free to open issues or submit pull requests on our GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

postgresml_django-0.1.0.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

postgresml_django-0.1.0-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file postgresml_django-0.1.0.tar.gz.

File metadata

  • Download URL: postgresml_django-0.1.0.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for postgresml_django-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1b262ca35156bcea0390df7f3772c7aff3ef65b217318ff52edc031edc6eaff4
MD5 2d21a8e37d594901e1970dfa2f97a381
BLAKE2b-256 1f953784a47264cc189dca28bd979841436bfd1d9845fb73627cd6bfcfecbb80

See more details on using hashes here.

File details

Details for the file postgresml_django-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for postgresml_django-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd1481f4bb0cba158c36874cdf2bea1937629a22f0c111412441926aa5dda22d
MD5 74a9a067291c84a78a4b9915a1da1bbe
BLAKE2b-256 e89d1f0393df33e86bf6adeafc1a6dc6cb7793a061485afce2076f89755c6f82

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page