Skip to main content

PostgreSQL vector support for asyncpg

Project description

PostgreSQL vector support for asyncpg

Adds PostgreSQL vector support for Python.

Registers data types vector and halfvec from the PostgreSQL extension vector to the asynchronous PostgreSQL client asyncpg, and marshals vector data to and from PostgreSQL database tables.

Internally, the data is packed into a Python bytes object, with single-precision float vectors stored on 4 bytes per item (for class Vector) and half-precision float vectors stored on 2 bytes per item (for class HalfVector). Data is (un)packed using struct from the standard library as necessary.

This module provides functionality similar to pgvector-python but imports minimum dependencies (e.g. no dependency on numpy).

Setup

Install the package

pip install asyncpg_vector

Initialize

Register vector types with your database connection or connection pool:

Connection:

from asyncpg_vector import register_vector

async def main() -> None:
    ...

    await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
    await register_vector(conn)

Pool:

from asyncpg_vector import register_vector

async def init_connection(conn: asyncpg.Connection) -> None:
    await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
    await register_vector(conn)

async def main() -> None:
    ...

    pool = await asyncpg.create_pool(..., init=init_connection)

Perform similarity search

First, create a table and an index:

async def create(conn: asyncpg.Connection) -> None:
    await conn.execute("""
        CREATE TABLE IF NOT EXISTS items
        (
            id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,
            content text NOT NULL,
            embedding halfvec(1536) NOT NULL,
            CONSTRAINT pk_items PRIMARY KEY (id)
        );

        CREATE INDEX IF NOT EXISTS embedding_index ON items
        USING hnsw (embedding halfvec_cosine_ops);
    """)

Next, find documents in a knowledge base that match a search phrase using vector similarity with approximate nearest neighbor semantics:

from asyncpg_vector import HalfVector

async def search(conn: asyncpg.Connection, phrase: str) -> list[str]:
    ...

    embedding_response = await ai_client.embeddings.create(
        input=phrase,
        model="text-embedding-3-small",
        encoding_format="base64"
    )
    embedding = HalfVector.from_float_base64(embedding_response.data[0].embedding)
    query = """
        SELECT
            id,
            content,
            embedding <=> $1 AS distance
        FROM items
        ORDER BY distance
        LIMIT 5
    """
    rows = await conn.fetch(query, embedding)

    ...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asyncpg_vector-0.1.1.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

asyncpg_vector-0.1.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file asyncpg_vector-0.1.1.tar.gz.

File metadata

  • Download URL: asyncpg_vector-0.1.1.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for asyncpg_vector-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e1a9aefe3a1ba7161ee8d167692ae287c8b191b2697ad716de4a1fad9792bf20
MD5 2b9e7af5faacb19f095c38a8a28e7bee
BLAKE2b-256 31d46757b51e41f07630eedd88b530fec134f98c323350428d1bde9d6c822683

See more details on using hashes here.

File details

Details for the file asyncpg_vector-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: asyncpg_vector-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for asyncpg_vector-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 08cdaf9ab73acf31636a519c699f751e69f3306408a9aed222e617d789e964bb
MD5 bde60b280becf812ece92364ad028f2e
BLAKE2b-256 0d99866aa5ecc2c47105f8a8f67e3ddde272ca75315e30e22b7a01f634ebf2b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page