Skip to main content

PostgreSQL vector support for asyncpg

Project description

PostgreSQL vector support for asyncpg

Adds PostgreSQL vector support for Python.

Registers data types vector and halfvec from the PostgreSQL extension vector to the asynchronous PostgreSQL client asyncpg, and marshals vector data to and from PostgreSQL database tables.

Internally, the data is packed into a Python bytes object, with single-precision float vectors stored on 4 bytes per item (for class Vector) and half-precision float vectors stored on 2 bytes per item (for class HalfVector). Data is (un)packed using struct from the standard library as necessary.

This module provides functionality similar to pgvector-python but imports minimum dependencies (e.g. no dependency on numpy).

Setup

Install the package

pip install asyncpg_vector

Initialize

Register vector types with your database connection or connection pool:

Connection:

from asyncpg_vector import register_vector

async def main() -> None:
    ...

    await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
    await register_vector(conn)

Pool:

from asyncpg_vector import register_vector

async def init_connection(conn: asyncpg.Connection) -> None:
    await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
    await register_vector(conn)

async def main() -> None:
    ...

    pool = await asyncpg.create_pool(..., init=init_connection)

Perform similarity search

First, create a table and an index:

async def create(conn: asyncpg.Connection) -> None:
    await conn.execute("""
        CREATE TABLE IF NOT EXISTS items
        (
            id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,
            content text NOT NULL,
            embedding halfvec(1536) NOT NULL,
            CONSTRAINT pk_items PRIMARY KEY (id)
        );

        CREATE INDEX IF NOT EXISTS embedding_index ON items
        USING hnsw (embedding halfvec_cosine_ops);
    """)

Next, find documents in a knowledge base that match a search phrase using vector similarity with approximate nearest neighbor semantics:

from asyncpg_vector import HalfVector

async def search(conn: asyncpg.Connection, phrase: str) -> list[str]:
    ...

    embedding_response = await ai_client.embeddings.create(
        input=phrase,
        model="text-embedding-3-small",
        encoding_format="base64"
    )
    embedding = HalfVector.from_float_base64(embedding_response.data[0].embedding)
    query = """
        SELECT
            id,
            content,
            embedding <=> $1 AS distance
        FROM items
        ORDER BY distance
        LIMIT 5
    """
    rows = await conn.fetch(query, embedding)

    ...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asyncpg_vector-0.1.0.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

asyncpg_vector-0.1.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file asyncpg_vector-0.1.0.tar.gz.

File metadata

  • Download URL: asyncpg_vector-0.1.0.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for asyncpg_vector-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2c2a3c431d5e71e9201f55e01d2d3c68034513ef6c3be1c3e6cf686c0b72475b
MD5 59c40c82fcb4513b84d61f66167e9d59
BLAKE2b-256 52b918da43f39767b1b446dced41ebe756ee4f180e8416801a006936bface6a7

See more details on using hashes here.

File details

Details for the file asyncpg_vector-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: asyncpg_vector-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for asyncpg_vector-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 befe4248832098c9cac9e53b41360d0184a4eb1b4c8ea746599d08cb0cb2050b
MD5 42b8f751acb4aeb79641a65a0d3b7b09
BLAKE2b-256 fda6a502940ae95bca7ff911e2c88b4f74fe03276be60418028e5bc68d5b5c25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page