Skip to main content

SqlAlchemy Postgres tsquery functionality

Project description

SqlAlchemy Postgres Full Text Search

This provides the websearch_to_tsquery functionality that postgres provides out of the box, but it adds support for using * as a wildcard.

Usage

Saving a TSQuery

class Query(Base):
    __tablename__ = 'query'
    query = Column(TSQuery) # english query by default
    spanish_query = Column(TSQuery(language = 'spanish'))
    simple_query = Column(TSQuery(language = None)) # or 'simple'

Saving a TSVector

Note: there isn't much value in doing this as opposed to creating the indexes on a TEXT column unless the original text does not matter.

class Vector(Base):
    __tablename__ = 'vector'
    vector = Column(TSVector) # english by default
    spanish_vector = Column(TSVector(language = 'spanish'))
    simple_vector = Column(TSVector(language = None)) # or 'simple'

Filtering with to_tsquery

from sql_alchemy_fts import to_tsquery

vecs = session.
    query(Vector).
    filter(
        Vector.vector.op("@@")(to_tsquery('english', 'dinosaur & extinction')),
    ).all()

Note: vector is a TSVector. This will also work as text, but it will not give a chance to specify language (so will be 'simple' by default, unless postgres has been configured otherwise).

Filtering with websearch

from sql_alchemy_fts import to_tsquery, websearch
query = websearch('dinosaur "long time" -"jurassic park"')
vecs = session.
    query(Vector).
    filter(
        Vector.vector.op("@@")(to_tsquery('english', query),
    ).all()

Creating a TSVector index

from sql_alchemy_fts import to_tsvector

class Document(Base):
    __tablename__ = 'document'
    body = Column(Text)
    # track language so we can create partial indexes for language matches
    language = Column(Text)

    __tableargs__ = [Index("ix_document_body_english_gin_tsvector", to_tsvector(
        # Create english language index
        Index(
            "ix_document_body_english_gin_tsvec",
            text("to_tsvector('english', body)"),
            postgresql_using="gin",
            postgresql_where=text("language = 'english'"),
        ),
        # Create spanish language index
        Index(
            "ix_document_body_spanish_gin_tsvec",
            text("to_tsvector('spanish', body)"),
            postgresql_using="gin",
            postgresql_where=text("language = 'spanish'"),
        ),
    ]

# querying spanish docs:
session.
    query(Document).
    filter(
        # convert the text type to_tsvector to match the index
        to_tsvector('spanish', Document.body).
            op("@@")(to_tsquery("spanish", "dinosaurios & vivieron"))
    ).
    # required for postgres to match the index to be used
    filter(Document.language == "spanish").
    all()

Note: an index is likely going to be useful even if the TEXT was dropped and only a TSVector was saved on the table.

Websearch Syntax

This is inspired by the websearch_to_tsquery function that is defined in postgres, but it does not allow a "*" to wildcard after a prefix, despite that tsquery supports it. Therefore, to make this work, the entire websearch to tsquery has to be done within python.

  1. websearch("dinosaur stomp"): Dinosaur and stomp both must show up.
  2. websearch("dinosaur or stomp"): Either must show up.
  3. websearch("dinosaur "long time ago"): Dinosaur must show up and the phrase "long time ago" must show up.
  4. websearch("the of a an by dinosaur"): Dinosaur must show up and the other words are of no value so they are filtered (by postgres itself).
  5. websearch("super*"): Word with the prefix of super* is required such as superman, superb, or superior.
  6. websearch("-dinosaur") Anything without the word dinosaur.
  7. websearch("dinosuar -"jurassic park"): The phrase jurassic park must not be present, but dinosaur must show up.

The websearch class only renders out a tsquery text fragment. It is intended to be composed with to_tsquery.

Development

Setup

  1. Install asdf.
  2. Add python plugin: asdf plugin-add python.
    1. Add a $HOME/.default-python-packages containing poetry.
  3. Optionally, add postgres plugin: `asdf plugin-add postgres.
  4. asdf install to install python and postgres.

Tests

Run poetry run tox or poetry run tox --asdf-install if some versions of python are missing.

Note: make sure postgres is running. If installed through asdf, it needs to be started with pg_ctl.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlalchemy-pg-fts-0.1.3.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

sqlalchemy_pg_fts-0.1.3-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file sqlalchemy-pg-fts-0.1.3.tar.gz.

File metadata

  • Download URL: sqlalchemy-pg-fts-0.1.3.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.12 Linux/5.13.0-39-generic

File hashes

Hashes for sqlalchemy-pg-fts-0.1.3.tar.gz
Algorithm Hash digest
SHA256 7683ff5b19c93c4cf3a013ea18b139e208ae1acac37c4198e772739885916037
MD5 cd65131b572bd3e75ebdc8de2f9d41af
BLAKE2b-256 fcb360934e27b86207406ef82da38103f411e51bd3fe953173312e66a081a389

See more details on using hashes here.

File details

Details for the file sqlalchemy_pg_fts-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: sqlalchemy_pg_fts-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 6.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.12 Linux/5.13.0-39-generic

File hashes

Hashes for sqlalchemy_pg_fts-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 461f34d7eb54278d39b1ec63be838813aafe56d2bddca1dd896c63804476f221
MD5 97a5d6516618cbc9c0ccd30a5dee5e33
BLAKE2b-256 5ba1c57b51c7fbd1c42dbdb2c506c7f82d679f27ec21dd77218b0a6eae408c4b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page