Skip to main content

SqlAlchemy Postgres tsquery functionality

Project description

SqlAlchemy Postgres Full Text Search

This provides the websearch_to_tsquery functionality that postgres provides out of the box, but it adds support for using * as a wildcard.

Usage

Saving a TSQuery

class Query(Base):
    __tablename__ = 'query'
    query = Column(TSQuery) # english query by default
    spanish_query = Column(TSQuery(language = 'spanish'))
    simple_query = Column(TSQuery(language = None)) # or 'simple'

Saving a TSVector

Note: there isn't much value in doing this as opposed to creating the indexes on a TEXT column unless the original text does not matter.

class Vector(Base):
    __tablename__ = 'vector'
    vector = Column(TSVector) # english by default
    spanish_vector = Column(TSVector(language = 'spanish'))
    simple_vector = Column(TSVector(language = None)) # or 'simple'

Filtering with to_tsquery

from sql_alchemy_fts import to_tsquery

vecs = session.
    query(Vector).
    filter(
        Vector.vector.op("@@")(to_tsquery('english', 'dinosaur & extinction')),
    ).all()

Note: vector is a TSVector. This will also work as text, but it will not give a chance to specify language (so will be 'simple' by default, unless postgres has been configured otherwise).

Filtering with websearch

from sql_alchemy_fts import to_tsquery, websearch
query = websearch('dinosaur "long time" -"jurassic park"')
vecs = session.
    query(Vector).
    filter(
        Vector.vector.op("@@")(to_tsquery('english', query),
    ).all()

Creating a TSVector index

from sql_alchemy_fts import to_tsvector

class Document(Base):
    __tablename__ = 'document'
    body = Column(Text)
    # track language so we can create partial indexes for language matches
    language = Column(Text)

    __tableargs__ = [Index("ix_document_body_english_gin_tsvector", to_tsvector(
        # Create english language index
        Index(
            "ix_document_body_english_gin_tsvec",
            text("to_tsvector('english', body)"),
            postgresql_using="gin",
            postgresql_where=text("language = 'english'"),
        ),
        # Create spanish language index
        Index(
            "ix_document_body_spanish_gin_tsvec",
            text("to_tsvector('spanish', body)"),
            postgresql_using="gin",
            postgresql_where=text("language = 'spanish'"),
        ),
    ]

# querying spanish docs:
session.
    query(Document).
    filter(
        # convert the text type to_tsvector to match the index
        to_tsvector('spanish', Document.body).
            op("@@")(to_tsquery("spanish", "dinosaurios & vivieron"))
    ).
    # required for postgres to match the index to be used
    filter(Document.language == "spanish").
    all()

Note: an index is likely going to be useful even if the TEXT was dropped and only a TSVector was saved on the table.

Websearch Syntax

This is inspired by the websearch_to_tsquery function that is defined in postgres, but it does not allow a "*" to wildcard after a prefix, despite that tsquery supports it. Therefore, to make this work, the entire websearch to tsquery has to be done within python.

  1. websearch("dinosaur stomp"): Dinosaur and stomp both must show up.
  2. websearch("dinosaur or stomp"): Either must show up.
  3. websearch("dinosaur "long time ago"): Dinosaur must show up and the phrase "long time ago" must show up.
  4. websearch("the of a an by dinosaur"): Dinosaur must show up and the other words are of no value so they are filtered (by postgres itself).
  5. websearch("super*"): Word with the prefix of super* is required such as superman, superb, or superior.
  6. websearch("-dinosaur") Anything without the word dinosaur.
  7. websearch("dinosuar -"jurassic park"): The phrase jurassic park must not be present, but dinosaur must show up.

The websearch class only renders out a tsquery text fragment. It is intended to be composed with to_tsquery.

Development

Setup

  1. Install asdf.
  2. Add python plugin: asdf plugin-add python.
    1. Add a $HOME/.default-python-packages containing poetry.
  3. Optionally, add postgres plugin: `asdf plugin-add postgres.
  4. asdf install to install python and postgres.

Tests

Run poetry run tox or poetry run tox --asdf-install if some versions of python are missing.

Note: make sure postgres is running. If installed through asdf, it needs to be started with pg_ctl.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlalchemy-pg-fts-0.1.3.tar.gz (5.8 kB view hashes)

Uploaded Source

Built Distribution

sqlalchemy_pg_fts-0.1.3-py3-none-any.whl (6.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page