SqlAlchemy Postgres tsquery functionality
Project description
SqlAlchemy Postgres Full Text Search
This provides the websearch_to_tsquery
functionality that postgres
provides out of the box, but it adds support for using *
as a wildcard.
Usage
Saving a TSQuery
class Query(Base):
__tablename__ = 'query'
query = Column(TSQuery) # english query by default
spanish_query = Column(TSQuery(language = 'spanish'))
simple_query = Column(TSQuery(language = None)) # or 'simple'
Saving a TSVector
Note: there isn't much value in doing this as opposed to creating the
indexes on a TEXT
column unless the original text does not matter.
class Vector(Base):
__tablename__ = 'vector'
vector = Column(TSVector) # english by default
spanish_vector = Column(TSVector(language = 'spanish'))
simple_vector = Column(TSVector(language = None)) # or 'simple'
Filtering with to_tsquery
from sql_alchemy_fts import to_tsquery
vecs = session.
query(Vector).
filter(
Vector.vector.op("@@")(to_tsquery('english', 'dinosaur & extinction')),
).all()
Note: vector
is a TSVector
. This will also work as text, but it will
not give a chance to specify language (so will be 'simple'
by default,
unless postgres has been configured otherwise).
Filtering with websearch
from sql_alchemy_fts import to_tsquery, websearch
query = websearch('dinosaur "long time" -"jurassic park"')
vecs = session.
query(Vector).
filter(
Vector.vector.op("@@")(to_tsquery('english', query),
).all()
Creating a TSVector index
from sql_alchemy_fts import to_tsvector
class Document(Base):
__tablename__ = 'document'
body = Column(Text)
# track language so we can create partial indexes for language matches
language = Column(Text)
__tableargs__ = [Index("ix_document_body_english_gin_tsvector", to_tsvector(
# Create english language index
Index(
"ix_document_body_english_gin_tsvec",
text("to_tsvector('english', body)"),
postgresql_using="gin",
postgresql_where=text("language = 'english'"),
),
# Create spanish language index
Index(
"ix_document_body_spanish_gin_tsvec",
text("to_tsvector('spanish', body)"),
postgresql_using="gin",
postgresql_where=text("language = 'spanish'"),
),
]
# querying spanish docs:
session.
query(Document).
filter(
# convert the text type to_tsvector to match the index
to_tsvector('spanish', Document.body).
op("@@")(to_tsquery("spanish", "dinosaurios & vivieron"))
).
# required for postgres to match the index to be used
filter(Document.language == "spanish").
all()
Note: an index is likely going to be useful even if the TEXT
was dropped
and only a TSVector
was saved on the table.
Websearch Syntax
This is inspired by the websearch_to_tsquery
function that is defined in
postgres, but it does not allow a "*"
to wildcard after a prefix,
despite that tsquery
supports it. Therefore, to make this work, the
entire websearch to tsquery has to be done within python.
websearch("dinosaur stomp")
: Dinosaur and stomp both must show up.websearch("dinosaur or stomp")
: Either must show up.websearch("dinosaur "long time ago")
: Dinosaur must show up and the phrase "long time ago" must show up.websearch("the of a an by dinosaur")
: Dinosaur must show up and the other words are of no value so they are filtered (by postgres itself).websearch("super*")
: Word with the prefix of super* is required such as superman, superb, or superior.websearch("-dinosaur")
Anything without the word dinosaur.websearch("dinosuar -"jurassic park")
: The phrase jurassic park must not be present, but dinosaur must show up.
The websearch
class only renders out a tsquery text fragment. It is
intended to be composed with to_tsquery
.
Development
Setup
- Install asdf.
- Add python plugin:
asdf plugin-add python
.- Add a
$HOME/.default-python-packages
containingpoetry
.
- Add a
- Optionally, add postgres plugin: `asdf plugin-add postgres.
asdf install
to install python and postgres.
Tests
Run poetry run tox
or poetry run tox --asdf-install
if some versions of python are missing.
Note: make sure postgres is running. If installed through asdf, it needs to be started with pg_ctl
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sqlalchemy_pg_fts-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 461f34d7eb54278d39b1ec63be838813aafe56d2bddca1dd896c63804476f221 |
|
MD5 | 97a5d6516618cbc9c0ccd30a5dee5e33 |
|
BLAKE2b-256 | 5ba1c57b51c7fbd1c42dbdb2c506c7f82d679f27ec21dd77218b0a6eae408c4b |