Skip to main content

Apache Solr Dialect for SQLAlchemy

Project description

linting: pylint pre-commit

Apache Solr dialect for SQLAlchemy

A SQLAlchemy dialect for Apache Solr.

Requirements

The dialect is compatible with Solr version 6.0 and higher.

Installation

The package can either be installed through PyPi or from the source code.

Through Python Package Index

pip install sqlalchemy-solr

Latest from Source Code

pip install git+https://github.com/aadel/sqlalchemy-solr

Usage

To connect to Solr with SQLAlchemy, the following URL pattern can be used:

solr://<username>:<password>@<host>:<port>/solr/<collection>[?parameter=value]

Note: port 8983 is used when port in the URL is omitted

Authentication

Basic Authentication

Basic authentication credentials can be supplied in connection URL as shown in the URL pattern above

JWT Authentication

JWT authentication token can be supplied as a token URL parameter, for example:

solr://<host>:<port>/solr/<collection>?token=<token_value>

Additional Parameters:

If HTTPS is enabled, the following parameters can be supplied:

  1. use_ssl: a boolean parameter when set to true an HTTPS connection is used. Default value is false.
  2. verify_ssl: a boolean parameter that controls whether to verify SSL certificate. Default value is true.

Aliases

Aliases are supported as tables where columns are the union of all the underlying collections fields. For example, if an alias collection_alias has linked collection members, the following query is valid:

SELECT f1, f2, f3 FROM collection_alias

where f1, f2, and f3 are defined in the linked collections.

Time Filters

Time filtration predicates are transformed to Solr syntax when ORM mode is used and Solr datasource release is lower than 9. Time filters transformations are handled internally by Solr 9 and above without the driver intervention. Open- and close-ended date ranges are supported.

Multi-valued Fields

Multi-value fields are mapped to SQL arrays of a specific scalar type. For example:

phones = Column('PHONE_ss', ARRAY(String))

SQL Compilation Caching

The dialect supports caching by leveraging SQLAlchemy SQL compilation caching capabilities, which include query caching.

Schema

If the ORM query supplied explicitly refers to a schema, the schema would be filtered out before query execution.

Basic Example

The following example illustrates the basic usage in a Python project:

engine = create_engine('solr://solr:8983/solr/examples_books')

with engine.connect() as connection:
    result = connection.execute(text("SELECT sequence_i, genre_s FROM examples_books"))
    for row in result:
        print("Sequence: {}, Genre: {}".format(row['sequence_i'], row['genre_s']))

ORM Example

Base = declarative_base()

class Book(Base):
    __tablename__ = "books"

    id = Column('index_i', Integer, primary_key=True)
    name = Column('book_name_t', String)
    publishing_year = Column('publishing_year_i', Integer)

    def __repr__(self):
        return f"Book(id={self.id!r}, name={self.name!r}, publishing_year={self.publishing_year!r})"

engine = create_engine('solr://solr:8983/solr/books')

with Session(engine) as session:
    stmt = select(Book).where(Book.publishing_year.in_([2014, 2015])).limit(10)

    for book in session.scalars(stmt):
        print(book)

where index_i, book_name_t, and publishing_year_i are fields of books collection.

Time Filters Example

Base = declarative_base()

class SalesHistory(Base):
    __tablename__ = "sales"

    order_number = Column('order_number_i', Integer, primary_key=True)
    price_each = Column('price_each_f', Float)
    status = Column('status_s', String)
    order_date = Column('order_date_dt', Integer)

    def __repr__(self):
        return f"SalesHistory(order number={self.order_number!r}, status={self.status!r}, price={self.price_each}, order date={self.order_date!r})"

engine = create_engine('solr://solr:8983/solr/sales')

with Session(engine) as session:
    stmt = select(SalesHistory) \
        .where(and_(SalesHistory.order_date >= "2024-01-01 00:00:00", SalesHistory.order_date < "2024-02-01 00:00:00")) \
        .order_by(SalesHistory.price_each.asc()) \
        .limit(10)

    for row in session.scalars(stmt):
        print(row)

where order_number_i, price_each_f, status_s, and order_date_dt fields of sales collection.

In the above example, date predicates are transformed to [2024-01-01T00:00:00Z TO 2024-02-01T00:00:00Z}.

Open-ended date ranges are supported. For example:

.where(SalesHistory.order_date >= "2024-01-01 00:00:00")

translates to [2024-01-01T00:00:00Z TO *]

Operator Translation

ILIKE and NOT ILIKE operators are translated to LIKE amd NOT LIKE operators which are case in-sensetive in Solr.

Miscellaneous

SQL statement semi-colon is removed if present.

Compatibility

Feature 6.0 6.5 6.6 7.x 8.x 9.x
Aliases
Built-in date range compilation
SELECT expression statements
SQL compilation caching

Use Cases

Apache Superset

To connect Apache Superset with Solr datasource, add the package to the requirements then create a database connection using the URL pattern shown above.

Testing with Apache Superset

Requirements

  • A Solr instance with a Parallel SQL supported up and running
  • A Superset instance up and running with this package installed
  • pytest >= 7.4.4 installed on the testing machine

Procedure

  1. Change conftest.py as appropriate
  2. Run pytest

Resources

  1. SQLAlchemy documentation
  2. SQLAlchemy dialects
  3. PEP 249 – Python Database API Specification v2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlalchemy_solr-0.2.4.3.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

sqlalchemy_solr-0.2.4.3-py3-none-any.whl (22.3 kB view details)

Uploaded Python 3

File details

Details for the file sqlalchemy_solr-0.2.4.3.tar.gz.

File metadata

  • Download URL: sqlalchemy_solr-0.2.4.3.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for sqlalchemy_solr-0.2.4.3.tar.gz
Algorithm Hash digest
SHA256 409409f4a4c91cd3a7b5d0cf407d29cb75a0994894ab2467031bdc34fb5cddc8
MD5 cc82a4146948a7790087ec1632d6d6f0
BLAKE2b-256 39000bc20f647ae1b608d4ad309e707b8b18bc179fc43cbf6a031645effd0602

See more details on using hashes here.

File details

Details for the file sqlalchemy_solr-0.2.4.3-py3-none-any.whl.

File metadata

File hashes

Hashes for sqlalchemy_solr-0.2.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 355a7884a8cb32a77e6e4f2b03584f1e5db1fcb37c88af93989d9987cf3fc46f
MD5 1514a71be1c183071a8701255f8b936f
BLAKE2b-256 fe4242e61dbe76fe7462fcd1e8aab1d68f66352369c46faf268fcb9bea50ecb6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page