Skip to main content

offset-free paging for sqlalchemy

Project description

Notice: In accordance with Python 2’s rapidly-approaching end-of-life date, we’ve stopped supporting Python 2. If you really need it, the latest version to support Python 2 is 0.1.1559103842, but you’ll miss out on all the latest features and bugfixes from the latest version. You should be upgrading anyway!

https://circleci.com/gh/djrobstep/sqlakeyset.svg?style=svg

This library implements keyset-based paging for SQLAlchemy (both ORM and core).

This library has been tested with PostgreSQL and MariaDB/MySQL. It should work with other SQLAlchemy-supported databases too provided they support row( syntax (see below).

Background

A lot of people use SQL’s OFFSET syntax to implement paging of query results. The trouble with that is, the more pages you get through, the slower your query gets. Also, if the results you’re paging through change frequently, it’s possible to skip over or repeat results between pages. Keyset paging avoids these problems: Selecting even the millionth page is as fast as selecting the first.

Getting Started

Here’s how it works with a typical ORM query:

from sqlakeyset import get_page
from sqlbag import S

from models import Book

with S('postgresql:///books') as s:  # create a session
    q = s.query(Book).order_by(Book.author, Book.title, Book.id)  #

    # gets the first page
    page1 = get_page(q, per_page=20)

    # gets the key for the next page
    next_page = page1.paging.next

    # gets the second page
    page2 = get_page(q, per_page=20, page=next_page)

    # returning to the first page, getting the key
    previous_page = page2.paging.previous

    # the first page again, backwards from the previous page
    page1 = get_page(q, per_page=20, page=previous_page)

    # what if new items were added at the start?
    if page1.paging.has_previous:

        # go back even further
        previous_page = page1.paging.previous
        page1 = get_page(q, per_page=20, page=previous_page)

Under the Hood

sqlakeyset does the following to your query in order to get the paged contents:

  • adds a where clause, to get only rows after the specified row key.

  • if getting the previous page, reverses the order by direction in order the get the rows before the specified bookmark.

  • adds a limit clause, to fetch only enough items to fill the page, plus one additional (this additional row is used only to test for the existence of further pages after the current one, and is discarded from the results).

  • returns the page contents as an ordinary list that has an attached .paging attribute with the paging information for this and related pages.

Page objects

Paged items/rows are returned in a Page object, which is a vanilla python list, except with an attached Paging object with the paging information.

Properties such as next and previous return a 2-tuple containing the ordering key for the row, and a boolean to specify if the direction is forwards or backwards.

In our above example, the 2-tuple specifying the second page might look like:

('Joseph Heller', 'Catch 22', 123), False

The False means the query will fetch the page after the row containing Catch 22. This tuple contains two elements, title and id, to match the order by clause of the query.

The page before this row would be specified as:

('Joseph Heller', 'Catch 22', 123), True

The first and last pages are fetched with None instead of a tuple, so for the first page (this is also the default if the page parameter is not specified):

None, False

And the last page:

None, True

Keyset Serialization

You will probably want to turn these keysets/bookmarks into strings for passing around. sqlakeyset includes code to do this. To get a serialized bookmark, just add bookmark_ to the name of the property that holds the keyset you want.

Most commonly you’ll want next and previous, so:

>>> page.paging.bookmark_previous
<i:1~i:2015~s:Bad Blood~i:34
>>> page.paging.bookmark_next
>i:1~i:2014~s:Shake It Off~i:31

sqlakeyset uses the python csv row serializer to serialize the bookmark values (using ~ instead of a , as the separator). Direction is indicated by > (forwards/next), or < (backwards/previous) at the start of the string.

Limitations

  • Golden Rule: Always ensure your keysets are unique per row. If you violate this condition you risk skipped rows and other nasty problems. The simplest way to do this is to always include your primary key column(s) at the end of your ordering columns.

  • Any rows containing null values in their keysets will be omitted from the results, so your ordering columns should be NOT NULL. (This is a consequence of the fact that comparisons against NULL are always false in SQL.) This may change in the future if we work out an alternative implementation; but for now we recommend using coalesce as a workaround:

from sqlakeyset import get_page
from sqlalchemy import func
from sqlbag import S
from models import Book
with S('postgresql:///books') as s:
    # If Book.cost can be NULL:
    q = s.query(Book).order_by(func.coalesce(Book.cost, 0), Book.id)
    # page1 will start with books where cost is null:
    page1 = get_page(q, per_page=20)
  • If you’re using the in-built keyset serialization, this only handles basic data/column types so far (strings, ints, floats, datetimes, dates, booleans, and a few others). The serialization can be extended to serialize more advanced types as necessary (documentation on this is forthcoming).

Documentation

sqlakeyset is in early alpha and documentation other than this README is scarce so far. We are working on remedying this. Watch this space.

Installation

Assuming you have pip installed, all you need to do is install as follows:

$ pip install sqlakeyset

This will install sqlakeyset and also sqlalchemy if not already installed. Obviously you’ll need the necessary database driver for your chosen database to be installed also.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlakeyset-1.0.1616299134.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sqlakeyset-1.0.1616299134-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file sqlakeyset-1.0.1616299134.tar.gz.

File metadata

  • Download URL: sqlakeyset-1.0.1616299134.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.8.2 Linux/4.15.0-1092-aws

File hashes

Hashes for sqlakeyset-1.0.1616299134.tar.gz
Algorithm Hash digest
SHA256 675f1247b6cd604df6a3363d08555df1fa49b8fe5774aa3bb839b7e614ba4a1b
MD5 2e72e33b3e6f4ed5d9c14230a20498f2
BLAKE2b-256 796d9da38b8e615b8cdd8d8f9899f9afb8f0fbe7107ef74c67f63c0a32a599de

See more details on using hashes here.

File details

Details for the file sqlakeyset-1.0.1616299134-py3-none-any.whl.

File metadata

  • Download URL: sqlakeyset-1.0.1616299134-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.8.2 Linux/4.15.0-1092-aws

File hashes

Hashes for sqlakeyset-1.0.1616299134-py3-none-any.whl
Algorithm Hash digest
SHA256 b568611a10ee28a4683193db820cb31e0a89fd7126b594355adb1d6ccafb902c
MD5 a08b57b36a42b74cde0fcaa56ef8cd9b
BLAKE2b-256 f624b250002adfb09d84f4b63701957701386996a7021cc2f2174967d2fceb91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page