Skip to main content

Python function to extract all the rows from a SQLite database, without loading the entire file into memory or disk

Project description

stream-sqlite CircleCI Test Coverage

Python function to extract all the rows from a SQLite database file concurrently with iterating over its bytes. Typically used to extract rows while downloading, without loading the entire file to memory or disk.

Usage

from stream_sqlite import stream_sqlite
import httpx

def sqlite_bytes():
    # Iterable that yields the bytes of a sqlite file
    with httpx.stream('GET', 'https://www.example.com/my.sqlite') as r:
        yield from r.iter_bytes(chunk_size=65536)

# A table is not guaranteed to be contiguous in a sqlite file, so can appear
# multiple times while iterating
for table_name, table_info, rows in stream_sqlite(sqlite_bytes()):
    for row in rows:
        print(row)

Limitations and recommendations

The SQLite file format is not designed to be streamed: the data is arranged in pages of a fixed number of bytes, and the information to identify a page may come after the page in the stream. Therefore, pages are buffered in memory by the stream_sqlite function until they can be identified.

However, if you have control over the SQLite file, VACUUM; should be run on it before streaming. In addition to minimising the size of the file, VACUUM; arranges the pages in a way that often reduces the buffering required when streaming. This is especially true if it was the target of intermingled INSERTs and/or DELETEs over multiple tables.

Also, indexes are not used for extracting the rows while streaming. If streaming is the only use case of the SQLite file, and you have control over it, indexes should be removed, and VACUUM; then run.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stream-sqlite-0.0.9.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stream_sqlite-0.0.9-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file stream-sqlite-0.0.9.tar.gz.

File metadata

  • Download URL: stream-sqlite-0.0.9.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for stream-sqlite-0.0.9.tar.gz
Algorithm Hash digest
SHA256 42bd7bf915933a15f59dae222990e4bfb72cc7474d9169fd113e0574282f107a
MD5 4ab3a22a92a15c4181411da994502996
BLAKE2b-256 37d0a3a42f89b74298125e72dec4d248c09d8a7113fc99c31238a7e46ceeadaa

See more details on using hashes here.

File details

Details for the file stream_sqlite-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: stream_sqlite-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for stream_sqlite-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 aa1fefde21ba3e1a782738314baceae5fba9415cc4dd0ac525d8ba2c547da09c
MD5 2f7270b1db610d8216ad509c7a965d84
BLAKE2b-256 253e338a1575078e19758b51ea640fa5bdd303d4d34aea5724f9cdc7ea0c5d8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page