Useful wrapper for SQLite
Project description
brunodb
Brunodb is a lightweight but useful python interface for sqlite and postgres. It is tailored to data science workflows which are basically high throughout streaming computation patterns (rather than transactional patterns).
The idea is to use databases instead of files and also do most of your work in pure python in streaming fashion rather than using batch libraries like pandas and other data frame libraries. Databases allow for operations like joins, ordering and simple aggregations without having to put everything in memory.
The idea of the library is part of a strategy to enable very productive proof of concepts on local resources (your laptop) which can migrate naturally and painlessly into production applications without extensive rewrites. Brunodb can be an efficient solution by itself for moderate data sizes. Streaming pattern pipelines can be ported to Spark or some distributed cluster compute system fairly easily.
Brunodb frees you from some of the lower level details of dealing with these python database clients. It gives you any easy and natural way to schema and load data from either files or streams. It gives you some shortcuts for doing queries while also allowing you full SQL functionality when you need it. It makes working on either SQLite or Postgres the same. And it allows for very fast bulk loads for Postgres by levering the dbcrossbar library.
There are no real dependencies besides sqlite3 which is a standard library module and pytest for running tests. psycopg2 and a postgres database is needed to run the interface on postgres. dbcrossbar (easy to install rust library) is required for doing extremely fast bulk loads of postgres.
To install
pip install brunodb
See here for a demo:
Or to run demo:
from brunodb import demo
demo()
To run tests:
python -m pytest test
If you have postgres installed, you can test it as well. You'll need to put the database password in the POSTGRES_PWD environment variable and have the usual standards: running on localhost, usual port, user name postgres etc.
python -m pytest test_postscript
If you install dbcrossbar you can do much faster postgres loads. Around 80X faster.
python -m pytest test_postgres_bulk_load
Or run all tests if you have postgres and dbcrossbar installed
python -m pytest
There is a wrapper for either Database class called DBase:
For in memory sqlite database:
from brunodb import DBase
config = {'db_type': 'sqlite'}
dbase = DBase(config)
Or with a file:
config = {'db_type': 'sqlite', 'filename': 'path/my_database.db'}
dbase = DBase(config)
Or using postgres:
config = {'db_type': 'postgres'}
dbase = DBase(config)
Or add other config options:
config = {'db_type': 'postgres', 'port': 5555, 'password':'foo'}
dbase = DBase(config)
First version April 30, 2020
Make block=False the default for loading version='0.1.3'
Added Postgres functionality version '0.2.0'
Added dbcrossbar fast bulk loads to postgres making it much faster than SQlite rather than much slower as before version '0.3.0'
Demo and some cleanup, some aliases added version '0.3.2'
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file brunodb-0.3.2.tar.gz
.
File metadata
- Download URL: brunodb-0.3.2.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e8e206323c75cce4ff75b5553dd84de832c68ff8b760f74af747abdfedc6c11 |
|
MD5 | 101f68dea70258934f4b9becb712517c |
|
BLAKE2b-256 | 356d42ec7eb50ab1010e389465a335e7226c66d649f7a32dede1560294db43f0 |
File details
Details for the file brunodb-0.3.2-py3-none-any.whl
.
File metadata
- Download URL: brunodb-0.3.2-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4dc038718b04aeef10bbfa358d65d6b8bb2d4ad32d0ef3651d25c5d6ab8e9a25 |
|
MD5 | a98e2a5207d878a10292fb97ce31dbc3 |
|
BLAKE2b-256 | 2738d54ee10e38e63a6511f4bf57654859b71c7ace4a0d221fd61226e0a95a6b |