Client for Updating a Simple Data Warehouse on Blob Storage
Project description
datablob
Client for Updating a Simple Data Warehouse on Blob Storage
design philosophy
- optimize for simplicity and user friendliness
- storage is cheap (compared to compute)
- pre-compute as much as possible
- should work out of the box
- advanced configuration should be opt-in
- explicit is better than implicit
- straightforwardness over magic
install
pip install datablob
supported formats
- csv
- geojson points
- json
- json lines
- parquet, including geoparquet
- xlsx (Microsoft Excel)
basic usage
from datablob import DataBlobClient
client = DataBlobClient(bucket_name="example-test-bucket-123", bucket_path="prefix/to/dataportal")
client.update_dataset(name="fleet", version="2", data=rows, xlsx=True)
# automatically creates the following files
# s3://example-test-bucket-123/prefix/to/dataportal/fleet/v2/meta.json
# s3://example-test-bucket-123/prefix/to/dataportal/fleet/v2/data.csv
# s3://example-test-bucket-123/prefix/to/dataportal/fleet/v2/data.json
# s3://example-test-bucket-123/prefix/to/dataportal/fleet/v2/data.jsonl
# s3://example-test-bucket-123/prefix/to/dataportal/fleet/v2/data.parquet
# s3://example-test-bucket-123/prefix/to/dataportal/fleet/v2/data.xlsx
examples
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datablob-0.4.0.tar.gz
(7.4 kB
view details)
File details
Details for the file datablob-0.4.0.tar.gz.
File metadata
- Download URL: datablob-0.4.0.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9c5d9a3da7ae449cff4e4a2071fcfb5a32a79c16c533e8b047f97269d0c1846
|
|
| MD5 |
b1c740ec9a2edb84cc49e8006c5ee133
|
|
| BLAKE2b-256 |
f802b7392492d0e7ccd200a830c834a61ed88bf6cff3b0c4d2cd6b7b690e9a62
|