Extract data from a bunch of files and load into a table
Project description
Elbow
Elbow is a library for extracting data from a bunch of files and loading into table (and that's it).
Examples
import json
from elbow import load_table, load_parquet
# Extract records from JSON-lines
def extract(path):
with open(path) as f:
for line in path:
record = json.loads(line)
yield record
# Load as a pandas dataframe
df = load_table(
pattern="**/*.json",
extract=extract,
)
# Load as a parquet dataset (in parallel)
dset = load_parquet(
pattern="**/*.json",
extract=extract,
where="dset.parquet",
workers=8,
)
Installation
A pre-release version can be installed with
pip install git+https://github.com/clane9/elbow
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
elbow-0.1.0a0.tar.gz
(9.0 kB
view details)
Built Distribution
File details
Details for the file elbow-0.1.0a0.tar.gz
.
File metadata
- Download URL: elbow-0.1.0a0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fdc8ab5173d92b68e2656133e499bb603e42b4522e60c0b13f425973f7b56d86 |
|
MD5 | fe227e9100664fdccda1b8340add1a99 |
|
BLAKE2b-256 | 395ffd964dc435a0dba379fff7b509772a42c30bdf5d38e44ac8684d74069d3c |
File details
Details for the file elbow-0.1.0a0-py3-none-any.whl
.
File metadata
- Download URL: elbow-0.1.0a0-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bdc876317847612241222ed96631ce522d6ca7cfe7dc7df73ce596b94299381c |
|
MD5 | 3c9c325d2b505dcd1686bf21c724932b |
|
BLAKE2b-256 | e7ce2358fce7f3e908f0baacd61d4a8404bb40328ef73dff99b1a49f56ebdd09 |