Extract data from a bunch of files and load into a table
Project description
Elbow
Elbow is a library for extracting data from a bunch of files and loading into a table (and that's it).
Examples
import json
from elbow import load_table, load_parquet
# Extract records from JSON-lines
def extract(path):
with open(path) as f:
for line in f:
record = json.loads(line)
yield record
# Load as a pandas dataframe
df = load_table(
pattern="**/*.json",
extract=extract,
)
# Load as a parquet dataset (in parallel)
dset = load_parquet(
pattern="**/*.json",
extract=extract,
where="dset.parquet",
workers=8,
)
Installation
A pre-release version can be installed with
pip install elbow
Other (better) projects
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
elbow-0.1.0a1.tar.gz
(31.0 kB
view details)
Built Distribution
elbow-0.1.0a1-py3-none-any.whl
(26.0 kB
view details)
File details
Details for the file elbow-0.1.0a1.tar.gz
.
File metadata
- Download URL: elbow-0.1.0a1.tar.gz
- Upload date:
- Size: 31.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b6b6e3880e6e1ab5bac350ba53c908d16e2a26ae7c22608a924b3bd27898efd |
|
MD5 | 9103d7a3d173bc43ffd3e1bff91f80a1 |
|
BLAKE2b-256 | 239d6f24e56b4c912bd03b382c463c5507b90cd7c4e1f32030876463445270d3 |
File details
Details for the file elbow-0.1.0a1-py3-none-any.whl
.
File metadata
- Download URL: elbow-0.1.0a1-py3-none-any.whl
- Upload date:
- Size: 26.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b07b0b7df7445f77cae0b8511fe900cb9f28005eec88e591354f58062105df46 |
|
MD5 | 608bb58bb39470102b9ef45e568e940b |
|
BLAKE2b-256 | 820653acaf0c24943d2f3ad3b068018abc1b00806f73669ca9818aa30f6b2ec1 |