DE1's curated collection of kedro tools.
Project description
de1
Curated collection of DE1's favorite kedro utilities.
EmptyPartitionedDataSet
For those times when data is not yet available in a particular folder, or if no data is a valid value.
Particularly useful when doing sub-node parallelization.
empty_json_collection:
type: de1.empty.EmptyPartitionedDataSet
path: data/02_intermediate/json_collection
dataset: json.JSONDataSet
LazyPartitionedDataSet
For when the data is too big to calculate all at once, and requires at least some clean-up in the process.
lazy_json_collection:
type: de1.lazy.LazyPartitionedDataSet
path: data/02_intermediate/json_collection
dataset: json.JSONDataSet
PDFDataSet
A dataset that uses pdfplumber
to extract text and tables from pdf files.
Data gets returned as a PDFPage
object.
invoice_pdf:
type: de1.pdf.PDFDataSet
filepath: data/01_raw/invoice.pdf
ZipFileDataSet
A dataset that extracts a single file from a zip file and returns the bytes. By default will return a byte array, but a dataset can be passed in to change unzip behavior.
invoice_pdf:
type: de1.zip.ZipFileDataSet
filepath: data/01_raw/invoice.zip
filename: invoice.pdf
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file de1-0.1.1.tar.gz
.
File metadata
- Download URL: de1-0.1.1.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38c383cd0b76517d85c87ea7623bc55cd2fd02713ae6a131ffcd57695c50921d |
|
MD5 | 2001e9206ccb52eed93babe70ff5c6b6 |
|
BLAKE2b-256 | 1bca743055964c3bbf088011b2af057b195f5690b01b40928bf8d032d0c95a27 |
File details
Details for the file de1-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: de1-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed5fe57aead58031b97a8dd8f65578ae2fa3ac73971e41c5ac4042ab1d08ae9a |
|
MD5 | c5cf4c68eb846e93f93c2fc7afa2bb7e |
|
BLAKE2b-256 | c4125c6942439bac396be8ae64d3f3f91b0881128e8f329adec9b5fabd5d51f7 |