Skip to main content

DE1's curated collection of kedro tools.

Project description

de1

Curated collection of DE1's favorite kedro utilities.

EmptyPartitionedDataSet

For those times when data is not yet available in a particular folder, or if no data is a valid value.

Particularly useful when doing sub-node parallelization.

empty_json_collection:
    type: de1.empty.EmptyPartitionedDataSet
    path: data/02_intermediate/json_collection
    dataset: json.JSONDataSet

LazyPartitionedDataSet

For when the data is too big to calculate all at once, and requires at least some clean-up in the process.

lazy_json_collection:
    type: de1.lazy.LazyPartitionedDataSet
    path: data/02_intermediate/json_collection
    dataset: json.JSONDataSet

PDFDataSet

A dataset that uses pdfplumber to extract text and tables from pdf files.

Data gets returned as a PDFPage object.

invoice_pdf:
    type: de1.pdf.PDFDataSet
    filepath: data/01_raw/invoice.pdf

ZipFileDataSet

A dataset that extracts a single file from a zip file and returns the bytes. By default will return a byte array, but a dataset can be passed in to change unzip behavior.

invoice_pdf:
    type: de1.zip.ZipFileDataSet
    filepath: data/01_raw/invoice.zip
    filename: invoice.pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

de1-0.1.1.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

de1-0.1.1-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file de1-0.1.1.tar.gz.

File metadata

  • Download URL: de1-0.1.1.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6

File hashes

Hashes for de1-0.1.1.tar.gz
Algorithm Hash digest
SHA256 38c383cd0b76517d85c87ea7623bc55cd2fd02713ae6a131ffcd57695c50921d
MD5 2001e9206ccb52eed93babe70ff5c6b6
BLAKE2b-256 1bca743055964c3bbf088011b2af057b195f5690b01b40928bf8d032d0c95a27

See more details on using hashes here.

File details

Details for the file de1-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: de1-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6

File hashes

Hashes for de1-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ed5fe57aead58031b97a8dd8f65578ae2fa3ac73971e41c5ac4042ab1d08ae9a
MD5 c5cf4c68eb846e93f93c2fc7afa2bb7e
BLAKE2b-256 c4125c6942439bac396be8ae64d3f3f91b0881128e8f329adec9b5fabd5d51f7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page