Skip to main content

Add your description here

Project description

smallcat logo

smallcat

A small, modular data catalog.

PyPI Python versions CI coverage license downloads docs

Install

pip install smallcat

Quickstart

Create Catalog

Local catalogs can be kept in YAML files.

entries:
    foo:
        file_format: csv
        connection:
            conn_type: fs
            extra:
                base_path: /tmp/smallcat-example/
        location: foo.csv
        load_options:
            header: true
    bar:
        file_format: parquet
        connection:
            conn_type: google_cloud_platform
            extra:
                bucket: my-bucket
        location: bar.csv
        save_options:
            partition_by:
                - year
                - month

Standalone

from smallcat import Catalog

catalog = Catalog.from_path("catalog.yaml")
catalog.save_pandas("foo", df)
df2 = catalog.load_pandas("foo")

Filter on load

load_pandas (and the lower-level Arrow loaders) accept an optional where SQL predicate to push filters down to DuckDB/Arrow when reading:

df = catalog.load_pandas("bar", where="event_date >= '2024-01-01'")

With Airflow

from smallcat import Catalog

catalog = Catalog.from_airflow_variable("example_catalog")
df = catalog.load_pandas("bar")

Docs

Read more at the official docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smallcat-0.4.2.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smallcat-0.4.2-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file smallcat-0.4.2.tar.gz.

File metadata

  • Download URL: smallcat-0.4.2.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.4.2.tar.gz
Algorithm Hash digest
SHA256 14b45924505a26bf383891d7a371996f6777ad3823c6f0633af495dc39958a63
MD5 d7a3f3dc9ad3f9669d87d6c38ddcbe41
BLAKE2b-256 a5e8fce01f71e789cf2da9ddd7ca7a46f181d85f96474bd68eae9dd6a736d22a

See more details on using hashes here.

File details

Details for the file smallcat-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: smallcat-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 afac862a8a7806a1c07bc8766af4b0b72e6fa59308d916b2630ac120fee84741
MD5 a0384f1de936e72739e3902a9e1e1bac
BLAKE2b-256 ad2ef00aff211627cced64add06dcb0421c520a174f7f5391418797e287944b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page