Skip to main content

Add your description here

Project description

smallcat logo

smallcat

A small, modular data catalog.

PyPI Python versions CI coverage license downloads docs

Install

pip install smallcat

Quickstart

Create Catalog

Local catalogs can be kept in YAML files.

entries:
    foo:
        file_format: csv
        connection:
            conn_type: fs
            extra:
                base_path: /tmp/smallcat-example/
        location: foo.csv
        load_options:
            header: true
    bar:
        file_format: parquet
        connection:
            conn_type: google_cloud_platform
            extra:
                bucket: my-bucket
        location: bar.csv
        save_options:
            partition_by:
                - year
                - month

Standalone

from smallcat import Catalog

catalog = Catalog.from_path("catalog.yaml")
catalog.save_pandas("foo", df)
df2 = catalog.load_pandas("foo")

Filter on load

load_pandas (and the lower-level Arrow loaders) accept an optional where SQL predicate to push filters down to DuckDB/Arrow when reading:

df = catalog.load_pandas("bar", where="event_date >= '2024-01-01'")

With Airflow

from smallcat import Catalog

catalog = Catalog.from_airflow_variable("example_catalog")
df = catalog.load_pandas("bar")

Docs

Read more at the official docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smallcat-0.4.0.tar.gz (20.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smallcat-0.4.0-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file smallcat-0.4.0.tar.gz.

File metadata

  • Download URL: smallcat-0.4.0.tar.gz
  • Upload date:
  • Size: 20.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.4.0.tar.gz
Algorithm Hash digest
SHA256 97fa8e95e7c333614e2b921d14bf55ccf9c7de7f57e38434f27cea0bd5371e53
MD5 dac961e21c066159a7d9266b9d597d2a
BLAKE2b-256 e49c5dae05a05ca4e39fe1db2cb801ce78c2a1332d5b092486cf9912f856ad8d

See more details on using hashes here.

File details

Details for the file smallcat-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: smallcat-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d61424e8df77f1081781261315796484be6086e733b79b7c1ad5d17e1504a93
MD5 8a5f249953cc6ca4197eaf5bd00f01fc
BLAKE2b-256 b41a9d84f2b132115f40c2938304a1baa515443e099846f46fc466e99313bc87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page