Skip to main content

Add your description here

Project description

smallcat logo

smallcat

A small, modular data catalog.

PyPI Python versions CI coverage license downloads docs

Install

pip install smallcat

Quickstart

Create Catalog

Local catalogs can be kept in YAML files.

entries:
    foo:
        file_format: csv
        connection:
            conn_type: fs
            extra:
                base_path: /tmp/smallcat-example/
        location: foo.csv
        load_options:
            header: true
    bar:
        file_format: parquet
        connection:
            conn_type: google_cloud_platform
            extra:
                bucket: my-bucket
        location: bar.csv
        save_options:
            partition_by:
                - year
                - month

Standalone

from smallcat import Catalog

catalog = Catalog.from_path("catalog.yaml")
catalog.save_pandas("foo", df)
df2 = catalog.load_pandas("foo")

Filter on load

load_pandas (and the lower-level Arrow loaders) accept optional where and columns arguments to push filters and projections down to DuckDB/Arrow when reading:

df = catalog.load_pandas(
    "bar",
    where="event_date >= '2024-01-01'",
    columns=["event_date", "user_id"],
)

With Airflow

from smallcat import Catalog

catalog = Catalog.from_airflow_variable("example_catalog")
df = catalog.load_pandas("bar")

Docs

Read more at the official docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smallcat-0.5.1.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smallcat-0.5.1-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file smallcat-0.5.1.tar.gz.

File metadata

  • Download URL: smallcat-0.5.1.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.5.1.tar.gz
Algorithm Hash digest
SHA256 3144e698e49ab86c6489eae29f781ebca32d38b29f6f793142064e7d7bf2cfb5
MD5 6b6cef3656022e1bd3be4668435a1676
BLAKE2b-256 da56670e75d66589f9a370b4b60e273762b1cb8937c585e8dd7429716ab02ff6

See more details on using hashes here.

File details

Details for the file smallcat-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: smallcat-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d3f79ac190ab0eca3e59be78a14f78b85ed485ac8a8558ee18ce96dd48ac0eac
MD5 d7ed7d274537c194e2db356df70c8024
BLAKE2b-256 4aa4b495203563a0bb3949f1100385ff1fbb1637917c65be6566d8f0e5837a76

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page