Skip to main content

Add your description here

Project description

smallcat logo

smallcat

A small, modular data catalog.

PyPI Python versions CI coverage license downloads docs

Install

pip install smallcat

Quickstart

Create Catalog

Local catalogs can be kept in YAML files.

entries:
    foo:
        file_format: csv
        connection:
            conn_type: fs
            extra:
                base_path: /tmp/smallcat-example/
        location: foo.csv
        load_options:
            header: true
    bar:
        file_format: parquet
        connection:
            conn_type: google_cloud_platform
            extra:
                bucket: my-bucket
        location: bar.csv
        save_options:
            partition_by:
                - year
                - month

Standalone

from smallcat import Catalog

catalog = Catalog.from_path("catalog.yaml")
catalog.save_pandas("foo", df)
df2 = catalog.load_pandas("foo")

Filter on load

load_pandas (and the lower-level Arrow loaders) accept optional where and columns arguments to push filters and projections down to DuckDB/Arrow when reading:

df = catalog.load_pandas(
    "bar",
    where="event_date >= '2024-01-01'",
    columns=["event_date", "user_id"],
)

With Airflow

from smallcat import Catalog

catalog = Catalog.from_airflow_variable("example_catalog")
df = catalog.load_pandas("bar")

Docs

Read more at the official docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smallcat-0.5.0.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smallcat-0.5.0-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file smallcat-0.5.0.tar.gz.

File metadata

  • Download URL: smallcat-0.5.0.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.5.0.tar.gz
Algorithm Hash digest
SHA256 a30648697a9c550a55c89ed7926f17563e637dca75a6409f056b25be98ddf7e5
MD5 074e63142f65d6a19c37fd5861667de5
BLAKE2b-256 b1501c5cc5b76ca7ad08c5070562052ec6aa83902ca6dbd1366f9a082e1e29a1

See more details on using hashes here.

File details

Details for the file smallcat-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: smallcat-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1983fed00c9be251098f6a2630b8393801d45440a8be01c75c11b9287c4cc97d
MD5 0b0d0bf3301b8d29a25f13ed897f7193
BLAKE2b-256 cfe4f2216b4a8e9c03ac9150095112427a2c7bf23e747247cee6c3740f4e2d3d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page