Skip to main content

Add your description here

Project description

smallcat logo

smallcat

A small, modular data catalog.

PyPI Python versions CI coverage license downloads docs

Install

pip install smallcat

Quickstart

Create Catalog

Local catalogs can be kept in YAML files.

entries:
    foo:
        file_format: csv
        connection:
            conn_type: fs
            extra:
                base_path: /tmp/smallcat-example/
        location: foo.csv
        load_options:
            header: true
    bar:
        file_format: parquet
        connection:
            conn_type: google_cloud_platform
            extra:
                bucket: my-bucket
        location: bar.csv
        save_options:
            partition_by:
                - year
                - month

Standalone

from smallcat import Catalog

catalog = Catalog.from_path("catalog.yaml")
catalog.save_pandas("foo", df)
df2 = catalog.load_pandas("foo")

Filter on load

load_pandas (and the lower-level Arrow loaders) accept an optional where SQL predicate to push filters down to DuckDB/Arrow when reading:

df = catalog.load_pandas("bar", where="event_date >= '2024-01-01'")

With Airflow

from smallcat import Catalog

catalog = Catalog.from_airflow_variable("example_catalog")
df = catalog.load_pandas("bar")

Docs

Read more at the official docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smallcat-0.4.1.tar.gz (20.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smallcat-0.4.1-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file smallcat-0.4.1.tar.gz.

File metadata

  • Download URL: smallcat-0.4.1.tar.gz
  • Upload date:
  • Size: 20.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.4.1.tar.gz
Algorithm Hash digest
SHA256 9d2a6095da20338d5071607d90e9fe1cb30242943e0a2f647d0583591a83c23a
MD5 cf5cc45c543ea30b311a10e53c4b3989
BLAKE2b-256 8e8c319b10b1bb90416dd56dc48b084fb541e8d0d85639ff372339be3e25a35e

See more details on using hashes here.

File details

Details for the file smallcat-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: smallcat-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for smallcat-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 53e5ef5d7c865181e348840492fef0f783456b4454f203ec9c2aa7deabfa7d71
MD5 82432993f61a63ab32c13bb7988a5743
BLAKE2b-256 6bca2132876744ab6eb5920aa21a51338e1d9f12bc5681f15d41974155859c30

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page