Add your description here
Project description
smallcat
A small, modular data catalog.
Install
pip install smallcat
Quickstart
Create Catalog
Local catalogs can be kept in YAML files.
entries:
foo:
file_format: csv
connection:
conn_type: fs
extra:
base_path: /tmp/smallcat-example/
location: foo.csv
load_options:
header: true
bar:
file_format: parquet
connection:
conn_type: google_cloud_platform
extra:
bucket: my-bucket
location: bar.csv
save_options:
partition_by:
- year
- month
Standalone
from smallcat import Catalog
catalog = Catalog.from_path("catalog.yaml")
catalog.save_pandas("foo", df)
df2 = catalog.load_pandas("foo")
Filter on load
load_pandas (and the lower-level Arrow loaders) accept optional where and
columns arguments to push filters and projections down to DuckDB/Arrow when reading:
df = catalog.load_pandas(
"bar",
where="event_date >= '2024-01-01'",
columns=["event_date", "user_id"],
)
With Airflow
from smallcat import Catalog
catalog = Catalog.from_airflow_variable("example_catalog")
df = catalog.load_pandas("bar")
Docs
Read more at the official docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
smallcat-0.5.0.tar.gz
(21.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
smallcat-0.5.0-py3-none-any.whl
(25.7 kB
view details)
File details
Details for the file smallcat-0.5.0.tar.gz.
File metadata
- Download URL: smallcat-0.5.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a30648697a9c550a55c89ed7926f17563e637dca75a6409f056b25be98ddf7e5
|
|
| MD5 |
074e63142f65d6a19c37fd5861667de5
|
|
| BLAKE2b-256 |
b1501c5cc5b76ca7ad08c5070562052ec6aa83902ca6dbd1366f9a082e1e29a1
|
File details
Details for the file smallcat-0.5.0-py3-none-any.whl.
File metadata
- Download URL: smallcat-0.5.0-py3-none-any.whl
- Upload date:
- Size: 25.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1983fed00c9be251098f6a2630b8393801d45440a8be01c75c11b9287c4cc97d
|
|
| MD5 |
0b0d0bf3301b8d29a25f13ed897f7193
|
|
| BLAKE2b-256 |
cfe4f2216b4a8e9c03ac9150095112427a2c7bf23e747247cee6c3740f4e2d3d
|