Intake plugin for specifying a file-path pattern which can represent a number of different entries
Project description
Intake Pattern Catalog
intake-pattern-catalog is a plugin for Intake which allows you to specify a file-path pattern which can represent a number of different entries.
Note that this is different from the patterns you can write with the csv driver which get turned into a single entry
Use driver: pattern_cat
to use this driver in your catalogs.
Consider the following list of files in an S3 bucket:
- bucket-name/folder/a_1.csv
- bucket-name/folder/b_1.csv
- bucket-name/folder/c_1.csv
- bucket-name/folder/a_2.csv
- bucket-name/folder/b_2.csv
And the following catalog definition yaml file:
---
metadata:
version: 1
sources:
stuff:
description: Stuff and things
driver: pattern_cat
args:
urlpath: "s3://bucket-name/folder/{foo}_{bar}.csv"
driver: csv
Catalog API
Access entry by kwargs:
> catalog.stuff.get_entry(foo='a', bar=1)
sources:
foo_a_bar_1:
args:
storage_options:
use_listings_cache: false
urlpath: s3://bucket-name/folder/a_1.csv
description: ''
driver: intake.source.csv.CSVSource
metadata:
catalog_dir: ...
Note that this could also be accessed with catalog.stuff.foo_a_bar_1
See all valid kwarg combinations:
> catalog.stuff.get_entry_kwarg_sets()
[
{"foo": "a", "bar": "1"},
{"foo": "b", "bar": "1"},
{"foo": "c", "bar": "1"},
{"foo": "a", "bar": "2"},
{"foo": "b", "bar": "2"},
]
Caching
The default way of controlling any caching with a pattern-catalog is using a ttl
(in seconds),
which is an optional value under args
which specifies how long should wait after fetching a list of files
which match the pattern before it loads them again. The default ttl
is 60 seconds.
If you want to force it to always get the latest list of available entries, set the ttl
to 0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for intake-pattern-catalog-2021.8.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 642d1d7e46ba5fee7ca3a1435efd41066ae0a93f3388e503bc08b3ed142dabfa |
|
MD5 | 0a5c867ffec1a3fb66954430003749f4 |
|
BLAKE2b-256 | 5b8d80002086739cd9083d5dcaf3bdb7687b251c8be43e6a8f624fa527bdc74d |
Hashes for intake_pattern_catalog-2021.8.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf4b45fff5728d32e84bfe0069a7adcf1d0bd882fa2b64d43b063f52734c36aa |
|
MD5 | 6426bd2c2e957c1d4571b7ec195f30a0 |
|
BLAKE2b-256 | 729e8fc52c156cf5d0d982d1dfc666b850fed0d48d852ba405b28dbc11fce2ee |