Data collection manager
Project description
aswan
collect and organize data into a T1 data lake and T2 tables. named after the Aswan Dam
Quickstart
Pre v1.0 laundry list
will probably need to separate a few things from it:
- t2extractor
- unstructured json to tabular data automatically
- aswan.t2.extractor
- scheduler
TODO
- dvc integration
- export to dataset template
- maybe part of the dataset
- cleanup requirements
- s3, scp for push/pull
- add verified invalid output that is not parsing error
- selective push / pull
- with possible nuking of remote archive
- cleaning local obj store (when envs blow up, ide dies)
- parsing/connection error confusion
- also broken session thing
- conn session cpu requirement
- resource limits
- transfering / ignoring cookies
- lots of things with extractors
- template projects
- oddsportal
- updating thingy, based on latest match in season
- footy
- rotten
- boxoffice
- oddsportal
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
aswan-0.2.0.tar.gz
(37.9 kB
view hashes)
Built Distribution
aswan-0.2.0-py3-none-any.whl
(41.9 kB
view hashes)