Python package that loads data from the web and deploys a corresponding external table definition, so that the data can be queried using standard SQL.
Project description
vestapol
vestapol is a Python package that loads data from the web and deploys a corresponding external table definition, so that the data can be queried using standard SQL.
"Vestapol" is an open D Major tuning for the guitar. It is named after a 19th-century composition distributed in some of the earliest instructional guides for guitar.
Usage
from vestapol.web_resources.csv_resource import CSVResource
from vestapol.destinations.gcp_destination import GoogleCloudPlatform
nyt_covid_data_2022 = CSVResource(
name="nyt_covid19_us_counties_2022",
base_url="https://raw.githubusercontent.com/",
endpoint="nytimes/covid-19-data/master/rolling-averages/us-counties-2022.csv",
version="v0",
skip_leading_rows=1,
)
destination = GoogleCloudPlatform()
nyt_covid_data_2022.load(destination)
tablename = destination.create_table(nyt_covid_data_2022)
from google.cloud import bigquery
client = bigquery.Client()
query = f"""
select date, state, county, cases_avg_per_100k
from `{tablename}`
where requested_at = '{nyt_covid_data_2022.requested_at}'
limit 5
"""
query_job = client.query(query)
for row in query_job.result():
print(row)
Prerequisites
Installation of this project requires Poetry 1.2+ and Python version 3.9+.
Older version of poetry can be updated by running:
poetry self update
poetry --version
Installation
Install vestapol and its dependencies by running:
poetry install
Testing
Run tests with the following command:
poetry run pytest
Environment Variables
GCS_BUCKET_NAME
: the Google Cloud Storage bucket where data is loaded (e.g.inq-warehouse-waligob
)GCS_ROOT_PREFIX
: the GCS prefix where data is loaded (e.g.data_catalog
)GBQ_PROJECT_ID
: the BigQuery project identifier (e.g.inq-warehouse
)GBQ_DATASET_ID
: the BigQuery dataset where external tables will be created (e.g.data_catalog_waligob
)GBQ_DATASET_LOCATION
: the BigQuery dataset location (e.g.US
)GOOGLE_APPLICATION_CREDENTIALS=
: location of the GCS service account keyfile (e.g.~/inq-warehouse-f0962a57089e-inf.json
)
Publishing to PyPI
Instructions for pushing new versions of vestapol
to PyPI:
-
Update
CHANGELOG.md
. Include Additions, Fixes, and Changes. -
Update project version using either a valid PEP 440 string or a valid bump rule following Semantic Versioning.
poetry version <version string or bump rule>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for vestapol-0.0.26-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55cd1e692fe79d5ff37b794996fd60f1a0c52cf58c970b799099b90449ff8c7a |
|
MD5 | 5af6e8cebc97c1b8e1b265e3a390a56d |
|
BLAKE2b-256 | 518b54e7ac3fb959b192b92ec34b462ae1db378232d19551c002a892acc16e6a |