Skip to main content

No project description provided

Project description

neoval-py-utils

Python Utilities

Development

All development must take place on a feature branch and a pull request is required; a user is not allowed to commit directly to main. The automated workflow in this repo (using python-semantic-release) requires the use of angular style commit messages to update the package version and CHANGELOG. All commits must be formatted in this way before a user is able to merge a PR; a user who may want to develop without using this format for all commits can simply squash non-angular commit messages prior to merge. A PR may only be merged by the rebase and merge method. This is to ensure that only angular style commits end up on main.

Upon merge to main, the deploy workflow will facilitate the following:

  • bump the version in pyproject.toml
  • update the CHANGELOG using all commits added
  • tag and release, if required
  • publish to PyPi

Getting Started

Prerequisites

TODO

Tests

For the integration tests to pass you will need to be authenticated with a Google project. With storage admin and bigquery job permissions.

You can auth with GOOGLE_APPLICATION_CREDENTIALS as an environment variable or by running gcloud auth application-default login.

Specify gcp project with gcloud config set project <project-id>.

Run unit and integration tests with poetry run task test.

To run with coverage tests with poetry run task test-with-coverage.

Usage

TODO installation with pipy

Assuming that installed neoval-py-utilsis successfully as a dependency and have permissions to gcp storage and bigquery.

Examples of usage

Export BQ datasets or Queries >> Dataframe or GCS

from neoval_py_utils.exporter import Exporter
# To query a bigquery table and return a polar dataframe. Caches results, keeps for default 12 hours.
exporter = Exporter() # To use cache, pass path to the constructor. Eg Exporter(cache_dir=./cache)
pl_df = exporter.export("SELECT word FROM `bigquery-public-data.samples.shakespeare` GROUP BY word ORDER BY word DESC LIMIT 3")

# `export` is aliased by `<` operator. Will give same results as above.
pl_df = exporter < "SELECT word FROM `bigquery-public-data.samples.shakespeare` GROUP BY word ORDER BY word DESC LIMIT 3"


# To export a whole table
al_pl_df = exporter.export("bigquery-public-data.samples.shakespeare")


# To export bigquery table to a parquet file in a gcp storage bucket. Returns a list of blobs.
blobs = exporter.bq_to_gcs("my-dataset.my-table")

Create In-process(Embedded) Databases

# Pythong cli example to build in-process db
poetry run python ipdb build <DBT_DATASET> <GCLOUD_PROJECT_ID> <DB_PATH> <CONFIG_PATH> --upload-bucket <UPLOAD_BUCKET> 
# If you would like to run it in locally in this repo, you can run
# Upload bucket is optional, this will upload the in-process db to the specified bucket.
poetry run python neoval_py_utils/ipdb.py build samples bigquery-public-data tests/artifacts/in_process_db tests/resources/good.config.yaml

# To apply sql templates after the in-process db is built
poetry run python ipdb prepare <DBT_DATASET> <GCLOUD_PROJECT_ID> <DB_PATH> <TEMPLATES_PATH>
# If you would like to run it in locally in this repo, you can run
poetry run python neoval_py_utils/ipdb.py samples bigquery-public-data tests/artifacts/in_process_db tests/resources/templates
# For more info you can run
poetry run python neoval_py_utils/ipdb.py --help # which will return 
                                                                                                                                     
 Usage: ipdb.py [OPTIONS] COMMAND [ARGS]...                                                                                                                                                               
                                                                                                                                                                                                          
╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.                                                                                                                                │
│ --show-completion             Show completion for the current shell, to copy it or customize the installation.                                                                                         │
│ --help                        Show this message and exit.                                                                                                                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ build                           Build the in process database(s).                                                                                                                                      │
│ make-config                     Prints a default configuration to be used with the build command.                                                                                                      │
│ prepare                         Run scripts to add views/virtual tables/etc. to the database(s).                                                                                                       │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neoval_py_utils-0.3.0.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

neoval_py_utils-0.3.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file neoval_py_utils-0.3.0.tar.gz.

File metadata

  • Download URL: neoval_py_utils-0.3.0.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for neoval_py_utils-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8c3150edbfecffce0e9ba1713ade568ff7d576b289f2790affe7ef117a948cbd
MD5 c2e6e306e8c8b00f5281c537c51f8ce8
BLAKE2b-256 9a64c48ecaad7485a1c8f45cecc5df3a0ac6c005d0ebbecf74893ec49ed75030

See more details on using hashes here.

File details

Details for the file neoval_py_utils-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for neoval_py_utils-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 84f885b1323517fda1aac3193f5ae126c464f6c9199011d0fb47ef76eb411e65
MD5 3c31834f109fb6139fb68d12dbf68a85
BLAKE2b-256 fd1ab454144e92b29b8eedf901f0596ed1031c19392311c0a46b94cd8343def2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page