Skip to main content

Accessor to Google Cloud Storage and Big Query

Project description

This software is released under the MIT License, see LICENSE.txt.

gcp-accessor is a wrapper of Google Cloud Storage API and Google BigQuery API for more simply accesseing both functions.

While there are google-cloud-storage and google-cloud-bigquery library provided by Google which have several sophisticated features, frequently used features can be limited in python application development.

gcp-accessor focuses on the simplicity of usage pattern and includes the selected features as described below.


  • bq_accessor
    • get_dataset
    • get_table_name
    • check_if_dataset_exists
    • check_if_table_exists
    • create_table_from_json
    • load_data_from_gcs
    • execute_query
  • gcs_accessor
    • get_blob
    • get_blob_list
    • get_uris_list
    • upload_csv_gzip
    • download_csv_gzip

Setup

Installation

pip install gcp-accessor

Set GOOGLE_APPLICATION_CREDENTIALS

export GOOGLE_APPLICATION_CREDENTIALS='full path to credential key json file'

Usage of Google Cloud Storage Accessor

Import gcp_accessor and create google cloud storage client

import gcp_accessor
gcs = gcp_accessor.GoogleCloudStorageAccessor()

Get Binary Large Object (blob) from google cloud storage (gcs)

gcs.get_blob('bucket_name', 'full path to file on gcs')

Get Binary Large Object (blob) lists as HTTPIterator object from gcs

gcs.get_blob_list('bucket_name', prefix=None, delimiter=None)

Get uris lists

gcs.get_uris_list('bucket_name', prefix=None, delimiter=None)

Upload gzip object on gcs

gcs.upload_csv_gzip('bucket_name', 'full path to file on gcs', 'texts')

Download gzip file from gcs and load it through in-memory

gcs.download_csv_gzip('bucket_name', 'full path to file on gcs')

Usage of Big Query Accessor

Import gcp_accessor and create big query client

import gcp_accessor
bq = gcp_accessor.BigQueryAccessor()

Get datasets if datasets do not exist, then return empty list

bq.get_dataset()

Get table names if table names do not exist, then return exception error message

bq.get_table_name('dataset_name')

Check if dataset exsists in bigquery and then return True or False

bq.check_if_dataset_exists('dataset_name')

Check if table exsists in bigquery and then return True or False

bq.check_if_table_exists('dataset_name', 'table_name')

Create table on bigquery based on schema json file

bq.create_table_from_json('path_schema_file', 'dataset', 'table_name')

Load data from gcs (You have to upload file on gcs.).

bq.load_data_from_gcs(
        'dataset_name',
        'uris',
        'table_name',
        location="US",
        skip_leading_rows=0,
        source_format="CSV",
        create_disposition="CREATE_NEVER",
        write_disposition="WRITE_EMPTY",
    )

Execute a simple query or query with the below optipons

bq.execute_query(
        'query',
        location="US",
        timeout=30,
        page_size=0,
        project=None,
        allow_large_results=False,
        destination=None,
        destination_encryption_configuration=None,
        dry_run=False,
        labels=None,
        priority=None,
        query_parameters=None,
        schema_update_options=None,
        table_definitions=None,
        time_partitioning=None,
        udf_resources=None,
        use_legacy_sql=False,
        use_query_cache=False,
        write_disposition=None
    )

Note

Some argument names and descriptions about each argument are cited and referred from the documents of 'Google Cloud Client Libraries for Python' The explanations about each argument are written in the code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gcp-accessor-0.0.6.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gcp_accessor-0.0.6-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file gcp-accessor-0.0.6.tar.gz.

File metadata

  • Download URL: gcp-accessor-0.0.6.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for gcp-accessor-0.0.6.tar.gz
Algorithm Hash digest
SHA256 de5beb087178bad920c3a3e73e7f8deae4afe1d3c3f4a3d76d41f0024e112141
MD5 0c088603539b1b145533d8e6f4fd89f5
BLAKE2b-256 e1f971c710567c7dc9c25ccc0e1a17286b80954e114c3ab7489f8415788b03d8

See more details on using hashes here.

File details

Details for the file gcp_accessor-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: gcp_accessor-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for gcp_accessor-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f1789cd8df8f7abc924d5a22e55d5baa12422fcf775e401bef00d0a401e87a47
MD5 45a2e32adcd91af2dde0caa44d328ec1
BLAKE2b-256 14ec125266696573c8aa286da1d4f2719099fd067de577e622e61eb7eb9ad5ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page