Skip to main content

🐷 Multitread your data with Google BigQuery

Project description

🐷 PygQuery

Multi-treaded wrapper to read and write Pandas dataframes with Google BigQuery without the hassle of the heavy BigQuery API.

PygQuery is multi-treaded by design, meaning that any SQL request is a thread of its own. The advantage of it is that you can send multiple requests in parallel, and have awaiting inbound data ready for later.

Install

On CLI, just type:

pip install pygquery

Read Data

Let's import the module first

from pygquery.bigquery import BigQueryReader

The module takes 3 arguments as an input:

  1. request : A string of your query. E.g. """SELECT * FROM myproject.dataset.table"""
  2. project : The string of the project you are currently gathering data from
  3. api_key_path : a path of the G Service Account key, you can create one in the IAM tab of your GCP interface

Let's instantiate our data reader:

reader_dict = {
  'request' : """SELECT * FROM myproject.dataset.table""",
  'project' : 'myproject',
  'api_key_path' : 'folder/key.json'
}

# If there any error in your query at the instantiation stage, BigQuery will let you know
my_request = BigQueryReader(**reader_dict) 

Now you have an object ready to be launched. If the line of code above executes, you know that:

  1. There is no error in the SQL
  2. There is no credentials failure

Let's fire up this object:

my_request.start() # Launch the thread for downloading data

"# ... Do other things while data is downloading, like launching another request ... #"

my_request.join() # Tell to Python to wait for your download to complete

my_data = myRequest.data # Get your data

Et voilà! You have your data in Pandas DataFrame format ready to be crunched.

my_data.info()
my_data.head()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PygQuery-0.0.6.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

PygQuery-0.0.6-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file PygQuery-0.0.6.tar.gz.

File metadata

  • Download URL: PygQuery-0.0.6.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for PygQuery-0.0.6.tar.gz
Algorithm Hash digest
SHA256 5dd845d042696cd510e67d637a19ed701009307ac2014084fd5407cb854f48bb
MD5 0033c2b9fe4bfd0a738a228cecb689d2
BLAKE2b-256 f579337f5d9a286c45478166425737b8f434664b58bbf22ff159b68c6ce8ba0a

See more details on using hashes here.

File details

Details for the file PygQuery-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: PygQuery-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for PygQuery-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 043ae04fb6e2ca3bd05f086e5e40caa6b50638cac739055ef22f29f551f8171e
MD5 be3edca732b3b78825507528cc2e0060
BLAKE2b-256 90c9c0695b80211db0d2d9d91ed73239b72969262417b374bcd370f218075145

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page