Skip to main content

🐷 Multitread your data with Google BigQuery

Project description

🐷 PygQuery

Multi-treaded wrapper to read and write Pandas dataframes with Google BigQuery without the hassle of the heavy BigQuery API.

PygQuery is multi-treaded by design, meaning that any SQL request is a thread of its own. The advantage of it is that you can send multiple requests in parallel, and have awaiting inbound data ready for later.

Install

On CLI, just type:

pip install pygquery

Read Data

Let's import the module first

from pygquery.bigquery import BigQueryReader

The module takes 3 arguments as an input:

  1. request : A string of your query. E.g. """SELECT * FROM myproject.dataset.table"""
  2. project : The string of the project you are currently gathering data from
  3. api_key_path : a path of the G Service Account key, you can create one in the IAM tab of your GCP interface

Let's instantiate our data reader:

reader_dict = {
  'request' : """SELECT * FROM myproject.dataset.table""",
  'project' : 'myproject',
  'api_key_path' : 'folder/key.json'
}

# If there any error in your query at the instantiation stage, BigQuery will let you know
my_request = BigQueryReader(**reader_dict) 

Now you have an object ready to be launched. If the line of code above executes, you know that:

  1. There is no error in the SQL
  2. There is no credentials failure

Let's fire up this object:

my_request.start() # Launch the thread for downloading data

"# ... Do other things while data is downloading, like launching another request ... #"

my_request.join() # Tell to Python to wait for your download to complete

my_data = myRequest.data # Get your data

Et voilà! You have your data in Pandas DataFrame format ready to be crunched.

my_data.info()
my_data.head()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PygQuery-0.0.5.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

PygQuery-0.0.5-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file PygQuery-0.0.5.tar.gz.

File metadata

  • Download URL: PygQuery-0.0.5.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for PygQuery-0.0.5.tar.gz
Algorithm Hash digest
SHA256 022ef4f34b3555883cac73d4598875114796e655d92548b0f7b4c70a8ec839af
MD5 63266ed9a5ebd7f16561a5cd9cb0d97d
BLAKE2b-256 d7e7082497b9a7b5f737fa4e37bba9be6ff306d1b0ccfb74140a23d565c7fbe4

See more details on using hashes here.

File details

Details for the file PygQuery-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: PygQuery-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for PygQuery-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 34b016077e944b7a32d97d047e2260c3507c00471d6d27981d220f5034358ea2
MD5 d4818e4f45d0de6684088bd89a05367c
BLAKE2b-256 ae494195f94f0a126fc7bb2a85d1c9c5833f9e9c5f20d824e831b521510c900a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page