Skip to main content

🐷 Multitread your data with Google BigQuery

Project description

🐷 PygQuery

Multi-treaded wrapper to read and write Pandas dataframes with Google BigQuery without the hassle of the heavy BigQuery API.

By design, PygQuery is multi-treaded, meaning that any SQL request is a thread by it's own. The advantage of this is you can lauch multiples requests in parallel, and wait for data when you need it later.

Install

On CLI, just type:

pip install pygquery

Read Data

Let's import the module first

from pygquery.bigquery import BigQueryReader

The class needs 3 arguments to work:

  1. request : A string of your query. E.g. """SELECT * FROM myproject.dataset.table"""
  2. project : The string of the project you are currently gathering data
  3. api_key_path : a path of the G Sevice Account key, you can create one in the IAM tab of your GCP interface

Let's instantiate our data reader:

reader_dict = {
  'request' : """SELECT * FROM myproject.dataset.table""",
  'project' : 'myproject',
  'api_key_path' : 'folder/key.json'
}

# If there any error in your query at the instantiation stage, BigQuery will tell you at this moment
my_request = BigQueryReader(**reader_dict) 

Now you have an object ready to be launched. If the line of code above pass, you know that:

  1. There is no error in the SQL
  2. There is no credentials failure

Let's fire up this object:

my_request.start() # Launch the Tread to download

"# ... Do other things while data is downloading, like launching an other request ... #"

my_request.join() # Say to Python to wait for your download to complete

my_data = myRequest.data # Get your data

Et voilà! You have your data in Pandas DataFrame format ready to be crunched.

my_data.info()
my_data.head()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PygQuery-0.0.4.tar.gz (3.2 kB view details)

Uploaded Source

Built Distribution

PygQuery-0.0.4-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file PygQuery-0.0.4.tar.gz.

File metadata

  • Download URL: PygQuery-0.0.4.tar.gz
  • Upload date:
  • Size: 3.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for PygQuery-0.0.4.tar.gz
Algorithm Hash digest
SHA256 c8784d0ec22644cdce78228a2da24d8b9f7e1c738de26f7b83e72152d3596234
MD5 293f043b83eb338cf753652020a0f96e
BLAKE2b-256 0aea486b3b620ebe86f8b689d805c3bb645ee3af5bd71845227a22e950746b87

See more details on using hashes here.

File details

Details for the file PygQuery-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: PygQuery-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for PygQuery-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 853bc538a9dd028870a0d757c2b888456f2b165d8007971beef72921a3a629c3
MD5 02bf5770971a65c1599c2dc5e1b20702
BLAKE2b-256 c6657fbf79b9d0ea751fddee296c9500c92e24d4d8858f6270b868262aa3fa9f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page