Skip to main content

BigQuery backup scripts - Spartez fork

Project description

bqup

bqup is a backup tool for BigQuery projects. It can export a BigQuery project's structure and source code while mimicking the hierarchy of datasets and tables.

How bqup works

For the full story of why we made bqup, check out our blog post!

Installation

bqup can be installed using pip.

$ pip install bqup

Alternatively, you can also clone the repository then run install.

$ git clone https://github.com/thinkingmachines/bqup.git
$ cd bqup
$ python3 setup.py install

Usage

Command line options

You can list down the options by running bqup --help.

bqup [-p PROJECT_ID] [-d TARGET_DIR] [-fvx]

Options:
  -p PROJECT_ID, --project PROJECT_ID  Project ID to load. If unspecified,
                                       defaults to current project in
                                       configuration.
  -d TARGET_DIR, --dir TARGET_DIR      The target directory where the project
                                       will be written. Defaults to current
                                       timestamp.
  -f --force                           Overwrite target directory if it exists.
  -v --verbose                         Print a summary of the loaded project.
  -x --schema                          Export table schemata as json.

Development

  1. Set up gcloud to run with your personal account (aka run with scissors).

  2. Set up application-default.

    $ gcloud auth application-default login
    
  3. Install wheel.

    $ pip3 install wheel
    
  4. Install bqup.

    pip3 install -e .
    

    Alternatively, you can also install it using:

    python3 setup.py develop
    
  5. Run bqup (see Usage).

Production

Note: When deploying for a new GCP project, consider using Cloud Scheduler

  1. Turn off your host Google Instance.
  2. Enable BigQuery on the instance's Cloud API Access Scopes
  3. Start the instance.
  4. ssh into the Google Instance you want to run bqup from.
  5. Authorize your Compute Instance Account to read from the target BigQuery project.
  6. Install bqup via pip install bqup, optionally inside a virtual environment.
  7. Run bqup
    • If it still doesn't work, check in IAM that the service account you are using has BigQuery read access.

Setting up regular backups

  1. On the machine that will run your backups, set up your git config (username, email, the usual).

  2. Make a directory to use as the Git repository. For this example, let's use repo:

    $ mkdir repo cd repo git init
    
  3. Add the remote to the git repository (ideally a GCP repository). For this example, let's use google:

    $ git remote add google <url-to-remote-repository>
    
  4. Create a script called bqup.sh that follows the following template. For our example, our repository is dedicated to backups, so we just assume that our HEAD is the latest and just push gently to master.

    #!/bin/bash
    <absolute-path-to-bqup> -p <project-id> -d <absolute-path-to-repo>/projects -fv >> <absolute-path-to-log-file>
    cd <absolute-path-to-repo>
    date > last-updated.log
    git add .
    git commit -m "Automated bqup"
    git push <remote> <branch>
    
  5. Add this script to your crontab to run as frequently as your heart desires.

Distribution

Run make test to try a test upload.

Run make dist to upload a distribution.

Both of these will call make build, which rebuilds the package locally.

Contributing

If you wish to contribute, check out our contributing guide!

A list is maintained for all external contributors who have submitted pull requests that were subsequently approved. Users are allowed and encouraged to fork the project and submit pull requests and issues. We only request that contributions adhere to these guidelines:

The official maintainers in charge of responding to issues and merging pull requests are:

Contributors

Thanks to all these wonderful people who've helped out with bqup:

Jess
Jess
Ram
Ram
Pepe Berba
Pepe Berba
Tim Pron
Tim Pron
Enzo
Enzo

Ardie
Ardie

Disclaimers

bqup is maintained on a best effort basis:

  • No amount of official time is currently being dedicated to the regular maintenance of this project.
  • Thinking Machines does not make any guarantees about the quality of the software.

Thinking Machines reserves the rights to:

  • refuse to resolve issues
  • close issues without resolution
  • request changes to pull requests
  • reject pull requests outright

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

bqup_spartez-0.0.7-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file bqup_spartez-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: bqup_spartez-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.3

File hashes

Hashes for bqup_spartez-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 aef30c48eef0d2b8cd002e6f9e43c9ed7d4d113b6ce341bae1e0c6ff5e28e85a
MD5 744fce0fbd73aac540a3e455eacf2c24
BLAKE2b-256 cbfcbca2e61754f1f2ed33ae12677428ca91ef1a321bee4ec3fef5cb099d6d3d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page