Skip to main content

Sync Airtable bases to a Postgres schemas in real time

Project description

Airtable Postgres Sync

The goal of this library is to provide an out-of-the-box solution for replicating an entire Airtable base in a Postgres schema. There are two modes of operation:

  • One-off-sync: This mode will replicate the Airtable base in the specified Postgres schema and then exit. This is useful for creating snapshots of the base for analysis or for storage as a backup.
  • Perpetual sync: This mode will replicate the Airtable base in the specified Postgres schema and then continue to watch for changes in the base. When a change is detected, the change will be applied to the Postgres schema. This is useful for creating a replica of the base that can be used for analysis in real time.

This library will produce a Postgres table and view for each of the tables in the specified Airtable base. The table will take the Airtable table id for its name and the field ids for its column names. The view will have the same name as the Airtable table and the column names will be the same as the Airtable column names. For most analysis use cases it makes sense to use the view as it is more readable, but for applications requiring robustness with respect to column name changes the table should be used.

Installation

To install the library, run the following command:

pip install airtable-pg-sync

Permissions

To use this library, you will need to create a personal access token in Airtable. This token will need to have the following scopes:

  • data.records:read
  • schema.bases:read
  • webhook:manage

You will also need to give the Postgres user that you are using read and write access to the schema you are syncing to.

Usage

To use the library, you will need to create a config file. The config file defines all the parameters that are needed to connect to Airtable and Postgres, as well as how your program will listen for changes. The file must be in YAML format and must contain the following fields:

AIRTABLE_PG_SYNC:
  REDUCED_MEMORY: # boolean, if true will use less memory but will be slower when initially syncing tables
  DB_HOST: # Postgres host
  DB_PORT: # Postgres port
  DB_USER: # Postgres user
  DB_PASSWORD: # Postgres password
  DB_NAME: # Postgres database name
  AIRTABLE_PAT: # Airtable personal access token
  LISTENER_PORT: # The port to listen for change notifications on
  WEBHOOK_URL: # The url that Airtable will send change notifications to
    REPLICATION_NAME_ONE: # Unique dummy identifier for the replication 
        BASE_ID: # Airtable base id to sync
        SCHEMA_NAME: # Postgres schema name
    REPLICATION_NAME_TWO: # Unique dummy identifier for the replication 
        BASE_ID: # Airtable base id to sync
        SCHEMA_NAME: # Postgres schema name

The library can be used in two ways:

  1. As a command line tool

To trigger a one-time sync, run the following command:

airtable-pg-sync one-time-sync --config /path/to/config.yml

To trigger a perpetual sync, run the following command:

airtable-pg-sync perpetual-sync --config /path/to/config.yml
  1. As a python library

To trigger a sync from within a python program, run the following code:

from airtable_pg_sync import Sync

Sync(config_path="/path/to/config.yml", perpetual=True / False).run()

Testing and Deployment

When testing this library for your use case the ngrok service is very useful. It allows you to listen for requests sent over the internet to your PC (ie the webhook POST requests).

For deployment, it is recommended that you run the library in an AWS EC2 or ECS type service. When using reduced memory mode, an instance with 0.25 vCPU and 0.5 GB of memory will be sufficient. WHen not using reduced memory mode, the instance size will depend on the size of your data set.

Bugs, Feature Requests, and Contributions

If you find a bug or have a feature request, please open an issue on GitHub. Any contributions are welcome and appreciated. If you would like to contribute, please open a pull request on GitHub.

Ideas for contributions:

  • Add support for other databases
  • Add support for Postgres -> Airtable sync

License

This library is licensed under the MIT License. See the LICENSE file

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airtable_pg_sync-0.0.42.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

airtable_pg_sync-0.0.42-py3-none-any.whl (28.8 kB view details)

Uploaded Python 3

File details

Details for the file airtable_pg_sync-0.0.42.tar.gz.

File metadata

  • Download URL: airtable_pg_sync-0.0.42.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.0

File hashes

Hashes for airtable_pg_sync-0.0.42.tar.gz
Algorithm Hash digest
SHA256 5e500b881b43fb4a98dc5c911d6fdf4fc64a9652a910545a0f94b1a66c3a7118
MD5 9a2c24370e3473bb96516e12660c6034
BLAKE2b-256 e25768f2f0fa4d65eb693288c86f3fc3598715b3e043a17ff6b767dfd6bea2ef

See more details on using hashes here.

File details

Details for the file airtable_pg_sync-0.0.42-py3-none-any.whl.

File metadata

File hashes

Hashes for airtable_pg_sync-0.0.42-py3-none-any.whl
Algorithm Hash digest
SHA256 9ded3d3b67deb637ba14afceab9356ab9f9e43e5f2cc4363a4f6f91b3db31804
MD5 eafef0a2ef1df0f42e72ff98febb5957
BLAKE2b-256 b877f106d6889b343c998edbcd02602ea562267e057a3c982da19f56075ce6eb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page