Skip to main content

PyAirbyte

Project description

PyAirbyte

PyAirbyte brings the power of Airbyte to every Python developer. PyAirbyte provides a set of utilities to use Airbyte connectors in Python.

PyPI version PyPI - Downloads PyPI - Python Version

PyPI - Wheel

PyPI - Implementation PyPI - Format Star on GitHub

Getting Started

Watch this Getting Started Loom video or run one of our Quickstart tutorials below to see how you can use PyAirbyte in your python code.

Secrets Management

PyAirbyte can auto-import secrets from the following sources:

  1. Environment variables.
  2. Variables defined in a local .env ("Dotenv") file.
  3. Google Colab secrets.
  4. Manual entry via getpass.

Note: You can also build your own secret manager by subclassing the CustomSecretManager implementation. For more information, see the airbyte.secrets.CustomSecretManager class definiton.

Retrieving Secrets

import airbyte as ab

source = ab.get_source("source-github")
source.set_config(
   "credentials": {
      "personal_access_token": ab.get_secret("GITHUB_PERSONAL_ACCESS_TOKEN"),
   }
)

By default, PyAirbyte will search all available secrets sources. The get_secret() function also accepts an optional sources argument of specific source names (SecretSourceEnum) and/or secret manager objects to check.

By default, PyAirbyte will prompt the user for any requested secrets that are not provided via other secret managers. You can disable this prompt by passing allow_prompt=False to get_secret().

For more information, see the airbyte.secrets module.

Secrets Auto-Discovery

If you have a secret matching an expected name, PyAirbyte will automatically use it. For example, if you have a secret named GITHUB_PERSONAL_ACCESS_TOKEN, PyAirbyte will automatically use it when configuring the GitHub source.

The naming convention for secrets is as {CONNECTOR_NAME}_{PROPERTY_NAME}, for instance SNOWFLAKE_PASSWORD and BIGQUERY_CREDENTIALS_PATH.

PyAirbyte will also auto-discover secrets for interop with hosted Airbyte: AIRBYTE_CLOUD_API_URL, AIRBYTE_CLOUD_API_KEY, etc.

Connector compatibility

To make a connector compatible with PyAirbyte, the following requirements must be met:

  • The connector must be a Python package, with a pyproject.toml or a setup.py file.
  • In the package, there must be a run.py file that contains a run method. This method should read arguments from the command line, and run the connector with them, outputting messages to stdout.
  • The pyproject.toml or setup.py file must specify a command line entry point for the run method called source-<connector name>. This is usually done by adding a console_scripts section to the pyproject.toml file, or a entry_points section to the setup.py file. For example:
[tool.poetry.scripts]
source-my-connector = "my_connector.run:run"
setup(
    ...
    entry_points={
        'console_scripts': [
            'source-my-connector = my_connector.run:run',
        ],
    },
    ...
)

To publish a connector to PyPI, specify the pypi section in the metadata.yaml file. For example:

data:
 # ...
 remoteRegistries:
   pypi:
     enabled: true
     packageName: "airbyte-source-my-connector"

Validating source connectors

To validate a source connector for compliance, the airbyte-lib-validate-source script can be used. It can be used like this:

airbyte-lib-validate-source —connector-dir . -—sample-config secrets/config.json

The script will install the python package in the provided directory, and run the connector against the provided config. The config should be a valid JSON file, with the same structure as the one that would be provided to the connector in Airbyte. The script will exit with a non-zero exit code if the connector fails to run.

For a more lightweight check, the --validate-install-only flag can be used. This will only check that the connector can be installed and returns a spec, no sample config required.

Contributing

To learn how you can contribute to PyAirbyte, please see our PyAirbyte Contributors Guide.

Frequently asked Questions

1. Does PyAirbyte replace Airbyte? No.

2. What is the PyAirbyte cache? Is it a destination? Yes, you can think of it as a built-in destination implementation, but we avoid the word "destination" in our docs to prevent confusion with our certified destinations list here.

3. Does PyAirbyte work with data orchestration frameworks like Airflow, Dagster, and Snowpark, Yes, it should. Please give it a try and report any problems you see. Also, drop us a note if works for you!

4. Can I use PyAirbyte to develop or test when developing Airbyte sources? Yes, you can, but only for Python-based sources.

5. Can I develop traditional ETL pipelines with PyAirbyte? Yes. Just pick the cache type matching the destination - like SnowflakeCache for landing data in Snowflake.

6. Can PyAirbyte import a connector from a local directory that has python project files, or does it have to be pip install Yes, PyAirbyte can use any local install that has a CLI - and will automatically find connectors by name if they are on PATH.

Changelog and Release Notes

For a version history and list of all changes, please see our GitHub Releases page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airbyte-0.10.0.tar.gz (85.5 kB view details)

Uploaded Source

Built Distribution

airbyte-0.10.0-py3-none-any.whl (111.8 kB view details)

Uploaded Python 3

File details

Details for the file airbyte-0.10.0.tar.gz.

File metadata

  • Download URL: airbyte-0.10.0.tar.gz
  • Upload date:
  • Size: 85.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.9

File hashes

Hashes for airbyte-0.10.0.tar.gz
Algorithm Hash digest
SHA256 248e4cfab9585d8fe0e96f01dab28c6e5edf96e72c67db762eb0dd91b9c0a4a6
MD5 c3c04710698f25ae3a03958d40271c8a
BLAKE2b-256 7121d59a9e816b2004babcaa3daec1b0420b816a7ca53211c7a355ed7279b2a4

See more details on using hashes here.

File details

Details for the file airbyte-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: airbyte-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 111.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.9

File hashes

Hashes for airbyte-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fdc6fdeb8a9a0712a2f287b8e1ca076a2ee175d2002fea540f3c37860e6795bb
MD5 ea6235480c9ffc6e57f1da19178754fb
BLAKE2b-256 e30cf40d45e7afa41d1e3486daaff8f572908c04d6de61c44d80d2c08825adc2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page