Typo is the intelligent data quality barrier for enterprise information systems. The Typo tap retrieves results and data from the Typo platform.
Project description
tap-typo
Singer tap that extracts data from the Typo platform. The tap produces JSON-formatted data output following the Singer spec.
Usage
This section describes the basic usage of tap-typo through an example data extraction from a Typo dataset. It assumes that you already have a Typo account, with an existing repository and a dataset. If you do not meet these prerequisites, please go to Typo Registration and Setup.
Installation
Python 3 is required. It is recommended to create a separate virtual environment for each tap or target as their may be incompatibilities between dependency versions.
> pip install tap-typo
Create a configuration file
The config file (usually config.json) is a JSON file describing the tap's settings.
The following sample configuration can be used as a starting point:
{
"api_key": "my_apikey",
"api_secret": "my_apisecret",
"cluster_api_endpoint": "https://cluster.typo.ai/management/api/v1",
"repository": "my_repository",
"dataset": "my_dataset",
"audit_id": "audit_id",
"output_rfc3339_datetime": false
}
Please note: the dataset
and audit_id
parameters are optional. When not specified the tap-typo will automatically run in sync mode using the selected datasets from the catalog.
- api_key, api_secret and cluster_api_endpoint can be obtained by logging into the Typo Console, clicking on your username, and then on My Account.
- repository and dataset correspond to their respective names and audit_id is optional and should be only provided when syncing data from an audit.
- Additionally, a records_per_page parameter can be provided to override the number of records requested at once, and a record_limit parameter can indicate the maximum number of records that will be obtained when the tap is executed.
Discovery mode
In discovery mode, tap-typo will infer the Singer Catalog from the config file and data in Typo. The output can be redirected to a file in order to be modified and used as input to tap-typo (see Catalog file section).
> tap-typo -c config.json -d > catalog.json
Sync mode
Sync mode will fetch data from Typo and output to stdout. Each record has two additional fields: __typo_result
, that can have a value of Error
or OK
and __typo_record_id
, which indicates the record's internal ID in Typo. Before starting the sync, unless a custom Catalog file is provided, Typo will run discovery and build the catalog.
> tap-typo -c config.json
Saving state and resuming
Saving state messages
When tap-typo runs in Sync mode it will emit one STATE message for every RECORD message emitted. STATE messages contain a value JSON property with the state information.
A Singer target should output the contents of the value JSON property in a STATE message. By redirecting this target output to a file, the value property of each STATE message will be stored per line.
> tap-typo -c config.json | target-google-bigquery > state-history.txt
Creating a State file
To resume from a failed or terminated transfer, you will need create a STATE file from the last line in the redirected output (state-history.txt in our example). Below is an example command that performs the step to create a STATE file, state.json, from state-history.txt. You may edit this STATE file as necessary. The STATE file can be used as input to tap-typo to resume.
tail -n 1 state-history.txt > state.json
Example STATE file:
{
"bookmarks": {
"tap-typo-repository-repo1-dataset-dataset1": {
"__typo_record_id": 26
}
}
}
Resuming with a State file
To resume by providing a State file, tap-typo can be started with a -s parameter and providing a path to a STATE file. tap-typo searches the bookmarks property for a key that matches the stream name. If found, tap-typo will try to resume from the location defined in the bookmark.
> tap-typo -c config.json -s state.json | target-google-bigquery > state-history.txt
Catalog file
A catalog file can be provided by adding the --catalog parameter with a file path. This will prevent the discovery process and use the catalog provided in the file path.
> tap-typo -c config.json --catalog catalog.json | target-google-bigquery > state-history.txt
Typo registration and setup
In order to create a Typo account, visit https://www.typo.ai/signup and follow the instructions.
Once registered you can log in to the Typo Console (https://console.typo.ai/) and go to the Repositories section to create a new Repository.
Next, you can start uploading data by using target-typo. A new dataset will be created automatically when data is submitted.
Development
To work on development of tap-typo, clone the repository, create and activate a new virtual environment, go into the cloned folder and install tap-typo in editable mode.
git clone https://github.com/typo-ai/tap-typo.git
cd tap-typo
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
Support
You may reach Typo Support at the email address support@ followed by the typo domain or see the full contact information at https://www.typo.ai.
Copyright © 2020 Stitch
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file tap-typo-0.2.0.tar.gz
.
File metadata
- Download URL: tap-typo-0.2.0.tar.gz
- Upload date:
- Size: 29.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.5.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c69872d5099377698e4c83b1fa88dd217c2d515caf5c485a88c8ce2eb8cf783 |
|
MD5 | feb15f653fa40c7e4bea0d4351833174 |
|
BLAKE2b-256 | b728faf3fa3a6d6e88fcba8b819591e8f6a8bad97e6168e4b008a8b4cf37c95d |