Skip to main content

Trifacta client

Project description

trifacta

Trifacta client that makes it easy to integrate Trifacta into your production and data science workflows

Usage Scenarios

  • Jupyter: Invoke Trifacta jobs from a Jupyter notebook and pass data back and forth between Jupyter and Trifacta
  • Other Notebooks: Integrate Trifacta with Azure Databricks, Zepellin or any other notebook-style interface that supports Python
  • Scripts: Automate Trifacta jobs and input/output using python scripts that can be easily executed from the command line or called from an external scheduler

Functionality

This library makes it simple to do the following:

  1. Connect to a Trifacta instance
  2. Run a job
  3. Download results to a csv file and view in pandas dataframe

Note that file uploads and downloads are performed using Amazon S3, using the boto3 API

#!pip install trifacta
import trifacta

If you need an access token, you can generate it as follows:

#Step 1: Connect to Trifacta by providing the URL and API Access Token
t = trifacta.Client('http://partnerdemo.amer.trifacta.net:3005', 'YOUR_ACCESS_TOKEN')

Get the wrangled dataset id from the URL in the Trifacta UI

Make sure that you have run the job manually at least once Edit recipe

Note the output path (be sure to set it to "replace")

Publish settings

#Step 2: Run the job
t.run_job(23)
About to run job
{'sessionId': '9d339e65-8898-4165-871b-b9db848dc099', 'reason': 'JobStarted', 'jobGraph': {'vertices': [76, 77], 'edges': [{'source': 76, 'target': 77}]}, 'id': 42, 'jobs': {'data': [{'id': 76}, {'id': 77}]}}
2020-02-25 11:19:58.508231 InProgress
2020-02-25 11:20:03.700189 InProgress
2020-02-25 11:20:08.887794 Complete





True
%env AWS_PROFILE=trifacta_master_trial
env: AWS_PROFILE=trifacta_master_trial
#Step 3: Download results to a csv file and view in pandas dataframe
import boto3
s3 = boto3.client('s3', region_name='us-west-2')
s3.download_file(Bucket='trifacta-partnerdemo-trifactabucket-kkcpnw234feu',
                Key='trifacta/queryResults/admin@trifacta.local/MarketingAnalytics.csv',
                Filename='MarketingAnalytics.csv')
import pandas as pd
df = pd.read_csv('MarketingAnalytics.csv')
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
user_id customerkey event_type event_subtype Date advertiser_id creative_id url product_id domain_url ... customeraccount_number customerphone customeraddress cusotmerstate customerzipcode customercountry socialmedia totalsale Outlier_Identifier currencykey
0 1126310400000-424 1126310400000-424 click click 10-19-2005 164332 543027 http://zdnet.com/praesent/lectus/vestibulum/qu... 1124064000000-475 zdnet ... 310170445527596 (817)718-7309 156 Cozy Berry Arc CA 78710 USA deneleaf 7004.54 False 1
1 1229126400000-20 1229126400000-20 click click 08-17-2009 164332 252030 http://hostgator.com/a/feugiat.js?pid=12331008... 1233100800000-528 hostgator ... 310150240507900 (469)201-1812 3641 Euismod Avenue CA 10769 USA kinphanng 4853.35 False 1
2 1126828800000-518 1126828800000-518 view view 04-05-2006 164332 562765 http://fc2.com/convallis/duis/consequat/dui/ne... 1121904000000-509 fc2 ... 310170133079761 (443)585-1769 Ap #543-7410 Accumsan Rd. CA 92845 USA waldeelbailarin 6885.15 False 1
3 1130112000000-336 1130112000000-336 click click 04-05-2006 164332 466942 http://biblegateway.com/est/phasellus/sit/amet... 1130284800000-343 biblegateway ... 310120073380564 (215)669-3055 900-8123 Aliquam Av. CA 85517 USA charlrey 2593.31 False 1
4 1121990400000-216 1121990400000-216 view view 09-27-2005 164332 400316 https://zdnet.com/elementum/nullam/varius/null... 1108339200000-416 zdnet ... 310160496868669 301 742 1112 164 Cozy Anchor Rd CA 60101 USA scottylago 3958.25 False 1

5 rows × 31 columns

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trifacta-2.5.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

trifacta-2.5-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file trifacta-2.5.tar.gz.

File metadata

  • Download URL: trifacta-2.5.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.5

File hashes

Hashes for trifacta-2.5.tar.gz
Algorithm Hash digest
SHA256 1066f28d019f1ed13812ccceb7c084eb4003d6a1909d1b808a345ba42e60d6a3
MD5 5190ac9980884037cf52bcdbb1cf6ced
BLAKE2b-256 fd509bc28fe91b9b4fee49812d3cee7f1ec0258db1df2a8d971619c11b3ecbb4

See more details on using hashes here.

File details

Details for the file trifacta-2.5-py3-none-any.whl.

File metadata

  • Download URL: trifacta-2.5-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.5

File hashes

Hashes for trifacta-2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c1929136750e0026539178a4dfa44164a08179cd0e1ee46470593bf44efde7c8
MD5 8c43a979710451c39e29a08cc3bfaec3
BLAKE2b-256 271c93f59f992662d1147df659a488d49c883e9f67b68ffc9be144aa527fdb0c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page