Trifacta client
Project description
trifacta
Trifacta client that makes it easy to integrate Trifacta into your production and data science workflows
Usage Scenarios
- Jupyter: Invoke Trifacta jobs from a Jupyter notebook and pass data back and forth between Jupyter and Trifacta
- Other Notebooks: Integrate Trifacta with Azure Databricks, Zepellin or any other notebook-style interface that supports Python
- Scripts: Automate Trifacta jobs and input/output using python scripts that can be easily executed from the command line or called from an external scheduler
Functionality
This library makes it simple to do the following:
- Connect to a Trifacta instance
- Run a job
- Download results to a csv file and view in pandas dataframe
Note that file uploads and downloads are performed using Amazon S3, using the boto3 API
#!pip install trifacta
import trifacta
If you need an access token, you can generate it as follows:
#Step 1: Connect to Trifacta by providing the URL and API Access Token
t = trifacta.Client('http://partnerdemo.amer.trifacta.net:3005', 'YOUR_ACCESS_TOKEN')
Get the wrangled dataset id from the URL in the Trifacta UI
Make sure that you have run the job manually at least once
Note the output path (be sure to set it to "replace")
#Step 2: Run the job
t.run_job(23)
About to run job
{'sessionId': '9d339e65-8898-4165-871b-b9db848dc099', 'reason': 'JobStarted', 'jobGraph': {'vertices': [76, 77], 'edges': [{'source': 76, 'target': 77}]}, 'id': 42, 'jobs': {'data': [{'id': 76}, {'id': 77}]}}
2020-02-25 11:19:58.508231 InProgress
2020-02-25 11:20:03.700189 InProgress
2020-02-25 11:20:08.887794 Complete
True
%env AWS_PROFILE=trifacta_master_trial
env: AWS_PROFILE=trifacta_master_trial
#Step 3: Download results to a csv file and view in pandas dataframe
import boto3
s3 = boto3.client('s3', region_name='us-west-2')
s3.download_file(Bucket='trifacta-partnerdemo-trifactabucket-kkcpnw234feu',
Key='trifacta/queryResults/admin@trifacta.local/MarketingAnalytics.csv',
Filename='MarketingAnalytics.csv')
import pandas as pd
df = pd.read_csv('MarketingAnalytics.csv')
df.head()
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
user_id | customerkey | event_type | event_subtype | Date | advertiser_id | creative_id | url | product_id | domain_url | ... | customeraccount_number | customerphone | customeraddress | cusotmerstate | customerzipcode | customercountry | socialmedia | totalsale | Outlier_Identifier | currencykey | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1126310400000-424 | 1126310400000-424 | click | click | 10-19-2005 | 164332 | 543027 | http://zdnet.com/praesent/lectus/vestibulum/qu... | 1124064000000-475 | zdnet | ... | 310170445527596 | (817)718-7309 | 156 Cozy Berry Arc | CA | 78710 | USA | deneleaf | 7004.54 | False | 1 |
1 | 1229126400000-20 | 1229126400000-20 | click | click | 08-17-2009 | 164332 | 252030 | http://hostgator.com/a/feugiat.js?pid=12331008... | 1233100800000-528 | hostgator | ... | 310150240507900 | (469)201-1812 | 3641 Euismod Avenue | CA | 10769 | USA | kinphanng | 4853.35 | False | 1 |
2 | 1126828800000-518 | 1126828800000-518 | view | view | 04-05-2006 | 164332 | 562765 | http://fc2.com/convallis/duis/consequat/dui/ne... | 1121904000000-509 | fc2 | ... | 310170133079761 | (443)585-1769 | Ap #543-7410 Accumsan Rd. | CA | 92845 | USA | waldeelbailarin | 6885.15 | False | 1 |
3 | 1130112000000-336 | 1130112000000-336 | click | click | 04-05-2006 | 164332 | 466942 | http://biblegateway.com/est/phasellus/sit/amet... | 1130284800000-343 | biblegateway | ... | 310120073380564 | (215)669-3055 | 900-8123 Aliquam Av. | CA | 85517 | USA | charlrey | 2593.31 | False | 1 |
4 | 1121990400000-216 | 1121990400000-216 | view | view | 09-27-2005 | 164332 | 400316 | https://zdnet.com/elementum/nullam/varius/null... | 1108339200000-416 | zdnet | ... | 310160496868669 | 301 742 1112 | 164 Cozy Anchor Rd | CA | 60101 | USA | scottylago | 3958.25 | False | 1 |
5 rows × 31 columns
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
trifacta-2.5.tar.gz
(5.7 kB
view details)
Built Distribution
File details
Details for the file trifacta-2.5.tar.gz
.
File metadata
- Download URL: trifacta-2.5.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1066f28d019f1ed13812ccceb7c084eb4003d6a1909d1b808a345ba42e60d6a3 |
|
MD5 | 5190ac9980884037cf52bcdbb1cf6ced |
|
BLAKE2b-256 | fd509bc28fe91b9b4fee49812d3cee7f1ec0258db1df2a8d971619c11b3ecbb4 |
File details
Details for the file trifacta-2.5-py3-none-any.whl
.
File metadata
- Download URL: trifacta-2.5-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1929136750e0026539178a4dfa44164a08179cd0e1ee46470593bf44efde7c8 |
|
MD5 | 8c43a979710451c39e29a08cc3bfaec3 |
|
BLAKE2b-256 | 271c93f59f992662d1147df659a488d49c883e9f67b68ffc9be144aa527fdb0c |