Trifacta client
Project description
trifacta
Trifacta client that makes it easy to integrate Trifacta into your production and data science workflows
Usage Scenarios
- Jupyter: Invoke Trifacta jobs from a Jupyter notebook and pass data back and forth between Jupyter and Trifacta
- Other Notebooks: Integrate Trifacta with Azure Databricks, Zepellin or any other notebook-style interface that supports Python
- Scripts: Automate Trifacta jobs and input/output using python scripts that can be easily executed from the command line or called from an external scheduler
Functionality
This library makes it simple to do the following:
- Connect to a Trifacta instance
- Run a job
- Download results to a pandas dataframe OR Download results as text/csv
- Upload files to Trifacta
Note that file uploads and downloads are performed using httpfs, and require that port 14000 be opened on the Trifacta server
#!pip install trifacta
import trifacta
#Step 1: Connect to Trifacta by providing the URL, username and password
t = trifacta.Client('https://partnerdemo.trifacta.net', 'userid@mydomain.com', 'mypassword')
Get the wrangled dataset id from the URL in the Trifacta UI
Make sure that you have run the job manually at least once
Note the output path (be sure to set it to "replace")
#Step 2: Run the job
t.run_job(14478)
About to run job
{'jobgroupId': 3926, 'jobIds': [7513, 7514], 'reason': 'JobStarted', 'sessionId': 'b9d327f0-8e19-11e8-8feb-9fabf204e996'}
2018-07-22 18:43:01.427594 InProgress
2018-07-22 18:43:06.791576 Complete
True
#Step 3a: Get a pandas dataframe with the results
df = t.get_dataframe('/trifacta/queryResults/demo@trifacta.com/demo_output.csv')
df
Neighborhood | HouseStyle | row_count | sum_LotArea | |
---|---|---|---|---|
0 | NAmes | 1Story | 159 | 1589811 |
1 | CollgCr | 1Story | 91 | 841644 |
2 | Gilbert | 2Story | 60 | 668112 |
3 | Timber | 1Story | 23 | 554694 |
4 | CollgCr | 2Story | 53 | 546602 |
5 | NridgHt | 1Story | 51 | 537687 |
6 | Sawyer | 1Story | 53 | 528438 |
7 | Edwards | 1Story | 53 | 511296 |
8 | NoRidge | 2Story | 33 | 485691 |
9 | NWAmes | 1Story | 35 | 403813 |
10 | ClearCr | 1Story | 11 | 395797 |
11 | Mitchel | 1Story | 32 | 394436 |
12 | Somerst | 1Story | 37 | 350820 |
13 | NWAmes | 2Story | 29 | 348885 |
14 | Somerst | 2Story | 49 | 323495 |
15 | NridgHt | 2Story | 26 | 300685 |
16 | OldTown | 2Story | 32 | 274465 |
17 | SawyerW | 1Story | 28 | 271008 |
18 | OldTown | 1.5Fin | 33 | 267283 |
19 | ClearCr | 1.5Fin | 6 | 266593 |
20 | Crawfor | 1Story | 19 | 260639 |
21 | SawyerW | 2Story | 25 | 255102 |
22 | NAmes | 2Story | 22 | 249793 |
23 | OldTown | 1Story | 33 | 240257 |
24 | Edwards | 1.5Fin | 22 | 228970 |
25 | Crawfor | 2Story | 20 | 222029 |
26 | NAmes | SLvl | 21 | 221177 |
27 | Edwards | 2Story | 14 | 185799 |
28 | Timber | 1.5Fin | 2 | 178418 |
29 | BrkSide | 1.5Fin | 25 | 172233 |
... | ... | ... | ... | ... |
66 | CollgCr | SLvl | 3 | 30135 |
67 | BrDale | 2Story | 16 | 28816 |
68 | Veenker | SLvl | 2 | 25757 |
69 | NoRidge | 1.5Fin | 2 | 25398 |
70 | SawyerW | SFoyer | 3 | 25267 |
71 | CollgCr | SFoyer | 3 | 24491 |
72 | MeadowV | 2Story | 8 | 19611 |
73 | NPkVill | 1Story | 4 | 17942 |
74 | Veenker | 2Story | 1 | 17542 |
75 | NAmes | 1.5Unf | 2 | 16827 |
76 | SWISU | 1Story | 2 | 14692 |
77 | OldTown | SFoyer | 2 | 14179 |
78 | NWAmes | 1.5Fin | 1 | 13837 |
79 | SawyerW | SLvl | 1 | 12800 |
80 | IDOTRR | 1.5Unf | 2 | 12449 |
81 | SawyerW | 1.5Fin | 1 | 12327 |
82 | Gilbert | 1.5Fin | 1 | 12134 |
83 | BrkSide | 2.5Unf | 1 | 11888 |
84 | Crawfor | 2.5Fin | 1 | 11526 |
85 | NPkVill | 2Story | 5 | 11465 |
86 | NWAmes | SFoyer | 1 | 10625 |
87 | Crawfor | 1.5Unf | 1 | 10594 |
88 | OldTown | 1.5Unf | 2 | 9888 |
89 | MeadowV | SFoyer | 6 | 9853 |
90 | SawyerW | 1.5Unf | 1 | 9000 |
91 | MeadowV | 1Story | 2 | 8448 |
92 | IDOTRR | 2.5Unf | 1 | 7200 |
93 | Crawfor | 2.5Unf | 1 | 7128 |
94 | Blueste | 2Story | 2 | 3250 |
95 | MeadowV | SLvl | 1 | 1596 |
96 rows × 4 columns
#Step 3b: Download results as text/csv
file_contents = t.get_file_contents('/trifacta/queryResults/demo@trifacta.com/demo_output.csv')
with open('demo_output.csv', 'w') as f:
f.write(file_contents)
#Show the first few rows of the CSV file
!head demo_output.csv
"Neighborhood","HouseStyle","row_count","sum_LotArea"
"NAmes","1Story","159","1589811"
"CollgCr","1Story","91","841644"
"Gilbert","2Story","60","668112"
"Timber","1Story","23","554694"
"CollgCr","2Story","53","546602"
"NridgHt","1Story","51","537687"
"Sawyer","1Story","53","528438"
"Edwards","1Story","53","511296"
"NoRidge","2Story","33","485691"
#Step 4: Upload files to Trifacta
t.put_file_contents('/trifacta/uploads/demo_output.csv', file_contents)
True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
trifacta-2.1.tar.gz
(6.3 kB
view details)
Built Distribution
File details
Details for the file trifacta-2.1.tar.gz
.
File metadata
- Download URL: trifacta-2.1.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1899d24ca77ec146acf0b6629db80958079ee3ec3b59a557abdd113425852e5e |
|
MD5 | fe37c29307bd46ccb7be5eef01a6a5e1 |
|
BLAKE2b-256 | 3b7ed7128739fffc115365322c0d6f312fe11718f1dcb84b6b7947baf28bee4a |
File details
Details for the file trifacta-2.1-py3-none-any.whl
.
File metadata
- Download URL: trifacta-2.1-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03b0309beedbd0cbcbda44527650db054c8b69806db0f286ddb572833d701c11 |
|
MD5 | ee7888adeb134e4cb445262125edcea8 |
|
BLAKE2b-256 | 30a9182cd3163a914e173673609ee237fbfb02fc4070f8dadf699937ac864a66 |