Skip to main content

A Python SDK for interacting with datasets created on DataDistillr

Project description

DataDistillr Python SDK

This library allows you to programmatically interact with DataDistillr. It is quite simple to programmatically pull data from DataDistillr for use in machine learning.

Installing the SDK

The DataDistillr Python SDK is available on Pypi. You can install it with pip as shown below:

pip install datadistillr

Methods

Datadistillr

  • get_dataframe(url, auth_token): Pulls your data and returns it in a Pandas DataFrame.
  • get_csv_from_api(url, auth_token, filename): Pulls your data and returns it in a CSV file.
  • get_json_from_api(url, auth_token, filename): Pulls your data and returns it in a JSON file.
  • get_parquet_from_api(url, auth_token, filename): Pulls your data and returns it in a parquet file.
  • get_excel_from_api(url, auth_token, filename): Pulls your data and returns it in an Excel file.
  • get_dict_from_api(url, auth_token, filename): Pulls your data and returns it in a Python dictionary.

DatadistillrAccount

  • logout(): Logs you out of DataDistillr account.
  • get_projects(): Returns all projects in DataDistillr account as a list of Project objects.
  • get_project_token_dict(): Returns dictionary with project tokens as keys and project names as values.
  • get_project_token(project_name): Returns project token that matches project_name
  • get_project(project_token): Returns project object identified by project_token.
  • get_organizations(): Returns list organizations that DataDistillr account has access to.

Project

Note: A tab in the DataDistillr user interface is equivalent to a query barrel in API routes and responses. All public functions use the phrasing "tab" while all private functions use "query barrel"

  • get_tab_token_dict(): Returns dictionary with tab tokens as keys and tab names as values.
  • get_tab_token(tab_name): Returns tab token that matches tab_name
  • execute_existing_query(tab_token): Executes the most recent query in the tab identified by tab_token.
  • execute_new_query(tab_name, query): Creates new tab named tab_name and executes query in new tab.
  • get_data_source_token_dict(): Returns dictionary with data source tokens as keys and data source names as values.
  • get_data_source_token(data_source_name): Returns data source token that matches data_source_name
  • upload_files(data_source_token, file_paths): Uploads files to a data source. file_paths must be a list of absolute file path strings.

Getting your Endpoint URL and Authorization Token

See https://docs.datadistillr.com/ddr/ for complete documentation on obtaining the URL and Auth Token.

Usage

Using the SDK in Python code is quite simple. See the Examples below:

Importing SDK

import datadistillr as ddr

Getting data from API Access Clients

url = <Your URL From DataDistillr>
auth_token = <AUTH TOKEN>
dataframe = ddr.Datadistillr.get_dataframe(url, auth_token)

Logging in to a DataDistillr Account

email = <Your Email linked to DataDistillr Account>
password = <Your Password>

ddr_account = ddr.DatadistillrAccount(email, password)

Getting a project object

project_name = <Name of project within DataDistillr Account>

project_token = ddr_account.get_project_token(project_name)
project = ddr_account.get_project(project_token)

Executing an existing query from a tab within a project

tab_name = <Name of tab within project>

tab_token = project.get_tab_token(tab_name)
data_frame = project.execute_existing_query(tab_token)

Uploading files to a data source within a project

data_source_name = <Name of data source within project>
file_paths = <List of absolute file path strings of files that you want to upload>

data_source_token = project.get_data_source_token(data_source_name)
project.upload_files(data_source_token, file_paths)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datadistillr-1.0.1.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datadistillr-1.0.1-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file datadistillr-1.0.1.tar.gz.

File metadata

  • Download URL: datadistillr-1.0.1.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for datadistillr-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d46bf6a125c929c99c975822e38eba28433c80519f40c36ac60bd878ffe09c6b
MD5 6e18d0d970cd2833a7e3b93111e3966d
BLAKE2b-256 9911da052f7b5aefe7bb3ef5f2214944f76537d7c6c1d838c017af12c5c38474

See more details on using hashes here.

File details

Details for the file datadistillr-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: datadistillr-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for datadistillr-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 225206fbddda16df1199a0ea2f977d53519f6c7ed06b2aea27f258af56817e10
MD5 57da90ae8f58c3ccbb5dadb4c7d256dd
BLAKE2b-256 b972ef5a71066d6c09048074c9ac0eaa1cba13c42b560b00fdae9572ea133730

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page