API client for Datarefiner
Project description
Readme
DataRefiner Client Library is a Python API toolkit designed to seamlessly connect your Python code with the DataRefiner platform, enabling convenient access and interaction.
Website: https://datarefiner.com
What functions this library support?
- Login using API key
- Upload dataset to the platform
- Configure project settings before training
- Start training (you can track rendering progress)
- Embed TDA map from DataRefiner right in your Jupyter Notebook for analysis
- Export result data from the TDA: cluster labels for source data, parameter scores for segmentation, list of the most importan features for all clusters, download TDA coordinates
- Perform Supervised labelling (predict cluster labels and groups from trained toplogical project)
Usage example:
import pandas as pd
from datarefiner_client import DataRefinerClient
from datarefiner_client.services.project_settings import ProjectSettingsFactory, ProjectType
from datarefiner_client.exceptions import DatarefinerExploreDownloadsError
from dataclasses import asdict
from pprint import pprint as pp
API_TOKEN = "<api_token>" # get API token from your user profile page
API_BASE_URL = "https://app.datarefiner.com"
# Login using API key
datarefiner_api = DataRefinerClient(
token=API_TOKEN,
base_url=API_BASE_URL,
)
datarefiner_api.me()
# Loading new data from CSV file
df = pd.read_csv("./data.csv")
# Upload dataset to the platform
upload, project_settings = datarefiner_api.upload(df=df, title="Data", load_filedetails=True)
# Check the project settings generated automatically
pp(asdict(project_settings))
# Change the field mapping settings: overlay/learn/disabled.
project_settings.fields_config['1'].config = "overlay"
project_settings.fields_config['2'].config = "learn"
project_settings.fields_config['3'].config = "disabled"
# You can change the rest of the project settings, here some examples:
project_settings.json_params.allow_noise_points = False
project_settings.json_params.beta = [45, 100, 200]
project_settings.json_params.clusterisation_type = 'kMeans'
project_settings.json_params.metric = ['euclidean', 'cosine']
# Perform rendering of the project
project_settings.name = "Create test project from API client"
project = datarefiner_api.create_project(project_settings=project_settings)
# Embed TDA map right in your Jupyter notebook
datarefiner_api.explore(project_id=project.id)
# Get assigned clusters for your source data
cluster_labels_df = datarefiner_api.get_cluster_labels(project_id=project.id)
# Get user-defined labels for your source data (and catch the excpection if there are no groups defined for the project)
try:
group_labels_df = datarefiner_api.get_group_labels(project_id=project.id)
print(group_labels_df.groupby('GroupID').count())
except DatarefinerExploreDownloadsError as e:
print(e)
# Get top parameters impacting the segmentation
parameter_scores_df = datarefiner_api.get_parameter_scores_for_segmentation(project_id=project.id)
# Get he list of the most important features for all clusters in one request
most_important_features_df = datarefiner_api.get_most_important_features_for_all_clusters(project_id=project.id)
# Get 2D and 3D TDA coordinates for your source data points (can be used in downstream tasks)
tda_coordinates_df = datarefiner_api.get_tda_coordinates(project_id=project.id)
# Performig prediction for new data (we use the same data as for training, but in reality you'll use new data in the same format)
clusters_df, groups_df = datarefiner_api.supervised_labeling(project_id=project.id, df=df)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datarefiner_client-0.0.4.tar.gz
(12.6 kB
view hashes)
Built Distribution
Close
Hashes for datarefiner_client-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 916c9e9fc427a68bea7b523d14823e383187361871dde8f7d7becf7a48ee62b0 |
|
MD5 | 5cd8fd48b6f6e637e43b03eb3da41789 |
|
BLAKE2b-256 | 06e7593b058bbaa460ffa5951ba4ae7552a3ee1c5f3d7bd4ed8ff63afa36c30a |