Skip to main content

Datasets Summary Viewer. Enables the exploration of dataset search results in Jupyter Notebooks

Project description

DatasetsSummarizer

Datasets Summarizer is compatible with Jupyter Notebooks. Need the x and y values based on any similarity metric to generated the similarity plot between datasets. Supports the metadata format generated by datamart-profiler library to generate the Detail View to explore each dataset.

System screen

( Click one dataset from the list of results to open the Detail View.)

Demo

Live demo (Google Colab):

In Jupyter Notebook:

import DatasetsSummarizer
data = DatasetsSummarizer.get_taxi_data()
DatasetsSummarizer.plot_datasets_summary(data)

Install

Option 1: install via pip:

pip install datasets-summarizer

Custom similarity metric

Use a subset or add a new entry (x and y values ) based on a different similatiry metric. For example, here we added x and y values based on a similarity metric using a modified version of the titles. Note that modif_title_x and modif_title_y must be included in the dataframe.

new_similarity_metrics = [{'name': 'Title', 'x': 'title_x', 'y': 'title_y'},
                          {'name': 'ModifiedTitle', 'x': 'modif_title_x', 'y': 'modif_title_y'}
                         ]

Then, we can pass this new similarity metrics as a parameter of our visualization

DatasetsSummarizer.plot_datasets_summary(dataframe, new_similarity_metrics)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasets-summarizer-0.1.2.tar.gz (2.6 MB view hashes)

Uploaded Source

Built Distribution

datasets_summarizer-0.1.2-py3-none-any.whl (2.7 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page