Skip to main content

Datasets Summary Viewer. Enables the exploration of dataset search results in Jupyter Notebooks

Project description

DatasetsSummarizer

Datasets Summarizer is compatible with Jupyter Notebooks. Need the x and y values based on any similarity metric to generated the similarity plot between datasets. Supports the metadata format generated by datamart-profiler library to generate the Detail View to explore each dataset.

System screen

( Click one dataset from the list of results to open the Detail View.)

Demo

Live demo (Google Colab):

In Jupyter Notebook:

import DatasetsSummarizer
data = DatasetsSummarizer.get_taxi_data()
DatasetsSummarizer.plot_datasets_summary(data)

Install

Option 1: install via pip:

pip install datasets-summarizer

Custom similarity metric

Use a subset or add a new entry (x and y values ) based on a different similatiry metric. For example, here we added x and y values based on a similarity metric using a modified version of the titles. Note that modif_title_x and modif_title_y must be included in the dataframe.

new_similarity_metrics = [{'name': 'Title', 'x': 'title_x', 'y': 'title_y'},
                          {'name': 'ModifiedTitle', 'x': 'modif_title_x', 'y': 'modif_title_y'}
                         ]

Then, we can pass this new similarity metrics as a parameter of our visualization

DatasetsSummarizer.plot_datasets_summary(dataframe, new_similarity_metrics)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasets-summarizer-0.1.2.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

datasets_summarizer-0.1.2-py3-none-any.whl (2.7 MB view details)

Uploaded Python 3

File details

Details for the file datasets-summarizer-0.1.2.tar.gz.

File metadata

  • Download URL: datasets-summarizer-0.1.2.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/41.4.0 requests-toolbelt/0.8.0 tqdm/4.48.2 CPython/3.7.4

File hashes

Hashes for datasets-summarizer-0.1.2.tar.gz
Algorithm Hash digest
SHA256 d0cce2b615d235bc33989fe2abb2b16dbdba565f08735bf35d83108a81e5f81a
MD5 a2df634397621a7cd35b0a3ea2d10d02
BLAKE2b-256 1a967ab6390dcb6a4258e4446e7455c8faa3c0759906e2122182507e233466f6

See more details on using hashes here.

File details

Details for the file datasets_summarizer-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: datasets_summarizer-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 2.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/41.4.0 requests-toolbelt/0.8.0 tqdm/4.48.2 CPython/3.7.4

File hashes

Hashes for datasets_summarizer-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c3bf08bcce779a26681fbf8cfd4d5bd8656d23c3a7aae61efabe1bff7145b70a
MD5 d4c114fff4e72d415c0f1769b52e2cdb
BLAKE2b-256 575ee52b5a43a37808ad56168386ea357abbd93eb6033de81aeaf32a1fe6f1d1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page