Skip to main content

No project description provided

Project description

Fedex Genarator

Introduction

FEDEx Genarator is a system that assists in the process of EDA (Exploratory Data Analysis) sessions. Based on FEDEx work (https://github.com/TAU-DB/FEDEx), it gives the user the option to generate NL explanations + Visualizations to their queries (Filter/GroupBy/Join) results.

How it works

FEDEx generator is forked from on FEDEx system, and offer new process to get explanation:

  1. The user users query (filter/groupby/join), and pass to FEDEx the input dataframe, output dataframe and the query parameters.
  2. FEDEx calculates an Interestingness Measure (that works well with the specific operation, for example Exceptionality measure for Filter and Join operations) for every column in the output dataframe (the query result)
  3. FEDEx finds the most interesting columns and partition them to set of rows.
  4. Then it finds the set-of-rows that affects the Interesingness measure result the most (from [2]).
  5. Now FEDEx takes the top columns and set-of-rows and generates meaningful explanations

For the full details, you can either view the code or read the FEDEx article which will be referenced here really soon:)

Example

In FEDEx example they used the spotify dataset from Kaggle. The first operation of our user was SELECT * FROM Spotify WHERE popularity > 65;

The raw output (Snip) -

Filter output

The generated explanation -

Filter explanation

The second operation of the user was SELECT AVG(dancability), AVG(loudness) FROM [SELECT * FROM Spotify WHERE year >= 1990] GROUPBY year;

The raw output (Snip) -

GroupBy output

The generated explanation -

GroupBy explanation

Usage

Notice - This project was tested on python version 3.6-3.8.

First, you have to install the requirements - py -3 -m pip install -r requirements.txt

Secondly, you should install latex on your system (the explanations inside the graphs require that). Things will still work even without latex but the experince might be a bit inferior.

This fork created to work on some adjusments for a API that will allow users to use pandas and generate explanations without effort and without using additional dedicated API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fedex_generator-0.0.7.tar.gz (26.3 kB view hashes)

Uploaded Source

Built Distribution

fedex_generator-0.0.7-py3-none-any.whl (32.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page