Skip to main content

No project description provided

Project description

Fedex Genarator

Introduction

FEDEx Genarator is a system that assists in the process of EDA (Exploratory Data Analysis) sessions. Based on FEDEx work (https://github.com/TAU-DB/FEDEx), it gives the user the option to generate NL explanations + Visualizations to their queries (Filter/GroupBy/Join) results.

How it works

FEDEx generator is forked from on FEDEx system, and offer new process to get explanation:

  1. The user users query (filter/groupby/join), and pass to FEDEx the input dataframe, output dataframe and the query parameters.
  2. FEDEx calculates an Interestingness Measure (that works well with the specific operation, for example Exceptionality measure for Filter and Join operations) for every column in the output dataframe (the query result)
  3. FEDEx finds the most interesting columns and partition them to set of rows.
  4. Then it finds the set-of-rows that affects the Interesingness measure result the most (from [2]).
  5. Now FEDEx takes the top columns and set-of-rows and generates meaningful explanations

For the full details, you can either view the code or read the FEDEx article which will be referenced here really soon:)

Example

In FEDEx example they used the spotify dataset from Kaggle. The first operation of our user was SELECT * FROM Spotify WHERE popularity > 65;

The raw output (Snip) -

Filter output

The generated explanation -

Filter explanation

The second operation of the user was SELECT AVG(dancability), AVG(loudness) FROM [SELECT * FROM Spotify WHERE year >= 1990] GROUPBY year;

The raw output (Snip) -

GroupBy output

The generated explanation -

GroupBy explanation

Usage

Notice - This project was tested on python version 3.6-3.8.

First, you have to install the requirements - py -3 -m pip install -r requirements.txt

Secondly, you should install latex on your system (the explanations inside the graphs require that). Things will still work even without latex but the experince might be a bit inferior.

This fork created to work on some adjusments for a API that will allow users to use pandas and generate explanations without effort and without using additional dedicated API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fedex_generator-1.0.3.tar.gz (33.1 kB view details)

Uploaded Source

File details

Details for the file fedex_generator-1.0.3.tar.gz.

File metadata

  • Download URL: fedex_generator-1.0.3.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for fedex_generator-1.0.3.tar.gz
Algorithm Hash digest
SHA256 1e90209f2948bdee89bfdd039195d05de0575fa7092eb6b7b2c2b055c5f43225
MD5 8f7dd1068754173bdf0c922ee6cb119a
BLAKE2b-256 6491624017c59ab97037d885f3668bda74effe1f9360a62068bbfe2ceac8f45c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page