Skip to main content

A friendly toolkit for quick exploratory data analysis (EDA)

Project description

DataNova Logo

🌌 DataNova

DataNova — a toolkit for data exploration in Python with a few lines of code!


🚀 Features

  • Instant profiling: Summarize your data with profile(df)
  • Bar Graph: Shows the top 5 most common values bar(df, 'Column_Name')
  • Histogram: Plots the distribution of numerical data hist(df, 'Column_Name')
  • Exploratory Data Analysis: EDA(df)
  • Simple Linear Regression: lm(df, x_var, y_var)

what's next?

  • Logistic Regression (coming soon!)

🧭 Examples

Examples below use a dataset provided in the package called WINE_DF

from datanova import *
profile( WINE_DF )
  • Most of our data is not blank, expect for 'price' (only 5% blank).
Variable Name Variable Type Missing Count % Blank Unique Values Most Frequent Value Mean Standard Deviation Min 25% Median 75% Max
0 country object 0 0 24 US
1 province object 0 0 120 California
2 points int64 0 0 13 90 89.55 2.32 84.0 88.0 90.0 91.0 96.0
3 price float64 88 5 110 20.0 38.71 29.39 7.0 20.0 30.0 48.0 500.0
4 variety object 0 0 161 Pinot Noir

$$\:$$ $$\:$$

bar( WINE_DF , 'province', top_n=7)
  • California accounts for 32% of total wine sales, and the top five regions collectively contribute over half of all sales.

Bar Graph Example

$$\:$$ $$\:$$

hist( WINE_DF , 'price' , xlim = [0,105], n_bins = 25)
  • On average, a bottle of wine costs $38. The price ranges from: $20-$48

Histogram Example

$$\:$$ $$\:$$

eda( WINE_DF )
  • Creates bar graphs and histograms for all columns in the dataset.

$$\:$$

Regression Modeling

figure, model = lm(WINE_DF, 'points' , 'price' ,  ylimit = [0,150] ,
                   xtitle = 'Points' , ytitle = 'Price ($)' , 
                   show_summary=False , alpha = 0.8 ) 

display( figure )  
  • Creates a simple linear regression
  • As the quality of wine increases (points), the price also increases.

Linear Model


🛠️ Installation

pip install DataNova

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datanova-0.1.3.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datanova-0.1.3-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file datanova-0.1.3.tar.gz.

File metadata

  • Download URL: datanova-0.1.3.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for datanova-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8ae85b00d227f9e66ea70734fe2bc0eb7dac6b91a452cfa5c097affd0728f7b2
MD5 eac6c393b236b250286067137dda8629
BLAKE2b-256 1b291de310bebbc3235aa850e467c75d0275bc36937256b91602481ade949239

See more details on using hashes here.

File details

Details for the file datanova-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: datanova-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for datanova-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6b86657e9bb62f2e964cd11bfdb0fd26a362daf54ac945c55c715f85acd0b78d
MD5 cca12b56de24b74fc46bbcbf55a9eb28
BLAKE2b-256 b9a3744ab199cfa4afcb5194b503ddb9bcc1e94f986fc74a7e30bb42b23b0930

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page