Skip to main content

数据表概览-数据分析第一步

Project description

数据概览工具

当前支持的功能:基于Pandas的DataFrame,生成该DataFrame的表和字段粒度概览,HTML格式。

当前功能:

  1. 数据表概览

    • 数据表行数,列数,内存大小
    • 各类型数据字段数量
  2. 数据字段概览

    • 每个字段的类型,数量,去重后数量,缺失值数量,缺失率等
  3. 数据字段详情

    • 每个字段的详细信息

在研功能:

  1. 导出excel和csv
  2. 其他

使用示例:

参考data_overview/Usage.ipynb文件

import pandas as pd
import data_overview as do # 前提是先要执行:pip install data_overview进行安装

# 载入数据
df = pd.read_csv("./data/test_data.csv", encoding='gb18030')

# 用法:输入df,可导出数据概览
dfo = do.DataOverview(df)
dfo.to_html("./data/report.html") # 导出路径和命名自定义

Project details


Release history Release notifications

This version

0.1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for data-overview, version 0.1.0
Filename, size & hash File type Python version Upload date
data_overview-0.1.0-py3-none-any.whl (50.0 kB) View hashes Wheel py3
data_overview-0.1.0.tar.gz (42.8 kB) View hashes Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page