Skip to main content

Data Cleaner for multiple files

Project description

Table对象VS. Pandas DataFrame

  1. 每个Table必有一个字符串的name属性;
  2. 每个Table有一个字符串可选的description属性,用于描述Table自身
  3. 每个Table的行索引总是为默认的RangeIndex;
  4. 每个Table的列索引将总是字符串且独一无二且单层,且不为缺失值;
  5. 每个Table的列索引将具有额外标签column_labels用于描述列索引;
  6. 每个Table的类型转换系统将是简化后的Nullable数据类型;
  7. 每个Table的HTML显示,会显示列的简化类型

MultiTable VS. DiskCache

  1. 每个MultiTable的值总是Table对象

  2. 每个MultiTable的size_limit将为当前磁盘的free disk_usage*0.95 以及cull_limit将为0;

  3. 每个MultiTable的with语句退出后不仅close cache数据库还删除数据库;

  4. 增加concat, reshape, eval, aggregate, format操作五个数据pipeline方法:

    1. concat: 横向/纵向合并 (不新增值但改变维度)
    2. reshape: 长转宽/宽转长 (改变形状和行列值)
    3. mutate: 基于现有列计算新列; (根据已有列修改列但不改变维度)
    4. aggregate: 加总列信息为更小行数的列; (根据已有列,创建新Table) (比如从个体加总成家庭层面)
    5. format: 列排序/行排序/表名/表描述/列值范围/列值替换/列数据类型/列名重命名/列名标签 (不改变行列值和形状)
  5. 区分add和update方法: add仅仅在key不存在时使用, update则将可以更新存在的key; 使用选项来设置是否lock source

  6. 通过构造表达式来

  7. 增加IO:

    1. from_csv
    2. from_tsv
    3. from_pickle
    4. from_stata
    5. to_csv
    6. to_tsv
    7. to_pickle
    8. to_stata

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidydata-0.1.19.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tidydata-0.1.19-py3-none-any.whl (24.0 kB view details)

Uploaded Python 3

File details

Details for the file tidydata-0.1.19.tar.gz.

File metadata

  • Download URL: tidydata-0.1.19.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Linux/6.7.0-arch3-1

File hashes

Hashes for tidydata-0.1.19.tar.gz
Algorithm Hash digest
SHA256 8b087645498dfb2446bb3680a05ba8a1e3e784b35fb10ce7e26764cdee3c287c
MD5 4758ddb438db43f21bb778c4330d132e
BLAKE2b-256 6e3a14d3d980b2004aac9ff1620b1d401dac81f9d9bcbe6ffa292b29af1c92c4

See more details on using hashes here.

File details

Details for the file tidydata-0.1.19-py3-none-any.whl.

File metadata

  • Download URL: tidydata-0.1.19-py3-none-any.whl
  • Upload date:
  • Size: 24.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Linux/6.7.0-arch3-1

File hashes

Hashes for tidydata-0.1.19-py3-none-any.whl
Algorithm Hash digest
SHA256 db7e9dc8cd609b75a73a46e2ddf1e7cca0dd8c4809430613754fbf2c43c68353
MD5 65c72d29f93f9147c3181b52e7eea85b
BLAKE2b-256 c9cf79b57c5aeabe3a3251cee0ec40951cb0691d49b1a7def11217b4375ed568

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page