Skip to main content

A library for data exploration comparible to pandas. No Series, No hierarchical indexing, only one indexer [ ]

Project description

dexplo

A data analysis library comparible to pandas

Main Goals

  • A very minimal set of features

  • Be as explicit as possible

  • There should be one– and preferably only one –obvious way to do it.

Data Structures

  • Only DataFrames

  • No Series

Data Types

  • Only primitive types - int, float, boolean, numpy.unicode

  • No object data types

Row and Column Labels

  • No index, meaning no row labels

  • No hierarchical index

  • Column names must be strings

  • Column names must be unique

  • Columns stored in a numpy array

Subset Selection

  • Only one way to select data - [ ]

  • Subset selection will be explicit and necessitate both rows and columns

  • Rows will be selected only by integer location

  • Columns will be selected by either label or integer location. Since columns must be strings, this will not be amibguous

  • Column names cannot be duplicated

All selections and operations copy

  • All selections and operations provide new copies of the data

  • This will avoid any chained indexing confusion

Development

  • Must use type hints

  • Must use 3.6 - fstrings

  • Must have numpy, bottleneck, numexpr

Small feature set

  • Implement as few attributes and methods as possible

  • Focus on good idiomatic cookbook examples for doing more complex tasks

Only Scalar Data Types

No complex Python data types - [x] bool - always 8 bits, not-null - [x] int - always 64 bits, not-null - [x] float - always 64 bits, nulls allowed - [x] str - A python unicode object, nulls allowed - [ ] categorical - [ ] datetime - [ ] timedelta

Attributes to implement

  • [x] size

  • [x] shape

  • [x] values

  • [x] dtypes

May not implement any of the binary operators as methods (add, sub, mul, etc…)

Methods

Stats - [x] abs - [x] all - [x] any - [x] argmax - [x] argmin - [x] clip - [ ] corr - [x] count - [ ] cov - [x] cummax - [x] cummin - [ ] cumprod - [x] cumsum - [ ] describe - [x] max - [x] min - [x] median - [x] mean - [ ] mode - [ ] nlargest - [ ] nsmallest - [ ] quantile - [ ] rank - [x] std - [x] sum - [x] var - [ ] unique - [ ] nunique

Selection - [ ] drop - [ ] drop_duplicates - [x] head - [ ] isin - [ ] sample - [x] select_dtypes - [x] tail - [ ] where

Missing Data - [ ] isna - [ ] dropna - [ ] fillna - [ ] interpolate

Other - [ ] append - [ ] apply - [ ] assign - [x] astype - [ ] groupby - [ ] info - [ ] melt - [ ] memory_usage - [ ] merge - [ ] pivot - [ ] replace - [ ] rolling - [ ] sort_values

Functions - [ ] read_csv - [ ] read_sql - [ ] concat

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dexplo-0.0.2.tar.gz (333.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dexplo-0.0.2-cp36-cp36m-macosx_10_7_x86_64.whl (321.1 kB view details)

Uploaded CPython 3.6mmacOS 10.7+ x86-64

File details

Details for the file dexplo-0.0.2.tar.gz.

File metadata

  • Download URL: dexplo-0.0.2.tar.gz
  • Upload date:
  • Size: 333.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dexplo-0.0.2.tar.gz
Algorithm Hash digest
SHA256 76b6de3b10db1ca3fdbaa5e5c49a39a383c479ab6f1584d7d83bf3f357e0ba43
MD5 c7a5d5e004ceb2a3f0e893bb7c3a954d
BLAKE2b-256 cceaab99134949cf68307e9600f2ad3c550df66d77f155d8fdabfe5dbeadcac2

See more details on using hashes here.

File details

Details for the file dexplo-0.0.2-cp36-cp36m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for dexplo-0.0.2-cp36-cp36m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 2cb3b86afe60ad841cb68adf1333aba81849bf9644d67ef25d6c41f5c5226a85
MD5 6c84faf069fbfb5eaf62c2edc2519166
BLAKE2b-256 72c720495a406178c48af059342a9efdf6194249d4e210c0a3449664edeebf9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page