Skip to main content

Simple, light-weight data frames for Python

Project description

Simple, Light-Weight Data Frames for Python

PyPI Downloads

Dataiter's DataFrame is a class for tabular data similar to R's data.frame, implementing all common operations to manipulate data. It is under the hood a dictionary of NumPy arrays and thus capable of fast vectorized operations. You can consider it to be a light-weight alternative to Pandas with a simple and consistent API. Performance-wise Dataiter relies on NumPy and Numba and is likely to be at best comparable to Pandas.

Installation

# Latest stable version
pip install -U dataiter

# Latest development version
pip install -U git+https://github.com/otsaloma/dataiter

# Numba (optional)
pip install -U numba

Recommended NumPy version is currently >= 2.4.0 due to various StringDType fixes that have landed in NumPy 2.2.1 and 2.4.0.

Dataiter optionally uses Numba to speed up certain operations. If you have Numba installed, Dataiter will use it automatically. It's currently not a hard dependency, so you need to install it separately.

Quick Start

>>> import dataiter as di
>>> data = di.read_csv("data/listings.csv")
>>> data.filter(hood="Manhattan", guests=2).sort(price=1).head()
.
        id      hood zipcode guests    sqft price
     int64    string  string  int64 float64 int64
  ──────── ───────── ─────── ────── ─────── ─────
0 42279170 Manhattan   10013      2     nan     0
1 42384530 Manhattan   10036      2     nan     0
2 18835820 Manhattan   10021      2     nan    10
3 20171179 Manhattan   10027      2     nan    10
4 14858544 Manhattan              2     nan    15
5 31397084 Manhattan   10002      2     nan    19
6 22289683 Manhattan   10031      2     nan    20
7  7760204 Manhattan   10040      2     nan    22
8 43292527 Manhattan   10033      2     nan    22
9 43268040 Manhattan   10033      2     nan    23
.

Documentation

https://dataiter.readthedocs.io/

If you're familiar with either dplyr (R) or Pandas (Python), the comparison table in the documentation will give you a quick overview of the differences and similarities in common operations.

https://dataiter.readthedocs.io/en/stable/comparison.html

Development

To install a virtualenv for development, use

make venv

or, for a specific Python version

make PYTHON=python3.X venv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataiter-1.3.1.tar.gz (53.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataiter-1.3.1-py3-none-any.whl (74.7 kB view details)

Uploaded Python 3

File details

Details for the file dataiter-1.3.1.tar.gz.

File metadata

  • Download URL: dataiter-1.3.1.tar.gz
  • Upload date:
  • Size: 53.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for dataiter-1.3.1.tar.gz
Algorithm Hash digest
SHA256 0db09f3a71e85f1ecf91995bbbc4e50e3dc10fbf3331122eec6808ac4de8353d
MD5 fab31adf368dabec8e5cf3f08d1e8ad5
BLAKE2b-256 2ecba4b2a37470fe5e9c167ac5e44d9799da45591c14bbeade0529db4672f8b3

See more details on using hashes here.

File details

Details for the file dataiter-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: dataiter-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 74.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for dataiter-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cb9e607c5e14a72f41bf1edcb1151d6c87db2c26ffecb28dd53448cd233239e4
MD5 d2acbf279aa9ff48baaf6108b9bfe918
BLAKE2b-256 bdd3696be91a52ddb127f6094efad297e7e786040ba7411897d08464049f91dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page