Skip to main content

An intuitive, Pythonic way to work with tabular data

Project description

Python DataMatrix

An intuitive, Pythonic way to work with tabular data.

Sebastiaan Mathôt
Copyright 2015-2021
https:/datamatrix.cogsci.nl

Build Status

About

The datamatrix package provides a high-level, intuitive way to work with tabular data, that is, datasets that consist of named columns and numbered rows.

The main advantage of datamatrix over similar libraries is the clean, Pythonic syntax, which makes your code easy to read and understand.

datamatrix is also one of the core libraries of OpenSesame, a graphical experiment builder for the social sciences, and Rapunzel, a modern code editor for numerical computing with Python and R.

Ultra-short cheat sheet

from datamatrix import DataMatrix
# Create a new DataMatrix
dm = DataMatrix(length=5)
# The first two rows
print(dm[:2])
# Create a new column and initialize it with the Fibonacci series
dm.fibonacci = 0, 1, 1, 2, 3
# Remove 0 and 3 with a simple selection
dm = (dm.fibonacci > 0) & (dm.fibonacci < 3)
# Select 1, 1, and 2 by matching any of the values in a set
dm = dm.fibonacci == {1, 2}
# Select all odd numbers with a lambda expression
dm = dm.fibonacci == (lambda x: x % 2)
# Change all 1s to -1
dm.fibonacci[dm.fibonacci == 1] = -1
# The first two cells from the fibonacci column
print(dm.fibonacci[:2])
# Column mean
print('Mean: %s' % dm.fibonacci.mean)
# Multiply all fibonacci cells by 2
dm.fibonacci_times_two = dm.fibonacci * 2
# Loop through all rows
for row in dm:
	print(row.fibonacci) # get the fibonacci cell from the row
# Loop through all columns
for colname, col in dm.columns:
	for cell in col: # Loop through all cells in the column
		print(cell) # do something with the cell
# Or just see which columns exist
print(dm.column_names)

Dependencies

  • Python >= 3.7

Optional:

  • numpy and scipy for using the FloatColumn, IntColumn, and SeriesColumn objects
  • prettytable for creating a text representation of a DataMatrix (e.g. to print it out)
  • openpyxl for reading and writing .xlsx files
  • fastnumbers for improved performance

Installation

PyPi

pip install python-datamatrix

Anaconda

conda install python-datamatrix -c cogsci -c conda-forge

Ubuntu

sudo add-apt-repository ppa:smathot/cogscinl
sudo apt-get update
sudo apt install python3-datamatrix

Documentation

See https://datamatrix.cogsci.nl/

License

python-datamatrix is licensed under the GNU General Public License v3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for python-datamatrix, version 0.12.0
Filename, size File type Python version Upload date Hashes
Filename, size python_datamatrix-0.12.0-py2.py3-none-any.whl (69.9 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size python-datamatrix-0.12.0.tar.gz (48.6 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page