Skip to main content

An implementation of the DataFrame specification in Python

Project description

This is the official implementation of the DataFrame specification provided by Raven Computing.

Getting Started

Install via:

pip install raven-pydf

After installation you can use the entire DataFrame API by importing one class:

from raven.struct.dataframe import DataFrame

# read a DataFrame file into memory
df ="/path/to/myFile.df")

# show the first 10 rows on stdout

Alternatively, you can import all concrete Column types directly, for example:

from raven.struct.dataframe import (DefaultDataFrame,

# create a DataFrame with 3 columns and 3 rows
df = DefaultDataFrame(
        IntColumn("A", [1, 2, 3]),
        DoubleColumn("B", [4.4, 5.5, 6.6]),
        StringColumn("C", ["cat", "dog", "horse"]))

# _| A B   C
# 0| 1 4.4 cat
# 1| 2 5.5 dog
# 2| 3 6.6 horse


This library requires Python3.7 or higher.

Internally, this library uses Numpy for array operations. The minimum required version is v1.19.0


The unified documentation is available here.

Additional features implemented in Python are documented in the Wiki.


This library is licensed under the Apache License Version 2 - see the LICENSE for details.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raven-pydf-1.1.3.tar.gz (66.9 kB view hashes)

Uploaded source

Built Distribution

raven_pydf-1.1.3-py3-none-any.whl (92.8 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page