Skip to main content

An implementation of the DataFrame specification in Python

Project description

This is the official implementation of the DataFrame specification provided by Raven Computing.

Getting Started

Install via:

pip install raven-pydf

After installation you can use the entire DataFrame API by importing one class:

from raven.struct.dataframe import DataFrame

# read a DataFrame file into memory
df = DataFrame.read("/path/to/myFile.df")

# show the first 10 rows on stdout
print(df.head(10))

Alternatively, you can import all concrete Column types directly, for example:

from raven.struct.dataframe import (DefaultDataFrame,
                                    IntColumn,
                                    DoubleColumn,
                                    StringColumn)

# create a DataFrame with 3 columns and 3 rows
df = DefaultDataFrame(
        IntColumn("A", [1, 2, 3]),
        DoubleColumn("B", [4.4, 5.5, 6.6]),
        StringColumn("C", ["cat", "dog", "horse"]))

print(df)
# _| A B   C
# 0| 1 4.4 cat
# 1| 2 5.5 dog
# 2| 3 6.6 horse

Compatibility

This library requires Python3.7 or higher.

Internally, this library uses Numpy for array operations. The minimum required version is v1.19.0

Documentation

The unified documentation is available here.

Additional features implemented in Python are documented in the Wiki.

License

This library is licensed under the Apache License Version 2 - see the LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raven-pydf-1.1.3.tar.gz (66.9 kB view hashes)

Uploaded Source

Built Distribution

raven_pydf-1.1.3-py3-none-any.whl (92.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page