Skip to main content

cuDF - GPU Dataframe

Project description

 cuDF - A GPU-accelerated DataFrame library for tabular data processing

cuDF (pronounced "KOO-dee-eff") is an Apache 2.0 licensed, GPU-accelerated DataFrame library for tabular data processing. The cuDF library is one part of the RAPIDS GPU Accelerated Data Science suite of libraries.

About

cuDF is composed of multiple libraries including:

  • libcudf: A CUDA C++ library with Apache Arrow compliant data structures and fundamental algorithms for tabular data.
  • pylibcudf: A Python library providing Cython bindings for libcudf.
  • cudf: A Python library providing
    • A DataFrame library mirroring the pandas API
    • A zero-code change accelerator, cudf.pandas, for existing pandas code.
  • cudf-polars: A Python library providing a GPU engine for Polars
  • dask-cudf: A Python library providing a GPU backend for Dask DataFrames

Notable projects that use cuDF include:

Installation

System Requirements

Operating System, GPU driver, and supported CUDA version information can be found at the RAPIDS Installation Guide

pip

A stable release of each cudf library is available on PyPI. You will need to match the major version number of your installed CUDA version with a -cu## suffix when installing from PyPI.

A development version of each library is available as a nightly release by including the -i https://pypi.anaconda.org/rapidsai-wheels-nightly/simple index.

# CUDA 13
pip install libcudf-cu13
pip install pylibcudf-cu13
pip install cudf-cu13
pip install cudf-polars-cu13
pip install dask-cudf-cu13

# CUDA 12
pip install libcudf-cu12
pip install pylibcudf-cu12
pip install cudf-cu12
pip install cudf-polars-cu12
pip install dask-cudf-cu12

conda

A stable release of each cudf library is available to be installed with the conda package manager by specifying the -c rapidsai channel.

A development version of each library is available as a nightly release by specifying the -c rapidsai-nightly channel instead.

conda install -c rapidsai libcudf
conda install -c rapidsai pylibcudf
conda install -c rapidsai cudf
conda install -c rapidsai cudf-polars
conda install -c rapidsai dask-cudf

source

To install cuDF from source, please follow the contribution guide detailing how to setup the build environment.

Examples

The following examples showcase reading a parquet file, dropping missing rows with a null value, and performing a groupby aggregation on the data.

cudf

import cudf and the APIs are largely similar to pandas.

import cudf

df = cudf.read_parquet("data.parquet")
df.dropna().groupby(["A", "B"]).mean()

cudf.pandas

With a Python file containing pandas code:

import pandas as pd

df = cudf.read_parquet("data.parquet")
df.dropna().groupby(["A", "B"]).mean()

Use cudf.pandas by invoking python with -m cudf.pandas

$ python -m cudf.pandas script.py

If running the pandas code in an interactive Jupyter environment, call %load_ext cudf.pandas before importing pandas.

In [1]: %load_ext cudf.pandas

In [2]: import pandas as pd

In [3]: df = cudf.read_parquet("data.parquet")

In [4]: df.dropna().groupby(["A", "B"]).mean()

cudf-polars

Using Polars' lazy API, call collect with engine="gpu" to run the operation on the GPU

import polars as pl

lf = pl.scan_parquet("data.parquet")
lf.drop_nulls().group_by(["A", "B"]).mean().collect(engine="gpu")

Questions and Discussion

For bug reports or feature requests, please file an issue on the GitHub issue tracker.

For questions or discussion about cuDF and GPU data processing, feel free to post in the RAPIDS Slack workspace.

Contributing

cuDF is open to contributions from the community! Please see our guide for contributing to cuDF for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cudf_cu13-26.2.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu13-26.2.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu13-26.2.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu13-26.2.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

File details

Details for the file cudf_cu13-26.2.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 619767b207e80cb16d1da9fc1db769c7ef3ec2b83503240c2fb46f15a0503b2b
MD5 212fa19a0a28c84454d709ecea932ead
BLAKE2b-256 ea240f6b86d26485e8634346866c4c5b4bef01590f6252a0401b0fd60f77d80b

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a86d0feb8fa4b40ec167f87fa7002603d6ba8f8d52eaf27381bf2310165e7640
MD5 86c75e087196e311c53d9c10e39fb7fa
BLAKE2b-256 9eaa3dfd9867a9993b8063d7af3761ee7d9cbfe69dc64ea3168799e6420d9a43

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1d2237b7a4cce44ce686d1ef36295d8a4785af8aaccc30f5db92f5237e45e93e
MD5 6229e646833a536eb69d381b6ca8fb25
BLAKE2b-256 4abff9c2399b1d39e60c922ff26b73091473b7309fc8fb6f5b07dbdb4f5ec30e

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 91b86b4b25e46b5ce23a3cfa4c0a880e22f21a6dfeaf879ded8bc5573069fe55
MD5 744e718006c4be66c87f15cb8caac9c3
BLAKE2b-256 03680b3a9ebc88f5e9eaab5087908ef79fc16f20b0dd437308b6e6a3fdf64f17

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8d7239950f3168761efea9c445d154bd2b88c2175fae3139f0af2e8a901513fe
MD5 c2f1f680e81cd3740b36a686055e5e15
BLAKE2b-256 7e8592b78399d5e4af832db96c44bd58d46761c5440e943da13cc7120f52cda2

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 158e7c06f98e4dfa03596bd5b29e2b22e7aa03efccf83b31526577419ce88fab
MD5 9aa591aad5a99e25c77643614efcc9ff
BLAKE2b-256 ff16e6349197b1de4f7feef098286ccd5a9d78c871404088c08cd21c98aa46ae

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c1b4b84a76a826e003feb3a422dcd1d098975a1f49bb4363402e7ab1e00d3125
MD5 ea2b8c3fed8421355b086aec530225ba
BLAKE2b-256 447e6b09a9622827a7194f7aef24a751b28f81ec2913dbd06e3749ca680f777b

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 469e05c4552b7b60a3817fac98c1e01c7dcd7ad0c11180167ee4af380f8019ad
MD5 0ecd0d289876cfcf4619fc7b900ffe24
BLAKE2b-256 89b09844d2427bd25649d04f250841467c4dd99a0aaeebfe6a01f5a115266b84

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page