Skip to main content

cuDF - GPU Dataframe

Project description

 cuDF - A GPU-accelerated DataFrame library for tabular data processing

cuDF (pronounced "KOO-dee-eff") is an Apache 2.0 licensed, GPU-accelerated DataFrame library for tabular data processing. The cuDF library is one part of the RAPIDS GPU Accelerated Data Science suite of libraries.

About

cuDF is composed of multiple libraries including:

  • libcudf: A CUDA C++ library with Apache Arrow compliant data structures and fundamental algorithms for tabular data.
  • pylibcudf: A Python library providing Cython bindings for libcudf.
  • cudf: A Python library providing
    • A DataFrame library mirroring the pandas API
    • A zero-code change accelerator, cudf.pandas, for existing pandas code.
  • cudf-polars: A Python library providing a GPU engine for Polars
  • dask-cudf: A Python library providing a GPU backend for Dask DataFrames

Notable projects that use cuDF include:

Installation

System Requirements

Operating System, GPU driver, and supported CUDA version information can be found at the RAPIDS Installation Guide

pip

A stable release of each cudf library is available on PyPI. You will need to match the major version number of your installed CUDA version with a -cu## suffix when installing from PyPI.

A development version of each library is available as a nightly release by including the -i https://pypi.anaconda.org/rapidsai-wheels-nightly/simple index.

# CUDA 13
pip install libcudf-cu13
pip install pylibcudf-cu13
pip install cudf-cu13
pip install cudf-polars-cu13
pip install dask-cudf-cu13

# CUDA 12
pip install libcudf-cu12
pip install pylibcudf-cu12
pip install cudf-cu12
pip install cudf-polars-cu12
pip install dask-cudf-cu12

conda

A stable release of each cudf library is available to be installed with the conda package manager by specifying the -c rapidsai channel.

A development version of each library is available as a nightly release by specifying the -c rapidsai-nightly channel instead.

conda install -c rapidsai libcudf
conda install -c rapidsai pylibcudf
conda install -c rapidsai cudf
conda install -c rapidsai cudf-polars
conda install -c rapidsai dask-cudf

source

To install cuDF from source, please follow the contribution guide detailing how to setup the build environment.

Examples

The following examples showcase reading a parquet file, dropping missing rows with a null value, and performing a groupby aggregation on the data.

cudf

import cudf and the APIs are largely similar to pandas.

import cudf

df = cudf.read_parquet("data.parquet")
df.dropna().groupby(["A", "B"]).mean()

cudf.pandas

With a Python file containing pandas code:

import pandas as pd

df = cudf.read_parquet("data.parquet")
df.dropna().groupby(["A", "B"]).mean()

Use cudf.pandas by invoking python with -m cudf.pandas

$ python -m cudf.pandas script.py

If running the pandas code in an interactive Jupyter environment, call %load_ext cudf.pandas before importing pandas.

In [1]: %load_ext cudf.pandas

In [2]: import pandas as pd

In [3]: df = cudf.read_parquet("data.parquet")

In [4]: df.dropna().groupby(["A", "B"]).mean()

cudf-polars

Using Polars' lazy API, call collect with engine="gpu" to run the operation on the GPU

import polars as pl

lf = pl.scan_parquet("data.parquet")
lf.drop_nulls().group_by(["A", "B"]).mean().collect(engine="gpu")

Questions and Discussion

For bug reports or feature requests, please file an issue on the GitHub issue tracker.

For questions or discussion about cuDF and GPU data processing, feel free to post in the RAPIDS Slack workspace.

Contributing

cuDF is open to contributions from the community! Please see our guide for contributing to cuDF for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cudf_cu12-26.2.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu12-26.2.1-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu12-26.2.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu12-26.2.1-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu12-26.2.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu12-26.2.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu12-26.2.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu12-26.2.1-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

File details

Details for the file cudf_cu12-26.2.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1dacad89771d4dad0f24949659a1d688902f2cbb24d78c61ab46dda4dc6e35b1
MD5 fdf72dee508154506e74a346efa0622f
BLAKE2b-256 b4cd09e77423dee5a6924ca3e149949aa510ef6ac9e6914996eab1dd086d6279

See more details on using hashes here.

File details

Details for the file cudf_cu12-26.2.1-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b54fcb609aa549c55f7ecc11c70ab78c734d59e4820257724bd9dd091593cef5
MD5 9bae5710ae0c9aecd481ebc76c3bebaa
BLAKE2b-256 1f26a8181a517f5d65109b77e85fd012d22a8fd28bdea7be09e697fed18ab06f

See more details on using hashes here.

File details

Details for the file cudf_cu12-26.2.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 23c0f10714903d522c8c444107b5239bdbfa6878873e37782cf41d53bcc21a75
MD5 c8155815cf2c67f09bb6abe7deef4fd2
BLAKE2b-256 51d0aa36cc52c357a611d8be3eb48c4d7fa35f8c50b678b3d2f415da24e49ac9

See more details on using hashes here.

File details

Details for the file cudf_cu12-26.2.1-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a458e735dc90297d2de4bd12304337d1b869e3b08abb221f8ef82d974a62f95a
MD5 e85ccbad0bb477ea9e224da8fe6292ab
BLAKE2b-256 bdc0e1c1e6c2f8aba2553ee1f4baffdf109e0ef3a30e06e31b681d3b5b519029

See more details on using hashes here.

File details

Details for the file cudf_cu12-26.2.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 55e838bc8067c40f4590e617694762d4e20ca7d9686fb7a720af694970f75502
MD5 0ce34ebd1edd234d0dd35d524e656c69
BLAKE2b-256 477e0f1decc5b86fe7d82a48d64c4ac379eeb30ade7c41ab85cb01209e0d775d

See more details on using hashes here.

File details

Details for the file cudf_cu12-26.2.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 86d12dfff0bddad886ef0f8cb12cf4260570978ea5c798d86004c1a08f46cac7
MD5 fc2a669f5daca5585e7bb90eca134e97
BLAKE2b-256 2af3623cf03bbf262228031b3e22f65d1ff8dd89735197287097f39a0e2e113c

See more details on using hashes here.

File details

Details for the file cudf_cu12-26.2.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 51c2ff6a4f73b6aaa2dfc3877e4b852bc8c929582b4bd4a4739cb1fefeeb94c9
MD5 a1b94db7c3d78ec1b51f42c2897b29fd
BLAKE2b-256 5067c34be38ddd8d546bbb3fbca1b01e1e6cb0b5f14cbef97a1284acfb28e922

See more details on using hashes here.

File details

Details for the file cudf_cu12-26.2.1-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu12-26.2.1-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4690d207817124891ba5852e03033cb9c1b413a41c6cc13a25362ea46f0b2dfa
MD5 da1dbb56dbff9f36b22e1997915518de
BLAKE2b-256 94ad97424f5692ac2b38554478dd26c8ef261f2e5aedac0795c2709b17ba2276

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page