Skip to main content

cuDF - GPU Dataframe

Project description

 cuDF - A GPU-accelerated DataFrame library for tabular data processing

cuDF (pronounced "KOO-dee-eff") is an Apache 2.0 licensed, GPU-accelerated DataFrame library for tabular data processing. The cuDF library is one part of the RAPIDS GPU Accelerated Data Science suite of libraries.

About

cuDF is composed of multiple libraries including:

  • libcudf: A CUDA C++ library with Apache Arrow compliant data structures and fundamental algorithms for tabular data.
  • pylibcudf: A Python library providing Cython bindings for libcudf.
  • cudf: A Python library providing
    • A DataFrame library mirroring the pandas API
    • A zero-code change accelerator, cudf.pandas, for existing pandas code.
  • cudf-polars: A Python library providing a GPU engine for Polars
  • dask-cudf: A Python library providing a GPU backend for Dask DataFrames

Notable projects that use cuDF include:

Installation

System Requirements

Operating System, GPU driver, and supported CUDA version information can be found at the RAPIDS Installation Guide

pip

A stable release of each cudf library is available on PyPI. You will need to match the major version number of your installed CUDA version with a -cu## suffix when installing from PyPI.

A development version of each library is available as a nightly release by including the -i https://pypi.anaconda.org/rapidsai-wheels-nightly/simple index.

# CUDA 13
pip install libcudf-cu13
pip install pylibcudf-cu13
pip install cudf-cu13
pip install cudf-polars-cu13
pip install dask-cudf-cu13

# CUDA 12
pip install libcudf-cu12
pip install pylibcudf-cu12
pip install cudf-cu12
pip install cudf-polars-cu12
pip install dask-cudf-cu12

conda

A stable release of each cudf library is available to be installed with the conda package manager by specifying the -c rapidsai channel.

A development version of each library is available as a nightly release by specifying the -c rapidsai-nightly channel instead.

conda install -c rapidsai libcudf
conda install -c rapidsai pylibcudf
conda install -c rapidsai cudf
conda install -c rapidsai cudf-polars
conda install -c rapidsai dask-cudf

source

To install cuDF from source, please follow the contribution guide detailing how to setup the build environment.

Examples

The following examples showcase reading a parquet file, dropping missing rows with a null value, and performing a groupby aggregation on the data.

cudf

import cudf and the APIs are largely similar to pandas.

import cudf

df = cudf.read_parquet("data.parquet")
df.dropna().groupby(["A", "B"]).mean()

cudf.pandas

With a Python file containing pandas code:

import pandas as pd

df = cudf.read_parquet("data.parquet")
df.dropna().groupby(["A", "B"]).mean()

Use cudf.pandas by invoking python with -m cudf.pandas

$ python -m cudf.pandas script.py

If running the pandas code in an interactive Jupyter environment, call %load_ext cudf.pandas before importing pandas.

In [1]: %load_ext cudf.pandas

In [2]: import pandas as pd

In [3]: df = cudf.read_parquet("data.parquet")

In [4]: df.dropna().groupby(["A", "B"]).mean()

cudf-polars

Using Polars' lazy API, call collect with engine="gpu" to run the operation on the GPU

import polars as pl

lf = pl.scan_parquet("data.parquet")
lf.drop_nulls().group_by(["A", "B"]).mean().collect(engine="gpu")

Questions and Discussion

For bug reports or feature requests, please file an issue on the GitHub issue tracker.

For questions or discussion about cuDF and GPU data processing, feel free to post in the RAPIDS Slack workspace.

Contributing

cuDF is open to contributions from the community! Please see our guide for contributing to cuDF for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cudf_cu13-26.2.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.1-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu13-26.2.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.1-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu13-26.2.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

cudf_cu13-26.2.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cudf_cu13-26.2.1-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

File details

Details for the file cudf_cu13-26.2.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 777e12d75fe55a301e4d1030ba1e68217dfa30c368483ec10062742a23090617
MD5 55677acadb89765332041f3d17e6f6be
BLAKE2b-256 c98ba8f59196a2c81bf9cbf4095abf7a7cda099d2a281acac27cdc5798115ec3

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.1-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fefa974398801e95c42f2d8d232aad1f290f9318e0dd558806164da406b2b15d
MD5 4c9dc1eb926101f256b7eb64961dc7a7
BLAKE2b-256 f146ac2c1d685fcb75163511878ad872dd37095076c85970e7ae0b3ea1f70c5d

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a69f8a805cf1dbe027f6a8d5a0679793d9441ff4bcfa31b1c4d3ac2f761c0a54
MD5 d47e2558a7fb2bb5600ce657cae65f31
BLAKE2b-256 c957ae3f76a931dd0360671fda2981b96cbc24dc6f17f34e4cc2f342e72fd13c

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.1-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a01b782281bb5fa8046cf11d0d7fceece2ed98de7459cc6ff2006cc7231563e9
MD5 04633ec2479cf112ea29a13a68e4866a
BLAKE2b-256 d1f6e3c96d24bdef63a268906f1e64c1cc3d4abfb1166447d26aa7d6924ff887

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f35ca8357b43dc7921acf9c632bc651b96ff775560a28f60c39fe2a2bd43ab6f
MD5 9400ca98f62e0b87e99ebae7fec5aa47
BLAKE2b-256 74db9912b3ebf65b8a3149b84f09ba929692fa33dc5cc361c98652c089712acd

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 bef3832feb9b8e97467951f5fe5b8a08d93932046ad48e301fceaf8522a0eb56
MD5 f15baaa0d7069d8e7fbddbe486eb872f
BLAKE2b-256 0730c5806c7a6e28877b86907291755e56f09f3540102dd64cbd7327ec02aeb4

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 733fd3d0c494c8a034bc3e297564fc8eff24d88b9989048ec8661943d763f621
MD5 cdc31f9fad8a0ad2308e4ada05a1ce4b
BLAKE2b-256 052d3b542873d3ad6b59df125ea433f49fd50d4e210198ca65cfad110dbda023

See more details on using hashes here.

File details

Details for the file cudf_cu13-26.2.1-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cudf_cu13-26.2.1-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ee7bd46e6b3a48ef67771d483fda967e301425ec443239e4562cadf3c0f19644
MD5 76275dbf14aaea60d233c833f324a6ff
BLAKE2b-256 c471d325d6375169bfd8db580241b0b96f6998f629c6484d725a3004daa2ca99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page