Bridge between pandas, cudf, modin, dask, dask-modin, dask-cudf, spark or spark+rapids and between numpy, cupy and dask.array
Project description
Virtual DataFrame
Motivation
With Panda-like dataframe or numby-like array, do you want to create a code, and choose at the end, the framework to use? Do you want to be able to choose the best framework after simply performing performance measurements? This framework unifies multiple Panda-compatible or Numpy-comptaible components, to allow the writing of a single code, compatible with all.
Do you want to use different architectures at different times of the year to be "green" and cheaper? Do you want to use a GPU only for the black-friday?
Synopsis
With some parameters and Virtual classes, it's possible to write a code, and execute this code:
- With or without multicore
- With or without cluster (multi nodes)
- With or without GPU
To do that, we create some virtual classes, add some methods in others classes, etc.
It's difficult to use a combinaison of framework, with the same classe name, with similare semantic, etc. For example, if you want to use in the same program, Dask, cudf, pandas, modin, pyspark or pyspark+rapids, you must manage:
pandas.DataFrame,pandas,Seriesmodin.pandas.DataFrame,modin.pandas.Seriescudf.DataFrame,cudf.Seriesdask.DataFrame,dask.Seriespyspark.pandas.DataFrame,pyspark.pandas.Series
With numpy, you must manage:
numpy.ndarraycupy.ndarraydask.array
With cudf or cudf, the code must call .to_pandas() or asnumpy(). With dask, the code must call .compute(), can use @delayed or
dask.distributed.Client. etc.
We propose to replace all these classes and scenarios, with a uniform model, inspired by dask (the more complex API). Then, it is possible to write one code, and use it in differents environnements and frameworks.
This project is essentially a back-port of Dask+Cudf to others frameworks. We try to normalize the API of all frameworks. This project will weave your code with the selected framework, at runtime.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mx07-0.2.dev0-py3-none-any.whl.
File metadata
- Download URL: mx07-0.2.dev0-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5683c86c093cfa8cd136d40b5db5b0fe93fd91fe793d51b809b43b09d0353187
|
|
| MD5 |
300bc9fd05cc1a5079e37d34906c36b3
|
|
| BLAKE2b-256 |
399a27aed6b0ae2d1bb89c03de57ecd4b0f0d5def781d2b8e94682dba04e4a5f
|