dataframe operations
Project description
FRAMEX
Getting test datasets made easy.
Built on top of polars.
Installation
To get started, install the library with:
pip install framex
Usage
CLI
Get a single dataset:
fx get iris
or get multiple datasets:
fx get iris mpg titanic
which will download dataset(s) to the current directory.
fx list
this will list all available datasets on the remote server.
Python
import framex as fx
Loading datasets
iris = fx.load("iris")
which returns a polars DataFrame
Therefore, you can use all the polars functions and methods on the returned DataFrame.
iris.head()
shape: (5, 5)
┌──────────────┬─────────────┬──────────────┬─────────────┬─────────┐
│ sepal_length ┆ sepal_width ┆ petal_length ┆ petal_width ┆ species │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ f32 ┆ f32 ┆ f32 ┆ f32 ┆ str │
╞══════════════╪═════════════╪══════════════╪═════════════╪═════════╡
│ 5.1 ┆ 3.5 ┆ 1.4 ┆ 0.2 ┆ setosa │
│ 4.9 ┆ 3.0 ┆ 1.4 ┆ 0.2 ┆ setosa │
│ 4.7 ┆ 3.2 ┆ 1.3 ┆ 0.2 ┆ setosa │
│ 4.6 ┆ 3.1 ┆ 1.5 ┆ 0.2 ┆ setosa │
│ 5.0 ┆ 3.6 ┆ 1.4 ┆ 0.2 ┆ setosa │
└──────────────┴─────────────┴──────────────┴─────────────┴─────────┘
iris = fx.load("iris", lazy=True)
which returns a polars LazyFrame
Both these operations create local copies of the datasets
by default cache=True
.
Available datasets
To see the list of available datasets, run:
fx.available()
{'remote': ['iris', 'mpg', 'netflix', 'starbucks', 'titanic'], 'local': ['titanic']}
which returns a dictionary of both locally and remotely available datasets.
To see only local or remote datasets, run:
fx.available("local")
fx.available("remote")
{'local': ['titanic']}
{'remote': ['iris', 'mpg', 'netflix', 'starbucks', 'titanic']}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.