Skip to main content

dataframe operations

Project description

Banner A DataVil project.

FrameX

GitHub PyPI

FrameX is a light-weight, dataset fetching library for fast prototyping, tutorial creation, and experimenting.

Built on top of Polars.

Installation

To get started, install the library with:

pip install framex

Usage

Python

import framex as fx

Loading datasets

iris = fx.load("iris")

which returns a polars DataFrame
Therefore, you can use all the polars functions and methods on the returned DataFrame.

iris.head()
shape: (5, 5)
┌──────────────┬─────────────┬──────────────┬─────────────┬─────────┐
│ sepal_length ┆ sepal_width ┆ petal_length ┆ petal_width ┆ species │
│ ---          ┆ ---         ┆ ---          ┆ ---         ┆ ---     │
│ f32          ┆ f32         ┆ f32          ┆ f32         ┆ str     │
╞══════════════╪═════════════╪══════════════╪═════════════╪═════════╡
│ 5.1          ┆ 3.5         ┆ 1.4          ┆ 0.2         ┆ setosa  │
│ 4.9          ┆ 3.0         ┆ 1.4          ┆ 0.2         ┆ setosa  │
│ 4.7          ┆ 3.2         ┆ 1.3          ┆ 0.2         ┆ setosa  │
│ 4.6          ┆ 3.1         ┆ 1.5          ┆ 0.2         ┆ setosa  │
│ 5.0          ┆ 3.6         ┆ 1.4          ┆ 0.2         ┆ setosa  │
└──────────────┴─────────────┴──────────────┴─────────────┴─────────┘
iris = fx.load("iris", lazy=True)

which returns a polars LazyFrame

Both these operations create local copies of the datasets by default cache=True.

Available datasets

To see the list of available datasets, run:

fx.available()
{'remote': ['iris', 'mpg', 'netflix', 'starbucks', 'titanic'], 'local': ['titanic']}

which returns a dictionary of both locally and remotely available datasets.

To see only local or remote datasets, run:

fx.available("local")
fx.available("remote")
{'local': ['titanic']}
{'remote': ['iris', 'mpg', 'netflix', 'starbucks', 'titanic']}

Getting information on Datasets

To get information on a dataset, run:

fx.about("mpg") # basically the same as `fx.about("mpg", mode="print")`

which will print the information on the dataset as the following:

NAME    : mpg
SOURCE  : https://www.kaggle.com/datasets/uciml/autompg-dataset
LICENSE : CC0: Public Domain
ORIGIN  : Kaggle
OG NAME : autompg-dataset

Or you can get the information as a single row polars.DataFrame by running:

row = fx.about("mpg", mode="row")
print(row)

which will print the information on the dataset ASCII art as the following:

shape: (1, 4)
┌──────┬─────────────────────────────────┬────────────────────┬────────┐       
│ name ┆ source                          ┆ license            ┆ origin │       
│ ---  ┆ ---                             ┆ ---                ┆ ---    │       
│ str  ┆ str                             ┆ str                ┆ str    │       
╞══════╪═════════════════════════════════╪════════════════════╪════════╡       
│ mpg  ┆ https://www.kaggle.com/dataset… ┆ CC0: Public Domain ┆ Kaggle │       
└──────┴─────────────────────────────────┴────────────────────┴────────┘ 

or you can simply treat row as a polars DataFrame in your code.

Getting Dataset URLs

In case you need the file links.

url_pokemon = fx.get_url("pokemon")

by default, the format is " feather".

Optionally, you can specify the format of the dataset.

url_pokemon_csv = fx.get_url("pokemon", format="csv")

CLI

Get a single dataset:

fx get iris

or get multiple datasets:

fx get iris mpg titanic

which will download dataset(s) to the current directory.

For more parameters

fx get --help

To get the name of the available datasets on the remote server.

fx list

this will list all available datasets on the remote server.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

framex-0.5.0.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

framex-0.5.0-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file framex-0.5.0.tar.gz.

File metadata

  • Download URL: framex-0.5.0.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.9 Windows/10

File hashes

Hashes for framex-0.5.0.tar.gz
Algorithm Hash digest
SHA256 47963b109adb243ed3305b6ad16d70ca486199d20c4acae6c23a21247fd879c2
MD5 c395fbcc01e2bc3f83fabf16ca151a59
BLAKE2b-256 362ec14a407709bd011f4f44e51ad40f2a7003418341749f01bb22720aaf4422

See more details on using hashes here.

File details

Details for the file framex-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: framex-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.9 Windows/10

File hashes

Hashes for framex-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f92ebbda2cf1bf2d3c4a874ce61eb06878c2c157dd59df5b826406972b8167ae
MD5 2bc18af1370417450cfa60c4a40e0bda
BLAKE2b-256 9b949305d0a56fd022d4570e5347c19c7174ca758c3128a6f1afd7ce198472d6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page