Skip to main content

Python DataFrames powered by Julia

Project description

grizzlys


Code style: Ruff Linting: Ruff pre-commit

grizzlys: User-friendly Python DataFrames powered by Julia

grizzlys is a Python package that provides a native interface on top of Julia's popular DataFrames.jl package.

As a user-friendly alternative to existing Python packages such as pandas and polars, it is designed to be a convenient & easy to use DataFrames tool for data analysts, data engineers and data scientists alike, while still providing high performance and abstractions, thanks to Julia's high-performance computing capabilities.

Why you might consider using grizzlys

:white_check_mark: You are transitioning into Python from a Julia or R programming background

:white_check_mark: You are accustomed to working with Jupyter notebooks (or a REPL) and performing exploratory data analysis (EDA) on-the-fly

:white_check_mark: You need a quick-and-dirty data wrangling tool that provides readymade macros and convenience functions out of the box

:white_check_mark: You work with statistics or linear algebra often and require a wide range of statistical/algebraic functions to be well-integrated with your DataFrames

What is grizzlys (currently) NOT well-suited for

:x: Larger-than-memory datasets - grizzlys' current implementation relies on data being stored in-memory, and therefore it is not a good choice if you work with datasets that don't fit in your machine's RAM.

For such cases, using Polars or Dask DataFrames would be a much better choice as of now.

:x: Lazy Evaluation - Similar to the above, grizzlys is currently designed to be fully eager, which means it always immediately executes your code, as opposed to building a task/computation graph or thereabout and delaying execution until it's needed.

:x: Backwards compatibility - grizzlys is based on a relatively new programming language in Julia, and is developed using an advanced version of Python, with little regard to end-of-life versions or any compatibility with Python 2.7, for example.

You should therefore not rely on grizzlys for integrations with very old code or any other legacy/deprecated tools and implementations.

:x: Best-in-class Performance - Though Julia is widely considered a very high-performance language (it is actually a major reason why it's used under the hood here), grizzlys is still a work-in-progress (WIP) and therefore does not currently aim to compete with, or outperform, other high-performance DataFrame libraries, such as Polars (written in Rust) or Modin (Multi-threaded pandas).

This, of course, might no longer be a limitation in the future, as grizzlys will have undergone optimizations and maturation.


Go to Top

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grizzlys-0.0.1.tar.gz (10.1 kB view hashes)

Uploaded Source

Built Distribution

grizzlys-0.0.1-py3-none-any.whl (10.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page