Python DataFrames powered by Julia
Project description
grizzlys: User-friendly Python DataFrames powered by Julia
grizzlys is a Python package that provides a native interface on top of Julia's popular DataFrames.jl package.
As a user-friendly alternative to existing Python packages such as pandas and polars, it is designed to be a convenient & easy to use DataFrames tool for data analysts, data engineers and data scientists alike, while still providing high performance and abstractions, thanks to Julia's high-performance computing capabilities.
Why you might consider using grizzlys
✅ You are transitioning into Python from a Julia or R programming background
✅ You are accustomed to working with Jupyter notebooks (or a REPL) and performing exploratory data analysis (EDA) on-the-fly
✅ You need a quick-and-dirty data wrangling tool that provides readymade macros and convenience functions out of the box
✅ You work with statistics or linear algebra often and require a wide range of statistical/algebraic functions to be well-integrated with your DataFrames
What is grizzlys (currently) NOT well-suited for
❌ Larger-than-memory datasets - grizzlys' current implementation relies on data being stored in-memory, and therefore it is not a good choice if you work with datasets that don't fit in your machine's RAM.
For such cases, using Polars or Dask DataFrames would be a much better choice as of now.
❌ Lazy Evaluation - Similar to the above, grizzlys is currently designed to be fully eager, which means it always immediately executes your code, as opposed to building a task/computation graph or thereabout and delaying execution until it's needed.
❌ Backwards compatibility - grizzlys is based on a relatively new programming language in Julia, and is developed using an advanced version of Python, with little regard to end-of-life versions or any compatibility with Python 2.7, for example.
You should therefore not rely on grizzlys for integrations with very old code or any other legacy/deprecated tools and implementations.
❌ Best-in-class Performance - Though Julia is widely considered a very high-performance language (it is actually a major reason why it's used under the hood here), grizzlys is still a work-in-progress (WIP) and therefore does not currently aim to compete with, or outperform, other high-performance DataFrame libraries, such as Polars (written in Rust) or Modin (Multi-threaded pandas).
This, of course, might no longer be a limitation in the future, as grizzlys will have undergone optimizations and maturation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for grizzlys-0.0.1.dev1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65048dce2bf48afc91de657b00bd7d50d0c712c0a7e51ddea76f955a6091c81b |
|
MD5 | 336ef67c6deba025953a36b09ad93a3c |
|
BLAKE2b-256 | 40114460bacfb26365f2964602144eedad9af7029aec11462f1f8d9080f1d0dd |