Extremely lightweight compatibility layer between pandas, Polars, cuDF, and Modin
Project description
Narwhals
Extremely lightweight compatibility layer between Polars, pandas, and more.
Seamlessly support both, without depending on either!
- ✅ Just use a subset of the Polars API, no need to learn anything new
- ✅ No dependencies (not even Polars), keep your library lightweight
- ✅ Separate Lazy and Eager APIs
- ✅ Use Polars Expressions
Note: this is work-in-progress, and a bit of an experiment, don't take it too seriously.
Installation
pip install narwhals
Or just vendor it, it's only a bunch of pure-Python files.
Usage
There are three steps to writing dataframe-agnostic code using Narwhals:
-
use
narwhals.translate_frame
to wrap a pandas or Polars DataFrame to a Narwhals DataFrame -
(optional) use
narwhals.get_namespace
to get a namespace object -
use the subset of the Polars API defined in https://github.com/MarcoGorelli/narwhals/blob/main/narwhals/spec/__init__.py. Some methods are only available if you called
narwhals.translate_frame
withis_eager=True
-
use
narwhals.to_native
to return an object to the user in their original dataframe flavour. For example:- if you started with pandas, you'll get pandas back
- if you started with Polars, you'll get Polars back
Example
Here's an example of a dataframe agnostic function:
from typing import TypeVar
import pandas as pd
import polars as pl
from narwhals import translate_frame, get_namespace, to_native
AnyDataFrame = TypeVar("AnyDataFrame")
def my_agnostic_function(
suppliers_native: AnyDataFrame,
parts_native: AnyDataFrame,
) -> AnyDataFrame:
suppliers = translate_frame(suppliers_native)
parts = translate_frame(parts_native)
pl = get_namespace(suppliers)
result = (
suppliers.join(parts, left_on="city", right_on="city")
.filter(
pl.col("color").is_in(["Red", "Green"]),
pl.col("weight") > 14,
)
.group_by("s", "p")
.agg(
weight_mean=pl.col("weight").mean(),
weight_max=pl.col("weight").max(),
)
)
return to_native(result)
You can pass in a pandas or Polars dataframe, the output will be the same! Let's try it out:
suppliers = {
"s": ["S1", "S2", "S3", "S4", "S5"],
"sname": ["Smith", "Jones", "Blake", "Clark", "Adams"],
"status": [20, 10, 30, 20, 30],
"city": ["London", "Paris", "Paris", "London", "Athens"],
}
parts = {
"p": ["P1", "P2", "P3", "P4", "P5", "P6"],
"pname": ["Nut", "Bolt", "Screw", "Screw", "Cam", "Cog"],
"color": ["Red", "Green", "Blue", "Red", "Blue", "Red"],
"weight": [12.0, 17.0, 17.0, 14.0, 12.0, 19.0],
"city": ["London", "Paris", "Oslo", "London", "Paris", "London"],
}
print("pandas output:")
print(
my_agnostic_function(
pd.DataFrame(suppliers),
pd.DataFrame(parts),
)
)
print("\nPolars output:")
print(
my_agnostic_function(
pl.DataFrame(suppliers),
pl.DataFrame(parts),
)
)
print("\nPolars lazy output:")
print(
my_agnostic_function(
pl.LazyFrame(suppliers),
pl.LazyFrame(parts),
).collect()
)
pandas output:
s p weight_mean
0 S1 P6 19.0
1 S2 P2 17.0
2 S3 P2 17.0
3 S4 P6 19.0
Polars output:
shape: (4, 3)
┌─────┬─────┬─────────────┐
│ s ┆ p ┆ weight_mean │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ f64 │
╞═════╪═════╪═════════════╡
│ S1 ┆ P6 ┆ 19.0 │
│ S3 ┆ P2 ┆ 17.0 │
│ S4 ┆ P6 ┆ 19.0 │
│ S2 ┆ P2 ┆ 17.0 │
└─────┴─────┴─────────────┘
Polars lazy output:
shape: (4, 3)
┌─────┬─────┬─────────────┐
│ s ┆ p ┆ weight_mean │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ f64 │
╞═════╪═════╪═════════════╡
│ S1 ┆ P6 ┆ 19.0 │
│ S3 ┆ P2 ┆ 17.0 │
│ S4 ┆ P6 ┆ 19.0 │
│ S2 ┆ P2 ┆ 17.0 │
└─────┴─────┴─────────────┘
Magic! 🪄
Scope
- Do you maintain a dataframe-consuming library?
- Is there a Polars function which you'd like Narwhals to have, which would make your job easier?
If, I'd love to hear from you!
Note: You might suspect that this is a secret ploy to infiltrate the Polars API everywhere. Indeed, you may suspect that.
Why "Narwhals"?
Because they are so awesome.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file narwhals-0.3.0.tar.gz
.
File metadata
- Download URL: narwhals-0.3.0.tar.gz
- Upload date:
- Size: 131.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0487ac2a225383b332b6abd1a7b6eafa4b6ddb06d4cfe956ce4742fc888203a3 |
|
MD5 | b0a2bdfe77ab163d7a661db8ccbd3954 |
|
BLAKE2b-256 | 74467e70b437ed8f58e68f337257faada2b28ede3a10da0931ea7cb0a84d56e3 |
File details
Details for the file narwhals-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: narwhals-0.3.0-py3-none-any.whl
- Upload date:
- Size: 25.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1ebc44ba5f05b598deb966dc0e030f57708cd5531323ec8d15545d4df6e9f7a |
|
MD5 | 0c541a5284ef286c4f9ee5726bdac90c |
|
BLAKE2b-256 | f72688b884c8dbac6435b77796719985dbbff40223bc13f9ee6edfaba4bf9b24 |