No project description provided
Project description
Polars
Blazingly fast DataFrames in Rust & Python
Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow(2) as memory model.
- Lazy | eager execution
- Multi-threaded
- SIMD
- Query optimization
- Powerful expression API
- Rust | Python | ...
To learn more, read the User Guide.
>>> df = pl.DataFrame(
{
"A": [1, 2, 3, 4, 5],
"fruits": ["banana", "banana", "apple", "apple", "banana"],
"B": [5, 4, 3, 2, 1],
"cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
}
)
# embarrassingly parallel execution
# very expressive query language
>>> (df
.sort("fruits")
.select([
"fruits",
"cars",
lit("fruits").alias("literal_string_fruits"),
col("B").filter(col("cars") == "beetle").sum(),
col("A").filter(col("B") > 2).sum().over("cars").alias("sum_A_by_cars"), # groups by "cars"
col("A").sum().over("fruits").alias("sum_A_by_fruits"), # groups by "fruits"
col("A").reverse().over("fruits").flatten().alias("rev_A_by_fruits"), # groups by "fruits
col("A").sort_by("B").over("fruits").flatten().alias("sort_A_by_B_by_fruits") # groups by "fruits"
]))
shape: (5, 8)
┌──────────┬──────────┬──────────────┬─────┬─────────────┬─────────────┬─────────────┬─────────────┐
│ fruits ┆ cars ┆ literal_stri ┆ B ┆ sum_A_by_ca ┆ sum_A_by_fr ┆ rev_A_by_fr ┆ sort_A_by_B │
│ --- ┆ --- ┆ ng_fruits ┆ --- ┆ rs ┆ uits ┆ uits ┆ _by_fruits │
│ str ┆ str ┆ --- ┆ i64 ┆ --- ┆ --- ┆ --- ┆ --- │
│ ┆ ┆ str ┆ ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞══════════╪══════════╪══════════════╪═════╪═════════════╪═════════════╪═════════════╪═════════════╡
│ "apple" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 7 ┆ 4 ┆ 4 │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ "apple" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 7 ┆ 3 ┆ 3 │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ "banana" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 8 ┆ 5 ┆ 5 │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ "banana" ┆ "audi" ┆ "fruits" ┆ 11 ┆ 2 ┆ 8 ┆ 2 ┆ 2 │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ "banana" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 8 ┆ 1 ┆ 1 │
└──────────┴──────────┴──────────────┴─────┴─────────────┴─────────────┴─────────────┴─────────────┘
Performance 🚀🚀
Polars is very fast, and in fact is one of the best performing solutions available. See the results in h2oai's db-benchmark.
Rust setup
You can take latest release from crates.io
, or if you want to use the latest features/ performance improvements
point to the master
branch of this repo.
polars = { git = "https://github.com/pola-rs/polars", rev = "<optional git tag>" }
Rust version
Required Rust version >=1.52
Python users read this!
Polars is currently transitioning from py-polars
to polars
. Some docs may still refer the old name.
Install the latest polars version with:
$ pip3 install polars
Documentation
Want to know about all the features Polars support? Read the docs!
Rust
Python
- installation guide:
$ pip3 install polars
- User Guide
- Reference guide
Contribution
Want to contribute? Read our contribution guideline.
[Python] compile py-polars from source
If you want a bleeding edge release or maximal performance you should compile py-polars from source.
This can be done by going through the following steps in sequence:
- install the latest Rust compiler
$ pip3 install maturin
- Choose any of:
- Very long compile times, fastest binary:
$ cd py-polars && maturin develop --rustc-extra-args="-C target-cpu=native" --release
- Shorter compile times, fast binary:
$ cd py-polars && maturin develop --rustc-extra-args="-C codegen-units=16 -C lto=thin -C target-cpu=native" --release
Note that the Rust crate implementing the Python bindings is called py-polars
to distinguish from the wrapped
Rust crate polars
itself. However, both the Python package and the Python module are named polars
, so you
can pip install polars
and import polars
(previously, these were called py-polars
and pypolars
).
Arrow2
Polars has transitioned to arrow2. Arrow2 is a faster and safer implementation of the arrow spec.
Arrow2 also has a more granular code base, helping to reduce the compiler bloat.
There is still a maintained arrow-rs
branch for users who want to use another backend.
Acknowledgements
Development of Polars is proudly powered by
Sponsors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for polars-0.10.15-cp36-abi3-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67f5454aab7e88242f4886aa3bb2c8608c74433c660f3f693843f2a9c6993a26 |
|
MD5 | b2a186679d4d9f455ca0cefbf7282c5d |
|
BLAKE2b-256 | 7dd38201d332d8105558c70f4c0b7776aca72b86f55ff0ee792c93e463661611 |
Hashes for polars-0.10.15-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2cd76a106ee119727ec96152d853e4d2e8154545bf037fb6d405c4dc6479d209 |
|
MD5 | 85a91fbd4c4ef246192aae04521a5069 |
|
BLAKE2b-256 | 1eff957d9ea20ecc10f472a18fc438585a8b7c1c14396ab8c6c954bf34770202 |
Hashes for polars-0.10.15-cp36-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48d1d32e68c6f982bf0ca0ab0e31b5268cd7d5225011051805c89af6c84ed8f7 |
|
MD5 | ffc93afdaa1b62c413fe72442e171b80 |
|
BLAKE2b-256 | d9e800a12c19a551cec9b09ec6553da71db898dd3afad196de7ef5f0f9db3d21 |
Hashes for polars-0.10.15-cp36-abi3-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8716fdf3b8aecaf783d83cb9e635589038b571ceec180c55d077beb9d36b1391 |
|
MD5 | 5ba2cc56d3aa443048de95907ed5a82f |
|
BLAKE2b-256 | dd37359a25a951618e57b3e371ac0201791e2b3809afbb013ddc2445c96da670 |