Skip to main content

Blazingly fast DataFrame library

Reason this release was yanked:

regression

Project description


Documentation: Python - Rust - Node.js | StackOverflow: Python - Rust - Node.js | User Guide | Discord

Polars: Blazingly fast DataFrames in Rust, Python & Node.js

Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow Columnar Format as the memory model.

  • Lazy | eager execution
  • Multi-threaded
  • SIMD
  • Query optimization
  • Powerful expression API
  • Hybrid Streaming (larger than RAM datasets)
  • Rust | Python | NodeJS | ...

To learn more, read the User Guide.

>>> import polars as pl
>>> df = pl.DataFrame(
...     {
...         "A": [1, 2, 3, 4, 5],
...         "fruits": ["banana", "banana", "apple", "apple", "banana"],
...         "B": [5, 4, 3, 2, 1],
...         "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
...     }
... )

# embarrassingly parallel execution & very expressive query language
>>> df.sort("fruits").select(
...     [
...         "fruits",
...         "cars",
...         pl.lit("fruits").alias("literal_string_fruits"),
...         pl.col("B").filter(pl.col("cars") == "beetle").sum(),
...         pl.col("A").filter(pl.col("B") > 2).sum().over("cars").alias("sum_A_by_cars"),
...         pl.col("A").sum().over("fruits").alias("sum_A_by_fruits"),
...         pl.col("A").reverse().over("fruits").alias("rev_A_by_fruits"),
...         pl.col("A").sort_by("B").over("fruits").alias("sort_A_by_B_by_fruits"),
...     ]
... )
shape: (5, 8)
┌──────────┬──────────┬──────────────┬─────┬─────────────┬─────────────┬─────────────┬─────────────┐
 fruits    cars      literal_stri  B    sum_A_by_ca  sum_A_by_fr  rev_A_by_fr  sort_A_by_B 
 ---       ---       ng_fruits     ---  rs           uits         uits         _by_fruits  
 str       str       ---           i64  ---          ---          ---          ---         
                     str                i64          i64          i64          i64         
╞══════════╪══════════╪══════════════╪═════╪═════════════╪═════════════╪═════════════╪═════════════╡
 "apple"   "beetle"  "fruits"      11   4            7            4            4           
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
 "apple"   "beetle"  "fruits"      11   4            7            3            3           
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
 "banana"  "beetle"  "fruits"      11   4            8            5            5           
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
 "banana"  "audi"    "fruits"      11   2            8            2            2           
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
 "banana"  "beetle"  "fruits"      11   4            8            1            1           
└──────────┴──────────┴──────────────┴─────┴─────────────┴─────────────┴─────────────┴─────────────┘

Performance 🚀🚀

Polars is very fast. In fact, it is one of the best performing solutions available. See the results in h2oai's db-benchmark.

In the TPCH benchmarks polars is orders of magnitudes faster than pandas, dask, modin and vaex on full queries (including IO).

Besides fast, polars is also very lightweight. It comes with zero required dependencies, and this shows in the import times:

import time measurements:

  • polars: 70ms
  • numpy: 104ms
  • pandas: 520ms

Python setup

Install the latest polars version with:

pip install polars

We also have a conda package (conda install polars), however pip is the preferred way to install Polars.

Install Polars with all optional dependencies.

pip install 'polars[all]'
pip install 'polars[numpy,pandas,pyarrow]'  # install a subset of all optional dependencies

You can also install the dependencies directly.

Tag Description
all Install all optional dependencies (all of the following)
pandas Install with Pandas for converting data to and from Pandas Dataframes/Series
numpy Install with numpy for converting data to and from numpy arrays
pyarrow Reading data formats using PyArrow
fsspec Support for reading from remote file systems
connectorx Support for reading from SQL databases
xlsx2csv Support for reading from Excel files
timezone Timezone support, only needed if 1. you are on Python < 3.9 and/or 2. you are on Windows, otherwise no dependencies will be installed

Releases happen quite often (weekly / every few days) at the moment, so updating polars regularly to get the latest bugfixes / features might not be a bad idea.

Rust setup

You can take latest release from crates.io, or if you want to use the latest features / performance improvements point to the master branch of this repo.

polars = { git = "https://github.com/pola-rs/polars", rev = "<optional git tag>" }

Rust version

Required Rust version >=1.58

Documentation

Want to know about all the features Polars supports? Read the docs!

Larger than RAM data

If you have data that does not fit into memory, polars lazy is able to process your query (or parts of your query) in a streaming fashion, this drastically reduces memory requirements you might be able to process your 250GB dataset on your laptop. Collect with collect(allow_streaming=True) to run the query streaming. (This might be a little slower, but it is still very fast!)

Python

Rust

Node

Contribution

Want to contribute? Read our contribution guideline.

[Python]: compile polars from source

If you want a bleeding edge release or maximal performance you should compile polars from source.

This can be done by going through the following steps in sequence:

  1. Install the latest Rust compiler
  2. Install maturin: pip install maturin
  3. Choose any of:
    • Fastest binary, very long compile times:
      $ cd py-polars && maturin develop --release -- -C target-cpu=native
      
    • Fast binary, Shorter compile times:
      $ cd py-polars && maturin develop --release -- -C codegen-units=16 -C lto=thin -C target-cpu=native
      

Note that the Rust crate implementing the Python bindings is called py-polars to distinguish from the wrapped Rust crate polars itself. However, both the Python package and the Python module are named polars, so you can pip install polars and import polars.

Arrow2

Polars has transitioned to arrow2. Arrow2 is a faster and safer implementation of the Apache Arrow Columnar Format. Arrow2 also has a more granular code base, helping to reduce the compiler bloat.

Use custom Rust function in python?

See this example.

Going big...

Do you expect more than 2^32 ~4,2 billion rows? Compile polars with the bigidx feature flag.

Or for python users install pip install polars-u64-idx.

Don't use this unless you hit the row boundary as the default polars is faster and consumes less memory.

Legacy

Do you want polars to run on an old CPU (e.g. dating from before 2011)? Install pip polars-lts-cpu. This polars project is compiled without avx target features.

Acknowledgements

Development of Polars is proudly powered by

Xomnia

Sponsors

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars-0.15.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polars-0.15.0-cp37-abi3-win_amd64.whl (14.9 MB view details)

Uploaded CPython 3.7+Windows x86-64

polars-0.15.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (14.2 MB view details)

Uploaded CPython 3.7+manylinux: glibc 2.17+ x86-64

polars-0.15.0-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (12.4 MB view details)

Uploaded CPython 3.7+manylinux: glibc 2.17+ ARM64

polars-0.15.0-cp37-abi3-macosx_11_0_arm64.whl (12.0 MB view details)

Uploaded CPython 3.7+macOS 11.0+ ARM64

polars-0.15.0-cp37-abi3-macosx_10_7_x86_64.whl (13.4 MB view details)

Uploaded CPython 3.7+macOS 10.7+ x86-64

File details

Details for the file polars-0.15.0.tar.gz.

File metadata

  • Download URL: polars-0.15.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/0.13.5

File hashes

Hashes for polars-0.15.0.tar.gz
Algorithm Hash digest
SHA256 e75cc0a7d8a322dc75479a86188e04da57b5c1e61ab284527db68ff3cb85f22a
MD5 8072c748f7fc0f42a5b75882ce52533d
BLAKE2b-256 8728ff98ec0c6eefe5382d0afa557fc977e99e3f795985e16995347492b415ce

See more details on using hashes here.

File details

Details for the file polars-0.15.0-cp37-abi3-win_amd64.whl.

File metadata

  • Download URL: polars-0.15.0-cp37-abi3-win_amd64.whl
  • Upload date:
  • Size: 14.9 MB
  • Tags: CPython 3.7+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/0.13.5

File hashes

Hashes for polars-0.15.0-cp37-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 607c27b5c7bf1072eeda0c930d1d1ad324484655782578745e7166a78de26720
MD5 953eca2ab50bd3833867d79e92053102
BLAKE2b-256 6a87432cdce251b332a9d27ed2095415c3612523bc35c35910c91fb09800bc7a

See more details on using hashes here.

File details

Details for the file polars-0.15.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for polars-0.15.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 183f7f6195dc5175147f18c279b8cbf150304dff7d70bd055bf13ce36e4f9697
MD5 7d4c44178924b91e541d09e3c94aee4a
BLAKE2b-256 4d1ac76c1df7f8b122feac262c8c6af80ca82d28e64a07098337ed9046616e76

See more details on using hashes here.

File details

Details for the file polars-0.15.0-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for polars-0.15.0-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 61ecbebe4eb5da0be694105451856ddcd67c257b9f0039a1dd2b0b989c6f36df
MD5 bed3ab64e4b8873e1a719ccc074f876f
BLAKE2b-256 1743855666ea5416d3a41e1cf64dad7b6a9610e85f6a7bd4c3ff6c8559f4feb8

See more details on using hashes here.

File details

Details for the file polars-0.15.0-cp37-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars-0.15.0-cp37-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d6a240fa4394ac25e396ba2503abc2e47fa094d895e8fca1977d180b89a6de6d
MD5 e9214ddcc658333d40a0cab6d226d478
BLAKE2b-256 6b56acfa1bb530c07d3e977a95ca367eb1447402b927c3261a71e191aa0a55fd

See more details on using hashes here.

File details

Details for the file polars-0.15.0-cp37-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for polars-0.15.0-cp37-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 81b9389e0b986c8da18e57cf76d52119643a72f90a9817e463924a6e95efe962
MD5 6c96fbe6dc2507b8c73eb11a620a077b
BLAKE2b-256 d3e41845d8d47e6948942b4806e16f08977b883c29285c73b5456383e8390ae7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page