Skip to main content

No project description provided

Project description

Polars

rust docs Build and test Gitter

Blazingly fast DataFrames in Rust & Python

Polars is a blazingly fast DataFrames library implemented in Rust. Its memory model uses Apache Arrow as backend.

It currently consists of an eager API similar to pandas and a lazy API that is somewhat similar to spark. Amongst more, Polars has the following functionalities.

To learn more about the inner workings of Polars read the User Guide (wip).

Rust users read this!

Polars cannot deploy a new version to crates.io until a new arrow release is issued. Arrow's release cycle takes 3/4 months which is a lot slower than I'd like to release. Until that time it is recommended to use the current master branch instead of the published version on crates.io. The current master is a lot stabler than the published version and has way faster compile times.

You can add the master like this:

polars = {version="0.12.0", git = "https://github.com/ritchie46/polars" }

Or by fixing to a specific version:

polars = {version="0.12.0", git = "https://github.com/ritchie46/polars", rev = "<optional git tag>" } 

Python users read this!

Polars is currently transitioning from py-polars to polars. Some docs may still refer the old name.

Install the latest polars version with: $ pip3 install polars

Functionality Eager Lazy (DataFrame) Lazy (Series)
Filters
Shifts
Joins
GroupBys + aggregations
Comparisons
Arithmetic
Sorting
Reversing
Closure application (User Defined Functions)
SIMD
Pivots
Melts
Filling nulls + fill strategies
Aggregations
Moving Window aggregates
Find unique values
Rust iterators
IO (csv, json, parquet, Arrow IPC
Query optimization: (predicate pushdown)
Query optimization: (projection pushdown)
Query optimization: (type coercion)
Query optimization: (simplify expressions)
Query optimization: (aggregate pushdown)

Note that almost all eager operations supported by Eager on Series/ChunkedArrays can be used in Lazy via UDF's

Documentation

Want to know about all the features Polars support? Read the docs!

Rust

Python

Performance

Polars is written to be performant, and it is! But don't take my word for it, take a look at the results in h2oai's db-benchmark.

Cargo Features

Additional cargo features:

  • temporal (default)
    • Conversions between Chrono and Polars for temporal data
  • simd (nightly)
    • SIMD operations
  • parquet
    • Read Apache Parquet format
  • json
    • Json serialization
  • ipc
    • Arrow's IPC format serialization
  • random
    • Generate array's with randomly sampled values
  • ndarray
    • Convert from DataFrame to ndarray
  • lazy
    • Lazy api
  • strings
    • String utilities for Utf8Chunked
  • object
    • Support for generic ChunkedArray's called ObjectChunked<T> (generic over T). These will downcastable from Series through the Any trait.
  • parallel
    • ChunkedArrays can be used by rayon::par_iter()
  • [plain_fmt | pretty_fmt] (mutually exclusive)
    • one of them should be chosen to fmt DataFrames. pretty_fmt can deal with overflowing cells and looks nicer but has more dependencies. plain_fmt (default) is plain formatting.

Contribution

Want to contribute? Read our contribution guideline.

ENV vars

  • POLARS_PAR_SORT_BOUND -> Sets the lower bound of rows at which Polars will use a parallel sorting algorithm. Default is 1M rows.
  • POLARS_FMT_MAX_COLS -> maximum number of columns shown when formatting DataFrames.
  • POLARS_FMT_MAX_ROWS -> maximum number of rows shown when formatting DataFrames.
  • POLARS_TABLE_WIDTH -> width of the tables used during DataFrame formatting.
  • POLARS_MAX_THREADS -> maximum number of threads used in join algorithm. Default is unbounded.
  • POLARS_VERBOSE -> print logging info to stderr

[Python] compile py-polars from source

If you want a bleeding edge release or maximal performance you should compile py-polars from source.

This can be done by going through the following steps in sequence:

  1. install the latest rust compiler
  2. $ pip3 install maturin
  3. $ cd py-polars && maturin develop --release

Note that the Rust crate implementing the Python bindings is called py-polars to distinguish from the wrapped Rust crate polars itself. However, both the Python package and the Python module are named polars, so you can pip install polars and import polars (previously, these were called py-polars and pypolars).

Acknowledgements

Development of Polars is proudly powered by

Xomnia

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polars-0.7.6-cp36-abi3-win_amd64.whl (10.3 MB view details)

Uploaded CPython 3.6+Windows x86-64

polars-0.7.6-cp36-abi3-manylinux2010_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.6+manylinux: glibc 2.12+ x86-64

polars-0.7.6-cp36-abi3-macosx_10_7_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.6+macOS 10.7+ x86-64

File details

Details for the file polars-0.7.6-cp36-abi3-win_amd64.whl.

File metadata

  • Download URL: polars-0.7.6-cp36-abi3-win_amd64.whl
  • Upload date:
  • Size: 10.3 MB
  • Tags: CPython 3.6+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/0.10.2

File hashes

Hashes for polars-0.7.6-cp36-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 76d61dffa6d3d3c9b6b4b7da2caeeb4edd458eeda0afda6e1b53d968d716a488
MD5 dadc3b02b5b0ae066b616e6f72f81937
BLAKE2b-256 92f3f31d01a79dfa596e27b161a9ce80c06c83fd4503b6ffa03dfbf484d2a98a

See more details on using hashes here.

File details

Details for the file polars-0.7.6-cp36-abi3-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for polars-0.7.6-cp36-abi3-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5b95d0ec3b4f6308585de7bc3ad958ed00bf74c1baf2595e7c152f3e4d924396
MD5 daaea96724540626c538670c4624c92a
BLAKE2b-256 0f3d453546c464e13a1bebcde10aeef19336b8872a434da19d7f1833fe3c8a29

See more details on using hashes here.

File details

Details for the file polars-0.7.6-cp36-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for polars-0.7.6-cp36-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 b68764ba67f4ac277d72b77cdfd67be91bee750201acef0c5a917a0f7b97e92e
MD5 de22fcf2fe2207d150a5b0d24a5aba96
BLAKE2b-256 eab05a3cef7c96235a397c04aded0dc0aa7bb24af11d44ce21839b6fe5153a67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page