Pandas ExtensionDType/Array backed by Apache Arrow

These details have not been verified by PyPI

Project links

Homepage

Project description

fletcher

A library that provides a generic set of Pandas ExtensionDType/Array implementations backed by Apache Arrow. They support a wider range of types than Pandas natively supports and also bring a different set of constraints and behaviours that are beneficial in many situations.

Usage

To use fletcher in Pandas DataFrames, all you need to do is to wrap your data in a FletcherChunkedArray or FletcherContinuousArray object. Your data can be of either pyarrow.Array, pyarrow.ChunkedArray or a type that can be passed to pyarrow.array(…).

import fletcher as fr
import pandas as pd

df = pd.DataFrame({
    'str_chunked': fr.FletcherChunkedArray(['a', 'b', 'c']),
    'str_continuous': fr.FletcherContinuousArray(['a', 'b', 'c']),
})

df.info()

# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 3 entries, 0 to 2
# Data columns (total 2 columns):
#  #   Column          Non-Null Count  Dtype                      
# ---  ------          --------------  -----                      
#  0   str_chunked     3 non-null      fletcher_chunked[string]   
#  1   str_continuous  3 non-null      fletcher_continuous[string]
# dtypes: fletcher_chunked[string](1), fletcher_continuous[string](1)
# memory usage: 166.0 bytes

Development

While you can use fletcher in pip-based environments, we strongly recommend using a conda based development setup with packages from conda-forge.

# Create the conda environment with all necessary dependencies
conda env create

# Activate the newly created environment
conda activate fletcher

# Install fletcher into the current environment
python -m pip install -e . --no-build-isolation --no-use-pep517

# Run the unit tests (you should do this several times during development)
py.test -nauto

# Install pre-commit hooks
# These will then be automatically run on every commit and ensure that files
# are black formatted, have no flake8 issues and mypy checks the type consistency.
pre-commit install

Code formatting is done using black. This should keep everything in a consistent styling and the formatting is automatically adjusted via the pre-commit hooks.

Using pandas in development mode

To test and develop against pandas' master or your local fixes, you can install a development version of pandas using:

git clone https://github.com/pandas-dev/pandas
cd pandas

# Install additional pandas dependencies
conda install -y cython

# Build and install pandas
python setup.py build_ext --inplace -j 4
python -m pip install -e . --no-build-isolation --no-use-pep517

This links the development version of pandas into your fletcher conda environment. If you change any Python code in pandas, it is directly reflected in your environment. If you change any Cython code in pandas, you need to re-execute python setup.py build_ext --inplace -j 4.

Using (py)arrow nightlies

To test and develop against the latest development version of Apache Arrow (pyarrow), you can install it from the arrow-nightlies conda channel:

conda install -c arrow-nightlies arrow-cpp pyarrow

Benchmarks

In benchmarks/ we provide a set of benchmarks to compare the performance of fletcher against pandas and ensure that fletcher itself stays performant. The benchmarks are written using airspeed velocity. When developing the benchmarks you can run them using asv dev (use -b <pattern> to only run a selection of them) only once. To get real benchmark values, you should use asv run --python=same to run the benchmarks multiple times and get meaningful average runtimes.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.7.2

Jan 17, 2021

0.7.1

Dec 29, 2020

0.7.0

Dec 7, 2020

0.6.2

Oct 20, 2020

0.6.1

Oct 14, 2020

0.6.0

Sep 23, 2020

0.5.2

Sep 21, 2020

0.5.1

Sep 21, 2020

0.5.0

Jun 23, 2020

0.4.0

Jun 16, 2020

0.3.1

Mar 10, 2020

0.3.0

Feb 25, 2020

0.2.0

Sep 1, 2019

0.1.2

Jul 8, 2018

0.1.1

Jul 8, 2018

0.1.0

Jul 7, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fletcher-0.7.2.tar.gz (72.7 kB view details)

Uploaded Jan 17, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fletcher-0.7.2-py3-none-any.whl (49.8 kB view details)

Uploaded Jan 17, 2021 Python 3

File details

Details for the file fletcher-0.7.2.tar.gz.

File metadata

Download URL: fletcher-0.7.2.tar.gz
Upload date: Jan 17, 2021
Size: 72.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7

File hashes

Hashes for fletcher-0.7.2.tar.gz
Algorithm	Hash digest
SHA256	`b646d0a69118ea9f95a62423d139afaa0f404da388bf0106735cfe7d4fb876b6`
MD5	`2c93f118e37300f71a94934b7f084cf3`
BLAKE2b-256	`95ceaf8d8b1c6824a68e51c8f920ea491eaf34d1f1e096239130357719a6c78f`

See more details on using hashes here.

File details

Details for the file fletcher-0.7.2-py3-none-any.whl.

File metadata

Download URL: fletcher-0.7.2-py3-none-any.whl
Upload date: Jan 17, 2021
Size: 49.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7

File hashes

Hashes for fletcher-0.7.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6f4227b48004dce8f7edd09f658395b30b066f6c7f48439a78e370b5c3a30a2e`
MD5	`0107e272103f23ff2449c9038b1804af`
BLAKE2b-256	`d8537be7bf3fda0de3d289bdf2283c8a549d6f0ab3c4a758f65b0ac9a8e08aa9`

See more details on using hashes here.

fletcher 0.7.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

fletcher

Usage

Development

Using pandas in development mode

Using (py)arrow nightlies

Benchmarks

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes