Skip to main content

Remap and renumber numpy arrays.

Project description

[![Build Status](https://travis-ci.org/seung-lab/fastremap.svg?branch=master)](https://travis-ci.org/seung-lab/fastremap) [![PyPI version](https://badge.fury.io/py/fastremap.svg)](https://badge.fury.io/py/fastremap)

# fastremap

Renumber and relabel Numpy arrays at C++ speed and physically convert rectangular Numpy arrays between C and Fortran order using an in-place transposition.

## `pip` Installation

```bash
pip install fastremap
```

*If not, a C++ compiler is required.*

```bash
pip install numpy
pip install fastremap --no-binary :all:
```

## Manual Installation

*A C++ compiler is required.*

```bash
sudo apt-get install g++ python3-dev
mkvirtualenv -p python3 fastremap
pip install numpy

# Choose one:
python setup.py develop
python setup.py install
```

## The Problem of Remapping

Python loops are slow, so Numpy is often used to perform remapping on large arrays (hundreds of megabytes or gigabytes). In order to efficiently remap an array in Numpy you need a key-value array where the index is the key and the value is the contents of that index.

```python
import numpy as np

original = np.array([ 1, 3, 5, 5, 10 ])
remap = np.array([ 0, -5, 0, 6, 0, 0, 2, 0, 0, 0, -100 ])
# Keys: 0 1 2 3 4 5 6 7 8 9 10

remapped = remap[ original ]
>>> [ -5, 6, 2, 2, -100 ]
```

If there are 32 or 64 bit labels in the array, this becomes impractical as the size of the array can grow larger than RAM. Therefore, it would be helpful to be able to perform this mapping using a C speed loop. Numba can be used for this in some circumstances. However, this library provides an alternative.

```python
import numpy as np
import fastremap

mappings = {
1: 100,
2: 200,
-3: 7,
}

arr = np.array([5, 1, 2, -5, -3, 10, 6])
# Custom remapping of -3, 5, and 6 leaving the rest alone
arr = fastremap.remap(arr, mappings, preserve_missing_labels=True)
# result: [ 5, 100, 200, -5, 7, 10, 6 ]
```

## The Problem of Renumbering

Sometimes a 64-bit array contains values that could be represented by an 8-bit array. However, similarly to the remapping problem, Python loops can be too slow to do this. Numpy doesn't provide a convenient way to do it either. Therefore this library provides an alternative solution.

```python
import fastremap
import numpy as np

arr = np.array([ 283732875, 439238823, 283732875, 182812404, 0 ], dtype=np.int64)

arr, remapping = fastremap.renumber(arr, preserve_zero=True) # Returns uint8 array
>>> arr = [ 1, 2, 1, 3, 0 ]
>>> remapping = { 0: 0, 283732875: 1, 439238823: 2, 182812404: 3 }

arr, remapping = fastremap.renumber(arr, preserve_zero=False) # Returns uint8 array
>>> arr = [ 1, 2, 1, 3, 4 ]
>>> remapping = { 0: 4, 283732875: 1, 439238823: 2, 182812404: 3 }

arr, remapping = fastremap.renumber(arr, preserve_zero=False, in_place=True) # Mutate arr to use less memory
>>> arr = [ 1, 2, 1, 3, 4 ]
>>> remapping = { 0: 4, 283732875: 1, 439238823: 2, 182812404: 3 }
```

## The Problem of In-Place Transposition

When transitioning between different media, e.g. CPU to GPU, CPU to Network, CPU to disk, it's often necessary to physically transpose multi-dimensional arrays to reformat as C or Fortran order. Tranposing matrices is also a common action in linear algebra, but often you can get away with just changing the strides.

An out-of-place transposition is easy to write, and often faster, but it will spike peak memory consumption. This library grants the user the option of performing an in-place transposition which trades CPU time for peak memory usage. In the special case of square or cubic arrays, the in-place transpisition is both lower memory and faster.

- **fastremap.asfortranarray:** Same as np.asfortranarray but will perform the transposition in-place for 1, 2, 3, and 4D arrays. 2D and 3D square matrices are faster to process than with Numpy.
- **fastremap.ascontiguousarray:** Same as np.ascontiguousarray but will perform the transposition in-place for 1, 2, 3, and 4D arrays. 2D and 3D square matrices are faster to process than with Numpy.

```python
import fastremap
import numpy as np

arr = np.ones((512,512,512), dtype=np.float32)
arr = fastremap.asfortranarray(x)

arr = np.ones((512,512,512), dtype=np.float32, order='F')
arr = fastremap.ascontiguousarray(x)
```

## All Available Functions:
- **renumber:** Relabel array from 1 to N which can often use smaller datatypes.
- **remap:** Custom relabeling of values in an array from a dictionary.
- **mask:** Zero out labels in an array specified by a given list.
- **remap_from_array:** Same as remap, but the map is an array where the key is the array index and the value is the value.
- **remap_from_array_kv:** Same as remap, but the map consists of two equal sized arrays, the first containing keys, the second containing values.
- **asfortranarray:** Perform an in-place matrix transposition for rectangular arrays if memory is contiguous, standard numpy otherwise.
- **ascontiguousarray:** Perform an in-place matrix transposition for rectangular arrays if memory is contiguous, standard numpy algorithm otherwise.

## C++ Usage

The in-place matrix transposition is implemented in ipt.hpp. If you're working in C++, you can also use it directly like so:

```cpp
#include "ipt.hpp"

int main() {

int sx = 128;
int sy = 124;
int sz = 103;
int sw = 3;

auto* arr = ....;

// All primitive number types supported
// The array will be modified in place,
// so these functions are void type.
ipt::ipt<int>(arr, sx, sy); // 2D
ipt::ipt<float>(arr, sx, sy, sz); // 3D
ipt::ipt<double>(arr, sx, sy, sz, sw); // 4D

return 0;
}
```

--
Made with <3



Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastremap-1.5.0.tar.gz (226.0 kB view details)

Uploaded Source

Built Distributions

fastremap-1.5.0-cp37-cp37m-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.7m

fastremap-1.5.0-cp36-cp36m-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.6m

fastremap-1.5.0-cp35-cp35m-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.5m

fastremap-1.5.0-cp27-cp27m-manylinux1_x86_64.whl (1.3 MB view details)

Uploaded CPython 2.7m

File details

Details for the file fastremap-1.5.0.tar.gz.

File metadata

  • Download URL: fastremap-1.5.0.tar.gz
  • Upload date:
  • Size: 226.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.6

File hashes

Hashes for fastremap-1.5.0.tar.gz
Algorithm Hash digest
SHA256 3adf518a4e3dab7bce7c7e507cc82aa962bf23286b299cf8d509f06630e8ca22
MD5 7fb31c030e8e108cf9e7f907a499091e
BLAKE2b-256 2ae372b8c35dbebda3f9e89b74c2097dc5a4478a89faa94d438329d719d555f7

See more details on using hashes here.

File details

Details for the file fastremap-1.5.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fastremap-1.5.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.6

File hashes

Hashes for fastremap-1.5.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 505aacb9819b5c879ad3b0fbec987077ce06ba0153c994ba33764307f9dc2524
MD5 3acea055a127104efcad1e4b06e90647
BLAKE2b-256 e6fd6373ddd7924b19ca0daaa1d0e081a79afb15b861c7e348174d4fa1cfb453

See more details on using hashes here.

File details

Details for the file fastremap-1.5.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fastremap-1.5.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.6

File hashes

Hashes for fastremap-1.5.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ee1d58f800d2ecf33e7def144cb2eacd82b60dc2120e180d28272fa9350e4a8e
MD5 d92c9c4f51d8a55af84d8a34d5625f5a
BLAKE2b-256 28d35024d198538a167aa2628dcde99099278690cb8d0eb110729ad7ff2dd65f

See more details on using hashes here.

File details

Details for the file fastremap-1.5.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fastremap-1.5.0-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.6

File hashes

Hashes for fastremap-1.5.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8c34ee98cd495278182460e1d317723efbaa69f35ec02edb2b0ec6a1cf192efb
MD5 1c2cc1995c0fa4ccfce33413b4880f02
BLAKE2b-256 927c39fc92f0bdd7d3d1b4ba1fae5f9fd3e048c3a2f50336b320f4b94ace7334

See more details on using hashes here.

File details

Details for the file fastremap-1.5.0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fastremap-1.5.0-cp27-cp27m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: CPython 2.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.6

File hashes

Hashes for fastremap-1.5.0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8246a1ff76fe7a9d2bdac0a83fedfee2d91a7e9c703f316136f9f7735aa6bbe5
MD5 2c3f056e453bb9b811f3fe166179e675
BLAKE2b-256 37928fb450031e38cb205a958ec06b432097618ae0e0d8554173e45dce9afc87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page