Skip to main content

Write algorithms in PyTorch that adapt to the available (CUDA) memory

Project description

Torch Memory-adaptive Algorithms (TOMA)

Build Status codecov PyPI

A collection of helpers to make it easier to write code that adapts to the available (CUDA) memory. Specifically, it retries code that fails due to OOM (out-of-memory) conditions and lowers batchsizes automatically.

To avoid failing over repeatedly, a simple cache is implemented that memorizes that last successful batchsize given the call and available free memory.

Installation

To install using pip, use:

pip install toma

To run the tests, use:

python setup.py test

Example

from toma import toma

@toma.batch(initial_batchsize=512)
def run_inference(batchsize, model, dataset):
    # ...

run_inference(batchsize, model, dataset)

This will try to execute train_model with batchsize=512. If a memory error is thrown, it will decrease the batchsize until it succeeds.

Note: This batch size can be different from the batch size used to accumulate gradients by only calling optimizer.step() every so often.

To make it easier to loop over a ranges, there are also toma.range and toma.chunked:

@toma.chunked(initial_step=512)
def compute_result(out: torch.Tensor, start: int, end: int):
    # ...

result = torch.empty((8192, ...))
compute_result(result)

This will chunk result and pass the chunks to compute_result one by one. Again, if it fails due to OOM, the step will be halfed etc. Compared to toma.batch, this allows for reduction of the step size while looping over the chunks. This can save computation.

@toma.range(initial_step=32)
def reduce_data(start: int, end: int, out: torch.Tensor, dataA: torch.Tensor, dataB: torch.Tensor):
    # ...

reduce_data(0, 1024, result, dataA, dataB)

toma.range iterates over range(start, end, step) with step=initial_step. If it fails due to OOM, it will lower the step size and continue.

toma.execute

To make it easier to just execute a block without having to extract it into a function and then call it, we also provide toma.execute.batch, toma.execute.range and toma.execute.chunked, which are somewhat unorthodox and call the function that is passed to them right away. (Mainly because there is no support for anonymous functions in Python beyond lambda expressions.)

def function():
    # ... other code

    @toma.execute.chunked(batched_data, initial_step=128):
    def compute(chunk, start, end):
        # ...

Cache

There are 3 available cache types at the moment. They can be changed by either setting toma.DEFAULT_CACHE_TYPE or by passing cache_type to the calls.

For example:

@toma.batch(initial_batchsize=512, cache_type=toma.GlobalBatchsizeCache)

or

toma.explicit.batch(..., toma_cache_type=toma.GlobalBatchsizeCache)

StacktraceMemoryBatchsizeCache: Stacktrace & Available Memory (the default)

This memorizes the successful batchsizes for a given call trace and available memory at that point. For most machine learning code, this is sufficient to remember the right batchsize without having to look at the actual arguments and understanding more of the semantics.

The implicit assumption is that after a few iterations a stable state will be reached in regards to GPU and CPU memory usage.

To limit the CPU memory of the process, toma provides:

import toma.cpu_memory

toma.cpu_memory.set_cpu_memory_limit(8)

This can also be useful to avoid accidental swap thrashing.

GlobalBatchsizeCache: Global per Function

This reuses the last successful batchsize independently from where the call happened.

NoBatchsizeCache: No Caching

Always starts with the suggested batchsize and fails over if necessary.

Benchmark/Overhead

There is overhead involved. Toma should only be used with otherwise time/memory-consuming operations.

---------------------------------------------------------------------------------- benchmark: 5 tests ----------------------------------------------------------------------------------
Name (time in ms)          Min                Max               Mean            StdDev             Median                IQR            Outliers       OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_native             2.1455 (1.0)       3.7733 (1.0)       2.3037 (1.0)      0.1103 (1.0)       2.2935 (1.0)       0.1302 (1.0)          81;5  434.0822 (1.0)         448           1
test_simple            17.4657 (8.14)     27.0049 (7.16)     21.0453 (9.14)     2.6233 (23.79)    20.4881 (8.93)      3.4384 (26.42)        13;0   47.5165 (0.11)         39           1
test_toma_no_cache     31.4380 (14.65)    40.8567 (10.83)    33.2749 (14.44)    2.2530 (20.43)    32.2698 (14.07)     2.8210 (21.67)         4;1   30.0527 (0.07)         25           1
test_explicit          33.0759 (15.42)    52.1866 (13.83)    39.6956 (17.23)    6.9620 (63.14)    38.4929 (16.78)    11.2344 (86.31)         4;0   25.1917 (0.06)         20           1
test_toma              36.9633 (17.23)    57.0220 (15.11)    43.5201 (18.89)    6.7318 (61.05)    41.6034 (18.14)     7.2173 (55.45)         2;2   22.9779 (0.05)         13           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Thanks

Thanks to @y0ast for feedback and discussion.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toma-1.1.0.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

toma-1.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file toma-1.1.0.tar.gz.

File metadata

  • Download URL: toma-1.1.0.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for toma-1.1.0.tar.gz
Algorithm Hash digest
SHA256 d4b7d04d3c8a5b4ce4fee30a92282e29e95f4b641db20d1c6f458d13e8793b7a
MD5 036b5801bf5fb7de06c4c94fa4ad7b17
BLAKE2b-256 4fd174aad779150b03b6237de39f9d7a0e48d6a1ef2ff59d7501f683a8aa5a8a

See more details on using hashes here.

File details

Details for the file toma-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: toma-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for toma-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e2d678bd286e8a1e8141277f21aa32bb635dc47e58c492d5f60efbf646927318
MD5 1a9b49341a92c349aff33cc4e6f47057
BLAKE2b-256 0e63f0b3f8855591bf9385e2c66f27e99c847e758927f1be72a1aece8be007a1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page