Low-impact, task-level memory profiling for Dask.

These details have not been verified by PyPI

Project links

Homepage

Project description

dask-memusage

If you're using Dask with tasks that use a lot of memory, RAM is your bottleneck for parallelism. That means you want to know how much memory each task uses:

So you can set the highest parallelism level (process or threads) for each machine, given available to RAM.
In order to know where to focus memory optimization efforts.

dask-memusage is an MIT-licensed statistical memory profiler for Dask's Distributed scheduler that can help you with both these problems.

dask-memusage polls your processes for memory usage and records the minimum and maximum usage in a CSV:

task_key,min_memory_mb,max_memory_mb
"('from_sequence-map-sum-part-e15703211a549e75b11c63e0054b53e5', 0)",44.84765625,96.98046875
"('from_sequence-map-sum-part-e15703211a549e75b11c63e0054b53e5', 1)",47.015625,97.015625
"('sum-part-e15703211a549e75b11c63e0054b53e5', 0)",0,0
"('sum-part-e15703211a549e75b11c63e0054b53e5', 1)",0,0
sum-aggregate-apply-no_allocate-4c30eb545d4c778f0320d973d9fc8ea6,0,0
apply-no_allocate-4c30eb545d4c778f0320d973d9fc8ea6,47.265625,47.265625
task_key,min_memory_mb,max_memory_mb
"('from_sequence-map-sum-part-e15703211a549e75b11c63e0054b53e5', 0)",44.84765625,96.98046875
"('from_sequence-map-sum-part-e15703211a549e75b11c63e0054b53e5', 1)",47.015625,97.015625
"('sum-part-e15703211a549e75b11c63e0054b53e5', 0)",0,0
"('sum-part-e15703211a549e75b11c63e0054b53e5', 1)",0,0
sum-aggregate-apply-no_allocate-4c30eb545d4c778f0320d973d9fc8ea6,0,0
apply-no_allocate-4c30eb545d4c778f0320d973d9fc8ea6,47.265625,47.265625

Usage

Important: Make sure your workers only have a single thread! Otherwise the results will be wrong.

Installation

On the machine where you are running the Distributed scheduler, run:

$ pip install dask_memusage

Or if you're using Conda:

$ conda install -c conda-forge dask-memusage

API usage

# Add to your Scheduler object, which is e.g. your LocalCluster's scheduler
# attribute:
from dask_memoryusage import install
install(scheduler, "/tmp/memusage.csv")

CLI usage

$ dask-scheduler --preload dask_memusage --memusage.csv /tmp/memusage.csv

Limitations

Again, make sure you only have one thread per worker process.
This is statistical profiling, running every 10ms. Tasks that take less than that won't have accurate information.

Help

Need help? File a ticket at https://github.com/itamarst/dask-memusage/issues/new

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1

Jan 18, 2020

1.0

Sep 28, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dask_memusage-1.1.tar.gz (7.0 kB view details)

Uploaded Jan 18, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dask_memusage-1.1-py3-none-any.whl (4.4 kB view details)

Uploaded Jan 18, 2020 Python 3

File details

Details for the file dask_memusage-1.1.tar.gz.

File metadata

Download URL: dask_memusage-1.1.tar.gz
Upload date: Jan 18, 2020
Size: 7.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.22.0

File hashes

Hashes for dask_memusage-1.1.tar.gz
Algorithm	Hash digest
SHA256	`29d9f25074fecd7ca249e972cb3ec0b909a1dcefaf037c8d5fca24fadbf66757`
MD5	`94f3882eed9009eee13702c1c6ed2565`
BLAKE2b-256	`b6c473b1021d1a9ea5ed29c079faf23cb62d8c29e8ef5794384f237c8927b918`

See more details on using hashes here.

File details

Details for the file dask_memusage-1.1-py3-none-any.whl.

File metadata

Download URL: dask_memusage-1.1-py3-none-any.whl
Upload date: Jan 18, 2020
Size: 4.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.22.0

File hashes

Hashes for dask_memusage-1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3024bcd9189ac611d2576ab8b3941dd41ea466f1933dd131cf4650f81a4677c4`
MD5	`12630a210959fa028c7c04e651b1ee67`
BLAKE2b-256	`e051499c565202a5b892bd9ac5ba98c458d0cf6d1ec9b0b784db20a4e0f5b5cd`

See more details on using hashes here.

dask_memusage 1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dask-memusage

Usage

Installation

API usage

CLI usage

Limitations

Help

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes