Joblib-like interface for parallel GPU computations (e.g. data preprocessing)
Project description
GPUParallel
Joblib-like interface for parallel GPU computations (e.g. data preprocessing).
import torch
from gpuparallel import GPUParallel, delayed
def perform(idx, device_id, **kwargs):
tensor = torch.Tensor([idx]).to(device_id)
return (tensor * tensor).item()
result = GPUParallel(n_gpu=2)(delayed(perform)(idx) for idx in range(5))
print(list(result)) # result: [0.0, 1.0, 4.0, 9.0, 16.0], ordered in accordance with input parameters
Features:
- Initialize networks once on worker init
- Reuse initialized workers
- Preserve sample order:
preserve_orderflag, turned on by default - Result is a generator
- Auto batching
- Simple logging from workers
- Main process inference mode for tasks debug (use
debug = True) - Progressbar with tqdm:
progressbarflag - Optional ignoring task errors:
ignore_errorsflag
Install
python3 -m pip install gpuparallel
# or
python3 -m pip install git+git://github.com/vlivashkin/gpuparallel.git
Examples
Initialize networks once on worker init
Function init_fn is called on init of every worker. All common resources (e.g. networks) can be initialized here.
from gpuparallel import GPUParallel, delayed
def init(device_id=None, **kwargs):
global model
model = load_model().to(device_id)
def perform(img, device_id=None, **kwargs):
global model
return model(img.to(device_id))
gp = GPUParallel(n_gpu=16, n_workers_per_gpu=2, init_fn=init)
results = gp(delayed(perform)(img) for img in fnames)
Reuse initialized workers
Once workers are initialized, they keep live until GPUParallel object exist.
You can perform several queues of tasks without reinitializing worker resources:
gp = GPUParallel(n_gpu=16, n_workers_per_gpu=2, init_fn=init)
overall_results = []
for folder_images in folders:
folder_results = gp(delayed(perform)(img) for img in folder_images)
overall_results.extend(folder_results)
del gp # this will close process pool to free memory
Result is a generator
GPUParallel call returns a generator to use results during caclulations (e.g. for sequential saving ordered results)
import h5py
gp = GPUParallel(n_gpu=16, n_workers_per_gpu=2, preserve_order=True)
result = gp(delayed(perform)(img) for img in images)
with h5py.File('output.h5') as f:
result_dataset = f.create_dataset('result', shape=(300, 224, 224, 3))
for idx, result in enumerate(result):
result_dataset[idx] = result
Auto batching
Use class BatchGPUParallel for auto spliting tensor to workers.
flat_result flag de-batches results (works only if single array/tensor returned)
arr = np.zeros((102, 103))
bgpup = BatchGPUParallel(task_fn=task, batch_size=3, flat_result=True, n_gpu=2)
flat_results = np.array(list(bgpup(arr)))
Simple logging from workers
print() inside a worker won't be seen in the main process, but you still can use logging to stderr of the main process.
Use log_to_stderr() call to init logging, and log.info(message) to log info from workers
from gpuparallel import GPUParallel, delayed, log_to_stderr, log
log_to_stderr('INFO')
def perform(idx, worker_id=None, device_id=None):
hi = f'Hello world #{idx} from worker #{worker_id} with {device_id}!'
log.info(hi)
GPUParallel(n_gpu=2)(delayed(perform)(idx) for idx in range(2))
It will return:
[INFO/Worker-1(cuda:1)]:Hello world #1 from worker #1 with cuda:1!
[INFO/Worker-0(cuda:0)]:Hello world #0 from worker #0 with cuda:0!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gpuparallel-0.2.2.tar.gz.
File metadata
- Download URL: gpuparallel-0.2.2.tar.gz
- Upload date:
- Size: 8.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ac9e002a37d86e3aecf4ce363dc65e0627167aabe75a143bf0000677a3243be
|
|
| MD5 |
dc2c2cc0d43633627c695c897067177b
|
|
| BLAKE2b-256 |
1eec22ddbe8c2049093d39b3f2d5def58a7ae47f7ac3c90af0cc5726d636714b
|
File details
Details for the file gpuparallel-0.2.2-py3-none-any.whl.
File metadata
- Download URL: gpuparallel-0.2.2-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f3266a7cfc5ff216352a2f61b105635f6a9d4f235f626478e18643a6ab990d6
|
|
| MD5 |
28ffb259c9cce01ecba1ecf9b861cd46
|
|
| BLAKE2b-256 |
3bc88603aa05d6be89fd4547ba881cfb7eace2be0cbd1b1fb9850cf3507a2bf7
|