Skip to main content

DLSlime Transfer Engine

Project description

DLSlime logo

Roadmap | Slack | WeChat Group | Zhihu

Flexible & Efficient Heterogeneous Transfer Toolkit

DLSlime is a heterogeneous transfer toolkit for distributed deep learning systems. It provides Python and C++ APIs for peer-to-peer data movement across RDMA, NVLink, Ascend Direct, and torch-distributed style backends, with higher level PeerAgent, SlimeRPC, and cache-service utilities built on top of the same data plane.

Highlights

Area What DLSlime Provides
P2P transfer RDMA RC read/write/send-recv, NVLink transfer, Ascend Direct transfer
Python control RDMAEndpoint, PeerAgent, memory-region registration, async futures
RPC SlimeRPC service/proxy helpers over PeerAgent mailbox transport
Cache service dlslime-cache service for assignment-directory backed RDMA cache slabs
Torch integration Optional torch backend and torchrun examples
Benchmarks Transfer, endpoint, cache, and RPC microbenchmarks under bench/

Install

From PyPI

pip install dlslime==0.0.3.rc2

The PyPI package is built with the default CMake flags. Build from source when you need optional transports or local C++ changes.

From Source

git clone https://github.com/deeplink-org/DLSlime.git
cd DLSlime
pip install -v --no-build-isolation -e .

Pass CMake flags through the environment when enabling optional components:

BUILD_NVLINK=ON BUILD_TORCH_PLUGIN=ON \
  pip install -v --no-build-isolation -e .

For a pure C++ build:

cmake -S . -B build -GNinja -DBUILD_PYTHON=OFF -DBUILD_RDMA=ON
cmake --build build

Build Flags

Flag Default Description
BUILD_RDMA ON Build the RDMA transfer engine
BUILD_PYTHON OFF in CMake, ON in pyproject.toml Build Python bindings
BUILD_NVLINK OFF Build the NVLink transfer engine
BUILD_ASCEND_DIRECT OFF Build Ascend Direct transport
BUILD_TORCH_PLUGIN OFF Build DLSlime as a torch backend
BUILD_BENCH OFF Build C++ transfer-engine benchmarks
BUILD_TEST OFF Build C++ tests
USE_MACA OFF Enable Metax platform support for torch backend builds

Quick Start

RDMA Endpoint

The low-level endpoint API registers local memory regions, exchanges endpoint metadata out of band, connects peers, and issues RDMA operations.

python examples/python/p2p_rdma_rc_read.py
python examples/python/p2p_rdma_rc_write.py
python examples/python/p2p_rdma_rc_write_with_imm_data.py
python examples/python/p2p_rdma_rc_send_recv_gdr.py

The examples use available_nic() to find RDMA devices and RDMAEndpoint to register local and remote memory regions.

PeerAgent and SlimeRPC

PeerAgent adds control-plane based discovery and connection management. Start a NanoCtrl instance first, then run the RPC example:

nanoctrl start
python examples/python/rpc_example.py --ctrl http://127.0.0.1:3000

The example defines a Python service, serves it on a worker PeerAgent, and calls it from a driver PeerAgent through the SlimeRPC proxy API.

DLSlimeCache

DLSlimeCache owns a preallocated memory region and stores assignment manifests so clients can write bytes into cache slabs and read them back through normal RDMA operations.

nanoctrl start
dlslime-cache start --ctrl http://127.0.0.1:3000 \
  --host 127.0.0.1 --port 8765 --memory-size 1G

python examples/python/cache_client_example.py --url http://127.0.0.1:8765

dlslime-cache stop

See docs/design/dlslime-cache.md for the cache service design and API.

NVLink and Ascend

torchrun --nproc_per_node=2 examples/python/p2p_nvlink.py
python examples/python/p2p_ascend_read.py

Ascend Direct setup details live in docs/huawei_ascend/README.md.

Benchmarks

Benchmark commands and historical performance tables now live under the benchmark directory:

Common entry points:

# Aggregated RDMA transfer benchmark, two nodes
torchrun --master-addr <addr> --master-port 6006 \
  --nnodes 2 --nproc-per-node 8 --node-rank <rank> \
  bench/python/agg_transfer_bench_spmd.py \
  --qp-num 8 --transfer-engine dlslime \
  --batch-size 64 --num-iteration 100 --num-concurrency 8

# SlimeRPC vs Ray local benchmark
bash bench/python/run_rpc_bench.sh

Repository Layout

dlslime/          Python package and C++ sources
examples/python/  Runnable Python examples
bench/            Benchmark scripts, result files, and benchmark README
docs/             Design notes, roadmap, platform docs, and benchmark notes
tests/            Python and C++ tests
cmake/            CMake helper modules
scripts/          Development scripts

Documentation

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dlslime-0.0.3rc2-cp313-cp313-manylinux2014_x86_64.whl (973.6 kB view details)

Uploaded CPython 3.13

dlslime-0.0.3rc2-cp312-cp312-manylinux2014_x86_64.whl (972.8 kB view details)

Uploaded CPython 3.12

dlslime-0.0.3rc2-cp311-cp311-manylinux2014_x86_64.whl (974.0 kB view details)

Uploaded CPython 3.11

dlslime-0.0.3rc2-cp310-cp310-manylinux2014_x86_64.whl (973.0 kB view details)

Uploaded CPython 3.10

dlslime-0.0.3rc2-cp39-cp39-manylinux2014_x86_64.whl (973.2 kB view details)

Uploaded CPython 3.9

dlslime-0.0.3rc2-cp38-cp38-manylinux2014_x86_64.whl (972.9 kB view details)

Uploaded CPython 3.8

File details

Details for the file dlslime-0.0.3rc2-cp313-cp313-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dlslime-0.0.3rc2-cp313-cp313-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 943af4e36ff3b52f3156f96ca749926b03d84937ca3a79c6a08af3b6b980aa2c
MD5 c749593e00f4425eae16db644dad5298
BLAKE2b-256 c9761df1e3c6687e806b8001244978b3312f129d64b058c7cb647f2067839eef

See more details on using hashes here.

File details

Details for the file dlslime-0.0.3rc2-cp312-cp312-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dlslime-0.0.3rc2-cp312-cp312-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6f03acfe801204e3316f0fc2c59c25bb59e1009eeea7804dddbaa2c46d5dbb06
MD5 483a7de31ee1cdbc11bba5ffce00b141
BLAKE2b-256 245aefe72453b14cd7fdd11cff81dbe4356a1f1a2f2504b0782870bf15dd4dd2

See more details on using hashes here.

File details

Details for the file dlslime-0.0.3rc2-cp311-cp311-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dlslime-0.0.3rc2-cp311-cp311-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 780f1f9f561a38f820a1d4b53e343cb1bc7f2718b63a6e8735cb1aeb97dda117
MD5 bee40e0529fbe62db7e11924e473bba3
BLAKE2b-256 8546f7641634ee616af3f884cb3cfe61ea442adbae7940dbe5cf3dde37cd813e

See more details on using hashes here.

File details

Details for the file dlslime-0.0.3rc2-cp310-cp310-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dlslime-0.0.3rc2-cp310-cp310-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9c7dc9b07151be567421b38cf3ff1dc84e580b3eba8e852199f12dd4c2e27a0d
MD5 eb636baf81d8621eacccc056a4447245
BLAKE2b-256 8d1cb432e6ac1e849dbde07cb760154e1bd3612e88ab28fbaae07489d63c6a16

See more details on using hashes here.

File details

Details for the file dlslime-0.0.3rc2-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dlslime-0.0.3rc2-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b6d137ee7eb5b45a785ae04bebf1fb0f1662fbb9a3898bd50dfa55cafa3acf75
MD5 9456e261c673eb9810d385877bb2e606
BLAKE2b-256 87ba642bde016de3978676937d9e0531735599fdbee36eac89fdc44a25a1f88c

See more details on using hashes here.

File details

Details for the file dlslime-0.0.3rc2-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dlslime-0.0.3rc2-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 609ce65eff618de0a960fad611b05569fd739798b7d3e687b1a6fb0935a20fb1
MD5 f5a002044e764011e98135c1b40171c8
BLAKE2b-256 975a1a74b94a2137277ec5ccdf9c29891d5574f9bf72e983dbeded2a5d86200b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page