Distributed Coordinated Sequence Sampler
Project description
DiscoSeqSampler
Distributed Coordinated Sequence Sampler - 一个高效的分布式序列采样框架。
背景
在当前的 AI 领域,无论是音频语音(Audio/Speech)还是图像视频(Image/Video)模型,都广泛使用 Transformer 架构。这类模型的计算量与序列长度高度相关,而在大规模数据集中,数据的长度分布往往非常广泛。为了实现高效的多 GPU 训练,必须对训练数据的序列长度进行精细准确的管理。
DiscoSeqSampler 正是为了解决这一关键问题而设计的分布式序列采样框架,它能够智能地协调和管理不同长度的序列数据,确保训练过程的高效性和稳定性。
特性
- 🚀 高性能: 优化的分布式采样算法
- 🔄 协调机制: 智能的序列协调和同步
- 📊 可扩展: 支持大规模分布式部署
- 🛠️ 易用性: 简洁的 API 设计
- 🔧 可配置: 灵活的配置选项
安装
- 项目仍在开发中,功能尚未完整验证
从 PyPI 安装
pip install discoss
从源码安装
git clone https://github.com/lifeiteng/DiscoSeqSampler.git
cd DiscoSeqSampler
pip install -e .
快速开始
import discoss
# TODO: 添加使用示例
开发
查看 DEVELOPMENT.md 获取详细的开发指南。
快速设置
# 克隆仓库
git clone https://github.com/lifeiteng/DiscoSeqSampler.git
cd DiscoSeqSampler
# 安装开发依赖
pip install -e .[dev]
# 设置 pre-commit 钩子
make setup-dev
运行测试
make test
贡献
欢迎贡献!请查看 DEVELOPMENT.md 了解如何设置开发环境。
许可证
本项目采用 MIT 许可证 - 查看 LICENSE 文件了解详情。
引用
如果您在研究中使用了 DiscoSeqSampler,请引用:
@software{discoss2024,
title={DiscoSeqSampler: Distributed Coordinated Sequence Sampler},
author={Feiteng Li},
year={2025},
url={https://github.com/lifeiteng/DiscoSeqSampler}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file discoss-0.1.1.tar.gz.
File metadata
- Download URL: discoss-0.1.1.tar.gz
- Upload date:
- Size: 20.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f049c242687fcfb675afbc577d52a06d33c3af76a335c541af597499caa59b98
|
|
| MD5 |
23a52a6f1609a8a048eb6292eaf60546
|
|
| BLAKE2b-256 |
de8eb2a8cb82fa44f5dbb15230d99337d7ce57e5b4c7304365c559652408b34b
|
File details
Details for the file discoss-0.1.1-py3-none-any.whl.
File metadata
- Download URL: discoss-0.1.1-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b8cd95c48025734ca5d4ae996946ad842b18e18eb300f0a79deed86271caa35
|
|
| MD5 |
b77febc5acaecfdfb728302b63fecbd9
|
|
| BLAKE2b-256 |
ee87ddb3208e6c620532282113a9d1eb1c63125dd46f4eac7c1d9e76852c4bf4
|