Skip to main content

No project description provided

Project description

🚧 PipeGoose: Train 🤗 transformers in 3D parallelism - WIP

pipeline

Honk honk honk! This project is actively under development. Check out my learning progress here.

from transformer import AutoModel, AutoTokenizer
from pipegoose import Pipeline, ParallelContext

model = AutoModel.from_pretrained("bloom")
tokenizer = AutoTokenizer.from_pretrained("bloom")

parallel_context = ParallelContext(
    tensor_parallel_size=2,
    pipeline_parallel_size=2,
    data_parallel_size=2
)

pipeline = Pipeline(model, tokenizer, parallel_context)

pipeline.fit(dataloader)

Implementation Details

  • Supports training transformers model.
  • Supports ZeRO-1 and ZeRO-Offload.
  • Implements parallel compute and data transfer using separate CUDA streams.
  • Gradient checkpointing will be implemented by enforcing virtual dependency in the backpropagation graph, ensuring that the activation for gradient checkpoint will be recomputed just in time for each (micro-batch, partition).
  • Custom algorithms for model partitioning with two default partitioning models based on elapsed time and GPU memory consumption per layer.
  • Potential support includes:
    • Callbacks within the pipeline: Callback(function, microbatch_idx, partition_idx) for before and after the forward, backward, and recompute steps (for gradient checkpointing).
    • Mixed precision training.
    • Elastic training
    • Fault-tolerance

Appreciation

Big thanks to 🤗 Hugging Face for sponsoring this project with 8x A100 GPUs for testing! And Zach Schrier for monthly twitch donations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipegoose-0.1.0.tar.gz (20.4 kB view details)

Uploaded Source

Built Distribution

pipegoose-0.1.0-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file pipegoose-0.1.0.tar.gz.

File metadata

  • Download URL: pipegoose-0.1.0.tar.gz
  • Upload date:
  • Size: 20.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.9.13 Darwin/21.6.0

File hashes

Hashes for pipegoose-0.1.0.tar.gz
Algorithm Hash digest
SHA256 18af0aa5eb99cd320fe485de3fb70f7ea73d5d06bf3f423923042f45203dbf36
MD5 2891f4c54785d68e4eec042d404165db
BLAKE2b-256 b62916c4e05a2a4b610fa3489edba062b9e503e21fe13d0c1460814ac8e44db3

See more details on using hashes here.

File details

Details for the file pipegoose-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pipegoose-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.9.13 Darwin/21.6.0

File hashes

Hashes for pipegoose-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0dc4f369095f64c4fd4cdf47ef4183e16d5e3b075f23d3dff3f81900c3e2a276
MD5 5cbfffb15f19f86b0e68fb6218be15ee
BLAKE2b-256 b965bcb4d9b9a86ffd68638ae2da7f89e05d84422bf52f02a3305e9a0ef109f9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page