DualPipe: An innovative bidirectional pipeline parallelism algorithm

DualPipe is a groundbreaking bidirectional pipeline parallelism algorithm from the DeepSeek-V3 Technical Report. It excels in achieving full overlap of forward and backward computation-communication phases while minimizing pipeline bubbles. The unique scheduling example showcases the symmetry of micro-batches in two directions. DualPipe stands out with its efficient memory usage and reduced pipeline bubbles compared to other methods. Developers Jiashi Li, Chengqi Deng, and Wenfeng Liang created this innovative algorithm. To implement DualPipe, one must tailor a custom overlapped_forward_backward method using PyTorch 2.0 and above. Check out the DeepSeek-V3 Technical Report for more details.

https://github.com/deepseek-ai/DualPipe

To top