Prof. Marco Canini, KAUST

Prof. Marco Canini

Event Quick Information

Date

Thursday, August 31, 2023

Time

08:30 AM – 09:15 AM

Prof. Marco Canini, KAUST

Title: Programmable Networks for Distributed Deep Learning: Advances and Perspectives

Abstract: Training large deep learning models is challenging due to high communication overheads that distributed training entails. Embracing the recent technological development of programmable network devices, this talk describes our efforts to rein in distributed deep learning's communication bottlenecks and offers an agenda for future work in this area. We demonstrate that an in-network aggregation primitive can accelerate distributed DL workloads, and can be implemented using modern programmable switch hardware. We discuss various designs for streaming aggregation and in-network data processing that lower memory requirements and exploit sparsity to maximize effective bandwidth use. We also touch on gradient compression methods, which contribute to lower communication volume and adapt to dynamic network conditions. Lastly, we consider how the rise of programmable NICs may have a role in this space.

Share this: