Dartmouth Events

Fast and Accurate Deep Neural Network Training

Yang will introduce LARS (Layer-wise Adaptive Rate Scaling) and LAMB (Layer-wise Adaptive Moments for Batch training) optimizers, which can find more parallelism for deep learning.

Monday, February 17, 2020
3:30pm – 5:00pm
Kemeny Hall 007
Intended Audience(s): Public
Categories: Lectures & Seminars

Abstract: In the last three years, supercomputers have become increasingly popular in leading AI companies. Amazon built a High Performance Computing (HPC) cloud. Google released its first 100-petaFlop supercomputer (TPU Pod). Facebook made a submission on the Top500 supercomputer list. Why do they like supercomputers? Because the computation of deep learning is very expensive. For example, even with 16 TPUs, BERT training takes more than 3 days. On the other hand, supercomputers can process 10^17 floating point operations per second. So why don’t we just use supercomputers and finish the training of deep neural networks in a very short time? The reason is that deep learning does not have enough parallelism to make full use of thousands or even millions of processors in a typical modern supercomputer. There are two directions for parallelizing deep learning: model parallelism and data parallelism. Model parallelism is very limited. For data parallelism, current optimizers can not scale to thousands of processors because large-batch training is a sharp minimum problem. In this talk, I will introduce LARS (Layer-wise Adaptive Rate Scaling) and LAMB (Layer-wise Adaptive Moments for Batch training) optimizers, which can find more parallelism for deep learning. They can not only make deep learning systems scale well, but they can also help real-world applications to achieve higher accuracy.

 

Bio:Yang You is a PhD candidate at UC Berkeley Computer Science Division. His advisor is Professor James Demmel.  Yang You's research interests include Parallel/Distributed Algorithms, High Performance Computing (HPC), and Machine Learning.

For more information, contact:
Susan Cable

Events are free and open to the public unless otherwise noted.