[2023.04.27(Thu.)]Artificial Intelligence & AI Convergence Network Colloquium

[행사] [2023.04.27(Thu.)]Artificial Intelligence & AI Convergence Network Colloquium

소프트웨어융합대학교학팀
김성민
Create Date 2023-04-19
Views 659

< Artificial Intelligence & AI Convergence Network Colloquium >

When : 2023년 4월 27일(목) 오후 1시 30분

Where : 팔달관 407호

Speaker : 서지원 교수(한양대학교 컴퓨터소프트웨어학부)

Title : Out-Of-Order BackProp: An Effective Scheduling Technique for Deep Learning

Abstract : Neural network training requires a large amount of computation and thus GPUs are often used for the

acceleration. While they improve the performance, GPUs are underutilized during the training. This paper proposes

out-of-order (ooo) back-prop, an effective scheduling technique for neural network training. By exploiting the

dependencies of gradient computations, ooo backprop enables to reorder their executions to make the most of

the GPU resources. We show that the GPU utilization in single- and multi-GPU training can be commonly

improved by applying ooo backprop and prioritizing critical operations. We propose three scheduling

algorithms based on ooo backprop. For single-GPU training, we schedule with multi-stream ooo computation

to mask the kernel launch overhead. In data-parallel training, we reorder the gradient computations to

maximize the overlapping of computation and parameter communication; in pipeline-parallel training, we

prioritize critical gradient computations to reduce the pipeline stalls. We evaluate our optimizations with twelve

neural networks and five public datasets. Compared to the respective state of the art training systems, our

algorithms improve the training throughput by 1.03--1.58× for single-GPU training, by 1.10--1.27× for data-

parallel training, and by 1.41--1.99× for pipeline-parallel training.

Bio : Jiwon Seo is an assistant professor at the department of computer science in Hanyang university, Korea. He received his PhD in electrical engineering from Stanford in 2015. His research interests include machine learning

systems and big data systems.

Host : 소프트웨어학과 안정섭 교수(jsahn@ajou.ac.kr)

List

AI Convergence Network

Board

Notice

< Artificial Intelligence & AI Convergence Network Colloquium >