Loading…
Wednesday, May 13 • 12:45pm - 1:00pm
Designing Balanced System for Small to Large Scale Deeplearning Training

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Deep learning training workloads call for a balanced system design with an optimal balance of compute, acceleration and memory.  In this talk we discuss how we leverage system design knobs across compute, memory, storage & networking, coupled with the systems & hardware management tools and software to develop an optimized solution for customers training needs from small scale workloads running on a few cards to large scale complex workloads spanning multiple chassis & racks. We will showcase how the flexibility of using different fabric solutions along with the Inter Chip Link enables different topologies and highly efficient training system configurations to be built. We will share how we design to strike a balance between system performance, software complexity and total cost of ownership.

Speakers
avatar for A.K. Roy

A.K. Roy

Director, AI Training Systems & Solutions Strategy, Training Products Group (TPG), AIPG, Intel
A.K. Roy is the Director of AI Training Systems & Solutions Strategy & Product Management, in the Training Products Group (TPG) at Intel Corporation’s AI Platforms Group (AIPG). He has over 15 years of experience in AI, Cloud and Server Systems & Solutions Strategy, Product management... Read More →


Wednesday May 13, 2020 12:45pm - 1:00pm PDT
EW: Servers