Enabling the A.I. Era: From Materials to Systems

Enabling the A.I. Era: From Materials to Systems Sundeep Bajikar Head of Market Intelligence, Applied Materials New Street Research Conference May 30, 2018 External Use

Key Message PART 1 PART 2 A.I. * Requires New Computing Architectures New A.I. Architectures Require Materials Breakthroughs * A.I. refers to Machine Learning and Deep Learning approaches. 2

Key Message PART 1 PART 2 A.I. * Requires New Computing Architectures New A.I. Architectures Require Materials Breakthroughs What is different about A.I. computing? Empirical analysis A.I. / IoT unaffordable with current computing model 3

What s different about A.I. computing? Memory Bound Performance limited by memory latency and memory bandwidth High Data Parallelism Identical operations can be done on multiple chunks of data in parallel Low Precision Training and inference require much lower floating point (16 bit) and integer (8 bit) precision Relative to traditional CPU architectures Traditional multi-level on-die cache architecture is unnecessary Traditional deep instruction pipeline and out-of-order execution are unnecessary Traditional 64 bit or higher precision data paths are unnecessary Deep Learning Enabled by Processing Abundant Data Source: Computer Architecture: A Quantitative Approach, Sixth Edition, John Hennessy and David Patterson, December 2017, Applied Materials internal analysis 4

Data Algorithms ML / DL RELATIONSHIP BETWEEN Complexity of algorithms VERSUS Abundance of (training) data + affordable highperformance computing AI is about relentless classification Data quality and size are making AI a reality Source: Applied Materials internal analysis. 5

ML / DL CASE STUDY: Google Translate Circa 2010 Complex algorithms based on grammar and sematic rules Developed by team of linguists 70% accuracy Source: Applied Materials internal analysis. 6

ML / DL CASE STUDY: Google Translate Today Simple algorithm Trained using every single web page available in English and Chinese No Chinese speakers on team 98% accuracy Source: Applied Materials internal analysis. 7

FPGA From Homogeneous to Heterogeneous Computing Homogeneous / Discrete Compute Era Heterogeneous Compute Era CPU CPU CPU 0 CPU 1 Media CPU GPU ASIC / FPGA GPU ASIC / FPGA CPU 2 CPU 3 CPU 4 GPU ISP Audio Normal Compiled / Managed Code (Office, OS, Enterprise) Imaging, Video Playback AI Workloads: Training, Inference, Analytics, etc. ASIC CPU GPU Client CC, Search, Audio, VOIP CPU GPU Gaming, HPC, Highly Parallel Workloads Exploits data level parallelism Uses the most power efficient compute engine for a given job Source: Applied Materials internal analysis. 8

DRAM Shipments (EB) NAND Shipments (EB) In Current Compute Model 25 DRAM 500 NAND y = 0.0081x 1.3065 R² = 0.9563 y = 0.0022x 2.0399 R² = 0.9569 20 400 15 300 10 200 5 100 - - 100 200 300 400 500 - - 100 200 300 400 500 Incremental Data Generation (EB) Incremental Data Generation (EB) Source: Cisco VNI, Cisco, Gartner, Factset, Applied Materials internal analysis 2006 to 2016 data from Cisco and Gartner. 2017 to 2020 projections from Cisco for data generation (VNI IP traffic), and industry average estimates for DRAM and NAND content shipments 9

DRAM Shipments (EB) NAND Shipments (EB) A.I. / Big Data Era Needs New Compute Models 180 160 140 120 DRAM 1% adoption of L4/L5 Smart Car 14,000 12,000 10,000 NAND 1% adoption of L4/L5 Smart Car 100 8,000 80 1% SMART CAR ADOPTION 6,000 1% SMART CAR ADOPTION 60 40 20 y = 0.0081x 1.3065 R² = 0.9563 5X MORE DATA 8X MORE DRAM IN 2020 4,000 2,000 y = 0.0022x 2.0399 R² = 0.9569 5X MORE DATA 25X MORE NAND IN 2020 - - 500 1,000 1,500 2,000 2,500 - - 500 1,000 1,500 2,000 2,500 Incremental Data Generation (EB) Incremental Data Generation (EB) Source: Cisco VNI, Cisco, Gartner, Factset, Applied Materials internal analysis 2006 to 2016 data from Cisco and Gartner. 2017 to 2020 projections from Cisco for data generation (VNI IP traffic), and industry average estimates for DRAM and NAND content shipments 10

Key Message PART 2 New A.I. Architectures Require Materials Breakthroughs From classic 2D scaling to architecture innovation From traditional materials to materials breakthroughs 11

Classic 2D Scaling to Architecture Innovation ERA ORIGINAL PERFORMANCE PERFORMANCE GAIN HOW PC, Mobile 1x ~2x Classic 2D scaling When Moore s Law scaling was at it s peak. Assumes no major architecture changes A.I. / IoT 1x ~2x to 2,500x Architecture Innovation Small benefit from Moore s Law scaling Source: Applied Materials internal analysis. Performance broadly refers to speed or power efficiency. ~2x refers to doubling of performance every two years consistent with Moore s Law. ISS Blog post #1: Nodes, Inter-nodes, Leading-Nodes and Trailing-Nodes 12

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 Performance (vs. VAX-11/780) Processor Performance Improvements Have Slowed 100,000 End of Moore's Law Limits of parallelism of Amdahl's Law 10,000 1,000 100 10 1 25% per year End of Dennard Scaling 52% per year 23% per year 12% per year 3.5% per year Chart plots performance relative to the VAX-11/780 as measured by the SPEC integer benchmarks Chart methodology and annotations based on Dec 2017 edition of Computer Architecture textbook Source: Computer Architecture: A Quantitative Approach, Sixth Edition, John Hennessy and David Patterson, December 2017, Applied Materials internal analysis Chart is for illustrative purposes only, not to scale. 13

Guidelines for A.I. Architecture Innovation Increase Memory Bandwidth Use dedicated memories to minimize distance over which data is moved Invest resources saved from dropping advanced microarchitectural optimizations into more arithmetic units or bigger memories Reduce data size and type to the simplest needed for the domain Increase Data Parallelism Use the easiest form of parallelism that matches the domain Use domain specific programming language to port code to the DSA Reduce Precision Training and Inference require much lower Floating Point (16 bit) and Integer (8 bit) precision Software Workloads will Redefine Hardware Architecture Source: Computer Architecture: A Quantitative Approach, Sixth Edition, John Hennessy and David Patterson, December 2017, Applied Materials internal analysis 14

A.I. Enabled by Materials Breakthroughs ERA ORIGINAL PERFORMANCE PERFORMANCE GAIN MATERIALS ENGINEERING PC, Mobile 1x ~2x Unit Materials A.I. / IoT 1x ~2x to 2,500x Materials Breakthroughs New devices, 3D techniques, Advanced packaging Source: Applied Materials internal analysis. Performance broadly refers to speed or power efficiency. ~2x refers to doubling of performance every two years consistent with Moore s Law ISS Blog post #2: Nodes, Inter-nodes, Leading-Nodes and Trailing-Nodes. 15

Architecture Innovation High Memory Bandwidth Performance limited by memory latency and memory bandwidth High Data Parallelism Identical operations can be done on multiple chunks of data in parallel Low Precision Training and inference require much lower floating point (16 bit) and integer (8 bit) precision enabled by Materials Engineering New Devices On-die memory e.g. MRAM New memory e.g. Intel 3D XPoint technology 3D Techniques Self-aligned structures e.g. vias, multi-color patterning Side-to-side to top-to-bottom e.g. COAG Advanced Packaging Stacked DRAM e.g. HBM / HBM2 Logic-DRAM Sensor-Logic Intel and 3D XPoint are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. 16

Example: STT-MRAM CAPPING MGO CAP FREE LAYER TUNNEL BARRIER REFERENCE LAYER SAF PINNED LAYER SEED LAYER >15 individual layers of film in MTJ stack Each layer 0.2-to-2nm thin Requires new etching techniques Requires highly complex process equipment Requires intimate collaboration with customers New Memory Device Enabled By Materials Breakthrough Source: Applied Materials internal analysis. 17

Investment in Innovation R&D INVESTMENTS MAYDAN TECHNOLOGY CENTER $1.8B $1.2B World s most advanced 300mm R&D lab $300M invested in past 3 years APPLIED VENTURES FY2012 FY2017 >$250M assets under management >30 active companies 18

Semiconductor Market Evolution A.I. + Big Data Era Mobile + Social Media Era PC + Internet Era 2000 2010 2017 2020 19