A Building Block 3D System with Inductive-Coupling Through Chip Interfaces Hiroki Matsutani Keio University, Japan

Similar documents
Part IV: 3D WiNoC Architectures

3D WiNoC Architectures

MUCCRA-CUBE: A 3D DYNAMICALLY RECONFIGURABLE PROCESSOR WITH INDUCTIVE-COUPLING LINK S. Saito, Y. Kohama, Y. Sugimori, Y. Hasegawa, H.

Low-Latency Wireless 3D NoCs via Randomized Shortcut Chips

for High Performance and Low Power Consumption Koji Inoue, Shinya Hashiguchi, Shinya Ueno, Naoto Fukumoto, and Kazuaki Murakami

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape

3D Memory Architecture. Kyushu University

Innovative Power Control for. Performance System LSIs. (Univ. of Electro-Communications) (Tokyo Univ. of Agriculture and Tech.)

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp

Network on Chip Architecture: An Overview

A 400Gbps Multi-Core Network Processor

BREAKING THE MEMORY WALL

High Volume Manufacturing Supply Chain Ecosystem for 2.5D HBM2 ASIC SiPs

Near-Field Coupling Integration Technology

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

Combining Arm & RISC-V in Heterogeneous Designs

Non-contact Test at Advanced Process Nodes

Memory Systems IRAM. Principle of IRAM

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Prediction Router: Yet another low-latency on-chip router architecture

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection

Architecture and Automated Design Flow for Digital Network on chip for Analog/RF Building Block Control

FABRICATION TECHNOLOGIES

Chapter 0 Introduction

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

An Overview of Standard Cell Based Digital VLSI Design

VLSI Design Automation. Maurizio Palesi

Sort vs. Hash Join Revisited for Near-Memory Execution. Nooshin Mirzadeh, Onur Kocberber, Babak Falsafi, Boris Grot

Low-Power Interconnection Networks

Hybrid On-chip Data Networks. Gilbert Hendry Keren Bergman. Lightwave Research Lab. Columbia University

Moore s Law: Alive and Well. Mark Bohr Intel Senior Fellow

Overlaid Mesh Topology Design and Deadlock Free Routing in Wireless Network-on-Chip. Danella Zhao and Ruizhe Wu Presented by Zhonghai Lu, KTH

Natalie Enright Jerger, Jason Anderson, University of Toronto November 5, 2010

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain

Implementing Flexible Interconnect Topologies for Machine Learning Acceleration

Outline Marquette University

The Nios II Family of Configurable Soft-core Processors

A 1-GHz Configurable Processor Core MeP-h1

IBM "Broadway" 512Mb GDDR3 Qimonda

Z-RAM Ultra-Dense Memory for 90nm and Below. Hot Chips David E. Fisch, Anant Singh, Greg Popov Innovative Silicon Inc.

1. NoCs: What s the point?

3D Integration & Packaging Challenges with through-silicon-vias (TSV)

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis

edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next?

Copyright 2016 Xilinx

White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

OVERCOMING THE MEMORY WALL FINAL REPORT. By Jennifer Inouye Paul Molloy Matt Wisler

A 50% Lower Power ARM Cortex CPU using DDC Technology with Body Bias. David Kidd August 26, 2013

A Case for Wireless 3D NoCs for CMPs

RHiNET-3/SW: an 80-Gbit/s high-speed network switch for distributed parallel computing

Interfacing Techniques in Embedded Systems

Multi-Core Microprocessor Chips: Motivation & Challenges

Big Data mais basse consommation L apport des processeurs manycore

INTERCONNECTION NETWORKS LECTURE 4

Packet Switch Architecture

Packet Switch Architecture

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer

Future of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1

Design and Test Solutions for Networks-on-Chip. Jin-Ho Ahn Hoseo University

Three DIMENSIONAL-CHIPS

Ncore Cache Coherent Interconnect

Toward a Memory-centric Architecture

AMchip architecture & design

Digital Integrated Circuits A Design Perspective. Jan M. Rabaey

OVERVIEW: NETWORK ON CHIP 3D ARCHITECTURE

FPGAhammer: Remote Voltage Fault Attacks on Shared FPGAs, suitable for DFA on AES

Part 1 of 3 -Understand the hardware components of computer systems

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics

Advancing high performance heterogeneous integration through die stacking

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration

Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization

SDR Forum Technical Conference 2007

Network-on-chip (NOC) Topologies

on Chip Architectures for Multi Core Systems

Moore s s Law, 40 years and Counting

Networks for Multi-core Chips A A Contrarian View. Shekhar Borkar Aug 27, 2007 Intel Corp.

VLSI Design Automation

PERFORMANCE EVALUATION OF WIRELESS NETWORKS ON CHIP JYUN-LYANG CHANG

The Memory Hierarchy 1

Lecture 1: CS/ECE 3810 Introduction

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Real-Time Dynamic Voltage Hopping on MPSoCs

THE NVIDIA DEEP LEARNING ACCELERATOR

2009 International Solid-State Circuits Conference Intel Paper Highlights

Vertical Circuits. Small Footprint Stacked Die Package and HVM Supply Chain Readiness. November 10, Marc Robinson Vertical Circuits, Inc

Embedded Systems: Architecture

A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports

Creating a Scalable Microprocessor:

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013

Microprocessor Trends and Implications for the Future

NETWORK ON CHIP TO IMPLEMENT THE SYSTEM-LEVEL COMMUNICATION SIMPLIFIES THE DISTRIBUTION OF I/O DATA THROUGHOUT THE CHIP, AND IS ALWAYS

DesignConEast 2005 Track 6: Board and System-Level Design (6-TA4)

An Asynchronous Array of Simple Processors for DSP Applications

TEXAS INSTRUMENTS ANALOG UNIVERSITY PROGRAM DESIGN CONTEST MIXED SIGNAL TEST INTERFACE CHRISTOPHER EDMONDS, DANIEL KEESE, RICHARD PRZYBYLA SCHOOL OF

Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors" ASP-DAC 2014

TechSearch International, Inc.

An FPGA-Based Optical IOH Architecture for Embedded System

Product Series SoC Solutions Product Series 2016

Transcription:

A Building Block 3D System with Inductive-Coupling Through Chip Interfaces Hiroki Matsutani Keio University, Japan 1

Outline: 3D Wireless NoC Designs This part also explores 3D NoC architecture with inductive-coupling wireless links and shows some prototype designs 3D IC technologies: Wired vs. Wireless [5min] Prototype systems: Cube-0 & Cube-1 [5min] 3D Ring network Prototype system: Cube-2 [5min] 3D Linear network Summary and Q&A [5min] 2

Design cost of LSI is increasing System-on-Chip (SoC) Required components are integrated on a single chip Different LSI must be developed for each application System-in-Package (SiP) or 3D IC Required components are stacked for each application By changing the chips in a package, we can provide a wider range of chip family with modest design cost Accelerator chip SRAM chip DRAM chip Cache chip Apr Next 24th, 2018 slides show IEEE VLSI techniques Test Symposium (VTS) for 2018stacking multiple chips 3

3D IC technology for going vertical Wired Wireless Two chips (face-to-face) Microbump Flexibility Capacitive coupling More than three chips Scalability Through silicon via Inductive coupling 4

Inductive coupling link for 3D ICs More than 3 chips Stacking after chip fabrication Only know-good-dies selected Bonding wires for power supply We have developed some prototype systems of wireless 3D ICs using the inductive coupling Inductor for transceiver Implemented as a square coil with metal in common CMOS Footprint of inductor Not a serious problem. Only metal layers are occupied 5

Stacking method: Staircase stacking Inductor has /RX/Idle modes (mode change 1-cycle) Pillar Slide & stack Bonding wire Bonding wire Bonding wire Inductor () Inductor (RX) 6

Stacking method: Staircase stacking Inductive-coupling link Pillar Local clock @ 4GHz Serial data System clock for NoC: 200MHz 35-bit transfer for each clock TxData TxClk We have fabricated some prototype multi-core systems using this wireless technology Inductor () Inductor (RX) TxData TxClk Local clock shared by neighboring chips; No global sync. 7

Outline: 3D Wireless NoC Designs This part also explores 3D NoC architecture with inductive-coupling wireless links and shows some prototype designs 3D IC technologies: Wired vs. Wireless [5min] Prototype systems: Cube-0 & Cube-1 [5min] 3D Ring network Prototype system: Cube-2 [5min] 3D Linear network Summary and Q&A [5min] 8

Wireless 3D chip stacking for IoT System-in-Package (SiP) for IoT devices Required chips are selected and stacked in package E.g., CPU chip, Accelerator chip, Memory chip, Wireless inductive-coupling for vertical links Not electrically-connected Add, remove, and swap chips for given applications Vertical links Chip#2 Chip#1 Apr 24th, 2018 Fujitsu 65nm CMOS IEEE VLSI Test Symposium (VTS) 2018 Chip#0 9

An example: Cube-0 (2010) Test chip for vertical communication schemes Vertical point-to-point link between adjacent chips Vertical shared bus (broadcast) Each chip has 2 cores (packet counter) 2 routers [Matsutani, NOCS 11] Inductors (P2P ring) Inductors (vertical bus) Process: Fujitsu 65nm (CS202SZ) Voltage: 1.2V System clock: 200MHz 2.1mm x 2.1mm Inductors (P2P) Core 0 & 1 Router 0 & 1 Inductors (bus) 10

An example: Cube-0 (2010) Test chip for vertical communication schemes Vertical point-to-point link between adjacent chips Vertical shared bus (broadcast) [Matsutani, NOCS 11] RX Stacking for Ring network 2.1mm x 2.1mm Inductors (P2P) Core 0 & 1 Slide & stack Router 0 & 1 Inductors (bus) 11

An example: Cube-0 (2010) Test chip for vertical communication schemes Vertical point-to-point link between adjacent chips Vertical shared bus (broadcast) [Matsutani, NOCS 11] RX Stacking for Ring network Slide & stack 12

An example: Cube-1 (2012) Test chips for building-block 3D systems Two chip types: Host CPU chip & Accelerator chip We can customize number & types of chips in SiP Cube-1 Host CPU chip Two 3D wireless routers MIPS-like CPU [Miura, IEEE Micro 13] Cube-1 Accelerator chip Two 3D wireless routers Processing element array MIPS R3K CPU Core 8x8 PE Array Inductor 13

An example: Cube-1 (2012) Microphotographs of test chips [Miura, IEEE Micro 13] Host CPU Chip Accelerator Chip Host CPU + 3 Accelerators 14

An example: Cube-1 (2012) Cube-1 Cube-1 demo system PE array chip performs image processing CPU chip for control [Miura, HotChips 13 Demo] Motherboard 15

Outline: 3D Wireless NoC Designs This part also explores 3D NoC architecture with inductive-coupling wireless links and shows some prototype designs 3D IC technologies: Wired vs. Wireless [5min] Prototype systems: Cube-0 & Cube-1 [5min] 3D Ring network Prototype system: Cube-2 [5min] 3D Linear network Summary and Q&A [5min] 16

Current design: Cube-2 (2016) A practical building-block 3D systems CPU chip, Accelerator chip, NN chip, and KVS chip Renesas Electronics SOTB 65nm with Body biasing Cube-2 Host CPU chip MIPS-like CPU Cube-2 Accelerator chip Processing element array Cube-2 NN chip Neural network prediction Cube-2 KVS chip Key-value store w/ SRAM L1 Caches 12x8 PE Array+ controller MIPS Core Inductors Inductors 17

Current design: Cube-2 (2016) A practical building-block 3D systems CPU chip, Accelerator chip, NN chip, and KVS chip Renesas Electronics SOTB 65nm with Body biasing Cube-2 Host CPU chip MIPS-like CPU Cube-2 Accelerator chip Processing element array Cube-2 NN chip Neural network prediction Cube-2 KVS chip Key-value store w/ SRAM SIMD Core + SRAM for Alexnet SRAM KVS Core Inductors Inductors 18

An example: Cube-2 (2016) Simpler vertical network for Cube-2 Vertical linear network (8 VCs) Credit signal for flow control is piggybacked on packets on the opposite link RX Inductors (Up) PE Array Controller Downlink Uplink Inductors (Down) 19

An example: Cube-2 (2016) Simpler vertical network for Cube-2 Vertical linear network (8 VCs) Credit signal for flow control is piggybacked on packets on the opposite link RX Downlink Uplink Downlink Uplink 20

An example: Cube-2 (2016) Simpler vertical network for Cube-2 Vertical linear network (8 VCs) Credit signal for flow control is piggybacked on packets on the opposite link RX Credit Downlink Uplink Downlink Uplink Packets 21

An example: Cube-2 (2016) Simpler vertical network for Cube-2 Vertical linear network (8 VCs) Credit signal for flow control is piggybacked on packets on the opposite link RX Packets Credit Downlink Uplink Downlink Uplink 22

An example: Cube-2 (2016) ThruChip Interface 400um x 500um 36bit/50MHz Including SER/DES Half-duplex transfer PE Array Controller Inductors 23

Outline: 3D Wireless NoC Designs This part also explores 3D NoC architecture with inductive-coupling wireless links and shows some prototype designs 3D IC technologies: Wired vs. Wireless [5min] Prototype systems: Cube-0 & Cube-1 [5min] 3D Ring network Prototype system: Cube-2 [5min] 3D Linear network Summary and Q&A [5min] 24

Summary: 3D Wireless NoC Designs Inductive-coupling 3D SiP A low cost alternative to build low-volume custom systems by stacking off-the-shelf known-good-dies No special process technology is required; inductors are implemented with metal layers Our history Cube-0 (2010): Test (3D Wireless NoC only) Cube-1 (2012): Host CPU chip and Accelerator chip 25

Summary: 3D Wireless NoC Designs Inductive-coupling 3D SiP A low cost alternative to build low-volume custom systems by stacking off-the-shelf known-good-dies No special process technology is required; inductors are implemented with metal layers Our history Cube-2 Cube-0 (2010): Test 3 chip (3D stack Wireless now NoCoperational only) Cube-1 (2012): Host CPU chip and Accelerator chip Cube-2 (2016): A practical 3D WiNoC system CPU chip, Accelerator chip, NN chip, and KVS chip We can customize number & types of chips in SiP 26

References (1/2) Cube-0: The first real 3D WiNoC H. Matsutani, et.al., "A Vertical Bubble Flow Network using Inductive-Coupling for 3-D CMPs", NOCS 2011. Y. Take, et.al., "3D NoC with Inductive-Coupling Links for Building-Block SiPs", IEEE Trans on Computers (2014). Cube-1: The heterogeneous 3D WiNoC N. Miura, et.al., "A Scalable 3D Heterogeneous Multicore with an Inductive ThruChip Interface", IEEE Micro (2013). MuCCRA-Cube: Dynamically reconfigurable processor S. Saito, et.al., "MuCCRA-Cube: a 3D Dynamically Reconfigurable Processor with Inductive-Coupling Link", FPL 2009. 27

References (2/2) Vertical bubble flow control on Cube-0 H. Matsutani, et.al., "A Vertical Bubble Flow Network using Inductive-Coupling for 3-D CMPs", NOCS 2011. Spanning trees optimization for 3D WiNoCs H. Matsutani, et.al., "A Case for Wireless 3D NoCs for CMPs", ASP-DAC 2013 (Best Paper Award). Small-world effect in 3D WiNoCs H. Matsutani, et.al., "Low-Latency Wireless 3D NoCs via Randomized Shortcut Chips", DATE 2014. CSMA/CD vertical bus using inductive-coupling T. Kagami, et.al., "Efficient 3-D Bus Architectures for Inductive-Coupling ThruChip Interfaces", IEEE Trans on VLSI (2015). 28