The TAU 2017 Contest

Size: px
Start display at page:

Download "The TAU 2017 Contest"

Transcription

1 The TAU 2017 Contest Timing Macro Modeling Song Chen Synopsys [Speaker] Akash Khandelwal Cadence Xin Zhao IBM Corp. Xi Chen Synopsys Sponsors: TAU 2017 Workshop March 16 th -17 th,

2 Why Timing Macro Modeling Again? 1. Very Important 2

3 Motivation Integrated circuit chip grow rapidly in performance and gate / transistor counts Process Node 16nm Transistors 15.3 Billion > 2x world population GPU Die Size 610 mm2 FP64 Compute 5.30 TFLOPs > fastest supercomputer in 1999 Source: NVIDIA GP100 Source: 3

4 Motivation Design / analysis / verification are done hierarchically with multiple level of hierarchies and in parallel. Need abstraction for low level blocks when work at higher level. SMs 56 CUDA Cores Per SM 64 CUDA Cores (Total) 3584 FP64 CUDA Cores / SM 32 FP64 CUDA Cores / GPU 1792 Texture Units 224 Memory Size 16 GB NVIDIA GP100 Source: 4

5 Hierarchical Timing Analysis Full netlist Includes top and block netlist Interface timing model Top level netlist and netlist of block interface Extracted timing model (ETM) Top level netlist and ETM for blocks Apple A10 Source: 5

6 Extracted timing model (ETM) Benefits Simple, understandable model Fast runtime, small memory footprint Do not expose IP Widely supported by many tools. Shortcoming Accuracy in advanced node Do not allow cross hierarchy timing constraint Cannot optimize block from top 6

7 Why Timing Macro Modeling Again? 2. Performance 2016 contest: focus on accuracy 2017 contest: focus on performance on usage 7

8 2016 Contest 8

9 Model Generation Methods Graph Reduction Start from full graph, eliminate / merge nodes and arcs Easy to archive accuracy Point to point extraction Directly extract the point to point delay Can add internal node Minimize model size 9

10 Model Size/Performance vs. Accuracy Original Design u1 a o b u2 a o n1 n2 u3 a o b Timing Model p1 p2 p3 p4 p5 p6 p7 p8 p1 p2 p3 p4 Slower Usage Large High Model Size** Accuracy** **general trends Faster Usage Small Low 10

11 Accuracy Evaluation u1 a o b u2 a o n1 n2 u3 a o b Original Benchmark Contestant Binary Early + Late.libs design_model u1 a o b u2 a o n1 n2 u3 a o b Vary Input and Output Conditions: input slew, put load, arrival time design_model Analysis using OpenTimer Accuracy Evaluation Golden Timing Report Contestant Timing Report 11

12 TAU 2017 Contest Infrastructure Detailed Documentation Provided to Contestants Design Connectivity Benchmarks Early and Late Libraries Design Parasitics Evaluation Block-based Post-CPPR Timing Analysis at Primary Inputs and Primary Outputs Timing and CPPR tutorials, file formats, timing model basics, evaluation rules, etc. Open Source Code and Binaries 1. PATMOS 2011: NTU-Timer 2. TAU 2013: IITiMer 3. TAU 2014: UI-Timer 4. ISPD 2013:.spef/.lib parsers Previous contest winners, utilities Verilog (.v) Liberty (.lib) wrapper file (.tau2016) 5. TAU 2015 binary: itimerc v OpenTimer (UI-Timer v2.0) SPEF (.spef) Assertions (.timing) Based on TAU 2015 Benchmarks Golden Result* Accuracy (.put) Runtime Performance Memory Usage Time frame: ~4 months Contest Scope: only hold, setup, RAT tests; no latches (flush segments); single-source clock tree *using OpenTimer 12

13 Benchmarks: 11 based on TAU 2015 Phase 1 benchmarks (3K 100K gates) 7 based on TAU 2015 Phase 2 benchmarks (1K 150K gates) 7 based on TAU 2015 Evaluation benchmarks (160K 1.6M gates) 13

14 Evaluation Metrics Query Slack at PIs and POs in in-context design : S IC Accuracy (Compared to Golden Results) Query Slack at PIs and POs in original design : S OoC 2017 Accuracy Score [0, 100] ps x (100, ) ps 0 Compute Difference d S for all PIs and POs D S : if optimistic, d S = 2d S average AVG(D S ) standard deviation STDEV(D S ) maximum MAX(D S ) Average performance Worst performance 2016 Accuracy Score [0, 5] ps 100 (5, 10] ps 80 (10, 15] ps 50 (15, ) ps 0 Runtime Factor (Relative) RF(D) = MAX_R(D) R(D) MAX_R(D) MIN_R(D) R(D) Runtime / Memory use 5:1 for model extraction vs. evaluation MF(D) = Memory Factor (Relative) MAX_M(D) M(D) MAX_M(D) MIN_M(D) M(D) Composite Design Score score(d) = A(D) ( RF(D) + 30 MF(D)) 14

15 2017 Contest Results Model Extraction Team1 Team2 Memory Runtime Model Evaluation Team1 Team2 Memory Runtime *Ratio to OpenTimer runtime / memory, large designs Model QoR Team1 Team2 Score Model Size Team1 Team2 Pin Arc

16 TAU 2017 Contestants University Team Texas A&M University AggieTimer National Chiao Tung University itimerm 2.0 University of Illinois at Urbana-Champaign International Institute of Information Technology Federal University of Santa Catarina LibAbs Alpha Ophidian 16

17 Acknowledgments Akash Khandelwal Contest Committee Member Xin Zhao Contest Committee Member Xi Chen Contest Committee Member Qiuyang Wu Workshop General Chair Tom Spyrou Workshop Technical Chair Tsung-Wei Huang OpenTimer Support The TAU 2017 Contestants This contest would not have been successful with your hard work and dedication 17

18 TAU 2017 Timing Contest on Macro Modeling 1st Prize Presented to Pei-Yu Lee and Iris Hui-Ru Jiang For itimerm 2.0 National Chiao Tung University Qiuyang Wu General Chair Tom Spyrou Technical Chair Song Chen Contest Chair 18

19 TAU 2017 Timing Contest on Macro Modeling 2 nd Prize Presented to Tin-Yin Lai, and Martin D. F. Wong For LibAbs University of Illinois at Urbana-Champaign Qiuyang Wu General Chair Tom Spyrou Technical Chair Song Chen Contest Chair 19

LibAbs: An Efficient and Accurate Timing Macro-Modeling Algorithm for Large Hierarchical Designs

LibAbs: An Efficient and Accurate Timing Macro-Modeling Algorithm for Large Hierarchical Designs LibAbs: An Efficient and Accurate Timing Macro-Modeling Algorithm for Large Hierarchical Designs Tin-Yin Lai Dept. of ECE, UIUC, IL, USA tinyinlai@gmail.com Tsung-Wei Huang Dept. of ECE, UIUC, IL, USA

More information

itimerm: Compact and Accurate Timing Macro Modeling for Efficient Hierarchical Timing Analysis

itimerm: Compact and Accurate Timing Macro Modeling for Efficient Hierarchical Timing Analysis itimerm: Compact and ccurate Timing Macro Modeling for Efficient Hierarchical Timing nalysis Pei-u Lee Institute of Electronics National Chiao Tung University Hsinchu 30010, Taiwan palacedeforsaken@gmail.com

More information

Distributed Timing Analysis: Framework and System. Tsung-Wei Huang, PhD candidate University of Illinois at Urbana-Champaign (UIUC), USA, IL

Distributed Timing Analysis: Framework and System. Tsung-Wei Huang, PhD candidate University of Illinois at Urbana-Champaign (UIUC), USA, IL Distributed Timing Analysis: Framework and System Tsung-Wei Huang, PhD candidate University of Illinois at Urbana-Champaign (UIUC), USA, IL Agenda q A distributed timing analysis framework for large designs

More information

OpenTimer: A High-Performance Timing Analysis Tool

OpenTimer: A High-Performance Timing Analysis Tool OpenTimer: A High-Performance Timing Analysis Tool Special Session Paper: Incremental Timing and CPPR Analysis Tsung-Wei Huang and Martin. F. Wong twh760812@gmail.com, mdfwong@illinois.edu epartment of

More information

Static Timing Verification of Custom Blocks Using Synopsys NanoTime Tool

Static Timing Verification of Custom Blocks Using Synopsys NanoTime Tool White Paper Static Timing Verification of Custom Blocks Using Synopsys NanoTime Tool September 2009 Author Dr. Larry G. Jones, Implementation Group, Synopsys, Inc. Introduction With the continued evolution

More information

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

More information

How Game Engines Can Inspire EDA Tools Development: A use case for an open-source physical design library

How Game Engines Can Inspire EDA Tools Development: A use case for an open-source physical design library How Game Engines Can Inspire EDA Tools Development: A use case for an open-source physical design library Tiago Fontana, Renan Netto, Vinicius Livramento, Chrystian Guth, Sheiny Almeida, Laércio Pilla,

More information

Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool

Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Jin Hee Kim and Jason Anderson FPL 2015 London, UK September 3, 2015 2 Motivation for Synthesizable FPGA Trend towards ASIC design flow Design

More information

Call for Participation

Call for Participation ACM International Symposium on Physical Design 2015 Blockage-Aware Detailed-Routing-Driven Placement Contest Call for Participation Start date: November 10, 2014 Registration deadline: December 30, 2014

More information

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University Outline Introduction Problem Formulation Algorithm -

More information

A Hierarchical Bin-Based Legalizer for Standard-Cell Designs with Minimal Disturbance

A Hierarchical Bin-Based Legalizer for Standard-Cell Designs with Minimal Disturbance A Hierarchical Bin-Based Legalizer for Standard- Designs with Minimal Disturbance Yu-Min Lee, Tsung-You Wu, and Po-Yi Chiang Department of Electrical Engineering National Chiao Tung University ASPDAC,

More information

Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design

Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design Wei-Jin Dai, Dennis Huang, Chin-Chih Chang, Michel Courtoy Cadence Design Systems, Inc. Abstract A design methodology for the implementation

More information

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand

More information

For a long time, programming languages such as FORTRAN, PASCAL, and C Were being used to describe computer programs that were

For a long time, programming languages such as FORTRAN, PASCAL, and C Were being used to describe computer programs that were CHAPTER-2 HARDWARE DESCRIPTION LANGUAGES 2.1 Overview of HDLs : For a long time, programming languages such as FORTRAN, PASCAL, and C Were being used to describe computer programs that were sequential

More information

What is GPU? CS 590: High Performance Computing. GPU Architectures and CUDA Concepts/Terms

What is GPU? CS 590: High Performance Computing. GPU Architectures and CUDA Concepts/Terms CS 590: High Performance Computing GPU Architectures and CUDA Concepts/Terms Fengguang Song Department of Computer & Information Science IUPUI What is GPU? Conventional GPUs are used to generate 2D, 3D

More information

Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs.

Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Cortex-A12: ARM-Cadence collaboration Joint team working on ARM Cortex -A12 irm flow irm content:

More information

PrimeTime: Introduction to Static Timing Analysis Workshop

PrimeTime: Introduction to Static Timing Analysis Workshop i-1 PrimeTime: Introduction to Static Timing Analysis Workshop Synopsys Customer Education Services 2002 Synopsys, Inc. All Rights Reserved PrimeTime: Introduction to Static 34000-000-S16 Timing Analysis

More information

101-1 Under-Graduate Project Digital IC Design Flow

101-1 Under-Graduate Project Digital IC Design Flow 101-1 Under-Graduate Project Digital IC Design Flow Speaker: Ming-Chun Hsiao Adviser: Prof. An-Yeu Wu Date: 2012/9/25 ACCESS IC LAB Outline Introduction to Integrated Circuit IC Design Flow Verilog HDL

More information

Department of Electrical and Computer Engineering

Department of Electrical and Computer Engineering LAGRANGIAN RELAXATION FOR GATE IMPLEMENTATION SELECTION Yi-Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University OUTLINE Introduction and motivation

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

On Constructing Lower Power and Robust Clock Tree via Slew Budgeting

On Constructing Lower Power and Robust Clock Tree via Slew Budgeting 1 On Constructing Lower Power and Robust Clock Tree via Slew Budgeting Yeh-Chi Chang, Chun-Kai Wang and Hung-Ming Chen Dept. of EE, National Chiao Tung University, Taiwan 2012 年 3 月 29 日 Outline 2 Motivation

More information

LP Based Leakage Power Optimizer User Manual

LP Based Leakage Power Optimizer User Manual LP Based Leakage Power Optimizer User Manual 1. Introduction This manual explains the usage of the linear programming (LP) based leakage power optimizer. The leakage power optimization problem is formulated

More information

An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart

An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart Weiwei Jiang Columbia University, USA Gabriele Miorandi University of Ferrara, Italy Wayne Burleson

More information

Evolution of CAD Tools & Verilog HDL Definition

Evolution of CAD Tools & Verilog HDL Definition Evolution of CAD Tools & Verilog HDL Definition K.Sivasankaran Assistant Professor (Senior) VLSI Division School of Electronics Engineering VIT University Outline Evolution of CAD Different CAD Tools for

More information

ADVANCED FPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 3 & 4

ADVANCED FPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 3 & 4 ADVANCED FPGA BASED SYSTEM DESIGN Dr. Tayab Din Memon tayabuddin.memon@faculty.muet.edu.pk Lecture 3 & 4 Books Recommended Books: Text Book: FPGA Based System Design by Wayne Wolf Overview Why VLSI? Moore

More information

An Overview of Standard Cell Based Digital VLSI Design

An Overview of Standard Cell Based Digital VLSI Design An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,

More information

Efficient Stimulus Independent Timing Abstraction Model Based on a New Concept of Circuit Block Transparency

Efficient Stimulus Independent Timing Abstraction Model Based on a New Concept of Circuit Block Transparency Efficient Stimulus Independent Timing Abstraction Model Based on a New Concept of Circuit Block Transparency Martin Foltin mxf@fc.hp.com Brian Foutz* Sean Tyler sct@fc.hp.com (970) 898 3755 (970) 898 7286

More information

Design Process. Design : specify and enter the design intent. Verify: Implement: verify the correctness of design and implementation

Design Process. Design : specify and enter the design intent. Verify: Implement: verify the correctness of design and implementation Design Verification 1 Design Process Design : specify and enter the design intent Verify: verify the correctness of design and implementation Implement: refine the design through all phases Kurt Keutzer

More information

Overview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions

Overview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions Logic Synthesis Overview Design flow Principles of logic synthesis Logic Synthesis with the common tools Conclusions 2 System Design Flow Electronic System Level (ESL) flow System C TLM, Verification,

More information

Fra superdatamaskiner til grafikkprosessorer og

Fra superdatamaskiner til grafikkprosessorer og Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc

More information

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Computer Organization Chapter 5 Large and Fast: Exploiting Memory Hierarchy Chansu Yu Table of Contents Ch.1 Introduction Ch. 2 Instruction: Machine Language Ch. 3-4 CPU Implementation Ch. 5 Cache

More information

Parallel Implementation of VLSI Gate Placement in CUDA

Parallel Implementation of VLSI Gate Placement in CUDA ME 759: Project Report Parallel Implementation of VLSI Gate Placement in CUDA Movers and Placers Kai Zhao Snehal Mhatre December 21, 2015 1 Table of Contents 1. Introduction...... 3 2. Problem Formulation...

More information

Concurrent, OA-based Mixed-signal Implementation

Concurrent, OA-based Mixed-signal Implementation Concurrent, OA-based Mixed-signal Implementation Mladen Nizic Eng. Director, Mixed-signal Solution 2011, Cadence Design Systems, Inc. All rights reserved worldwide. Mixed-Signal Design Challenges Traditional

More information

SYNTHESIS FOR ADVANCED NODES

SYNTHESIS FOR ADVANCED NODES SYNTHESIS FOR ADVANCED NODES Abhijeet Chakraborty Janet Olson SYNOPSYS, INC ISPD 2012 Synopsys 2012 1 ISPD 2012 Outline Logic Synthesis Evolution Technology and Market Trends The Interconnect Challenge

More information

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141 ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition

More information

Parallelizing FPGA Technology Mapping using GPUs. Doris Chen Deshanand Singh Aug 31 st, 2010

Parallelizing FPGA Technology Mapping using GPUs. Doris Chen Deshanand Singh Aug 31 st, 2010 Parallelizing FPGA Technology Mapping using GPUs Doris Chen Deshanand Singh Aug 31 st, 2010 Motivation: Compile Time In last 12 years: 110x increase in FPGA Logic, 23x increase in CPU speed, 4.8x gap Question:

More information

AccuCore Static Timing Analysis

AccuCore Static Timing Analysis AccuCore Static Timing Analysis AccuCore Static Timing Analysis High Performance SoC Timing Solution Levels of Design Abstraction History of Digital Functional Verification Definitions of Key STA Terminology

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

DIGITAL DESIGN TECHNOLOGY & TECHNIQUES

DIGITAL DESIGN TECHNOLOGY & TECHNIQUES DIGITAL DESIGN TECHNOLOGY & TECHNIQUES CAD for ASIC Design 1 INTEGRATED CIRCUITS (IC) An integrated circuit (IC) consists complex electronic circuitries and their interconnections. William Shockley et

More information

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras CAD for VLSI Debdeep Mukhopadhyay IIT Madras Tentative Syllabus Overall perspective of VLSI Design MOS switch and CMOS, MOS based logic design, the CMOS logic styles, Pass Transistors Introduction to Verilog

More information

Computer Architecture!

Computer Architecture! Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors

More information

CUDA Experiences: Over-Optimization and Future HPC

CUDA Experiences: Over-Optimization and Future HPC CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign

More information

CS250 DISCUSSION #2. Colin Schmidt 9/18/2014 Std. Cell Slides adapted from Ben Keller

CS250 DISCUSSION #2. Colin Schmidt 9/18/2014 Std. Cell Slides adapted from Ben Keller CS250 DISCUSSION #2 Colin Schmidt 9/18/2014 Std. Cell Slides adapted from Ben Keller LAST TIME... Overview of course structure Class tools/unix basics THIS TIME... Synthesis report overview for Lab 2 Lab

More information

P V Sriniwas Shastry et al, Int.J.Computer Technology & Applications,Vol 5 (1),

P V Sriniwas Shastry et al, Int.J.Computer Technology & Applications,Vol 5 (1), On-The-Fly AES Key Expansion For All Key Sizes on ASIC P.V.Sriniwas Shastry 1, M. S. Sutaone 2, 1 Cummins College of Engineering for Women, Pune, 2 College of Engineering, Pune pvs.shastry@cumminscollege.in

More information

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications

More information

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011 FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level

More information

Digital System Design Lecture 2: Design. Amir Masoud Gharehbaghi

Digital System Design Lecture 2: Design. Amir Masoud Gharehbaghi Digital System Design Lecture 2: Design Amir Masoud Gharehbaghi amgh@mehr.sharif.edu Table of Contents Design Methodologies Overview of IC Design Flow Hardware Description Languages Brief History of HDLs

More information

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

NVIDIA Update and Directions on GPU Acceleration for Earth System Models NVIDIA Update and Directions on GPU Acceleration for Earth System Models Stan Posey, HPC Program Manager, ESM and CFD, NVIDIA, Santa Clara, CA, USA Carl Ponder, PhD, Applications Software Engineer, NVIDIA,

More information

Lecture #1. Teach you how to make sure your circuit works Do you want your transistor to be the one that screws up a 1 billion transistor chip?

Lecture #1. Teach you how to make sure your circuit works Do you want your transistor to be the one that screws up a 1 billion transistor chip? Instructor: Jan Rabaey EECS141 1 Introduction to digital integrated circuit design engineering Will describe models and key concepts needed to be a good digital IC designer Models allow us to reason about

More information

Computer Architecture

Computer Architecture Informatics 3 Computer Architecture Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh (thanks to Prof. Nigel Topham) General Information Instructor

More information

High Performance Computing with Accelerators

High Performance Computing with Accelerators High Performance Computing with Accelerators Volodymyr Kindratenko Innovative Systems Laboratory @ NCSA Institute for Advanced Computing Applications and Technologies (IACAT) National Center for Supercomputing

More information

NS115 System Emulation Based on Cadence Palladium XP

NS115 System Emulation Based on Cadence Palladium XP NS115 System Emulation Based on Cadence Palladium XP wangpeng 新岸线 NUFRONT Agenda Background and Challenges Porting ASIC to Palladium XP Software Environment Co Verification and Power Analysis Summary Background

More information

Extending Digital Verification Techniques for Mixed-Signal SoCs with VCS AMS September 2014

Extending Digital Verification Techniques for Mixed-Signal SoCs with VCS AMS September 2014 White Paper Extending Digital Verification Techniques for Mixed-Signal SoCs with VCS AMS September 2014 Author Helene Thibieroz Sr Staff Marketing Manager, Adiel Khan Sr Staff Engineer, Verification Group;

More information

X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management

X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management Hideyuki Shamoto, Tatsuhiro Chiba, Mikio Takeuchi Tokyo Institute of Technology IBM Research Tokyo Programming for large

More information

Comprehensive Place-and-Route Platform Olympus-SoC

Comprehensive Place-and-Route Platform Olympus-SoC Comprehensive Place-and-Route Platform Olympus-SoC Digital IC Design D A T A S H E E T BENEFITS: Olympus-SoC is a comprehensive netlist-to-gdsii physical design implementation platform. Solving Advanced

More information

Agenda. Presentation Team: Agenda: Pascal Bolzhauser, Key Developer, Lothar Linhard, VP Engineering,

Agenda. Presentation Team: Agenda: Pascal Bolzhauser, Key Developer, Lothar Linhard, VP Engineering, Welcome JAN 2009 Agenda Presentation Team: Pascal Bolzhauser, Key Developer, pascal@concept.de Lothar Linhard, VP Engineering, lothar427@concept.de Agenda: Company Overview Products: GateVision RTLVision

More information

An Interconnect-Centric Design Flow for Nanometer Technologies. Outline

An Interconnect-Centric Design Flow for Nanometer Technologies. Outline An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 http://cadlab.cs.ucla.edu/~cong Outline Global interconnects

More information

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,

More information

EE-382M VLSI II. Early Design Planning: Front End

EE-382M VLSI II. Early Design Planning: Front End EE-382M VLSI II Early Design Planning: Front End Mark McDermott EE 382M-8 VLSI-2 Page Foil # 1 1 EDP Objectives Get designers thinking about physical implementation while doing the architecture design.

More information

Chip/Package/Board Interface Pathway Design and Optimization. Tom Whipple Product Engineering Architect November 2015

Chip/Package/Board Interface Pathway Design and Optimization. Tom Whipple Product Engineering Architect November 2015 Chip/Package/Board Interface Pathway Design and Optimization Tom Whipple Product Engineering Architect November 2015 Chip/package/board interface pathway design and optimization PCB design with Allegro

More information

Harmony-AMS Analog/Mixed-Signal Simulator

Harmony-AMS Analog/Mixed-Signal Simulator Harmony-AMS Analog/Mixed-Signal Simulator Yokohama, June 2004 Workshop 7/15/04 Challenges for a True Single-Kernel A/MS Simulator Accurate partition of analog and digital circuit blocks Simple communication

More information

More Course Information

More Course Information More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well

More information

AccuCore. Product Overview of Block Characterization, Modeling and STA

AccuCore. Product Overview of Block Characterization, Modeling and STA AccuCore Product Overview of Block Characterization, Modeling and STA What is AccuCore? AccuCore performs timing characterization of multi-million device circuits with SmartSpice accuracy and performs

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU Computer Science 14 (2) 2013 http://dx.doi.org/10.7494/csci.2013.14.2.243 Marcin Pietroń Pawe l Russek Kazimierz Wiatr ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU Abstract This paper presents

More information

ASIC world. Start Specification Design Verification Layout Validation Finish

ASIC world. Start Specification Design Verification Layout Validation Finish AMS Verification Agenda ASIC world ASIC Industrial Facts Why Verification? Verification Overview Functional Verification Formal Verification Analog Verification Mixed-Signal Verification DFT Verification

More information

Introduction to CELL B.E. and GPU Programming. Agenda

Introduction to CELL B.E. and GPU Programming. Agenda Introduction to CELL B.E. and GPU Programming Department of Electrical & Computer Engineering Rutgers University Agenda Background CELL B.E. Architecture Overview CELL B.E. Programming Environment GPU

More information

IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM

IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM I5 AND I7 PROCESSORS Juan M. Cebrián 1 Lasse Natvig 1 Jan Christian Meyer 2 1 Depart. of Computer and Information

More information

CPE/EE 427, CPE 527, VLSI Design I: Tutorial #4, Standard cell design flow (from verilog to layout, 8-bit accumulator)

CPE/EE 427, CPE 527, VLSI Design I: Tutorial #4, Standard cell design flow (from verilog to layout, 8-bit accumulator) CPE/EE 427, CPE 527, VLSI Design I: Tutorial #4, Standard cell design flow (from verilog to layout, 8-bit accumulator) Joel Wilder, Aleksandar Milenkovic, ECE Dept., The University of Alabama in Huntsville

More information

Shifter: Fast and consistent HPC workflows using containers

Shifter: Fast and consistent HPC workflows using containers Shifter: Fast and consistent HPC workflows using containers CUG 2017, Redmond, Washington Lucas Benedicic, Felipe A. Cruz, Thomas C. Schulthess - CSCS May 11, 2017 Outline 1. Overview 2. Docker 3. Shifter

More information

DATASHEET ENCOUNTER LIBRARY CHARACTERIZER ENCOUNTER LIBRARY CHARACTERIZER

DATASHEET ENCOUNTER LIBRARY CHARACTERIZER ENCOUNTER LIBRARY CHARACTERIZER DATASHEET ENCOUNTER LIBRARY CHARACTERIZER Power and process variation concerns are growing for digital IC designers, who need advanced modeling formats to support their cutting-edge low-power digital design

More information

Full Chip False Timing Path Identification: Applications to the PowerPC TM Microprocessors

Full Chip False Timing Path Identification: Applications to the PowerPC TM Microprocessors Full Chip False Timing Path Identification: Applications to the PowerPC TM Microprocessors Jing Zeng yz, Magdy S. Abadir y, Jayanta Bhadra yz, and Jacob A. Abraham z y EDA Tools and Methodology, Motorola

More information

Rsyn - An Extensible Framework for Physical Design. Guilherme Flach, Mateus Fogaça, Jucemar Monteiro, Marcelo Johann and Ricardo Reis

Rsyn - An Extensible Framework for Physical Design. Guilherme Flach, Mateus Fogaça, Jucemar Monteiro, Marcelo Johann and Ricardo Reis Rsyn - An Extensible Framework for Physical Design Guilherme Flach, Mateus Fogaça, Jucemar Monteiro, Marcelo Johann and Ricardo Reis Agenda 1. Introduction 2. Framework anatomy 3. Standard components 4.

More information

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING 1 DSP applications DSP platforms The synthesis problem Models of computation OUTLINE 2 DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: Time-discrete representation

More information

Logic Verification 13-1

Logic Verification 13-1 Logic Verification 13-1 Verification The goal of verification To ensure 100% correct in functionality and timing Spend 50 ~ 70% of time to verify a design Functional verification Simulation Formal proof

More information

A Deterministic Flow Combining Virtual Platforms, Emulation, and Hardware Prototypes

A Deterministic Flow Combining Virtual Platforms, Emulation, and Hardware Prototypes A Deterministic Flow Combining Virtual Platforms, Emulation, and Hardware Prototypes Presented at Design Automation Conference (DAC) San Francisco, CA, June 4, 2012. Presented by Chuck Cruse FPGA Hardware

More information

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation Course Goals Lab Understand key components in VLSI designs Become familiar with design tools (Cadence) Understand design flows Understand behavioral, structural, and physical specifications Be able to

More information

AMS DESIGN METHODOLOGY

AMS DESIGN METHODOLOGY OVER VIEW CADENCE ANALOG/ MIXED-SIGNAL DESIGN METHODOLOGY The Cadence Analog/Mixed-Signal (AMS) Design Methodology employs advanced Cadence Virtuoso custom design technologies and leverages silicon-accurate

More information

COEN-4730 Computer Architecture Lecture 12. Testing and Design for Testability (focus: processors)

COEN-4730 Computer Architecture Lecture 12. Testing and Design for Testability (focus: processors) 1 COEN-4730 Computer Architecture Lecture 12 Testing and Design for Testability (focus: processors) Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University 1 Outline Testing

More information

A Routing Approach to Reduce Glitches in Low Power FPGAs

A Routing Approach to Reduce Glitches in Low Power FPGAs A Routing Approach to Reduce Glitches in Low Power FPGAs Quang Dinh, Deming Chen, Martin Wong Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign This research

More information

AccuCore STA DSPF Backannotation Timing Verification Design Flow

AccuCore STA DSPF Backannotation Timing Verification Design Flow Application Note AccuCore STA DSPF Backannotation Timing Verification Design Flow Abstract This application note highlights when and why DSPF backannotation is needed during timing verification, and details

More information

Profiling of Data-Parallel Processors

Profiling of Data-Parallel Processors Profiling of Data-Parallel Processors Daniel Kruck 09/02/2014 09/02/2014 Profiling Daniel Kruck 1 / 41 Outline 1 Motivation 2 Background - GPUs 3 Profiler NVIDIA Tools Lynx 4 Optimizations 5 Conclusion

More information

Mathematical computations with GPUs

Mathematical computations with GPUs Master Educational Program Information technology in applications Mathematical computations with GPUs GPU architecture Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University GPU Graphical Processing

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

An OpenSource Digital Circuit Design Flow

An OpenSource Digital Circuit Design Flow An OpenSource Digital Circuit Design Flow Davide Sabena Mauricio De Carvalho Free Software - 2012 Outline Introduction Problem Motivations Proposed Open Source method Digital Design Flow Commercial vendor

More information

Design Solutions in Foundry Environment. by Michael Rubin Agilent Technologies

Design Solutions in Foundry Environment. by Michael Rubin Agilent Technologies Design Solutions in Foundry Environment by Michael Rubin Agilent Technologies Presenter: Michael Rubin RFIC Engineer, R&D, Agilent Technologies former EDA Engineering Manager Agilent assignee at Chartered

More information

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host

More information

What is This Course About? CS 356 Unit 0. Today's Digital Environment. Why is System Knowledge Important?

What is This Course About? CS 356 Unit 0. Today's Digital Environment. Why is System Knowledge Important? 0.1 What is This Course About? 0.2 CS 356 Unit 0 Class Introduction Basic Hardware Organization Introduction to Computer Systems a.k.a. Computer Organization or Architecture Filling in the "systems" details

More information

An overview of standard cell based digital VLSI design

An overview of standard cell based digital VLSI design An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased

More information

HAI ZHOU. Evanston, IL Glenview, IL (847) (o) (847) (h)

HAI ZHOU. Evanston, IL Glenview, IL (847) (o) (847) (h) HAI ZHOU Electrical and Computer Engineering Northwestern University 2535 Happy Hollow Rd. Evanston, IL 60208-3118 Glenview, IL 60025 haizhou@ece.nwu.edu www.ece.nwu.edu/~haizhou (847) 491-4155 (o) (847)

More information

High Performance Components with Charm++ and OpenAtom (Work in Progress)

High Performance Components with Charm++ and OpenAtom (Work in Progress) High Performance Components with Charm++ and OpenAtom (Work in Progress) Christian Perez Graal/Avalon INRIA EPI LIP, ENS Lyon, France Joint Laboratory for Petascale Computing University of Illinois at

More information

Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al.

Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al. Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al.) Andreas Kurth 2017-12-05 1 In short: The situation Image credit:

More information

VOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017

VOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017 VOLTA: PROGRAMMABILITY AND PERFORMANCE Jack Choquette NVIDIA Hot Chips 2017 1 TESLA V100 21B transistors 815 mm 2 80 SM 5120 CUDA Cores 640 Tensor Cores 16 GB HBM2 900 GB/s HBM2 300 GB/s NVLink *full GV100

More information

A Review on Parallel Logic Simulation

A Review on Parallel Logic Simulation Volume 114 No. 12 2017, 191-199 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu A Review on Parallel Logic Simulation 1 S. Karthik and 2 S. Saravana

More information

Selecting the right Tesla/GTX GPU from a Drunken Baker's Dozen

Selecting the right Tesla/GTX GPU from a Drunken Baker's Dozen Selecting the right Tesla/GTX GPU from a Drunken Baker's Dozen GPU Computing Applications Here's what Nvidia says its Tesla K20(X) card excels at doing - Seismic processing, CFD, CAE, Financial computing,

More information

Overview of Digital Design with Verilog HDL 1

Overview of Digital Design with Verilog HDL 1 Overview of Digital Design with Verilog HDL 1 1.1 Evolution of Computer-Aided Digital Design Digital circuit design has evolved rapidly over the last 25 years. The earliest digital circuits were designed

More information

HiPANQ Overview of NVIDIA GPU Architecture and Introduction to CUDA/OpenCL Programming, and Parallelization of LDPC codes.

HiPANQ Overview of NVIDIA GPU Architecture and Introduction to CUDA/OpenCL Programming, and Parallelization of LDPC codes. HiPANQ Overview of NVIDIA GPU Architecture and Introduction to CUDA/OpenCL Programming, and Parallelization of LDPC codes Ian Glendinning Outline NVIDIA GPU cards CUDA & OpenCL Parallel Implementation

More information

Title: ====== Open Research Compiler (ORC): Proliferation of Technologies and Tools

Title: ====== Open Research Compiler (ORC): Proliferation of Technologies and Tools Tutorial Proposal to Micro-36 Title: ====== Open Research Compiler (ORC): Proliferation of Technologies and Tools Abstract: ========= Open Research Compiler (ORC) has been well adopted by the research

More information

10/5/2016. Review of General Bit-Slice Model. ECE 120: Introduction to Computing. Initialization of a Serial Comparator

10/5/2016. Review of General Bit-Slice Model. ECE 120: Introduction to Computing. Initialization of a Serial Comparator University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 120: Introduction to Computing Example of Serialization Review of General Bit-Slice Model General model parameters

More information