Advances in Designing Clockless Digital Systems

Similar documents
Advances in Designing Clockless Digital Systems

Recent Advances in Designing Clockless Digital Systems

Advances in Designing Clockless Digital Systems

The Design of Low-Latency Interfaces for Mixed-Timing Systems

An Introduction to Burst-Mode Controllers and the MINIMALIST CAD Package

TEMPLATE BASED ASYNCHRONOUS DESIGN

High Performance Interconnect and NoC Router Design

Design of 8 bit Pipelined Adder using Xilinx ISE

A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

SystemVerilogCSP: Modeling Digital Asynchronous Circuits Using SystemVerilog Interfaces

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

Clockless IC Design using Handshake Technology. Ad Peeters

POWER consumption has become one of the most important

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 5, Sep-Oct 2014

Introduction to Asynchronous Circuits and Systems

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1

TIMA Lab. Research Reports

ASYNC Rik van de Wiel COO Handshake Solutions

Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Low Power GALS Interface Implementation with Stretchable Clocking Scheme

The design of a simple asynchronous processor

More Course Information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Modified Micropipline Architecture for Synthesizable Asynchronous FIR Filter Design

CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER

Optimization of Robust Asynchronous Circuits by Local Input Completeness Relaxation. Computer Science Department Columbia University

VLSI Design Automation

VHDL for Synthesis. Course Description. Course Duration. Goals

16x16 Multiplier Design Using Asynchronous Pipeline Based On Constructed Critical Data Path

Design Guidelines for Optimal Results in High-Density FPGAs

DIGITAL DESIGN TECHNOLOGY & TECHNIQUES

ABSTRACT. Michael N. Horak Master of Science, 2008

ANEW asynchronous pipeline style, called MOUSETRAP,

FPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 1 & 2

Verilog for High Performance

ISSN Vol.08,Issue.07, July-2016, Pages:

Digital Design Methodology

Digital Design Methodology (Revisited) Design Methodology: Big Picture

Prof. Steven Nowick. Chair, Computer Engineering Program

Lecture #1: Introduction

Introduction to VHDL. Module #5 Digilent Inc. Course

COE 561 Digital System Design & Synthesis Introduction

Design Verification Lecture 01

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

VLSI Design Automation

Natalie Enright Jerger, Jason Anderson, University of Toronto November 5, 2010

Distributed Synchronous Control Units for Dataflow Graphs under Allocation of Telescopic Arithmetic Units

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

Synchronization In Digital Systems


Additional Slides to De Micheli Book

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad

Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles

Hardware/Software Co-design

Chapter 1: Fundamentals of Quantitative Design and Analysis

Jung-Lin Yang. Ph.D. and M.S. degree in the Dept. of Electrical and Computer Engineering University of Utah expected spring 2003

Introduction to asynchronous circuit design. Motivation

VLSI Design Automation. Maurizio Palesi

Field Programmable Gate Array (FPGA)

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Computer Architecture s Changing Definition

Announcements. Midterm 2 next Thursday, 6-7:30pm, 277 Cory Review session on Tuesday, 6-7:30pm, 277 Cory Homework 8 due next Tuesday Labs: project

EEL 5722C Field-Programmable Gate Array Design

POWER ANALYSIS OF CRITICAL PATH DELAY DESIGN USING DOMINO LOGIC

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation

Asynchronous Circuit Design

DESIGN AND VERIFICATION ANALYSIS OF APB3 PROTOCOL WITH COVERAGE

10/24/2016. Let s Name Some Groups of Bits. ECE 120: Introduction to Computing. We Just Need a Few More. You Want to Use What as Names?!

An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart

Implementation of ALU Using Asynchronous Design

Asynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus

Hardware Software Codesign of Embedded Systems

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

FPGA for Software Engineers

Index 283. F Fault model, 121 FDMA. See Frequency-division multipleaccess

Ten Reasons to Optimize a Processor

Asynchronous Circuits: An Increasingly Practical Design Solution

Architecture of an Asynchronous FPGA for Handshake-Component-Based Design

An Overview of Standard Cell Based Digital VLSI Design

Cadence SystemC Design and Verification. NMI FPGA Network Meeting Jan 21, 2015

Design of a clockless MSP430 core using mixed asynchronous design flow

Unit 2: High-Level Synthesis

Design of Low Power Asynchronous Parallel Adder Benedicta Roseline. R 1 Kamatchi. S 2

ECE 459/559 Secure & Trustworthy Computer Hardware Design

Clocked and Asynchronous FIFO Characterization and Comparison

Digital Integrated Circuits A Design Perspective. Jan M. Rabaey

All MSEE students are required to take the following two core courses: Linear systems Probability and Random Processes

SP3Q.3. What makes it a good idea to put CRC computation and error-correcting code computation into custom hardware?

The Design and Implementation of a Low-Latency On-Chip Network

The Microprocessor as a Microcosm:

NoCIC: A Spice-based Interconnect Planning Tool Emphasizing Aggressive On-Chip Interconnect Circuit Methods

Outline. SLD challenges Platform Based Design (PBD) Leveraging state of the art CAD Metropolis. Case study: Wireless Sensor Network

FPGAs: FAST TRACK TO DSP

Lecture 11 Logic Synthesis, Part 2

Functional Programming in Hardware Design

DESIGN STRATEGIES & TOOLS UTILIZED

Transcription:

Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick nowick@cs.columbia columbia.edu Department of Computer Science (and Elect. Eng.) Columbia University New York, NY, USA

Introduction Synchronous vs.. Asynchronous Systems? Synchronous Systems: use a global clock entire system operates at fixed-rate uses centralized control clock #2

Introduction (cont.) Synchronous vs.. Asynchronous Systems? (cont.) Asynchronous Systems: no global clock components can operate at varying rates communicate locally via handshaking uses distributed control handshaking interfaces (channels) #3

Trends and Challenges Trends in Chip Design: next decade Semiconductor Industry Association (SIA) Roadmap Unprecedented Challenges: complexity and scale (= size of systems) clock speeds power management reusability & scalability reliability time-to-market Design becoming unmanageable using a centralized single clock (synchronous) approach. #4

Trends and Challenges (cont.) 1. Clock Rate: 1980: several MegaHertz 2001: ~750 MegaHertz - 1+ GigaHertz 2009: 3-6 GigaHertz (and sometimes falling!) Design Challenge: clock skew : clock must be near-simultaneous across entire chip #5

Trends and Challenges (cont.) 2. Chip Size and Density: Total #Transistors per Chip: 60-80% increase/year ~1970: 4 thousand (Intel 4004 microprocessor) today: 50-200+ million 2010 and beyond: 1 billion+ Design Challenges: system complexity, design time, clock distribution clock will require 10-20 cycles to reach across chip #6

Trends and Challenges (cont.) 3. Power Consumption Low power: ever-increasing demand consumer electronics: battery-powered high-end processors: avoid expensive fans, packaging Design Challenge: clock inherently consumes power continuously power-down techniques: add complexity, only partly effective #7

Trends and Challenges (cont.) 4. Time-to-Market, Design Re-Use, Scalability Increasing pressure for faster time-to-market. Need: reusable components: plug-and-play design flexible interfacing: under varied conditions, voltage scaling scalable design: easy system upgrades Design Challenge: mismatch with central fixed-rate clock #8

Trends and Challenges (cont.) 5. Future Trends: Mixed Timing Domains Chips themselves becoming distributed systems. contain many sub-regions, operating at different speeds: Design Challenge: breakdown of single centralized clock control #9

Asynchronous Design: Potential Advantages Several Potential Advantages: Lower Power no clock components use dynamic power only on demand no global clock distribution effectively provides automatic clock gating at arbitrary granularity Robustness, Scalability no global timing mix-and-match variable-speed components supports dynamic voltage scaling modular design style object-oriented Higher Performance not limited to worst-case clock rate Demand- (Data-) Driven Operation instantaneous wake-up from standby mode #10

Asynchronous Design: Recent Industrial Developments 1. Philips Semiconductors: Wide commercial use: 700 million async chips for consumer electronics: pagers, cell phones, smart cards, digital passports, automotive Benefits (vs( vs.. sync): 3-4x lower power (and lower energy consumption/ops) much lower electromagnetic interference (EMI) instant startup from stand-by mode (no PLL s) Complete commercial CAD tool flow: Tangram : Haste : Philips (mid-90 Philips (mid-90 s s to early 2000 s) Handshake Solutions (incubated spinoff Synthesis strategy: syntax-directed compilation starting point: concurrent HDL (Tangram,, Haste) 2-step synthesis: spinoff) (early 2000 s s to present) front-end: HDL spec => intermediate netlist of concurrent components back-end: each component => standard cell ( ( then physical design) +: fast, transparent,, easy-to-use -: few optimizations, low/moderate-performance only #11

Asynchronous Design: Recent Industrial Developments 2. Intel: experimental Pentium instruction-length decoder = RAPPID (1990 s) 3-4x faster than synchronous subsystem ~2x lower power 3. Sun Labs: commercial: high-speed FIFO s in recent Ultra s (memory access) 4. IBM Research: experimental: high-speed pipelines, FIR filters, mixed-timing systems 5. Recent Async Startups: Fulcrum Microsystems (California): Ethernet routing chips Camgian Systems: very low-power/robust designs (sensors, etc.) Handshake Solutions (Netherlands): incubated by Philips -- tools + design Silistrix (UK): interconnect for low-end heterogenous/mixed-timing systems Achronix: high-speed FPGA s #12

Asynchronous Design: Potential Targets Large variety of asynchronous design styles Address different points in design-space spectrum Example targets: extreme timing-robustness: providing near delay-insensitive (DI) operation ultra-low power or energy: on-demand operation, instant wakeup ease-of-design/moderate performance e.g. Philips style very high-speed: asynchronous pipelines (with localized timing constraints) comparable to high-end synchronous with added benefits: support variable-speed I/O rates support for heterogeneous systems: integrate different clock domains + async GALS-style (globally-async/locally-sync) #13

Asynchronous Design: Challenges Critical Design Issues: components must communicate cleanly: hazard-free design highly-concurrent designs: much harder to verify! Lack of Automated Computer-Aided Design Tools: most commercial CAD tools targeted to synchronous #14

What Are CAD Tools? Software programs to aid digital designers = computer-aided design tools automatically synthesize and optimize digital circuits Input: desired circuit specification CAD TOOL Output: optimized circuit implementation #15

Asynchronous Design Challenge Lack of Existing Asynchronous Design Tools: Most commercial CAD tools targeted to synchronous Synchronous CAD tools: major drivers of growth in microelectronics industry Asynchronous chicken-and-egg problem: few CAD tools less commercial use of async design especially lacking: tools for designing/optmzng.. large systems #16

Overview: Asynchronous Communication Components usually communicate & synchronize on channels channel Sender Receiver #17

Overview: Signalling Protocols Communication channel: usually instantiated as 2 wires req ack Sender Receiver #18

Overview: Signalling Protocols req ack Sender Receiver 4-Phase Handshaking Active (evaluate) phase One transaction (return-to-zero [RZ]): req ack Return-to-zero (RZ) phase #19

Overview: Signalling Protocols req ack Sender Receiver 2-Phase Handshaking = Transition-Signalling First communication Two transactions (non-return-to-zero [NRZ]): req ack Second communication #20

Overview: How to Communicate Data? Data channel: replace req by (encoded) data bits - still use 2-phase or 4-phase protocol data ack Sender Receiver #21

Overview: How to Encode Data? A variety of asynchronous data encoding styles Two key classes: (i) DI (delay-insensitive) or (ii) timing-dependent each can use either a 2-phase or 4-phase protocol DI Codes: provides timing-robustness (to arbitrary bit skew, arrival times, etc.) 4-phase (RZ) protocols: dual-rail (1-of-2): widely used! 1-of-4 (or m-of-n) 2-phase (NRZ) protocols: transition-signaling (1-of-2) LEDR (1-of-2) [ level-encoded dual-rail ] [Dean/Horowitz/Dill, Advanced Research in VLSI 91] LETS (1-of-4) [ level-encoded transition-signalling ] Timing-Dependent Codes: use localized timing assumptions [McGee/Agyekum/Mohamed/Nowick IEEE Async Symp. 08] Single-rail bundled data : widely used! = sync encoding + matched delay Other: pulse-mode,, etc. #22

Overview: How to Encode Data? dual-rail : 4-Phase (RZ) Bit X Bit X no data Dual-rail encoding X1 X0 0 0 1 1 1 0 0 0 = NULL (spacer) X1 X0 ack Sender Receiver #23

Overview: How to Encode Data? 1-of-4 : 4-Phase (RZ) Bits A B Bits A B no data Dual-rail encoding X3 X2 X1 X0 00 0 0 0 1 01 0 0 1 0 10 0 1 0 0 11 1 0 0 0 0 0 0 0 = NULL (spacer) X3 X2 X1 X0 ack Sender Receiver #24

Overview: How to Encode Data? Single-Rail Bundled Data : 4-Phase (RZ) bundling signal Uses synchronous (single-rail) data + local worst-case model delay req A B ack Sender Receiver #25

Signalling Protocols + Data Encoding: Tradeoffs DI Codes: provides timing-robustness 4-phase (RZ) protocols: -: poorer system throughput + power (2 roundtrips), +: easy function block design dual-rail (1-of-2): 1-of-4 (or m-of-n) worse power (# rail transitions) better power (# rail transitions) 2-phase (NRZ) protocols: +: better system throughput + power (1 roundtrip), transition-signaling (1-of-2) LEDR (1-of-2) LETS (1-of-4) -: difficult to design function blocks worse power (# rail transitions) better power (# rail transitions) [Dean/Horowitz/Dill, Advanced Research in VLSI 91] best power (# rail transitions) [McGee/Agyekum/Mohamed/Nowick IEEE Async Symp. 08] Timing-Dependent Codes: good power + ease of function design/poor robustness Single-rail bundled data : widely used! = sync encoding + matched delay Other: pulse-mode,, etc. #26

Async Protocols: Evaluation Summary Robust/High-Throughput Global Communication: High throughput + low power: 2-phase (NRZ) protocols (LETS) Efficient Local Computation (easy-to-design function blocks): Ease-of-design + low area + low power: Timing Robust (DI): 4-phase (RZ) protocols (dual-rail, 1-of-4) Non-DI: single-rail bundled data (2-/4-phase) Our recent research: efficient protocol converters Global communication: use 2-phase (LEDR, LETS) Local computation: use 4-phase (bundled, dual-rail, 1-of-4) [McGee/Agyekum/Mohamed/Nowick IEEE Async Symp. 08] #27

Overview: My Research Areas CAD Tools/Algorithms for Asynchronous Controllers (FSM( FSM s) MINIMALIST Package: for synthesis + optimization Mixed-Timing Interface Circuits: for interfacing sync/sync and sync/async systems High-Speed Asynchronous Pipelines: for static or dynamic logic #28

CAD Tools for Async Controllers MINIMALIST: developed at Columbia University [1994-] Features: extensible CAD package for synthesis of asynchronous controllers integrates synthesis, optimization and verification tools used in 80+ sites/17+ countries (was taught in IIT Bombay) URL: http://www.cs.columbia.edu/~nowick/asynctools Automatic design scripts + custom commands Performance-driven multi-level logic decomposition Verilog back-end Automatic verifier Graphical interfaces many optimization modes Recent application: laser space measurement chip (joint with NASA Goddard) NASA/Columbia (2006-2007) fabricated experimental chip: taped out (Oct. 06) Key goal: facilitate design-space exploration #29

Example: PE-SEND-IFC (HP Labs) Inputs: req-send treq rd-iq adbld-out ack-pkt Outputs: tack peack adbld From HP Labs Mayfly Project: B.Coates, A.Davis, K.Stevens, The Post Office Experience: Designing a Large Asynchronous Chip, INTEGRATION: the VLSI Journal, vol. 15:3, pp. 341-66 (Oct. 1993) req-send-/ -- adbld-out- treq- ack-pkt+/ peack+ 8 9 10 ack-pkt+/ peack- tack- ack-pkt- treq-/ peack- tacktreq-/ tacktreq+/ tack+ 0 1 2 3 4 5 6 7 req-send+ treq+ rd-iq+/ adbld+ adbld-out+/ peack+ rd-iq-/ peack- adbldtack+ adbld-out- treqrd-id+/ adbld+ adbld-out+/ peack+ adbld-outtreq+ rd-iq+/ adbld+ rd-iq-/ peackadbld- tackadbld-out- treq+ ack-pkt+/ peack+ tack+ #30

EXAMPLE (cont.): Design-Space Exploration using MINIMALIST: optimizing for area vs. speed Examples: #31

Overview: My Research Areas CAD Tools/Algorithms for Asynchronous Controllers (FSM( FSM s) MINIMALIST Package: for synthesis + optimization Mixed-Timing Interface Circuits: for interfacing sync/sync and sync/async systems High-Speed Asynchronous Pipelines: for static or dynamic logic #32

Mixed-Timing Interfaces: Challenge Asynchronous Domain Asynchronous Domain Synchronous Domain 2 Synchronous Domain 1 Goal: provide low-latency communication between timing domains Challenge: avoid synchronization errors #33

Mixed-Timing Interfaces: Solution Async-Sync FIFO Asynchronous Domain Asynchronous Domain Sync-Async FIFO Async-Sync FIFO Synchronous Domain 1 Synchronous Domain 2 Mixed-Clock FIFO s Solution: insert mixed-timing FIFO s provide safe data transfer developed complete family of mixed-timing interface circuits [Chelcea/Nowick, IEEE Design Automation Conf. (2001); IEEE Trans. on VLSI Systems v. 12:8, Aug. 2004 ] #34

Overview: My Research Areas CAD Tools/Algorithms for Asynchronous Controllers (FSM( FSM s) MINIMALIST Package: for synthesis + optimization Mixed-Timing Interface Circuits: for interfacing sync/sync and sync/async systems High-Speed Asynchronous Pipelines: for static or dynamic logic #35

High-Speed Asynchronous Pipelines NON-PIPELINED COMPUTATION: datapath component = adder, multiplier, etc. global clock SYNCHRONOUS #36

High-Speed Asynchronous Pipelines PIPELINED COMPUTATION : like an assembly line global clock SYNCHRONOUS no global clock ASYNCHRONOUS #37

High-Speed Asynchronous Pipelines Goal: fast + flexible async datapath components speed: comparable to fastest existing synchronous designs additional benefits: dynamically adapt to variable-speed interfaces handles dynamic voltage scaling no requirement of equal-delay stages no high-speed clock distribution Contributions: 3 New Asynchronous Pipeline Styles [M. Singh/S.M. Nowick] (i) MOUSETRAP: static logic [ICCD-01, IEEE Trans. on VLSI Systems 2007] (ii) Lookahead (LP): dynamic logic [Async-02, IEEE Trans. on VLSI Systems 2007] (iii) High-Capacity (HC): dynamic logic [Async-02, ISSCC-02, IEEE Trans. on VLSI Systems 2007] Application (IBM Research): experimental FIR filter [ISSCC-02, J. Tierno et al.] - async filter in sync wrapper - provides adaptive latency = # of clock cycles per operation - performance: better than leading comparable commercial synchronous design (from IBM) #38

MOUSETRAP: A Basic FIFO (no computation) Stages communicate using transition-signaling: Latch Controller ack N-1 ack N En req N done N req N+1 Data in Data out Data Latch Stage N-1 Stage N Stage N+1 [Singh/Nowick, IEEE Int. Conf. on Computer Design (2001), IEEE Transactions on VLSI Systems (2007)] #39

MOUSETRAP Pipeline: w/computation Latch Controller ack N-1 ack N delay req N done N delay req N+1 delay logic logic logic Data Latch Stage N-1 Stage N Stage N+1 Function Blocks: use synchronous single-rail circuits (not hazard-free!) Bundled Data Requirement: each req must arrive after data inputs valid and stable #40

Other Research Projects 1. Asynchronous Interconnection Networks: for Shared-Memory Parallel Processors Medium-scale NSF project [2008-12]: with Prof. Uzi Vishkin (University of Maryland) Goal: low-power/high-performance async routing network (processors <=> memory) GALS -style: globally-asynchronous/locally-synchronous [M. Horak,, S.M. Nowick, M. Carlberg,, U. Vishkin,, ACM NOCS-10 Symposium] 2. Continuous-Time DSP s Medium-scale NSF project [2010-14]: with Prof. Yannis Tsividis (Columbia EE Dept.) Idea: adaptive signal processing, based on signal rate-of-change Goal: low-aliasing + low-power -- combine analog + async digital 3. Asynchronous Bus Encoding: for Timing-Robust Global Communication Goal: low-power, error-correction + timing-robust ( delay-insensitive( delay-insensitive ) ) communication [M. Agyekum/S.M. Nowick, DATE-10, IWLS-10, DATE-11] 4. Variable-Latency Functional Units: Speculative Completion Goal: high-performance components with data-dependent completion [S.M. Nowick et al., IEE Proceedings 96; IEEE Async-97 Symposium] #41

#42

MOUSETRAP: A Basic FIFO Stages communicate using transition-signaling: 1 transition per data item! ack N-1 Latch Controller ack N En req N done N req N+1 Data in Data out Data Latch Stage N-1 Stage N Stage N+1 One Data Item #43

Performance Analysis of Concurrent Systems Goal: fast analytical techniques + tools - to handle large/complex asynchronous + mixed-timing systems using stochastic delay models (Markovian): ): [P. McGee/S.M. Nowick, CODES-05] using bounded delay models (min/max): [P. McGee/S.M. Nowick, ICCAD-07] Applications: analysis + optimization Large Asynchronous Systems: Evaluate latency, throughput, critical vs.. slack paths, average-case performance Drive optimization: pipeline granularity, module selection Large Heterogeneous (mixed-clock) or GALS Systems: Evaluate critical vs.. slack paths Drive optimization: : dynamic voltage scaling, load balancing of threads, buffer insertion #44

Introduction to MLO MLO is an integrated post-processing (i.e. backend) tool for Minimalist. Targeted to multi-level logic. In contrast, Minimalist currently is targeted to two-level logic. Designed to work on combinational hazard-free logic for Burst Mode controllers. Uses hazard-non-increasing transforms. Output of MLO is Verilog. MLO is a standalone tool running from the Linux shell outside of Minimalist. #45

Minimalist: MLO (Multi-Level Optimizer) Accessible on the web from: Initial Release One version for Linux Distributions Includes http://www1.cs.columbia.edu/~nowick/asynctools Complete Tutorial Documentation Examples Tool requires Python interpreter to run: Consult README for MLO installation information http://www.python.org/download/ #46

CEO Feature - User-Specified Critical Events User-Specified Critical Arcs Highlighted in Red Case 1: Non colorized arc. User-Specified nothing is critical. Defaults to automated mode for every output. 2 IntITReq+ / ITEventReq+ IntITReq- / CtrIncAck+ / CtrIncReq- ITEventReq- 0 1 4 IntITReq+ / ITEventReq+ IntITReq+ / ITEventReq+ ITEvent2Ticks- / CtrIncReq+ ITEventReq- CtIncAck- / ITEventReq+ 3 CtrIncReq- ITEvent2Ticks- 5 ITEvent2Ticks- / IntITReq- Case 2: Some outputs colorized, some outputs not. Both user-specified data and automated approaches are used to determine criticality. ITEventReq will use userspecified data to determine criticality. CtrincReq will default to automated mode to determine criticality. Case 3: Every output is colorized. Automated approach is never used. IntITReq- is critical with respect to CtrIncReq-, while ITEvent2Ticks- is NOT critical to CtrIncReq-. #47

Feature Set - Initial Two-Level Implementation (before( applying MLO) The next four slides present different MLO output examples. For each example, the starting circuit (input to MLO) is this circuit Two-level Structure from Minimalist Output #48

Feature Set Example 1 - Gate Fan-in Limitation Result of MLO: Multi-Level circuit with AND gate fan-in limit of 2 #49

Feature Set Example 2 - Negative Logic Result of MLO: Multi-Level Circuit using MLO Negative Logic This mode carefully optimizes only hazard non-increasing safe transformations (DeMorgan s Law). Optimizations are also included to carefully eliminate extra inverters. #50

Feature Set Example 3 - CEO critical event optimizer Gate Decomposed. Input intitreq is more critical to output iteventreq than ctrincack and y0 critical primary inputto-output path Result of MLO: Multi-Level Circuit after MLO CEO is used #51

Feature Set Example 4 - Combined Negative Logic Gate fan-in limit of 2 CEO Optimizes Critical Path Result of MLO: Multi-Level Circuit with negative logic, AND gate fan-in limit of 2, 2 and CEO. #52