Scheduling with Integer Time Budgeting for Low-Power Optimization

Similar documents
Numerical Solution of Deformation Equations. in Homotopy Analysis Method

LECTURE NOTES Duality Theory, Sensitivity Analysis, and Parametric Programming

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Modeling Local Uncertainty accounting for Uncertainty in the Data

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

Support Vector Machines

An Improved Isogeometric Analysis Using the Lagrange Multiplier Method

GSLM Operations Research II Fall 13/14

Support Vector Machines

Scheduling with Integer Time Budgeting for Low-Power Optimization

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

Smoothing Spline ANOVA for variable screening

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

An Optimal Algorithm for Prufer Codes *

5 The Primal-Dual Method

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Optimization Methods: Integer Programming Integer Linear Programming 1. Module 7 Lecture Notes 1. Integer Linear Programming

A General Algorithm for Computing Distance Transforms in Linear Time

Conditional Speculative Decimal Addition*

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Solving two-person zero-sum game by Matlab

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Flexible ASIC: Shared Masking for Multiple Media Processors

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

11. APPROXIMATION ALGORITHMS

Modeling and Solving Nontraditional Optimization Problems Session 2a: Conic Constraints

A Facet Generation Procedure. for solving 0/1 integer programs

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Array transposition in CUDA shared memory

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Efficient Distributed File System (EDFS)

The Codesign Challenge

On Some Entertaining Applications of the Concept of Set in Computer Science Course

Storage Binding in RTL synthesis

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

A New Approach For the Ranking of Fuzzy Sets With Different Heights

How Accurately Can We Model Timing In A Placement Engine?

Discriminative Dictionary Learning with Pairwise Constraints

Behavior-Level Observability Analysis for Operation Gating in Low-Power Behavioral Synthesis

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

Module Management Tool in Software Development Organizations

Using SAS/OR for Automated Test Assembly from IRT-Based Item Banks

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

OPL: a modelling language

Gradual Relaxation Techniques with Applications to Behavioral Synthesis *

LP Decoding. Martin J. Wainwright. Electrical Engineering and Computer Science UC Berkeley, CA,

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

A Comparative Study of Constraint-Handling Techniques in Evolutionary Constrained Multiobjective Optimization

NURBS curve method Taylor's launch type of interpolation arithmetic Wan-Jun Zhang1,2,3,a, Shan-Ping Gao1,b Xi-Yan Cheng 1,c &Feng Zhang2,d

Clustering on antimatroids and convex geometries

Efficient Load-Balanced IP Routing Scheme Based on Shortest Paths in Hose Model. Eiji Oki May 28, 2009 The University of Electro-Communications

Polyhedral Compilation Foundations

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Analog amplifier card Type VT-VSPA2-1-2X/V0/T1 Type VT-VSPA2-1-2X/V0/T5

Parallel matrix-vector multiplication

Today Using Fourier-Motzkin elimination for code generation Using Fourier-Motzkin elimination for determining schedule constraints

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Classification / Regression Support Vector Machines

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Minimization of the Expected Total Net Loss in a Stationary Multistate Flow Network System

Meta-heuristics for Multidimensional Knapsack Problems

EWA: Exact Wiring-Sizing Algorithm

A Multilevel Analytical Placement for 3D ICs

Intra-Parametric Analysis of a Fuzzy MOLP

Cluster Analysis of Electrical Behavior

INTEGER PROGRAMMING MODELING FOR THE CHINESE POSTMAN PROBLEMS

Interconnect Optimization for High-Level Synthesis of SSA Form Programs

Programming in Fortran 90 : 2017/2018

Constrained Robust Model Predictive Control Based on Polyhedral Invariant Sets by Off-line Optimization

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Hermite Splines in Lie Groups as Products of Geodesics

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Design and Analysis of Algorithms

Wishing you all a Total Quality New Year!

NGPM -- A NSGA-II Program in Matlab

ASSERTION SUPPORT IN HIGH-LEVEL SYNTHESIS DESIGN FLOW. 351 crs de la Libération, Talence, France

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

Classifier Selection Based on Data Complexity Measures *

Delay Variation Optimized Traffic Allocation Based on Network Calculus for Multi-path Routing in Wireless Mesh Networks

Towards Semantic Knowledge Propagation from Text to Web Images

Network Coding as a Dynamical System

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky

Virtual Machine Migration based on Trust Measurement of Computer Node

A Fuzzy Goal Programming Approach for a Single Machine Scheduling Problem

Optimization of integrated circuits by means of simulated annealing. Jernej Olenšek, Janez Puhan, Árpád Bűrmen, Sašo Tomažič, Tadej Tuma

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

Solving Route Planning Using Euler Path Transform

Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations

Optimization of Critical Paths in Circuits with Level-Sensitive Latches

Design for Reliability: Case Studies in Manufacturing Process Synthesis

MOBILE Cloud Computing (MCC) extends the capabilities

Multi-objective Design Optimization of MCM Placement

OPTIMIZATION OF PROCESS PARAMETERS USING AHP AND TOPSIS WHEN TURNING AISI 1040 STEEL WITH COATED TOOLS

Transcription:

Schedlng wth Integer Tme Bdgetng for Low-Power Optmzaton We Jang, Zhr Zhang, Modrag Potkonjak and Jason Cong Compter Scence Department Unversty of Calforna, Los Angeles Spported by NSF, SRC.

Otlne Introdcton to nteger tme bdgetng (ITB) problem Low-power schedlng Expermental reslts Conclsons and ftre work

Integer Tme Bdgetng (ITB) Problem Defnton Slack: the amont of extra delay that each component (ether a small gate or a large modle) of a desgn can tolerate wthot volatng the gven tmng constrant Tme bdgetng problem The problem of dstrbtng the slacks to dfferent modles of a desgn to optmze some objectves (sch as area, power) Example: redce area/power by sng slower adders Integer tme bdgetng problem The slacks mst be ntegral vales Applcatons: schedlng, gate szng, nterconnect plannng D 1 = 1ns + + D 2 = 1ns + D 3 = 1ns T = 5ns Slacks: S 1 =S 2 =S 3 =3ns

Motvaton Power (mw) 5 3 Maxmzng sm of weghted slack mght be sboptmal for power optmzaton v 1 v 2 v 3 1 2 3 4 Delay (ns) Power-delay tradeoff-crve v 1 v 2 5ns v 1 v 2 v 3 v 3 Total slack = 6ns Total power = 16mW Total slack = 5ns Total power = 11mW

Related Work Network-flow flow-based algorthm A Unfed Theory of Tme Bdget Management, [Ghas[ et al., ICCAD 4] Can solve the ITB problem optmally wth lnear objectve fncton Desgn Closre Drven Delay Relaxaton Based on Convex Cost Network Flow [Ln et al., DATE 7] Can handle convex objectve fncton Mathematcal programmng approach A Mathematcal Formlaton of Integer Tme Bdgetng Problem [Cong et al., TECHCON 7] Handles convex objectve fncton Inttve and easy to be ncorporated nto systems wth applcaton on-specfc desgn constrants

A Mathematcal Formlaton of Tme Bdgetng Problem Power (mw) v 1 v 2 v 3 v 4 v 5 T 1 5 3 v 6 Drected Acyclc Graph: G = (V, E) s : start tme of node v d : mnmm latency of v b : tme bdget at node v Lnearly constraned separable convex optmzaton problem Mn Sbject s b s s 1 2 3 4 Delay (ns) Each f s a sngle-varable convex fncton + b d T V = 1 s f ( b ) to : j v v v e(, V PIs j) POs E

Totally Unmodlar Constrant Matrx v 1 v 2 s 2 + d 2 s 3 s 1 + d 1 s 3 v 3 s 1 s 2 s 3 T s 1 1 1 s 2 1 1 s 3-1 -1 1 J 1 J 1 J 2 J 1 b 1 1 b 2 1 b 3 Theorem 1: The ITB constrant matrx s a TUM

Optmzng Separable Convex Objectve Power (mw) 1 Power (mw) 1 5 3 5 3 1 2 3 4 Delay (ns) 1 2 3 4 Delay (ns)

Otlne Introdcton to nteger tme bdgetng (ITB) problem Low-power schedlng Expermental reslts Conclsons and ftre work

Applcaton to Low-Power Schedlng Motvaton Schedlng and tme bdgetng are hghly correlated Problem Consder the schedlng and bdgetng problem together to mnmze the average power nder tme constrant T Man dea Integrate or ITB problem wth the SDC based schedlng [Cong and Zhang, An effcent and Versatle Schedlng Algorthm Based on SDC Formlaton, DAC 6]

Low-power Schedlng Problem Formlaton Gven: A data flow graph G A latency constrant T A set of optonal schedlng constrants ncldng cycle tme constrant, relatve tmng constrants and resorce constrants. + * v 1 v 2 * v 3 + v 4 A set of power-delay tradeoff crves for each type of operaton sch as addton, mltplcaton, etc. Objectve v 5 DFG Example Get a vald schedlng whch satsfes all the constrants and mnmze the total power

Low-Power Schedlng Each node v V op s assocated wth a node bdgetng varable bv(v ) whch denotes the # of clock cycles that operaton v lasts n the fnal schedle Adjst the followng constrants Data dependence constrant (, v) E d : sv beg beg ()) + bv() sv sv beg (v) Latency constrant T v V op : sv beg ()) + bv(v) T Throghpt constrant wth ntaton nterval II v V op : bv(v) II Optmzng total node power Mn V op = pw 1 op v ) ( ( bv( v )) We can optmally mnmze the total node power n polynomal tme

Consderaton of Resorce Bndng Optmzng total FU power Constrant matrx s no longer totally nmodlar wth the reqrement that: all operatons sharng a same fncton nt mst have same slacks The problem s NP-complete (redcton from 3-SAT) 3 Proposed herstc Mn Frst solve the contnos verson and obtan the optmal fractonal bdget fb(v ) for each node v Perform a global rondng by mnmzng the least-sqares sqares error Objectve fncton s separable convex F = f * j 1 j pw op f j ) ( ( bv ( f j )) Mn V op = 1 ( bv( v ) fb( v )) 2

Low-power Schedlng Flow User Constrants CDFG Target platform modelng (Power-delay crves) Resorce Sharng Informaton Low Power Schedler Constrant generaton ITB constrants Dependency constrants Objectve generaton Mathematcal Programmng Solver STG (State Transton Graph) Resorce Bndng

Otlne Introdcton to nteger tme bdgetng (ITB) problem Low-power schedlng Expermental reslts Conclsons and ftre work

C specfcaton C-to- xplot behavoral synthess LLVM compler SSDM (System-Level Synthess Data Model) SSDM/CDFG Behavoral synthess SSDM/FSMD RTL generaton Platform descrpton & constrants FSM wth Datapath n VHDL Mlt-cycle path constrants Magma RTL-to to-slcon v4.2 (+ TSMC 9nm lbrary) GDSII

Expermental Reslts Comparson wth Max Weghted Slack LPS: Mnmze total node power WMS: Maxmze maxmm weghted slack Power, area, and cycle tme comparsons between WMS and LPS (Latency constrant = 1.2x the longest path length)

Expermental Reslts Consderatons of Resorce Sharng LPS_NRS: Mnmze total node power (wthot consderng resorcee sharng) LPS_RS: Mnmze total fncton nt power (consderng resorce sharng) ILP: ILP-based approach to drectly mnmze total FU power (optmal solton) Actal power consmpton LPS_RS s wthn 6% of the ILP exact approach and otperforms LPS_NRS by 3%

Conclsons and Ftre Work Conclsons A mathematcal programmng formlaton of the nteger tme bdgetng problem wth great flexblty and extensblty Applcaton to low-power schedlng problem Ftre works Apply or ITB formlaton to other problems

Thanks.

Desgn Closre Drven Delay Relaxaton Based on Convex Cost Network Flow [DATE7] Problem formlaton Desgn Closre Drven Delay Relaxaton problem Essentally a ITB problem wth convex objectve fncton Solton Transformaton to a convex cost nteger dal network flow problem Comparson wth mathematcal programmng (MP) approach Network flow based algorthm has a better worst-case complexty MP approach allows the tlzaton of the leadng-edge edge mathematcal programmng solvers MP approach can be easly extended to spport applcaton specfc c constrants