Design Space Exploration SS 2012 Jun.-Prof. Dr. Christian Plessl Custom Computing University of Paderborn Version 1.1.0 2012-06-15
Overview motivation for design space exploration design space exploration using evolutionary algorithms case studies 2
Motivation challenge: many real-world synthesis problems cannot be solved optimally large search space (combinatorial explosion) metrics to be optimized can be estimated only crudely without implementation (e.g. power consumption, performance, chip area, ) no closed form solution available (e.g. because of non-linear constraints or metrics) no general definition what optimal means in the case of multiple, contradicting optimization goals 1 5 3 Allocation, Binding v RISC v bus power consumption? communication bus usage? 7 2 6 v HWM1 v ptp v HWM2 execution time? code size? 4 3
objective instead of finding one optimal solution Design Space Exploration determine many alternative solutions and their characteristics use cases present designer with choices to select from understand design trade-offs create starting points for refined designs iterative stochastic search process initialize candidates generate new solution candidates evaluate candidates update set of solutions e.g. randomly generated chose allocation and binding compute schedule and other metrics select which solution(s) to pursue further 4
Evolutionary Algorithms Basic Principles ➊ selection ➋ crossover ➌ mutation 5
minimize g(x) = x² Evolutionary Algorithms (EA) 0011 = one solution 0011 fitness = 9 fitnesscalculation selection next generation 0011 0000 0100 1011 0011 mutation crossover 6
Evolutionary Algorithms (EA) evolutionary algorithms are randomized search heuristics problem-independent (meta heuristics) population based use variation (crossover, mutation) and selection application domains when optimization problem is complex and 'diffuse' examples: system synthesis, path planning in robotics diffuse + complex multiobjective optimization several conflicting criteria, eg. performance vs. cost vs. power consumption EAs find Pareto fronts (set of Pareto points) 7
Dominance, Pareto Points definition: a (design) point J i is dominated by point J k, if J k is equal or better than J i in all criteria and better in at least one criterion. J J i k definition: a (design) point is Pareto-optimal or a Pareto point, if it is not dominated by any other point. 8
Multiobjective Optimization (1) decision space (x1, x2,, xn) minimize f objective space (y1, y2,, yk) x2 y2 dominated x1 y1 Pareto optimal = not dominated difficulties: large search space multiple optima 9
Multiobjective Optimization (2) classic single-objective methods e.g. simulated annealing, ILP, hierarchical clustering,... decision making before optimization weighted cost function multi-stage optimization e.g. hierarchical clustering with different closeness functions decision making after optimization multiple optimization runs with varying weights population-based methods evolutionary algorithms decision making after optimization the goal is to explore the design space 10
multiple objectives (y1, y2,, yk) parameters transformation Weighted Cost Function single objective y example: weighting approach y2 maximization problem (w1, w2,, wk) y = w1y1 + + wkyk y1 11
EAs for Multiobjective Optimization y2 EA operations selection recombination mutation diversity goals y1 distance 12
Fitness Assessment by Pareto Ranking fitness function: F'( J ) = = i 1.. N, J 0 : J i 1: Ji else J execution time F' (1) = 3 1 2 4 3 6 F F F F '(2) '(3) '(4) '(5) = 1 = 1 = 2 = 1 5 cost F '(6) = 0 13
Fitness Assessment by Non-dominated Sorting 1. start with set of all solutions J 2. determine set of non-dominated solutions (A 0 ) in J, assign fitness value φ 0 to these solutions 3. remove A 0 from J (J 1 = J \ A 0 ), determine set of non-dominated solutions (A 1 ) in A 0, assign fitness value φ 1 to these solutions 4. repeat until J i is empty f 2 ϕ = 3 ϕ = 2 ϕ = 1 ϕ = 0 Teich & Haubelt 2007 f 1 14
Example: Strength Pareto EA (1) save non-dominated solutions (elitism) reduce non-dominated set by means of clustering population non-dominated set assign fitness values (Pareto-based) perform binary tournament selection recombination Mutation 15
Example: Strength Pareto EA (2) clustering: reduce non-dominated set but do not destroy characteristics y2 y2 y2 clustering y1 y1 y1 group select remove 16
Example: Strength Pareto EA (3) y2 5+2 5+7 2 1 5+4 4 fitness assignment scheme: non-dominated solutions: fitness = #dominated solutions dominated solutions : fitness = #non-pareto solutions fitness of + dominators y1 lighter better than darker guidance towards Pareto-optimal set few better than many maintenance of diversity 17
Design Space Exploration with EAs individual EA selection recombination mutation allocation binding decode allocation decode binding scheduling chromosome = encoded allocation + binding design point (implementation) fitness fitness evaluation user constraints 18
Challenges encoding of (allocation+binding) simple encoding e.g. one bit per resource, one variable per binding easy to implement many infeasible partitionings encoding + repair e.g. simple encoding and modify such that for each v p V P there exists at least one v a V A with a β(v p ) = v a reduces number of infeasible partitionings generation of the initial population find a good starting point (seed) to accelerate convergence mutation and recombination operators must be defined that the whole search space can be reached 19
Case Study - Video Coder (1) 20
Case Study - Video Coder (2) 21
Case Study - Video Coder (3) frame memory dual ported frame memory block matching module input module subtract/add module DCT/IDCT module Huffman encoder output module 22
EA Design Space Exploration Tool 1 2 23
Case Study - Solution 1 INM OUTM FM RISC2 SBS 24
Case Study - Solution 2 INM OUTM DPFM HC DCTM BMM SAM SBF 25
Case Study - Code Synthesis (1) synchronous data flow graph (SDF) formalism programming model for streaming signal processing applications allows expressing coarse-grained parallelism suitable as specification for systems with parallel execution application is defined as a network of nodes and arcs nodes (actors) can execute arbitrary functions consume and produce tokens number of consumed/produced tokens is shown as a label arcs (communication channels) actor communicate via FIFO buffers 3 4 A example for an actor, reads 3 tokens on input arc and produces 4 tokens on output 26
Case Study - Code Synthesis (2) synchronous data flow graph software implementation A 1 1 2 3 2 7 8 7 5 1 CD 44.1kHz B C D E sample rate converter F DAT 48 khz decisions schedule ABABABCCABABA code generation model inlining subroutines looping B 2 3 C CODE(A) CODE(B) CODE(A) CALL(A) CALL(B) CALL(A) FOR 1 TO 2 CODE(A) CALL(B) CODE(B) CALL(B) CODE(C) CODE(C) CALL(C) CODE(A) 27
Case Study - Code Synthesis (3) may increase data memory trade-offs saves looping program memory subroutines execution time save increase 28
Case Study - Code Synthesis (4) trade-off surface for TI TMS320C40 digital signal processor 29
TI TMS320C40 DSP Case Study - Code Synthesis (5) Motorola DSP56k ADSP 2106x DSP 30
Changes 2012-06-15 (v1.1.0) updated for SS2012 2011-06-14 (v1.0.0) initial version 31