Improved Symoblic Simulation By Dynamic Funtional Space Partitioning

Similar documents
For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

Construction of ROBDDs. area. that such graphs, under some conditions, can be easily manipulated.

Mathematics 256 a course in differential equations for engineering students

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction

ELEC 377 Operating Systems. Week 6 Class 3

Parallelism for Nested Loops with Non-uniform and Flow Dependences

The Codesign Challenge

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization

An Optimal Algorithm for Prufer Codes *

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

RADIX-10 PARALLEL DECIMAL MULTIPLIER

Parallel matrix-vector multiplication

Wishing you all a Total Quality New Year!

Storage Binding in RTL synthesis

A Binarization Algorithm specialized on Document Images and Photos

Support Vector Machines

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Assembler. Building a Modern Computer From First Principles.

Meta-heuristics for Multidimensional Knapsack Problems

Brave New World Pseudocode Reference

CMPS 10 Introduction to Computer Science Lecture Notes

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Cluster Analysis of Electrical Behavior

USING GRAPHING SKILLS

Conditional Speculative Decimal Addition*

Using Delayed Addition Techniques to Accelerate Integer and Floating-Point Calculations in Configurable Hardware

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Memory Modeling in ESL-RTL Equivalence Checking

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Programming in Fortran 90 : 2017/2018

Private Information Retrieval (PIR)

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Hermite Splines in Lie Groups as Products of Geodesics

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Line Clipping by Convex and Nonconvex Polyhedra in E 3

CHAPTER 2 DECOMPOSITION OF GRAPHS

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

Random Kernel Perceptron on ATTiny2313 Microcontroller

Feature Reduction and Selection

Newton-Raphson division module via truncated multipliers

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)

Lecture 5: Multilayer Perceptrons

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

Hierarchical clustering for gene expression data analysis

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Problem Set 3 Solutions

IP Camera Configuration Software Instruction Manual

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.

Optimal Workload-based Weighted Wavelet Synopses

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

Floating-Point Division Algorithms for an x86 Microprocessor with a Rectangular Multiplier

Cost-efficient deployment of distributed software services

Modular PCA Face Recognition Based on Weighted Average

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

A Deflected Grid-based Algorithm for Clustering Analysis

A Power Optimization Toolbox for Logic Synthesis and Mapping

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Mallathahally, Bangalore, India 1 2

(1) The control processes are too complex to analyze by conventional quantitative techniques.

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Optimizing Document Scoring for Query Retrieval

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Analysis of Non-coherent Fault Trees Using Ternary Decision Diagrams

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS

Analysis of Continuous Beams in General

y and the total sum of

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Efficient Distributed File System (EDFS)

High-Boost Mesh Filtering for 3-D Shape Enhancement

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011

Solving two-person zero-sum game by Matlab

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Related-Mode Attacks on CTR Encryption Mode

CHAPTER 4 PARALLEL PREFIX ADDER

Smoothing Spline ANOVA for variable screening

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Collision Detection. Overview. Efficient Collision Detection. Collision Detection with Rays: Example. C = nm + (n choose 2)

Support Vector Machines

Improving The Test Quality for Scan-based BIST Using A General Test Application Scheme

Verification by testing

Solutions to Programming Assignment Five Interpolation and Numerical Differentiation

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Edge Detection in Noisy Images Using the Support Vector Machines

TN348: Openlab Module - Colocalization

Transcription:

Improved Symoblc Smulaton By Dynamc Funtonal Space Parttonng Tao Feng, L-.Wang, Kwang-Tng heng Department of EE, U-Santa Barbara, U.S.A tfeng,lcwang, tmcheng @ece.ucsb.edu Andy -. Ln adence Desgn Systems, Inc. U.S.A ccln@verplex.com Abstract In ths paper, we provde a flexble and automatc method to partton the functonal space for effcent symbolc smulaton. We utlze a 2-tuple lst representaton as the bass for parttonng the functonal space. The parttonng s carred out dynamcally durng the symbolc smulaton based on the szes of OBDDs. We develop heurstcs for choosng the optmal parttonng ponts. These heurstcs ntend to balance the tradeoff between the tme and space complexty. We demonstrate the effectveness of our new symbolc smulaton approach through experments based on a floatng pont adder and a memory management unt. 1. Introducton Symbolc smulaton based on Ordered Bnary Decson Dagram(OBDD) [1] has been shown varous successes n formal verfcaton. However, a tradtonal symbolc smulaton approach may easly suffer from the memory sze exploson problem. Even worse, a lttle modfcaton of the crcut or the ntal varable orderng can result n order-ofmagntude dfference n run-tme and memory usage. The lmted and unstable behavor of a symbolc smulator often leads to tremendous frustraton for verfcaton engneers. Many attempts have been done to reduce/control the OBDD szes n symbolc smulaton [2]. The partton of functonal space (case splt) s one promsng approach. In the paper [3], the authors splt the verfcaton task nto subcases based on the understandng of the desgn. The parametrc constrans have been appled to verfy each subcase. The authors n [4] decompose the monolthc OBDD nto some parttoned-obdds based on the control condtons whch s the combnaton of prmary nputs or nternal varables. In ths paper, we provde a dfferent way to partton the functonal space. The contrbuton of ths paper comes from two aspects. Frst, we provde a 2-tuple lst symbolc smulaton engne whch can represent the boolean functon n control and datapath domans separately. Durng the course of the smulaton, the functonal space n control doman can be parttoned nto subspaces and the correspondng data for each subspace s evaluated n the datapath doman. Second, the paper dscusses n detal on how to fnd the good ponts for functonal space parttonng. For the hard-to-verfy crcuts, we observe the dramatc changes of memory usage durng the course of symbolc smulaton. These dramatc changes provde hnts to fnd the key parttonng ponts where OBDD sze reducton methods should be appled. 2. Motvatons and the baselne study The curve 1 of Fgure 4 n our experments shows the level-by-level total OBDD szes by smulatng a floatng pont adder whose netlst s levelzed. The monolthc OBDD s bult from nput to output n the ordnary symbolc smulaton. We observe the followng aspects: The floatng pont adder s a typcal example hard for a tradtonal symbolc smulator to smulate [9]. Notce that the OBDD szes does not ncrease lnearly as the crcut level ncreases. The sze may sharply ncrease at certan levels although the dynamc varable orderng was enabled. The ponts where OBDD sze change dramatcally can be n any place of the crcut, not necessarly only at the place where symbolc smulaton aborts. Ths may explan the unpredctable OBDD performance for the symbolc smulaton. To fnd out the problem source, t would be good to know the earlest pont where the sudden change of OBDD szes occur. Past experences ndcate that these ponts where OBDD szes change sgnfcantly often locate along the boundary between the datapath and the control part of a desgn. For example, a comparator output from the datapath s often consdered as a control sgnal. Ths output represents the result of mergng multple word-level data and can be a place for OBDD sze to blow up. Large crcuts usually have complex control and datapath logc. When they converge at the control-datapath nterface, t often causes OBDD szes to change dramatcally. 2.1. The optmal partton ponts nsde the crcut The boundary of the datapath and control part can usually be modeled explctly or mplctly usng the MUX 1530-1591/04 $20.00 (c) 2004 IEEE

prmtves. The MUX nputs and outputs stay on the datapath whle the MUX select lnes stay on the control. It provdes a natural parttonng pont at whch we can separate the logc. A smple heurstc s to choose all MUX prmtves as the space partton pont. Unfortunately t may generate too many trval subspaces and ncrease the tme complexty of the problem. Thus we need to use the heurstcs to fnd the key ponts for functonal space partton. Our baselne study on montorng the OBDD performance durng the symbolc smulaton above gves some hnts to the soluton. The ponts where the OBDD sze change dramatcally have the hgh nfluence of the smulaton performance. The MUX prmtves related to these ponts wll have the hgher prorty to be selected. We call ths method a dynamc heurstc for selectng parttonng ponts because t s based on the OBDD performance durng the course of the smulaton. On the contrary, statc heurstcs can be based on the crcut topologcal structure to determne whch MUX prmtves are for parttonng (dscussed n secton 4.2). The dynamc functonal space parttonng concept, n essence, follows the same prncple as that makes the dynamc varable orderng successful. In both approaches, the adjustment of OBDD sze occurs only when a problem has been observed. As an analogy, a fxed ntal varable orderng s a statc orderng before the symbolc smulaton. They are based on the crcut topologcal structure to estmate the best ntal varable orderng. Alternatvely, dynamc varable orderng could be more effcent to reduce the OBDD szes, but t s qute tme consumng. Hence, the heurstc needs to nvoke the dynamc orderng as few tmes as possble wth the reducton of the total OBDD sze as much as possble. Ths s smlar to our dynamc heurstc for choosng MUX prmtves to partton the functonal space. We want to nvoke the parttonng only at the key ponts wth the reducton of total OBDD sze as much as possble. 3. Basc concept of 2-tuple lst representaton [Defnton 1: 2-tuple] The smulated a result on each sgnal lne a s stored as a 2-tuple that s of the form a Da, where the frst tuple a s called a control and the second tuple D a s called a data. a D s read as node a has the data D a when the control a s true, otherwse the value on node a s unknown X. a and D a could be a sngle varable or a boolean expresson. Intutvely, D a smulates the results for a ste on the datapath whle the correspondng a tells the control sgnal s combnatons for t to happen. [Defnton 2: 2-tuple lst] Intally, each nput port a of the crcut wll be assgned wth the 2-tuple control data 1 D a. The data part s assgned a new varable D a, and the control part s denoted as 1 whch represents the whole functonal space. Durng the course of symbolc smulaton, the whole functonal space n the control doman can be splt nto several subspaces. The nternal wre a n s represented as a lst of 2-tuples that s of the form L a n 1 D a a 1 n D a a 1n D a a (1) where n s the number of the splt subspaces for the wre a. If the crcut has no unknown states, then the unon of 1 2 n the control parts n the 2-tuple lst s the whole functonal space: a a a 1 (2) 1n a Here we use the symbol to represent the concatenate of multple 2-tuples and for multple boolean OR operatons. [Defnton 3: mutually exclusve n 2-tuple lst] In j the 2-tuple lst L n 1 D 1 n D n, f /0! j j 1 n, each 2-tuple n the lst s 1n D called mutually exclusve wth other n 2-tuples. [Theorem 1: 2-tuples lst merge rule] In the lst L! 1 D 1 n D whch contans n 2- tuples. We can merge multple 2-tuples (n the same lst) nto a sngle 2-tuple. If L s a mutually exclusve 2-tuple lst, the followng rule can be appled. n 1 2# 1 D 1 2 D 2 "# n D 1 2# n 1 D 2 D n D $1n 1n D (3) Here D represents ( D ). 3.1. The 2-tuple lst constructon rule When the sgnals go through the gates such as AND or OR, the output result can be obtaned by explorng the data values under the ntersecton of the control domans from the fann wres. The followng s the algorthm for the 2- nput OR a b gate. onstructon out n m Rule1: for 2-nputs OR gate Input: L 1 D a a 1 n D a a j j L 1 D b b 1 m D b b Output: L &%1n'j$1m( a b D a D b [Proof of constructon rule1:] We prove the above 2-tuple lst constructon rule can evaluate the same functon as the ordnary symbolc method. The dfference between them

s that the functon space n the control doman s always 1 n the ordnary symbolc method, whle n our constructon rule the control space of nput sgnals has been mutual exclusvely parttoned j and represented n a 2-tuple lst. The 2-tuples n a lst can be merged wth the equaton 3 and becomes the representaton of the ordnary method. ( a 1, j b 1) The ordnary a b symbolc 1 smulaton 1 evaluates j the OR functon as: 1 L L a Da j b j Db j a Da b j Db (4) a b j j Our 2-tuple lst constructon rule1 evaluates functon as: L L $1n D a a j j1m j b j D b b D a D b %1n)j1m( a j j j 1 %$1n'j1m( a b j %1n'j1m( a b a b a Da b j Db (5) j By applyng the equaton 2 and 3, we derve that gven the same nputs, our constructon rule evaluates the same functon as the ordnary symbolc smulaton n equaton 4. The above constructon rule can be extended to the other gates such as XOR and AND gates. In the above rule, the varables n the control and data domans are handled ndependently, thus the control varables do not go nto the data doman and vce versa. For the MUX prmtve, the varables n the control and date domans a can be exchanged wth the followng constructon b rule. onstructon c n Rule2: for 2:1 MUX p m gate Input: L 1 D a a 1 n D a a L 1 D b b 1 m D b b L 1 D c c 1 c p D j c j jj ) %1n)j1p( c D c a, Da ), %1m'j$1p( c!d c b, Db Output: L out = The 2:1 MUX has two data-nput sgnals (L a and L b ) and one select-nput sgnal(l c ). The output sgnal L out wll have the value of L a f the value of the select-nput sgnal L c s true, otherwse, L out wll have the value of L b. We note that the varables n the data doman of L c wll be moved to the control doman n the L out. For the space of the paper, we omt the proof of the constructon rule2. A demonstraton D D example on hdden weghted functon has been shown n [6]. 3.2. The 2-tuple lst n the verfcaton flow Our symbolc smulator conssts of the followng three steps: (1) extracton of the MUX prmtves from a gatelevel crcut, (2) symbolc smulaton wth the 2-tuple lst constructon rules, and (3) consstency checkng of the output result n the 2-tuple lst. A gate-level netlst can usually be syntheszed from ts hgh-level (RTL) model. The RTL statements such as f, case are the decson ponts, and are usually syntheszed as MUXes n the low level crcut. In addton to the hghlevel nformaton, the MUX can also be extracted n a lowlevel crcut where the sgnal and ts negated sgnal have re-convergent fanout. In our symbolc smulator, we dstngush the MUX prmtve wth other prmtves such as AND/OR gates, because on the MUX prmtve the varables n the control and data domans can be nterchanged and adjusted n our 2-tuple lst representaton(accordng a n * to ts constructon rule). When the symbolc smulaton fnshes, an output sgnal s represented by a 2-tuple lst L 1 D a a 1 n D a a n. If we want to verfy ths result by comparng t wth the result obtaned by smulatng another model (another gate-level model or an asser- b m + ton), we need to perform consstency checkng. In consstency checkng, we compare L a n to another lst L 1 D b b 1 m D b m. We frst check f the unon of ther control doman covers the whole functonal space: 1. We further check, under the ntersecton of control domans between D b j when 4. Heurstcs for selectng partton pont 1 2 n, b 1 2 m - a a a b b b the 2-tuple lsts, ther values are the same: D a a n 1 j m. j b /0 1 As the parttonng s based on the nput-select sgnals of the MUXes, dfferent values of the nput-select sgnals wll partton the control space n dfferent ways. When the sgnals go through AND/OR gates, the control space can be further parttoned by the ntersecton of the subspaces from the fann wres (by applyng the 2-tuple constructon rules). Although the OBDD sze for each subspace becomes smaller, t may take more tme to handle all of the subcases f the number of the parttoned subspaces s very large. To control the sze of 2-tuple lsts, we need to carefully select the MUX prmtves for parttonng. 4.1. Remodel the MUX prmtves Instead of parttonng all MUX prmtves n a crcut, our method selects some of the MUXes as the parttonng ponts based on the objectve to control both the OBDD sze and the smulaton run tme.

For those MUX prmtves whch s not chosen for partton, we mplctly remodel the MUX prmtves usng AND/OR gates (see Fgure 1). Hence, nstead of applyng the constructon rule2 to nterchange the control and data varables, we apply the constructon rule1 for the AND/OR gates so that the output of the MUX would keep the orgnal parttoned subspaces as those gven at ts fann wres. In ths way, the control space wll not be parttoned nto too many dverse subspaces. Lc=(1,c) La=(1, a) Lb=(1,b) 1 0 (a) Lout={ (c,a), (c',b) } Lc=(1,c) La=(1, a) Lb=(1, b) Fgure 1: Remodel the MUX prmtves (b) 4.2. Statc heurstc to choose the MUX prmtves Lout={ (1, (ca) (c'b) ) Our statc heurstc s based on the structure of a crcut [6]. If the logc cones of the data-nput sgnals of a MUX overlap sgnfcantly wth the logc cones of the select-nput sgnal, OBDD could have trouble n fndng a good orderng. We select the MUX prmtve as a parttonng pont to separate these varables n control and data domans. 4.3. Dynamc heurstc to choose the MUX prmtves search for the MUX parttonng ponts. The procedure trace back wll mask the mux splt flag for the MUX prmtves n the search wndow. The symbolc smulaton then goes back to these masked MUX prmtves and reevaluate them usng the constructon rule2 to partton the functonal space. The trace back procedure searches the MUX prmtves backward level by level wthn the fann cone from the pont where the exploson of OBDD sze was observed. The search wndow s restrcted wth the parameter LEVEL LIMIT and the search wll stop when the backward traced level goes beyond the LEVEL LIMIT. In our experments, we set the search wndow range be 4 levels. The MUX prmtves n the search wndow wll be marked wth mux splt flag. Meanwhle, all the other MUX prmtves whch share the same select-nput wre wth them wll also be marked. Usually, these shared nputselect sgnals of the MUX prmtves are the global control varables n a crcut. Identfyng these varables ensures that the parttonng can be done systematcally across the entre crcut. Ths usually helps to reduce the szes of the 2-tuple lsts n the smulaton [6]. 5. Applcatons and expermental results All experments were run on a Pentum 4 1.5G machnes wth 512M memory. Trace back to search for MUX 5.1. Expermental Example I: Floatng Pont Adder P...... TRAE_LEVEL_LIMIT3 2 Fan-n cone wthn search wndows BDD sze ncrease > DT_THRESHOLD 1Back_traced Level Fgure 2: Dynamc heurstc to choose MUX prmtves The dynamc heurstc s embedded n our 2-tuple lst symbolc smulaton engne. We levelze a gven crcut and symbolcally smulate the gates level by level. For MUX prmtves, orgnally all of them are reset wth the mux nosplt flag. Ths means that they wll be evaluated usng the constructon rule1. We then use the dynamc heurstc to select some MUX prmtves as the parttonng ponts by settng ther mux splt flags. These MUX prmtves wll be evaluated usng the constructon rule2. Durng the course of constructng OBDD for the gate output, we record the total OBDD sze obtaned so far. Once we fnd that the total OBDD sze s beyond a gven threshold DT THRESHOLD, we trace back from ths pont to e1 e1-e2 e2 f1 f2 exponent (e1) mantssa (f1) + exponent(e2) mantssa(f2) rght- shfter Adjust mantssa adder Add mantssa Fgure 3: FADD mplementatons sum 5.1.1 The procedure of symbolc smulaton Lead- sgn left- shfter Denormalze Key pont adjust The FP adder s descrbed n a herarchcal manner[7] and syntheszed nto flattened netlst. Fgure 3 shows how two floatng pont numbers are added together. Each floatng number s represented n the form of exponent(e1/e2) and mantssa( f 1/f 2). At frst, the two exponents e1 and e2 are compared, the dfference e10 e2 s the amount number to rght shft(algn) the smaller mantssa. After algnment, the two algned mantssas are added together as the sum result. It should be normalzed by left shftng the sum result.

In our experments, we frst symbolcally smulate the crcut wthout usng the 2-tuple lst partton. We levelze the crcut and montor the total OBDD sze at each level. As shown n fgure 4(curve 1), the total OBDD sze would exponentally ncrease even wth dynamc orderng enabled. Actually the symbolc smulaton could abort when t reaches the recourse lmtaton at the last level when the exponent bts s large enough. At the level of 142 n the crcut, the sgnfcant ncrement of OBDD sze corresponds to the places where normalzaton of the sum result s done (marked as key pont n fgure 3). The amount of left shft n the shfter depends on the most sgnfcant non-sgn bt of the nput data whch s to be shfted. Total BDD sze 7 x 106 6 5 4 3 2 1 urve 1:Ordnary method 0 0 20 40 60 80 100 120 140 160 180 Levels n rcuts urve 2: 2-tuple lst method wth dynamc partton key pont Fgure 4: Total OBDD sze for smulaton FADD(fadd e5 m24) wth exponent(5bts), mantssa(24bts) Fgure 4(curve 2) shows the total OBDD sze at each level usng our 2-tuple lst method wth dynamc heurstc for selectng the partton ponts. We notce that the curve becomes flat as they reach the prmary outputs (large level numbers). The dynamc heurstc successfully found the MUX prmtves whch nfluence the key ponts of fgure 3. The curves ndcate that our method has effectvely decomposed the functonal space to avod the OBDD blow-up problem. Table 1 summarzes the run-tme and OBDD sze of each method. We dd the experments on the adder of floatng pont numbers wth 24 bts mantssa and dfferent bts (from 3 to 7 bts) n exponent. Table 1: Run tme and OBDD sze comparson results rcuts Run tme(s) Total OBDD sze Max splt subspaces -ord -tp -ord -tp -tp(dynamc) -tp(statc) fadd e3 m24* 288s 154s 969878 442255 18 137 fadd e4 m24 5129s 1434s 6079878 2001076 27 185 fadd e5 m24 7931s 1779s 6358884 2145178 27 257 fadd e6 m24 abort 1984s abort 2825484 27 313 fadd e7 m24 abort 2714s abort 2994318 27 345 -ord: ordnary method, -tp: our 2-tuple lst method wth dynamc partton fadd e3 m24* s wth 3bts exponent and 24bts mantssa Table 2 compares the run-tme and OBDD sze wth dfferent parttonng heurstcs. The Max OBDD sze n the table s the max OBDD nodes used to represent the functon of a sgnal. The total OBDD sze s the total OBDD nodes allocated. The all splt heurstc selects all MUXes for parttonng. Fgure 5 shows the max number of subspaces splt by the 2-tuple lst at each level durng the smulaton. Frst, wth the all splt heurstc, although the OBDD sze mght be reduced much, the 2-tuple lst sze wll ncrease dramatcally (the upper curve n the fgure 5). In ths case, each sgnal could have a large-sze 2-tuple lst to be processed by the smulator. As a result, the run tme can be slow. The statc heurstc for parttonng, as explaned before, s based on crcut structure. Only the MUX prmtves wth overlappng logc cones are chosen. The number of parttoned subspaces s reduced but could stll grow dramatcally. The dynamc heurstc only chooses the MUX prmtves whch could greatly nfluence the OBDD performance. As a result, t could lmt the partton ponts and at the same tme, reduce the total OBDD sze. Table 2: omparson of partton heurstcs for fadd e3 m24 Heurstc Run tme(s) Total OBDD sze Max OBDD sze MUXs for partton all splt 337s 410303 8415 528 statc 209s 445441 8415 399 dynamc 154s 442252 48522 97 Max splt subspaces 140 120 100 80 60 40 20 all splt heurstc statc heurstc dynamc heurstc 0 0 20 40 60 80 100 120 140 160 180 Levels n crcuts Fgure 5: The number of subspaces splt at each level for fadd e3 m24 by dfferent heurstcs 5.1.2 The procedure of consstency checkng When the symbolc smulaton fnshes, the consstency checkng needs to be performed to compare the output sgnal wth other model. Due to the dfferent partton strategy and the crcut mplementaton, the 2-tuple lst representaton of output sgnal n each crcut model can be dfferent. One method for equvalence checkng s to use the merge rule(theorem 1) to convert the 2-tuple lst nto one monolthc OBDD. In some cases, the fnal merge of the 2-tuple lst at the output can avod the ntermedate OBDD peak sze n the mddle of the smulaton compared wth the ordnary

symbolc smulaton whch bulds the monolthc OBDD for every nternal sgnal. For the complex crcuts, the output sgnal could be too complex to be represented by a monolthc OBDD. Hence, we keep the output sgnal n 2-tuple lst format and use the method proposed n secton 3.2 for equvalence checkng. In out experments, the mplementaton of the crcut has been modfed as the revsed model. Table 3 shows the result of equvalence checkng by usng our 2-tuple lst method and the ordnary method respectvely. Table 3: Run tme and OBDD sze for equvalence checkng rcuts -tp Vs. -tp -ord Vs. -ord Tme Total OBDD sze Tme Total OBDD sze fadd e2 m24* 47s 80738 78s 219084 fadd e3 m24 495s 554946 1024s 2391480 fadd e4 m24 4007s 3199882 13592s 14527730 -ord: ordnary method, -tp: our 2-tuple lst method wth dynamc partton fadd e2 m24* s wth 2bts exponent and 24bts mantssa 5.2. Expermental Example II: Memory Management Unt Another applcaton s to verfy the memory management unt(mmu) n the hgh-performance mcroprocessors. The MMU conssts of two on-chp content addressable memory blocks (BAT and TLB blocks) to support the vrtual memory address translaton. BAT block contans four entres wth the tag T12and data D12(3410#32) n each entry. If one tag T12matches the nput effectve address ea, then match12 1 and the correspondng data D12n ths entry s placed at the output of MMU. The control swtches behave smlarly as a MUX select lne. For the bus structures, the 2-tuple lst can be used to partton the functonal space based on the control swtches. We can see that wth our 2- tuple lst representaton, the parttonng pont s not strctly lmted to MUX prmtves, t can be any place where the concept of select control sgnal s appled. The MMU example used n our experments contans practcal custom-desgn modules at the transstor level. An ordnary symbolc smulator could not handle the MMU due to the nteractons between the TLB and the BAT modules [8]. The mxed-level nature of the MMU desgn (gates and transstors) adds another dmenson of dffculty for a symbolc smulaton. However, a transstor can be modeled as a latch whle the latch enable sgnal serves as a control sgnal. Thus the 2-tuple lst symbolc smulator can be appled n a smooth way for the mxed-level MMU desgn. In our experments, we ntalze the value of memory cells wth symbols. Then, symbolc smulaton s carred out on the MMU. Table 4 shows the results wth our 2- tuple lst smulator. Note that an ordnary symbolc smulator could not smulate ths desgn wthout encounterng OBDD sze blow up. There are many other applcatons whch our proposed Table 4: omparson of tme and OBDD for MMU blocks rcuts Run tme(s) Total OBDD sze Max OBDD sze Splt subspaces MMU 302s 265429 12657 67 BAT 173s 196085 7927 16 TLB 100s 39436 1054 35 SEG 3s 10897 1595 16 2-tuple lst partton method can be appled. In the mcroprogram controller, the nstructon s encoded to generate control sgnals and the multplexer selects data from dfferent resources [7]. Greatest common dvsor(gd) s another example n arthmetc unt [10]. Our next goal s to extend our partton method nto the sequental crcuts. 6. oncluson In ths paper, we present a functonal-space parttonng method based on the constructon of 2-tuple lsts n the symbolc smulaton. The parttonng s done dynamcally by selectng MUX (or control ponts) usng the proposed heurstcs. The dynamc heurstc montors the OBDD performance durng the symbolc smulaton n order to dentfy the key ponts where parttonng of the functonal space can mprove the global OBDD performance greatly. We demonstrate the effectveness of our smulator by experments on two known crcut examples, the floatng pont adder and the memory management unt, whch both were shown to be dffcult for an ordnary symbolc smulator to handle before. References [1] R.E. Bryant. Symbolc Boolean Manpulaton wth Ordered Bnary- Decson Dagrams. AM omputng Surveys, 24(3):293 318, 1992. [2] Alan. J. Hu. Formal hardware verfcaton wth BDDs: An ntroducton. IEEE Pacfc Rm onference on ommuncatons, omputers and Sgnal Processng, 1997 [3] Mark D.Aagaard, Robert B.Jones, arl-johan H.Seger. Formal Verfcaton Usng Parametrc Representatons of Boolean onstrants. In 36th AM/IEEE Desgn Automaton onference, 1999. [4] Amt Narayan, Jawahar Jan, Fujta, Sangovann. Partoned ROB- DDs - A ompact, anoncal and Effcently Manpulable Representaton for Boolean Functons. In AM/IEEE Int. onference on omputer-aded Desgn, 1996. [5] R.E.Bryant. On the omplexty of VLSI mplementatons and graph representatons of Boolean functons wth applcaton to nteger multplcaton. In IEEE Trans. on omputer, 1991. [6] T.Feng, L-. Wang, Kwang-Tng heng Improved Symbolc Smulaton By Functonal Space Decomposton. In Asa and South Pacfc Desgn Automaton onference, 2004. [7] K..hang. Dgtal Systems Desgn wth VHDL and Synthess, An ntegrated approach. In IEEE computer socety press, 1999. [8] T.Feng, L-. Wang, Kwang-Tng heng etc Enhanced Symbolc Smulaton for Effcent Verfcaton of Embedded Array Systems. In Asa and South Pacfc Desgn Automaton onference, 2003. [9] Yrng-An hen, Randal E. Bryant Verfcaton of Floatng Pont Adders In Proceedng of Internatonal onference of omputer Aded Verfcaton, 1998. [10] D.J.Smth Practcal Modelng Examples - HDL hp Desgn In Doone Publcatons, hapter 12, 1996