Space- and Time-Efficient BDD Construction via Working Set Control

Size: px
Start display at page:

Download "Space- and Time-Efficient BDD Construction via Working Set Control"

Transcription

1 Spae- and Time-Effiient BDD Constrution via Working Set Control Bwolen Yang Yirng-An Chen Randal E. Bryant David R. O Hallaron Computer Siene Department Carnegie Mellon University Pittsburgh, PA USA Abstrat Binary deision diagrams (BDDs) have been shown to be a powerful tool in formal verifiation. Effiient BDD onstrution tehniques beome more important as the omplexity of protool and iruit designs inreases. This paper addresses this issue by introduing three tehniques based on working set ontrol. First, we introdue a novel BDD onstrution algorithm based on partial breadth-first expansion. This approah has the good memory loality of the breadth-first BDD onstrution while maintaining the low memory overhead of the depth-first approah. Seond, we desribe how memory management on a per-variable basis an improve spatial loality of BDD onstrution at all levels, inluding expansion, redution, and rehashing. Finally, we introdue a memory ompating garbage olletion algorithm to remove unreahable BDD nodes and minimize memory fragmentation. Experimental results show that when the appliations fit in physial memory, our approah has speedups of up to 1.6 in omparison to both depth-first (CUDD) and breadth-first (CAL) pakages. When the appliations do not fit into physial memory, our approah outperforms both CUDD and CAL by up to an order of magnitude. Furthermore, the good memory loality and low memory overhead of this approah has enabled us to be the first to have suessfully onstruted the entire C6288 multipliation iruit from the ISCAS85 benhmark set using only onventional BDD representations. I. INTRODUCTION With the inreasing omplexity of protool and iruit designs, formal verifiation has beome an important researh area. Binary deision diagrams (BDDs) have been shown to be a powerful tool in formal verifiation [4]. Even though many funtions have ompat BDD representations, some funtions an have very large BDDs. For example, BDD representations for integer multipliation have been shown to be exponential Effort sponsored in part by the Advaned Researh Projets Ageny and Rome Laboratory, Air Fore Materiel Command, USAF, under agreement number F , in part by the National Siene Foundation under Grant CMS , and in part by a grant from the Intel Corporation. The U.S. Government is authorized to reprodue and distribute reprints for Governmental purposes notwithstanding any opyright annotation thereon. The views and onlusions ontained herein are those of the authors and should not be interpreted as neessarily representing the offiial poliies or endorsements, either expressed or implied, of the Advaned Researh Projets Ageny, Rome Laboratory, or the U.S. Government. Supported in part by the Defense Advaned Researh Projet Ageny (DARPA) under ontrat number DABT63-96-C in the number of input bits [5]. To address this issue, there are many BDD related researh efforts direted towards reduing the size of the graph with tehniques like new ompat representations for speifi lasses of funtions (KFDD [9] and *BMD [6]), divide-and-onquer (POBDD [11] and ACV [7]), funtion abstration (abdd [12]), and variable reordering [19]. Despite these efforts, large graphs an still naturally arise for more irregular funtions or for inorret implementations of a speifiation. Inorret implementation an break the struture of a funtion and thus an greatly inrease the graph size. For example, the *BMD representation for integer multipliation is linear. However, a mistake in the implementation of integer multipliation logi an ause an exponential explosion of the resulting graph. The ability to handle large graphs effiiently an enable us to represent more irregular funtions and to provide ounterexamples for inorret implementations. Conventional BDD algorithms [2] are based on depth-first traversal of BDD graphs. This approah has small memory overhead, but poor memory loality. To address the issue of onstruting large BDDs effiiently, there have been many implementations [14, 15, 1, 10, 18] based on breadth-first traversal. The breadth-first approah, whih exploits its graph traversal pattern by using speialized memory layouts, has better memory aess loality and thus often has better performane. However, the breadth-first approah an have a large memory overhead, up to quadrati in the size of BDD operands. This extra memory overhead an result in an inreased number of page faults and thus poor performane. To maintain memory aess loality with low memory overhead, we introdue a new algorithm based on partial breadthfirst expansion. This algorithm improves loality of referene by ontrolling the working set size and thus reduing overhead due to page faults. We desribe how memory management on a per-variable basis an improve spatial loality of BDD onstrution at all levels, inluding expansion, redution, and rehashing. Finally, we introdue a breadth-first BDD garbage olletion algorithm whih performs memory ompation without inurring additional memory overhead. All of these tehniques work together to ontrol the working set size and have a signifiant impat on performane of BDD onstrution. As these tehniques exploit inherent properties of BDD onstrution, graph redution tehniques (like *BMD, POBDD, and dynami variable reordering) an be inorporated into our algorithms to further expand the usefulness of these

2 $ " & algorithms. Experimental results show that when the appliations fit in physial memory, our approah has speedups of up to 1.6 in omparison to leading depth-first (CUDD) and breadth-first (CAL) pakages. When the appliations do not fit into physial memory, our algorithm outperforms both CUDD and CAL by up to an order of magnitude. Furthermore, to demonstrate how our tehniques an effiiently build very large graphs, we onstruted the output BDDs for the C6288 multipliation iruit from the ISCAS85 benhmark. To the best of our knowledge, this has never been done before. Beyond the sequential world, another advantage of the partial breadth-first algorithm is that it an be parallelized [22]. This approah ahieves speedups of up to four on eight proessors of a shared memory system. The rest of this paper is as follows: Setion II gives an overview of BDDs and how they are onstruted. Setion III desribes the partial breadth-first algorithm and other tehniques for ontrolling the working set size. Setion IV presents performane evaluation of our implementation. Setion V demonstrates the usefulness of this implementation by onstruting very large BDDs for 16-bit array multipliers. Finally, Setion VII summarizes this paper and offers some onluding remarks. II. BDD OVERVIEW A boolean expression an be represented by a omplete binary tree alled a binary deision tree, whih is based on the expression s truth table. Fig.1(a) shows the truth table for a boolean expression and Fig.1(b) shows the orresponding binary deision tree. Eah internal vertex is labeled by a variable and has edges direted toward two hildren: the 0-branh (shown as a dashed line) orresponds to the ase where the variable is assigned 0, and the 1-branh (shown as a solid line) orresponds to the ase where the variable is assigned 1. Eah leaf node is labeled 0 or 1. Eah path from the root to a leaf node orresponds to a truth table entry where the value of the leaf node is the value of the funtion and the path orresponds to the assignment of the boolean variables. a b f (a) 0 b a b (b) b a 0 1 Fig. 1. A boolean expression represented with (a) Truth table, (b) Binary deision tree, () Binary deision diagram. The dashed-edges are 0-branhes and the solid-edges are the 1-branhes. () b A binary deision diagram (BDD) is a direted ayli graph (DAG) representation of a binary deision tree where equivalent boolean subexpressions are uniquely represented. Fig.1() shows the BDD representation of the binary deision tree in Fig.1(b). Sine all subexpressions in a BDD are uniquely represented, a BDD an be exponentially more ompat than its orresponding truth table or binary deision tree representations. One neessary ondition for guaranteeing uniqueness of the BDD representation is that all the BDDs onstruted must follow the same variable ordering; i.e., for any two variables and, if has higher preedene than ( ), then for any path that ontains both and, must appear before on this path. Note that the BDD size an be very sensitive to the variable ordering where the graph size of one ordering an be exponentially more ompat than the graph size of another ordering. Before desribing the basis for BDD onstrution, we will first introdue some terminology and notation. 0 and 1 are ofator funtions of the funtion with respet to the boolean variable, where 0 is equal to with the value of set to 0, and 1 is equal to with the value of set to 1. A reahable subgraph of a node is defined to be all the nodes that an be reahed from by traversing 0 or more direted edges. BDD nodes are defined to be internal verties of BDDs. Given a BDD, the funtion represented by is reursively defined by! 0 "! 1 (1) where is the variable orresponds to s root node and the ofator funtion 0 is reursively defined by the reahable subgraph of s 0-branh hild. Similarly, 1 is reursively defined by the reahable subgraph of s 1-branh hild. A. Basis for BDD Constrution BDD onstrution is a memoization-based dynami programming algorithm. Due to the large number of distint subproblems, instead of a memoization table, a ahe known as the omputed ahe is used to reord the result of eah subproblem. Given a variable ordering and two BDDs and #, the resulting BDD $ of a boolean operation op # is onstruted based on the Shannon expansion % & ' ' op # 0 op # 0 ' 1 op # where & is the variable (top variable) with the highest preedene among all the variables in and #, and (' 0, ' ' ' 1, # 0, and # 1 are the orresponding ofator funtions of and #. In the top-down expansion phase, this Shannon expansion proess repeats reursively following the given variable ordering for all the boolean variables in and #. The base ase (also alled the terminal ase) of this reursive proess is when the operation an be trivially evaluated. For example, the boolean operation ) * is a terminal ase beause it an be trivially evaluated to. Similarly, + 0 is also a terminal ase. At the end of the expansion phase, there may be unredued subexpressions like,.- " /.-. Thus, in order to ensure uniqueness, a bottom-up redution phase is neessary to redue expressions ' 1 (2)

3 like - " - to -. This redution phase also needs to ensure that eah BDD node reated is unique. Fig.2 illustrates the Shannon expansion (Equation 2) for the operation $ op #. On the left side of this figure, the operation is represented with an operator node whih refers to BDD representations of and # as operands. The right side of this figure shows the Shannon expansion of this operation with respet to the variable. Further expansion of operator nodes an be performed in any order. In partiular, the depthfirst onstrution always expands the operator node with the greatest depth. Note that the depth-first algorithm does not expliitly store the operations as operator nodes. Instead, the operation is impliitly stored in the stak as arguments to the reursive alls. In the breadth-first onstrution, the Shannon expansion is performed top-down from the variables with the highest to the lowest preedene so that operations with the same top variable are expanded together. The redution phase is performed bottom-up in reverse order. Thus, all operations with the same top variable are redued at the same time. f r op g op r τ op f g f g τ=0 τ=0 τ=1 τ=1 Fig. 2. Shannon Expansion: The dashed edge represent the 0-branh of a variable and the thik solid edge represents the 1-branh For the rest of this paper, we will refer to boolean operations issued by a user of BDD pakage as the top level operations to distinguish them from operations generated internally by the Shannon expansion proess. B. Memory Overhead and Aess Loality BDD onstrution is often memory intensive, espeially when large graphs are involved. It not only requires a lot of memory, it also requires frequent aesses to many small data strutures (the node size is typially 16 bytes on 32-bit mahines). The depth-first BDD onstrution has poor memory behavior beause of irregular ontrol flow and memory aess patterns. The ontrol flow is irregular beause the reursive expansion an terminate at any time when a terminal ase is deteted or when the operation is ahed in the omputed ahe. The memory aess pattern is irregular beause a BDD node an be aessed due to expansion on any of its many parents; and, sine the BDD is traversed in the depth-first manner, expansions on the parents are sattered in time. The performane impat for the depth-first algorithm s poor memory loality is espeially severe for BDDs larger than the physial memory. Reently, there has been muh interest in BDD onstrution based on breadth-first traversal [14, 15, 1, 10, 18]. In a breadth-first traversal, the expansion phase expands operations one variable at a time with all the operations of the same variable expanded together. Furthermore, during the redution phase, all the new BDD nodes of the same variable are reated together. The breadth-first onstrution exploits this strutured aess by lustering nodes (for both BDD and operator nodes) of the same variable together in memory with speialized node managers. Despite its better memory loality, the breadth-first onstrution has muh larger memory overhead in omparison to the depth-first onstrution. The number of operations that the depth-first onstrution keeps traks of at any given time is the depth of the reursion, whih is at most the number of variables. Sine the number of variables is typially small, the depth-first onstrution does not require muh memory to store these operations. In ontrast, for eah top level operation, the breadth-first onstrution will keep all operations generated by Shannon expansion of this top level operation until the result for this top level operation is onstruted. Sine the number of operations an be quadrati in the size of the BDD operands, the breadth-first approah an inur a large memory overhead. Thus, on some appliations where the depth-first onstrution fits in physial memory while the breadth-first onstrution does not, the performane of the breadth-first onstrution an be signifiantly worse due to page faults. III. OUR APPROACH TO BDD CONSTRUCTION Sine BDD onstrution involves a large number of aesses of many small data strutures, loalizing the memory aess pattern to bound the working set size is ritial beause good memory aess loality results in good hardware ahe loality and fewer page faults. This setion introdues three tehniques to ontrol the working set size by limiting memory overhead and by improving both temporal and spatial loality. These are followed by a brief disussion on how these tehniques an work together with variable reordering algorithms. A. Partial Breadth-First Constrution For the pure breadth-first onstrution (whih normally has good memory loality), if the BDD operands do not fit in physial memory, then the pages of operator nodes swapped in during the expansion phase will be swapped out by the time the redution phase takes plae. Furthermore, as desribed in Setion II.B, breadth-first onstrution an inur a large memory overhead. To overome these drawbaks while bounding the memory overhead, we introdue partial breadth-first expansion based on ontext swith. Within eah evaluation ontext, the breadthfirst expansion is used until a fixed evaluation threshold is reahed. Upon reahing this threshold, the urrent ontext is pushed onto a ontext stak and a new hild ontext is started. The remaining operations of the parent ontext are partitioned into smaller groups and the hild ontext evaluates these operations one group at a time. This proess repeats eah time the

4 urrent evaluation ontext reahes its threshold. By keeping the evaluation threshold to be a small fration of the available physial memory, we an bound the number of BDD nodes and ompute ahe nodes reated and aessed and thus ontrol the working set size. Note that by setting the evaluation threshold to 1, this algorithm degenerates to depth-first onstrution. Similarly, by setting the evaluation threshold to, this algorithm is idential to pure breadth-first onstrution. Fig.3(a) shows an example of a ontext swith. In this figure, the top triangle denotes the graph of the initial expansion. Upon reahing the evaluation threshold, the remaining unexpanded operations are divided into two partitions (shown as two dashed retangles) and the new hild ontext is started. This new hild ontext ontinues to expand on the first partition. After the hild ontext finishes building BDD results for the first partition, it ontinues to expand on the seond partition as shown in Fig.3(b). Note that expansion of these two partitions might share some operations in ommon. For these ommon operations, the expansion of the seond partition an benefit from the results omputed from the expansion of the first partition via the ompute ahe. However, sine the ompute ahe is not a omplete ahe, some ommon operations may need to be reomputed. This figure also depits how the partial breadth-first onstrution an redue memory overhead. The operator nodes reated from expanding the first partition do not need to be kept during the expansion of the seond partition. In omparison, the pure breadth-first onstrution (shown in Fig.3()) needs to keep all the operator nodes until after the redution phase. Context Swith and Expanding 1st Partition (a) Expanding 2nd Partition (b) No Context Swith Fig. 3. A Context Swith Example. (a) Upon reahing the evaluation threshold, urrent unexpanded operations are divided into two partitions (shown as two dashed retangles) and the new hild ontext ontinues to expand on the first partition. (b) After the redution for the first partition, this hild ontext expands on the seond partition. () Pure breadth-first expansion is shown for omparison. Other than the memory loality and the memory overhead, the evaluation threshold an also impat the effetiveness of the ompute ahe. In the pure breadth-first traversal, the expanded operator nodes must be kept until after the redution phase. This feature effetively resulted in a omplete ahe within an expansion phase. Similarly for the partial breadth-first approah, expansion within eah evaluation ontext maintains a omplete ahe. Thus, a larger evaluation threshold results in a larger and more omplete ahe for the urrent evaluation ontext at the ost of higher memory overhead. The rest of this setion formally desribes this partial breadth-first algorithm. Fig.4 shows the top level proedure () and a helper funtion for this partial breadth-first onstrution. For eah variable, there is an expansion queue and a redution queue. An expansion queue queues the operations of the same variable to be Shannon expanded during the expansion phase. A redution queue queues the operations of the same variable to be redued in the redution phase. The top level proedure pbf-op() builds the result BDD by repeatedly doing the Shannon expansion (line 3) and redution (line 4) until there are no more operations in the top ontext (lines 5 to 8) and until there are no more evaluation ontexts on the ontext stak (lines 9 to 11). Proedure preproess-op() first determines whether or not the operation is a terminal ase or is ahed (lines 13 to 15). If not, this operation is added to its top variable s expansion queue (lines 17 and 18) to indiate that further Shannon expansion is neessary for this operation. This operation is also inserted into the ompute ahe (line 19) to avoid expanding redundant operations in the future. This proedure returns either the BDD result (for the terminal ase and for the ase when the ahed result is a BDD) or an operator node. If an operator node is returned, this operator node s field opnode.result will ontain the result BDD after this operator node is proessed in the redution phase. pbf-op(,, ) 1 opnode preproess-op(,, ) 2 if opnode is a BDD node, return opnode. 3 all expansion() 4 all redution() 5 if top ontext of the ontext stak has operations, then 6 take a group of operations from the top ontext 7 add eah operation to its top variable s expansion queue 8 goto line 3 and repeat until top ontext is empty 9 if ontext stak is not empty, 10 pop the top ontext and use it as the urrent ontext 11 goto line 3 and repeat until ontext stak is empty 12 return opnode.result preproess-op(,, ) 13 if terminal ase, return simplified result 14 if the operation (,, ) is in ompute ahe, 15 return result found in ahe 16 opnode (,, ) 17 top variable of and 18 add opnode to s expansion queue 19 insert opnode into the ompute ahe 20 return opnode Fig. 4. Partial Breadth-First Constrution: top level proedure and a helper funtion Fig.5 shows the expansion phase. This top-down expansion phase proesses operations queued from the variable with the highest to the lowest preedene. Here, all the operations of the same variable are Shannon expanded together (lines 3 to 7). The branh 0 and the branh 1 fields of an operator node are used to store the results of Shannon expansion, and as desribed earlier, these results returned by the proedure preproess-op()

5 an be either a BDD node or an operator node. In the later ase, the proedure preproess-op() would have queued the new operator nodes to be proessed by the expansion phase later. The variable nopsproessed is used to trak the size of the urrent evaluation ontext and when it exeeds a onstant evaluation threshold evalthreshold, the urrent ontext is pushed onto the ontext stak and a new hild ontext is started (lines 9 to 13). expansion() 1 nopsproessed 0 2 for eah variable in the urrent evaluation ontext from the highest to lowest preedene 3 for eah node opnode in s expansion queue 4 (,, ) opnode 5 opnode.branh 0 preproess-op(, 0, 0) 6 opnode.branh 1 preproess-op(, 1, 1) 7 add opnode to variable s redue queue 8 nopsproessed++ 9 if (nopsproessed evalthreshold) 10 partition the remaining operators into small groups. 11 push urrent ontext with these operation groups onto the ontext stak 12 start a new evaluation ontext 13 return Fig. 5. Partial Breadth-First Constrution: expansion phase Fig.6 shows the redution phase. This bottom-up redution algorithm is the same as the pure breadth-first onstrution s redution phase where Shannon expanded operations are proessed together one variable at a time, starting from the variable with the lowest preedene moving upwards to the variables with the highest preedene. The results from the hildren are obtained in lines 4 to 11. Lines 12 to 19 perform the redution and ensure the result is unique. The result of a redution is stored in the opnode.result field of an operator node (line 13 and 19). B. Memory Management As in breadth-first BDD algorithms, speialized node managers are the key fators in exploiting strutured aess in the partial breadth-first approah. In our implementation, eah variable is assoiated with a BDD-node manager as in [18] s breadth-first algorithm. Eah variable s BDD-node manager lusters BDD nodes of the same variable by alloating memory in terms of bloks and alloates BDD nodes ontiguously within eah blok. We further extend this lustering onept to using one operator-node manager for eah variable. With this design, we not only benefit from good loality of node lustering, we also eliminate the need for having both the expansion and the redution queues, sine we an aess all the operator nodes of eah variable by simply traversing memory bloks of eah operator-node manager. Furthermore, we assoiate one ompute ahe and one unique table per variable. Thus, ahe lookup in the expansion phase and the BDD unique table lookup in the redution redution() 1 for eah variable in the urrent evaluation ontext from the lowest to highest preedene 2 for eah node opnode in s redue queue 3 (,, ) opnode 4 if opnode.branh 0 is a BDD, 5 res 0 opnode.branh 0 6 else 7 res 0 opnode.branh 0.result 8 if opnode.branh 1 is a BDD, 9 res 1 opnode.branh 1 10 else 11 res 1 opnode.branh 1.result 12 if (res 0 == res 1) 13 opnode.result = res 0 14 else 15 BDD node (, res 0, res 1) 16 opnode.result lookup(unique table, ) 17 if BDD node does not exist in the unique table, 18 insert into the unique table 19 opnode.result Fig. 6. Partial Breadth-First Constrution: redution phase phase will only traverse nodes of the same variable. Sine nodes of the same variables are lustered by the node managers, this results in better memory loality. Combined with per-variable node managers, we an perform rehashing for eah variable independently by traversing the memory bloks of the orresponding node manager. Again, this rehashing approah has better memory loality than the traditional approah, whih traverses the hash table. C. Garbage Colletion No BDD pakage is omplete without a good garbage olletor. External users of a BDD pakage an free referenes to exported BDDs and sine BDD onstrution is a memory intensive appliation, reusing the spae of unreahable BDD nodes is important. Most BDD pakages use referene ounting and maintain a free list of unreferened nodes. This approah has several drawbaks. Most notably it has poor memory loality beause the free-list approah an satter newly reated BDD nodes in memory and thus reversing the lustering effets of speialized node managers. In our implementation, a mark-and-sweep garbage olletor with memory ompation is used. Unlike a opying garbage olletor, our garbage olletion algorithm performs memory ompation without requiring any additional memory. This ompation algorithm is stable; i.e, the nodes linear ordering is maintained. This property allows nodes whih are alloated nearby in time to stay together. This an help aess loality beause nodes alloated together are likely to be aessed together in the future. Our garbage olletion algorithm onsists of two phases, both of whih are breadth-first traversal from the variable with

6 highest preedene to the variable with the lowest preedene. The first phase marks and ompats all the reahable nodes and the seond phase fixes all the referenes and rehashes these nodes. Fig.7 shows the algorithm for the mark-and-ompat phase. Line 1 marks all the roots of exported BDDs to indiate that these nodes and their desendants are all the nodes that we need to keep. The top-down breadth-first marking of desendants is performed by traversing BDD nodes in eah node manager (lines 2 to 6). In this algorithm, denotes the marked BDD node that is being proessed and denotes the next target loation for ompation. For eah marked BDD node, its hildren are marked (line 7). Line 8 establishes the new loation for node by setting s forward field. Lines 9 and 10 opy the relevant information in to this new target loation. Line 11 advanes to the next node in the node manager # $ as the new target loation. Line 12 advanes to the next marked node in this node-manager. This proess repeats until we have proessed all the marked nodes in this node manager # $ ; after whih, all the marked nodes are ompated into memory bloks before new and thus all the bloks after new are marked as free bloks to be freed after the seond phase (line 13). mark-and-ompat() 1 mark all the root nodes of exported BDDs we need to keep. 2 for eah variable from the highest to lowest preedene, 3 mgr s BDD-node manager 4 first marked node in manager mgr 5 new first node in manager mgr 6 while is still in node manager, 7 mark hildren.left and.right 8.forward new 9 new.left.left 10 new.right.right 11 new ManagerNextNode(mgr, new) 12 ManagerNextMarkedNode(mgr, ) 13 put memory bloks for all the nodes after new into mgr.freebloks. Fig. 7. Garbage Colletion s Mark and Compat Phase. This phase marks nodes that we want to keep and at the same time ompat the memory to avoid memory fragmentation. Fig.8 shows the seond phase of the garbage olletion algorithm. Initially, all external referenes are updated (lines 2 and 3). Then it proeeds in a top-down breadth-first manner to fix eah BDD node s hildren referenes (lines 7 and 8) and reinsert this node bak into the unique table (line 9). After all the referenes of a BDD-node manager are updated, its assoiated free bloks are freed (line 10). For the purpose of explanation, the garbage olletion algorithm shown uses an additional field forward for eah BDD node. In the atual implementation, eah BDD node s hashnext field, used for hained hashing, is also used as the forward field during the garbage olletion. This dual use of the same field is only orret if hash insertion of a node does not our until fix-and-rehash() 1 lear all unique tables 2 for eah root node of exported BDDs 3 update root nodes of exported BDDs to the forwarded loation 4 for eah variable from the highest to lowest preedene, 5 mgr s Bdd-node manager 6 for eah node in manager mgr 7.left.left.forward 8.right.right.forward 9 insert into variable s unique table 10 free all memory bloks in mgr.freebloks. Fig. 8. Garbage Colletion s Fix and Rehash Phase. This phase updates all the hildren referenes and reinserts the BDD nodes into unique tables. after all the referenes to this node are fixed. This ondition is guaranteed by first fixing external referenes (lines 2 and 3 in Fig.8) and then performing the top-down breadth-first traversal, whih updates all the parents referenes before inserting a node into the hash table. Thus, this two phase breadth-first garbage olletion algorithm is able to perform memory ompation without requiring any additional memory. D. Variable Reordering Dynami variable reordering is an important part of BDD onstrution. Even though we have not yet implemented dynami variable reordering, the following is an outline of potential problems and their solutions. 1. Some variable reordering algorithms require referene ounts. Sine garbage olletion is generally invoked right before variable reordering, we an ompute referene ounts during the mark-and-ompat phase of garbage olletion (line 1 and line 7 of Fig.7). 2. Dynami variable reordering an ounterat the lustering effets ahieved by the per-variable memory managers [16]. The solutions proposed in [16] should be diretly appliable to our approah. IV. PERFORMANCE EVALUATION In this setion, we present a performane evaluation of our approah. The test ases are the ISCAS85 benhmarks [3], a olletion of ten iruits used in industry. The variable ordering we used is generated by order dfs in SIS [20]. To get more test ases, we generate differene size array multiplier iruits based on arry ripple adders [6]. For the rest of this setion, we shall refer to this multiplier iruit as MCRA (Multiplier based on Carry Ripple Adders). For -bit multiplier with two operands and 2 0, the variable ordering used is For all the test ases, to minimize memory usage, we freed the intermediate results (those that are neither inputs nor outputs of the iruit) immediately after its the last referene.

7 In this setion, we use two leading BDD pakages for omparison. The first pakage is CAL version 2.0 from UC Berkeley, whih implements the breadth-first algorithm desribed in [18]. The seond pakage is CUDD version [21] from the University of Colorado at Boulder, whih implements the depth-first algorithm for BDD onstrution. Both are the latest releases as of November, All pakages are ompiled with g using the optimization flag -O3. In this setion, we will refer to our pakage as PBF. For both CAL and CUDD, we used all the default settings with the exeption of dynami variable reordering features whih we disabled for two reasons. First, we have not implemented dynami variable reordering yet. Seond, turning off the dynami reordering features removes the performane impat due to different dynami reordering algorithms. For the CAL pakage, the results we present are without its supersalarity and pipelining features [18] beause of adverse performane impat. These features require deomposing all operations into a single operation type. For the multipliers, suh deomposition inreases the running time by up to 60% and supersalarity of 10 with automati pipelining inreases the memory usage by 30% with little (< 1%) or no performane improvement. For C2670 and C3540 from ISCAS85 benhmarks, the results are less lear. Thus, for these two iruits, the results using supersalarity of 10 with automati pipelining will also be inluded. A. Evaluation Threshold In this setion, we examine how different evaluation thresholds impat the memory usage and running time of our approah. The system used for this evaluation is an SGI Power Challenge with 1 GBytes of physial memory. This system has 12 proessors running IRIX 6.2 with 32-bit address spae. Eah proessor is a 196MHz MIPS R We perform our experiments using one proessor under light load onditions where our proesses are the only ative proesses. Timing results reported are measured CPU time. In this study, the evaluation threshold ranges from 8 KBytes to where the ase orresponds to the pure breadth-first ase. The results from very small ases ( 10 seonds CPU time and 10 MBytes memory usage) are omitted. The results in Fig.9 show that in general, the running time varies about 10 to 20%, exept for the C2670. For C2670, there is a speedup of 2 for the ase vs. the ases with smaller evaluation thresholds. This is most likely aused by the fat that a larger evaluation threshold results in a more omplete ahe (as disussed in Setion III.A). This is substantiated by the fat that the ase has a total of 23 million Shannon expansions, while the smaller evaluation thresholds ases have over 135 million Shannon expansions. The results in Fig.9 also show that different evaluation thresholds an have an impat on the memory usage; e.g, for C2670, the ratio between maximum and minimum memory usage is In general, this memory usage differene may be the key fator on whether or not an appliation fits into physial memory and thus an have a signifiant effet on the running time. Threshold CPU Time(seonds) / Memory Usage(MBytes) (KBytes) C2670 C3540 MCRA14 MCRA / / / / / / / / / / / / / / / / / / / / / / / / 491 Fig. 9. Effets of Evaluation Threshold. pure breadth-first. ase orresponds to the ase with Note that overall, the evaluation threshold of 4096 KBytes strikes a reasonable balane between memory usage and running time. Sine 4906 KBytes is of the physial memory size (1 GBytes), for the rest of the performane evaluation in this paper, we hoose the evaluation threshold for our pakage to be of the physial memory size. B. Performane Comparison No Paging This setion ompares our approah (PBF) to CAL and CUDD when the test ases fit in physial memory. The system used for evaluation is the same as in the previous setion. The memory usage limit is set to 1 GBytes. The evaluation 1 threshold hosen for our pakage is 4 MBytes whih is 256 of physial memory size of 1 GBytes. Fig. 10 shows the results of this study. The results for smaller ases are shown at the top half of this table. The results for the C6288 and C7552 ases are not available beause they both exeeded the memory limit. Note that for CAL, C2670 and C3540 have better performane using CAL s supersalarity and pipelining feature at the ost of 71% to 84% higher memory usage. These results are marked with in Fig. 10. The results show that for the larger ases, PBF onsistently outperforms both CAL and CUDD, with speedups ranging from 1.10 (MCRA15) to 1.60 (C3540) in omparison to the best of CAL and CUDD. For the smaller ases, PBF is slower. However, sine these smaller ases take less than 2 seonds to finish, performane differenes among the different approahes are less signifiant. As for memory usage, PBF s memory usage traks very losely with CUDD s depth-first implementation. For small ases ( 10 MBytes), PBF s memory usage is higher due to the memory overhead of per variable data strutures. However, for large ases like C3540 and MCRA iruits, PBF s memory usage is atually slightly smaller than CUDD s memory usage. In ontrast, CAL s memory usage is up to a fator of 1.6 (MCRA15) in omparison to PBF s memory usage. C. Performane Comparison Paging This setion ompares our approah (PBF) to CAL and CUDD when the test ases do not fit into physial memory. We

8 Ciruit CPU Time(seonds) Memory(MBytes) PBF CAL CUDD PBF CAL CUDD C C C C C C C C C6288 n/a n/a n/a n/a n/a n/a C7552 n/a n/a n/a n/a n/a n/a MCRA MCRA Fig. 10. Performane omparison when the test ases fit in physial memory. Both C6288 and C7552 ases exeeded the 1 GBytes memory limit and thus the results are not available. Numbers marked with are CAL s results using supersalarity of 10 with automati pipelining. repeated the experiments on a smaller system a 200MHz Pentium Pro with 256 KBytes L2 Cahe and 128 MBytes of 60ns EDO DRAM. This system is running Linux with 32-bit address spae. All measurements were obtained under single user mode. Timing results reported are elapsed time and time limit is set to be 24 hours of elapsed time. For this experiment, we hose the test ases whih use more memory than available physial memory (128 MBytes). Fig.11 shows that our approah (PBF) onsistently outperforms both CAL and CUDD with speedups ranging from 1.51 (C2670) to 13.2 (MCRA14) in omparison to the best of CAL and CUDD. The signifiant speedup of MCRA14 is mainly due to the fat that our approah s memory usage for this ase is only slightly more than the available physial memory. This ase demonstrates the importane of limiting the memory overhead. Another interesting point to note that both the PBF (our approah) and the CAL (breadth-first) approah have muh better paging loality than the CUDD (depth-first) approah. For the C3540 iruit, this loality resulted in an order of magnitude differene in performane. V. ARRAY MULTIPLIERS In this setion, we demonstrate the effetiveness of our tehniques by building very large output BDDs of two types of integer multipliation iruits. The first type is based on C6288 from ISCAS85 benhmark. C6288 is a 16-bit array multiplier using arry save adders. Based on its design, we derived orresponding iruits from 1 to 15 bits. The seond type is an array multiplier with arry ripple adder (MCRA) as in Setion IV. In this study, we haraterize both multipliers from 1 to 16 bits. The system used for this evaluation is an SGI Power Challenge with 4 GBytes of physial memory. This system has 16 proessors running IRIX 6.2 with 64-bit address spae. Eah proessor is a 194MHz MIPS R We perform our experiments under dediated mode using one proessor. Note that for BDD appliations, memory usage on 64-bit mahines is generally twie that of 32-bit mahines. Fig.12 shows the results for this experiment. Fig.13 plots the memory usage of output BDDs and memory usage for onstruting C6288 and MCRA iruits in a semi-log graph. Note that the output BDD sizes grows exponentially at a fator of about 2.87 per bit of word size. Fig.13 also shows that other than the initial overhead, whih affets the memory usage of smaller iruits, the total memory usage grows at the same rate as the output BDDs memory usage. This plot is a semi-log plot to learly show the numbers for small ases. However, it is worth noting that even though the total memory usage for the 16-bit multiplier is about a fator of three to four over the size of output BDDs, this semi-log plot deemphasizes this differene. To better understand the memory usage, we analyze the BDD onstrution for building the C6288 iruit. The maximum memory usage for building this iruit is 3803 MBytes. The maximum number of BDD nodes that exist simultaneously during the BDD onstrution proess is about 110 million (3352 MBytes). To aommodate these BDD nodes, the unique tables have a ombined total of 48 million bins (366 MBytes). Thus the memory overhead of the operator nodes, the ompute ahe, and other auxiliary data strutures is 85 MBytes whih is only 2.2% of the total memory usage. This result demonstrates that our approah has very little memory overhead. As far as we know, this is the first time that the entire C6288 iruit has been built using onventional BDD representations. Ciruit Elapsed Time(seonds) Memory(MBytes) PBF CAL CUDD PBF CAL CUDD C C MCRA MCRA15 n/a n/a n/a n/a n/a n/a Fig. 11. Performane omparison when the test ases do not fit into physial memory. MCRA15 ase exeeded the time limit of 24 hours for all three pakages. CAL s numbers are measured without its supersality nor pipelining features to redue the memory usage and minimize paging. VI. RELATED WORK There are many researh efforts based on breadth-first BDD onstrution [14, 15, 1, 10, 18]. However, none of these propose how to bound the memory overhead of the breadth-first onstrution. To address this issue, we introdued a hybrid algorithm whih performs the breadth-first onstrution to exploit memory loality and swithes to the depth-first onstrution when the memory overhead beomes too high [8]. This hybrid approah has the drawbak that when a BDD operation is muh larger than the swith-over threshold, this hybrid approah will be dominated by the depth-first portion and thus

9 # of Output Size CPU Time(seonds) Memory(MBytes) Bits (# of nodes) C6288 MCRA C6288 MCRA , , , , ,733, ,955, ,181, ,563, Fig. 12. Results for multiplier iruits. Note that sine a 64-bit mahine is used for this study, the memory usage is roughly twie as big as results on a 32-bit mahine. have poor memory behavior. Note that this hybrid is similar to the mixed depth-first and breadth-first approah that prunes unneessary reursion branhes for the quantifiation and relational produt operations [18]. SMV [13] s BDD pakage uses mark-and-sweep garbage olletor without memory ompation. In [15, 1, 17], memory ompation is used to avoid memory fragmentation. These three approahes are all based on referene ounting. In [15], the ompation algorithm is stable (i.e., linear ordering of the nodes is maintained) and does not require additional memory. Our approah is quite similar to this. In [1], the garbage olletion uses a free-list and when memory fragmentation beomes high, a separate memory ompation algorithm based on opying is used. In [17], garbage olletion phase is also free-list based and memory ompation is performed after garbage olletion only when memory fragmentation beomes high. This ompation is performed by moving the newest set of live nodes to fill the holes left behind by the oldest set of dead nodes; thus, no additional memory is required. This algorithm has the advantage of moving minimum number of nodes neessary but it does not maintain the linear ordering of the live nodes. The performane impat of this tradeoff deserves further study. Our approah ombines many attributes of the approahes above by integrating a mark-and-sweep garbage olletor with a stable memory ompation without any additional memory overhead. VII. SUMMARY AND CONCLUSIONS This paper has introdued three tehniques to ontrol the working set size by limiting memory overhead and improving both temporal and spatial loality. First, we have introdued a novel BDD onstrution algorithm based on partial breadthfirst expansion. This approah has the good memory loality Memory (MBytes) MCRA BDD C Number of Bits Fig. 13. Maximum memory usage for both C6288 and MCRA ompared with memory usage of output BDDs (labeled as BDD). of the breadth-first BDD onstrution while maintaining the low memory overhead of the depth-first approah. Seond, we have desribed how memory management on a per-variable basis an improve spatial loality of BDD onstrution at all levels, inluding expansion, redution, and rehashing. Finally, we have introdued a memory ompating garbage olletion algorithm to avoid memory fragmentation due to unreahable BDD nodes. These algorithms work together in ontrolling the working set size to gain better memory aess loality with little memory overhead. As these tehniques exploit inherent properties of BDD onstrution, graph redution tehniques (like *BMD, POBDD, and variable reordering) an be inorporated into our algorithms to further expand the usefulness of these algorithms. Experimental results show that by ontrolling the evaluation threshold, the partial-breadth approah an redue the memory usage by 60% in omparison to our pure breadth-first ase ( evaluation threshold). In the performane omparison study, the results show that when the appliations fit in physial memory, our approah is onsistently faster for larger ases ( 2 seonds) with speedups of up to 1.6 in omparison to the leading depth-first (CUDD) and breadth-first (CAL) pakages. When the appliations do not fit into physial memory, our approah outperforms both CUDD and CAL by up to an order of magnitude. Furthermore, to demonstrate how our tehniques an effiiently build very large graphs, we onstruted the output BDDs for the C6288 multipliation iruit from the ISCAS85 benhmark and showed that the memory overhead of our approah is 2.2%. These results show that our tehniques have suessfully ahieved better memory loality while reduing the memory overhead. Beyond the sequential world, another advantage of the partial breadth-first algorithm is that it an be parallelized by using eah proessor s ontext stak as a distributed work queue [22]. This approah ahieves speedups of up to four on eight proessors of a shared memory system.

10 ACKNOWLEDGEMENT We thank Claudson F. Bornstein and Henry R. Rowley for numerous disussions on effiient BDD implementations. We also thank Rajeev K. Ranjan for his help in setting up our performane study with CAL pakage. This work utilized Silion Graphis Power Challenge shared memory mahines on both the Pittsburgh Superomputing Center and the National Center for Superomputing Appliations at Urbana-Champaign. We are very grateful to the wonderful support staff in both superomputing enters. REFERENCES [1] R. Ashar and M. Cheong. Effiient breadth-first manipulation of binary deision diagrams. In Proeedings of the International Conferene on Computer-Aided Design, pages , November [2] K. Brae, R. Rudell, and R. E. Bryant. Effiient implementation of a BDD pakage. In Proeedings of the 27th ACM/IEEE Design Automation Conferene, pages 40 45, June [3] F. Brglez and H. Fujiwara. A neutral netlist of 10 ombinational benhmark iruits and a target translator in Fortran. In 1985 International Symposium on Ciruits And Systems, June Partially desribed in F. Brglez, P. Pownall, R. Hum. Aelearted ATPG and Fault Grading via Testability Analysis. In 1985 International Symposium on iruits and Systems, pages , June [4] R. E. Bryant. Graph-based algorithms for Boolean funtion manipulation. IEEE Transations on Computers, C-35(8): , August [5] R. E. Bryant. On the omplexity of VLSI implementations and graph representations of Boolean funtions with appliation to integer multipliation. IEEE Transations on Computers, 40(2): , Feburary [6] R. E. Bryant and Y.-A. Chen. Verifiation of arithmeti iruits with binary moment diagrams. In Proeedings of the 32nd ACM/IEEE Design Automation Conferene, pages , June [7] Y.-A. Chen and R. E. Bryant. ACV: An arithmeti iruit verifier. In Proeedings of the International Conferene on Computer- Aided Design, pages , November [8] Y.-A. Chen, B. Yang, and R. E. Bryant. Breadth-first with depthfirst BDD onstrution: A hybrid approah. Tehnial Report CMU-CS , Shool of Computer Siene, Carnegie Mellon University, [9] R. Drehsler, A. Sarabi, M. Theobald, B. Beker, and M. A. Perkowski. Effiient representation and manipulation of swithing funtions based on ordered kroneker funtional deision diagrams. In Proeedings of the 31st ACM/IEEE Design Automation Conferene, pages , June [10] A. Hett, R. Frehsler, and B. Beker. MORE: Alternative implementation of BDD-pakages by multi-operand synthesis. In Proeedings of the European Design Automation Conferene, pages 16 20, September [11] J. Jain, J. Bitner, J. A. Abraham, and D. S. Fussell. Funtional partitioning for verifiation and related problems. In Proeedings of the Brown/MIT VLSI Conferene, pages , Marh [12] S. Jha, Y. Lu, M. Minea, and E. M. Clarke. Equivalene heking using abstrat BDDs. In 1997 IEEE Proeedings of the International Conferene on Computer Design, pages , Otober [13] K. L. MMillan. Symboli Model Cheking. Kluwer Aademi Publishers, [14] H. Ohi, N. Ishiura, and S. Yajima. Breadth-first manipulation of SBDD of Boolean funtions for vetor proessing. In Proeedings of the 28th ACM/IEEE Design Automation Conferene, pages , June [15] H. Ohi, K. Yasuoka, and S. Yajima. Breadth-first manipulation of very large binary-deision diagrams. In Proeedings of the International Conferene on Computer-Aided Design, pages 48 55, November [16] R. K. Ranjan, W. Gosti, R. K. Brayton, and A. Sangiovanni- Vinentelli. Dynami reordering in a breadth-first manipulation based BDD pakage: Challenges and solutions. In 1997 IEEE Proeedings of the International Conferene on Computer Design, pages , Otober [17] R. K. Ranjan and J. Sanghavi. CAL-2.0: Breadthfirst Manipulation Based BDD Library. Publi software. University of California, Berkeley, CA, June bdd/. [18] R. K. Ranjan, J. V. Sanghavi, R. K. Brayton, and A. Sangiovanni- Vinentelli. High performane BDD pakage based on exploiting memory hierarhy. In Proeedings of the 33rd ACM/IEEE Design Automation Conferene, pages , June [19] R. Rudell. Dynami variable ordering for ordered binary deision diagrams. In Proeedings of the International Conferene on Computer-Aided Design, pages , November [20] E. M. Sentovih, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha, H. Savoj, P. R. Stephan, R. K. Brayton, and A. L. Sangiovanni-Vinentelli. SIS: A system for sequential iruit synthesis. Tehnial Report UCB/ERL M92/41, Eletronis Researh Lab, University of California, May [21] F. Somenzi. CUDD-2.1.2: CU Deision Diagram Pakage, April ftp://vlsi.olorado.edu/pub/udd tar.gz. [22] B. Yang and D. R. O Hallaron. Parallel breadth-first BDD onstrution. In Ninth ACM SIGPLAN Symposium on Priniples and Pratie of Parallel Programming, pages , June 1997.

Outline: Software Design

Outline: Software Design Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling

More information

Improved Circuit-to-CNF Transformation for SAT-based ATPG

Improved Circuit-to-CNF Transformation for SAT-based ATPG Improved Ciruit-to-CNF Transformation for SAT-based ATPG Daniel Tille 1 René Krenz-Bååth 2 Juergen Shloeffel 2 Rolf Drehsler 1 1 Institute of Computer Siene, University of Bremen, 28359 Bremen, Germany

More information

Pipelined Multipliers for Reconfigurable Hardware

Pipelined Multipliers for Reconfigurable Hardware Pipelined Multipliers for Reonfigurable Hardware Mithell J. Myjak and José G. Delgado-Frias Shool of Eletrial Engineering and Computer Siene, Washington State University Pullman, WA 99164-2752 USA {mmyjak,

More information

On the Generation of Multiplexer Circuits for Pass Transistor Logic

On the Generation of Multiplexer Circuits for Pass Transistor Logic Preprint from Proeedings of DATE 2, Paris, rane, Marh 2 On the Generation of Multiplexer Ciruits for Pass Transistor Logi Christoph Sholl Bernd Beker Institute of Computer Siene Albert Ludwigs University

More information

Extracting Partition Statistics from Semistructured Data

Extracting Partition Statistics from Semistructured Data Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk

More information

COMP 181. Prelude. Intermediate representations. Today. Types of IRs. High-level IR. Intermediate representations and code generation

COMP 181. Prelude. Intermediate representations. Today. Types of IRs. High-level IR. Intermediate representations and code generation Prelude COMP 181 Intermediate representations and ode generation November, 009 What is this devie? Large Hadron Collider What is a hadron? Subatomi partile made up of quarks bound by the strong fore What

More information

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines The Minimum Redundany Maximum Relevane Approah to Building Sparse Support Vetor Mahines Xiaoxing Yang, Ke Tang, and Xin Yao, Nature Inspired Computation and Appliations Laboratory (NICAL), Shool of Computer

More information

Exploring the Commonality in Feature Modeling Notations

Exploring the Commonality in Feature Modeling Notations Exploring the Commonality in Feature Modeling Notations Miloslav ŠÍPKA Slovak University of Tehnology Faulty of Informatis and Information Tehnologies Ilkovičova 3, 842 16 Bratislava, Slovakia miloslav.sipka@gmail.om

More information

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2 On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,

More information

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking Algorithms for External Memory Leture 6 Graph Algorithms - Weighted List Ranking Leturer: Nodari Sithinava Sribe: Andi Hellmund, Simon Ohsenreither 1 Introdution & Motivation After talking about I/O-effiient

More information

Approximate logic synthesis for error tolerant applications

Approximate logic synthesis for error tolerant applications Approximate logi synthesis for error tolerant appliations Doohul Shin and Sandeep K. Gupta Eletrial Engineering Department, University of Southern California, Los Angeles, CA 989 {doohuls, sandeep}@us.edu

More information

HEXA: Compact Data Structures for Faster Packet Processing

HEXA: Compact Data Structures for Faster Packet Processing Washington University in St. Louis Washington University Open Sholarship All Computer Siene and Engineering Researh Computer Siene and Engineering Report Number: 27-26 27 HEXA: Compat Data Strutures for

More information

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks Abouberine Ould Cheikhna Department of Computer Siene University of Piardie Jules Verne 80039 Amiens Frane Ould.heikhna.abouberine @u-piardie.fr

More information

A Novel Validity Index for Determination of the Optimal Number of Clusters

A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers

More information

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Malaysian Journal of Computer Siene, Vol 10 No 1, June 1997, pp 36-41 A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Md Rafiqul Islam, Harihodin Selamat and Mohd Noor Md Sap Faulty of Computer Siene and

More information

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425) Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University

More information

Boosted Random Forest

Boosted Random Forest Boosted Random Forest Yohei Mishina, Masamitsu suhiya and Hironobu Fujiyoshi Department of Computer Siene, Chubu University, 1200 Matsumoto-ho, Kasugai, Aihi, Japan {mishi, mtdoll}@vision.s.hubu.a.jp,

More information

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1.

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1. Fuzzy Weighted Rank Ordered Mean (FWROM) Filters for Mixed Noise Suppression from Images S. Meher, G. Panda, B. Majhi 3, M.R. Meher 4,,4 Department of Eletronis and I.E., National Institute of Tehnology,

More information

This fact makes it difficult to evaluate the cost function to be minimized

This fact makes it difficult to evaluate the cost function to be minimized RSOURC LLOCTION N SSINMNT In the resoure alloation step the amount of resoures required to exeute the different types of proesses is determined. We will refer to the time interval during whih a proess

More information

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays nalysis of input and output onfigurations for use in four-valued D programmable logi arrays J.T. utler H.G. Kerkhoff ndexing terms: Logi, iruit theory and design, harge-oupled devies bstrat: s in binary,

More information

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System Algorithms, Mehanisms and Proedures for the Computer-aided Projet Generation System Anton O. Butko 1*, Aleksandr P. Briukhovetskii 2, Dmitry E. Grigoriev 2# and Konstantin S. Kalashnikov 3 1 Department

More information

Parallelizing Frequent Web Access Pattern Mining with Partial Enumeration for High Speedup

Parallelizing Frequent Web Access Pattern Mining with Partial Enumeration for High Speedup Parallelizing Frequent Web Aess Pattern Mining with Partial Enumeration for High Peiyi Tang Markus P. Turkia Department of Computer Siene Department of Computer Siene University of Arkansas at Little Rok

More information

Reducing Runtime Complexity of Long-Running Application Services via Dynamic Profiling and Dynamic Bytecode Adaptation for Improved Quality of Service

Reducing Runtime Complexity of Long-Running Application Services via Dynamic Profiling and Dynamic Bytecode Adaptation for Improved Quality of Service Reduing Runtime Complexity of Long-Running Appliation Servies via Dynami Profiling and Dynami Byteode Adaptation for Improved Quality of Servie ABSTRACT John Bergin Performane Engineering Laboratory University

More information

Constructing Transaction Serialization Order for Incremental. Data Warehouse Refresh. Ming-Ling Lo and Hui-I Hsiao. IBM T. J. Watson Research Center

Constructing Transaction Serialization Order for Incremental. Data Warehouse Refresh. Ming-Ling Lo and Hui-I Hsiao. IBM T. J. Watson Research Center Construting Transation Serialization Order for Inremental Data Warehouse Refresh Ming-Ling Lo and Hui-I Hsiao IBM T. J. Watson Researh Center July 11, 1997 Abstrat In typial pratie of data warehouse, the

More information

Partial Character Decoding for Improved Regular Expression Matching in FPGAs

Partial Character Decoding for Improved Regular Expression Matching in FPGAs Partial Charater Deoding for Improved Regular Expression Mathing in FPGAs Peter Sutton Shool of Information Tehnology and Eletrial Engineering The University of Queensland Brisbane, Queensland, 4072, Australia

More information

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract CS 9 Projet Final Report: Learning Convention Propagation in BeerAdvoate Reviews from a etwork Perspetive Abstrat We look at the way onventions propagate between reviews on the BeerAdvoate dataset, and

More information

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem Calulation of typial running time of a branh-and-bound algorithm for the vertex-over problem Joni Pajarinen, Joni.Pajarinen@iki.fi Otober 21, 2007 1 Introdution The vertex-over problem is one of a olletion

More information

Data Structures in Java

Data Structures in Java Data Strutures in Java Leture 8: Trees and Tree Traversals. 10/5/2015 Daniel Bauer 1 Trees in Computer Siene A lot of data omes in a hierarhial/nested struture. Mathematial expressions. Program struture.

More information

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY Dileep P, Bhondarkor Texas Instruments Inorporated Dallas, Texas ABSTRACT Charge oupled devies (CCD's) hove been mentioned as potential fast auxiliary

More information

Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis

Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis Journal of Computer Siene 4 (): 9-97, 008 ISSN 549-3636 008 Siene Publiations Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis Deepti Gaur, Aditya Shastri and Ranjit Biswas Department of Computer

More information

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall Department of Eletrial and Computer Engineering University of Wisonsin Madison ECE 553: Testing and Testable Design of Digital Systems Fall 2014-2015 Assignment #2 Date Tuesday, September 25, 2014 Due

More information

Gray Codes for Reflectable Languages

Gray Codes for Reflectable Languages Gray Codes for Refletable Languages Yue Li Joe Sawada Marh 8, 2008 Abstrat We lassify a type of language alled a refletable language. We then develop a generi algorithm that an be used to list all strings

More information

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study What are Cyle-Stealing Systems Good For? A Detailed Performane Model Case Study Wayne Kelly and Jiro Sumitomo Queensland University of Tehnology, Australia {w.kelly, j.sumitomo}@qut.edu.au Abstrat The

More information

1. Introduction. 2. The Probable Stope Algorithm

1. Introduction. 2. The Probable Stope Algorithm 1. Introdution Optimization in underground mine design has reeived less attention than that in open pit mines. This is mostly due to the diversity o underground mining methods and omplexity o underground

More information

Direct-Mapped Caches

Direct-Mapped Caches A Case for Diret-Mapped Cahes Mark D. Hill University of Wisonsin ahe is a small, fast buffer in whih a system keeps those parts, of the ontents of a larger, slower memory that are likely to be used soon.

More information

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A.

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A. Compilation 0368-3133 Leture 11a Text book: Modern ompiler implementation in C Andrew A. Appel Register Alloation Noam Rinetzky 1 Registers Dediated memory loations that an be aessed quikly, an have omputations

More information

A {k, n}-secret Sharing Scheme for Color Images

A {k, n}-secret Sharing Scheme for Color Images A {k, n}-seret Sharing Sheme for Color Images Rastislav Luka, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos The Edward S. Rogers Sr. Dept. of Eletrial and Computer Engineering, University

More information

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION Ken Sauer and Charles A. Bouman Department of Eletrial Engineering, University of Notre Dame Notre Dame, IN 46556, (219) 631-6999 Shool of

More information

M32: A Constructive Multilevel Logic Synthesis System*

M32: A Constructive Multilevel Logic Synthesis System* M32: A Construtive Multilevel Logi Synthesis System* Vitor N. Kravets Karem A. Sakallah Department of Eletrial Engineering and Computer Siene University of Mihigan, Ann Arbor, MI 48109 {vkravets, karem}@ees.umih.edu

More information

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks International Journal of Advanes in Computer Networks and Its Seurity IJCNS A Load-Balaned Clustering Protool for Hierarhial Wireless Sensor Networks Mehdi Tarhani, Yousef S. Kavian, Saman Siavoshi, Ali

More information

the data. Structured Principal Component Analysis (SPCA)

the data. Structured Principal Component Analysis (SPCA) Strutured Prinipal Component Analysis Kristin M. Branson and Sameer Agarwal Department of Computer Siene and Engineering University of California, San Diego La Jolla, CA 9193-114 Abstrat Many tasks involving

More information

Dynamic Programming. Lecture #8 of Algorithms, Data structures and Complexity. Joost-Pieter Katoen Formal Methods and Tools Group

Dynamic Programming. Lecture #8 of Algorithms, Data structures and Complexity. Joost-Pieter Katoen Formal Methods and Tools Group Dynami Programming Leture #8 of Algorithms, Data strutures and Complexity Joost-Pieter Katoen Formal Methods and Tools Group E-mail: katoen@s.utwente.nl Otober 29, 2002 JPK #8: Dynami Programming ADC (214020)

More information

Chapter 2: Introduction to Maple V

Chapter 2: Introduction to Maple V Chapter 2: Introdution to Maple V 2-1 Working with Maple Worksheets Try It! (p. 15) Start a Maple session with an empty worksheet. The name of the worksheet should be Untitled (1). Use one of the standard

More information

Accommodations of QoS DiffServ Over IP and MPLS Networks

Accommodations of QoS DiffServ Over IP and MPLS Networks Aommodations of QoS DiffServ Over IP and MPLS Networks Abdullah AlWehaibi, Anjali Agarwal, Mihael Kadoh and Ahmed ElHakeem Department of Eletrial and Computer Department de Genie Eletrique Engineering

More information

TMIX: Temporal Model for Indexing XML Documents

TMIX: Temporal Model for Indexing XML Documents TMIX: Temporal Model for Indexing XML Douments Rasha Bin-Thalab Department of Information System Faulty of omputers and Information Cairo University, Egypt azi_z30@yahoo.om Neamat El-Tazi Department of

More information

Performance of Histogram-Based Skin Colour Segmentation for Arms Detection in Human Motion Analysis Application

Performance of Histogram-Based Skin Colour Segmentation for Arms Detection in Human Motion Analysis Application World Aademy of Siene, Engineering and Tehnology 8 009 Performane of Histogram-Based Skin Colour Segmentation for Arms Detetion in Human Motion Analysis Appliation Rosalyn R. Porle, Ali Chekima, Farrah

More information

Graph-Based vs Depth-Based Data Representation for Multiview Images

Graph-Based vs Depth-Based Data Representation for Multiview Images Graph-Based vs Depth-Based Data Representation for Multiview Images Thomas Maugey, Antonio Ortega, Pasal Frossard Signal Proessing Laboratory (LTS), Eole Polytehnique Fédérale de Lausanne (EPFL) Email:

More information

8 Instruction Selection

8 Instruction Selection 8 Instrution Seletion The IR ode instrutions were designed to do exatly one operation: load/store, add, subtrat, jump, et. The mahine instrutions of a real CPU often perform several of these primitive

More information

Implementing Load-Balanced Switches With Fat-Tree Networks

Implementing Load-Balanced Switches With Fat-Tree Networks Implementing Load-Balaned Swithes With Fat-Tree Networks Hung-Shih Chueh, Ching-Min Lien, Cheng-Shang Chang, Jay Cheng, and Duan-Shin Lee Department of Eletrial Engineering & Institute of Communiations

More information

Multi-Channel Wireless Networks: Capacity and Protocols

Multi-Channel Wireless Networks: Capacity and Protocols Multi-Channel Wireless Networks: Capaity and Protools Tehnial Report April 2005 Pradeep Kyasanur Dept. of Computer Siene, and Coordinated Siene Laboratory, University of Illinois at Urbana-Champaign Email:

More information

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality INTERNATIONAL CONFERENCE ON MANUFACTURING AUTOMATION (ICMA200) Multi-Piee Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality Stephen Stoyan, Yong Chen* Epstein Department of

More information

Tackling IPv6 Address Scalability from the Root

Tackling IPv6 Address Scalability from the Root Takling IPv6 Address Salability from the Root Mei Wang Ashish Goel Balaji Prabhakar Stanford University {wmei, ashishg, balaji}@stanford.edu ABSTRACT Internet address alloation shemes have a huge impat

More information

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index IJCSES International Journal of Computer Sienes and Engineering Systems, ol., No.4, Otober 2007 CSES International 2007 ISSN 0973-4406 253 An Optimized Approah on Applying Geneti Algorithm to Adaptive

More information

Detecting Moving Targets in Clutter in Airborne SAR via Keystoning and Multiple Phase Center Interferometry

Detecting Moving Targets in Clutter in Airborne SAR via Keystoning and Multiple Phase Center Interferometry Deteting Moving Targets in Clutter in Airborne SAR via Keystoning and Multiple Phase Center Interferometry D. M. Zasada, P. K. Sanyal The MITRE Corp., 6 Eletroni Parkway, Rome, NY 134 (dmzasada, psanyal)@mitre.org

More information

Total 100

Total 100 CS331 SOLUTION Problem # Points 1 10 2 15 3 25 4 20 5 15 6 15 Total 100 1. ssume you are dealing with a ompiler for a Java-like language. For eah of the following errors, irle whih phase would normally

More information

Evaluation of Benchmark Performance Estimation for Parallel. Fortran Programs on Massively Parallel SIMD and MIMD. Computers.

Evaluation of Benchmark Performance Estimation for Parallel. Fortran Programs on Massively Parallel SIMD and MIMD. Computers. Evaluation of Benhmark Performane Estimation for Parallel Fortran Programs on Massively Parallel SIMD and MIMD Computers Thomas Fahringer Dept of Software Tehnology and Parallel Systems University of Vienna

More information

Detection and Recognition of Non-Occluded Objects using Signature Map

Detection and Recognition of Non-Occluded Objects using Signature Map 6th WSEAS International Conferene on CIRCUITS, SYSTEMS, ELECTRONICS,CONTROL & SIGNAL PROCESSING, Cairo, Egypt, De 9-31, 007 65 Detetion and Reognition of Non-Oluded Objets using Signature Map Sangbum Park,

More information

Sparse Certificates for 2-Connectivity in Directed Graphs

Sparse Certificates for 2-Connectivity in Directed Graphs Sparse Certifiates for 2-Connetivity in Direted Graphs Loukas Georgiadis Giuseppe F. Italiano Aikaterini Karanasiou Charis Papadopoulos Nikos Parotsidis Abstrat Motivated by the emergene of large-sale

More information

Detection of RF interference to GPS using day-to-day C/No differences

Detection of RF interference to GPS using day-to-day C/No differences 1 International Symposium on GPS/GSS Otober 6-8, 1. Detetion of RF interferene to GPS using day-to-day /o differenes Ryan J. R. Thompson 1#, Jinghui Wu #, Asghar Tabatabaei Balaei 3^, and Andrew G. Dempster

More information

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications System-Level Parallelism and hroughput Optimization in Designing Reonfigurable Computing Appliations Esam El-Araby 1, Mohamed aher 1, Kris Gaj 2, arek El-Ghazawi 1, David Caliga 3, and Nikitas Alexandridis

More information

A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering

A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering A Novel Bit Level Time Series Representation with Impliation of Similarity Searh and lustering hotirat Ratanamahatana, Eamonn Keogh, Anthony J. Bagnall 2, and Stefano Lonardi Dept. of omputer Siene & Engineering,

More information

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks A Dual-Hamiltonian-Path-Based Multiasting Strategy for Wormhole-Routed Star Graph Interonnetion Networks Nen-Chung Wang Department of Information and Communiation Engineering Chaoyang University of Tehnology,

More information

Reduced-Complexity Column-Layered Decoding and. Implementation for LDPC Codes

Reduced-Complexity Column-Layered Decoding and. Implementation for LDPC Codes Redued-Complexity Column-Layered Deoding and Implementation for LDPC Codes Zhiqiang Cui 1, Zhongfeng Wang 2, Senior Member, IEEE, and Xinmiao Zhang 3 1 Qualomm In., San Diego, CA 92121, USA 2 Broadom Corp.,

More information

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 9, September 2013 ISSN: 2277 128X International Journal of Advaned Researh in Computer Siene and Software Engineering Researh Paper Available online at: www.ijarsse.om A New-Fangled Algorithm

More information

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT?

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? 3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? Bernd Girod, Peter Eisert, Marus Magnor, Ekehard Steinbah, Thomas Wiegand Te {girod eommuniations Laboratory, University of Erlangen-Nuremberg

More information

Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System

Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System Arhiteture and Performane of the Hitahi SR221 Massively Parallel Proessor System Hiroaki Fujii, Yoshiko Yasuda, Hideya Akashi, Yasuhiro Inagami, Makoto Koga*, Osamu Ishihara*, Masamori Kashiyama*, Hideo

More information

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer Communiations and Networ, 2013, 5, 69-73 http://dx.doi.org/10.4236/n.2013.53b2014 Published Online September 2013 (http://www.sirp.org/journal/n) Cross-layer Resoure Alloation on Broadband Power Line Based

More information

Acoustic Links. Maximizing Channel Utilization for Underwater

Acoustic Links. Maximizing Channel Utilization for Underwater Maximizing Channel Utilization for Underwater Aousti Links Albert F Hairris III Davide G. B. Meneghetti Adihele Zorzi Department of Information Engineering University of Padova, Italy Email: {harris,davide.meneghetti,zorzi}@dei.unipd.it

More information

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework EECS 33 There be Dragons here http://ziyang.ees.northwestern.edu/ees33/ Teaher: Offie: Email: Phone: L477 Teh dikrp@northwestern.edu 847 467 2298 Today s material might at first appear diffiult Perhaps

More information

The AMDREL Project in Retrospective

The AMDREL Project in Retrospective The AMDREL Projet in Retrospetive K. Siozios 1, G. Koutroumpezis 1, K. Tatas 1, N. Vassiliadis 2, V. Kalenteridis 2, H. Pournara 2, I. Pappas 2, D. Soudris 1, S. Nikolaidis 2, S. Siskos 2, and A. Thanailakis

More information

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization Self-Adaptive Parent to Mean-Centri Reombination for Real-Parameter Optimization Kalyanmoy Deb and Himanshu Jain Department of Mehanial Engineering Indian Institute of Tehnology Kanpur Kanpur, PIN 86 {deb,hjain}@iitk.a.in

More information

Design of High Speed Mac Unit

Design of High Speed Mac Unit Design of High Speed Ma Unit 1 Harish Babu N, 2 Rajeev Pankaj N 1 PG Student, 2 Assistant professor Shools of Eletronis Engineering, VIT University, Vellore -632014, TamilNadu, India. 1 harishharsha72@gmail.om,

More information

Path Diversity for Overlay Multicast Streaming

Path Diversity for Overlay Multicast Streaming Path Diversity for Overlay Multiast Streaming Matulya Bansal and Avideh Zakhor Department of Eletrial Engineering and Computer Siene University of California, Berkeley Berkeley, CA 9472 {matulya, avz}@ees.berkeley.edu

More information

A Unified Subdivision Scheme for Polygonal Modeling

A Unified Subdivision Scheme for Polygonal Modeling EUROGRAPHICS 2 / A. Chalmers and T.-M. Rhyne (Guest Editors) Volume 2 (2), Number 3 A Unified Subdivision Sheme for Polygonal Modeling Jérôme Maillot Jos Stam Alias Wavefront Alias Wavefront 2 King St.

More information

splitting tehniques that partition live ranges have been proposed to solve both the spilling problem[5][8] and the assignment problem[8][9]. The parti

splitting tehniques that partition live ranges have been proposed to solve both the spilling problem[5][8] and the assignment problem[8][9]. The parti Load/Store Range Analysis for Global Register Alloation Priyadarshan Kolte and Mary Jean Harrold Department of Computer Siene Clemson University Abstrat Live range splitting tehniques divide the live ranges

More information

Improved Vehicle Classification in Long Traffic Video by Cooperating Tracker and Classifier Modules

Improved Vehicle Classification in Long Traffic Video by Cooperating Tracker and Classifier Modules Improved Vehile Classifiation in Long Traffi Video by Cooperating Traker and Classifier Modules Brendan Morris and Mohan Trivedi University of California, San Diego San Diego, CA 92093 {b1morris, trivedi}@usd.edu

More information

13.1 Numerical Evaluation of Integrals Over One Dimension

13.1 Numerical Evaluation of Integrals Over One Dimension 13.1 Numerial Evaluation of Integrals Over One Dimension A. Purpose This olletion of subprograms estimates the value of the integral b a f(x) dx where the integrand f(x) and the limits a and b are supplied

More information

One Against One or One Against All : Which One is Better for Handwriting Recognition with SVMs?

One Against One or One Against All : Which One is Better for Handwriting Recognition with SVMs? One Against One or One Against All : Whih One is Better for Handwriting Reognition with SVMs? Jonathan Milgram, Mohamed Cheriet, Robert Sabourin To ite this version: Jonathan Milgram, Mohamed Cheriet,

More information

INTERPOLATED AND WARPED 2-D DIGITAL WAVEGUIDE MESH ALGORITHMS

INTERPOLATED AND WARPED 2-D DIGITAL WAVEGUIDE MESH ALGORITHMS Proeedings of the COST G-6 Conferene on Digital Audio Effets (DAFX-), Verona, Italy, Deember 7-9, INTERPOLATED AND WARPED -D DIGITAL WAVEGUIDE MESH ALGORITHMS Vesa Välimäki Lab. of Aoustis and Audio Signal

More information

Exploiting Enriched Contextual Information for Mobile App Classification

Exploiting Enriched Contextual Information for Mobile App Classification Exploiting Enrihed Contextual Information for Mobile App Classifiation Hengshu Zhu 1 Huanhuan Cao 2 Enhong Chen 1 Hui Xiong 3 Jilei Tian 2 1 University of Siene and Tehnology of China 2 Nokia Researh Center

More information

Cell Projection of Convex Polyhedra

Cell Projection of Convex Polyhedra Volume Graphis (2003) I. Fujishiro, K. Mueller, A. Kaufman (Editors) Cell Projetion of Convex Polyhedra Stefan Roettger and Thomas Ertl Visualization and Interative Systems Group University of Stuttgart

More information

The recursive decoupling method for solving tridiagonal linear systems

The recursive decoupling method for solving tridiagonal linear systems Loughborough University Institutional Repository The reursive deoupling method for solving tridiagonal linear systems This item was submitted to Loughborough University's Institutional Repository by the/an

More information

Colouring contact graphs of squares and rectilinear polygons de Berg, M.T.; Markovic, A.; Woeginger, G.

Colouring contact graphs of squares and rectilinear polygons de Berg, M.T.; Markovic, A.; Woeginger, G. Colouring ontat graphs of squares and retilinear polygons de Berg, M.T.; Markovi, A.; Woeginger, G. Published in: nd European Workshop on Computational Geometry (EuroCG 06), 0 Marh - April, Lugano, Switzerland

More information

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating Capturing Large Intra-lass Variations of Biometri Data by Template Co-updating Ajita Rattani University of Cagliari Piazza d'armi, Cagliari, Italy ajita.rattani@diee.unia.it Gian Lua Marialis University

More information

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization Zippy - A oarse-grained reonfigurable array with support for hardware virtualization Christian Plessl Computer Engineering and Networks Lab ETH Zürih, Switzerland plessl@tik.ee.ethz.h Maro Platzner Department

More information

Series/1 GA File No i=:: IBM Series/ Battery Backup Unit Description :::5 ~ ~ >-- ffi B~88 ~0 (] II IIIIII

Series/1 GA File No i=:: IBM Series/ Battery Backup Unit Description :::5 ~ ~ >-- ffi B~88 ~0 (] II IIIIII Series/1 I. (.. GA34-0032-0 File No. 51-10 a i=:: 5 Q 1 IBM Series/1 4999 Battery Bakup Unit Desription B88 0 (] o. :::5 >-- ffi "- I II1111111111IIIIII1111111 ---- - - - - ----- --_.- Series/1 «h: ",

More information

Divide-and-conquer algorithms 1

Divide-and-conquer algorithms 1 * 1 Multipliation Divide-and-onquer algorithms 1 The mathematiian Gauss one notied that although the produt of two omplex numbers seems to! involve four real-number multipliations it an in fat be done

More information

CleanUp: Improving Quadrilateral Finite Element Meshes

CleanUp: Improving Quadrilateral Finite Element Meshes CleanUp: Improving Quadrilateral Finite Element Meshes Paul Kinney MD-10 ECC P.O. Box 203 Ford Motor Company Dearborn, MI. 8121 (313) 28-1228 pkinney@ford.om Abstrat: Unless an all quadrilateral (quad)

More information

Optimizing Sparse Matrix Operations on GPUs using Merge Path

Optimizing Sparse Matrix Operations on GPUs using Merge Path 21 IEEE 29th International Parallel and Distributed Proessing Symposium Optimizing Sparse Matrix Operations on GPUs using Merge Path Steven Dalton, Luke Olson Department of Computer Siene University of

More information

Time delay estimation of reverberant meeting speech: on the use of multichannel linear prediction

Time delay estimation of reverberant meeting speech: on the use of multichannel linear prediction University of Wollongong Researh Online Faulty of Informatis - apers (Arhive) Faulty of Engineering and Information Sienes 7 Time delay estimation of reverberant meeting speeh: on the use of multihannel

More information

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions Department of Eletrial Engineering and Computer iene MAACHUETT INTITUTE OF TECHNOLOGY 6.035 Fall 2016 Test I olutions 1 I Regular Expressions and Finite-tate Automata For Questions 1, 2, and 3, let the

More information

Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps

Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps Stairase Join: Teah a Relational DBMS to Wath its (Axis) Steps Torsten Grust Maurie van Keulen Jens Teubner University of Konstanz Department of Computer and Information Siene P.O. Box D 88, 78457 Konstanz,

More information

PROJECT PERIODIC REPORT

PROJECT PERIODIC REPORT FP7-ICT-2007-1 Contrat no.: 215040 www.ative-projet.eu PROJECT PERIODIC REPORT Publishable Summary Grant Agreement number: ICT-215040 Projet aronym: Projet title: Enabling the Knowledge Powered Enterprise

More information

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method 3537 Multiple-Criteria Deision Analysis: A Novel Rank Aggregation Method Derya Yiltas-Kaplan Department of Computer Engineering, Istanbul University, 34320, Avilar, Istanbul, Turkey Email: dyiltas@ istanbul.edu.tr

More information

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections SVC-DASH-M: Salable Video Coding Dynami Adaptive Streaming Over HTTP Using Multiple Connetions Samar Ibrahim, Ahmed H. Zahran and Mahmoud H. Ismail Department of Eletronis and Eletrial Communiations, Faulty

More information

The Implementation of RRTs for a Remote-Controlled Mobile Robot

The Implementation of RRTs for a Remote-Controlled Mobile Robot ICCAS5 June -5, KINEX, Gyeonggi-Do, Korea he Implementation of RRs for a Remote-Controlled Mobile Robot Chi-Won Roh*, Woo-Sub Lee **, Sung-Chul Kang *** and Kwang-Won Lee **** * Intelligent Robotis Researh

More information

An Efficient and Scalable Approach to CNN Queries in a Road Network

An Efficient and Scalable Approach to CNN Queries in a Road Network An Effiient and Salable Approah to CNN Queries in a Road Network Hyung-Ju Cho Chin-Wan Chung Dept. of Eletrial Engineering & Computer Siene Korea Advaned Institute of Siene and Tehnology 373- Kusong-dong,

More information

An Alternative Approach to the Fuzzifier in Fuzzy Clustering to Obtain Better Clustering Results

An Alternative Approach to the Fuzzifier in Fuzzy Clustering to Obtain Better Clustering Results An Alternative Approah to the Fuzziier in Fuzzy Clustering to Obtain Better Clustering Results Frank Klawonn Department o Computer Siene University o Applied Sienes BS/WF Salzdahlumer Str. 46/48 D-38302

More information

Gradient based progressive probabilistic Hough transform

Gradient based progressive probabilistic Hough transform Gradient based progressive probabilisti Hough transform C.Galambos, J.Kittler and J.Matas Abstrat: The authors look at the benefits of exploiting gradient information to enhane the progressive probabilisti

More information

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks Unsupervised Stereosopi Video Objet Segmentation Based on Ative Contours and Retrainable Neural Networks KLIMIS NTALIANIS, ANASTASIOS DOULAMIS, and NIKOLAOS DOULAMIS National Tehnial University of Athens

More information