Flexible ASIC: Shared Masking for Multiple Media Processors

Size: px
Start display at page:

Download "Flexible ASIC: Shared Masking for Multiple Media Processors"

Transcription

1 54.1 Flexble ASIC: Shared Maskng for Multple Meda Processors Jennfer L. Wong Unv. of Calf., Los Angeles Los Angeles, Calforna Farnaz Kourshanfar Unv. of Calf., Berkeley Berkeley, Calforna Modrag Potkonjak Unv. of Calf., Los Angeles Los Angeles, Calforna ABSTRACT ASIC provdes more than an order of magntude advantage n terms of densty, speed, and power requrement per gate. However, economc (cost of masks) and technologcal (deep mcron manufacturablty) trends favor FPGA as an mplementaton platform. In order to combne the advantages of both platforms and allevate ther dsadvantages, recently a number of approaches, such as structured ASIC/regular fabrcs, have been proposed. Our goal s to ntroduce an approach that has the same objectve, but s orthogonal to those already proposed. The dea s to mplement several ASIC desgns n such a way that they share the datapath, memory structure, and several bottom layers of nterconnect, whle each desgn has only a few unque metal layers. We dentfed and addressed two man problems n our quest to develop a CAD flow for realzaton of such desgns. They are: () the creaton of the datapath, and () the dentfcaton of common and unque nterconnects for each desgn. Both problems are solved optmally usng ILP formulatons. We assembled a desgn flow platform usng two new programs and the Trmaran and Shade tools. We quanttatvely analyzed the advantages and dsadvantages of the approach usng the Medabench benchmark sute. Categores and Subject Descrptors: B.5.1 [Desgn Ads]: Optmzaton General Terms: Desgn. Keywords: ASIC, nterconnect, optmzaton. 1. INTRODUCTION The tradtonal dlemma for many desgn teams has been ASIC or FPGA. FPGAs provde sgnfcant advantages over ASIC n terms of turn-around-tme, flexblty, testablty and sutablty for debuggng. However, for a gven technology ASIC provdes over 12 tmes hgher speed, almost 50 tmes hgher densty, and more than 500 tmes lower power per gate [18]. Exponental growth of mask cost and a more demandng manufacturablty process wth each new gener- Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. DAC 2005, June 13 17, 2005, Anahem, Calforna, USA. Copyrght 2005 ACM /05/ $5.00. aton of deep submcron technology caused two major ramfcatons on mplementaton platforms. Specfcally, whle n 0.25 mcron the Non-Recurrng Engneerng (NRE) cost (mask set and probe card) was $100,000, t wll approach $2 mllon for 65 nm [18]. At the same tme prntablty and process varaton challenges favor regular desgns [13]. The frst consequence s that regardless of great delay, area, and ASIC advantages, the number of ASIC desgns has to be consstently decreasng. For example, over the last 15 years, the number of ASIC desgns has been reduced 6 tmes (from 15,000 to 2,500). ASIC s currently used only for very large volume desgns. If we consder a relatvely large half of mllon volume, n 65 nm, the NRE cost wll add $4 per desgn, whch s often unacceptable overhead. The second consequence s the emergence of a quest for new desgn mplementaton platforms and desgn methodologes to develop solutons that preserve some of the advantages of ASIC whle reducng NRE costs and ncreasng manufacturablty. A number of mplementaton platforms amed to reduce mask and therefore NRE costs and to brdge the gap between ASICs and FPGAs have recently been proposed. The new ndustral platform s commonly called structured ASIC or hybrd-fpga structures and s pursued by large semconductor companes, CAD houses, and start-ups [4, 5, 7, 9, 12, 15, 17, 18]. Academc researchers often refer to ths mplementaton platform as regular fabrcs and have focused ther efforts manly on ether defnng the structure of the fabrc or explorng how the physcal desgn s mpacted by the emergng mplementaton platforms [2, 6, 7, 8, 11, 13, 14]. Most often the new platform conssts of an array of logc cells bult usng dffuson and a few bottom metal layers. Logc cells contan programmable combnatoral logc and often storage elements. Therefore, the bottom mask layers are common to all desgns that wll be executed on ths platform. Note that n addton to logc cells that can be customzed, the nterconnect network on the top few layers s generated usng custom masks specfcally crafted for a specfc desgn. Therefore the approach sgnfcantly reduces the number of requred custom masks and preserves some of the advantages n terms of delay, power, and area of a standard cell-based ASIC mplementaton platform. It s estmated that performances of ths platform are nferor to fully custom ASIC by a factor of 2-4. However, specalzed structure ASIC CAD tools are requred. Although our approach explores exactly the same dvson of layers n terms of common and customzed, the approach s complementary to the ones already presented. Note that a few bottom layers n our case can be ether fully customzed 909

2 A 1 A 2 B 1 B 2 C 1 C 2 D 1 D 2 A B Fgure 1: Basc archtecture of meda applcatons. and mplemented usng cell-based ASIC, or can leverage on the bottom few layers of a structured ASIC. In the former case, we preserve all ASIC advantages n terms of area, delay, and power. In the later case, the emphass s on enhanced manufacturablty through regularty and reduced testng and debug costs. Furthermore, we focus on the regster transfer and behavoral levels for the development of CAD tools for the effectve use of new ASIC mplementaton platforms wth reduced cost and shared bottom layers. We dentfed two man aspects for developng CAD tools for flexble top layer ASIC mplementaton platforms. The frst problem s how to select functonal unts that wll address the needs of all applcatons. The second problem s how to assgn nterconnect to both the shared and customzed metal layers. We proved that both problems are NP-complete and can be optmally solved usng nteger lnear programmng (ILP) formulatons. Snce the sze of the nstances for realstc desgns are relatvely moderate, the runtme of the CPLEX ILP solver n all cases was less than 0.1 second. 1.1 Motvatonal Example The key dea behnd the flexble ASIC approach s to generate nterconnect assgnments for dfferent applcatons n such a way that all the applcatons have the same basc mask layers. The only dfference would be the nterconnect confguratons for the upper layers. The approach can be ntroduced and llustrated usng the followng example. Assume that we have fve dfferent applcaton-specfc meda processors, namely JPEG encoder, MPEG encoder, GSM encoder, G.721 encoder and Pegwt encoder. For the sake of clarty and smplcty, we show a very smplstc vew of these encoder archtectures. Assume that the basc archtecture shown n Fgure 1 should be used for buldng all the encoders and only the nterconnect requrements dffer. We have four dfferent functonal unts denoted by A, B, C, and D. Each unt has an output wth the same name as ts block. There are two nputs for each blocks denoted as A 1, A 2, B 1, B 2, C 1, C 2, D 1,andD 2 respectvely. Table 1, shows the nterconnect requrements for these fve dfferent meda processors. Each row of the table represents an nterconnect between the output of a block to the nput of another block. Each column of the table represents a desgn. Assume that the chp has nne layers and requres that each layer can only have a maxmum of two nterconnects. If we drectly mplement each desgn, we wll need fve ASICs each wth nne layers of nterconnect, for a total of 45 nterconnect masks. However, by usng the concept of flexble ASIC and analyzng the common nterconnects requred for all desgns, the nterconnect requrements can be organzed as n Table 2. The leftmost column of Table 2 shows the nputs to each block. Each column afterward shows the output that C D Applcaton IC JPEG MPEG GSM P.721 Pegwt AA 1 AA AB AB 2 AC 1 AD BA 2 BB 1 BC BC 2 BD BD CA CB 1 CB 2 CD 1 DA 1 DB 2 DC 1 DC DD 1 DD 2 Table 1: Interconnects requred for each of the meda processors. Input Basc JPEG MPEG GSM P.721 Pegwt A 1 A,D C C - - C A 2 B A A B 1 B,C A A - - A B 2 A,C,D C 1 A,D B B B - - C 2 B - - B B - D 1 C,D B B D 2 D - A A,B B - Table 2: The nput/output relatonshp for the encoder applcatons. enters the correspondng nput. The second column shows the basc archtecture nterconnecton that s the same for all fve dfferent applcatons. Columns 3 through 6 ndcate the outputs per meda applcaton that are the requred nterconnects wth each of the nputs on each row, n addton to the basc nterconnects. Therefore, the frst seven layers are completely dedcated to the 14 nterconnects ncluded n the basc archtecture (column 2 of Table 2). The remanng nterconnects are assgned to the remanng two layers n such a way that we have the mnmum number of varants per layer and completely mplement each of the meda processors. We fnd the followng four varants (nstantaton for the nterconnects on one layer) for our fve encoders: v 1: CA 1, AB 1, v 2: CA 1, AB 1, v 3: BD 2, DC 2, v 4: BD 1, AA 2. By lookng at the nterconnect requrements from Table 2, we see that varants v 1 and v 2 can be used for realzng the MPEG and the JPEG decoders. Varants v 2 and v 3 can realze the GSM encoder. Varants v 3 and v 4 can realze the G.721 encoder, whle varants v 1 and v 4 can realze the Pegwt encoder. Therefore, the addton of only two layers to the basc layers can realze the full archtecture for all fve encoders. For each archtecture the frst seven layers are dentcal n terms of masks, whle the last two layers are realzed by two of the four varants correspondng to each archtecture and are hence dfferent. Therefore, usng flexble ASIC only eleven unque metal layers are necessary to realze all fve meda processors. 910

3 Lbrary Programs CDFG Profler Archtectural Profle Logc Level Defnton DP & Interconnect Archtecture Fgure 2: Global Flow. Interconnect Assgnment Layer Identfcaton & Assgnment In the followng secton we brefly dscuss the related work. In Secton 2 we dentfy the overall flow of the approach for realzaton of multple applcatons on a sngle chp. We ntroduce our ILP approach for determnng the basc layers of the chp n Secton 3. We formulate the nterconnect assgnment problem usng nteger lnear programmng (ILP) for two optmzatons: mnmal varant layers and mnmal desgn and manufacturng cost. Before concludng the work we present the expermental analyss of the technque. 2. DESIGN FLOW Fgure 2 shows the flow of our flexble ASIC approach. The nput to the CAD system s a set of applcatons specfed usng C or VHDL programs. In our expermentaton, we focused on the mplementaton of large meda applcatons that have been specfed and smulated for ther functonalty correctness usng C programs. Specfcally, we consdered all Medabench applcatons [10] and the Vterb convoluton encoder. All programs have been profled for the dentfcaton of the number of operatons and transfers of each type. Proflng s accomplshed through a relatvely smple wrapper around the Trmaran [16] and Shade tools [1]. The output of the CDFG profler s the requred datapath and nterconnect requrements of each applcaton specfed at the behavoral level. Ths profle s an nput, together wth a lbrary of the avalable functonal unts, to a program that dentfes the best unts for ncluson n the datapath n such a way the the lkelhood that all tmng constrants for all applcatons are satsfed and the area of the desgn mnmzed. Note that we do not specfcally optmze the memory structure. In practce we observed that t s suffcent to allocate memory requrements of the most ntensve applcaton for all others. The result of the datapath defnton module has two components. The frst s a set of mult-functonal unts that satsfy the specfed clock cycle tme. The second s the set of requred nterconnects for each applcaton. The second component s the nput nto the layer dentfcaton and assgnment program that mnmzes the total number of unque layers requred for all applcatons. We consdered two varants of the layer dentfcaton assgnment problem: () where the goal s to solely mnmze the number of unque layers, and () where the layers are dentfed and assgned n such a way that the sum of the NRE and manufacturng cost s mnmzed. Both the datapath and layer dentfcaton assgnment programs are solved usng ILP. For ths purpose we used the CPLEX solver. 3. DATAPATH DEFINITION In ths secton, we frst formulate the datapath defnton problem and ts complexty. We present the ILP formulaton of the problem whch can be solved usng any ILP solver. We assume that a set of applcatons s gven. The goal s to allocate a datapath for these applcatons usng the avalable lbrary of functonal unts. The lbrary contans specfcatons for a number of unt types. The goal s to fnd a mappng whch mnmzes the total area used by the logcal level defnton (actual functonal unts) of our applcatons. The problem can be more formally stated as follows. Note that, we defne the problem nstance n such a way that we can drectly use t as the nput to our ILP formulaton n the next subsecton. Problem: Datapath Defnton Problem Instance: A set of applcatons MP k, k =1,..., N k. AnumberofoperatonsOP j, j =1,..., N j. A set of requred operatons RO kj, that defne f the operaton OP j s requred by the applcaton MP k, j =1,..., N j, k =1,..., N k : 1, f operaton OPj s requred by MP RO kj = k, The archtectural profles P kj, that defnes the number of operatons of type OP j n the applcatons MP k, j = 1,..., N j, k =1,..., N k. A number of for functonal unt types (famles) TFU, =1, 2,..., N. A lbrary wth elements L j such that each type of functonal unts TFU can perform the operatons OP j, = 1, 2,..., N, j =1,..., N j: 1, f TFU can perform operatons of type OP L j = j, An actual set of functonal unts FU a, a =1,..., N a; An area cost A, for each functonal unt type TFU, =1, 2,..., N. A throughput constrant T k correspondng to each applcaton MP k, k =1,..., N k. Queston: Is there a subset of functonal unts FU a, a = 1,..., N a, such that the subset has the cardnalty N sub and all applcatons are realzable (functonal unts and transfers) and the total cost s mnmzed? 3.1 NP-Completeness The datapath defnton problem s NP-complete. It s a specal case of the mnmum cover problem. The mnmum cover problem s formulated as followng n Garey-Johnson format [3]. Problem: [SP5] Mnmum cover Instance: Collecton C of subsets of a fnte set S, postve nteger K C. Queston: Does C contan a cover for S of sze K or less,.e., a subset C C wth C K such that every element of S belongs to at least one member of C? We prove that the mnmum cover problem s a specal case of the datapath defnton problem. Each element n the set cover problem corresponds to a type of operaton n the specfcaton of the desgns. Each collecton corresponds to a functonal unt n the lbrary that can execute a set of operatons that correspond to the elements of the set cover problem. The goal s to select at most K collectons, or 911

4 functonal unts, whch mplement all the operatons. Note that the throughput constrant on the functonal unts s not enforced n the case of the mnmum cover problem. 3.2 ILP Formulaton of the Problem In order to solve ths problem usng ILP, a number of varables are defned. The frst set of varables are denoted by x a and relate the actual functonal unts FU a to the type of functonal unts TFU, a =1,..., N a, =1, 2,..., N : 1, f functonal unts FUa s of type TFU x a =, (1) The second set of varables are denoted by t kja and are defned for each applcaton MP k, and determne the number of operatons of type OP j assgned to the functonal unt FU a,,k =1,..., N k, j =1,..., N j, a =1,..., N a. The objectve functon s to mnmze the total cost of the mplementaton area requred for mplementaton of the applcatons (actual functonal unts): mn A x a We ntroduce four sets of constrants for the problem. The frst set of constrants (C 1), ensures that each functonal unt FU a s only of one specfc type TFU. The second set of constrants (C 2), enforces each applcaton to be realzable by the selected functonal unts. Thus, for each applcaton MP k, all operatons of type OP j, must be mplemented by at least one selected functonal unt FU a: C 1 : a, x a =1 C 2 : k, j, a O kj L jx a 1 a Not only must each of the applcatons be realzable, but the throughput constrant for each applcaton should be met. The thrd and fourth sets of constrants enforce the throughput constrants for each applcaton and each type of operaton. The thrd set of constrants (C 3) ensures that all nstances of the operaton are assgned to a selected functonal unt whch can perform the operaton. C 3 conssts of two notons, the frst of whch specfes f the selected functonal unt FU a of type TFU can perform the operaton OP j. If so, then ths functonal unt can be assgned the operaton of type OP j. Secondly, the sum of all operatons of type TFU j assgned to all selected functonal unts must equal the total number of operatons requred for MP k.we mplement these requrements usng the logcal AND ( ) functon. AND s mplemented n terms of ILP constrants by ntroducng a new varable and addng three constrants for each AND operaton. Specfcally, a b = c s mplemented n the followng way where c s a bnary varable. a + b c 1 a c 0 b c 0 The fourth and last set of constrants (C 4), ensures that all the assgned operatons to an actual functonal unt for a gven applcaton do not exceed the throughput lmtatons of that applcaton. For each applcaton MP k,andforeach selected functonal unt FU a, for any type of operaton OP j that the functonal unt of a certan type can perform, the sum must be less than the throughput specfed for the operaton: C 3 : O kj =1, C 4 : k, a, (L jx a t kja )=P kj a (L jx a t kja ) T k 4. INTERCONNECT ASSIGNMENT j Once the datapath for each of the processors s defned (.e. the soluton from the datapath defnton problem), the next step s to determne the nterconnect assgnments for each of the processors. In ths secton, we frst formulate the nterconnect assgnment problem and the complexty of the problem. Next, we present the ILP formulaton of the problem. Fnally, we ntroduce a varant on the orgnal problem formulaton whch addresses the mnmzaton of the total mplementaton cost of all layers. Assume the nterconnect specfcatons for a number of applcaton-specfc meda processors MPs s known. In addton, the specfc nterconnects for each of the meda processors are gven. The fabrcaton process has a constrant on the number of nterconnects per layer. The goal s to assgn nterconnects to the dfferent layers for each applcaton n such a way that the maxmum number of layers can be shared between the applcatons, only a few layers vary for dfferent applcatons, and the constrants on the number of nterconnects on one layer are satsfed. More formally the problem can defned as follows. Problem: Interconnect Assgnment Problem Instance: Applcaton-specfc meda processors (MP k ) wth specfed nterconnects, number of layers (S), number of nterconnects per layer (M), number of allowable varatons for each layer (R). Queston: Is there an assgnment of nterconnects to each layer s.t. the total number of created varatons are for all layers are mnmal, and each MP k s created? 4.1 NP-Completeness The nterconnect assgnment problem s NP-complete. It s a specal case of the Largest Common Subgraph problem. The Largest Common Subgraph problem s formulated n the Garey-Johnson format [3] as: Problem: [GT49] Largest Common Subgraph Instance: Graph G = (V 1,E 1), H = (V 2,E 2), postve nteger K. Queston: Do there exst subsets E 1 E 1 and E 2 E 2 wth E 1 = E 2 K such that the two subgraphs G = (V 1,E 1) and H =(V 2,E 2) are somorphc? Note that the nterconnect requrements for each applcaton form a graph, where the vertces of the graphs are the functonal blocks of the MPs and the edges are the nterconnects between the blocks. When we have only two applcatons, the selected subset of edges s the set of nterconnects whch are shared by the meda processors. In ths case the goal s to fnd the maxmal set. 4.2 ILP Formulaton We now show the notatons and the varables that are used for formulatng the nterconnect assgnment problem n 912

5 ILP. We assume that the total number of layers s specfed s = 1,...,S}, aswellasthemaxmumnumberofvarants per layer r = 1,...,R}. The maxmum number of nterconnects on one layer s denoted by a constant M and s an nput to the problem. The three dmensonal matrx W jk s an nput to the problem and ndcates the data flow edges (nterconnects) for each applcaton k. More specfcally, 1, f the edge (,j) exst n applcaton k, W jk = The soluton to the problem fnds dfferent varants (nstantatons) for each of the layers. The four dmensonal varable x jrs s used to denote the nterconnect assgnment for each layer n each of the varants. More formally, 1, f the edge (,j) exst n varant r on layer s, x jrs = Addtonally, we defne a three dmensonal varable, u krs, that lnks each applcaton to the layers and the varant on that layer on whch ts nterconnect requrements are mplemented. 1, f the applcaton k uses varant r on layer s, u krs = Lastly, we defne a varable to denote f a possble varant s used on each layer. The two dmensonal varable y rs s such a varable and s defned below. 1, f varant r s used on layer s, y rs = Note that the varables x jrs, u krs,andy rs are not ndependent of each other and we wll defne a set of constrants n the problem formulaton that relates these three varables. The objectve functon of the problem s to mnmze the total number of varants on all of the chp layers. Ths objectve functon can be formally wrtten as: mn There are fve constrants to the formulaton whch relate the above varables together and specfy the lmtatons on the varables. The frst constrant of the problem (C 5)ensures that each of the requred nterconnects for all the applcatons are mplemented on at least one varant on any of the layers of the chp. Ths ensure that each of the meda applcatons can be realzed on the chp. More formally, C 5 : k [1,K], (,j) W jk x jrs r s C 6 : s, r, x jrs M C 7 : k [1,K], layers s, r s y rs j u krs 1 The second constrant of the problem (C 6) s to ensure that each layer has at most the maxmum number of M nterconnects at each layer,.e., When realzng each of the applcatons the requred nterconnects must all be mplemented on only one varant per layer. To ensure ths, the thrd constrant (C 7) specfes that each applcaton can only be assgned at most one varant at each layer. In other words, The two fnal constrants (C 8 and C 9)relateallthevarables to each other. In partcular, the constrant C 8 determnes r the relatonshp between the overall used layers, y rs, and the layers used by each specfc applcaton, u krs. The relatonshp s straghtforward from the defnton of the two varables. If an applcaton k uses varant r on layer s (e. u krs = 1), then the varant r on layer s must be used. Ths relatonshp s shown as follows. C 8 : k [1,K] y rs u krs C 9 :, j u krs W jk x jrs Constrant C 9 specfes the relatonshp between the nterconnects on each varant of each layer, x jrs, andthevarant layers whch are used for each applcaton, u krs. Ths relatonshp s also drectly concluded from the defntons. If the edge (,j) exsts n varant r on layer s, and also the edge (,j) s requred for realzaton of applcaton k, thenx jrs and W jk are one. In such a stuaton, the applcaton k can use the varant r on layer s,.e. u krs = Interconnect Assgnment for Cost In ths Subsecton the focus of the optmzaton swtches from mnmzng the number of total varant layers to mnmzng the monetary cost for mplementaton of the layers, both desgn and manufacturng. We assume that the cost of desgnng each of the layers s known along wth the amount of each meda processor to be manufactured. Consderng these new nputs the problem can be formally defned as follows. Problem: Cost Mnmzaton wth Interconnect Assgnment Instance: Applcaton-specfc meda processors (MP k ) wth specfed nterconnects, number of layers (S), number of nterconnects per layer (M), number of allowable varatons for each layer (R), cost for desgn for each layer (D), volume of manufacturng each meda processors (V k ). Queston: Is there an assgnment of nterconnects to each layer s.t. the total cost of desgn and manufacturng all desgns s mnmal and each MP k s created? In order to address ths problem whch s an extenson of the nterconnect assgnment problem, we reformulate the objectve functon to focus on the total cost rather than the number of layers. The total cost s the cost to desgn each of the necessary layers, plus for each layer the volume of manufacturng necessary for each of the meda processors. Therefore, usng the same formulaton varables from the nterconnect assgnment problem we formulate the objectve functon as: mn D ( y rs)+ (V k ( u krs )) r s k r s All constrants of the problem are dentcal to the nterconnect assgnment problem. 5. EXPERIMENTAL RESULTS Flexble ASIC s partcularly well suted for computatonal ntensve applcatons such as meda processors. Therefore, for evaluaton purposes we used 21 applcatons from the Medabench benchmark sute [10] and the Vterb decoder that s wdely used n both wreless and optcal communcatons. The Vterb decoder was obtaned from the LSI Logc corporaton and s denoted n Table 3 by V. The Medabench 913

6 desgns # ofdesgns # ofntal # ofcustom # of fnal reducton ntal fnal cost layers layers layers cost ($M) cost ($M) reduct A,B,C,D,V % % G,H,I,J,V % % A,B,L,M,N % % A,B,C,D,I,J % % G,H,I,J,M,N % % O,P,Q,R,S,T % % A,B,C,D,G,H,I,J % % A,B,E,F,G,H,I,J % % A,B,G,H,I,J,L,M,N,V % % A,B,C,D,G,H,I,J,K,L % % Table 3: Analyss of the new flexble ASCI scheme n terms of technologcal complexty (number of layers) and estmated cost for the case where on manufacturng of each desgn $1M s spent per desgn. desgns have been denoted usng the followng notaton : Cjpeg (A), Djpeg (B), MPEG2encode (C), MPEG2decode (D), toast (E), untoast (F), rawcaudo (G), rawdaudo (H), G.721 encode (I), G.721 decode (J), PGPencode (K), PG- Pdecode (L), PEGWITencode (M), PEGWITdecode (N), Ghostscrpt(O),mpmap(P),osdemo(Q),texgen(R),rasta (S), EPIC (T), and unepic (U). We organzed all the desgns nto ten groups by smlartes of ther applcatons and smlarty n ther type of datapath and nterconnect requrements. The smallest group conssted of fve desgns and the largest contaned ten desgns. For all desgns we assumed the total number of avalable layers was ten and that each metal layer can be used to mplement at most fve global nterconnect between dfferent functonal unts. In all examples, the datapath conssted of eght unts. Table 3 shows characterstcs of the desgns when mplemented usng the tradtonal ASIC where each desgn s mplemented separately and after applcaton of the new flexble ASIC desgn methodology and platform. The frst column ndcates whch desgns were consdered for smultaneous mplementaton on a sngle platform usng the flexble ASIC methodology. The second column shows the total number of layers requred for all desgn when the tradtonal ASIC approach s used. The fourth column ndcates the number of customzed layers used by the flexble ASIC approach. The next two columns show the total number of layers requred by the flexble ASIC approach and the total reducton n percentage of layers requred for mplementaton. The last three columns are used to estmate the economc advantage of flexble ASIC under the assumpton that the cost of a layer s $100,000 and that the manufacturng cost for each type of desgn n $1M. Note that the reducton n number of layers s equvalent to the reducton n NRE cost. The average NRE cost was reduced by 74% and the average overall cost reducton by 34%. Note that n all cases, the cost reducton ncreases as the number of desgns targeted to share the common bottom layers ncreases. 6. CONCLUSION In order to reduce NRE cost and enhance the applcablty of ASIC mplementaton platforms, we ntroduced a common bottom - customzed top layers flexble ASIC approach. The approach provdes a new alternatve between FPGA and ASIC mplementaton platforms: t essentally mantans ASIC advantages n terms of gate speed, densty, and power, whle reducng the NRE cost. We dentfed the two key phases for CAD tools that target the new ASIC platform and solved assocated optmzaton problems optmally usng the CPLEX ILP solver. The expermentaton ndcates that the new platform can be used both to reduce NRE and overall mplementaton costs. 7. REFERENCES [1] R. F. Cmelk and D. Keppel. Shade: A fast nstructon-set smulator for executon proflng. Techncal Report SMLI 93-12, UWCSE , [2] J. Cong, Y. Fan, X. Yang, and Z. Zhang. Archtecture and synthess for mult-cycle communcaton. In Internatonal Symposum on Physcal Desgn, pages , [3] M. Garey and D. Johnson. Computers and Intractablty. W.H. Freeman, [4] [5] platform asc/ndex.html. [6] B. Hu, H. Jang, Q. Lu, and M. Marek-Sadowska. Synthess and placement flow for gan-based programmable regular fabrcs. In Internatonal Symposum on Physcal Desgn, pages , [7] A. Kahng, I. Bolsens, J. Cohn, B. Gupta, C. Hamln, Z. Orbach, and L. Plegg. What s the next mplementaton fabrc? IEEE Desgn and Test of computers, 20(6):86 95, Nov [8] V. Kheterpal, A. J. Strojwas, and L. Plegg. Routng archtecture exploraton for regular fabrcs. In Conference on Desgn Automaton, pages , [9] D. E. Lackey, P. S. Zuchowsk, and J. Koehl. Desgnng mega-asics n nanogate technologes. In Conference on Desgn Automaton, pages , [10] C. Lee, W. Mangone-Smth, and M. Potkonjak. Medabench: A tool for evaluatng multmeda and communcaton systems. In MICRO-30 Conference, pages , Nov [11] F. Mo and R. K. Brayton. Fshbone: a block-level placement and routng scheme. In Internatonal Symposum on Physcal Desgn, pages , [12] T. Okamoto, T. Kmoto, and N. Maeda. Desgn methodology and tools for NEC electroncs structured ASIC ISSP. In Internatonal Symposum on Physcal Desgn, pages 90 96, [13] L. Plegg, et al. Explorng regular fabrcs to optmze the performance-cost trade-off. In Conference on Desgn Automaton, pages , [14] Y. Ran and M. Marek-Sadowska. On desgnng va-confgurable cell blocks for regular fabrcs. In Conference on Desgn Automaton, pages , [15] D. D. Sherlekar. Desgn consderatons for regular fabrcs. In Internatonal Symposum on Physcal Desgn, pages , [16] Trmaran. [17] K. Wu and Y. Tsa. Structured ASIC, evoluton or revoluton? In Internatonal Symposum on Physcal Desgn, pages , [18] P. S. Zuchowsk, C. B. Reynolds, R. J. Grupp, S. G. Davs, B. Cremen, and B. Troxel. A hybrd ASIC and FPGA archtecture. In Internatonal Conference on Computer-aded Desgn, pages ,

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Scheduling with Integer Time Budgeting for Low-Power Optimization

Scheduling with Integer Time Budgeting for Low-Power Optimization Schedlng wth Integer Tme Bdgetng for Low-Power Optmzaton We Jang, Zhr Zhang, Modrag Potkonjak and Jason Cong Compter Scence Department Unversty of Calforna, Los Angeles Spported by NSF, SRC. Otlne Introdcton

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

CHAPTER 2 DECOMPOSITION OF GRAPHS

CHAPTER 2 DECOMPOSITION OF GRAPHS CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Needed Information to do Allocation

Needed Information to do Allocation Complexty n the Database Allocaton Desgn Must tae relatonshp between fragments nto account Cost of ntegrty enforcements Constrants on response-tme, storage, and processng capablty Needed Informaton to

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

3. CR parameters and Multi-Objective Fitness Function

3. CR parameters and Multi-Objective Fitness Function 3 CR parameters and Mult-objectve Ftness Functon 41 3. CR parameters and Mult-Objectve Ftness Functon 3.1. Introducton Cogntve rados dynamcally confgure the wreless communcaton system, whch takes beneft

More information

Gradual Relaxation Techniques with Applications to Behavioral Synthesis *

Gradual Relaxation Techniques with Applications to Behavioral Synthesis * Gradual Relaxaton Technques wth Applcatons to Behavoral Synthess * Zhru Zhang, Ypng Fan, Modrag Potkonjak, Jason Cong Computer Scence Department, Unversty of Calforna, Los Angeles Los Angeles, CA 90095,

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Specifications in 2001

Specifications in 2001 Specfcatons n 200 MISTY (updated : May 3, 2002) September 27, 200 Mtsubsh Electrc Corporaton Block Cpher Algorthm MISTY Ths document shows a complete descrpton of encrypton algorthm MISTY, whch are secret-key

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Memory Modeling in ESL-RTL Equivalence Checking

Memory Modeling in ESL-RTL Equivalence Checking 11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Routing in Degree-constrained FSO Mesh Networks

Routing in Degree-constrained FSO Mesh Networks Internatonal Journal of Hybrd Informaton Technology Vol., No., Aprl, 009 Routng n Degree-constraned FSO Mesh Networks Zpng Hu, Pramode Verma, and James Sluss Jr. School of Electrcal & Computer Engneerng

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Test-Cost Modeling and Optimal Test-Flow Selection of 3D-Stacked ICs

Test-Cost Modeling and Optimal Test-Flow Selection of 3D-Stacked ICs Test-Cost Modelng and Optmal Test-Flow Selecton of 3D-Stacked ICs Mukesh Agrawal, Student Member, IEEE, and Krshnendu Chakrabarty, Fellow, IEEE Abstract Three-dmensonal (3D) ntegraton s an attractve technology

More information

Flexible ASIC: Shared Masking for Multiple Media Processors

Flexible ASIC: Shared Masking for Multiple Media Processors 54.7 Flexible ASIC: Shared Masking for Multiple Media Processors Jennifer L. Wong Farinaz Kourshanfar Miodrag Potkonjak Univ. of Calif., Los Angeles Univ. of Calif., Berkeley Univ. of Calif., Los Angeles

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

An Integer Linear Programming Approach for Identifying Instruction-Set Extensions

An Integer Linear Programming Approach for Identifying Instruction-Set Extensions An Integer Lnear Programmng Approach for Identfyng Instructon-Set Extensons Kublay Atasu Department of Computer Engneerng Bogazc Unversty, Turkey atasu@boun.edu.tr Günhan Dündar Department of Electrcal

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Maintaining temporal validity of real-time data on non-continuously executing resources

Maintaining temporal validity of real-time data on non-continuously executing resources Mantanng temporal valdty of real-tme data on non-contnuously executng resources Tan Ba, Hong Lu and Juan Yang Hunan Insttute of Scence and Technology, College of Computer Scence, 44, Yueyang, Chna Wuhan

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

A Saturation Binary Neural Network for Crossbar Switching Problem

A Saturation Binary Neural Network for Crossbar Switching Problem A Saturaton Bnary Neural Network for Crossbar Swtchng Problem Cu Zhang 1, L-Qng Zhao 2, and Rong-Long Wang 2 1 Department of Autocontrol, Laonng Insttute of Scence and Technology, Benx, Chna bxlkyzhangcu@163.com

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

Topology Design using LS-TaSC Version 2 and LS-DYNA

Topology Design using LS-TaSC Version 2 and LS-DYNA Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool

More information

5 The Primal-Dual Method

5 The Primal-Dual Method 5 The Prmal-Dual Method Orgnally desgned as a method for solvng lnear programs, where t reduces weghted optmzaton problems to smpler combnatoral ones, the prmal-dual method (PDM) has receved much attenton

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

A New Token Allocation Algorithm for TCP Traffic in Diffserv Network

A New Token Allocation Algorithm for TCP Traffic in Diffserv Network A New Token Allocaton Algorthm for TCP Traffc n Dffserv Network A New Token Allocaton Algorthm for TCP Traffc n Dffserv Network S. Sudha and N. Ammasagounden Natonal Insttute of Technology, Truchrappall,

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Interconnect Optimization for High-Level Synthesis of SSA Form Programs

Interconnect Optimization for High-Level Synthesis of SSA Form Programs Interconnect Optmzaton for Hgh-Level Synthess of SSA Form Programs Phlp Brsk Aay K. Verma Paolo Ienne Processor Archtecture Laboratory Swss Federal Insttute of Technology (EPFL) Lausanne, Swtzerland {phlp.brsk,

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT Bran J. Wolf, Joseph L. Hammond, and Harlan B. Russell Dept. of Electrcal and Computer Engneerng, Clemson Unversty,

More information

Storage Binding in RTL synthesis

Storage Binding in RTL synthesis Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne

More information

A fault tree analysis strategy using binary decision diagrams

A fault tree analysis strategy using binary decision diagrams Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Routability Driven Modification Method of Monotonic Via Assignment for 2-layer Ball Grid Array Packages

Routability Driven Modification Method of Monotonic Via Assignment for 2-layer Ball Grid Array Packages Routablty Drven Modfcaton Method of Monotonc Va Assgnment for 2-layer Ball Grd Array Pacages Yoch Tomoa Atsush Taahash Department of Communcatons and Integrated Systems, Toyo Insttute of Technology 2 12

More information

A Facet Generation Procedure. for solving 0/1 integer programs

A Facet Generation Procedure. for solving 0/1 integer programs A Facet Generaton Procedure for solvng 0/ nteger programs by Gyana R. Parja IBM Corporaton, Poughkeepse, NY 260 Radu Gaddov Emery Worldwde Arlnes, Vandala, Oho 45377 and Wlbert E. Wlhelm Teas A&M Unversty,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Reliability and Energy-aware Cache Reconfiguration for Embedded Systems

Reliability and Energy-aware Cache Reconfiguration for Embedded Systems Relablty and Energy-aware Cache Reconfguraton for Embedded Systems Yuanwen Huang and Prabhat Mshra Department of Computer and Informaton Scence and Engneerng Unversty of Florda, Ganesvlle FL 326-62, USA

More information

Genetic Tuning of Fuzzy Logic Controller for a Flexible-Link Manipulator

Genetic Tuning of Fuzzy Logic Controller for a Flexible-Link Manipulator Genetc Tunng of Fuzzy Logc Controller for a Flexble-Lnk Manpulator Lnda Zhxa Sh Mohamed B. Traba Department of Mechancal Unversty of Nevada, Las Vegas Department of Mechancal Engneerng Las Vegas, NV 89154-407

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

UNIT 2 : INEQUALITIES AND CONVEX SETS

UNIT 2 : INEQUALITIES AND CONVEX SETS UNT 2 : NEQUALTES AND CONVEX SETS ' Structure 2. ntroducton Objectves, nequaltes and ther Graphs Convex Sets and ther Geometry Noton of Convex Sets Extreme Ponts of Convex Set Hyper Planes and Half Spaces

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

LOOP ANALYSIS. The second systematic technique to determine all currents and voltages in a circuit

LOOP ANALYSIS. The second systematic technique to determine all currents and voltages in a circuit LOOP ANALYSS The second systematic technique to determine all currents and voltages in a circuit T S DUAL TO NODE ANALYSS - T FRST DETERMNES ALL CURRENTS N A CRCUT AND THEN T USES OHM S LAW TO COMPUTE

More information

ARTICLE IN PRESS. Signal Processing: Image Communication

ARTICLE IN PRESS. Signal Processing: Image Communication Sgnal Processng: Image Communcaton 23 (2008) 754 768 Contents lsts avalable at ScenceDrect Sgnal Processng: Image Communcaton journal homepage: www.elsever.com/locate/mage Dstrbuted meda rate allocaton

More information

Pose, Posture, Formation and Contortion in Kinematic Systems

Pose, Posture, Formation and Contortion in Kinematic Systems Pose, Posture, Formaton and Contorton n Knematc Systems J. Rooney and T. K. Tanev Department of Desgn and Innovaton, Faculty of Technology, The Open Unversty, Unted Kngdom Abstract. The concepts of pose,

More information

Minimization of the Expected Total Net Loss in a Stationary Multistate Flow Network System

Minimization of the Expected Total Net Loss in a Stationary Multistate Flow Network System Appled Mathematcs, 6, 7, 793-87 Publshed Onlne May 6 n ScRes. http://www.scrp.org/journal/am http://dx.do.org/.436/am.6.787 Mnmzaton of the Expected Total Net Loss n a Statonary Multstate Flow Networ System

More information

Minimum Cost Optimization of Multicast Wireless Networks with Network Coding

Minimum Cost Optimization of Multicast Wireless Networks with Network Coding Mnmum Cost Optmzaton of Multcast Wreless Networks wth Network Codng Chengyu Xong and Xaohua L Department of ECE, State Unversty of New York at Bnghamton, Bnghamton, NY 13902 Emal: {cxong1, xl}@bnghamton.edu

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Decision Support for the Dynamic Reconfiguration of Machine Layout and Part Routing in Cellular Manufacturing

Decision Support for the Dynamic Reconfiguration of Machine Layout and Part Routing in Cellular Manufacturing Decson Support for the Dynamc Reconfguraton of Machne Layout and Part Routng n Cellular Manufacturng Hao W. Ln and Tomohro Murata Abstract A mathematcal based approach s presented to evaluate the dynamc

More information

Power-Aware Mapping for Network-on-Chip Architectures under Bandwidth and Latency Constraints

Power-Aware Mapping for Network-on-Chip Architectures under Bandwidth and Latency Constraints Power-Aware Mappng for Network-on-Chp Archtectures under Bandwdth and Latency Constrants Xaohang Wang 1,2, Me Yang 2, Yngtao Jang 2, and Peng Lu 1 1 Department of Informaton Scence and Electronc Engneerng,

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011 9/8/2 2 Outlne Appendx C: The Bascs of Logc Desgn TDT4255 Computer Desgn Case Study: TDT4255 Communcaton Module Lecture 2 Magnus Jahre 3 4 Dgtal Systems C.2: Gates, Truth Tables and Logc Equatons All sgnals

More information