AN ANALYTICAL MODEL DESCRIBING THE RELATIONSHIPS BETWEEN LOGIC ARCHITECTURE AND FPGA DENSITY

Size: px
Start display at page:

Download "AN ANALYTICAL MODEL DESCRIBING THE RELATIONSHIPS BETWEEN LOGIC ARCHITECTURE AND FPGA DENSITY"

Transcription

1 AN ANALYTICAL MODEL DESCRIBING THE RELATIONSHIPS BETWEEN LOGIC ARCHITECTURE AND FPGA DENSITY Andrew Lam 1, Steven J.E. Wilton 1, Phili Leong 2, Wayne Luk 3 1 Elec. and Com. Engineering 2 Comuter Science and Engineering 3 Det. of Comuting University of British Columbia Chinese University of Hong Kong Imerial College Vancouver, B.C., Canada Hong Kong London, U.K. {andrewl,stevew}@ece.ubc.ca hwl@cse.cuhk.edu.hk wl@doc.ic.ac.uk ABSTRACT This aer describes an analytical model, based rincially on Rent s Rule, that relates logic architectural arameters to the area efficiency of an FPGA. In articular, the model relates the looku-table size, the cluster size, and the number of inuts er cluster to the amount of logic that can be acked into each looku-table and cluster, and the number of used inuts er cluster. Comarison to exerimental results show that our models are accurate. This accuracy combined with the simle form of the equations make them a owerful tool for FPGA architects to better understand and guide the develoment of future FPGA architectures. 1. INTRODUCTION Field-Programmable Gate Arrays (FPGAs) have evolved considerably since their introduction. A dramatic increase in the number of available transistors has motivated research in both academia and industry, leading to new logic block structures, flexible embedded blocks, and comlex interconnect networks. FPGA architectural enhancements are often develoed in a somewhat ad-hoc manner. Exert FPGA architects erform exeriments in which benchmark circuits are maed using reresentative comuter-aided design (CAD) tools, and the resulting density, seed, and/or ower dissiation are estimated [1, 2, 3]. Based on the results of these exeriments, architects use their intuition and exerience to design new architectures, and then evaluate these architectures using another set of exeriments. This is reeated numerous times, until a suitable architecture is found. This rocess occurs both within FPGA comanies as new generations of devices are develoed, and within academia, as new architectural innovations are investigated. During this rocess, there is virtually no body of theory that architects can use to guide their investigations. Previous studies have roduced emirical rules of thumb (such as the best logic element size), however, these rules of thumb usually rovide no insight into the underlying tradeoffs between flexibility and density, seed, and ower dissiation. Such insight, however, would be extremely valuable for two reasons. First, understanding the relationshis between architectural arameters enables early-stage architecture develoment [4] in which the design sace can be searched quickly using analytical models. Once a romising region of the architecture sace has been identified, traditional exerimental methods can be used to choose recise architectural arameters. This would significantly accelerate the FPGA architecture design rocess. It may also allow the study of a wider variety of interesting architectures since exerimental CAD tools need not be develoed for each architecture under consideration. Second, the develoment of such theory will encourage researchers to understand why certain architectures work well, and may eventually rovide bounds on the caabilities and efficiencies of rogrammable logic. This aer is a ste towards such a body of theory. Secifically, this aer resents an analytical model that describes the relationshi between logic block and cluster arameters and the area-efficiency of the resulting FPGA. The inuts of the model are the looku-table size, cluster size, and inuts er cluster. The oututs are (1) the exected number of two-inut logic gates that can be acked into each lookutable, (2) the exected number of looku-tables that can be acked into each cluster, and (3) the exected number of inuts of each cluster that are used. The first two oututs can be used to deduce the density of an FPGA imlementation, while the third outut can be used as an inut to the channel width model resented in [4]. Together with [4], our model can be used to quickly evaluate a wide variety of looku-table/cluster architectures, without requiring timeconsuming emirical exeriments. This aer is organized as follows. Related work and reliminary background are described in Sections 2 and 3, resectively. The model itself is described in Section 4, and it is validated against exerimental results in Section 5. Section 6 gives an examle of how the model can be used as a tool in FPGA architectural investigation /8/$ IEEE. 221

2 2. RELATED WORK Several revious ublications have described analyticallyderived relationshis between FPGA architectural arameters. Much of the work focuses on the routing fabric. El Gamal derives a model that relates that relates the area required for routing to the total number of ins in the logic gates [5]. Although this work was originally designed to describe a non-rogrammable chi, the model has been used in the design of numerous generations of FPGAs [6]. Work by Brown et al relates various arameters that describe an FPGA routing architecture to the routability of that architecture [7], and more recently Fang and Rose related the same architectural arameters to the channel width of an FPGA [4]. Pistorius and Hutton relates the Rent arameter (the Rent arameter is a measure of the comlexity of the interconnect attern in a circuit [8]) of a circuit to various architectural arameters [9]. There has also been much work on interconnect rediction. Early work by Donath [1] and others related the area requirements of routing wires to the Rent arameter of a circuit. Later work by Stroobandt refined the models to consider more realistic network toologies and architectural assumtions [11]. More recent work has roduced more accurate estimates of routing area using more information from the circuit to be imlemented [12]. The revious work closest to ours is by Gao et al, who relates looku-table size to area in a non-clustered FPGA [13]. Comared to that work, we consider clustered architectures (which are more reresentative of real FPGAs) and use cluster architectural arameters in our equations Guiding Princiles 3. PRELIMINARIES Three rinciles guided us in the develoment of this model. First, we endeavored to develo the model by deriving relations analytically, without relying on curve-fitting or exerimental techniques. This ensures that we are caturing the essence of rogrammable logic, and not creating a model that is limited to a articular CAD flow or tool suite. As will be discussed in Section 4, all asects of our model were derived analytically 1 excet for one arameter, γ. Second, since we wish to derive relations between architectural arameters, we attemted to create a model that is indeendent of the circuit to be imlemented on the FPGA. For examle, we would refer a relation between block size and cluster size that is indeendent of the circuit to be imlemented. This is different from much rior estimation work, in which the goal is to redict the area, seed, or ower for a 1 Technically, Rent s Rule is an emirical exression, yet we use it as art of our analytical derivation. Rent s Rule is well acceted, so we do not feel it violates the sirit of this guiding rincile. Table 1. Parameters Architectural Parameters: K Number of inuts er looku table N Number of looku tables er cluster I Number of inuts er cluster Circuit Parameters: Rent arameter of a given circuit n 2 Number of 2-LUTs in a given circuit Imlementation Parameters: n k Number of K-LUTs needed to imlement a given circuit n c Number of clusters needed to imlement a given circuit c Average number of LUTs acked into each cluster (c = n 2 /n c) i Average number of inuts used in each cluster o Average number of oututs used in each cluster f Average fanout of all nets in the circuit γ Average number of inuts not used in each LUT given circuit [12]. That being said, it is imossible to comletely ignore the imact of the circuit; we describe our circuit using the Rent Parameter and size of the un-techmaed circuit. Third, we attemted to balance the comlexity of our equations with their accuracy. We feel that simle equations rovide significantly more insight into architectural tradeoffs than exressions which are unnecessarily comlex Parameters Table 1 summarizes the arameters used to describe the architecture, circuit, and imlementation. In general, uercase letters are used for architectural arameters, and lowercase letters are used for circuit arameters and arameters that describe the imlementation of the circuit on a given architecture. 4. MODEL DERIVATION Although our goal is to model architectures, our model mirrors the CAD flow used when maing circuits to FPGAs. The first art of our model mirrors the rocess of technology maing, in which 2-inut gates are maed to K-LUTs. This art of the model can be used to comute the number of K-LUTs needed to imlement a circuit as a function of K. The second art of the model mirrors clustering, in which K-LUTs are acked into clusters with a re-defined caacity and a re-defined number of unique inuts [1]. The inuts of this art of the model are the number of K-LUTs in a circuit as well as the cluster arameters N (the caacity of each cluster) and I, the maximum number of available inut ins on each cluster, and the oututs are two quantities that describe the clustering: the exected number of LUTs that can be acked into each cluster, and the exected number of inuts to each cluster that are used. In the following, we focus on each art of the model searately. 222

3 4.1. Number of K-LUTs to imlement a circuit We first model the rocess of technology maing. Consider an un-techmaed circuit consisting of n 2 two-inut LUTs. During technology maing, these n 2 gates will be maed into a smaller number of LUTs, which we will denote n k. In this section, we seek a closed form exression for n k. Consider a ortion of the un-techmaed-circuit consisting of x two-inut gates (1 <x n 2 ). Denote the number of signals that connect across the boundary of this region as y. Since each gate has three ins (two inuts and one outut), we can use Rent s Rule [8] to write: y =3x (1) where is the Rent arameter of the circuit. Now suose this same region is maed to zk-luts using a technology maing algorithm. The number of signals that connect across the boundary of this region is still y. The number of ins used in each K-LUT is 1+K γ (the first two terms corresond to the outut and K inuts, and the final term, γ, will be described below). We can then write y =(K +1 γ)z (2) Since y is the same in both equations, we can eliminate y to get z x = 3 K +1 γ. (3) In other words, every zk-luts can imlement x 2-LUTs in the original circuit. Intuitively, this ratio is a measure of how much logic can be acked into each looku-table. Finally, using Equation 3, we can write: n k = n 2 3 K +1 γ The term γ in Equations 2 to 4 arises because not all K inuts are always used in a K-LUT. γ is the exected number of inuts to a K-LUT that are not used. We have not found a way to accurately model this analytically, however, exerimentally we have found that γ (as a function of K) is extremely consistent across all benchmark circuits we considered. Table 2 shows our measured values of γ; the derivation of a closed form for this exression is an interesting toic of future work. (4) clustering, these n k LUTs will be acked into a smaller number of clusters, which we will denote n c. In this subsection, we seek closed-form exressions for n c. In the next subsection, we derive an exression for the exected number of inuts used er cluster, i. Each cluster can contain u to N LUTs and have u to I unique inuts. In an architecture with a large value of I and small value of N, it is likely that most clusters will be comletely filled, while in architectures with a small value of I, some clusters may not be comletely filled, because of the limitation on the number of unique inuts. Since we wish our model to aly to both tyes of architectures, we consider each case searately below. We also define an equation for the boundary between the two cases I-Limited Clustering We first consider architectures in which I is small, and the exected number of LUTs acked into each cluster is dictated by the number of hysical ins on each cluster. In this case, the exected number of LUTs acked into each cluster, c = n k /n c, will be smaller than the caacity of the cluster N. To estimate c, and hence n c, we emloy Rent s Rule as follows. Consider the same region of the technologymaed circuit from Section 4.1 which contains zk-luts and has y signals that connect outside the region. When the same region is maed to clusters, we can write y =(i + o)v (5) where v is the number of clusters needed for this region (v z), i is the average number of used inuts er cluster, and o is the average number of used oututs er cluster. The latter quantity can be written as o = i/f where f is the average fanout of the circuit (this term will be comuted below). Thus, we can write: y = i(1 + 1 f )v (6) Eliminating y from Equations 2 and 6 and solving for n c gives: n c = n K +1 γ k I(1 + 1 f ), (7) Table 2. γ values from 2 MCNC benchmarks K γ and c = I(1 + 1 f ) K +1 γ (8) 4.2. Number of clusters needed to imlement a circuit We next model the rocess of clustering. Consider a technology maed circuit consisting of n k K-LUTs. During 2 Note that in real FPGA imlementations, regardless of N and I,some clusters will be full and some will not be. We could model this by considering a distribution of acking caacities rather than a single exected value. We have chosen to model a single exected value, since we feel it rovides simler equations, and thus more insight into the underlying tradeoffs between the architectural arameters. 223

4 The fanout f in Equations 7 and 8 can be calculated using a formula from [14]: where: f = 1 (f max +1) ( 1) 1 (f max +1) ( 2) φ(, f max ) 1 (9) φ(, f max )= f max n=1 and f max is the maximum fan-out written as: n n 2 (n +1), (1) f max =[(i + N) n k N (1 )](1/(3 )). (11) In Equation 11 we aroximated the number of oututs as N and the number of clusters as n k /N ; exerimentally, we have found that the fanout is only a weak function of f max, and thus these aroximations lead to only a small error N-Limited Clustering This case is trivial. Since cluster inut ins are lentiful, clusters can be filled to caacity. Hence, c = N and Boundary Condition n c = n k N (12) For architectures in which c<n, clustering is I-limited, otherwise it is N-limited. Using Equation 8, we can write the following condition that indicates that clustering is I- limited: I(1 + 1 f ) <N (13) K +1 γ This can be rearranged to roduce: I<N K +1 γ 1+ 1 f (14) For all values of I in which Inequality 14 holds, clustering is I-limited Average Number of Used Inuts Again, we consider I-limited and N-limited architectures searately. The boundary condition between the two tyes of architectures is the same as in the revious section I-Limited Clustering In these architectures, we would exect all cluster inut ins to be used. Thus, we can write i = I (15) N-Limited Clustering By alying Rent s Rule to a single cluster, we obtain: i + o =(K +1 γ)n, (16) since a cluster has N K-LUTs that each has (K +1 γ)i/o ins and i + o number of exterior ins. By substituting o = i/f into (16) and solving for i, we obtain (K +1 γ)n i =. (17) 1+ 1 f 5. MODEL VERIFICATION To evaluate the accuracy of our model, we comare the model redictions to exerimental results. Figure 1(a) illustrates the accuracy of our technology maing model. The exerimental results were obtained by averaging the results for twenty large MCNC benchmark circuits. Results for two different technology maing algorithms are shown. The analytical results were obtained from Equation 4 using the average Rent arameter from all benchmark circuits. As the grah shows, the analytical results track the exerimental results very closely. Figures 1(b) through 1(d) illustrate the accuracy of our clustering model. Figure 1(b) shows the ratio n2 n c as a function of cluster size. In all cases, we set K =4, and I to be.88n +3.2 from [4] (this ensures that our clustering is I- limited, which is the interesting case for this grah). The analytical results were obtained using Equation 7 and 4, while the exerimental results were obtained using two searate clustering algorithms, T-Vack [1] and our imlementation of irac [15]. Again, the analytical results are very consistent with both sets of exerimental results. Figure 1(c) shows the same ratio as a function of the number of inut ins er cluster, I. In all cases, K =4and N =2. The boundary between I-limited and N-limited architectures is also shown. The grah shows that our model tracks the exerimental results well in both regions. Finally, Figure 1(d) shows the average number of used inuts er cluster as a function of cluster size. Although our model tracks both sets of exerimental results, it matches the irac results more closely. This is exected, since irac exlicitly tries to minimize the use of cluster ins. 6. EXAMPLE APPLICATION OF OUR MODEL One of the uroses of our model is to allow for early architectural evaluation. In this section, we show how the model, along with the channel width model from [4], can be used to estimate the number of rogramming bits in an FPGA as a function of the architectural arameters K, N, and I. We will investigate whether an analytical flow emloying our 224

5 n 2 /n k Exerimental (EMa) 5 Exerimental (Flowma) Inuts er LUT (K) Cluster Size (N) n 2 /n c Exerimental (TVPack) Exerimental (irac) a) Logic er LUT vs. LUT Size b) Logic acked er cluster vs. cluster size n 2 /n c Boundary Exerimental 15 between I (TVPack) and N-limited Exerimental 1 clustering (irac) Inuts er Cluster (I) Used inuts er cluster (i) Exerimental (TVPack) Exerimental (irac) Cluster Size (N) c) Logic acked er cluster vs. cluster inuts d) Used Inuts er Cluster Fig. 1. Validation model leads to similar conclusions that would be obtained by a more time-consuming exerimental methodology. We consider three flows: 1. The first flow is urely analytical. We use the model described in Section 4, to estimate n2 n c and i. These quantities are then used in conjunction with the channel width model in [4] to determine the amount of routing needed for a given architecture. We then use equations that model the number of rogramming bits in a clustered logic block, connection block, switch block. These equations are similar to those used within VPR 5., and assume an architecture with uni-directional, single-driver wires. Multilexers are assumed to be imlemented using a two-tiered structure with onehot select lines (as in VPR 5.). Finally, these area estimates are used to determine the number of rogramming bits required for each of twenty large benchmark circuits. Note that this flow is urely analytical and does not require exerimental CAD tools. 2. The second flow is urely exerimental. We ma each of the twenty benchmark circuits to 4-inut LUTs using Flowma, clusters containing four LUTs and 1 inuts using TVPack, and then lace and route the circuits using VPR 5.. We use the model within VPR 5. to count the number of rogramming bits for each benchmark circuit. 3. As an intermediate between the first two flows, we use the model in Section 4 to estimate n2 n c and i, but use VPR 5. to determine the actual channel width (instead of using the model from [4]). This flow is included to rovide insight into each ortion of the analytical model. Figure 2 shows the results for an architecture with F s = 9, F cin =2, F cout =4, and L =1. Figure 2(a) shows the number of rogramming bits as a function of K, Figure 2(b) shows the number of rogramming bits as a function of N, and Figure 2(c) shows the number of rogramming bits as a function of I. In all cases, the results from Flow 2 (urely exerimental) and Flow 3 (analytical technology maing and clustering but exerimental routing) match closely. This is to be exected, given the accuracy illustrated in Figure 1. When the model from [4] is used (Flow 1), the analytical estimates still track the exerimental results, however, not as 225

6 Average # of Programming Bits (x1 5 ) K Average # of Programming Bits (x1 5 ) I Average # of Programming Bits (x1 5 ) N Flow 1 Flow 2 Flow 3 Fig. 2. Routing Bit Verification closely. The largest differences arise for architectures with a small number of cluster inuts. These architectures contain far fewer inuts er cluster than the equations in [4] were intended to model. We have found that for such extreme architectures, the average wirelength is 25% larger than the architectures considered in [4]. This is rimarily because as I decreases, the sharing of LUT inuts within a cluster is encouraged. This tends to decrease the average fanout, which increases the average wirelength. This is an imortant observation although the model resented in this aer correctly models architectures with small values of I, a comlete analytical flow that accurately models these architectures does not yet exist. Addressing this limitation is an interesting area for future research. 7. CONCLUSION In this aer, we have resented an analytical model that relates looku-table size, cluster size, and the number of inuts er cluster to amount of logic that can be acked into each looku-table and cluster, and the number of used inuts er cluster. Comaring the model redictions with exerimental results has shown that our model is accurate. We have shown that the model can be used during early architecture evaluation, in which otential architectures are considered before custom CAD tools are created. However, the model has the otential to be much more owerful than that. By combining our model with that from [4], we have a means of relating routing architecture arameters with looku-table and cluster size. Recent FPGAs are emloying larger looku-tables, and it seems reasonable to exect that the logic architecture of future FPGAs may change further. Using a model such as ours rovides the ability to understand the imact that otential logic block architectures may have on the routing fabric, and more imortantly, why the routing fabric is imacted in a certain way. For FPGA architects, this could otentially be much more useful than exerimental results that evaluate a single architecture (or even a swee of a single architectural arameter). A C-imlementation of our model can be downloaded from htt:// stevew. 8. REFERENCES [1] V. Betz, J. Rose, and A. Marquardt, Architecture and CAD for Deesubmicron FPGAs. Kluwer Academic Publishers, [2] M. Hutton, FPGA architecture design methodology, in Int l Conf. on Field-Programmable Logic and Alications, Aug. 26,. 1. [3] E. Ahmed and J. Rose, The effect of LUT and cluster size on deesubmicron FPGA erformance and density, IEEE Trans. on VLSI Systems, vol. 12, no. 3, , Mar. 24. [4] W. Fang and J. Rose, ing FPGA routing demand in early-stage architecture develoment, in Int l Sym. on Field-Programmable Gate Arrays, Feb. 28. [5] A. A. El Gamal, Two-dimensional stochastic models for interconnections in master-slice integrated circuits, IEEE Trans. on Circuits and Systems, vol. 26, no. 4, , Feb [6] J. Rose, Closing the ga between FPGAs and ASICs, Keynote Address, in Int l Conference on Field-Programmable Technology, Dec. 26. [7] S. Brown, J. Rose, and Z. Vranesic, A stochastic model to redict the routability of field-rogrammable gate arrays, IEEE Trans. on Comuter-Aided Design of Circuits and Systems, vol. 12, no. 12, , Dec [8] B. Landman and R. Russo, On a in vs. block relationshi for artitions of logic grahs, IEEE Trans. on Comuters, vol. C-2, , [9] J. Pistorius and M. Hutton, Placement rent exonent calculation methods, temoraral behavior and FPGA architectural evaluation, in SLIP Worksho, Ar. 23, [1] W. Donath, Placement and average interconnect lengths of comuter logic, IEEE Trans. on Circuits and Systems, vol. 26, no. 4, , [11] D. Stroobandt, A Priori Wire Length Estimates for Digital Design. Kluwer Academic Publishers, 21. [12] S. Balachandran and D. Bhatia, A riori wirelength estimation and interconnect estimation based on circuit characteristics, IEEE Trans. on Comuter-Aided Design of Integrated Circuits and Systems, vol. 24, no. 7, , July 25. [13] H. Gao, Y. Yang, X. Ma, and G. Dong, Analysis of the effect of LUT size on FPGA area and delay using theoretical derivations, in Int l Symosium on Quality Electronic Design, Mar. 25, [14] P. Zarkesh-Ha, J. A. Davis, W. Loh, and J. D. Meindl, Prediction of interconnect fan-out distribution using rent s rule, in Proceedings of the 2 international worksho on System-level interconnect rediction. New York, NY, USA: ACM, 2, [15] A. Singh and M. Marek-Sadowska, Efficient circuit clustering for area and ower reduction in FPGAs, in Int l Symosium on FPGAs, Feb. 22,

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model IMS Network Deloyment Cost Otimization Based on Flow-Based Traffic Model Jie Xiao, Changcheng Huang and James Yan Deartment of Systems and Comuter Engineering, Carleton University, Ottawa, Canada {jiexiao,

More information

TOPP Probing of Network Links with Large Independent Latencies

TOPP Probing of Network Links with Large Independent Latencies TOPP Probing of Network Links with Large Indeendent Latencies M. Hosseinour, M. J. Tunnicliffe Faculty of Comuting, Information ystems and Mathematics, Kingston University, Kingston-on-Thames, urrey, KT1

More information

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network 1 Sensitivity Analysis for an Otimal Routing Policy in an Ad Hoc Wireless Network Tara Javidi and Demosthenis Teneketzis Deartment of Electrical Engineering and Comuter Science University of Michigan Ann

More information

A Study of Protocols for Low-Latency Video Transport over the Internet

A Study of Protocols for Low-Latency Video Transport over the Internet A Study of Protocols for Low-Latency Video Transort over the Internet Ciro A. Noronha, Ph.D. Cobalt Digital Santa Clara, CA ciro.noronha@cobaltdigital.com Juliana W. Noronha University of California, Davis

More information

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K.

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K. inuts er clock cycle Streaming ermutation oututs er clock cycle AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS Ren Chen and Viktor K.

More information

A Model-Adaptable MOSFET Parameter Extraction System

A Model-Adaptable MOSFET Parameter Extraction System A Model-Adatable MOSFET Parameter Extraction System Masaki Kondo Hidetoshi Onodera Keikichi Tamaru Deartment of Electronics Faculty of Engineering, Kyoto University Kyoto 66-1, JAPAN Tel: +81-7-73-313

More information

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 Mingliang Chen 1, Weiyao Lin 1*, Xiaozhen Zheng 2 1 Deartment of Electronic Engineering, Shanghai Jiao Tong University, China

More information

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation To aear in IEEE VLSI Test Symosium, 1997 SITFIRE: Scalable arallel Algorithms for Test Set artitioned Fault Simulation Dili Krishnaswamy y Elizabeth M. Rudnick y Janak H. atel y rithviraj Banerjee z y

More information

Hardware-Accelerated Formal Verification

Hardware-Accelerated Formal Verification Hardare-Accelerated Formal Verification Hiroaki Yoshida, Satoshi Morishita 3 Masahiro Fujita,. VLSI Design and Education Center (VDEC), University of Tokyo. CREST, Jaan Science and Technology Agency 3.

More information

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data Efficient Processing of To-k Dominating Queries on Multi-Dimensional Data Man Lung Yiu Deartment of Comuter Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Deartment of

More information

Learning Robust Locality Preserving Projection via p-order Minimization

Learning Robust Locality Preserving Projection via p-order Minimization Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Learning Robust Locality Preserving Projection via -Order Minimization Hua Wang, Feiing Nie, Heng Huang Deartment of Electrical

More information

Submission. Verifying Properties Using Sequential ATPG

Submission. Verifying Properties Using Sequential ATPG Verifying Proerties Using Sequential ATPG Jacob A. Abraham and Vivekananda M. Vedula Comuter Engineering Research Center The University of Texas at Austin Austin, TX 78712 jaa, vivek @cerc.utexas.edu Daniel

More information

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal International Journal of Information and Electronics Engineering, Vol. 1, No. 1, July 011 An Efficient VLSI Architecture for Adative Rank Order Filter for Image Noise Removal M. C Hanumantharaju, M. Ravishankar,

More information

A Parallel Algorithm for Constructing Obstacle-Avoiding Rectilinear Steiner Minimal Trees on Multi-Core Systems

A Parallel Algorithm for Constructing Obstacle-Avoiding Rectilinear Steiner Minimal Trees on Multi-Core Systems A Parallel Algorithm for Constructing Obstacle-Avoiding Rectilinear Steiner Minimal Trees on Multi-Core Systems Cheng-Yuan Chang and I-Lun Tseng Deartment of Comuter Science and Engineering Yuan Ze University,

More information

Space-efficient Region Filling in Raster Graphics

Space-efficient Region Filling in Raster Graphics "The Visual Comuter: An International Journal of Comuter Grahics" (submitted July 13, 1992; revised December 7, 1992; acceted in Aril 16, 1993) Sace-efficient Region Filling in Raster Grahics Dominik Henrich

More information

Randomized algorithms: Two examples and Yao s Minimax Principle

Randomized algorithms: Two examples and Yao s Minimax Principle Randomized algorithms: Two examles and Yao s Minimax Princile Maximum Satisfiability Consider the roblem Maximum Satisfiability (MAX-SAT). Bring your knowledge u-to-date on the Satisfiability roblem. Maximum

More information

Face Recognition Using Legendre Moments

Face Recognition Using Legendre Moments Face Recognition Using Legendre Moments Dr.S.Annadurai 1 A.Saradha Professor & Head of CSE & IT Research scholar in CSE Government College of Technology, Government College of Technology, Coimbatore, Tamilnadu,

More information

Ad Hoc Networks. Latency-minimizing data aggregation in wireless sensor networks under physical interference model

Ad Hoc Networks. Latency-minimizing data aggregation in wireless sensor networks under physical interference model Ad Hoc Networks (4) 5 68 Contents lists available at SciVerse ScienceDirect Ad Hoc Networks journal homeage: www.elsevier.com/locate/adhoc Latency-minimizing data aggregation in wireless sensor networks

More information

A Novel Iris Segmentation Method for Hand-Held Capture Device

A Novel Iris Segmentation Method for Hand-Held Capture Device A Novel Iris Segmentation Method for Hand-Held Cature Device XiaoFu He and PengFei Shi Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200030, China {xfhe,

More information

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Min Hu, Saad Ali and Mubarak Shah Comuter Vision Lab, University of Central Florida {mhu,sali,shah}@eecs.ucf.edu Abstract Learning tyical

More information

Improving Trust Estimates in Planning Domains with Rare Failure Events

Improving Trust Estimates in Planning Domains with Rare Failure Events Imroving Trust Estimates in Planning Domains with Rare Failure Events Colin M. Potts and Kurt D. Krebsbach Det. of Mathematics and Comuter Science Lawrence University Aleton, Wisconsin 54911 USA {colin.m.otts,

More information

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism Erlin Yao, Mingyu Chen, Rui Wang, Wenli Zhang, Guangming Tan Key Laboratory of Comuter System and Architecture Institute

More information

An Efficient Video Program Delivery algorithm in Tree Networks*

An Efficient Video Program Delivery algorithm in Tree Networks* 3rd International Symosium on Parallel Architectures, Algorithms and Programming An Efficient Video Program Delivery algorithm in Tree Networks* Fenghang Yin 1 Hong Shen 1,2,** 1 Deartment of Comuter Science,

More information

Stereo Disparity Estimation in Moment Space

Stereo Disparity Estimation in Moment Space Stereo Disarity Estimation in oment Sace Angeline Pang Faculty of Information Technology, ultimedia University, 63 Cyberjaya, alaysia. angeline.ang@mmu.edu.my R. ukundan Deartment of Comuter Science, University

More information

Equality-Based Translation Validator for LLVM

Equality-Based Translation Validator for LLVM Equality-Based Translation Validator for LLVM Michael Ste, Ross Tate, and Sorin Lerner University of California, San Diego {mste,rtate,lerner@cs.ucsd.edu Abstract. We udated our Peggy tool, reviously resented

More information

Mitigating the Impact of Decompression Latency in L1 Compressed Data Caches via Prefetching

Mitigating the Impact of Decompression Latency in L1 Compressed Data Caches via Prefetching Mitigating the Imact of Decomression Latency in L1 Comressed Data Caches via Prefetching by Sean Rea A thesis resented to Lakehead University in artial fulfillment of the requirement for the degree of

More information

Efficient Parallel Hierarchical Clustering

Efficient Parallel Hierarchical Clustering Efficient Parallel Hierarchical Clustering Manoranjan Dash 1,SimonaPetrutiu, and Peter Scheuermann 1 Deartment of Information Systems, School of Comuter Engineering, Nanyang Technological University, Singaore

More information

An Indexing Framework for Structured P2P Systems

An Indexing Framework for Structured P2P Systems An Indexing Framework for Structured P2P Systems Adina Crainiceanu Prakash Linga Ashwin Machanavajjhala Johannes Gehrke Carl Lagoze Jayavel Shanmugasundaram Deartment of Comuter Science, Cornell University

More information

Wavelet Based Statistical Adapted Local Binary Patterns for Recognizing Avatar Faces

Wavelet Based Statistical Adapted Local Binary Patterns for Recognizing Avatar Faces Wavelet Based Statistical Adated Local Binary atterns for Recognizing Avatar Faces Abdallah A. Mohamed 1, 2 and Roman V. Yamolskiy 1 1 Comuter Engineering and Comuter Science, University of Louisville,

More information

Source Coding and express these numbers in a binary system using M log

Source Coding and express these numbers in a binary system using M log Source Coding 30.1 Source Coding Introduction We have studied how to transmit digital bits over a radio channel. We also saw ways that we could code those bits to achieve error correction. Bandwidth is

More information

MATHEMATICAL MODELING OF COMPLEX MULTI-COMPONENT MOVEMENTS AND OPTICAL METHOD OF MEASUREMENT

MATHEMATICAL MODELING OF COMPLEX MULTI-COMPONENT MOVEMENTS AND OPTICAL METHOD OF MEASUREMENT MATHEMATICAL MODELING OF COMPLE MULTI-COMPONENT MOVEMENTS AND OPTICAL METHOD OF MEASUREMENT V.N. Nesterov JSC Samara Electromechanical Plant, Samara, Russia Abstract. The rovisions of the concet of a multi-comonent

More information

A NOVEL GEOMETRIC ALGORITHM FOR FAST WIRE-OPTIMIZED FLOORPLANNING

A NOVEL GEOMETRIC ALGORITHM FOR FAST WIRE-OPTIMIZED FLOORPLANNING A OVEL GEOMETRIC ALGORITHM FOR FAST WIRE-OPTIMIZED FLOORPLAIG Peter G. Sassone, Sung K. Lim School of Electrical and Comuter Engineering Georgia Institute of Technology Atlanta, Georgia 30332, U ABSTRACT

More information

A Reconfigurable Architecture for Quad MAC VLIW DSP

A Reconfigurable Architecture for Quad MAC VLIW DSP A Reconfigurable Architecture for Quad MAC VLIW DSP Sangwook Kim, Sungchul Yoon, Jaeseuk Oh, Sungho Kang Det. of Electrical & Electronic Engineering, Yonsei University 132 Shinchon-Dong, Seodaemoon-Gu,

More information

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1 Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Sanning Trees 1 Honge Wang y and Douglas M. Blough z y Myricom Inc., 325 N. Santa Anita Ave., Arcadia, CA 916, z School of Electrical and

More information

Generic ILP-Based Approaches for Time-Multiplexed FPGA Partitioning

Generic ILP-Based Approaches for Time-Multiplexed FPGA Partitioning 1266 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 2, NO. 1, OCTOBER 21 Generic ILP-Based Aroaches for Time-Multilexed FPGA Partitioning Guang-Ming Wu, Jai-Ming Lin,

More information

10. Parallel Methods for Data Sorting

10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting... 1 10.1. Parallelizing Princiles... 10.. Scaling Parallel Comutations... 10.3. Bubble Sort...3 10.3.1. Sequential Algorithm...3

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

Figure 8.1: Home age taken from the examle health education site (htt:// Setember 14, 2001). 201

Figure 8.1: Home age taken from the examle health education site (htt://  Setember 14, 2001). 201 200 Chater 8 Alying the Web Interface Profiles: Examle Web Site Assessment 8.1 Introduction This chater describes the use of the rofiles develoed in Chater 6 to assess and imrove the quality of an examle

More information

Leak Detection Modeling and Simulation for Oil Pipeline with Artificial Intelligence Method

Leak Detection Modeling and Simulation for Oil Pipeline with Artificial Intelligence Method ITB J. Eng. Sci. Vol. 39 B, No. 1, 007, 1-19 1 Leak Detection Modeling and Simulation for Oil Pieline with Artificial Intelligence Method Pudjo Sukarno 1, Kuntjoro Adji Sidarto, Amoranto Trisnobudi 3,

More information

Improve Precategorized Collection Retrieval by Using Supervised Term Weighting Schemes Λ

Improve Precategorized Collection Retrieval by Using Supervised Term Weighting Schemes Λ Imrove Precategorized Collection Retrieval by Using Suervised Term Weighting Schemes Λ Ying Zhao and George Karyis University of Minnesota, Deartment of Comuter Science Minneaolis, MN 55455 Abstract The

More information

Energy consumption model over parallel programs implemented on multicore architectures

Energy consumption model over parallel programs implemented on multicore architectures Energy consumtion model over arallel rograms imlemented on multicore architectures Ricardo Isidro-Ramírez Instituto Politécnico Nacional SEPI-ESCOM M exico, D.F. Amilcar Meneses Viveros Deartamento de

More information

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications OMNI: An Efficient Overlay Multicast Infrastructure for Real-time Alications Suman Banerjee, Christoher Kommareddy, Koushik Kar, Bobby Bhattacharjee, Samir Khuller Abstract We consider an overlay architecture

More information

Semi-Markov Process based Model for Performance Analysis of Wireless LANs

Semi-Markov Process based Model for Performance Analysis of Wireless LANs Semi-Markov Process based Model for Performance Analysis of Wireless LANs Murali Krishna Kadiyala, Diti Shikha, Ravi Pendse, and Neeraj Jaggi Deartment of Electrical Engineering and Comuter Science Wichita

More information

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling *

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling * ivot Selection for Dimension Reduction Using Annealing by Increasing Resamling * Yasunobu Imamura 1, Naoya Higuchi 1, Tetsuji Kuboyama 2, Kouichi Hirata 1 and Takeshi Shinohara 1 1 Kyushu Institute of

More information

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs An emirical analysis of looy belief roagation in three toologies: grids, small-world networks and random grahs R. Santana, A. Mendiburu and J. A. Lozano Intelligent Systems Grou Deartment of Comuter Science

More information

Complexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks

Complexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks Journal of Comuting and Information Technology - CIT 8, 2000, 1, 1 12 1 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks Eunice E. Santos Deartment of Electrical

More information

Parametric Optimization in WEDM of WC-Co Composite by Neuro-Genetic Technique

Parametric Optimization in WEDM of WC-Co Composite by Neuro-Genetic Technique Parametric Otimization in WEDM of WC-Co Comosite by Neuro-Genetic Technique P. Saha*, P. Saha, and S. K. Pal Abstract The resent work does a multi-objective otimization in wire electro-discharge machining

More information

2-D Fir Filter Design And Its Applications In Removing Impulse Noise In Digital Image

2-D Fir Filter Design And Its Applications In Removing Impulse Noise In Digital Image -D Fir Filter Design And Its Alications In Removing Imulse Noise In Digital Image Nguyen Thi Huyen Linh 1, Luong Ngoc Minh, Tran Dinh Dung 3 1 Faculty of Electronic and Electrical Engineering, Hung Yen

More information

An FPGA Implementation of ( )-Regular Low-Density. Parity-Check Code Decoder

An FPGA Implementation of ( )-Regular Low-Density. Parity-Check Code Decoder An FPGA Imlementation of ( )-Regular Low-Density Parity-Check Code Decoder Tong Zhang ECSE Det., RPI Troy, NY tzhang@ecse.ri.edu Keshab K. Parhi ECE Det., Univ. of Minnesota Minneaolis, MN arhi@ece.umn.edu

More information

The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing

The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing Mikael Taveniku 2,3, Anders Åhlander 1,3, Magnus Jonsson 1 and Bertil Svensson 1,2

More information

Implementing Logic in FPGA Memory Arrays: Heterogeneous Memory Architectures

Implementing Logic in FPGA Memory Arrays: Heterogeneous Memory Architectures Implementing Logic in FPGA Memory Arrays: Heterogeneous Memory Architectures Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver, BC, Canada, V6T

More information

A DEA-bases Approach for Multi-objective Design of Attribute Acceptance Sampling Plans

A DEA-bases Approach for Multi-objective Design of Attribute Acceptance Sampling Plans Available online at htt://ijdea.srbiau.ac.ir Int. J. Data Enveloment Analysis (ISSN 2345-458X) Vol.5, No.2, Year 2017 Article ID IJDEA-00422, 12 ages Research Article International Journal of Data Enveloment

More information

A simulated Linear Mixture Model to Improve Classification Accuracy of Satellite. Data Utilizing Degradation of Atmospheric Effect

A simulated Linear Mixture Model to Improve Classification Accuracy of Satellite. Data Utilizing Degradation of Atmospheric Effect A simulated Linear Mixture Model to Imrove Classification Accuracy of Satellite Data Utilizing Degradation of Atmosheric Effect W.M Elmahboub Mathematics Deartment, School of Science, Hamton University

More information

Fast Distributed Process Creation with the XMOS XS1 Architecture

Fast Distributed Process Creation with the XMOS XS1 Architecture Communicating Process Architectures 20 P.H. Welch et al. (Eds.) IOS Press, 20 c 20 The authors and IOS Press. All rights reserved. Fast Distributed Process Creation with the XMOS XS Architecture James

More information

Using Permuted States and Validated Simulation to Analyze Conflict Rates in Optimistic Replication

Using Permuted States and Validated Simulation to Analyze Conflict Rates in Optimistic Replication Using Permuted States and Validated Simulation to Analyze Conflict Rates in Otimistic Relication An-I A. Wang Comuter Science Deartment Florida State University Geoff H. Kuenning Comuter Science Deartment

More information

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University.

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University. Relations with Relation Names as Arguments: Algebra and Calculus Kenneth A. Ross Columbia University kar@cs.columbia.edu Abstract We consider a version of the relational model in which relation names may

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Distrib. Comut. 71 (2011) 288 301 Contents lists available at ScienceDirect J. Parallel Distrib. Comut. journal homeage: www.elsevier.com/locate/jdc Quality of security adatation in arallel

More information

CS 229 Final Project: Single Image Depth Estimation From Predicted Semantic Labels

CS 229 Final Project: Single Image Depth Estimation From Predicted Semantic Labels CS 229 Final Project: Single Image Deth Estimation From Predicted Semantic Labels Beyang Liu beyangl@cs.stanford.edu Stehen Gould sgould@stanford.edu Prof. Dahne Koller koller@cs.stanford.edu December

More information

Statistical Detection for Network Flooding Attacks

Statistical Detection for Network Flooding Attacks Statistical Detection for Network Flooding Attacks C. S. Chao, Y. S. Chen, and A.C. Liu Det. of Information Engineering, Feng Chia Univ., Taiwan 407, OC. Email: cschao@fcu.edu.tw Abstract In order to meet

More information

SEARCH ENGINE MANAGEMENT

SEARCH ENGINE MANAGEMENT e-issn 2455 1392 Volume 2 Issue 5, May 2016. 254 259 Scientific Journal Imact Factor : 3.468 htt://www.ijcter.com SEARCH ENGINE MANAGEMENT Abhinav Sinha Kalinga Institute of Industrial Technology, Bhubaneswar,

More information

RESEARCH ARTICLE. Simple Memory Machine Models for GPUs

RESEARCH ARTICLE. Simple Memory Machine Models for GPUs The International Journal of Parallel, Emergent and Distributed Systems Vol. 00, No. 00, Month 2011, 1 22 RESEARCH ARTICLE Simle Memory Machine Models for GPUs Koji Nakano a a Deartment of Information

More information

A power efficient flit-admission scheme for wormhole-switched networks on chip

A power efficient flit-admission scheme for wormhole-switched networks on chip A ower efficient flit-admission scheme for wormhole-switched networks on chi Zhonghai Lu, Li Tong, Bei Yin and Axel Jantsch Laboratory of Electronics and Comuter Systems Royal Institute of Technology,

More information

CENTRAL AND PARALLEL PROJECTIONS OF REGULAR SURFACES: GEOMETRIC CONSTRUCTIONS USING 3D MODELING SOFTWARE

CENTRAL AND PARALLEL PROJECTIONS OF REGULAR SURFACES: GEOMETRIC CONSTRUCTIONS USING 3D MODELING SOFTWARE CENTRAL AND PARALLEL PROJECTIONS OF REGULAR SURFACES: GEOMETRIC CONSTRUCTIONS USING 3D MODELING SOFTWARE Petra Surynková Charles University in Prague, Faculty of Mathematics and Physics, Sokolovská 83,

More information

A GPU Heterogeneous Cluster Scheduling Model for Preventing Temperature Heat Island

A GPU Heterogeneous Cluster Scheduling Model for Preventing Temperature Heat Island A GPU Heterogeneous Cluster Scheduling Model for Preventing Temerature Heat Island Yun-Peng CAO 1,2,a and Hai-Feng WANG 1,2 1 School of Information Science and Engineering, Linyi University, Linyi Shandong,

More information

Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction

Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction David Yeager Darius Chiu Guy Lemieux Dept of Electrical and Computer Engineering The University of British Columbia

More information

Model-Based Annotation of Online Handwritten Datasets

Model-Based Annotation of Online Handwritten Datasets Model-Based Annotation of Online Handwritten Datasets Anand Kumar, A. Balasubramanian, Anoo Namboodiri and C.V. Jawahar Center for Visual Information Technology, International Institute of Information

More information

Sensitivity of multi-product two-stage economic lotsizing models and their dependency on change-over and product cost ratio s

Sensitivity of multi-product two-stage economic lotsizing models and their dependency on change-over and product cost ratio s Sensitivity two stage EOQ model 1 Sensitivity of multi-roduct two-stage economic lotsizing models and their deendency on change-over and roduct cost ratio s Frank Van den broecke, El-Houssaine Aghezzaf,

More information

Patterned Wafer Segmentation

Patterned Wafer Segmentation atterned Wafer Segmentation ierrick Bourgeat ab, Fabrice Meriaudeau b, Kenneth W. Tobin a, atrick Gorria b a Oak Ridge National Laboratory,.O.Box 2008, Oak Ridge, TN 37831-6011, USA b Le2i Laboratory Univ.of

More information

Layered Switching for Networks on Chip

Layered Switching for Networks on Chip Layered Switching for Networks on Chi Zhonghai Lu zhonghai@kth.se Ming Liu mingliu@kth.se Royal Institute of Technology, Sweden Axel Jantsch axel@kth.se ABSTRACT We resent and evaluate a novel switching

More information

Depth Estimation for 2D to 3D Football Video Conversion

Depth Estimation for 2D to 3D Football Video Conversion International Research Journal of Alied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 6 (4): 475-480 Science Exlorer ublications Deth Estimation for 2D to 3D Football

More information

Optimization of Collective Communication Operations in MPICH

Optimization of Collective Communication Operations in MPICH To be ublished in the International Journal of High Performance Comuting Alications, 5. c Sage Publications. Otimization of Collective Communication Oerations in MPICH Rajeev Thakur Rolf Rabenseifner William

More information

Argo Programming Guide

Argo Programming Guide Argo Programming Guide Evangelia Kasaaki, asmus Bo Sørensen February 9, 2015 Coyright 2014 Technical University of Denmark This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International

More information

Congestion-Driven Regional Re-clustering for Low-Cost FPGAs

Congestion-Driven Regional Re-clustering for Low-Cost FPGAs Congestion-Driven Regional Re-clustering for Low-Cost FPGAs Darius Chiu, Guy G.F. Lemieux, Steve Wilton Electrical and Computer Engineering, University of British Columbia British Columbia, Canada dariusc@ece.ubc.ca

More information

Collective communication: theory, practice, and experience

Collective communication: theory, practice, and experience CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Comutat.: Pract. Exer. 2007; 19:1749 1783 Published online 5 July 2007 in Wiley InterScience (www.interscience.wiley.com)..1206 Collective

More information

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties Imroved heuristics for the single machine scheduling roblem with linear early and quadratic tardy enalties Jorge M. S. Valente* LIAAD INESC Porto LA, Faculdade de Economia, Universidade do Porto Postal

More information

CASCH - a Scheduling Algorithm for "High Level"-Synthesis

CASCH - a Scheduling Algorithm for High Level-Synthesis CASCH a Scheduling Algorithm for "High Level"Synthesis P. Gutberlet H. Krämer W. Rosenstiel Comuter Science Research Center at the University of Karlsruhe (FZI) HaidundNeuStr. 1014, 7500 Karlsruhe, F.R.G.

More information

Optimizing the Number of Fog Nodes for Cloud-Fog-Thing Networks

Optimizing the Number of Fog Nodes for Cloud-Fog-Thing Networks 1 Otimizing the Number of Fog Nodes for Cloud-Fog-Thing Networks Eren Balevi, and Richard D. Gitlin Deartment of Electrical Engineering, University of South Florida Tama, Florida 336, USA erenbalevi@mail.usf.edu,

More information

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.)

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.) Advanced Algorithms Fall 2015 Lecture 3: Geometric Algorithms(Convex sets, Divide & Conuer Algo.) Faculty: K.R. Chowdhary : Professor of CS Disclaimer: These notes have not been subjected to the usual

More information

Bayesian Oil Spill Segmentation of SAR Images via Graph Cuts 1

Bayesian Oil Spill Segmentation of SAR Images via Graph Cuts 1 Bayesian Oil Sill Segmentation of SAR Images via Grah Cuts 1 Sónia Pelizzari and José M. Bioucas-Dias Instituto de Telecomunicações, I.S.T., TULisbon,Lisboa, Portugal sonia@lx.it.t, bioucas@lx.it.t Abstract.

More information

Introduction to Parallel Algorithms

Introduction to Parallel Algorithms CS 1762 Fall, 2011 1 Introduction to Parallel Algorithms Introduction to Parallel Algorithms ECE 1762 Algorithms and Data Structures Fall Semester, 2011 1 Preliminaries Since the early 1990s, there has

More information

Extracting Optimal Paths from Roadmaps for Motion Planning

Extracting Optimal Paths from Roadmaps for Motion Planning Extracting Otimal Paths from Roadmas for Motion Planning Jinsuck Kim Roger A. Pearce Nancy M. Amato Deartment of Comuter Science Texas A&M University College Station, TX 843 jinsuckk,ra231,amato @cs.tamu.edu

More information

Directed File Transfer Scheduling

Directed File Transfer Scheduling Directed File Transfer Scheduling Weizhen Mao Deartment of Comuter Science The College of William and Mary Williamsburg, Virginia 387-8795 wm@cs.wm.edu Abstract The file transfer scheduling roblem was

More information

Face Recognition Based on Wavelet Transform and Adaptive Local Binary Pattern

Face Recognition Based on Wavelet Transform and Adaptive Local Binary Pattern Face Recognition Based on Wavelet Transform and Adative Local Binary Pattern Abdallah Mohamed 1,2, and Roman Yamolskiy 1 1 Comuter Engineering and Comuter Science, University of Louisville, Louisville,

More information

Source-to-Source Code Generation Based on Pattern Matching and Dynamic Programming

Source-to-Source Code Generation Based on Pattern Matching and Dynamic Programming Source-to-Source Code Generation Based on Pattern Matching and Dynamic Programming Weimin Chen, Volker Turau TR-93-047 August, 1993 Abstract This aer introduces a new technique for source-to-source code

More information

Implementation of Evolvable Fuzzy Hardware for Packet Scheduling Through Online Context Switching

Implementation of Evolvable Fuzzy Hardware for Packet Scheduling Through Online Context Switching Imlementation of Evolvable Fuzzy Hardware for Packet Scheduling Through Online Context Switching Ju Hui Li, eng Hiot Lim and Qi Cao School of EEE, Block S Nanyang Technological University Singaore 639798

More information

Simple Memory Machine Models for GPUs

Simple Memory Machine Models for GPUs 2012 IEEE 2012 26th IEEE International 26th International Parallel Parallel and Distributed and Distributed Processing Processing Symosium Symosium Workshos Workshos & PhD Forum Simle Memory Machine Models

More information

A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing

A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Grah Processing Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto, Matei Rieanu Deartment of Electrical and Comuter Engineering, The University

More information

Memory Footprint Reduction for FPGA Routing Algorithms

Memory Footprint Reduction for FPGA Routing Algorithms Memory Footprint Reduction for FPGA Routing Algorithms Scott Y.L. Chin, and Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver, B.C., Canada email:

More information

Single character type identification

Single character type identification Single character tye identification Yefeng Zheng*, Changsong Liu, Xiaoqing Ding Deartment of Electronic Engineering, Tsinghua University Beijing 100084, P.R. China ABSTRACT Different character recognition

More information

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Stehan Baumann, Kai-Uwe Sattler Databases and Information Systems Grou Technische Universität Ilmenau, Ilmenau, Germany

More information

Reducing Structural Bias in Technology Mapping

Reducing Structural Bias in Technology Mapping Reducing Structural Bias in Technology Maing S. Chatterjee A. Mishchenko R. Brayton Deartment of EECS U. C. Berkeley {satrajit, alanmi, brayton}@eecs.berkeley.edu X. Wang T. Kam Strategic CAD Labs Intel

More information

Use of Multivariate Statistical Analysis in the Modelling of Chromatographic Processes

Use of Multivariate Statistical Analysis in the Modelling of Chromatographic Processes Use of Multivariate Statistical Analysis in the Modelling of Chromatograhic Processes Simon Edwards-Parton 1, Nigel itchener-hooker 1, Nina hornhill 2, Daniel Bracewell 1, John Lidell 3 Abstract his aer

More information

Tiling for Performance Tuning on Different Models of GPUs

Tiling for Performance Tuning on Different Models of GPUs Tiling for Performance Tuning on Different Models of GPUs Chang Xu Deartment of Information Engineering Zhejiang Business Technology Institute Ningbo, China colin.xu198@gmail.com Steven R. Kirk, Samantha

More information

A Morphological LiDAR Points Cloud Filtering Method based on GPGPU

A Morphological LiDAR Points Cloud Filtering Method based on GPGPU A Morhological LiDAR Points Cloud Filtering Method based on GPGPU Shuo Li 1, Hui Wang 1, Qiuhe Ma 1 and Xuan Zha 2 1 Zhengzhou Institute of Surveying & Maing, No.66, Longhai Middle Road, Zhengzhou, China

More information

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS Kevin Miller, Vivian Lin, and Rui Zhang Grou ID: 5 1. INTRODUCTION The roblem we are trying to solve is redicting future links or recovering missing links

More information

EDGE: A ROUTING ALGORITHM FOR MAXIMIZING THROUGHPUT AND MINIMIZING DELAY IN WIRELESS SENSOR NETWORKS

EDGE: A ROUTING ALGORITHM FOR MAXIMIZING THROUGHPUT AND MINIMIZING DELAY IN WIRELESS SENSOR NETWORKS EDGE: A ROUTING ALGORITHM FOR MAXIMIZING THROUGHPUT AND MINIMIZING DELAY IN WIRELESS SENSOR NETWORKS Shuang Li, Alvin Lim, Santosh Kulkarni and Cong Liu Auburn University, Comuter Science and Software

More information

Privacy Preserving Moving KNN Queries

Privacy Preserving Moving KNN Queries Privacy Preserving Moving KNN Queries arxiv:4.76v [cs.db] 4 Ar Tanzima Hashem Lars Kulik Rui Zhang National ICT Australia, Deartment of Comuter Science and Software Engineering University of Melbourne,

More information

Mining Association rules with Dynamic and Collective Support Thresholds

Mining Association rules with Dynamic and Collective Support Thresholds Mining Association rules with Dynamic and Collective Suort Thresholds C S Kanimozhi Selvi and A Tamilarasi Abstract Mining association rules is an imortant task in data mining. It discovers the hidden,

More information

Chapter 8: Adaptive Networks

Chapter 8: Adaptive Networks Chater : Adative Networks Introduction (.1) Architecture (.2) Backroagation for Feedforward Networks (.3) Jyh-Shing Roger Jang et al., Neuro-Fuzzy and Soft Comuting: A Comutational Aroach to Learning and

More information

FIELD-programmable gate arrays (FPGAs) are quickly

FIELD-programmable gate arrays (FPGAs) are quickly This article has been acceted for inclusion in a future issue of this journal Content is final as resented, with the excetion of agination IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

More information