Circuit Memory Requirements Number of Utilization Number of Utilization. Variable Length one 768x16, two 32x7, è è

Size: px
Start display at page:

Download "Circuit Memory Requirements Number of Utilization Number of Utilization. Variable Length one 768x16, two 32x7, è è"

Transcription

1 Previous Alorithm ë6ë SPACK Alorithm Circuit Memory Requirements Number of Utilization Number of Utilization Arrays Req'd Arrays Req'd Variable Lenth one 768x16, two 32x7, è è CODEC one 512x1 Discrete Cosine two 16x è è Transform Chip Video one 24x112 èdual portè è è Compression one 16x96 èdual portè Encryption one 256x è 1 100è Circuit Robot one 172x è 2 42è Controller Filter two 8x24 èdual portè è è one 320x24 Neural Network one 160x8, one 32x è è Chip 1 Neural Network one 1310x24, è è Chip 2 one 1024x16 DMA Chip one 15x24, one 16x4, è è for LAN one 256x32 Translation Look- two 256x59, è è aside Buæer one 16x18 èdual portè Proof-of-Concept three 128x8, è è Viterbi Decoder one 28x3 èdual portè Imae two 128x è è Backprojector DSP Control one 1024x32, one 128x16, è è Unit three 64x16, two 24x16 Vector two 256x9, two 256x8, è è Processin three 128x9, Unit three 128x16 èdual portè Communications six 88x8, one 64x è è Circuit 1 Communications three 736x è è Circuit 2 Communications four 368x16, è è Circuit 3 one 736x16 Communications two 1620x3, two 168x12, è è Circuit 4 two 366x11 Communications one 192x è è Circuit 5 Averae è è Table 2. Experimental Results

2 8. J. Con and S. Xu, ëtechnoloy mappin for FPGAs with embedded memory blocks," in Proceedins of the ACMèSIGDA International Symposium on Field- Prorammable Gate Arrays, pp. 179í187, February Altera Corporation, ëimplementin RAM functions in FLEX 10K devices." Technical Note, Nov P. K. Jha and N. D. Dutt, ëlibrary mappin for memories," in Proceedins of the 1997 European Desin and Test Conference, March D. Karchmer and J. Rose, ëdeænition and solution of the memory packin problem for æeld-prorammable systems," in Proceedins of the IEEE International Conference on Computer-Aided Desin, pp. 20í26, H. Schmit and D. Thomas Jr., ëaddress eneration for memories containin multiple arrays," IEEE Transactions on Computer-Aided Desin of Interated Circuits and Systems, vol. 17, May P. R. Panda and N. D. Dutt, ëbehavioral array mappin into multiport memories taretin low power," in Proceedins of the 10th International Conference on VLSI Desin, Jan M. Balakrishnan, A. Majumdar, D. Banerji, J. Linders, and J. Majithia, ëallocation of multiport memories in datapath synthesis," IEEE Transactions on Computer-Aided Desin, vol. 7, April 1988.

3 wasted èbecause each physical array could not be completely ælled with loical memoriesè. Usin the previous alorithm, the utilization is only 35.4è averaed over all circuits. SPACK results in a siniæcantly hiher utilization of 51.7è. 5 Conclusions In this paper, we have presented a new loical-to-physical mappin alorithm that tarets FPGAs with dual-port embedded arrays. The purpose of the alorithm is to map the memories required by a circuit to the physical FPGA memory resources. This is an important problem, since an implementation of a user's memory that requires even one more physical array than necessary could very easily cause a circuit to not æt on a iven FPGA. Previous work has studied FPGAs with sinle-port embedded arrays. Most current FPGAs, however, contain dual-port arrays. We have shown that by explicitly takin advantae of the dual-port nature of these arrays, our alorithm produces considerably more eæcient implementations of the memory parts of circuits. Speciæcally, we have shown that under the riht conditions, we can pack two sinle-port user memories èor parts of two sinle-port user memoriesè into a dual-port array. Our alorithm results in memory implementations that use, on averae, 28è fewer arrays than an alorithm that does not take advantae of the dual-port arrays in this way. Acknowledments This work was supported by Cypress Semiconductor, the Natural Sciences a nd Enineerin Research Council of Canada, and UBC's Centre for Interated Computer Systems Research. References 1. Altera Corporation, Datasheet: FLEX 10K Embedded Prorammable Loic Family, May Altera Corporation, Datasheet: FLEX 10KE Embedded Prorammable Loic Family, Auust Xilinx, Inc., ëvirtex: Our new million-ate 100-MHz FPGA technoloy." XCell: The Quarterly Journal for Xilinx Prorammable Loic Users, First Quarter Actel Corporation, Datasheet: Interator Series FPGAs: 40MX and 42MX Families, April Lattice Semiconductor Corporation, Datasheet: isplsi and plsi 6192 Hih Density Prorammable Loic with Dedicated Memory and ReisterèCounter Modules, July S. J. E. Wilton, Architectures and Alorithms for Field-Prorammable Gate Arrays with Embedded Memory. PhD thesis, University of Toronto, S. J. E. Wilton, ësmap: heteroeneous technoloy mappin for FPGAs with embedded memory arrays," in ACMèSIGDA International Symposium on Field- Prorammable Gate Arrays, pp. 171í178, February 1998.

4 phase2: bin list = ç sort component list èlarest component firstè for each component i f for each bin j in the bin list f if fitsècomponent i, bin jè f add component i to bin j o to next component b = new bin with component i as its only occupant bin list = bin list ë b fitsècomponent i, bin jè f k = component already in bin j if èp ç pc i + pc kè and èmc i = mc kè and ëèw mci ç wc i + wc k è or èb=w mci ç dc i + pow2èdc kèè or èb=w mck ç dc k + pow2èdcièë f returnèyesè else f returnènoè Fi. 5. Summary of Phase 2 of the Alorithm Note that the purpose of Table 2 is not to compare FPGAs with sinle-port arrays to those with dual-port arrays. If the FPGA had only sinle-port arrays, many of the circuits in the table could not have even been implemented èunless the dual-port user memories were implemented by time-multiplexin two ports onto a sinle physical portè. Rather, the purpose of the table is to show that when taretin FPGAs with dual-port arrays, our alorithm performs considerably better than the previous alorithm. Since we expect most future FPGAs to contain dual-port arrays, this is an important result. The fourth and sixth columns of Table 2 show the utilization of the arrays. The utilization is deæned as: number of bits in loical memory conæuration utilization = 100 ènumber of bits in each arrayèènumber of arrays usedè A utilization of 100è means that every bit in the physical arrays was used, while a utilization lower than 100è means that some bits in the arrays were

5 eneral, the followin condition must be true to combine arrays: P ç pc i + pc j è6è Most FPGAs with dual-port arrays require that each port in a physical array be used in the same mode èe. one port can not be used as a 4Kx1 while the other is used as a 512x8è. Thus, one ænal constraint is that two components i and j can only be packed toether if: mc i = mc j è7è With these constraints, we can formulate the packin problem as a multidimensional bin-packin problem. The physical arrays are bins, and the components are the objects to be packed. In order for two components to be packed in the same bin, constraints 6 and 7 as well as either 4 or 5 must be satisæed. Fiure 5 summarizes this phase of the alorithm. 3.3 Phase 3: Wire toether the Memories After phase 2, the components implementin each loical memory may be scattered amon several physical arrays. The ænal step in the alorithm is to combine the arrays and connect them to the rest of the circuit. If the ëhorizontal" partitionin was used in phase 1, the address ports can be simply wired toether. If the ëvertical" partitionin was in phase 1, a multiplexor and decoder are needed to connect the components. Both of these techniques are described in ë6ë and ë9ë and so will not be discussed further here. 4 Results and Discussion Our alorithm was implemented in a proram called SPACK. To evaluate SPACK, we compare it to results obtained from the alorithm presented in ë6ë. That alorithm maps sinle-port loical memories to sinle-port arrays. Each loical memory is broken into components and each component is implemented by a sinle physical array èthis is the same as our alorithm, without Phase 2è. Althouh all loical memories considered in ë6ë were sinle-port, the alorithm will support dual-port memories, as lon as the physical arrays are dual-port. Thus, we can compare it directly with our alorithm usin benchmark circuits containin both sinle and dual-port loical memories. Table 2 shows the results from SPACK and the previous alorithm for 19 benchmark circuits. The circuits and their loical memory conæurations are shown in the ærst two columns of Table 2. Each circuit was mapped to 4Kbit physical arrays, each of which can be used as a 4Kx1, 2Kx2, 1Kx4, or 512x8. The number of arrays required to implement each benchmark usin each alorithm is shown in the third and æfth columns of the table. Averaed over all benchmark circuits, the previous alorithm required 9.89 arrays, while SPACK required only 7.1 arrays.

6 wc i wc j B/W Mem i Mem j dc i dc j Mem i Mem j B/W dc i dc j Mem i Mem j B/W W W W aè bè cè Fi. 4. Combinin Sinle-Port Memories into Dual-Port Physical Arrays Fiure 4èbè shows how two components i and j can be combined ëvertically". In this case, dc i of the words are used to implement component i and dc j are used to implement component j. By accessin words 0 to dc i, 1 throuh one port and words dc i to dc i + dc j, 1 throuh the other port, we can access each component independently. A straihtforward implementation of this, however, would require an adder on the path feedin the second address port, since an oæset of dc i must be added to the second component's address. This adder would be on the critical path of the memory access, which could slow down the circuit. We can eliminate the need for an adder as lon as the followin condition holds: B=W ç dc i +pow2èdc j è è5è where pow2èxè means x is ërounded-up" to the next hihest power-of-two. As lon as condition 5 holds, we can pack the components as shown in Fiure 4ècè. Component i is implemented startin at word 0, so no adder is needed on the ærst address port. Component j is implemented in the top pow2èdc j è words. Then, lo 2 èpow2èdc j èè address lines in the second address port can be used to address component j, while the remainin lines in the second address port are set to '1'. In this way, both memories can be accessed independently, and no adder is needed on either address port. The example in Fiure 3èbè was implemented in this way. Note that condition 5 is suæcient but not necessary. By employin the addressin techniques in ë12ë, the constraint could be relaxed somewhat. Experimentation has shown, however, that for our problem, these more complex addressin schemes rarely lead to a ænal implementation that uses fewer memory arrays. The above discussion has assumed that both components are sinle-ported. If either component requires two access ports, then it can not be packed with any other component, and must be implemented in its own physical array. In

7 Unused Mem 1 Address (bits 0 7) 9 9 Unused Mem 2 Address (bits 0 7) MSB Mem 1 Address (bits 0 7) Mem 2 Address (bits 0 7) 8 8 MSB 1 Address 1 Address2 Address 1 Address2 Address 1 Address2 dual port 512x8 Data 1 Data 2 dual port 512x8 Data 1 Data 2 dual port 512x8 Data 1 Data Unused Mem 1 Data aè Unused Mem 2 Data Mem 1 Data bè Mem 2 Data Fi. 3. Two ways of implementin two sinle-port 192x8 loical memories 192x8 loical memories usin an architecture with B = 4096 and P = 2. Phase 1 would create two components, each 192x8. If these components were implemented directly, two memory arrays would be required, as shown in Fiure 3èaè. Even thouh the oriinal loical memory conæuration only consists of 3072 bits, a total of 8192 bits ètwo physical arraysè are used to implement it. An alternative implementation is shown in Fiure 3èbè; in this implementation, each loical memory is mapped to a portion of a sinle array, and one of the array's two ports is used for each loical memory. The upper order address bit of port 1 is tied to 0, while the upper order address bit of port 2 is tied to 1. This ensures that each port sees a diæerent 256x8 portion of the physical array. Since both ports are independent, both loical memories can be accessed independently. This example illustrates the oal in phase 2: the components from phase 1 are packed into the available memory arrays such that the total number of required arrays is as small as possible. Informally,two sinle-port components can be packed into a dual-port physical array ëvertically" or ëhorizontally". Consider packin two arrays i and j ëhorizontally" as shown in Fiure 4èaè. In this case, the physical array is of width W, and the two components are of width wc i and wc j. Each word in the memory contains wc i bits for component i and wc j bits for component j. By supplyin an address to the ærst port's address bus, and accessin data throuh bits 0 to wc i,1 of the ærst port's data bus, the ærst component can be accessed. The second component can be accessed in the same way usin the second port's address bus and bits wc i to wc i + wc j, 1 of the second port's data bus. Since the ports are independent, both components can be accessed independently. In order to combine arrays in this way, it is suæcient that: W ç wc i + wc j è4è where W is the physical array width in the mode chosen to implement the components.

8 Phase1: component list = ç for each loical memory i c = 1 f for each physical array mode j èstartin from widestèf l ml m c wi di j = W j B=W j if c j éc then f c = c j m = j construct c new components for this loical memory calculate wc k, dck, pck for each new component k mc k = m for each new component k component list = component list ë new components Fi. 2. Summary of Phase 1 of the Alorithm In eneral, there are many ways to partition each loical memory.to simplify the task, we only consider partitions in which each component has the same mc i èthat is, each component can be implemented by a physical array in the same modeè. The partitions in Fiures 1èaè and 1èbè would be considered, therefore, while the one in Fiure 1ècè would not. Note that this only applies to components that make up a sinle loical memory. A loical memory conæuration typically has several loical memories; components from diæerent loical memories may correspond to diæerent physical array modes. Given this assumption, the number of components required to implement a loical memory i usin physical array mode j is: c = ç çç wi di W j B=W j ç è3è To ænd the partition that results in the smallest c, we cycle throuh all possible array modes and choose the best result. This partitionin is done independently for each loical memory in the loical memory conæuration. This is summarized in Fiure Phase 2: Bin Packin Given the list of components found in phase 1, it is possible to implement the loical memory conæuration directly by usin one physical array for each component. As will be shown in Section 4, this often results in very poor utilization of the memory arrays. As an example, consider implementin two sinle-port

9 aè bè cè Fi. 1. Three ways to partition a 3584x3 loical memory 3 -to-physical Mappin Alorithm In this section, our new loical-to-physical mappin alorithm is described. The alorithm consists of three phases: durin the ærst phase, the loical memories are broken into components, in the second phase, these components are packed into the physical arrays, and in the third phase, these physical arrays are wired toether to implement the oriinal loical memory conæuration. 3.1 Phase 1: Break Memories into Components The ærst phase of the alorithm partitions each loical memory into several components, each of which is small enouh to æt into a sinle physical array. Each component represents a portion of the bits in the oriinal loical memory, and can be described by its width, wc i, its depth, dc i, and the number of ports required, pc i. In order for the component to æt in a sinle physical array, wc i and dc i must satisfy the followin inequalities: wc i ç W j dc i ç B=W j è1è è2è for some value of j between 0 and M, 1 èrecall that a physical array can be used in one of M modes, each of which has a diæerent widthèdepthè. The number of ports required by each component, pc i, is the same as the number of ports required by the oriinal loical memory.we also deæne a quantity mc i for each component which indicates the physical array modeèsè that can be used to implement this component. As an example, Fiure 1 shows three ways in which a 3584x3 loical memory could be broken into components. In each case, it is assumed that each physical array consists of 4096 bits èb = 4096è and can be used as a 4096x1, 2048x2, 1024x4, or a 512x8. In Fiure 1èaè, wc i = 1 for each component, while in Fiure 1èbè, all components have wc i = 3. In Fiure 1, one of the components has wc i = 1, while the others have wc i =2.

10 used in many architectures today, we feel that true dual-port memories will be available in future devices. Thus, we focus our eæorts on studyin alorithms that taret true dual-port memories. 2.2 User Circuit Assumptions It is assumed that the user circuit to be implemented on the FPGA contains both loic and memory portions. In this paper, we are only concerned with the memory portion. We assume that the memory portion of the circuit consists of l independent user memories. We refer to each of these user memories as a loical memory. The set of all loical memories required for a circuit will be referred to as that circuit's loical memory conæuration. The depth of loical memory k è0 ç k ç l, 1è will be denoted d k, the width of loical memory k will be denoted w k, and the number of ports required by loical memory k èmaximum number of simultaneous accesses to memory kè will be denoted p k. Unlike ë6ë, we allow user memories that require either one or two ports. These parameters are summarized in the bottom half of Table Problem Statement The problem studied in this paper can be stated as follows: Given: 1. An FPGA architecture described by B, M and W i è0 ç iémè as described in Subsection 2.1 èthis paper only considers architectures with P = 2è, 2. A Memory Conæuration described by l, d k, w k, and p k è0 ç kélè, as described in Subsection 2.2 èin this paper, 1 ç p k ç 2 for all kè, Find: An implementation of the loical memory conæuration usin n embedded memories. Such that: n is as small as possible. Note that the oal is to implement the loical memory conæuration usin as few physical arrays as possible. In an FPGA with N arrays, it may appear that minimizin the number of arrays required to implement the loical memory conæuration is immaterial, as lon as the implementation requires N or fewer arrays. Minimizin the number of arrays is important, however, since the remainin arrays can be conæured as ROMs, and be very eæciently used to implement the loic part of the user circuit ë7, 8ë. The fewer arrays that are used to implement memory, the more that will be available to implement loic.

11 Parameter Meanin N Number of Arrays B Bits per Array P Ports per Array M Number of Modes for each array W i Data width of array inmodei l Number of Memories d k Depth of Memory k w k Width of Memory k Ports in Memory k p k Table 1. Architectural and Circuit Parameters those obtained by simply extendin a previous alorithm that was developed to taret sinle-port arrays. 2 Problem Deænition In this section, we ærst describe our assumptions reardin the taret FPGA architecture and the user circuit that is to be mapped, and then present a precise deænition of the -to-physical Memory Mappin problem. 2.1 Architectural Framework The top half of Table 1 summarizes the parameters that deæne the FPGA embedded memory array architecture. The number of embedded memory arrays is denoted by N, the number of bits in each array is denoted by B, the number of independent access ports in each array is denoted by P. Each array can be used in one of M diæerent modes; each mode has a diæerent width and depth. The width of each array inmodei is denoted W i ; the depth can be calculated as B=W i. In the Altera FLEX10KE, B = 4096 bits, P =2,M =4,and fw 0 ;W 1 ;W 2 ;W 3 = f2; 4; 8; 16, meanin each array is dual-port and can be conæured to be one of 2048x2, 1024x4, 512x8, or 256x16. In this paper, we will only consider dual port arrays, ie. P = 2. Note that some FPGA architectures, such as the Altera FLEX10KE, contain two independent ports, but one port is a dedicated read port and one port is a dedicated write port. This works well for many applications èsuch as a ærst-in ærst-out buæer that is used to temporarily hold data in a communication systemè, but there are many applications for which this is insuæcient èa dual-port reister æle in a processor which must be read by two functional units simultaneously, for exampleè. To implement these sorts of circuits, true dual-port memory arrays are required, in which the two accesses are independent, and either can be a read or write. With the increasin importance of embedded memory in FPGAs, and since true dualport arrays appear to be a natural evolution from the restricted dual-port model

12 depth. As an example, the Altera 10KE devices contain between six and twenty 4-Kbit blocks, each of which can be used as a 2Kx2, a 1Kx4, a 512x8, or a 256x16 array. 1 These arrays can be combined to implement larer user memories. The task of implementin the memories required by a user circuit usin the FPGA embedded arrays is called loical-to-physical mappin ë6ë. Because of the lare number of ways in which arrays can be combined, and because each array can be used in one of several modes èwidthsèdepthsè, this problem is not trivial. Yet, it is vitally important í since each FPGA contains only a few memory arrays, a sub-optimal implementation that wastes even one memory array could very easily cause a circuit to not æt on a iven FPGA. Even if the memory conæuration does æt on the FPGA, minimizin the number of arrays needed to implement the storae part of the circuit is beneæcial because unused memory arrays can be conæured as ROM and used to implement loic ë7,8ë. In ë6, 9ë, loical-to-physical mappin for FPGAs with sinle-port embedded arrays èarrays in which only one access can be performed at a timeè is discussed. Many recent FPGAs, however, contain dual-port arrays èso that two accesses can be performed by each array concurrentlyè ë2í4ë. Many applications require memories that can be accessed simultaneously by two separate subcircuits; these applications can most eæciently be implemented if the FPGA has dual-port arrays. In this paper, a new loical-to-physical mappin alorithm that tarets dualport arrays is presented. We show that this new alorithm results in much more eæcient implementations than if we simply extend the techniques taretin sinle-port arrays ë6, 9ë. The user circuits are assumed to consist of both sinle and dual-port user memories; our improvement is obtained by intelliently packin the sinle-port user memories into the dual-port physical arrays. Under the riht conditions, each dual-port array can implement two sinle-port memories èor parts of two sinle-port memoriesè. Besides ë6ë and ë9ë, little work as been done in this area. Jha and Dutt describe an alorithm to map loical memories to physical library elements ë10ë, but do not consider the optimizations that are possible when the physical elements are dual-port. Karchmer and Rose show how user memories can be implemented by larer physical memory chips, but only consider sinle-port physical arrays ë11ë. Their work is also diæerent in that they consider discrete memory devices, which do not have the variety of modes that FPGA memory arrays have. There has also been considerable work mappin variables to both sinle and dual port memories durin hih-level synthesis in an attempt to minimize the execution time of an alorithm ë12í14ë. None of these papers consider physical memories with the conæurability of FPGA arrays, however. This paper is oranized as follows. Section 2 presents our assumptions reardin the FPGA architecture and the application circuits, and then ives a precise deænition of the problem solved in this paper. Section 3 then describes our alorithm. Finally, Section 4 compares the results from our alorithm with 1 In this paper, a axb memory has a words of b bits each.

13 -to-physical Memory Mappin for FPGAs with Dual-Port Embedded Arrays William K.C. Ho and Steven J.E. Wilton Department of Electrical and Computer Enineerin University of British Columbia, Vancouver, B.C., Canada, fwilliamh stevew Abstract. On-chip storae has become critical in lare FPGAs. This has led most FPGA vendors to include conæurable embedded arrays in their devices. Because of the lare number of ways in which the arrays can be combined, and because of the conæurability of each array, there are often many ways to implement the memories required by a circuit. Implementin user memories usin physical arrays is called loical-tophysical mappin, and has previously been studied for sinle-port FPGA memory arrays. Most current FPGAs, however, contain dual-port arrays. In this paper, we present a loical-to-physical alorithm that speciæcally tarets dual-port FPGA arrays. We show that this alorithm results in 28è denser memory implementations than the only previously published alorithm. 1 Introduction It has become clear that on-chip storae is critical in lare FPGAs. As FPGAs row, they are bein used to implement entire systems, rather than the small loic subcircuits that have traditionally been tareted to FPGAs. One of the important diæerences between these lare systems and smaller loic subcircuits is that the lare systems often require storae. Althouh this storae could be implemented oæ-chip, on-chip storae has a number of advantaes. Besides the obvious advantaes of interation, on-chip storae will often lead to hiher clock frequencies, since IèO pins need not be driven with each memory access. In addition, on-chip storae will relax IèO pin requirements, since pins need not be devoted to external memory connections. These advantaes have led most FPGA vendors to produce architectures with siniæcant amounts of on-chip storae. Since the storae requirements of circuits vary widely, the FPGA memory architecture must be æexible enouh to implement diæerent numbers of independently addressable memories as well as diæerent memory shapes and sizes. Many recent commercial devices, such as the Altera 10K and 10KE devices ë1, 2ë, the Xilinx Virtex FPGAs ë3ë, the Actel 42MX ë4ë, and the Lattice isplsi 6192 FPGAs ë5ë, provide several lare arrays embedded into the FPGA. Each array can typically be used in one of several modes, each with a diæerent width and

Implementing Logic in FPGA Memory Arrays: Heterogeneous Memory Architectures

Implementing Logic in FPGA Memory Arrays: Heterogeneous Memory Architectures Implementing Logic in FPGA Memory Arrays: Heterogeneous Memory Architectures Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver, BC, Canada, V6T

More information

3. G. G. Lemieux and S. D. Brown, ëa detailed router for allocating wire segments

3. G. G. Lemieux and S. D. Brown, ëa detailed router for allocating wire segments . Xilinx, Inc., The Programmable Logic Data Book, 99.. G. G. Lemieux and S. D. Brown, ëa detailed router for allocating wire segments in æeld-programmable gate arrays," in Proceedings of the ACM Physical

More information

Heterogeneous Technology Mapping for FPGAs with Dual-Port Embedded Memory Arrays

Heterogeneous Technology Mapping for FPGAs with Dual-Port Embedded Memory Arrays Heterogeneous Technology Mapping for FPGAs with Dual-Port Embedded Memory Arrays Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver, BC, Canada,

More information

in FPGAs with Embedded Memory Arrays", in Proceedings of the ACMèSIGDA International Symposium on Field-Programmable Gate Arrays, Feb

in FPGAs with Embedded Memory Arrays, in Proceedings of the ACMèSIGDA International Symposium on Field-Programmable Gate Arrays, Feb 5. Wilton, S. J. E., ësmap: Heteroeneous Tecnoloy Mappin or Area Reduction in FPGAs wit Embedded Memory Arrays", in Proceedins o te ACMèSIGDA International Symposium on Field-Prorammable Gate Arrays, Feb.

More information

A CPLD-based RC-4 Cracking System. short ètypically 32 or 40 bitsè sequence of bits. As long as. thus can not decrypt the message.

A CPLD-based RC-4 Cracking System. short ètypically 32 or 40 bitsè sequence of bits. As long as. thus can not decrypt the message. A CPLD-based RC-4 Cracking System Paul D. Kundarewich and Steven J.E. Wilton Dept. of Electrical and Computer Engineering University of British Columbia Vancouver, BC, Canada kundarew@ieee.org, stevew@ece.ubc.ca

More information

Heterogeneous Technology Mapping for Area Reduction in FPGA s with Embedded Memory Arrays

Heterogeneous Technology Mapping for Area Reduction in FPGA s with Embedded Memory Arrays 56 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 1, JANUARY 2000 Heterogeneous Technology Mapping for Area Reduction in FPGA s with Embedded Memory Arrays

More information

Texture Un mapping. Computer Vision Project May, Figure 1. The output: The object with texture.

Texture Un mapping. Computer Vision Project May, Figure 1. The output: The object with texture. Texture Un mappin Shinjiro Sueda sueda@cs.ruters.edu Dinesh K. Pai dpai@cs.ruters.edu Computer Vision Project May, 2003 Abstract In computer raphics, texture mappin refers to the technique where an imae

More information

Reducing network cost of many-to-many communication in unidirectional WDM rings with network coding

Reducing network cost of many-to-many communication in unidirectional WDM rings with network coding Reducin network cost of many-to-many communication in unidirectional WDM rins with network codin Lon Lon and Ahmed E. Kamal Department of Electrical and Computer Enineerin, Iowa State University Email:

More information

Module. Sanko Lan Avi Ziv Abbas El Gamal. and its accompanying FPGA CAD tools, we are focusing on

Module. Sanko Lan Avi Ziv Abbas El Gamal. and its accompanying FPGA CAD tools, we are focusing on Placement and Routin For A Field Prorammable Multi-Chip Module Sanko Lan Avi Ziv Abbas El Gamal Information Systems Laboratory, Stanford University, Stanford, CA 94305 Abstract Placemen t and routin heuristics

More information

Reducing Network Cost of Many-to-Many Communication in Unidirectional WDM Rings with Network Coding

Reducing Network Cost of Many-to-Many Communication in Unidirectional WDM Rings with Network Coding 1 Reducin Network Cost of Many-to-Many Communication in Unidirectional WDM Rins with Network Codin Lon Lon and Ahmed E. Kamal, Senior Member, IEEE Abstract In this paper we address the problem of traffic

More information

Bus-Based Communication Synthesis on System-Level

Bus-Based Communication Synthesis on System-Level Bus-Based Communication Synthesis on System-Level Michael Gasteier Manfred Glesner Darmstadt University of Technoloy Institute of Microelectronic Systems Karlstrasse 15, 64283 Darmstadt, Germany Abstract

More information

Leveraging Models at Run-time to Retrieve Information for Feature Location

Leveraging Models at Run-time to Retrieve Information for Feature Location Leverain Models at Run-time to Retrieve Information for Feature Location Lorena Arcea,2, Jaime Font,2 Øystein Hauen 2,3, and Carlos Cetina San Jore University, SVIT Research Group, Zaraoza, Spain {larcea,jfont,ccetina}@usj.es

More information

Low cost concurrent error masking using approximate logic circuits

Low cost concurrent error masking using approximate logic circuits 1 Low cost concurrent error maskin usin approximate loic circuits Mihir R. Choudhury, Member, IEEE and Kartik Mohanram, Member, IEEE Abstract With technoloy scalin, loical errors arisin due to sinle-event

More information

Fast Module Mapping and Placement for Datapaths in FPGAs

Fast Module Mapping and Placement for Datapaths in FPGAs Fast Module Mappin and Placement for Datapaths in FPGAs Timothy J. Callahan, Philip Chon, André DeHon, and John Wawrzynek University of California at Berkeley Abstract By tailorin a compiler tree-parsin

More information

SCALE SELECTIVE EXTENDED LOCAL BINARY PATTERN FOR TEXTURE CLASSIFICATION. Yuting Hu, Zhiling Long, and Ghassan AlRegib

SCALE SELECTIVE EXTENDED LOCAL BINARY PATTERN FOR TEXTURE CLASSIFICATION. Yuting Hu, Zhiling Long, and Ghassan AlRegib SCALE SELECTIVE EXTENDED LOCAL BINARY PATTERN FOR TEXTURE CLASSIFICATION Yutin Hu, Zhilin Lon, and Ghassan AlReib Multimedia & Sensors Lab (MSL) Center for Sinal and Information Processin (CSIP) School

More information

Fiure 1: Screen Shot of Lolo World 2

Fiure 1: Screen Shot of Lolo World 2 The Game of Lolo An Exploration of Object-Oriented Prorammin Timothy A.Budd Oreon State University May 4, 2000 Introduction The ame of Lolo is a classic Nintindo ame. Lolo is a ame in the puzzle tradutuib.

More information

Affinity Hybrid Tree: An Indexing Technique for Content-Based Image Retrieval in Multimedia Databases

Affinity Hybrid Tree: An Indexing Technique for Content-Based Image Retrieval in Multimedia Databases Affinity Hybrid Tree: An Indexin Technique for Content-Based Imae Retrieval in Multimedia Databases Kasturi Chatterjee and Shu-Chin Chen Florida International University Distributed Multimedia Information

More information

ACTion: Combining Logic Synthesis and Technology Mapping for MUX based FPGAs

ACTion: Combining Logic Synthesis and Technology Mapping for MUX based FPGAs IEEE EUROMICRO Conference (EUROMICRO 00) Maastricht, September 2000 ACTion: Combinin Loic Synthesis and Technoloy Mappin for MUX based FPGAs Wolfan Günther Rolf Drechsler Institute of Computer Science

More information

Iterative Single-Image Digital Super-Resolution Using Partial High-Resolution Data

Iterative Single-Image Digital Super-Resolution Using Partial High-Resolution Data Iterative Sinle-Imae Diital Super-Resolution Usin Partial Hih-Resolution Data Eran Gur, Member, IAENG and Zeev Zalevsky Abstract The subject of extractin hih-resolution data from low-resolution imaes is

More information

optimization agents user interface agents database agents program interface agents

optimization agents user interface agents database agents program interface agents A MULTIAGENT SIMULATION OPTIMIZATION SYSTEM Sven Hader Department of Computer Science Chemnitz University of Technoloy D-09107 Chemnitz, Germany E-Mail: sha@informatik.tu-chemnitz.de KEYWORDS simulation

More information

Efficient and Provably Secure Ciphers for Storage Device Block Level Encryption

Efficient and Provably Secure Ciphers for Storage Device Block Level Encryption Efficient and Provably Secure Ciphers for Storae Device Block evel Encryption Yulian Zhen SIS Department, UNC Charlotte yzhen@uncc.edu Yone Wan SIS Department, UNC Charlotte yonwan@uncc.edu ABSTACT Block

More information

Chapter 5 THE MODULE FOR DETERMINING AN OBJECT S TRUE GRAY LEVELS

Chapter 5 THE MODULE FOR DETERMINING AN OBJECT S TRUE GRAY LEVELS Qian u Chapter 5. Determinin an Object s True Gray evels 3 Chapter 5 THE MODUE OR DETERMNNG AN OJECT S TRUE GRAY EVES This chapter discusses the module for determinin an object s true ray levels. To compute

More information

An Intelligent Multi-Port Memory

An Intelligent Multi-Port Memory JOURNAL OF COMPUTERS, VOL. 5, NO. 3, MARCH 2010 471 An Intelligent Multi-Port Memory Zuo Wang School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China wuchenjian.wang@gmail.com

More information

Grooming Multicast Traffic in Unidirectional SONET/WDM Rings

Grooming Multicast Traffic in Unidirectional SONET/WDM Rings 1 Groomin Multicast Traffic in Unidirectional SONET/WDM Rins Anuj Rawat, Richard La, Steven Marcus and Mark Shayman Abstract In this paper we study the problem of efficient roomin of iven non-uniform multicast

More information

WAVELENGTH Division Multiplexing (WDM) significantly

WAVELENGTH Division Multiplexing (WDM) significantly IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 25, NO. 6, AUGUST 27 1 Groomin Multicast Traffic in Unidirectional SONET/WDM Rins Anuj Rawat, Richard La, Steven Marcus, and Mark Shayman Abstract

More information

FPGA Technology Mapping: A Study of Optimality

FPGA Technology Mapping: A Study of Optimality FPGA Technoloy Mappin: A Study o Optimality Andrew Lin Department o Electrical and Computer Enineerin University o Toronto Toronto, Canada alin@eec.toronto.edu Deshanand P. Sinh Altera Corporation Toronto

More information

Survey on Error Control Coding Techniques

Survey on Error Control Coding Techniques Survey on Error Control Codin Techniques Suriya.N 1 SNS Collee of Enineerin, Department of ECE, surikala@mail.com S.Kamalakannan 2 SNS Collee of Enineerin, Department of ECE, kamalakannan.ap@mail.com Abstract

More information

HCE: A New Channel Exchange Scheme for Handovers in Mobile Cellular Systems

HCE: A New Channel Exchange Scheme for Handovers in Mobile Cellular Systems 129 HCE: A New Channel Exchane Scheme for Handovers in Mobile Cellular Systems DNESH K. ANVEKAR* and S. SANDEEP PRADHAN Department of Electrical Communication Enineerin ndian nstitute of Science, Banalore

More information

FPGA Clock Network Architecture: Flexibility vs. Area and Power

FPGA Clock Network Architecture: Flexibility vs. Area and Power FPGA Clock Network Architecture: Flexibility vs. Area and Power Julien Lamoureux and Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver, B.C.,

More information

Heuristic Searching: A* Search

Heuristic Searching: A* Search IOSR Journal of Computer Enineerin (IOSRJCE) ISSN: 2278-0661 Volume 4, Issue 2 (Sep.-Oct. 2012), PP 01-05 Nazmul Hasan (Department of Computer Science and Enineerin, Military Institute of Science and Technoloy,

More information

WILTON ET AL: STRUCTURAL ANALYSIS AND GENERATION OF DIGITAL CIRCUITS WITH MEMORY 1. Structural Analysis and Generation of Synthetic

WILTON ET AL: STRUCTURAL ANALYSIS AND GENERATION OF DIGITAL CIRCUITS WITH MEMORY 1. Structural Analysis and Generation of Synthetic WILTON ET AL: STRUCTURAL ANALYSIS AND GENERATION OF DIGITAL CIRCUITS WITH MEMORY 1 Structural Analysis and Generation of Synthetic Digital Circuits with Memory Steven J.E. Wilton Department of Electrical

More information

The Role of Switching in Reducing the Number of Electronic Ports in WDM Networks

The Role of Switching in Reducing the Number of Electronic Ports in WDM Networks 1 The Role of Switchin in Reducin the Number of Electronic Ports in WDM Networks Randall A. Berry and Eytan Modiano Abstract We consider the role of switchin in minimizin the number of electronic ports

More information

A Web Architecture for progressive delivery of 3D content

A Web Architecture for progressive delivery of 3D content A Web Architecture for proressive delivery of 3D content Efi Foel Enbaya, Ltd. Daniel Cohen-Or Λ Tel Aviv University Revital Ironi Enbaya, Ltd. Tali Zvi Enbaya, Ltd. Enbaya, Ltd. y (a) (b) (c) Fiure 1:

More information

Cached. Cached. Cached. Active. Active. Active. Active. Cached. Cached. Cached

Cached. Cached. Cached. Active. Active. Active. Active. Cached. Cached. Cached Manain Pipeline-Reconurable FPGAs Srihari Cadambi, Jerey Weener, Seth Copen Goldstein, Herman Schmit, and Donald E. Thomas Carneie Mellon University Pittsburh, PA 15213-3890 fcadambi,weener,seth,herman,thomas@ece.cmu.edu

More information

FPGA Implementations of the RC6 Block Cipher

FPGA Implementations of the RC6 Block Cipher FGA Implementations of the RC6 lock Cipher JeanLuc euchat Laboratoire de l Informatique du arallélisme Ecole Normale Supérieure de Lyon 46 Allée d Italie F 69364 Lyon Cedex 07 JeanLuceuchat@enslyonfr Abstract

More information

IN SUPERSCALAR PROCESSORS DAVID CHU LIN. B.S., University of Illinois, 1990 THESIS. Submitted in partial fulllment of the requirements

IN SUPERSCALAR PROCESSORS DAVID CHU LIN. B.S., University of Illinois, 1990 THESIS. Submitted in partial fulllment of the requirements COMPILER SUPPORT FOR PREDICTED EXECUTION IN SUPERSCLR PROCESSORS BY DVID CHU LIN B.S., University of Illinois, 1990 THESIS Submitted in partial fulllment of the requirements for the deree of Master of

More information

FPGA Trojans through Detecting and Weakening of Cryptographic Primitives

FPGA Trojans through Detecting and Weakening of Cryptographic Primitives FPGA Trojans throuh Detectin and Weakenin of Cryptoraphic Primitives Pawel Swierczynski, Marc Fyrbiak, Philipp Koppe, and Christof Paar, Fellow, IEEE Horst Görtz Institute for IT Security, Ruhr University

More information

Digital Integrated Circuits

Digital Integrated Circuits Digital Integrated Circuits Lecture 9 Jaeyong Chung Robust Systems Laboratory Incheon National University DIGITAL DESIGN FLOW Chung EPC6055 2 FPGA vs. ASIC FPGA (A programmable Logic Device) Faster time-to-market

More information

Luca Benini Davide Bruni Alberto Macii Enrico Macii. Bologna, ITALY Torino, ITALY 10129

Luca Benini Davide Bruni Alberto Macii Enrico Macii. Bologna, ITALY Torino, ITALY 10129 Hardware-Assisted Data for Enery Minimization in Systems with Embedded Processors Luca Benini Davide Bruni Alberto Macii Enrico Macii Universita di Bolona Politecnico di Torino Bolona, ITALY 40136 Torino,

More information

Calder, Feller, & Eustace break true data-dependencies by predictin the outcome value of instructions before they are executed, and forwardin these sp

Calder, Feller, & Eustace break true data-dependencies by predictin the outcome value of instructions before they are executed, and forwardin these sp Journal of Instruction-Level Parallelism 1 è1999è 1-6 Submitted 6è98; published 3è99 Value Proælin and Optimization Brad Calder Peter Feller Department of Computer Science and Enineerin University of California,

More information

where C is traversed in the clockwise direction, r 5 èuè =h, sin u; cos ui; u ë0; çè; è6è where C is traversed in the counterclockwise direction èhow

where C is traversed in the clockwise direction, r 5 èuè =h, sin u; cos ui; u ë0; çè; è6è where C is traversed in the counterclockwise direction èhow 1 A Note on Parametrization The key to parametrizartion is to realize that the goal of this method is to describe the location of all points on a geometric object, a curve, a surface, or a region. This

More information

Department of Computer Science, Tsing Hua University. Quickturn Design Systems, Inc., 440 Clyde Avenue,

Department of Computer Science, Tsing Hua University. Quickturn Design Systems, Inc., 440 Clyde Avenue, DP Gen: A Datapath Generator for Multiple-FPGA Applications y Wen-Jon Fan 1, Allen C.-H. Wu 1, Ti-Yen Yen 2, and Tsair-Chin Lin 2 1 Department of Computer Science, Tsin Hua University Hsinchu, Taiwan,

More information

IBM Thomas J. Watson Research Center. Yorktown Heights, NY, U.S.A. The major advantage of BDDs is their eciency for a

IBM Thomas J. Watson Research Center. Yorktown Heights, NY, U.S.A. The major advantage of BDDs is their eciency for a Equivalence Checkin Usin Cuts and Heaps Andreas Kuehlmann Florian Krohm IBM Thomas J. Watson Research Center Yorktown Heihts, NY, U.S.A. Abstract This paper presents a verication technique which is specically

More information

General Design of Grid-based Data Replication. Schemes Using Graphs and a Few Rules. availability of read and write operations they oer and

General Design of Grid-based Data Replication. Schemes Using Graphs and a Few Rules. availability of read and write operations they oer and General esin of Grid-based ata Replication Schemes Usin Graphs and a Few Rules Oliver Theel University of California epartment of Computer Science Riverside, C 92521-0304, US bstract Grid-based data replication

More information

f y f x f z exu f xu syu s y s x s zl ezl f zl s zu ezu f zu sign logic significand multiplier exponent adder inc multiplexor multiplexor ty

f y f x f z exu f xu syu s y s x s zl ezl f zl s zu ezu f zu sign logic significand multiplier exponent adder inc multiplexor multiplexor ty A Combined Interval and Floatin Point Multiplier James E. Stine and Michael J. Schulte Computer Architecture and Arithmetic Laboratory Electrical Enineerin and Computer Science Department Lehih University

More information

On Programmable Memory Built-In Self Test Architectures. Kamran Zarrineh and Shambhu J. Upadhyaya

On Programmable Memory Built-In Self Test Architectures. Kamran Zarrineh and Shambhu J. Upadhyaya On Programmable Memory Built-In Self Test Architectures Kamran Zarrineh and Shambhu J. Upadhyaya Department of Electrical and Computer Engineering SUNY at Buæalo, Buæalo, NY 426 Abstract The design and

More information

Protocol Design for Congestion Management in. Narrowband Integrated Networks. B.E., University of Bombay, Bombay, 1992

Protocol Design for Congestion Management in. Narrowband Integrated Networks. B.E., University of Bombay, Bombay, 1992 Protocol Desin for Conestion Manaement in Narrowband Interated Networks by Sona S. Kapadia B.E., University of Bombay, Bombay, 1992 Submitted to the Department of Electrical Enineerin and Computer Science

More information

Improving robotic machining accuracy by real-time compensation

Improving robotic machining accuracy by real-time compensation University of Wollonon Research Online Faculty of Enineerin - Papers (Archive) Faculty of Enineerin and Information Sciences 2009 Improvin robotic machinin accuracy by real-time compensation Zeni Pan University

More information

Probabilistic Gaze Estimation Without Active Personal Calibration

Probabilistic Gaze Estimation Without Active Personal Calibration Probabilistic Gaze Estimation Without Active Personal Calibration Jixu Chen Qian Ji Department of Electrical,Computer and System Enineerin Rensselaer Polytechnic Institute Troy, NY 12180 chenji@e.com qji@ecse.rpi.edu

More information

FPGA: What? Why? Marco D. Santambrogio

FPGA: What? Why? Marco D. Santambrogio FPGA: What? Why? Marco D. Santambrogio marco.santambrogio@polimi.it 2 Reconfigurable Hardware Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially much

More information

Configurable Embedded Systems: Using Programmable Logic to Compress Embedded System Design Cycles

Configurable Embedded Systems: Using Programmable Logic to Compress Embedded System Design Cycles Class 330 Configurable Embedded Systems: Using Programmable Logic to Compress Embedded System Design Cycles Steven Knapp (sknapp) Arye Ziklik (arye) Triscend Corporation www.triscend.com Copyright 1998,

More information

Register Binding for FPGAs with Embedded Memory

Register Binding for FPGAs with Embedded Memory Register Binding for FPGAs with Embedded Memory Hassan Al Atat and Iyad Ouaiss iyad.ouaiss@lau.edu.lb Department of Computer Engineering Lebanese American University Byblos, Lebanon Abstract * The trend

More information

Reversible Image Merging for Low-level Machine Vision

Reversible Image Merging for Low-level Machine Vision Reversible Imae Merin for Low-level Machine Vision M. Kharinov St. Petersbur Institute for Informatics and Automation of RAS, 14_liniya Vasil evskoo ostrova 39, St. Petersbur, 199178 Russia, khar@iias.spb.su,

More information

Imitation: An Alternative to Generalization in Programming by Demonstration Systems

Imitation: An Alternative to Generalization in Programming by Demonstration Systems Imitation: An Alternative to Generalization in Prorammin by Demonstration Systems Technical Report UW-CSE-98-08-06 Amir Michail University of Washinton amir@cs.washinton.edu http://www.cs.washinton.edu/homes/amir/opsis.html

More information

i=266 to 382 ok Undo

i=266 to 382 ok Undo Softspec: Software-based Speculative Parallelism Derek Bruenin, Srikrishna Devabhaktuni, Saman Amarasinhe Laboratory for Computer Science Massachusetts Institute of Technoloy Cambride, MA 9 iye@mit.edu

More information

2 CHAPTR 1. BOTTOM UP PARSING 1. S ::= 4. T ::= T* F 2. ::= +T 5. j F 3. j T 6. F ::= 7. j Fiure 1.1: Our Sample Grammar for Bottom Up Parsin Our beli

2 CHAPTR 1. BOTTOM UP PARSING 1. S ::= 4. T ::= T* F 2. ::= +T 5. j F 3. j T 6. F ::= 7. j Fiure 1.1: Our Sample Grammar for Bottom Up Parsin Our beli Chapter 1 Bottom Up Parsin The key diæculty with top-down parsin is the requirement that the rammar satisfy the LL1 property. You will recall that this entailed knowin, when you are facin the token that

More information

TETROBOT: A NOVEL MECHANISM FOR RECONFIGURABLE PARALLEL ROBOTICS. Student, Machine Building Faculty, Technical University of Cluj-Napoca

TETROBOT: A NOVEL MECHANISM FOR RECONFIGURABLE PARALLEL ROBOTICS. Student, Machine Building Faculty, Technical University of Cluj-Napoca UNIVERSITATEA TRANSILVANIA DIN BRA0OV Catedra Desin de Produs 1i Robotic2 Simpozionul na8ional cu participare interna8ional9 PRoiectarea ASIstat9 de Calculator P R A S I C ' 02 Vol. I Mecanisme 1i Triboloie

More information

SBG SDG. An Accurate Error Control Mechanism for Simplification Before Generation Algorihms

SBG SDG. An Accurate Error Control Mechanism for Simplification Before Generation Algorihms An Accurate Error Control Mechanism for Simplification Before Generation Alorihms O. Guerra, J. D. Rodríuez-García, E. Roca, F. V. Fernández and A. Rodríuez-Vázquez Instituto de Microelectrónica de Sevilla,

More information

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public Reduce Your System Power Consumption with Altera FPGAs Agenda Benefits of lower power in systems Stratix III power technology Cyclone III power Quartus II power optimization and estimation tools Summary

More information

Code Optimization Techniques for Embedded DSP Microprocessors

Code Optimization Techniques for Embedded DSP Microprocessors Code Optimization Techniques for Embedded DSP Microprocessors Stan Liao Srinivas Devadas Kurt Keutzer Steve Tjian Albert Wan MIT Department of EECS Synopsys, Inc. Cambride, MA 02139 Mountain View, CA 94043

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

Memory Footprint Reduction for FPGA Routing Algorithms

Memory Footprint Reduction for FPGA Routing Algorithms Memory Footprint Reduction for FPGA Routing Algorithms Scott Y.L. Chin, and Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver, B.C., Canada email:

More information

COMPLETED LOCAL DERIVATIVE PATTERN FOR ROTATION INVARIANT TEXTURE CLASSIFICATION. Yuting Hu, Zhiling Long, and Ghassan AlRegib

COMPLETED LOCAL DERIVATIVE PATTERN FOR ROTATION INVARIANT TEXTURE CLASSIFICATION. Yuting Hu, Zhiling Long, and Ghassan AlRegib COMPLETED LOCAL DERIVATIVE PATTERN FOR ROTATION INVARIANT TEXTURE CLASSIFICATION Yutin Hu, Zhilin Lon, and Ghassan AlReib Multimedia & Sensors Lab (MSL) Center for Sinal and Information Processin (CSIP)

More information

Multi-Product Floorplan and Uncore Design Framework for Chip Multiprocessors

Multi-Product Floorplan and Uncore Design Framework for Chip Multiprocessors Multi-Product Floorplan and Uncore Desin Framework for hip Multiprocessors Marco scalante, Andrew B Kahn, Michael Kishinevsky, Umit Oras and Kambiz Samadi Intel orp, Hillsboro, OR, and S Departments, University

More information

Wavefront Cache-friendly Algorithm for Compact Numerical Schemes

Wavefront Cache-friendly Algorithm for Compact Numerical Schemes NASA/CR-1999-209708 ICASE Report No. 99-40 Waveront Cache-riendly Alorithm or Compact Numerical Schemes Alex Povitsky ICASE, Hampton, Virinia Institute or Computer Applications in Science and Enineerin

More information

An Internet Collaborative Environment for Sharing Java Applications

An Internet Collaborative Environment for Sharing Java Applications An Internet Collaborative Environment for Sharin Java Applications H. Abdel-Wahab and B. Kvande Department of Computer Science Old Dominion University Norfolk, Va 23529 fwahab,kvande@cs.odu.edu O. Kim

More information

Evolution of Implementation Technologies. ECE 4211/5211 Rapid Prototyping with FPGAs. Gate Array Technology (IBM s) Programmable Logic

Evolution of Implementation Technologies. ECE 4211/5211 Rapid Prototyping with FPGAs. Gate Array Technology (IBM s) Programmable Logic ECE 42/52 Rapid Prototyping with FPGAs Dr. Charlie Wang Department of Electrical and Computer Engineering University of Colorado at Colorado Springs Evolution of Implementation Technologies Discrete devices:

More information

A Recursive Algorithm for Low-Power Memory Partitioning Luca Benini* Alberto Macii** Massimo Poncino**

A Recursive Algorithm for Low-Power Memory Partitioning Luca Benini* Alberto Macii** Massimo Poncino** A Recursive Alorithm for Low-Power ory Partitionin Luca Benini* Alberto Macii** Massimo Poncino** *Universita di Bolona **Politecnico di Torino Bolona, Italy 40136 Torino, Italy 10129 Abstract ory-processor

More information

simply by implementing large parts of the system functionality in software running on application-speciæc instruction set processor èasipè cores. To s

simply by implementing large parts of the system functionality in software running on application-speciæc instruction set processor èasipè cores. To s SYSTEM MODELING AND IMPLEMENTATION OF A GENERIC VIDEO CODEC Jong-il Kim and Brian L. Evans æ Department of Electrical and Computer Engineering, The University of Texas at Austin Austin, TX 78712-1084 fjikim,bevansg@ece.utexas.edu

More information

In Proceedings of ICPP 90, The 1990 International Conference on Parallel Processing, Chicago,Illinois, August 1990.

In Proceedings of ICPP 90, The 1990 International Conference on Parallel Processing, Chicago,Illinois, August 1990. In Proceedins of IPP 9, The 99 International onference on Parallel Processin, hicao,illinois, uust 99. SETH VLSI HIP FOR THE REL-TIME INFORMTION DISPERSL ND RETRIEVL FOR SEURITY ND FULT-TOLERNE zer estavros

More information

On Defending Peer-to-Peer System-based Active Worm Attacks

On Defending Peer-to-Peer System-based Active Worm Attacks This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 25 proceedins. On Defendin eer-to-eer System-based Active

More information

Design æy. Ted J. Hubbard z. Erik K. Antonsson x. Engineering Design Research Laboratory. California Institute of Technology.

Design æy. Ted J. Hubbard z. Erik K. Antonsson x. Engineering Design Research Laboratory. California Institute of Technology. Cellular Automata Modeling in MEMS Design æy Ted J. Hubbard z Erik K. Antonsson x Engineering Design Research Laboratory Division of Engineering and Applied Science California Institute of Technology January

More information

Altera FLEX 8000 Block Diagram

Altera FLEX 8000 Block Diagram Altera FLEX 8000 Block Diagram Figure from Altera technical literature FLEX 8000 chip contains 26 162 LABs Each LAB contains 8 Logic Elements (LEs), so a chip contains 208 1296 LEs, totaling 2,500 16,000

More information

CONTRAST COMPENSATION FOR BACK-LIT AND FRONT-LIT COLOR FACE IMAGES VIA FUZZY LOGIC CLASSIFICATION AND IMAGE ILLUMINATION ANALYSIS

CONTRAST COMPENSATION FOR BACK-LIT AND FRONT-LIT COLOR FACE IMAGES VIA FUZZY LOGIC CLASSIFICATION AND IMAGE ILLUMINATION ANALYSIS CONTRAST COMPENSATION FOR BACK-LIT AND FRONT-LIT COLOR FACE IMAGES VIA FUZZY LOGIC CLASSIFICATION AND IMAGE ILLUMINATION ANALYSIS CHUN-MING TSAI 1, ZONG-MU YEH 2, YUAN-FANG WANG 3 1 Department of Computer

More information

SKIPJACK and Key Exchange Algorithm (KEA) By Carlton J. O Riley

SKIPJACK and Key Exchange Algorithm (KEA) By Carlton J. O Riley SKIPJACK and Key Exchane Alorithm (KEA) By Carlton J. O Riley.0 Introduction This report covers my implementation of the Sipjac alorithm and Key Exchane Alorithms in C usin some assembly code routines,

More information

Learning Deep Features for One-Class Classification

Learning Deep Features for One-Class Classification 1 Learnin Deep Features for One-Class Classification Pramuditha Perera, Student Member, IEEE, and Vishal M. Patel, Senior Member, IEEE Abstract We propose a deep learnin-based solution for the problem

More information

The SpecC Methodoloy Technical Report ICS December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department

The SpecC Methodoloy Technical Report ICS December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department The SpecC Methodoloy Technical Report ICS-99-56 December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department of Information and Computer Science University of

More information

Verification of Delta-Sigma Converters Using Adaptive Regression Modeling Λ

Verification of Delta-Sigma Converters Using Adaptive Regression Modeling Λ Verification of DeltaSima Converters Usin Adaptive Reression Modelin Λ Jeonjin Roh, Suresh Seshadri and Jacob A. Abraham Computer Enineerin Research Center The University of Texas at Austin Austin, TX

More information

Steven J.E. Wilton, Jonathan Rose, and Zvonko G. Vranesic. University oftoronto.

Steven J.E. Wilton, Jonathan Rose, and Zvonko G. Vranesic. University oftoronto. Architecture of Centralized Field-Congurable Memory Steven J.E. Wilton, Jonathan Rose, and Zvonko G. Vranesic Department of Electrical and Computer Engineering University oftoronto Toronto, Ontario, Canada,

More information

TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance

TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance TCP-FIT: An Improved TCP Conestion Control Alorithm and its Performance Jinyuan Wan, Jiantao Wen, Jun Zhan and Yuxin Han Key Laboratory of Pervasive Computin, Ministry of Education Tsinhua National Laboratory

More information

A Combined Delay and Throughput Proportional Scheduling Scheme for Differentiated Services

A Combined Delay and Throughput Proportional Scheduling Scheme for Differentiated Services A Combined Delay and Throuhput Proportional Schedulin Scheme for Differentiated Services Samyukta Sankaran and Ahmed E.Kamal Department of Electrical and Computer Enineerin Iowa State University Ames,

More information

OPTIMIZING COARSE- GRAINED UNITS IN FLOATING POINT HYBRID FPGA

OPTIMIZING COARSE- GRAINED UNITS IN FLOATING POINT HYBRID FPGA OPTIMIZING COARSE- GRAINED UNITS IN FLOATING POINT HYBRID FPGA Chi Wai Yu 1, Alastair M. Smith 2, Wayne Luk 1, Philip Leong 3, Steven J.E. Wilton 2 1 Dept of Computing Imperial College London, London {cyu,wl}@doc.ic.ac.uk

More information

19.2 View Serializability. Recall our discussion in Section?? of how our true goal in the design of a

19.2 View Serializability. Recall our discussion in Section?? of how our true goal in the design of a 1 19.2 View Serializability Recall our discussion in Section?? of how our true goal in the design of a scheduler is to allow only schedules that are serializable. We also saw how differences in what operations

More information

Development and Verification of an SP 3 Code Using Semi-Analytic Nodal Method for Pin-by-Pin Calculation

Development and Verification of an SP 3 Code Using Semi-Analytic Nodal Method for Pin-by-Pin Calculation Journal of Physical Science and Application 7 () (07) 0-7 doi: 0.765/59-5348/07.0.00 D DAVID PUBLISHIN Development and Verification of an SP 3 Code Usin Semi-Analytic Chuntao Tan Shanhai Nuclear Enineerin

More information

Recovery in Distributed Extended Long-lived Transaction Models æ

Recovery in Distributed Extended Long-lived Transaction Models æ Recovery in Distributed Extended Lon-lived Transaction Models æ M. M. Gore y andr.k.ghosh Department of Computer Science and Enineerin, Indian Institute of Technoloy, Kanpur, India ore, rk@iitk.ernet.in

More information

IN this paper, we establish the computational complexity of optimally solving multi-robot path planning problems

IN this paper, we establish the computational complexity of optimally solving multi-robot path planning problems Intractability of Optimal Multi-Robot Path Plannin on Planar Graphs Jinjin Yu Abstract arxiv:504.007v3 [cs.ro] 6 Dec 05 We study the computational complexity of optimally solvin multi-robot path plannin

More information

Efficient Minimization of Sum and Differential Costs on Machines with Job Placement Constraints

Efficient Minimization of Sum and Differential Costs on Machines with Job Placement Constraints Efficient Minimization of Sum and Differential Costs on Machines with Job Placement Constraints Jaya Prakash Champati and Ben Lian Department of Electrical and Computer Enineerin, University of Toronto

More information

Minimum-Cost Multicast Routing for Multi-Layered Multimedia Distribution

Minimum-Cost Multicast Routing for Multi-Layered Multimedia Distribution Minimum-Cost Multicast Routin for Multi-Layered Multimedia Distribution Hsu-Chen Chen and Frank Yeon-Sun Lin Department of Information Manaement, National Taiwan University 50, Lane 144, Keelun Rd., Sec.4,

More information

æ When a query is presented to the system, it is useful to ænd an eæcient method of ænding the answer,

æ When a query is presented to the system, it is useful to ænd an eæcient method of ænding the answer, CMPT-354-98.2 Lecture Notes July 26, 1998 Chapter 12 Query Processing 12.1 Query Interpretation 1. Why dowe need to optimize? æ A high-level relational query is generally non-procedural in nature. æ It

More information

Electrical Power System Harmonic Analysis Using Adaptive BSS Algorithm

Electrical Power System Harmonic Analysis Using Adaptive BSS Algorithm Sensors & ransducers 2013 by IFSA http://www.sensorsportal.com Electrical Power System Harmonic Analysis Usin Adaptive BSS Alorithm 1,* Chen Yu, 2 Liu Yuelian 1 Zhenzhou Institute of Aeronautical Industry

More information

Using LDAP Directory Caches. Olga Kapitskaia. AT&T Labs{Research. on sample queries from a directory enabled application

Using LDAP Directory Caches. Olga Kapitskaia. AT&T Labs{Research. on sample queries from a directory enabled application Usin LDAP Directory Caches Sophie Cluet INRIA Rocquencourt Sophie.Cluet@inria.fr Ola Kapitskaia AT&T Labs{Research ola@research.att.com Divesh Srivastava AT&T Labs{Research divesh@research.att.com 1 Introduction

More information

Power-aware RAM Mapping for FPGA Embedded Memory Blocks

Power-aware RAM Mapping for FPGA Embedded Memory Blocks Power-aware RAM Mapping for FPGA Embedded Memory Blocks Russell Tessier Department of Electrical and Computer Engineering University of Massachusetts Amherst, MA, USA tessier@ecs.umass.edu Vaughn Betz,

More information

in two important ways. First, because each processor processes lare disk-resident datasets, the volume of the communication durin the lobal reduction

in two important ways. First, because each processor processes lare disk-resident datasets, the volume of the communication durin the lobal reduction Compiler and Runtime Analysis for Ecient Communication in Data Intensive Applications Renato Ferreira Gaan Arawal y Joel Saltz Department of Computer Science University of Maryland, Collee Park MD 20742

More information

Employing Multi-FPGA Debug Techniques

Employing Multi-FPGA Debug Techniques Employing Multi-FPGA Debug Techniques White Paper Traditional FPGA Debugging Methods Debugging in FPGAs has been difficult since day one. Unlike simulation where designers can see any signal at any time,

More information

2 IEICE TRANS. COMMUN., VOL. 0, NO Table 1 Classication of Replacement Policies. Replacement Re-reference likelihood Non-uniformity Policies T

2 IEICE TRANS. COMMUN., VOL. 0, NO Table 1 Classication of Replacement Policies. Replacement Re-reference likelihood Non-uniformity Policies T IEICE TRANS. COMMUN., VOL. 0, NO. 0 2000 1 PAPER IEICE Transactions on Communications Exploitin Metadata of Absent Objects for Proxy Cache Consistency Jooyon Kim y,hyokyun Bahn yy, Nonmembers, and Kern

More information

On the Network-Wide Gain of Memory-Assisted Source Coding

On the Network-Wide Gain of Memory-Assisted Source Coding 20 IEEE Information Theory Workshop On the etwork-wide Gain of Memory-Assisted Source Codin Mohsen Sardari, Ahmad Beirami, Faramarz Fekri School of Electrical and Computer Enineerin, Georia Institute of

More information

Component Selector Guide November 2004

Component Selector Guide November 2004 Component Selector Guide November 2004 Introduction Leadin Throuh Innovation The world s pioneer in reprorammable devices, Altera Corporation offers a complete rane of CPLDs, FPGAs, and structured ASICs

More information

to chanes in the user interface desin. By embeddin the object-oriented, interpreted lanuae into the application, it can also be used as a tool for rer

to chanes in the user interface desin. By embeddin the object-oriented, interpreted lanuae into the application, it can also be used as a tool for rer Usin C++ Class Libraries from an Interpreted Lanuae Wolfan Heidrich, Philipp Slusallek, Hans-Peter Seidel Computer Graphics Department, Universitat Erlanen-Nurnber Am Weichselarten 9, 91058 Erlanen, Germany.

More information

BWM CRM, PRM CRM, WB, PRM BRM, WB CWM, CWH, PWM. Exclusive (read/write) CWM PWM

BWM CRM, PRM CRM, WB, PRM BRM, WB CWM, CWH, PWM. Exclusive (read/write) CWM PWM University of California, Berkeley College of Engineering Computer Science Division EECS Spring 1998 D.A. Patterson Quiz 2 April 22, 1998 CS252 Graduate Computer Architecture You are allowed to use a calculator

More information

Disk-directed I/O for an Out-of-core Computation

Disk-directed I/O for an Out-of-core Computation Disk-directed I/O for an Out-of-core Computation David Kotz Department of Computer Science Dartmouth Collee Hanover, NH 03755-3510 dfk@cs.dartmouth.edu Abstract New file systems are critical to obtain

More information