Efficient Designs of Multiported Memory on FPGA
|
|
- Tracey Dickerson
- 5 years ago
- Views:
Transcription
1 Abstract: Efficient Designs of Multiported Memory on FPGA The utilization of block RAMs (BRAMs) is a critical performance factor for multiported memory designs on field programmable gate arrays (FPGAs). Not only does the excessive demand on BRAMs block the usage of BRAMs from other parts of a design, but the complex routing between BRAMs and logic also limits the operating frequency. This paper first introduces a brand new perspective and a more efficient way of using a conventional two reads one write (2R1W) memory as a 2R1W/4R memory. By exploiting the 2R1W/4R as the building block, this paper introduces a hierarchical design of 4R1W memory that requires 25% fewer BRAMs than the previous approach of duplicating the 2R1W module. Memories with more read/write ports can be extended from the proposed 2R1W/4R memory and the hierarchical 4R1W memory. Compared with previous xor-based and live value table-based approaches, the proposed designs can, respectively, reduce up to 53% and 69% of BRAM usage for 4R2W memory designs with 8K-depth. For complex multiported designs, the proposed BRAM-efficient approaches can achieve higher clock frequencies by alleviating the complex routing in an FPGA. For 4R3W memory with 8K-depth, the proposed design can save 53% of BRAMs and enhance the operating frequency by 20%. The proposed architecture of this paper analysis the logic size, area and power consumption using Xilinx Existing System: To implement a multiported memory on an FPGA, two typesof design techniques are required, namely increasing readports and increasing write ports. Table I lists the techniquesproposed by previous works for multiported memories on FPGAs. The approach ofreplicationenables multipleread ports by replicating the data on multiple BRAMs. Thistechnique uses low complexity of control logic, but requiresexcessive usage of BRAMs. LVT, which is implemented bysynthesizing slices on FPGA, enables multiple write ports byduplicating BRAMs and tracking which BRAM stores thelatest value of an address. The other approach to increase writeports is referred to asxor-based. Different from LVT,which uses a table to track the location of the latest value, thexor-based design duplicates BRAMs and encodes the storeddata withxoroperations. The target data can be retrieved byapplying thexoragain. In general,
2 thexor-based approachcan achieve a higher operating frequency, but requires morebrams than the LVT approach. Note that thispaper focuses on architectural solutions toachieve multiple accesses for a general memory that takesrequests at the current cycle and returns results in the nextcycle. Users of the multiported memory can be completelyignorant of the details of memory designs. There are otherworks focusing on enabling multiple accesses for specifictypes of storage elements, such as register files. Theyenable concurrent reads with an approach similar to replication, but avoid write conflicts by renaming the registerswith software approaches, such as compiler or assembler.these approaches, which tacklespecific storage functions andinvolve effort of users, are not in the scope of this paper. The following sections will provide more in-depth discussions about implementations and design concerns of thesetechniques. To facilitate a more general discussion, the following paragraphs use a memorybankto refers to a standalonememory module used as a building block to implement a memory system. A memory system usually consists of multiplebanks. The memory space, also referred to asmemory depth,is distributed across the banks. When designing a memorysystem on FPGAs, a BRAM can be used to support the complete memory space. BRAMs can also be deployed as banksto enable larger memory space or higher access bandwidth. Disadvantages: Memory is high High slice utilization Proposed System: This section proposes efficient solutions to implement multiported memories on FPGAs. Unlike the replication method the approach proposed inthis paper supportsmultiple reads withxoroperations, while multiple writescan be enabled using additional BRAMs. A remap table isadded to track the location of the correct data. The mainmemory architecture is similar to that of our previous workintroduced in [9]. On top of the main architecture, this paperintroduces a brand new perspective of using a 2R1W moduleas either a 2R1W or a 4R module, denoted as a 2R1W/4Rmemory. By applying the 2R1W/4R, this paper exploits theversatile usage mode and proposes a hierarchicalxor-baseddesign of 4R1W memory that requires fewer BRAMs
3 thanprevious designs. Memories with more read/write ports canbe supported by extending the proposed 2R1W/4R memoryand the hierarchical 4R1W memory. Techniques to Increase Read Ports: 1) Bank Division With XOR Design Scheme: BankDivision WithXOR(BDX) is an approach to increase read ports proposed. Unlike the method used,bdx avoids replicating the storage elements of the wholememory space. With BDX, multiple reads can be supported byusing thexoroperations. Note that BDX is different from thexor-based design. The XOR-based approach usesxoroperations to increase write ports by storing the encodeddata to maintain the data coherence between memory modules.bdx usesxoroperations to increase read ports by retrievingthe target data from the encoded value. Figure 1: Example of a 2R1W memoryimplemented by BDX technique. (a) Supporting multiple reads with XORoperations. (b) Supporting a write request in the two-cycle pipeline architecture.
4 Fig. 1 illustrates an example of a 2R1W memory implemented with the BDX scheme. As shown in Fig. 1(a), thememory space is distributed to four data banks (banks 0 3).OneXOR-bank is added to keep the XORvalues of the databanks. 2) 2R1W/4R (An Efficient Two-Mode Memory): To implement HBDX in an efficient way, this paper introduces a brandnew perspective of using a 2R1W module as either a 2R1Wor a 4R module. This new way of using the 2R1W module isdenoted as 2R1W/4R. Thishybridmodule can support either2r and 1W or 4R. Note that the 2R1W/4R module uses exactlythe same design as the 2R1W module introduced in Fig. 1. Fig. 2 illustrates how the two modes work. Fig. 2(a) showsthe 2R1W mode. When there is a write requestw0, thisdesigncan support up to two conflicting reads. The write requestw0stores the data directly to the target data bank, and reads allthe data at the same offset from the other data banks (Rupdate) to update the XOR-bank. Fig. 2(b) shows the 4R mode. Figure 2: Two modes of 2R1W/4R module. The module is implemented with four data banks and one XOR-bank. (a) 2R1W mode. (b) 4R mode
5 3) HBDX Designs With 2R1W/4R Module: Fig. 3 illustrates a design scheme that can support more read ports byreplicating the 2R1W module. However, this design schemecould significantly increase the usage of the limited BRAMson an FPGA. To achieve a more BRAMefficient design, thispaper proposes HBDX, which adopts a hierarchical structurethat organizes the 2R1W to achieve 4R1W without replicatingthe 2R1W module. To further enhance the design, HBDX inthis section leverages the 2R1W/4R scheme introduced in theprevious section as the basic building module to implementa 4R1W module. Figure 3: Example of mr1w memory implemented with multiple 2R1W modules. Fig. 4 illustrates a 4R1W memory design by using thehbdx scheme. In this 4R1W design, each basic buildingblock is a 2R1W module of the BDX scheme introduced infig. 1.
6 Techniques to Increase Write Ports: Figure 4: HBDX 4R1W implemented with 2R1W/4R modules. Bank division with remap table (BDRT) is an approach toincrease write ports proposed. Unlike the LVT designused, BDRT avoids replicating the wholememory space and supports multiple writes using additionalbrams and a remap table to track the location of thelatest data. Fig. 5 shows an example of the design fora 1R2W memory. This example consists of two data banks(banks 0 and 1), one bank buffer, and a remap table. The Nullentries in a memory bank are the entries that do not store anyvalid data. When receivingw0 and W1, these requests willfirst look up the remap table to identify the correct BRAM thatstores the latest data. According to the remap table,w0andw1are, respectively, going to address 0 and address 1 in bank 0.
7 Figure 5: Example of a 1R2W memory implemented with BDRT technique (a) According to the remap table, both W0 and W1 are going to bank 0. The null entries in BRAMs are the entries that do not store any valid data. (b) Final state of the multiported memory after completing the two writes W0andW1. Advantages: Memory reduction Low slice utilization Software implementation: Modelsim Xilinx ISE
ISSN Vol.05,Issue.09, September-2017, Pages:
WWW.IJITECH.ORG ISSN 2321-8665 Vol.05,Issue.09, September-2017, Pages:1693-1697 AJJAM PUSHPA 1, C. H. RAMA MOHAN 2 1 PG Scholar, Dept of ECE(DECS), Shirdi Sai Institute of Science and Technology, Anantapuramu,
More informationLow-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units
Low-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units Abstract: Split-radix fast Fourier transform (SRFFT) is an ideal candidate for the implementation of a lowpower FFT processor, because
More informationResource-Efficient SRAM-based Ternary Content Addressable Memory
Abstract: Resource-Efficient SRAM-based Ternary Content Addressable Memory Static random access memory (SRAM)-based ternary content addressable memory (TCAM) offers TCAM functionality by emulating it with
More informationA Configurable Multi-Ported Register File Architecture for Soft Processor Cores
A Configurable Multi-Ported Register File Architecture for Soft Processor Cores Mazen A. R. Saghir and Rawan Naous Department of Electrical and Computer Engineering American University of Beirut P.O. Box
More informationModular Multi-ported SRAMbased. Ameer M.S. Abdelhadi Guy G.F. Lemieux
Modular Multi-ported SRAMbased Memories Ameer M.S. Abdelhadi Guy G.F. Lemieux Multi-ported Memories: A Keystone for Parallel Computation! Enhance ILP for processors and accelerators, e.g. VLIW Processors
More informationA 3-D CPU-FPGA-DRAM Hybrid Architecture for Low-Power Computation
A 3-D CPU-FPGA-DRAM Hybrid Architecture for Low-Power Computation Abstract: The power budget is expected to limit the portion of the chip that we can power ON at the upcoming technology nodes. This problem,
More informationFCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow
FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture
More informationA High-Speed FPGA Implementation of an RSD- Based ECC Processor
A High-Speed FPGA Implementation of an RSD- Based ECC Processor Abstract: In this paper, an exportable application-specific instruction-set elliptic curve cryptography processor based on redundant signed
More informationA New CDMA Encoding/Decoding Method for on- Chip Communication Network
A New CDMA Encoding/Decoding Method for on- Chip Communication Network Abstract: As a high performance on-chip communication method, the code division multiple access (CDMA) technique has recently been
More informationCharles Eric LaForest
EFFICIENT MULTI-PORTED MEMORIES FOR FPGAS by Charles Eric LaForest A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Electrical and
More informationArea Efficient Multi-Ported Memories with Write Conflict Resolution
Area Efficient Multi-Ported Memories with Write Conflict Resolution A thesis submitted to the Graduate School of University of Cincinnati in partial fulfillment of the requirements for the degree of Master
More informationFPGA-Scope Final Project Proposal
FPGA-Scope 6.111 Final Project Proposal Anartya Mandal and Kevin Linke November 1, 2011 1 Overview Our final project will be a digital oscilloscope implemented on the Labkit's Field Programmable Gate Array
More informationEfficient Self-Reconfigurable Implementations Using On-Chip Memory
10th International Conference on Field Programmable Logic and Applications, August 2000. Efficient Self-Reconfigurable Implementations Using On-Chip Memory Sameer Wadhwa and Andreas Dandalis University
More informationImplementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics
Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics Yojana Jadhav 1, A.P. Hatkar 2 PG Student [VLSI & Embedded system], Dept. of ECE, S.V.I.T Engineering College, Chincholi,
More informationPINE TRAINING ACADEMY
PINE TRAINING ACADEMY Course Module A d d r e s s D - 5 5 7, G o v i n d p u r a m, G h a z i a b a d, U. P., 2 0 1 0 1 3, I n d i a Digital Logic System Design using Gates/Verilog or VHDL and Implementation
More informationMemory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures
Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures Abstract: The coarse-grained reconfigurable architectures (CGRAs) are a promising class of architectures with the advantages of
More informationDesign of a Pipelined 32 Bit MIPS Processor with Floating Point Unit
Design of a Pipelined 32 Bit MIPS Processor with Floating Point Unit P Ajith Kumar 1, M Vijaya Lakshmi 2 P.G. Student, Department of Electronics and Communication Engineering, St.Martin s Engineering College,
More informationPage 1. Multilevel Memories (Improving performance using a little cash )
Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency
More informationAn Intelligent Multi-Port Memory
JOURNAL OF COMPUTERS, VOL. 5, NO. 3, MARCH 2010 471 An Intelligent Multi-Port Memory Zuo Wang School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China wuchenjian.wang@gmail.com
More informationThe Lekha 3GPP LTE FEC IP Core meets 3GPP LTE specification 3GPP TS V Release 10[1].
Lekha IP 3GPP LTE FEC Encoder IP Core V1.0 The Lekha 3GPP LTE FEC IP Core meets 3GPP LTE specification 3GPP TS 36.212 V 10.5.0 Release 10[1]. 1.0 Introduction The Lekha IP 3GPP LTE FEC Encoder IP Core
More informationDynamically Configurable Online Statistical Flow Feature Extractor on FPGA
Dynamically Configurable Online Statistical Flow Feature Extractor on FPGA Da Tong, Viktor Prasanna Ming Hsieh Department of Electrical Engineering University of Southern California Email: {datong, prasanna}@usc.edu
More informationEE 459/500 HDL Based Digital Design with Programmable Logic
EE 459/500 HDL Based Digital Design with Programmable Logic Lecture 17 From special-purpose FSMD to general-purpose microcontroller: Xilinx s PicoBlaze 1 Overview From FSMD to Microcontroller PicoBlaze
More informationLow-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm
International Journal of Scientific and Research Publications, Volume 3, Issue 8, August 2013 1 Low-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm MUCHHUMARRI SANTHI LATHA*, Smt. D.LALITHA KUMARI**
More informationFPGA design with National Instuments
FPGA design with National Instuments Rémi DA SILVA Systems Engineer - Embedded and Data Acquisition Systems - MED Region ni.com The NI Approach to Flexible Hardware Processor Real-time OS Application software
More informationMethod We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering. Winter/Summer Training
Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering Winter/Summer Training Level 2 continues. 3 rd Year 4 th Year FIG-3 Level 1 (Basic & Mandatory) & Level 1.1 and
More informationAdvanced FPGA Design Methodologies with Xilinx Vivado
Advanced FPGA Design Methodologies with Xilinx Vivado Lecturer: Alexander Jäger Course of studies: Technische Informatik Student number: 3158849 Date: 30.01.2015 30/01/15 Advanced FPGA Design Methodologies
More informationIMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC
IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC Thangamonikha.A 1, Dr.V.R.Balaji 2 1 PG Scholar, Department OF ECE, 2 Assitant Professor, Department of ECE 1, 2 Sri Krishna
More informationA Configurable Parallel Hardware Architecture for Efficient Integral Histogram Image Computing
A Configurable Parallel Hardware Architecture for Efficient Integral Histogram Image Computing Abstract: Integral histogram image can accelerate the computing process of feature algorithm in computer vision,
More informationAES Core Specification. Author: Homer Hsing
AES Core Specification Author: Homer Hsing homer.hsing@gmail.com Rev. 0.1.1 October 30, 2012 This page has been intentionally left blank. www.opencores.org Rev 0.1.1 ii Revision History Rev. Date Author
More informationField Programmable Gate Array (FPGA)
Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems
More informationDESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY
DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY Saroja pasumarti, Asst.professor, Department Of Electronics and Communication Engineering, Chaitanya Engineering
More informationFPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)
FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor
More informationVHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationLogiCORE IP Serial RapidIO Gen2 v1.2
LogiCORE IP Serial RapidIO Gen2 v1.2 Product Guide Table of Contents Chapter 1: Overview System Overview............................................................ 5 Applications.................................................................
More informationResource Efficient Multi Ported Sram Based Ternary Content Addressable Memory
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 PP 11-18 www.iosrjen.org Resource Efficient Multi Ported Sram Based Ternary Content Addressable Memory S.Parkavi (1) And S.Bharath
More informationMemory-efficient and fast run-time reconfiguration of regularly structured designs
Memory-efficient and fast run-time reconfiguration of regularly structured designs Brahim Al Farisi, Karel Heyse, Karel Bruneel and Dirk Stroobandt Ghent University, ELIS Department Sint-Pietersnieuwstraat
More informationMeasuring Improvement When Using HUB Formats to Implement Floating-Point Systems under Round-to- Nearest
Measuring Improvement When Using HUB Formats to Implement Floating-Point Systems under Round-to- Nearest Abstract: This paper analyzes the benefits of using half-unitbiased (HUB) formats to implement floatingpoint
More informationData Side OCM Bus v1.0 (v2.00b)
0 Data Side OCM Bus v1.0 (v2.00b) DS480 January 23, 2007 0 0 Introduction The DSOCM_V10 core is a data-side On-Chip Memory (OCM) bus interconnect core. The core connects the PowerPC 405 data-side OCM interface
More informationComposing Multi-Ported Memories on FPGAs
16 Composing Multi-Ported Memories on FPGAs CHARLES ERIC LAFOREST, ZIMO LI, TRISTAN O ROURKE, MING G. LIU, and J. GREGORY STEFFAN, University of Toronto Multi-ported memories are challenging to implement
More informationLow Power Design Techniques
Low Power Design Techniques August 2005, ver 1.0 Application Note 401 Introduction This application note provides low-power logic design techniques for Stratix II and Cyclone II devices. These devices
More informationThe Efficient Implementation of Numerical Integration for FPGA Platforms
Website: www.ijeee.in (ISSN: 2348-4748, Volume 2, Issue 7, July 2015) The Efficient Implementation of Numerical Integration for FPGA Platforms Hemavathi H Department of Electronics and Communication Engineering
More informationUnderstanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive
Understanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive Abstract: A NAND flash memory/storage-class memory (SCM) hybrid solid-state drive (SSD) can
More informationCDA 4253 FPGA System Design Op7miza7on Techniques. Hao Zheng Comp S ci & Eng Univ of South Florida
CDA 4253 FPGA System Design Op7miza7on Techniques Hao Zheng Comp S ci & Eng Univ of South Florida 1 Extracted from Advanced FPGA Design by Steve Kilts 2 Op7miza7on for Performance 3 Performance Defini7ons
More informationParallel graph traversal for FPGA
LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,
More informationComputer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics
Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing
More informationFPGA architecture and design technology
CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 08: Caches III Shuai Wang Department of Computer Science and Technology Nanjing University Improve Cache Performance Average memory access time (AMAT): AMAT =
More informationV1 - VHDL Language. FPGA Programming with VHDL and Simulation (through the training Xilinx, Lattice or Actel FPGA are targeted) Objectives
Formation VHDL Language: FPGA Programming with VHDL and Simulation (through the training Xilinx, Lattice or Actel FPGA are targeted) - Programmation: Logique Programmable V1 - VHDL Language FPGA Programming
More informationInstantiation. Verification. Simulation. Synthesis
0 XPS Mailbox (v2.00a) DS632 June 24, 2009 0 0 Introduction In a multiprocessor environment, the processors need to communicate data with each other. The easiest method is to set up inter-processor communication
More informationIntro to Computer Architecture, Spring 2012 Midterm Exam II. Name:
18-447 Intro to Computer Architecture, Spring 2012 Midterm Exam II Instructor: Onur Mutlu Teaching Assistants: Chris Fallin, Lavanya Subramanian, Abeer Agrawal Date: April 11, 2012 Name: Instructions:
More informationWhat is Xilinx Design Language?
Bill Jason P. Tomas University of Nevada Las Vegas Dept. of Electrical and Computer Engineering What is Xilinx Design Language? XDL is a human readable ASCII format compatible with the more widely used
More informationLECTURE 12. Virtual Memory
LECTURE 12 Virtual Memory VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a cache for magnetic disk. The mechanism by which this is accomplished
More informationRepair Pipelining for Erasure-Coded Storage
Repair Pipelining for Erasure-Coded Storage Runhui Li, Xiaolu Li, Patrick P. C. Lee, Qun Huang The Chinese University of Hong Kong USENIX ATC 2017 1 Introduction Fault tolerance for distributed storage
More informationOnline Heavy Hitter Detector on FPGA
Online Heavy Hitter Detector on FPGA Da Tong, Viktor Prasanna Ming Hsieh Department of Electrical Engineering University of Southern California Email: {datong, prasanna}@usc.edu Abstract Detecting heavy
More informationUtility Reduced Logic (v1.00a)
DS482 December 2, 2009 Introduction The Utility Reduced Logic core applies a logic reduction function over an input vector to generate a single bit result. The core is intended as glue logic between peripherals.
More informationFPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP
FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP 1 M.DEIVAKANI, 2 D.SHANTHI 1 Associate Professor, Department of Electronics and Communication Engineering PSNA College
More informationScanline-based rendering of 2D vector graphics
Scanline-based rendering of 2D vector graphics Sang-Woo Seo 1, Yong-Luo Shen 1,2, Kwan-Young Kim 3, and Hyeong-Cheol Oh 4a) 1 Dept. of Elec. & Info. Eng., Graduate School, Korea Univ., Seoul 136 701, Korea
More informationHybrid LUT/Multiplexer FPGA Logic Architectures
Hybrid LUT/Multiplexer FPGA Logic Architectures Abstract: Hybrid configurable logic block architectures for field-programmable gate arrays that contain a mixture of lookup tables and hardened multiplexers
More informationDESIGN AND IMPLEMENTATION OF SDR SDRAM CONTROLLER IN VHDL. Shruti Hathwalia* 1, Meenakshi Yadav 2
ISSN 2277-2685 IJESR/November 2014/ Vol-4/Issue-11/799-807 Shruti Hathwalia et al./ International Journal of Engineering & Science Research DESIGN AND IMPLEMENTATION OF SDR SDRAM CONTROLLER IN VHDL ABSTRACT
More informationMobile Robot Path Planning Software and Hardware Implementations
Mobile Robot Path Planning Software and Hardware Implementations Lucia Vacariu, Flaviu Roman, Mihai Timar, Tudor Stanciu, Radu Banabic, Octavian Cret Computer Science Department, Technical University of
More informationLow area implementation of AES ECB on FPGA
Total AddRoundkey_3 MixCollumns AddRoundkey_ ShiftRows SubBytes 1 Low area implementation of AES ECB on FPGA Abstract This project aimed to create a low area implementation of the Rajindael cipher (AES)
More informationEmbedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory
Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives
More informationParallel FIR Filters. Chapter 5
Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture
More informationRegister File Organization
Register File Organization Sudhakar Yalamanchili unless otherwise noted (1) To understand the organization of large register files used in GPUs Objective Identify the performance bottlenecks and opportunities
More informationKintex-7: Hardware Co-simulation and Design Using Simulink and Sysgen
Kintex-7: Hardware Co-simulation and Design Using Simulink and Sysgen Version 1.2 April 19, 2013 Revision History Version Date Author Comments Version Date Author(s) Comments on Versions No Completed 1.0
More informationHardware Implementation of Cryptosystem by AES Algorithm Using FPGA
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,
More informationCS250 VLSI Systems Design Lecture 9: Memory
CS250 VLSI Systems esign Lecture 9: Memory John Wawrzynek, Jonathan Bachrach, with Krste Asanovic, John Lazzaro and Rimas Avizienis (TA) UC Berkeley Fall 2012 CMOS Bistable Flip State 1 0 0 1 Cross-coupled
More informationVendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs
Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Xin Fang and Miriam Leeser Dept of Electrical and Computer Eng Northeastern University Boston, Massachusetts 02115
More informationAn Efficient FPGA Implementation of the Advanced Encryption Standard (AES) Algorithm Using S-Box
Volume 5 Issue 2 June 2017 ISSN: 2320-9984 (Online) International Journal of Modern Engineering & Management Research Website: www.ijmemr.org An Efficient FPGA Implementation of the Advanced Encryption
More informationResource-efficient Acceleration of 2-Dimensional Fast Fourier Transform Computations on FPGAs
In Proceedings of the International Conference on Distributed Smart Cameras, Como, Italy, August 2009. Resource-efficient Acceleration of 2-Dimensional Fast Fourier Transform Computations on FPGAs Hojin
More informationFloating-Point Butterfly Architecture Based on Binary Signed-Digit Representation
Floating-Point Butterfly Architecture Based on Binary Signed-Digit Representation Abstract: Fast Fourier transform (FFT) coprocessor, having a significant impact on the performance of communication systems,
More informationApplying Models of Computation to OpenCL Pipes for FPGA Computing. Nachiket Kapre + Hiren Patel
Applying Models of Computation to OpenCL Pipes for FPGA Computing Nachiket Kapre + Hiren Patel nachiket@uwaterloo.ca Outline Models of Computation and Parallelism OpenCL code samples Synchronous Dataflow
More informationReducing Cache Energy in Embedded Processors Using Early Tag Access and Tag Overflow Buffer
Reducing Cache Energy in Embedded Processors Using Early Tag Access and Tag Overflow Buffer Neethu P Joseph 1, Anandhi V. 2 1 M.Tech Student, Department of Electronics and Communication Engineering SCMS
More informationCPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline
CPE/EE 422/522 Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices Dr. Rhonda Kay Gaede UAH Outline Introduction Field-Programmable Gate Arrays Virtex Virtex-E, Virtex-II, and Virtex-II
More informationEnergy Optimization of FPGA-Based Stream- Oriented Computing with Power Gating
Energy Optimization of FPGA-Based Stream- Oriented Computing with Power Gating Mohammad Hosseinabady and Jose Luis Nunez-Yanez Department of Electrical and Electronic Engineering University of Bristol,
More informationLarge-Scale Network Simulation Scalability and an FPGA-based Network Simulator
Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Stanley Bak Abstract Network algorithms are deployed on large networks, and proper algorithm evaluation is necessary to avoid
More informationEFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL INTERLEAVER FOR LONG TERM EVOLUTION SYSTEMS
Rev. Roum. Sci. Techn. Électrotechn. et Énerg. Vol. 61, 1, pp. 53 57, Bucarest, 016 Électronique et transmission de l information EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL
More informationTDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading
Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5
More informationSupport Triangle rendering with texturing: used for bitmap rotation, transformation or scaling
logibmp Bitmap 2.5D Graphics Accelerator March 12 th, 2015 Data Sheet Version: v2.2 Xylon d.o.o. Fallerovo setaliste 22 10000 Zagreb, Croatia Phone: +385 1 368 00 26 Fax: +385 1 365 51 67 E-mail: support@logicbricks.com
More informationDESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA
DESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA T.MALLIKARJUNA 1 *,K.SREENIVASA RAO 2 1 PG Scholar, Annamacharya Institute of Technology & Sciences, Rajampet, A.P, India.
More informationPOWER REDUCTION IN CONTENT ADDRESSABLE MEMORY
POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY Latha A 1, Saranya G 2, Marutharaj T 3 1, 2 PG Scholar, Department of VLSI Design, 3 Assistant Professor Theni Kammavar Sangam College Of Technology, Theni,
More informationA SCALABLE COMPUTING AND MEMORY ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS. Theepan Moorthy and Andy Ye
A SCALABLE COMPUTING AND MEMORY ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS Theepan Moorthy and Andy Ye Department of Electrical and Computer Engineering Ryerson
More informationECE 1160/2160 Embedded Systems Design. Midterm Review. Wei Gao. ECE 1160/2160 Embedded Systems Design
ECE 1160/2160 Embedded Systems Design Midterm Review Wei Gao ECE 1160/2160 Embedded Systems Design 1 Midterm Exam When: next Monday (10/16) 4:30-5:45pm Where: Benedum G26 15% of your final grade What about:
More informationA Hardware Cache memcpy Accelerator
A Hardware memcpy Accelerator Stephan Wong, Filipa Duarte, and Stamatis Vassiliadis Computer Engineering, Delft University of Technology Mekelweg 4, 2628 CD Delft, The Netherlands {J.S.S.M.Wong, F.Duarte,
More informationAnalysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope
Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope G. Mohana Durga 1, D.V.R. Mohan 2 1 M.Tech Student, 2 Professor, Department of ECE, SRKR Engineering College, Bhimavaram, Andhra
More informationOn the parallelization of slice-based Keccak implementations on Xilinx FPGAs
On the parallelization of slice-based Keccak implementations on Xilinx FPGAs Jori Winderickx, Joan Daemen and Nele Mentens KU Leuven, ESAT/COSIC & iminds, Leuven, Belgium STMicroelectronics Belgium & Radboud
More informationTowards Performance Modeling of 3D Memory Integrated FPGA Architectures
Towards Performance Modeling of 3D Memory Integrated FPGA Architectures Shreyas G. Singapura, Anand Panangadan and Viktor K. Prasanna University of Southern California, Los Angeles CA 90089, USA, {singapur,
More informationDesign and Low Power Implementation of a Reorder Buffer
Design and Low Power Implementation of a Reorder Buffer J.D. Fisher, C. Romo, E. John, W. Lin Department of Electrical and Computer Engineering, University of Texas at San Antonio One UTSA Circle, San
More informationGraduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE
FPGA Design with Xilinx ISE Presenter: Shu-yen Lin Advisor: Prof. An-Yeu Wu 2005/6/6 ACCESS IC LAB Outline Concepts of Xilinx FPGA Xilinx FPGA Architecture Introduction to ISE Code Generator Constraints
More informationImplementation and Comparative Analysis of AES as a Stream Cipher
Implementation and Comparative Analysis of AES as a Stream Cipher Bin ZHOU, Yingning Peng Dept. of Electronic Engineering, Tsinghua University, Beijing, China, 100084 e-mail: zhoubin06@mails.tsinghua.edu.cn
More informationADPCM: Adaptive Differential Pulse Code Modulation
ADPCM: Adaptive Differential Pulse Code Modulation Motivation and introduction This is the final exercise. You have three weeks to complete this exercise, but you will need these three weeks! In this exercise,
More informationBasic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices
3 Digital Systems Implementation Programmable Logic Devices Basic FPGA Architectures Why Programmable Logic Devices (PLDs)? Low cost, low risk way of implementing digital circuits as application specific
More informationHybrid Threading: A New Approach for Performance and Productivity
Hybrid Threading: A New Approach for Performance and Productivity Glen Edwards, Convey Computer Corporation 1302 East Collins Blvd Richardson, TX 75081 (214) 666-6024 gedwards@conveycomputer.com Abstract:
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 Advance Encryption Standard (AES) Rijndael algorithm is symmetric block cipher that can process data blocks of 128 bits, using cipher keys with lengths of 128, 192, and 256
More informationLab 6 : Introduction to Verilog
Lab 6 : Introduction to Verilog Name: Sign the following statement: On my honor, as an Aggie, I have neither given nor received unauthorized aid on this academic work 1 Objective The main objective of
More informationParallel Reconfigurable Hardware Architectures for Video Processing Applications
Parallel Reconfigurable Hardware Architectures for Video Processing Applications PhD Dissertation Presented by Karim Ali Supervised by Prof. Jean-Luc Dekeyser Dr. HdR. Rabie Ben Atitallah Parallel Reconfigurable
More informationContent Addressable Memory with Efficient Power Consumption and Throughput
International journal of Emerging Trends in Science and Technology Content Addressable Memory with Efficient Power Consumption and Throughput Authors Karthik.M 1, R.R.Jegan 2, Dr.G.K.D.Prasanna Venkatesan
More informationAgenda. Introduction FPGA DSP platforms Design challenges New programming models for FPGAs
New Directions in Programming FPGAs for DSP Dr. Jim Hwang Xilinx, Inc. Agenda Introduction FPGA DSP platforms Design challenges New programming models for FPGAs System Generator Getting your math into
More informationDesign and Implementation of High Performance DDR3 SDRAM controller
Design and Implementation of High Performance DDR3 SDRAM controller Mrs. Komala M 1 Suvarna D 2 Dr K. R. Nataraj 3 Research Scholar PG Student(M.Tech) HOD, Dept. of ECE Jain University, Bangalore SJBIT,Bangalore
More informationUnique Journal of Engineering and Advanced Sciences Available online: Research Article
ISSN 2348-375X Unique Journal of Engineering and Advanced Sciences Available online: www.ujconline.net Research Article A POWER EFFICIENT CAM DESIGN USING MODIFIED PARITY BIT MATCHING TECHNIQUE Karthik
More information