Optimization of Task Scheduling and Memory Partitioning for Multiprocessor System on Chip
|
|
- Mervyn Horn
- 5 years ago
- Views:
Transcription
1 Optimization of Task Scheduling and Memory Partitioning for Multiprocessor System on Chip 1 Mythili.R, 2 Mugilan.D 1 PG Student, Department of Electronics and Communication K S Rangasamy College Of Technology, TN, India 2 Assistant Professor, Department of Electronics and Communication K S Rangasamy College Of Technology, TN, India Abstract - Multiprocessor system-on-chip (MPSoC) is an attractive solution for increase in complexity and size of embedded applications. MPSoC is an integrated circuit containing multiple instruction-set processors on a single chip that implements most of the functionality of a complex electronic system. While embedded systems become increasingly complex, the increase in memory access speed has failed to keep up with the processor speed. This makes the memory access latency a major issue in scheduling embedded applications on embedded systems. Scheduling the tasks of an embedded application on the processors and partitioning the available Scratch-pad memory (SPM) budget among those processors are two critical issues in complex embedded systems. This research focuses mainly on task scheduling and SPM partitioning to reduce the execution time of embedded applications. Equally partitioned SPM reduces the computation time. To further reduce these applications computation time, available SPM can be divided between the processors in any ratio. Pipelined scheduling allows tasks of different embedded application instances to be scheduled at each stage of the pipeline. Keywords - Memory partitioning, multiprocessor system-onchip, scratchpad memory, task scheduling. in terms of the clock cycles compared to fast on-chip SPM. Cache memory in the processor is replaced by SPM. SPM has been employed as a partial or entire replacement for cache memory due to its better energy efficiency. SPM consists of only decoding circuits, data arrays, and output units. Unlike in caches, it does not require tag comparison on SPM. Due to its simplified architecture, SPM is more energy/area efficient than cache. The computation time of a program on a processor depends on how much SPM is allocated to that processor. Execution time predictability is a critical issue for realtime embedded applications; this means that data caches are not suitable since it is hard to model the exact behaviour and to predict the execution time of programs. To alleviate such problems, many modern MPSoC systems use scratchpad memories. SPM contributes to better timing predictability. Cellular phones, portable media players, gaming consoles are some complex embedded applications consisting of multiple concurrent real-time tasks. Usually tasks are scheduled first and the SPM budget is then partitioned among the processors. Such a decoupled technique may prevent better schedules in terms of minimizing the computation time of the whole application. The integration of those two steps improve the performance. I. INTRODUCTION MPSoC consists of multiple heterogeneous processing elements, a SPM memory hierarchy, and input/output components which are linked together by an on-chip interconnect structure. MPSoC models use a memory hierarchy with slow off-chip memory and fast on-chip scratchpad memories. A larger SPM results in less computation time since off-chip access is more expensive II. METHODOLOGY The embedded application is given to the MPSoC that consists of multiple processors. The application is then divided in to number of tasks. These tasks are scheduled and the memory should be partitioned among the processors. Finally the execution time has to be found. ISSN: Page 1058
2 Embedded Application MPSoC Assigning the scheduled tasks with allocated memory to each processors Tasks TDG Task Scheduling & Memory Partitionin. Any time there is an edge between two tasks Ti and Tj means that a communication cost should be accounted for provided that these two tasks are allocated to two different processors. Tasks, T2, T3, and are ready to be scheduled in our example. Task will not be scheduled at this point based on its ALAP value. Thus, first tasks and T2 will be mapped to the two available processors and. T2 T4 T6 Execution Prediction T3 Fig.1. Block Diagram of the Project. III. TASKS & TDG Embedded applications usually consist of computation blocks, which are treated as tasks. An application program is divided in to tasks. Tasks are the various processes in the application. Whenever a program is executed, the operating system creates a new task for it. The task is like an envelope for the program. The state information of a task is represented by the task states such as idle, running, ready and blocked states. There are usually dependences between tasks that should be respected in the schedule. The problem formulation is based on a task dependence graph (TDG). A TDG is a directed acyclic graph with weighted edges where each vertex represents a task in the embedded application. An edge from task Ti to task Tj in the TDG represents a scheduling order that needs to be enforced due to the fact that Tj needs data to be transferred from Ti after Ti is already executed. The weight of this edge is the communication cost. A processor cannot start executing task until all the necessary data communication is performed. The weight of an edge is the communication cost. Each task can be mapped to any of the available processors. Since the processors in this architectural model can be heterogeneous, the execution time of each task depends on the processor to which this task is mapped as well as the SPM memory allocated to that processor. Accessing a data variable from a SPM is usually in the order of 100 times faster than accessing it from the off-chip memory. Consider the example task graph shown below with six tasks,, T2, T3, T4,, and T6. Task T4 depends on tasks, T2 and T3, and task T6 depends on tasks T4 and Fig.2. An example TDG. The scheduling algorithm will map T3 to as it is free before since the computation time of T2 is less than that of. In a similar fashion, the scheduling algorithm will assign tasks T4 and T6 to processor whereas task will be mapped to processor. From the task schedule, it has seen that task T4 can only start after is done executing task T3. The issue now is to try to reduce the dead time between tasks and T4 imposed by the computation time for tasks T2 and T3. To minimize this dead time, techniques usually allocate more SPM budget to processor to reduce the computation time of tasks T2 and T3. IV. TASK SCHEDULING & MEMORY PARTITIONING Four approaches can be implemented to solve the task scheduling and memory allocation problem on MPSoC systems, namely: Decoupled task scheduling and memory partitioning assuming equally partitioned SPM among all available processors, TSMP EQUAL; Decoupled task scheduling and memory partitioning with SPM partitioned among different processors with any ratio, TSMP ANY; Integrated task scheduling and memory partitioning heuristic, TSMP INTEG; Integrated heuristic with pipelining TSMP PIPE; Unlike current approaches that treat task scheduling and memory partitioning as two separate problems, these two problems can be solved in an integrated fashion. An effective heuristic was developed for the task scheduling/ memory partitioning problem for a multiprocessor system- ISSN: Page 1059
3 on-chip where a single application is using the MPSoC at a time. These two steps are performed in an integrated fashion where the private on-chip memory budget allocated to a processor is decided as tasks are mapped to this processor. The computation time of a task depends on the processor to which it is mapped, as well as on the SPM memory available for that task. Therefore, task scheduling should take into consideration the varying computation time of a task based on the processor and the SPM budget. An embedded application is usually executed many times for a stream of input data on an MPSoC. Such multiple executions make embedded applications amenable to pipelined implementation. Pipeline scheduling benefits from allowing tasks of different embedded application instances to be scheduled at each stage of the pipeline. The objective is to decrease the pipeline stage time interval, as after filling up the pipeline an instance execution of the application is performed each pipeline stage. The maximum number of stages is equal to the number of processors in the MPSoC system. A. Decoupled TSMP using Cache Memory At first the schedule is done by assuming no available scratch pad memories. Tasks, T2, T3, and are ready to be scheduled in the example. Task will not be scheduled at this point based on its ALAP value. Thus, first tasks and T2 will be mapped to the two available processors and. The scheduling algorithm will map T3 to as it is free before since the computation time of T2 is less than that of. In a similar fashion, the scheduling algorithm will assign tasks T4 and T6 to processor whereas task will be mapped to processor. Fig.4. Schedule on Equal Partitioned SPM The results following partitioning the available SPM memory equally between the two processors. With such a criterion, the available SPM budget will be equally divided between processors and regardless of what tasks are mapped to what processors. The idle time can be reduced. Equally partitioned SPM reduces the computation time of the whole application. C. Decoupled TSMP on Non equal Partitioned SPM Fig.5. Schedule Based on Non equal Partitioned SPM To further reduce this application s computation time, the available SPM can be divided between the two processors in any ratio. From the task schedule, we can see that task T4 can only start after is done executing task T3. The issue now is to try to reduce the dead time between tasks and T4 imposed by the computation time for tasks T2 and T3. To minimize this dead time, techniques usually allocate more SPM budget to processor to reduce the computation time of tasks T2 and T3. D. Integrated TSMP T2 T3 T4 T6 T4 T6 T2 T3 Fig.3. Schedule Based on no SPM The problem with the previous schedule is that it allocated T3 to the same processor that is scheduled to execute T2. This choice is the reason for the dead time in the schedule as T2 cannot benefit much from more SPM memory which is clear from the Min, Avg, and Max values. A good heuristic should take these values into consideration where a better choice for T3 is to be scheduled on with all available SPM memory being allocated to this processor, and the result is a schedule with the minimal end time. B. Decoupled TSMP on Equal Partitioned SPM T3 T4 T6 T4 T6 T2 T2 T3 ISSN: Page 1060
4 Fig.6. Schedule Based on Integrated Approach E. Integrated TSMP with Pipelining Pipeline scheduling allows tasks of different embedded application instances to be scheduled at each stage of the pipeline. Such a schedule does not necessarily decrease the computation time of one instance of embedded application, but rather it decreases the time between the start times of two consecutive iterations of the task graph. Here the pipelined concept is implemented by storing the result of previous task in to the memory while current task is executing. This further reduces the computation time. V. RESULTS AND DISCUSSION Task Dependency Graph shown in fig.2 is considered and the implementation was done using the Modelsim software. The various tasks are considered to be interpolation, Sum of Absolute Differences (SAD), Multiply and Accumulation (MAC), addition, subtraction, and multiplication from MPEG4 encoder block. The decoupled TSMP approach using equally partitioned SPM needs 700ns for execution. C. Simulation Result for Decoupled TSMP on Non equal Partitioned SPM A. Simulation Result for Decoupled TSMP using Cache Memory The execution time obtained for decoupled TSMP using cache memory approach is 800ns. The execution time obtained for decoupled TSMP approach using non equal partitioned SPM is 600ns. D. Simulation Result for Integrated TSMP The execution time obtained for integrated TSMP approach using SPM is 500ns. B. Simulation Result for Decoupled TSMP on Equal Partitioned SPM ISSN: Page 1061
5 E. Simulation Result for Integrated TSMP with pipelining Integrated TSMP with pipelining approach needs 500ns for executing the given tasks. An effective heuristic was presented that integrates task scheduling and memory partitioning of embedded applications on multiprocessor systems-on-chip with scratchpad memory. Compared to the widely-used decoupled approach, this integrated approach significantly improved the results, since the appropriate partitioning of SPM spaces among different processors depends on the tasks scheduled on each of those processors and vice versa. Thus the reduction in the execution time of the tasks scheduled on the processors is obtained using various approaches such as equally partitioned SPM, non equal partitioned SPM, integrated approach and integrated approach with pipelining. Simulation results are obtained using modelsim software and the frequency values are obtained using xilinx software. REFERENCES F. Comparison Result The results obtained for various processes are compared and it is shown in fig.7. Comparison is done with 37k memory allocation for five different approaches T0 T2 T3 T4 Frequency Fig.7. Comparison between Various Approaches T 0 -- Decoupled TSMP using Cache Memory T 1 -- Decoupled TSMP on Equal Partitioned SPM T 2 -- Decoupled TSMP on Non equal Partitioned SPM T 3 -- Integrated TSMP T 4 -- Integrated TSMP with Pipelining The frequency values increased between each processes and hence the execution time gets reduced in the implemented concept. [1] Hassan Salamy and J. Ramanujam, An Effective Solution to Task Scheduling and Memory Partitioning for Multiprocessor System-on- Chip, in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 31, no. 5, May [2] L. Benini, D. Bertozzi, A. Guerri, and M. Milano, Allocation and scheduling for MPSOC via decomposition and no-good generation, in Proc. IJCAI, 2005, pp [3] Y.K. Kwok and I. Ahmad, Benchmarking and comparison of the task graph scheduling algorithms, J. Parallel Distributed Comput., vol. 59, no. 3, pp , Dec [4] R. Neimann and P. Marwedel, Hardware/software partitioning using integer programming, in Proc. DATE, 1996, pp [5] K. S. Chatha and R. Vemuri, Hardware-software partitioning and pipelined scheduling of transformative applications, IEEE Trans. Very Large Scale Integr., vol. 10, no. 3, pp , Jun [6] P. Panda, N. D. Dutt, and A. Nicolau, On-chip vs. off-chip memory: The data partitioning problem in embedded processorbased systems, ACM Trans. Des. Automat. Electron. Syst., vol. 5, no. 3, pp , Jul [7] O. Avissar, R. Barua, and D. Stewart, An optimal memory allocation scheme for scratch-pad-based embedded systems, ACM Trans. Embedded Comput. Syst., vol. 1, no. 1, pp. 6 26, Nov [8] A. Dominguez, S. Udayakumaran, and R. Barua, Heap data allocation to scratch-pad memory in embedded systems, J. Embedded Comput., vol. 1, no. 4, pp , Dec AUTHORS PROFILE Mythili.R received her B.E degree from Anna University, Coimbatore, India, in She is currently pursuing her M.E degree from Anna university, Chennai, India. Her research area includes optimization of MPSoC and low power VLSI circuits. VI. CONCLUSION ISSN: Page 1062
6 Mugilan.D received his B.E degree from Erode Sengunthar Engineering College, Erode, India, in 2007, M.E degree from Kongu Engineering College, Erode, India, in He worked as a Assistant Professor in Maharaja Engineering College, Avinashi, India. Since 2010 he is working as a Assistant Professor in K.S.Rangasamy College of Technology, Tamilnadu, India. His research is in the area of embedded systems and digital image processing. He is a life member in ISTE. ISSN: Page 1063
Effective Memory Access Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management
International Journal of Computer Theory and Engineering, Vol., No., December 01 Effective Memory Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management Sultan Daud Khan, Member,
More informationIntegrated Scratchpad Memory Optimization and Task Scheduling for MPSoC Architectures
Integrated Scratchpad Memory Optimization and Task Scheduling for MPSoC Architectures Vivy Suhendra, Chandrashekar Raghavan, Tulika Mitra School of Computing National University of Singapore {vivy, chandra1,
More informationPower and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA
Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA Krishnapriya P.N 1, Arathy Iyer 2 M.Tech Student [VLSI & Embedded Systems], Sree Narayana Gurukulam College of Engineering,
More informationUsage of Scratchpad Memory In Embedded Systems - State of Art
Usage of Scratchpad Memory In Embedded Systems - State of Art B. An uradha I Department of Computer Science and Engineering Dr. C. Vivekanandan2 Dean Electrical Sciences & Student Affairs SNS College of
More informationISSN Vol.04,Issue.01, January-2016, Pages:
WWW.IJITECH.ORG ISSN 2321-8665 Vol.04,Issue.01, January-2016, Pages:0077-0082 Implementation of Data Encoding and Decoding Techniques for Energy Consumption Reduction in NoC GORANTLA CHAITHANYA 1, VENKATA
More informationHybrid SPM-Cache Architectures to Achieve High Time Predictability and Performance
Hybrid SPM-Cache Architectures to Achieve High Time Predictability and Performance Wei Zhang and Yiqiang Ding Department of Electrical and Computer Engineering Virginia Commonwealth University {wzhang4,ding4}@vcu.edu
More informationIMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC
IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC Thangamonikha.A 1, Dr.V.R.Balaji 2 1 PG Scholar, Department OF ECE, 2 Assitant Professor, Department of ECE 1, 2 Sri Krishna
More informationFPGA-BASED DATA ACQUISITION SYSTEM WITH RS 232 INTERFACE
FPGA-BASED DATA ACQUISITION SYSTEM WITH RS 232 INTERFACE 1 Thirunavukkarasu.T, 2 Kirthika.N 1 PG Student: Department of ECE (PG), Sri Ramakrishna Engineering College, Coimbatore, India 2 Assistant Professor,
More informationA Level-wise Priority Based Task Scheduling for Heterogeneous Systems
International Journal of Information and Education Technology, Vol., No. 5, December A Level-wise Priority Based Task Scheduling for Heterogeneous Systems R. Eswari and S. Nickolas, Member IACSIT Abstract
More informationPipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications
, Vol 7(4S), 34 39, April 204 ISSN (Print): 0974-6846 ISSN (Online) : 0974-5645 Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications B. Vignesh *, K. P. Sridhar
More informationHETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS
HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS Jing Lin, Akshaya Srivasta, Prof. Andreas Gerstlauer, and Prof. Brian L. Evans Department of Electrical and Computer Engineering The
More informationSathyamangalam, 2 ( PG Scholar,Department of Computer Science and Engineering,Bannari Amman Institute of Technology, Sathyamangalam,
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 8, Issue 5 (Jan. - Feb. 2013), PP 70-74 Performance Analysis Of Web Page Prediction With Markov Model, Association
More informationImplementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator
Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator A.Sindhu 1, K.PriyaMeenakshi 2 PG Student [VLSI], Dept. of ECE, Muthayammal Engineering College, Rasipuram, Tamil Nadu,
More informationAUTONOMOUS RECONFIGURATION OF IP CORE UNITS USING BLRB ALGORITHM
AUTONOMOUS RECONFIGURATION OF IP CORE UNITS USING BLRB ALGORITHM B.HARIKRISHNA 1, DR.S.RAVI 2 1 Sathyabama Univeristy, Chennai, India 2 Department of Electronics Engineering, Dr. M. G. R. Univeristy, Chennai,
More informationDesign and Performance analysis of efficient bus arbitration schemes for on-chip shared bus Multi-processor SoC
50 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.9, September 008 Design and Performance analysis of efficient bus arbitration for on-chip shared bus Multi-processor SoC
More informationHigh Performance and Area Efficient DSP Architecture using Dadda Multiplier
2017 IJSRST Volume 3 Issue 6 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology High Performance and Area Efficient DSP Architecture using Dadda Multiplier V.Kiran Kumar
More informationOPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION
OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION 1 S.Ateeb Ahmed, 2 Mr.S.Yuvaraj 1 Student, Department of Electronics and Communication/ VLSI Design SRM University, Chennai, India 2 Assistant
More informationHIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE
HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE Anni Benitta.M #1 and Felcy Jeba Malar.M *2 1# Centre for excellence in VLSI Design, ECE, KCG College of Technology, Chennai, Tamilnadu
More information[Kalyani*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY SYSTEMATIC ERROR-CORRECTING CODES IMPLEMENTATION FOR MATCHING OF DATA ENCODED M.Naga Kalyani*, K.Priyanka * PG Student [VLSID]
More informationSecure Token Based Storage System to Preserve the Sensitive Data Using Proxy Re-Encryption Technique
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,
More informationMCM Based FIR Filter Architecture for High Performance
ISSN No: 2454-9614 MCM Based FIR Filter Architecture for High Performance R.Gopalana, A.Parameswari * Department Of Electronics and Communication Engineering, Velalar College of Engineering and Technology,
More informationControlled duplication for scheduling real-time precedence tasks on heterogeneous multiprocessors
Controlled duplication for scheduling real-time precedence tasks on heterogeneous multiprocessors Jagpreet Singh* and Nitin Auluck Department of Computer Science & Engineering Indian Institute of Technology,
More informationA Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding
A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely
More informationPROCESSORS are increasingly replacing gates as the basic
816 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 8, AUGUST 2006 Exploiting Statistical Information for Implementation of Instruction Scratchpad Memory in Embedded System
More informationCache Controller with Enhanced Features using Verilog HDL
Cache Controller with Enhanced Features using Verilog HDL Prof. V. B. Baru 1, Sweety Pinjani 2 Assistant Professor, Dept. of ECE, Sinhgad College of Engineering, Vadgaon (BK), Pune, India 1 PG Student
More informationA METHODOLOGY FOR THE OPTIMIZATION OF MULTI- PROGRAM SHARED SCRATCHPAD MEMORY
INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 4, NO. 1, MARCH 2011 A METHODOLOGY FOR THE OPTIMIZATION OF MULTI- PROGRAM SHARED SCRATCHPAD MEMORY J. F. Yang, H. Jiang School of Electronic
More informationLow Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm
Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm 1 A.Malashri, 2 C.Paramasivam 1 PG Student, Department of Electronics and Communication K S Rangasamy College Of Technology,
More informationPOWER REDUCTION IN CONTENT ADDRESSABLE MEMORY
POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY Latha A 1, Saranya G 2, Marutharaj T 3 1, 2 PG Scholar, Department of VLSI Design, 3 Assistant Professor Theni Kammavar Sangam College Of Technology, Theni,
More informationLOW POWER FPGA IMPLEMENTATION OF REAL-TIME QRS DETECTION ALGORITHM
LOW POWER FPGA IMPLEMENTATION OF REAL-TIME QRS DETECTION ALGORITHM VIJAYA.V, VAISHALI BARADWAJ, JYOTHIRANI GUGGILLA Electronics and Communications Engineering Department, Vaagdevi Engineering College,
More informationthe main limitations of the work is that wiring increases with 1. INTRODUCTION
Design of Low Power Speculative Han-Carlson Adder S.Sangeetha II ME - VLSI Design, Akshaya College of Engineering and Technology, Coimbatore sangeethasoctober@gmail.com S.Kamatchi Assistant Professor,
More informationHybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling
Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling Morteza Damavandpeyma 1, Sander Stuijk 1, Twan Basten 1,2, Marc Geilen 1 and Henk Corporaal 1 1 Department of Electrical Engineering,
More informationA Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors
A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors Murali Jayapala 1, Francisco Barat 1, Pieter Op de Beeck 1, Francky Catthoor 2, Geert Deconinck 1 and Henk Corporaal
More informationIntegrating MRPSOC with multigrain parallelism for improvement of performance
Integrating MRPSOC with multigrain parallelism for improvement of performance 1 Swathi S T, 2 Kavitha V 1 PG Student [VLSI], Dept. of ECE, CMRIT, Bangalore, Karnataka, India 2 Ph.D Scholar, Jain University,
More informationFault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections
Fault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections A.SAI KUMAR MLR Group of Institutions Dundigal,INDIA B.S.PRIYANKA KUMARI CMR IT Medchal,INDIA Abstract Multiple
More informationImplementation of Reduce the Area- Power Efficient Fixed-Point LMS Adaptive Filter with Low Adaptation-Delay
Implementation of Reduce the Area- Power Efficient Fixed-Point LMS Adaptive Filter with Low Adaptation-Delay A.Sakthivel 1, A.Lalithakumar 2, T.Kowsalya 3 PG Scholar [VLSI], Muthayammal Engineering College,
More informationArea Efficient SAD Architecture for Block Based Video Compression Standards
IJCAES ISSN: 2231-4946 Volume III, Special Issue, August 2013 International Journal of Computer Applications in Engineering Sciences Special Issue on National Conference on Information and Communication
More informationDESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER
DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER Bhuvaneswaran.M 1, Elamathi.K 2 Assistant Professor, Muthayammal Engineering college, Rasipuram, Tamil Nadu, India 1 Assistant Professor, Muthayammal
More informationSynthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction
Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction Rakhi S 1, PremanandaB.S 2, Mihir Narayan Mohanty 3 1 Atria Institute of Technology, 2 East Point College of Engineering &Technology,
More informationArchitectures of Flynn s taxonomy -- A Comparison of Methods
Architectures of Flynn s taxonomy -- A Comparison of Methods Neha K. Shinde Student, Department of Electronic Engineering, J D College of Engineering and Management, RTM Nagpur University, Maharashtra,
More informationAn Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator
An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator M.Chitra Evangelin Christina Associate Professor Department of Electronics and Communication Engineering Francis Xavier
More informationReducing Cache Energy in Embedded Processors Using Early Tag Access and Tag Overflow Buffer
Reducing Cache Energy in Embedded Processors Using Early Tag Access and Tag Overflow Buffer Neethu P Joseph 1, Anandhi V. 2 1 M.Tech Student, Department of Electronics and Communication Engineering SCMS
More informationEfficient Current Mode Sense Amplifier for Low Power SRAM
Efficient Current Mode Sense Amplifier for Low Power SRAM A. V. Gayatri Department of Electronics and Communication Engineering, K.S. Rangasamy College of Technology, Tiruchengode, Namakkal Dist, Tamilnadu,
More informationOPTIMIZING THE POWER USING FUSED ADD MULTIPLIER
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,
More informationImplementation of SCN Based Content Addressable Memory
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 12, Issue 4, Ver. II (Jul.-Aug. 2017), PP 48-52 www.iosrjournals.org Implementation of
More informationSPM Management Using Markov Chain Based Data Access Prediction*
SPM Management Using Markov Chain Based Data Access Prediction* Taylan Yemliha Syracuse University, Syracuse, NY Shekhar Srikantaiah, Mahmut Kandemir Pennsylvania State University, University Park, PA
More informationArea And Power Efficient LMS Adaptive Filter With Low Adaptation Delay
e-issn: 2349-9745 p-issn: 2393-8161 Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com Area And Power Efficient LMS Adaptive
More informationPerformance Enhancement Guaranteed Cache Using STT-RAM Technology
Performance Enhancement Guaranteed Cache Using STT-RAM Technology Ms.P.SINDHU 1, Ms.K.V.ARCHANA 2 Abstract- Spin Transfer Torque RAM (STT-RAM) is a form of computer data storage which allows data items
More informationAdvanced Spam Detection Methodology by the Neural Network Classifier
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,
More informationFPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST
FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST SAKTHIVEL Assistant Professor, Department of ECE, Coimbatore Institute of Engineering and Technology Abstract- FPGA is
More informationImplementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications
46 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.3, March 2008 Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications
More informationFull Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing
Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing Umadevi.S #1, Vigneswaran.T #2 # Assistant Professor [Sr], School of Electronics Engineering, VIT University, Vandalur-
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationArchitecture to Detect and Correct Error in Motion Estimation of Video System Based on RQ Code
International Journal of Emerging Engineering Research and Technology Volume 3, Issue 7, July 2015, PP 152-159 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Architecture to Detect and Correct Error
More informationPOWER CONSUMPTION AND MEMORY AWARE VLSI ARCHITECTURE FOR MOTION ESTIMATION
POWER CONSUMPTION AND MEMORY AWARE VLSI ARCHITECTURE FOR MOTION ESTIMATION K.Priyadarshini, Research Scholar, Department Of ECE, Trichy Engineering College ; D.Jackuline Moni,Professor,Department Of ECE,Karunya
More informationArea And Power Optimized One-Dimensional Median Filter
Area And Power Optimized One-Dimensional Median Filter P. Premalatha, Ms. P. Karthika Rani, M.E., PG Scholar, Assistant Professor, PA College of Engineering and Technology, PA College of Engineering and
More informationVLSI Design and Implementation of High Speed and High Throughput DADDA Multiplier
VLSI Design and Implementation of High Speed and High Throughput DADDA Multiplier U.V.N.S.Suhitha Student Department of ECE, BVC College of Engineering, AP, India. Abstract: The ever growing need for improved
More informationDelay Optimised 16 Bit Twin Precision Baugh Wooley Multiplier
Delay Optimised 16 Bit Twin Precision Baugh Wooley Multiplier Vivek. V. Babu 1, S. Mary Vijaya Lense 2 1 II ME-VLSI DESIGN & The Rajaas Engineering College Vadakkangulam, Tirunelveli 2 Assistant Professor
More informationRun length encoding and bit mask based Data Compression and Decompression Using Verilog
Run length encoding and bit mask based Data Compression and Decompression Using Verilog S.JAGADEESH 1, T.VENKATESWARLU 2, DR.M.ASHOK 3 1 Associate Professor & HOD, Department of Electronics and Communication
More informationStack Frames Placement in Scratch-Pad Memory for Energy Reduction of Multi-task Applications
Stack Frames Placement in Scratch-Pad Memory for Energy Reduction of Multi-task Applications LOVIC GAUTHIER 1, TOHRU ISHIHARA 1, AND HIROAKI TAKADA 2 1 System LSI Research Center, 3rd Floor, Institute
More informationDesign and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology
Design and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology Senthil Ganesh R & R. Kalaimathi 1 Assistant Professor, Electronics and Communication Engineering, Info Institute of Engineering,
More informationAvailable online at ScienceDirect. Procedia Technology 25 (2016 )
Available online at www.sciencedirect.com ScienceDirect Procedia Technology 25 (2016 ) 544 551 Global Colloquium in Recent Advancement and Effectual Researches in Engineering, Science and Technology (RAEREST
More informationISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies
VLSI IMPLEMENTATION OF HIGH PERFORMANCE DISTRIBUTED ARITHMETIC (DA) BASED ADAPTIVE FILTER WITH FAST CONVERGENCE FACTOR G. PARTHIBAN 1, P.SATHIYA 2 PG Student, VLSI Design, Department of ECE, Surya Group
More informationCo-operative Scheduled Energy Aware Load-Balancing technique for an Efficient Computational Cloud
571 Co-operative Scheduled Energy Aware Load-Balancing technique for an Efficient Computational Cloud T.R.V. Anandharajan 1, Dr. M.A. Bhagyaveni 2 1 Research Scholar, Department of Electronics and Communication,
More informationKeywords: Fast Fourier Transforms (FFT), Multipath Delay Commutator (MDC), Pipelined Architecture, Radix-2 k, VLSI.
ww.semargroup.org www.ijvdcs.org ISSN 2322-0929 Vol.02, Issue.05, August-2014, Pages:0294-0298 Radix-2 k Feed Forward FFT Architectures K.KIRAN KUMAR 1, M.MADHU BABU 2 1 PG Scholar, Dept of VLSI & ES,
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VII /Issue 2 / OCT 2016
NEW VLSI ARCHITECTURE FOR EXPLOITING CARRY- SAVE ARITHMETIC USING VERILOG HDL B.Anusha 1 Ch.Ramesh 2 shivajeehul@gmail.com 1 chintala12271@rediffmail.com 2 1 PG Scholar, Dept of ECE, Ganapathy Engineering
More informationISSN Vol.05, Issue.12, December-2017, Pages:
ISSN 2322-0929 Vol.05, Issue.12, December-2017, Pages:1174-1178 www.ijvdcs.org Design of High Speed DDR3 SDRAM Controller NETHAGANI KAMALAKAR 1, G. RAMESH 2 1 PG Scholar, Khammam Institute of Technology
More informationDESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY
DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY Saroja pasumarti, Asst.professor, Department Of Electronics and Communication Engineering, Chaitanya Engineering
More informationA NOVEL APPROACH FOR A HIGH PERFORMANCE LOSSLESS CACHE COMPRESSION ALGORITHM
A NOVEL APPROACH FOR A HIGH PERFORMANCE LOSSLESS CACHE COMPRESSION ALGORITHM K. Janaki 1, K. Indhumathi 2, P. Vijayakumar 3 and K. Ashok Kumar 4 1 Department of Electronics and Communication Engineering,
More informationFast FPGA Routing Approach Using Stochestic Architecture
. Fast FPGA Routing Approach Using Stochestic Architecture MITESH GURJAR 1, NAYAN PATEL 2 1 M.E. Student, VLSI and Embedded System Design, GTU PG School, Ahmedabad, Gujarat, India. 2 Professor, Sabar Institute
More informationIMPLEMENTATION OF DISTRIBUTED CANNY EDGE DETECTOR ON FPGA
IMPLEMENTATION OF DISTRIBUTED CANNY EDGE DETECTOR ON FPGA T. Rupalatha 1, Mr.C.Leelamohan 2, Mrs.M.Sreelakshmi 3 P.G. Student, Department of ECE, C R Engineering College, Tirupati, India 1 Associate Professor,
More informationReal Time NoC Based Pipelined Architectonics With Efficient TDM Schema
Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam
More informationEnhanced Hexagon with Early Termination Algorithm for Motion estimation
Volume No - 5, Issue No - 1, January, 2017 Enhanced Hexagon with Early Termination Algorithm for Motion estimation Neethu Susan Idiculay Assistant Professor, Department of Applied Electronics & Instrumentation,
More informationContent Addressable Memory with Efficient Power Consumption and Throughput
International journal of Emerging Trends in Science and Technology Content Addressable Memory with Efficient Power Consumption and Throughput Authors Karthik.M 1, R.R.Jegan 2, Dr.G.K.D.Prasanna Venkatesan
More informationAnalysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope
Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope G. Mohana Durga 1, D.V.R. Mohan 2 1 M.Tech Student, 2 Professor, Department of ECE, SRKR Engineering College, Bhimavaram, Andhra
More informationFused Floating Point Arithmetic Unit for Radix 2 FFT Implementation
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 2, Ver. I (Mar. -Apr. 2016), PP 58-65 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Fused Floating Point Arithmetic
More informationDYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)
DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017
Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of
More informationAn Adaptive and Optimal Distributed Clustering for Wireless Sensor
An Adaptive and Optimal Distributed Clustering for Wireless Sensor M. Senthil Kumaran, R. Haripriya 2, R.Nithya 3, Vijitha ananthi 4 Asst. Professor, Faculty of CSE, SCSVMV University, Kanchipuram. 2,
More informationVLSI Implementation of Daubechies Wavelet Filter for Image Compression
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue 6, Ver. I (Nov.-Dec. 2017), PP 13-17 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org VLSI Implementation of Daubechies
More informationUnique Journal of Engineering and Advanced Sciences Available online: Research Article
ISSN 2348-375X Unique Journal of Engineering and Advanced Sciences Available online: www.ujconline.net Research Article A POWER EFFICIENT CAM DESIGN USING MODIFIED PARITY BIT MATCHING TECHNIQUE Karthik
More informationA Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter
A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter A.S. Sneka Priyaa PG Scholar Government College of Technology Coimbatore ABSTRACT The Least Mean Square Adaptive Filter is frequently
More informationVLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2
IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 05, 2015 ISSN (online): 2321-0613 VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila
More informationDETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX
DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX ENDREDDY PRAVEENA 1 M.T.ech Scholar ( VLSID), Universal College Of Engineering & Technology, Guntur, A.P M. VENKATA SREERAJ 2 Associate
More informationA VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation
Journal of Automation and Control Engineering Vol. 3, No. 1, February 20 A VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation Dam. Minh Tung and Tran. Le Thang Dong Center of Electrical
More informationDesign of a High Speed CAVLC Encoder and Decoder with Parallel Data Path
Design of a High Speed CAVLC Encoder and Decoder with Parallel Data Path G Abhilash M.Tech Student, CVSR College of Engineering, Department of Electronics and Communication Engineering, Hyderabad, Andhra
More informationDesign of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter
African Journal of Basic & Applied Sciences 9 (1): 53-58, 2017 ISSN 2079-2034 IDOSI Publications, 2017 DOI: 10.5829/idosi.ajbas.2017.53.58 Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm
More informationProcessor-Directed Cache Coherence Mechanism A Performance Study
Processor-Directed Cache Coherence Mechanism A Performance Study H. Sarojadevi, dept. of CSE Nitte Meenakshi Institute of Technology (NMIT) Bangalore, India hsarojadevi@gmail.com S. K. Nandy CAD Lab, SERC
More informationDEVELOPMENT AND VERIFICATION OF AHB2APB BRIDGE PROTOCOL USING UVM TECHNIQUE
DEVELOPMENT AND VERIFICATION OF AHB2APB BRIDGE PROTOCOL USING UVM TECHNIQUE N.G.N.PRASAD Assistant Professor K.I.E.T College, Korangi Abstract: The AMBA AHB is for high-performance, high clock frequency
More informationSum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator
Sum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator D.S. Vanaja 1, S. Sandeep 2 1 M. Tech scholar in VLSI System Design, Department of ECE, Sri VenkatesaPerumal
More informationDetecting and Correcting the Multiple Errors in Video Coding System
International Journal of Research Studies in Science, Engineering and Technology Volume 2, Issue 8, August 2015, PP 99-106 ISSN 2349-4751 (Print) & ISSN 2349-476X (Online) Detecting and Correcting the
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VI /Issue 3 / JUNE 2016
VLSI DESIGN OF HIGH THROUGHPUT FINITE FIELD MULTIPLIER USING REDUNDANT BASIS TECHNIQUE YANATI.BHARGAVI, A.ANASUYAMMA Department of Electronics and communication Engineering Audisankara College of Engineering
More informationVLSI DESIGN OF REDUCED INSTRUCTION SET COMPUTER PROCESSOR CORE USING VHDL
International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol.2, Issue 3 (Spl.) Sep 2012 42-47 TJPRC Pvt. Ltd., VLSI DESIGN OF
More informationDetecting and Correcting the Multiple Errors in Video Coding System
International Journal of Emerging Engineering Research and Technology Volume 3, Issue 7, July 2015, PP 92-98 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Detecting and Correcting the Multiple Errors
More informationMapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.
Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable
More informationDesign and Implementation of Signed, Rounded and Truncated Multipliers using Modified Booth Algorithm for Dsp Systems.
Design and Implementation of Signed, Rounded and Truncated Multipliers using Modified Booth Algorithm for Dsp Systems. K. Ram Prakash 1, A.V.Sanju 2 1 Professor, 2 PG scholar, Department of Electronics
More informationMemory Systems and Compiler Support for MPSoC Architectures. Mahmut Kandemir and Nikil Dutt. Cap. 9
Memory Systems and Compiler Support for MPSoC Architectures Mahmut Kandemir and Nikil Dutt Cap. 9 Fernando Moraes 28/maio/2013 1 MPSoC - Vantagens MPSoC architecture has several advantages over a conventional
More informationComparison of Online Record Linkage Techniques
International Research Journal of Engineering and Technology (IRJET) e-issn: 2395-0056 Volume: 02 Issue: 09 Dec-2015 p-issn: 2395-0072 www.irjet.net Comparison of Online Record Linkage Techniques Ms. SRUTHI.
More informationParallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides)
Parallel Computing 2012 Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Algorithm Design Outline Computational Model Design Methodology Partitioning Communication
More informationDesign of 2-D DWT VLSI Architecture for Image Processing
Design of 2-D DWT VLSI Architecture for Image Processing Betsy Jose 1 1 ME VLSI Design student Sri Ramakrishna Engineering College, Coimbatore B. Sathish Kumar 2 2 Assistant Professor, ECE Sri Ramakrishna
More information