A hardware/software partitioning and scheduling approach for embedded systems with low-power and high performance requirements
|
|
- Lucy Townsend
- 6 years ago
- Views:
Transcription
1 A hardware/software partitioning and scheduling approach for embedded systems with low-power and high performance requirements Javier Resano, Daniel Mozos, Elena Pérez, Hortensia Mecha, Julio Septién Dept. de Arquitectura de Computadores, Facultad de Informática, UCM, Madrid {javier1, mozos, eperez, horten, Abstract. Hardware/software (hw/sw) partitioning largely affects the system cost, performance, and power consumption. Most of the previous hw/sw partitioning approaches are focused on either optimising the hw area, or the performance. Thus, they ignore the influence of the partitioning process on the energy consumption. However, during this process the designer still has the maximum flexibility, hence, it is clearly the best moment to analyse the energy consumption. We have developed a new hw/sw partitioning and scheduling tool that reduces the energy consumption of an embedded system while meeting high performance constraints. We have applied it to two current multimedia applications saving up to 30% of the system energy without reducing the performance. 1 Introduction Low-power has become one of the major design concerns. First of all, the designer must guarantee that his design does not exceed the power constraints of the target platform, since it will generate heating problems. Moreover, due to the proliferation of portable, battery-dependent devices, low-energy consumption has become one of the key features for the success of a design. The current trend for portable embedded systems is to create heterogeneous systems, with one or more low-power processors, some additional hardware (hw) logic (ASICs and/or FPGAs), and some memory hierarchy. Current technologies allow creating the whole system in a single chip (SoC). One of the most important steps to carry out in order to implement an application over such a system is to partition the application functionality among the different processing elements. This process drastically influences both the energy consumption and performance of the system. Figure 1 presents a simple example where the partitioning process can lead to energy savings. If the designer selects the fastest solution (sch1), the execution time is 139 time-units and the energy 21 energy-units. However, if the deadline for the application is 150, the designer can try to find a slower solution that meets this constraint while consuming less energy. In this case sch2 would be
2 selected since its execution time is less than the deadline and its energy consumption is 16. Thus, the energy consumption decreases 25%. PE1 PE2 T E T E Node Node S c h 1 S c h 2 P E 1 P E 2 N 1 N 2 P E 1 N 2 P E 2 N 1 D eadline Fig. 1. Partitioning example. Two nodes must be partitioned between two Processing Elements (PE). T means time. E means energy. Sch1 and Sch2 are two selected solutions. Since our partitioning tool is still under construction, currently we just support a software (sw) processor, an FPGA, a system bus and one or several memory blocks. However, partitioning an application to such a system is still a NP-complete problem. Moreover, there are several existing prototype platforms as well as commercial platforms that follow this scheme providing a sw processor and some reconfigurable hw resources e.g. Garp [1], Morphosys [2] and the Virtex II-Pro XC2VP4 and VP7 [3]. The system bus and the memory blocks require a careful study, since both elements can significantly affect the system performance and energy consumption, especially because both hw and sw performance are improving much faster than communication channels and memories do. In order to estimate accurately the impact of the memories and buses in the system performance and energy consumption their physical features must be taken into account. Ideally the vendor should provide either estimators or at least time and power models, but unfortunately, this is not always the case, then, time and power models are needed, some examples of existing useful models are [4] for USB, and PCI buses (just timing considerations), and [5,6] for memories. However, even after accurately estimating all the tasks, communications and memory accesses, computing the overall execution time it is not trivial, since it involves a scheduling that must take into account data and control dependencies as well as the accesses to the shared resources. Thus, we have developed a tool that schedules the tasks and the accesses to the system bus, and the shared memories during the partitioning process. This scheduling is the only way to accurately evaluate a solution, since otherwise, it is impossible to determine the impact of the communications or the delays introduced due to the conflicts on the accesses to shared resources (In [7] this problem is explained in detail). In addition, this scheduling prevents the need for arbitration logic in the bus controllers. Since the scheduler is integrated in a partitioning tool that must evaluate a great amount of different partitions one of our major concerns was to achieve near-optimal scheduling without increasing significantly the execution time of the partitioning tool. The rest of the paper is structured as follows: section 2 presents an overview of the related work; section 3 explains in detail the format of the initial specification for our partitioning tool; section 4 describes the cost function that steers the design space exploration; sections 5, 6, and 7 explain how the energy, execution-time and hardware area are estimated for a given partitioning. Section 8 presents the experimental
3 results and finally section 9 remarks some conclusions as well as future work to be done. 2. Related Work Hardware/software partitioning is a very well known problem. Several partitioning tools have been proposed in literature (e.g. [8, 9]). Most of these previous approaches accomplish the partitioning problem at a high abstraction level, adding the platform low-level details and scheduling the tasks on the processing elements (PEs) in a subsequent step called co-synthesis. Moreover, even during co-synthesis often the communications between different PEs are neglected, thus, these communications are included in a following step called communication synthesis. After these three steps the resultant solution is co-simulated, and likely, the results will not be the expected, so the process will have to start again with another solution. The main problem of this approach is that some of the features neglected during partitioning are critical for the system performance. Thus, it is almost impossible to found near-optimal solutions when communications are neglected during the partitioning process. Another lack of most of the existing approaches is that they just consider either hardware area or execution time minimization. However, as mentioned in the introduction, currently minimizing the energy consumption is often one of the more important designer concerns. Recently several scheduling and/or partitioning approaches for multiprocessors have been presented. They attempt to minimize the system consumption either applying Dynamic Voltage Scheduling (DVS) or applying different supply voltages to each processor; some of the more relevant are [10, 11,12]. DVS techniques schedule the voltage supplied to each processor during its execution. This is a powerful way to achieve power savings, since in CMOS technologies the power consumption decreases quadratically with the power supply. However, currently there is not support for DVS in most of the commercial processors, and to the best of our knowledge, there is not support at all for DVS in FPGAs platforms. Hence, nowadays, this is not a feasible approach for hw/sw co-design. [13] is the first hw/sw partitioning tool for low-power that we have found, it starts from a full sw implementation in a microprocessor ( P), and reduces the energy consumption migrating part of the functionality to hw, the energy savings are achieved turning off the P (in addition clock gating is applied in the hw partition). This approach does not perform a full partitioning design exploration. Moreover, it expects some data for the designer, like the number of ALUs, multipliers, shifters, etc., based on some previous designer experience, so the results of the partitioning will highly depend on the designer capabilities. PAP [14] is a recent partitioning tool that attempts to minimize the hardware area while meeting the timing and power constraints, thus they do not minimize the overall energy consumption but take care that infeasible solutions (those that consume more power than the allowed by the platform) will not be selected. Finally, in [15] a scheduling technique for dynamically reconfigurable FPGAs with support for partial reconfiguration is presented. The scheduling process attempts to minimize the energy consumption optimising the
4 number of partial reconfigurations. However, this scheduling is carried out after the partitioning process, hence, most of the flexibility is lost since the partition has been previously fixed. According to this paper, currently, FPGAs dynamic reconfiguration is extremely power inefficient, since in their experiments up to 50% of the FPGA energy consumption was due to these reconfigurations. Although there is substantial work spent in partitioning and scheduling for lowpower, we believe that our approach is the first one that accomplishes a deep design space exploration of the partitioning and scheduling process for hardware/software low-power embedded systems, attempting to meet the real-time timing constraints while minimising the overall system energy consumption, and including the system bus, and memories in the performance and energy consumption estimations. 3 Initial Specification The initial specification is described as a Directed Acyclic Graph (DAG), where each node represents a computational task, or an access to the shared memory, and the edges correspond to dependencies among the nodes. Three different dependencies are considered, namely: communication, internal, and temporal dependencies. A communication dependency edge (CDE) either connects two nodes of PEs, or corresponds to a memory access; therefore, it represents a data transfer that must be carried out using the system bus. An internal dependency edge (IDE) connects two nodes allocated in the same PE, thus, it represents a data transfer, but in this case there is no access to the system-bus. A temporal dependency edge (TDE) represents a dependency between two nodes in the same PE that has been imposed by the scheduler. Each node of the graph must be characterized by its execution, power and area estimations for every possible platform. Each CDE is tagged with the amount of data to be transferred, and the execution time and energy consumption estimations. These estimations must include both the access to the system bus, and when needed, the access to the shared memory. 4. Cost Function The cost function of a codesign system typically includes different elements like the hw area, the execution time, the energy consumption, or the amount of communications. One of the more difficult issues when designing a partitioning system is how to mix all these completely different magnitudes into a cost function that should be able to lead the design space exploration in a near-optimal fashion. In literature several codesign approaches can be found where cost functions are built like the following: n n n a* * * i 0 i t i 0 i e i 0 i F c Area c Time c Energy (1)
5 Thus, for a given partition, each node of the DAG is characterizes with a number for every magnitude considered (three in this example). The cost function is then easily computed adding these numbers and multiplying them by some coefficients. Often, the user must fix these coefficients, thus, he has to identify the equivalence between a second, a Joule, and a mm 2. There is not an evident criteria about how to fix these coefficients, therefore these heterogeneous cost functions often lead to inefficient design-space explorations. In order to avoid this problem, our partitioning tool is led by a straightforward cost function that can be identified either with the energy consumption, the hw area or the execution time. Thus, the tool supports three different design-space explorations; the first one attempts to find the solution that consumes less energy and meets three restrictions, namely, maximum execution time, maximum hardware area and maximum power consumption restrictions. The first restriction guarantees that the application meets its real-time deadline; the second guarantees that there are enough hw resources to implement the hw partition; and the third restriction prevents the heating problems. If the system is not battery-dependent, the cost function can be identified either with the execution time, or with the area. When the execution time is selected as cost function, the tool attempts to find the fastest solution that meets the given area and power restrictions, otherwise, when area is selected, the tool will try to find the solution with less hw area that meets the execution time and power restrictions. It is up to the designer to decide which one is the goal of the design-space exploration. Table 1 shows all the possibilities. Table 1. Cost functions and restrictions that can steer the design space exploration Available Cost Functions Energy Time Area Available Restrictions Time Energy Area Power 5. Energy Consumption Estimations First of all, each node and each edge of the DAG must be characterized with its energy consumption for every possible processing element. These estimations must be carried out using the tools provided by the vendors if possible; otherwise generic power models must be applied. In addition to the energy consumption due to the nodes execution (including those nodes that represent the accesses to the shared memory) and communications, we assume that the PEs also consume energy when they are idle. If the PE is a processor, the power consumption in the idle state is commonly provided in the data sheet. The energy can be computed multiplying the power by the idle time. The same approach is used for the memory blocks. If the PE is implemented in the FPGA and clock gating is applied to it, the power that consumes when is idle will be just the device quiescent power. Otherwise, if clock gating is not implemented the logic dissipates more power apart from the quiescent power,
6 since the clock signal continues switching. This case is estimated considering the power consumption of the circuit when the toggle rate of the inputs is set to 0, thus we assume that when the circuit is idle all the inputs are fixed, if this is not correct, a proper toggle rate should be estimated profiling the system. Besides the energy considerations, the partitioning tool must check if a given partition meets the power dissipation constraints of the platform. To this end, the average power consumption of each node and each communication is included in the DAG. 6. Execution Time Estimations The execution time estimator, receive as input a given partitioning where the execution time of each node and each access to the system-bus have been previously estimated (we assume cycle accurate estimations). Nodes representing accesses to the shared memory have always a 0 time-units execution time assigned, since the latency of accessing the shared memory is considered as part of the communication delay. With this input the estimator schedules the execution of every node as well as all the accesses to the system bus. This scheduling is a NP-complete problem, however the estimation must be done as fast as possible since it has to be computed for every explored partition. Thus, we have developed a fast heuristic, based on list scheduling techniques, which provides a near-optimal scheduling with a low computational complexity (O(N 2 )). Fig. 2 depicts the scheduling pseudo-code. A) Assign a weight to each node. B) Choose the execution order for the SW nodes. C) Recalculate the weights taking into account the new dependencies. D) Schedule those nodes that are not waiting for a communication. E) While there is a communication waiting for execution do: E1) Choose one communication and schedule it. E2) Schedule those nodes that are not waiting for a communication Fig. 2. Scheduling heuristic pseudo-code Step A: The weights are used to steer the scheduling process trying to minimize the global execution time. The weight of a node is the maximum time-distance from that node to the end of the execution in the initial graph. This distance is computed carrying out an ALAP scheduling that takes into account all the dependencies. Thus, those nodes, which are in the DAG critical path, have higher weights. Step B and C: The initial DAG allows parallel execution between their nodes, but those nodes assigned to sw must be executed sequentially. The sw execution order is decided sorting the nodes by their weights. To impose this order new TDE dependencies are added to the initial DAG. It is easy to prove that this sw execution order does not allow the new dependencies to create cycles in the graph. Since these new dependencies can significantly affect the system performance, a new weight is assigned to each node. These weights are computed in the same way that in step A, but considering the new dependencies.
7 Steps D and E: An enhanced list-scheduling heuristic that attempts to minimize the global execution time has been developed for the scheduling process. This heuristic decides when each node and each communication is executed, assigning to them a t start and a t end times. The motivation of the heuristic is to detect the system bus access conflicts and the delays created by them. The scheduling starts assigning t start = 0, and t end =t ex to the first node, where t ex is its execution time in the partition where it has been assigned. Then the algorithm continues scheduling the successors of the first node. A greedy policy is followed to schedule nodes while there is no need for hw/sw communications. When a scheduled node requests a hw/sw communication with another node this request is stored in a list. Once all the nodes that do not need a hw/sw communication have been scheduled, one of the requested communications is selected and scheduled. There are two selecting criteria (E1): If at a given time t the system bus is not carrying out any communication and there is just one previous request, the communication channel is assigned to this request, and the bus is tagged as busy until this communication ends. Otherwise, if there is more than one request, the one with the greatest weight will be selected. The weight of a communication is computed as the weight of the destination node plus the time needed to execute the communication. Once the selected communication has been scheduled the graph is examined (E2) and all the nodes that can start their execution without waiting for another HW/SW communication are also scheduled. The loop continues until all the communications are scheduled. 7. Area estimation We apply the following equation to estimate the area needed to implement the nodes assigned to hw in the FPGA: Area N 1 A A A A (2) i 0 i driver control storage A i is the area of the node i. A driver is the area needed to implement the communication driver. A i and A driver are estimated from a core library. When a new core is added to the library its area is estimated using a synthesis tool. A control is the area needed for the control logic that schedules the communications. In this approach the scheduling control is assumed by a state machine, so the area requested is estimated as a function of the number of communications. A storage is the area needed for storing the data to transfer until a communication is executed. This storage space is computed during the communication scheduling. During this process a record keeps the maximum storage space required.
8 8. Results and Analysis All the estimators has been integrated into a partitioning tool based on genetic algorithms (GA) [16]. This tool creates a random initial population of valid solutions. A solution is valid if meets the given area, time and power constraints. Invalid solutions are rejected to save computational time, as well as to prevent the algorithm from converging to a non-valid area. During the design space exploration solutions evolve by reproducing themselves, generating new offspring of solutions. The crossover and the mutation operators carry out the reproduction process. Population is kept constant deleting the solution surplus. The 80% of the survivors are selected choosing the best solutions, and the 20% remaining is randomly selected in order to prevent a premature convergence. The designer can establish the population and the crossover and mutation probabilities. In addition, he can also select the cost function (between time, energy, and area) and fix the area, time and power restrictions. The partitioning tool allows the designer to select between two different scheduling modes, the first implements our heuristic while the second carries out a full search of the design space applying a branch&bound (b&b) algorithm, hence this mode guarantees that always the best schedule is found. As a first experiment, in order to validate our heuristic, we have run the partitioning tool in these two different modes for a set of 100 randomly generated DAGs. These DAGs were created using the TGFF tool [17], and their sizes are limited to any number between 10 and 20 nodes (for greater sizes it is not feasible to apply the b&b algorithm). The results obtained show that the b&b algorithm finds slightly better schedulings (on average 10% less execution time), but at the price of increasing 800 times the computational time needed to carry out the partitioning process (which it is reasonable since it performs a full search of the design space). These results confirm that our scheduling heuristic finds near-optimum schedulings with an almost negligible overhead. In this experiment the average time needed to schedule one of the DAGs with our heuristic was less than 2.5 s using a Pentium II running at 350 MHz. In our second experiment we attempt to compare the results obtained when using the energy and the execution time as cost function. To this end, we have analyzed two current multimedia applications, namely a JPEG decoder and a pattern recognition application that compute the Hough Transform of a matrix of pixels in order to find simple geometric patterns. The Hough Transform is commonly applied in robotics and astronomical data analysis. It is very simple to reduce the energy consumption when it is also possible to reduce the performance. Therefore, in this experiment we check whether it is possible to reduce the energy consumption while keeping almost the highest performance. Hence, we have run first the partitioning tool using the execution time as cost function to find the fastest solution. Then, we have rerun it using the energy instead of the time as cost function, but this time we have imposed that the solutions must be at most 10% slower than the fastest solution found in the previous step. Therefore, the tool is going to found the solution that consumes less energy while keeping almost the highest performance. For this experiment we have estimated the energy, execution time, area and power consumption of the application using the XILINX Foundation
9 5.i tool for the FPGA and the system bus, an ARM processor simulator for the sw processor and a 128 MB MICRON SRAM memory datasheet for the shared memory. Each application has been partitioned to a platform composed by a XILINX Virtex FPGA, an ARM processor running at 233 MHz, a 128 MB memory block and a system bus with 16 bit width and clocked at 33 MHz. The measurements were repeated 5 times for 5 different FPGA sizes. The results are shown in table 2. It is remarkable that we can decrease up to 30% the energy consumption (on average 17%), whereas the execution time remains almost the same (it increases less than 3% on average). Table 2. Results for the Pattern Recogniton Application (a) and the JPEG decoder (b). T1, and E1 are the execution time and the energy consumption for the fastest solution, whereas T2 and E2 correspond to the solution found using the energy as cost function. a) Pat. Rec. T1 T2 Time % E1 E2 Energy % FPGA % % FPGA % % FPGA % % FPGA % % FPGA % % Average + 2% - 15% b) JPEG T1 T2 Time % E1 E2 Energy % FPGA % % FPGA % % FPGA % % FPGA % % FPGA % % Average + 3% - 19% 9. Conclusions and Future work We have presented the first (to the best of our knowledge) hw/sw partitioning tool that can steer the design space exploration of the partitioning process to minimize the energy, the execution time or the area. In addition this is one of the few tools that accomplishes a full scheduling during the partitioning process including the accesses to the system bus and shared memories. This scheduling is the only way to accurately estimate the goodness of a given partition. We believe that this tool can be especially useful to decrease the energy consumption of a given application while meeting hard real-time constraints. Thus, we have applied our tool to two current multimedia applications, saving up to the 30% of the energy consumption, whereas the performance remains almost constant. Moreover, it must be remarked that is unimportant that the performance slightly decreases as long as the timing constraints are met.
10 Although our tool fulfills the requirements to partition an application to several existing platforms, several extensions are needed to apply it to platforms with multiple processors and more complex interconnection networks. Acknowledgements This work has been partially supported by Spanish Government research grant TIC References 1. J. R. Hauser and J. Wawrzynek, "Garp: A mips processor with a reconfigurable coprocessor," in IEEE Workshop on FPGAs for Custom Computing Machines, pp , H. Singh et al, MorphoSyS: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications, IEEE Trans. on Computers, pp , Vol. 49, No. 5, M. Gasteier, M. Munich, M. Glesner. Generation of Interconnect Topologies for Comuni cation Synthesis, DATE 98, pp K. Itoh et al., Trends in Low-Power Ram Circuits Technologies, Proc. IEEE, 83(4): , Apr M. Kamble and K. Ghose, Analytical Energy Disipation Models for Low Power Caches, Proc. Int l Sym. Low Power Electronics and Design, p. 143, Aug J. Resano et al, Analyzing Communication Overheads during Hardware/Software Partitioning, ESCODES 02, pp , R.P. Dick and N.K. Jha, CORDS: Hardware-Software Co-Synthesis of Reconfigurable Real-Time Distributed Embedded Systems, ICCAD 98, pp , J. Noguera, R.M. Badía, A HW/SW partitioning algorithm for dynamically reconfigurable architectures, DATE 01, pp , P. Yang et al., Energy-Aware Runtime Scheduling for Embedded-Multiprocessors SOCs, IEEE Journal on Design&Test of Computers, pp , G. Qu et al., Power Minimization using System-Level Partitioning of Applications with Quality of Services Requirements, Proc of Int. conf. on CAD. pp , I. Hong et al., Power Optimization of Variable-Voltage Core-Based System, IEEE Trans. on CAD of Integrated Circuits and Systems, vol. 18, no 12, pp , J. Henkel, A low power hardware/software partitioning approach for core-based embedded systems, DAC 99, pp , R. Mahapatra and P. Vijay, PAP: Power Aware Partitioning for Reconfigurable System, To be published in Proc. of HPCA Workshop 2003, feb L. Shang et al., Hw/Sw Co-synthesis of Low Power Real-Time Distributed Embedded Systems with Dynamically Reconfigurable FPGAs, ASP-DAC 02, pp , J. Holland. Adaptation in natural and artificial systems, MIT Press, R.P. Dick et al, TGFF: Task Graphs for Free, Int l Workshop HW/SW Codesign, pp , 1998
A New Approach to Execution Time Estimations in a Hardware/Software Codesign Environment
A New Approach to Execution Time Estimations in a Hardware/Software Codesign Environment JAVIER RESANO, ELENA PEREZ, DANIEL MOZOS, HORTENSIA MECHA, JULIO SEPTIÉN Departamento de Arquitectura de Computadores
More informationARTICLE IN PRESS. Analyzing communication overheads during hardware/software partitioning
Microelectronics Journal xx (2003) xxx xxx www.elsevier.com/locate/mejo Analyzing communication overheads during hardware/software partitioning J. Javier Resano*, M. Elena Pérez, Daniel Mozos, Hortensia
More informationA Hardware Task-Graph Scheduler for Reconfigurable Multi-tasking Systems
A Hardware Task-Graph Scheduler for Reconfigurable Multi-tasking Systems Abstract Reconfigurable hardware can be used to build a multitasking system where tasks are assigned to HW resources at run-time
More informationUsing Dynamic Voltage Scaling to Reduce the Configuration Energy of Run Time Reconfigurable Devices
Using Dynamic Voltage Scaling to Reduce the Configuration Energy of Run Time Reconfigurable Devices Yang Qu 1, Juha-Pekka Soininen 1 and Jari Nurmi 2 1 Technical Research Centre of Finland (VTT), Kaitoväylä
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationLecture 7: Introduction to Co-synthesis Algorithms
Design & Co-design of Embedded Systems Lecture 7: Introduction to Co-synthesis Algorithms Sharif University of Technology Computer Engineering Dept. Winter-Spring 2008 Mehdi Modarressi Topics for today
More informationMULTI-OBJECTIVE DESIGN SPACE EXPLORATION OF EMBEDDED SYSTEM PLATFORMS
MULTI-OBJECTIVE DESIGN SPACE EXPLORATION OF EMBEDDED SYSTEM PLATFORMS Jan Madsen, Thomas K. Stidsen, Peter Kjærulf, Shankar Mahadevan Informatics and Mathematical Modelling Technical University of Denmark
More informationAbstract. 1 Introduction. Reconfigurable Logic and Hardware Software Codesign. Class EEC282 Author Marty Nicholes Date 12/06/2003
Title Reconfigurable Logic and Hardware Software Codesign Class EEC282 Author Marty Nicholes Date 12/06/2003 Abstract. This is a review paper covering various aspects of reconfigurable logic. The focus
More informationEmbedded Systems. 7. System Components
Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More informationMulti MicroBlaze System for Parallel Computing
Multi MicroBlaze System for Parallel Computing P.HUERTA, J.CASTILLO, J.I.MÁRTINEZ, V.LÓPEZ HW/SW Codesign Group Universidad Rey Juan Carlos 28933 Móstoles, Madrid SPAIN Abstract: - Embedded systems need
More informationA Partitioning Flow for Accelerating Applications in Processor-FPGA Systems
A Partitioning Flow for Accelerating Applications in Processor-FPGA Systems MICHALIS D. GALANIS 1, GREGORY DIMITROULAKOS 2, COSTAS E. GOUTIS 3 VLSI Design Laboratory, Electrical & Computer Engineering
More informationRUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,
More informationA Lost Cycles Analysis for Performance Prediction using High-Level Synthesis
A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,
More informationA High Performance Bus Communication Architecture through Bus Splitting
A High Performance Communication Architecture through Splitting Ruibing Lu and Cheng-Kok Koh School of Electrical and Computer Engineering Purdue University,West Lafayette, IN, 797, USA {lur, chengkok}@ecn.purdue.edu
More informationReconfigurable Architecture Requirements for Co-Designed Virtual Machines
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra
More informationHardware-Software Codesign. 1. Introduction
Hardware-Software Codesign 1. Introduction Lothar Thiele 1-1 Contents What is an Embedded System? Levels of Abstraction in Electronic System Design Typical Design Flow of Hardware-Software Systems 1-2
More informationA Replacement Technique to Maximize Task Reuse in Reconfigurable Systems
A Replacement echnique to Maximize ask Reuse in Reconfigurable Systems Abstract Dynamically reconfigurable hardware is a promising technology that combines in the same device both the high performance
More informationENERGY EFFICIENT SCHEDULING SIMULATOR FOR DISTRIBUTED REAL-TIME SYSTEMS
I J I T E ISSN: 2229-7367 3(1-2), 2012, pp. 409-414 ENERGY EFFICIENT SCHEDULING SIMULATOR FOR DISTRIBUTED REAL-TIME SYSTEMS SANTHI BASKARAN 1, VARUN KUMAR P. 2, VEVAKE B. 2 & KARTHIKEYAN A. 2 1 Assistant
More informationModeling Arbitrator Delay-Area Dependencies in Customizable Instruction Set Processors
Modeling Arbitrator Delay-Area Dependencies in Customizable Instruction Set Processors Siew-Kei Lam Centre for High Performance Embedded Systems, Nanyang Technological University, Singapore (assklam@ntu.edu.sg)
More informationTowards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing
Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Walter Stechele, Stephan Herrmann, Andreas Herkersdorf Technische Universität München 80290 München Germany Walter.Stechele@ei.tum.de
More informationRuntime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays
Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann
More informationPilot: A Platform-based HW/SW Synthesis System
Pilot: A Platform-based HW/SW Synthesis System SOC Group, VLSI CAD Lab, UCLA Led by Jason Cong Zhong Chen, Yiping Fan, Xun Yang, Zhiru Zhang ICSOC Workshop, Beijing August 20, 2002 Outline Overview The
More informationMapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.
Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable
More informationComputer Systems Colloquium (EE380) Wednesday, 4:15-5:30PM 5:30PM in Gates B01
Adapting Systems by Evolving Hardware Computer Systems Colloquium (EE380) Wednesday, 4:15-5:30PM 5:30PM in Gates B01 Jim Torresen Group Department of Informatics University of Oslo, Norway E-mail: jimtoer@ifi.uio.no
More informationA Novel Deadlock Avoidance Algorithm and Its Hardware Implementation
A ovel Deadlock Avoidance Algorithm and Its Hardware Implementation + Jaehwan Lee and *Vincent* J. Mooney III Hardware/Software RTOS Group Center for Research on Embedded Systems and Technology (CREST)
More informationPower Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study
Power Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study William Fornaciari Politecnico di Milano, DEI Milano (Italy) fornacia@elet.polimi.it Donatella Sciuto Politecnico
More informationHardware/Software Codesign
Hardware/Software Codesign 3. Partitioning Marco Platzner Lothar Thiele by the authors 1 Overview A Model for System Synthesis The Partitioning Problem General Partitioning Methods HW/SW-Partitioning Methods
More informationSystem Verification of Hardware Optimization Based on Edge Detection
Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection
More informationEffective Memory Access Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management
International Journal of Computer Theory and Engineering, Vol., No., December 01 Effective Memory Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management Sultan Daud Khan, Member,
More informationA Modified Genetic Algorithm for Process Scheduling in Distributed System
A Modified Genetic Algorithm for Process Scheduling in Distributed System Vinay Harsora B.V.M. Engineering College Charatar Vidya Mandal Vallabh Vidyanagar, India Dr.Apurva Shah G.H.Patel College of Engineering
More informationA Reconfigurable Multifunction Computing Cache Architecture
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 4, AUGUST 2001 509 A Reconfigurable Multifunction Computing Cache Architecture Huesung Kim, Student Member, IEEE, Arun K. Somani,
More informationHardware Software Codesign of Embedded Systems
Hardware Software Codesign of Embedded Systems Rabi Mahapatra Texas A&M University Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on Codesign of Embedded System
More informationSoftware Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors
Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors Francisco Barat, Murali Jayapala, Pieter Op de Beeck and Geert Deconinck K.U.Leuven, Belgium. {f-barat, j4murali}@ieee.org,
More informationA Methodology and Tool Framework for Supporting Rapid Exploration of Memory Hierarchies in FPGAs
A Methodology and Tool Framework for Supporting Rapid Exploration of Memory Hierarchies in FPGAs Harrys Sidiropoulos, Kostas Siozios and Dimitrios Soudris School of Electrical & Computer Engineering National
More informationReal-Time Dynamic Energy Management on MPSoCs
Real-Time Dynamic Energy Management on MPSoCs Tohru Ishihara Graduate School of Informatics, Kyoto University 2013/03/27 University of Bristol on Energy-Aware COmputing (EACO) Workshop 1 Background Low
More informationScalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA
Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA Yun R. Qu, Viktor K. Prasanna Ming Hsieh Dept. of Electrical Engineering University of Southern California Los Angeles, CA 90089
More informationMOGAC: A Multiobjective Genetic Algorithm for the Co-Synthesis of Hardware-Software Embedded Systems
MOGAC: A Multiobjective Genetic Algorithm for the Co-Synthesis of Hardware-Software Embedded Systems Robert P. Dick and Niraj K. Jha Department of Electrical Engineering Princeton University Princeton,
More informationAn adaptive genetic algorithm for dynamically reconfigurable modules allocation
An adaptive genetic algorithm for dynamically reconfigurable modules allocation Vincenzo Rana, Chiara Sandionigi, Marco Santambrogio and Donatella Sciuto chiara.sandionigi@dresd.org, {rana, santambr, sciuto}@elet.polimi.it
More informationRECONFIGURABLE computing (RC) [5] is an interesting
730 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 7, JULY 2006 System-Level Power-Performance Tradeoffs for Reconfigurable Computing Juanjo Noguera and Rosa M. Badia Abstract
More informationSystem-on-Chip Architecture for Mobile Applications. Sabyasachi Dey
System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution
More informationEmbedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory
Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives
More informationA Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms
A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms Jingzhao Ou and Viktor K. Prasanna Department of Electrical Engineering, University of Southern California Los Angeles, California,
More informationSAMBA-BUS: A HIGH PERFORMANCE BUS ARCHITECTURE FOR SYSTEM-ON-CHIPS Λ. Ruibing Lu and Cheng-Kok Koh
BUS: A HIGH PERFORMANCE BUS ARCHITECTURE FOR SYSTEM-ON-CHIPS Λ Ruibing Lu and Cheng-Kok Koh School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 797- flur,chengkokg@ecn.purdue.edu
More informationA Complete Data Scheduler for Multi-Context Reconfigurable Architectures
A Complete Data Scheduler for Multi-Context Reconfigurable Architectures M. Sanchez-Elez, M. Fernandez, R. Maestre, R. Hermida, N. Bagherzadeh, F. J. Kurdahi Departamento de Arquitectura de Computadores
More informationScheduling tasks in embedded systems based on NoC architecture
Scheduling tasks in embedded systems based on NoC architecture Dariusz Dorota Faculty of Electrical and Computer Engineering, Cracow University of Technology ddorota@pk.edu.pl Abstract This paper presents
More informationMobile Robot Path Planning Software and Hardware Implementations
Mobile Robot Path Planning Software and Hardware Implementations Lucia Vacariu, Flaviu Roman, Mihai Timar, Tudor Stanciu, Radu Banabic, Octavian Cret Computer Science Department, Technical University of
More informationMULTI-PROCESSOR SYSTEM-LEVEL SYNTHESIS FOR MULTIPLE APPLICATIONS ON PLATFORM FPGA
MULTI-PROCESSOR SYSTEM-LEVEL SYNTHESIS FOR MULTIPLE APPLICATIONS ON PLATFORM FPGA Akash Kumar,, Shakith Fernando, Yajun Ha, Bart Mesman and Henk Corporaal Eindhoven University of Technology, Eindhoven,
More informationHardware Software Codesign of Embedded System
Hardware Software Codesign of Embedded System CPSC489-501 Rabi Mahapatra Mahapatra - Texas A&M - Fall 00 1 Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on
More informationA Level-wise Priority Based Task Scheduling for Heterogeneous Systems
International Journal of Information and Education Technology, Vol., No. 5, December A Level-wise Priority Based Task Scheduling for Heterogeneous Systems R. Eswari and S. Nickolas, Member IACSIT Abstract
More informationMapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience
Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience H. Krupnova CMG/FMVG, ST Microelectronics Grenoble, France Helena.Krupnova@st.com Abstract Today, having a fast hardware
More informationTradeoff Analysis and Architecture Design of a Hybrid Hardware/Software Sorter
Tradeoff Analysis and Architecture Design of a Hybrid Hardware/Software Sorter M. Bednara, O. Beyer, J. Teich, R. Wanka Paderborn University D-33095 Paderborn, Germany bednara,beyer,teich @date.upb.de,
More informationLow-Power Data Address Bus Encoding Method
Low-Power Data Address Bus Encoding Method Tsung-Hsi Weng, Wei-Hao Chiao, Jean Jyh-Jiun Shann, Chung-Ping Chung, and Jimmy Lu Dept. of Computer Science and Information Engineering, National Chao Tung University,
More informationLong Term Trends for Embedded System Design
Long Term Trends for Embedded System Design Ahmed Amine JERRAYA Laboratoire TIMA, 46 Avenue Félix Viallet, 38031 Grenoble CEDEX, France Email: Ahmed.Jerraya@imag.fr Abstract. An embedded system is an application
More informationAdaptive Online Cache Reconfiguration for Low Power Systems
Adaptive Online Cache Reconfiguration for Low Power Systems Andre Costi Nacul and Tony Givargis Department of Computer Science University of California, Irvine Center for Embedded Computer Systems {nacul,
More informationRED: A Reconfigurable Datapath
RED: A Reconfigurable Datapath Fernando Rincón, José M. Moya, Juan Carlos López Universidad de Castilla-La Mancha Departamento de Informática {frincon,fmoya,lopez}@inf-cr.uclm.es Abstract The popularity
More informationHardware-Software Co-Design of Embedded Reconfigurable Architectures
Hardware-Software Co-Design of Embedded Reconfigurable Architectures Yanbing Li, Tim Callahan *, Ervan Darnell **, Randolph Harr, Uday Kurkure, Jon Stockwood Synopsys Inc., 700 East Middlefield Rd. Mountain
More informationIntroduction Warp Processors Dynamic HW/SW Partitioning. Introduction Standard binary - Separating Function and Architecture
Roman Lysecky Department of Electrical and Computer Engineering University of Arizona Dynamic HW/SW Partitioning Initially execute application in software only 5 Partitioned application executes faster
More informationECE 448 Lecture 15. Overview of Embedded SoC Systems
ECE 448 Lecture 15 Overview of Embedded SoC Systems ECE 448 FPGA and ASIC Design with VHDL George Mason University Required Reading P. Chu, FPGA Prototyping by VHDL Examples Chapter 8, Overview of Embedded
More informationFPGA. Agenda 11/05/2016. Scheduling tasks on Reconfigurable FPGA architectures. Definition. Overview. Characteristics of the CLB.
Agenda The topics that will be addressed are: Scheduling tasks on Reconfigurable FPGA architectures Mauro Marinoni ReTiS Lab, TeCIP Institute Scuola superiore Sant Anna - Pisa Overview on basic characteristics
More informationCrew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm. Santos and Mateus (2007)
In the name of God Crew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm Spring 2009 Instructor: Dr. Masoud Yaghini Outlines Problem Definition Modeling As A Set Partitioning
More informationHow Much Logic Should Go in an FPGA Logic Block?
How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca
More informationThe Design and Implementation of a Low-Latency On-Chip Network
The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current
More informationAn Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling
An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling Keigo Mizotani, Yusuke Hatori, Yusuke Kumura, Masayoshi Takasu, Hiroyuki Chishiro, and Nobuyuki Yamasaki Graduate
More informationFeRAM Circuit Technology for System on a Chip
FeRAM Circuit Technology for System on a Chip K. Asari 1,2,4, Y. Mitsuyama 2, T. Onoye 2, I. Shirakawa 2, H. Hirano 1, T. Honda 1, T. Otsuki 1, T. Baba 3, T. Meng 4 1 Matsushita Electronics Corp., Osaka,
More informationBi-Objective Optimization for Scheduling in Heterogeneous Computing Systems
Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Tony Maciejewski, Kyle Tarplee, Ryan Friese, and Howard Jay Siegel Department of Electrical and Computer Engineering Colorado
More informationI. INTRODUCTION DYNAMIC reconfiguration, often referred to as run-time
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 11, NOVEMBER 2006 1189 Integrating Physical Constraints in HW-SW Partitioning for Architectures With Partial Dynamic Reconfiguration
More informationPerformance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path
Performance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path MICHALIS D. GALANIS 1, GREGORY DIMITROULAKOS 2, COSTAS E. GOUTIS 3 VLSI Design Laboratory, Electrical
More informationOptimal Cache Organization using an Allocation Tree
Optimal Cache Organization using an Allocation Tree Tony Givargis Technical Report CECS-2-22 September 11, 2002 Department of Information and Computer Science Center for Embedded Computer Systems University
More informationHardware Software Partitioning of Multifunction Systems
Hardware Software Partitioning of Multifunction Systems Abhijit Prasad Wangqi Qiu Rabi Mahapatra Department of Computer Science Texas A&M University College Station, TX 77843-3112 Email: {abhijitp,wangqiq,rabi}@cs.tamu.edu
More informationEvaluation of Runtime Task Mapping Heuristics with rsesame - A Case Study
Evaluation of Runtime Task Mapping Heuristics with rsesame - A Case Study Kamana Sigdel Mark Thompson Carlo Galuzzi Andy D. Pimentel Koen Bertels Computer Engineering Laboratory EEMCS, Delft University
More informationMassively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain
Massively Parallel Computing on Silicon: SIMD Implementations V.M.. Brea Univ. of Santiago de Compostela Spain GOAL Give an overview on the state-of of-the- art of Digital on-chip CMOS SIMD Solutions,
More informationMapping a group of jobs in the error recovery of the Grid-based workflow within SLA context
Mapping a group of jobs in the error recovery of the Grid-based workflow within SLA context Dang Minh Quan International University in Germany School of Information Technology Bruchsal 76646, Germany quandm@upb.de
More informationTHIS PAPER describes algorithms to synthesize lowpower
508 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 26, NO. 3, MARCH 2007 SLOPES: Hardware Software Cosynthesis of Low-Power Real-Time Distributed Embedded Systems With
More informationA Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors
A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors Murali Jayapala 1, Francisco Barat 1, Pieter Op de Beeck 1, Francky Catthoor 2, Geert Deconinck 1 and Henk Corporaal
More informationCHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP
133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located
More informationQUKU: A Fast Run Time Reconfigurable Platform for Image Edge Detection
QUKU: A Fast Run Time Reconfigurable Platform for Image Edge Detection Sunil Shukla 1,2, Neil W. Bergmann 1, Jürgen Becker 2 1 ITEE, University of Queensland, Brisbane, QLD 4072, Australia {sunil, n.bergmann}@itee.uq.edu.au
More informationIntroduction to Embedded Systems
Introduction to Embedded Systems Outline Embedded systems overview What is embedded system Characteristics Elements of embedded system Trends in embedded system Design cycle 2 Computing Systems Most of
More informationStatic Compaction Techniques to Control Scan Vector Power Dissipation
Static Compaction Techniques to Control Scan Vector Power Dissipation Ranganathan Sankaralingam, Rama Rao Oruganti, and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationDesign Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Design Space Exploration Using Parameterized Cores Ian D. L. Anderson M.A.Sc. Candidate March 31, 2006 Supervisor: Dr. M. Khalid 1 OUTLINE
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationHYBRID GENETIC ALGORITHM WITH GREAT DELUGE TO SOLVE CONSTRAINED OPTIMIZATION PROBLEMS
HYBRID GENETIC ALGORITHM WITH GREAT DELUGE TO SOLVE CONSTRAINED OPTIMIZATION PROBLEMS NABEEL AL-MILLI Financial and Business Administration and Computer Science Department Zarqa University College Al-Balqa'
More informationLossless Compression using Efficient Encoding of Bitmasks
Lossless Compression using Efficient Encoding of Bitmasks Chetan Murthy and Prabhat Mishra Department of Computer and Information Science and Engineering University of Florida, Gainesville, FL 326, USA
More informationEnergy-Constrained Scheduling of DAGs on Multi-core Processors
Energy-Constrained Scheduling of DAGs on Multi-core Processors Ishfaq Ahmad 1, Roman Arora 1, Derek White 1, Vangelis Metsis 1, and Rebecca Ingram 2 1 University of Texas at Arlington, Computer Science
More informationVerification and Validation of X-Sim: A Trace-Based Simulator
http://www.cse.wustl.edu/~jain/cse567-06/ftp/xsim/index.html 1 of 11 Verification and Validation of X-Sim: A Trace-Based Simulator Saurabh Gayen, sg3@wustl.edu Abstract X-Sim is a trace-based simulator
More informationSynthetic Benchmark Generator for the MOLEN Processor
Synthetic Benchmark Generator for the MOLEN Processor Stephan Wong, Guanzhou Luo, and Sorin Cotofana Computer Engineering Laboratory, Electrical Engineering Department, Delft University of Technology,
More informationA Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning
A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning By: Roman Lysecky and Frank Vahid Presented By: Anton Kiriwas Disclaimer This specific
More informationReal-Time Mixed-Criticality Wormhole Networks
eal-time Mixed-Criticality Wormhole Networks Leandro Soares Indrusiak eal-time Systems Group Department of Computer Science University of York United Kingdom eal-time Systems Group 1 Outline Wormhole Networks
More informationA Novel Design of High Speed and Area Efficient De-Multiplexer. using Pass Transistor Logic
A Novel Design of High Speed and Area Efficient De-Multiplexer Using Pass Transistor Logic K.Ravi PG Scholar(VLSI), P.Vijaya Kumari, M.Tech Assistant Professor T.Ravichandra Babu, Ph.D Associate Professor
More informationDesign of a System-on-Chip Switched Network and its Design Support Λ
Design of a System-on-Chip Switched Network and its Design Support Λ Daniel Wiklund y, Dake Liu Dept. of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Abstract As the degree of
More informationEnergy Aware Optimized Resource Allocation Using Buffer Based Data Flow In MPSOC Architecture
ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference
More informationFPGA: What? Why? Marco D. Santambrogio
FPGA: What? Why? Marco D. Santambrogio marco.santambrogio@polimi.it 2 Reconfigurable Hardware Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially much
More informationDelay Estimation for Technology Independent Synthesis
Delay Estimation for Technology Independent Synthesis Yutaka TAMIYA FUJITSU LABORATORIES LTD. 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki, JAPAN, 211-88 Tel: +81-44-754-2663 Fax: +81-44-754-2664 E-mail:
More informationA Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management
A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management H. Tan and R. F. DeMara Department of Electrical and Computer Engineering University of Central Florida
More informationSystems Development Tools for Embedded Systems and SOC s
Systems Development Tools for Embedded Systems and SOC s Óscar R. Ribeiro Departamento de Informática, Universidade do Minho 4710 057 Braga, Portugal oscar.rafael@di.uminho.pt Abstract. A new approach
More informationECE 4514 Digital Design II. Spring Lecture 22: Design Economics: FPGAs, ASICs, Full Custom
ECE 4514 Digital Design II Lecture 22: Design Economics: FPGAs, ASICs, Full Custom A Tools/Methods Lecture Overview Wows and Woes of scaling The case of the Microprocessor How efficiently does a microprocessor
More informationSystem on Chip (SoC) Design
System on Chip (SoC) Design Moore s Law and Technology Scaling the performance of an IC, including the number components on it, doubles every 18-24 months with the same chip price... - Gordon Moore - 1960
More informationReal-Time Dynamic Voltage Hopping on MPSoCs
Real-Time Dynamic Voltage Hopping on MPSoCs Tohru Ishihara System LSI Research Center, Kyushu University 2009/08/05 The 9 th International Forum on MPSoC and Multicore 1 Background Low Power / Low Energy
More informationAn FPGA Based Adaptive Viterbi Decoder
An FPGA Based Adaptive Viterbi Decoder Sriram Swaminathan Russell Tessier Department of ECE University of Massachusetts Amherst Overview Introduction Objectives Background Adaptive Viterbi Algorithm Architecture
More informationOptimization of Task Scheduling and Memory Partitioning for Multiprocessor System on Chip
Optimization of Task Scheduling and Memory Partitioning for Multiprocessor System on Chip 1 Mythili.R, 2 Mugilan.D 1 PG Student, Department of Electronics and Communication K S Rangasamy College Of Technology,
More information