DOWNLOAD PDF SYNTHESIZING LINEAR-ARRAY ALGORITHMS FROM NESTED FOR LOOP ALGORITHMS.

Size: px
Start display at page:

Download "DOWNLOAD PDF SYNTHESIZING LINEAR-ARRAY ALGORITHMS FROM NESTED FOR LOOP ALGORITHMS."

Transcription

1 Chapter 1 : Zvi Kedem â Research Output â NYU Scholars Excerpt from Synthesizing Linear-Array Algorithms From Nested for Loop Algorithms We will study linear systolic arrays in this paper, as linear arrays are attractive for their bounded i/o requirements and a simple global clock whose rate is independent of the size of the array. Kedem - ACM Trans. Programming Languages and Systems, " On shared memory parallel computers SMPCs it is natural to focus on decomposing the computation mainly by distributing the iterations of the nested Do-Loops. In contrast, on distributed memory parallel computers DMPCs the decomposition of computation and the distribution of data must both be h In contrast, on distributed memory parallel computers DMPCs the decomposition of computation and the distribution of data must both be handledin order to balance the computation load and to minimize the migration of data. We propose and validate experimentally a method for handling computations and data synergistically to optimize the overall execution time. The method relies on a number of novel techniques, also presented in this paper. The intuition is that the dominant arrays are the ones whose migration would be the most expensive. Using the correspondence between iteration space mapping vectors and distributed dimensions of the dominant data array in each nested Do-loop, we are able to design algorithms for determin Show Context Citation Context As in general, the number of iterations of a nested Do-Loop is much larger than the number of PEs, a set of iterations called a tile is assigned to each PE, with the property that they can be execut In this paper, we generalize the parameter-based approach of Li and Wah [1] to map n-dimensional uniform recurrences to any k-dimensional processor arrays, where In this paper, we generalize the parameter-based approach of Li and Wah [1] to map n-dimensional uniform recurrences to any k-dimensional processor arrays, where k! In our approach, operations of the target array are captured by a set of parameters, and constraints are derived to avoid computational conflicts and data collisions. We show that the optimal array for any objective function expressed in terms of these parameters can be found by a systematic enumeration over a polynomial search space. In contrast, previous attempts [2, 3] do not guarantee the optimality of the resulting designs. We illustrate our method with optimal single-pass linear arrays for re-indexed WarshallFloyd path-finding algorithm. Finally, we show the application of GPM to practical situations characterized by restriction on resources, such as processors or completion ti Finally, we show the app Data parallelism, in which the same operation is performed on many elements of an n-dimensional array, is one of the most powerful methods of extracting parallelism in scientific computation. One form of data parallelism involves defining a sequence of parallel wavefronts of a computation. Different wavefronts result in different performance, so the question arises how to determine the wavefronts that result in the minimum computation time. Wavefront determination should define also allocation of wavefront elements to processors. In this paper we present efficient algorithms for determining the optimum wavefront and for partitioning it into sections assigned to individual processors. Presented algorithms are applicable to computations that are defined over two or higher dimensional arrays and are executed on distributed memory machines interco Moldovan et al [7] considered a linear transformation, T, of the algorithm to map it efficiently on a VLSI processor array. The linear transformation consists of two parts: Parallel and Distributed Systems, " Processor arrays are frequently used to deliver high performance in many applications with computationally intensive operations. This paper presents the General Parameter Method GP- M, a systematic parameter-based approach for synthesizing such algorithm-specific architectures. GPM can synthesize processor arrays of any lower dimension from a uniform-recurrence description of the algorithm. The design objective is a general non-linear and non-monotonic user-specified function, and depends on attributes such as computation time of the recurrence on the processor array, completion time, load time, and drain time. In addition, bounds on some or all of these attributes can be specified. GPM performs an efficient search of polynomial complexity to find the optimal design satisfying the user-specified design constraints. As an illustration, we show how GPM can be used to find optimal linear processor arrays for computing transitive Page 1

2 closures. We consider design objectives that minimize co Page 2

3 Chapter 2 : CiteSeerX â Citation Query Synthesizing linear array algorithms from nested for loop algorith The mapping of algorithms structured as depth-p nested FOR loops into special-purpose systolic VLSI linear arrays is addressed. The mappings are done by using linear functions to transform the original sequential algorithms into a form suitable for parallel execution on linear arrays. During the course of the last decade, a mathematical model for the parallelization of FOR-loops has become increasingly popular. In this model, a perfect nest of r FOR-loops is represented by a convex polytope in Z r. The boundaries of each loop specify the extent of the polytope in a dis The boundaries of each loop specify the extent of the polytope in a distinct dimension. These transformations have a very intuitive interpretation and can be easily quantified and automated due to their mathematical foundation in linear programming and linear algebra. With the recent availability of massively parallel computers, the idea of loop parallelization is gaining significance, since it promises execution speed-ups of orders of magnitude. The polytope model for loop parallelization has its origin in systolic design, but it applies in more general settings and methods based on it will become a part of futur Show Context Citation Context A full-dimensional solution offers a maximum speed-up. Cronquist, Paul Franklin, " Configurable computing has captured the imagination of many architects who want the performance of application-specific hardware combined with the reprogrammability of general-purpose computers. Unfortunately, configurable computing has had rather limited success largely because the FPGAs on which t Unfortunately, configurable computing has had rather limited success largely because the FPGAs on which they are built are more suited to implementing random logic than computing tasks. This paper presents RaPiD, a new coarse-grained FPGA architecture that is optimized for highly repetitive, computation-intensive tasks. Very deep application-specific computation pipelines can be configured in RaPiD. These pipelines make much more efficient use of silicon than traditional FPGAs and also yield much higher performance for a wide range of applications. RaPiD is not limited to implementing systolic arrays, however. For example, a pipeline can be constructed which comprises different computations at different stages and at different times. RaPiDs can provide significantly higher performance than general purp RaPiDs can provide significantly higher performance than general purpose processors on a wide range of applications from the areas of video and signal processing, scientific computing, and communications. A RaPiD architecture is optimized for highly repetitive, computationally-intensive tasks. Very deep application-specific computation pipelines can be configured in RaPiDs that deliver very high performance for a wide range of applications. RaPiDs achieve this using a coarse-grained reconfigurable architecture that mixes the appropriate amount of static configuration with dynamic control. We describe the fundamental features of a RaPiD architecture, including the linear array of functional units, a programmable segmented bus structure, and a programmable control architecture. In addition, we outline the floorplan of the architecture and provide timing data for the most critical paths. We conclude with performance numbers for several applications on an instance of a RaPiD architecture. The linear structure of the RaPiD datapath was shown in The goal of the RaPiD Reconfigurable Pipelined Datapath architecture is to provide high performance configurable computing for a range of computationally-intensive applications that demand special-purpose hardware. This is accomplished by mapping the computation into a deep pipeline using a config This is accomplished by mapping the computation into a deep pipeline using a configurable array of coarse-grained computational units. A key feature of RaPiD is the combination of static and dynamic control. While the underlying computational pipelines are configured statically, a limited amount of dynamic control is provided which greatly increases the range and capability of applications that can be mapped to RaPiD. This paper illustrates this mapping and configuration for several important applications including a FIR filter, 2-D DCT, motion estimation, and parametric curve generation; it also shows how static and dynamic control are used to perform complex computations. This paper presents the New Systolic Language as a general solution to the problem systolic programming. The language provides a simple programming interface for systolic algorithms Page 3

4 suitable for di erent hardware platforms and software simulators. The New Systolic Language hides the details and pote The New Systolic Language hides the details and potential systolic data streams. Data ows and systolic cell programs for the co-processor are integrated with host functions, enabling a single le to specify a complete systolic program. Configurable computers have attracted considerable attention recently because they promise to deliver the performance of application-specific hardware along with the flexibility of general-purpose computers. Unfortunately, configurable computing has had rather limited success to date. We believe that the FPGAs currently used to construct configurable computers are too general to achieve good cost-performance on computationally-intensive applications that demand special-purpose hardware. This paper describes a new architecture called RaPiD Reconfigurable Pipelined Datapaths, which is optimized for highly repetitive, computationally-intensive tasks. Very deep application-specific computation pipelines can be configured in RaPiD that deliver very high performance for a wide range of applications. RaPiD achieves this using a coarse-grained reconfigurable architecture that mixes the appropriate amount of static configuration with dynamic control. Kedem - ACM Trans. Programming Languages and Systems, " On shared memory parallel computers SMPCs it is natural to focus on decomposing the computation mainly by distributing the iterations of the nested Do-Loops. In contrast, on distributed memory parallel computers DMPCs the decomposition of computation and the distribution of data must both be h In contrast, on distributed memory parallel computers DMPCs the decomposition of computation and the distribution of data must both be handledin order to balance the computation load and to minimize the migration of data. We propose and validate experimentally a method for handling computations and data synergistically to optimize the overall execution time. The method relies on a number of novel techniques, also presented in this paper. The intuition is that the dominant arrays are the ones whose migration would be the most expensive. Using the correspondence between iteration space mapping vectors and distributed dimensions of the dominant data array in each nested Do-loop, we are able to design algorithms for determin As in general, the number of iterations of a nested Do-Loop is much larger than the number of PEs, a set of iterations called a tile is assigned to each PE, with the property that they can be execut In this paper, we generalize the parameter-based approach of Li and Wah [1] to map n-dimensional uniform recurrences to any k-dimensional processor arrays, where In this paper, we generalize the parameter-based approach of Li and Wah [1] to map n-dimensional uniform recurrences to any k-dimensional processor arrays, where k! In our approach, operations of the target array are captured by a set of parameters, and constraints are derived to avoid computational conflicts and data collisions. We show that the optimal array for any objective function expressed in terms of these parameters can be found by a systematic enumeration over a polynomial search space. In contrast, previous attempts [2, 3] do not guarantee the optimality of the resulting designs. We illustrate our method with optimal single-pass linear arrays for re-indexed WarshallFloyd path-finding algorithm. Finally, we show the application of GPM to practical situations characterized by restriction on resources, such as processors or completion ti Lee and Kedem [2, 6] gave a set of necessary and sufficient conditions for the feasibility of a design and conditions to avoid data-link collisions when two data tokens contend for the same link si Data parallelism, in which the same operation is performed on many elements of an n-dimensional array, is one of the most powerful methods of extracting parallelism in scientific computation. One form of data parallelism involves defining a sequence of parallel wavefronts of a computation. Different wavefronts result in different performance, so the question arises how to determine the wavefronts that result in the minimum computation time. Wavefront determination should define also allocation of wavefront elements to processors. In this paper we present efficient algorithms for determining the optimum wavefront and for partitioning it into sections assigned to individual processors. Presented algorithms are applicable to computations that are defined over two or higher dimensional arrays and are executed on distributed memory machines interco The set of index points along with the set of dependence vectors d Parallel and Distributed Systems, " Processor arrays are frequently used to deliver high performance in many applications with computationally intensive operations. This paper presents the General Parameter Method GP- M, a systematic parameter-based approach for synthesizing such Page 4

5 algorithm-specific architectures. GPM can synthesize processor arrays of any lower dimension from a uniform-recurrence description of the algorithm. The design objective is a general non-linear and non-monotonic user-specified function, and depends on attributes such as computation time of the recurrence on the processor array, completion time, load time, and drain time. In addition, bounds on some or all of these attributes can be specified. GPM performs an efficient search of polynomial complexity to find the optimal design satisfying the user-specified design constraints. As an illustration, we show how GPM can be used to find optimal linear processor arrays for computing transitive closures. We consider design objectives that minimize co Important steps towards a formal solution were first made by Lee and Kedem [8]. They presented the concept of data-link collisions two data tokens contending for the same link simultaneously and c Page 5

6 Chapter 3 : ADVIS - Mathematical software - swmath Synthesizing linear-array algorithms from nested for loop algorithms [P Lee, Z Kedem] on theinnatdunvilla.com *FREE* shipping on qualifying offers. This is a reproduction of a book published before This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The methodologies adopted for mapping these algorithms onto parallel hardware often use heuristic search that requires a lot of computational effort to obtain near optimal solutions. The above is used to develop our proposed modified heuristic search to arrive at optimal design and the complexity comparisons are given. The MATLAB results of the new search and the design space trade-off analysis using the high-level synthesis tool are presented for two typical computationally intensive nested loop algorithmsâ the 6D FSBM and the 4D edge detection alternatively known as the 2D filtering algorithm. The management of complexity and tapping the full potential of these RSoC architectures present many challenges [ 1 ]. A large number of heuristic algorithms have been used in developing many novel scheduling and mapping algorithms [ 2 â 5 ]. However, these approaches face difficulties in dealing with large execution times. Systolic array design style can effectively exploit parallelism inherent in the nested loop algorithm and, therefore, reduce processing time [ 2, 3 ]. Often heuristic procedures are used to search for the mapping transformations that are used to map the nested loop algorithms onto array architectures [ 4, 5 ]. Since the effort that goes into heuristic search is large and complex, the challenge lies in improving the process to reduce the computational effort in getting the mapping results. Our main contribution in this paper is that we propose an augmented approach to the heuristic search. A new method of identifying the subspace to which the PE array is to be assigned is proposed based on the directional index of the computational expression that is explained in Section 2. The new vectors and terminologies used in the procedure are defined and elaborated in Section 2. The complexity analysis is performed by comparing the search space used in our method with the search space in [ 4 ]. The high-level synthesis tool GAUT is used to plot the design space trade-off curves to obtain the design space exploration curves. The paper is organized as follows: The 4D nested loop formulation of the 2D filtering problem is explained in Section 4. The methodology and the implementation of the above approach for the 2D filtering algorithm and the mapping results are presented in Section 4. Section 7 discusses the complexity considerations and comparisons. Section 8 gives the conclusion and future work. Page 6

7 Chapter 4 : Mapping of recursive algorithms onto multi-rate arrays - CORE During the course of the last decade, a mathematical model for the parallelization of FOR-loops has become increasingly popular. In this model, a (perfect) nest of r FOR-loops is represented by a convex polytope in Z r. Research reported in more than 50 scientific publications. Has served on program committees of scientific conferences and on editorial boards of scientific journals. Guided more than 15 doctoral dissertations. Has a total of more than doctoral descendants. A complete set of PowerPoint presentations for an introductory Database Management Systems class is available at https: They may be used in lectures, in individual study, downloaded, and printed. However, they may not be modified or material extracted from them or incorporated elsewhere without prior written permission. Optimal surface reconstruction from planar contours. Communications of the ACM, Citations including those in more than distinct journals; majority not in Computer Science: Consistency in hierarchical data base systems. Journal of the ACM, With preliminary version as: Controlling concurrency using locking protocols. On visible surface generation by a-priori tree structures. From exclusive to shared locks. Journal of the ACM, Non-two-phase locking protocols with shared and exclusive locks. Synthesizing linear-array algorithms from nested for loop algorithms. Mapping nested loop algorithms into multi-dimensional systolic arrays. Efficient robust parallel computations. Combining tentative and definite executions for very fast dependable parallel computing. Efficient program transformations for resilient parallel computation via randomization. Parallel processing on networks of workstations: A fault-tolerant high performance approach. Parallel suffix-prefix-matching algorithm and applications. A novel software system for fault-tolerant parallel processing on distributed platforms. An infrastructure for network computing with Java applets. Practice and Experience, An infrastructure for distributed web applications. Metacomputing on the Web. Future Generation Computer Systems, An efficient algorithm for discovering the maximum frequent set. Data image management via emulation of non-volatile storage device. Chapter 5 : CiteSeerX â Citation Query Mapping nested loop algorithms into multidimensional systolic arr The mapping of algorithms structured as depth-p nested FOR loops into special-purpose systolic VLSI linear arrays is addressed. The mappings are done by us. Chapter 6 : Zvi M. Kedem: Brief CV We will study linear systolic arrays in this paper, as linear arrays are attractive for their bounded i/o requirements and a simple global clock whose rate is independent of the size of the array. We will consider the important class of algorithms structured as (depth) p nested for loops. Vlsi. Chapter 7 : Synthesizing Linear-Array Algorithms From Nested for Loop Algorithms The mapping of algorithms structured as depth- p nested FOR loops into special-purpose systolic VLSI linear arrays is addressed. The mappings are done by using linear functions to transform the. Chapter 8 : Mapping Nested Loop Algorithms Into Multi-Dimensional Systolic Arrays Synthesizing linear-array algorithms from nested for loop algorithms Item Preview remove-circle Share or Embed This Item. Page 7

Rapid: A Configurable Architecture for Compute-Intensive Applications

Rapid: A Configurable Architecture for Compute-Intensive Applications Rapid: Configurable rchitecture for Compute-Intensive pplications Carl Ebeling Dept. of Computer Science and Engineering niversity of Washington lternatives for High-Performance Systems SIC se application-specific

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION 1.1 Advance Encryption Standard (AES) Rijndael algorithm is symmetric block cipher that can process data blocks of 128 bits, using cipher keys with lengths of 128, 192, and 256

More information

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004 A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into

More information

Workloads Programmierung Paralleler und Verteilter Systeme (PPV)

Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Workloads 2 Hardware / software execution environment

More information

Mapping Algorithms onto a Multiple-Chip Data-Driven Array

Mapping Algorithms onto a Multiple-Chip Data-Driven Array Mapping Algorithms onto a MultipleChip DataDriven Array Bilha Mendelson IBM Israel Science & Technology Matam Haifa 31905, Israel bilhaovnet.ibm.com Israel Koren Dept. of Electrical and Computer Eng. University

More information

ASSIGNMENT- I Topic: Functional Modeling, System Design, Object Design. Submitted by, Roll Numbers:-49-70

ASSIGNMENT- I Topic: Functional Modeling, System Design, Object Design. Submitted by, Roll Numbers:-49-70 ASSIGNMENT- I Topic: Functional Modeling, System Design, Object Design Submitted by, Roll Numbers:-49-70 Functional Models The functional model specifies the results of a computation without specifying

More information

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given

More information

DEPARTMENT OF COMPUTER SCIENCE

DEPARTMENT OF COMPUTER SCIENCE Department of Computer Science 1 DEPARTMENT OF COMPUTER SCIENCE Office in Computer Science Building, Room 279 (970) 491-5792 cs.colostate.edu (http://www.cs.colostate.edu) Professor L. Darrell Whitley,

More information

1 Introduction. 1.1 Raster-to-vector conversion

1 Introduction. 1.1 Raster-to-vector conversion 1 Introduction 1.1 Raster-to-vector conversion Vectorization (raster-to-vector conversion) consists of analyzing a raster image to convert its pixel representation to a vector representation The basic

More information

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction

More information

A CELLULAR, LANGUAGE DIRECTED COMPUTER ARCHITECTURE. (Extended Abstract) Gyula A. Mag6. University of North Carolina at Chapel Hill

A CELLULAR, LANGUAGE DIRECTED COMPUTER ARCHITECTURE. (Extended Abstract) Gyula A. Mag6. University of North Carolina at Chapel Hill 447 A CELLULAR, LANGUAGE DIRECTED COMPUTER ARCHITECTURE (Extended Abstract) Gyula A. Mag6 University of North Carolina at Chapel Hill Abstract If a VLSI computer architecture is to influence the field

More information

Design of Parallel Algorithms. Models of Parallel Computation

Design of Parallel Algorithms. Models of Parallel Computation + Design of Parallel Algorithms Models of Parallel Computation + Chapter Overview: Algorithms and Concurrency n Introduction to Parallel Algorithms n Tasks and Decomposition n Processes and Mapping n Processes

More information

Part IV. Chapter 15 - Introduction to MIMD Architectures

Part IV. Chapter 15 - Introduction to MIMD Architectures D. Sima, T. J. Fountain, P. Kacsuk dvanced Computer rchitectures Part IV. Chapter 15 - Introduction to MIMD rchitectures Thread and process-level parallel architectures are typically realised by MIMD (Multiple

More information

Novel Lossy Compression Algorithms with Stacked Autoencoders

Novel Lossy Compression Algorithms with Stacked Autoencoders Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is

More information

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE for for March 10, 2006 Agenda for Peer-to-Peer Sytems Initial approaches to Their Limitations CAN - Applications of CAN Design Details Benefits for Distributed and a decentralized architecture No centralized

More information

IMAGE PROCESSING USING DISCRETE WAVELET TRANSFORM

IMAGE PROCESSING USING DISCRETE WAVELET TRANSFORM IMAGE PROCESSING USING DISCRETE WAVELET TRANSFORM Prabhjot kour Pursuing M.Tech in vlsi design from Audisankara College of Engineering ABSTRACT The quality and the size of image data is constantly increasing.

More information

Co-synthesis and Accelerator based Embedded System Design

Co-synthesis and Accelerator based Embedded System Design Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements.

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements. Contemporary Design We have been talking about design process Let s now take next steps into examining in some detail Increasing complexities of contemporary systems Demand the use of increasingly powerful

More information

AN HIERARCHICAL APPROACH TO HULL FORM DESIGN

AN HIERARCHICAL APPROACH TO HULL FORM DESIGN AN HIERARCHICAL APPROACH TO HULL FORM DESIGN Marcus Bole and B S Lee Department of Naval Architecture and Marine Engineering, Universities of Glasgow and Strathclyde, Glasgow, UK 1 ABSTRACT As ship design

More information

Systolic Arrays for Reconfigurable DSP Systems

Systolic Arrays for Reconfigurable DSP Systems Systolic Arrays for Reconfigurable DSP Systems Rajashree Talatule Department of Electronics and Telecommunication G.H.Raisoni Institute of Engineering & Technology Nagpur, India Contact no.-7709731725

More information

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION Rapid advances in integrated circuit technology have made it possible to fabricate digital circuits with large number of devices on a single chip. The advantages of integrated circuits

More information

Implementation Techniques

Implementation Techniques V Implementation Techniques 34 Efficient Evaluation of the Valid-Time Natural Join 35 Efficient Differential Timeslice Computation 36 R-Tree Based Indexing of Now-Relative Bitemporal Data 37 Light-Weight

More information

Patterns for! Parallel Programming!

Patterns for! Parallel Programming! Lecture 4! Patterns for! Parallel Programming! John Cavazos! Dept of Computer & Information Sciences! University of Delaware!! www.cis.udel.edu/~cavazos/cisc879! Lecture Overview Writing a Parallel Program

More information

System Verification of Hardware Optimization Based on Edge Detection

System Verification of Hardware Optimization Based on Edge Detection Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection

More information

AN ABSTRACTION-BASED METHODOLOGY FOR MECHANICAL CONFIGURATION DESIGN

AN ABSTRACTION-BASED METHODOLOGY FOR MECHANICAL CONFIGURATION DESIGN AN ABSTRACTION-BASED METHODOLOGY FOR MECHANICAL CONFIGURATION DESIGN by Gary Lee Snavely A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Mechanical

More information

ASIC, Customer-Owned Tooling, and Processor Design

ASIC, Customer-Owned Tooling, and Processor Design ASIC, Customer-Owned Tooling, and Processor Design Design Style Myths That Lead EDA Astray Nancy Nettleton Manager, VLSI ASIC Device Engineering April 2000 Design Style Myths COT is a design style that

More information

Fundamental Concepts of Parallel Programming

Fundamental Concepts of Parallel Programming Fundamental Concepts of Parallel Programming Abstract The concepts behind programming methodologies and techniques are always under development, becoming more complex and flexible to meet changing computing

More information

Spatial Data Structures

Spatial Data Structures CSCI 420 Computer Graphics Lecture 17 Spatial Data Structures Jernej Barbic University of Southern California Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees [Angel Ch. 8] 1 Ray Tracing Acceleration

More information

Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 01 Introduction Welcome to the course on Hardware

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Dense LU Factorization

Dense LU Factorization Dense LU Factorization Dr.N.Sairam & Dr.R.Seethalakshmi School of Computing, SASTRA Univeristy, Thanjavur-613401. Joint Initiative of IITs and IISc Funded by MHRD Page 1 of 6 Contents 1. Dense LU Factorization...

More information

Spatial Data Structures

Spatial Data Structures CSCI 480 Computer Graphics Lecture 7 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids BSP Trees [Ch. 0.] March 8, 0 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s/

More information

modefrontier: Successful technologies for PIDO

modefrontier: Successful technologies for PIDO 10 modefrontier: Successful technologies for PIDO The acronym PIDO stands for Process Integration and Design Optimization. In few words, a PIDO can be described as a tool that allows the effective management

More information

Integer Programming ISE 418. Lecture 7. Dr. Ted Ralphs

Integer Programming ISE 418. Lecture 7. Dr. Ted Ralphs Integer Programming ISE 418 Lecture 7 Dr. Ted Ralphs ISE 418 Lecture 7 1 Reading for This Lecture Nemhauser and Wolsey Sections II.3.1, II.3.6, II.4.1, II.4.2, II.5.4 Wolsey Chapter 7 CCZ Chapter 1 Constraint

More information

Parallel Query Optimisation

Parallel Query Optimisation Parallel Query Optimisation Contents Objectives of parallel query optimisation Parallel query optimisation Two-Phase optimisation One-Phase optimisation Inter-operator parallelism oriented optimisation

More information

EMO A Real-World Application of a Many-Objective Optimisation Complexity Reduction Process

EMO A Real-World Application of a Many-Objective Optimisation Complexity Reduction Process EMO 2013 A Real-World Application of a Many-Objective Optimisation Complexity Reduction Process Robert J. Lygoe, Mark Cary, and Peter J. Fleming 22-March-2013 Contents Introduction Background Process Enhancements

More information

CAD Algorithms. Circuit Partitioning

CAD Algorithms. Circuit Partitioning CAD Algorithms Partitioning Mohammad Tehranipoor ECE Department 13 October 2008 1 Circuit Partitioning Partitioning: The process of decomposing a circuit/system into smaller subcircuits/subsystems, which

More information

Introduction to Formal Methods

Introduction to Formal Methods 2008 Spring Software Special Development 1 Introduction to Formal Methods Part I : Formal Specification i JUNBEOM YOO jbyoo@knokuk.ac.kr Reference AS Specifier s Introduction to Formal lmethods Jeannette

More information

Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications

Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications Authors: Shuangchen Li, Yongpan Liu, X.Sharon Hu, Xinyu He, Pei Zhang, and Huazhong Yang 2013/01/23 Outline

More information

Software Architecture

Software Architecture Software Architecture Does software architecture global design?, architect designer? Overview What is it, why bother? Architecture Design Viewpoints and view models Architectural styles Architecture asssessment

More information

Spatial Data Structures

Spatial Data Structures 15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) April 1, 2003 [Angel 9.10] Frank Pfenning Carnegie

More information

Lecture 7: Introduction to Co-synthesis Algorithms

Lecture 7: Introduction to Co-synthesis Algorithms Design & Co-design of Embedded Systems Lecture 7: Introduction to Co-synthesis Algorithms Sharif University of Technology Computer Engineering Dept. Winter-Spring 2008 Mehdi Modarressi Topics for today

More information

Patterns for! Parallel Programming II!

Patterns for! Parallel Programming II! Lecture 4! Patterns for! Parallel Programming II! John Cavazos! Dept of Computer & Information Sciences! University of Delaware! www.cis.udel.edu/~cavazos/cisc879! Task Decomposition Also known as functional

More information

Hardware/Software Co-design

Hardware/Software Co-design Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction

More information

WORKFLOW ENGINE FOR CLOUDS

WORKFLOW ENGINE FOR CLOUDS WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Task Computing Task computing

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

Performance of Multicore LUP Decomposition

Performance of Multicore LUP Decomposition Performance of Multicore LUP Decomposition Nathan Beckmann Silas Boyd-Wickizer May 3, 00 ABSTRACT This paper evaluates the performance of four parallel LUP decomposition implementations. The implementations

More information

Spatial Data Structures

Spatial Data Structures 15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) March 28, 2002 [Angel 8.9] Frank Pfenning Carnegie

More information

Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers

Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers Subash Chandar G (g-chandar1@ti.com), Vaideeswaran S (vaidee@ti.com) DSP Design, Texas Instruments India

More information

416 Distributed Systems. Distributed File Systems 4 Jan 23, 2017

416 Distributed Systems. Distributed File Systems 4 Jan 23, 2017 416 Distributed Systems Distributed File Systems 4 Jan 23, 2017 1 Today's Lecture Wrap up NFS/AFS This lecture: other types of DFS Coda disconnected operation 2 Key Lessons Distributed filesystems almost

More information

Parallel Programming Patterns Overview and Concepts

Parallel Programming Patterns Overview and Concepts Parallel Programming Patterns Overview and Concepts Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License.

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

Basic Idea. The routing problem is typically solved using a twostep

Basic Idea. The routing problem is typically solved using a twostep Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a

More information

Dependence Vectors and Fast Search of Systolic Mapping for Computationally Intensive Image Processing Algorithms

Dependence Vectors and Fast Search of Systolic Mapping for Computationally Intensive Image Processing Algorithms Dependence Vectors and Fast Search of Systolic Mapping for Computationally Intensive Image Processing Algorithms Bala Tripura Sundari B Abstract- 2-D convolution in image processing and Full Search Block

More information

Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm

Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm 161 CHAPTER 5 Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm 1 Introduction We saw in the previous chapter that real-life classifiers exhibit structure and

More information

The future is parallel but it may not be easy

The future is parallel but it may not be easy The future is parallel but it may not be easy Michael J. Flynn Maxeler and Stanford University M. J. Flynn 1 HiPC Dec 07 Outline I The big technology tradeoffs: area, time, power HPC: What s new at the

More information

Effective Memory Access Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management

Effective Memory Access Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management International Journal of Computer Theory and Engineering, Vol., No., December 01 Effective Memory Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management Sultan Daud Khan, Member,

More information

Algorithms. Lecture Notes 5

Algorithms. Lecture Notes 5 Algorithms. Lecture Notes 5 Dynamic Programming for Sequence Comparison The linear structure of the Sequence Comparison problem immediately suggests a dynamic programming approach. Naturally, our sub-instances

More information

Volume 5, Issue 5 OCT 2016

Volume 5, Issue 5 OCT 2016 DESIGN AND IMPLEMENTATION OF REDUNDANT BASIS HIGH SPEED FINITE FIELD MULTIPLIERS Vakkalakula Bharathsreenivasulu 1 G.Divya Praneetha 2 1 PG Scholar, Dept of VLSI & ES, G.Pullareddy Eng College,kurnool

More information

All MSEE students are required to take the following two core courses: Linear systems Probability and Random Processes

All MSEE students are required to take the following two core courses: Linear systems Probability and Random Processes MSEE Curriculum All MSEE students are required to take the following two core courses: 3531-571 Linear systems 3531-507 Probability and Random Processes The course requirements for students majoring in

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VII /Issue 2 / OCT 2016

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VII /Issue 2 / OCT 2016 NEW VLSI ARCHITECTURE FOR EXPLOITING CARRY- SAVE ARITHMETIC USING VERILOG HDL B.Anusha 1 Ch.Ramesh 2 shivajeehul@gmail.com 1 chintala12271@rediffmail.com 2 1 PG Scholar, Dept of ECE, Ganapathy Engineering

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

Milind Kulkarni Research Statement

Milind Kulkarni Research Statement Milind Kulkarni Research Statement With the increasing ubiquity of multicore processors, interest in parallel programming is again on the upswing. Over the past three decades, languages and compilers researchers

More information

Evaluation of Power Consumption of Modified Bubble, Quick and Radix Sort, Algorithm on the Dual Processor

Evaluation of Power Consumption of Modified Bubble, Quick and Radix Sort, Algorithm on the Dual Processor Evaluation of Power Consumption of Modified Bubble, Quick and, Algorithm on the Dual Processor Ahmed M. Aliyu *1 Dr. P. B. Zirra *2 1 Post Graduate Student *1,2, Computer Science Department, Adamawa State

More information

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT?

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT? ESE55: Electronic Design Automation Day 7: February, 0 Clustering (LUT Mapping, Delay) Today How do we map to LUTs What happens when IO dominates Delay dominates Lessons for non-luts for delay-oriented

More information

Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines

Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines The Scientific World Journal Volume 2013, Article ID 596724, 6 pages http://dx.doi.org/10.1155/2013/596724 Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines Weizhe

More information

Massively Parallel Computation for Three-Dimensional Monte Carlo Semiconductor Device Simulation

Massively Parallel Computation for Three-Dimensional Monte Carlo Semiconductor Device Simulation L SIMULATION OF SEMICONDUCTOR DEVICES AND PROCESSES Vol. 4 Edited by W. Fichtner, D. Aemmer - Zurich (Switzerland) September 12-14,1991 - Hartung-Gorre Massively Parallel Computation for Three-Dimensional

More information

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision

More information

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Stanley Bak Abstract Network algorithms are deployed on large networks, and proper algorithm evaluation is necessary to avoid

More information

Behavioral Array Mapping into Multiport Memories Targeting Low Power 3

Behavioral Array Mapping into Multiport Memories Targeting Low Power 3 Behavioral Array Mapping into Multiport Memories Targeting Low Power 3 Preeti Ranjan Panda and Nikil D. Dutt Department of Information and Computer Science University of California, Irvine, CA 92697-3425,

More information

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,

More information

IN5050: Programming heterogeneous multi-core processors Thinking Parallel

IN5050: Programming heterogeneous multi-core processors Thinking Parallel IN5050: Programming heterogeneous multi-core processors Thinking Parallel 28/8-2018 Designing and Building Parallel Programs Ian Foster s framework proposal develop intuition as to what constitutes a good

More information

Global Solution of Mixed-Integer Dynamic Optimization Problems

Global Solution of Mixed-Integer Dynamic Optimization Problems European Symposium on Computer Arded Aided Process Engineering 15 L. Puigjaner and A. Espuña (Editors) 25 Elsevier Science B.V. All rights reserved. Global Solution of Mixed-Integer Dynamic Optimization

More information

CSE 190D Spring 2017 Final Exam Answers

CSE 190D Spring 2017 Final Exam Answers CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join

More information

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides)

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Computing 2012 Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Algorithm Design Outline Computational Model Design Methodology Partitioning Communication

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Principles of Data Management. Lecture #14 (Spatial Data Management)

Principles of Data Management. Lecture #14 (Spatial Data Management) Principles of Data Management Lecture #14 (Spatial Data Management) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Project

More information

Visualization and Analysis of Inverse Kinematics Algorithms Using Performance Metric Maps

Visualization and Analysis of Inverse Kinematics Algorithms Using Performance Metric Maps Visualization and Analysis of Inverse Kinematics Algorithms Using Performance Metric Maps Oliver Cardwell, Ramakrishnan Mukundan Department of Computer Science and Software Engineering University of Canterbury

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

Overview of the Class

Overview of the Class Overview of the Class Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California (USC) have explicit permission to make copies

More information

Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol.

Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol. Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol. 6937, 69370N, DOI: http://dx.doi.org/10.1117/12.784572 ) and is made

More information

Embedded Systems Design with Platform FPGAs

Embedded Systems Design with Platform FPGAs Embedded Systems Design with Platform FPGAs Spatial Design Ron Sass and Andrew G. Schmidt http://www.rcs.uncc.edu/ rsass University of North Carolina at Charlotte Spring 2011 Embedded Systems Design with

More information

Full file at

Full file at Chapter 2 Data Warehousing True-False Questions 1. A real-time, enterprise-level data warehouse combined with a strategy for its use in decision support can leverage data to provide massive financial benefits

More information

SPARK: A Parallelizing High-Level Synthesis Framework

SPARK: A Parallelizing High-Level Synthesis Framework SPARK: A Parallelizing High-Level Synthesis Framework Sumit Gupta Rajesh Gupta, Nikil Dutt, Alex Nicolau Center for Embedded Computer Systems University of California, Irvine and San Diego http://www.cecs.uci.edu/~spark

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

PROBLEM FORMULATION AND RESEARCH METHODOLOGY

PROBLEM FORMULATION AND RESEARCH METHODOLOGY PROBLEM FORMULATION AND RESEARCH METHODOLOGY ON THE SOFT COMPUTING BASED APPROACHES FOR OBJECT DETECTION AND TRACKING IN VIDEOS CHAPTER 3 PROBLEM FORMULATION AND RESEARCH METHODOLOGY The foregoing chapter

More information

Chapter 2 Overview of the Design Methodology

Chapter 2 Overview of the Design Methodology Chapter 2 Overview of the Design Methodology This chapter presents an overview of the design methodology which is developed in this thesis, by identifying global abstraction levels at which a distributed

More information

Video Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Video Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Final Report Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This report describes a method to align two videos.

More information

AOSA - Betriebssystemkomponenten und der Aspektmoderatoransatz

AOSA - Betriebssystemkomponenten und der Aspektmoderatoransatz AOSA - Betriebssystemkomponenten und der Aspektmoderatoransatz Results obtained by researchers in the aspect-oriented programming are promoting the aim to export these ideas to whole software development

More information

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications?

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications? YOUR APPLICATION S JOURNEY TO THE CLOUD What s the best way to get cloud native capabilities for your existing applications? Introduction Moving applications to cloud is a priority for many IT organizations.

More information

CS137: Electronic Design Automation

CS137: Electronic Design Automation CS137: Electronic Design Automation Day 4: January 16, 2002 Clustering (LUT Mapping, Delay) Today How do we map to LUTs? What happens when delay dominates? Lessons for non-luts for delay-oriented partitioning

More information

Additional Slides to De Micheli Book

Additional Slides to De Micheli Book Additional Slides to De Micheli Book Sungho Kang Yonsei University Design Style - Decomposition 08 3$9 0 Behavioral Synthesis Resource allocation; Pipelining; Control flow parallelization; Communicating

More information

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2 Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1907-1911 1907 Web-Based Data Mining in System Design and Implementation Open Access Jianhu

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr.

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr. Mid-Year Report Discontinuous Galerkin Euler Equation Solver Friday, December 14, 2012 Andrey Andreyev Advisor: Dr. James Baeder Abstract: The focus of this effort is to produce a two dimensional inviscid,

More information

A Novel Approach to Planar Mechanism Synthesis Using HEEDS

A Novel Approach to Planar Mechanism Synthesis Using HEEDS AB-2033 Rev. 04.10 A Novel Approach to Planar Mechanism Synthesis Using HEEDS John Oliva and Erik Goodman Michigan State University Introduction The problem of mechanism synthesis (or design) is deceptively

More information

Design For High Performance Flexray Protocol For Fpga Based System

Design For High Performance Flexray Protocol For Fpga Based System IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) e-issn: 2319 4200, p-issn No. : 2319 4197 PP 83-88 www.iosrjournals.org Design For High Performance Flexray Protocol For Fpga Based System E. Singaravelan

More information

Math 32, August 20: Review & Parametric Equations

Math 32, August 20: Review & Parametric Equations Math 3, August 0: Review & Parametric Equations Section 1: Review This course will continue the development of the Calculus tools started in Math 30 and Math 31. The primary difference between this course

More information

these developments has been in the field of formal methods. Such methods, typically given by a

these developments has been in the field of formal methods. Such methods, typically given by a PCX: A Translation Tool from PROMELA/Spin to the C-Based Stochastic Petri et Language Abstract: Stochastic Petri ets (SPs) are a graphical tool for the formal description of systems with the features of

More information