A Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster

Size: px
Start display at page:

Download "A Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster"

Transcription

1 A Freely Congurable Audio-Mixing Engine with Automatic Loadbalancing M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster Electronics Laboratory, Swiss Federal Institute of Technology CH-8092 Zurich, Switzerland March 7, 1995 Abstract The most important design issue for digital audio mixing consoles is the communication concept, that is used to interconnect an array of signal processors. This paper demonstrates the implementation of a digital mixing console that is capable of routing up to 100 audio signalpaths. The audio algorithms are on a modular DSP network. Signalpaths can be edited with a graphical user interface and are automatically mapped on the DSP network with optimal load-balancing. 1 Introduction The typical architecture of a digital audio mixing console is shown in gure 1. The most obvious dierence to an analog mixing console is the clear separation of mixing-desk and mixing-engine. The mixing-engine performs the digital signal processing and is completely controlled by the mixing-desk. A digital mixing console can be described as a network of signal-paths where a large number of digital audio functions are combined. Each function has to process some input signals and produces some output signals. The produced output signals of each function are used as input signals for other functions. This network can be described in a signal ow-graph where every function has at least one input and one output. The sources and drains of the ow-graph are the audio inputs and outputs of the entire mixing console. The mixing desk controls directly the functions by a set of parameters. For example a scale function scales a particular audio signal according to the slider on the mixing-desk. The huge amount of processing power that a digital mixing console demands can only be performed by a certain number of processors. Such an array of processors needs an interprocessor communication with a high bandwidth to provide the data exchange between audio functions. The communication 1

2 network is often designed directly in hardware because a software controlled communication does not satisfy the strict requirements of speed or uses too much processing power. As a result, the communication network becomes very inexible and changes in the digital data ow of audio signals are dicult. To tackle the lack of routability and communication speed of the interprocessor network this paper shows a new, highly exible communication structure. It has a bandwidth of 800 MBit/sec and allows to communicate up to 512 internal audio channels autonomously without decreasing the processing power. 2 System Requirements As mentioned above, audio functions need to be spreaded among an array of processors. To provide an ecient distribution, a homogeneous architecture of processing elements (PEs) is required. In other words, all PEs have the same capabilities and there is no master processor. Under this condition a load-balancing analysis can be performed in which the needed processing time of each function is measured. The goal is to evaluate the best possible way of spreading the functions among the processing elements and to use only the fewest number of PEs. Such a system works with optimal load-balancing. The next design issue that has to be dealt with arises in nding a way of mapping the signal ow-graph of the functions onto the existing hardware. For a fully routable system, that allows any interconnection between two or more functions running on any processor in the system, the communication structure of the interprocessor network must be orthogonal. This means each processor must be able to access all data produced by all other processors. The last important requirement of the system is that interprocessor communication is independent and not consuming processing power. 3 System Architecture Figure 2 shows the system architecture of the mixing-engine. One major goal of the project was to build a fully scalable system, where the number of PEs can be anything between 1 and 100. This corresponds directly with the fact that the needed processing power varies with the number of functions describing a mixing console. For the reason of scalability the chosen network topology is a ring. Buses and crossbars are other network examples. However a bus may establish only one connection at a time and must be arbitrated. A crossbar of order N may establish N connections at a time. This topology owns the best connection possibilities. But a scalable system with a crossbar network can hardly be realized. Other systems with a ring architecture have been built, like the WARP [1], the iwarp and the RAP [2]. 2

3 3.1 Communication Concept Every PE has its own communication controller (CC) which is responsible for the data ow on the ring bus. The data that has to be transferred passes every CC in a strictly ordered fashion value by value. Every controller is programmed to insert data from its PE at a certain position of the data stream. It also copies data coming from other PEs out of the data stream and stores it for its PE. This is a special kind of an independent time-division multiplexed (TDM) bus. In order to reduce bandwidth, but still meet the requirement of orthogonality of the interprocessor network, every CC communicates only data that is needed by other processors. Before the communication starts, the CCs have to be congured by the processors. After that the net works completely autonomously. Figure 3 illustrates the overlapping of communication and processing. The CCs synchronizes itself and starts the communication as soon as the rst data values are available. Data transfer and processing is executed simultaneously without slowing down the processors. After the last value has reached the last CC the communication cycle is nished and the processing can start immediately. Therefore processing is synchronized with the end of the communication cycle and not with the master sampling clock. Let S C be the synchronization time of the CC and S P the synchronization time of the PE. Thus the synchronization time over the entire system is maximum of S C and S P. If no CC is involved, like in other implementations, all synchronization is done by the processors. The synchronization time is then the sum of S C and S P. A new cycle begins after the next master sampling clock. Because of the independent communication controller the communication concept is named \Intelligent Communication". 3.2 Global Audio Channels The data that is communicated on the ring bus can be described as a set of global channels. Each channel is a digital audio connection between audio functions on dierent processors. Each processor produces a certain part of these channels depending on which functions are running on this processor. If two functions on the same processor need a connection, there is no need to use a global audio channel. The audio data can be transfered within the memory of the processor. This is equivalent to a local audio connection. The amount of communicated data, limited by the highest possible clock frequency on the ring bus, is at the moment 25 MWord/sec. At a digital audio sampling rate of 48KHz it is possible to communicate up to 512 global channels at 32 bits/word. Supposing an optimal signal ow-graph of a digital mixing console where most of the audio connections can be hold locally and not more than 5 global audio channels are used for a full signal-path, it is possible to route up to 100 audio signal-paths on the system. 3

4 3.3 Data Input Output Two interface modules provide the data exchange of audio raw-data with external audio resources. One is an AES/EBU interface, the other is a Multi Audio Digital Interface (MADI). With one MADI interface a maximum of 56 digital audio channels can be connected directly to the mixing-engine. To support an ecient data ow it is important to connect the interface modules directly to a CC. This way no processing power is lost at all. Figure 4 shows the topology of the mixing-engine with the I/O features. An interface module can be positioned anywhere between two processors. A system can have several MADI and AES/EBU interfaces. Like processors also the interface modules produce a certain part of the global channels. The CC of each module works autonomously and is liable for the data that the module produces and consumes. 3.4 Parameter Processing The mixing desk needs to control every function that is currently running on the system with a certain set of parameters. Also this information needs to be communicated through the interprocessor network. However parameters are not changing as fast as audio raw-data. Therefore it is not necessary to use one global audio channel for each parameter. In the current implementation 512 parameters are multiplexed on one global audio channel. This corresponds to an update rate of more than 100 times per second per parameter. Only a few global audio channels are applied and no special communication network for parameters has to be implemented. Between two parameter updates, an interpolation of the parameters is done to avoid audible discontinuities. 4 Hardware Implementation The hardware platform of the mixing-engine is the MUSIC Parallel-computer built at the Electronics Lab of the Swiss Federal Institute of Technology [3] [4]. However the communication concept and the operating system was completely redesigned. Figure 5 shows a processing board. Three PEs t on one board (22cm by 23cm) and up to 63 PEs can be connected together in a standard 19 inch rack. A special I/O board gives the possibility to connect a MADI or a AES/EBU module directly to the interprocessor network. The modular design allows to scale the system according to the individual needs. Only the necessary number of PEs and modules are inserted in the system. Therefore hardware overhead can be substantially reduced. One PE consists of a Motorola DSP oating point digital signal processor, 1 MByte of static RAM and 2 MBytes of dual-ported DRAM (Video RAM) organized in two blocks called \producer" memory and \consumer" memory. Each PE has its own communication controller, which is responsible for the data-ow between the PE and the interprocessor network. The CC is 4

5 implemented in an FPGA Xilinx XC3190. It fetches data through the serial port of the producer VRAM and writes arriving data into the serial port of the consumer VRAM. The serial buer of the producer and consumer VRAM can store 512 IEEE oating point values of 32 bits. 5 Software The software for the mixing engine is made of three parts: a signal-ow-graph editor, a conguration software and a runtime kernel. Figure 6 shows the three steps for the reconguration of the mixing-engine. Each step corresponds to a separate software module. 5.1 Signal-ow-graph Editor Audio functions are programmed in optimized assembler code. They appear as icons in the signal-ow-graph editor. Figure 7 demonstrates how functions can be placed and connected together. Subgraphs can be dened for later use. For example a complete channel structure can be designed and inserted as a block into the total system. The graphical user interface is running on a UNIX workstation. 5.2 Signal-path Router After placing and connecting the audio functions with the signal-ow-graph editor, the signal-path router congures the mixing-engine according to the designed signal network. In a load-balancing analysis the functions are placed on the processors. For parallel programs with asynchronous data exchange this is a known problem [5] [6]. However a synchronous system like a mixing-engine already has well partitioned functions and the processing time for each processor is x. Important is the optimal load of the processors. In the next step the routing of the signal-ow-graph is performed and mapped on the mixingengine. If a connection between two processors is needed the interprocessor network is applied using one of the global audio channels. If two functions are linked together that run on the same processor a local connection is established. 5.3 Runtime Kernel After booting the system a runtime kernel is working on each PE. It synchronizes all running functions with the communication controller. At a master sampling frequency of 48kHz new audio signals arrive about every 20 s. This time also corresponds to the processing time on each processor when no pipelining of functions is involved. The kernel uses less than 5 % of processing time on each processor. The remaining time is reserved for the audio functions exclusively. 5

6 5.4 Audio Function Design New audio functions can be included with a minimum of software eort. Using a well dened software interface the user can insert any self-written audio function in the system. The new function can be programmed in C or DSP assembler code. After the integration it is visible in the signal-ow-graph editor and can be placed in any audio signal network. 6 Conclusion This paper describes a communication network which is very exible and still reaches the necessary speed for multi digital audio communication. Recon- guration of signal paths is done easily with automatic load-balancing on all processors, which guarantees an optimal usage of processing resources. Therefore any conguration of a digital mixing console described in a signal-owgraph can be implemented. The presented implementation is the result of a research work and is not a cost eective solution. However the system is fully operational and serves as the platform for an industrial product. References [1] M. Annaratone, E. Arnould, T. Gross, H. T. Kung, M. Lam, O. Menzilicioglu, J. A. Webb. The WARP Computer: Architecture, Implementation and Performance. IEEE Trans. on Computer, Vol. C-36, No. 12, December 1987, pp [2] N. Morgan, J. Beck, P. Kohn, J. Bilmes, E. Allman, and J. Beer. The rap: A Ring Array Processor for Layered Network Calculations. In International Conference On Application Specic Array Processors. IEEE Computer Society Press, [3] A. Gunzinger, U. A. Muller, W. Scott, B. Baumle, P. Kohler, W. Guggenbuhl. Architecture and Realization of a Multi Signalprocessor System. In International Conference On Application Specic Array Processors. IEEE Computer Society Press, [4] U. A. Muller, B. Baumle, P. Kohler, A. Gunzinger, W. Guggenbuhl. Achieving Supercomputer Performance for Neural Net Simulation with an Array of Digital Signal Processors. IEEE Micro, October [5] Ch. W. Kessler (ed). Automatic Parallelization, new Approaches to Code Generation, Data Distribution, and Performance Prediction. Vieweg, Wiesbaden, Germany, [6] G. Haring (ed), G. Kotsis (ed). Performance Measurement and Visualization of Parallel Systems. North-Holland, Amsterdam, London, New York, Tokyo,

7 Mixing Desk Parameter Audio raw-data Input MIXING ENGINE Processed audio data Output Figure 1: Architecture of a Digital Mixing Console Ringbus Controller Controller Controller Controller n PE 1 PE 2 PE 3 PE n Figure 2: Topology of the scalable mixing-engine End of communication S c Communication Processing Clock Period Clock Period End of processing S p Figure 3: Communication and processing on all PEs run in parallel. After the end of a communication cycle, the processing can start immediately. The synchronization of communication (S C ) and processing (S P )is done separately and can be pipelined. 7

8 Controller Controller 1 2 Controller 3 Controller 3 Controller n AES/EBU MADI PE 1 MADI PE n Figure 4: Input Output features of the mixing-engine Figure 5: A processing board with 3 PEs 8

9 Signal Flow Graph Routing Runtime Figure 6: Reconguration of the mixing-engine is done in 3 steps. Figure 7: Signal-ow-graph Editor 9

A Scalable Multiprocessor for Real-time Signal Processing

A Scalable Multiprocessor for Real-time Signal Processing A Scalable Multiprocessor for Real-time Signal Processing Daniel Scherrer, Hans Eberle Institute for Computer Systems, Swiss Federal Institute of Technology CH-8092 Zurich, Switzerland {scherrer, eberle}@inf.ethz.ch

More information

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo Real-Time Scalability of Nested Spin Locks Hiroaki Takada and Ken Sakamura Department of Information Science, Faculty of Science, University of Tokyo 7-3-1, Hongo, Bunkyo-ku, Tokyo 113, Japan Abstract

More information

SRAM SRAM SRAM SRAM EPF 10K130V EPF 10K130V. Ethernet DRAM DRAM DRAM EPROM EPF 10K130V EPF 10K130V. Flash DRAM DRAM

SRAM SRAM SRAM SRAM EPF 10K130V EPF 10K130V. Ethernet DRAM DRAM DRAM EPROM EPF 10K130V EPF 10K130V. Flash DRAM DRAM Hardware Recongurable Neural Networks Jean-Luc Beuchat, Jacques-Olivier Haenni and Eduardo Sanchez Swiss Federal Institute of Technology, Logic Systems Laboratory, EPFL { LSL, IN { Ecublens, CH { 1015

More information

High Performance Neural Net Simulation on a Multiprocessor System with "Intelligent" Communication

High Performance Neural Net Simulation on a Multiprocessor System with Intelligent Communication High Performance Neural Net Simulation on a Multiprocessor System with "Intelligent" Communication Urs A. Miiller, Michael Kocheisen, and Anton Gunzinger Electronics Laboratory, Swiss Federal Institute

More information

Technische Universitat Munchen. Institut fur Informatik. D Munchen.

Technische Universitat Munchen. Institut fur Informatik. D Munchen. Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl

More information

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite Parallelization of 3-D Range Image Segmentation on a SIMD Multiprocessor Vipin Chaudhary and Sumit Roy Bikash Sabata Parallel and Distributed Computing Laboratory SRI International Wayne State University

More information

Flexible Hardware Support for Interworking Systems. Till Harbaum Detlef Meier Matthias Prinke. Martina Zitterbart

Flexible Hardware Support for Interworking Systems. Till Harbaum Detlef Meier Matthias Prinke. Martina Zitterbart Flexible Hardware Support for Interworking Systems Till Harbaum Detlef Meier Matthias Prinke Martina Zitterbart fharbaum meier prinke zitg@ibr.cs.tu-bs.de Institute of Operating Systems and Computer Networks

More information

Readout-Nodes. Master-Node S-LINK. Crate Controller VME ROD. Read out data (PipelineBus) VME. PipelineBus Controller PPM VME. To DAQ (S-Link) PPM

Readout-Nodes. Master-Node S-LINK. Crate Controller VME ROD. Read out data (PipelineBus) VME. PipelineBus Controller PPM VME. To DAQ (S-Link) PPM THE READOUT BU OF THE ATLA LEVEL- CALORIMETER TRIGGER PRE-PROCEOR C. chumacher Institut fur Hochenergiephysik, Heidelberg, Germany (e-mail: schumacher@asic.uni-heidelberg.de) representing the ATLA level-

More information

Hardware Implementation of GA.

Hardware Implementation of GA. Chapter 6 Hardware Implementation of GA Matti Tommiska and Jarkko Vuori Helsinki University of Technology Otakaari 5A, FIN-02150 ESPOO, Finland E-mail: Matti.Tommiska@hut.fi, Jarkko.Vuori@hut.fi Abstract.

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

The Cambridge Backbone Network. An Overview and Preliminary Performance. David J. Greaves. Olivetti Research Ltd. Krzysztof Zielinski

The Cambridge Backbone Network. An Overview and Preliminary Performance. David J. Greaves. Olivetti Research Ltd. Krzysztof Zielinski The Cambridge Backbone Network An Overview and Preliminary Performance David J. Greaves Olivetti Research Ltd. University of Cambridge, Computer Laboratory Krzysztof Zielinski Institute of Computer Science

More information

PARNEU: Scalable Multiprocessor System for Soft Computing Applications

PARNEU: Scalable Multiprocessor System for Soft Computing Applications PARNEU: Scalable Multiprocessor System for Soft Computing Applications PASI KOLINUMMI, TIMO HÄMÄLÄINEN AND JUKKA SAARINEN Digital and Computer Systems Laboratory Tampere University of Technology P.O. Box

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Zhou B. B., Brent R. P. and Tridgell A. y Computer Sciences Laboratory The Australian National University Canberra,

More information

Embedded Systems: Hardware Components (part II) Todor Stefanov

Embedded Systems: Hardware Components (part II) Todor Stefanov Embedded Systems: Hardware Components (part II) Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded

More information

Design of A Memory Latency Tolerant. *Faculty of Eng.,Tokai Univ **Graduate School of Eng.,Tokai Univ. *

Design of A Memory Latency Tolerant. *Faculty of Eng.,Tokai Univ **Graduate School of Eng.,Tokai Univ. * Design of A Memory Latency Tolerant Processor() Naohiko SHIMIZU* Kazuyuki MIYASAKA** Hiroaki HARAMIISHI** *Faculty of Eng.,Tokai Univ **Graduate School of Eng.,Tokai Univ. 1117 Kitakaname Hiratuka-shi

More information

Architectures? Vinoo Srinivasan, Shankar Radhakrishnan, Ranga Vemuri, and Je Walrath. fvsriniva, sradhakr, ranga,

Architectures? Vinoo Srinivasan, Shankar Radhakrishnan, Ranga Vemuri, and Je Walrath.   fvsriniva, sradhakr, ranga, Interconnect Synthesis for Recongurable Multi-FPGA Architectures? Vinoo Srinivasan, Shankar Radhakrishnan, Ranga Vemuri, and Je Walrath E-mail: fvsriniva, sradhakr, ranga, jwalrathg@ececs.uc.edu DDEL,

More information

The CPU Design Kit: An Instructional Prototyping Platform. for Teaching Processor Design. Anujan Varma, Lampros Kalampoukas

The CPU Design Kit: An Instructional Prototyping Platform. for Teaching Processor Design. Anujan Varma, Lampros Kalampoukas The CPU Design Kit: An Instructional Prototyping Platform for Teaching Processor Design Anujan Varma, Lampros Kalampoukas Dimitrios Stiliadis, and Quinn Jacobson Computer Engineering Department University

More information

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals Performance Evaluations of a Multithreaded Java Microcontroller J. Kreuzinger, M. Pfeer A. Schulz, Th. Ungerer Institute for Computer Design and Fault Tolerance University of Karlsruhe, Germany U. Brinkschulte,

More information

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk HRaid: a Flexible Storage-system Simulator Toni Cortes Jesus Labarta Universitat Politecnica de Catalunya - Barcelona ftoni, jesusg@ac.upc.es - http://www.ac.upc.es/hpc Abstract Clusters of workstations

More information

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO 63130 hart@cs.wustl.edu Eileen Kraemer Dept. of Computer Science University of Georgia

More information

Component-Based Communication Support for Parallel Applications Running on Workstation Clusters

Component-Based Communication Support for Parallel Applications Running on Workstation Clusters Component-Based Communication Support for Parallel Applications Running on Workstation Clusters Antônio Augusto Fröhlich 1 and Wolfgang Schröder-Preikschat 2 1 GMD FIRST Kekulésraÿe 7 D-12489 Berlin, Germany

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines B. B. Zhou, R. P. Brent and A. Tridgell Computer Sciences Laboratory The Australian National University Canberra,

More information

farun, University of Washington, Box Seattle, WA Abstract

farun, University of Washington, Box Seattle, WA Abstract Minimizing Overhead in Parallel Algorithms Through Overlapping Communication/Computation Arun K. Somani and Allen M. Sansano farun, alleng@shasta.ee.washington.edu Department of Electrical Engineering

More information

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located

More information

Dept. of Computer Science, Keio University. Dept. of Information and Computer Science, Kanagawa Institute of Technology

Dept. of Computer Science, Keio University. Dept. of Information and Computer Science, Kanagawa Institute of Technology HOSMII: A Virtual Hardware Integrated with Yuichiro Shibata, 1 Hidenori Miyazaki, 1 Xiao-ping Ling, 2 and Hideharu Amano 1 1 Dept. of Computer Science, Keio University 2 Dept. of Information and Computer

More information

Network-on-Chip Architecture

Network-on-Chip Architecture Multiple Processor Systems(CMPE-655) Network-on-Chip Architecture Performance aspect and Firefly network architecture By Siva Shankar Chandrasekaran and SreeGowri Shankar Agenda (Enhancing performance)

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

JNTUWORLD. 1. Discuss in detail inter processor arbitration logics and procedures with necessary diagrams? [15]

JNTUWORLD. 1. Discuss in detail inter processor arbitration logics and procedures with necessary diagrams? [15] Code No: 09A50402 R09 Set No. 2 1. Discuss in detail inter processor arbitration logics and procedures with necessary diagrams? [15] 2. (a) Discuss asynchronous serial transfer concept? (b) Explain in

More information

Studer D21m. I/O System Components. Condensed Information

Studer D21m. I/O System Components. Condensed Information Studer D21m I/O System Components Condensed Information 2 The D21m I/O System The D21m I/O system provides very cost-effective inputs and outputs with maximum flexibility while maintaining the well-known

More information

Memroy MUX. Input. Output (1bit)

Memroy MUX. Input. Output (1bit) CPRE/COMS 583x Adaptive Computing Systems Fall Semester 1998 Lecture 4: Thursday September 3, 1998 Lecture 5: Tuesday September 8, 1998 Instructor: Arun K. Somani & Akhilesh Tyagi Scribe: Hue-Sung Kim

More information

TR-CS The rsync algorithm. Andrew Tridgell and Paul Mackerras. June 1996

TR-CS The rsync algorithm. Andrew Tridgell and Paul Mackerras. June 1996 TR-CS-96-05 The rsync algorithm Andrew Tridgell and Paul Mackerras June 1996 Joint Computer Science Technical Report Series Department of Computer Science Faculty of Engineering and Information Technology

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives

More information

Design of a System-on-Chip Switched Network and its Design Support Λ

Design of a System-on-Chip Switched Network and its Design Support Λ Design of a System-on-Chip Switched Network and its Design Support Λ Daniel Wiklund y, Dake Liu Dept. of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Abstract As the degree of

More information

UNIVERSITY OF PITTSBURGH FACULTY OF ARTS AND SCIENCES This dissertation was presented by Xin Yuan It was defended on August, 1998 and approved by Prof

UNIVERSITY OF PITTSBURGH FACULTY OF ARTS AND SCIENCES This dissertation was presented by Xin Yuan It was defended on August, 1998 and approved by Prof Dynamic and Compiled Communication in Optical Time{Division{Multiplexed Point{to{Point Networks by Xin Yuan B.S., Shanghai Jiaotong University, 1989 M.S., Shanghai Jiaotong University, 1992 M.S., University

More information

Embedded Systems. 7. System Components

Embedded Systems. 7. System Components Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic

More information

A New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract

A New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks Jose Duato Abstract Second generation multicomputers use wormhole routing, allowing a very low channel set-up time and drastically reducing

More information

ECE 697J Advanced Topics in Computer Networks

ECE 697J Advanced Topics in Computer Networks ECE 697J Advanced Topics in Computer Networks Switching Fabrics 10/02/03 Tilman Wolf 1 Router Data Path Last class: Single CPU is not fast enough for processing packets Multiple advanced processors in

More information

ATLANTIS - a modular, hybrid FPGA/CPU processor for the ATLAS. University of Mannheim, B6, 26, Mannheim, Germany

ATLANTIS - a modular, hybrid FPGA/CPU processor for the ATLAS. University of Mannheim, B6, 26, Mannheim, Germany ATLANTIS - a modular, hybrid FPGA/CPU processor for the ATLAS Readout Systems A. Kugel, Ch. Hinkelbein, R. Manner, M. Muller, H. Singpiel University of Mannheim, B6, 26, 68131 Mannheim, Germany fkugel,

More information

Technical Report No On the Power of Arrays with. Recongurable Optical Buses CANADA. Abstract

Technical Report No On the Power of Arrays with. Recongurable Optical Buses CANADA. Abstract Technical Report No. 95-374 On the Power of Arrays with Recongurable Optical Buses Sandy Pavel, Selim G. Akl Department of Computing and Information Science Queen's University, Kingston, Ontario, K7L 3N6

More information

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing

More information

ECE 551 System on Chip Design

ECE 551 System on Chip Design ECE 551 System on Chip Design Introducing Bus Communications Garrett S. Rose Fall 2018 Emerging Applications Requirements Data Flow vs. Processing µp µp Mem Bus DRAMC Core 2 Core N Main Bus µp Core 1 SoCs

More information

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved Hardware Design MicroBlaze 7.1 This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: List the MicroBlaze 7.1 Features List

More information

Assignment1 - CSG1102: Virtual Memory. Christoer V. Hallstensen snr: March 28, 2011

Assignment1 - CSG1102: Virtual Memory. Christoer V. Hallstensen snr: March 28, 2011 Assignment1 - CSG1102: Virtual Memory Christoer V. Hallstensen snr:10220862 March 28, 2011 1 Contents 1 Abstract 3 2 Virtual Memory with Pages 4 2.1 Virtual memory management.................... 4 2.2

More information

Laboratory Pipeline MIPS CPU Design (2): 16-bits version

Laboratory Pipeline MIPS CPU Design (2): 16-bits version Laboratory 10 10. Pipeline MIPS CPU Design (2): 16-bits version 10.1. Objectives Study, design, implement and test MIPS 16 CPU, pipeline version with the modified program without hazards Familiarize the

More information

Abstract A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE

Abstract A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE Reiner W. Hartenstein, Rainer Kress, Helmut Reinig University of Kaiserslautern Erwin-Schrödinger-Straße, D-67663 Kaiserslautern, Germany

More information

Enhancing Integrated Layer Processing using Common Case. Anticipation and Data Dependence Analysis. Extended Abstract

Enhancing Integrated Layer Processing using Common Case. Anticipation and Data Dependence Analysis. Extended Abstract Enhancing Integrated Layer Processing using Common Case Anticipation and Data Dependence Analysis Extended Abstract Philippe Oechslin Computer Networking Lab Swiss Federal Institute of Technology DI-LTI

More information

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor

More information

Patagonia Cluster Project Research Cluster

Patagonia Cluster Project Research Cluster Patagonia Cluster Project Research Cluster Clusters of PCs Multi-Boot and Multi-Purpose? Christian Kurmann, Felix Rauch, Michela Taufer, Prof. Thomas M. Stricker Laboratory for Computer Systems ETHZ -

More information

Techniques. IDSIA, Istituto Dalle Molle di Studi sull'intelligenza Articiale. Phone: Fax:

Techniques. IDSIA, Istituto Dalle Molle di Studi sull'intelligenza Articiale. Phone: Fax: Incorporating Learning in Motion Planning Techniques Luca Maria Gambardella and Marc Haex IDSIA, Istituto Dalle Molle di Studi sull'intelligenza Articiale Corso Elvezia 36 - CH - 6900 Lugano Phone: +41

More information

Dynamic Multi-Path Communication for Video Trac. Hao-hua Chu, Klara Nahrstedt. Department of Computer Science. University of Illinois

Dynamic Multi-Path Communication for Video Trac. Hao-hua Chu, Klara Nahrstedt. Department of Computer Science. University of Illinois Dynamic Multi-Path Communication for Video Trac Hao-hua Chu, Klara Nahrstedt Department of Computer Science University of Illinois h-chu3@cs.uiuc.edu, klara@cs.uiuc.edu Abstract Video-on-Demand applications

More information

Design and Implementation of a. August 4, Christopher Frank Joerg

Design and Implementation of a. August 4, Christopher Frank Joerg Design and Implementation of a Packet Switched Routing Chip MIT / LCS / TR-482 August 4, 1994 Christopher Frank Joerg This report describes research done at the Laboratory of Computer Science of the Massachusetts

More information

Optimal Topology for Distributed Shared-Memory. Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres

Optimal Topology for Distributed Shared-Memory. Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres Optimal Topology for Distributed Shared-Memory Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres Facultad de Informatica, Universidad Politecnica de Valencia P.O.B. 22012, 46071 - Valencia,

More information

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz Compiler and Runtime Support for Programming in Adaptive Parallel Environments 1 Guy Edjlali, Gagan Agrawal, Alan Sussman, Jim Humphries, and Joel Saltz UMIACS and Dept. of Computer Science University

More information

PEPE: A Trace-Driven Simulator to Evaluate. Recongurable Multicomputer Architectures? Campus Universitario, Albacete, Spain

PEPE: A Trace-Driven Simulator to Evaluate. Recongurable Multicomputer Architectures? Campus Universitario, Albacete, Spain PEPE: A Trace-Driven Simulator to Evaluate Recongurable Multicomputer Architectures? Jose M Garca 1, Jose LSanchez 2,Pascual Gonzalez 2 1 Universidad de Murcia, Facultad de Informatica Campus de Espinardo,

More information

As dierent shading methods and visibility calculations have diversied the. image generation, many dierent alternatives have come into existence for

As dierent shading methods and visibility calculations have diversied the. image generation, many dierent alternatives have come into existence for Chapter 8 z-buffer, GOURAUD-SHADING WORKSTATIONS As dierent shading methods and visibility calculations have diversied the image generation, many dierent alternatives have come into existence for their

More information

OMNEO Interface and OMNEO Control Praesideo on IP

OMNEO Interface and OMNEO Control Praesideo on IP Application Note Praesideo on IP OMNEO Interface and OMNEO Control Praesideo on IP Create large Praesideo Public Address and Voice Alarm designs Interconnect multiple Praesideo subsystems via OMNEO IP-technology

More information

Flow simulation. Frank Lohmeyer, Oliver Vornberger. University of Osnabruck, D Osnabruck.

Flow simulation. Frank Lohmeyer, Oliver Vornberger. University of Osnabruck, D Osnabruck. To be published in: Notes on Numerical Fluid Mechanics, Vieweg 1994 Flow simulation with FEM on massively parallel systems Frank Lohmeyer, Oliver Vornberger Department of Mathematics and Computer Science

More information

The Avalanche Myrinet Simulation Package. University of Utah, Salt Lake City, UT Abstract

The Avalanche Myrinet Simulation Package. University of Utah, Salt Lake City, UT Abstract The Avalanche Myrinet Simulation Package User Manual for V. Chen-Chi Kuo, John B. Carter fchenchi, retracg@cs.utah.edu WWW: http://www.cs.utah.edu/projects/avalanche UUCS-96- Department of Computer Science

More information

A Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1

A Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1 A Hierarchical Approach to Workload Characterization for Parallel Systems? M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1 1 Dipartimento di Informatica e Sistemistica, Universita dipavia,

More information

1 master and 8 independent stereo subgroup Flexible architecture including a modular control surface, outputs

1 master and 8 independent stereo subgroup Flexible architecture including a modular control surface, outputs Digital Audio Console OXF-R3 High-end digital recording and mix-down console 24 cue/auxiliary send buses, which can be linked for Provides exemplary sound quality and greater functionality stereo than

More information

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) UC Berkeley Fall 2010 Unit-Transaction Level

More information

Word Clock Select. 188 Digital I/O, Setup, and Utilities. Wordclock. 1. Use the [DIGITAL I/O] button to locate the DIGITAL I/O 1/5 page.

Word Clock Select. 188 Digital I/O, Setup, and Utilities. Wordclock. 1. Use the [DIGITAL I/O] button to locate the DIGITAL I/O 1/5 page. 188 Digital I/O, Setup, and Utilities Word Clock Select 1. Use the [DIGITAL I/O] button to locate the DIGITAL I/O 1/5 page. The 02R processes audio data at 44.1 khz or 48 khz using its internal clock,

More information

suitable for real-time applications. In this paper, we add a layer of Real-Time Communication Control (RTCC) protocol on top of Ethernet. The RTCC pro

suitable for real-time applications. In this paper, we add a layer of Real-Time Communication Control (RTCC) protocol on top of Ethernet. The RTCC pro A Hard Real-Time Communication Control Protocol Based on the Ethernet WANG Zhi-Ping 1, XIONG Guang-Ze 1, LUO Jin 1, LAI Ming-Zhi 1,and Wanlei ZHOU 2 1 Computer Science and Engineering College, University

More information

Dr e v prasad Dt

Dr e v prasad Dt Dr e v prasad Dt. 12.10.17 Contents Characteristics of Multiprocessors Interconnection Structures Inter Processor Arbitration Inter Processor communication and synchronization Cache Coherence Introduction

More information

\Classical" RSVP and IP over ATM. Steven Berson. April 10, Abstract

\Classical RSVP and IP over ATM. Steven Berson. April 10, Abstract \Classical" RSVP and IP over ATM Steven Berson USC Information Sciences Institute April 10, 1996 Abstract Integrated Services in the Internet is rapidly becoming a reality. Meanwhile, ATM technology is

More information

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809 PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA Laurent Lemarchand Informatique ubo University{ bp 809 f-29285, Brest { France lemarch@univ-brest.fr ea 2215, D pt ABSTRACT An ecient distributed

More information

Real Time Spectrogram

Real Time Spectrogram Real Time Spectrogram EDA385 Final Report Erik Karlsson, dt08ek2@student.lth.se David Winér, ael09dwi@student.lu.se Mattias Olsson, ael09mol@student.lu.se October 31, 2013 Abstract Our project is about

More information

FPGAs: Instant Access

FPGAs: Instant Access FPGAs: Instant Access Clive"Max"Maxfield AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO % ELSEVIER Newnes is an imprint of Elsevier Newnes Contents

More information

What is Parallel Computing?

What is Parallel Computing? What is Parallel Computing? Parallel Computing is several processing elements working simultaneously to solve a problem faster. 1/33 What is Parallel Computing? Parallel Computing is several processing

More information

Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience

Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience H. Krupnova CMG/FMVG, ST Microelectronics Grenoble, France Helena.Krupnova@st.com Abstract Today, having a fast hardware

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

Space Priority Trac. Rajarshi Roy and Shivendra S. Panwar y. for Advanced Technology in Telecommunications, Polytechnic. 6 Metrotech Center

Space Priority Trac. Rajarshi Roy and Shivendra S. Panwar y. for Advanced Technology in Telecommunications, Polytechnic. 6 Metrotech Center Ecient Buer Sharing in Shared Memory ATM Systems With Space Priority Trac Rajarshi Roy and Shivendra S Panwar y Center for Advanced Technology in Telecommunications Polytechnic University 6 Metrotech Center

More information

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics Overcoming the Memory System Challenge in Dataflow Processing Darren Jones, Wave Computing Drew Wingard, Sonics Current Technology Limits Deep Learning Performance Deep Learning Dataflow Graph Existing

More information

A New Orthogonal Multiprocessor and its Application to Image. Processing. L. A. Sousa M. S. Piedade DEEC IST/INESC. R.

A New Orthogonal Multiprocessor and its Application to Image. Processing. L. A. Sousa M. S. Piedade DEEC IST/INESC. R. A New Orthogonal ultiprocessor and its Application to Image Processing L. A. Sousa. S. Piedade email:las@molly.inesc.pt email: msp@inesc.pt DEEC IST/INESC R. Alves Redol,9 1000 Lisboa, PORTUGAL Abstract

More information

DIGIGRID MGR. Table of Contents

DIGIGRID MGR. Table of Contents 1 1 Table of Contents Introduction 3 About SoundGrid and the DiGiGrid MGR Audio Interface 3 Using DiGiGrid MGR with a Console 6 1. Hardware and Connectors 7 2. Installation and Configuration Overview 9

More information

Advanced Parallel Architecture. Annalisa Massini /2017

Advanced Parallel Architecture. Annalisa Massini /2017 Advanced Parallel Architecture Annalisa Massini - 2016/2017 References Advanced Computer Architecture and Parallel Processing H. El-Rewini, M. Abd-El-Barr, John Wiley and Sons, 2005 Parallel computing

More information

SoC Design Lecture 11: SoC Bus Architectures. Shaahin Hessabi Department of Computer Engineering Sharif University of Technology

SoC Design Lecture 11: SoC Bus Architectures. Shaahin Hessabi Department of Computer Engineering Sharif University of Technology SoC Design Lecture 11: SoC Bus Architectures Shaahin Hessabi Department of Computer Engineering Sharif University of Technology On-Chip bus topologies Shared bus: Several masters and slaves connected to

More information

Elchin Mammadov. Overview of Communication Systems

Elchin Mammadov. Overview of Communication Systems Overview of Communication Systems About Me Studying towards the Masters of Applied Science in Electrical and Computer Engineering. My research area is about implementing a communication framework (software

More information

DSP Development Environment: Introductory Exercise for TI TMS320C55x

DSP Development Environment: Introductory Exercise for TI TMS320C55x Connexions module: m13811 1 DSP Development Environment: Introductory Exercise for TI TMS320C55x Thomas Shen David Jun Based on DSP Development Environment: Introductory Exercise for TI TMS320C54x (ECE

More information

CS 4453 Computer Networks Winter

CS 4453 Computer Networks Winter CS 4453 Computer Networks Chapter 2 OSI Network Model 2015 Winter OSI model defines 7 layers Figure 1: OSI model Computer Networks R. Wei 2 The seven layers are as follows: Application Presentation Session

More information

Zeki Bozkus, Sanjay Ranka and Georey Fox , Center for Science and Technology. Syracuse University

Zeki Bozkus, Sanjay Ranka and Georey Fox , Center for Science and Technology. Syracuse University Modeling the CM-5 multicomputer 1 Zeki Bozkus, Sanjay Ranka and Georey Fox School of Computer Science 4-116, Center for Science and Technology Syracuse University Syracuse, NY 13244-4100 zbozkus@npac.syr.edu

More information

VersaPipe: A Versatile Programming Framework for Pipelined Computing on GPU

VersaPipe: A Versatile Programming Framework for Pipelined Computing on GPU VersaPipe: A Versatile Programming Framework for Pipelined Computing on GPU Zhen Zheng Tsinghua University z-zheng14@mails.tsinghua.edu.cn Chanyoung Oh University of Seoul alspace11@uos.ac.kr Jidong Zhai

More information

Design and Implementation of a FPGA-based Pipelined Microcontroller

Design and Implementation of a FPGA-based Pipelined Microcontroller Design and Implementation of a FPGA-based Pipelined Microcontroller Rainer Bermbach, Martin Kupfer University of Applied Sciences Braunschweig / Wolfenbüttel Germany Embedded World 2009, Nürnberg, 03.03.09

More information

6.1 Multiprocessor Computing Environment

6.1 Multiprocessor Computing Environment 6 Parallel Computing 6.1 Multiprocessor Computing Environment The high-performance computing environment used in this book for optimization of very large building structures is the Origin 2000 multiprocessor,

More information

UNIT I (Two Marks Questions & Answers)

UNIT I (Two Marks Questions & Answers) UNIT I (Two Marks Questions & Answers) Discuss the different ways how instruction set architecture can be classified? Stack Architecture,Accumulator Architecture, Register-Memory Architecture,Register-

More information

A Framework for Building Parallel ATPs. James Cook University. Automated Theorem Proving (ATP) systems attempt to prove proposed theorems from given

A Framework for Building Parallel ATPs. James Cook University. Automated Theorem Proving (ATP) systems attempt to prove proposed theorems from given A Framework for Building Parallel ATPs Geo Sutclie and Kalvinder Singh James Cook University 1 Introduction Automated Theorem Proving (ATP) systems attempt to prove proposed theorems from given sets of

More information

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik SoC Design Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik Chapter 5 On-Chip Communication Outline 1. Introduction 2. Shared media 3. Switched media 4. Network on

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming David Lifka lifka@cac.cornell.edu May 23, 2011 5/23/2011 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor or computer to complete

More information

Abstract Studying network protocols and distributed applications in real networks can be dicult due to the need for complex topologies, hard to nd phy

Abstract Studying network protocols and distributed applications in real networks can be dicult due to the need for complex topologies, hard to nd phy ONE: The Ohio Network Emulator Mark Allman, Adam Caldwell, Shawn Ostermann mallman@lerc.nasa.gov, adam@eni.net ostermann@cs.ohiou.edu School of Electrical Engineering and Computer Science Ohio University

More information

Real-time communication scheduling in a multicomputer video. server. A. L. Narasimha Reddy Eli Upfal. 214 Zachry 650 Harry Road.

Real-time communication scheduling in a multicomputer video. server. A. L. Narasimha Reddy Eli Upfal. 214 Zachry 650 Harry Road. Real-time communication scheduling in a multicomputer video server A. L. Narasimha Reddy Eli Upfal Texas A & M University IBM Almaden Research Center 214 Zachry 650 Harry Road College Station, TX 77843-3128

More information

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.

More information

An Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki

An Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki An Ultra High Performance Scalable DSP Family for Multimedia Hot Chips 17 August 2005 Stanford, CA Erik Machnicki Media Processing Challenges Increasing performance requirements Need for flexibility &

More information

Predicting the Worst-Case Execution Time of the Concurrent Execution. of Instructions and Cycle-Stealing DMA I/O Operations

Predicting the Worst-Case Execution Time of the Concurrent Execution. of Instructions and Cycle-Stealing DMA I/O Operations ACM SIGPLAN Workshop on Languages, Compilers and Tools for Real-Time Systems, La Jolla, California, June 1995. Predicting the Worst-Case Execution Time of the Concurrent Execution of Instructions and Cycle-Stealing

More information

Programming Environments for Developing Real Time. Autonomous Agents based on a Functional Module Network Model

Programming Environments for Developing Real Time. Autonomous Agents based on a Functional Module Network Model Programming Environments for Developing Real Time Autonomous Agents based on a Functional Module Network Model T.Oka 3 M.Inaba + H.Inoue + 3 Graduate School of Information Systems + Dpt. of Mechano-Informatics

More information

Client Server & Distributed System. A Basic Introduction

Client Server & Distributed System. A Basic Introduction Client Server & Distributed System A Basic Introduction 1 Client Server Architecture A network architecture in which each computer or process on the network is either a client or a server. Source: http://webopedia.lycos.com

More information

NVIDIA nforce IGP TwinBank Memory Architecture

NVIDIA nforce IGP TwinBank Memory Architecture NVIDIA nforce IGP TwinBank Memory Architecture I. Memory Bandwidth and Capacity There s Never Enough With the recent advances in PC technologies, including high-speed processors, large broadband pipelines,

More information

Process 0 Process 1 MPI_Barrier MPI_Isend. MPI_Barrier. MPI_Recv. MPI_Wait. MPI_Isend message. header. MPI_Recv. buffer. message.

Process 0 Process 1 MPI_Barrier MPI_Isend. MPI_Barrier. MPI_Recv. MPI_Wait. MPI_Isend message. header. MPI_Recv. buffer. message. Where's the Overlap? An Analysis of Popular MPI Implementations J.B. White III and S.W. Bova Abstract The MPI 1:1 denition includes routines for nonblocking point-to-point communication that are intended

More information

The Architecture of a Homogeneous Vector Supercomputer

The Architecture of a Homogeneous Vector Supercomputer The Architecture of a Homogeneous Vector Supercomputer John L. Gustafson, Stuart Hawkinson, and Ken Scott Floating Point Systems, Inc. Beaverton, Oregon 97005 Abstract A new homogeneous computer architecture

More information