DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA"

Transcription

1 DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA M. GAUS, G. R. JOUBERT, O. KAO, S. RIEDEL AND S. STAPEL Technical University of Clausthal, Department of Computer Science Julius-Albert-Str. 4, Clausthal-Zellerfeld, Germany Distributed platforms are not necessarily well-suited for systems which handle large data sets, such as processed in multimedia applications. In this paper a specialised computation model, based on asynchronous transmission, is presented. As the necessary functions are encapsulated this system can be used without detailed knowledge of the system architecture. A dynamic strategy of task execution is utilised to adjust the number and size of the distributed data packages according to the computational load of the processing elements at transmission time. Thus more powerful PE s, or those whose resources are not fully utilised, will either receive packages more frequently or will be given larger packages. In large networks some nodes can be replaced by others or only a few data blocks may be sent to (a) particular node(s). The efficiency of the method is evaluated with a variety of practical run time measurements. 1 Introduction Distributed systems consisting of a network of workstations are increasingly being used for solving compute intensive problems. Distributed platforms are, however, not always well-suited for systems which handle large data sets as can be found in, for example, multimedia applications. The limiting factor for processing of large data sets is usually network bandwidth. Thus, the distribution of huge amounts of data bounds the overall processing speed. This situation is made worse by the fact that data is transmitted only when requested or sent by the parallel processes. In order to reduce this effect, the transmission of data should be separated from the process synchronisation. Well-known software systems for parallel/distributed processing on existing computer networks are PVM, MPI, PVMPI, Condor, Mosix [1-5] and Treadmarks. An advantage of the PVM is its availability on nearly all important architectures and operating systems. On the other hand synchronous data transfers and type conversions are time consuming, making it unsuitable for the processing of large multimedia data sets. 1.1 Multimedia data Multimedia data has become an important component of modern software systems. Static media (images, graphics, text) are combined with dynamic media (audio, video, animations) to obtain realistic representations of natural processes, for the visualisation of complex results or to depict dynamic processes. In spite of the increases in memory sizes, processing and communication speeds, the processing and communication of multimedia data is still time and submitted to World Scientific : : 13:47 1/1

2 compute intensive. Some of the initial problems, essentially data compression, could be solved by the development of efficient compression algorithms, e.g. JPEG, MPEG, MP3. Many of these algorithms have been implemented in hardware, offering the possibility of real time encoding. The next step resulted in parallelising numerous procedures for processing multimedia data. Static media, such as encountered in image processing applications, are usually subdivided into independent data fragments, which are then distributed among a number of processing elements. The results are gathered and combined to form the final result. In dynamic media, interdependencies between the different data blocks must be considered and resolved. An example for this is MPEG compression, which is based on finding and eliminating redundant information in consecutive frames. Parallelisation by means of data segmentation is well-suited for parallel computers with shared memory, since little or no time is spent on communicating the data. Software for distributed computing in heterogeneous networks will have less of a performance gain, because of the slow synchronous transfers and greater variances in client resources. If the operations executed are simple these delay effects can be seen quite clearly. An example for this is the calculation of correlation coefficients for short term series [8]. Considering all combinations between 100 shares and a time difference of 5 days resulted in correlation terms and 27 megabytes of data. The performance gain by parallelising the algorithm with the PVM among 4 DEC Alphas was negated by the resulting administration overhead. This resulted in the run time on a single workstation being up to 6 times faster than the parallel PVM version. These requirements (large data sets, simple operations) are also found in the management, retrieval and processing of multimedia data. Current approaches to multimedia databases are based on the extraction and management of specific characteristics. Queries compare the extracted characteristics with all images stored in the database, and return the most similar images. Each archival and retrieval process results in the computation of huge amounts of data. Performance gains through parallelisation are negated by transfer times and administration of the data, as described in the correlation example. This results in the necessity of a specialised model for parallel processing of huge amounts of data. 2 Processing model for static multimedia data The proposed processing model aims to make development of parallel programs by non-experienced users easy, and minimise the communication and management effort, by using TCP/IP sockets directly. Similar to the work pile model [6], this model is based on the creation of pools of tasks, which are controlled by three special processes (distribution and collection manager, computation client). The information is divided into sections which are distributed to a number of processing elements (Figure 1). submitted to World Scientific : : 13:47 2/2

3 PE 1 Pool of Tasks Distribution manager PE 2 : Collection manager Pool of Results PE n Figure 1: Schematic representation of the processing model 2.1 Distribution manager The distribution manager is responsible for the division and management of the data packets to be processed. Push technology is used to minimise the transfer cost between server and clients. The responsibility of the distribution manager includes data packets definition, management of data packets in the local pool of tasks, processing of client requests and distribution of the data packets among the processing elements. The distribution strategy is set within this process. Essential requirements include the efficient use of available resources, as well as being failure tolerant. To circumvent problems related to processing element failures the data packets are subdivided into three groups: the first group consists of packages which were not yet distributed, the second group comprises transmitted, but unprocessed data, whereas the third group consists of processed data packets. A simple distribution strategy of available data packets increases computing efficiency. If the first group is empty, but non-processed data blocks are still in the second group, then these are dispatched to idle clients, which have already completed their computation tasks. This can be achieved by generating a list of all available active nodes and of the status of their local pools of tasks. The number of distributed but not yet processed packets can be calculated from the number of packets sent, but not yet received by the collection task. This requires a direct connection between the distributor and the collector. The difference is analysed and compared to a given threshold values. If it is below the threshold the distribution manager sends new packets to the client. This strategy requires a time and/or workload oriented distribution of the data packets as well, since processing can only occur if the processing element has a low CPU load. A blocked client that does not satisfy this requirement is regarded as a node that has failed. The server will redistribute the data packets sent to this client. 2.2 Computation client This component performs the computation on each processing element. A simple and compact structure reduces the management overhead and enables an important performance increase. The computation client consists of a local pool of tasks, a submitted to World Scientific : : 13:47 3/3

4 processing object and a local pool of results. In this pool the processed data packets are temporarily stored until a connection for the transfer to the collection manager becomes available. 2.3 Collection manager This process accepts processed data packets from the computation clients and stores them until all data packets have been received in the pool of results. Once this occurs, it composes the processed original from the received data packets. A picture or a series of pictures would be composed at this point during e.g. JPEG-encoding. Furthermore, the collector sends a message giving the number of received data packets to the distributor. From this information the distribution manager determines the current workload of each client and redefines the distribution strategy. The distributor is also notified when all data packets have reached the collector and the processing is completed. 2.4 Arraying in multiple hierarchical levels The described model consists of two hierarchical levels, containing the distribution and collection processes on one level, and computation clients on the other. This model will reach its capacities quickly with a large number of non-local processing elements. An alternative is to arrange servers hierarchically. The lower levels of this hierarchy contain not only clients, but subordinated servers as well, which distribute the data packets to lower level clients. An example for the application of such a model are data distributions in corporate or university networks: a super server sends data packets to subordinate servers in each division. Each of these servers initiates the computation in its own domain. This significantly reduces the communication complexity, or at least binds it locally. The processed packets are still sent to a central collector making dynamic regrouping possible. The clients of a new group will then receive their packets from the server of the new group. Marking the processed data packets with the id of the group which processed them is mandatory. This allows the collector to find out which group processed each data packet so that this group is resupplied with data to process once it drops below a given threshold. 3 An adaptive distribution strategy Heterogeneous networks consist of processing elements with different performance capabilities (CPU, memory etc). Information about the complexity of tasks being processed is usually not available. Furthermore, the number of users working on a particular workstation are continuously changing. Thus it is impossible to predict the performance of any particular workstation in a network at a given time. This submitted to World Scientific : : 13:47 4/4

5 makes it impossible to a priori schedule task processing. A dynamic distribution strategy of processing tasks is thus needed. The number and size of the distributed data packages must be adapted to the work load of the processing element at transmission time. Even this strategy may not be near optimal, as additional tasks can be started on the PE between the determination of the current load and the arrival of data packages. More powerful PE s or those with a small performance utilisation will receive packets more frequently or will be allocated larger packages. In large networks some low performance nodes can be skipped and the work distributed to more powerful PE s. If this is not possible the data blocks sent to the low performance nodes will automatically be adapted. For the concrete realisation of this method a performance ranking must be generated. This can be done by calculating the difference between sent and processed packages as described above. In the first distribution run each processing element is supplied with n packages. After a certain time interval a performance rank list is created. The number of packets for the respective processing elements are then increased or decreased. This operation is repeated until the collector has received all data. Alternatively the packet size can be adapted. Larger packets are sent to the PE s at the top of the performance list. This can minimise the communication and network traffic. However, this is not always possible. For example, an image is usually subdivided into n sections. If all sections are distributed during the first run a change of the package size is not possible without a loss of already processed data. This performance information can only be used if the image has large dimensions, or if a whole image sequence is to be processed. A disadvantage of this model is that additional logic for the management of dynamic block sizes is necessary in the clients. Furthermore the complexity of the model tasks and the requirements regarding the user knowledge are increased. 4 The usage of the system The data flow of the proposed model for the parallel processing of multimedia data involves the following steps: The generated data packets are put into the pool of tasks when processing starts and the distribution manager is initialised. The data packets, received by the clients, are stored in the local pool of tasks, which is essentially a queue. Afterwards the computation starts. Processed data is stored in the local pool of results and sent to the collection manager. The collection manager informs the distribution manager of the receipt of the processed packets. When all data packets have been received, the so-called NULL-packet is distributed. Every processing element which receives a NULL-packet immediately terminates processing. An object oriented system design will help making system components reusable and lessens the difficulty of using the distribution models. The most submitted to World Scientific : : 13:47 5/5

6 important class is the processing class. It does the actual processing and is the focal point of the model. All other classes support it by managing the administration, reception and distribution of data. The parameter of its run()-method contains the data to be processed. The packet is processed in this method, stored in the local pool of results by means of a return call and is then sent back. The usage of this system merely requires an overloading of the run()-method of the processing class, adjusting the class for special problems. The distribution and collection manager have to be initialised at the beginning of a session. Furthermore, the required processes need to be launched in the processing nodes. These will then contact the distributor and collector on their own. At this stage the system will be idle. The pool of tasks is now filled with the required packets. Once this has been done, the distributor is activated and the data is processed. All processed packets are stored in the pool of results. Manipulating the packet size requires overloading of the methods that split and merge the packets. 5 Performance measurements The measurements were performed on a cluster of Linux K6, 300 MHz PCs connected over a 10Mbit Ethernet. In a first attempt different block sizes and number of iterations as well as various configurations of the processing model were examined in order to obtain data about the efficiency and the run time behaviour of the proposed system. Table 1: Measurement results (run times, speedup and efficiency) with the implemented prototype Iterations Time[s]: 1 PE Time[s]/Sp/Ep : 2 PE Time[s]/Sp/Ep : 3 PE Time[s]/Sp/Ep : 4 PE / 1.407/ / 1.615/ / 1.590/ / 1.586/ / 1.751/ / 1.939/ / 1.637/ / 1.852/ / 2.136/ / 1.758/ / 2.036/ / 2.409/ / 1.768/ / 2.215/ / 2.501/ / 1.805/ / 2.218/ / 2.514/ / 1.829/ / 2.388/ / 2.797/ / 1.855/ / 2.400/ / 2.921/ / 1.846/ / 2.429/ / 2.991/ / 1.825/ / 2.511/ / 3.155/ / 1.938/ / 2.529/ / 3.301/ Table 1 shows the run times needed for iterations of a simple inverting operation performed on a 10 Mbyte large block as well as the speedup factor S P submitted to World Scientific : : 13:47 6/6

7 and the efficiency E P. The data is subdivided into byte large subsections and according to the strategy described distributed to the single PE clients. Speedup values between 1.4 and 3.3 are reached in this simple application. At the beginning the network communication is the most influencing factor resulting in speedups between (2 PE s) and 1.59 (4 PE s). With larger numbers of iterations a linear increase of the speedup values can be observed reaching top speedup values of 3.3 in case of 4 PE s and 200 iterations. The efficiency decreases only slightly, e.g. there is a difference of 0.24 between the mean values of the two and four PE systems. Thus the scalability of the system model appears to be good. A clearer description of the results is given in figure 2. The right hand diagram shows the run times of the different system configurations, the left hand diagram contains the mean speedup and efficiency values for the parallel configurations. 3,0 2,5 Speedup and Efficiency (mean values ) Speedup Efficiency Client 2 Clients 3 Clients 4 Clients 2,0 1,5 1,0 0,5 Time [s] , Processing Elements Iterations Figure 2: A diagram of the speedup and efficiency values achieved (left); run times for 1-4 PEs (right) The achieved results are compared to the mean speedup and efficiency values of the PVM, which are shown in figure 3. The measurements are performed on the same configurations (K6 with Linux, distribution of byte large blocks) and type conversion disabled. Speedup and Efficiency PVM (mean values) 3 2,5 2 Speedup Efficiency 1,5 1 0, Processing Elements Figure 3: A diagram of the PVM average speedup and efficiency values submitted to World Scientific : : 13:47 7/7

8 An analysis of the PVM results shows slightly better speedup and efficiency values in case of two processing elements. These decrease when larger numbers of PE s are used. The effort of management and transfer clearly reduces the performance. Thus the proposed system model reaches a five times better speedup and efficiency in case of configurations with four PEs. 6 Conclusions In this paper a specialised computation model based on asynchronous transmission is presented, which automatically adapts to the workload of the elements in the parallel environment at transmission time, enables easy development of parallel programs and minimises the communication and management effort by direct use of TCP/IP sockets. It is based on the creation of pools of tasks, which are controlled by three special modules. A simple distribution strategy of the available packages increases the computing efficiency. More powerful processing elements or such with a small workload will more frequently receive packages. Additionally the package size can be adapted. The efficiency of the proposed method is evaluated through a variety of performance measurements. The results are compared with the results of the PVM. Future work includes extensions, which primarily concern improving the system s performance. Storing the packets in the local file system, similar to a spool-directory, makes it possible to save all packets of the same type that are to be processed in a special directory. Furthermore, comparative benchmarks with other systems are to be performed. References: 1. PVM Home page: Documentation, comparison between various packages, 2. CONDOR Project description, documentation, 3. MPI Project Home page: Documentation, tutorials, etc, 4. Mosix Home page: 5. Information about PVMPI: 6. S. Keinman, D. Shah, Programming with Threads, Prentice Hall, B. Wilkinson, M. Allen: Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, Prentice Hall, O. Sachs, Analyse von Aktienreihen mittels paralleler Korrelationsberechnungen, Master thesis, TU Clausthal, 1998 submitted to World Scientific : : 13:47 8/8

Fractals exercise. Investigating task farms and load imbalance

Fractals exercise. Investigating task farms and load imbalance Fractals exercise Investigating task farms and load imbalance Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Fractals. Investigating task farms and load imbalance

Fractals. Investigating task farms and load imbalance Fractals Investigating task farms and load imbalance Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Transactions on Information and Communications Technologies vol 15, 1997 WIT Press, ISSN

Transactions on Information and Communications Technologies vol 15, 1997 WIT Press,  ISSN Balanced workload distribution on a multi-processor cluster J.L. Bosque*, B. Moreno*", L. Pastor*" *Depatamento de Automdtica, Escuela Universitaria Politecnica de la Universidad de Alcald, Alcald de Henares,

More information

Job Re-Packing for Enhancing the Performance of Gang Scheduling

Job Re-Packing for Enhancing the Performance of Gang Scheduling Job Re-Packing for Enhancing the Performance of Gang Scheduling B. B. Zhou 1, R. P. Brent 2, C. W. Johnson 3, and D. Walsh 3 1 Computer Sciences Laboratory, Australian National University, Canberra, ACT

More information

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

Chapter 3. Design of Grid Scheduler. 3.1 Introduction Chapter 3 Design of Grid Scheduler The scheduler component of the grid is responsible to prepare the job ques for grid resources. The research in design of grid schedulers has given various topologies

More information

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18 PROCESS VIRTUAL MEMORY CS124 Operating Systems Winter 2015-2016, Lecture 18 2 Programs and Memory Programs perform many interactions with memory Accessing variables stored at specific memory locations

More information

I/O in the Gardens Non-Dedicated Cluster Computing Environment

I/O in the Gardens Non-Dedicated Cluster Computing Environment I/O in the Gardens Non-Dedicated Cluster Computing Environment Paul Roe and Siu Yuen Chan School of Computing Science Queensland University of Technology Australia fp.roe, s.chang@qut.edu.au Abstract Gardens

More information

Why Study Multimedia? Operating Systems. Multimedia Resource Requirements. Continuous Media. Influences on Quality. An End-To-End Problem

Why Study Multimedia? Operating Systems. Multimedia Resource Requirements. Continuous Media. Influences on Quality. An End-To-End Problem Why Study Multimedia? Operating Systems Operating System Support for Multimedia Improvements: Telecommunications Environments Communication Fun Outgrowth from industry telecommunications consumer electronics

More information

Chapter 7 Multimedia Operating Systems

Chapter 7 Multimedia Operating Systems MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 7 Multimedia Operating Systems Introduction To Multimedia (1) Figure 7-1. Video on demand using different local distribution technologies.

More information

Continuous Real Time Data Transfer with UDP/IP

Continuous Real Time Data Transfer with UDP/IP Continuous Real Time Data Transfer with UDP/IP 1 Emil Farkas and 2 Iuliu Szekely 1 Wiener Strasse 27 Leopoldsdorf I. M., A-2285, Austria, farkas_emil@yahoo.com 2 Transilvania University of Brasov, Eroilor

More information

Multimedia Communications. Transform Coding

Multimedia Communications. Transform Coding Multimedia Communications Transform Coding Transform coding Transform coding: source output is transformed into components that are coded according to their characteristics If a sequence of inputs is transformed

More information

Final Project Writeup

Final Project Writeup Jitu Das Bertha Lam 15-418 Final Project Writeup Summary We built a framework that facilitates running computations across multiple GPUs and displaying results in a web browser. We then created three demos

More information

PowerVR Series5. Architecture Guide for Developers

PowerVR Series5. Architecture Guide for Developers Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Operating System Support for Multimedia. Slides courtesy of Tay Vaughan Making Multimedia Work

Operating System Support for Multimedia. Slides courtesy of Tay Vaughan Making Multimedia Work Operating System Support for Multimedia Slides courtesy of Tay Vaughan Making Multimedia Work Why Study Multimedia? Improvements: Telecommunications Environments Communication Fun Outgrowth from industry

More information

An Optimized Search Mechanism for Large Distributed Systems

An Optimized Search Mechanism for Large Distributed Systems An Optimized Search Mechanism for Large Distributed Systems Herwig Unger 1, Thomas Böhme, Markus Wulff 1, Gilbert Babin 3, and Peter Kropf 1 Fachbereich Informatik Universität Rostock D-1051 Rostock, Germany

More information

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others Memory Management 1 Learning Outcomes Appreciate the need for memory management in operating systems, understand the limits of fixed memory allocation schemes. Understand fragmentation in dynamic memory

More information

CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007

CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007 CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007 Question 344 Points 444 Points Score 1 10 10 2 10 10 3 20 20 4 20 10 5 20 20 6 20 10 7-20 Total: 100 100 Instructions: 1. Question

More information

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [NETWORKING] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Why not spawn processes

More information

Design of Parallel Algorithms. Course Introduction

Design of Parallel Algorithms. Course Introduction + Design of Parallel Algorithms Course Introduction + CSE 4163/6163 Parallel Algorithm Analysis & Design! Course Web Site: http://www.cse.msstate.edu/~luke/courses/fl17/cse4163! Instructor: Ed Luke! Office:

More information

From Cluster Monitoring to Grid Monitoring Based on GRM *

From Cluster Monitoring to Grid Monitoring Based on GRM * From Cluster Monitoring to Grid Monitoring Based on GRM * Zoltán Balaton, Péter Kacsuk, Norbert Podhorszki and Ferenc Vajda MTA SZTAKI H-1518 Budapest, P.O.Box 63. Hungary {balaton, kacsuk, pnorbert, vajda}@sztaki.hu

More information

Chapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel

Chapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel Chapter-6 SUBJECT:- Operating System TOPICS:- I/O Management Created by : - Sanjay Patel Disk Scheduling Algorithm 1) First-In-First-Out (FIFO) 2) Shortest Service Time First (SSTF) 3) SCAN 4) Circular-SCAN

More information

Quality of Service II

Quality of Service II Quality of Service II Patrick J. Stockreisser p.j.stockreisser@cs.cardiff.ac.uk Lecture Outline Common QoS Approaches Best Effort Integrated Services Differentiated Services Integrated Services Integrated

More information

PARA++ : C++ Bindings for Message Passing Libraries

PARA++ : C++ Bindings for Message Passing Libraries PARA++ : C++ Bindings for Message Passing Libraries O. Coulaud, E. Dillon {Olivier.Coulaud, Eric.Dillon}@loria.fr INRIA-lorraine BP101, 54602 VILLERS-les-NANCY, FRANCE Abstract The aim of Para++ is to

More information

Scheduling Large Parametric Modelling Experiments on a Distributed Meta-computer

Scheduling Large Parametric Modelling Experiments on a Distributed Meta-computer Scheduling Large Parametric Modelling Experiments on a Distributed Meta-computer David Abramson and Jon Giddy Department of Digital Systems, CRC for Distributed Systems Technology Monash University, Gehrmann

More information

Evaluating Algorithms for Shared File Pointer Operations in MPI I/O

Evaluating Algorithms for Shared File Pointer Operations in MPI I/O Evaluating Algorithms for Shared File Pointer Operations in MPI I/O Ketan Kulkarni and Edgar Gabriel Parallel Software Technologies Laboratory, Department of Computer Science, University of Houston {knkulkarni,gabriel}@cs.uh.edu

More information

The Transport Layer: User Datagram Protocol

The Transport Layer: User Datagram Protocol The Transport Layer: User Datagram Protocol CS7025: Network Technologies and Server Side Programming http://www.scss.tcd.ie/~luzs/t/cs7025/ Lecturer: Saturnino Luz April 4, 2011 The UDP All applications

More information

ENSC 427: COMMUNICATION NETWORKS (Spring 2011) Final Report

ENSC 427: COMMUNICATION NETWORKS (Spring 2011) Final Report ENSC 427: COMMUNICATION NETWORKS (Spring 2011) Final Report Video Streaming over the 802.11g WLAN Technologies http://www.sfu.ca/~zxa7/ Zhenpeng Xue 301062408 zxa7@sfu.ca Page 2 of 16 Table of Contents

More information

Lecture 13. Quality of Service II CM0256

Lecture 13. Quality of Service II CM0256 Lecture 13 Quality of Service II CM0256 Types of QoS Best Effort Services Integrated Services -- resource reservation network resources are assigned according to the application QoS request and subject

More information

Mitsubishi FX Net Driver PTC Inc. All Rights Reserved.

Mitsubishi FX Net Driver PTC Inc. All Rights Reserved. 2017 PTC Inc. All Rights Reserved. 2 Table of Contents 1 Table of Contents 2 3 Overview 3 Device Setup 4 Channel Properties 5 Channel Properties - General 5 Channel Properties - Serial Communications 6

More information

April 9, 2000 DIS chapter 1

April 9, 2000 DIS chapter 1 April 9, 2000 DIS chapter 1 GEINTEGREERDE SYSTEMEN VOOR DIGITALE SIGNAALVERWERKING: ONTWERPCONCEPTEN EN ARCHITECTUURCOMPONENTEN INTEGRATED SYSTEMS FOR REAL- TIME DIGITAL SIGNAL PROCESSING: DESIGN CONCEPTS

More information

File Format for Storage of Multimedia Information Peter Armyanov

File Format for Storage of Multimedia Information Peter Armyanov M a t h e m a t i c a B a l k a n i c a New Series Vol. 24, 2010, Fasc.3-4 File Format for Storage of Multimedia Information Peter Armyanov This article studies problems referring to storage, editing and

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES OBJECTIVES Detailed description of various ways of organizing memory hardware Various memory-management techniques, including paging and segmentation To provide

More information

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Chapter 8: Main Memory. Operating System Concepts 9 th Edition Chapter 8: Main Memory Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel

More information

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others Memory Management 1 Process One or more threads of execution Resources required for execution Memory (RAM) Program code ( text ) Data (initialised, uninitialised, stack) Buffers held in the kernel on behalf

More information

On the scalability of tracing mechanisms 1

On the scalability of tracing mechanisms 1 On the scalability of tracing mechanisms 1 Felix Freitag, Jordi Caubet, Jesus Labarta Departament d Arquitectura de Computadors (DAC) European Center for Parallelism of Barcelona (CEPBA) Universitat Politècnica

More information

LDPC benchmarking for DVB-H

LDPC benchmarking for DVB-H Subject Description and discussion of the benchmarking of our LDPC solual-fec for DVB-H Category Report Revision 1.0.1 Authors STM/AST and INRIA Table of contents 1. Introduction...4 2. Test environment...5

More information

Operating Systems (2INC0) 2017/18

Operating Systems (2INC0) 2017/18 Operating Systems (2INC0) 2017/18 Memory Management (09) Dr. Courtesy of Dr. I. Radovanovic, Dr. R. Mak (figures from Bic & Shaw) System Architecture and Networking Group Agenda Reminder: OS & resources

More information

Query Answering Using Inverted Indexes

Query Answering Using Inverted Indexes Query Answering Using Inverted Indexes Inverted Indexes Query Brutus AND Calpurnia J. Pei: Information Retrieval and Web Search -- Query Answering Using Inverted Indexes 2 Document-at-a-time Evaluation

More information

MICE: A Prototype MPI Implementation in Converse Environment

MICE: A Prototype MPI Implementation in Converse Environment : A Prototype MPI Implementation in Converse Environment Milind A. Bhandarkar and Laxmikant V. Kalé Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign

More information

MEMORY MANAGEMENT/1 CS 409, FALL 2013

MEMORY MANAGEMENT/1 CS 409, FALL 2013 MEMORY MANAGEMENT Requirements: Relocation (to different memory areas) Protection (run time, usually implemented together with relocation) Sharing (and also protection) Logical organization Physical organization

More information

PARALLEL XXXXXXXXX IMPLEMENTATION USING MPI

PARALLEL XXXXXXXXX IMPLEMENTATION USING MPI 2008 ECE566_Puneet_Kataria_Project Introduction to Parallel and Distributed Computing PARALLEL XXXXXXXXX IMPLEMENTATION USING MPI This report describes the approach, implementation and experiments done

More information

Fast optimal task graph scheduling by means of an optimized parallel A -Algorithm

Fast optimal task graph scheduling by means of an optimized parallel A -Algorithm Fast optimal task graph scheduling by means of an optimized parallel A -Algorithm Udo Hönig and Wolfram Schiffmann FernUniversität Hagen, Lehrgebiet Rechnerarchitektur, 58084 Hagen, Germany {Udo.Hoenig,

More information

The Peregrine High-performance RPC System

The Peregrine High-performance RPC System SOFIWARE-PRACTICE AND EXPERIENCE, VOL. 23(2), 201-221 (FEBRUARY 1993) The Peregrine High-performance RPC System DAVID B. JOHNSON* AND WILLY ZWAENEPOEL Department of Computer Science, Rice University, P.

More information

Hardware Assisted Recursive Packet Classification Module for IPv6 etworks ABSTRACT

Hardware Assisted Recursive Packet Classification Module for IPv6 etworks ABSTRACT Hardware Assisted Recursive Packet Classification Module for IPv6 etworks Shivvasangari Subramani [shivva1@umbc.edu] Department of Computer Science and Electrical Engineering University of Maryland Baltimore

More information

PART IV. Internetworking Using TCP/IP

PART IV. Internetworking Using TCP/IP PART IV Internetworking Using TCP/IP Internet architecture, addressing, binding, encapsulation, and protocols in the TCP/IP suite Chapters 20 Internetworking: Concepts, Architecture, and Protocols 21 IP:

More information

Ping Driver PTC Inc. All Rights Reserved.

Ping Driver PTC Inc. All Rights Reserved. 2017 PTC Inc. All Rights Reserved. 2 Table of Contents 1 Table of Contents 2 3 Overview 4 Channel Properties General 4 Channel Properties Ethernet Communications 5 Channel Properties Write Optimizations

More information

Using Time Division Multiplexing to support Real-time Networking on Ethernet

Using Time Division Multiplexing to support Real-time Networking on Ethernet Using Time Division Multiplexing to support Real-time Networking on Ethernet Hariprasad Sampathkumar 25 th January 2005 Master s Thesis Defense Committee Dr. Douglas Niehaus, Chair Dr. Jeremiah James,

More information

Chapter 4 Communication

Chapter 4 Communication DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 4 Communication Layered Protocols (1) Figure 4-1. Layers, interfaces, and protocols in the OSI

More information

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Di-Shi Sun and Douglas M. Blough School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA

More information

CS301 - Data Structures Glossary By

CS301 - Data Structures Glossary By CS301 - Data Structures Glossary By Abstract Data Type : A set of data values and associated operations that are precisely specified independent of any particular implementation. Also known as ADT Algorithm

More information

Building Mobile Applications. F. Ricci 2010/2011

Building Mobile Applications. F. Ricci 2010/2011 Building Mobile Applications F. Ricci 2010/2011 Wireless Software Engineering Model Mobile User Analysis Scenario Analysis Architectural Design Planning Navigation & User Interface Design Maintenance Implementation

More information

Symphony: An Integrated Multimedia File System

Symphony: An Integrated Multimedia File System Symphony: An Integrated Multimedia File System Prashant J. Shenoy, Pawan Goyal, Sriram S. Rao, and Harrick M. Vin Distributed Multimedia Computing Laboratory Department of Computer Sciences, University

More information

Receive Livelock. Robert Grimm New York University

Receive Livelock. Robert Grimm New York University Receive Livelock Robert Grimm New York University The Three Questions What is the problem? What is new or different? What are the contributions and limitations? Motivation Interrupts work well when I/O

More information

SMD149 - Operating Systems

SMD149 - Operating Systems SMD149 - Operating Systems Roland Parviainen November 3, 2005 1 / 45 Outline Overview 2 / 45 Process (tasks) are necessary for concurrency Instance of a program in execution Next invocation of the program

More information

Intersection Acceleration

Intersection Acceleration Advanced Computer Graphics Intersection Acceleration Matthias Teschner Computer Science Department University of Freiburg Outline introduction bounding volume hierarchies uniform grids kd-trees octrees

More information

Latency on a Switched Ethernet Network

Latency on a Switched Ethernet Network Page 1 of 6 1 Introduction This document serves to explain the sources of latency on a switched Ethernet network and describe how to calculate cumulative latency as well as provide some real world examples.

More information

Chapter 6: DataLink Layer - Ethernet Olivier Bonaventure (2010)

Chapter 6: DataLink Layer - Ethernet Olivier Bonaventure (2010) Chapter 6: DataLink Layer - Ethernet Olivier Bonaventure (2010) 6.3.2. Ethernet Ethernet was designed in the 1970s at the Palo Alto Research Center [Metcalfe1976]. The first prototype [5] used a coaxial

More information

Multicast can be implemented here

Multicast can be implemented here MPI Collective Operations over IP Multicast? Hsiang Ann Chen, Yvette O. Carrasco, and Amy W. Apon Computer Science and Computer Engineering University of Arkansas Fayetteville, Arkansas, U.S.A fhachen,yochoa,aapong@comp.uark.edu

More information

An Empirical Study of Reliable Multicast Protocols over Ethernet Connected Networks

An Empirical Study of Reliable Multicast Protocols over Ethernet Connected Networks An Empirical Study of Reliable Multicast Protocols over Ethernet Connected Networks Ryan G. Lane Daniels Scott Xin Yuan Department of Computer Science Florida State University Tallahassee, FL 32306 {ryanlane,sdaniels,xyuan}@cs.fsu.edu

More information

different problems from other networks ITU-T specified restricted initial set Limited number of overhead bits ATM forum Traffic Management

different problems from other networks ITU-T specified restricted initial set Limited number of overhead bits ATM forum Traffic Management Traffic and Congestion Management in ATM 3BA33 David Lewis 3BA33 D.Lewis 2007 1 Traffic Control Objectives Optimise usage of network resources Network is a shared resource Over-utilisation -> congestion

More information

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time

More information

OpenMP: Open Multiprocessing

OpenMP: Open Multiprocessing OpenMP: Open Multiprocessing Erik Schnetter June 7, 2012, IHPC 2012, Iowa City Outline 1. Basic concepts, hardware architectures 2. OpenMP Programming 3. How to parallelise an existing code 4. Advanced

More information

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition Chapter 7: Main Memory Operating System Concepts Essentials 8 th Edition Silberschatz, Galvin and Gagne 2011 Chapter 7: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure

More information

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Chapter 8 & Chapter 9 Main Memory & Virtual Memory Chapter 8 & Chapter 9 Main Memory & Virtual Memory 1. Various ways of organizing memory hardware. 2. Memory-management techniques: 1. Paging 2. Segmentation. Introduction Memory consists of a large array

More information

Measuring the Processing Performance of NetSniff

Measuring the Processing Performance of NetSniff Measuring the Processing Performance of NetSniff Julie-Anne Bussiere *, Jason But Centre for Advanced Internet Architectures. Technical Report 050823A Swinburne University of Technology Melbourne, Australia

More information

Monte Carlo Method on Parallel Computing. Jongsoon Kim

Monte Carlo Method on Parallel Computing. Jongsoon Kim Monte Carlo Method on Parallel Computing Jongsoon Kim Introduction Monte Carlo methods Utilize random numbers to perform a statistical simulation of a physical problem Extremely time-consuming Inherently

More information

Evaluating external network bandwidth load for Google Apps

Evaluating external network bandwidth load for Google Apps Evaluating external network bandwidth load for Google Apps This document describes how to perform measurements to better understand how much network load will be caused by using a software as a service

More information

AADECA 2004 XIX Congreso Argentino de Control Automático. Ethernet delay evaluation by an embedded real time Simulink model PC in asynchronous mode

AADECA 2004 XIX Congreso Argentino de Control Automático. Ethernet delay evaluation by an embedded real time Simulink model PC in asynchronous mode Ethernet delay evaluation by an embedded real time Simulink model PC in asynchronous mode Mario R. Modesti, Luis R. Canali, Jorge C. Vaschetti Grupo de Investigaciones en Informática para Ingeniería, Universidad

More information

What is a multi-model database and why use it?

What is a multi-model database and why use it? What is a multi-model database and why use it? An When it comes to choosing the right technology for a new project, ongoing development or a full system upgrade, it can often be challenging to define the

More information

Parallel Branch & Bound

Parallel Branch & Bound Parallel Branch & Bound Bernard Gendron Université de Montréal gendron@iro.umontreal.ca Outline Mixed integer programming (MIP) and branch & bound (B&B) Linear programming (LP) based B&B Relaxation and

More information

Splitting Algorithms

Splitting Algorithms Splitting Algorithms We have seen that slotted Aloha has maximal throughput 1/e Now we will look at more sophisticated collision resolution techniques which have higher achievable throughput These techniques

More information

Work Queue + Python. A Framework For Scalable Scientific Ensemble Applications

Work Queue + Python. A Framework For Scalable Scientific Ensemble Applications Work Queue + Python A Framework For Scalable Scientific Ensemble Applications Peter Bui, Dinesh Rajan, Badi Abdul-Wahid, Jesus Izaguirre, Douglas Thain University of Notre Dame Distributed Computing Examples

More information

A Rule Chaining Architecture Using a Correlation Matrix Memory. James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe

A Rule Chaining Architecture Using a Correlation Matrix Memory. James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe A Rule Chaining Architecture Using a Correlation Matrix Memory James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe Advanced Computer Architectures Group, Department of Computer Science, University

More information

In multiprogramming systems, processes share a common store. Processes need space for:

In multiprogramming systems, processes share a common store. Processes need space for: Memory Management In multiprogramming systems, processes share a common store. Processes need space for: code (instructions) static data (compiler initialized variables, strings, etc.) global data (global

More information

Network protocols and. network systems INTRODUCTION CHAPTER

Network protocols and. network systems INTRODUCTION CHAPTER CHAPTER Network protocols and 2 network systems INTRODUCTION The technical area of telecommunications and networking is a mature area of engineering that has experienced significant contributions for more

More information

Part IV. Chapter 15 - Introduction to MIMD Architectures

Part IV. Chapter 15 - Introduction to MIMD Architectures D. Sima, T. J. Fountain, P. Kacsuk dvanced Computer rchitectures Part IV. Chapter 15 - Introduction to MIMD rchitectures Thread and process-level parallel architectures are typically realised by MIMD (Multiple

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) What the course is: An introduction to parallel systems and their implementation using MPI A presentation of all the basic functions and types you are likely to need in MPI A collection of examples What

More information

Praktikum 2014 Parallele Programmierung Universität Hamburg Dept. Informatics / Scientific Computing. October 23, FluidSim.

Praktikum 2014 Parallele Programmierung Universität Hamburg Dept. Informatics / Scientific Computing. October 23, FluidSim. Praktikum 2014 Parallele Programmierung Universität Hamburg Dept. Informatics / Scientific Computing October 23, 2014 Paul Bienkowski Author 2bienkow@informatik.uni-hamburg.de Dr. Julian Kunkel Supervisor

More information

Benchmarking CPU Performance

Benchmarking CPU Performance Benchmarking CPU Performance Many benchmarks available MHz (cycle speed of processor) MIPS (million instructions per second) Peak FLOPS Whetstone Stresses unoptimized scalar performance, since it is designed

More information

3. Memory Management

3. Memory Management Principles of Operating Systems CS 446/646 3. Memory Management René Doursat Department of Computer Science & Engineering University of Nevada, Reno Spring 2006 Principles of Operating Systems CS 446/646

More information

Mesh-Based Content Routing Using XML

Mesh-Based Content Routing Using XML Outline Mesh-Based Content Routing Using XML Alex C. Snoeren, Kenneth Conley, and David K. Gifford MIT Laboratory for Computer Science Presented by: Jie Mao CS295-1 Fall 2005 2 Outline Motivation Motivation

More information

Multiprocessor scheduling

Multiprocessor scheduling Chapter 10 Multiprocessor scheduling When a computer system contains multiple processors, a few new issues arise. Multiprocessor systems can be categorized into the following: Loosely coupled or distributed.

More information

Measurement-based Analysis of TCP/IP Processing Requirements

Measurement-based Analysis of TCP/IP Processing Requirements Measurement-based Analysis of TCP/IP Processing Requirements Srihari Makineni Ravi Iyer Communications Technology Lab Intel Corporation {srihari.makineni, ravishankar.iyer}@intel.com Abstract With the

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 CS 347 Notes 12 5 Web Search Engine Crawling

More information

Early Evaluation of the "Infinite Memory Engine" Burst Buffer Solution

Early Evaluation of the Infinite Memory Engine Burst Buffer Solution Early Evaluation of the "Infinite Memory Engine" Burst Buffer Solution Wolfram Schenck Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, Bielefeld, Germany Salem El Sayed,

More information

Parallel Computers. c R. Leduc

Parallel Computers. c R. Leduc Parallel Computers Material based on B. Wilkinson et al., PARALLEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers c 2002-2004 R. Leduc Why Parallel Computing?

More information

Internetworking With TCP/IP

Internetworking With TCP/IP Internetworking With TCP/IP Vol II: Design, Implementation, and Internals SECOND EDITION DOUGLAS E. COMER and DAVID L. STEVENS Department of Computer Sciences Purdue University West Lafayette, IN 47907

More information

Data reduction for CORSIKA. Dominik Baack. Technical Report 06/2016. technische universität dortmund

Data reduction for CORSIKA. Dominik Baack. Technical Report 06/2016. technische universität dortmund Data reduction for CORSIKA Technical Report Dominik Baack 06/2016 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft (DFG) within

More information

Remote Health Monitoring for an Embedded System

Remote Health Monitoring for an Embedded System July 20, 2012 Remote Health Monitoring for an Embedded System Authors: Puneet Gupta, Kundan Kumar, Vishnu H Prasad 1/22/2014 2 Outline Background Background & Scope Requirements Key Challenges Introduction

More information

Overview of Project's Achievements

Overview of Project's Achievements PalDMC Parallelised Data Mining Components Final Presentation ESRIN, 12/01/2012 Overview of Project's Achievements page 1 Project Outline Project's objectives design and implement performance optimised,

More information

Universal Communication Component on Symbian Series60 Platform

Universal Communication Component on Symbian Series60 Platform Universal Communication Component on Symbian Series60 Platform Róbert Kereskényi, Bertalan Forstner, Hassan Charaf Department of Automation and Applied Informatics Budapest University of Technology and

More information

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition Chapter 8: Memory- Management Strategies Operating System Concepts 9 th Edition Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation

More information

Chapter 8: Memory- Management Strategies

Chapter 8: Memory- Management Strategies Chapter 8: Memory Management Strategies Chapter 8: Memory- Management Strategies Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and

More information

CHAPTER 8: MEMORY MANAGEMENT. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 8: MEMORY MANAGEMENT. By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 8: MEMORY MANAGEMENT By I-Chen Lin Textbook: Operating System Concepts 9th Ed. Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the

More information

Basics of datacommunication

Basics of datacommunication Data communication I Lecture 1 Course Introduction About the course Basics of datacommunication How is information transported between digital devices? Essential data communication protocols Insight into

More information

Chapter 4: Threads. Chapter 4: Threads. Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues

Chapter 4: Threads. Chapter 4: Threads. Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues Chapter 4: Threads Silberschatz, Galvin and Gagne 2013 Chapter 4: Threads Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues 4.2 Silberschatz, Galvin

More information

CONTENT MODEL FOR MOBILE ADAPTATION OF MULTIMEDIA INFORMATION

CONTENT MODEL FOR MOBILE ADAPTATION OF MULTIMEDIA INFORMATION CONTENT MODEL FOR MOBILE ADAPTATION OF MULTIMEDIA INFORMATION Maija Metso, Antti Koivisto and Jaakko Sauvola MediaTeam, MVMP Unit Infotech Oulu, University of Oulu e-mail: {maija.metso, antti.koivisto,

More information

Grid-Based Genetic Algorithm Approach to Colour Image Segmentation

Grid-Based Genetic Algorithm Approach to Colour Image Segmentation Grid-Based Genetic Algorithm Approach to Colour Image Segmentation Marco Gallotta Keri Woods Supervised by Audrey Mbogho Image Segmentation Identifying and extracting distinct, homogeneous regions from

More information

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.

More information