Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream
|
|
- Imogene Gibson
- 5 years ago
- Views:
Transcription
1 Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO Eileen Kraemer Dept. of Computer Science University of Georgia Athens, GA Abstract The ability to understand running distributed computations depends on eective monitoring techniques. Monitoring distributed systems entails two primary tasks: collecting data from the application processes and integrating it into comprehensive global views. This paper focuses on the snapshot assembly task of taking process checkpoints and forming global snapshots. The assembly task can be performed in many ways, each having its own set of advantages and disadvantages. We look at some of the dierent approaches and their associated costs and benets. Then the roles that agents can play in the assembly process are examined in the context of the PathFinder visualization system. Keywords: distributed monitoring, consistent snapshots, agents 1 Introduction Monitoring is an essential function in tools for understanding distributed computations. Debuggers, interactive steering systems, and visualization tools all rely on some form of monitoring to provide the information upon which to base their representation of the execution of an application. The extent to which users can rely on these representations to be accurate depends on the guarantees made by the underlying monitoring system. However, providing such guarantees is non-trivial in distributed systems. The monitoring of a distributed computation may be viewed as the creation of a sequence of global snapshots. Each global snapshot is a set of checkpoints, local snapshots representing the state of a single process, with one checkpoint from each process in the computation. A snapshot should be consistent, representing a possible state of the computation from when the data was collected. The lack of a global clock and uncertainty in message delivery times complicate the task of assembling global snapshots from the streams of local snapshots produced at each process. Dierent degrees of consistency exist, and the particular type of consistency that is sought aects both the ordering information that must be collected and the criteria to be applied in the assembly of global snapshots from checkpoints. Although important, consistency is not the only criteria by which a monitoring system is judged. Consistency concerns must be balanced against consideration of the lag in presentation, the scalability of the system, and the perturbation induced by monitoring. Latency or lag refers to the elapsed time between the existence of a state in the program's execution and the presentation of that state to the viewer. The scalability of the monitoring system is a measure of how the performance of the monitoring software changes as a function of the number of processes in the computation, the amount of data, and the frequency of data collection. Perturbation refers to the degree to which the underlying computation is slowed down or otherwise aected by external forces, i.e., the monitoring software. In this paper we examine some of the different ways checkpoints can be assembled into
2 snapshots. In addition, we show how agents can be used to support the assembly task, in the context of the PathFinder[1] exploratory visualization system. An obvious role that agents can take is to instantiate general assembly algorithms. Clearly, these agent-based assembly protocols will be slower than standard compiled protocols. However, the use of agents allows us to examine a variety of algorithms, \tweak" their parameters interactively, and compare the trade-os of dierent approaches before committing to further development. In addition, the use of agents to implement assembly algorithms permits the user to easily switch assembly algorithms at runtime. Instead of simply following general assembly algorithms, agents can be designed to take advantage of application specic information to make the assembly process more ecient. Agents can also be used to support nonagent assembly solutions. The remainder of the paper is organized as follows: Section 2 describes the PathFinder monitoring system. Section 3 looks at dierent types of snapshot assembly. The roles agents can play is considered in Section 4. Finally, the paper is summarized in Section 5. 2 PathFinder The purpose of the PathFinder system is to support exploratory visualization[2]. Exploratory visualization is rooted in the realization that it is not feasible to collect and present all of the data in large, long-lived distributed computations nor is it typically desirable to do so. Rather, a user exploring the execution of a distributed computation through visualization and interaction needs only a subset of the data available. In the interest of good performance and clarity, only this subset should be collected and presented. As the user explores the computation, the particular subset of data that is \interesting" evolves, thus the user is provided with the ability to navigate through the computation, changing what is collected and how it is presented. Our approach to ex- Interaction Managers Stream Manager Steering Stream Snapshot Stream User Interface Figure 1: PathFinder architecture overview. ploratory visualization is based on viewing the interaction between the user and the computation in terms of streams of information. The user is presented with a stream of globally consistent snapshots representing the computation and can send a stream of steering commands. The PathFinder architecture consists of Interaction Managers (IMs), a Stream Manager (SM), and a User Interface (UI), as shown in Figure 1. The IMs collect data from and allow steering of values in the application processes. The UI presents the collected data to the user and receives user commands to change how the data is collected, the way in which the data is presented, or the computation (by steering its variables). The SM serves as an intermediary between the UI and IMs, ensuring that the information passed from one side of the system to the other is properly correlated, e.g. collating data into snapshots and distributing steering commands to the appropriate IMs.. PathFinder's architecture uses an attributeevent model of the computation. Processes possess attributes that are available for monitoring and/or steering. The Interaction Manager serves as a framework for accessing an application process's attributes and learning of its events. It is implemented as a library of routines installed at the process and provides an interface between the application and the monitoring system. Each IM contains the database of locally available attributes. Events from the application are received by an IM when speci- ed conditions exist, e.g. the execution passes through a particular point in the code. In the current implementation, software annotations indicate the occurrence of events, the availability of process variables for monitoring, steering, or both.
3 PathFinder is modular, the functionality of the monitoring and steering system is separated into layers. A layer consists of a module installed at the SM and companion modules installed at the IMs that work together to perform a specic function. Layers provide services such as the collection of ordering information for snapshot construction, monitoring, steering, migration, and rollback. This separation of functionality into loosely coupled modules permits PathFinder to be congured in a \plug-and-play" fashion. The set of layers installed determines both the capabilities of the system and the costs in terms of consistency, perturbation, lag, and scalability. One layer that is available for use is an agent layer that was designed to provide monitoring and steering functionality. A layer that provides information for creating and ordering snapshots is referred to as an assembly layer. 3 Snapshot Assembly To monitor an application a tool can generate a sequence of global snapshots of the application. The assembly task is to maintain guarantees about the accuracy of the individual snapshots and their sequencing. Performing the assembly task eciently can be challenging in distributed computations. A distributed computation is a set of processes cooperating to perform a task or service. This suggests that it would be useful to view the state of the distributed system as a unied whole, a single set of attributes. The distributed nature of the computation results in the attributes being partitioned, by the process they reside in, into a set of checkpoints. Hence, a global snapshot is a set of checkpoints, such that there is exactly one checkpoint from each process. In general, processes do not take checkpoints at the same instant. Consequently, any monitoring system for distributed programs must make decisions about how the attributes of the processes should be aggregated for presentation to the tool it serves. Clearly, attributes in the same checkpoint should be presented together. It is less clear how to correlate attributes from separate checkpoints. There is no global clock available for the processes to reference, yet some consistency criteria must be used to decide how to assemble a set of local snapshots into a global snapshot. An established consistency criteria for global snapshots is the causality relation. The causality relation is a partial ordering of the events of a computation, given by Lamport's happenedbefore relation[3]. Two events are concurrent if they are not orderable by the causality relation. Similarly, checkpoints can be ordered by associating a checkpoint with the event that immediately preceded it. The choice of a method for aggregation affects the way in which the user of the tool views the computation. Inconsistent aggregations can mislead the viewer, and even logically consistent aggregations can obscure ordering information or fail to emphasize interesting aspects of the computation[4]. Obtaining consistency is not automatic or free though. The cost of obtaining a consistent view needs to be weighed against other monitoring and steering considerations. To illustrate the dierent ways in which assembly can be performed, four general categories of assembly algorithms and the consistency guarantees they provide are presented. Physical assembly is based on hardware clocks. A global snapshot is constructed by choosing the latest checkpoint from each process that is before a chosen time. Although physical assembly is adequate for many applications, problems can arise. If the physical clocks are not tightly synchronized and the elapsed time between local snapshots is small, then global snapshots may be created that violate causality, e.g., a receive appears in a global snapshot before the corresponding send has been presented. Such inconsistencies in global snapshots can mislead a viewer or cause errors in analysis tools. To prevent possible causality violations a causal assembly algorithm can be used. Several reasonable approaches exist for achieving causal assembly. A straightforward way is by
4 keeping logical clocks, also known as Lamport clocks[3]. A snapshot is constructed as it was in physical assembly, the only dierence is that logical clocks are used instead of physical ones. Causal assembly ensures that the global snapshots created reect states of the computation that were possible. The method of implementing causal assembly aects how much exibility exists in choosing global states for presentation. Limited causal assembly refers to a technique, such as logical clocks, in which some sequences of global snapshots that are correct are not possible to obtain. A technique that provides the ability to reach all possible sequences is called full causal assembly. One way of providing full causal assembly is to use vector clocks to track the causality relation among the processes. Strong consistency[5] refers to snapshots that are consistent and do not have any messages in transit. PathFinder uses transactional assembly to create global snapshots that are strongly consistent. Transactional assembly views the computation as consisting of a set of (possibly nested) logical actions. The user observes these logical actions as occurring atomically, at whatever level of granularity is appropriate. We refer to the logical actions as transactions. In PathFinder, transactions are recognized independently at each local process through code annotations indicating the beginning and end of the process's participation in the transaction. Each process also records information about the other processes it has communicated with during the transaction. Local portions of the transactions are assembled into full multi-process transactions through transitive analysis of the communication events to determine the members of the transaction. Global snapshots are constructed based on the transaction membership and ordering information. We have developed several algorithms that can be used to recognize and order transactions[6]. Some of these algorithms have been encoded into assembly layers for PathFinder. The assembly of checkpoints into global snapshots is an essential task of any monitoring system. The choice of how it is done and what guarantees are made depends on the application, the task that the user wishes to perform, and performance characteristics of the environment in which the monitoring is performed. Physical assembly has low perturbation and lag, but can have consistency problems. Limited causal assembly ensures that the snapshots created are ones that could have happened, but results in additional perturbation. Full causal assembly allows a choice of any possible consistent state for presentation, but at the expense of maintaining a vector clock or other interprocess dependence information. In general, full causal assembly does not have any means of scaling to very large computations, as the size of the vector clock must be equal to the number of processes[7]. Transactional assembly requires more perturbation still and can cause some additional lag in the presentation, but it provides additional consistency guarantees. It also has a better ability to scale than full causal assembly because of the user's ability to choose the level of temporal detail at which to view the computation. 4 Agent Roles As a distributed computation runs, the best assembly strategy may change. One way to cope with this changing environment is to utilize agents in the assembly process. For this paper, it suces to consider an agent as an encoding of code and data. More comprehensive views on agents can be found in[8, 9]. For an agent to be eective there must be an environment for it to operate within. In PathFinder, agents operate within an environment known as a milieu. The milieu provides basic services to agents allowing them to execute, interact with, and create agents locally, and to migrate to other milieux. As seen in Figure 2, the milieu's interface to the outside world is through two queues, an incoming queue and an outgoing queue. The queues contain agents that are either arriving
5 Attributes Incoming Queue Outgoing Queue Events Interprocess Communication Milieu Agent Module Figure 2: Attributes, events, and interprocess communication from the IM are available to the milieu through the agent module. at or departing from the milieu. An agent interacts only with the milieu and with other local agents. The application process is represented through agents created and/or simulated by the agent module. When an event occurs in the application, the agent module creates an agent to represent that event (an event agent) and adds it to the milieu's incoming queue. The application's attributes are accessed as though they were data elements of these event agents. The agent module in PathFinder currently generates three types of event agents: transaction, message send, and message receive. As described in the previous section, the snapshot assembly task is to order the checkpoints generated by the distributed processes and integrate them into global snapshots. In the IMs this task is handled by the assembly layer. Through the agent module and milieu, agents have access to the same attributes and events that an assembly layer does. This enables the agents to implement general assembly algorithms. As an example (Figure 3), consider how agents could implement the selective transaction assembly algorithm[6]. The selective algorithm determines the members of the transaction at the application processes and then it forwards the membership information to the SM, which then orders the received checkpoints. Agents are not limited to implementing (a) (c) Figure 3: Selective algorithm implemented using agents: (a) agent observes who the process communicates with during the transaction through arriving event agents (b) the agent waits for information about other processes in the transaction to be delivered to it (c) the agent then migrates to another milieu (d) where it delivers its accumulated information to another agent in the transaction. (b) (d)
6 standard assembly algorithms. In fact, one of their strengths is their ability to make use of application specic information to order checkpoints more eciently. For example, consider a distributed application that primarily consists of a loop and contains a counter of the number of times it has passed through the loop. Agents could read the attribute and use it as a logical clock to order checkpoints. The perturbation, scaling, lag and consistency in such a case is excellent since the application is already doing the work necessary to decide on an ordering. Having an attribute that can be used as a logical clock is a very simple case. In addition to published attributes, agents can utilize data from a wide variety of sources such as message passing events, the history of the process, monitoring specic data, and information received from outside the process. The ordering of snapshots can also be encapsulated by agents. Figure 4 illustrates how an agent could generate logical clock values to produce a particular ordering for the user. Computing these custom assembly tasks could also be done at the SM. The primary advantage of using agents to compute them is that it reduces the load on the SM. When monitoring large computations, the SM can easily become a bottleneck since all of the information that the user will see passes through it. Using agents reduces the amount of information that needs to be sent to the SM and it also reduces the amount of computation done at the SM. In general, agents serve as a convenient means of extracting and analyzing the ordering information from the application. Agents can also be used to aid a non-agent assembly layer. For example, some assembly algorithms rely on an initial synchronization to function properly. This presents an obstacle to starting one of these algorithms after the computation has begun. Agents can be used to provide the initial synchronization needed, without modifying the original algorithm. In general, agents can be used as a means of steering snapshot assembly modules. Assembling checkpoints into global snapshots should not always be done with agents, x.3 x.5 x.7 x.5 if switch case a: case b: case c: else Figure 4: A loop counter x is used as the basis for a logical clock. In order to show the transaction in case b as happening at the same time as the else block (across dierent processes) the agent would label both blocks with the same timestamp. as often times there will be ecient non-agent solutions. However, agents can play valuable roles in the assembly task. They provide an avenue to prototype general assembly algorithms and compare them to each other. Their easy installation and removal from a running computation allow users to try out dierent algorithms until they nd the one that best meets their needs. Another useful role for agents is the realization of custom assembly algorithms based on application specic information. Taking advantage of ordering knowledge embedded in the application can provide solutions that perform better than generic assembly algorithms. Finally, agents can be used to aid non-agent assembly layers. 5 Summary Many dierent approaches to snapshot assembly may be employed, each with dierent costs and benets. The choice of assembly method depends on the needs of the user and the application being monitored. Within the assembly task, agents can implement general assembly algorithms, perform snapshot assembly using application specic information, or support non-agent assembly components. Such agent-
7 based assembly can be useful in rapid prototyping of new assembly algorithms and for rapid deployment of application-specic assembly algorithms. This paper is based upon work supported in part by the National Science Foundation under Grant No. CDA Any opinions, ndings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reect the views of the National Science Foundation. [7] B. Charron-Bost. Concerning the size of logical clocks in distributed systems. Information Processing Letters, 39:11{16, July [8] Jeerey Bradshaw, editor. Software Agents. MIT Press, [9] W. Brenner, R. Zarnekow, and H. Witting. Intelligent Software Agents. Springer- Verlag, References [1] Delbert Hart and Eileen Kraemer. Consistency considerations in the interactive steering of computations. International Journal of Parallel and Distributed Networks and Systems, to appear, [2] Delbert Hart, Eileen Kraemer, and Gruia- Catalin Roman. Using snapshot streams to support visual exploration. Technical Report WUCS-97-46, Washington University in St. Louis, [3] Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558{ 565, July [4] Eileen Kraemer and John T. Stasko. Creating an accurate portrayal of concurrent executions. IEEE Concurrency, 6(1):36{46, January-March [5] Jean-Michel Helary, Robert H.B. Netzer, and Michel Raynal. Consistency issues in distributed checkpoints. IEEE Transactions on Software Engineering, 25(2):274{ 280, March/April [6] Delbert Hart, Eileen Kraemer, and Gruia- Catalin Roman. Query-based visualization of distributed computations. In Proceedings of the 11th International Parallel Processing Symposium, Geneva, Switzerland, April 1997.
distributed applications. Exploratory Visualization addresses the size and complexity of distributed systems by engaging the user as an active partner
Token Finding Strategies Delbert Hart Washington University in St. Louis St. Louis, MO, 63130 USA +1 314 935 7536 hart@cs.wustl.edu Eileen Kraemer University of Georgia Athens, GA, 30606 USA +1 706 542
More informationA taxonomy of race. D. P. Helmbold, C. E. McDowell. September 28, University of California, Santa Cruz. Santa Cruz, CA
A taxonomy of race conditions. D. P. Helmbold, C. E. McDowell UCSC-CRL-94-34 September 28, 1994 Board of Studies in Computer and Information Sciences University of California, Santa Cruz Santa Cruz, CA
More informationKevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a
Asynchronous Checkpointing for PVM Requires Message-Logging Kevin Skadron 18 April 1994 Abstract Distributed computing using networked workstations oers cost-ecient parallel computing, but the higher rate
More informationCreating and Running Mobile Agents with XJ DOME
Creating and Running Mobile Agents with XJ DOME Kirill Bolshakov, Andrei Borshchev, Alex Filippoff, Yuri Karpov, and Victor Roudakov Distributed Computing & Networking Dept. St.Petersburg Technical University
More informationNetwork. Department of Statistics. University of California, Berkeley. January, Abstract
Parallelizing CART Using a Workstation Network Phil Spector Leo Breiman Department of Statistics University of California, Berkeley January, 1995 Abstract The CART (Classication and Regression Trees) program,
More informationDistributed KIDS Labs 1
Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database
More informationand easily tailor it for use within the multicast system. [9] J. Purtilo, C. Hofmeister. Dynamic Reconguration of Distributed Programs.
and easily tailor it for use within the multicast system. After expressing an initial application design in terms of MIL specications, the application code and speci- cations may be compiled and executed.
More informationWeb site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client.
(Published in WebNet 97: World Conference of the WWW, Internet and Intranet, Toronto, Canada, Octobor, 1997) WebView: A Multimedia Database Resource Integration and Search System over Web Deepak Murthy
More informationAn Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation
230 The International Arab Journal of Information Technology, Vol. 6, No. 3, July 2009 An Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation Ali Al-Humaimidi and Hussam Ramadan
More informationConsistent Logical Checkpointing. Nitin H. Vaidya. Texas A&M University. Phone: Fax:
Consistent Logical Checkpointing Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 hone: 409-845-0512 Fax: 409-847-8578 E-mail: vaidya@cs.tamu.edu Technical
More informationComprehensive Guide to Evaluating Event Stream Processing Engines
Comprehensive Guide to Evaluating Event Stream Processing Engines i Copyright 2006 Coral8, Inc. All rights reserved worldwide. Worldwide Headquarters: Coral8, Inc. 82 Pioneer Way, Suite 106 Mountain View,
More informationINTEGRATED MANAGEMENT OF LARGE SATELLITE-TERRESTRIAL NETWORKS' ABSTRACT
INTEGRATED MANAGEMENT OF LARGE SATELLITE-TERRESTRIAL NETWORKS' J. S. Baras, M. Ball, N. Roussopoulos, K. Jang, K. Stathatos, J. Valluri Center for Satellite and Hybrid Communication Networks Institute
More informationEect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli
Eect of fan-out on the Performance of a Single-message cancellation scheme Atul Prakash (Contact Author) Gwo-baw Wu Seema Jetli Department of Electrical Engineering and Computer Science University of Michigan,
More informationTechnische Universitat Munchen. Institut fur Informatik. D Munchen.
Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl
More informationDatabase Architectures
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in
More informationSnapshot Protocols. Angel Alvarez. January 17, 2012
Angel Alvarez January 17, 2012 1 We have seen how to monitor a distributed computation in a passive manner We have also seen how to do active monitoring, and an instance of an inconsistent observation
More informationMobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology
Mobile and Heterogeneous databases Distributed Database System Transaction Management A.R. Hurson Computer Science Missouri Science & Technology 1 Distributed Database System Note, this unit will be covered
More informationECE519 Advanced Operating Systems
IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (10 th Week) (Advanced) Operating Systems 10. Multiprocessor, Multicore and Real-Time Scheduling 10. Outline Multiprocessor
More informationLeslie Lamport: The Specification Language TLA +
Leslie Lamport: The Specification Language TLA + This is an addendum to a chapter by Stephan Merz in the book Logics of Specification Languages by Dines Bjørner and Martin C. Henson (Springer, 2008). It
More informationModel-based Run-Time Software Adaptation for Distributed Hierarchical Service Coordination
Model-based Run-Time Software Adaptation for Distributed Hierarchical Service Coordination Hassan Gomaa, Koji Hashimoto Department of Computer Science George Mason University Fairfax, VA, USA hgomaa@gmu.edu,
More informationMonitoring and Visualizing. Software-Heterogeneous Distributed Object Applications. Jakub Szymaszek.
Monitoring and Visualizing Software-Heterogeneous Distributed Object Applications Jakub Szymaszek e-mail: jasz@ics.agh.edu.pl Institute of Computer Science University of Mining & Metallurgy Al. Mickiewicza
More informationConcurrent Reading and Writing of Clocks
[1]1 27 Concurrent Reading and Writing of Clocks Leslie Lamport April 1, 1988, revised November 20, 1990 Systems Research Center DEC s business and technology objectives require a strong research program.
More informationThree Models. 1. Time Order 2. Distributed Algorithms 3. Nature of Distributed Systems1. DEPT. OF Comp Sc. and Engg., IIT Delhi
DEPT. OF Comp Sc. and Engg., IIT Delhi Three Models 1. CSV888 - Distributed Systems 1. Time Order 2. Distributed Algorithms 3. Nature of Distributed Systems1 Index - Models to study [2] 1. LAN based systems
More informationAdvanced Databases Lecture 17- Distributed Databases (continued)
Advanced Databases Lecture 17- Distributed Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch www.mniazi.ir Alternative Models of Transaction Processing Notion of a single
More informationOperating System Overview. Chapter 2
Operating System Overview Chapter 2 1 Operating System A program that controls the execution of application programs An interface between applications and hardware 2 Operating System Objectives Convenience
More informationOutline. Computer Science 331. Information Hiding. What This Lecture is About. Data Structures, Abstract Data Types, and Their Implementations
Outline Computer Science 331 Data Structures, Abstract Data Types, and Their Implementations Mike Jacobson 1 Overview 2 ADTs as Interfaces Department of Computer Science University of Calgary Lecture #8
More informationA Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster
A Freely Congurable Audio-Mixing Engine with Automatic Loadbalancing M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster Electronics Laboratory, Swiss Federal Institute of Technology CH-8092 Zurich, Switzerland
More informationDewayne E. Perry. Abstract. An important ingredient in meeting today's market demands
Maintaining Consistent, Minimal Congurations Dewayne E. Perry Software Production Research, Bell Laboratories 600 Mountain Avenue, Murray Hill, NJ 07974 USA dep@research.bell-labs.com Abstract. An important
More informationFault-Tolerant Computer Systems ECE 60872/CS Recovery
Fault-Tolerant Computer Systems ECE 60872/CS 59000 Recovery Saurabh Bagchi School of Electrical & Computer Engineering Purdue University Slides based on ECE442 at the University of Illinois taught by Profs.
More informationMPI Proto: Simulating Distributed Computing in Parallel
MPI Proto: Simulating Distributed Computing in Parallel Omari Stephens 1 Introduction MIT class 6.338, Parallel Processing, challenged me to, vaguely, do something with parallel computing. To this end,
More informationAdaptive Methods for Distributed Video Presentation. Oregon Graduate Institute of Science and Technology. fcrispin, scen, walpole,
Adaptive Methods for Distributed Video Presentation Crispin Cowan, Shanwei Cen, Jonathan Walpole, and Calton Pu Department of Computer Science and Engineering Oregon Graduate Institute of Science and Technology
More informationDistributed Databases
Distributed Databases These slides are a modified version of the slides of the book Database System Concepts (Chapter 20 and 22), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides
More informationVirtual Multi-homing: On the Feasibility of Combining Overlay Routing with BGP Routing
Virtual Multi-homing: On the Feasibility of Combining Overlay Routing with BGP Routing Zhi Li, Prasant Mohapatra, and Chen-Nee Chuah University of California, Davis, CA 95616, USA {lizhi, prasant}@cs.ucdavis.edu,
More informationCMPSCI 677 Operating Systems Spring Lecture 14: March 9
CMPSCI 677 Operating Systems Spring 2014 Lecture 14: March 9 Lecturer: Prashant Shenoy Scribe: Nikita Mehra 14.1 Distributed Snapshot Algorithm A distributed snapshot algorithm captures a consistent global
More informationAn evaluation of Papyrus-RT for solving the leader-follower challenge problem
An evaluation of Papyrus-RT for solving the leader-follower challenge problem Karim Jahed Queen s University, Kingston, ON jahed@cs.queensu.ca Abstract. We discuss and evaluate the use of Papyrus-RT modeling
More informationCS 403/534 Distributed Systems Midterm April 29, 2004
CS 403/534 Distributed Systems Midterm April 9, 004 3 4 5 Total Name: ID: Notes: ) Please answer the questions in the provided space after each question. ) Duration is 0 minutes 3) Closed books and closed
More informationClock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers
Clock Synchronization Synchronization Tanenbaum Chapter 6 plus additional papers Fig 6-1. In a distributed system, each machine has its own clock. When this is the case, an event that occurred after another
More informationTime Synchronization and Logical Clocks
Time Synchronization and Logical Clocks CS 240: Computing Systems and Concurrency Lecture 5 Mootaz Elnozahy Today 1. The need for time synchronization 2. Wall clock time synchronization 3. Logical Time
More informationBreakpoints and Halting in Distributed Programs
1 Breakpoints and Halting in Distributed Programs Barton P. Miller Jong-Deok Choi Computer Sciences Department University of Wisconsin-Madison 1210 W. Dayton Street Madison, Wisconsin 53706 Abstract Interactive
More informationOn Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems
On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer
More informationLanguage-Based Parallel Program Interaction: The Breezy Approach. Darryl I. Brown Allen D. Malony. Bernd Mohr. University of Oregon
Language-Based Parallel Program Interaction: The Breezy Approach Darryl I. Brown Allen D. Malony Bernd Mohr Department of Computer And Information Science University of Oregon Eugene, Oregon 97403 fdarrylb,
More information2 Application Support via Proxies Onion Routing can be used with applications that are proxy-aware, as well as several non-proxy-aware applications, w
Onion Routing for Anonymous and Private Internet Connections David Goldschlag Michael Reed y Paul Syverson y January 28, 1999 1 Introduction Preserving privacy means not only hiding the content of messages,
More informationComparing Gang Scheduling with Dynamic Space Sharing on Symmetric Multiprocessors Using Automatic Self-Allocating Threads (ASAT)
Comparing Scheduling with Dynamic Space Sharing on Symmetric Multiprocessors Using Automatic Self-Allocating Threads (ASAT) Abstract Charles Severance Michigan State University East Lansing, Michigan,
More informationDigital Archives: Extending the 5S model through NESTOR
Digital Archives: Extending the 5S model through NESTOR Nicola Ferro and Gianmaria Silvello Department of Information Engineering, University of Padua, Italy {ferro, silvello}@dei.unipd.it Abstract. Archives
More informationCombining Different Business Rules Technologies:A Rationalization
A research and education initiative at the MIT Sloan School of Management Combining Different Business Rules Technologies:A Rationalization Paper 116 Benjamin Grosof Isabelle Rouvellou Lou Degenaro Hoi
More informationHow to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor
How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor Leslie Lamport 1 Digital Equipment Corporation February 14, 1993 Minor revisions January 18, 1996 and September 14, 1996
More informationPROCESSES AND THREADS
PROCESSES AND THREADS A process is a heavyweight flow that can execute concurrently with other processes. A thread is a lightweight flow that can execute concurrently with other threads within the same
More informationA Study of Query Execution Strategies. for Client-Server Database Systems. Department of Computer Science and UMIACS. University of Maryland
A Study of Query Execution Strategies for Client-Server Database Systems Donald Kossmann Michael J. Franklin Department of Computer Science and UMIACS University of Maryland College Park, MD 20742 f kossmann
More informationChapter 3: Processes. Operating System Concepts 8 th Edition,
Chapter 3: Processes, Silberschatz, Galvin and Gagne 2009 Chapter 3: Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Silberschatz, Galvin and Gagne 2009
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationSIGNAL PROCESSING TOOLS FOR SPEECH RECOGNITION 1
SIGNAL PROCESSING TOOLS FOR SPEECH RECOGNITION 1 Hualin Gao, Richard Duncan, Julie A. Baca, Joseph Picone Institute for Signal and Information Processing, Mississippi State University {gao, duncan, baca,
More information2. Time and Global States Page 1. University of Freiburg, Germany Department of Computer Science. Distributed Systems
2. Time and Global States Page 1 University of Freiburg, Germany Department of Computer Science Distributed Systems Chapter 3 Time and Global States Christian Schindelhauer 12. May 2014 2. Time and Global
More informationCheckpointing and Rollback Recovery in Distributed Systems: Existing Solutions, Open Issues and Proposed Solutions
Checkpointing and Rollback Recovery in Distributed Systems: Existing Solutions, Open Issues and Proposed Solutions D. Manivannan Department of Computer Science University of Kentucky Lexington, KY 40506
More information. The problem: ynamic ata Warehouse esign Ws are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered
ynamic ata Warehouse esign? imitri Theodoratos Timos Sellis epartment of Electrical and Computer Engineering Computer Science ivision National Technical University of Athens Zographou 57 73, Athens, Greece
More informationPattern Density and Role Modeling of an Object Transport Service
Pattern Density and Role Modeling of an Object Transport Service Dirk Riehle. SKYVA International. 25 First Street, Cambridge, MA 02129, U.S.A. E-mail: driehle@skyva.com or riehle@acm.org Roger Brudermann.
More informationDISTRIBUTED SHARED MEMORY
DISTRIBUTED SHARED MEMORY COMP 512 Spring 2018 Slide material adapted from Distributed Systems (Couloris, et. al), and Distr Op Systems and Algs (Chow and Johnson) 1 Outline What is DSM DSM Design and
More informationParallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?
Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and
More informationSpemmet - A Tool for Modeling Software Processes with SPEM
Spemmet - A Tool for Modeling Software Processes with SPEM Tuomas Mäkilä tuomas.makila@it.utu.fi Antero Järvi antero.jarvi@it.utu.fi Abstract: The software development process has many unique attributes
More informationDISTRIBUTED SELF-SIMULATION OF HOLONIC MANUFACTURING SYSTEMS
DISTRIBUTED SELF-SIMULATION OF HOLONIC MANUFACTURING SYSTEMS Naoki Imasaki I, Ambalavanar Tharumarajah 2, Shinsuke Tamura 3 J Toshiba Corporation, Japan, naoki.imasaki@toshiba.co.jp 2 CSIRO Manufacturing
More informationVerteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms
Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms Holger Karl Computer Networks Group Universität Paderborn Goal of this chapter Apart from issues in distributed time and resulting
More informationKhoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety
Data Parallel Programming with the Khoros Data Services Library Steve Kubica, Thomas Robey, Chris Moorman Khoral Research, Inc. 6200 Indian School Rd. NE Suite 200 Albuquerque, NM 87110 USA E-mail: info@khoral.com
More informationAbstract Studying network protocols and distributed applications in real networks can be dicult due to the need for complex topologies, hard to nd phy
ONE: The Ohio Network Emulator Mark Allman, Adam Caldwell, Shawn Ostermann mallman@lerc.nasa.gov, adam@eni.net ostermann@cs.ohiou.edu School of Electrical Engineering and Computer Science Ohio University
More informationTime Synchronization and Logical Clocks
Time Synchronization and Logical Clocks CS 240: Computing Systems and Concurrency Lecture 5 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Today 1. The
More informationTime. COS 418: Distributed Systems Lecture 3. Wyatt Lloyd
Time COS 418: Distributed Systems Lecture 3 Wyatt Lloyd Today 1. The need for time synchronization 2. Wall clock time synchronization 3. Logical Time: Lamport Clocks 2 A distributed edit-compile workflow
More informationNext-Generation Architecture for Virtual Prototyping
Next-Generation Architecture for Virtual Prototyping Dr. Bipin Chadha John Welsh Principal Member Manager Lockheed Martin ATL Lockheed Martin ATL (609) 338-3865 (609) 338-3865 bchadha@atl.lmco.com jwelsh@atl.lmco.com
More informationPart II. Integration Use Cases
Part II Integration Use Cases Achieving One Version of the Truth requires integration between the data synchronization application environment (especially the local trade item catalog) and enterprise applications
More informationDatabricks Delta: Bringing Unprecedented Reliability and Performance to Cloud Data Lakes
Databricks Delta: Bringing Unprecedented Reliability and Performance to Cloud Data Lakes AN UNDER THE HOOD LOOK Databricks Delta, a component of the Databricks Unified Analytics Platform*, is a unified
More informationThe Architecture of a System for the Indexing of Images by. Content
The Architecture of a System for the Indexing of s by Content S. Kostomanolakis, M. Lourakis, C. Chronaki, Y. Kavaklis, and S. C. Orphanoudakis Computer Vision and Robotics Laboratory Institute of Computer
More informationEvent Ordering. Greg Bilodeau CS 5204 November 3, 2009
Greg Bilodeau CS 5204 November 3, 2009 Fault Tolerance How do we prepare for rollback and recovery in a distributed system? How do we ensure the proper processing order of communications between distributed
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationOPAX - An Open Peer-to-Peer Architecture for XML Message Exchange
OPAX - An Open Peer-to-Peer Architecture for XML Message Exchange Bernhard Schandl, University of Vienna bernhard.schandl@univie.ac.at Users wishing to find multimedia material about interesting events
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationSYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN واحد نجف آباد Chapter 6 SYNCHRONIZATION Dr. Rastegari - Email: rastegari@iaun.ac.ir - Tel: +98331-2291111-2488
More informationPerformance and Scalability with Griddable.io
Performance and Scalability with Griddable.io Executive summary Griddable.io is an industry-leading timeline-consistent synchronized data integration grid across a range of source and target data systems.
More informationFDI Field Device Integration Technology
FDI Field Device Integration Technology 1 Table of Contents 1 Introduction... 3 2 FDI Technology... 3 2.1 FDI Package... 3 3 FDI host... 4 3.1 Hierarchical networks nested communication... 6 3.2 Harmonization
More informationUDP Packet Monitoring with Stanford Data Stream Manager
UDP Packet Monitoring with Stanford Data Stream Manager Nadeem Akhtar #1, Faridul Haque Siddiqui #2 # Department of Computer Engineering, Aligarh Muslim University Aligarh, India 1 nadeemalakhtar@gmail.com
More informationRelative Reduced Hops
GreedyDual-Size: A Cost-Aware WWW Proxy Caching Algorithm Pei Cao Sandy Irani y 1 Introduction As the World Wide Web has grown in popularity in recent years, the percentage of network trac due to HTTP
More informationSite 1 Site 2 Site 3. w1[x] pos ack(c1) pos ack(c1) w2[x] neg ack(c2)
Using Broadcast Primitives in Replicated Databases y I. Stanoi D. Agrawal A. El Abbadi Dept. of Computer Science University of California Santa Barbara, CA 93106 E-mail: fioana,agrawal,amrg@cs.ucsb.edu
More informationEcient Redo Processing in. Jun-Lin Lin. Xi Li. Southern Methodist University
Technical Report 96-CSE-13 Ecient Redo Processing in Main Memory Databases by Jun-Lin Lin Margaret H. Dunham Xi Li Department of Computer Science and Engineering Southern Methodist University Dallas, Texas
More informationAssignment 12: Commit Protocols and Replication Solution
Data Modelling and Databases Exercise dates: May 24 / May 25, 2018 Ce Zhang, Gustavo Alonso Last update: June 04, 2018 Spring Semester 2018 Head TA: Ingo Müller Assignment 12: Commit Protocols and Replication
More informationAdapting Commit Protocols for Large-Scale and Dynamic Distributed Applications
Adapting Commit Protocols for Large-Scale and Dynamic Distributed Applications Pawel Jurczyk and Li Xiong Emory University, Atlanta GA 30322, USA {pjurczy,lxiong}@emory.edu Abstract. The continued advances
More informationCooperative Planning of Independent Agents. through Prototype Evaluation. E.-E. Doberkat W. Hasselbring C. Pahl. University ofdortmund
Investigating Strategies for Cooperative Planning of Independent Agents through Prototype Evaluation E.-E. Doberkat W. Hasselbring C. Pahl University ofdortmund Dept. of Computer Science, Informatik 10
More informationTECHNICAL RESEARCH REPORT
TECHNICAL RESEARCH REPORT A Resource Reservation Scheme for Synchronized Distributed Multimedia Sessions by W. Zhao, S.K. Tripathi T.R. 97-14 ISR INSTITUTE FOR SYSTEMS RESEARCH Sponsored by the National
More informationAssignment 4. Overview. Prof. Stewart Weiss. CSci 335 Software Design and Analysis III Assignment 4
Overview This assignment combines several dierent data abstractions and algorithms that we have covered in class, including priority queues, on-line disjoint set operations, hashing, and sorting. The project
More informationImplementation of Clocks and Sensors
Implementation of Clocks and Sensors Term Paper EE 382N Distributed Systems Dr. Garg November 30, 2000 Submitted by: Yousuf Ahmed Chandresh Jain Onur Mutlu Global Predicate Detection in Distributed Systems
More informationMobile Computing Models What is the best way to partition a computation as well as the functionality of a system or application between stationary and
Mobile Computig: Conclusions Evaggelia Pitoura Computer Science Department, University of Ioannina, Ioannina, Greece http://www.cs.uoi.gr/~ pitoura Summer School, Jyvaskyla, August 1998 Mobile Computing
More informationCity Research Online. Permanent City Research Online URL:
Kloukinas, C., Saridakis, T. & Issarny, V. (1999). Fault Tolerant Access to Dynamically Located Services for CORBA Applications. Paper presented at the Computer Applications in Industry and Engineering
More informationSelected Questions. Exam 2 Fall 2006
Selected Questions Exam 2 Fall 2006 Page 1 Question 5 The clock in the clock tower in the town of Chronos broke. It was repaired but now the clock needs to be set. A train leaves for the nearest town,
More informationTECHNICAL RESEARCH REPORT
TECHNICAL RESEARCH REPORT A Scalable Extension of Group Key Management Protocol by R. Poovendran, S. Ahmed, S. Corson, J. Baras CSHCN T.R. 98-5 (ISR T.R. 98-14) The Center for Satellite and Hybrid Communication
More informationChapter 1: Introduction
Chapter 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming Batched Systems Time-Sharing Systems Personal-Computer Systems Parallel Systems Distributed Systems Real -Time
More informationGrowing Agents - An Investigation of Architectural Mechanisms for the Specification of Developing Agent Architectures
Growing Agents - An Investigation of Architectural Mechanisms for the Specification of Developing Agent Architectures Virgil Andronache Matthias Scheutz University of Notre Dame Notre Dame, IN 46556 e-mail:
More informationChapter 8 Fault Tolerance
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 8 Fault Tolerance 1 Fault Tolerance Basic Concepts Being fault tolerant is strongly related to
More informationSAMOS: an Active Object{Oriented Database System. Stella Gatziu, Klaus R. Dittrich. Database Technology Research Group
SAMOS: an Active Object{Oriented Database System Stella Gatziu, Klaus R. Dittrich Database Technology Research Group Institut fur Informatik, Universitat Zurich fgatziu, dittrichg@ifi.unizh.ch to appear
More informationThinAir Server Platform White Paper June 2000
ThinAir Server Platform White Paper June 2000 ThinAirApps, Inc. 1999, 2000. All Rights Reserved Copyright Copyright 1999, 2000 ThinAirApps, Inc. all rights reserved. Neither this publication nor any part
More informationEvent List Management In Distributed Simulation
Event List Management In Distributed Simulation Jörgen Dahl ½, Malolan Chetlur ¾, and Philip A Wilsey ½ ½ Experimental Computing Laboratory, Dept of ECECS, PO Box 20030, Cincinnati, OH 522 0030, philipwilsey@ieeeorg
More informationImproved Database Development using SQL Compare
Improved Database Development using SQL Compare By David Atkinson and Brian Harris, Red Gate Software. October 2007 Introduction This white paper surveys several different methodologies of database development,
More informationControl of Processes in Operating Systems: The Boss-Slave Relation
Control of Processes in Operating Systems: The Boss-Slave Relation R. Stockton Gaines Communications Research Division, Institute for Defense Analyses, Princeton NJ and The RAND Corporation, Santa Monica
More informationStackable Layers: An Object-Oriented Approach to. Distributed File System Architecture. Department of Computer Science
Stackable Layers: An Object-Oriented Approach to Distributed File System Architecture Thomas W. Page Jr., Gerald J. Popek y, Richard G. Guy Department of Computer Science University of California Los Angeles
More informationCSE 5306 Distributed Systems
CSE 5306 Distributed Systems Fault Tolerance Jia Rao http://ranger.uta.edu/~jrao/ 1 Failure in Distributed Systems Partial failure Happens when one component of a distributed system fails Often leaves
More informationSABLE: Agent Support for the Consolidation of Enterprise-Wide Data- Oriented Simulations
: Agent Support for the Consolidation of Enterprise-Wide Data- Oriented Simulations Brian Blake The MITRE Corporation Center for Advanced Aviation System Development 1820 Dolley Madison Blvd. McLean, VA
More information