{jfzhan, snh} gncic.ac.cn

Size: px
Start display at page:

Download "{jfzhan, snh} gncic.ac.cn"

Transcription

1 Fire Phoenix Cluster Operating System Kernel and its Evaluation Jianfeng Zhan, Ninghui Sun Institute of Computing Technology, Chinese Academy of Sciences, Beijing , China {jfzhan, snh} gncic.ac.cn Abstract Fire Phoenix cluster operating system kernel (Phoenix kernel) is a minimum set of cluster core functions with scalability andfault-tolerance support. In this paper, we define components of cluster operating system kernel, and introduce its internal mechanism for scalability and fault-tolerance support. Based on Phoenix kernel, user environments can be easily constructed according to users' needs. In addition, we evaluate Phoenix kernel from four different perspectives, such as fault-tolerance, scalability, performance impact on scientific computing, and easiness of constructing user environment. Our design has been proved in the practices of Dawning 4000A super server, which is the biggest cluster system for scientific computing in China. 1: Introduction Though cluster systems have been widely used as platform for scientific and business computing, the challenge still lies in developing cluster system software. Firstly, application range of cluster is expanding and user needs are always varying, so cluster system software should provide a flexible components framework to adapt to this situation; Secondly, since more and more cluster systems are adopted as business computing platforms, such as Web hosting environment, digital library, cluster system software should provide high availability support for business computing which promises delivering 7x24 service [1][2]; Lastly, cluster system software should have a highly scalable architecture that easily extends to increasing system scale. In order to deal with those difficulties and challenges, we need take a global view and figure out a reasonable architecture which could hide system complexity and reduce risks of developing cluster system software. Layered architecture style [4] is the best practice suitable for this need, which is proved to be correct in UNIX operating system. In UNIX operating system, the kernel shields bottom layer hardware, and utilities can communicate with the operating system kernel through a set of documented interfaces. Applications can be constructed upon lower level utilities. In our opinion, these efforts give us a direction in research and development of cluster system software. We can define and develop cluster operating system kernel, which provides stable minimum set of core functions with scalability and fault-tolerant support. On the basis of cluster operating system kernel, we can construct user environments, which are easily adapted and extended according to users' needs. In July 2002, Institute of Computing Technology, Chinese Academy of Sciences started developing a cluster operating system named Fire Phoenix ( In the remainder, Phoenix is short for Fire Phoenix). It provides facilities such as system monitoring, system administration, job management for the Dawning 4000A super server. The Dawning 4000A [6] super server, composed of 640 nodes, ranked No. 10 in a list of the top 500 supercomputers on July 22, This paper is structured as follows. Section 2 outlines related work; section 3 gives an overview of Phoenix cluster operating system; section 4 defines the architecture of Phoenix cluster operating system kernel; section 5 evaluates Phoenix operating system kernel from four perspectives, including fault-tolerance, performance impact on scientific computing, scalability

2 and easiness of constructing user environment; finally, in section 6, we give a conclusion. 2: Related Work When dealing with the difficulties and challenges mentioned above in section one, the research and development of cluster system software follows different paths and lacks of unified effort: Beowulf system software [7] takes fully advantage of open source cluster software developed without unified efforts, thus packaging of different software is main work, unable to achieve the target of efficiency, common use, easiness of use and interoperability. To improve this situation, many research institutes and open source group begin the work of packaging and integrating of cluster system software, such as SCE [8] developed by Thailand Kasetsart University, Score [9] researched by Japan Real world computing project, and OSCAR project [10]. Among them, the typical work is OSCAR project, which focuses on "best cluster practices", taking the best of what is currently available and integrating it into one package. Now several projects begin fully integrated design of cluster system software. SSS project [11] aims at addressing the lack of software for the effective management and utilization of terascale computational resources and developing an integrated suite of machine independent, scalable systems software components, which focuses on scientific computing support. The rational behind DOE CCA project [12] is using component frameworks to deal with the complexity of developing interdisciplinary HPC application through introducing higher level abstractions and allowing code reusability. Galaxy cluster management framework [2] focuses on servicing large-scale enterprise clusters through the use of novel, highly scalable communication and management technique, and this system is developed for Microsoft's Windows 2000 operating system, integrating tightly with the naming, directory services offered by host operating system. Oceano project [13] is a prototype of a scalable, management infrastructure for a large server farm, and it is motivated by large scale web hosting environment, which increasingly require support for peak loads that are orders of magnitude larger than the normal steady state. Ciba City project [14] studies a medium-sized testbed cluster dedicated to computer science research. In these projects, people develop different cluster system software from scratch according to different user's requirement. For example, SSS [11] project aims at supporting scientific computing, while Galaxy [2] and Oceano [13] develop cluster system software for business computing. Since the application range of cluster is always expanding, if we don't change this situation, the code base of cluster software will be always increasing. In several projects, researchers have resorted to component technology to solve this problem [11] [12]. Though component technology supports code reuse and component substitution, how to adapt to varying user needs and reduce risk of developing cluster system software is still a problem. When dealing with this challenge, Phoenix takes a different solution. Firstly, we develop Phoenix kernel, which maintains a stable minimum set of core functions with scalability and fault-tolerance support. Then, on the basis of Phoenix kernel, we develop different user environments according to user needs. 3: Overview of Phoenix Cluster OS Figure 1 describes the layered architecture of Phoenix OS. The lowest layer is heterogeneous resource, and it shields heterogeneous hardware architectures, host operating systems and communication protocols with heterogeneous middleware. The second layer is Phoenix cluster operating system kernel, which defines the minimum set of core functions with scalability and fault-tolerance support. The first layer is user environment, through which users utilize cluster resources to fulfill their targets. In Phoenix system, we define four user roles including system constructor, system administrator, scientific computing users, and business computing user. For these users, Phoenix provides difference user environments:

3 * System constructor configures, deploys and boots cluster system with system construction tool, and system construction tool behaves like the BIOS and kernel booting module of a host operating system. * System management and monitoring tools assist system administrators to perform daily system management, real-time system monitoring, performance analysis and fault analysis. * Job management system is a user environment for science computing users, which manages cluster resources, through which users submit their jobs and complete their computing task. * Business application runtime environment is the core of the business application hosting environment. It manages multi-tier business applications and guarantees their high-availability and load-balancing. (3). Phoenix kernel provides scalable and fault-tolerant support, thus the difficulty of developing user environment is decreased. 4.2: Components Framework of Phoenix Kernel Phoenix kernel is the minimum set of core functions for Phoenix cluster OS, which is also the basic building block for user environments. Phoenix kernel provides documented interfaces and parallel command calls for user environments in different forms with uniformed semantics (Such as Socket, RPC and ORB etc.). The layered architecture of Phoenix kernel is shown in Figure 2. Parallel command and Application programming interface Event Service Check Point Service Data Bulletin Group Service Detector Services Physical Node Network Application Resource State State State Parallel Process Management Figure 2 Phoenix Kernel Stack Figurel Architecture of Phoenix 4: Phoenix Cluster Operating System Kernel 4.1: Rational behind Phoenix Kernel When developing cluster operating system kernel, we choose three principles to follow: (1).User interacts with user environment, and Phoenix kernel is invisible to users, which decreases cognition overhead and management cost for daily users. (2).Maintaining a stable minimum set of core functions on cluster operating system kernel level, we can easily construct, adapt and extend user environments on the basis of Phoenix kernel according to user's needs, which controls the complexity and reduces the risk of developing cluster operating system. * Configuration service It provides cluster-wide configuration information, including information of physical resources, Phoenix kernel and user environments. Configuration service has a self-introspection mechanism to automatically find and diagnose cluster resources, and provides documented interface for dynamic reconfiguration. * Security service It provides authorization, authentication and encryption functions for users. * Parallel processes management service Parallel process management service performs efficient remote jobs loading, deleting, and resource cleaning up, which is a basic module of Phoenix kernel. * Detector services Detector services include physical resource detector, application state detector node state detector and, network state detector; physical resource detector

4 monitors usage of physical resources, such as CPU, memory, swap, disk I/0 and network I/0 of each nodes, which are fundamental for job management's schedulers; application state detector monitors application status such as physical resources consumed by a specific application, the application's living status, as well as the application information related to system level agreement, which are fundamental for business application runtime environment; node state detector and network state detector monitor status of node and network connectivity. * Group service Group service is the kernel one to solve scalability and high availability at the same time. The key functions of group service are guaranteeing the high availability of its meta-group; providing interfaces for upper-layer service group's creating, joining and leaving; and guaranteeing upper-layer service group's high availability. * Checkpoint service Based on group service, it provides interfaces for upper-layer services to save system data, which means that upper-layer services themselves are responsible for saving and deleting system state by calling interface of checkpoint service. * Event service Based on group service, event service play the role of communication channel of Phoenix kernel, and provides the following interfaces: the registration of the event supplier and event types it produces, the registration of the event consumer and event types it feel interested in; plus these interfaces, event service also provides functions like events filtering and real-time notification. * Data bulletin service Based on group service, data bulletin service is an in-memory database which stores the state of cluster-wide physical resource and application state; it provides interfaces for non-persistent data storage and data query. 4.3: Management Framework of Phoenix Kernel The scalability and fault-tolerance issues come to merge with large scale cluster system used as productivity platform. Designers used several different structures to deal with high availability issues, such as master-slave structure [1] [15], group structure [2] [13]; group structure includes two kinds of form: peer-to-peer structure and leader/member structure. Master-slave structure is only suitable for small cluster systems with its scalability limit. As for group structure, group members need monitor each other's state, and maintain consistent state information among each ones. Several projects propose group membership protocol in cluster management framework, for example Galaxy [2] and GulfStream [13] project. But when the scale of cluster system reaches thousand nodes, it is unacceptable for all nodes joining a group managed by group membership protocol, thus we improve the group structure. In Phoenix system, the whole cluster system is divided into several cluster partitions, each of which is composed of one server node, at least one server backup node, and other computing nodes. In order to achieve the scalability, each partition chooses one server node as representative to form a group, as shown in Figure 3. Among different partitions, the principal part of group management is GSD (Group Service Daemon). A GSD takes charge of a partition. Several group service daemons form a meta-group which managed by membership protocol. The GSD meta-group takes a ring structure. In case of of Leader, other members of meta-group select Princess to take over it. If Princess fails, the next member to Princess will take over it. If one of the members fails, the member next to it will take over it. Within a partition, the daemons responsible for sending heartbeat are watch daemons (WD) which reside on every node. WD sends heartbeat to GSD periodically through all network interfaces of the node. Through receiving and analyzing heartbeat from WD, GSD can monitor status of nodes and networks in a partition. Acting as an event supplier, GSD calls the interface of event service to push event, recovery event of node or network to those event consumers.

5 4.4: Service Federation Figure 3 Meta-group Structures with Five Members Checkpoint service, data bulletin service and event service call the interface of group service to create service group and register policies of how to deal with s. Taking event service as an example, group services take charge of monitoring event service group, as shown in Figure 4. If one member of event service group fails, GSD on the same host will notify all members of GSD group and then restart the failed service. Recovered event service daemon will retrieve its state data from the checkpoint service. If the node on which event service daemon running fails, GSD member next to it in the ring structure will select a new node for migrating GSD and then recovering event service, so the restarted event service daemon will also retrieve its state data from the checkpoint service. -es) Service federation is a group of service entity, providing single service access point through internally mutual connection and coordination. In Phoenix kernel, data bulletin, event and checkpoint service group respectively form its own federations. Figure 5 shows the structure of data bulletin federation in the form of a completed graph. There lies one instance of data bulletin (DB) service in each partition, and detector services on each nodes export the physical resource state and application state to data bulletin service which belongs to same partition. The user can query any data bulletin service to obtain cluster-wide information, so there is only one access point for data bulletin federation from user point of view. For each kernel service group, there is only one access point from user point of view, which simplifies the development of user environment. The system reliability is improved with service federation. If one data bulletin service fails, only the state of one partition can't be obtained. With the support of GSD, the failed data bulletin service will be restarted and come to work in a short period of time. User Figure 5 Data Bulletin Service Federation Figure 4 Event Service Group based on GSD In the whole system, there are one instance of configuration service and one instance of security service, while there are several different kernel services on the single server node in each partition, one instance for each service respectively; and meanwhile there are only detector service and parallel process management service running on each computing node. 5: Evaluation of Phoenix Kernel 5.1: Evaluation of Fault-Tolerance The main performance criterions for evaluating fault-tolerant system software are detection overhead and recovery overhead. The testbed is as follows: 136 nodes in Dawning 4000A with 16 computing nodes and 1 server node per partition,

6 so it is divided into 8 partitions. The interval for sending heartbeat can be configured as a system parameter, and 30 seconds is set for testing. As shown in Figure 3, group service daemon (GSD) detects and recovery of nodes and networks through receiving and analyzing of heartbeat sent by watch daemons (WD) within same partition. Group service daemons monitor each other to detect and recovery of their meta-group, and event service (ES) acts as a communication channel for sending and receiving and recovery events. Since WD, GSD and ES are most important components for fault-tolerance; we measure the fault detecting time, fault diagnosing time and recovery time for WD, GSD and ES in three unhealthy situations. E.g. for WD, these three unhealthy situations include of WD process, of node on which WD running and of one network interface. By the means of fault injection, we get the information in Table 1-3. From data in Table 1-3, we can conclude that the sum of detecting time, diagnosing time and recovery time is almost equal to the interval of sending heartbeat, while the interval for sending heartbeat can be configured as system parameter. It proves that Phoenix kernel has a good performance in supporting fault-tolerance. From Table 1 to Table 3, the recovery time of network is 0, because each node has three networks, only of one network isn't fatal. For WD, in case of node, the recovery time is 0, because each WD is the representative of hosting node for sending heartbeat, and migrating WD means nothing, while for GSD or ES, they can be migrated to another node in case of node. Fault Detecting Fault Recovery Sum of reason time diagnosing time time time Process 30s 0.29s OUs 30.39s Node 30s 2s Os 32s Fault Detecting Fault Recovery Sum of reason time diagnosing time time time Process 3Os 0.29s 2.03 s 32.32s Node 30s 0.3s 2.95s 33.25s network 30s 348us Os 30s Table 2 Three Unhealthy Situations for GSD Fault Detecting Fault Recovery Sum of reason time diagnosing time time time Process 30s 12us 0.12s 30.12s Node 30s 0.3s 2.95s 33.25s network 3Os 12us Os 3Os Table 3 Three Unhealthy Situations for ES 5.2: Performance Impact of Phoenix Kernel on Scientific Computing As for fault tolerant software system, fault tolerance means loss of performance. We measure the performance impact of Phoenix kernel on scientific computing under 4, 16, 64 and 128 CPU conditions respectively using the Linpack benchmark program. The test data is shown in table-4, and it is worth pointing out that the data without Phoenix running is obtained with system optimization by High performance computing Lab in Institute of Computing Technology, Chinese Academy of Sciences, while the data with Phoenix running is obtained by our test without optimization work From the Table-4, it can be concluded that Phoenix kernel has little impact on scientific computing. Network 3Os 348us Os 3Os Table 1 Three Unhealthy Situations for WD

7 CPU Without Phoenix With Phoenix running running % % % % % % % % Table 4 Phoenix's Impact on Linpack Benchmark Performance 1 0* r 50% 1- O % 1 a06 50% % _ 50% anu CPU Usage SWAP Usag 0 Flushing interval 5.3: Evaluating Scalability 384 4t Nd u Taking monitor system of Dawning 4000A [6] as an example, this section proves the high scalability of Phoenix kernel. The monitor system of Dawning 4000A involves five Phoenix kernel services, including configuration service, detector service, group service, data bulletin, event service, and GridView module [16] in charge of displaying graphic and analyzing data. GridView interacts with Phoenix kernel only through the interfaces of data bulletin service and event service and configuration service. GridView registers its interested event types to event service, including node and network etc., and GridView can get real-time notifications of these events. GridView collects cluster-wide performance data by calling single interface of data bulletin service federation, and visually displays cluster-wide resources usage with a specific refreshing rate. Figure 6 is a snapshot of Dawning 4000A's monitoring system under common load with percent average memory usage, percent average CPU usage and 0.72 percent average swap usage. As shown in this SystEOOverloa Syse Sttu Figure 6 System Monitoring based on Phoenix Kernel 5.4: Constructing User Environment on Phoenix Kernel According to Figure 1, we build different user environments based on Phoenix kernel. In this section, we will discuss on how to construct Phoenix-PWS job management system user environments (Partitioned Workload Solution, PWS for short). PWS is a job management system based on Phoenix kernel, improved on the basis of PBS (Portable Batch System) [17]. PWS supports multi-pools with customized scheduling policies for different pools and dynamic leasing among different pools. As shown in Figure 7, main modules of PBS include user interface, scheduling, resource monitoring, configuration, parallel process management. Figure 8 shows the PWS based on Phoenix kernel. figure, this system includes 640 nodes, and it proves the high scalability of Phoenix kernel, since the GridView system is constructed on it. Figure 7 Main Components of PBS

8 XhedulinR; Configuratio Event tservic DtBullet. I 0 Phoen'ix kernle CGopsrice _ ector Par e roess Management Figure 8 Main Components of PWS Based on Phoenix kernel Comparing with the PBS, PWS has several desirable properties as follows: (1) Phoenix kernel provides most of functions of PBS, and the development of new PWS system focuses only on the user interface and scheduling modules. (2) The scalability of PWS system is improved on the basis of group management and service federation. Physical resources detectors export physical resource information to data bulletin federation, from which the PWS system obtains cluster-wide resources information directly, thus the system workload is reduced; By registering events of node, network and application to event service, PWS can get real-time notification of those events, while PBS needs polling continually and consumes network bandwidth. (3) The fault-tolerance of PWS system is guaranteed on the basis of group management and service federation. If one data bulletin fails, only the state of one partition can't be obtained. With the support of GSD, the failed data bulletin will be restarted and come to work in a short period of time. The scheduling service group for different pools is created on the basis of group service with high availability guaranteed, while PBS doesn't guarantee it. (4) PWS supports multi-pools and dynamic leasing among different pools. Figure 9 shows a snapshot of integrated Web GUI for PWS job management system. Our practice proves the easiness of constructing user environment on the base of Phoenix kernel. C J l with ~ gbestartls ;citlde ;: Figure 9 Integrated Web GUI for Phoenix-PWS: Start/Shutdown Nodes 6. Conclusion d~ ~ ~~~~~~~~~S e r c plnoef o el Co aringe c SX i ii Maintaining a stable minimum set of core functions on cluster operating systemkeafel level make Phoenix different from other systems. Based on Phoenix kernel, we have constructed more complete user environments with ordinary developing efforts than other projects, including system construction tool, system management and system monitoring tools, job management system, and business application runtime environment. Comparing with Phoenix, SSS projects [11] only focus on HPC system software support, Ganglia [18] builds cluster monitoring system, and Rock [19] provides cluster building tool, and Galaxy [2] and Oce'ano project [13] take into account the requirement of business application. In addition, the scalable and fault tolerant support is embedded Phoenix kesel, thus the diffdculty of developing user environment is decreased. Though many projects propose high-availability solution [1] [2] [13] [15], what makes Phoenix different is that it provides a unified scalability and fault-tolerance supporting framework composed of three mechanisms: improved service management, service federation and single service access point. The developments of GridView [ 16] and Phoenix-PWS prove the correctness of this design decision. ACKNOWLEDGMENTS

9 I wish to thank all the members of Phoenix team at the Institute of Computing Technology System, The Chinese Academy of Science. REFERENCES [1] Richard Rabbat, Tom McNeal, Tim Burke, A High-Availability Clustering Architecture with Data Integrity Guarantees, Proceedings of the 2001 IEEE International Conference on Cluster Computing, Newport Beach, CA [2] Werner Vogels Dan Dumitriu, An Overview of the Galaxy Management Framework for Scalable Enterprise Cluster Computing, Proceedings of the 2000 IEEE International Conference on Cluster Computing. [3] Monika Henzinger, Indexing the Web - A Challenge for Supercomputers, International supercomputer conference, June 19, 2002 in Heidelberg. [4] Mary SHAW, DAVID GARLAN, software architecture: perspective on an emerging discipline, prentice hall [5] Maurice J. Bach: The Design of the UNIX Operating System. Prentice-Hall, 1986 [6] Dawning 4000A, sublist/ System.php? id= 7036, 2004 [7] T. Sterling, D. J. Becker, D. S. abd John Dorband, U. A. Ranawake, and C. E. Packer. Beowulf: A parallel workstation for scientific computation. In Proceedings of The International Conference on Parallel Processing 95. IEEE and ACM, [8] P. Uthayopas, S. Phatanapherom, T. Angskun, S. Sriprayoonsakul, "SCE: A Fully Integrated Software Tool for Beowulf Cluster System," in Proceedings oflinux Clusters: the HPC Revolution, National Center for Supercomputing Applications (NCSA), University of Illinois, Urbana, Illinois,June 25-27, [9] Atsushi, SCore: An Integrated Cluster System Software Package for High Performance Cluster Computing, Proceedings of the 2000 IEEE International Conference on cluster computing. [10] Stephen L. Scott, OSCAR and the Beowulf Arms Race for the "Cluster Standard", Proceedings of the 2001 IEEE International Conference on Cluster Computing. [11] Ralph Butler, Narayan Desai Andrew Lusk Ewing Lusk, The Process Management Component of a Scalable Systems Software Environment, In Proceedings of the IEEE Cluster 2003 Conference, Hong Kong. [12] Rob Armstrong, Dennis Gannon, Al Geist, Katarzyna Keahey, Scott Kohn, Lois McInnes, Steve Parker, and Brent Smolinski, Toward a Common Component Architecture for High- Performance Scientific Computing, http: H www -unix.mcs. anl.gov/ %7Ecurfman/ web/ cca_paper.html [13] Sameh A. Fakhouri, Germ an Goldszmidt, Michael Kalantar, John A. Pershing, GulfStream - a System for Dynamic Topology Management in Multi-domain Server Farms, Proceedings of the 2001 IEEE International Conference on Cluster Computing,Newport Beach, CA [14] Remy Evard, Narayan Desai, John-Paul Navarro, and Daniel Nurmi, Clusters as Large-Scale Development Facilities, Proceedings of the 2002 IEEE International Conference on Cluster Computing. [15] Chokchai Leangsuksun and Ibrahim Haddad, Building Highly Available HPC Clusters with HA-OSCAR, Proceedings of the 2004 IEEE International Conference on Cluster Computing. [16]Ni Guangbao, Ma Jie, Li Bo, GridView: ADynamic and Visual Grid Monitoring System, In Proc. of the 7th International Conference on High Performance Computing and Grid in Asia Pacific Region, Omiya Sonic City, Tokyo Area, Japan, July 20-22, 2004, pp [17] [18] Federico D. Sacerdoti, Mason J. Katz, Matthew L. Massie, David E. Culler, Wide Area Cluster Monitoring with Ganglia, In Proceedings of the IEEE Cluster 2003 Conference, Hong Kong. [19] P. Papadopoulos, M. Katz, and G. Bruno, "NPACI Rocks: Tools and Techniques for Easily Deploying Manageable Linux Clusters," Proceedings of the 2001 IEEE International Conference on Cluster Computing, Newport Beach, CA

Easy and Reliable Cluster Management: The Self-management Experience of Fire Phoenix *

Easy and Reliable Cluster Management: The Self-management Experience of Fire Phoenix * Easy and Reliable Cluster Management: The Self-management Experience of Fire Phoenix * Zhang Zhi-Hong, Meng Dan, Zhan Jian-Feng, Wang Lei, Wu Lin-ping and Huang Wei Institute of Computing Technology Chinese

More information

Bluemin: A Suite for Management of PC Clusters

Bluemin: A Suite for Management of PC Clusters Bluemin: A Suite for Management of PC Clusters Hai Jin, Hao Zhang, Qincheng Zhang, Baoli Chen, Weizhong Qiang School of Computer Science and Engineering Huazhong University of Science and Technology Wuhan,

More information

Managing CAE Simulation Workloads in Cluster Environments

Managing CAE Simulation Workloads in Cluster Environments Managing CAE Simulation Workloads in Cluster Environments Michael Humphrey V.P. Enterprise Computing Altair Engineering humphrey@altair.com June 2003 Copyright 2003 Altair Engineering, Inc. All rights

More information

Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters*

Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters* Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters* Chao-Tung Yang, Chun-Sheng Liao, and Ping-I Chen High-Performance Computing Laboratory Department of Computer

More information

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems Distributed Systems Outline Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems What Is A Distributed System? A collection of independent computers that appears

More information

Distributed Operating Systems Fall Prashant Shenoy UMass Computer Science. CS677: Distributed OS

Distributed Operating Systems Fall Prashant Shenoy UMass Computer Science.   CS677: Distributed OS Distributed Operating Systems Fall 2009 Prashant Shenoy UMass http://lass.cs.umass.edu/~shenoy/courses/677 1 Course Syllabus CMPSCI 677: Distributed Operating Systems Instructor: Prashant Shenoy Email:

More information

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan Definition of a Distributed System (1) A distributed

More information

Distributed Operating Systems Spring Prashant Shenoy UMass Computer Science.

Distributed Operating Systems Spring Prashant Shenoy UMass Computer Science. Distributed Operating Systems Spring 2008 Prashant Shenoy UMass Computer Science http://lass.cs.umass.edu/~shenoy/courses/677 Lecture 1, page 1 Course Syllabus CMPSCI 677: Distributed Operating Systems

More information

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

Chapter 3. Design of Grid Scheduler. 3.1 Introduction Chapter 3 Design of Grid Scheduler The scheduler component of the grid is responsible to prepare the job ques for grid resources. The research in design of grid schedulers has given various topologies

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information

Chapter 20: Database System Architectures

Chapter 20: Database System Architectures Chapter 20: Database System Architectures Chapter 20: Database System Architectures Centralized and Client-Server Systems Server System Architectures Parallel Systems Distributed Systems Network Types

More information

Distributed and Operating Systems Spring Prashant Shenoy UMass Computer Science.

Distributed and Operating Systems Spring Prashant Shenoy UMass Computer Science. Distributed and Operating Systems Spring 2019 Prashant Shenoy UMass http://lass.cs.umass.edu/~shenoy/courses/677!1 Course Syllabus COMPSCI 677: Distributed and Operating Systems Course web page: http://lass.cs.umass.edu/~shenoy/courses/677

More information

Distributed File Systems. CS432: Distributed Systems Spring 2017

Distributed File Systems. CS432: Distributed Systems Spring 2017 Distributed File Systems Reading Chapter 12 (12.1-12.4) [Coulouris 11] Chapter 11 [Tanenbaum 06] Section 4.3, Modern Operating Systems, Fourth Ed., Andrew S. Tanenbaum Section 11.4, Operating Systems Concept,

More information

CAS 703 Software Design

CAS 703 Software Design Dr. Ridha Khedri Department of Computing and Software, McMaster University Canada L8S 4L7, Hamilton, Ontario Acknowledgments: Material based on Software by Tao et al. (Chapters 9 and 10) (SOA) 1 Interaction

More information

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.

More information

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and

More information

Lecture 23 Database System Architectures

Lecture 23 Database System Architectures CMSC 461, Database Management Systems Spring 2018 Lecture 23 Database System Architectures These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used

More information

Job-Oriented Monitoring of Clusters

Job-Oriented Monitoring of Clusters Job-Oriented Monitoring of Clusters Vijayalaxmi Cigala Dhirajkumar Mahale Monil Shah Sukhada Bhingarkar Abstract There has been a lot of development in the field of clusters and grids. Recently, the use

More information

Cluster Distributions Review

Cluster Distributions Review Cluster Distributions Review Emir Imamagi Damir Danijel Žagar Department of Computer Systems, University Computing Centre, Croatia {eimamagi, dzagar}@srce.hr Abstract Computer cluster is a set of individual

More information

Cycle Sharing Systems

Cycle Sharing Systems Cycle Sharing Systems Jagadeesh Dyaberi Dependable Computing Systems Lab Purdue University 10/31/2005 1 Introduction Design of Program Security Communication Architecture Implementation Conclusion Outline

More information

An Introduction to GPFS

An Introduction to GPFS IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4

More information

InfoBrief. Platform ROCKS Enterprise Edition Dell Cluster Software Offering. Key Points

InfoBrief. Platform ROCKS Enterprise Edition Dell Cluster Software Offering. Key Points InfoBrief Platform ROCKS Enterprise Edition Dell Cluster Software Offering Key Points High Performance Computing Clusters (HPCC) offer a cost effective, scalable solution for demanding, compute intensive

More information

Computer Fundamentals : Pradeep K. Sinha& Priti Sinha

Computer Fundamentals : Pradeep K. Sinha& Priti Sinha Computer Fundamentals Pradeep K. Sinha Priti Sinha Chapter 14 Operating Systems Slide 1/74 Learning Objectives In this chapter you will learn about: Definition and need for operating system Main functions

More information

Adaptive Cluster Computing using JavaSpaces

Adaptive Cluster Computing using JavaSpaces Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of

More information

Chapter 18 Distributed Systems and Web Services

Chapter 18 Distributed Systems and Web Services Chapter 18 Distributed Systems and Web Services Outline 18.1 Introduction 18.2 Distributed File Systems 18.2.1 Distributed File System Concepts 18.2.2 Network File System (NFS) 18.2.3 Andrew File System

More information

MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # ) Chapter One Introducing Windows Server 2008

MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # ) Chapter One Introducing Windows Server 2008 MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # 70-643) Chapter One Introducing Windows Server 2008 Objectives Distinguish among the different Windows Server

More information

Client Server & Distributed System. A Basic Introduction

Client Server & Distributed System. A Basic Introduction Client Server & Distributed System A Basic Introduction 1 Client Server Architecture A network architecture in which each computer or process on the network is either a client or a server. Source: http://webopedia.lycos.com

More information

Certkiller.P questions

Certkiller.P questions Certkiller.P2140-020.59 questions Number: P2140-020 Passing Score: 800 Time Limit: 120 min File Version: 4.8 http://www.gratisexam.com/ P2140-020 IBM Rational Enterprise Modernization Technical Sales Mastery

More information

Implementing a Hardware-Based Barrier in Open MPI

Implementing a Hardware-Based Barrier in Open MPI Implementing a Hardware-Based Barrier in Open MPI - A Case Study - Torsten Hoefler 1, Jeffrey M. Squyres 2, Torsten Mehlan 1 Frank Mietke 1 and Wolfgang Rehm 1 1 Technical University of Chemnitz 2 Open

More information

M. Roehrig, Sandia National Laboratories. Philipp Wieder, Research Centre Jülich Nov 2002

M. Roehrig, Sandia National Laboratories. Philipp Wieder, Research Centre Jülich Nov 2002 Category: INFORMATIONAL Grid Scheduling Dictionary WG (SD-WG) M. Roehrig, Sandia National Laboratories Wolfgang Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Philipp Wieder, Research

More information

A Global Operating System for HPC Clusters

A Global Operating System for HPC Clusters A Global Operating System Emiliano Betti 1 Marco Cesati 1 Roberto Gioiosa 2 Francesco Piermaria 1 1 System Programming Research Group, University of Rome Tor Vergata 2 BlueGene Software Division, IBM TJ

More information

An Evaluation of Alternative Designs for a Grid Information Service

An Evaluation of Alternative Designs for a Grid Information Service An Evaluation of Alternative Designs for a Grid Information Service Warren Smith, Abdul Waheed *, David Meyers, Jerry Yan Computer Sciences Corporation * MRJ Technology Solutions Directory Research L.L.C.

More information

Essentials. Oracle Solaris Cluster. Tim Read. Upper Saddle River, NJ Boston Indianapolis San Francisco. Capetown Sydney Tokyo Singapore Mexico City

Essentials. Oracle Solaris Cluster. Tim Read. Upper Saddle River, NJ Boston Indianapolis San Francisco. Capetown Sydney Tokyo Singapore Mexico City Oracle Solaris Cluster Essentials Tim Read PRENTICE HALL Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sydney Tokyo Singapore Mexico

More information

Announcements. me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris

Announcements.  me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris Announcements Email me your survey: See the Announcements page Today Conceptual overview of distributed systems System models Reading Today: Chapter 2 of Coulouris Next topic: client-side processing (HTML,

More information

Distribution Transparencies For Integrated Systems*

Distribution Transparencies For Integrated Systems* Distribution Transparencies For Integrated Systems* Janis Putman, The Corporation Ground System Architectures Workshop 2000 The Aerospace Corporation February 2000 Organization: D500 1 * The views and

More information

Blizzard: A Distributed Queue

Blizzard: A Distributed Queue Blizzard: A Distributed Queue Amit Levy (levya@cs), Daniel Suskin (dsuskin@u), Josh Goodwin (dravir@cs) December 14th 2009 CSE 551 Project Report 1 Motivation Distributed systems have received much attention

More information

VMware Mirage Getting Started Guide

VMware Mirage Getting Started Guide Mirage 5.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document,

More information

Scalability of the Microsoft Cluster Service

Scalability of the Microsoft Cluster Service Scalability of the Microsoft Cluster Service Werner Vogels, Dan Dumitriu, Ashutosh Agrawal, Teck Chia, Katherine Guo Reliable Distributed Systems Group Dept. of Computer Science Cornell University Agenda

More information

A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach

A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach Shridhar Diwan, Dennis Gannon Department of Computer Science Indiana University Bloomington,

More information

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies.

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies. Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Distributed Information Systems Architecture Chapter Outline

More information

Linux Automation.

Linux Automation. Linux Automation Using Red Hat Enterprise Linux to extract maximum value from IT infrastructure www.redhat.com Table of contents Summary statement Page 3 Background Page 4 Creating a more efficient infrastructure:

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

VMware Mirage Getting Started Guide

VMware Mirage Getting Started Guide Mirage 5.8 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document,

More information

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure Mario Beck (mario.beck@oracle.com) Principal Sales Consultant MySQL Session Agenda Requirements for

More information

Coherence & WebLogic Server integration with Coherence (Active Cache)

Coherence & WebLogic Server integration with Coherence (Active Cache) WebLogic Innovation Seminar Coherence & WebLogic Server integration with Coherence (Active Cache) Duško Vukmanović FMW Principal Sales Consultant Agenda Coherence Overview WebLogic

More information

A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS

A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS Raj Kumar, Vanish Talwar, Sujoy Basu Hewlett-Packard Labs 1501 Page Mill Road, MS 1181 Palo Alto, CA 94304 USA { raj.kumar,vanish.talwar,sujoy.basu}@hp.com

More information

Distributed OS and Algorithms

Distributed OS and Algorithms Distributed OS and Algorithms Fundamental concepts OS definition in general: OS is a collection of software modules to an extended machine for the users viewpoint, and it is a resource manager from the

More information

Software Architecture Patterns

Software Architecture Patterns Software Architecture Patterns *based on a tutorial of Michael Stal Harald Gall University of Zurich http://seal.ifi.uzh.ch/ase www.infosys.tuwien.ac.at Overview Goal Basic architectural understanding

More information

Large Scale Computing Infrastructures

Large Scale Computing Infrastructures GC3: Grid Computing Competence Center Large Scale Computing Infrastructures Lecture 2: Cloud technologies Sergio Maffioletti GC3: Grid Computing Competence Center, University

More information

Introduction to Cluster Computing

Introduction to Cluster Computing Introduction to Cluster Computing Prabhaker Mateti Wright State University Dayton, Ohio, USA Overview High performance computing High throughput computing NOW, HPC, and HTC Parallel algorithms Software

More information

How it can help your organisation

How it can help your organisation How it can help your organisation History Types of Virtualisation & Hypervisors Virtualisation Features Why Virtualisation? Virtualising Oracle Performance Licensing Support Cloud 1998 VMware founded by

More information

REMEM: REmote MEMory as Checkpointing Storage

REMEM: REmote MEMory as Checkpointing Storage REMEM: REmote MEMory as Checkpointing Storage Hui Jin Illinois Institute of Technology Xian-He Sun Illinois Institute of Technology Yong Chen Oak Ridge National Laboratory Tao Ke Illinois Institute of

More information

New research on Key Technologies of unstructured data cloud storage

New research on Key Technologies of unstructured data cloud storage 2017 International Conference on Computing, Communications and Automation(I3CA 2017) New research on Key Technologies of unstructured data cloud storage Songqi Peng, Rengkui Liua, *, Futian Wang State

More information

Distributed Systems COMP 212. Lecture 18 Othon Michail

Distributed Systems COMP 212. Lecture 18 Othon Michail Distributed Systems COMP 212 Lecture 18 Othon Michail Virtualisation & Cloud Computing 2/27 Protection rings It s all about protection rings in modern processors Hardware mechanism to protect data and

More information

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Distributed Information Systems Architecture Chapter Outline

More information

SSI-OSCAR: a Cluster Distribution for High Performance Computing Using a Single System Image

SSI-OSCAR: a Cluster Distribution for High Performance Computing Using a Single System Image SSI-OSCAR: a Cluster Distribution for High Performance Computing Using a Single System Image Geoffroy Vallée 1 2 ORNL/INRIA/EDF Computer Science and Mathematics Division Oak Ridge National Laboratory Oak

More information

VMware vsphere with ESX 4.1 and vcenter 4.1

VMware vsphere with ESX 4.1 and vcenter 4.1 QWERTYUIOP{ Overview VMware vsphere with ESX 4.1 and vcenter 4.1 This powerful 5-day class is an intense introduction to virtualization using VMware s vsphere 4.1 including VMware ESX 4.1 and vcenter.

More information

Fault tolerance in Grid and Grid 5000

Fault tolerance in Grid and Grid 5000 Fault tolerance in Grid and Grid 5000 Franck Cappello INRIA Director of Grid 5000 fci@lri.fr Fault tolerance in Grid Grid 5000 Applications requiring Fault tolerance in Grid Domains (grid applications

More information

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager The University of Oxford campus grid, expansion and integrating new partners Dr. David Wallom Technical Manager Outline Overview of OxGrid Self designed components Users Resources, adding new local or

More information

Update on EZ-Grid. Priya Raghunath University of Houston. PI : Dr Barbara Chapman

Update on EZ-Grid. Priya Raghunath University of Houston. PI : Dr Barbara Chapman Update on EZ-Grid Priya Raghunath University of Houston PI : Dr Barbara Chapman chapman@cs.uh.edu Outline Campus Grid at the University of Houston (UH) Functionality of EZ-Grid Design and Implementation

More information

Construction and Application of Cloud Data Center in University

Construction and Application of Cloud Data Center in University International Conference on Logistics Engineering, Management and Computer Science (LEMCS 2014) Construction and Application of Cloud Data Center in University Hong Chai Institute of Railway Technology,

More information

Kernel Korner AEM: A Scalable and Native Event Mechanism for Linux

Kernel Korner AEM: A Scalable and Native Event Mechanism for Linux Kernel Korner AEM: A Scalable and Native Event Mechanism for Linux Give your application the ability to register callbacks with the kernel. by Frédéric Rossi In a previous article [ An Event Mechanism

More information

VMware vsphere: Install, Configure, Manage (vsphere ICM 6.7)

VMware vsphere: Install, Configure, Manage (vsphere ICM 6.7) VMware vsphere: Install, Configure, Manage (vsphere ICM 6.7) COURSE OVERVIEW: This five-day course features intensive hands-on training that focuses on installing, configuring, and managing VMware vsphere

More information

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 HDFS Architecture Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 Based Upon: http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoopproject-dist/hadoop-hdfs/hdfsdesign.html Assumptions At scale, hardware

More information

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P? Peer-to-Peer Data Management - Part 1- Alex Coman acoman@cs.ualberta.ca Addressed Issue [1] Placement and retrieval of data [2] Server architectures for hybrid P2P [3] Improve search in pure P2P systems

More information

PoS(eIeS2013)008. From Large Scale to Cloud Computing. Speaker. Pooyan Dadvand 1. Sònia Sagristà. Eugenio Oñate

PoS(eIeS2013)008. From Large Scale to Cloud Computing. Speaker. Pooyan Dadvand 1. Sònia Sagristà. Eugenio Oñate 1 International Center for Numerical Methods in Engineering (CIMNE) Edificio C1, Campus Norte UPC, Gran Capitán s/n, 08034 Barcelona, Spain E-mail: pooyan@cimne.upc.edu Sònia Sagristà International Center

More information

PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM

PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM Szabolcs Pota 1, Gergely Sipos 2, Zoltan Juhasz 1,3 and Peter Kacsuk 2 1 Department of Information Systems, University of Veszprem, Hungary 2 Laboratory

More information

Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017

Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017 Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017 1 Table of Contents 1. Executive Summary 1.1.Business Case 1.2.Solution Overview

More information

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 2 Issue 10 October, 2013 Page No. 2958-2965 Abstract AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi,

More information

Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware

Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware Mohamed Taher 1, Kris Gaj 2, Tarek El-Ghazawi 1, and Nikitas Alexandridis 1 1 The George Washington University 2 George Mason

More information

CONSTRUCTION OF A HIGHLY DEPENDABLE OPERATING SYSTEM

CONSTRUCTION OF A HIGHLY DEPENDABLE OPERATING SYSTEM CONSTRUCTION OF A HIGHLY DEPENDABLE OPERATING SYSTEM 6th European Dependable Computing Conference Coimbra, Portugal October 18 20, 2006 Jorrit N. Herder Dept. of Computer Science Vrije Universiteit Amsterdam

More information

Similarities and Differences Between Parallel Systems and Distributed Systems

Similarities and Differences Between Parallel Systems and Distributed Systems Similarities and Differences Between Parallel Systems and Distributed Systems Pulasthi Wickramasinghe, Geoffrey Fox School of Informatics and Computing,Indiana University, Bloomington, IN 47408, USA In

More information

Analysis of the Component Architecture Overhead in Open MPI

Analysis of the Component Architecture Overhead in Open MPI Analysis of the Component Architecture Overhead in Open MPI B. Barrett 1, J.M. Squyres 1, A. Lumsdaine 1, R.L. Graham 2, G. Bosilca 3 Open Systems Laboratory, Indiana University {brbarret, jsquyres, lums}@osl.iu.edu

More information

1. Which programming language is used in approximately 80 percent of legacy mainframe applications?

1. Which programming language is used in approximately 80 percent of legacy mainframe applications? Volume: 59 Questions 1. Which programming language is used in approximately 80 percent of legacy mainframe applications? A. Visual Basic B. C/C++ C. COBOL D. Java Answer: C 2. An enterprise customer's

More information

Loaded: Server Load Balancing for IPv6

Loaded: Server Load Balancing for IPv6 Loaded: Server Load Balancing for IPv6 Sven Friedrich, Sebastian Krahmer, Lars Schneidenbach, Bettina Schnor Institute of Computer Science University Potsdam Potsdam, Germany fsfried, krahmer, lschneid,

More information

X-S Framework Leveraging XML on Servlet Technology

X-S Framework Leveraging XML on Servlet Technology X-S Framework Leveraging XML on Servlet Technology Rajesh Kumar R Abstract This paper talks about a XML based web application framework that is based on Java Servlet Technology. This framework leverages

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

GFS: The Google File System. Dr. Yingwu Zhu

GFS: The Google File System. Dr. Yingwu Zhu GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can

More information

Research on Hierarchical Storage System based on Wireless Mesh Network. Xin Hu

Research on Hierarchical Storage System based on Wireless Mesh Network. Xin Hu International Conference on Materials Engineering and Information Technology Applications (MEITA 2015) Research on Hierarchical Storage System based on Wireless Mesh Network Xin Hu Library, Nanchang Institute

More information

A Generic Distributed Architecture for Business Computations. Application to Financial Risk Analysis.

A Generic Distributed Architecture for Business Computations. Application to Financial Risk Analysis. A Generic Distributed Architecture for Business Computations. Application to Financial Risk Analysis. Arnaud Defrance, Stéphane Vialle, Morgann Wauquier Firstname.Lastname@supelec.fr Supelec, 2 rue Edouard

More information

Chapter 14 Operating Systems

Chapter 14 Operating Systems Chapter 14 Operating Systems Ref Page Slide 1/54 Learning Objectives In this chapter you will learn about: Definition and need for operating system Main functions of an operating system Commonly used mechanisms

More information

Chapter 14 Operating Systems

Chapter 14 Operating Systems Chapter 14 Systems Ref Page Slide 1/54 Learning Objectives In this chapter you will learn about: Definition and need for operating Main functions of an operating Commonly used mechanisms for: Process management

More information

Data Sheet: Storage Management Veritas Storage Foundation for Oracle RAC from Symantec Manageability and availability for Oracle RAC databases

Data Sheet: Storage Management Veritas Storage Foundation for Oracle RAC from Symantec Manageability and availability for Oracle RAC databases Manageability and availability for Oracle RAC databases Overview Veritas Storage Foundation for Oracle RAC from Symantec offers a proven solution to help customers implement and manage highly available

More information

An Introduction to Overlay Networks PlanetLab: A Virtual Overlay Network Testbed

An Introduction to Overlay Networks PlanetLab: A Virtual Overlay Network Testbed An Introduction to Overlay Networks PlanetLab: A Virtual Overlay Network Testbed Suhas Mathur suhas@winlab.rutgers.edu Communication Networks II Spring 2005 Talk Outline Introduction: The future internet

More information

Accelerating Parallel Analysis of Scientific Simulation Data via Zazen

Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw D. E. Shaw Research Motivation

More information

Grid Computing with Voyager

Grid Computing with Voyager Grid Computing with Voyager By Saikumar Dubugunta Recursion Software, Inc. September 28, 2005 TABLE OF CONTENTS Introduction... 1 Using Voyager for Grid Computing... 2 Voyager Core Components... 3 Code

More information

Introduction to Virtualization. From NDG In partnership with VMware IT Academy

Introduction to Virtualization. From NDG In partnership with VMware IT Academy Introduction to Virtualization From NDG In partnership with VMware IT Academy www.vmware.com/go/academy Why learn virtualization? Modern computing is more efficient due to virtualization Virtualization

More information

COMP6511A: Large-Scale Distributed Systems. Windows Azure. Lin Gu. Hong Kong University of Science and Technology Spring, 2014

COMP6511A: Large-Scale Distributed Systems. Windows Azure. Lin Gu. Hong Kong University of Science and Technology Spring, 2014 COMP6511A: Large-Scale Distributed Systems Windows Azure Lin Gu Hong Kong University of Science and Technology Spring, 2014 Cloud Systems Infrastructure as a (IaaS): basic compute and storage resources

More information

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision

More information

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2 Chapter 1: Distributed Information Systems Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) alonso@inf.ethz.ch http://www.iks.inf.ethz.ch/ Contents - Chapter 1 Design

More information

Large Scale Sky Computing Applications with Nimbus

Large Scale Sky Computing Applications with Nimbus Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France Pierre.Riteau@irisa.fr INTRODUCTION TO SKY COMPUTING IaaS

More information

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR Petri Kero CTO / Ministry of Games MOBILE GAME BACKEND CHALLENGES Lots of concurrent users Complex interactions between players Persistent world with frequent

More information

A Chord-Based Novel Mobile Peer-to-Peer File Sharing Protocol

A Chord-Based Novel Mobile Peer-to-Peer File Sharing Protocol A Chord-Based Novel Mobile Peer-to-Peer File Sharing Protocol Min Li 1, Enhong Chen 1, and Phillip C-y Sheu 2 1 Department of Computer Science and Technology, University of Science and Technology of China,

More information

A Distance Learning Tool for Teaching Parallel Computing 1

A Distance Learning Tool for Teaching Parallel Computing 1 A Distance Learning Tool for Teaching Parallel Computing 1 RAFAEL TIMÓTEO DE SOUSA JR., ALEXANDRE DE ARAÚJO MARTINS, GUSTAVO LUCHINE ISHIHARA, RICARDO STACIARINI PUTTINI, ROBSON DE OLIVEIRA ALBUQUERQUE

More information

Redundancy for Routers using Enhanced VRRP

Redundancy for Routers using Enhanced VRRP Redundancy for Routers using Enhanced VRRP 1 G.K.Venkatesh, 2 P.V. Rao 1 Asst. Prof, Electronics Engg, Jain University Banglaore, India 2 Prof., Department of Electronics Engg., Rajarajeshwari College

More information

VMware vsphere with ESX 6 and vcenter 6

VMware vsphere with ESX 6 and vcenter 6 VMware vsphere with ESX 6 and vcenter 6 Course VM-06 5 Days Instructor-led, Hands-on Course Description This class is a 5-day intense introduction to virtualization using VMware s immensely popular vsphere

More information

By the end of the class, attendees will have learned the skills, and best practices of virtualization. Attendees

By the end of the class, attendees will have learned the skills, and best practices of virtualization. Attendees Course Name Format Course Books 5-day instructor led training 735 pg Study Guide fully annotated with slide notes 244 pg Lab Guide with detailed steps for completing all labs vsphere Version Covers uses

More information

<Insert Picture Here> Enterprise Data Management using Grid Technology

<Insert Picture Here> Enterprise Data Management using Grid Technology Enterprise Data using Grid Technology Kriangsak Tiawsirisup Sales Consulting Manager Oracle Corporation (Thailand) 3 Related Data Centre Trends. Service Oriented Architecture Flexibility

More information

VMware vsphere with ESX 4 and vcenter

VMware vsphere with ESX 4 and vcenter VMware vsphere with ESX 4 and vcenter This class is a 5-day intense introduction to virtualization using VMware s immensely popular vsphere suite including VMware ESX 4 and vcenter. Assuming no prior virtualization

More information

THE IMPACT OF E-COMMERCE ON DEVELOPING A COURSE IN OPERATING SYSTEMS: AN INTERPRETIVE STUDY

THE IMPACT OF E-COMMERCE ON DEVELOPING A COURSE IN OPERATING SYSTEMS: AN INTERPRETIVE STUDY THE IMPACT OF E-COMMERCE ON DEVELOPING A COURSE IN OPERATING SYSTEMS: AN INTERPRETIVE STUDY Reggie Davidrajuh, Stavanger University College, Norway, reggie.davidrajuh@tn.his.no ABSTRACT This paper presents

More information