GoAhead Software NDIA Systems Engineering 2010

Similar documents
OpenSAF More than HA. Jonas Arndt. HP - Telecom Architect OpenSAF - TCC

Using OpenSAF for carrier grade High Availability

ATCA, HPI, AIS open specifications for HA applications. Artem Kazakov SOKENDAI/KEK TIPP09

Cisco CRS Carrier Routing System 8-Slot Line Card Chassis Enhanced Router Overview

Mission Critical Linux

Introduction to the Service Availability Forum

Introduction to Service Availability Forum

Data Distribution Service A foundation of Real-Time Data Centricity

Wireless Network Virtualization: Ensuring Carrier Grade Availability

9LV Mk4. Use of DDS for system integration. June, 2007 Thomas Jungefeldt. Saab Systems

BENEFITS OF INTRA-VEHICLE DISTRIBUTED NETWORK ARCHITECTURE

Virtualization and High-Availability

A Standards-Based Integration Platform for Reconfigurable Unmanned Aircraft Systems

Open Server Architecture

Introduction to DDS. Brussels, Belgium, June Gerardo Pardo-Castellote, Ph.D. Co-chair OMG DDS SIG

Improving Blade Economics with Virtualization

Data-Centric Architecture for Space Systems

System Management and Infrastructure

Product Catalog Embedded Computing Products

FLASHARRAY//M Smart Storage for Cloud IT

Remote Operation Services

Virtualizing Business- Critical Applications with Confidence TECHNICAL WHITE PAPER

Evolving the CORBA standard to support new distributed real-time and embedded systems

Oracle Enterprise Manager 12c IBM DB2 Database Plug-in

Cisco Nexus 7000 Switches Supervisor Module

Veritas Cluster Server from Symantec

Carrier Grade Virtualization

Managing IT Complexity Managing the Physical and Virtual world from a Single Console

Veritas Dynamic Multi-Pathing for VMware 6.0 Chad Bersche, Principal Technical Product Manager Storage and Availability Management Group

Implementing Net-Centric Tactical Warfare Systems

Unify DevOps and SecOps: Security Without Friction

Symmetra LX APC Symmetra LX 8kVA Scalable to 8kVA N+1 Rack-mount, 208/240V Input, 208/240V and 120V Output

The Business Value of Virtualizing Oracle ebusiness Suite. Haroon Qureshi QSolve, Inc.

Finding the pain. Delivering the solution.

High-Availability Practice of ZTE Cloud-Based Core Network

The Genesis HyperMDC is a scalable metadata cluster designed for ease-of-use and quick deployment.

Migrating Applications to the Cloud

HP StorageWorks MSA/P2000 Family Disk Array Installation and Startup Service

Cisco SFS 7000D InfiniBand Server Switch

Oracle Enterprise Manager 12c Sybase ASE Database Plug-in

Symmetra LX APC Symmetra LX 8kVA Scalable to 16kVA N+1 Rack-mount, 208/240V Input, 208/240V and 120V Output

EBOOK DATABASE CONSIDERATIONS FOR DOCKER

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Compute Infrastructure Management: The Future. Fred van den Bosch CTO, EVP Advanced Technology VERITAS Software Corporation

Cisco Nexus 7000 Series Supervisor Module

Course Outline: Designing, Optimizing, and Maintaining a Database Administrative Solution for Microsoft SQL Server 2008

Cisco ASR 1001-X Router Overview

COMPUTING. ControlSafe Platform. SIL4 Certified COTS Fail-Safe and Fault-Tolerant System for Train Control and Rail Signaling Applications.

Understanding the Role of Real-Time Java in Aegis Warship Modernization. Dr. Kelvin Nilsen, Chief Technology Officer Java, Atego Systems

RapidIO.org Update. Mar RapidIO.org 1

Open Communication Architecture Forum

A Data-Centric Approach for Modular Assurance Abstract. Keywords: 1 Introduction

Brochure. ION Multi-Service Integration Platform. Integrate. Optimize. Navigate. Transition Networks Brochure.

Service Availability TM Forum Application Interface Specification

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY

The Service Availability Forum Platform Interface

VMware vsphere 6.5 Boot Camp

Cisco Nexus 7000 Switches Second-Generation Supervisor Modules Data Sheet

Next Gen Storage StoreVirtual Alex Wilson Solutions Architect

Realizing the Promise of SANs

The vsphere 6.0 Advantages Over Hyper- V

Q A F 2.2 ger A n A m client dell dell client manager 2.2 FAQ

How to Harvest Reusable Components in Existing Software. Nikolai Mansurov Chief Scientist & Architect

Applying OpenHPI to xtca Platforms

Convergence is accelerating the path to the New Style of Business

WIRELESS NETWORK VIRTUALIZATION: ENSURING CARRIER GRADE AVAILABILITY

WIND RIVER TITANIUM CLOUD FOR TELECOMMUNICATIONS

How Redundant Power Supplies Prevent System Downtime

Lenovo ThinkSystem NE Release Notes. For Lenovo Cloud Network Operating System 10.6

Accelerating Data Center Virtualization with Cisco Services. Mark Milinkovich Director, WWTP Advanced Services, Data Center

TITANIUM CLOUD VIRTUALIZATION PLATFORM

Industry-leading Application PaaS Platform

AUTOTASK ENDPOINT BACKUP (AEB) SECURITY ARCHITECTURE GUIDE

What's New in vsan 6.2 First Published On: Last Updated On:

ENABLING IMPLEMENTATION AND DEPLOYMENT UTILIZING VICTORY TOOLS

Directions in Data Centre Virtualization and Management

"Charting the Course... VMware vsphere 6.7 Boot Camp. Course Summary

Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision

Launching StarlingX. The Journey to Drive Compute to the Edge Pilot Project Supported by the OpenStack

Across an Organization

Data Sheet: High Availability Veritas Cluster Server from Symantec Reduce Application Downtime

Reliable Power and Thermal Management in The Data Center

VMware HA: Overview & Technical Best Practices

Oracle RAC Course Content

Architecting Storage for Semiconductor Design: Manufacturing Preparation

QuickSpecs. Models. HP StorageWorks Modular Smart Array 30 Multi-Initiator (MSA30 MI) Enclosure. Overview

Specifications. Z Microsystems, Inc Summers Ridge Road, San Diego, CA T: F:

VMware vsphere 6.5: Install, Configure, Manage (5 Days)

NETSMART Network Management Solutions

<Insert Picture Here> Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g

Virtualization. Application Application Application. MCSN - N. Tonellotto - Distributed Enabling Platforms OPERATING SYSTEM OPERATING SYSTEM

The Realities of Virtualization

High Availability with the openais project. Prepared by: Steven Dake October 2005

Modular three-phase UPS system. Conceptpower DPA kva True modular UPS system for critical applications

A Model Driven Approach for the Generation of Configurations for Highly-Available Software Systems

Introducing the Cisco Nexus 7000 Series Data Center Class Switches

[VMICMV6.5]: VMware vsphere: Install, Configure, Manage [V6.5]

Modelos de Negócio na Era das Clouds. André Rodrigues, Cloud Systems Engineer

Craig Blitz Oracle Coherence Product Management

What s New in VMware vsphere Availability

Transcription:

GoAhead Software NDIA Systems Engineering 2010 High Availability and Fault Management in Objective Architecture Systems Steve Mills, Senior Systems Architect

Outline Standards/COTS and the Mission-Critical Requirement Defining High Availability (HA) What is (OA) Open Architecture? Introduction to SA Forum Services Conceptual Alignment between OA and SA Forum Summary

Introduction Mission Critical Systems Program/system examples: USN Aegis Weapon System, USA Integrated Battle Command System, USAF Space Fence, USN Air & Missile Defense Radar, USA Ground Combat Vehicle System architecture requires modularity and scalability COTS Solutions in general gaining traction The Service Availability Forum High Availability standards are gaining momentum for addressing mission critical requirements DOD Information Technology Standards Registry (DISR), DOD-wide mandated standard USN PEO IWS Combat System Product Line Architecture Description Document 3

Mission Critical Requirements Maintain continuous availability Distributed infrastructure to support cooperating redundant applications Infrastructure oversight of compute nodes and health awareness Automated, policy-based fault management cycle Configuration of fault detection criteria Deterministic fault recovery modeling Sub-second MTTR recovery for real-time applications Distributed system management framing a single coherent system Central configuration and administrative control Centralized alarm, notification, and log support Modular Open Systems Approach (MOSA) Open standards, COTS based solutions Reduce costs and risks 4

The Essence Of The High Availability Requirement Goal: Continuous availability (99.999% or better) despite Hardware, software, human, and communication failures Planned maintenance (migration, upgrade) Achieved via no single point of failure Redundancy of mission critical applications Failure detection at HW, OS, Node and App layers Automated, policy-based real-time recovery and repair: Sub-second, stateful failover Alarms and Notifications Custom policies Proactive system monitoring with preventative measures Clear integration points and modular design Health Repair Fault Detection Recovery Isolation 5 5

Open Architecture Goals Many meanings, but core is: Flexibility, openness, and re-use Layered infrastructure with common, open functions and interfaces Decouple domains of concern Modular, scalable Increased industry competition COTS, open standards 6

OA Reference Model and Infrastructure Critical boundary OA infrastructure and OA-compliant applications Applications modular components that plug into OA reference model Common, open features at this boundary include: Vertical integration - Interfaces for components to access infrastructure services Horizontal integration - Interfaces for components to bind to each other External integration Interface to external world for configuration and administration Modeling Explains component capabilities and policy driven behaviors The better an OA solution enables components to explain their properties, resource needs, and policies: The less such logic needs to be repeated within the applications The more consistent the OA infrastructure can manage those issues 7

Open Architecture - Infrastructure Layered to better partition responsibilities and decouple dependencies When open architecture is applied to a specific domain, it is often referred to as an Objective Architecture In our example here, the layers are: Platform layer hardware abstraction layer Node layer operating system abstraction layer Middleware layer set of services that treat distributed nodes as a logical world of a single system NOT a physical place or location Includes functions such as: logging, notification, stateful fail-over, fault detection, and Application integration layer component integration layer with integration methods This OA has a mission critical requirement which could be inherent in the problem domain (ex: combat system), or based on a mixed system where mission critical needs arise because of system density 8

SA Forum Availability Middleware Services Public Integration Points System Model Public APIs for each service Formal alarms and notifications Formal configuration access Key SA Forum Services Availability Management Framework (AMF) Platform Management (PLM) Cluster Membership (CLM) 9

Availability Management Framework Model the distributed, redundant applications Mission Critical System Critical applications OS, MW Cluster Nodes Hardware & LRU inventory Mission Critical Apps Distributed redundant applications configured in Active/Standby relationships System Model Policies Deterministic policies drive the AMF Fault Mgmt Policy Engine Power Power Power Cooling Power Shelf Mgr I P M I SG (2N) SG (N+M) SU SU SU SU A SM S A A GoAhead OpenSAFfire SU Linux Linux OS OS OS S Logical Modeling Entities Service Unit (SU) Service Group (SG) 10

Cluster Membership Model the cluster node members & state Mission Critical System Critical applications OS, MW Cluster Nodes Hardware & LRU inventory A cluster made up of a chassis and other compute nodes Three clusters partitioned over several chassis Yellow cluster View; 11 nodes Console HA Apps Yellow cluster View; 9 nodes Console OpenSAFfire MW HA Apps OpenSAFfire MW Orange cluster View; 12 nodes Console IP (Subnets) OpenSAFfire MW HA Apps A cluster can consist of the same or heterogeneous devices such as chassis, rack mount servers, etc. as long as each has visibility to each other through (redundant) communications paths HA Apps OpenSAFfire MW Green cluster View; 6 nodes Console 11

Platform Management Model the FRU inventory and state Depends-on Relationship Mission Critical System Critical applications OS, MW Cluster Nodes Hardware & LRU inventory Physical Logical = Containment dependencies Depends-on Relationship Platform Management Modeling PLM Manages Hardware Resources Model HW elements and dependencies Automated validation of LRU Inventory Hot swap management Configurable power and temperature threshold alarms Power on, off and reset of HW resources Node Node Node Node Cluster Membership Nodes PLM Manages Execution Environments (Operating System) Model execution environments Hosted directly by computer Hosted by hypervisor Administrative control of operating system state 12

Two Models Conceptual Alignment 13

Summary SA Forum and objective architecture align well Core Principles of open architecture reflected in SA Forum SA Forum Standards the ONLY proven open standard for mission-critical systems SA Forum standards can be integrated into new systems or legacy scenarios Deployed in programs such as Aegis Weapon System, Littoral Combat Ship, Common Processing System, Deepwater 14

Thank You Stephen Mills GoAhead Software Senior Systems Architect smills@goahead.com 781-654-7273 15