Risk-Aware Rapid Data Evacuation for Large- Scale Disasters in Optical Cloud Networks

Similar documents
Performance Analysis of Storage-Based Routing for Circuit-Switched Networks [1]

Data Center Network Placement and Data Backup Against Region Failures

Towards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu

Sleep/Wake Aware Local Monitoring (SLAM)

An Ant-Based Routing Algorithm to Achieve the Lifetime Bound for Target Tracking Sensor Networks

Multiple Virtual Network Function Service Chain Placement and Routing using Column Generation

Chapter 7 CONCLUSION

CQNCR: Optimal VM Migration Planning in Cloud Data Centers

AS the increasing number of smart devices and various

Virtual Network Embedding with Substrate Support for Parallelization

Deadline Guaranteed Service for Multi- Tenant Cloud Storage Guoxin Liu and Haiying Shen

Data Sheet FUJITSU Storage ETERNUS CD10000 S2 Hyperscale Storage

Global Leased Line Service

Time Slot Assignment Algorithms for Reducing Upstream Latency in IEEE j Networks

CONSTRUCTION AND EVALUATION OF MESHES BASED ON SHORTEST PATH TREE VS. STEINER TREE FOR MULTICAST ROUTING IN MOBILE AD HOC NETWORKS

Video-Aware Networking: Automating Networks and Applications to Simplify the Future of Video

Modeling and Optimization of Resource Allocation in Cloud

Enterprise GRC Implementation

CHAPTER 5 PROPAGATION DELAY

8. CONCLUSION AND FUTURE WORK. To address the formulated research issues, this thesis has achieved each of the objectives delineated in Chapter 1.

Railroad Infrastructure Security

Crosstalk-Aware Spectrum Defragmentation based on Spectrum Compactness in SDM-EON

DEPARTMENT of. Computer & Information Science & Engineering

Emergency Response for Demand Response Transportation Systems

Minimum Overlapping Layers and Its Variant for Prolonging Network Lifetime in PMRC-based Wireless Sensor Networks

The Impact of Control Path Survivability on Data Plane Survivability in SDN. Sedef Savas Networks Lab, Group Meeting Aug 11, 2017

Scale-Out Architectures with Brocade DCX 8510 UltraScale Inter-Chassis Links

Simplifying Collaboration in the Cloud

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

Resilient Communications: Staying connected during a disaster.

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Adapting Enterprise Distributed Real-time and Embedded (DRE) Pub/Sub Middleware for Cloud Computing Environments

Presenter: Wei Chang4

Routing protocols in WSN

Application of Cloud Computing in Hazardous Mechanical Industries

DCRoute: Speeding up Inter-Datacenter Traffic Allocation while Guaranteeing Deadlines

Randomized User-Centric Clustering for Cloud Radio Access Network with PHY Caching

PCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud

Networked Control Systems for Manufacturing: Parameterization, Differentiation, Evaluation, and Application. Ling Wang

Disaster Recovery and Mitigation: Is your business prepared when disaster hits?

White Paper: Next generation disaster data infrastructure CODATA LODGD Task Group 2017

Lightning Fast Rock Solid

CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT

Deploying Multiple Service Chain (SC) Instances per Service Chain BY ABHISHEK GUPTA FRIDAY GROUP MEETING APRIL 21, 2017

Performance Evaluation of CoAP and UDP using NS-2 for Fire Alarm System

Figure 1. Clustering in MANET.

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

Aggregation on the Fly: Reducing Traffic for Big Data in the Cloud

On Performance Evaluation of Reliable Topology Control Algorithms in Mobile Ad Hoc Networks (Invited Paper)

Network Adaptability under Resource Crunch. Rafael Braz Rebouças Lourenço Networks Lab - UC Davis Friday Lab Meeting - April 7th, 2017

The Storage Industry s Need for Power Efficient Storage. Marty Finkbeiner Senior Vice President Engineering Western Digital Corporation

On the Robustness of Distributed Computing Networks

Disaster-Resilient Control Plane Design and Mapping in Software-Defined Networks

CANVAS DISASTER RECOVERY PLAN AND PROCEDURES

Routing in Delay Tolerant Networks (2)

White paper ETERNUS CS800 Data Deduplication Background

Dell Compellent Storage Center and Windows Server 2012/R2 ODX

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development

Outline. CS5984 Mobile Computing. Dr. Ayman Abdel-Hamid, CS5984. Wireless Sensor Networks 1/2. Wireless Sensor Networks 2/2

Genetic-Algorithm-Based Construction of Load-Balanced CDSs in Wireless Sensor Networks

Dynamic Grouping Strategy in Cloud Computing

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

On the Robustness of Distributed Computing Networks

Oracle Argus Cloud Service Service Descriptions and Metrics

What's in this guide... 4 Documents related to NetBackup in highly available environments... 5

Splitter Placement in All-Optical WDM Networks

Survivable Routing in Multi-domain Optical Networks with Geographically Correlated Failures

Master s Thesis. Title. Supervisor Professor Masayuki Murata. Author Yuki Koizumi. February 15th, 2006

TELCOM2125: Network Science and Analysis

Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks

Automated Harvesting of Loss and Damage Information from Open Sources

CHAPTER 4: ROUTING DYNAMIC. Routing & Switching

Zonal based Deterministic Energy Efficient Clustering Protocol for WSNs

Kapitel 5: Mobile Ad Hoc Networks. Characteristics. Applications of Ad Hoc Networks. Wireless Communication. Wireless communication networks types

Global Crisis Management at Target

Modification of an energy-efficient virtual network mapping method for a load-dependent power consumption model

Connected Point Coverage in Wireless Sensor Networks using Robust Spanning Trees

Application Provisioning in Fog Computingenabled Internet-of-Things: A Network Perspective

A Low-Overhead Hybrid Routing Algorithm for ZigBee Networks. Zhi Ren, Lihua Tian, Jianling Cao, Jibi Li, Zilong Zhang

Outline. Introduction. Outline. Introduction (Cont.) Introduction (Cont.)

Analysis of Cluster-Based Energy-Dynamic Routing Protocols in WSN

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture

Nodes Energy Conserving Algorithms to prevent Partitioning in Wireless Sensor Networks

IMPACT OF LEADER SELECTION STRATEGIES ON THE PEGASIS DATA GATHERING PROTOCOL FOR WIRELESS SENSOR NETWORKS

Cloud Connect. Gain highly secure, performance-optimized access to third-party public and private cloud providers

Citrix NetScaler LLB Deployment Guide

Optimizing Apache Spark with Memory1. July Page 1 of 14

Basic Switch Organization

DELAY-CONSTRAINED MULTICAST ROUTING ALGORITHM BASED ON AVERAGE DISTANCE HEURISTIC

On the Latency-Accuracy Tradeoff in Approximate MapReduce Jobs

CASER Protocol Using DCFN Mechanism in Wireless Sensor Network

The Construction of Open Source Cloud Storage System for Digital Resources

HUAWEI TECHNOLOGIES CO., LTD. Huawei FireHunter6000 series

Some problems in ad hoc wireless networking. Balaji Prabhakar

An Overview of Smart Sustainable Cities and the Role of Information and Communication Technologies (ICTs)

The Tofu Interconnect D

Wireless Internet Routing. Learning from Deployments Link Metrics

Demand-adaptive VNF placement and scheduling in optical datacenter networks. Speaker: Tao Gao 8/10/2018 Group Meeting Presentation

Application Placement and Demand Distribution in a Global Elastic Cloud: A Unified Approach

RECENTLY virtualization technologies have been

Transcription:

Risk-Aware Rapid Data Evacuation for Large- Scale Disasters in Optical Cloud Networks Presenter: Yongcheng (Jeremy) Li PhD student, School of Electronic and Information Engineering, Soochow University, China Email: liyongcheng621@163.com Group Meeting, Friday, August 26, 2016

Outline 1. Background 2. Risk-Aware Rapid Data Evacuation For Large- Scale Disasters 3. Heuristic Algorithm 4. Performance Evaluation 5. Conclusion 6. Future Work 2

Background Enterprises deploy their cloud services such as cloud data storage and applications in distributed datacenter (DC) networks. Cloud services require Terabytes or Petabytes of data transfer. Optical networks can be used to facilitate data transfers. Advantage: high bandwidth and low latency in inter-dc networks. Disadvantage: services can be disrupted by disasters (such as earthquakes, tornadoes, and intentional attacks). A large-scale disaster can lead to high data loss and service disruptions. 2011 Japan Earthquake damaged many cloud providers data. 3

Background (contd.) Provide redundancy (and protection) against data loss. Distributed content/service replicas in different DCs. All replicas of a content can be lost in a large-scale disaster. Critical data must be quickly evacuated from DCs in the disaster region to safe DCs prior to the disaster. Rapid data evacuation. Receive warning of an oncoming disaster from various sensors and monitors in their network and/or from government or intelligence agencies. Depending on the type of disaster (e.g., earthquake, hurricane or weapons of mass destruction (WMD)), do the following prediction. Disaster zone Evacuation deadline Potential damage in the infrastructure Before deadline, quickly evacuate as much critical data as possible. 4

Background (contd.) Safe data transfers Possible independent disasters may compromise the process of data evacuation. Risk of node/link failures must be considered to ensure safe data transfers. How to objective a tradeoff between evacuation time and risk of data loss during evacuation. 5

Risk-Aware Rapid Data Evacuation For Large-Scale Disasters (contd.) Time delay Path-computation delay. Connection-setup delay. Data-transmission delay. Data-propagation delay. Notations Distance of path: l. Number of hops on path: n. Bandwidth of path: B p Propagation delay per unit distance: μ. Processing delay: η. Switch configuration delay: β. Assume the same propagation delay for data and control messages. 6

Risk-Aware Rapid Data Evacuation For Large-Scale Disasters (contd.) Equations Connection-setup delay Control-message processing delay: (n +1) η Control-message propagation delay: l μ Switch-configuration delay: (n + 1) β Transmission delay: F c B p Propagation delay: l μ 7

Risk-Aware Rapid Data Evacuation For Large-Scale Disasters (contd.) The risk of node/link failures rrrr l m /rrrr n m are the probabilities of link l or node n being damaged due to disasterm M. l L,m M(1 rrrr m l ) n N,m M (1 rrrr m n ) computes the probability that all links and nodes are not damaged by any disaster. Path failure probability: rrrr p = 1 l L,m M(1 rrrr m l ) n N,m M (1 rrrr m n ). 8

Risk-Aware Rapid Data Evacuation For Large-Scale Disasters (contd.) Example Transfer content c1 from node C to safe DCs A or D as fast as possible. Failure probabilities of links C-D and A-C is assumed as 0.6 and 0.2. Select destination DC D Evacuation time is less but the risk along path C-D is higher. Select destination DC A Evacuation time is high but the risk along path C-D is less. 9

Risk-Aware Rapid Data Evacuation For Large-Scale Disasters (contd.) Problem statement Objective: Achieve an optimal tradeoff between evacuation time and risk of data loss during evacuation. Given inputs: N is the set of nodes and L is the set of links. Physical topology G = (N, L). Set of possible disaster zones M. Set of predicted WMD attack zones WW. Set of DCs D. Storage capacity S d. Set of locally hosted contents C d. Number of replicas of content c R c. Importance metric of content c α c. 10

Risk-Aware Rapid Data Evacuation For Large-Scale Disasters (contd.) Problem statement Input: Size of content c F c. Residual link capacity B l, l L. Evacuation deadline T. Set of k-shortest paths R ii for each node pair (i, j), i, j N. Constraint: Available link capacity is limited. Data-transfer delay to be upper bounded by evacuation deadline. Different paths can be used in parallel for different connections if paths do not overlap. 11

Heuristic Algorithm Disaster mapping: Get a set of DCs D ww D, ww WW. // Datacenters in predicted disaster zone Get a set of DCs D ooo_ww D, D ooo_ww = D D ww. // Datacenters outside predicted disaster zone Content selection: For each DC d D ww Get a set of contents, C d, d D ww, C ww = C ww C d, ww WW. // Contents in predicted disaster zone For each content c C ww If all replicas of content c are in the disaster zone WW then Put c in a content list C EEE and get set of DCs D c, which host the replicas of c. // C EEE is a set of contents to be evacuated Sort C EEE based on α c in a descending order. 12

Heuristic Algorithm Destination DC selection and path delay and risk computation: For each content c C EEE For each d D ooo_ww, if F c S d then put d into list of available DCs D AAA // D AAA is a set of safe DCs with available storage Get set of k-shortest paths R ii, i D c,j D AAA For each path p R ii Obtain total delay ddddd p and risk rrrr p to calculate general cost CCCC p CCCC p = ddddd p + φ rrrr p Set path p which has minimum cost CCCC p as final solution 13

Performance Evaluation Simulation conditions Network topology: 24-node USNET topology. Disaster type: WMD attacks. A predicted WMD zone: WM A set of 10 possible independent disaster zones: M. Blue nodes represent safe DCs. Red nodes 6, 9, and 12 are the DCs in the WMD zone WM. 14

Performance Evaluation (contd.) Large storage capacity ranging from 10TB to 100TB (average occupation is assumed to be 40%). Residual link capacity is assumed to range from 500Gbps to 1Tbps (network is assumed to have 30% utilization). Number of contents is assumed to be 300. Size of each content is randomly generated within the range of [100GB, 200GB]. The contents are uniformly distributed among different DCs with the number of replicas ranging from 2 to 4. Contents are randomly assigned the importance metric α c on a scale from 1 to 10. Processing delay, propagation delay, and switch configuration delay to be 10 μs, 5 μs/km, 15 ms. 15

Performance Evaluation (contd.) Total risk of data loss and evacuation time Time consumption (s) 9 8 7 6 5 4 3 2 1 0 0 3 5 7 9 Weight factor ϕ 3000 2500 2000 1500 1000 Shows the time consumption and the total risk of data loss for the RA-RDE algorithm with an increasing weight factor φ. We see that RA-RDE can significantly reduce the total risk with an increasing φ. This is reasonable since a larger weight factor can lead to higher risk reduction in the general cost. 500 0 Total risk of data loss 16

Performance Evaluation (contd.) Performance comparison Result gap compared with LD-RDE 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% Time consumption Total risk of data loss 0.00% 3 5 7 9 Weight factor ϕ Compare the performance of RA-RDE with the LD-RDE algorithm which selects the least-delay paths for data evacuation without considering risk. It should be noted that the RA-RDE algorithm is equivalent to the LD-RDE scheme when φ = 0. We see that, when φ = 3, our approach reduces the total risk by 20% and needs less than 10% additional time consumption. 17

Performance Evaluation (contd.) Time consumption with an increasing number of contents Time consumption (s) 50 40 30 20 10 0 100 200 300 400 500 Number of contents 18 NDE RA-RDE LD-RDE Compare time consumption of the two rapid evacuation approaches with the nearest data evacuation (NDE) approach which evacuates data only to the nearest DC. With an increasing number of contents from 100 to 500 and φ = 3, we can see that RA-RDE performs close to LD-RDE and is much better than NDE, which verifies its time efficiency.

Conclusion To balance performance between time consumption and total risk of data loss, we defined a general cost considering path delay and path risk using a weight factor. We develop a risk-aware rapid data evacuation scheme for largescale disasters in optical cloud networks. Results show that proposed approach significantly reduces total risk with minimal addition time consumption. Time consumption close to LD-RDE under different number of contents. 19

Future Work Try to propose an MILP model to solve the rapid data evacuation problem by using time slot Time slot contiguity Time slot continuity C1 C2 12 Time slot T Investigate an new heuristic algorithms with the ICC paper of Wu Yu 20

Thank you for your attention! Presenter: Yongcheng (Jeremy) Li PhD student, School of Electronic and Information Engineering, Soochow University, China Email: liyongcheng621@163.com Group Meeting, Friday, August 26, 2016 21