Dynamic Grouping Strategy in Cloud Computing

Similar documents
Effective Query Grouping Strategy in Clouds

Outsourcing Privacy-Preserving Social Networks to a Cloud

Clock-Based Proxy Re-encryption Scheme in Unreliable Clouds

Cooperative Private Searching in Clouds

Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments

HCBE: Achieving Fine-Grained Access Control in Cloud-based PHR Systems

Fault-Tolerant and Secure Data Transmission Using Random Linear Network Coding

Multi-resource Energy-efficient Routing in Cloud Data Centers with Network-as-a-Service

Optimal Slice Allocation in 5G Core Networks

Clustering Analysis for Malicious Network Traffic

over Multi Label Images

RIGHTNOW A C E

Outsourcing Privacy-Preserving Social Networks to a Cloud

Performance Analysis of Storage-Based Routing for Circuit-Switched Networks [1]

MULTI-LAYER VIDEO STREAMING WITH HELPER NODES USING NETWORK CODING

Introduction. Architecture Overview

Efficient Subgraph Matching by Postponing Cartesian Products

Parallels Remote Application Server. Scalability Testing with Login VSI

Hardware Sizing Guide OV

cs/ee 143 Fall

Time-Based Proxy Re-encryption Review

NON-CENTRALIZED DISTINCT L-DIVERSITY

Web caches (proxy server) Applications (part 3) Applications (part 3) Caching example (1) More about Web caching

Automated Control for Elastic Storage Harold Lim, Shivnath Babu, Jeff Chase Duke University

Experimental Model for Load Balancing in Cloud Computing Using Throttled Algorithm

Multi-path based Algorithms for Data Transfer in the Grid Environment

Scalability Testing with Login VSI v16.2. White Paper Parallels Remote Application Server 2018

Preliminary Research on Distributed Cluster Monitoring of G/S Model

Microsoft Office is a collection of programs that you will be already using in school. This includes Word, PowerPoint, Publisher, Excel etc..

Towards Green Cloud Computing: Demand Allocation and Pricing Policies for Cloud Service Brokerage Chenxi Qiu

SSD-based Information Retrieval Systems

Final Exam. Introduction to Artificial Intelligence. CS 188 Spring 2010 INSTRUCTIONS. You have 3 hours.

Extending SLURM with Support for GPU Ranges

Using Global Behavior Modeling to improve QoS in Cloud Data Storage Services

Risk-Aware Rapid Data Evacuation for Large- Scale Disasters in Optical Cloud Networks

THE proliferation of handheld mobile devices and wireless

SSD-based Information Retrieval Systems

Simulation of Cloud Computing Environments with CloudSim

CLUSTER HEAD SELECTION USING QOS STRATEGY IN WSN

Load Balancing Algorithm over a Distributed Cloud Network

Sensor Tasking and Control

Performance Extrapolation for Load Testing Results of Mixture of Applications

Joint Optimization of Server and Network Resource Utilization in Cloud Data Centers

R-Storm: A Resource-Aware Scheduler for STORM. Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell

DONAR Decentralized Server Selection for Cloud Services

Enhancing Software-Defined RAN with Collaborative Caching and Scalable Video Coding

Efficient Load Balancing and Disk Failure Avoidance Approach Using Restful Web Services

A new caching policy for cloud assisted Peer-to-Peer video on-demand services

VIAF: Verification-based Integrity Assurance Framework for MapReduce. YongzhiWang, JinpengWei

A Firewall Architecture to Enhance Performance of Enterprise Network

WIRELESS broadband networks are being increasingly

Robust Wireless Delivery of Scalable Videos using Inter-layer Network Coding

Research and Implementation of Server Load Balancing Strategy in Service System

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection

Device-to-Device Networking Meets Cellular via Network Coding

Contents Overview of the Compression Server White Paper... 5 Business Problem... 7

Some Routing Challenges in Dynamic Networks

Relational Model, Relational Algebra, and SQL

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

Guaranteeing Heterogeneous Bandwidth Demand in Multitenant Data Center Networks

Cooperative Mobile Internet Access with Opportunistic Scheduling

Server Specifications

Geometric Steiner Trees

Framework Research on Privacy Protection of PHR Owners in Medical Cloud System Based on Aggregation Key Encryption Algorithm

Modeling and Analysis: System Model

Evaluating the Potential of Graphics Processors for High Performance Embedded Computing

Reducing Power Consumption in Data Centers by Jointly Considering VM Placement and Flow Scheduling

Ingo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA Copyright 2003, SAS Institute Inc. All rights reserved.

CQNCR: Optimal VM Migration Planning in Cloud Data Centers

A 3D Point Cloud Registration Algorithm based on Feature Points

Datacenter Simulation Methodologies Case Studies

Spectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University

The Construction of Open Source Cloud Storage System for Digital Resources

On the Robustness of Distributed Computing Networks

A Combined Semi-Pipelined Query Processing Architecture For Distributed Full-Text Retrieval

SAP High-Performance Analytic Appliance on the Cisco Unified Computing System

Deltek Vision 7.4 Technical Overview & System Requirements: Basic Deployment ( Employees) 1/28/2015

Some economical principles

Meraki MX Family. Overview

On the Robustness of Distributed Computing Networks

Genetic-Algorithm-Based Construction of Load-Balanced CDSs in Wireless Sensor Networks

Synthesis of Planar Mechanisms, Part IX: Path Generation using 6 Bar 2 Sliders Mechanism

Coriolis: Scalable VM Clustering in Clouds

Column-Stores vs. Row-Stores. How Different are they Really? Arul Bharathi

QLogic 2500 Series FC HBAs Accelerate Application Performance

NSFA: Nested Scale-Free Architecture for Scalable Publish/Subscribe over P2P Networks

ASN Configuration Best Practices

Enhancing Cloud Resource Utilisation using Statistical Analysis

Energy-aware Scheduling for Frame-based Tasks on Heterogeneous Multiprocessor Platforms

Analytical Modeling of Parallel Programs

Capacity planning and.

G-NET: Effective GPU Sharing In NFV Systems

Power-Aware Throughput Control for Database Management Systems

Effect of memory latency

Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading

An Efficient Privacy Preserving Keyword Search Scheme in Cloud Computing

Research on Heterogeneous Communication Network for Power Distribution Automation

A Randomized Algorithm for Minimizing User Disturbance Due to Changes in Cellular Technology

Using Hybrid Algorithm in Wireless Ad-Hoc Networks: Reducing the Number of Transmissions

Transcription:

IEEE CGC 2012, November 1-3, Xiangtan, Hunan, China Dynamic Grouping Strategy in Cloud Computing Presenter : Qin Liu a Joint work with Yuhong Guo b, Jie Wu b, and Guojun Wang a a Central South University, China b Temple University, USA 1

Outline 1. 1. Introduction 2. 2. Preliminaries 3. 3. K-Mean-based Dynamic Grouping 4. 4. Extensions 5. 5. Evaluation 6. 6. Conclusion & Future work 2

Introduction 3

Cloud computing model o Cloud computing has emerged as a new type of commercial paradigm due to its overwhelming advantages, such as flexibility, scalability, and cost efficiency F 1 : 1 :{A, B} B} A, B Cloud F 2 : 2 :{B, D} D} Bob F 1 F 1 2 2 F 3 : 3 :{C, D} D} 4

Application scenario University A Cloud Online library Access right Online library 5 Staffs &students

Problem: Cost grows linearly University A Alice {A,B} {F1,F2,F4} Redundant files: F1,F3: 2 F2,F4: 3 Bob Clark {A} {F1,F2} {C,D} {F2,F3,F4} {C} {F3,F4} 6 Donland

New chanllenge Alice University A {A,B} Deploy proxies inside Aggregate user queries Distribute search results Bob {F1, F2, F4} {A} {F1, F2} Proxy 1 {A,B} {F1, F2, F4} 7 Clark {C} {C, D} {F2, F3, F4} Proxy 2 Donland Return 6 files How to group queries to minimize the returned files? {F3, F4} {C, D} {F2, F3, F4}

Naïve solution: Random grouping Alice Clark Bob Donland University A {A,B} {F1, F2, F4} {C, D} {F2, F3, F4} {A} {F1, F2} {C} {F3, F4} Proxy 1 Proxy 2 Problems: Random grouping cause a waste of bandwidth {A,B} {F1, F2, F3 F4} {C, D} {F1,F2, F3, F4} Return 8 files Waste 20% bandwidth 8

Naïve solution: One proxy Alice University A {A,B} {F1, F2, F4} Problem: One proxy causes performance bottleneck and single point of failure Clark Bob Donland {C, D} {F2, F3, F4} {A} {F1, F2} {C} {F3, F4} Proxy {A,B,C, D} {F1,F2,F3,F4} Return 4 files Best performance 9

K-Mean-based Dynamic Grouping (KMDG) Classify n queries into k groups in the case of k proxy servers, so that each group size equals to n/k and the number of returned files is minimized. NP-Hard problem Heuristic grouping strategy Basic strategy: KMDG (based on K-Means) Extensions: KMDG1-Robust version KMDG2-Relax the constraint of equal group size 10

Design goals Effectiveness Cost efficiency Obtain optimal results within a polynomial time Load balancing Balance the bandwidth among proxies Minimize the bandwidth at at the cloud Robustness Obtain search results even if if some machines fail 11

Preliminaries 12

System model The cloud, many users, and many proxy servers Query router (QR) Aggregation and distribution machines (ADMs) 13

System model Given a public dictionary consisting of all keywords University A A user query/a combined query is a 0-1 bit string Alice Clark Bob {A,B} <1100> {C, D} <0011> {A} <1000> {C} <0010> Proxy {A,B,C, D} <1111> Dictionary=<A,B,C D> 14 Donland

Problem formulation Assumption Keywords are uniformly distributed in the file set The probability of each keyword in a file is the same 15 Minimize the total returned files = Minimize the number of keywords in combined queries Group queries with the most common keywords together Minimize the total number of 1s in the combined queries

Parameter analysis The expected value of the number of returned files can be calculated with Eq.1 d: the number of keywords in the dictionary γ: the average number of keywords in a file k: the number of groups/proxies t: the number of files in the cloud (grouping cost) : the average number of keywords in the j-th combined query (1) 16

K-Mean-based Dynamic Grouping (KMDG) 17

Definitions 18

High level ideas Suppose each query denotes a node 19

High level ideas An improvement: First choose a random node as the seed. For the i-th seed, choose the one with the total maximal distance with all i-1 seeds 20

High level ideas Closet: the minimal number of increased 1s after being combined with the seed s1=1100, s2=0011, s3=1001, s4=0101; Q=1100 21

High level ideas After choosing k seeds, grouping process is performed in the same way in next round 22

Algorithm 23

Example Sample Queries 24

Extensions 25

KMDG1-Algorithm Each user generates 2 α k query copies with the constraint that α copies are in different groups 26

KMDG1-Example 27

KMDG2-Algorithm Relax the constraint of equal group size to further reduce bandwidth 28

KMDG2-Example 29

Evaluation 30

Parameters Simulations are conducted with MATLAB R2010a, running on a local machine with an Intel Core 2 Duo E8400 3.0 GHz CPU and 8 GB RAM Summary of parameters 31

Performance Comparison of total number of 1s. X-axis denotes the number of users and Y-axis denotes the total number of 1s. 32

Performance Comparison of bandwidth at the cloud. X-axis denotes the number of users and Y-axis denotes bandwidth at the cloud (MB) 33

Load balancing Comparison of imbalanced transfer-in bandwidth. X-axis denotes the number of users and Y-axis denotes the imbalanced transfer-in bandwidth (MB) 34

Load balancing Comparison of imbalanced transfer-out bandwidth. X-axis denotes the number of users and Y-axis denotes the imbalanced transfer-out bandwidth (MB) 35

Conclusion & Future work 1 A dynamic grouping strategy is proposed to achieve cost efficiency, load balancing, and robustness in cloud computing 2 Experiment results show that KMDG can largely reduce the bandwidth incurred at the cloud 3 Conduct experiments on other keyword distributions to verify the effectiveness of KMDG 36

37 Thank you!