Performing Large Science Experiments on Azure: Pitfalls and Solutions
|
|
- Erika Elliott
- 5 years ago
- Views:
Transcription
1 Performing Large Science Experiments on Azure: Pitfalls and Solutions Wei Lu, Jared Jackson, Jaliya Ekanayake, Roger Barga, Nelson Araujo Microsoft extreme Computing Group
2 Windows Azure Application Compute Storage Fabric
3 Suggested Application Model Using queues for reliable messaging To scale, add more of either Web Role Worker Role IIS ASP.NET, WCF, etc. main( { } 4) Do work Decouple the system Absorb the bursts resilient to the instance failure, Easy to scale 2) Put work in queue Queue 3) Get work from queue
4 Azure Queue Communication channel between instances Messages in the Queue is reliable and durable 7-day life time Fault tolerance mechanism De-queued message becomes visible again after visibilitytimeout if it is not deleted 2-hour maximum limitation Idempotent processing Instance Instance Instance
5 AzureBLAST BLAST (Basic Local Alignment Search Tool) the most important software in bioinformatics Identify the similarity between bio-sequences BLAST is highly computation-intensive Large number of pairwise alignment operations The size of sequence databases has been growing exponentially Two choices for running large BLAST jobs Building a local cluster Submit jobs to NCBI or EBI Long job queuing time BLAST is easy to be parallelized Query segmentation Splitting task BLAST task BLAST task BLAST task Merging Task BLAST task
6 AzureBLAST Worker Web Role Job Management Role Web Portal Web Service Job registration Job Scheduler Global dispatch queue Worker Worker NCBI databases Database updating Role Job Registry Azure Table Blast databases, temporary data, etc.) Azure Blob
7 All-by-All BLAST experiment All by All query Compare the database against itself Discovering Homologs inter-relationships of known protein sequences Large protein database (4.2 GB size) Totally 9,865,668 sequences In theory100 billion sequence comparisons! Performance estimation would require 14 CPU-years One of biggest BLAST jobs as far as we know
8 Our Solution Allocated 3776 weighted instances 475 extra-large instances From three datacenters US South Central, West Europe and North Europe Dividing 10 million sequences into several segments Each will be submitted to one datacenter as one job Each segment consists of smaller partitions Finally the job took two weeks Total size of all outputs is ~230GB
9 Understanding Azure by analyzing logs A normal log record should be 3/31/2010 6:14RD00155D3611B0 Executing the task /31/2010 6:25RD00155D3611B0 Execution of task is done, it takes 10.9mins 3/31/2010 6:25RD00155D3611B0 Executing the task /31/2010 6:44RD00155D3611B0 Execution of task is done, it takes 19.3mins 3/31/2010 6:44RD00155D3611B0 Executing the task /31/2010 7:02RD00155D3611B0 Execution of task is done, it takes mins Otherwise, something is wrong (e.g., lost task) 3/31/2010 8:22RD00155D3611B0 Executing the task /31/2010 9:50RD00155D3611B0 Executing the task /31/ :12RD00155D3611B0 Execution of task is done, it takes 82 mins
10 Challenges & Pitfalls Failures Instance Idle time Limitation of current Azure Queue Performance/Cost Estimation Minimizing the Needs for Programming
11 Case Study 1 North Europe datacenter, totally 34, 265 tasks processed Node replacement, Avoid using machine name in your program Almost one day delay. Try not to orchestrate instances by the tight synchronization (e.g., barrier)
12 Case Study 2 North Europe Data Center, totally 34,256 tasks processed All 62 nodes lost tasks and then came back in a group fashion. This is Update domain ~ 6 nodes in one group ~30 mins
13 Case Study 3 West Europe Datacenter; 30,976 tasks are completed, and job was killed 35 Nodes experienced the blob writing failure at same time A reasonable guess: the Fault Domain is working
14 Challenges & Pitfalls Failures Failures are expectable and unpredictable Design with failure in mind Most are automatically recovered by cloud Instance Idle time Limitation of current Azure Queue Performance/Cost Estimation Minimizing the Needs for Programming
15 Challenges & Pitfalls Failures Instance Idle time Gap time between two jobs Diversity of work load Load imbalance Limitation of current Azure Queue Performance/Cost Estimation Minimizing the Needs for Programming
16 Load imbalance North Europe Data center, 2058 tasks Two-day very low system throughput due to some long-tail tasks Task needs 8 hours to complete; it was re-executed by 8 nodes due to the 2-hour max value of the visibliblitytimeout of a message
17 Challenges & Pitfalls Failures Instance Idle time Limitation of current Azure Queue 2-hour max value of visibilitytimeout Each individual task has to be done in 2 hours 7-day max message life time Entire experiment has to be done in less then 7 days Performance/Cost Estimation Minimizing the Needs for Programming
18 Challenges & Pitfalls Failures Instance Idle time Limitation of current Azure Queue Performance/Cost Estimation The better you understand your application, the more money you can save BLAST has about 20 arguments VM size Minimizing the Needs for Programming
19 Cirrus: Parameter Sweeping Service on Azure Worker Web Role Job Manager Role Web Portal Web Service Job registration Job Scheduler Scaling Engine Parametric Engine Sampling Filter Dispatch Queue Worker Worker Azure Table Azure Blob
20 Job Manager Role Job Definition Job Scheduler Scaling Engine Parametric Engine Sampling Filter Declarative Job definition Derived from Nimrod Each job can have Prolog Commands Paramters Azure-related opeartors AzureCopy AzureMount SelectBlobs Job configuration Minimize the programming for running legacy binaries on Azure BLAST Bayesian Network Machine Learning Image rendering <job name="blast"> <prolog> azurecopy uniref.fasta </prolog> <cmd> azurecopy %partition% input blastall.exe -p blastp -d uniref.fasta -i input -o output azurecopy output %partition%.out </cmd> <parameter name="partition"> <selectblobs> <prefix>partitions/</prefix> </selectblobs> </parameter> <configure> <mininstances>2</mininstances> <maxinstances>4</maxinstances> <shutdownwhendone> true </shutdownwhendone> <sampling> true </sampling> </configure> </job>
21 Job Manager Role Dynamic ScalingJob Scheduler Scaling Engine Parametric Engine Sampling Filter Scaling in/out for individual job Fit into the [min, max] window specified in the job config Synchronous Scaling Tasks are dispatched after the scaling is done Asynchronous Scaling Tasks execution and scaling operation are simultaneous Scaling in when load imbalance happens Scaling in when not receiving new jobs after a period of time Or if the job is configured as shutdown-when-done Usually used for the reducing job.
22 Job Pause-ReConfig-Resume Each job maintains a take status table Checkpoint by snapshotting the task table A task can be incomplete Fix the 7-day/ 2-hour limitation Handle the exception optimistically Ignore the exceptions, retry incomplete tasks with reduced number of instance, minimize the cost of failures Handle the load imbalance
23 Performance Estimation by Sampling Observation based approach Job Manager Role Job Scheduler Scaling Engine Parametric Engine Randomly sample the parameter space based on the sampling ration a Only dispatch the sample tasks scaling in only with n instances to save cost Assuming the uniform distribution, the estimation is done by Sampling Filter
24 Evaluation A complete BLAST running takes 2 hours with 16 instances, a 2%-sampling-run which achieves 96% accuracy only takes about 18 minutes with 2 instances the overall cost for the sampling run is only 1.8% of the complete run.
25 Evaluation Scaling-out Sync. Operation stall all instances for 80 minutes Async. Operation, Existing instances keep working New instances needs minutes 16-instance run is 1.4x faster Scaling-in Sync. Operation finished in 3 minutes Async. Operation caused the random message losing May lead to more idle instance time. the best practices scale-out asynchronously Scale-in synchronously New instances join in minutes Azure randomly picks the instances to shutdown
26 Conclusion Running large-scale parameter sweeping experiment on Azure Identified Pitfalls Design with Failure (most of them are recoverable) Watch out the instance idle time understand your application to save cost Minimize the need of programming Our parameter sweeping solutions Declarative job definition Dynamic scaling, Job pause-reconfig-resume pattern Performance estimation
Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson, Nelson Araujo, Dennis Gannon, Wei Lu, and
Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson, Nelson Araujo, Dennis Gannon, Wei Lu, and Jaliya Ekanayake Range in size from edge facilities
More informationLoosely coupled: asynchronous processing, decoupling of tiers/components Fan-out the application tiers to support the workload Use cache for data and content Reduce number of requests if possible Batch
More informationWINDOWS AZURE QUEUE. Table of Contents. 1 Introduction
WINDOWS AZURE QUEUE December, 2008 Table of Contents 1 Introduction... 1 2 Build Cloud Applications with Azure Queue... 2 3 Data Model... 5 4 Queue REST Interface... 6 5 Queue Usage Example... 7 5.1 A
More informationWindows Azure Services - At Different Levels
Windows Azure Windows Azure Services - At Different Levels SaaS eg : MS Office 365 Paas eg : Azure SQL Database, Azure websites, Azure Content Delivery Network (CDN), Azure BizTalk Services, and Azure
More informationescience in the Cloud: A MODIS Satellite Data Reprojection and Reduction Pipeline in the Windows
escience in the Cloud: A MODIS Satellite Data Reprojection and Reduction Pipeline in the Windows Jie Li1, Deb Agarwal2, Azure Marty Platform Humphrey1, Keith Jackson2, Catharine van Ingen3, Youngryel Ryu4
More informationCOMP6511A: Large-Scale Distributed Systems. Windows Azure. Lin Gu. Hong Kong University of Science and Technology Spring, 2014
COMP6511A: Large-Scale Distributed Systems Windows Azure Lin Gu Hong Kong University of Science and Technology Spring, 2014 Cloud Systems Infrastructure as a (IaaS): basic compute and storage resources
More informationCloud Computing Paradigms for Pleasingly Parallel Biomedical Applications
Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne, Tak-Lon Wu Judy Qiu, Geoffrey Fox School of Informatics, Pervasive Technology Institute Indiana University
More informationDeveloping Microsoft Azure Solutions
Course 20532C: Developing Microsoft Azure Solutions Course details Course Outline Module 1: OVERVIEW OF THE MICROSOFT AZURE PLATFORM This module reviews the services available in the Azure platform and
More informationFLAT DATACENTER STORAGE CHANDNI MODI (FN8692)
FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) OUTLINE Flat datacenter storage Deterministic data placement in fds Metadata properties of fds Per-blob metadata in fds Dynamic Work Allocation in fds Replication
More informationYogesh Simmhan. escience Group Microsoft Research
External Research Yogesh Simmhan Group Microsoft Research Catharine van Ingen, Roger Barga, Microsoft Research Alex Szalay, Johns Hopkins University Jim Heasley, University of Hawaii Science is producing
More informationDeveloping Microsoft Azure Solutions: Course Agenda
Developing Microsoft Azure Solutions: 70-532 Course Agenda Module 1: Overview of the Microsoft Azure Platform Microsoft Azure provides a collection of services that you can use as building blocks for your
More informationCourse Outline. Lesson 2, Azure Portals, describes the two current portals that are available for managing Azure subscriptions and services.
Course Outline Module 1: Overview of the Microsoft Azure Platform Microsoft Azure provides a collection of services that you can use as building blocks for your cloud applications. Lesson 1, Azure Services,
More informationExam Questions
Exam Questions 70-475 Designing and Implementing Big Data Analytics Solutions https://www.2passeasy.com/dumps/70-475/ 1. Drag and Drop You need to recommend data storage mechanisms for the solution. What
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Piccolo: Building Fast, Distributed Programs
More informationCourse Outline. Introduction to Azure for Developers Course 10978A: 5 days Instructor Led
Introduction to Azure for Developers Course 10978A: 5 days Instructor Led About this course This course offers students the opportunity to take an existing ASP.NET MVC application and expand its functionality
More informationDistributed Systems. Tutorial 9 Windows Azure Storage
Distributed Systems Tutorial 9 Windows Azure Storage written by Alex Libov Based on SOSP 2011 presentation winter semester, 2011-2012 Windows Azure Storage (WAS) A scalable cloud storage system In production
More informationCourse Outline. Developing Microsoft Azure Solutions Course 20532C: 4 days Instructor Led
Developing Microsoft Azure Solutions Course 20532C: 4 days Instructor Led About this course This course is intended for students who have experience building ASP.NET and C# applications. Students will
More informationFLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568
FLAT DATACENTER STORAGE Paper-3 Presenter-Pratik Bhatt fx6568 FDS Main discussion points A cluster storage system Stores giant "blobs" - 128-bit ID, multi-megabyte content Clients and servers connected
More informationAzure-persistence MARTIN MUDRA
Azure-persistence MARTIN MUDRA Storage service access Blobs Queues Tables Storage service Horizontally scalable Zone Redundancy Accounts Based on Uri Pricing Calculator Azure table storage Storage Account
More informationXLDB 11 Cloud Computing at Scale. Roger Barga Microsoft Research
XLDB 11 Cloud Computing at Scale Roger Barga Microsoft Research Framing Questions for Presentation(s) Does it make sense for large-scale (many terabytes, petabytes), data-intensive projects to consider
More informationPutting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21
Big Processing -Parallel Computation COS 418: Distributed Systems Lecture 21 Michael Freedman 2 Ex: Word count using partial aggregation Putting it together 1. Compute word counts from individual files
More informationMicrosoft Developing Microsoft Azure Solutions.
http://www.officialcerts.com 70-532 Microsoft Developing Microsoft Azure Solutions OfficialCerts.com is a reputable IT certification examination guide, study guides and audio exam provider. We ensure that
More informationMicrosoft Windows HPC Server 2008 R2 for the Cluster Developer
50291B - Version: 1 02 May 2018 Microsoft Windows HPC Server 2008 R2 for the Cluster Developer Microsoft Windows HPC Server 2008 R2 for the Cluster Developer 50291B - Version: 1 5 days Course Description:
More informationAZURE CONTAINER INSTANCES
AZURE CONTAINER INSTANCES -Krunal Trivedi ABSTRACT In this article, I am going to explain what are Azure Container Instances, how you can use them for hosting, when you can use them and what are its features.
More informationFuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc
Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,
More informationUsers Application Virtual Machine Users Application Virtual Machine Users Application Virtual Machine Private Cloud Users Application Virtual Machine On-Premise Service Providers Private Cloud Users Application
More informationMicrosoft_PrepKing_70-583_v _85q_By-Cath. if u wana pass the exam with good percentage dn follow this dump
Microsoft_PrepKing_70-583_v2011-11-25_85q_By-Cath Number: 70-583 Passing Score: 800 Time Limit: 120 min File Version: 2011-11-25 http://www.gratisexam.com/ Exam : Microsoft_PrepKing_70-583 Ver :2011-11-25
More informationACCURATE STUDY GUIDES, HIGH PASSING RATE! Question & Answer. Dump Step. provides update free of charge in one year!
DUMP STEP Question & Answer ACCURATE STUDY GUIDES, HIGH PASSING RATE! Dump Step provides update free of charge in one year! http://www.dumpstep.com Exam : 70-532 Title : Developing Microsoft Azure Solutions
More informationDeveloping Microsoft Azure Solutions
Developing Microsoft Azure Solutions Duration: 5 Days Course Code: M20532 Overview: This course is intended for students who have experience building web applications. Students should also have experience
More informationMapReduce for Data Intensive Scientific Analyses
apreduce for Data Intensive Scientific Analyses Jaliya Ekanayake Shrideep Pallickara Geoffrey Fox Department of Computer Science Indiana University Bloomington, IN, 47405 5/11/2009 Jaliya Ekanayake 1 Presentation
More informationebay s Architectural Principles
ebay s Architectural Principles Architectural Strategies, Patterns, and Forces for Scaling a Large ecommerce Site Randy Shoup ebay Distinguished Architect QCon London 2008 March 14, 2008 What we re up
More informationPatterns on XRegional Data Consistency
Patterns on XRegional Data Consistency Contents The problem... 3 Introducing XRegional... 3 The solution... 5 Enabling consistency... 6 The XRegional Framework: A closer look... 8 Some considerations...
More informationCLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters
CLUSTERING HIVEMQ Building highly available, horizontally scalable MQTT Broker Clusters 12/2016 About this document MQTT is based on a publish/subscribe architecture that decouples MQTT clients and uses
More informationVlad Vinogradsky
Vlad Vinogradsky vladvino@microsoft.com http://twitter.com/vladvino Commercially available cloud platform offering Billing starts on 02/01/2010 A set of cloud computing services Services can be used together
More informationTechno Expert Solutions
Course Content of Microsoft Windows Azzure Developer: Course Outline Module 1: Overview of the Microsoft Azure Platform Microsoft Azure provides a collection of services that you can use as building blocks
More informationDeveloping Microsoft Azure Solutions (MS 20532)
Developing Microsoft Azure Solutions (MS 20532) COURSE OVERVIEW: This course is intended for students who have experience building ASP.NET and C# applications. Students will also have experience with the
More informationAdaptive Cluster Computing using JavaSpaces
Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of
More informationMost real programs operate somewhere between task and data parallelism. Our solution also lies in this set.
for Windows Azure and HPC Cluster 1. Introduction In parallel computing systems computations are executed simultaneously, wholly or in part. This approach is based on the partitioning of a big task into
More informationebay Marketplace Architecture
ebay Marketplace Architecture Architectural Strategies, Patterns, and Forces Randy Shoup, ebay Distinguished Architect QCon SF 2007 November 9, 2007 What we re up against ebay manages Over 248,000,000
More informationDatacenter replication solution with quasardb
Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION
More information20532D: Developing Microsoft Azure Solutions
20532D: Developing Microsoft Azure Solutions Course Details Course Code: Duration: Notes: 20532D 5 days Elements of this syllabus are subject to change. About this course This course is intended for students
More informationEMC RecoverPoint. EMC RecoverPoint Support
Support, page 1 Adding an Account, page 2 RecoverPoint Appliance Clusters, page 3 Replication Through Consistency Groups, page 4 Group Sets, page 22 System Tasks, page 24 Support protects storage array
More informationDistributed ETL. A lightweight, pluggable, and scalable ingestion service for real-time data. Joe Wang
A lightweight, pluggable, and scalable ingestion service for real-time data ABSTRACT This paper provides the motivation, implementation details, and evaluation of a lightweight distributed extract-transform-load
More informationPERFORMANCE OPTIMIZATION FOR LARGE SCALE LOGISTICS ERP SYSTEM
PERFORMANCE OPTIMIZATION FOR LARGE SCALE LOGISTICS ERP SYSTEM Santosh Kangane Persistent Systems Ltd. Pune, India September 2013 Computer Measurement Group, India 1 Logistic System Overview 0.5 millions
More informationmicrosoft. Number: Passing Score: 800 Time Limit: 120 min.
70-534 microsoft Number: 70-534 Passing Score: 800 Time Limit: 120 min Exam A QUESTION 1 Drag and Drop Question You need to recommend data storage mechanisms for the solution. What should you recommend?
More informationLocality-Aware Dynamic VM Reconfiguration on MapReduce Clouds. Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng
Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng Virtual Clusters on Cloud } Private cluster on public cloud } Distributed
More informationCohesity Microsoft Azure Data Box Integration
Cohesity Microsoft Azure Data Box Integration Table of Contents Introduction...2 Audience...2 Requirements...2 Assumptions...2 Order Microsoft Azure Data Box...3 Requesting...3 Order Details...4 Shipping
More informationLarge-scale cluster management at Google with Borg
Large-scale cluster management at Google with Borg Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, John Wilkes Google Inc. Slides heavily derived from John Wilkes s presentation
More informationApache Flink. Alessandro Margara
Apache Flink Alessandro Margara alessandro.margara@polimi.it http://home.deib.polimi.it/margara Recap: scenario Big Data Volume and velocity Process large volumes of data possibly produced at high rate
More informationThe MapReduce Abstraction
The MapReduce Abstraction Parallel Computing at Google Leverages multiple technologies to simplify large-scale parallel computations Proprietary computing clusters Map/Reduce software library Lots of other
More information<Hot>Table 1.1 lists the Infoblox vnios for Azure appliance models that are supported for this release. # of vcpu Cores. TE-V Yes
About Infoblox vnios for Azure Infoblox vnios for Azure is an Infoblox virtual appliance designed for deployments through Microsoft Azure, a collection of integrated cloud services in the Microsoft Cloud.
More informationYves Goeleven. Solution Architect - Particular Software. Shipping software since Azure MVP since Co-founder & board member AZUG
Storage Services Yves Goeleven Solution Architect - Particular Software Shipping software since 2001 Azure MVP since 2010 Co-founder & board member AZUG NServiceBus & MessageHandler Used azure storage?
More informationRocksteady: Fast Migration for Low-Latency In-memory Storage. Chinmay Kulkarni, Aniraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman
Rocksteady: Fast Migration for Low-Latency In-memory Storage Chinmay Kulkarni, niraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman 1 Introduction Distributed low-latency in-memory key-value stores are
More informationThe Stream Processor as a Database. Ufuk
The Stream Processor as a Database Ufuk Celebi @iamuce Realtime Counts and Aggregates The (Classic) Use Case 2 (Real-)Time Series Statistics Stream of Events Real-time Statistics 3 The Architecture collect
More informationDistributed Systems 27. Process Migration & Allocation
Distributed Systems 27. Process Migration & Allocation Paul Krzyzanowski pxk@cs.rutgers.edu 12/16/2011 1 Processor allocation Easy with multiprocessor systems Every processor has access to the same memory
More informationMap-Reduce. Marco Mura 2010 March, 31th
Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of
More informationPimp My Data Grid. Brian Oliver Senior Principal Solutions Architect <Insert Picture Here>
Pimp My Data Grid Brian Oliver Senior Principal Solutions Architect (brian.oliver@oracle.com) Oracle Coherence Oracle Fusion Middleware Agenda An Architectural Challenge Enter the
More informationQualys Cloud Platform
18 QUALYS SECURITY CONFERENCE 2018 Qualys Cloud Platform Looking Under the Hood: What Makes Our Cloud Platform so Scalable and Powerful Dilip Bachwani Vice President, Engineering, Qualys, Inc. Cloud Platform
More informationBatches and Commands. Overview CHAPTER
CHAPTER 4 This chapter provides an overview of batches and the commands contained in the batch. This chapter has the following sections: Overview, page 4-1 Batch Rules, page 4-2 Identifying a Batch, page
More informationB.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2
Introduction :- Today single CPU based architecture is not capable enough for the modern database that are required to handle more demanding and complex requirements of the users, for example, high performance,
More informationSpeeding up the execution of numerical computations and simulations with rcuda José Duato
Speeding up the execution of numerical computations and simulations with rcuda José Duato Universidad Politécnica de Valencia Spain Outline 1. Introduction to GPU computing 2. What is remote GPU virtualization?
More informationSAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
1 SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics Qin Liu, John C.S. Lui 1 Cheng He, Lujia Pan, Wei Fan, Yunlong Shi 2 1 The Chinese University of Hong Kong 2 Huawei Noah s
More informationDistributed Systems. Day 3: Principles Continued Jan 31, 2019
Distributed Systems Day 3: Principles Continued Jan 31, 2019 Semantic Guarantees of RPCs Semantics At-least-once (1 or more calls) At-most-once (0 or 1 calls) Scenarios: Reading from bank account? Withdrawing
More informationIntroduction to Grid Computing
Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able
More informationHigh Availability & Disaster Recovery. Witt Mathot
High Availability & Disaster Recovery Witt Mathot Managing the Twin Risks to your Operations Data Loss Down Time Business Continuity Terminology Resiliency High Availability RTO Round Robin Cost Business
More informationDistributed and Fault-Tolerant Execution Framework for Transaction Processing
Distributed and Fault-Tolerant Execution Framework for Transaction Processing May 30, 2011 Toshio Suganuma, Akira Koseki, Kazuaki Ishizaki, Yohei Ueda, Ken Mizuno, Daniel Silva *, Hideaki Komatsu, Toshio
More informationSynergetics-Standard-SQL Server 2012-DBA-7 day Contents
Workshop Name Duration Objective Participants Entry Profile Training Methodology Setup Requirements Hardware and Software Requirements Training Lab Requirements Synergetics-Standard-SQL Server 2012-DBA-7
More informationVendor: Microsoft. Exam Code: Exam Name: Developing Microsoft Azure Solutions. Version: Demo
Vendor: Microsoft Exam Code: 70-532 Exam Name: Developing Microsoft Azure Solutions Version: Demo Testlet 1 Topic 1, Web-based Solution Background You are developing a web-based solution that students
More informationAmbry: LinkedIn s Scalable Geo- Distributed Object Store
Ambry: LinkedIn s Scalable Geo- Distributed Object Store Shadi A. Noghabi *, Sriram Subramanian +, Priyesh Narayanan +, Sivabalan Narayanan +, Gopalakrishna Holla +, Mammad Zadeh +, Tianwei Li +, Indranil
More informationPivotal Greenplum Database Azure Marketplace v4.0 Release Notes
Pivotal Greenplum Database Azure Marketplace v4.0 Release Notes Updated: February 2019 Overview Pivotal Greenplum is deployed on Azure using an Azure Resource Manager (ARM) template that has been optimized
More informationAzure Development Course
Azure Development Course About This Course This section provides a brief description of the course, audience, suggested prerequisites, and course objectives. COURSE DESCRIPTION This course is intended
More informationITBraindumps. Latest IT Braindumps study guide
ITBraindumps Latest IT Braindumps study guide Exam : 70-535 Title : Architecting Microsoft Azure Solutions Vendor : Microsoft Version : DEMO Get Latest & Valid 70-535 Exam's Question and Answers 1 from
More informationStorm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter
Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Storm at Twitter Twitter Web Analytics Before Storm Queues Workers Example (simplified) Example Workers schemify tweets and
More informationApplication-Transparent Checkpoint/Restart for MPI Programs over InfiniBand
Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand Qi Gao, Weikuan Yu, Wei Huang, Dhabaleswar K. Panda Network-Based Computing Laboratory Department of Computer Science & Engineering
More informationWindows Azure Overview
Windows Azure Overview Christine Collet, Genoveva Vargas-Solar Grenoble INP, France MS Azure Educator Grant Packaged Software Infrastructure (as a Service) Platform (as a Service) Software (as a Service)
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationTackling Latency via Replication in Distributed Systems
Tackling Latency via Replication in Distributed Systems Zhan Qiu, Imperial College London Juan F. Pe rez, University of Melbourne Peter G. Harrison, Imperial College London ACM/SPEC ICPE 2016 15 th March,
More informationIntroduction to K2View Fabric
Introduction to K2View Fabric 1 Introduction to K2View Fabric Overview In every industry, the amount of data being created and consumed on a daily basis is growing exponentially. Enterprises are struggling
More informationDeveloping Microsoft Azure Solutions (70-532) Syllabus
Developing Microsoft Azure Solutions (70-532) Syllabus Cloud Computing Introduction What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationPLEXXI HCN FOR VMWARE ENVIRONMENTS
PLEXXI HCN FOR VMWARE ENVIRONMENTS SOLUTION BRIEF FEATURING Plexxi s pre-built, VMware Integration Pack makes Plexxi integration with VMware simple and straightforward. Fully-automated network configuration,
More informationForget about the Clouds, Shoot for the MOON
Forget about the Clouds, Shoot for the MOON Wu FENG feng@cs.vt.edu Dept. of Computer Science Dept. of Electrical & Computer Engineering Virginia Bioinformatics Institute September 2012, W. Feng Motivation
More informationAuthors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G.
Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G. Speaker: Chong Li Department: Applied Health Science Program: Master of Health Informatics 1 Term
More informationCSCI 204 Introduction to Computer Science II Lab 7 Queue ADT
CSCI 204 Introduction to Computer Science II Lab 7 Queue ADT 1. Objectives In this lab, you will practice the following: Implement the Queue ADT using a structure of your choice, e.g., array or linked
More informationINTRODUCTION TO NEXTFLOW
INTRODUCTION TO NEXTFLOW Paolo Di Tommaso, CRG NETTAB workshop - Roma October 25th, 2016 @PaoloDiTommaso Research software engineer Comparative Bioinformatics, Notredame Lab Center for Genomic Regulation
More informationUsers and utilization of CERIT-SC infrastructure
Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user
More informationLinear Regression Optimization
Gradient Descent Linear Regression Optimization Goal: Find w that minimizes f(w) f(w) = Xw y 2 2 Closed form solution exists Gradient Descent is iterative (Intuition: go downhill!) n w * w Scalar objective:
More informationParallel Computing: MapReduce Jin, Hai
Parallel Computing: MapReduce Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology ! MapReduce is a distributed/parallel computing framework introduced by Google
More informationServerless Computing: Design, Implementation, and Performance. Garrett McGrath and Paul R. Brenner
Serverless Computing: Design, Implementation, and Performance Garrett McGrath and Paul R. Brenner Introduction Serverless Computing Explosion in popularity over the past 3 years Offerings from all leading
More informationProgramming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines
A programming model in Cloud: MapReduce Programming model and implementation for processing and generating large data sets Users specify a map function to generate a set of intermediate key/value pairs
More informationDeccansoft Software Services
Azure Syllabus Cloud Computing What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages and Disadvantages of Cloud Computing Getting
More informationTasks. Task Implementation and management
Tasks Task Implementation and management Tasks Vocab Absolute time - real world time Relative time - time referenced to some event Interval - any slice of time characterized by start & end times Duration
More informationBERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Amazon Aurora: Amazon s New Relational Database Engine Carlos Conde Technology Evangelist @caarlco 2015, Amazon Web Services,
More informationStep-by-Step Guide to Installing Cluster Service
Page 1 of 23 TechNet Home > Products & Technologies > Windows 2000 Server > Deploy > Configure Specific Features Step-by-Step Guide to Installing Cluster Service Topics on this Page Introduction Checklists
More informationFlat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897
Flat Datacenter Storage Edmund B. Nightingale, Jeremy Elson, et al. 6.S897 Motivation Imagine a world with flat data storage Simple, Centralized, and easy to program Unfortunately, datacenter networks
More informationDeveloping In The Cloud
Developing In The Cloud What is the Cloud? How does it work? What is P&P doing to help? What Is The Cloud? Cloud computing is a model for enabling Cloud convenient, computingon-demand is the provision
More informationChapter 5: CPU Scheduling
Chapter 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Operating Systems Examples Algorithm Evaluation Chapter 5: CPU Scheduling
More informationScalable Parallel Scientific Computing Using Twister4Azure
Scalable Parallel Scientific Computing Using Twister4Azure Thilina Gunarathne, Bingjing Zhang, Tak-Lon Wu, Judy Qiu School of Informatics and Computing Indiana University, Bloomington. {tgunarat, zhangbj,
More informationGFS: The Google File System. Dr. Yingwu Zhu
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
More informationVendor: Microsoft. Exam Code: Exam Name: Developing Microsoft Azure Solutions. Version: Demo
Vendor: Microsoft Exam Code: 70-532 Exam Name: Developing Microsoft Azure Solutions Version: Demo DEMO QUESTION 1 You need to configure storage for the solution. What should you do? To answer, drag the
More information