Contents PART I: CLOUD, BIG DATA, AND COGNITIVE COMPUTING 1

Similar documents
Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Moving to the Cloud. Developing Apps in. the New World of Cloud Computing. Dinkar Sitaram. Geetha Manjunath. David R. Deily ELSEVIER.

IBM Leading High Performance Computing and Deep Learning Technologies

What is Cloud Computing? What are the Private and Public Clouds? What are IaaS, PaaS, and SaaS? What is the Amazon Web Services (AWS)?

Distributed and Cloud Computing

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Machine Learning on VMware vsphere with NVIDIA GPUs

Faculté Polytechnique

The Oracle Trust Fabric Securing the Cloud Journey

Cloud Essentials for Architects using OpenStack

Introduction To Cloud Computing

Top 40 Cloud Computing Interview Questions

Module Day Topic. 1 Definition of Cloud Computing and its Basics

Deep Learning Frameworks with Spark and GPUs

DOTNET Projects. DotNet Projects IEEE I. DOTNET based CLOUD COMPUTING. DOTNET based ARTIFICIAL INTELLIGENCE

Cloud Computing and Service-Oriented Architectures

An Introduction to Apache Spark

Challenges for Data Driven Systems

No Limits Cloud Introducing the HPE Helion Cloud Suite July 28, Copyright 2016 Vivit Worldwide

Architecting Microsoft Azure Solutions (proposed exam 535)

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

Large Scale Computing Infrastructures

Data Protection Modernization: Meeting the Challenges of a Changing IT Landscape

CLOUD COMPUTING PRIMER FOR EXECUTIVES

Deploying Applications on DC/OS

API, DEVOPS & MICROSERVICES

HPC over Cloud. July 16 th, SCENT HPC Summer GIST. SCENT (Super Computing CENTer) GIST (Gwangju Institute of Science & Technology)

DATA SCIENCE USING SPARK: AN INTRODUCTION

ODL based AI/ML for Networks Prem Sankar Gopannan, Ericsson YuLing Chen, Cisco

Building a Data-Friendly Platform for a Data- Driven Future

Machine Learning in WAN Research

Specialist ICT Learning

JAVA Projects. 1. Enforcing Multitenancy for Cloud Computing Environments (IEEE 2012).

VMware Cloud on AWS Technical Deck VMware, Inc.

De kracht van IBM cloud: hoe je bestaande workloads verhuist naar de cloud

Report on The Infrastructure for Implementing the Mobile Technologies for Data Collection in Egypt

Pasiruoškite ateičiai: modernus duomenų centras. Laurynas Dovydaitis Microsoft Azure MVP

Cisco Cloud Strategy. Uwe Müller. Leader PreSales Cloud & Datacenter Germany

Lenovo Software Defined Infrastructure Solutions. Aleš Simončič Technical Sales Manager, Lenovo South East Europe

Qualys Cloud Platform

Introduction to Cloud Computing. [thoughtsoncloud.com] 1

School of Software / Soongsil University Prof. YOUNGJONG KIM, Ph.D. Soongsil University

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

The Future of Analytics or The New SQL

Evolution of the Data Center

CS 6393 Lecture 10. Cloud Computing. Prof. Ravi Sandhu Executive Director and Endowed Chair. April 12,

ADABAS & NATURAL 2050+

Cloud Computing. Technologies and Types

CLOUDLENS PUBLIC, PRIVATE, AND HYBRID CLOUD VISIBILITY

Fujitsu World Tour 2018

Running MarkLogic in Containers (Both Docker and Kubernetes)

[MS10992]: Integrating On-Premises Core Infrastructure with Microsoft Azure

Colocation Enabler for Hybrid and Multi Cloud Solutions. Toan Nguyen, Director Business Development & Cloud Platform, e-shelter services GmbH

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

EDGE COMPUTING & IOT MAKING IT SECURE AND MANAGEABLE FRANCK ROUX MARKETING MANAGER, NXP JUNE PUBLIC

Machine Learning in WAN Research

Title DC Automation: It s a MARVEL!

High Performance and Cloud Computing (HPCC) for Bioinformatics

Cloud I - Introduction

La plateforme Cloud d Entreprise. Découvrez la vision et la stratégie de Nutanix.

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

IBM POWER SYSTEMS: YOUR UNFAIR ADVANTAGE

The Changing Shape of Industry and Technology

Eucalyptus Overview The most widely deployed on-premise cloud computing platform

Why data science is the new frontier in software development

Industry-leading Application PaaS Platform

VMware Cloud on AWS. A Closer Look. Frank Denneman Senior Staff Architect Cloud Platform BU

Automated Deployment of Private Cloud (EasyCloud)

Cisco Container Platform

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS

Build your own Cloud on Christof Westhues

Cloud Computing. Amazon Web Services (AWS)

OpenStack Seminar Disruption, Consolidation and Growth. Woodside Capital Partners

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Microsoft Operations Management Suite (OMS) Fernando Andreazi RED CLOUD

VMware Hybrid Cloud Solution

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and

Cloud Computing 4/17/2016. Outline. Cloud Computing. Centralized versus Distributed Computing Some people argue that Cloud Computing. Cloud Computing.

Foundation for Cloud Computing with VMware vsphere 4

Container 2.0. Container: check! But what about persistent data, big data or fast data?!

CS 470 Spring Virtualization and Cloud Computing. Mike Lam, Professor. Content taken from the following:

DEFINING SECURITY FOR TODAY S CLOUD ENVIRONMENTS. Security Without Compromise

Brainchip OCTOBER

Isolation Forest for Anomaly Detection

Moving to the Cloud: Making It Happen With MarkLogic

Designing MQ deployments for the cloud generation

Overview of Data Services and Streaming Data Solution with Azure

Cloud Computing and Service-Oriented Architectures

SCALE AND SECURE MOBILE / IOT MQTT TRAFFIC

Automated Deployment of Private Cloud (EasyCloud)

Big Data with Hadoop Ecosystem

The Software Driven Datacenter

Hyper-Convergence De-mystified. Francis O Haire Group Technology Director

Machine Learning with Python

More AWS, Serverless Computing and Cloud Research

WITH INTEL TECHNOLOGIES

Review: The best frameworks for machine learning and deep learning

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source

Transcription:

Preface xiii PART I: CLOUD, BIG DATA, AND COGNITIVE COMPUTING 1 1 Princi ples of Cloud Computing Systems 3 1.1 Elastic Cloud Systems for Scalable Computing 3 1.1.1 Enabling Technologies for Cloud Computing 3 1.1.2 Evolution of Scalable Distributed/Parallel Computing 6 1.1.3 Virtualized Resources in Cloud Systems 9 1.1.4 Cloud Computing versus On- Premise Computing 10 1.2 Cloud Architectures Compared with Distributed Systems 13 1.2.1 Basic Cloud Platform Architectures 13 1.2.2 Public, Private, Community, and Hybrid Clouds 16 1.2.3 Physical Clusters versus Virtual Clusters 19 1.2.4 Comparison with Other Parallel/Distributed Systems 22 1.3 Ser vice Models, Ecosystems, and Scalability Analy sis 25 1.3.1 Cloud Ser vice Models: IaaS, PaaS, and SaaS 25 1.3.2 Scalability Laws in Evaluating Cloud Per for mance 29 1.3.3 Cloud Ecosystem and User Environments 32 1.3.4 Gartner Hype Cycle for Cloud Computing 35 1.3.5 Interaction among SMACT Technologies 37 1.4 Availability, Mobility, and Cluster Optimization 40 1.4.1 Availability Analy sis of Cloud Server Clusters 40 1.4.2 Fault Tolerance in Virtual Cluster Operations 43 1.4.3 Queueing Model of Multiserver Clusters in Clouds 44 1.4.4 Multiserver Cluster Optimization for Cloud Computing 46 1.5 Conclusions 50 Homework Problems 50

vi Contents 2 Data Analytics, Internet of Things and Cognitive Computing 57 2.1 Big Data Science and Application Challenges 57 2.1.1 Data Science and Big Data Characteristics 57 2.1.2 Gartner Hype Cyde for the Internet of Things 59 2.1.3 Towards a Big Data Industry 61 2.1.4 Big Data Applications: An Overview 64 2.2 The Internet of Things and Cloud Interactions 68 2.2.1 IoT Sensing and Platform Architecture 70 2.2.2 IoT Value Chains and Development Road Map 72 2.2.3 Stand- alone and Cloud- centric IoT Applications 75 2.2.4 Smart City and Smart Community Development 80 2.3 Data Collection, Mining, and Analytics on Clouds 82 2.3.1 Data Quality Control and Repre sen ta tions 82 2.3.2 Data Mining and Data Analytics 88 2.3.3 Upgrading Data Analytics on Clouds 89 2.3.4 Cloud Resources for Supporting Big Data Analytics 93 2.4 Neuromorphic Hardware and Cognitive Computing 97 2.4.1 Cognitive Computing and Neuromorphic Pro cessors 97 2.4.2 SyNAPSE and Related Neurocomputer Proj ects at IBM 99 2.4.3 Cambricom NPU at the Chinese Acad emy of Sciences 103 2.4.4 Google s TPU and Related AI Programs 104 2.5 Conclusions 106 Homework Problems 107 PART II: CLOUD ARCHITECTURE AND SERVICE PLATFORM DESIGN 111 3 Virtual Machines, Docker Containers, and Server Clusters 113 3.1 Virtualization in Cloud Computing Systems 113 3.1.1 Basic Concept of Machine Virtualization 113 3.1.2 Implementation Levels of Virtualization 116 3.1.3 Resources Virtualization in Cluster or Cloud Systems 119 3.2 Hypervisors for Creating Native Virtual Machines 121 3.2.1 Virtual Machine Architecture Types 121 3.2.2 Full Virtualization and Hosted Virtualization 125 3.2.3 Paravirtualization with Guest OS Modification 127 3.2.4 Comparison of Platform Virtualization Software Products and Toolkits 130 3.3 Docker Engine and Application Containers 132 3.3.1 Virtualization at Linux Kernel Level 132

vii 3.4 Docker Containers and Deployment Requirements 136 3.4.1 Docker Containers Created with Linux Kernel Functions 137 3.4.2 Docker Containers versus Virtual Machines 139 3.4.3 Architectural Evolution from VMs to Containers and Unikernel 141 3.5 Virtual Machine Management and Container Orchestration 144 3.5.1 VM Management Solutions 144 3.5.2 VM Migration for Disaster Recovery 147 3.5.3 Docker Container Scheduling and Orchestration 149 3.6 Eucalyptus, OpenStack, and VMware for Cloud Construction 153 3.6.1 Eucalyptus for Virtual Clustering in Private Clouds 153 3.6.2 OpenStack Software for Building Private or Public Clouds 156 3.6.3 VMware Virtualization Support for Building Hybrid Clouds 158 3.7 Conclusions 160 Homework Problems 161 4 Cloud Architectures and Ser vice Platform Design 167 4.1 Cloud Architecture and Infrastructure Design 167 4.1.1 Public Clouds and Ser vice Offerings 167 4.1.2 Business Models of Cloud Ser vices 170 4.1.3 Converting Data Centers to Cloud Platforms 174 4.1.4 Elastic Resources Provisioning Methods 178 4.2 Dynamic Deployment of Virtual Clusters 180 4.2.1 Virtual Cluster Deployment Proj ects 181 4.2.2 Virtual Cluster Configuration Adaptation 183 4.2.3 Virtualization Support for Data Center Clusters 184 4.2.4 VMware vsphere 6: A Commercial Cloud Operating System 185 4.3 Amazon AWS Cloud and Ser vice Offerings 188 4.3.1 Three Cloud Architectures and Ser vices Convergence 188 4.3.2 AWS EC2 Compute Engine and S3 Storage Cloud 192 4.3.3 Other AWS Cloud Ser vice Offerings 195 4.4 Google App Engine and Microsoft Azure 200 4.4.1 Google App Engine and Compute Engine 200 4.4.2 Google Hardware/Software Support for Machine Learning Services 205 4.4.3 Microsoft Azure and Ser vice Offerings 206 4.5 Salesforce, IBM SmartCloud, and Other Clouds 212 4.5.1 Salesforce Clouds for SaaS Ser vices 212 4.5.2 IBM SmartCloud, IoT, and Cognitive Proj ects 215 4.5.3 Clouds at SGI, NASA, and CERN 218

viii Contents 4.6 Conclusions 223 Homework Problems 223 5 Clouds for Mobile, IoT, Social Media, and Mashup Ser vices 229 5.1 Wireless Internet and Mobile Cloud Computing 229 5.1.1 Mobile Devices and Internet Edge Networks 229 5.1.2 Wi-Fi, Bluetooth, and Wireless Sensor Networks 232 5.1.3 Cloudlet Mesh for Mobile Cloud Computing 233 5.1.4 Mobile Clouds and Colocation Clouds 236 5.2 IoT Sensing and Interaction with Clouds 240 5.2.1 Local and Global Positioning Systems 241 5.2.2 Cloud- Based RAN for Building Mobile Networks 242 5.2.3 IoT Interaction Frameworks with Clouds and Devices 246 5.3 Cloud Computing in Social Media Applications 250 5.3.1 Social Media Big-Data Industrial Applications 251 5.3.2 Social Networks and API for Social Media Applications 255 5.3.3 Social Graph Properties and Repre sen ta tions 258 5.3.4 Social Graph Analy sis on Smart Clouds 262 5.4 Multicloud Mashup Architecture and Ser vice 264 5.4.1 Cloud Mashup Architecture for Agility and Scalability 265 5.4.2 Multicloud Mashup Ser vice Architecture 268 5.4.3 Skyline Discovery of Mashup Ser vices 273 5.4.4 Dynamic Composition of Mashup Ser vices 275 5.5 Conclusions 277 Homework Problems 278 PART III: PRINCI PLES OF MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE MACHINES 283 6 Machine Learning Algorithms and Model Fitting 285 6.1 Taxonomy of Machine Learning Methods 285 6.1.1 Categories of Machine Learning Algorithms 285 6.1.2 Supervised Machine Learning Algorithms 289 6.1.3 Unsupervised Machine Learning Algorithms 290 6.2 Supervised Regression and Classification Methods 291 6.2.1 Linear Regression for Prediction or Forecasting 291 6.2.2 Decision Trees for Machine Learning 299 6.2.3 Bayesian Classifier with Training Samples 303 6.2.4 Support Vector Machines (SVM) 307 6.3 Clustering and Dimensionality Reduction Methods 310

ix 6.3.1 Cluster Analy sis and K- Means Clustering 311 6.3.2 Dimensionality Reduction and Reinforcement Learning 315 6.3.3 Principal Component Analy sis 320 6.3.4 Semi- Supervised Learning Methods 322 6.4 Model Development for Machine Learning Applications 325 6.4.1 Per for mance Metrics and Model-Fitting Cases 325 6.4.2 Methods to Reduce Model Over- Fitting 327 6.4.3 Methods to Avoid Model Under- Fitting 330 6.5.4 Machine Learning Model Se lection Options 331 6.5 Conclusions 333 Homework Problems 334 7 Intelligent Machines and Deep Learning Networks 341 7.1 Artificial Intelligence and Smart Machine Development 341 7.1.1 Analy sis of 2016 Gartner Hype Cycle on Smart Machines 341 7.1.2 Google s Development of AI Products and Ser vices 343 7.1.3 Cognitive Ser vices at IBM and Other Companies 349 7.1.4 Deep Learning Chips at Intel, Nvidia, and CAS/ICT 350 7.2 Augmented/Virtual Real ity and Blockchain Technology 354 7.2.1 Augmented, Mediated, and Virtual Realities (AR, MR, VR) 354 7.2.2 Virtual Real ity and Product Reviews 356 7.2.3 Block Chaining for Securing Business Transactions 357 7.3 Artificial Neural Networks for Deep Learning 360 7.3.1 Deep Learning Mimics Human Cognitive Functions 360 7.3.2 Evolution of ANNs and Reported Applications 363 7.3.3 Mathematical Description of an Artificial Neuron 364 7.3.4 Multilayer Artificial Neural Network 367 7.3.5 Forward Propagation and Backward Propagation in ANN 370 7.4 Taxonomy of Deep Learning Networks 376 7.4.1 Classes and Types of Deep Learning Networks 376 7.4.2 Convolutional Neural Networks 377 7.4.3 Connectivity in Deep Neural Networks 383 7.4.4 Recurrent Neural Networks (RNNs) 384 7.5 Deep Learning of Other Brain Functions 386 7.5.1 Restricted Boltzmann Machines 388 7.5.2 Deep Belief Networks 389 7.5.3 Deep Learning to Explore Other Brain Functions 392 7.6 Conclusions 393 Homework Problems 393

x Contents PART IV: CLOUD PROGRAMMING AND PER FOR MANCE BOOSTERS 401 8 Cloud Programming with Hadoop and Spark 403 8.1 Scalable Parallel Computing Over Large Clusters 403 8.1.1 Characteristics of Scalable Computing 403 8.1.2 From MapReduce to Hadoop and Spark 404 8.1.3 Application Software Libraries for Big Data Pro cessing 406 8.2 Hadoop Programming with YARN and HDFS 407 8.2.1 The MapReduce Compute Engine 408 8.2.2 MapReduce for Parallel Matrix Multiplication 413 8.2.3 Hadoop Architecture and Recent Extensions 416 8.2.4 Hadoop Distributed File System 421 8.2.5 Hadoop YARN for Resource Management 424 8.3 Spark Core and Resilient Distributed Data Sets 426 8.3.1 Spark Core for General- Purpose Applications 426 8.3.2 Resilient Distributed Data Sets 429 8.3.3 Spark Programming with RDDs for DAG Tasks 432 8.4 Spark SQL and Streaming Programming 435 8.4.1 Spark SQL with Structured Data 436 8.4.2 Spark Streaming with Live Stream of Data 437 8.4.3 Spark Streaming Application Examples 440 8.5 Spark MLlib for Machine Learning and GraphX for Graph Pro cessing 442 8.5.1 Spark MLlib Library for Machine Learning 442 8.5.2 Some MLlib Application Examples 444 8.5.3 Spark GraphX for Graph Pro cessing 445 8.5.4 Some GraphX Programming Examples 448 8.6 Conclusions 452 Homework Problems 453 9 TensorFlow, Keras, DeepMind, and Graph Analytics 463 9.1 TensorFlow for Neural Network Computing 463 9.1.1 Key Concepts of TensorFlow 463 9.1.2 Tensors, Variables, Feed, and Fetch Operations 466 9.1.3 Distributed TensorFlow Execution Environment 470 9.1.4 Execution Sessions in TensorFlow Programs 473 9.2 TensorFlow System for Deep Learning 476 9.2.1 Layered TensorFlow System Architecture 477 9.2.2 TensorFlow Installation on Vari ous Host Machines 480 9. 2. 3 TensorFlow Ecosystem for Distributed Resources Sharing 482 9.2.4 TensorFlow for Handwritten Digit Recognition 484

xi 9.2.5 TensorFlow Applications for Cognitive Ser vices 488 9.3 Google s DeepMind and Other AI Programs 494 9.3.1 Reinforcement Deep Learning Algorithm 494 9.3.2 Interaction Between Policy Network and Value Network 496 9.3.3 Reinforcement Learning in the AlphaGo Program 499 9.3.4 DeepMind Health Proj ect in the United Kingdom 502 9.4 Predictive Software, Keras, DIGITS, and Graph Libraries 504 9.4.1 Predictive Software Libraries for Cognitive Applications 504 9.4.2 Keras Library and DIGITS 5 for Deep learning 506 9.4.3 Graph- Parallel Computations on Clouds 511 9.4.4 Community Detection in Social networks 513 9.5 Conclusions 518 Homework Problems 518 10 Cloud Per for mance, Security, and Data Privacy 521 10.1 Introduction 521 10.1.1 What Are Cloud Per for mance and QoS? 521 10.1.2 How Do You Secure Clouds and Protect Shared Data? 522 10.2 Cloud Per for mance Metrics and Benchmarks 525 10.2.1 Auto- Scaling, Scale- Out, and Scale- Up Strategies 525 10.2.2 Cloud Performance Metrics 531 10.2.3 Cloud Per for mance Models Expressed in Radar Charts 535 10.3 Per for mance Analy sis of Cloud Benchmark Results 541 10.3.1 Elastic Analy sis of Scalable Cloud Per for mance 541 10.3.2 Scale- Out, Scale- Up, and Mixed Scaling Per for mance 542 10.3.3 Relative Merits of Scaling Strategies 545 10.4 Cloud Security and Data Privacy Protection 548 10.4.1 Cloud Security and Privacy Issues 548 10.4.2 Cloud Security Infrastructure 551 10.4.4 Mobile Clouds and Security Threats 558 10.5 Trust Management in Clouds and Datacenters 559 10.5.1 Distributed Intrusion and Anomaly Detection 560 10.5.2 Reputation- Based Trust Management in Clouds 561 10.5.3 P2P Trust Overlay Network over Multiple Data Centers 566 10.6 Conclusions 571 Homework Problems 571 Index 577