Automated Deployment of Private Cloud (EasyCloud)

Similar documents
Automated Deployment of Private Cloud (EasyCloud)

Automated Deployment of Private Cloud (EasyCloud)

Distributed Systems. 31. The Cloud: Infrastructure as a Service Paul Krzyzanowski. Rutgers University. Fall 2013

OpenNebula on VMware: Cloud Reference Architecture

CHEM-E Process Automation and Information Systems: Applications

OPENSTACK: THE OPEN CLOUD

Cloud Computing introduction

Introduction To Cloud Computing

What is Cloud Computing? Cloud computing is the dynamic delivery of IT resources and capabilities as a Service over the Internet.

Open Cloud Reference Architecture

Getting to Know Apache CloudStack

Build Cloud like Rackspace with OpenStack Ansible

The OnApp Cloud Platform

Lecture 09: VMs and VCS head in the clouds

Xen and CloudStack. Ewan Mellor. Director, Engineering, Open-source Cloud Platforms Citrix Systems

VMware vsphere with ESX 4.1 and vcenter 4.1

Baremetal with Apache CloudStack

Hyperconverged Cloud Architecture with OpenNebula and StorPool

Be smart. Think open source.

Cloud Computing. Luigi Santangelo Department of Computer Engineering University of Pavia

Cloud Computing: Making the Right Choice for Your Organization

Cloud Computing Lecture 4

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

70-414: Implementing an Advanced Server Infrastructure Course 01 - Creating the Virtualization Infrastructure

VMware vsphere with ESX 6 and vcenter 6

Data Centers and Cloud Computing

Discover SUSE Manager

ECE Enterprise Storage Architecture. Fall ~* CLOUD *~. Tyler Bletsch Duke University

CLOUD COMPUTING. Rajesh Kumar. DevOps Architect.

Open Hybrid Cloud & Red Hat Products Announcements

Building a government cloud Concepts and Solutions

Introduction to Virtualization. From NDG In partnership with VMware IT Academy

CLOUD COMPUTING. Lecture 4: Introductory lecture for cloud computing. By: Latifa ALrashed. Networks and Communication Department

Citrix CloudPlatform (powered by Apache CloudStack) Version 4.5 Concepts Guide

VMware Overview VMware Infrastructure 3: Install and Configure Rev C Copyright 2007 VMware, Inc. All rights reserved.

Application Deployment

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

2014 VMware Inc. All rights reserved.

Genomics on Cisco Metacloud + SwiftStack

Introduction to Cloud Computing. [thoughtsoncloud.com] 1

RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING

Data Centers and Cloud Computing. Data Centers

Securing Containers Using a PNSC and a Cisco VSG

Red Hat OpenStack Platform 10 Product Guide

FIVE REASONS YOU SHOULD RUN CONTAINERS ON BARE METAL, NOT VMS

Distributed Systems COMP 212. Lecture 18 Othon Michail

Hedvig as backup target for Veeam

opennebula and cloud architecture

Build your own Cloud on Christof Westhues

WHITE PAPER. RedHat OpenShift Container Platform. Benefits: Abstract. 1.1 Introduction

How CloudEndure Works

How CloudEndure Disaster Recovery Works

BRKDCT-1253: Introduction to OpenStack Daneyon Hansen, Software Engineer

openqrm Enterprise Administrator Guide Enterprise System Adminstration and IaaS Datacenter Automation with openqrm 5.2

COP Cloud Computing. Presented by: Sanketh Beerabbi University of Central Florida

Examining Public Cloud Platforms

Citrix CloudPlatform (powered by Apache CloudStack) Version 4.5 Getting Started Guide

vsan Mixed Workloads First Published On: Last Updated On:

Top 40 Cloud Computing Interview Questions

Introduction and Data Center Topology For Your System

ElasterStack 3.2 User Administration Guide - Advanced Zone

Using Red Hat Network Satellite to dynamically scale applications in a private cloud

PassTest. Bessere Qualität, bessere Dienstleistungen!

Securing Containers Using a PNSC and a Cisco VSG

Table of Contents 1.1. Introduction. Overview of vsphere Integrated Containers 1.2

Understanding Cloud Migration. Ruth Wilson, Data Center Services Executive

Cloud platforms T Mobile Systems Programming

CloudStack Administration Guide

Deploying enterprise applications on Dell Hybrid Cloud System for Microsoft Cloud Platform System Standard

Hyper-Convergence De-mystified. Francis O Haire Group Technology Director

The intelligence of hyper-converged infrastructure. Your Right Mix Solution

What is Cloud Computing? What are the Private and Public Clouds? What are IaaS, PaaS, and SaaS? What is the Amazon Web Services (AWS)?

Table of Contents 1.1. Overview. Containers, Docker, Registries vsphere Integrated Containers Engine

How CloudEndure Disaster Recovery Works

A10 HARMONY CONTROLLER

VMware vsphere. Administration VMware Inc. All rights reserved

SUSE OpenStack Cloud Production Deployment Architecture. Guide. Solution Guide Cloud Computing.

Programowanie w chmurze na platformie Java EE Wykład 1 - dr inż. Piotr Zając

Choosing the Right Cloud Computing Model for Data Center Management

Introduction to data centers

Cloud platforms. T Mobile Systems Programming

Acronis Backup & Recovery 11.5

Hybrid Cloud Data Protection & Storage

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?

Storage Considerations for VMware vcloud Director. VMware vcloud Director Version 1.0

Transformation Through Innovation

IBM Bluemix compute capabilities IBM Corporation

Dell EMC ScaleIO Ready Node

A High-Availability Cloud for Research Computing

RED HAT CLOUDFORMS. Chris Saunders Cloud Solutions

DC: Le Converged Infrastructure per Software Defined e Cloud Cisco NetApp - Softway. Luigi MARCOCCHIA SOFTWAY

The Future of Virtualization Desktop to the Datacentre. Raghu Raghuram Vice President Product and Solutions VMware

EBOOK: VMware Cloud on AWS: Optimized for the Next-Generation Hybrid Cloud

Redefining Networking with Network Virtualization

Windows Server 2012 Hands- On Camp. Learn What s Hot and New in Windows Server 2012!

Bringing OpenStack to the Enterprise. An enterprise-class solution ensures you get the required performance, reliability, and security

Understanding Virtual System Data Protection

VMware vsphere with ESX 4 and vcenter

OpenNebula 5.2 Deployment guide

Acronis Backup Advanced Version 11.5 Update 6

Transcription:

Automated Deployment of Private Cloud (EasyCloud) GROUP Mohannad S. Mostafa Musab Al Zahrani Hassan Al Salam Moath Al Solea Mohammed Kazim ADVISOR Dr. Ahmad Khayyat COE485 December, 2015 Term 151

Table of Contents Introduction... 3 How this project deals with the issue... 6 Positive impacts on the society... 7 Problem Statement... 7 Background... 7 Definition of Cloud Computing... 7 Characteristics of Cloud Computing... 7 Service Models... 8 Deployment Models... 9 Termonolgy... 9 Existing reaserch and products... 10 Requirements and Specifications... 11 Functional user requirements... 11 Non-functional user requirements... 11 Technical specifications... 12 System Design... 12 Solution Concept... 12 General Approach... 12 Alternative Approaches... 13 Comparison between Approaches... 14 Sub-functions... 15 Architecture Design... 15 Alternative Architectures... 17 Comparison between Architectures... 19 Hardware/Software Components...20 Function of Hardware/Software Components...20 Component Design... 21 Cloud Platform... 21 OpenNebula System... 25 Deployment Management Tool...29 Automatic Installation Tool...29 System Integration... 31 Design Evolution... 32 PAGE 1

initial Design... 32 Final Design... 33 Testing, Analysis, and Evaluation... 34 Testing methodology and results:... 34 System analysis and evaluation:... 37 Issues... 37 Engineering Tools and Standards... 39 Installing the master host... 39 Cobbler... 39 Ansible... 40 Conclusion... 41 Teamwork... 43 Reference... 44 Appendix... 47 PAGE 2

Introduction Private cloud by definition is a single-tenant environment where the hardware, storage and network are dedicated to a single client or company [1]. Nowadays, cloud computing has become a great factor in many enterprises. However, the deployment of such a private cloud can be a waste of time. It can take up to months to complete the deployments depending on the scale of deployment and the deployment method. For example if you have to deploy a cloud on a cluster of PCs, you will have to install OSs, manage them after installation and add them to the cloud, so depending on the approach this might take from weeks to months. This project try to solve the deployment issue and implement an automated deployment system for private cloud. The first issue here is that there are many limitations for using traditional computing techniques. So a solution that expanded this area of computing was cloud computing. And almost every business nowadays is shifting to cloud computing for many reasons. The reasons are that with cloud computing an organization can enjoy scalability, flexibility, and agility along with better distribution of workload without significantly increasing IT budget. So the other important issue is to either choose a private or public cloud and this problem is debatable. So the major point in deciding which cloud to go for is to evaluate your requirements and then identify which solution works for you. An example of the criteria that companies are evaluating are business critical applications they want to move to the cloud, regulatory issues they may need to comply with, required service levels, usage patterns for the workloads, and how integrated the application must be with other enterprise functions. As we can see from (Figure 1) how companies are shifting rapidly to cloud computing. And as an example of Amazon s cloud we can see from (Figure 2) how much virtual computers creation per day is increasing, thus improving the organization performance. PAGE 3

Figure 1: number of websites using cloud providers [9] Figure 2: Amazon's virtual computers created per day [9] In (Figure 3) we can see the business trend toward using cloud computing for their businesses and private cloud has the higher percentage. PAGE 4

Figure 3: Business Trend Here in (Table 1) we can see a comparison between traditional and cloud computing to show how Cloud computing is better in many aspects than Traditional computing. Table 1: comparison between traditional and cloud computing [10] Characteristic Cloud computing Traditional computing Time before accessing a service Minutes / hours Days / weeks Capital Expenditure Pay as you go, variable Upfront cost, fixed Economics of scale Yes, for all organizations Only for large organizations Multi-tenancy Yes Generally no, but can be found in application hosting Scalability Elastic and automatic Manual Virtualized Usually Sometimes To emphasize the power of cloud computing we need to understand the strength of these key factors. Here is an explanation of those points: Time before accessing a service: when a cloud environment is setup initially you can gain access but in traditional there is a lead time for installation, setup and configuration. Capital Expenditure: cloud reduces the upfront cost in procuring hardware and software and setting environment. PAGE 5

Economics of scale: once an enhancement is discovered it can be applied to all other clouds. Multi-tenancy: in clouds, the ability to host multiple consumers effectively on shared resources. Scalability: in clouds it can be done automatically, but in traditional there must be human intervention in hardware and software. Virtualized: clouds are usually virtualized while traditional can be a mix of physical and virtual. So now we know that cloud computing is the future in terms of the great advantages that we can get in many aspects. The question now is what service to go for in cloud computing. Our project is focusing on private clouds and here are the reasons for choosing a private cloud. Private cloud service offers a number of advantages that make it a more viable cloud solution instead of a public cloud service option. The advantages are as follow: Greater control: the main reason to use a private cloud is to have your resources under your control, therefore you can oversee your data. More security: when a cloud is dedicated to a single organization a high level of security can be assured in terms of designing the system. Another important aspect to keep the cloud on your sight is the regulations of a country for example some countries desire to have their cloud on-house. Higher performance: since private cloud will be deployed inside the firewall on an organization s intranet, the transfer rates will increase dramatically instead of going to the internet for a public cloud. Customizable: the performance of Hardware, network and storage is customizable since it is under the organization s control. So in private cloud the customer can have full control of his resources and by that he gain more security which is very critical part for any organization and the performance is enhanced greatly. HOW THIS PROJECT DEALS WITH THE ISSUE The project deals with the complexity of deploying a customizable private cloud through proving an easier way to deploy the private cloud using script that will take PAGE 6

care of whatever the client wants of changes is his own version of private cloud. The scripts will do some modification of the private cloud design or deployment model depending on client requirements. POSITIVE IMPACTS ON THE SOCIETY Locally: If our project used in some public services such as healthcare or education it will: Enhances the restricting access to sensitive, personal, or private information of the public since the private cloud is customized fully to serve each client needs and the client is fully aware how his data are going to be handled. Also, this project will contribute to the economic growth of the society through helping small IT companies by making it easier for them to host their service and start their business. Globally: This project will contribute to the development of open source software. Problem Statement Automated, i.e. easily producible, private cloud setup in which virtual machines can be easily provisioned and addition hardware can be added to increase the platform capacity. Background DEFINITION OF CLOUD COMPUTING According to NIST, Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. [2] CHARACTERISTICS OF CLOUD COMPUTING There are five essential characteristics of the cloud model [2]: 1. On-demand self-service: consumer can acquire the needed resources (e.g. RAM, CPU, and Storage) automatically without human interaction. 2. Broad network access: The service should be reached through the network with thin or thick client platform (e.g. Mobile phone or Web browser). PAGE 7

3. Resource pooling: Capability of the system to enable user to set some preference about the location of its resource (e.g. datacenter or country) 4. Rapid elasticity. : The ability of the system to scale based on demand upward or downward. 5. Measured service: Is the ability of the system to control, monitor and report about underlying infrastructure and its usage. SERVICE MODELS The cloud is a very wide concept, and it covers just about every possible sort of online service. However, when businesses refer to cloud, there are generally three models of cloud service: (SaaS), (PaaS), and (IaaS). [2] In SaaS (Software as a Service), users are provided with access to application software often referred to as on-demand software. They don't have to worry about the installation, setup and running of the application. Classical examples in this models are: Google Apps, Microsoft Office 365, Dropbox, and Box. In PaaS (Platform as a Service), users will be provided with a computing platforms which usually includes operating system, programming language, execution environment, database, and web server. They can develop their applications and deploy them in a PaaS cloud service. Classical examples in this models are: AWS Elastic Beanstalk, Windows Azure, Heroku, and Google App Engine. In IaaS (Infrastructure as a Service), users will be provided with the computing infrastructure, physical or - often- virtual machines and other resources like virtualmachine disk image library, block and file-based storage, etc.. IaaS is the most flexible cloud computing model because it allows for automated deployment of servers, processing power, storage, and networking. Furthermore, IaaS clients have a real control over their infrastructure than the clients of other Cloud models. The main uses of IaaS include the actual development and deployment of PaaS, SaaS, and web-scale applications. Here are some examples of IaaS platforms and providers: Amazon EC2, Windows Azure, Rackspace, Google Compute Engine, OpenStack, and OpenNebula. The following diagram explain the functionality and service of each service model as layers. PAGE 8

Figure 4: service models and their services [3] DEPLOYMENT MODELS There mainly four types of cloud deployment models: Public cloud, private cloud, Hybrid cloud, and community cloud. [2] In Public cloud, the cloud provider will provide the cloud computing infrastructure to their customers and charge them per usage. In Private cloud, business can deploy its own private cloud infrastructure for its units to use them exclusively. Hybrid cloud is referred to deployment model where a private cloud can combine their resources with a public cloud. TERMONOLGY In this section, we will explain several terminologies that will be used during this report. Hypervisor The hypervisor is a software layer that lies between the hardware and the Operation system. Its functionality is to allow the hardware to be virtualized to create virtual machine for the user. The main purpose of the virtualization is to utilize the server resources by running different virtual machine and operating system that can host many services. There are two types of the hypervisor: PAGE 9

1- Full virtualization, where the hypervisor run on top of the host operating system. For example: KVM, Xen and Microsoft Hyper-V. 2- Hardware-layer virtualization, where the hypervisor is directly on top of the hardware. For example: VMware ESX. [2] Master Node Physical machine that hold and execute the cloud service, also called Frontend. [4] Host Node Physical machine that have the hypervisor enabled and provide the resources for the virtual machine. [4] Bare Metal Also called bare machine, means a computer without its operating system. [5] EXISTING REASERCH AND PRODUCTS In this section, we will show some of the existing solution for automating cloud deployment. Also, we will mention a research that targets the cloud platforms and compares between them. Fuel Fuel is an open source deployment and management tool developed by OpenStack community effort to accelerate OpenStack deployment and configuration. It is a GUI-driven which used for deployment and management of OpenStack, and other related Openstack projects and plug-ins. Fuel has several key features, such as hardware discovery, hardware configuration in UI, and pre-deployment checks and network validation. However, this project is limited and exclusive for Openstack platform and could not be used for our use as we will see in the coming section of this report. [6] Compass Compass is an open source project designed to provide automated deployment as a service to a set of bare metal machines to Openstack platform. [7] PAGE 10

TripleO TripleO is also another tool used to install, upgrade, and operate OpenStack cloud using OpenStack facilities. Simply, it uses OpenStack to deploy other OpenStack clouds "overclouds" on bare metal. [8] Research: Comparison and Evaluation of Open-source Cloud Management software. It s a master thesis written by Srivatsan Jagannathan and was published in June 2012 from KTH Royal Institute of Technology, Stockholm, Sweden. The first objective of this research is to provide a framework for comparing different Cloud infrastructure platforms (IaaS Platform) and explain many related concept and terminologies. The second Objective is to evaluate the performance of one of OpenNebula cloud infrastructure platform. [2] Requirements and Specifications FUNCTIONAL USER REQUIREMENTS Deployment on hardware o Automated deployment of platform on hardware, e.g. network boot, automatic node configuration. o Support heterogeneous hardware; hardware does not need to be identical. o Automated expansion by deploying on additional hardware, e.g. adding PCs or hard driver. o Enable users to create a VM and to configure its specs based on the avability of the hardware. Administration. o Monitoring of resource usage per VM and for the entire platform. o Network configuration to control connectivity between VMs. o Selection of boot images for the VMs. NON-FUNCTIONAL USER REQUIREMENTS Using open source tools only. Scalability of the system varies from small setup to large setup. The smallest configuration can be 2 nodes, and the largest setup can be up to 10000 nodes The system should provide friendly user interface Deploying the system should take less than 2 Hours for large scale deployment. The system should be able to work on any scale with little performance drop. PAGE 11

TECHNICAL SPECIFICATIONS Using open source cloud platform, such as OpenStack, Eucalyptus, CloudStack, Open Nebula Deploying the system will be in less than 2 hours. The user can specify the OS, RAM, Virtual Cores, disk storage, and number of network interfaces of the VMs. (The speed of network interface is determined by the capabilities of the available network speed and the network setup). The response time for creating the virtual machine is less than 15 minutes. The admin can see the following: CPU utilization, RAM, network traffic, and storage per VM and for the entire platform. Any computer added to the system should be able to boot directly from network and be automatically configured to the system. The system can be installed and configured on any x86 architecture System Design SOLUTION CONCEPT The problem statement of this project focuses on the automated and easily reproducible private cloud setup. In addition to increasing the platform capacity by adding additional hardware. This section will describe the general approaches of solving the stated problem, alternative approaches to the general approach, and the selection criteria. General Approach The proposed approach should solve the problems stated in the problem statement and provide a solution for automating the process of deploying a private cloud. The general approach is to find the best suitable private cloud platform which has most of the requirements and specification needed. Then, we add an automatic installation tool, such as Cobbler or FAI, to install an OS image through TFTP. The advantage of the automatic installation tool is that it configures the DHCP, TFTP, and HTTP servers. Thus, it eases the work for us. After network booting, a deployment management tool is taking part to install all the required packages for node installation of the selected private cloud platform along with additional setting needed. The critical point on this approach is the network booting. Since the requirement specifies that any device can be added to the PCs Pool, any added node to the PCs PAGE 12

pool need be booted from TFTP server. However, the network booting can be done by using automatic installation tools which ease the process of configuring DHCP, TFTP and HTTP servers. The automatic installation tool that has been used in this project is Cobbler (Figure 5). In addition, there s the main node which can control and download any configuration by using a deployment tool, such as Ansible. By using an automated deployment tool, the clients can easily configure different options and deploy it on all the machines in the same network. This approach requires that main node should be installed and configured first by USB or DVD. The installation of the main node requires small intervention to run the program which will configure Ansible and Cobbler on the main node. Figure 5: General Approach Alternative Approaches There are multiple alternative approaches to implement this system other than the approach described in General Approach. Alternative approaches are as following: Modify an Open Source cloud to be able to automate the deployment along with holding the configurations of all nodes in the cloud itself. Use Open Source deployment tool which was developed for OpenStack and modify it to meet the project requirement. However, understanding how this tools work and the restricted environment that it needs to work, doesn t make this option a good one to follow. PAGE 13

Comparison between Approaches This section shows a brief comparison between different approaches, and it focuses on the important features which are needed to automate the deployment of private cloud system. Approach Table 2: Comparison between different approaches: General Approach Advantages - The client can change the configuration at any time using the configuration tool (Ansible) - Other cloud system can be added without changing the system architecture - Modifying open source cloud Disadvantages Advantages - Requires a specific network setup to work. - Full control of everything in the cloud. - Remove or add any features Disadvantages - Relatively hard to accomplish. - There s already existing solutions for OpenStack mainly Table 2 shows the comparison between different approaches. The approach which will be followed is the general approach which uses Cobbler and Ansible to configure the network and the cloud system. Even though modifying an open source cloud is considered a good options, the experience the team has and the time constraints don t satisfy following such approach at the moment PAGE 14

Platform Functions Developers Functions Sub-functions This section describes the sub-functions of the proposed system. There are two different types of sub-functions as shows in Figure 6: Functions implemented by the chosen platform o Create Virtual Machine o Configure the resources of VM o Network between different VMs instantiated Functions implemented by EasyCloud system. o Network boot for each node Download the chosen linux distribution Adding the node to Ansible o Ansible configuration for cloud requirements o DHCP, DNS, TFTP configuration o Configure automatic installation tool (Cobbler/FAI) Create Virtual Machine Configure the resouces of VM Network between different VMs instatiated Network boot for each node Design interface to configure the nodes using deployment tool Download the OS image Run a script to configure the node Figure 6: Sub-Function ARCHITECTURE DESIGN The main architecture requires Layer 3 switch to allow VLANs. Also there should be a router which connects the cloud network to clients and internet. The architecture mainly requires that nodes and a master node should be exist in a separate VLAN. PAGE 15

For redundancy and large scale implementation, multiple VLANs of the same nodes can be added. Also, this architecture support High availability using cloud hook for the host nodes and using active-passive architecture for the master node. In case of the failure of active master node, the passive node will take over. The MySQL database is used to keep the state of the system. Furthermore, automatic installation tool should be installed first in the network to allow all the nodes to be booted directly from the network. The automatic installation tools configures the DHCP, TFTP, and HTTP servers. However, they don t need to be on the same machine to work, except for TFTP. In addition, the deployment configuration tools, such as Ansible, is needed to be configured to access the main node first. Then any host node added to the system, Ansible can add its IP to the configuration files. Figure 7 shows the main hardware and software component of EasyCloud Architecture. Figure 7: Distributed Storage Architecture Design PAGE 16

Alternative Architectures 1- Shared storage Figure 8 shows the alternative architecture of the system. In this architecture, the data storage is a separated entity. Also the database storage is in a separate VLAN other than the nodes which will help in reducing the contention in the nodes VLAN which might degrade the network performance. In addition, there s no backup for the master node. Figure 8: Shared Storage Architecture PAGE 17

2- Distributed Storage with Backup Nodes In this architecture, the master node support high availability using active passive architecture. The difference between the main architecture and this one is the backup nodes. This will ensure that there is a backup in case of host node failure. However, this architecture is not optimal because backup nodes are not utilized fully. The computing resources of these nodes will be wasted compared to the storage nodes. Figure 9: Distributed Storage with backup PAGE 18

Comparison between Architectures Architecture Table 3: Comparison between Shared and Distributed Storage Architectures Shared Storage Distributed Storage Strength Reduce VM deployments times Enables live-migration Weakness It can become a bottleneck in the infrastructure. Thus, degrading VMs performance. Strength Backing up of the data on many machines. The ability to add or remove any of the devices without losing the storage. Weakness Images have to be copied always to the hosts, which can be a very resource demanding operation. Prevent the use of live-migration between hosts High VM deployment times depending on the infrastructure network connectivity. Even though shared storage can be a bottleneck in the network infrastructure, many cloud platforms suggest using shared storage for large scale deployment. Thus, a separate storage ensures guest network traffic contention doesn t impact storage performance. However, since this project is aiming to deploy a cloud on a bare metal computers which mostly not be a dedicated storage nodes. The distributed storage will be followed. However, shared storage system is supported by configure a playbook in Ansible to configure the nodes to work on shared storage. Also, the platform plays a critical role in affecting the architecture of the system. OpenNebula supports all of the above architecture PAGE 19

Hardware/Software Components The architecture design shown in Figure 9 includes a mixture between hardware and software components. Hardware Components Host Nodes Main Node Network infrastructure Switch Router Software Components Automatic Installation Tool (Cobbler) Management Deployment Tool (Ansible) Cloud Platform DHCP Server TFTP Server Main Python Program Figure 10: Hardware & Software Components Function of Hardware/Software Components Hardware Components: Host Nodes: The nodes where the VMs will be run on Main Node: The main node which have the core of the cloud and control the rest of host nodes Network Infrastructure: VLANs is used to support large scale redundancy. Storage Nodes: The nodes where all the storage will be placed in. Switch: Layer 3 switch which supports VLANs Truck Router: Repsonsible for connecting the cloud to the internet and the client network. Software Components: Automatic Installation Tools (Cobbler): Responsible for installing OS on baremetal hardware. There s also a preseed file which can be configured to install and configure the initial setup of the system Configuration File: A file which enables the client to modify different characteristics of the cloud and deploy it on the system anytime. PAGE 20

Management Deployment Tool: A tool which deploy the configuration file on a specified network devices. The tool will help to change the configuration of a set of devices. Cloud Platform: The core of the system is the cloud platform. It s a chosen cloud platform that has been chosen depending on its features which support the requirements and specifications of this project. DHCP Server: assigns IP addresses to client computers. TFTP Server: TFTP is a simple high-level protocol for transferring data servers use to boot diskless workstations by using User Data Protocol (UDP). Python Program; This program is final production of the system, It has all the data and it can run Ansible and modify other nodes. COMPONENT DESIGN This section describes the design of some components and the justification of choosing an option for ready-made components. Cloud Platform Cloud platform is the core of this system. The cloud platform must be chosen to meet the requirements and specifications of this project, because most of the work is being done from the cloud platform. In addition, the architecture of the cloud itself plays a major rule on choosing the cloud. Thus, architecture of the cloud is one of the important criteria in choosing a cloud platform. The main selection criteria of choosing cloud platform: Simple Deployment Architecture Features that meet the requirements and specification Additional features which helps in expanding the project. Simple installation OpenStack OpenStack is one of the popular private cloud in the market. However, its architecture is too complicated because of its flexibility to work on many environments. The complex architecture and installation will not help us in the automated deployment of the cloud. PAGE 21

Figure 11: OpenStack Architecture OpenNebula OpenNebula is a cloud computing platform for managing heterogeneous distributed data center infrastructures. The OpenNebula platform manages a data center's virtual infrastructure to build private, public and hybrid implementations of infrastructure as a service. OpenNebula platform provides all the feature needed to complete this project in addition to additional feature which can help this project to expand more. In addition, it s known for its simple architecture and installation which can be easily deployed on computers. Figure 12: OpenNebula Architecture PAGE 22

CloudStack CloudStack is an open source cloud computing software for creating, managing, and deploying infrastructure cloud services. It uses existing hypervisors such as KVM, VMware vsphere, and XenServer/XCP for virtualization. CloudStack is a great candidate to be the core infrastructure platform for this project. Figure 12: CloudStack Architecture Eucalyptus Eucalyptus is free and open-source computer software for building Amazon Web Services (AWS)-compatible private and hybrid cloud computing environments marketed by the company Eucalyptus Systems. Eucalyptus can provide high availability by building primary and secondary cloud. In the event of a failure, the secondary component becomes the primary components. PAGE 23

Figure 13: Eucalyptus Architecture Cloud Platform Comparison Table 4: Cloud Platform Comparison Criteria OpenStack CloudStack OpenNebula Eucalyptus Simple Architecture No Yes Yes No Simple Installation No No Yes No Features that meet the requirements Yes Yes Yes Yes Additional Features which help in expanding the project Yes Yes Yes No PAGE 24

From Table 4, it s clearly shown that CloudStack and OpenNebula are both great candidate for this system. Thus, this system will support both candidates depending on the choice of the client. OPENNEBULA SYSTEM In this section we will list the different subsystems and components in our OpenNebula cloud platform. OpenNebula Sunstone: is the OpenNebula Cloud Operations Center, a Graphical User Interface (GUI) intended for regular users and administrators that simplifies the typical management operations in private and hybrid cloud infrastructures. OpenNebula Sunstone allows easily managing all OpenNebula resources and performing typical operations on them. Figure 14: OpenNebula Sunstone PAGE 25

Users & Groups: OpenNebula includes a complete user & group management system. Users in an OpenNebula installation are classified in four types: Administrators: an admin user belongs to an admin group (oneadmin or otherwise) and can perform manage operations Regular users: that may access most OpenNebula functionality. Public users: only basic functionality (and public interfaces) are open to public users. Service users: a service user account is used by the OpenNebula services (i.e. cloud APIs like EC2 or GUI s like Sunstone) to proxy auth requests. A Host: is a server that has the ability to run Virtual Machines and that is connected to OpenNebula s Frontend server. Figure 15: Host lists A Datastore: is any storage medium used to store disk images for VMs. Types of Datastore available in OpenNebula: System: to hold images for running VMs PAGE 26

Images: stores the disk images repository. Disk images are moved, or cloned to/from the System datastore when the VMs are deployed or shutdown; or when disks are attached or snapshotted. Files: this is a special datastore used to store plain files and not disk images. The plain files can be used as kernels, ramdisks or context files. Figure 16: Datastores Lists The Virtualization Subsystem: is the component in charge of talking with the hypervisor installed in the hosts and taking the actions needed for each step in the VM lifecycle. KVM (Kernel-based Virtual Machine): is a complete virtualization technique for Linux. It offers full virtualization, where each Virtual Machine interacts with its own virtualized hardware. Virtual Machine within the OpenNebula system consists of: A capacity in terms memory and CPU A set of NICs attached to one or more virtual networks A set of disk images A state file (optional) or recovery file, that contains the memory image of a running VM plus some hypervisor specific information. PAGE 27

Figure 17: Virtual Machine Information Figure 18: Virtual Machine Lists Templates: In OpenNebula the Virtual Machines are defined with Template files. The Template Repository system allows OpenNebula administrators PAGE 28

and users to register Virtual Machine definitions in the system, to be instantiated later as Virtual Machine instances. These Templates can be instantiated several times, and also shared with other users Images: the Storage system allows OpenNebula administrators and users to set up images, which can be operative systems or data, to be used in Virtual Machines easily. These images can be used by several Virtual Machines simultaneously, and also shared with other users. Deployment Management Tool Deployment Management Tools enable you to use recipes, playbooks, templates, or whatever terminology to simplify automation and orchestration across your environment to provide a standard, consistent deployment. The choice of the deployment management tool in this system depends mainly on its ease of use, language support, and open source tool. There are fairly many deployment management tools that can help to achieve the purpose of automating a script across a network. For instance, Ansible, Puppet, Cheff, Fabric, and SaltStack. Even though it can be a client choice to use any type of deployment management tool, this system will use Ansible in deploying scripts and configuration file. Automatic Installation Tool There are multiple automatic installation tool which helps in ease the automation of network booting of bare metal devices. These tools provide configurations for DHCP, TFTP, and HTTP. Thus, it s not required to have a dedicated server for each service. In addition, these tools help in modifying the Linux distribution with a pre seed file to choose the packages that needed to be installed with the setup. The tools that have been tested and considered in this project is FAI, Cobbler, and Foreman. Tools Advantages Disadvantages Forman Easy to use and configure thanks to its user friendly GUI Easy to install (one line to install all required packages and software needed) Needs a database (may cause single point of failure). Difficult to automate since it s mostly configured through GUI. PAGE 29

Cobbler FAI Supports different OSs from different families Supports plugins and integrate very well with puppet Comes with variety of provisioning templates and partition tables Monitor host configuration, report status, distribution and trends. Can create one template and use it on many OS images. Supported by Ansible. Easy to install. The OS image is mounted on Cobbler server, which makes it fast to install it on the target machine. OS Image is not needed OS Installation is much more faster than any other automation tool since it only needs to create the OS once for the same devices so with using a cache proxy the installing of new OSs on new machines will be very quick since all the needed packages will be cached in the proxy. OS Package management is very flexible Does not integrate well with Ansible Installation Media must be retrieved from the web Doesn t work well with Debian. Mainly supports Fedora distribution Has a problem with the default package tool (sourced.list) It has been developed mainly for Debian, so it is a little hard to tolerate the tool to suite another distribution like Ubuntu. Documentation is limited for other distributions than Debian. The configuration of the tool is also not very trivial since it is very flexible, so modifying the original version is complicated to some extent. Automating the configuration of this tool will need lots of work if you desire a specific distribution. PAGE 30

SYSTEM INTEGRATION For system integration, the first step is to install the master node, then configure the DHCP and TFTP. After that the host node, a node where the VMs will be running, is added by the steps through tools like Ansible and Cobbler described in Figure 4. After the cloud is installed on the host node. The node is waiting for requests to create and run VMs. Also, the host nodes can be accessed by the system administrator to know the active resources. Any later modification for the nodes can be managed through Ansible. Figure 14 shows the sequence of adding a new node and deploying the private cloud system on the node. Automated installation of master node through USB or DVD the master has Cobbler and ansible configured mostly configure DHCP and TFTP either on master or dedicated server DHCP replies with an IP and the address of TFTP server New Node Asks DHCP for IP address The master node is connected to network to be able to boot other nodes from network the Node asks TFTP server for an OS Image installation through UDP The TFTP server replies with the OS image The OS image is installed on the node Private Cloud is ready Ansible adds the new node to the cloud Ansible manages the configuration of the new node Waiting for requests Create VMs Run the VM Figure 19: System Integration PAGE 31

Design Evolution In this section, we will show evaluation of our system design and why the design has been changed from the initial design to the final system design. INITIAL DESIGN The initial design of our system, which is called shared storage, was aiming on spreading the storage nodes and computing nodes and using two virtual LAN. The purpose of the separation is to reduce the VM deployment times, and to utilize the storage resources on each machine by not having to install any operating system and allow these nodes to act as storage devices only. Figure 15 shows our initial design. Figure 20: Initial Design Since the purpose of this project is to install the cloud on bare metal machine, we will lose the computing resources of the machine that we chose to be a storage node. Also, the other way around, we will lose the storage resources of nodes that we have PAGE 32

chosen to be a computing node. As a consequence, the initial design was not efficient for our project. FINAL DESIGN The Final design called distributed storage architecture with high availability. As the name implies, all host/nodes will have both storage and computing capabilities. In addition, the master nodes and host nodes are highly available. This will allow us to better utilize the physical machines and to recover from failure. The following diagram shows our Final design. Figure 21: Final Design PAGE 33

Testing, Analysis, and Evaluation TESTING METHODOLOGY AND RESULTS: The table below explains how we determined that our system meets a specific requirement or technical specification. FUNCTIONAL USER REQUIREMENTS REQUIREMENT Met Testing methodology Automated deployment of platform on hardware, e.g. network boot, automatic node configuration. Support heterogeneous hardware of x86 architecture; hardware does not need to be identical Automated expansion by deploying on additional hardware, e.g. adding PCs or hard driver Enable users to create a VM and to configure its specs based on the availability of the hardware Yes Yes Yes Yes We added a new node without any OS, then by configuring it to network boot we were able to automatically to install an OS and configure it to join our cloud system We brought a laptops with different hardware specifications and we successfully were able to add their resources to our cloud system We installed the node package of OpenNebula on another machine then checked if the resources of this new machine added to the resources pool in the frontend node We made users accounts with some privileges then we logged in to the frontend using these accounts and checked if we were able to create VM s and configure them PAGE 34

Monitoring of resource usage per VM and for the entire platform Network configuration to control connectivity between VMs Selection of boot images for the VMs Yes Yes Yes By using admin accounts then logging in to the frontend node, we were able to see how much resources the entire cloud has in addition to how much of resource usage per VMs We used SSH to check the connectivity between the VM s By using admin accounts then creating a predefined templates for VMs which includes how much storage, CPU ram and the operating system PAGE 35

TECHNICAL SPECIFICATIONS REQUIREMENT Met Testing methodology Deploying the system will be in less than 2 hours The user can specify the OS, RAM, Virtual Cores, disk storage, and number of network interfaces of the VMs The response time for creating the virtual machine is less than 20 minutes. The admin can see the following: CPU utilization, RAM, network traffic, and storage per VM and for the entire platform Yes Yes Yes Yes Based on the current network in COE lab (1 GigE), we were able to deploy the cloud system in less than 1 hour We made users accounts with some privileges then we logged in to the frontend using these accounts and checked if we were able to create VM s and configure them A new VM is created in roughly 5 minutes By using admin accounts then logging in to the frontend node, we were able to see how much resources the entire cloud has in addition to how PAGE 36

much of resource usage per VM Any computer added to the system should be able to boot directly from network and be automatically configured to the system The system can be installed and configured on x86 Architecture Yes Yes We added a new node without any OS, then by configuring it to network boot we were able to automatically to install an OS and configure it to join our cloud system We successfully added a new x86 node to our cloud system SYSTEM ANALYSIS AND EVALUATION: In our project we mainly focused on providing a high availability to both VMs and hosts. Our system is prepared for failures in the virtual machines or physical nodes, and recover from them. Host Failure: when OpenNebula detects that a host enters the ERROR or DISABLE states a hook can be triggered to deal with the situation this can very useful to limit the downtime of a service due to a hardware failure, since it can redeploy the VMs on another host. Virtual Machines Failures: in our system we included hooks to cover these status of VM 1. UNKNOWN, when the VM enters the unknown state. 2. STOP, after the VM is stopped (including VM image transfers). Issues This section describes some of the different issues that have been faced in the implementation phase of this project. It also shows how this issues have been identified and solved. The issues are as following: PAGE 37

Virtual machine is not accessible from the network. The issue is if you create a virtual machine and you want to run a service or SSH to it from another machine in the network, it will not be accessible. Attempted solution: There were several solutions but the key word here is contextualization which helps VM's Operating System to inherit the information about the network from the cloud as assigned by the cloud network management. There were several ways to solve contextualization as follows: Use a ready image that have contextualization service ready but we aimed to solve the problem for all images so that was not an option. Modify the image and add the contextualization code to the image. However, this method is complicated. The solution basically is to guide the OpenNebula cloud to add the contextualization to the image before you create the virtual machine and provide the needed files. Also, to add a public key to the image so that you can SSH to the virtual machine using the private key. PXE timeout problem when trying to deploy an OS through network boot using a TFTP server. Attempted solutions: - Changing the permissions of the tftpboot file - Trying different machine to boot from the tftp server. Final solution: Adding the line -A INPUT -i <interface name> -p udp --dport 69 in the in the iptable file to allow access of data through port 69 for tftp Foreman downloads corrupted initrd file which is essential to start the booting of Ubuntu OS. Attempted resolution: - changing the operating system version Final resolution: downloading the initrd file to replace the corrupted one manually PAGE 38

Virtual machine stopped working when the host is shutdown or there s an error Attempted resolution: - Trying to find a way to copy the image and instantiate it again on other hosts Final resolution: adding hooks on configuration file of opennebula to do this automatically. Engineering Tools and Standards EasyCloud project team has been used many tools and standards to help in automating Opennebula cloud deployment, by integrating deployment tools together, deployment can be done as simple as possible for the customer to deploy their own private cloud. We test and run many different alternative tools trying to find the best suitable tools for our project goals. We aim for our project as mentioned in the project requirements section is to convert a bare metal system with nothing installed on it, to a full functional host node for opennebula cloud with least interaction with the user, all this state transfer from bare metal to full supported node for the cloud should be automated through the tools that will be used such as Ansible, Cobbler, or even a script file. INSTALLING THE MASTER HOST making and installing the master host (front-end) can be done by running a written script in python language, so we wrote this script and make sure of it is functionality by installing a fall front-end ready node for opennebula cloud, all installation and configuration is done by the script file. Plus, we need to install the tools that will be used in adding the new nodes along with front-end installation, these tools are installed automatically along with the script file. We have installed Ansible automatically and configure it according to our requirements. COBBLER Cobbler is a build and deployment system. The primary functionality of cobbler is to simplify the lives of administrators by automating repetitive actions, such installing OS system on the new machines. Cobbler can be consider as an automatic installation tool for any new machine in the system and make it ready for future deployment. Cobbler currently supports importing a wide array of distributions from many vendors. However, since PAGE 39

Cobbler has a history rooted in Red Hat based distributions, support for them is definitely the strongest. For others, the level of support varies from very good to requiring a lot of manual steps to get things working smoothly such as Ubuntu and Debian. For our project we will use cobbler for installing Ubuntu Server OS on the new bare metal new machines through network booting (PXE). The procedure of network booting is shown in the figure below, when the new machine boot through the NIC it will communicate with the DHCP server to gain its own new IP address and the address of the TFTP server. The image installer is located on the TFTP server. There for the new machines need to connect to the TFTP server to run the automatic installer and install Ubuntu Server image on them. ANSIBLE Ansible is an IT automation engine that automates configuration management, application deployment and many other IT needs. Ansible give much more freedom and flexibility in deploying and managing system components in different machines. Ansible works by connecting to your nodes and pushing out small programs called Playbook to them, these programs are written to be resource models of the desired state of the system. Ansible executes these modules (over SSH by default), and removes them when finished. As shown in the figure below, Ansible consists of inventory and playbooks. An Inventory contains the hosts and the groups that will be used in receiving the scripts and run them, hosts can be divided into groups and that is how they are managed. Ansible has been chosen by the team because of its support for new systems to become a full functional nodes for Opennebula cloud, and adding them automatically to the cloud with no interaction needed with the user. Also, it doesn t need any sql server like others to be installed on the other nodes. Which is a big advantage in EasyCloud System PAGE 40

Conclusion WHAT WAS LEARNED? Many concepts have been deeply learned and dealt with during this project, such as the private cloud and its platforms, service model, and deployment model. In addition, different virtualization setup have been learned such as KVM and Xen. Also, We have learned how to deal with Linux deeply, such as changing the configuration, and using command lines instead of GUI interface. Furthermore, Automatic installation tools have been explored deeply and we have seen the great potential it can do. Also, we spent a long time in configuring network booting. However, we become expert at it in the end. Deployment tools, such as Ansible, have also been learnt and dealt with to modify and configure other computers in the same network. Also, many network aspects such as DNS, TFTP, and DHCP have been reinforced and applied intensely in this project. In addition, working in a team of five has different experience than any other group project with 3 members maximum. However, if the team work efficiency hasn t been utilized to serve the goal of the projet. It would be a waste of resources and time. WHAT WOULD YOU DO DIFFERENTLY IN A SIMILAR PROJECT? Start by working mainly in the main idea with what you find instead of wasting the time on trying every little option. Your choices may be not great but you have a system up and running early and you know all of the problems and steps during the way. Then, you can start looking for other options and try them. Also, to try working from home if it is possible. For example, we could set a virtual machine with many operating systems and try working from home at the beginning, which will give us more time and motive to explore many options and tools. PAGE 41

FUTURE WORK This project has a good potentials to be developed further and grow more to be an industry product. For instance, the following are some of the future work and improvement which can be done on the system: Implement the high availability architecture for the master node Design our own system a server-based with WEB UI that comes with it. This WEB UI could have more features such as a full control all system tools, and be simpler and looks more attractive. Supporting more than one cloud. Network discovery. So the user doesn t need to have a specific network setup to install our system. PAGE 42