Verteego VDS Documentation
|
|
- Abraham Burns
- 6 years ago
- Views:
Transcription
1 Verteego VDS Documentation Release 1.0 Verteego May 31, 2017
2
3 Installation 1 Getting started 3 2 Ansible Install Ansible Clone installation package Install Verteego DS Sign in Custom settings Docker Install Docker Clone installation package Install Verteego DS Sign in Dataflow 13 5 Data cleaning 15 6 Analytics & dashboarding 17 7 Notebooks 19 8 Prediction 21 9 Examples Contact us Community 27 i
4 ii
5 Verteego Data Science Suite was built to deliver a plug-and-play environment for data scientists enabling both fast prototyping and scalable production deployments. Verteego s purpose is to provide a bundle of tools covering all common requirements of data scientists. Verteego is not just another big data platform but a best-of-breed mash-up of the most powerful big data and prediction solutions currently available. Installation 1
6 2 Installation
7 CHAPTER 1 Getting started Verteego Data Science Suite is aiming to gather the best data science tools into a unique consistent and powerful stack. Installing of Verteego is fast and straight forward. No hassle, no complicated configuration. Follow the installation instructions here. 3
8 4 Chapter 1. Getting started
9 CHAPTER 2 Ansible Please note that in the following installation instructions we use 2 placeholders for local directories: VDS_ROOT and SSH_ROOT. Just replace them with the correct directories from your system. VDS_ROOT: local directory where the installation package will be cloned to SSH_ROOT: local.ssh directory 1. Install Ansible We ll use Ansible to deploy Verteego DS to your remote server or Virtualbox. If you don t have Ansible yet, please install it as we ll use Ansible to orchestrate the automatic installation process for you. Linux Mac OS (Not tested) Windows (Not tested) 2. Clone installation package Clone the following repository to your local machine (NOT to the remote server on which you want run Verteego DS). We ll call this repository VDS_ROOT in the following. git clone 5
10 3. Install Verteego DS 3.1 Installation on Amazon web service 1. Configure account and fetch needed authentication file Create a key-pair and call it vds: Change the right of the downloaded file (vds.pem) to 400 : chmod 0400 Downloads/vds.pem Copy it to cp Downloads/vds.pem VDS_ROOT/deployment/ansible/files/aws/vds.pem Configure account rights : Create Access/Secret keys : Copy the Access and secret keys into key.json file under VDS_ROOT/deployment/ansible/files/aws/keys.json 2. Launch installation ansible-playbook -i VDS_ROOT/deployment/ansible/hosts --private-key=vds_root/ deployment/ansible/files/aws/vds.pem -u admin VDS_ROOT/deployment/ansible/setup_on_ aws.yml 3.2 Installation on google cloud platform 1. Install Google Cloud SDK Before you start you should make sure that you have a running Google Cloud platform account and the GCloud SDK installed (to install GCloud SDK: Configure your account and project gcloud init Generate SSH key for GCloud gcloud compute config-ssh 2. Set up the VDS environment on Google Cloud Create a Google service account : Go to Select the project into which you want to create the VDS instance Create a service account with project editor role Check the Furnish a new private key option Chose JSON key type When you click the Create button, a key file will be the downloaded. Copy the downloaded key file to VDS_ROOT/deployment/ansible/files/gcp and rename it to ansible.json 6 Chapter 2. Ansible
11 cp Downloads/ORIGINAL_KEYFILE.json VDS_ROOT/deployment/ansible/files/gcp/ansible.json 3. Install libcloud sudo apt-get install python-pip sudo pip install apache-libcloud==1.5.0 # in case you encounter an ssl certificate validation issue ( readthedocs.io/en/latest/other/ssl-certificate-validation.html#ssl-certificate- validation-in-v2-0) sudo pip install --upgrade certifi 4. Launch installation This will launch the default installation of Verteego Data Suite. For custom settings such as instance calibration, read this. ansible-playbook -i VDS_ROOT/deployment/ansible/hosts --private-key=ssh_root/google_ compute_engine VDS_ROOT/deployment/ansible/setup_gc_instance.yml Be patient, the deployment of all files can take a while depending on the capacity of the instance you ve chosen. 5. Start playing When the installation process has finished, using a browser, navigate to the newly created instance external IP on port : You can find the external ip address on on your Google Cloud Compute Engine web page console ( cloud.google.com/compute/instances). 3.3 Installation on a local virtual server (virtualbox) from sources 1. Install Virtualbox and Vagrant Install Virtualbox: Install Vagrant: 2. Launch Vagrant Go to the Vagrant directory (VDS_ROOT/vagrant) and launch Vagrant (this may take a while as it will download a full Debian image to be installed on Virtualbox): cd VDS_ROOT/vagrant vagrant up 3. Installation Launch installation ansible-playbook -i VDS_ROOT/deployment/ansible/hosts --private-key=vds_root/vagrant/. vagrant/machines/vds/virtualbox/private_key VDS_ROOT/deployment/ansible/setup_on_ vbox.yml 4. Start playing Navigate to Install Verteego DS 7
12 3.4 Installation on a remote virtual private server (vps) Requirements : this playbook is designed to work on a debian 8 distribution, so we assume your VPS to be running a debian 8 you should be able to connect o you VPS using a private key without password you should know your VPS s public ip remote user should be part of group sudoer, because we need sudo privileges to run all commands your server should expose the port range to 33335, to enable external access to the verteego datasuite. 1. Install VDS # Pay attention to the comma after the VPS_PUBLIC_IP ansible-playbook \ -i 'VPS_PUBLIC_IP,' \ --private-key=path_to_vps_private_ssh_key \ -u REMOTE_USER \ VDS_ROOT/setup_on_vps.yml 2. Start playing Navigate to 3. Sign in For your first sign in you can use the following credentials. For security reasons, remember to change them or delete the default user after your first login. Username: vds-user Password: verteego 4. Custom settings Customize infrastructure settings Your installation can be easily customised using the different.yml files in the VDS_ROOT/deployment/ansible directory. Example: Use a high-memory instance on Google Cloud Open VDS_ROOT/deployment/ansible/setup_gc_instance.yml In the vars:machine_type variable replace n1-standard-1 with n1-highmem-16. (see You can also directly precise specific settings in the command line using the extra-vars parameter while running ansible-playbook. Example : Use a high-memory instance on Google Cloud and deploy instance in a different zone ansible-playbook \n -i VDS_ROOT/deployment/ansible/hosts \n 8 Chapter 2. Ansible
13 --private-key=ssh_root/google_compute_engine VDS_ROOT/deployment/ansible/setup_gc_ instance.yml --extra-vars "ginstance_type=n1-highmem-16 gzone=us-central1-f" Customize application settings Open VDS_ROOT/deployment/ansible/group_vars/all/vars_file.yml to change the default settings for the different applications composing Verteego Data Suite Custom settings 9
14 10 Chapter 2. Ansible
15 CHAPTER 3 Docker 1. Install Docker Head to and download and install docker for your specfic plateform, docker is supported by Windows, Mac Os, Linux. Similarly head to and install docker-compose. Once installed you ll have access to docker and docker-compose commands using the command line. docker --version output : Docker version ce, build c6d412e docker-compose --version output : docker-compose version , build b31ff33 2. Clone installation package git clone 3. Install Verteego DS 3.1 using docker-compose on a single machine Using the command line, go to the directory deployment/docker inside the cloned repository from step 2. docker-compose up 11
16 Navigate to 4. Sign in For your first sign in you can use the following credentials. For security reasons, remember to change them or delete the default user after your first login. Username: vds-user Password: verteego 12 Chapter 3. Docker
17 CHAPTER 4 Dataflow The dataflow represents the backbone of your data science projects. It describes the different steps your data will run through from its source to prediction and visualization. The dataflow module is based on the Apache Nifi technology, one of the most powerful data routing and transformation tools currently available. It offers you a production-ready and highly configurable platform that can be run with a broad panel of data sources, formatting standards (XML, JSON, Avro, CSV,...) and big data technologies (Hadoop, Kafka, HDFS, Flume, Elasticsearch, HBase, Couchbase, Mongo DB, Solr, Splunk, Lumberjack, Cassandra, Hive,...). To learn more about the dataflow technology read on here. 13
18 14 Chapter 4. Dataflow
19 CHAPTER 5 Data cleaning The data cleaning module provides a powerful user interface to define a stack of data cleaning tasks that can be applied to any file running through your dataflow. Technology The data cleaning module is powered by Google s Open Refine technology. You ll find a full user documentation here. General Refine Expression Language (GREL) The data cleaning module comes with a powerful expression language allowing to apply precise transformation and cleaning tasks to your data. Check out the GREL documentation Running cleaning tasks in dataflow The dataflow module offers special processors that can be used to apply a cleaning script to a data flow. You may find them amongst the other processors by typing openrefine in the processor search field. Currently there are processors for XLS, XLSX, CSV and TSV support. 15
20 16 Chapter 5. Data cleaning
21 CHAPTER 6 Analytics & dashboarding Verteego Data Science Suite offers comprehensive analytical and dashboarding capabilities allowing to slice, dice and visualize your data at any step of your dataflow. Technology The analytics and dashboarding module is powered by Superset, a data exploration platform designed to be visual, intuitive and interactive. To dive deeper into the technology please check out the Superset Github and the user group. Database connections The analytics and dashboarding module can be run with a variety of databases using the SQL alchemy libraries. 17
22 18 Chapter 6. Analytics & dashboarding
23 CHAPTER 7 Notebooks Notebooks enable you to write custom scripts that can be run within your dataflow using the ExecuteProcess and ExecuteStreamCommand processors. Verteego comes with a few pre-installed kernels to cover the most common languages used by data scientists: Python 2.7 R Bash Technology The integrated notebooks run on Jupyter. Install additional kernels If you need programming languages that are not pre-installed with your package of Verteego you can find more of them here: List of community supported language kernels 19
24 20 Chapter 7. Notebooks
25 CHAPTER 8 Prediction The prediction module offers a simple and powerful way to produce predictive models without writing a line of code. Supported algorithms are: Aggregator Deep Learning Distributed Random Forest Gradient Boosting Machine Generalized Linear Modeling Generalized Low Rank Modeling K-means Naive Bayes Principal Components Analysis Technology Verteego Data Science Suite s predictive technology runs on open source software H2O. You may find the full documentation and some tutorials here. Predictive modelling Models trained on H2O are packaged as Plain Old Java Objects (POJO) that can easily be deployed and executed within your dataflow. Check out some examples of production environments. 21
26 22 Chapter 8. Prediction
27 CHAPTER 9 Examples Here are a few examples of what we and our users have built with the Verteego Data Science Suite: Use Case HR Reduce employee attrition and make talents stay longer (Part 1: Data Analysis) Use Case HR Reduce employee attrition and make talents stay longer (Part 2: Prediction) Clean and verify s All scripts belonging to these examples can be downloaded on the Verteego Github. Prediction deployment examples The H2O doc offers some examples of prediction deployments. 23
28 24 Chapter 9. Examples
29 CHAPTER 10 Contact us Need support, detected an issue or have a question? Drop a message to our community support! 25
30 26 Chapter 10. Contact us
31 CHAPTER 11 Community Drop a message and we ll do our best to help you out as fast as we can. You can also text us on Twitter. 27
TangeloHub Documentation
TangeloHub Documentation Release None Kitware, Inc. September 21, 2015 Contents 1 User s Guide 3 1.1 Managing Data.............................................. 3 1.2 Running an Analysis...........................................
More informationInstalling and Using Docker Toolbox for Mac OSX and Windows
Installing and Using Docker Toolbox for Mac OSX and Windows One of the most compelling reasons to run Docker on your local machine is the speed at which you can deploy and build lab environments. As a
More informationBitnami MEAN for Huawei Enterprise Cloud
Bitnami MEAN for Huawei Enterprise Cloud Description Bitnami MEAN Stack provides a complete development environment for mongodb and Node.js that can be deployed in one click. It includes the latest stable
More informationBitnami Apache Solr for Huawei Enterprise Cloud
Bitnami Apache Solr for Huawei Enterprise Cloud Description Apache Solr is an open source enterprise search platform from the Apache Lucene project. It includes powerful full-text search, highlighting,
More informationAnsible Tower Quick Setup Guide
Ansible Tower Quick Setup Guide Release Ansible Tower 2.4.5 Red Hat, Inc. Jun 06, 2017 CONTENTS 1 Quick Start 2 2 Login as a Superuser 3 3 Import a License 4 4 Examine the Tower Dashboard 6 5 The Setup
More informationBitnami JRuby for Huawei Enterprise Cloud
Bitnami JRuby for Huawei Enterprise Cloud Description JRuby is a 100% Java implementation of the Ruby programming language. It is Ruby for the JVM. JRuby provides a complete set of core built-in classes
More informationDC/OS on Google Compute Engine
DC/OS on Google Compute Engine You can configure a DC/OS cluster on Google Compute Engine (GCE) by using these scripts. Configure bootstrap node Install the DC/OS GCE scripts Configure parameters Important:
More informationUSING NGC WITH GOOGLE CLOUD PLATFORM
USING NGC WITH GOOGLE CLOUD PLATFORM DU-08962-001 _v02 April 2018 Setup Guide TABLE OF CONTENTS Chapter 1. Introduction to... 1 Chapter 2. Deploying an NVIDIA GPU Cloud Image from the GCP Console...3 2.1.
More informationAbout the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Gerrit
Gerrit About the Tutorial Gerrit is a web-based code review tool, which is integrated with Git and built on top of Git version control system (helps developers to work together and maintain the history
More informationDeploying a Production Gateway with Airavata
Deploying a Production Gateway with Airavata Table of Contents Pre-requisites... 1 Create a Gateway Request... 1 Gateway Deploy Steps... 2 Install Ansible & Python...2 Deploy the Gateway...3 Gateway Configuration...
More informationSAP Vora - AWS Marketplace Production Edition Reference Guide
SAP Vora - AWS Marketplace Production Edition Reference Guide 1. Introduction 2 1.1. SAP Vora 2 1.2. SAP Vora Production Edition in Amazon Web Services 2 1.2.1. Vora Cluster Composition 3 1.2.2. Ambari
More informationBitnami Re:dash for Huawei Enterprise Cloud
Bitnami Re:dash for Huawei Enterprise Cloud Description Re:dash is an open source data visualization and collaboration tool. It was designed to allow fast and easy access to billions of records in all
More informationAt Course Completion Prepares you as per certification requirements for AWS Developer Associate.
[AWS-DAW]: AWS Cloud Developer Associate Workshop Length Delivery Method : 4 days : Instructor-led (Classroom) At Course Completion Prepares you as per certification requirements for AWS Developer Associate.
More informationUsing DC/OS for Continuous Delivery
Using DC/OS for Continuous Delivery DevPulseCon 2017 Elizabeth K. Joseph, @pleia2 Mesosphere 1 Elizabeth K. Joseph, Developer Advocate, Mesosphere 15+ years working in open source communities 10+ years
More informationBitnami Pimcore for Huawei Enterprise Cloud
Bitnami Pimcore for Huawei Enterprise Cloud Description Pimcore is the open source platform for managing digital experiences. It is the consolidated platform for web content management, product information
More informationEnterprise Steam Installation and Setup
Enterprise Steam Installation and Setup Release H2O.ai Mar 01, 2017 CONTENTS 1 Installing Enterprise Steam 3 1.1 Obtaining the License Key........................................ 3 1.2 Ubuntu Installation............................................
More informationSwift Web Applications on the AWS Cloud
Swift Web Applications on the AWS Cloud Quick Start Reference Deployment November 2016 Asif Khan, Tom Horton, and Tony Vattathil Solutions Architects, Amazon Web Services Contents Overview... 2 Architecture...
More informationTunir Documentation. Release Kushal Das
Tunir Documentation Release 0.17 Kushal Das Jul 24, 2017 Contents 1 Why another testing tool? 3 2 Installation 5 2.1 Clone the repository........................................... 5 2.2 Install the dependencies.........................................
More informationCreating a Yubikey MFA Service in AWS
Amazon AWS is a cloud based development environment with a goal to provide many options to companies wishing to leverage the power and convenience of cloud computing within their organisation. In 2013
More informationPreparing Your Google Cloud VM for W4705
Preparing Your Google Cloud VM for W4705 August 27, 2017 1. Get a cloud.cs.columbia.edu account 1. Sign up for a cloud Columbia CS account using this link. Note that is is an entirely new account and is
More informationHDI+Talena Resources Deployment Guide. J u n e
HDI+Talena Resources Deployment Guide J u n e 2 0 1 7 2017 Talena Inc. All rights reserved. Talena, the Talena logo are trademarks of Talena Inc., registered in the U.S. Other company and product names
More informationBitnami OSQA for Huawei Enterprise Cloud
Bitnami OSQA for Huawei Enterprise Cloud Description OSQA is a question and answer system that helps manage and grow online communities similar to Stack Overflow. First steps with the Bitnami OSQA Stack
More informationBitnami Ruby for Huawei Enterprise Cloud
Bitnami Ruby for Huawei Enterprise Cloud Description Bitnami Ruby Stack provides a complete development environment for Ruby on Rails that can be deployed in one click. It includes most popular components
More informationAzure Marketplace Getting Started Tutorial. Community Edition
Azure Marketplace Getting Started Tutorial Community Edition Introduction NooBaa software provides a distributed storage solution for unstructured data such as analytics data, multi-media, backup, and
More informationBitnami ez Publish for Huawei Enterprise Cloud
Bitnami ez Publish for Huawei Enterprise Cloud Description ez Publish is an Enterprise Content Management platform with an easy to use Web Content Management System. It includes role-based multi-user access,
More informationCloudExpo November 2017 Tomer Levi
CloudExpo November 2017 Tomer Levi About me Full Stack Engineer @ Intel s Advanced Analytics group. Artificial Intelligence unit at Intel. Responsible for (1) Radical improvement of critical processes
More informationTestbed-12 TEAM Engine Virtualization User Guide
Testbed-12 TEAM Engine Virtualization User Guide Table of Contents 1. Introduction............................................................................. 3 2. VirtualBox...............................................................................
More informationSCALING DRUPAL TO THE CLOUD WITH DOCKER AND AWS
SCALING DRUPAL TO THE CLOUD WITH DOCKER AND AWS Dr. Djun Kim Camp Pacific OUTLINE Overview Quick Intro to Docker Intro to AWS Designing a scalable application Connecting Drupal to AWS services Intro to
More informationLENS Server Maintenance Guide JZ 2017/07/28
LENS Server Maintenance Guide JZ 2017/07/28 Duty Maintain LENS server with minimum downtime Patch critical vulnerabilities Assist LAB member for using the LENS services Evaluate for custom requirements
More informationAdvanced Continuous Delivery Strategies for Containerized Applications Using DC/OS
Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS ContainerCon @ Open Source Summit North America 2017 Elizabeth K. Joseph @pleia2 1 Elizabeth K. Joseph, Developer Advocate
More informationDevOps Course Content
DevOps Course Content 1. Introduction: Understanding Development Development SDLC using WaterFall & Agile Understanding Operations DevOps to the rescue What is DevOps DevOps SDLC Continuous Delivery model
More informationSetting Up U P D AT E D 1 / 3 / 1 6
Setting Up A GUIDE TO SETTING UP YOUR VIRTUAL MACHINE FOR PYTHON U P D AT E D 1 / 3 / 1 6 Why use a virtual machine? Before we begin, some motivation. Python can be installed on your host OS and many of
More informationGunnery Documentation
Gunnery Documentation Release 0.1 Paweł Olejniczak August 18, 2014 Contents 1 Contents 3 1.1 Overview................................................. 3 1.2 Installation................................................
More informationLSST software stack and deployment on other architectures. William O Mullane for Andy Connolly with material from Owen Boberg
LSST software stack and deployment on other architectures William O Mullane for Andy Connolly with material from Owen Boberg Containers and Docker Packaged piece of software with complete file system it
More informationAzure Marketplace. Getting Started Tutorial. Community Edition
Azure Marketplace Getting Started Tutorial Community Edition Introduction NooBaa software provides a distributed storage solution for unstructured data such as analytics data, multi-media, backup, and
More informationXcalar Installation Guide
Xcalar Installation Guide Publication date: 2018-03-16 www.xcalar.com Copyright 2018 Xcalar, Inc. All rights reserved. Table of Contents Xcalar installation overview 5 Audience 5 Overview of the Xcalar
More informationDownloading and installing Db2 Developer Community Edition on Ubuntu Linux Roger E. Sanders Yujing Ke Published on October 24, 2018
Downloading and installing Db2 Developer Community Edition on Ubuntu Linux Roger E. Sanders Yujing Ke Published on October 24, 2018 This guide will help you download and install IBM Db2 software, Data
More informationQuick Start Guide to Compute Canada Cloud Service
Quick Start Guide to Compute Canada Cloud Service Launching your first instance (VM) Login to the East or West cloud Dashboard SSH key pair Importing an existing key pair Creating a new key pair Launching
More informationInfoblox Kubernetes1.0.0 IPAM Plugin
2h DEPLOYMENT GUIDE Infoblox Kubernetes1.0.0 IPAM Plugin NIOS version 8.X August 2018 2018 Infoblox Inc. All rights reserved. Infoblox Kubernetes 1.0.0 IPAM Deployment Guide August 2018 Page 1 of 18 Overview...
More informationHow to choose the right approach to analytics and reporting
SOLUTION OVERVIEW How to choose the right approach to analytics and reporting A comprehensive comparison of the open source and commercial versions of the OpenText Analytics Suite In today s digital world,
More informationExercise #1: ANALYZING SOCIAL MEDIA AND CUSTOMER SENTIMENT WITH APACHE NIFI AND HDP SEARCH INTRODUCTION CONFIGURE AND START SOLR
Exercise #1: ANALYZING SOCIAL MEDIA AND CUSTOMER SENTIMENT WITH APACHE NIFI AND HDP SEARCH INTRODUCTION We will use Solr and the LucidWorks HDP Search to view our streamed data in real time to gather insights
More informationKollaborate Server. Installation Guide
1 Kollaborate Server Installation Guide Kollaborate Server is a local implementation of the Kollaborate cloud workflow system that allows you to run the service in-house on your own server and storage.
More informationHortonworks SmartSense
Hortonworks SmartSense Installation (January 8, 2018) docs.hortonworks.com Hortonworks SmartSense: Installation Copyright 2012-2018 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform,
More informationDeploying Rubrik Datos IO to Protect MongoDB Database on GCP
DEPLOYMENT GUIDE Deploying Rubrik Datos IO to Protect MongoDB Database on GCP TABLE OF CONTENTS INTRODUCTION... 1 OBJECTIVES... 1 COSTS... 2 BEFORE YOU BEGIN... 2 PROVISIONING YOUR INFRASTRUCTURE FOR THE
More informationBitnami Coppermine for Huawei Enterprise Cloud
Bitnami Coppermine for Huawei Enterprise Cloud Description Coppermine is a multi-purpose, full-featured web picture gallery. It includes user management, private galleries, automatic thumbnail creation,
More informationAn Overview of the Architecture of Juno: CHPC s New JupyterHub Service By Luan Truong, CHPC, University of Utah
An Overview of the Architecture of Juno: CHPC s New JupyterHub Service By Luan Truong, CHPC, University of Utah Introduction Jupyter notebooks have emerged as a popular and open-source web application
More informationDEPLOYING A 3SCALE API GATEWAY ON RED HAT OPENSHIFT
TUTORIAL: DEPLOYING A 3SCALE API GATEWAY ON RED HAT OPENSHIFT This tutorial describes how to deploy a dockerized version of the 3scale API Gateway 1.0 (APIcast) that is packaged for easy installation and
More informationInception Cloud User s Guide
Inception Cloud User s Guide 1 Overview Creating an inception cloud consists of preparing your workstation, preparing the VM environment by adding a temporary boot-up machine, and then executing the orchestrator
More informationHortonworks DataFlow
Hortonworks DataFlow Installing NiFi (February 28, 2018) docs.hortonworks.com Hortonworks DataFlow: Installing NiFi Copyright 2012-2018 Hortonworks, Inc. Some rights reserved. Except where otherwise noted,
More informationSnowAlert Documentation. Snowflake Security
Snowflake Security Nov 02, 2018 Contents 1 About SnowAlert 3 1.1 Overview................................................. 3 1.2 How It Works............................................... 3 2 Getting
More informationGIT. A free and open source distributed version control system. User Guide. January, Department of Computer Science and Engineering
GIT A free and open source distributed version control system User Guide January, 2018 Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Table of Contents What is
More informationCIS 612 Advanced Topics in Database Big Data Project Lawrence Ni, Priya Patil, James Tench
CIS 612 Advanced Topics in Database Big Data Project Lawrence Ni, Priya Patil, James Tench Abstract Implementing a Hadoop-based system for processing big data and doing analytics is a topic which has been
More informationCloudera Manager Quick Start Guide
Cloudera Manager Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this
More informationBitnami HHVM for Huawei Enterprise Cloud
Bitnami HHVM for Huawei Enterprise Cloud Description HHVM is an open source virtual machine designed for executing programs written in Hack and PHP. HHVM uses a just-in-time (JIT) compilation approach
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationPiranaJS installation guide
PiranaJS installation guide Ron Keizer, January 2015 Introduction PiranaJS is the web-based version of Pirana, a workbench for pharmacometricians aimed at facilitating the use of NONMEM, PsN, R/Xpose,
More informationBitnami Node.js for Huawei Enterprise Cloud
Bitnami Node.js for Huawei Enterprise Cloud Description Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. It uses an event-driven, non-blocking
More informationAmbari Managed HDF Upgrade
3 Ambari Managed HDF Upgrade Date of Publish: 2018-08-13 http://docs.hortonworks.com Contents Pre-upgrade tasks... 3 Review credentials...3 Stop Services...3 Verify NiFi Toolkit Version...4 Upgrade Ambari
More informationGetting Started with the Google Cloud SDK on ThingsPro 2.0 to Publish Modbus Data and Subscribe to Cloud Services
to Publish Modbus Data and Subscribe to Cloud Services Contents Moxa Technical Support Team support@moxa.com 1 Introduction... 2 2 Application Scenario... 2 3 Prerequisites... 3 4 Solution... 3 4.1 Set
More informationHPCC Systems: See Through Patterns in Big Data to Find Big Opportunities
White Paper LexisNexis Risk Solutions Using HtS3 to Deploy HPCC Systems HPCC Systems: See Through Patterns in Big Data to Find Big Opportunities Table of Contents INTRODUCTION... 3 What is HPCC Systems?...
More informationIndex. Bessel function, 51 Big data, 1. Cloud-based version-control system, 226 Containerization, 30 application, 32 virtualize processes, 30 31
Index A Amazon Web Services (AWS), 2 account creation, 2 EC2 instance creation, 9 Docker, 13 IP address, 12 key pair, 12 launch button, 11 security group, 11 stable Ubuntu server, 9 t2.micro type, 9 10
More informationAware IM Version 8.1 Installation Guide
Aware IM Version 8.1 Copyright 2002-2018 Awaresoft Pty Ltd CONTENTS 1 INSTALLATION UNDER WINDOWS OPERATING SYSTEM... 3 1.1 HARDWARE AND SOFTWARE REQUIREMENTS... 3 1.2 USING THE INSTALLATION PROGRAM...
More informationPuppet on the AWS Cloud
Puppet on the AWS Cloud Quick Start Reference Deployment AWS Quick Start Reference Team March 2016 This guide is also available in HTML format at http://docs.aws.amazon.com/quickstart/latest/puppet/. Contents
More informationInstalling HDF Services on an Existing HDP Cluster
3 Installing HDF Services on an Existing HDP Cluster Date of Publish: 2018-08-13 http://docs.hortonworks.com Contents Upgrade Ambari and HDP...3 Installing Databases...3 Installing MySQL... 3 Configuring
More informationStorage Made Easy Cloud Appliance installation Guide
dev.storagemadeeasy.com Storage Made Easy Cloud Appliance installation Guide 4 min read original Configuring the SME Appliance The configuration server allows you to configure - Static IP address - Domain
More informationUser Workspace Management
Access the Interface, page 1 User Management Workspace User Types, page 4 Projects (Admin User), page 5 Users (Admin User), page 9 CML Server (Admin User), page 11 Connectivity, page 30 Using the VM Control
More informationBitnami MySQL for Huawei Enterprise Cloud
Bitnami MySQL for Huawei Enterprise Cloud Description MySQL is a fast, reliable, scalable, and easy to use open-source relational database system. MySQL Server is intended for mission-critical, heavy-load
More informationBitnami MariaDB for Huawei Enterprise Cloud
Bitnami MariaDB for Huawei Enterprise Cloud First steps with the Bitnami MariaDB Stack Welcome to your new Bitnami application running on Huawei Enterprise Cloud! Here are a few questions (and answers!)
More informationInstalling MediaWiki using VirtualBox
Installing MediaWiki using VirtualBox Install VirtualBox with your package manager or download it from the https://www.virtualbox.org/ website and follow the installation instructions. Load an Image For
More informationOracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service
Demo Introduction Keywords: Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service Goal of Demo: Oracle Big Data Preparation Cloud Services can ingest data from various
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationAWS Remote Access VPC Bundle
AWS Remote Access VPC Bundle Deployment Guide Last updated: April 11, 2017 Aviatrix Systems, Inc. 411 High Street Palo Alto CA 94301 USA http://www.aviatrix.com Tel: +1 844.262.3100 Page 1 of 12 TABLE
More informationOracle Cloud Using Oracle Big Data Manager. Release
Oracle Cloud Using Oracle Big Data Manager Release 18.2.1 E91848-07 April 2018 Oracle Cloud Using Oracle Big Data Manager, Release 18.2.1 E91848-07 Copyright 2018, 2018, Oracle and/or its affiliates. All
More informationLOCAL WALLET (COLD WALLET):
This tutorial will teach you how to create a masternode with a "cold/hot" setup. The whole process is as follows. LOCAL WALLET (COLD WALLET): Visit TRAID platform s official repository on GitHub and download
More informationHypersocket SSO. Lee Painter HYPERSOCKET LIMITED Unit 1, Vision Business Centre, Firth Way, Nottingham, NG6 8GF, United Kingdom. Getting Started Guide
Hypersocket SSO Getting Started Guide Lee Painter HYPERSOCKET LIMITED Unit 1, Vision Business Centre, Firth Way, Nottingham, NG6 8GF, United Kingdom Table of Contents PREFACE... 4 DOCUMENT OBJECTIVE...
More informationSimplified CICD with Jenkins and Git on the ZeroStack Platform
DATA SHEET Simplified CICD with Jenkins and Git on the ZeroStack Platform In the technical article we will walk through an end to end workflow of starting from virtually nothing and establishing a CICD
More informationSetting up a Chaincoin Masternode
Setting up a Chaincoin Masternode Introduction So you want to set up your own Chaincoin Masternode? You ve come to the right place! These instructions are correct as of April, 2017, and relate to version
More informationEucalyptus User Console Guide
Eucalyptus 3.4.1 User Console Guide 2013-12-11 Eucalyptus Systems Eucalyptus Contents 2 Contents User Console Overview...5 Install the Eucalyptus User Console...6 Install on Centos / RHEL 6.3...6 Configure
More informationDEVOPS COURSE CONTENT
LINUX Basics: Unix and linux difference Linux File system structure Basic linux/unix commands Changing file permissions and ownership Types of links soft and hard link Filter commands Simple filter and
More informationOpenStack Havana All-in-One lab on VMware Workstation
OpenStack Havana All-in-One lab on VMware Workstation With all of the popularity of OpenStack in general, and specifically with my other posts on deploying the Rackspace Private Cloud lab on VMware Workstation,
More informationBig Data Applications with Spring XD
Big Data Applications with Spring XD Thomas Darimont, Software Engineer, Pivotal Inc. @thomasdarimont Unless otherwise indicated, these slides are 2013-2015 Pivotal Software, Inc. and licensed under a
More informationPVS Deployment in the Cloud. Last Updated: June 17, 2016
PVS Deployment in the Cloud Last Updated: June 17, 2016 Contents Amazon Web Services Introduction 3 Software Requirements 4 Set up a NAT Gateway 5 Install PVS on the NAT Gateway 11 Example Deployment 12
More informationCSCI 350 Virtual Machine Setup Guide
CSCI 350 Virtual Machine Setup Guide This guide will take you through the steps needed to set up the virtual machine to do the PintOS project. Both Macintosh and Windows will run just fine. We have yet
More informationThe SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.
Dublin Apache Kafka Meetup, 30 August 2017 The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Joseph @pleia2 * ASF projects 1 Elizabeth K. Joseph, Developer Advocate Developer Advocate
More informationComodo Endpoint Security Manager Professional Edition Software Version 3.3
Comodo Endpoint Security Manager Professional Edition Software Version 3.3 Quick Start Guide Guide Version 3.2.022615 Comodo Security Solutions 1255 Broad Street Clifton, NJ 07013 Comodo Endpoint Security
More informationDogeCash Masternode Setup Guide Version 1.2 (Ubuntu 16.04)
DogeCash Masternode Setup Guide Version 1.2 (Ubuntu 16.04) This guide will assist you in setting up a DogeCash Masternode on a Linux Server running Ubuntu 16.04. (Use at your own risk) If you require further
More informationContents Overview... 5 Upgrading Primavera Gateway... 7 Using Gateway Configuration Utilities... 9
Gateway Upgrade Guide for On-Premises Version 17 August 2017 Contents Overview... 5 Downloading Primavera Gateway... 5 Upgrading Primavera Gateway... 7 Prerequisites... 7 Upgrading Existing Gateway Database...
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationSINGLE NODE SETUP APACHE HADOOP
page 1 / 5 page 2 / 5 single node setup apache pdf This article will guide you on how you can install and configure Apache Hadoop on a single node cluster in CentOS 7, RHEL 7 and Fedora 23+ releases. How
More informationCERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)
CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) The Certificate in Software Development Life Cycle in BIGDATA, Business Intelligence and Tableau program
More informationHow To Start Mysql Using Linux Command Line Client In Ubuntu
How To Start Mysql Using Linux Command Line Client In Ubuntu Step One: Install MySQL Client On Debian, Ubuntu or Linux Mint: Before you start typing commands at the MySQL prompt, remember that each In
More informationProfessional Edition User Guide
Professional Edition User Guide Pronto, Visualizer, and Dashboards 2.0 Birst Software Version 5.28.6 Documentation Release Thursday, October 19, 2017 i Copyright 2015-2017 Birst, Inc. Copyright 2015-2017
More informationBitnami ProcessMaker Community Edition for Huawei Enterprise Cloud
Bitnami ProcessMaker Community Edition for Huawei Enterprise Cloud Description ProcessMaker is an easy-to-use, open source workflow automation and Business Process Management platform, designed so Business
More informationHomework #7 Google Cloud Platform
Homework #7 Google Cloud Platform This semester we are allowing all students to explore cloud computing as offered by the Google Cloud Platform. Using the instructions below one can establish a website
More informationQuick Install for Amazon EMR
Quick Install for Amazon EMR Version: 4.2 Doc Build Date: 11/15/2017 Copyright Trifacta Inc. 2017 - All Rights Reserved. CONFIDENTIAL These materials (the Documentation ) are the confidential and proprietary
More informationOracle Cloud Using Oracle Big Data Manager. Release
Oracle Cloud Using Oracle Big Data Manager Release 18.2.5 E91848-08 June 2018 Oracle Cloud Using Oracle Big Data Manager, Release 18.2.5 E91848-08 Copyright 2018, 2018, Oracle and/or its affiliates. All
More informationiway Big Data Integrator New Features Bulletin and Release Notes
iway Big Data Integrator New Features Bulletin and Release Notes Version 1.5.2 DN3502232.0717 Active Technologies, EDA, EDA/SQL, FIDEL, FOCUS, Information Builders, the Information Builders logo, iway,
More informationINDIGO PAAS TUTORIAL. ! Marica Antonacci RIA INFN-Bari
INDIGO PAAS TUTORIAL RIA-653549! Marica Antonacci!! marica.antonacci@ba.infn.it! INFN-Bari INDIGO PAAS Tutorial Introductory Concepts TOSCA Ansible Docker Orchestrator APIs INDIGO TOSCA custom types and
More informationCircuitPython with Jupyter Notebooks
CircuitPython with Jupyter Notebooks Created by Brent Rubell Last updated on 2018-08-22 04:08:47 PM UTC Guide Contents Guide Contents Overview What's a Jupyter Notebook? The Jupyter Notebook is an open-source
More informationChef Server on the AWS Cloud
Chef Server on the AWS Cloud Quick Start Reference Deployment Mike Pfeiffer December 2015 This guide is also available in HTML format at http://docs.aws.amazon.com/quickstart/latest/chef-server/. Contents
More information