Part II (c) Desktop Installation. Net Serpents LLC, USA

Similar documents
3 Hadoop Installation: Pseudo-distributed mode

Installation of Hadoop on Ubuntu

Apache Hadoop Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.

Installation and Configuration Documentation

Hadoop Quickstart. Table of contents

Hadoop Setup on OpenStack Windows Azure Guide

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g.

UNIT II HADOOP FRAMEWORK

Inria, Rennes Bretagne Atlantique Research Center

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. HCatalog

Multi-Node Cluster Setup on Hadoop. Tushar B. Kute,

Installing Hadoop / Yarn, Hive 2.1.0, Scala , and Spark 2.0 on Raspberry Pi Cluster of 3 Nodes. By: Nicholas Propes 2016

Developer s Manual. Version May, Computer Science Department, Texas Christian University

Big Data Retrieving Required Information From Text Files Desmond Hill Yenumula B Reddy (Advisor)

Implementation of Randomized Hydrodynamic Load Balancing Algorithm using Map Reduce framework on Open Source Platform of Hadoop

BIG DATA TRAINING PRESENTATION

Outline Introduction Big Data Sources of Big Data Tools HDFS Installation Configuration Starting & Stopping Map Reduc.

Hadoop is essentially an operating system for distributed processing. Its primary subsystems are HDFS and MapReduce (and Yarn).

Hadoop Setup Walkthrough

Cloud Computing II. Exercises

Getting Started with Hadoop/YARN

Big Data Analytics by Using Hadoop

Guidelines - Configuring PDI, MapReduce, and MapR

Hadoop Lab 2 Exploring the Hadoop Environment

Installation Guide. Community release

About 1. Chapter 1: Getting started with oozie 2. Remarks 2. Versions 2. Examples 2. Installation or Setup 2. Chapter 2: Oozie

Welcome to getting started with Ubuntu Server. This System Administrator Manual. guide to be simple to follow, with step by step instructions

This brief tutorial provides a quick introduction to Big Data, MapReduce algorithm, and Hadoop Distributed File System.

Linux Essentials Objectives Topics:

Hadoop Cluster Implementation

RDMA for Apache Hadoop 2.x User Guide

Hortonworks Technical Preview for Apache Falcon

Linux Kung-Fu. James Droste UBNetDef Fall 2016

Running various Bigtop components

Running Kmeans Spark on EC2 Documentation

Network Monitoring & Management. A few Linux basics

How to Install and Configure Big Data Edition for Hortonworks

Installing Lemur on Mac OS X and CSE Systems

DATA MIGRATION METHODOLOGY FROM SQL TO COLUMN ORIENTED DATABASES (HBase)

The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform

Initial setting up of VPN Java version.

BPPM Patrol Agent Installation Steps on Linux and Automation Integration

FEPS. SSH Access with Two-Factor Authentication. RSA Key-pairs

a) Define Hadoop Ecosystem Ans: Hadoop eco system consist of two components Hadoop Distributed File System(HDFS)

SQOOP - QUICK GUIDE SQOOP - INTRODUCTION

Booting a Galaxy Instance

Setting up a Chaincoin Masternode

Configure HOSTNAME by adding the hostname to the file /etc/sysconfig/network. Do the same to all the other 3(4) nodes.

COUCHDB - INSTALLATION

halvade Documentation

Pentaho MapReduce with MapR Client

CMU MSP Intro to Hadoop

In this exercise you will practice working with HDFS, the Hadoop. You will use the HDFS command line tool and the Hue File Browser

Perl and R Scripting for Biologists

Hadoop On Demand User Guide

Tutorial for Assignment 2.0

Session 1 Big Data and Hadoop - Overview. - Dr. M. R. Sanghavi

Everything about Linux User- and Filemanagement

Introduction into Big Data analytics Lecture 3 Hadoop ecosystem. Janusz Szwabiński

11/8/17 GETTING STARTED

Downloading and installing Db2 Developer Community Edition on Ubuntu Linux Roger E. Sanders Yujing Ke Published on October 24, 2018

Installing SmartSense on HDP

About Backup and Restore, on page 1 Supported Backup and Restore Procedures, on page 3

To configure the patching repository so that it can copy patches to alternate locations, use SFTP, SCP, FTP, NFS, or a premounted file system.

OpenEMR Insights Configuration Instructions

docs.hortonworks.com

Getting Started with Hadoop

John the Ripper on a Ubuntu MPI Cluster

Compile and Run WordCount via Command Line

Cryptography Application : SSH. Cyber Security & Network Security March, 2017 Dhaka, Bangladesh

MarketC - Masternode Setup Guide

GIT. A free and open source distributed version control system. User Guide. January, Department of Computer Science and Engineering

Installing Datameer with MapR on an Edge Node

HOD User Guide. Table of contents

MIS Week 10. Operating System Security. Unix/Linux basics

Post Ubuntu Install Exercises

Greenplum Data Loader Installation and User Guide

Project 1 Setup. Some relevant details are the output of: 1. uname -a 2. cat /etc/*release 3. whereis java 4. java -version 5.

Getting Started. Table of contents. 1 Pig Setup Running Pig Pig Latin Statements Pig Properties Pig Tutorial...

LAB #7 Linux Tutorial

Expedition. Hardening Guide Version Palo Alto Networks, Inc.

Aims. Background. This exercise aims to get you to:

BUILD LINUX LEARNING LAB FOR FREE

Ubuntu Practice and Configuration Post Installation Exercises interlab at AIT Bangkok, Thailand

Getting Started with Pentaho and Cloudera QuickStart VM

Apache Manually Install Ubuntu From Usb

HBase Installation and Configuration

LENS Server Maintenance Guide JZ 2017/07/28

Introduction to remote command line Linux. Research Computing Team University of Birmingham

HANDS UP IF YOU DON T HAVE A VM OR IF YOU DON T REMEMBER YOUR PASSWORDS. Or something broke

SAS Event Stream Processing for Edge Computing 4.3: Deployment Guide

How to Install and Configure EBF16193 for Hortonworks HDP 2.3 and HotFix 3 Update 2

Cluster Setup. Table of contents

Create Test Environment

Setting up VPS on Ovh public cloud and installing lamp server on Ubuntu instance

Install and Configure Ubuntu on a VirtualBox Virtual Machine

An Overview of SSH. Presentation to Linux Users of Victoria. Melbourne, August 26, 2017

Linux Kung Fu. Ross Ventresca UBNetDef, Fall 2017

Manually Java 7 Update Bit Windows 8

Installing Connector on Linux

Transcription:

Part II (c) Desktop ation

Desktop ation ation Supported Platforms Required Software Releases &Mirror Sites Configure Format Start/ Stop Verify

Supported Platforms ation GNU Linux supported for Development Production Demonstrated on 2000 node cluster Win32 Development only Not supported as a production platform

Required Software Required Software Following to be installed first Java 1.6.x or higher ssh: Ubuntu: ssh and rsync Windows: openssh

Releases and Mirror Sites Releases and Mirror Sites Releases http://hadoop.apache.org/releases.html Stable Release 2.7.1 (July 2015) Stable release: 2.6.0 (released Nov 2014) Earlier good releases: 2.4.0 (April 2014) 2.2.0 (GA Release Oct 2013) 1.0.0 (Dec 2011) Visit: http://hadoop.apache.org/releases.html

Mirror Sites Mirror Sites Downloads available at several mirror sites Suggested by Apache: http://www.apache.org/dyn/closer.cgi/hadoop/ common Other mirror sites at: http://www.apache.org/dyn/closer.cgi/hadoop/ common

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Ubuntu is the most popular Linux distribution Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 Start/ Stop Step 9 - Verify

Step 1 - Java Login as an admin user: $ cd ~ # Update the source list $ sudo apt-get update $ sudo apt-get install default-jdk # Verify version of Java is 1.6.0 or higher $ java -version java version "1.7.0_65" OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-0ubuntu0.14.04.1) OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 Start/ Stop Step 9 - Verify

Step 2 Create a dedicated Hadoop user # Create a hadoop group $ sudo addgroup hadoop Adding group `hadoop' (GID 1009) Done. # Create hadoop user $ sudo adduser --ingroup hadoop huser Adding user `huser'... Adding new user `huser' (1001) with group `hadoop'... Creating home directory `/home/huser'... Copying files from `/etc/skel'... Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully Changing the user information for huser

Step 2 Create a dedicated Hadoop user Enter the new value, or press ENTER for the default Full Name []: Room Number []: Work Phone []: Home Phone []: Other []: Is the information correct? [Y/n] Y # Add new user to sudoers $ sudo adduser huser sudo [sudo] password for admin: Adding user `huser' to group `sudo'... Adding user huser to group sudo Done.

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 Start/ Stop Step 9 - Verify

Step 3 SSH $ sudo apt-get install ssh # Verify SSH is installed $ which ssh /usr/bin/ssh $ which sshd /usr/sbin/sshd

Step 4 Create SSH Certificates $ sudo su huser #Generate a key pair $ ssh-keygen -f ~/.ssh/id_rsa -t rsa -P "" Generating public/private rsa key pair. Enter file in which to save the key (/home/huser/.ssh/id_rsa): Created directory '/home/huser/.ssh'. Your identification has been saved in /home/huser/.ssh/id_rsa. Your public key has been saved in /home/huser/.ssh/id_rsa.pub. The key fingerprint is: 20:6c:f3:ff:0f:33:bf:30:72:c3:22:70:24:cc:2d:d3 huser@laptop The key's randomart image is: +--[ RSA 2048]----+.oo.o

Step 4 Create SSH Certificates # Create list of authorized keys to avoid being prompted for password $ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 Start/ Stop Step 9 - Verify

Step 5 Hadoop # Download distribution from a mirror site $ sudo wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz # Extract the files $ tar xvzf hadoop-2.6.0.tar.gz $ cd hadoop-2.6.0 # Move files to /usr/local/hadoop $ sudo mkdir /usr/local/hadoop $ sudo mv * /usr/local/hadoop # Change ownership to hadoop user $ sudo chown -R huser:hadoop /usr/local/hadoop

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 - Verify

Step 6 Configure # Update links to point to Java $update-alternatives --config java There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/ java-7-openjdk-amd64/jre/bin/java Nothing to configure. # Note down JAVA_HOME variable value $ which javac /usr/bin/javac $ readlink -f /usr/bin/javac /usr/lib/jvm/java-7-openjdk-amd64/bin/javac (Note: JAVA_HOME would be everything before /bin/javac)

Step 6 Configure # Add variables to the end of.bashrc $ vi ~/.bashrc export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL

Step 6 Configure export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR= $HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path= $HADOOP_INSTALL/lib # Execute the commands in.bashrc $ source ~/.bashrc

Step 6 Configure # Configure hadoop-env.sh Change variable JAVA_HOME in hadoop-env.sh $ vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

Step 6 Configure # Configure core-site.xml # First create a tmp folder for hadoop $ sudo mkdir -p /app/hadoop/tmp $ sudo chown huser:hadoop /app/hadoop/tmp # Modify core-site.xml $ vi /usr/local/hadoop/etc/hadoop/core-site.xml

Step 6 Configure Modify as follows: <configuration> <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop/tmp</value> <description>a base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> <description>the name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.scheme.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> </configuration>

Step 6 Configure # Configure mapred-site.xml # First copy the file from the template provided $ cp /usr/local/hadoop/etc/hadoop/mapredsite.xml.template /usr/local/hadoop/etc/hadoop/ mapred-site.xml # Modify mapred-site.xml $ vi /usr/local/hadoop/etc/hadoop/mapred-site.xml

Step 6 Configure Modify as follows: <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:54311</value> <description>the host and port that the MapReduce job tracker runs at. </description> </property> </configuration>

Step 6 Configure # Configure hdfs-site.xml # First create the directories for data node and name node $ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode $ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode $ sudo chown -R huser:hadoop /usr/local/hadoop_store # Modify hdfs-site.xml $ vi /usr/local/hadoop/etc/hadoop/mapred-site.xml

Step 6 Configure Modify as follows: <configuration> <property> <name>dfs.replication</name> <value>1</value> <description>this is a default value for block replication. Tis could be different from the value specified when the file is created. This value is just a default if none is specified at file creation. </description> </property> <property> continued see next page

Step 6 Configure <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop_store/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop_store/hdfs/datanode</value> </property> </configuration>

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 Start/ Stop Step 9 - Verify

Step 7 Format Format $ hadoop namenode -format DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. 15/04/18 14:43:03 INFO namenode.namenode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = laptop/192.168.1.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 2.6.0 STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop... STARTUP_MSG: java = 1.7.0_65 ************************************************************/ 15/04/18 14:43:03 INFO namenode.namenode: registered UNIX signal handlers for [TERM, HUP, INT] 15/04/18 14:43:03 INFO namenode.namenode: createnamenode [-format] 15/04/18 14:43:07 WARN util.nativecode

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Ubuntu is the most popular Linux distribution Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 Start/ Stop Step 9 - Verify

Step 8 Start/ Stop Start / Stop To view the available commands: $ ls /usr/local/hadoop/sbin Start hadoop (Login as hadoop user) Available commands: $start-all.sh (deprecated) this starts all daemons OR $start-dfs.sh $start-yarn.sh $mr-jobhistory-daemon.sh start historyserver -

Step 8 Start/ Stop Start / Stop To stop hadoop (Login as hadoop user): $stop-all.sh Or $ Stop-dfs.sh $ stop-yarn.sh $ mr-jobhistory-daemon.sh stop historyserver

- Overview Hadoop 2.6.0 on Ubuntu 14.0.4 Step 1- Java Step 2 Create a dedicated hadoop user Step 3 ssh Step 4 Create ssh certificates Ubuntu is the most popular Linux distribution Step 5 Hadoop Step 6 Setup configuration Files Step 7 Format Step 8 Start/ Stop Step 9 - Verify

Step 9 Verify Verify To verify: $ jps (to view running daemons)

Step 9 Verify Verify WebUI: Using port 50070 Eg.,http://ec2-54-86-169-214.compute-1.amazonaws.com:50070

Quiz Quiz 1 - Which two of the following must be installed for hadoop installation to succeed a- ssh b- java c- ftp d- ruby 2 Which of the following Linux commands may be used to install a software like ssh or java a- get b- put c- get-apt d- startall.sh 3- Name the commands you would use to start the hadoop daemons and to stop them

Quiz Quiz 4 - Which of the following is NOT a configuration file for hadoop a- hadoop-env.sh b- core-site.xml c- mapred-site.xml d- hadoop-configuration.xml 5 True or false a- The user owning hadoop files must belong to a group called hadoop b- A hadoop cluster running on a single node is called pseudo-distributed mode c- A fully distributed hadoop configuration should have a minimum of three nodes