Pivotal Capgemini Just Do It Training HDFS-NFS Gateway Labs

Similar documents
In this exercise you will practice working with HDFS, the Hadoop. You will use the HDFS command line tool and the Hue File Browser

Hadoop Lab 2 Exploring the Hadoop Environment

Installing Apache Zeppelin

HDFS Access Options, Applications

Unix. 1 tblgrant projects 0 Jun 17 15:40 doc1.txt. documents]$ touch doc2.txt documents]$ ls -l total 0

Introduction to Linux

CSC209. Software Tools and Systems Programming.

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

CSCI 2132 Software Development. Lecture 4: Files and Directories

commandname flags arguments

Unix L555. Dept. of Linguistics, Indiana University Fall Unix. Unix. Directories. Files. Useful Commands. Permissions. tar.

An Introduction to Using the Command Line Interface (CLI) to Work with Files and Directories

Developer Training for Apache Spark and Hadoop: Hands-On Exercises

CS 2400 Laboratory Assignment #1: Exercises in Compilation and the UNIX Programming Environment (100 pts.)

Linux/Unix Filesystems

CMU MSP Intro to Hadoop

Common UNIX Commands. Unix. User Interfaces. Unix Commands Winter COMP 1270 Computer Usage II 9-1. Using UNIX. Unix has a command line interface

Hadoop streaming is an alternative way to program Hadoop than the traditional approach of writing and compiling Java code.

Unix File System. Class Meeting 2. * Notes adapted by Joy Mukherjee from previous work by other members of the CS faculty at Virginia Tech

Introduction to UNIX command-line

Unix Workshop Aug 2014

SAP Favorites. You can manage your own SAP menu Favorites in multiple ways: 1. From the SAP menu bar click on Favorites to display your options:

Fairfield University Using Xythos for File Storage

CSE Linux VM. For Microsoft Windows. Based on opensuse Leap 42.2

CS60021: Scalable Data Mining. Sourangshu Bhattacharya

TDDE31/732A54 - Big Data Analytics Lab compendium

ORF 201 Computer Methods in Problem Solving. Lab 2: My House, My Home (Page)

Lab Assignment #1. University of Pittsburgh Department of Electrical and Computer Engineering

Introduction to Unix: Fundamental Commands

Introduction to Linux. Roman Cheplyaka

Linux Essentials. Programming and Data Structures Lab M Tech CS First Year, First Semester

Lab Working with Linux Command Line

Introduction. File System. Note. Achtung!

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

CENG393 Computer Networks Labwork 1

MBV4410/9410 Fall Bioinformatics for Molecular Biology. More Unix

WEB CREATOR FILE MANAGER

5.5.3 Lab: Managing Administrative Settings and Snap-ins in Windows XP

ACS Unix (Winter Term, ) Page 92

A Guide to Running Map Reduce Jobs in Java University of Stirling, Computing Science

Hadoop is essentially an operating system for distributed processing. Its primary subsystems are HDFS and MapReduce (and Yarn).

Basic Unix. Set-up. Finding Terminal on the imac. Method 1. Biochemistry laboratories Jean-Yves Sgro

The Unix Shell. Permissions

Excel window. This will open the Tools menu. Select. from this list, Figure 3. This will launch a window that

Lab 3-1 Lab Installing Kofax Capture 10

Part I. Introduction to Linux

Find out where you currently are in the path Change directories to be at the root of your home directory (/home/username) cd ~

Arkansas High Performance Computing Center at the University of Arkansas

Mills HPC Tutorial Series. Linux Basics I

1. Remove any previously installed versions of the Offline Image Viewer by dragging and dropping the Offline Image Viewer icon into the Trash.

Updating Digital SLR Camera Firmware

UNIT-IV HDFS. Ms. Selva Mary. G

Welcome to getting started with Ubuntu Server. This System Administrator Manual. guide to be simple to follow, with step by step instructions

CHAPTER 1 UNIX FOR NONPROGRAMMERS

Ultimate Hadoop Developer Training

Chapter-3. Introduction to Unix: Fundamental Commands

Video Performance Evaluation Resource. Quick Start Guide

Don t jump ahead, there is more you need to do first in order for this to work properly.

Hadoop Setup on OpenStack Windows Azure Guide

Computer Engineering II Exercise Sheet Chapter 6

Product Instruction. Drew3 Control Assay Data File Internet Download and Installation Instructions

Updating the Firmware for Digital SLR Cameras (Mass Storage) Macintosh

AutoForm plus R6.0.3 Release Notes

Operating Systems. Copyleft 2005, Binnur Kurt

Tutorial 1: Unix Basics

Operating Systems 3. Operating Systems. Content. What is an Operating System? What is an Operating System? Resource Abstraction and Sharing

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny.

Lab 1: Introduction to Linux Networking

Macintosh. Updating Digital SLR Camera Firmware. Get ready

The Directory Structure

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

UNIX File Hierarchy: Structure and Commands

CS Fundamentals of Programming II Fall Very Basic UNIX

Accessing Hadoop Data Using Hive

CISC 220 fall 2011, set 1: Linux basics

CS November 2017

Distributed Systems. 09r. Map-Reduce Programming on AWS/EMR (Part I) 2017 Paul Krzyzanowski. TA: Long Zhao Rutgers University Fall 2017

Introduction to Linux. Woo-Yeong Jeong Computer Systems Laboratory Sungkyunkwan University

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

Introduction to the Linux Command Line

Map Blending Using Combiners Map to Image Combinations

Introduction to Unix - Lab Exercise 0

your Apple warranty; see There are two main failure modes for a mirrored RAID 1 set:

EE516: Embedded Software Project 1. Setting Up Environment for Projects

15. Creating a Samba Server in Knoppix v.3

Cloud Computing II. Exercises

CSE115 Lab exercises for week 1 of recitations Spring 2011

Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory

Files and Directories

CS246 Spring14 Programming Paradigm Notes on Linux

Documentation on File Management for website

Hadoop Lab 3 Creating your first Map-Reduce Process

Using Hive for Data Warehousing

Getting Started With UNIX Lab Exercises

ViCAR Linux VM. For Microsoft Windows 7. Using Oracle VirtualBox and OpenSuSE Leap 42.2

LAB 8 (Aug 4/5) Unix Utilities

Linux Bootcamp Fall 2015

Computer Systems and Architecture

Introduction to Windows

An Introductory Tutorial on UNIX

Transcription:

Pivotal Capgemini Just Do It Training HDFS-NFS Gateway Labs In this lab exercise you will have an opportunity to explore HDFS as well as become familiar with using the HDFS- NFS Bridge. First we will go though a few setup steps in preparation for the lab exercise 1) unzip the customers_dim.tsv.gz file located in /retail_demo/customers_dim [gpadmin@pivhd2 customers_dim]$ gunzip customers_dim.tsv.gz confirm the gunzip process was successful [gpadmin@pivhd2 customers_dim]$ ls customers_dim.tsv 2) move customers_dim.tsv to the Desktop [gpadmin@pivhd2 customers_dim]$ mv customers_dim.tsv /home/gpadmin/desktop confirm that customers_dim.tsv is on the desktop Now we will work with some of the HDFS commands NOTE!!! We will be using commands that have equivalents Ie: - put/- copyfromlocal & - get/- copytolocal You could also use the equivalent Hadoop/ HDFS commands Ie: hadoop fs command / HDFS DFS - command 3) make 2 new directories in your HDFS home directory ( gpadmin ) [gpadmin@pivhd2 Desktop]$ hadoop fs - mkdir hdfs_retail_demo [gpadmin@pivhd2 Desktop]$ hadoop fs - mkdir nfs_retail_demo confirm that the directories were created successfully [gpadmin@pivhd2 Desktop]$ hadoop fs - ls Found 4 items drwx- - - - - - - gpadmin hadoop 0 2014-04- 30 05:42.Trash drwx- - - - - - - gpadmin hadoop 0 2014-07- 23 18:51.staging drwxr- xr- x - gpadmin hadoop 0 2014-07- 27 09:18 hdfs_retail_demo drwxr- xr- x - gpadmin hadoop 0 2014-07- 27 09:18 nfs_retail_demo 4) load customers_dim.tsv from the desktop to the hdfs_retail_demo directory in HDFS [gpadmin@pivhd2 Desktop]$ hadoop fs - copyfromlocal customers_dim.tsv hdfs_retail_demo NOTE!! the hadoop fs put command could have been used as well confirm that the file got loaded successfully

[gpadmin@pivhd2 Desktop]$ hadoop fs - ls hdfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 hdfs_retail_demo/customers_dim.tsv confirm the content of the file [gpadmin@pivhd2 Desktop]$ hadoop fs - tail hdfs_retail_demo/customers_dim.tsv 5. rename customers_dim.tsv to hdfs_customers_dim.tsv in the hdfs_retail_demo directory [gpadmin@pivhd2 Desktop]$ hadoop fs - mv hdfs_retail_demo/customers_dim.tsv hdfs_retail_demo/hdfs_customers_dim.tsv confirm the rename [gpadmin@pivhd2 Desktop]$ hadoop fs - ls hdfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 hdfs_retail_demo/hdfs_customers_dim.tsv 6. copy hdfs_customers_dim.tsv to the HDFS root directory [gpadmin@pivhd2 Desktop]$ hadoop fs - cp hdfs_retail_demo/hdfs_customers_dim.tsv / confirm that the file was copied to HDFS root directory - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:45 /hdfs_customers_dim.tsv

7) append customers_dim.tsv from the desktop to /hdfs_customers_dim.tsv in the HDFS root directory [gpadmin@pivhd2 Desktop]$ hadoop fs - appendtofile /home/gpadmin/desktop/customers_dim.tsv /hdfs_customers_dim.tsv confirm the append operation was successful. NOTE!!! The filesize is now twice as large - rw- r- - r- - 1 gpadmin hadoop 948755358 2014-07- 27 09:50 /hdfs_customers_dim.tsv 8) unload hdfs_customers_dim.tsv from the HDFS root directory to the desktop [gpadmin@pivhd2 Desktop]$ hadoop fs - copytolocal /hdfs_customers_dim.tsv /home/gpadmin/desktop/ confirm the copy of hsdf_customers_dim.tsv is on the desktop and check the properties to ensure that the file size has roughly doubled. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Now you will use the HDFS- NFS Bridge to accomplish tasks similar to those you completed in the previous steps. There are two ( 2 ) icons on the desktop in the VM. Both are named HDFS. One is an icon of a disk drive and the other is an icon of a folder. You will use these browsers for the purposes of this exercise. You may open as many browsers as you wish. For the final step in this exercise you will use a terminal window. 9) Open two HDFS file browsers. Open the HDFS Disk icon from the Desktop Open the HDFS Folder icon from the Desktop. In one of the HDFS Browser windows drill down to the nfs_retail_demo directory in:

/gpadmin/desktop/hdfs/user/gpadmin/nfs_retail_demo Drag the customers_dim.tsv file from the Desktop and drop it into the nfs_retail_demo directory. The file should be copied to nfs_retail_demo Confirm by: [gpadmin@pivhd2 Desktop]$ hadoop fs - ls nfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 nfs_retail_demo/customers_dim.tsv confirm the content of the file [gpadmin@pivhd2 Desktop]$ hadoop fs - tail nfs_retail_demo/customers_dim.tsv 10. rename customers_dim.tsv to nfs_customers_dim.tsv in the nfs_retail_demo directory [gpadmin@pivhd2 Desktop]$ hadoop fs - mv nfs_retail_demo/customers_dim.tsv nfs_retail_demo/nfs_customers_dim.tsv confirm the rename [gpadmin@pivhd2 Desktop]$ hadoop fs - ls nfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 nfs_retail_demo/nfs_customers_dim.tsv 11. copy nfs_customers_dim.tsv to the HDFS root directory Right mouse- click on nfs_customers_dim.tsv in the nfs_retail_demo directory and drag it to the HDFS root directory. Let go of the mouse button and select the Copy menu item. For MacIntosh select the file and hold the Command key while dragging the file. Release the mouse button and select the Copy menu item. confirm that the file was copied to the HDFS root directory - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:45 /nfs_customers_dim.tsv

12) append customers_dim.tsv from the desktop to /nfs_customers_dim,tsv in the HDFS root directory [gpadmin@pivhd2 Desktop]$ cat customer_dim.tsv >> /home/gpadmin/desktop/hdfs/nfs_customer_dim.tsv confirm the append operation was successful. NOTE!!! The filesize is now twice as large - rw- r- - r- - 1 gpadmin hadoop 20122730 2014-07- 27 09:50 /nfs_customers_dim.tsv 13) unload nfs_customers_dim.tsv from the HDFS root directory to the desktop Right mouse- click on nfs_customers_dim.tsv in the HDFS root directory and drag it to the Desktop. Let go of the mouse button and select the Copy menu item For MacIntosh select the file and hold the Command key while dragging the file. Release the mouse button and select the Copy menu item confirm the copy of hsdf_customers_dim.tsv is on the desktop and check the properties to ensure that the file size has roughly doubled