Pivotal Capgemini Just Do It Training HDFS-NFS Gateway Labs In this lab exercise you will have an opportunity to explore HDFS as well as become familiar with using the HDFS- NFS Bridge. First we will go though a few setup steps in preparation for the lab exercise 1) unzip the customers_dim.tsv.gz file located in /retail_demo/customers_dim [gpadmin@pivhd2 customers_dim]$ gunzip customers_dim.tsv.gz confirm the gunzip process was successful [gpadmin@pivhd2 customers_dim]$ ls customers_dim.tsv 2) move customers_dim.tsv to the Desktop [gpadmin@pivhd2 customers_dim]$ mv customers_dim.tsv /home/gpadmin/desktop confirm that customers_dim.tsv is on the desktop Now we will work with some of the HDFS commands NOTE!!! We will be using commands that have equivalents Ie: - put/- copyfromlocal & - get/- copytolocal You could also use the equivalent Hadoop/ HDFS commands Ie: hadoop fs command / HDFS DFS - command 3) make 2 new directories in your HDFS home directory ( gpadmin ) [gpadmin@pivhd2 Desktop]$ hadoop fs - mkdir hdfs_retail_demo [gpadmin@pivhd2 Desktop]$ hadoop fs - mkdir nfs_retail_demo confirm that the directories were created successfully [gpadmin@pivhd2 Desktop]$ hadoop fs - ls Found 4 items drwx- - - - - - - gpadmin hadoop 0 2014-04- 30 05:42.Trash drwx- - - - - - - gpadmin hadoop 0 2014-07- 23 18:51.staging drwxr- xr- x - gpadmin hadoop 0 2014-07- 27 09:18 hdfs_retail_demo drwxr- xr- x - gpadmin hadoop 0 2014-07- 27 09:18 nfs_retail_demo 4) load customers_dim.tsv from the desktop to the hdfs_retail_demo directory in HDFS [gpadmin@pivhd2 Desktop]$ hadoop fs - copyfromlocal customers_dim.tsv hdfs_retail_demo NOTE!! the hadoop fs put command could have been used as well confirm that the file got loaded successfully
[gpadmin@pivhd2 Desktop]$ hadoop fs - ls hdfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 hdfs_retail_demo/customers_dim.tsv confirm the content of the file [gpadmin@pivhd2 Desktop]$ hadoop fs - tail hdfs_retail_demo/customers_dim.tsv 5. rename customers_dim.tsv to hdfs_customers_dim.tsv in the hdfs_retail_demo directory [gpadmin@pivhd2 Desktop]$ hadoop fs - mv hdfs_retail_demo/customers_dim.tsv hdfs_retail_demo/hdfs_customers_dim.tsv confirm the rename [gpadmin@pivhd2 Desktop]$ hadoop fs - ls hdfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 hdfs_retail_demo/hdfs_customers_dim.tsv 6. copy hdfs_customers_dim.tsv to the HDFS root directory [gpadmin@pivhd2 Desktop]$ hadoop fs - cp hdfs_retail_demo/hdfs_customers_dim.tsv / confirm that the file was copied to HDFS root directory - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:45 /hdfs_customers_dim.tsv
7) append customers_dim.tsv from the desktop to /hdfs_customers_dim.tsv in the HDFS root directory [gpadmin@pivhd2 Desktop]$ hadoop fs - appendtofile /home/gpadmin/desktop/customers_dim.tsv /hdfs_customers_dim.tsv confirm the append operation was successful. NOTE!!! The filesize is now twice as large - rw- r- - r- - 1 gpadmin hadoop 948755358 2014-07- 27 09:50 /hdfs_customers_dim.tsv 8) unload hdfs_customers_dim.tsv from the HDFS root directory to the desktop [gpadmin@pivhd2 Desktop]$ hadoop fs - copytolocal /hdfs_customers_dim.tsv /home/gpadmin/desktop/ confirm the copy of hsdf_customers_dim.tsv is on the desktop and check the properties to ensure that the file size has roughly doubled. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Now you will use the HDFS- NFS Bridge to accomplish tasks similar to those you completed in the previous steps. There are two ( 2 ) icons on the desktop in the VM. Both are named HDFS. One is an icon of a disk drive and the other is an icon of a folder. You will use these browsers for the purposes of this exercise. You may open as many browsers as you wish. For the final step in this exercise you will use a terminal window. 9) Open two HDFS file browsers. Open the HDFS Disk icon from the Desktop Open the HDFS Folder icon from the Desktop. In one of the HDFS Browser windows drill down to the nfs_retail_demo directory in:
/gpadmin/desktop/hdfs/user/gpadmin/nfs_retail_demo Drag the customers_dim.tsv file from the Desktop and drop it into the nfs_retail_demo directory. The file should be copied to nfs_retail_demo Confirm by: [gpadmin@pivhd2 Desktop]$ hadoop fs - ls nfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 nfs_retail_demo/customers_dim.tsv confirm the content of the file [gpadmin@pivhd2 Desktop]$ hadoop fs - tail nfs_retail_demo/customers_dim.tsv 10. rename customers_dim.tsv to nfs_customers_dim.tsv in the nfs_retail_demo directory [gpadmin@pivhd2 Desktop]$ hadoop fs - mv nfs_retail_demo/customers_dim.tsv nfs_retail_demo/nfs_customers_dim.tsv confirm the rename [gpadmin@pivhd2 Desktop]$ hadoop fs - ls nfs_retail_demo - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:19 nfs_retail_demo/nfs_customers_dim.tsv 11. copy nfs_customers_dim.tsv to the HDFS root directory Right mouse- click on nfs_customers_dim.tsv in the nfs_retail_demo directory and drag it to the HDFS root directory. Let go of the mouse button and select the Copy menu item. For MacIntosh select the file and hold the Command key while dragging the file. Release the mouse button and select the Copy menu item. confirm that the file was copied to the HDFS root directory - rw- r- - r- - 1 gpadmin hadoop 10061365 2014-07- 27 09:45 /nfs_customers_dim.tsv
12) append customers_dim.tsv from the desktop to /nfs_customers_dim,tsv in the HDFS root directory [gpadmin@pivhd2 Desktop]$ cat customer_dim.tsv >> /home/gpadmin/desktop/hdfs/nfs_customer_dim.tsv confirm the append operation was successful. NOTE!!! The filesize is now twice as large - rw- r- - r- - 1 gpadmin hadoop 20122730 2014-07- 27 09:50 /nfs_customers_dim.tsv 13) unload nfs_customers_dim.tsv from the HDFS root directory to the desktop Right mouse- click on nfs_customers_dim.tsv in the HDFS root directory and drag it to the Desktop. Let go of the mouse button and select the Copy menu item For MacIntosh select the file and hold the Command key while dragging the file. Release the mouse button and select the Copy menu item confirm the copy of hsdf_customers_dim.tsv is on the desktop and check the properties to ensure that the file size has roughly doubled