Installing Apache Zeppelin

Similar documents
Hortonworks Data Platform

Installing Apache Atlas

Knox Implementation with AD/LDAP

Pivotal Capgemini Just Do It Training HDFS-NFS Gateway Labs

Introduction to Cloudbreak

How to Run the Big Data Management Utility Update for 10.1

Using Apache Zeppelin

Hortonworks Data Platform

Installing Apache Ranger KMS

SAP VORA 1.4 on AWS - MARKETPLACE EDITION FREQUENTLY ASKED QUESTIONS

Hortonworks Cybersecurity Platform

Installing SmartSense on HDP

Hortonworks Data Platform

Installing Apache Knox

Hortonworks Data Platform

Configuring Apache Knox SSO

Configuring Sqoop Connectivity for Big Data Management

Upgrading Big Data Management to Version Update 2 for Hortonworks HDP

Configuring Apache Knox SSO

Hortonworks Data Platform

ISILON ONEFS WITH HADOOP KERBEROS AND IDENTITY MANAGEMENT APPROACHES. Technical Solution Guide

IMPLEMENTING HTTPFS & KNOX WITH ISILON ONEFS TO ENHANCE HDFS ACCESS SECURITY

Automation of Rolling Upgrade for Hadoop Cluster without Data Loss and Job Failures. Hiroshi Yamaguchi & Hiroyuki Adachi

Installing HDF Services on an Existing HDP Cluster

Installing an HDF cluster

HDP Security Overview

HDP Security Overview

Running Apache Spark Applications

KNIME Extension for Apache Spark Installation Guide. KNIME AG, Zurich, Switzerland Version 3.7 (last updated on )

Logging on to the Hadoop Cluster Nodes. To login to the Hadoop cluster in ROGER, a user needs to login to ROGER first, for example:

Security 3. NiFi Authentication. Date of Publish:

Exercise #1: ANALYZING SOCIAL MEDIA AND CUSTOMER SENTIMENT WITH APACHE NIFI AND HDP SEARCH INTRODUCTION CONFIGURE AND START SOLR

KNIME Extension for Apache Spark Installation Guide

Hortonworks Data Platform

Managing and Monitoring a Cluster

SAP Vora - AWS Marketplace Production Edition Reference Guide

SAS Data Loader 2.4 for Hadoop

microsoft

Hortonworks SmartSense

Hortonworks Data Platform

HDP Security Secure Credentials Management 3. Securing Credentials. Date of Publish:

Ambari Managed HDF Upgrade

Hands-on Exercise Hadoop

Accessing clusters 2. Accessing Clusters. Date of Publish:

Configuring Intelligent Streaming 10.2 For Kafka on MapR

Configuring NiFi Authentication and Proxying with Apache Knox

Known Issues for Oracle Big Data Cloud. Topics: Supported Browsers. Oracle Cloud. Known Issues for Oracle Big Data Cloud Release 18.

Hadoop. Introduction / Overview

SPLUNK ENTERPRISE AND ECS TECHNICAL SOLUTION GUIDE

Hortonworks SmartSense

Hortonworks Technical Preview for Apache Falcon

Ambari User Views: Tech Preview

Developer Training for Apache Spark and Hadoop: Hands-On Exercises

Oracle BDA: Working With Mammoth - 1

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Getting Started 1. Getting Started. Date of Publish:

Oracle Big Data Manager User s Guide. For Oracle Big Data Appliance

HDP Security Audit 3. Managing Auditing. Date of Publish:

Hadoop An Overview. - Socrates CCDH

Hortonworks Cybersecurity Platform

Hortonworks Cybersecurity Platform

Hortonworks Data Platform

TIBCO Spotfire Connecting to a Kerberized Data Source

1. Configuring Azure and EBP for a simple demo

Backup and Restore System

Hortonworks Data Platform

Hortonworks University. Education Catalog 2018 Q1

McAfee Data Loss Prevention Prevent 11.1.x Release Notes

Release Notes 1. DLM Release Notes. Date of Publish:

INITIAL EVALUATION BIGSQL FOR HORTONWORKS (Homerun or merely a major bluff?)

Using Apache Phoenix to store and access data

Installing and Configuring Apache Storm

Apache ZooKeeper ACLs

docs.hortonworks.com

SmartSense Configuration Guidelines

Hortonworks Data Platform for Enterprise Data Lakes delivers robust, big data analytics that accelerate decision making and innovation

Oracle Cloud Using Oracle Big Data Manager. Release

Using VMware vsphere With Your System

Oracle Cloud Using Oracle Big Data Manager. Release

ambari administration 2 Administering Ambari Date of Publish:

Upgrading to BAMS Release 3.30

Hortonworks Cybersecurity Platform

Guide for Attempting an HDP Certification Practice Exam. Revision 2 Hortonworks University

TDDE31/732A54 - Big Data Analytics Lab compendium

Introducing Oracle Machine Learning

Hitachi Hyper Scale-Out Platform (HSP) Hortonworks Ambari VM Quick Reference Guide

Read the following information carefully, before you begin an upgrade.

Oracle Hyperion EPM Installation & Configuration ( ) NEW

Integrating Apache Hive with Spark and BI

HPE IMC DBA User Database to IMC User Database Restoration Configuration Examples

TIBCO Spotfire Web Player 7.0. Installation and Configuration Manual

IBM Security Access Manager Version December Release information

Getting Started with Spark

Getting Started with Hadoop

vrealize Suite Lifecycle Manager 1.1 Installation, Upgrade, and Management vrealize Suite 2017

What's New in FreeNAS 9.3. Dru Lavigne Documentation Lead, ixsystems SCALE, February 21, 2015

Hortonworks DataPlane Service (DPS)

Hortonworks DataFlow

VMware vsphere Big Data Extensions Administrator's and User's Guide

Hortonworks Data Platform

Transcription:

3 Installing Date of Publish: 2018-04-01 http://docs.hortonworks.com

Contents Install Using Ambari...3 Enabling HDFS and Configuration Storage for Zeppelin Notebooks in HDP-2.6.3+...4 Overview... 4 Enable HDFS Storage when Upgrading to HDP-2.6.3+... 4 Use Local Storage when Upgrading to HDP-2.6.3+... 5

Install Using Ambari Install Using Ambari How to install on an Ambari-managed cluster. Before you begin Install Zeppelin on a node where Spark clients are already installed and running. This typically means that Zeppelin will be installed on a gateway or edge node. Zeppelin requires the following software versions: HDP 3.0 or later. Apache Spark 2.0. Java 8 on the node where Zeppelin is installed. The optional Livy server provides security features and user impersonation support for Zeppelin users. Livy is installed as part of Spark. After installing Spark, Livy, and Zeppelin, refer to "Configuring Zeppelin" in this guide for post-installation steps. Install Zeppelin Using Ambari The Ambari installation wizard sets default values for Zeppelin configuration settings. Initially, you should accept the default settings. Later, when you are more familiar with Zeppelin, consider customizing the Zeppelin configuration settings. To install Zeppelin using Ambari, add the Zeppelin service: 1. Click the ellipsis ( ) symbol next to Services on the Ambari dashboard, then click Add Service. 2. On the Add Service Wizard under Choose Services, select Zeppelin Notebook, then click Next. 3. On the Assign Masters page, review the node assignment for Zeppelin Notebook, then click Next. 4. On the Customize Services page, review the default values, then click Next. 5. If Kerberos is enabled on the cluster, review the principal and keytab settings on the Configure Identities page, modify the settings if desired, then click Next. 6. Review the configuration on the Review page, then click Deploy to begin the installation. 7. The Install, Start, and Test page displays the installation status. 8. When the progress bar reaches 100% and a "Success" message appears, click Next. 9. On the Summary page, click Complete to finish installing Zeppelin. To validate the Zeppelin installation, open the Zeppelin Web UI in a browser window. Use the port number configured for Zeppelin (9995 by default); for example: http://<zeppelin-host>:9995 You can also open the Zeppelin Web UI by selecting Zeppelin Notebook > Zeppelin UI on the Ambari dashboard. To check the Zeppelin version number, type the following command on the command line: /usr/hdp/current/zeppelin-server/bin/zeppelin-daemon.sh --version Zeppelin stores configuration settings in the /etc/zeppelin/conf directory. Note, however, that if your cluster is managed by Ambari you should not modify configuration settings directly. Instead, use the Ambari web UI. Zeppelin stores log files in /var/log/zeppelin on the node where Zeppelin is installed. 3

Enabling HDFS and Configuration Storage for Zeppelin Notebooks in HDP-2.6.3+ Enabling HDFS and Configuration Storage for Zeppelin Notebooks in HDP-2.6.3+ Overview When upgrading to HDP-2.6.3 and higher versions, additional configuration steps are required to enable HDFS storage for notebooks. HDP-2.6.3 introduced support for HDFS storage for notebooks and configuration files. In previous versions, notebooks and configuration files were stored on the local disk of the Zeppelin server. When upgrading to HDP-2.6.3 and higher versions, there are two options for configuring Zeppelin notebook and configuration file storage: Use HDFS storage (recommended) Zeppelin notebooks and configuration files must be copied to the new HDFS storage location before upgrading. Additional upgrade and post-upgrade steps must also be performed, as described in the following section. Use local storage Perform upgrade and post-upgrade steps to enable local storage. Enabling HDFS storage makes future upgrades much easier, and also sets up the first step toward enabling Zeppelin High Availability. Therefore it is recommended that you enable HDFS for Zeppelin notebooks and configuration files when upgrading to HDP 2.6.3+ from earlier versions of HDP. Note: Currently HDFS and local storage are the only supported notebook storage mechanisms in HDP-2.6.3+. Currently VFSNotebookRepo is the only supported local storage option. Enable HDFS Storage when Upgrading to HDP-2.6.3+ Perform the following steps to enable HDFS storage when upgrading to HDP 2.6.3+ from earlier versions of HDP. Procedure 1. Before upgrading Zeppelin, perform the following steps as the Zeppelin service user. a. Create the /user/zeppelin/conf and /user/zeppelin/notebook directories in HDFS. hdfs dfs -ls /user/zeppelin drwxr-xr-x - zeppelin hdfs 0 2018-01-20 04:17 /user/zeppelin/ conf drwxr-xr-x - zeppelin hdfs 0 2018-01-20 03:40 /user/zeppelin/ notebook b. Copy all notebooks from the local Zeppelin server (for example, /usr/hdp/2.5.3.0-37/zeppelin/notebook/) to the /user/zeppelin/notebook directory in HDFS. hdfs dfs -ls /user/zeppelin/notebook drwxr-xr-x - zeppelin hdfs 0 2018-01-19 01:40 /user/zeppelin/ notebook/2a94m5j1z drwxr-xr-x - zeppelin hdfs 0 2018-01-19 01:40 /user/zeppelin/ notebook/2bwjftxkj 4

Enabling HDFS and Configuration Storage for Zeppelin Notebooks in HDP-2.6.3+ c. Copy the interpreter.json and notebook-authorization.json files from the local Zeppelin service configuration directory (/etc/zeppelin/conf) to the/user/zeppelin/conf directory in HDFS. hdfs dfs -ls /user/zeppelin/conf -rw-r--r-- 3 zeppelin hdfs 284091 2018-01-22 23:28 /user/zeppelin/ conf/interpreter.json -rw-r--r-- 3 zeppelin hdfs 123849 2018-01-22 23:29 /user/zeppelin/ conf/notebook-authorization.json 2. Upgrade Ambari. 3. Upgrade HDP and Zeppelin. During the upgrade, verify that the following configuration settings are present in Ambari for Zeppelin. zeppelin.notebook.storage = org.apache.zeppelin.notebook.repo.filesystemnotebookrepo zeppelin.config.fs.dir = conf If necessary, add or update these configuration settings as shown above. 4. After the upgrade is complete: a. Log on to the Zeppelin server and verify that the following properties exist in the /etc/zeppelin/conf/zeppelinsite.xml file. The actual value for the keytab file and principal name may be different for your cluster. <property> <name>zeppelin.server.kerberos.keytab</name> <value>/etc/security/keytabs/zeppelin.server.kerberos.keytab</value> </property> <property> <name>zeppelin.server.kerberos.principal</name> <value>zeppelin@example.com</value> </property> b. Check the Zeppelin Interpreter page to see if any interpreter (e.g. the Livy interpreter) is duplicated. This may happen in some cases. If duplicate interpreter entries are found, perform the following steps: 1. Backup and delete the interpreter.json file from HDFS (/user/zeppelin/conf/interpreter.json) and from the local Zeppelin server. 2. Restart the Zeppelin service. 3. Verify that the duplicate entries no longer exist. 4. If any custom interpreter settings were present before the upgrade, add them again via the Zeppelin interpreter UI page. c. Verify that your existing notebooks are available on Zeppelin. Note: When an existing notebook is opened for the first time after the upgrade, it may ask you to save the interpreters associated with the notebook. Use Local Storage when Upgrading to HDP-2.6.3+ Perform the following steps to use local notebook storage when upgrading to HDP 2.6.3+ from earlier versions of HDP. Procedure 1. Upgrade Ambari. 5

Enabling HDFS and Configuration Storage for Zeppelin Notebooks in HDP-2.6.3+ 2. Upgrade HDP and Zeppelin. During the upgrade, verify that the following configuration settings are present in Ambari for Zeppelin. zeppelin.notebook.storage = org.apache.zeppelin.notebook.repo.vfsnotebookrepo zeppelin.config.fs.dir = file:///etc/zeppelin/conf If necessary, add or update these configuration settings as shown above. 3. After the upgrade is complete: a. Copy your notebooks and the notebook-authorization.json file from the previous Zeppelin installation directory to the new installation directory on the Zeppelin server machine. b. Verify that your existing notebooks are available on Zeppelin. Note: When an existing notebook is opened for the first time after the upgrade, it may ask you to save the interpreters associated with the notebook. 6