Hortonworks Technical Preview for Apache Falcon

Similar documents
Hortonworks Technical Preview for Stinger Phase 3 Released: 12/17/2013

Hortonworks Data Platform

Hortonworks Data Platform

Installing SmartSense on HDP

Hortonworks Data Platform

Cloudera Manager Quick Start Guide

Ambari User Views: Tech Preview

About 1. Chapter 1: Getting started with oozie 2. Remarks 2. Versions 2. Examples 2. Installation or Setup 2. Chapter 2: Oozie

Hortonworks DataFlow

Hortonworks DataFlow

Hortonworks Cybersecurity Platform

Installing HDF Services on an Existing HDP Cluster

Installing an HDF cluster

Hortonworks Data Platform v1.0 Powered by Apache Hadoop Installing and Configuring HDP using Hortonworks Management Center

Hortonworks SmartSense

Ambari Managed HDF Upgrade

Hortonworks SmartSense

HDP Security Overview

HDP Security Overview

docs.hortonworks.com

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. HCatalog

ambari administration 2 Administering Ambari Date of Publish:

Hitachi Hyper Scale-Out Platform (HSP) Hortonworks Ambari VM Quick Reference Guide

Hortonworks Data Platform

IMPLEMENTING HTTPFS & KNOX WITH ISILON ONEFS TO ENHANCE HDFS ACCESS SECURITY

Installation 1. Installing DPS. Date of Publish:

docs.hortonworks.com

Administration 1. DLM Administration. Date of Publish:

Knox Implementation with AD/LDAP

VMware vsphere Big Data Extensions Administrator's and User's Guide

Upgrading Big Data Management to Version Update 2 for Hortonworks HDP

Hortonworks Data Platform

KNIME Extension for Apache Spark Installation Guide. KNIME AG, Zurich, Switzerland Version 3.7 (last updated on )

KNIME Extension for Apache Spark Installation Guide

Installing Apache Zeppelin

Using The Hortonworks Virtual Sandbox Powered By Apache Hadoop

Introduction into Big Data analytics Lecture 3 Hadoop ecosystem. Janusz Szwabiński

Configuring Apache Knox SSO

QlikView, Creating Business Discovery Application using HDP V1.0 March 13, 2014

Server Installation Guide

SAS Data Loader 2.4 for Hadoop

Release Notes 1. DLM Release Notes. Date of Publish:

Hands-on Exercise Hadoop

Configuring Apache Knox SSO

Installation and Upgrade 1. Installing DataPlane. Date of Publish:

Data Pipeline testing now made easy! With Apache Falcon

Configuring NiFi Authentication and Proxying with Apache Knox

Hadoop Quickstart. Table of contents

INITIAL EVALUATION BIGSQL FOR HORTONWORKS (Homerun or merely a major bluff?)

Installing Datameer with MapR on an Edge Node

Installing Apache Knox

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

Troubleshooting Cloudbreak

HDP Security Audit 3. Managing Auditing. Date of Publish:

Introduction to Cloudbreak

Hortonworks DataFlow

INTEGRATION TOOLBOX. Installation Guide. for IBM Tivoli Storage Manager.

Adobe Experience Manager Dev/Ops Engineer Adobe Certified Expert Exam Guide. Exam number: 9A0-397

Automation of Rolling Upgrade for Hadoop Cluster without Data Loss and Job Failures. Hiroshi Yamaguchi & Hiroyuki Adachi

About the Tutorial. Audience. Prerequisites. Disclaimer & Copyright. Jenkins

Administration 1. DLM Administration. Date of Publish:

Hortonworks Data Platform

VMware vcenter Server Appliance Management Programming Guide. Modified on 28 MAY 2018 vcenter Server 6.7 VMware ESXi 6.7

Hortonworks Data Platform

Innovatus Technologies

Product Compatibility Matrix

Enterprise Data Catalog Fixed Limitations ( Update 1)

Installation Guide Installing AuraPlayer Components on Tomcat in a Linux Environment

ISILON ONEFS WITH HADOOP KERBEROS AND IDENTITY MANAGEMENT APPROACHES. Technical Solution Guide

Hortonworks Data Platform

About Backup and Restore, on page 1 Supported Backup and Restore Procedures, on page 3

Orchestrating Big Data with Apache Airflow

Hortonworks Cybersecurity Platform

Installation 1. DLM Installation. Date of Publish:

Configuring Sqoop Connectivity for Big Data Management

Data Governance Overview

Part II (c) Desktop Installation. Net Serpents LLC, USA

Linux Administration

Managing High Availability

Hortonworks and The Internet of Things

Setting Up Resources in VMware Identity Manager. VMware Identity Manager 2.8

SmartSense Configuration Guidelines

StreamSets Control Hub Installation Guide

VMware vsphere Big Data Extensions Administrator's and User's Guide

Red Hat JBoss Developer Studio 11.3

Working with Storm Topologies

Hortonworks DataFlow

Developing and Deploying vsphere Solutions, vservices, and ESX Agents. 17 APR 2018 vsphere Web Services SDK 6.7 vcenter Server 6.7 VMware ESXi 6.

Hadoop & Big Data Analytics Complete Practical & Real-time Training

Cascading Pattern - How to quickly migrate Predictive Models (PMML) from SAS, R, Micro Strategies etc., onto Hadoop and deploy them at scale

Hortonworks DataFlow

Using the vrealize Orchestrator Operations Client. vrealize Orchestrator 7.5

Carbon Black QRadar App User Guide

Developing and Deploying vsphere Solutions, vservices, and ESX Agents

Troubleshooting Guide and FAQs Community release

Important Notice Cloudera, Inc. All rights reserved.

The Evolving Apache Hadoop Ecosystem What it means for Storage Industry

IBM z Systems Development and Test Environment Tools User's Guide IBM

BIG DATA TRAINING PRESENTATION

NET EXPERT SOLUTIONS PVT LTD

Transcription:

Architecting the Future of Big Data Hortonworks Technical Preview for Apache Falcon Released: 11/20/2013

Architecting the Future of Big Data 2013 Hortonworks Inc. All Rights Reserved. Welcome to Hortonworks Inc, technical preview for Apache Falcon. The Technical Preview provides early access to upcoming features in the Hortonworks product, letting you test and review during the development process. These features are considered under development. Although your feedback is greatly appreciated these features are not intended for use in your production systems and not considered Supported by Hortonworks. Have fun and please send feedback to us on the Community forums: http://hortonworks.com/community/forums/

Apache Falcon Introduction... 4 Framework Entities... 4 Usage... 5 Architecture... 5 Deployment Options... 6 Utilities and APIs... 8 System Requirements... 8 Hortonworks Data Platform... 8 Operating systems... 8 Software Requirements... 8 JDK Requirements... 8 Installation... 9 Install Standalone Falcon Server (Recommended)... 9 Install Distributed Falcon Prism Server... 9 Install Falcon Documentation... 9 Setup... 10 Configuring Oozie... 10 Using Basic Commands... 13 Using the CLI... 14 Using the REST API... 14 Configuring the Store... 14 Removing Falcon... 15 Known Issues and Limitations... 16 Troubleshooting... 16 Further Reading... 17 Hortonworks Inc. Page 3

Apache Falcon Introduction Apache Falcon provides a framework for simplifying the development of data management applications in Apache Hadoop. Falcon enables users to automate the movement and processing of data sets. Instead of hard- coding complex data set and pipeline processing capabilities, Hadoop applications can now rely on the Apache Falcon framework for these functions. Dataset Replication - Replicate data sets (whether HDFS or Hive Tables) as part of your Disaster Recovery, Backup and Archival solution. Falcon can trigger processes for retry and handle late data arrival logic. Dataset Lifecycle Management Establish retention policies for datasets and Falcon will schedule eviction. Dataset Traceability / Lineage - Use Falcon to view coarse- grained dependencies between clusters, datasets, and processes. Framework Entities The Falcon framework defines the fundamental building blocks for data processing applications using entities such as Feeds, Processes, and Clusters. A Hadoop user can establish entity relationships and Falcon handles the management, coordination, and scheduling of data set processing. Cluster - Represents the interfaces to a Hadoop cluster. Feed Defines a dataset with location, replication schedule, and retention policy. Process - Consumes and processes Feeds. Hortonworks Inc. Page 4

Usage Using the entity definitions, you can create entity specifications (http://falcon.incubator.apache.org/docs/entityspecification.html) and submit to Falcon for execution. The high- level Falcon operations are: 1. Submit entity specifications into Falcon for Clusters. 2. Design Feeds and Processes and associate with Clusters. 3. Designate replication schedule and retention policies on the Feeds. 4. Submit entity specifications into Falcon for Feeds and Processes. 5. Instruct Falcon to schedule execution of the Feed and Process entities. 6. Use Falcon to manage instances of execution (status, suspend, resume, kill, re- run). Architecture Falcon essentially transforms entity definitions into repeated actions through a workflow scheduler. All the functions and workflow state management requirements are delegated to the scheduler. By default, Falcon uses Apache Oozie for the scheduler. Hortonworks Inc. Page 5

Deployment Options Falcon can run in either Standalone or Distributed mode. In Standalone mode, a single Falcon Server is installed and managed. This Falcon Server can work with and process data sets for one or more clusters in one or more data centers. If you plan to manage multiple clusters, across data centers, and you plan to have multiple instances of Falcon Server, you can deploy a Falcon Prism Server for distributed management of the Falcon Servers. Standalone When replicating data in the same cluster or a different cluster (in the same colo), you can run a single Falcon Server instance standalone. Hortonworks Inc. Page 6

Distributed When running multiple Falcon Server instances across clusters and colos, you can install a Falcon Prism Server for distributed Falcon Server instance management. Hortonworks Inc. Page 7

Utilities and APIs Whether you are managing the standalone Falcon Servers, or a Falcon Prism Server in a distributed deployment, Falcon provides a Command Line Interface CLI (http://falcon.incubator.apache.org/docs/falconcli.html) and a REST API (http://falcon.incubator.apache.org/docs/restapi/resourcelist.html). The CLI and REST API both expose Entity Management, Instance Management, and Admin operations. System Requirements The Falcon Technical Preview has the following minimum system requirements: Hortonworks Data Platform (HDP) Operating Systems Software Requirements JDK Requirements Hortonworks Data Platform Falcon requires HDP 2.0 GA. Note: You must have Oozie installed and configured on your cluster. Operating systems 64- bit RHEL 6 64- bit CentOS 6 64- bit Oracle Linux 6 Software Requirements yum rpm wget java (see JDK Requirements) JDK Requirements Your system must have the correct JDK installed on all hosts that will run Falcon. The following JDKs are supported: Oracle JDK 1.7 64- bit Oracle JDK 1.6 update 31 64- bit Hortonworks Inc. Page 8

Installation Install the Standalone Falcon Server (recommended for first- time Falcon users) or the Distributed Falcon Server on a host that has the Hadoop packages installed. You can run the following command to confirm the Hadoop packages are installed: hadoop version Install Standalone Falcon Server (Recommended) Complete the following steps to install the Falcon Standalone server: 1. Download the Falcon Server package. wget http://public-repo-1.hortonworks.com/hdp-labs/projects/falcon/2.0.6.0-76/rpm/falcon-0.4.0.2.0.6.0-76.el6.noarch.rpm 2. Install the Package using RPM. rpm -Uvh falcon-0.4.0.2.0.6.0-76.el6.noarch.rpm The Falcon Server software is installed here: /usr/lib/falcon Install Distributed Falcon Prism Server Complete the following steps to install the Distributed Falcon Prism server: Note: Do not install the Distributed Falcon Prism Server package on the same host as a Falcon Server. 1. Download the Falcon Distributed package. wget http://public-repo-1.hortonworks.com/hdp-labs/projects/falcon/2.0.6.0-76/rpm/falcon-distributed-0.4.0.2.0.6.0-76.el6.noarch.rpm 2. Install the Package using RPM. rpm -Uvh falcon-distributed-0.4.0.2.0.6.0-76.el6.noarch.rpm The Falcon Prism Server software is installed here: /usr/lib/falcon-distributed Install Falcon Documentation To install the Falcon project documentation: Hortonworks Inc. Page 9

1. Download the Falcon Documentation package. wget http://public-repo-1.hortonworks.com/hdp-labs/projects/falcon/2.0.6.0-76/rpm/falcon-doc-0.4.0.2.0.6.0-76.el6.noarch.rpm 2. Install the Package using RPM. rpm -Uvh falcon-doc-0.4.0.2.0.6.0-76.el6.noarch.rpm 3. Browse to the Falcon Documentation index page: /usr/share/doc/falcon-0.4.0.2.0.6.0-76/index.html Setup This section covers: Configuring Ooozie Using the CLI Using Basic Commands Using the REST API Removing Falcon Configuring Oozie After installing Falcon, you must make the following configuration changes to Oozie: 1. Add the following properties in bold to the /etc/oozie/conf/oozie-site.xml file: <name>oozie.service.proxyuserservice.proxyuser.falcon.hosts</name> <value>*</value> <name>oozie.service.proxyuserservice.proxyuser.falcon.groups</name> <value>*</value> <name>oozie.service.urihandlerservice.uri.handlers</name> <value>org.apache.oozie.dependency.fsurihandler,org.apache.oozie.dependency.hcaturihandler</val ue> <name>oozie.services.ext</name> <value> org.apache.oozie.service.jmsaccessorservice, org.apache.oozie.service.partitiondependencymanagerservice, org.apache.oozie.service.hcataccessorservice </value> <!-- Coord EL Functions Properties --> <name>oozie.service.elservice.ext.functions.coord-job-submit-instances</name> <value> Hortonworks Inc. Page 10

Hortonworks Inc. Page 11 now=org.apache.oozie.extensions.oozieelextensions#ph1_now_echo, today=org.apache.oozie.extensions.oozieelextensions#ph1_today_echo, yesterday=org.apache.oozie.extensions.oozieelextensions#ph1_yesterday_echo, currentmonth=org.apache.oozie.extensions.oozieelextensions#ph1_currentmonth_echo, lastmonth=org.apache.oozie.extensions.oozieelextensions#ph1_lastmonth_echo, currentyear=org.apache.oozie.extensions.oozieelextensions#ph1_currentyear_echo, lastyear=org.apache.oozie.extensions.oozieelextensions#ph1_lastyear_echo, formattime=org.apache.oozie.coord.coordelfunctions#ph1_coord_formattime_echo, latest=org.apache.oozie.coord.coordelfunctions#ph2_coord_latest_echo, future=org.apache.oozie.coord.coordelfunctions#ph2_coord_future_echo </value> <name>oozie.service.elservice.ext.functions.coord-action-create-inst</name> <value> now=org.apache.oozie.extensions.oozieelextensions#ph2_now_inst, today=org.apache.oozie.extensions.oozieelextensions#ph2_today_inst, yesterday=org.apache.oozie.extensions.oozieelextensions#ph2_yesterday_inst, currentmonth=org.apache.oozie.extensions.oozieelextensions#ph2_currentmonth_inst, lastmonth=org.apache.oozie.extensions.oozieelextensions#ph2_lastmonth_inst, currentyear=org.apache.oozie.extensions.oozieelextensions#ph2_currentyear_inst, lastyear=org.apache.oozie.extensions.oozieelextensions#ph2_lastyear_inst, latest=org.apache.oozie.coord.coordelfunctions#ph2_coord_latest_echo, future=org.apache.oozie.coord.coordelfunctions#ph2_coord_future_echo, formattime=org.apache.oozie.coord.coordelfunctions#ph2_coord_formattime, user=org.apache.oozie.coord.coordelfunctions#coord_user </value> <name>oozie.service.elservice.ext.functions.coord-action-create</name> <value> now=org.apache.oozie.extensions.oozieelextensions#ph2_now, today=org.apache.oozie.extensions.oozieelextensions#ph2_today, yesterday=org.apache.oozie.extensions.oozieelextensions#ph2_yesterday, currentmonth=org.apache.oozie.extensions.oozieelextensions#ph2_currentmonth, lastmonth=org.apache.oozie.extensions.oozieelextensions#ph2_lastmonth, currentyear=org.apache.oozie.extensions.oozieelextensions#ph2_currentyear, lastyear=org.apache.oozie.extensions.oozieelextensions#ph2_lastyear, latest=org.apache.oozie.coord.coordelfunctions#ph2_coord_latest_echo, future=org.apache.oozie.coord.coordelfunctions#ph2_coord_future_echo, formattime=org.apache.oozie.coord.coordelfunctions#ph2_coord_formattime, user=org.apache.oozie.coord.coordelfunctions#coord_user </value> <name>oozie.service.elservice.ext.functions.coord-job-submit-data</name> <value> now=org.apache.oozie.extensions.oozieelextensions#ph1_now_echo, today=org.apache.oozie.extensions.oozieelextensions#ph1_today_echo, yesterday=org.apache.oozie.extensions.oozieelextensions#ph1_yesterday_echo, currentmonth=org.apache.oozie.extensions.oozieelextensions#ph1_currentmonth_echo, lastmonth=org.apache.oozie.extensions.oozieelextensions#ph1_lastmonth_echo, currentyear=org.apache.oozie.extensions.oozieelextensions#ph1_currentyear_echo, lastyear=org.apache.oozie.extensions.oozieelextensions#ph1_lastyear_echo, datain=org.apache.oozie.extensions.oozieelextensions#ph1_datain_echo, instancetime=org.apache.oozie.coord.coordelfunctions#ph1_coord_nominaltime_echo_wrap, formattime=org.apache.oozie.coord.coordelfunctions#ph1_coord_formattime_echo, dateoffset=org.apache.oozie.coord.coordelfunctions#ph1_coord_dateoffset_echo, user=org.apache.oozie.coord.coordelfunctions#coord_user </value> <name>oozie.service.elservice.ext.functions.coord-action-start</name> <value> now=org.apache.oozie.extensions.oozieelextensions#ph2_now, today=org.apache.oozie.extensions.oozieelextensions#ph2_today, yesterday=org.apache.oozie.extensions.oozieelextensions#ph2_yesterday, currentmonth=org.apache.oozie.extensions.oozieelextensions#ph2_currentmonth, lastmonth=org.apache.oozie.extensions.oozieelextensions#ph2_lastmonth, currentyear=org.apache.oozie.extensions.oozieelextensions#ph2_currentyear, lastyear=org.apache.oozie.extensions.oozieelextensions#ph2_lastyear, latest=org.apache.oozie.coord.coordelfunctions#ph3_coord_latest,

future=org.apache.oozie.coord.coordelfunctions#ph3_coord_future, datain=org.apache.oozie.extensions.oozieelextensions#ph3_datain, instancetime=org.apache.oozie.coord.coordelfunctions#ph3_coord_nominaltime, dateoffset=org.apache.oozie.coord.coordelfunctions#ph3_coord_dateoffset, formattime=org.apache.oozie.coord.coordelfunctions#ph3_coord_formattime, user=org.apache.oozie.coord.coordelfunctions#coord_user </value> <name>oozie.service.elservice.ext.functions.coord-sla-submit</name> <value> instancetime=org.apache.oozie.coord.coordelfunctions#ph1_coord_nominaltime_echo_fixed, user=org.apache.oozie.coord.coordelfunctions#coord_user </value> <name>oozie.service.elservice.ext.functions.coord-sla-create</name> <value> instancetime=org.apache.oozie.coord.coordelfunctions#ph2_coord_nominaltime, user=org.apache.oozie.coord.coordelfunctions#coord_user </value> 2. Stop your Oozie server. su oozie cd /usr/lib/oozie bin/oozie-stop.sh 3. Confirm the Oozie Share Lib is installed on your cluster. You can install the Oozie Share Lib using the Oozie Community Quick Start docs: http://oozie.apache.org/docs/4.0.0/dg_quickstart.html#oozie_share_lib _Installation 4. Copy the existing Oozie WAR file to /usr/lib/oozie/oozie.war. This will make sure all existing items in the WAR file are still present after the current update. su root cp {$CATALINA_BASE}/webapps/oozie.war /usr/lib/oozie/oozie.war Where {$CATALINA_BASE} is the path for the Oozie web app. By default, {$CATALINA_BASE} is /var/lib/oozie/oozie-server. 5. Add the Falcon EL extensions to Oozie. Copy the extension JAR files provided with the Falcon Server to a temporary directory on the Oozie server. For example, if your standalone Falcon Server is on the same machine as your Oozie server, you can just copy the JAR files. mkdir /tmp/falcon-oozie-jars cp /usr/lib/falcon/oozie/ext/falcon-oozie-el-extension-0.4.0.2.0.6.0-76.jar \ /tmp/falcon-oozie-jars/ Hortonworks Inc. Page 12

6. Package the Oozie WAR file. su oozie cd /usr/lib/oozie/bin./oozie-setup.sh prepare-war d /tmp/falcon-oozie-jars 7. Start your Oozie Server. su oozie cd /usr/lib/oozie bin/oozie-start.sh Using Basic Commands Commands are different between Standalone and Distributed Falcon and have been arranged accordingly. Standalone Distributed Verify that you are running the right commands for your system. Standalone Starting the Falcon Server (standalone): su falcon /usr/lib/falcon/bin/falcon-start Stopping the Falcon Server (standalone): su falcon /usr/lib/falcon/bin/falcon-stop Browsing the Falcon logs (standalone): /var/log/falcon/ Running the Falcon CLI (standalone): /usr/lib/falcon/bin/falcon Accessing the Falcon Server (standalone): http://{your.falcon.server}:15000 Distributed Starting the Falcon Prism Server (distributed): su falcon /usr/lib/falcon-distributed/bin/falcon-start /usr/lib/falcon-distributed/bin/prism-start Stopping the Falcon Prism Server (distributed): Hortonworks Inc. Page 13

su falcon /usr/lib/falcon-distributed/bin/falcon-start /usr/lib/falcon-distributed/bin/prism-stop Browsing the Falcon logs (distributed): /var/log/falcon-distributed/ Running the Falcon CLI (distributed): /usr/lib/falcon-distributed/bin/falcon Accessing the Falcon Prism Server (distributed): http://{your.falcon.prism.server}:16000 Using the CLI The Falcon CLI lets you perform Entity Management, Instance Management, and Admin operations from the command line. Run the CLI with the help option to see a list of commands: falcon help Using the REST API The Falcon REST API lets you perform Entity Management, Instance Management, and Admin operations from a REST utility (for example, curl). When accessing the Falcon Server REST API, be sure to pass the Remote- user header. For example, the following command will retrieve the Falcon version using curl: curl -H "Remote-user: root" http://{your.falcon.server}:15000/api/admin/version Note: By default, the standalone Falcon Server runs on port 15000 and in distributed mode, runs on port 16000. You can learn more about the Falcon REST API here: http://falcon.incubator.apache.org/docs/restapi/resourcelist.html. Configuring the Store Once an entity definition is submitted to Falcon, by default, the XML is stored in the local file system where the Falcon Server is running. This is configured in the Falcon Server startup properties: vi /etc/falcon/conf/startup.properties ######### Implementation classes ######### *.config.store.uri=file://${falcon.home}/store Hortonworks Inc. Page 14

The local file store is useful during development and testing. In production, you should change this setting to use HDFS to store entity definitions. One way to do this is to modify the property and restart your Falcon Server. vi /etc/falcon/conf/startup.properties ######### Implementation classes ######### *.config.store.uri=hdfs://{namenode.hostname}:8020/apps/falcon/st ore Note: Be sure to create the /apps/falcon directory in HDFS and set ownership for the falcon user and group with 755 permissions. For example: drwxr-xr-x falcon hdfs /apps/falcon Removing Falcon To completely remove Falcon from a host: 1. Connect to the host running the Falcon Server. 2. Remove the Falcon packages. yum erase falcon y yum erase falcon-distributed -y 3. Delete the Falcon logs directory. rm -r /var/log/falcon rm -r /var/log/falcon-distributed 4. Delete the Falcon conf directory. rm -r /etc/falcon/conf rm -r /etc/falcon-distributed/conf 5. Delete the Falcon run directory. rm -r /var/run/falcon rm -r /var/run/falcon-distributed 6. Delete the Falcon lib directory. rm -r /var/lib/falcon rm -r /var/lib/falcon-distributed Hortonworks Inc. Page 15

Known Issues and Limitations At the time of this release, there is the following known issue for Falcon: HIVE- 5550: Import fails for tables created with default text and sequence file formats using HCatalog API. For more information, see the Falcon documentation included in the falcon- doc package. Note: Visit the forum for the latest discussions on Hortonworks issues: http://hortonworks.com/community/forums/ Troubleshooting The following troubleshooting information is available: Falcon Server fails to start with jar: command not found error Cluster entity definition submit hangs Falcon Server fails to start with jar: command not found error bash$ /usr/lib/bin/falcon-start /usr/lib/falcon/bin/falcon-config.sh: line 82: jar: command not found /usr/lib/falcon/bin Hadoop is installed, adding hadoop classpath to falcon classpath prism started using hadoop version: Hadoop 2.2.0.2.0.6.0-76 Before starting the server, be sure you have exported Java Home and put the Java binaries in the path: export JAVA_HOME=/usr/jdk64/jdk1.6.0_31 export PATH=$PATH:$JAVA_HOME/bin Alternatively, you can set up your environment to define JAVA_HOME in /etc/falcon/conf/falcon-env.sh or /etc/falcondistributed/conf/falcon-env.sh. Cluster entity definition submit hangs Check the /var/log/falcon (or if running Distributed /var/log/falcondistributed) application log for retry entries. This indicates Falcon was Hortonworks Inc. Page 16

unable to verify one or more interfaces in the cluster definition. The operation will timeout after the retries are complete. 2013-11-13 03:05:00,960 INFO - [1962083476@qtp-184766585-0:ambari-qa:POST//entities/submit/cluster 51d22bfc-d54d-4691-95e4-364eb85c3f0e] ~ Retrying connect to server: c6402.ambari.apache.org/192.168.64.102:8050. Already tried 47 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleeptime=1 SECONDS) (Client:783) Further Reading The Falcon Apache docs are available here: Item Falcon Project Page Falcon Project JIRA Falcon CLI Falcon REST API Falcon Entity Specification URL http://falcon.incubator.apache.org/ https://issues.apache.org/jira/browse/falcon http://falcon.incubator.apache.org/docs/falconcli.html http://falcon.incubator.apache.org/docs/restapi/resourcelist.html http://falcon.incubator.apache.org/docs/entityspecification.html Hortonworks Inc. Page 17

About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Hortonworks Data Platform provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop easier to install, manage and use. Hortonworks provides technical support, training & certification programs for enterprises, systems integrators & technology vendors. 3460 W. Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com