StreamSets Control Hub Installation Guide

Size: px
Start display at page:

Download "StreamSets Control Hub Installation Guide"

Transcription

1 StreamSets Control Hub Installation Guide Version , StreamSets, Inc. All rights reserved.

2 Table of Contents 2 Table of Contents Chapter 1: What's New...1 What's New in What's New in What's New in What's New in What's New in What's New in Chapter 2: Installation Overview...8 Architecture...9 Applications...9 Databases...10 System Data Collector...11 Data Collector Communication SDC Edge Communication Provisioning Agent Communication High Availability...16 Authentication Installation Requirements Relational Database Requirements...19 Time Series Database Requirements Default Ports Chapter 3: Installation...22 Overview Creating the Databases Step 1. Install the Database Software Step 2. Create Databases in the Relational Database Step 3. Create Databases in the Time Series Database Installing Control Hub...27 Step 1. Install the System Data Collector...27 Step 2. Set Up Time Synchronization...28 Step 3. Install from the Tarball or RPM Package...28 Step 4. Download the JDBC Driver Step 5. Set Environment Variables...29 Step 6. Set Up Control Hub...30

3 Table of Contents 3 Step 7. Enable PostgreSQL for the Scheduler Application (Optional)...34 Step 8. Build Schemas in the Relational Database...34 Step 9. Generate Authentication Tokens for Applications...34 Step 10. Activate the Control Hub License Step 11. Start Control Hub...35 Step 12. Log Into Control Hub Step 13. Enable LDAP Authentication (Optional)...36 Step 14. Create a Backup System Administrator...42 Step 15. Create Organizations...43 Setting Up a Highly Available Environment Step 1. Set Up Time Synchronization...45 Step 2. Set Up a Load Balancer for Control Hub Step 3. Install the Initial Control Hub Instance...46 Step 4. Install Additional Control Hub Instances Step 5. Start Each Control Hub Instance...48 Uninstalling Control Hub Chapter 4: Administration Overview Organizations...51 System Organization and Administrator Organizations and Groups Global Organization Configuration Dashboards Messaging View...54 Pipeline Templates Data Collector Version Range...56 Administer Control Hub Applications Logs...56 Control Hub Configuration...57 Configuring HTTPS Properties...59 Customizing the StreamSets Logo Control Hub Environment Configuration...61 Control Hub Directories User and Group for Service Start Java Configuration Options...62 Shutting Down Control Hub Renewing the Control Hub License...64 Regenerating the System ID...64 Control Hub Admin Tool Configuring Users...65

4 Table of Contents 4 Modifying the Log Level Chapter 5: Upgrade Overview Preparing for the Upgrade...68 Step 1. Shut Down the Previous Version...68 Step 2. Back Up the Databases...69 Step 3. Create New Databases Upgrading the Initial Control Hub Instance Step 1. Back Up the Previous RPM Directories...71 Step 2. Install or Upgrade the System Data Collector...72 Step 3. Install the New Version Step 4. Set Up the New Version Step 5. Enable PostgreSQL for the Scheduler Application (Optional)...74 Step 6. Configure Component IDs (Optional) Step 7. Update Schemas in the Relational Database Step 8. Generate Authentication Tokens...75 Step 9. Activate the Control Hub License Upgrading a Highly Available Environment...76 Step 1. Upgrade Additional Control Hub Instances Step 2. Update the Load Balancer Starting and Logging Into the Upgraded Control Hub...78 Completing Post Upgrade Tasks Delete the Previous Applications Assign New Roles to Upgraded Users...80 Modify the Pipeline Template Organization... 80

5 Chapter 1 What's New This chapter includes the following information: What's New in What's New in What's New in What's New in What's New in What's New in 3.0.0

6 What's New 2 What's New in StreamSets Control Hub version fixes the following known issues: Viewing pipeline details from the Topology view causes an error to occur. Time series charts for jobs cannot be viewed from the Topology view even though time series analysis is enabled. When a Kubernetes pod is restarted, the Provisioning Agent fails to register the Data Collector containers with Control Hub. What's New in StreamSets Control Hub version includes the following new features: Activate the Control Hub License Each Control Hub system now requires an active license. During the installation process, you use the Control Hub security command line program to generate a unique system ID and then request an activation key for that system ID from the StreamSets support team. After you receive the activation key, you use the security command line program to activate the license. Each activation key is generated for a specific Control Hub system ID. If you install multiple Control Hub instances for a highly available system, you only need to activate the license once. Pipeline Fragments Control Hub now includes pipeline fragments. A pipeline fragment is a stage or set of connected stages that you can reuse in Data Collector or SDC Edge pipelines. Use pipeline fragments to easily add the same processing logic to multiple pipelines and to ensure that the logic is used as designed. Pipeline fragments can only be created in the Control Hub Pipeline Designer. You can use any stage available in the authoring Data Collector in a fragment. Pipeline fragments cannot be designed within the Data Collector user interface. Scheduler Control Hub now includes a scheduler that manages long-running scheduled tasks. A scheduled task periodically triggers the execution of a job or a data delivery report at the specified frequency. For example, a scheduled task can start a job or generate a data delivery report on a weekly or monthly basis. Before you can schedule jobs and data delivery reports, the Scheduler Operator role must be assigned to your user account. Data Delivery Reports Jobs Control Hub now includes data delivery reports that show how much data was processed by a job or topology over a given period of time. You can create periodic reports with the scheduler, or create an on-demand report. Before you can manage data delivery reports, the Reporting Operator role must be assigned to your user account. Edit a pipeline version directly from a job - When viewing the details of a job or monitoring a job, you can now edit the latest version of the pipeline directly from the job. Previously, you had to locate the pipeline in the Pipeline Repository view before you could edit the pipeline. Enable time series analysis - You can now enable time series analysis for a job. When enabled, you can view historical time series data when you monitor the job or a topology that includes the job.

7 What's New 3 When time series analysis is disabled, you can still view the total record count and throughput for a job or topology, but you cannot view the data over a period of time. For example, you can t view the record count for the last five minutes or for the last hour. By default, all existing jobs have time series analysis enabled. All new jobs have time series analysis disabled. You might want to enable time series analysis for new jobs for debugging purposes or to analyze dataflow performance. Pipeline force stop timeout - In some situations when you stop a job, a remote pipeline instance can remain in a Stopping state for a long time. When you configure a job, you can now configure the number of milliseconds to wait before forcing remote pipeline instances to stop. The default time to force a pipeline to stop is 2 minutes. View logs- While monitoring an active job, the top toolbar now includes a View Logs icon that displays the logs for any remote pipeline instance run from the job. Subscriptions action - You can now create a subscription that listens for Control Hub events and then sends an when those events occur. For example, you might send an each time a job status changes. Pipeline committed event - You can configure an action for a pipeline committed event. For example, you might send a message when a pipeline is committed with the name of the user who committed it. Filter the events to subscribe to - You can now use the StreamSets expression language to create an expression that filters the events that you want to subscribe to. You can include subscription parameters and StreamSets string functions in the expression. For example, you might enter the following expression for a Job Status Change event so that the subscription is triggered only when the specified job ID encounters a status change: ${JOB_ID == '99efe399-7fb e27-e4c56b53db31:MyCompany'} If you do not filter the events, then the subscription is triggered each time an event occurs for all objects that you have at least read permission on. Permissions - When permission enforcement is enabled for your organization, you can now share and grant permissions on subscriptions. Provisioned Data Collectors When you define a deployment YAML specification file for provisioned Data Collectors, you can now optionally associate a Kubernetes Horizontal Pod Autoscaler, service, or Ingress with the deployment. Define a deployment and Horizontal Pod Autoscaler in the specification file for a deployment of one or more execution Data Collectors that must automatically scale during times of peak performance. The Kubernetes Horizontal Pod Autoscaler automatically scales the deployment based on CPU utilization. Define a deployment and service in the specification file for a deployment of a single development Data Collector that must be exposed outside the cluster using a Kubernetes service. Optionally associate an Ingress with the service to provide load balancing, SSL termination, and virtual hosting to the service in the Kubernetes cluster. New Configuration Files This release includes the following new configuration files located in the $DPM_CONF directory: reporting-app.properties - Used to configure the Reporting application. scheduler-app.properties - Used to configure the Scheduler application. What's New in StreamSets Control Hub version includes the following new feature:

8 What's New 4 View logs for an active job When monitoring an active job, you can now view the logs for a remote pipeline instance from the Data Collectors tab. What's New in StreamSets Control Hub version includes the following new features: System Data Collector You can now configure the system Data Collector connection properties when you run the Control Hub setup script. Previously, you had to modify the $DPM_CONF/common-to-all-apps.properties to configure the system Data Collector properties. Pipelines Pipelines include the following enhancements: Duplicate pipelines - You can now select a pipeline in the Pipeline Repository view and then duplicate the pipeline. A duplicate is an exact copy of the original pipeline. Commit message when publishing pipelines - You can now enter commit messages when you publish pipelines from Pipeline Designer. Previously, you could only enter commit messages when you published pipelines from a registered Data Collector. Export and Import You can now use Control Hub to export and import the following objects: Jobs and topologies - You can now export and import jobs and topologies to migrate the objects from one organization to another. You can export a single job or topology or you can export a set of jobs and topologies. When you export and import jobs and topologies, you also export and import dependent objects. For jobs, you also export and import the pipelines included in the jobs. For topologies, you also export and import the jobs and pipelines included in the topologies. Sets of pipelines - You can now select multiple pipelines in the Pipeline Repository view and export the pipelines as a set to a ZIP file. You can also now import pipelines from a ZIP file containing multiple pipeline files. Subscriptions You can now create a subscription so that Control Hub sends an alert to an external system when an event occurs. For example, you might create a subscription that sends an alert to a Slack channel each time a job status changes. When you create a subscription, you select the Control Hub events to subscribe to - such as a changed job status or a triggered data SLA. You then configure the action to take when the events occur - such as using a webhook to send an HTTP request. Webhooks automatically trigger external tasks based on an HTTP request. Important: By default, an organization is not enabled to send events that trigger subscriptions. Before Control Hub can trigger subscriptions for your organization, your organization administrator must enable events for the organization. Scale Out Active Jobs When the Number of Instances property for a job is set to -1, Control Hub can now automatically scale out pipeline processing for the active job. When Number of Instances is set to any other value, you must synchronize the active job to start additional pipeline instances on newly available Data Collectors or Edge Data Collectors.

9 What's New 5 For example, if Number of Instances is set to -1 and three Data Collectors have all of the specified labels for the job, Control Hub runs three pipeline instances, one on each Data Collector. If you register another Data Collector with the same labels as the active job, Control Hub automatically starts a fourth pipeline instance on that newly available Data Collector. Previously, you had to synchronize all active jobs - regardless of the Number of Instances value - to start additional pipeline instances on a newly registered Data Collector. What's New in StreamSets Control Hub version includes the following new feature: PostgreSQL Support Control Hub now supports PostgreSQL in addition to MySQL for the relational database that stores metadata written by Control Hub applications. What's New in StreamSets Control Hub version includes the following new features and enhancements: Product Rename With this release, we have created a new product called StreamSets Control Hub that includes a number of new dataflow design, deployment, and scale-up features. Since this release is now our core service for controlling dataflows, we have renamed "Dataflow Performance Manager (DPM)" to "StreamSets Control Hub. DPM now refers to the performance management functions such as live metrics and data SLAs. Customers who have purchased the StreamSets Enterprise Edition will gain access to all Control Hub functionality and continue to have access to all DPM functionality as before. To understand the end-to-end StreamSets Data Operations Platform and how the products fit together, visit Installation StreamSets Control Hub is now supported on the CentOS 6.x and Red Hat Enterprise Linux 6.x operating systems. StreamSets provides the following RPM packages for Control Hub: EL6 - Use to install Control Hub on CentOS 6.x or Red Hat Enterprise Linux 6.x. EL7 - Use to install Control Hub on CentOS 7.x or Red Hat Enterprise Linux 7.x. LDAP Authentication If your company uses Lightweight Directory Access Protocol (LDAP), you can use the LDAP provider to authenticate Control Hub users and to retrieve group membership. LDAP authenticates a user using the credentials stored in the LDAP server. LDAP authentication is configured by the system administrator for the complete Control Hub installation. After LDAP authentication is enabled, all organizations must use LDAP authentication. Users log in to Control Hub using their Control Hub user ID and their LDAP password. To use LDAP authentication, the system administrator configures LDAP connection information for Control Hub and then maps organization administrator accounts to LDAP users. Organization administrators then create Control Hub user accounts and groups for their organization, mapping these to LDAP users and groups.

10 What's New 6 Pipeline Designer You can now create and design pipelines directly in the Control Hub Pipeline Designer after you select an authoring Data Collector for Pipeline Designer to use. You select one of the following types of Data Collectors to use as the authoring Data Collector: System Data Collector - Use to design pipelines only - cannot be used to preview or explicitly validate pipelines. You configure the system Data Collector that all organizations can use as the default authoring Data Collector for exploration and light development. Registered Data Collector using the HTTPS protocol - Use to design, preview, and explicitly validate pipelines. Includes the stage libraries and custom stage libraries installed in the registered Data Collector. When you create pipelines in Pipeline Designer, you can create a blank pipeline or you can create a pipeline from a template. Use pipeline templates to quickly design pipelines for typical use cases. Provisioning Data Collectors You can now automatically provision Data Collectors on a Kubernetes container orchestration framework. Provisioning includes deploying, registering, starting, scaling, and stopping Data Collector Docker containers in the Kubernetes cluster. Use provisioning to reduce the overhead of managing a large number of Data Collector instances. Instead, you can manage a central Kubernetes cluster used to run multiple Data Collector containers. Integration with Data Collector Edge Control Hub now works with Data Collector Edge (SDC Edge) to execute edge pipelines. SDC Edge is a lightweight agent without a UI that runs pipelines on edge devices with limited resources. Edge pipelines read data from the edge device or receive data from another pipeline and then act on that data to control the edge device. You install SDC Edge on edge devices, then register each SDC Edge with Control Hub. You assign labels to each SDC Edge to determine which jobs are run on that SDC Edge. You either design edge pipelines in the Control Hub Pipeline Designer or in a development Data Collector. After designing edge pipelines, you publish the pipelines to Control Hub and then add the pipelines to jobs that run on a registered SDC Edge. Pipeline Comparison When you compare two pipeline versions, Control Hub now highlights the differences between the versions in the pipeline canvas. Previously, you had to visually compare the two versions to discover the differences between them. Aggregated Statistics You can now configure a pipeline to write aggregated statistics to MapR Streams. Load Balancing for Jobs Roles You can now balance a job to redistribute the pipeline load across available Data Collectors that are running the fewest number of pipelines. For example, let s say that a failed pipeline restarts on another Data Collector due to the original Data Collector shutting down. When the original Data Collector restarts, you can balance the job to redistribute the pipeline to the restarted Data Collector not currently running any pipelines. You can now assign provisioning roles to user accounts, which enable users to view and work with Provisioning Agents and deployments to automatically provision Data Collectors. You must assign the appropriate provisioning roles to users before they can access the Provisioning Agents and Deployments views in the Navigation panel.

11 Navigation Panel The Navigation panel now groups the Data Collectors view under an Execute menu, along with the new Edge Data Collectors, Provisioning Agents, and Deployments views: What's New 7 Dashboards The default dashboard now includes the number of users in your organization when your user account has the Organization Administrator role. New Configuration File This release includes a new $DPM_CONF/provisioning-app.properties file used to configure the Provisioning application. Updated Configuration Files The following updated configuration files include new properties for this release: common-to-all-apps.properties The $DPM_CONF/common-to-all-apps.properties file includes the following new properties used to configure the system Data Collector: pipeline.designer.system.sdc.url pipeline.designer.system.sdc.username pipeline.designer.system.sdc.password The file also includes the following new property that is reserved for future use: ui.signup.enabled pipelinestore-app.properties The $DPM_CONF/pipelinestore-app.properties file includes the following new properties used to configure the organization that manages pipeline templates: pipeline.templates.organization pipeline.templates.organizationuser security-app.properties The $DPM_CONF/security-app.properties file includes new properties to configure LDAP authentication. The file also includes the following new properties that are reserved for future use: trial.days trial.maxjobs trial.maxtopologies trial.provider address

12 Installation Overview 8 Chapter 2 Installation Overview This chapter includes the following information: Architecture System Data Collector Data Collector Communication SDC Edge Communication Provisioning Agent Communication High Availability Authentication Installation Requirements

13 Installation Overview 9 Architecture StreamSets Control Hub architecture includes applications that manage user requests and databases that store application metadata and time series data. Control Hub applications manage different user requests, such as designing and publishing a pipeline, starting a job, or measuring topology performance. The applications are independent and isolated from each other. They use REST APIs to communicate and cannot access the database data owned by another application. The applications are internal to Control Hub - when you install Control Hub, all of the applications are installed with Control Hub. Control Hub uses authentication tokens to authenticate each message or request sent by an application. When you install Control Hub, you generate a unique authentication token for each application. The application includes the authentication token when it issues authenticated messages or requests to other applications. Control Hub applications store data in a relational database and a time series database. Before you install Control Hub, you must set up the required databases. The following image displays the architecture of Control Hub: Applications The following table describes each Control Hub application: Application Job Runner Messaging Description Manages job creation and requests to start, stop, and synchronize jobs. Manages the messages sent between Control Hub, registered Data Collectors, registered Edge Data Collectors, and Provisioning Agents.

14 Installation Overview 10 Application Notification Pipeline Store Provisioning Reporting Scheduler Security SLA Time Series Topology Description Manages and triggers subscriptions that listen for Control Hub events. Manages the display and acknowledgement of data SLA alerts configured for topologies in Control Hub. Also manages the display of all metric, data, and data drift alerts configured for pipelines. Manages pipeline designing, publishing, versioning, and history. Manages the automatic provisioning of Data Collectors to a container orchestration framework, such as Kubernetes. Provisioning includes deploying, registering, starting, stopping, and scaling Data Collector containers to work with Control Hub. Manages data delivery reports - including creating report definitions, generating reports, and displaying generated reports. Manages the scheduling of jobs and data delivery reports. Ensures the integrity of your data by managing the following components: Registration of Data Collectors and Edge Data Collectors. Management of authentication tokens for Control Hub applications, Provisioning Agents, registered Data Collectors, and registered Edge Data Collectors. Authentication of each message or request sent by Control Hub applications, Provisioning Agents, registered Data Collectors, and registered Edge Data Collectors. Authentication of user accounts and authorization of users through roles and permissions. Manages the creation and triggering of data SLA alerts for topologies. Reads and stores statistics collected by Data Collectors and Edge Data Collectors running remote pipeline instances. Displays the statistics as metrics when you monitor jobs and topologies. Manages the creation and history of topologies. Databases Control Hub uses the following databases: Relational Control Hub applications store metadata in a relational database. Control Hub currently supports MySQL or PostgreSQL for the relational database. The relational database includes a database for each of the following applications: Job Runner Messaging Notification Pipeline Store Provisioning Reporting Scheduler

15 Security SLA Time Series Topology Time Series Installation Overview 11 Control Hub applications store metric data in a time series database. Control Hub currently supports InfluxDB for the time series database. The time series database includes the following databases: Metrics - The Time Series application writes aggregated statistics collected by Data Collectors running remote pipeline instances to this database. Control Hub users view the metrics as time series data when they monitor jobs and topologies. In addition, data delivery reports are generated using these metrics. The Time Series application stores all historical metrics in the time series database. The Time Series application stores the current metrics only in the relational database. Application Metrics - All Control Hub applications write metrics about their actions to this database. As a system administrator, you can view these application metrics as time series data in the Control Hub Admin tool to help you troubleshoot Control Hub issues. Note: StreamSets recommends using a time series database so that users can more accurately analyze dataflow performance. However, the time series database is optional. If you choose not to set up a time series database for Control Hub, the Time Series application does not store all historical metrics. Instead, it only stores the current metrics in the relational database. Users can view the total record count and throughput for a job or topology, but they cannot view the data over a period of time nor can they generate data delivery reports. System Data Collector Before you can design pipelines directly in StreamSets Control Hub, you select an authoring Data Collector for Control Hub to use. The selected authoring Data Collector determines the stages, stage libraries, and functionality that display in the Control Hub Pipeline Designer. A system Data Collector is a Data Collector that can be used by all users across all organizations as a default authoring Data Collector. The system Data Collector can be used for exploration and light development to design pipelines - it cannot be used to perform data preview or explicit pipeline validation. Install and configure a system Data Collector during the installation process so that all users have access to an authoring Data Collector used to design pipelines in Control Hub. Note: After the installation, organization administrators register Data Collectors for their organization. Registered Data Collectors can also be used as authoring Data Collectors for Pipeline Designer. A registered Data Collector owned by an organization can be used to design, preview, and explicitly validate pipelines. The Pipeline Designer UI uses encrypted REST APIs to communicate with Control Hub and the system Data Collector. As you design pipelines, the Pipeline Designer UI sends requests to the Pipeline Store application. The Pipeline Store application saves and retrieves pipeline definitions in the Pipeline Store relational database. The system Data Collector is stateless - meaning that no pipeline definitions are saved with the Data Collector. The Pipeline Store sends additional requests to the system Data Collector to retrieve stage definitions and perform implicit validation. The following image shows how the Pipeline Designer UI interacts with the system Data Collector when you design pipelines:

16 Installation Overview 12 Data Collector Communication After you install StreamSets Control Hub, you register Data Collectors to work with Control Hub. You register Data Collectors by manually installing and administering them or by automatically provisioning them on a container orchestration framework. All registered Data Collectors - either manually administered or automatically provisioned - function in the same way. Registered Data Collectors use encrypted REST APIs to communicate with Control Hub. Data Collectors initiate outbound connections to Control Hub on the port number configured in the Control Hub system. Data Collectors send requests and information to several of the Control Hub applications. Control Hub applications do not directly send requests to Data Collectors. Instead, Control Hub applications send requests using encrypted REST APIs to a messaging queue managed by the Messaging application. Data Collectors periodically check with the queue to retrieve application requests. Registered Data Collectors communicate with the following Control Hub applications: Pipeline Store When you use a Data Collector to publish a pipeline to Control Hub or to download a published pipeline from Control Hub, the Data Collector sends the request to the Pipeline Store application. Security When you enable Control Hub within a Data Collector or when a user logs into a registered Data Collector, the Data Collector makes an authentication request to the Security application. Time Series Every minute, a Data Collector sends aggregated metrics for remotely running pipelines to the Time Series application. Messaging Data Collectors send the following information to the Messaging application: At startup, a Data Collector sends the following information: Data Collector version, HTTP URL of the Data Collector, and labels configured in the Control Hub configuration file, dpm.properties. When you update permissions on local pipelines, the Data Collector sends the updated pipeline permissions.

17 Every five seconds, a Data Collector sends a heartbeat and any status changes for remote pipelines. Installation Overview 13 Every minute, a Data Collector sends the last-saved offsets of remotely running pipelines and the status of all running pipelines. Every three seconds, the Job Runner application checks the Messaging application to retrieve pipeline status changes and last-saved offsets sent by the Data Collectors. Every five seconds, Data Collectors check with the Messaging application to retrieve requests sent by the Job Runner application. When you start, stop, or delete a job in Control Hub, the Job Runner sends a pipeline request for a specific Data Collector to the Messaging application. The Messaging application retains the request until the receiving Data Collector retrieves the request. The following image shows the Control Hub applications that registered Data Collectors communicate with: SDC Edge Communication StreamSets Control Hub works with Data Collector Edge (SDC Edge) to execute edge pipelines. SDC Edge is a lightweight agent that runs pipelines on edge devices with limited resources. You either design edge pipelines in the Control Hub Pipeline Designer or in a development Data Collector, and then publish the pipelines to Control Hub. Within Control Hub, you add published edge pipelines to jobs and then run the jobs on registered Edge Data Collectors. You install each SDC Edge on an edge device and then register it to work with Control Hub. Registered Edge Data Collectors use encrypted REST APIs to communicate with Control Hub. Edge Data Collectors initiate outbound connections to Control Hub on the port number configured in the Control Hub system. Just like Data Collector, a registered SDC Edge sends requests and information to several of the Control Hub applications. Control Hub applications do not directly send requests to an SDC Edge. Instead, Control Hub sends requests using encrypted REST APIs to a messaging queue managed by the Messaging application. An SDC Edge periodically checks with the queue to retrieve Control Hub requests. SDC Edge communicates with the following Control Hub applications:

18 Installation Overview 14 Time Series Every minute, an SDC Edge sends metrics for remotely running edge pipelines directly to the Time Series application. Messaging SDC Edge sends the following information to the Messaging application: At startup, an SDC Edge sends the following information: SDC Edge version, HTTP URL of the SDC Edge, and labels configured in the SDC Edge configuration file, edge.conf. Every five seconds, an SDC Edge sends a heartbeat and any status changes for remote edge pipelines. Every minute, an SDC Edge sends the last-saved offsets of remotely running edge pipelines and the status of all running edge pipelines. Every three seconds, the Job Runner application checks the Messaging application to retrieve pipeline status changes and last-saved offsets sent by each SDC Edge. Every five seconds, each SDC Edge checks with the Messaging application to retrieve requests sent by the Job Runner application. When you start, stop, or delete a job, the Job Runner sends a pipeline request for a specific SDC Edge to the Messaging application. The Messaging application retains the request until the receiving SDC Edge retrieves the request. The following image shows the Control Hub applications that each registered SDC Edge communicates with: Provisioning Agent Communication A Provisioning Agent is a containerized application that runs in a container orchestration framework, such as Kubernetes. The agent automatically provisions Data Collector containers in the Kubernetes cluster on which it runs. Provisioning includes deploying, starting, stopping, and scaling the Data Collector containers to work with StreamSets Control Hub. Use provisioning to reduce the overhead of managing individual Data Collector instances.

19 Installation Overview 15 After you create a Provisioning Agent and deploy the application to a container orchestration framework, the Provisioning Agent uses encrypted REST APIs to communicate with Control Hub. Provisioning Agents initiate outbound connections to Control Hub on the port number configured in the Control Hub system. Provisioning Agents send requests and information to several of the Control Hub applications. Control Hub applications do not directly send requests to Provisioning Agents. Instead, Control Hub applications send requests using encrypted REST APIs to a messaging queue managed by the Messaging application. Provisioning Agents periodically check with the queue to retrieve application requests. The container orchestration framework provides high availability for the Provisioning Agent. Provisioning Agents communicate with the following Control Hub applications: Security When the Provisioning Agent deploys a Data Collector container, the agent makes a request to the Security application for a new authentication token. The Provisioning Agent sends the returned authentication token to the Data Collector container and enables the container to work with Control Hub. During the start up of the Data Collector container, the Data Collector registers itself with Control Hub. The Provisioning Agent uses private keys to sign authentication tokens, and then Data Collector containers decrypt the tokens. As a result, Data Collector containers are not prone to distributed denial-of-service (DDoS) attacks where an impersonating agent attempts to send an invalid authentication token. Messaging Every five seconds, Provisioning Agents send deployment status changes to the Messaging application. At the same time, Provisioning Agents check with the Messaging application to retrieve requests sent by the Provisioning application. When you start, stop, or scale a deployment in Control Hub, the Provisioning application sends a deployment request for a specific Provisioning Agent to the Messaging application. The Messaging application retains the request until the receiving Provisioning Agent retrieves the request. Every 60 seconds, the Provisioning application checks the Messaging application to retrieve deployment status changes. After the Provisioning Agent deploys Data Collector containers, the Data Collector containers communicate with Control Hub the same way that any registered Data Collector communicates with Control Hub. The following image shows the Control Hub applications that a Provisioning Agent communicates with:

20 Installation Overview 16 High Availability You can run a single StreamSets Control Hub instance in a development environment. However, in a production environment, we recommend using multiple Control Hub instances and a load balancer to ensure that Control Hub is highly available. To set up Control Hub as a highly available system, complete the following tasks: Use highly available database clusters Use highly available database clusters: For the relational database, use MySQL Enterprise High Availability or use PostgreSQL with high availability enabled. For the time series database, use InfluxEnterprise. Install Control Hub on multiple machines Install Control Hub on multiple machines, ensuring that each Control Hub instance uses the same relational and time series database and the same SMTP account for s. Set up a load balancer for Control Hub Set up a load balancer to distribute user and registered Data Collector, Data Collector Edge, and Provisioning Agent requests across the Control Hub system. These Control Hub clients use the load balancer URL to communicate with the Control Hub system. In addition, each Control Hub instance accesses the front end of the load balancer to communicate with the other Control Hub instances. We recommend using a Layer 7 load balancer such as HAProxy, NGINX, or F5. As a best practice, use multiple instances of the load balancer to ensure that the load balancer is also highly available. The following image displays the components of a highly available Control Hub system:

21 Installation Overview 17 Authentication Users require a StreamSets Control Hub user account to log in. You can configure Control Hub to use the following methods to authenticate Control Hub user accounts: Control Hub authentication The built-in Control Hub authentication method authenticates a user using the credentials stored in the Control Hub relational database. Control Hub authentication is the default authentication method. To use Control Hub authentication, organization administrators simply create Control Hub user accounts for their organization. SAML authentication If an organization uses a Security Assertion Markup Language (SAML) IdP, the organization can use the IdP to authenticate Control Hub users. SAML authenticates a user using the credentials stored in the IdP. SAML authentication is configured by the organization administrator for each organization. To use SAML authentication, organization administrators enable SAML authentication for their organization, create Control Hub user accounts, and then map the Control Hub user accounts to IdP user accounts. LDAP authentication If your company uses Lightweight Directory Access Protocol (LDAP), you can use the LDAP provider to authenticate Control Hub users. LDAP authenticates a user using the credentials stored in the LDAP server. LDAP authentication is configured by the default system administrator - the admin@admin user account - for the entire Control Hub system. To use LDAP authentication, the Control Hub system administrator configures

22 Installation Overview 18 LDAP connection information for Control Hub and then maps organization administrator accounts to LDAP users. Organization administrators then create Control Hub user accounts for their organization, mapping the Control Hub user accounts to LDAP users. Control Hub can also retrieve group membership from the LDAP provider. To group users, organization administrators create Control Hub groups, and then map the Control Hub groups to LDAP groups. Control Hub and SAML authentication are configured at the organization level. As a result, Control Hub can include some organizations that use Control Hub authentication and other organizations that use SAML authentication. For more information about Control Hub and SAML authentication, see the "Organization Security" chapter in the StreamSets Control Hub online help. LDAP authentication is configured for the entire Control Hub system. As a result, after the system administrator enables LDAP authentication for Control Hub, all organizations must use LDAP authentication. You can enable Control Hub to use LDAP authentication during the installation process or after the installation process. For more information, see Enabling LDAP Authentication. Related Information Organizations on page 51 Installation Requirements Install StreamSets Control Hub on a machine that meets the following minimum requirements: Component Operating system Minimum Requirement Use one of the following operating systems and versions: CentOS 6.x or 7.x Red Hat Enterprise Linux 6.x or 7.x Ubuntu LTS Java Oracle Java 8. Note: OpenJDK is not supported. Also requires Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files 8. The remaining requirements depend on whether you are installing a single Control Hub instance for a development environment or multiple Control Hub instances for a highly available production environment: Component Single Installation Multiple Installations for High Availability CPU 8 4 RAM 15 GB 7.5 GB Disk space 50 GB 30 GB If installing on Amazon EC2 instances, install Control Hub on a separate volume instead of the root volume. Use the following instance types: Single installation - c4.2xlarge Multiple installations for high availability - c4.xlarge

23 General Access Requirements Installation Overview 19 After installation, Control Hub requires access to the following components. These components can be local or remote to the Control Hub installations: Component SMTP account Load balancer Browser StreamSets Data Collector StreamSets Data Collector Edge (SDC Edge) Statistics aggregator Minimum Requirement SMTP account to send s. Load balancer to set up a highly available Control Hub system. We recommend using a Layer 7 load balancer such as HAProxy, NGINX, or F5. Required for a production environment, optional for a development environment. Use the latest version of one of the following browsers: Chrome Firefox Safari StreamSets recommends using the latest version of Data Collector, up to the current Control Hub version. The minimum supported Data Collector version depends on how you use Data Collector: Version or later is required to design pipelines in Data Collector and to run standalone and cluster pipelines from jobs. Version or later is required as the authoring Data Collector used to design pipelines in the Control Hub Pipeline Designer. Version is required as the authoring Data Collector used to design pipeline fragments. If needed, you can customize the supported Data Collector version range. StreamSets recommends using the latest version of SDC Edge to run edge pipelines from jobs. The minimum supported SDC Edge version is Use one of the following systems to aggregate pipeline statistics when jobs run on multiple Data Collectors: Amazon Kinesis Streams Kafka version supported by Data Collector MapR Streams version supported by Data Collector Note: In a development environment, you can also use SDC RPC to aggregate pipeline statistics. Using SDC RPC to aggregate statistics is not highly available and might cause the loss of some data. It should be used for development purposes only. Relational Database Requirements Control Hub supports MySQL or PostgreSQL for the relational database. MySQL Requirements The relational database for a single Control Hub instance requires MySQL 5.6 or 5.7.

24 Installation Overview 20 The relational database for a highly available Control Hub system requires MySQL Enterprise High Availability 5.6 or 5.7. MySQL installations must meet the following minimum requirements: Component Single Installation Multiple Installations for High Availability CPU 4 4 RAM 30.5 GB 30.5 GB Disk space 50 GB 100 GB If installing on Amazon EC2 instances, use the following instance types: Single installation - db.r3.xlarge Multiple installations for high availability - db.r3.xlarge PostgreSQL Requirements The relational database for a single Control Hub instance requires PostgreSQL 9.4. The relational database for a highly available Control Hub system requires PostgreSQL 9.4 with high availability enabled. PostgreSQL installations must meet the following minimum requirements: Component Single Installation Multiple Installations for High Availability CPU 4 4 RAM 30.5 GB 30.5 GB Disk space 50 GB 100 GB If installing on Amazon EC2 instances, use the following instance types: Single installation - db.r3.xlarge Multiple installations for high availability - db.r3.xlarge Time Series Database Requirements The time series database for a single Control Hub instance requires InfluxDB OSS 1.3. The time series database for a highly available Control Hub system requires InfluxEnterprise 1.3, with a minimum of 2 data nodes and 3 meta nodes. Influx installations must meet the following minimum requirements: Component Single Installation Multiple Installations for High Availability CPU 4 8 RAM 30.5 GB 61 GB Disk space 250 GB 500 GB If installing on Amazon EC2 instances, use the following instance types: Single installation - r4.xlarge

25 Installation Overview 21 Multiple installations for high availability - r3.2xlarge Default Ports The following table lists the default ports exposed to Control Hub clients and how they are used. Note that the default port numbers can be changed during installation. In a development environment, configure network routes and firewalls so that web UI clients and registered Data Collectors, Edge Data Collectors, and Provisioning Agents can reach the Control Hub IP addresses. In a highly available production environment, configure network routes and firewalls so that the Control Hub instances, web UI clients, and registered Data Collectors, Data Collector Edges, and Provisioning Agents can reach the load balancer. System Default Port Protocol Usage Control Hub HTTP HTTPS - depends on dpm.properties configuration TCP Access to the Control Hub web-based UI and API for a single Control Hub instance in a development environment. Used by developers and administrators to access the UI. Used by registered Data Collectors, Edge Data Collectors, and Provisioning Agents to access the API. Control Hub Admin tool HTTP HTTPS - depends on dpm.properties configuration TCP Access to the Control Hub Admin tool web-based UI for a single Control Hub instance in a development environment. Used by administrators to access the UI. Load balancer Depends on the chosen load balancer TCP When using multiple Control Hub instances in a highly available production environment, both Control Hub and the Control Hub Admin tool are accessed through a load balancer. The following table lists the default ports of the external systems that Control Hub depends on and how they are used. The default port numbers can change - confirm the actual numbers with your systems administrator. External System Default Port Protocol Usage MySQL 3306 TCP Relational database that stores Control Hub application data. Note: Control Hub uses either the MySQL or PostgreSQL relational database - not both. PostgreSQL 5432 TCP Relational database that stores Control Hub application data. InfluxDB 8086 TCP Time series database that stores metrics. LDAP or LDAPS TCP Used when Control Hub is configured for LDAP or LDAPS authentication. SMTP 465 TCP Used by Control Hub to send notifications.

26 Installation 22 Chapter 3 Installation This chapter includes the following information: Overview Creating the Databases Installing Control Hub Setting Up a Highly Available Environment Uninstalling Control Hub

27 Installation 23 Overview Installing StreamSets Control Hub includes the following high-level tasks: Creating the databases Install the required database software and then create the required databases. Installing Control Hub Install Control Hub, launch and log into the Control Hub instance, and then create organizations. Setting up a highly available environment For a highly available production environment, set up a load balancer and install Control Hub on multiple machines. You can run a single Control Hub instance in a development environment. If you are setting up a development environment, you can skip this section. Creating the Databases Install the required database software and create the databases before installing StreamSets Control Hub. Step 1. Install the Database Software Control Hub requires MySQL or PostgreSQL for the relational database and requires InfluxDB for the time series database. You can install the database software on the same machine as the Control Hub instance or on a remote machine. For best performance, we recommend installing on remote machines. Be sure to install the database software on a machine that meets the installation requirements. For a development environment, install the following: Relational database - Install MySQL or PostgreSQL as described in the MySQL documentation or the PostgreSQL documentation. Time series database - Install InfluxDB as described in the InfluxDB documentation. For a highly available Control Hub system in a production environment, install the following: Relational database - Install MySQL Enterprise High Availability or PostgreSQL with high availability enabled as described in the MySQL Enterprise High Availability documentation or the PostgreSQL high availability documentation. Time series database - Install InfluxEnterprise as described in the InfluxEnterprise documentation. Related Information Relational Database Requirements on page 19 Time Series Database Requirements on page 20

28 Installation 24 Setting MySQL Server System Variables If using MySQL for the relational database, set the following MySQL server system variables to these values on each machine where MySQL is installed: MySQL System Variable character_set_client character_set_connection character_set_database character_set_server character_set_system collation_connection collation_database collation_server Value utf8 utf8_general_ci utf8_unicode_ci lower_case_table_names 0 Enabling Authentication and HTTPS for InfluxDB By default, an InfluxDB installation has authentication and HTTPS disabled. For a production environment, we recommend enabling both authentication and HTTPS. When you enable authentication, you must create an admin user. Use this admin user when you create the required databases in InfluxDB. For instructions, see Authentication and Authorization and HTTPS Setup in the InfluxDB documentation. Step 2. Create Databases in the Relational Database Create a unique database for each Control Hub application in the relational database on MySQL or PostgreSQL. Creating Databases on MySQL Create a unique database for each application and then create a user with all privileges on these databases. 1. Connect to MySQL as a user who can create databases. You must be an admin user to create databases in MySQL. For more information, see the MySQL documentation. 2. Use the following command to create a unique database for each application: CREATE DATABASE <database name>; For example, you might create databases with the following names to match each application: CREATE DATABASE jobrunner; CREATE DATABASE messaging; CREATE DATABASE notification; CREATE DATABASE pipelinestore; CREATE DATABASE provisioning; CREATE DATABASE reporting; CREATE DATABASE scheduler; CREATE DATABASE security; CREATE DATABASE sla; CREATE DATABASE timeseries; CREATE DATABASE topology;

29 Tip: If you have both a development and production environment using the same relational database, create unique names for each environment. For example, security_dev and security_prod. 3. Verify that the required databases were successfully created with the correct names. 4. Create a user with all privileges on these databases. Installation 25 When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database. For information on creating users, see the MySQL documentation. The commands you use depend on whether MySQL is installed on the local Control Hub machine or on a remote machine: Local machine - Use the following commands to create another user with all privileges on the jobrunner database: CREATE USER 'jobrunner'@'%' identified by 'jobrunner'; grant all privileges on jobrunner.* to 'jobrunner'@'%'; CREATE USER 'jobrunner'@'<full host name>' identified by 'jobrunner'; grant all privileges on jobrunner.* to 'jobrunner'@'<full host name>'; Remote machine - Use the following commands to create another user with all privileges on the jobrunner database: CREATE USER 'jobrunner'@'%' identified by 'jobrunner'; grant all privileges on jobrunner.* to 'jobrunner'@'%'; Repeat the commands for each application database. Creating Databases on PostgreSQL Create a unique database for each application and then create a user with all privileges on these databases. 1. Connect to PostgreSQL as a user who can create databases. You must be a superuser or a user with the special CREATEDB privilege to create databases in PostgreSQL. For more information, see the PostgreSQL documentation. 2. Use the following command to create a unique database for each application: CREATE DATABASE <database name> WITH ENCODING = UTF8; For example, you might create databases with the following names to match each application: CREATE DATABASE jobrunner WITH ENCODING = 'UTF8'; CREATE DATABASE messaging WITH ENCODING = 'UTF8'; CREATE DATABASE notification WITH ENCODING = 'UTF8'; CREATE DATABASE pipelinestore WITH ENCODING = 'UTF8'; CREATE DATABASE provisioning WITH ENCODING = 'UTF8'; CREATE DATABASE reporting WITH ENCODING = 'UTF8'; CREATE DATABASE scheduler WITH ENCODING = 'UTF8'; CREATE DATABASE security WITH ENCODING = 'UTF8'; CREATE DATABASE sla WITH ENCODING = 'UTF8'; CREATE DATABASE timeseries WITH ENCODING = 'UTF8'; CREATE DATABASE topology WITH ENCODING = 'UTF8'; Tip: If you have both a development and production environment using the same relational database, create unique names for each environment. For example, security_dev and security_prod. 3. Verify that the required databases were successfully created with the correct names. 4. Create a user with all privileges on these databases. When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database. For information on creating users, see the PostgreSQL documentation. For example, use the following commands to create a jobrunner user with all privileges on the jobrunner database:

30 Installation 26 CREATE USER jobrunner with password '<password>'; grant all privileges on database jobrunner to jobrunner; Repeat the commands for each application database. Step 3. Create Databases in the Time Series Database Create the following databases in the time series database on InfluxDB: Metrics Application Metrics 1. Use the following command to start the InfluxDB server: service influxdb start 2. Use the following command to connect to InfluxDB as the admin user that you created when you enabled authentication for InfluxDB: influx -username <admin user name> -password <password> -precision rfc3339 You must be an admin user to create databases in InfluxDB. Note: The command assumes that InfluxDB is installed on the localhost using the default port If you changed the defaults during the InfluxDB installation, use the appropriate command to connect. For more information, see the InfluxDB documentation. 3. Use the following Influx Query Language (InfluxQL) statement to create each database: > CREATE DATABASE <database name> For example, to create a Metrics database named sch and an Application Metrics database named sch_app, run the following statements: > CREATE DATABASE sch > CREATE DATABASE sch_app Tip: If you have both a development and production environment using the same time series database, create unique database names for each environment. For example, sch_dev and sch_prod. 4. Use the following InfluxQL statement to verify that the databases were successfully created with the correct names: > SHOW DATABASES Verify that the statement output includes the required databases you just created. For example: name: databases name ---- _internal sch sch_app Note: The _internal database is created and used by InfluxDB to store internal runtime metrics. 5. Create an InfluxDB user with all privileges on these time series databases. When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for both databases, or you can create a unique user account for each database. For example, use the following InfluxQL statements to create another user with all privileges on the Metrics database (sch) and the Application Metrics database (sch_app): > CREATE USER <username> WITH PASSWORD '<password>' > grant ALL on sch to <username> > grant ALL on sch_app to <username> > grant read on _internal to <username>

31 Installation 27 Installing Control Hub You can install Control Hub on the same machine as the required databases or on a remote machine. For best performance, we recommend installing on a remote machine. During the Control Hub installation process, you also install and configure a system Data Collector so that all users have access to an authoring Data Collector used to design pipelines in Control Hub. Important: After you install Control Hub, you must generate a unique system ID and then send that system ID to StreamSets to request an activation key. Plan to have some downtime while you wait for the activation key. You need the key before you can install the license and start Control Hub. After starting Control Hub, log into the instance and create the required organizations. Control Hub also includes a separate Admin tool to monitor and troubleshoot Control Hub issues. For example, if Control Hub becomes inaccessible, the Admin tool remains running. You can still log into the Admin tool to troubleshoot the Control Hub issues. Step 1. Install the System Data Collector Install and configure a Data Collector that functions as the system Data Collector used by the Control Hub Pipeline Designer. For more information about how the system Data Collector is used by Pipeline Designer, see System Data Collector. Note: You can use an existing Data Collector installation as long as the Data Collector has not been registered with Control Hub, meets all requirements, and is configured as described below. Requirements The system Data Collector must meet all of the following requirements: Version StreamSets recommends using the latest version of Data Collector. The minimum supported Data Collector version is To design pipeline fragments, the minimum supported Data Collector version is Installation type Use any of the supported installation methods for the system Data Collector - including a tarball, RPM, Cloudera Manager, or Docker installation. For a tarball or RPM installation, we recommend installing the core version, and then installing the additional stage libraries required by your data engineers. For a Cloudera Manager installation that has added multiple Data Collector role instances to the StreamSets service, you must configure the system Data Collector instance separately from any Data Collector instances that you intend to register with Control Hub. As such, modify the configuration properties for the system Data Collector instance to override the StreamSets service configuration. Installation location In a development environment, you can install the system Data Collector on the same machine as the Control Hub instance as long as the machine has enough resources. For best performance in a production environment, we recommend installing on a remote machine within the same internal network as the Control Hub instance. The Control Hub instance must be able to access the system Data Collector URL.

32 Installation 28 Authentication Configure the system Data Collector to use the default file-based authentication and the form authentication type. By default, a new installation uses filed-based authentication with the form type. If you choose to use an existing Data Collector, verify that the http.authentication property is set to form in the $SDC_CONF/sdc.properties file. Note: The system Data Collector uses encrypted REST APIs to communicate with Control Hub. As a result, you do not need to configure the system Data Collector to use the HTTPS protocol or to use SSL certificates to secure the communication. Installing and Configuring the System Data Collector 1. Install the required version of Data Collector. For installation instructions, see Installation in the Data Collector documentation. 2. Configure a single Data Collector user account with the admin or creator role. The system Data Collector uses Data Collector authentication - unlike registered Data Collectors that use Control Hub authentication. Users do not directly log into the system Data Collector. However, a single user account is required so that Control Hub can make requests to the system Data Collector. You can use the default admin account. However, be sure to change the default password for the account. For instructions on configuring users for file-based authentication, see Configure Users, Groups, and Roles in the Data Collector documentation. 3. Start the Data Collector. Important: Do not register the system Data Collector with Control Hub. Step 2. Set Up Time Synchronization When you install the databases and Control Hub on separate machines, you must set up time synchronization using Network Time Protocol (NTP). NTP synchronizes all participating machines to within a few milliseconds of Coordinated Universal Time (UTC). To use NTP, install and set up an NTP server as described in your operating system documentation. If you do not set up time synchronization, Control Hub might stop processing tasks due to out of order timestamps among the machines. Step 3. Install from the Tarball or RPM Package You can install Control Hub from the tarball and start it manually. Or, you can install Control Hub from the RPM package and run it as a service. Install the tarball or RPM package on a machine that meets the installation requirements. Installing from the Tarball You can install the Control Hub tarball on all supported operating systems. When you install from the tarball, you must start Control Hub manually from the command line. Control Hub runs as the system user account logged into the command prompt when you run the start command. You can alternatively impersonate another user account when you run the command. 1. Download the tarball from the Control Hub on-premises that you received from StreamSets. 2. Use the following command to extract the tarball:

33 Installation 29 tar xzvf streamsets-dpm-<version>.tar.gz For example, to extract version 3.2.1, use the following command: tar xzvf streamsets-dpm tar.gz Installing from the RPM Package You can install the Control Hub RPM package on CentOS or Red Hat Enterprise Linux. When you install from the RPM package, Control Hub runs as a service using the default system user account and group named dpm. If a dpm user and a dpm group do not exist on the machine, the installation creates the user and group for you and assigns them the next available user ID and group ID. Tip: To use specific IDs for the dpm user and group, create the user and group before installation to specify the IDs that you want to use. For example, if you re installing Control Hub on multiple machines for high availability, create the system user and group before installation to ensure that the user ID and group ID are consistent across the machines. Installing Control Hub as a service requires sudo privileges on the root directory. 1. Download the RPM package for your operating system from the Control Hub on-premises that you received from StreamSets: For CentOS 6.x or Red Hat Enterprise Linux 6.x, download the RPM EL6 package. For CentOS 7.x or Red Hat Enterprise Linux 7.x, download the RPM EL7 package. 2. Use the following command to install the RPM package: yum localinstall streamsets-dpm x86_64.rpm Step 4. Download the JDBC Driver Control Hub requires a JDBC driver to connect to the relational database on MySQL or PostgreSQL. 1. Download the JDBC driver for the relational database that you are using: MySQL - Download the MySQL JDBC driver version or later from the following location: PostgreSQL - Download the PostgreSQL JDBC driver version or later from the following location: 2. Copy the driver to the following directory: $DPM_HOME/extra-lib For example, copy the driver to the following directory in an RPM installation: /opt/streamsets-dpm/extra-lib Step 5. Set Environment Variables Before you run the Control Hub installation scripts, you must set the DPM_HOME and DPM_CONF environment variables on the command line. 1. Use the following command to set the DPM_HOME environment variable: export DPM_HOME=<home directory> For example, for a tarball installation use: export DPM_HOME=/dpm/streamsets-dpm-3.2.1

34 Installation 30 For an RPM installation use: export DPM_HOME=/opt/streamsets-dpm 2. Use the following command to set the DPM_CONF environment variable: export DPM_CONF=<configuration directory> For example, for a tarball installation use: export DPM_CONF=/dpm/streamsets-dpm-3.2.1/etc For an RPM installation use: export DPM_CONF=/etc/dpm Step 6. Set Up Control Hub Run the Control Hub setup script to configure Control Hub properties, database connection details, and system Data Collector properties. The setup script uses the dialog command line utility to display the configuration properties using dialog boxes. 1. Install the dialog command line utility. For CentOS or Red Hat Linux, use the following command: yum install dialog For Ubuntu, use the following commands: apt-get update apt-get install dialog 2. If using PuTTY as the SSH client to install Control Hub on a remote machine, configure PuTTY to use linux as the terminal emulation mode. By default, PuTTY uses xterm emulation which does not correctly display the dialog command line utility. In the PuTTY Configuration dialog box, click Connection > Data and then set Terminal-type string to linux. 3. Use the following command to run the Control Hub setup script from the $DPM_HOME directory: dev/setup.sh When you run the script for the first time, configure all of the properties. If necessary, you can run the script again to change a few properties, navigating to the appropriate configuration group. See the sections below for a description of each property. Navigation Tips The Control Hub setup script contains multiple configuration groups that you navigate through to configure the required properties. The initial dialog box displays the configuration groups:

35 Installation 31 Use the arrow keys, the numbers assigned to each section, and the OK, Cancel, and Back options to navigate through the dialog boxes. Type a number to jump to that section rather using arrow keys to cycle through each section. In dialog boxes that offer a selection of two options, use the space bar to select another option. Let's look at the Mail Transport Protocol dialog box: In this example, the SMTP protocol is currently set, as displayed by the asterisk (*) next to the option. To change to the SMTPS protocol, press the down arrow or the number 2 to highlight SMTPS. Then press the space bar to switch the selection - the screen displays the asterisk next to the SMTPS option. Then press Enter with the OK option highlighted to save your selection. Control Hub Configuration The Control Hub configuration group includes the following properties: Important: You must enter a value for the Load Balancer URL. If you are installing a single Control Hub instance and not using a load balancer, enter the same value that you enter for the Control Hub Base URL. Control Hub Configuration Property Control Hub Base URL Description Base URL to access Control Hub based on your installation type: Single installation - You can use the HTTP or HTTPS protocol. Use the HTTPS protocol for a production environment. For the HTTP protocol, use the following format for the base URL: <host name>:<port number> For example: Note: is the default port number for the HTTP protocol. If you change the default port of 18631, then you also must modify the http.port property in the $DPM_CONF/dpm.properties file to use the same port number. To change the Admin tool default port of for the HTTP protocol, modify the admin.http.port property in the same file. For the HTTPS protocol, use the following format for the base URL, entering a unique secure port number: name>:<port number> For example: After installation, you must modify additional configuration files to successfully configure Control Hub to use HTTPS. Multiple installations for high availability - Enter the load balancer URL for this property.

36 Installation 32 Control Hub Configuration Property Load Balancer URL Admin Tool 'admin' Password Mail Transport Protocol Mail Server Host Mail Server Port Mail 'From' Address Mail Server Authentication Mail Server Username Mail Server Password Description Load balancer URL based on your installation type: Single installation - You won't use a load balancer for a single installation, so enter the Control Hub base URL for this property. Multiple installations for high availability - Enter the load balancer URL. For the Control Hub Admin tool, enter a password for the default "admin" user account. Protocol to use for the SMTP account used for s. Use the space bar to select SMTP or SMTPS - the asterisk (*) shows your selection - and then press Enter with the OK option highlighted. Default is SMTP. Host name of the mail server. Port number of the mail server. address to use to send . Whether the mail server host uses authentication. Use the space bar to select enabled or disabled - the asterisk (*) shows your selection - and then press Enter with the OK option highlighted. Default is disabled. If the mail server host uses authentication, user name for the account to send . If the mail server host uses authentication, password for the account. You can use a file that contains the password for the account. By default, the file name is security-app- -password.txt and is expected in the $DPM_CONF directory. You can replace the default file name. Relational Database Configuration The Relational Database configuration group includes the connection details for the databases created for each application in MySQL or PostgreSQL. Select each application, and then enter the following database connection details: Relational Database Property Driver Class Description Name of the JDBC driver class used by the relational database. For MySQL, enter: com.mysql.jdbc.driver For PostgreSQL, enter: org.postgresql.driver

37 Installation 33 Relational Database Property JDBC Connection String Username Password Description Connection string to use to connect to the database. Use the following format: For MySQL, use the following format: jdbc:mysql://<host name>:<port>/<database name> For example: jdbc:mysql://sch.acme.dbs.com:3306/security For PostgreSQL, use the following format: jdbc:postgresql://<host name>:<port>/<database name> For example: jdbc:postgresql://sch.acme.dbs.com:5432/security User name for the JDBC connection. The user account must have all privileges on the database. Password for the user account. System Data Collector Configuration The System Data Collector configuration group includes the following properties for the system Data Collector: System Data Collector Property System Data Collector URL System Data Collector Username System Data Collector Password Description URL of the system Data Collector. Data Collector user account with the admin or creator role. The system Data Collector uses Data Collector authentication - unlike registered Data Collectors which use Control Hub authentication. Default is admin. Password for the Data Collector user account. Default is admin. Time Series Database Configuration The Time Series Database configuration group includes the following properties for the databases created in InfluxDB: Time Series Database Property Metrics Database URL Metrics Database Name Description Metrics database URL using the following format: name>:<port number> For example: Name of the Metrics database. For example, sch.

38 Installation 34 Time Series Database Property Metrics Database Username Metrics Database Password Application Metrics Database URL Application Metrics Database Name Application Metrics Database Username Application Metrics Database Password Description User name for the database. The user account must have all privileges on the database. Password for the database user account. Application Metrics database URL using the following format: name>:<port number> For example: Name of the Application Metrics database. For example, sch_app. User name for the database. The user account must have all privileges on the database. Password for the database user account. Step 7. Enable PostgreSQL for the Scheduler Application (Optional) If using PostgreSQL for the relational database, configure the Scheduler application to use the driver delegate class for PostgreSQL. Note: If using MySQL for the relational database, you can skip this step. Uncomment the following line in the $DPM_CONF/scheduler-app.properties file: org.quartz.jobstore.driverdelegateclass = org.quartz.impl.jdbcjobstore.postgresqldelegate Step 8. Build Schemas in the Relational Database Run the Control Hub database initialization script to create the required tables for each database in the relational database. 1. Use the following command to run the database initialization script from the $DPM_HOME directory: dev/01-initdb.sh 2. Use the appropriate MySQL or PostgreSQL command to verify that tables were created for each database. Step 9. Generate Authentication Tokens for Applications Run the security script to generate a unique authentication token for each Control Hub application. Control Hub uses authentication tokens to authenticate each message or request sent by an application. The application includes the authentication token when it issues authenticated messages or requests to other applications. Use the following command to run the security script from the $DPM_HOME directory: dev/02-initsecurity.sh

39 Installation 35 Step 10. Activate the Control Hub License Each Control Hub system requires an active license before you can start Control Hub. Note: Each license is activated for a specific Control Hub system ID. If you install multiple Control Hub instances for a highly available system, you only need to activate the license once. 1. Generate the Control Hub system ID by running the following command from the $DPM_HOME directory: bin/streamsets dpmcli security systemid -c 2. Open a StreamSets support ticket or contact your StreamSets sales representative to request the activation key for your Control Hub system ID. 3. After you receive the activation key, run the following command from the $DPM_HOME directory to activate the license: bin/streamsets dpmcli security activationkey -i activationkey.txt Step 11. Start Control Hub Start Control Hub from the command prompt, using the required command for your installation type. Starting for a Tarball Installation When you install Control Hub from the tarball, you start Control Hub manually from the command line. Control Hub runs as the system user account logged into the command prompt when you run the start command. You can alternatively impersonate another user account when you run the command. Use the following command from the $DPM_HOME directory to start Control Hub so that it runs as the system user account logged into the command prompt: bin/streamsets dpm Or, use the following command to start Control Hub and run it in the background: nohup bin/streamsets dpm & Use the following command to start Control Hub so that it runs as another system user account: sudo -u <user> bin/streamsets dpm Starting for an RPM Installation When you install Control Hub from an RPM package, you run Control Hub as a service. Control Hub runs as the system user account and group defined in environment variables. The default system user and group are named dpm. Use the required command for your operating system to start Control Hub as a service: For CentOS 6.x or Red Hat Enterprise Linux 6.x, use: service dpm start For CentOS 7.x or Red Hat Enterprise Linux 7.x, use: systemctl start dpm Step 12. Log Into Control Hub After launching Control Hub, log in to Control Hub using the default system administrator account. 1. Enter the Control Hub base URL in the address bar of your browser.

40 Installation 36 You defined the base URL when you ran the Control Hub setup script. After you start Control Hub, the first URL listed in the command output is the base URL: name>:18631 The second URL listed in the command output is the URL to the Control Hub Admin tool, which you can use to monitor and troubleshoot Control Hub issues. No need to log into that tool now, we'll explore it in more detail in Control Hub Admin tool. 2. Use the following credentials to log in as the default system administrator: / Control Hub displays the default dashboard: The system administrator is a user account that belongs to the system organization. The system administrator can complete tasks across all Control Hub organizations. 3. Immediately after logging in as the system administrator, change the default password for the account to ensure the integrity of your data. a) In the Navigation panel, click Administration > My Account. b) Click Update Password. c) Enter the current and new password, and then click Save. Step 13. Enable LDAP Authentication (Optional) If your company uses Lightweight Directory Access Protocol (LDAP), you can use the LDAP provider to authenticate Control Hub users. LDAP authenticates a user using the credentials stored in the LDAP server. Note: You can also use the built-in Control Hub authentication method or the SAML authentication method. These authentication methods are configured by the organization administrator for their organization. For more information, see Authentication. LDAP authentication is configured by the default system administrator - the admin@admin user account - for the entire Control Hub system. After LDAP authentication is enabled, all organizations must use LDAP authentication. Users log in to Control Hub using their Control Hub user ID and their LDAP password. To use LDAP authentication, the Control Hub system administrator configures LDAP connection information for Control Hub and then maps organization administrator accounts to LDAP users. Organization administrators then create Control Hub user accounts for their organization, mapping the Control Hub user accounts to LDAP users. Control Hub can also retrieve group membership from the LDAP provider. To group users, organization administrators create Control Hub groups, and then map the Control Hub groups to LDAP groups. You can enable LDAP authentication during the installation or after the installation.

41 After LDAP authentication is enabled, any changes in the LDAP provider can take up to a minute to be reflected within Control Hub. To enable LDAP authentication, complete the following tasks: Configure LDAP Connection Information Configure Secure Connections to LDAP (Optional) Map Control Hub Organization Administrators to LDAP Users Map Control Hub Users and Groups to LDAP Users and Groups System Administrator and LDAP Authentication Installation 37 When LDAP authentication is enabled, the default system administrator - the admin@admin user account - continues to log in using the password for the account stored in the Control Hub relational database. The system administrator account cannot be mapped to an LDAP user account. If LDAP authentication is incorrectly configured within Control Hub, users will not be able to log in using LDAP authentication. However, the system administrator can use Control Hub credentials to log in, troubleshoot LDAP-related issues, and then re-enable access to Control Hub. Configuring LDAP Connection Information To enable LDAP authentication, configure LDAP connection information in the Control Hub security configuration file, $DPM_CONF/security-app.properties. If your organization has multiple LDAP servers, you can include multiple LDAP configurations within the file - so that Control Hub can connect to multiple LDAP servers. LDAP user accounts must be unique across the servers. A user account that exists in multiple LDAP servers might encounter unexpected results after logging in to Control Hub. 1. In the security-app.properties file, enable LDAP authentication by completing the following steps: a) Comment out the usergroupprovider.id=embedded line, as follows: #usergroupprovider.id=embedded When the property is enabled and set to embedded, Control Hub uses either the built-in Control Hub authentication method or the SAML authentication method. Organization administrators configure these authentication methods for each organization. b) Uncomment the second usergroupprovider.id property, as follows: usergroupprovider.id=m Do not change the value of this property. 2. Configure the following global properties used by all LDAP configurations included in the file: Global Property usergroupprovider.m.providerclass usergroupprovider.m.multi.ids Description Do not change this value. Unique ID for each LDAP configuration. The ID can be any string value. To include multiple LDAP configurations, use a comma-separated list. For example, if you want to connect to both a Microsoft Active Directory server and to an OpenLDAP server, you might define the property as follows: usergroupprovider.m.multi.ids=ad,openldap Default is L.

42 Installation 38 Global Property usergroupprovider.m.multi.fetchgroups Description Retrieves group information from the provider. Must be true if you map Control Hub groups to the provider groups. Default is true. usergroupprovider.m.multi.allgroupsproviderid Not used as this time. 3. Configure the following properties for each LDAP configuration. Each LDAP property includes the following prefix: usergroupprovider.m.multi.<id>.ldap When you configure the properties, replace the <ID> variable in each property name with the unique ID that you defined for the LDAP configuration. For example, if you defined both AD and OpenLDAP as IDs to create multiple LDAP configurations, then you would replace the <ID> variable in the first three properties as follows: # Connection information for Active Directory usergroupprovider.m.multi.ad.providerclass= com.streamsets.apps.security.authentication.ldap.ldapusergroupprovider usergroupprovider.m.multi.ad.ldap.poolminconnections=3 usergroupprovider.m.multi.ad.ldap.poolmaxconnections=10... # Connection information for OpenLDAP usergroupprovider.m.multi.openldap.providerclass= com.streamsets.apps.security.authentication.ldap.ldapusergroupprovider usergroupprovider.m.multi.openldap.ldap.poolminconnections=3 usergroupprovider.m.multi.openldap.ldap.poolmaxconnections=10... For simplicity, the table below has dropped the following prefix from each LDAP property: usergroupprovider.m.multi.<id>.ldap Global Property providerclass poolminconnections poolmaxconnections poolvalidateconnections Description Use the correct <ID> variable in the property prefix, but do not change the property value. Minimum number of connections for the Bind DN connection pool. Default is 3. Maximum number of connections for the Bind DN connection pool. Default is 10. Validate the connections retrieved from the Bind DN connection pool. Default is true. connectiontimeoutmillis Connection timeout in milliseconds. Default is responsetimeoutmillis Response timeout in milliseconds. Default is hostname port LDAP server host name. LDAP server port. To use unencrypted connections or to use connections secured with StartTLS, enter the LDAP port number, typically 389. To use connections secured with LDAPS, enter the port number for secure connections, typically 636.

43 Installation 39 Global Property ldaps starttls userbasedn userobjectclass usernameattribute user attribute userfullnameattribute userfilter binddn bindpassword fetchgroups groupbasedn groupobjectclass groupmemberattribute groupnameattribute groupfullnameattribute groupfilter Description Secure LDAP connections using the LDAPS (LDAP over SSL) protocol. Default is false. You must complete additional steps to use LDAPS, see Configuring Secure Connections to LDAP. Secure LDAP connections using the StartTLS protocol. Default is false. You must complete additional steps to use StartTLS, see Configuring Secure Connections to LDAP. Note: StartTLS and LDAPS cannot be used at the same time. If both starttls and ldaps are set to true, starttls takes precedence. Base DN under which user accounts are located. Name of the user object class. Default is inetorgperson. Name of the user ID attribute. Default is uid. Name of the address attribute. Default is mail. Name of the user full name attribute. Default is cn. User attribute used to log in to Control Hub. For example, LDAP users can log in using a username, uid, or address. Default is %s={user}, where %s is replaced with the value of the usernameattribute property. Root distinguished name (DN) used to query the directory server. This user must have privileges to search the directory. Password for the root distinguished name. Retrieves group information from the LDAP provider. Must be true if you map Control Hub groups to LDAP groups. Default is true. Base DN to search for group membership. Name of the group object class. Default is groupofuniquenames. Name of the group attribute for user names. Default is uniquemember. Name of the attribute for group names. Default is cn. Name of the group attribute included with the group display name. Default is description. Group attribute used to look up the group memberships for a user. For example, you can search for a user's groups using a base DN, username, or uid. Default is %s={dn}, where %s is replaced with the value of the groupmemberattribute property. The group object class is also automatically added to the group filter. 4. Shut down and then launch Control Hub again to enable the changes.

44 Installation 40 Example for Active Directory The following example shows a configuration to connect to a Microsoft Active Directory server: #usergroupprovider.id=embedded usergroupprovider.externalprovider.principalcache.expiration.secs=60 usergroupprovider.id=m usergroupprovider.m.providerclass= com.streamsets.apps.security.authentication.multiusergroupprovider usergroupprovider.m.multi.ids=ad usergroupprovider.m.multi.fetchgroups=true usergroupprovider.m.multi.allgroupsproviderid=l usergroupprovider.m.multi.ad.providerclass= com.streamsets.apps.security.authentication.ldap.ldapusergroupprovider usergroupprovider.m.multi.ad.ldap.poolminconnections=3 usergroupprovider.m.multi.ad.ldap.poolmaxconnections=10 usergroupprovider.m.multi.ad.ldap.poolvalidateconnections=true usergroupprovider.m.multi.ad.ldap.connectiontimeoutmillis=5000 usergroupprovider.m.multi.ad.ldap.responsetimeoutmillis=5000 usergroupprovider.m.multi.ad.ldap.hostname=abc01.mycompany.net usergroupprovider.m.multi.ad.ldap.port=636 usergroupprovider.m.multi.ad.ldap.ldaps=true usergroupprovider.m.multi.ad.ldap.starttls=false usergroupprovider.m.multi.ad.ldap.userbasedn=ou=mycompany,dc=mycompany,dc=net usergroupprovider.m.multi.ad.ldap.userobjectclass=organizationalperson usergroupprovider.m.multi.ad.ldap.usernameattribute=samaccountname usergroupprovider.m.multi.ad.ldap.user attribute=mail usergroupprovider.m.multi.ad.ldap.userfullnameattribute=cn usergroupprovider.m.multi.ad.ldap.userfilter=%s={user} usergroupprovider.m.multi.ad.ldap.binddn=admin@mycompany.net usergroupprovider.m.multi.ad.ldap.bindpassword=****** usergroupprovider.m.multi.ad.ldap.fetchgroups=true usergroupprovider.m.multi.ad.ldap.groupbasedn=ou=mycompany,dc=mycompany,dc=net usergroupprovider.m.multi.ad.ldap.groupobjectclass=group usergroupprovider.m.multi.ad.ldap.groupmemberattribute=member usergroupprovider.m.multi.ad.ldap.groupnameattribute=cn usergroupprovider.m.multi.ad.ldap.groupfullnameattribute=description usergroupprovider.m.multi.ad.ldap.groupfilter=%s={dn} Example for OpenLDAP The following example shows a configuration to connect to an OpenLDAP server: #usergroupprovider.id=embedded usergroupprovider.externalprovider.principalcache.expiration.secs=60 usergroupprovider.id=m usergroupprovider.m.providerclass= com.streamsets.apps.security.authentication.multiusergroupprovider usergroupprovider.m.multi.ids=openldap usergroupprovider.m.multi.fetchgroups=true usergroupprovider.m.multi.allgroupsproviderid=l usergroupprovider.m.multi.openldap.providerclass= com.streamsets.apps.security.authentication.ldap.ldapusergroupprovider usergroupprovider.m.multi.openldap.ldap.poolminconnections=3 usergroupprovider.m.multi.openldap.ldap.poolmaxconnections=10 usergroupprovider.m.multi.openldap.ldap.poolvalidateconnections=true usergroupprovider.m.multi.openldap.ldap.connectiontimeoutmillis=5000 usergroupprovider.m.multi.openldap.ldap.responsetimeoutmillis=5000 usergroupprovider.m.multi.openldap.ldap.hostname=abc02.mycompany.net usergroupprovider.m.multi.openldap.ldap.port=389 usergroupprovider.m.multi.openldap.ldap.ldaps=false usergroupprovider.m.multi.openldap.ldap.starttls=false usergroupprovider.m.multi.openldap.ldap.userbasedn=ou=employees,dc=example,dc=org usergroupprovider.m.multi.openldap.ldap.userobjectclass=inetorgperson usergroupprovider.m.multi.openldap.ldap.usernameattribute=uid usergroupprovider.m.multi.openldap.ldap.user attribute=mail usergroupprovider.m.multi.openldap.ldap.userfullnameattribute=cn usergroupprovider.m.multi.openldap.ldap.userfilter=%s={user} usergroupprovider.m.multi.openldap.ldap.binddn=cn=admin,dc=example,dc=org

45 Installation 41 usergroupprovider.m.multi.openldap.ldap.bindpassword=****** usergroupprovider.m.multi.openldap.ldap.fetchgroups=true usergroupprovider.m.multi.openldap.ldap.groupbasedn=ou=departments,dc=example,dc=org usergroupprovider.m.multi.openldap.ldap.groupobjectclass=groupofnames usergroupprovider.m.multi.openldap.ldap.groupmemberattribute=member usergroupprovider.m.multi.openldap.ldap.groupnameattribute=cn usergroupprovider.m.multi.openldap.ldap.groupfullnameattribute=description usergroupprovider.m.multi.openldap.ldap.groupfilter=%s={dn} Configuring Secure Connections to LDAP (Optional) You can optionally configure Control Hub to use one of the following methods to make secure connections to the LDAP server: LDAP over SSL (LDAPS) LDAPS uses SSL to encrypt LDAP connections. LDAPS uses the ldaps:// scheme. StartTLS StartTLS can wrap an unencrypted connection with TLS during the connection process. This allows the same port to handle both unencrypted and encrypted connections. StartTLS uses the ldap:// scheme. Use the same procedure to configure either secure method. 1. In the $DPM_CONF/security-app.properties file, set one of the following properties to true: usergroupprovider.m.multi.<id>.ldap.ldaps usergroupprovider.m.multi.<id>.ldap.starttls By default, both properties are false and so Control Hub makes unencrypted connections to the LDAP server. If you set both properties to true, StartTLS takes precedence. 2. Set the port property in the security-app.properties file as required, based on the method that you enabled: LDAPS - Use the port number for secure connections, typically 636. StartTLS - Use the LDAP port number, typically Define the following options in the DPM_JAVA_OPTS environment variable: javax.net.ssl.truststore - path to truststore file javax.net.ssl.truststorepassword - truststore password Modify environment variables in the file required by your installation type. For example, define the options as follows: export DPM_JAVA_OPTS="-Djavax.net.ssl.trustStore=<path to truststore file> - Djavax.net.ssl.trustStorePassword=<password> -server -Xmx1024m -Xms1024m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./ dpm-$$-outofmem.hprof -XX:ErrorFile=./dpm-$$-error.log ${DPM_JAVA_OPTS}" Or to secure the password, save the password in a text file and then define the truststore password option as follows: -Djavax.net.ssl.trustStorePassword=$(cat passwordfile.txt) 4. Shut down and then launch Control Hub again to enable the changes. Mapping Organization Administrators to LDAP Users As the system administrator, you must map each StreamSets Control Hub organization administrator to an LDAP user account. The steps you use to map organization administrators depend on whether you are creating new organizations or editing existing organizations created before LDAP authentication was enabled:

46 Installation 42 New organizations When you create new organizations, you simply enter the name of the LDAP user account to map to the organization administrator created with the organization, as described in Creating Organizations. Existing organizations If you have existing organizations created before LDAP authentication was enabled, edit the organization administrator user accounts as follows: 1. In the Navigation panel, click Administration > Users. 2. Click the Toggle Filter Column icon ( ) to filter the users by organization. 3. Select an organization and then click the existing organization administrator to display the account details. 4. Enter the name of the LDAP user account to map to the organization administrator in the LDAP User Name property. 5. Click Save. Mapping Users and Groups to LDAP Users and Groups Organization administrators must map StreamSets Control Hub users and groups to LDAP users and groups for their organization. When using LDAP authentication with Control Hub, we recommend that organization administrators follow these best practices: 1. Create Control Hub user accounts for the organization and map them to LDAP user accounts by entering the name of the LDAP user in the LDAP User Name property. Assign the Organization User role to each Control Hub user account, and then clear all other roles. 2. Create Control Hub groups and map them to LDAP groups by entering the name of the LDAP group in the LDAP Group Name property. LDAP group names are case sensitive. When LDAP authentication is enabled, you cannot assign Control Hub users to Control Hub groups. Control Hub retrieves the group memberships from the LDAP provider. 3. Assign Control Hub roles to the Control Hub groups. To more efficiently manage user accounts, assign roles to groups rather than individually to each user account. Step 14. Create a Backup System Administrator Log into Control Hub and create a backup system administrator for the system organization - in case you lose the password for the default system administrator. 1. Log into Control Hub as the system administrator, using the admin@admin user account and the Control Hub password for that account. Note: If you enabled LDAP authentication, the admin@admin system administrator is still authenticated by Control Hub using Control Hub credentials. 2. In the Navigation panel, click Administration > Users. 3. Click the Add New User icon:. 4. Enter a user ID in the following format: <ID>@admin. Note: When you create a user, you can only the create the user for the organization that you are currently logged into - the "admin" organization in this case. 5. Enter a display name. 6. If you did not enable LDAP authentication, enter an address.

47 When LDAP authentication is enabled, Control Hub retrieves a user's address from the LDAP provider. 7. If you enabled LDAP authentication, enter the name of the LDAP user account to map to this backup system administrator in the LDAP User Name property. 8. Clear all of the default roles, and then select the System Administrator role. If you did not enable LDAP authentication, the configured Add User window should look like this: Installation Click Save. Control Hub sends an to the specified address so that you can change the password for this backup system administrator account. Step 15. Create Organizations An organization is a secure space provided to a set of Control Hub users. All Data Collectors, pipelines, jobs, and topologies added by any user in the organization belong to that organization. A user logs in to Control Hub as a member of an organization and can access data that belongs to that organization only. When you create an organization, you create an organization administrator that can perform administrative tasks for that organization only. Control Hub includes a system organization with an ID of "admin" that includes the default system administrator user account. The system administrator can complete tasks across all Control Hub organizations. We recommend creating at least one backup system administrator account, as described in the previous step. You can add additional non-admin users to the system organization. However, as a best practice, create one or more organizations for your enterprise separate from the system organization. For example, you might create a single organization named My Company for your enterprise where you add all user accounts. When you log in to Control Hub as the system administrator, you can see both the system and the My Company organizations:

48 Installation 44 When the organization administrator for the My Company organization logs in, the organization administrator can see the My Company organization only. You can create multiple organizations for your enterprise. For example, you might create one organization for the Northern Office and another organization for the Southern Office. Users in the Northern Office organization cannot access any data that belongs to the Southern Office organization. For more details about the system organization and creating multiple organizations, see Organizations. Creating Organizations Create organizations before creating additional user accounts or registering Data Collectors. When you create an organization, you also create an organization administrator that can perform administrative tasks for that organization only. 1. In the Navigation panel, click Administration > Organizations. 2. Click the Add New Organization icon:. 3. On the Add Organization window, configure the following properties: Organization Property Organization ID Organization Name Admin User ID Admin User Display Name Admin User Address LDAP User Name Default User Password Expiry Time Description ID that uniquely identifies the organization. Display name for the organization. ID that uniquely identifies the organization administrator. Use the following format: ID> Display name for the organization administrator. If LDAP authentication is not enabled, address for the organization administrator. When LDAP authentication is enabled, Control Hub retrieves a user's address from the LDAP provider. If LDAP authentication is enabled, name of the LDAP user account to map to this organization administrator. Maximum number of days that a user password is valid. Users in the organization must reset their password when the maximum number of days is reached.

49 Installation 45 Organization Property Valid Domains Active Description List of trusted domains that can make authentication requests to Control Hub on behalf of the organization. Organization administrators can configure the valid domains for the organization, as described in the Control Hub online help. Access the online help from the Help icon:. Enables access to Control Hub. When disabled, users belonging to the organization cannot log in to Control Hub or any Data Collector registered with the organization. 4. Click Save. For introductory Control Hub tutorials, including creating users and groups for an organization and registering Data Collectors, see the Control Hub online help. Access the online help from the Help icon:. Setting Up a Highly Available Environment In a production environment, we recommend using multiple Control Hub instances and a load balancer to ensure that Control Hub is highly available. Note: You can run a single Control Hub instance in a development environment. If you are setting up a development environment, you can skip this section. When you plan a highly available environment, consider the following requirements: Use highly available database clusters dedicated to the production environment. Before you set up a highly available environment, ensure that you have installed MySQL Enterprise High Availability or are using PostgreSQL with high availability enabled and InfluxEnterprise and have created relational and time series databases dedicated to the production environment. If you created development databases for an initial Control Hub development environment as described in Creating the Databases, then follow the same steps to create the production databases. Install Control Hub on at least two machines to support Control Hub failover. Install Control Hub on additional machines if you find that you need additional instances to handle the number of requests to Control Hub. Use unique component IDs for each Control Hub instance. Define a unique application component ID for each instance. The default component ID is <application name>000. For example: notification000 and jobrunner000. You might want to incrementally update the component ID for each Control Hub instance. For example: <application name>001 for the first instance and <application name>002 for the second instance. Or, you can set the component ID to the IP address of each machine, for example: <application name> Related Information High Availability on page 16 Step 1. Set Up Time Synchronization Set up time synchronization for each additional Control Hub machine using the same NTP server that you configured for the initial Control Hub instance. See Set Up Time Synchronization.

50 Installation 46 Step 2. Set Up a Load Balancer for Control Hub Set up a load balancer to distribute user and registered Data Collector, Data Collector Edge, and Provisioning Agent requests across the Control Hub system. These Control Hub clients communicate with the Control Hub system through the load balancer URL. We recommend using a Layer 7 load balancer such as HAProxy, NGINX, or F5. Configure the following information to set up the load balancer: HTTPS protocol Configure the load balancer to use the HTTPS protocol. As a best practice, configure the Control Hub instances to use the HTTPS protocol also. If you need the Control Hub instances to use the HTTP protocol, then you must configure the load balancer to add the X-Forwarded- Proto header to all inbound requests. IP address and port numbers for each Control Hub instance Configure the IP address and both of the following port numbers for each Control Hub instance: Control Hub port number. Default is Control Hub Admin tool port number. Default is Control Hub backend definitions Define the following backend definitions for Control Hub: /jobrunner /messaging /pipelinestore /provisioning /reporting /scheduler /security /timeseries /topology /notification/rest/v1 /sla/rest/v1 Define the following backend definition for the Control Hub Admin tool: /rest/v1/system After setting up the load balancer, configure network routes and firewalls so that the Control Hub instances, web UI clients, and registered Data Collectors, Data Collector Edges, and Provisioning Agents can reach the load balancer. Step 3. Install the Initial Control Hub Instance When you install the initial Control Hub instance for a highly available environment, follow the complete installation process as described in Installing Control Hub on page 27. Then, make sure that the Control Hub instance can access the front end of the load balancer to communicate with the other Control Hub instances.

51 Installation 47 Step 4. Install Additional Control Hub Instances When you install additional Control Hub instances on separate machines for a highly available environment, use the shortened installation process described here. 1. Install Control Hub on a separate machine that meets the installation requirements using one of the following installation methods: Installing from the Tarball Installing from the RPM Package 2. Download the JDBC driver for the relational database that you are using: MySQL - Download the MySQL JDBC driver version or later from the following location: PostgreSQL - Download the PostgreSQL JDBC driver version or later from the following location: 3. Copy the driver to the following directory: $DPM_HOME/extra-lib For example, copy the driver to the following directory in an RPM installation: /opt/streamsets-dpm/extra-lib 4. Set the $DPM_HOME and $DPM_CONF environment variables. Use the following command to set the $DPM_HOME environment variable: export DPM_HOME=<home directory> For example: export DPM_HOME=/opt/streamsets-dpm Use the following command to set the $DPM_CONF environment variable: export DPM_CONF=<configuration directory> For example: export DPM_CONF=/etc/dpm 5. Copy all files from the $DPM_CONF directory in the initial Control Hub instance to the $DPM_CONF directory in this additional Control Hub instance. By copying all configuration files, you ensure that this additional Control Hub instance connects to the same load balancer, databases, and SMTP account as the initial Control Hub instance. 6. Update the configuration file for each Control Hub application to define a unique component ID for this Control Hub instance. Modify the value of the dpm.componentid property in these files located in the Control Hub configuration directory, $DPM_CONF: dpm.properties jobrunner-app.properties messaging-app.properties notification-app.properties pipelinestore-app.properties provisioning-app.properties reporting-app.properties scheduler-app.properties

52 security-app.properties sla-app.properties timeseries-app.properties topology-app.properties The default component ID for each instance is <application name>000. For example: notification000 and jobrunner000. Installation 48 You might want to incrementally update the component ID for each Control Hub instance. For example: <application name>001 for the first instance and <application name>002 for the second instance. Or, you can set the component ID to the IP address of each machine, for example: <application name> Run the security script to generate a unique authentication token for each Control Hub application. Use the following command to run the security script from the $DPM_HOME directory: dev/02-initsecurity.sh <component ID> For example, if you defined the component ID for this installation as <application name>002, use the following command: dev/02-initsecurity.sh Make sure that the Control Hub instance can access the front end of the load balancer to communicate with the other Control Hub instances. Repeat these steps for each additional Control Hub instance. Step 5. Start Each Control Hub Instance Start each Control Hub instance from the command prompt, and then use the load balancer URL to log into Control Hub. 1. Start the load balancer. 2. Start each Control Hub instance from the command prompt, using the required command for your installation type: Tarball installation Use the following command from the $DPM_HOME directory to start Control Hub so that it runs as the system user account logged into the command prompt: bin/streamsets dpm Or, use the following command to start Control Hub and run it in the background: nohup bin/streamsets dpm & Use the following command to start Control Hub so that it runs as another system user account: sudo -u <user> bin/streamsets dpm RPM installation Use the required command for your operating system to start Control Hub as a service: For CentOS 6.x or Red Hat Enterprise Linux 6.x, use: service dpm start For CentOS 7.x or Red Hat Enterprise Linux 7.x, use: systemctl start dpm For more details about starting each instance, see Start DPM. 3. To log in to Control Hub, enter the load balancer URL in the address bar of your browser.

53 Installation 49 Uninstalling Control Hub To uninstall a Control Hub instance, shut down Control Hub and then remove all Control Hub directories. If no additional Control Hub instances are using the same relational and time series databases, then remove those databases also. If no additional Control Hub instances are using the same system Data Collectors, then uninstall those Data Collectors also. 1. Shut down Control Hub. To shut down when Control Hub is started as a service, use the required command for your operating system: For CentOS 6.x or Red Hat Enterprise Linux 6.x, use: service dpm stop For CentOS 7.x or Red Hat Enterprise Linux 7.x, use: systemctl stop dpm To shut down when Control Hub is started manually from the tarball, use the Control Hub process ID in the following command: kill <process ID> 2. Remove the Control Hub home directory, $DPM_HOME. 3. If no additional Control Hub instances are using the same relational and time series databases, then remove those databases. a) To remove databases from the relational database, connect to MySQL or PostgreSQL and then use the following command to drop the database for each application: DROP DATABASE <database name>; For example, if you created databases with names that matched each application, then use the following commands: DROP DATABASE jobrunner; DROP DATABASE messaging; DROP DATABASE notification; DROP DATABASE pipelinestore; DROP DATABASE provisioning; DROP DATABASE reporting; DROP DATABASE scheduler; DROP DATABASE security; DROP DATABASE sla; DROP DATABASE timeseries; DROP DATABASE topology; b) To remove databases from the time series database, connect to InfluxDB and then use the following Influx Query Language (InfluxQL) statement to remove each database: DROP DATABASE <database name> For example, if you named the Metrics database "sch" and named the Application Metrics database "sch_app", run the following statements: DROP DATABASE sch DROP DATABASE sch_app 4. If no additional Control Hub instances are using the same system Data Collectors, then remove those Data Collectors. For instructions, see Uninstallation in the Data Collector documentation.

54 Administration 50 Chapter 4 Administration This chapter includes the following information: Overview Organizations Dashboards Messaging View Pipeline Templates Data Collector Version Range Administer Control Hub Applications Logs Control Hub Configuration Control Hub Environment Configuration Shutting Down Control Hub Renewing the Control Hub License Control Hub Admin Tool

55 Administration 51 Overview As the StreamSets Control Hub system administrator, you can access additional areas of the Control Hub user interface and the separate Admin tool so that you can administer, monitor, and troubleshoot the complete Control Hub system. A StreamSets Control Hub system includes the following tools: Control Hub When you log into Control Hub as the system administrator, you can access additional areas of the user interface so that you can complete tasks across all organizations. You can view the metadata of all objects across all organizations; however, you do not have access to the objects. For example, you can view the name and description of all pipelines, but you cannot view the pipeline configuration in the canvas. As a system administrator, you can also monitor Control Hub applications. You can access additional dashboards and can monitor the messaging queue managed by the Messaging application. You can also view log files and modify Control Hub configuration property files that are included in the Control Hub installation directory. When you launch Control Hub from the command prompt, the first URL listed in the command output is the Control Hub URL: Admin tool name>:18631 In addition to the main StreamSets Control Hub tool that all Control Hub users log into, system administrators can log into the separate Admin tool to monitor and troubleshoot Control Hub issues. For example, if Control Hub becomes inaccessible, the Admin tool remains running. You can still log into the Admin tool to troubleshoot the Control Hub issues. When you launch Control Hub from the command prompt, the second URL listed in the command output is the Admin tool URL: name>:18632/admin.html Organizations An organization is a secure space provided to a set of Control Hub users. All Data Collectors, pipelines, jobs, and topologies added by any user in the organization belong to that organization. A user logs in to Control Hub as a member of an organization and can access data that belongs to that organization only. Control Hub includes a default system organization with an ID of "admin" and a single system administrator user account. A system administrator can complete tasks across all Control Hub organizations. Create organizations for your enterprise separate from the system organization. When you create an organization, you create an organization administrator that can perform administrative tasks for that organization only. You can create a single organization for your enterprise where you add all users. Or you can create multiple organizations for your enterprise. For example, you might create one organization for the Northern Office and another organization for the Southern Office. Users in the Northern Office organization cannot access any data that belongs to the Southern Office organization. You can use groups to efficiently assign roles and permissions to sets of users within an organization without having to edit individual users. If you create multiple organizations, you can configure global properties that affect all organizations.

56 Administration 52 System Organization and Administrator Control Hub includes a single default system organization with an ID of "admin". The system organization functions the same as all other organizations with the following exceptions: Users in the system organization can be assigned the System Administrator role. The System Administrator role is not available for any other organization. Security Assertion Markup Language (SAML) authentication cannot be enabled for the system organization. Users in the system organization can be authenticated using built-in Control Hub authentication only. The system organization includes a set of pipelines that can be used as pipeline templates when users in other organizations design pipelines in Control Hub. For more information, see Pipeline Templates. You can log in to Control Hub as the default system administrator using the user ID of admin@admin. Or, you can create additional users in the system organization and assign the users the System Administrator role. We recommend creating at least one backup system administrator account - in case you lose the password for the default system administrator. A user with the System Administrator role can complete the following tasks: Create and configure other organizations. Configure global properties for all organizations. View the metadata of all objects across all organizations. For example, a user with the System Administrator role can view the name and description of all pipelines, but cannot view the pipeline configuration in the canvas. Monitor Control Hub applications. Monitor the messaging queue managed by the Messaging application. Register and administer Data Collectors and configure users and groups for the system organization. Organizations and Groups You can use both organizations and groups to create sets of users. However, there are important differences between the two: Organizations Organizations are required. When you create a user, you must specify the organization that the user belongs to. Organizations are completely independent from each other. Data cannot be shared between organizations. After logging in to Control Hub, users can see the data only for the organization that they logged into. Users cannot view data across organizations in a single login session. If a user needs to access data belonging to two different organizations, the user must have an account in each organization. Users can share objects with other users that belong to the same organization - but they cannot share objects across organizations. Groups Groups are optional groupings of users within a single organization. Use groups to more efficiently assign roles and permissions to sets of users without having to edit individual users. When you create a user, you can optionally specify the groups that the user belongs to. Groups can be independent from each other, based on how you assign permissions to the groups and users within the groups. Data can also be shared between groups. After logging in to Control Hub, users who belong to multiple groups can see all data that all of the groups have been granted access to. Users can view data across groups in a single login session. Users can share objects with users in different groups within a single organization. You can use both organizations and groups to create a multitenant environment:

57 To create a multitenant environment with organizations, simply create multiple organizations and add the appropriate users to each organization. Administration 53 To create a multitenant environment with multiple groups in a single organization, enable permissions for the organization, create groups of users, and then share objects within the groups to grant each group access to the appropriate objects. For more information about using groups and permissions to create a multitenant environment, see the StreamSets Control Hub online help. Access the online help from the Help icon:. Global Organization Configuration You can configure global properties that affect all organizations. Some properties can be overridden by the organization administrator for each organization. To configure global properties, click the Global Configuration icon in the top toolbar of the Organizations view: You can globally configure the following organization properties: Organization Property Maximum number of jobs, pipelines, Data Collectors, users, and topologies in the organization Enforce Permissions During Object Access Enable Events to Trigger Subscriptions Enable Time Series Analysis Maximum Password Validity Inactivity Period for Session Termination Description Maximum number of objects of each type in the organization. These properties can only be set by the system administrator. An organization administrator can view the values, but cannot change them. Enable permission enforcement to secure the integrity of organization data. Disable permission enforcement if you want all users in the organization to have full access to all objects. An organization administrator can override this property for an organization. Enable events so that Control Hub can trigger subscriptions for organizations. Disable events if you do not users to use subscriptions. An organization administrator can override this property for an organization. Enables Control Hub to store time series data for organizations which users can analyze when monitoring jobs. When time series analysis is disabled, users can still view the total record count and throughput for a job, but cannot view the data over a period of time. For example, they can t view the record count for the last five minutes or for the last hour. Maximum number of days that a user password is valid. An organization administrator can override this property for an organization. Maximum number of minutes that a user session can remain inactive before timing out. An organization administrator can override this property for an organization. You can also globally configure all SAML identity provider (IdP) properties. An organization administrator can override any of the IdP properties.

58 Administration 54 Dashboards The Dashboards view includes the default dashboard that all organization administrators can access. As the system administrator, the default dashboard provides a summary of the system organization activity. You can also access additional dashboards that help you monitor and troubleshoot the entire Control Hub system. Click Default Dashboard in the top toolbar of the Dashboards view to display the list of additional dashboards: The additional dashboards include monitoring information and metrics about the Control Hub applications and the time series database. The JVM Metrics dashboard includes metrics about CPU, memory, and heap usage for Control Hub. Messaging View The Messaging view in the Navigation panel lists messages waiting in the messaging queue that have not yet been retrieved by the receiving Control Hub component. The Messaging application manages the messaging queue. For example, when a user starts a job in Control Hub, the Job Runner application sends a pipeline start request for a specific Data Collector to the Messaging application. The Messaging application retains the request and displays it in the Messaging view until the receiving Data Collector retrieves the request. The Messaging view lists messages by organization. Select the organization that you want to monitor. Related Information Data Collector Communication on page 12 Pipeline Templates The default system organization with an ID of "admin" includes a set of published pipelines. When data engineers in other organizations design pipelines in the Control Hub Pipeline Designer, they can use any published pipeline belonging to the system organization as a pipeline template. You can edit the provided templates and create new templates. You can also configure another organization to manage the templates. For example, you might create your own set of pipeline templates that include the stages and processing your pipelines require. Data engineers can use the templates as a basis for their pipeline development.

59 Administration 55 To view the list of provided pipeline templates, log in as the system administrator and then in the Pipeline Repository view, filter the list of pipelines by the system organization. The provided templates include Data Collector and SDC Edge pipeline templates. A user with the System Administrator role can view the name and description of the pipelines, but cannot view or edit the pipeline configuration in the canvas. To manage the pipelines, assign the following roles to the system administrator account: Pipeline Editor Data Collector Creator or Administrator Job Operator The following image displays a few of the provided SDC Edge templates in the Pipeline Repository view of the system organization: To manage pipeline templates, you can take either of the following approaches: Modify the provided templates and create new templates in the system organization. You can edit or delete the provided templates. You can also create new pipeline templates in the system organization. All published pipelines that belong to the system organization can be used as pipeline templates by any other organization. Important: If you create additional user accounts in the system organization to manage the pipeline templates, assign them the required roles to manage pipelines. However, take care not to grant those users the System Administrator role. If you enable permission enforcement for the system organization, grant the users the required pipeline permissions. Create new templates in another organization. You can create and configure another organization that manages pipeline templates. You might want to configure another organization dedicated to pipeline template management to ensure that only system administrators log into and access the system organization. You cannot move the pipeline templates provided with the system organization to another organization. However, you can recreate the same pipeline templates in another organization. To configure another organization to manage pipeline templates, modify the following property in the $DPM_CONF/pipelinestore-app.properties file: pipeline.templates.organization - Name of the organization that manages pipeline templates. All published pipelines that belong to this configured organization can be used as pipeline templates by any other organization. Restart Control Hub for the changed property to take effect. Create user accounts for this organization and assign them the required roles to manage pipelines. If you enable permission enforcement for the organization, grant the user accounts the required pipeline permissions. Note: The pipeline.templates.organizationuser property in the $DPM_CONF/pipelinestore-app.properties file defines the user account assigned to the Committed By property for the provided pipeline templates during

60 Administration 56 installation or upgrade. There is no need to change the value of this property after completing the Control Hub installation or upgrade. Data Collector Version Range By default, StreamSets Control Hub can work with registered Data Collectors from version to the current version of Control Hub. You can customize the Data Collector version range. For example, you might require that the minimum Data Collector version is You can specify official Data Collector release versions or SNAPSHOT versions installed from a nightly build. 1. In the Navigation panel, click Administration > Data Collectors. 2. Click the Component Version Range icon:. 3. Enter the minimum and maximum Data Collector versions that can work with Control Hub. 4. If you enter a SNAPSHOT version for the minimum version, enter the oldest acceptable build date for that version using the following format: yyyy-mm-dd't'hh:mm'z' For example: T06:30Z Enter ANY to allow any build date from the specified SNAPSHOT version. Control Hub checks the build date only when the Data Collector version matches the specified minimum version. Administer Control Hub Applications You can view and administer each Control Hub application by clicking Administration > SCH Apps in the Navigation panel. The Control Hub Apps view lists each application by component ID and the URL where the application is running. You can deactivate an application or regenerate the authentication token for an application. However, you won't need to deactivate or regenerate authentication tokens for applications unless you are upgrading Control Hub or you have set up a highly available Control Hub environment. Logs View Control Hub logs to help you monitor and troubleshoot issues with Control Hub. Locate the Control Hub log file in the following location: $DPM_LOG/dpm.log Or, you can access the contents of the log file when you are logged into the Control Hub Admin tool. Note: If you installed multiple Control Hub instances for a highly available environment, each Control Hub installation has its own log file.

61 Administration 57 Control Hub Configuration You can edit StreamSets Control Hub configuration files to configure properties such as the host name and port number and SMTP account information for s. You can also customize Control Hub to display your company logo instead of the StreamSets logo in the user interface. Control Hub configuration files are included in the $DPM_CONF directory. View the comments in the file for a description of each property. If you modify a property in a configuration file, restart Control Hub for the changes to take effect. The following table lists each configuration file: File Name basic-realm.properties common-to-all-apps.properties dpm-log4j.properties dpm.properties jobrunner-app.properties messaging-app.properties notification-app.properties pipelinestore-app.properties provisioning-app.properties reporting-app.properties scheduler-app.properties security-app.properties sla-app.properties timeseries-app.properties topology-app.properties Description Configures users that can log in to the Control Hub Admin tool, as described in Configuring Users. Properties that are common to all Control Hub applications, including the Control Hub base URL, load balancer URL, SMTP account properties to enable Control Hub to send , and system Data Collector configuration. Log configuration properties, including the log level. Properties for Control Hub, including properties to configure HTTPS. Properties for the Job Runner application. Properties for the Messaging application. Properties for the Notification application. Properties for the Pipeline Store application, including properties to configure the organization that manages pipeline templates. Properties for the Provisioning application. Properties for the Reporting application. Properties for the Scheduler application. Properties for the Security application, including properties to configure LDAP authentication. Properties for the SLA application. Properties for the Time Series application. Properties for the Topology application. Configuring HTTPS You can configure Control Hub and the separate Admin tool to use HTTP or HTTPS. We recommend using HTTPS in a production environment. The steps that you complete to configure HTTPS depend on whether you are running a single Control Hub instance in a development environment or multiple Control Hub instances in a highly available production environment. Single Instance When running a single Control Hub instance in a development environment, you can configure Control Hub and the separate Admin tool to use HTTP or HTTPS. By default both use HTTP.

62 Administration 58 Configure both Control Hub and the Admin tool to use the same protocol. We recommend using HTTPS for both tools in a production environment. HTTPS requires an SSL/TLS certificate. Control Hub provides a self-signed certificate so you can use HTTPS immediately. You can also generate a certificate that is self-signed or signed by a certifying authority. Web browsers generally issue a warning for self-signed certificates. 1. To define the secure port and the keystore file for Control Hub, configure the following properties in the Control Hub configuration file, $DPM_CONF/dpm.properties: Control Hub HTTPS Property http.port https.port https.keystore.path https.keystore.password Description Port number for Control Hub. Set to -1 to enable the HTTPS protocol. Secure port number for Control Hub. Any number besides -1 enables the secure port number. Path and name of the keystore file that contains the private key and selfsigned certificates for the web server. Store the file on the Control Hub machine. Enter an absolute path or a path relative to the $DPM_CONF directory. Default is keystore.jks in the $DPM_CONF directory. Path and name of the file that contains the password to open the Java keystore file. Enter an absolute path or a path relative to the $DPM_CONF directory. Default is keystore-password.txt in the $DPM_CONF directory. 2. In the $DPM_CONF/common-to-all-apps.properties file, modify the dpm.base.url property to use the HTTPS protocol and the same secure port that you just defined. For example, if you defined the Control Hub secure port number as 19631, define the Control Hub base URL property as follows: dpm.base.url= 3. To define the secure port and the keystore file for the Admin tool, configure the following properties in the Control Hub configuration file, $DPM_CONF/dpm.properties: Admin Tool HTTPS Property admin.http.port admin.https.port admin.https.keystore.path admin.https.keystore.password Description Port number for the Admin tool. Set to -1 to enable the HTTPS protocol for the Admin tool. Secure port number for the Admin tool, unique from the port number defined for Control Hub. Any number besides -1 enables the secure port number. Path and name of the keystore file that contains the private key and selfsigned certificates for the web server. Store the file on the Control Hub machine. Enter an absolute path or a path relative to the $DPM_CONF directory. Default is keystore.jks in the $DPM_CONF directory. Path and name of the file that contains the password to open the Java keystore file. Enter an absolute path or a path relative to the $DPM_CONF directory. Default is keystore-password.txt in the $DPM_CONF directory. 4. Define the following options in the DPM_JAVA_OPTS environment variable:

63 javax.net.ssl.truststore - path to truststore file javax.net.ssl.truststorepassword - truststore password Modify environment variables in the file required by your installation type. For example, define the options as follows: export DPM_JAVA_OPTS="-Djavax.net.ssl.trustStore=<path to truststore file> - Djavax.net.ssl.trustStorePassword=<password> ${DPM_JAVA_OPTS}" Administration Run the Control Hub security command to regenerate the Control Hub SAML metadata with the updated URL. To configure SAML authentication for an organization, users must download the Control Hub SAML metadata and keys, and then use that information to register Control Hub as a service provider with an IdP. The Control Hub URL is included in that metadata. Run the following command from the $DPM_HOME directory: bin/streamsets dpmcli security createsamlconfig -u <system administrator> -p <password> Where <system administrator> is the Control Hub system administrator account. 6. Restart Control Hub for the changes to take effect. Multiple Instances for High Availability When running multiple Control Hub instances in a highly available production environment, configure the load balancer to use the HTTPS protocol. For instructions, see the documentation provided with your load balancer. Then make the following changes to the Control Hub configuration files: 1. Since the load balancer handles SSL termination - decrypting SSL traffic before passing the request on to Control Hub - configure each Control Hub instance to use the http.port property in the Control Hub configuration file, $DPM_CONF/dpm.properties. 2. Set both of the following properties in the $DPM_CONF/common-to-all-apps.properties file to the load balancer URL for each Control Hub instance: dpm.base.url http.load.balancer.url 3. Run the Control Hub security command to regenerate the Control Hub SAML metadata with the updated URL. To configure SAML authentication for an organization, users must download the Control Hub SAML metadata and keys, and then use that information to register Control Hub as a service provider with an IdP. The Control Hub URL is included in that metadata. Run the following command from the $DPM_HOME directory: bin/streamsets dpmcli security createsamlconfig -u <system administrator> -p <password> Where <system administrator> is the Control Hub system administrator account. 4. Restart Control Hub for the changes to take effect. Properties The common-to-all-apps.properties file includes the following properties for sending

64 Administration 60 Property mail.transport.protocol mail.smtp.host mail.smtp.port mail.smtp.auth mail.smtp.starttls.enable mail.smtps.host mail.smtps.port mail.smtps.auth xmail.username xmail.password xmail.from.address Description Use smtp or smtps. Default is smtp. SMTP host name. Default is localhost. SMTP port number. Default is 25. Whether the SMTP host uses authentication. Use true or false. Default is false. Whether the SMTP host uses STARTTLS encryption. Use true or false. Default is false. SMTPS host name. Default is localhost. SMTPS port number. Default is 465. Whether the SMTPS host uses authentication. Use true or false. Default is false. User name for the account to send . File that contains the password for the account. By default, the file name is security-app- -password.txt and is expected in the $DPM_CONF directory. You can replace the default file name. address to use to send . Customizing the StreamSets Logo You can customize the StreamSets logo that displays in the top toolbar of Control Hub. For example, you can replace the StreamSets logo: With your company logo:

65 Administration 61 To customize the logo, simply overwrite the following file with your custom logo file: $DPM_HOME/dpm-static-web/assets/images/logo.png The change takes effect immediately - you do not need to restart Control Hub to see the customized logo. Control Hub Environment Configuration Control Hub includes several environment variables that you can customize. The method that you use to modify environment variables depends on the Control Hub installation type: Tarball installation on any supported operating system For a tarball installation, modify environment variables in the $DPM_HOME/libexec/dpm-env.sh file. Use a text editor to edit the dpm-env.sh file. Some of the environment variables in the file are commented out and do not reflect the default values. Be sure to uncomment the line when you change a variable value. After you edit the file, restart Control Hub to enable the changes. RPM installation on operating systems that use the SysV init system For an RPM installation on CentOS 6.x or Red Hat Enterprise Linux 6.x, modify environment variables in the $DPM_HOME/libexec/dpmd-env.sh file. Use a text editor to edit the dpmd-env.sh file. After you edit the file, restart Control Hub to enable the changes. RPM installation on operating systems that use the systemd init system For an RPM installation on CentOS 7.x or Red Hat Enterprise Linux 7.x, modify environment variables in the / usr/lib/systemd/system/dpm.service file. Override the default values in the dpm.service file using the same procedure that you use to override unit configuration files on a systemd init system. For an example, see "Example 2. Overriding vendor settings" in this systemd.unit manpage. After overriding the default values, use the following command to reload the systemd manager configuration: systemctl daemon-reload Then restart Control Hub to enable the changes. Control Hub Directories Control Hub includes environment variables that define the directories used to store configuration and log files. The $DPM_HOME environment variable defines the Control Hub runtime directory. The runtime directory is the base Control Hub directory that stores the executables and related files. This environment variable is set during installation. When you install Control Hub from the tarball, the default values of the remaining directory variables are relative to the $DPM_HOME runtime directory. When you install Control Hub from the RPM package, the default values of the remaining directory variables are absolute paths that are outside of the $DPM_HOME runtime directory. Modify environment variables in the file required by your installation type.

66 Administration 62 You can configure the following environment variables that define directories: Environment Variable DPM_CONF DPM_LOG Description Defines the configuration directory for the Control Hub configuration file, dpm.properties, application configuration files, the basic-realm properties file, and keystore files. Also includes the log4j properties file. Default values: Tarball installation - $DPM_HOME/etc RPM installation - /etc/dpm Defines the log directory. Default values: Tarball installation - $DPM_HOME/log RPM installation - /var/log/dpm User and Group for Service Start When you install Control Hub from an RPM package, you run Control Hub as a service. Control Hub runs as the system user account and group defined in environment variables. The default system user and group are named dpm. If the defined system user or group do not exist on the machine, the installation creates the user and group for you and assigns them the next available user ID and group ID. You can modify the values of the environment variables to point to another system user or group. Modify environment variables in the file required by your installation type. If you change the system user, you must make the new system user the owner of all Control Hub directories: $DPM_HOME $DPM_CONF $DPM_LOG For example, if you change the system user and group to myuser, use the following command to change the owner of the configuration directory, $DPM_CONF, and all files in the directory to myuser:myuser: chown -R myuser:myuser /etc/dpm Note: When you install Control Hub from the tarball, you run Control Hub manually from the command line. Control Hub runs as the system user account logged into the command prompt when the launch command is run. To run as another user account, see Launch Control Hub. Java Configuration Options Define Java configuration options used by Control Hub in the DPM_JAVA_OPTS environment variable. Modify environment variables in the file required by your installation type. The following is an example of the DPM_JAVA_OPTS environment variable: export DPM_JAVA_OPTS="-Djavax.net.ssl.trustStore=/opt/ssl/truststore.jks -Djavax.net.ssl.trustStorePassword=abcd -server -Xmx4096m -Xms4096m -XX: +HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dpm-outOfMem.hprof - XX:ErrorFile=/var/log/dpm-error.log ${DPM_JAVA_OPTS}" Boolean properties are turned on with -XX:+<property name> and are turned off with -XX-<property name>.

67 Administration 63 You can configure the following options in the DPM_JAVA_OPTS environment variable: Property Name -server -Xms4096m -Xmx4096m Description Define properties for the server, as opposed to the client. The Java heap size determines the heap size allocated to Control Hub and affects the amount of memory it can use. Increase or decrease the Control Hub heap size as needed, based on the resources available on the host machine. Use the following Java properties to define the Java heap size: Xms - the minimum heap size Xmx - the maximum heap size By default, the Java heap size is 4096 MB. To avoid constant recalculation of the allocated heap size, set both properties to the same value. To define the unit of measure, use m for MB and g for GB. -XX: Generate a heap dump when Control Hub encounters an out of memory error. +HeapDumpOnOutOfMemoryError -XX:HeadDumpPath=<file name> -XX:ErrorFile=<file name> -Djavax.net.ssl.trustStore=<file name> - Djavax.net.ssl.trustStorePassword =<password> Specifies the file name and location to use for heap dump files. Specifies the file name and location to use when Control Hub encounters a fatal error. Specifies the truststore file name and location. Enter an absolute path. The truststore file contains the certificates to use when creating an SSL connection. Password to open the truststore file. By default, the password is stored as plaintext. To secure the password, save the password in a text file and then define the truststore password option as follows: -Djavax.net.ssl.trustStorePassword=$(cat passwordfile.txt) Shutting Down Control Hub You can shut down and then manually launch Control Hub to apply changes to the Control Hub configuration files. Note: Any registered Data Collectors that are running at the time of the Control Hub shutdown continue to run remote pipeline instances in Control Hub disconnected mode. The Data Collectors maintain the offsets for the running pipelines and update Control Hub with the latest offsets as soon as they reconnect to Control Hub. To shut down when Control Hub is started as a service, use the required command for your operating system: For CentOS 6.x or Red Hat Enterprise Linux 6.x, use: service dpm stop For CentOS 7.x or Red Hat Enterprise Linux 7.x, use: systemctl stop dpm To shut down when Control Hub is started manually from the tarball, use the Control Hub process ID in the following command: kill <process ID>

68 Administration 64 Renewing the Control Hub License Each Control Hub system requires an active license. If the license is about to expire, you'll need to request a new activation key to renew the license. As the Control Hub system administrator, the Dashboards view notifies you when the system license will expire. You can also view the activation details by clicking the My Account icon ( Activation Details. ) in the top right corner and then clicking Tip: Be sure to renew the license before it expires. When the license expires, you can no longer use Control Hub. 1. Retrieve the unique system ID for your Control Hub system by running the following command from the $DPM_HOME directory: bin/streamsets dpmcli security systemid Or as the system administrator, you can retrieve the system ID from the Control Hub UI by clicking the My Account icon in the top right corner of Control Hub and then clicking Activation Details. 2. Open a StreamSets support ticket or contact your StreamSets sales representative to request the activation key for your Control Hub system ID. 3. After you receive the activation key, run the following command from the $DPM_HOME directory to activate the license: bin/streamsets dpmcli security activationkey -i activationkey.txt The renewed license becomes active within thirty seconds - you do not need to restart Control Hub. Regenerating the System ID If needed, you can regenerate the system ID for your Control Hub system. In most cases, you will not need to regenerate the system ID. Warning: Use caution when regenerating the system ID. The system ID is tied to the activation key. If you regenerate the system ID after you have already activated the license, then the license is no longer valid. Run the following command from the $DPM_HOME directory to force a regeneration of the system ID: bin/streamsets dpmcli security systemid -c -f Control Hub Admin Tool StreamSets Control Hub includes a separate Admin tool. Use the Control Hub Admin tool to monitor and troubleshoot Control Hub issues. For example, if Control Hub becomes inaccessible, the Admin tool remains running. You can still log into the Admin tool to troubleshoot the Control Hub issues. To log in to the Admin tool, use the second URL listed in the command output when you launched Control Hub: name>:18632/admin.html Log in as the default system administrator named "admin" using the password that you entered for the Control Hub setup script. You can change or remove this user account, or you can configure additional users. The Admin tool lists each Control Hub instance with the time that Control Hub was last updated:

69 Administration 65 Click the following links to access monitoring information about each Control Hub instance: Metrics - Displays metrics for each Control Hub application. Threads - Displays a thread dump for Control Hub. Logs - Displays the most recent log information. Access the complete log file in the following location: $DPM_LOG/dpm.log Log4j Config - Displays the contents of the log configuration file, dpm-log4j.properties. You can modify the log level to display messages at another severity level. Server Config - Displays a read only view of the contents of all the Control Hub configuration files. To modify the properties, locate the files in the $DPM_CONF directory. If you modify a property, restart Control Hub for the changes to take effect. Flush Caches - Flushes the Security application cache, so that changed roles take effect immediately. In most cases, you won't need to flush the cache. Build - Information about the currently installed Control Hub version. Configuring Users The Control Hub Admin tool uses file-based authentication - it does not use the Control Hub authentication method. To configure additional user accounts that can log into the tool, configure the $DPM_CONF/basic-realm.properties file. The Admin tool provides a default user account named "admin". You can change or remove this user account, or you can create new user accounts. For increased security, change the password for the default user account. The Admin tool provides a single sys-admin role that you must assign to each user. The sys-admin role allows the user to complete all tasks in the Admin tool. To hash login information, you can use an md5 program such as md5sum on Linux. Hash the password alone. For example: echo -n "<password>" md5 1. In the basic-realm.properties file, add a user definition for each new user using the following format: <user name>: MD5:<md5-text>, user, <role> Note: Assign the sys-admin role to each user. Be sure to include "user" in every user definition. For example, the following line defines a user named jsmith: jsmith: MD5:a66e44736e753d ced572ca821,user,sys-admin 2. To make the new users available, restart Control Hub.

Installing SmartSense on HDP

Installing SmartSense on HDP 1 Installing SmartSense on HDP Date of Publish: 2018-07-12 http://docs.hortonworks.com Contents SmartSense installation... 3 SmartSense system requirements... 3 Operating system, JDK, and browser requirements...3

More information

VMware vrealize Log Insight Getting Started Guide

VMware vrealize Log Insight Getting Started Guide VMware vrealize Log Insight Getting Started Guide vrealize Log Insight 2.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced

More information

VMware Identity Manager Connector Installation and Configuration (Legacy Mode)

VMware Identity Manager Connector Installation and Configuration (Legacy Mode) VMware Identity Manager Connector Installation and Configuration (Legacy Mode) VMware Identity Manager This document supports the version of each product listed and supports all subsequent versions until

More information

SafeConsole On-Prem Install Guide. version DataLocker Inc. July, SafeConsole. Reference for SafeConsole OnPrem

SafeConsole On-Prem Install Guide. version DataLocker Inc. July, SafeConsole. Reference for SafeConsole OnPrem version 5.2.2 DataLocker Inc. July, 2017 SafeConsole Reference for SafeConsole OnPrem 1 Contents Introduction................................................ 2 How do the devices become managed by SafeConsole?....................

More information

ElasterStack 3.2 User Administration Guide - Advanced Zone

ElasterStack 3.2 User Administration Guide - Advanced Zone ElasterStack 3.2 User Administration Guide - Advanced Zone With Advance Zone Configuration TCloud Computing Inc. 6/22/2012 Copyright 2012 by TCloud Computing, Inc. All rights reserved. This document is

More information

Storage Manager 2018 R1. Installation Guide

Storage Manager 2018 R1. Installation Guide Storage Manager 2018 R1 Installation Guide Notes, Cautions, and Warnings NOTE: A NOTE indicates important information that helps you make better use of your product. CAUTION: A CAUTION indicates either

More information

Version Installation Guide. 1 Bocada Installation Guide

Version Installation Guide. 1 Bocada Installation Guide Version 19.4 Installation Guide 1 Bocada Installation Guide Copyright 2019 Bocada LLC. All Rights Reserved. Bocada and BackupReport are registered trademarks of Bocada LLC. Vision, Prism, vpconnect, and

More information

Hortonworks SmartSense

Hortonworks SmartSense Hortonworks SmartSense Installation (April 3, 2017) docs.hortonworks.com Hortonworks SmartSense: Installation Copyright 2012-2017 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform,

More information

Setting Up the Server

Setting Up the Server Managing Licenses, page 1 Cross-launch from Prime Collaboration Provisioning, page 5 Integrating Prime Collaboration Servers, page 6 Single Sign-On for Prime Collaboration, page 7 Changing the SSL Port,

More information

CloudHealth. AWS and Azure On-Boarding

CloudHealth. AWS and Azure On-Boarding CloudHealth AWS and Azure On-Boarding Contents 1. Enabling AWS Accounts... 3 1.1 Setup Usage & Billing Reports... 3 1.2 Setting Up a Read-Only IAM Role... 3 1.3 CloudTrail Setup... 5 1.4 Cost and Usage

More information

vcenter CapacityIQ Installation Guide

vcenter CapacityIQ Installation Guide vcenter CapacityIQ 1.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions

More information

SafeConsole On-Prem Install Guide

SafeConsole On-Prem Install Guide version 5.4 DataLocker Inc. December, 2018 Reference for SafeConsole OnPrem 1 Contents Introduction................................................ 3 How do the devices become managed by SafeConsole?....................

More information

Polarion Enterprise Setup 17.2

Polarion Enterprise Setup 17.2 SIEMENS Polarion Enterprise Setup 17.2 POL005 17.2 Contents Terminology......................................................... 1-1 Overview...........................................................

More information

vrealize Operations Manager Customization and Administration Guide vrealize Operations Manager 6.4

vrealize Operations Manager Customization and Administration Guide vrealize Operations Manager 6.4 vrealize Operations Manager Customization and Administration Guide vrealize Operations Manager 6.4 vrealize Operations Manager Customization and Administration Guide You can find the most up-to-date technical

More information

Server Installation and Administration Guide

Server Installation and Administration Guide NetApp Connect 5.1 Server Installation and Administration Guide NetApp, Inc. 495 East Java Drive Sunnyvale, CA 94089 U.S. Telephone: +1 (408) 822-6000 Fax: +1 (408) 822-4501 Support telephone: +1 (888)

More information

QuickStart Guide for Managing Computers. Version

QuickStart Guide for Managing Computers. Version QuickStart Guide for Managing Computers Version 10.2.0 copyright 2002-2018 Jamf. All rights reserved. Jamf has made all efforts to ensure that this guide is accurate. Jamf 100 Washington Ave S Suite 1100

More information

ThoughtSpot on AWS Quick Start Guide

ThoughtSpot on AWS Quick Start Guide ThoughtSpot on AWS Quick Start Guide Version 4.2 February 2017 Table of Contents Contents Chapter 1: Welcome to ThoughtSpot...3 Contact ThoughtSpot... 4 Chapter 2: Introduction... 6 About AWS...7 Chapter

More information

Documentation. This PDF was generated for your convenience. For the latest documentation, always see

Documentation. This PDF was generated for your convenience. For the latest documentation, always see Management Pack for AWS 1.50 Table of Contents Home... 1 Release Notes... 3 What's New in Release 1.50... 4 Known Problems and Workarounds... 5 Get started... 7 Key concepts... 8 Install... 10 Installation

More information

Zmanda Cloud Backup FAQ

Zmanda Cloud Backup FAQ Zmanda Cloud Backup 2.0.1 FAQ The first sections of this document cover general questions regarding features, cloud, and support; the last section lists error messages and what to do about them. Terminology

More information

Red Hat OpenStack Platform 10 Product Guide

Red Hat OpenStack Platform 10 Product Guide Red Hat OpenStack Platform 10 Product Guide Overview of Red Hat OpenStack Platform OpenStack Team Red Hat OpenStack Platform 10 Product Guide Overview of Red Hat OpenStack Platform OpenStack Team rhos-docs@redhat.com

More information

CA Agile Central Installation Guide On-Premises release

CA Agile Central Installation Guide On-Premises release CA Agile Central Installation Guide On-Premises release 2016.2 Agile Central to Go 2017.1 rallysupport@rallydev.com www.rallydev.com 2017 CA Technologies (c) 2017 CA Technologies Version 2016.2 (c) Table

More information

NGFW Security Management Center

NGFW Security Management Center NGFW Security Management Center Release Notes 6.4.0 Revision B Contents About this release on page 2 System requirements on page 2 Build version on page 3 Compatibility on page 4 New features on page 5

More information

A10 HARMONY CONTROLLER

A10 HARMONY CONTROLLER DATA SHEET A10 HARMONY CONTROLLER AGILE MANAGEMENT, AUTOMATION, ANALYTICS FOR MULTI-CLOUD ENVIRONMENTS PLATFORMS A10 Harmony Controller provides centralized agile management, automation and analytics for

More information

OnCommand Unified Manager 7.3 Installation and Setup Guide

OnCommand Unified Manager 7.3 Installation and Setup Guide OnCommand Unified Manager 7.3 Installation and Setup Guide November 2017 215-12452_A0 doccomments@netapp.com Table of Contents 3 Contents Introduction to OnCommand Unified Manager... 5 What the Unified

More information

JAMF Software Server Installation and Configuration Guide for Linux. Version 9.31

JAMF Software Server Installation and Configuration Guide for Linux. Version 9.31 JAMF Software Server Installation and Configuration Guide for Linux Version 9.31 JAMF Software, LLC 2014 JAMF Software, LLC. All rights reserved. JAMF Software has made all efforts to ensure that this

More information

QuickStart Guide for Managing Computers. Version 9.73

QuickStart Guide for Managing Computers. Version 9.73 QuickStart Guide for Managing Computers Version 9.73 JAMF Software, LLC 2015 JAMF Software, LLC. All rights reserved. JAMF Software has made all efforts to ensure that this guide is accurate. JAMF Software

More information

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the product, please review the readme files,

More information

QuickStart Guide for Managing Computers. Version 9.32

QuickStart Guide for Managing Computers. Version 9.32 QuickStart Guide for Managing Computers Version 9.32 JAMF Software, LLC 2014 JAMF Software, LLC. All rights reserved. JAMF Software has made all efforts to ensure that this guide is accurate. JAMF Software

More information

Getting Started. 05-SEPT-2017 vrealize Log Insight 4.5

Getting Started. 05-SEPT-2017 vrealize Log Insight 4.5 05-SEPT-2017 vrealize Log Insight 4.5 You can find the most up-to-date technical documentation on the VMware Web site at: https://docs.vmware.com/ The VMware Web site also provides the latest product updates.

More information

Red Hat 3Scale 2.0 Terminology

Red Hat 3Scale 2.0 Terminology Red Hat Scale 2.0 Terminology For Use with Red Hat Scale 2.0 Last Updated: 2018-0-08 Red Hat Scale 2.0 Terminology For Use with Red Hat Scale 2.0 Legal Notice Copyright 2018 Red Hat, Inc. The text of

More information

Red Hat Virtualization 4.1 Product Guide

Red Hat Virtualization 4.1 Product Guide Red Hat Virtualization 4.1 Product Guide Introduction to Red Hat Virtualization 4.1 Red Hat Virtualization Documentation TeamRed Hat Red Hat Virtualization 4.1 Product Guide Introduction to Red Hat Virtualization

More information

CA Agile Central Administrator Guide. CA Agile Central On-Premises

CA Agile Central Administrator Guide. CA Agile Central On-Premises CA Agile Central Administrator Guide CA Agile Central On-Premises 2018.1 Table of Contents Overview... 3 Server Requirements...3 Browser Requirements...3 Access Help and WSAPI...4 Time Zone...5 Architectural

More information

MongoDB Management Suite Manual Release 1.4

MongoDB Management Suite Manual Release 1.4 MongoDB Management Suite Manual Release 1.4 MongoDB, Inc. Aug 10, 2018 MongoDB, Inc. 2008-2016 2 Contents 1 On-Prem MMS Application Overview 4 1.1 MMS Functional Overview........................................

More information

Edge Device Manager Quick Start Guide. Version R15

Edge Device Manager Quick Start Guide. Version R15 Edge Device Manager Quick Start Guide Version R15 Notes, cautions, and warnings NOTE: A NOTE indicates important information that helps you make better use of your product. CAUTION: A CAUTION indicates

More information

Server Monitoring. AppDynamics Pro Documentation. Version 4.1.x. Page 1

Server Monitoring. AppDynamics Pro Documentation. Version 4.1.x. Page 1 Server Monitoring AppDynamics Pro Documentation Version 4.1.x Page 1 Server Monitoring......................................................... 4 Standalone Machine Agent Requirements and Supported Environments............

More information

Web Cloud Solution. User Guide. Issue 01. Date

Web Cloud Solution. User Guide. Issue 01. Date Issue 01 Date 2017-05-30 Contents Contents 1 Overview... 3 1.1 What Is Web (CCE+RDS)?... 3 1.2 Why You Should Choose Web (CCE+RDS)... 3 1.3 Concept and Principle... 4... 5 2.1 Required Services... 5 2.2

More information

NetIQ Privileged Account Manager 3.5 includes new features, improves usability and resolves several previous issues.

NetIQ Privileged Account Manager 3.5 includes new features, improves usability and resolves several previous issues. Privileged Account Manager 3.5 Release Notes July 2018 NetIQ Privileged Account Manager 3.5 includes new features, improves usability and resolves several previous issues. Many of these improvements were

More information

PrinterOn Print Delivery Station

PrinterOn Print Delivery Station PrinterOn Print Delivery Station Installation and Administration Guide Version 4.0.2 Contents Chapter 1: Introduction... 5 About the Print Delivery Station software... 5 PDS instances... 5 Print Delivery

More information

TIBCO ActiveMatrix Policy Director Administration

TIBCO ActiveMatrix Policy Director Administration TIBCO ActiveMatrix Policy Director Administration Software Release 2.0.0 November 2014 Document Updated: January 2015 Two-Second Advantage 2 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES

More information

SafeConsole On-Prem Install Guide

SafeConsole On-Prem Install Guide SafeConsole On-Prem Install Guide This guide applies to SafeConsole 5.0.5 Introduction This guide describes how to install a new SafeConsole server on Windows using the SafeConsole installer. As an option,

More information

NGFW Security Management Center

NGFW Security Management Center NGFW Security Management Center Release Notes 6.4.8 Revision A Contents About this release on page 2 System requirements on page 2 Build version on page 3 Compatibility on page 5 New features on page 5

More information

JetBrains TeamCity Comparison

JetBrains TeamCity Comparison JetBrains TeamCity Comparison TeamCity is a continuous integration and continuous delivery server developed by JetBrains. It provides out-of-the-box continuous unit testing, code quality analysis, and

More information

McAfee epolicy Orchestrator Release Notes

McAfee epolicy Orchestrator Release Notes McAfee epolicy Orchestrator 5.9.1 Release Notes Contents About this release What's new Resolved issues Known issues Installation information Getting product information by email Where to find product documentation

More information

Polarion 18 Enterprise Setup

Polarion 18 Enterprise Setup SIEMENS Polarion 18 Enterprise Setup POL005 18 Contents Terminology......................................................... 1-1 Overview........................................................... 2-1

More information

Hortonworks SmartSense

Hortonworks SmartSense Hortonworks SmartSense Installation (January 8, 2018) docs.hortonworks.com Hortonworks SmartSense: Installation Copyright 2012-2018 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform,

More information

AppController :21:56 UTC Citrix Systems, Inc. All rights reserved. Terms of Use Trademarks Privacy Statement

AppController :21:56 UTC Citrix Systems, Inc. All rights reserved. Terms of Use Trademarks Privacy Statement AppController 2.6 2014-03-18 13:21:56 UTC 2014 Citrix Systems, Inc. All rights reserved. Terms of Use Trademarks Privacy Statement Contents AppController 2.6... 6 About This Release... 8 Getting Started...

More information

Red Hat Quay 2.9 Deploy Red Hat Quay - Basic

Red Hat Quay 2.9 Deploy Red Hat Quay - Basic Red Hat Quay 2.9 Deploy Red Hat Quay - Basic Deploy Red Hat Quay Last Updated: 2018-09-14 Red Hat Quay 2.9 Deploy Red Hat Quay - Basic Deploy Red Hat Quay Legal Notice Copyright 2018 Red Hat, Inc. The

More information

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the product, please review the readme files,

More information

Installing an HDF cluster

Installing an HDF cluster 3 Installing an HDF cluster Date of Publish: 2018-08-13 http://docs.hortonworks.com Contents Installing Ambari...3 Installing Databases...3 Installing MySQL... 3 Configuring SAM and Schema Registry Metadata

More information

akkadian Global Directory 3.0 System Administration Guide

akkadian Global Directory 3.0 System Administration Guide akkadian Global Directory 3.0 System Administration Guide Updated July 19 th, 2016 Copyright and Trademarks: I. Copyright: This website and its content is copyright 2014 Akkadian Labs. All rights reserved.

More information

Polarion 18.2 Enterprise Setup

Polarion 18.2 Enterprise Setup SIEMENS Polarion 18.2 Enterprise Setup POL005 18.2 Contents Overview........................................................... 1-1 Terminology..........................................................

More information

Installing HDF Services on an Existing HDP Cluster

Installing HDF Services on an Existing HDP Cluster 3 Installing HDF Services on an Existing HDP Cluster Date of Publish: 2018-08-13 http://docs.hortonworks.com Contents Upgrade Ambari and HDP...3 Installing Databases...3 Installing MySQL... 3 Configuring

More information

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the product, please review the readme files,

More information

Cisco Prime Service Catalog Virtual Appliance Quick Start Guide 2

Cisco Prime Service Catalog Virtual Appliance Quick Start Guide 2 Cisco Prime Service Catalog 11.1.1 Virtual Appliance Quick Start Guide Cisco Prime Service Catalog 11.1.1 Virtual Appliance Quick Start Guide 2 Introduction 2 Before You Begin 2 Preparing the Virtual Appliance

More information

EnterSpace Data Sheet

EnterSpace Data Sheet EnterSpace 7.0.4.3 Data Sheet ENTERSPACE BUNDLE COMPONENTS Policy Engine The policy engine is the heart of EnterSpace. It evaluates digital access control policies and makes dynamic, real-time decisions

More information

Developing Microsoft Azure Solutions (70-532) Syllabus

Developing Microsoft Azure Solutions (70-532) Syllabus Developing Microsoft Azure Solutions (70-532) Syllabus Cloud Computing Introduction What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages

More information

OpenIAM Identity and Access Manager Technical Architecture Overview

OpenIAM Identity and Access Manager Technical Architecture Overview OpenIAM Identity and Access Manager Technical Architecture Overview Overview... 3 Architecture... 3 Common Use Case Description... 3 Identity and Access Middleware... 5 Enterprise Service Bus (ESB)...

More information

OnCommand Unified Manager 9.4 Installation and Setup Guide

OnCommand Unified Manager 9.4 Installation and Setup Guide OnCommand Unified Manager 9.4 Installation and Setup Guide May 2018 215-12992_A0 doccomments@netapp.com Table of Contents 3 Contents Introduction to OnCommand Unified Manager... 5 What the Unified Manager

More information

High Availability Enabling SSL Database Migration Auto Backup and Auto Update Mail Server and Proxy Settings Support...

High Availability Enabling SSL Database Migration Auto Backup and Auto Update Mail Server and Proxy Settings Support... Quick Start Guide Table of Contents Overview... 4 Deployment... 4 System Requirements... 4 Installation... 6 Working with AD360... 8 Starting AD360... 8 Launching AD360 client... 9 Stopping AD360... 9

More information

KYOCERA Net Admin Installation Guide

KYOCERA Net Admin Installation Guide KYOCERA Net Admin Guide Legal Notes Unauthorized reproduction of all or part of this guide is prohibited. The information in this guide is subject to change without notice. We cannot be held liable for

More information

Enterprise Steam Installation and Setup

Enterprise Steam Installation and Setup Enterprise Steam Installation and Setup Release H2O.ai Mar 01, 2017 CONTENTS 1 Installing Enterprise Steam 3 1.1 Obtaining the License Key........................................ 3 1.2 Ubuntu Installation............................................

More information

Policy Manager for IBM WebSphere DataPower 7.2: Configuration Guide

Policy Manager for IBM WebSphere DataPower 7.2: Configuration Guide Policy Manager for IBM WebSphere DataPower 7.2: Configuration Guide Policy Manager for IBM WebSphere DataPower Configuration Guide SOAPMDP_Config_7.2.0 Copyright Copyright 2015 SOA Software, Inc. All rights

More information

Developing Microsoft Azure Solutions (70-532) Syllabus

Developing Microsoft Azure Solutions (70-532) Syllabus Developing Microsoft Azure Solutions (70-532) Syllabus Cloud Computing Introduction What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages

More information

Entrust. Discovery 2.4. Administration Guide. Document issue: 3.0. Date of issue: June 2014

Entrust. Discovery 2.4. Administration Guide. Document issue: 3.0. Date of issue: June 2014 Entrust Discovery 2.4 Administration Guide Document issue: 3.0 Date of issue: June 2014 Copyright 2010-2014 Entrust. All rights reserved. Entrust is a trademark or a registered trademark of Entrust, Inc.

More information

vcloud Director Administrator's Guide vcloud Director 9.0

vcloud Director Administrator's Guide vcloud Director 9.0 vcloud Director 9.0 You can find the most up-to-date technical documentation on the VMware Web site at: https://docs.vmware.com/ The VMware Web site also provides the latest product updates. If you have

More information

Installing and Configuring VMware vrealize Orchestrator

Installing and Configuring VMware vrealize Orchestrator Installing and Configuring VMware vrealize Orchestrator vrealize Orchestrator 7.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced

More information

Ekran System v.5.2 Deployment Guide

Ekran System v.5.2 Deployment Guide Ekran System v.5.2 Deployment Guide Table of Contents About... 6 System Requirements... 7 Program Structure... 9 Deployment Process... 10 Server and Database... 11 About... 11 Database Types Comparison...

More information

EDB Postgres Enterprise Manager EDB Ark Management Features Guide

EDB Postgres Enterprise Manager EDB Ark Management Features Guide EDB Postgres Enterprise Manager EDB Ark Management Features Guide Version 7.4 August 28, 2018 by EnterpriseDB Corporation Copyright 2013-2018 EnterpriseDB Corporation. All rights reserved. EnterpriseDB

More information

JAMF Software Server Installation and Configuration Guide for Linux. Version 9.72

JAMF Software Server Installation and Configuration Guide for Linux. Version 9.72 JAMF Software Server Installation and Configuration Guide for Linux Version 9.72 JAMF Software, LLC 2015 JAMF Software, LLC. All rights reserved. JAMF Software has made all efforts to ensure that this

More information

BlackBerry Enterprise Server for Microsoft Office 365. Version: 1.0. Administration Guide

BlackBerry Enterprise Server for Microsoft Office 365. Version: 1.0. Administration Guide BlackBerry Enterprise Server for Microsoft Office 365 Version: 1.0 Administration Guide Published: 2013-01-29 SWD-20130131125552322 Contents 1 Related resources... 18 2 About BlackBerry Enterprise Server

More information

Developing Microsoft Azure Solutions (70-532) Syllabus

Developing Microsoft Azure Solutions (70-532) Syllabus Developing Microsoft Azure Solutions (70-532) Syllabus Cloud Computing Introduction What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages

More information

IBM Security Access Manager Version 9.0 October Product overview IBM

IBM Security Access Manager Version 9.0 October Product overview IBM IBM Security Access Manager Version 9.0 October 2015 Product overview IBM IBM Security Access Manager Version 9.0 October 2015 Product overview IBM ii IBM Security Access Manager Version 9.0 October 2015:

More information

VMware vfabric Data Director Installation Guide

VMware vfabric Data Director Installation Guide VMware vfabric Data Director Installation Guide vfabric Data Director 1.0.1 This document supports the version of each product listed and supports all subsequent versions until the document is replaced

More information

Using the VMware vrealize Orchestrator Client

Using the VMware vrealize Orchestrator Client Using the VMware vrealize Orchestrator Client vrealize Orchestrator 7.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by

More information

Installation and Upgrade Guide. Front Office v9.0

Installation and Upgrade Guide. Front Office v9.0 c Installation and Upgrade Guide Front Office v9.0 Contents 1.0 Introduction... 4 2.0 Prerequisites... 5 2.1 Database... 5 2.2 Portal and Web Service... 5 2.3 Windows Service... 5 3.0 New Installation...

More information

Getting Started. Update 1 Modified on 03 SEP 2017 vrealize Log Insight 4.0

Getting Started. Update 1 Modified on 03 SEP 2017 vrealize Log Insight 4.0 Update 1 Modified on 03 SEP 2017 vrealize Log Insight 4.0 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/ If you have comments about this documentation,

More information

Dell Storage Manager 2016 R3 Installation Guide

Dell Storage Manager 2016 R3 Installation Guide Dell Storage Manager 2016 R3 Installation Guide Notes, Cautions, and Warnings NOTE: A NOTE indicates important information that helps you make better use of your product. CAUTION: A CAUTION indicates either

More information

Server Installation Guide

Server Installation Guide Server Installation Guide Server Installation Guide Legal notice Copyright 2018 LAVASTORM ANALYTICS, INC. ALL RIGHTS RESERVED. THIS DOCUMENT OR PARTS HEREOF MAY NOT BE REPRODUCED OR DISTRIBUTED IN ANY

More information

Setting Up Resources in VMware Identity Manager. VMware Identity Manager 2.8

Setting Up Resources in VMware Identity Manager. VMware Identity Manager 2.8 Setting Up Resources in VMware Identity Manager VMware Identity Manager 2.8 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/ If you have comments

More information

Getting Started. April 12, 2018 vrealize Log Insight 4.6

Getting Started. April 12, 2018 vrealize Log Insight 4.6 April 12, 2018 vrealize Log Insight 4.6 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/ If you have comments about this documentation, submit

More information

System Description. System Architecture. System Architecture, page 1 Deployment Environment, page 4

System Description. System Architecture. System Architecture, page 1 Deployment Environment, page 4 System Architecture, page 1 Deployment Environment, page 4 System Architecture The diagram below illustrates the high-level architecture of a typical Prime Home deployment. Figure 1: High Level Architecture

More information

LGTM Enterprise System Requirements. Release , August 2018

LGTM Enterprise System Requirements. Release , August 2018 Release 1.17.2, August 2018 Semmle Inc 180 Sansome St San Francisco, CA 94104 Copyright 2018, Semmle Ltd. All rights reserved. LGTM Enterprise release 1.17.2 Document published August 30, 2018 Contents

More information

Carbon Black QRadar App User Guide

Carbon Black QRadar App User Guide Carbon Black QRadar App User Guide Table of Contents Carbon Black QRadar App User Guide... 1 Cb Event Forwarder... 2 Overview...2 Requirements...2 Install Cb Event Forwarder RPM...2 Configure Cb Event

More information

Progress DataDirect Hybrid Data Pipeline

Progress DataDirect Hybrid Data Pipeline Progress DataDirect Hybrid Data Pipeline Installation Guide Release 4.3 Copyright 2018 Progress Software Corporation and/or one of its subsidiaries or affiliates. All rights reserved. These materials

More information

Qlik NPrinting. September 2018 Copyright QlikTech International AB. All rights reserved.

Qlik NPrinting. September 2018 Copyright QlikTech International AB. All rights reserved. Qlik NPrinting Qlik NPrinting September 2018 Copyright 1993-2018 QlikTech International AB. All rights reserved. Contents 1 What is Qlik NPrinting? 22 1.1 How does Qlik NPrinting work? 22 Qlik NPrinting

More information

Amazon AppStream 2.0: SOLIDWORKS Deployment Guide

Amazon AppStream 2.0: SOLIDWORKS Deployment Guide 2018 Amazon AppStream 2.0: SOLIDWORKS Deployment Guide Build an Amazon AppStream 2.0 environment to stream SOLIDWORKS to your users June 2018 https://aws.amazon.com/appstream2/ 1 Welcome This guide describes

More information

vcenter CapacityIQ Installation Guide

vcenter CapacityIQ Installation Guide vcenter CapacityIQ 1.0.1 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions

More information

Getting Started. vrealize Log Insight 4.3 EN

Getting Started. vrealize Log Insight 4.3 EN vrealize Log Insight 4.3 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions

More information

Installing Cisco CMX in a VMware Virtual Machine

Installing Cisco CMX in a VMware Virtual Machine Installing Cisco CMX in a VMware Virtual Machine This chapter describes how to install and deploy a Cisco Mobility Services Engine (CMX) virtual appliance. Cisco CMX is a prebuilt software solution that

More information

FUJITSU Software ServerView Cloud Monitoring Manager V1.1. Release Notes

FUJITSU Software ServerView Cloud Monitoring Manager V1.1. Release Notes FUJITSU Software ServerView Cloud Monitoring Manager V1.1 Release Notes J2UL-2170-01ENZ0(00) July 2016 Contents Contents About this Manual... 4 1 What's New?...6 1.1 Performance Improvements... 6 1.2

More information

Dell Wyse Management Suite. Version 1.0 Quick Start Guide

Dell Wyse Management Suite. Version 1.0 Quick Start Guide Dell Wyse Management Suite Version 1.0 Quick Start Guide Notes, cautions, and warnings NOTE: A NOTE indicates important information that helps you make better use of your product. CAUTION: A CAUTION indicates

More information

Using the VMware vcenter Orchestrator Client. vrealize Orchestrator 5.5.1

Using the VMware vcenter Orchestrator Client. vrealize Orchestrator 5.5.1 Using the VMware vcenter Orchestrator Client vrealize Orchestrator 5.5.1 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/ If you have comments

More information

Zumobi Brand Integration(Zbi) Platform Architecture Whitepaper Table of Contents

Zumobi Brand Integration(Zbi) Platform Architecture Whitepaper Table of Contents Zumobi Brand Integration(Zbi) Platform Architecture Whitepaper Table of Contents Introduction... 2 High-Level Platform Architecture Diagram... 3 Zbi Production Environment... 4 Zbi Publishing Engine...

More information

QuickStart Guide for Managing Mobile Devices. Version

QuickStart Guide for Managing Mobile Devices. Version QuickStart Guide for Managing Mobile Devices Version 10.1.0 copyright 2002-2017 Jamf. All rights reserved. Jamf has made all efforts to ensure that this guide is accurate. Jamf 100 Washington Ave S Suite

More information

NGFW Security Management Center

NGFW Security Management Center NGFW Security Management Center Release Notes 6.5.3 Revision A Contents About this release on page 2 System requirements on page 2 Build number and checksums on page 4 Compatibility on page 5 New features

More information

vcenter Chargeback User s Guide

vcenter Chargeback User s Guide vcenter Chargeback 1.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions

More information

Acronis Monitoring Service

Acronis Monitoring Service Acronis Monitoring Service PRODUCT DOCUMENTATION Table of contents 1 About the Acronis Monitoring Service...4 2 Software Requirements...4 3 Understanding basic concepts...5 4 Getting started...7 4.1 Setting

More information

NGFW Security Management Center

NGFW Security Management Center NGFW Security Management Center Release Notes 6.3.3 Revision A Contents About this release on page 2 System requirements on page 2 Build version on page 3 Compatibility on page 5 New features on page 5

More information

Vodafone Secure Device Manager Administration User Guide

Vodafone Secure Device Manager Administration User Guide Vodafone Secure Device Manager Administration User Guide Vodafone New Zealand Limited. Correct as of June 2017. Vodafone Ready Business Contents Introduction 3 Help 4 How to find help in the Vodafone Secure

More information

Mozy. Administrator Guide

Mozy. Administrator Guide Mozy Administrator Guide Preface 2017 Mozy, Inc. All rights reserved. Information in this document is subject to change without notice. The software described in this document is furnished under a license

More information