IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture

Similar documents
Managing IBM Db2 Analytics Accelerator by using IBM Data Server Manager 1

Best practices. Starting and stopping IBM Platform Symphony Developer Edition on a two-host Microsoft Windows cluster. IBM Platform Symphony

Best practices. Reducing concurrent SIM connection requests to SSM for Windows IBM Platform Symphony

Best practices. Linux system tuning for heavilyloaded. IBM Platform Symphony

Understanding IBM Db2 Restore

Platform LSF Version 9 Release 1.1. Migrating on Windows SC

Platform LSF Version 9 Release 1.3. Migrating on Windows SC

Implementing IBM Easy Tier with IBM Real-time Compression IBM Redbooks Solution Guide

IBM LoadLeveler Version 5 Release 1. Documentation Update: IBM LoadLeveler Version 5 Release 1 IBM

Version 9 Release 0. IBM i2 Analyst's Notebook Premium Configuration IBM

ServeRAID-MR10i SAS/SATA Controller IBM System x at-a-glance guide

IBM Spectrum LSF Version 10 Release 1. Readme IBM

Version 9 Release 0. IBM i2 Analyst's Notebook Configuration IBM

Using application properties in IBM Cúram Social Program Management JUnit tests

Getting Started with InfoSphere Streams Quick Start Edition (VMware)

IBM z/os Management Facility V2R1 Solution Guide IBM Redbooks Solution Guide

Continuous Availability with the IBM DB2 purescale Feature IBM Redbooks Solution Guide

IBM Spectrum LSF Process Manager Version 10 Release 1. Release Notes IBM GI

Installing Watson Content Analytics 3.5 Fix Pack 1 on WebSphere Application Server Network Deployment 8.5.5

Build integration overview: Rational Team Concert and IBM UrbanCode Deploy

CONFIGURING SSO FOR FILENET P8 DOCUMENTS

IBM. IBM i2 Enterprise Insight Analysis Understanding the Deployment Patterns. Version 2 Release 1 BA

Contents. Configuring AD SSO for Platform Symphony API Page 2 of 8

Utility Capacity on Demand: What Utility CoD Is and How to Use It

Using the IBM DS8870 in an OpenStack Cloud Environment IBM Redbooks Solution Guide

Version 2 Release 1. IBM i2 Enterprise Insight Analysis Understanding the Deployment Patterns IBM BA

ServeRAID-BR10il SAS/SATA Controller v2 for IBM System x IBM System x at-a-glance guide

Implementing IBM CICS JSON Web Services for Mobile Applications IBM Redbooks Solution Guide

Emulex 8Gb Fibre Channel Single-port and Dual-port HBAs for IBM System x IBM System x at-a-glance guide

Tivoli Storage Manager for Virtual Environments: Data Protection for VMware Solution Design Considerations IBM Redbooks Solution Guide

Enterprise Caching in a Mobile Environment IBM Redbooks Solution Guide

Maximizing Performance of IBM DB2 Backups

IBM Geographically Dispersed Resiliency for Power Systems. Version Release Notes IBM

iscsi Configuration Manager Version 2.0

Optimizing Data Integration Solutions by Customizing the IBM InfoSphere Information Server Deployment Architecture IBM Redbooks Solution Guide

IBM Copy Services Manager Version 6 Release 1. Release Notes August 2016 IBM

Integrated Management Module (IMM) Support on IBM System x and BladeCenter Servers

IBM License Metric Tool Enablement Guide

IBM Operational Decision Manager Version 8 Release 5. Configuring Operational Decision Manager on Java SE

IBM PowerKVM available with the Linux only scale-out servers IBM Redbooks Solution Guide

IBM High IOPS SSD PCIe Adapters IBM System x at-a-glance guide

IBM 6 Gb Performance Optimized HBA IBM Redbooks Product Guide

Migrating on UNIX and Linux

IBM. Networking INETD. IBM i. Version 7.2

IBM Cloud Orchestrator. Content Pack for IBM Endpoint Manager for Software Distribution IBM

IBM Platform LSF. Best Practices. IBM Platform LSF and IBM GPFS in Large Clusters. Jin Ma Platform LSF Developer IBM Canada

IBM Cloud Object Storage System Version Time Synchronization Configuration Guide IBM DSNCFG_ K

Redpaper. IBM Tivoli Access Manager for e-business: Junctions and Links. Overview. URLs, links, and junctions. Axel Buecker Ori Pomerantz

Migrating Classifications with Migration Manager

IBM Maximo for Service Providers Version 7 Release 6. Installation Guide

Best practices. Defining your own EGO service to add High Availability capability for your existing applications. IBM Platform Symphony

Elastic Caching with IBM WebSphere extreme Scale IBM Redbooks Solution Guide

Application and Database Protection in a VMware vsphere Environment

IBM emessage Version 8.x and higher. Account Startup Overview

2-Port 40 Gb InfiniBand Expansion Card (CFFh) for IBM BladeCenter IBM BladeCenter at-a-glance guide

IBM Platform HPC V3.2:

IBM Maximo Calibration Version 7 Release 5. Installation Guide

IBM Tivoli Directory Server Version 5.2 Client Readme

IBM Endpoint Manager Version 9.1. Patch Management for Ubuntu User's Guide

IBM. Avoiding Inventory Synchronization Issues With UBA Technical Note

IBM Watson Explorer Content Analytics Version Upgrading to Version IBM

IBM. Release Notes November IBM Copy Services Manager. Version 6 Release 1

IBM Cognos Dynamic Query Analyzer Version Installation and Configuration Guide IBM

Installing on Windows

IBM UrbanCode Cloud Services Security Version 3.0 Revised 12/16/2016. IBM UrbanCode Cloud Services Security

IBM Security QRadar Version 7 Release 3. Community Edition IBM

Tivoli Access Manager for Enterprise Single Sign-On

IBM Storage Driver for OpenStack Version Release Notes

IBM FlashSystem V MTM 9846-AC3, 9848-AC3, 9846-AE2, 9848-AE2, F, F. Quick Start Guide IBM GI

IBM Flex System FC port 8Gb FC Adapter IBM Redbooks Product Guide

Integrated use of IBM WebSphere Adapter for Siebel and SAP with WPS Relationship Service. Quick Start Scenarios

DS8880 High Performance Flash Enclosure Gen2

Optimizing Database Administration with IBM DB2 Autonomics for z/os IBM Redbooks Solution Guide

IBM Maximo for Aviation MRO Version 7 Release 6. Installation Guide IBM

IBM Flex System FC5024D 4-port 16Gb FC Adapter IBM Redbooks Product Guide

A Quick Look at IBM SmartCloud Monitoring. Author: Larry McWilliams, IBM Tivoli Integration of Competency Document Version 1, Update:

IMS and VSAM replication using GDPS IBM Redbooks Solution Guide

IBM Operational Decision Manager. Version Sample deployment for Operational Decision Manager for z/os artifact migration

Designing a Reference Architecture for Virtualized Environments Using IBM System Storage N series IBM Redbooks Solution Guide

Determining dependencies in Cúram data

Patch Management for Solaris

ServeRAID M1015 SAS/SATA Controller for System x IBM System x at-a-glance guide

IMM and IMM2 Support on IBM System x and BladeCenter Servers

ios 9 support in IBM MobileFirst Platform Foundation IBM

IBM WebSphere Sample Adapter for Enterprise Information System Simulator Deployment and Testing on WPS 7.0. Quick Start Scenarios

IBM OpenPages GRC Platform - Version Interim Fix 1. Interim Fix ReadMe

IBM Content Analytics with Enterprise Search Version 3.0. Expanding queries and influencing how documents are ranked in the results

Proposal for a Tivoli Storage Manager Client system migration from Solaris with VxFS to Linux with GPFS or AIX with GPFS or JFS2

IBM Maximo Spatial Asset Management Version 7 Release 6. Installation Guide IBM

IBM. IBM i2 Analyze Windows Upgrade Guide. Version 4 Release 1 SC

IBM FlashSystem V Quick Start Guide IBM GI

Broadcom NetXtreme Gigabit Ethernet Adapters IBM Redbooks Product Guide

IBM Storage Driver for OpenStack Version Release Notes

IBM Operations Analytics - Log Analysis: Network Manager Insight Pack Version 1 Release 4.1 GI IBM

ServeRAID M5025 SAS/SATA Controller for IBM System x IBM System x at-a-glance guide

RSE Server Installation Guide: AIX and Linux on IBM Power Systems

IBM Maximo Spatial Asset Management Version 7 Release 5. Installation Guide

IBM. IBM i2 Enterprise Insight Analysis User Guide. Version 2 Release 1

Integrating the Hardware Management Console s Broadband Remote Support Facility into your Enterprise

IBM. Networking Open Shortest Path First (OSPF) support. IBM i. Version 7.2

Transcription:

IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture Big data analytics involves processing large amounts of data that cannot be handled by conventional systems. The IBM BigInsights platform processes large amounts of data by breaking the computation into smaller tasks that can be distributed onto several nodes. As this platform is shared by users in different roles (developers, analysts, data scientists, and testers), it introduces the challenge of provisioning access and authorization to the cluster and securing the data. Big data platforms are an amalgamation of several individual components that are still evolving, and are based on the challenges and requirements that are dictated by the open source community. These components are developed in isolation by independent teams with no forethought of integrating them in a secure way, which results in individual components defining and exposing their own security policies for data and access protection. This inherent lack of single security policy enforcement in big data platforms can be challenging and overwhelming. This IBM Redbooks Analytics Support web doc introduces a reference security architecture for the IBM BigInsights solution that is in line with current industry practices. It can be used as a reference document for solution architects and solution implementers. This document applies to IBM BigInsights Version 4.2 and later. IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture 1

Security aspects to consider when designing the security architecture for IBM BigInsights Securing an IBM BigInsights cluster involves addressing four main security aspects, which are shown in Figure 1: Secure Perimeter Secure Data Access Management Audit Figure 1. Four aspects of IBM BigInsights security A secure perimeter can be enforced in the following ways: By authenticating users against LDAP and Kerberos By protecting HTTPS access through the Apache Knox security gateway By isolating the data nodes in a secure private network Secure data can be accomplished in the following ways: By using Hadoop transparent encryption with Apache KMS (Key Management Server) By using IBM BigSQL data masking By enabling SSL and TLS support for components to secure the data transfers Access management should be enforced at several levels: At the job level by using Yet Another Resource Scheduler (YARN) job-queue-based access control. By using SQL access privileges for SQL access of Hadoop data. By using ACL- based access control for Hadoop Distributed File System (HDFS) files. Audits and reporting are provided by the following items: By using light-weight monitoring that uses Java Management Extensions (JMX) By using IBM Security Guardium Data Activity Monitor IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture 2

Figure 2 shows a high-level design of a secure IBM BigInsights cluster. It highlights various components that are the building blocks of a big data cluster architecture design. Figure 2. IBM BigInsights high-level design Note the following items in Figure 2: An IBM BigInsights cluster can span over two networks: Public and private networks. The communication between the two networks occurs through an edge node (1). This edge node has the client components (2) for all the master services that are deployed in the cluster so that users can connect and perform analytics and administration. Access to administration and analytic tools is enforced through Ambari user management and the Knox gateway (3). Data encryption protects user data from unauthorized access and enforces industry security standards. Data can be encrypted at rest and while it is being transferred over the network. Encryption at rest is performed in two ways: By using Hadoop transparent data encryption and by using IBM Security Guardium Data Encryption. Hadoop transparent data encryption uses the key management server (KMS), which holds encryption and decryption keys. (4) IBM Security Guardium Data Encryption deploys agents on all nodes to perform encryption and decryption of data. The IBM Security Guardium server monitors and enforces encryption and decryption policies and rules on the agents. (5) The data transfers over the network are secured by configuring services to use SSL and TLS certificates. IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture 3

Similar to the Linux file system, Hadoop Distributed File System (HDFS) also provides fine-grained user access control by using file system access control lists (ACLs) (6). Big SQL and Hive provide Grant and Revoke commands to authorize users to perform certain operations (7). IBM Security Guardium Data Activity Monitor provides monitoring and auditing capabilities that you can use to integrate seamlessly Hadoop data protection into your existing enterprise data security strategy. HDFS has its own auditing mechanism that captures all the file system activities (8). Designing the security architecture for IBM BigInsights products involves the features that are provided by individual components and a holistic approach that involves securing data, users, and functions from possible vulnerabilities. Acknowledgements Thanks to Mohan Dani, IBM BigInsights software developer, for his contributions to this project. Related publications IBM BigInsights V4.2 documentation: https://ibm.biz/bdrnvb IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture 4

Notices This information was developed for products and services offered in the US. This material might be available from IBM in other languages. However, you may be required to own a copy of the product or product version in that language in order to access it. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-ibm product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-ibm websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk. IBM may use or distribute any of the information you provide in any way it believes appropriate without incurring any obligation to you. The performance data and client examples cited are presented for illustrative purposes only. Actual performance results may vary depending on specific configurations and operating conditions. Information concerning non-ibm products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-ibm products. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. Statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to actual people or business enterprises is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided AS IS, without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs. Copyright International Business Machines Corporation 2016. All rights reserved. IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture 5

This document was created or updated on August 25, 2016. Send us your comments in one of the following ways: Use the online Contact us review form found at: ibm.com/redbooks Send your comments in an e-mail to: redbooks@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400 U.S.A. This document is available online at http://www.ibm.com/redbooks/abstracts/tips1340.html. Trademarks IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol ( or ), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml. The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: BigInsights Guardium IBM InfoSphere Redbooks Redbooks (logo) The following terms are trademarks of other companies: Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Other company, product, or service names may be trademarks or service marks of others. IBM BigInsights Security Implementation: Part 1 Introduction to Security Architecture 6