Introducing Apache Kudu and RecordService (incubating)

Size: px

Start display at page:

Download "Introducing Apache Kudu and RecordService (incubating)"

Erica Mills
5 years ago
Views:

1 Introducing Apache Kudu and RecordService (incubating) Guido Oswald Sales Engineer, Switzerland April 2016, Swiss Big Data User Group Meetup 1

2 Current storage landscape in Hadoop HDFS excels at: Efficiently scanning large amounts of data Accumulating data with high throughput HBase excels at: Efficiently finding and writing individual rows Making data mutable Gaps exist when these properties are needed simultaneously 2

3 Managing the gap (today) Code Complexity Manage flow and sync of data between HDFS and Hbase Monitoring and Security Managing consistent backups, security policies, monitoring and more is hard Performance Significant lag between arrival of Hbase data staging and time when data is available for analytics. 3

4 Changing hardware landscape Spinning disk -> solid state storage NAND flash: Up to 450k read 250k write IOPS, about 2GB/sec read and 1.5GB/sec write throughput, at a price of less than $3/GB and dropping 3D XPoint memory (1000x faster than NAND, cheaper than RAM) RAM is cheaper and more abundant: 64->128->256GB over last few years Takeaway 1: The next bottleneck is CPU, and current storage heavy applications weren t designed with CPU efficiency in mind Takeaway 2: Column stores are feasible for random access 4

5 Apache Kudu (Incubating) Storage for Fast Analytics on Fast Data BATCH Spark, Hive, Pig MapReduce PROCESS, ANALYZE, SERVE STREAM Spark RESOURCE MANAGEMENT YARN SQL Impala UNIFIED SERVICES SEARCH Solr SECURITY Sentry, RecordService SDK Kite New updating column store for Hadoop Simplifies the architecture for building analytic applications on changing data Designed for fast analytic performance Natively integrated with Hadoop FILESYSTEM HDFS STRUCTURED Sqoop RELATIONAL Kudu STORE INTEGRATE NoSQL HBase UNSTRUCTURED Kafka, Flume Donated as incubating project at Apache Software Foundation (November 17, 2015) Beta now available 5

Kudu design goals High throughput for big scans (columnar storage and replication) Goal: Within 2x of Parquet Low-latency for short accesses (primary key indexes and

6 Kudu design goals High throughput for big scans (columnar storage and replication) Goal: Within 2x of Parquet Low-latency for short accesses (primary key indexes and quorum design) Goal: 1ms read/write on SSD Database-like semantics (initially single-row ACID) Relational data model SQL query NoSQL style scan/insert/update (Java client) 6

7 Kudu basic design Apache-licensed open source software Structured data model Basic construct: tables Tables broken down into tablets (roughly equivalent to partitions) Architecture supports geographically disparate, active/active systems Not the initial design goal 7

8 What Kudu is not Not a SQL interface Just the storage layer BYOSQL Bring-your-own SQL Not a file system Data must have tabular structure Not an application that runs on HDFS An alternative, native Hadoop storage engine Not a replacement for HDFS or HBase Select the right storage for the right use case Cloudera will continue to support and invest in all three 8

9 Kudu data model Tables have a RDBMS-like schema Finite number of columns (unlike HBase/Cassandra) Types: BOOL, INT8/16/32/64, FLOAT, DOUBLE, STRING, BINARY, TIMESTAMP Some subset of columns make up a primary key Fast random reads/writes by primary key No secondary indexes (yet) Columnar layout on disk - Parquet Lazy materialization Encoding and compression options 9 9

10 Table partitioning Hash bucketing Distribute records by hash of partition column(s) N buckets leads to N tablets Range partitioning Distribute records by ranges of the partition column(s) N split keys leads to N tablets Can be a mix for different columns of the primary key 10

11 Consistency model Consistency and replication enforced by Raft consensus (similar to Paxos) Replication by operation not data Single-row transactions now Multi-row transactions later Geo-distributed replicas will be possible under strict time synchronization Techniques drawn from Google Spanner and others 11

12 Kudu interfaces NoSQL-style APIs Insert(), Update(), Delete(), Scan() Java and C++ now Python soon Integrations with MapReduce, Spark, and Impala No direct access to underlying Kudu tablet files Beta does not have authentication, authorization, encryption 12

13 Impala integration Opens up Kudu to JDBC/ODBC clients Intuitive way to get data into Kudu INSERT INTO kudu_table SELECT * FROM src_table; Additional commands UPDATE DELETE Efficient INSERT VALUES Runs on the Kudu C++ client 13

14 Performance characteristics Very CPU efficient Written in modern C++, uses specialized CPU instructions, JIT compilation with LLVM Latency dependent on storage hardware capabilities Expect sub-millisecond response on SSDs and upcoming technologies No garbage collection allows very large memory footprint with no pauses Bloom filters reduce the need for many disk accesses 14

15 Operating Kudu Easiest through Cloudera Manager integration Separate parcel for now Kudu is always compacting No minor vs. major compaction No compaction latency spikes Web UI is full of metrics and logs 15

16 Cluster layout One or multiple masters Only one in current beta Low CPU and memory impact One tablet server per worker node Can share disks with HDFS One SSD per worker node just for Kudu WAL can speed up writes No dependencies on other Hadoop ecosystem components But interfacing components like Impala or Spark do 16

17 Real-time analytics in Hadoop today Merging in new data = storage complexity Incoming Data (Messaging System) HDFS + Impala Downsides: Multiple storage layers Have we accumulated enough data? Reorganize HBase file into Parquet HBase Parquet File Historic Data Most Recent Partition New Partition Wait for running operations to complete Define new Impala partition referencing the newly written Parquet file Reporting Request Latest data is hidden Files are messy Complex to do updates without breaking running queries 17

18 Real-time analytics in Hadoop with Kudu Incoming Data (Messaging System) Kudu + Impala Improvements: One system to operate Historical and Real-time Data Reporting Request No schedules or background processes Handle late arrivals or data corrections with ease New data available immediately for analytics or operations 18

19 Kudu for data warehousing Near real time data visibility BI tools can display events that happened seconds earlier Excellent for star schemas Fast scans of deep fact tables Efficient wide fact tables Simplified updates of slowly changing dimensions 19

20 Near real time data warehousing on Kudu Simple Files FLUME HUE RDBMS K A F K A K U D U IMPALA User Streams SPARK STREAMING Complex BI tools 20

21 Resources Join the community Download the beta cloudera.com/downloads Read the whitepaper getkudu.io/kudu.pdf 21

22 Creating a Kudu table Table name in Impala does NOT match table name in Kudu. Kudu is its own storage layer. Kudu Storage handler Kudu Master hostname and port A primary key is mandatory 22

23 Spark (Scala) code DataFrame Row Kudu table name Kudu Master hostname and port Create a client, session and table object Extract values from the row, strong types Create an insert object and row Perform the actual insert Cleanup Set the values by type, column name and column valule 23

24 Kudu code examples and docs 0/topics/kudu_development.html 24

25 RecordService 25

Permission Enforcement today with Sentry Rule: Allow fraud analysts read access to the transaction table Sentry Enforcement Hive Server 2 Admins specify permissions Sentry Service Sentry

26 Permission Enforcement today with Sentry Rule: Allow fraud analysts read access to the transaction table Sentry Enforcement Hive Server 2 Admins specify permissions Sentry Service Sentry Permissions rules Coarse grained (table) Sentry Enforcement Sentry Enforcement Impala HDFS: MR, Pig, Spark,... Apps: Datameer, Platfora, Zoomdata, etc Sentry Enforcement Search (Solr) 26

27 The Need for Fine-Grained Access Control Across all access paths Columns: Sensitive column visibility varies; Example: credit card numbers Managers: Call Centre: XXXX XXXX XXXX 5678 Analysts: XXXX XXXX XXXX XXXX Others: Does not see credit card column Rows: Different groups of users need access to different records European privacy laws Government security clearance Financial information restrictions 27

28 The workaround Split the original file; Use HDFS permissions to limit access Date/time Accnt # National Identifier 09:33: :33: :12: :22: :55: :22: :45: :03: :55: Asset Trade Broker ABC Sell group1 TBT Buy group2 DEF Sell group3 INTC Buy group1 F Buy group1 UA Buy group3 XYZ Sell group2 TMV Buy group1 MA Buy group3 What if only some brokers in each group are allowed to see full IDs? Date/time Accnt # National Identifier 09:33: :22: :55: :03: Date/time Accnt # National Identifier 11:33: :45: Date/time Accnt # National Identifier 14:12: :22: :55: Asset Trade Broker ABC Sell group1 INTC Buy group1 F Buy group1 TMV Buy group1 Asset Trade Broker TBT Buy group2 XYZ Sell group2 Asset Trade Broker DEF Sell group3 UA Buy group3 MA Buy group3 28

The Solution Apply controls to the master data file Row, column, and sub-column (masking) controls Ability to enforce these across access paths What All Group 1 Brokers See: Date/time Accnt #

29 The Solution Apply controls to the master data file Row, column, and sub-column (masking) controls Ability to enforce these across access paths What All Group 1 Brokers See: Date/time Accnt # National Identifier Asset Trade Broker 09:33: XXX-XX-9876 ABC Sell group1 09:22: :55: XXX-XX-2345 INTC Buy group XXX-XX-8765 F Buy group1 09:03: XXX-XX-5678 TMV Buy group1 29

30 Record Service (Beta) to Enforce Column and Rowlevel Rules Hbase Applications: Datameer, Platfora, etc Hadoop components: MR, Pig, Spark, Solr, Hive Server 2, Impala... RecordService AWS S3 Permissions specified by administrators (top-level and delegated) Rule: Allow managers to see National IDs. HDFS Sentry Service Sentry Permissions rules 30

31 Benefits of RecordService Security Fine-grained data permissions and enforcement across Hadoop Integration with Sentry for policy storage and implementation Interoperability Clients no longer need to be aware of on-disk format Single data access path means single place to implement and test file format related changes Transparently swap components above or below (ex. HDFS -> S3) Performance/Efficiency Performance boosted via Impala s optimized scanner, dynamic code generation, parquet implementation Use projections over original source datasets instead of making so many copies/subsets 31

32 Record Service Architecture 1 Request: - Objects to access - User info Response: - List of splits - Delegation token RecordServicePlanner HDFS NN Sentry Service Hive Metastore Client Client Task Client Task RecordServiceWorker Client Task 2 Job launches as normal 3 Client tasks read records from RecordServiceWorker HDFS DN HBase RS S3 Not yet supported 32

33 Enforcing Sentry Permissions for MR/Spark Create a view in HMS with the necessary column/row restrictions Create a role and assign to a group CREATE VIEW nation_names AS SELECT n_nationkey, n_name FROM tpch.nation; CREATE ROLE demorole; GRANT ROLE demorole to GROUP demogroup; Grant access privilege to that role GRANT SELECT ON TABLE tpch.nation_names TO ROLE demorole; 33

34 Spark Usage Example: RDD Import Record Service package scala> import com.cloudera.recordservice.spark._; import com.cloudera.recordservice.spark._ Read data into a variable using Record Service API scala> val data = sc.recordservicerecords("select * from tpch.nation_names"); data: org.apache.spark.rdd.rdd[array[org.apache.hadoop.io.writable]] = RecordServiceRDD[0] at RDD at RecordServiceRDDBase.scala:57 Perform an action scala> data.count(); res0: Long = 25 34

35 Current Feature Availability Compute: SupportforMR (InputFormat) and Spark (RDDs, SparkSQLDataFrames) Storage: Support for reading HDFS or S3 of file format: Parquet, Text, Sequence File, RC, Avro Data Types: INT (8-64 bits), CHAR/VARCHAR, BOOL, FLOAT, DOUBLE, DECIMAL, STRING, TIMESTAMP No support for LOBs or Nested Types Scalability: Tested up to 80 large/powerful nodes Validated against 1 trillion row (100TB) TeraSort dataset/workload Metadata up to 1M blocks (planning only) Note that TPC-DS run on SparkSQL at 500GB scale point ran 15% faster with Record Service Security: Authentication: Kerberos / LDAP / AD Authorization: Sentry table level privileges, column and row-level privileges using HMS views. Delegation token + task encryption for secure task execution 35

36 Current Limitations Security Limitations Only supports simple single-table views (no joins or aggregations). SSL support has not been tested. Oozie integration has not been tested. UDFs are not supported Storage/File Format Limitations No support for write path. Unable to read from Kudu or HBase. Operation and Administration Limitations No diagnostic bundle support. No metrics available in CM. Application Integration Limitations Spark DataFramenot well tested. See in 36

37 Installation and Platform Support Installation Support CSD installation on CDH5.4+ Parcels, via CM Packages QuickStart VM Client JARs Platform/Hardware Support Server support: RHEL5-7, Ubuntu LTS, SLES, Debian Intel Nehalem (or later) or AMD Bulldozer (or later) processor 64GB memory For optimal performance, run with 12 or more disks or use SSD. Operation and Administration Directly from RecordService: Metrics exposed via a RecordService webapp Profiles for requests via RecordService webapp From CM: Basic service management (start/stop/restart) and basic health checks via CM (process availability). Ability to deploy RecordService Planner, Worker, or Planner+Worker roles. 37

38 Resources RecordService Beta Docs Feature list RecordService Source Code RecordServiceClient libraries 38

39 Thank you 39

Enabling Secure Hadoop Environments

Enabling Secure Hadoop Environments Fred Koopmans Sr. Director of Product Management 1 The future of government is data management What s your strategy? 2 Cloudera s Enterprise Data Hub makes it possible