Flash Storage Complementing a Data Lake for Real-Time Insight

Size: px

Start display at page:

Download "Flash Storage Complementing a Data Lake for Real-Time Insight"

Alice Dickerson
5 years ago
Views:

1 Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018

2 Agenda Delivering insight along the entire spectrum from edge to core Coupling big data with fast data: complementing a data lake for real-time insight Stream ingestion to ActiveScale object storage as system a as a component within a data a data lake lake IoT speed layer implementation A real-world IoT use case 2

3 Evolving Landscape from Core to Edge 3

4 Evolving Landscape from Core to Edge consumer platforms & systems embedded & removable solid state & hard disk drives 4

5 Delivering Value to Data from Core to Edge Enable tight coupling between Big Data and Fast Data Big Data Data Aggregation Streaming Analytics Fast Data INSIGHT MOBILITY PREDICTION Batch Analytics Machine Learning REAL-TIME RESULTS Modeling PRESCRIPTION Scale Artificial Intelligence SMART MACHINES Performance 5

Future Data Pooled and Shared by Multiple Applications create transform app deliver create

6 Delivering Value to Data from Core to Edge Enable data sharing and utilization of system resources across diverse applications Past Data Held Captive by Single Application Current and Future Data Pooled and Shared by Multiple Applications create transform app deliver create create store transform store capture store access transform create access store create create deliver transform 6

7 Agenda Delivering insight along the entire spectrum from edge to core Coupling big data with fast data: complementing a data lake for real-time insight Stream ingestion to ActiveScale object storage system as a component within a data lake IoT speed layer implementation 5 A real-world IoT use case 7

8 A Unified Workflow for Analytics on the 3 rd Platform Big data Batch Layer Raw Data Hadoop Store + Map Reduce Batch Views Streaming event data from business processes (vehicles, transactions, ) Capture Merged Views Visualization, Machine Learning, Queries, Models, Metrics Serving Layer Transformed Data Streaming Analytics Fast data Incremental Real-time Views Flash Memor y Summit Speed 2018 Layer Santa Clara, CA 8

9 An Enhanced Workflow for Analytics on the 3 rd Platform Streaming event data from business processes (vehicles, transactions, ) ActiveScale object storage system as the repository of all raw data, metadata & process data Data Sources (Images, Backups, Documents,., Business Process Models) All Raw Data Kafka + Apache (index /tag/ Kafka search/ml) Rack-scale Flash (shared PCIe allflash enclosure, IntelliFlash ), NVMe SSDs: memory latency, I/O throughput Transformed Data Data Lake Apache Kafka S3 connector Big data Data Ingest /ML/ Transformation/ Process Data Persistence S3 / NFS ActiveScale S3A connector Other Apps Hadoop YARN Streaming Analytics Fast data Batch View Real-time View Batch Views Incremental Real-time Views Other non-map Reduce apps can use ActiveScale Real-time & Batch Queries, Search, Machine Learning, Dashboards Merged Views Flash Memor y Summit 2018 Santa Clara, Batch Layer Serving Layer Rack-scale Flash (shared PCIe all-flash enclosure, IntelliFlash ), NVMe SSDs: memory latency, IOPS CA 9 Smart Analytics Speed Layer

10 Aggregated vs. Disaggregated Architectures for Analytics Application Tier Data Store Tier Data Network Object storage system Co-located compute and storage Applications communicate with an integrated compute and storage (HDD/Flash) and with a back-end object storage. Homogenous across resources. Application Tier Data Store Tier Pool of compute Data Network Pool of HDDs Shared pool of flash Object storage system Applications communicate with a disaggregated pool of compute, HDDs, shared flash and object storage. Independent scaling of resources. Heterogeneous, and interoperable across resources. 10

11 Agenda Delivering insight along the entire spectrum from edge to core Coupling big data with fast data: complementing a data lake for real-time insight Stream ingestion to ActiveScale object storage system as a component within a data lake IoT speed layer implementation 5 A real-world IoT use case 11

12 Stream Ingestion from Apache Kafka to ActiveScale Architecture Concurrent Clients Client Node 1 Client Node 2 Client Node GigE connections Streaming data 2 NVMe SSDs in a shared PCIe all-flash enclosure allocated to each Kafka node Apache Kafka Cluster Kafka Node 1 Kafka Node 2 Apache Kafka Connect Cluster Connector 1 Worker 1... Connector 32 Worker 2... Connector 64 Connector 33 Worker 3... Connector 96 Connector GigE connections ActiveScale Scaler 1 Scaler 2 Scaler 3 Client Node 4 Client Node 5 8 GB/s Kafka Node GB/s Worker 4... Connector 97 Connector 128 Client Node 6 Client Node 7 Client Node 8 For each benchmark run on 8 clients: Record size: 500 bytes/message. Sends 800 M messages at 8 GB/s. Publishes to a Kafka topic, 32 M messages at a time. Kafka Node 4 For each benchmark run, Kafka cluster can process: 16 M messages/sec, for the message size of 500 bytes. Average latency of acknowledgment to the clients is 75 ms/message. Worker 5... Connector 129 Connector 160 Worker 6... Connector 161 Connector 192 Each connector in the Kafka Connect Cluster works as a subscriber to a Kafka topic in the Kafka Cluster and sinks the data to the ActiveScale. A total of 192 connectors is writing to ActiveScale and saturating the write throughput to a single ActiveScale system. Achieved average and maximum parallel write throughput to ActiveScale are ~ 4.28 GB/s and 6 GB/s, respectively. Projected throughput will increase by 2x with 12 worker nodes in the Kafka Connect cluster. 12

13 Write Throughput (GB/s) Write Throughput (GB/s) Apache Kafka S3 Connector to ActiveScale : Performance Write Throughput (GB/s) Kafka-S3 Connector Scalability with Number of Connectors Average Maximum Number of Connectors (one connector per partition) An optimal throughput is achieved with a total of 192 connectors writing to a single ActiveScale system. Higher the better An optimal and stable performance is achieved with the S3 upload part size of 128 MB and with a commit size of 1 Million messages (500 MB), irrespective of the number of connector partitions. 48 connectors have been used for these performance tests Average 1.93 Kafka-S3 Connector Scalability with S3 Upload Part Size Maximum MB 64 MB 128 MB 256 MB S3 Upload Part Size (1 Million Messages or 500MB Commit Size) Kafka S3 sink Connector Scalability with File/Object Commit Size Average Maximum File/Object Commit Size (MB) (S3 Upload Part Size - 64 MB) Higher the better Higher the better Flash Memor y Summit 2018 Santa Clara, CA 13

14 Acknowledgment Latency (ms) Apache Kafka Performance on Flash vs. HDDs With 16 M messages/sec, for the message size of 500 bytes: Kafka performance per node: Flash vs. HDD 160 Lower the better The average latency of acknowledgment to the clients by Kafka is 75 ms/message with 2 NVMe SSDs in a shared PCIe all-flash enclosure allocated to each of the Kafka nodes HDDs (5 HDDs/LUN, 2 LUNs) - 6TB 75 1 SSD - 3TB 2 SSDs - 6TB 3 SSDs - 9TB 60 A low latency of Kafka with 2 SSDs allocated to each node of a 4-node cluster has allowed for achieving a rate of 16 M messages/s. With 1 SSD 10 HDDs, in terms of latency, it would be necessary to use 20 HDDs to achieve the above message rate by Kafka. 14

15 Agenda Delivering insight along the entire spectrum from edge to core Coupling big data with fast data: complementing a data lake for real-time insight Stream ingestion to ActiveScale object storage system as a component within a data lake IoT speed layer implementation 6 A real-world IoT use case 15

16 IoT Speed layer: Stream Processing and Storing Requirements: Stream Processing Process events in near-real time : low latency; High Velocity; Guaranteed processing; Fault-tolerant; Scales easily. Choices Spark Streaming Low-Latency, High Throughput, Fault-tolerant; Uses Spark core for large-scale stream processing, In-flight messages; Component of Apache Spark Ecosystem; Uses: ETL on streaming data, Operational dashboards, anomaly & fraud detection, NLP analysis, sensor data analytics; High Adoption Rate. Others: Apache Storm, Flink, Apache Apex, Apache Samza, Apache NiFi, etc. Requirements: Storing Fast/real-time lookup storage; Support for write-intensive workloads; Support for concurrent reads and writes of multiple data sizes within a minimal response time. Choices NoSQL is a good option Cassandra Best-in-class, scalable NoSQL data management system; Parallel, distributed, high throughput, 100% fault-tolerant; No dependency on Hadoop; has Spark connector; Typical applications are IoT, fraud detection, recommendation engines, messaging apps. Others: HBase, Redis, Accumulo, MongoDB, etc. 16

Apache Spark Streaming Performance Processing Time ( ms) 70000 60000 50000 40000 30000 20000 10000 0 Spark Streaming Scalability - Rate of incoming events 10K 20K 40K 50K 60K 80K 100K 120K Incoming

17 Apache Spark Streaming Performance Processing Time ( ms) Spark Streaming Scalability - Rate of incoming events 10K 20K 40K 50K 60K 80K 100K 120K Incoming events per sec Lower the better Spark Streaming is implemented on 4 nodes (24c, 48t, 96GB) with 2 NVMe SSDs in a shared PCIe all-flash enclosure allocated to each node. The processing time of a single streaming application reaches a threshold at 100K events/s. Increasing the number of partitions or streams for a single application, does not improve the processing time, due to the overhead of aggregating results across multiple partitions. From measured data, it is estimated that we could easily execute multiple streaming applications concurrently (a total of 3.2M events) within a 55 sec response time and CPU usage of 75-80%. Processing Time (ms) Scalability of Concurrent Streaming Applications - 4 nodes (24c, 48t, 96GB/node) Concurrent Processing Time (secs) CPU(%)/node 47s Spark Streaming Scalability - Number of Kafka partitions/streams 100K events/sec 45,901 46,928 48s 2% 15% 49,122 1p/1s 4p/4s 8p/8s 96p/96s Number of Streams/Partitions 55s 55,189 Lower the better 75% 1 app (100K/s/app) 8 apps (100K/s/app) 32 apps (100K/s/app) (projected) 17

18 Agenda Delivering insight along the entire spectrum from edge to core Coupling big data with fast data: complementing a data lake for real-time insight Stream ingestion to ActiveScale object storage system as a component within a data lake IoT speed layer implementation 6 A real-world IoT use case 18

Real-world Use Case: Unified Streaming and Batch Analytics on Climate IoT Data ActiveScale object storage system as the storage repository of all raw data Raw data feeds and eventdriven data from

19 Real-world Use Case: Unified Streaming and Batch Analytics on Climate IoT Data ActiveScale object storage system as the storage repository of all raw data Raw data feeds and eventdriven data from weather stations All Raw Data Apache Kafka on a shared PCIe all-flash enclosure Kafka + Data Lake Transformed Data Big data Apache Kafka S3 Connector Tornado alerts Hadoop YARN Streaming Analytics Fast data ActiveScale S3A connector for Hadoop Tornadoes past history Batch View Real-time View Batch Views Incremental Real-time Views Apache Hive Spark Streaming on on a shared PCIe all-flash enclosure Merged Views Cassandra on a shared PCIe allflash enclosure Batch Layer Tornadoes present and past, Forecasting Serving Layer Speed Layer 19 Smart Analytics

20 Summary and Best Practices For efficient coupling between big data and fast data for IoT use cases, it is recommended to: Implement ActiveScale object storage system as a component within a data lake, for storing all the incoming raw streaming data from the ingestion layer. Tune the number of connectors configurable in the Kafka Connect Cluster, the S3 upload part size, and the File/Object commit size (also known as flush size), for ingesting streaming data from Apache Kafka to ActiveScale object storage system with an optimal write throughput. Leverage ActiveScale S3A connector for Hadoop to perform various long-running batch analytics and ad-hoc queries against the petabyte-scale data stored in ActiveScale, without any physical data movement. Implement a shared all-flash storage for the speed layer. Allocate the shared flash storage to compute nodes, for a balanced throughput, IOPS and capacity, required by the applications in the speed layer including ingestion, stream processing and real-time store. Implement the shared all-flash storage to complement an ActiveScale object storage system to achieve a disaggregation of compute and storage for the whole solution, without compromising on long-term data persistence, centralization or quality, durability, high availability and cost. Tune the number of concurrent applications, number of streams per application, and also the allocation of compute and JVM memory to achieve an optimal performance in the speed layer. 20

21 Western Digital, the Western Digital logo, ActiveScale, and IntelliFlash are registered trademarks or trademarks of Western Digital Corporation or its affiliates in the US and/or other countries. Apache, Apache Apex, Apache Hive, Apache Samza, Apache Spark, Apache Storm, Accumulo, Cassandra, Flink, Hadoop, HBase, and Kafka are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. The NVMe word mark is a trademark of NVM Express, Inc. All other marks are the property of their respective owners. 21

Toward a Memory-centric Architecture

Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains