STYX: Stream Processing with Trustworthy Cloud-based Execution

Size: px
Start display at page:

Download "STYX: Stream Processing with Trustworthy Cloud-based Execution"

Transcription

1 STYX: Stream Processing with Trustworthy Cloud-based Execution Julian Stephen, Savvas Savvides, Vinaitheerthan Sundaram, Masoud Saeida Ardekani, Patrick Eugster October 6, 2016 Purdue University

2 Table of contents 1. Overview 2. Ensuring confidentiality in the cloud 3. Challenges in encrypted stream processing 4. Architecture 5. STYX abstractions and key update 6. Evaluation 7. Related work and conclusion 2

3 Overview

4 Introduction Compute clouds Data analytics platforms Cost-efficiency, on-demand compute, low infrastructure setup cost IoT 26 billion smart devices connected to a network by 2020 Fine-grained user behavior tracking to capture, personalize and/or monetize user experience Stream processing Analytics on real-time streaming data (continuous queries) Many systems over the last few years - Apache Storm, Apache Spark, Apache Flink, Apache Samza, Amazon Kinesis 4

5 Vulnerabilities - 1 5

6 Vulnerabilities - 1 5

7 Vulnerabilities - 1 A 5

8 Vulnerabilities - 1 A f(a) 5

9 Vulnerabilities - 1 A f(a) 5

10 Vulnerabilities - 1 A f(a) A 5

11 Vulnerabilities - 2 Real problems 6

12 Ensuring confidentiality in the cloud

13 Confidentiality in the cloud Fully homomorphic encryption (FHE) Allows arbitrary computation on encrypted data A f f(a) 8

14 Confidentiality in the cloud Fully homomorphic encryption (FHE) Allows arbitrary computation on encrypted data A f f(a) 8

15 Confidentiality in the cloud Fully homomorphic encryption (FHE) Allows arbitrary computation on encrypted data A f f(a) k E(A, k) 8

16 Confidentiality in the cloud Fully homomorphic encryption (FHE) Allows arbitrary computation on encrypted data A f f(a) k E(A, k) f f (E(A, k)) 8

17 Confidentiality in the cloud Fully homomorphic encryption (FHE) Allows arbitrary computation on encrypted data A f f(a) k D(f (E(A, k))) E(A, k) f f (E(A, k)) 8

18 Confidentiality in the cloud Fully homomorphic encryption (FHE) Allows arbitrary computation on encrypted data Prohibitive overhead 8

19 Confidentiality in the cloud Fully homomorphic encryption (FHE) Allows arbitrary computation on encrypted data Prohibitive overhead Partially homomorphic encryption (PHE) Allows certain operations to be performed over encrypted text AHE: D(E(x1)ψE(x2)) = x1 + x2 AHE, MHE, OPE, DET Conjecture Many data analytics jobs can be performed securely using a combination of partially homomorphic encryption schemes 8

20 Vulnerabilities - 3 9

21 Vulnerabilities - 3 A E(A, k) 9

22 Vulnerabilities - 3 A E(A, k) f (E(A, k)) 9

23 Vulnerabilities - 3 A E(A, k) f (E(A, k)) 9

24 Vulnerabilities - 3 A E(A, k) f (E(A, k)) E(A, k) k A 9

25 Challenges in encrypted stream processing

26 Challenges in encrypted stream processing Programmer effort Need to identify encryption scheme for each input data stream, perform cryptographic equivalent of required operation if (Stream1.f1 < 100) return; else... sum = sum + Stream2.f3; 11

27 Challenges in encrypted stream processing Key change PHE requires all tuples in an aggregate function to be encrypted with same key A A B C D E F G A 11

28 Challenges in encrypted stream processing Deployment optimizations Deployment parameters specified for plaintext data may not be optimal when computation happens on encrypted data 11

29 Challenges in encrypted stream processing Programmer effort Need to identify encryption scheme for each input data stream, perform cryptographic equivalent of required operation Key change PHE requires all tuples in an aggregate function to be encrypted with same key Deployment optimizations Deployment parameters specified for plaintext data may not be optimal when computation happens on encrypted data 11

30 Challenges in encrypted stream processing Programmer effort Need to identify encryption scheme for each input data stream, perform cryptographic equivalent of required operation Key change PHE requires all tuples in an aggregate function to be encrypted with same key Deployment optimizations Deployment parameters specified for plaintext data may not be optimal when computation happens on encrypted data Limitations of PHE PHE may not support a sequence of operations requiring trusted nodes to perform remaining computation 11

31 Challenges in encrypted stream processing Programmer effort Need to identify encryption scheme for each input data stream, perform cryptographic equivalent of required operation Key change PHE requires all tuples in an aggregate function to be encrypted with same key Deployment optimizations Deployment parameters specified for plaintext data may not be optimal when computation happens on encrypted data Limitations of PHE PHE may not support a sequence of operations requiring trusted nodes to perform remaining computation Constants and initialization Variables must be initialized using the encrypted value of the initialization constant 11

32 Challenges in encrypted stream processing Programmer effort Need to identify encryption scheme for each input data stream, perform cryptographic equivalent of required operation Key change PHE requires all tuples in an aggregate function to be encrypted with same key Deployment optimizations Deployment parameters specified for plaintext data may not be optimal when computation happens on encrypted data Limitations of PHE PHE may not support a sequence of operations requiring trusted nodes to perform remaining computation Constants and initialization Variables must be initialized using the encrypted value of the initialization constant 11

33 Architecture

34 STYX architecture Analytical model identifies deployment profile execute the graph crypto systems required to identifies analysis Homomorphism User submits program written using system (STYX) API Execution flow Program (STYX API, Annotations) Homomorphism Analysis

35 STYX abstractions and key update

36 STYX abstraction Group sum in a sliding window 1 /** Track sum of values per group per time slot */ 2 public class SlotBasedSum <T > { public void updatesum ( T group, int slot, SecField val ) { 5 SecField [] sums = objgroupsum. get ( group ); 6 if ( sums == null ) { 7 sums = new SecField [ this. numslots ]; 8 init (sums, val ); 9 objgroupsum. put (obj, sums ); 10 } 11 sums [ slot ] = SecureOper 12. add ( sums [ slot ], val ); 13 } 14 } 15

37 STYX abstraction Group sum in a sliding window 1 /** Track sum of values per group per time slot */ 2 public class SlotBasedSum <T > { public void updatesum ( T group, int slot, SecField val ) { 5 SecField [] sums = objgroupsum. get ( group ); 6 if ( sums == null ) { 7 sums = new SecField [ this. numslots ]; 8 init (sums, val ); 9 objgroupsum. put (obj, sums ); 10 } 11 sums [ slot ] = SecureOper 12. add ( sums [ slot ], val ); 13 } 14 } 15

38 Without STYX abstractions Group sum in a sliding window (Storm) 1 public class SlotBasedSum <T > { 2 BigInteger publickey = readpubkey (); 3 public void updatesum ( T group, int slot, BigInteger value ) { 4 BigInteger [] sums = objgroupsum. get ( group ); 5 if ( sums == null ) { 6 sums = new BigInteger [ this. numslots ]; 7 init (sums, " AHE "); 8 objgroupsum. put ( group, sums ); 9 } 10 sums [ slot ] = sums [ slot ]. multiply ( value ) 11. mod ( publickey. multiply ( publickey )); 12 } 13 } 14 16

39 Without STYX abstractions Group sum in a sliding window (Storm) 1 public class SlotBasedSum <T > { 2 BigInteger publickey = readpubkey (); 3 public void updatesum ( T group, int slot, BigInteger value ) { 4 BigInteger [] sums = objgroupsum. get ( group ); 5 if ( sums == null ) { 6 sums = new BigInteger [ this. numslots ]; 7 init (sums, " AHE "); 8 objgroupsum. put ( group, sums ); 9 } 10 sums [ slot ] = sums [ slot ]. multiply ( value ) 11. mod ( publickey. multiply ( publickey )); 12 } 13 } 14 16

40 Without STYX abstractions Group sum in a sliding window (Storm) 1 public class SlotBasedSum <T > { 2 BigInteger publickey = readpubkey (); 3 public void updatesum ( T group, int slot, BigInteger value ) { 4 BigInteger [] sums = objgroupsum. get ( group ); 5 if ( sums == null ) { 6 sums = new BigInteger [ this. numslots ]; 7 init (sums, " AHE "); 8 objgroupsum. put ( group, sums ); 9 } 10 sums [ slot ] = sums [ slot ]. multiply ( value ) 11. mod ( publickey. multiply ( publickey )); 12 } 13 } 14 16

41 Key change Challenges Functions that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Problem A A B C D E F G A 17

42 Key change Challenges Functions that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Problem A A B C D E F G A 17

43 Key change Challenges Functions that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Problem A A B C D E F G A 17

44 Key change Challenges Functions that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Problem A A B C D E F G A 17

45 Key change Challenges Continuous queries that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Solution A A B C D E D E F G A 18

46 Key change Challenges Continuous queries that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Solution A A B C D E D E F G A 18

47 Key change Challenges Continuous queries that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Solution A A B C D E D E F G A 18

48 Key change Challenges Continuous queries that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Solution A A B C D E D E F G A 18

49 Key change Challenges Continuous queries that aggregates data over a sliding window makes it impossible to change the encryption key without disrupting output Solution A A B C D E D E F G A 18

50 Evaluation

51 Evaluation - 1 IoT Bench Smart meter data over a 24 hour time period at the rate of 1 reading per minute from 443 unique homes, totaling records 4 ec2 m3.large node Each tuple comprises of timestamp, meter id and meter reading Throughput (#tuples/sec) STYX Storm Q1 Q2 Q3 Q4 Q5 Q6 20

52 Evaluation - 2 Performance when keys change New york taxi route data (10G) Application finds the top 10 most frequent routes during the last 30 minutes of taxi servicing 9 ec2 m3.large node Response Time (ms) Time (s) 21

53 Related work and conclusion

54 Related work Prior work on encrypted computaion [Popa et al.; SOSP 11]; [Gentry.; STOC 09]; [Stephen et al.; VLDB 13]; Other approaches Using secure processors (e.g., Intel SGX) 23

55 Conclusion Conclusion Confidentiality breaches pose a serious threat to adoption and utilization of cloud resources PHE has proven to be effective for various batch workloads STYX leverage PHE for stream data analytics in the cloud Makes it easier for programmers to use PHE Automatically translates the program and optimizes deployment parameters 24

56 Questions 25

Northrop Grumman Cybersecurity Research Consortium (NGCRC) Spring 2014 Symposium

Northrop Grumman Cybersecurity Research Consortium (NGCRC) Spring 2014 Symposium Northrop Grumman Cybersecurity Research Consortium (NGCRC) Spring 2014 Symposium Crypsis: Secure Big Data Analysis in Untrusted Clouds 28 May 2014 Julian Stephen, Savvas Savvides, Russell Seidel and Patrick

More information

Secure Big Data Processing. Patrick Eugster Purdue University and TU Darmstadt

Secure Big Data Processing. Patrick Eugster Purdue University and TU Darmstadt Secure Big Data Processing Patrick Eugster Purdue University and TU Darmstadt 1 Outline Context Availability and integrity in big data analytics Privacy in big data analytics Conclusions and outlook 2

More information

Towards a Real- time Processing Pipeline: Running Apache Flink on AWS

Towards a Real- time Processing Pipeline: Running Apache Flink on AWS Towards a Real- time Processing Pipeline: Running Apache Flink on AWS Dr. Steffen Hausmann, Solutions Architect Michael Hanisch, Manager Solutions Architecture November 18 th, 2016 Stream Processing Challenges

More information

AWS & Intel: A Partnership Dedicated to fueling your Innovations. Thomas Kellerer BDM CSP, Intel Central Europe

AWS & Intel: A Partnership Dedicated to fueling your Innovations. Thomas Kellerer BDM CSP, Intel Central Europe AWS & Intel: A Partnership Dedicated to fueling your Innovations Thomas Kellerer BDM CSP, Intel Central Europe The Digital Service Economy Growth in connected devices enables new business opportunities

More information

CERIAS Tech Report Securing Cloud-Based Data Analytics: A Practical Approach by Julian James Stephen Center for Education and Research

CERIAS Tech Report Securing Cloud-Based Data Analytics: A Practical Approach by Julian James Stephen Center for Education and Research CERIAS Tech Report 2016-9 Securing Cloud-Based Data Analytics: A Practical Approach by Julian James Stephen Center for Education and Research Information Assurance and Security Purdue University, West

More information

Practical Big Data Processing An Overview of Apache Flink

Practical Big Data Processing An Overview of Apache Flink Practical Big Data Processing An Overview of Apache Flink Tilmann Rabl Berlin Big Data Center www.dima.tu-berlin.de bbdc.berlin rabl@tu-berlin.de With slides from Volker Markl and data artisans 1 2013

More information

SHE AND FHE. Hammad Mushtaq ENEE759L March 10, 2014

SHE AND FHE. Hammad Mushtaq ENEE759L March 10, 2014 SHE AND FHE Hammad Mushtaq ENEE759L March 10, 2014 Outline Introduction Needs Analogy Somewhat Homomorphic Encryption (SHE) RSA, EL GAMAL (MULT) Pallier (XOR and ADD) Fully Homomorphic Encryption (FHE)

More information

Processing Analytical Queries over Encrypted Data

Processing Analytical Queries over Encrypted Data Processing Analytical Queries over Encrypted Data Stephen Tu M. Frans Kaashoek Sam Madden Nickolai Zeldovich VLDB 2013 Introduction MONOMI a system for securely executing analytical queries over sensitive

More information

DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE

DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout, Michael Armbrust, Ali Ghodsi, Michael Franklin, Benjamin Recht, Ion Stoica STREAMING WORKLOADS

More information

How to Route Internet Traffic between A Mobile Application and IoT Device?

How to Route Internet Traffic between A Mobile Application and IoT Device? Whitepaper How to Route Internet Traffic between A Mobile Application and IoT Device? Website: www.mobodexter.com www.paasmer.co 1 Table of Contents 1. Introduction 3 2. Approach: 1 Uses AWS IoT Setup

More information

US Census Bureau Workshop on Multi-party Computing. David W. Archer, PhD 16-Nov-2017

US Census Bureau Workshop on Multi-party Computing. David W. Archer, PhD 16-Nov-2017 US Census Bureau Workshop on Multi-party Computing David W. Archer, PhD 16-Nov-2017 Census First-round Adoption Concerns Technology maturity Computational overhead Complexity of getting this stuff to work

More information

Managing IoT and Time Series Data with Amazon ElastiCache for Redis

Managing IoT and Time Series Data with Amazon ElastiCache for Redis Managing IoT and Time Series Data with ElastiCache for Redis Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect 2016, Web Services, Inc. or its Affiliates. All

More information

Lambda Architecture for Batch and Stream Processing. October 2018

Lambda Architecture for Batch and Stream Processing. October 2018 Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.

More information

UNIFY DATA AT MEMORY SPEED. Haoyuan (HY) Li, Alluxio Inc. VAULT Conference 2017

UNIFY DATA AT MEMORY SPEED. Haoyuan (HY) Li, Alluxio Inc. VAULT Conference 2017 UNIFY DATA AT MEMORY SPEED Haoyuan (HY) Li, CEO @ Alluxio Inc. VAULT Conference 2017 March 2017 HISTORY Started at UC Berkeley AMPLab In Summer 2012 Originally named as Tachyon Rebranded to Alluxio in

More information

TOWARDS PORTABILITY AND BEYOND. Maximilian maximilianmichels.com DATA PROCESSING WITH APACHE BEAM

TOWARDS PORTABILITY AND BEYOND. Maximilian maximilianmichels.com DATA PROCESSING WITH APACHE BEAM TOWARDS PORTABILITY AND BEYOND Maximilian Michels mxm@apache.org DATA PROCESSING WITH APACHE BEAM @stadtlegende maximilianmichels.com !2 BEAM VISION Write Pipeline Execute SDKs Runners Backends !3 THE

More information

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015 Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document

More information

Stream Processing on IoT Devices using Calvin Framework

Stream Processing on IoT Devices using Calvin Framework Stream Processing on IoT Devices using Calvin Framework by Ameya Nayak A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science Supervised

More information

Serverless Computing. Redefining the Cloud. Roger S. Barga, Ph.D. General Manager Amazon Web Services

Serverless Computing. Redefining the Cloud. Roger S. Barga, Ph.D. General Manager Amazon Web Services Serverless Computing Redefining the Cloud Roger S. Barga, Ph.D. General Manager Amazon Web Services Technology Triggers Highly Recommended http://a16z.com/2016/12/16/the-end-of-cloud-computing/ Serverless

More information

Building systems that compute on encrypted data

Building systems that compute on encrypted data ? xd51db5 X9ce568 xab2356 x453a32 xe891a1 X32e1dc xdd0135 x63ab12 Building systems that compute on encrypted data Raluca Ada Popa MIT Compromise of confidential data is prevalent Problem setup clients

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Motivation And Intro Programming Model Spark Data Transformation Model Construction Model Training Model Inference Execution Model Data Parallel Training

More information

TAKING THE MODULAR VIEW

TAKING THE MODULAR VIEW TAKING THE MODULAR VIEW Extracting security from the application Chenxi Wang, Ph.D. Forrester Research SANS Application Security Summit, May, 2012 Application security remains an elusive goal 2012 Breach

More information

Hadoop. Introduction / Overview

Hadoop. Introduction / Overview Hadoop Introduction / Overview Preface We will use these PowerPoint slides to guide us through our topic. Expect 15 minute segments of lecture Expect 1-4 hour lab segments Expect minimal pretty pictures

More information

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS @unterstein @dcos @bedcon #bedcon Operating microservices with Apache Mesos and DC/OS 1 Johannes Unterstein Software Engineer @Mesosphere @unterstein @unterstein.mesosphere 2017 Mesosphere, Inc. All Rights

More information

Implementation of 5PM(5ecure Pattern Matching) on Android Platform

Implementation of 5PM(5ecure Pattern Matching) on Android Platform Implementation of 5PM(5ecure Pattern Matching) on Android Platform Overview - Main Objective: Search for a pattern on the server securely The answer at the end -> either YES it is found or NO it is not

More information

Apache Flink. Alessandro Margara

Apache Flink. Alessandro Margara Apache Flink Alessandro Margara alessandro.margara@polimi.it http://home.deib.polimi.it/margara Recap: scenario Big Data Volume and velocity Process large volumes of data possibly produced at high rate

More information

Pocket: Elastic Ephemeral Storage for Serverless Analytics

Pocket: Elastic Ephemeral Storage for Serverless Analytics Pocket: Elastic Ephemeral Storage for Serverless Analytics Ana Klimovic*, Yawen Wang*, Patrick Stuedi +, Animesh Trivedi +, Jonas Pfefferle +, Christos Kozyrakis* *Stanford University, + IBM Research 1

More information

Energy Management with AWS

Energy Management with AWS Energy Management with AWS Kyle Hart and Nandakumar Sreenivasan Amazon Web Services August [XX], 2017 Tampa Convention Center Tampa, Florida What is Cloud? The NIST Definition Broad Network Access On-Demand

More information

The Stream Processor as a Database. Ufuk

The Stream Processor as a Database. Ufuk The Stream Processor as a Database Ufuk Celebi @iamuce Realtime Counts and Aggregates The (Classic) Use Case 2 (Real-)Time Series Statistics Stream of Events Real-time Statistics 3 The Architecture collect

More information

5 Steps to Government IT Modernization

5 Steps to Government IT Modernization 5 Steps to Government IT Modernization 1 WHY MODERNIZE? IT modernization is intimidating, but it s necessary. What are the advantages of modernization? Enhance citizen experience and service delivery Lower

More information

The Pliny Database PDB

The Pliny Database PDB The Pliny Database PDB Chris Jermaine Carlos Monroy, Kia Teymourian, Sourav Sikdar Rice University 1 PDB Overview PDB: Distributed object store + compute platform In Pliny project: Used to store processed

More information

Unifying Big Data Workloads in Apache Spark

Unifying Big Data Workloads in Apache Spark Unifying Big Data Workloads in Apache Spark Hossein Falaki @mhfalaki Outline What s Apache Spark Why Unification Evolution of Unification Apache Spark + Databricks Q & A What s Apache Spark What is Apache

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

Big Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Big Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data on AWS Big Data Agility and Performance Delivered in the Cloud 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data Technologies and techniques for working productively

More information

Portable stateful big data processing in Apache Beam

Portable stateful big data processing in Apache Beam Portable stateful big data processing in Apache Beam Kenneth Knowles Apache Beam PMC Software Engineer @ Google klk@google.com / @KennKnowles https://s.apache.org/ffsf-2017-beam-state Flink Forward San

More information

Real-time data processing with Apache Flink

Real-time data processing with Apache Flink Real-time data processing with Apache Flink Gyula Fóra gyfora@apache.org Flink committer Swedish ICT Stream processing Data stream: Infinite sequence of data arriving in a continuous fashion. Stream processing:

More information

DATA SCIENCE USING SPARK: AN INTRODUCTION

DATA SCIENCE USING SPARK: AN INTRODUCTION DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data

More information

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Igor Roiter Big Data Cloud Solution Architect Working as a Data Specialist for the last 11 years 9 of them as a Consultant specializing

More information

Cooperative Private Searching in Clouds

Cooperative Private Searching in Clouds Cooperative Private Searching in Clouds Jie Wu Department of Computer and Information Sciences Temple University Road Map Cloud Computing Basics Cloud Computing Security Privacy vs. Performance Proposed

More information

Data Analytics with HPC. Data Streaming

Data Analytics with HPC. Data Streaming Data Analytics with HPC Data Streaming Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

SOC-2 Requirement Solution Brief. EventTracker 8815 Centre Park Drive, Columbia MD SOC-2

SOC-2 Requirement Solution Brief. EventTracker 8815 Centre Park Drive, Columbia MD SOC-2 Requirement Solution Brief 8815 Centre Park Drive, Columbia MD 21045 About delivers business critical software and services that transform high-volume cryptic log data into actionable, prioritized intelligence

More information

Analysis of Partially and Fully Homomorphic Encryption

Analysis of Partially and Fully Homomorphic Encryption Analysis of Partially and Fully Homomorphic Encryption Liam Morris lcm1115@rit.edu Department of Computer Science, Rochester Institute of Technology, Rochester, New York May 10, 2013 1 Introduction Homomorphic

More information

Healthcare in the Public Cloud DIY vs. Managed Services

Healthcare in the Public Cloud DIY vs. Managed Services Business White Paper Healthcare in the Public Cloud DIY vs. Managed Services Page 2 of 9 Healthcare in the Public Cloud DIY vs. Managed Services Table of Contents Page 2 Healthcare Cloud Migration Page

More information

And Then There Were More:

And Then There Were More: David Naylor Carnegie Mellon And Then There Were More: Secure Communication for More Than Two Parties Richard Li University of Utah Christos Gkantsidis Microsoft Research Thomas Karagiannis Microsoft Research

More information

AWS IoT Overview. July 2016 Thomas Jones, Partner Solutions Architect

AWS IoT Overview. July 2016 Thomas Jones, Partner Solutions Architect AWS IoT Overview July 2016 Thomas Jones, Partner Solutions Architect AWS customers are connecting physical things to the cloud in every industry imaginable. Healthcare and Life Sciences Municipal Infrastructure

More information

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services Deep Dive Amazon Kinesis Ian Meyers, Principal Solution Architect - Amazon Web Services Analytics Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure

More information

WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER

WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER ABOUT ME Apache Flink PMC member & ASF member Contributing since day 1 at TU Berlin Focusing on

More information

Python, PySpark and Riak TS. Stephen Etheridge Lead Solution Architect, EMEA

Python, PySpark and Riak TS. Stephen Etheridge Lead Solution Architect, EMEA Python, PySpark and Riak TS Stephen Etheridge Lead Solution Architect, EMEA Agenda Introduction to Riak TS The Riak Python client The Riak Spark connector and PySpark CONFIDENTIAL Basho Technologies 3

More information

Hortonworks and The Internet of Things

Hortonworks and The Internet of Things Hortonworks and The Internet of Things Dr. Bernhard Walter Solutions Engineer About Hortonworks Customer Momentum ~700 customers (as of November 4, 2015) 152 customers added in Q3 2015 Publicly traded

More information

Flash Storage Complementing a Data Lake for Real-Time Insight

Flash Storage Complementing a Data Lake for Real-Time Insight Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum

More information

Apache Ignite and Apache Spark Where Fast Data Meets the IoT

Apache Ignite and Apache Spark Where Fast Data Meets the IoT Apache Ignite and Apache Spark Where Fast Data Meets the IoT Denis Magda GridGain Product Manager Apache Ignite PMC http://ignite.apache.org #apacheignite #denismagda Agenda IoT Demands to Software IoT

More information

MapR Enterprise Hadoop

MapR Enterprise Hadoop 2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS

More information

SafeBricks: Shielding Network Functions in the Cloud

SafeBricks: Shielding Network Functions in the Cloud SafeBricks: Shielding Network Functions in the Cloud Rishabh Poddar, Chang Lan, Raluca Ada Popa, Sylvia Ratnasamy UC Berkeley Network Functions (NFs) in the cloud Clients 2 Enterprise Destination Network

More information

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Cloudera s Developer Training for Apache Spark and Hadoop delivers the key concepts and expertise need to develop high-performance

More information

WHITEPAPER. MemSQL Enterprise Feature List

WHITEPAPER. MemSQL Enterprise Feature List WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure

More information

2013 Cisco and/or its affiliates. All rights reserved. 1

2013 Cisco and/or its affiliates. All rights reserved. 1 2013 Cisco and/or its affiliates. All rights reserved. 1 Building the Internet of Things Jim Green - CTO, Data & Analytics Business Group, Cisco Systems Brian McCarson Sr. Principal Engineer & Sr. System

More information

SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows

SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows Moritz Hoffmann, Andrea Lattuada, John Liagouris, Vasiliki Kalavri, Desislava Dimitrova, Sebastian Wicki, Zaheer Chothia,

More information

Scalable Tools - Part I Introduction to Scalable Tools

Scalable Tools - Part I Introduction to Scalable Tools Scalable Tools - Part I Introduction to Scalable Tools Adisak Sukul, Ph.D., Lecturer, Department of Computer Science, adisak@iastate.edu http://web.cs.iastate.edu/~adisak/mbds2018/ Scalable Tools session

More information

Apache Flink- A System for Batch and Realtime Stream Processing

Apache Flink- A System for Batch and Realtime Stream Processing Apache Flink- A System for Batch and Realtime Stream Processing Lecture Notes Winter semester 2016 / 2017 Ludwig-Maximilians-University Munich Prof Dr. Matthias Schubert 2016 Introduction to Apache Flink

More information

PREDICTIVE DATACENTER ANALYTICS WITH STRYMON

PREDICTIVE DATACENTER ANALYTICS WITH STRYMON PREDICTIVE DATACENTER ANALYTICS WITH STRYMON Vasia Kalavri kalavriv@inf.ethz.ch QCon San Francisco 14 November 2017 Support: ABOUT ME Postdoc at ETH Zürich Systems Group: https://www.systems.ethz.ch/ PMC

More information

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink Matthias Schubert, Matthias Renz, Felix Borutta, Evgeniy Faerman, Christian Frey, Klaus Arthur Schmid, Daniyal Kazempour,

More information

Maximum Sustainable Throughput Prediction for Large-Scale Data Streaming Systems

Maximum Sustainable Throughput Prediction for Large-Scale Data Streaming Systems JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST 215 1 Maximum Sustainable Throughput Prediction for Large-Scale Data Streaming Systems Abstract In cloud-based stream processing services, the maximum

More information

Developing the ERS Collaboration Framework

Developing the ERS Collaboration Framework 1 Developing the ERS Collaboration Framework Patrick J. Martin, Ph.D. BAE Systems Technology Solutions patrick.j.martin@baesystems.com 10-26-2016 2 ERS Development Challenges Resilient System A system

More information

Amazon Search Services. Christoph Schmitter

Amazon Search Services. Christoph Schmitter Amazon Search Services Christoph Schmitter csc@amazon.de What we'll cover Overview of Amazon Search Services Understand the difference between Cloudsearch and Amazon ElasticSearch Service Q&A Amazon Search

More information

Deploying Applications on DC/OS

Deploying Applications on DC/OS Mesosphere Datacenter Operating System Deploying Applications on DC/OS Keith McClellan - Technical Lead, Federal Programs keith.mcclellan@mesosphere.com V6 THE FUTURE IS ALREADY HERE IT S JUST NOT EVENLY

More information

Apache Ignite - Using a Memory Grid for Heterogeneous Computation Frameworks A Use Case Guided Explanation. Chris Herrera Hashmap

Apache Ignite - Using a Memory Grid for Heterogeneous Computation Frameworks A Use Case Guided Explanation. Chris Herrera Hashmap Apache Ignite - Using a Memory Grid for Heterogeneous Computation Frameworks A Use Case Guided Explanation Chris Herrera Hashmap Topics Who - Key Hashmap Team Members The Use Case - Our Need for a Memory

More information

Vision of the Software Defined Data Center (SDDC)

Vision of the Software Defined Data Center (SDDC) Vision of the Software Defined Data Center (SDDC) Raj Yavatkar, VMware Fellow Vijay Ramachandran, Sr. Director, Storage Product Management Business transformation and disruption A software business that

More information

Cloud Computing and Service-Oriented Architectures

Cloud Computing and Service-Oriented Architectures Material and some slide content from: - Atif Kahn SERVICES COMPONENTS OBJECTS MODULES Cloud Computing and Service-Oriented Architectures Reid Holmes Lecture 29 - Friday March 22 2013. Cloud precursors

More information

Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration

Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration Soo-Jin Moon, Vyas Sekar Michael K. Reiter Context: Infrastructure-as-a-Service Clouds Client API Cloud Controller Machine

More information

What s New at AWS? looking at just a few new things for Enterprise. Philipp Behre, Enterprise Solutions Architect, Amazon Web Services

What s New at AWS? looking at just a few new things for Enterprise. Philipp Behre, Enterprise Solutions Architect, Amazon Web Services What s New at AWS? looking at just a few new things for Enterprise Philipp Behre, Enterprise Solutions Architect, Amazon Web Services 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

More information

Over the last few years, we have seen a disruption in the data management

Over the last few years, we have seen a disruption in the data management JAYANT SHEKHAR AND AMANDEEP KHURANA Jayant is Principal Solutions Architect at Cloudera working with various large and small companies in various Verticals on their big data and data science use cases,

More information

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo Microsoft Exam Questions 70-775 Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo NEW QUESTION 1 You have an Azure HDInsight cluster. You need to store data in a file format that

More information

Mobile Security Fall 2012

Mobile Security Fall 2012 Mobile Security 14-829 Fall 2012 Patrick Tague Class #9 The Internet of Things Partial slide credit to L. Zoia and Y. Zhang Announcements If you haven't signed up for a Survey presentation (two teams,

More information

VMware vsphere Clusters in Security Zones

VMware vsphere Clusters in Security Zones SOLUTION OVERVIEW VMware vsan VMware vsphere Clusters in Security Zones A security zone, also referred to as a DMZ," is a sub-network that is designed to provide tightly controlled connectivity to an organization

More information

Deep Dive into Concepts and Tools for Analyzing Streaming Data

Deep Dive into Concepts and Tools for Analyzing Streaming Data Deep Dive into Concepts and Tools for Analyzing Streaming Data Dr. Steffen Hausmann Sr. Solutions Architect, Amazon Web Services Data originates in real-time Photo by mountainamoeba https://www.flickr.com/photos/mountainamoeba/2527300028/

More information

IBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data. IBM Db2 Event Store

IBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data. IBM Db2 Event Store IBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data IBM Db2 Event Store Disclaimer The information contained in this presentation is provided for informational purposes only.

More information

2017 Riverbed Technology. All rights reserved.

2017 Riverbed Technology. All rights reserved. 2017 Riverbed Technology. All rights reserved. Digital Enterprise Matt Eastwood, SVP Digital Disruption is Real $B Share Price ($ Million) $1,000 A Tale of Two Companies +1150% Walmart $750 $500 Amazon

More information

8/24/2017 Week 1-B Instructor: Sangmi Lee Pallickara

8/24/2017 Week 1-B Instructor: Sangmi Lee Pallickara Week 1-B-0 Week 1-B-1 CS535 BIG DATA FAQs Slides are available on the course web Wait list Term project topics PART 0. INTRODUCTION 2. DATA PROCESSING PARADIGMS FOR BIG DATA Sangmi Lee Pallickara Computer

More information

Sparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Sparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Outline The Spark scheduling bottleneck Sparrow s fully distributed, fault-tolerant technique

More information

Packet-Level Network Analytics without Compromises NANOG 73, June 26th 2018, Denver, CO. Oliver Michel

Packet-Level Network Analytics without Compromises NANOG 73, June 26th 2018, Denver, CO. Oliver Michel Packet-Level Network Analytics without Compromises NANOG 73, June 26th 2018, Denver, CO Oliver Michel Network monitoring is important Security issues Performance issues Equipment failure Analytics Platform

More information

vsan Security Zone Deployment First Published On: Last Updated On:

vsan Security Zone Deployment First Published On: Last Updated On: First Published On: 06-14-2017 Last Updated On: 11-20-2017 1 1. vsan Security Zone Deployment 1.1.Solution Overview Table of Contents 2 1. vsan Security Zone Deployment 3 1.1 Solution Overview VMware vsphere

More information

Apache Bahir Writing Applications using Apache Bahir

Apache Bahir Writing Applications using Apache Bahir Apache Big Data Seville 2016 Apache Bahir Writing Applications using Apache Bahir Luciano Resende About Me Luciano Resende (lresende@apache.org) Architect and community liaison at Have been contributing

More information

Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data

Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data Yunhao Zhang, Rong Chen, Haibo Chen Institute of Parallel and Distributed Systems (IPADS) Shanghai Jiao Tong University Stream Query

More information

Internet of Things: Driving the Transformation

Internet of Things: Driving the Transformation Internet of Things: Driving the Transformation Annabel Nickles, PhD, MBA Director, Emerging Platform Solutions Integrated Computing Research Intel Labs 1 What Are People Saying about IOT? Vol. 12345 Nr.001

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

Homomorphic Encryption

Homomorphic Encryption Homomorphic Encryption Travis Mayberry Cloud Computing Cloud Computing Cloud Computing Cloud Computing Cloud Computing Northeastern saves money on infrastructure and gets the benefit of redundancy and

More information

Completing your AWS Cloud SECURING YOUR AMAZON WEB SERVICES ENVIRONMENT

Completing your AWS Cloud SECURING YOUR AMAZON WEB SERVICES ENVIRONMENT Completing your AWS Cloud SECURING YOUR AMAZON WEB SERVICES ENVIRONMENT Introduction Amazon Web Services (AWS) provides Infrastructure as a Service (IaaS) cloud offerings for organizations. Using AWS,

More information

OnboardICNg: a Secure Protocol for On-boarding IoT Devices in ICN

OnboardICNg: a Secure Protocol for On-boarding IoT Devices in ICN OnboardICNg: a Secure Protocol for On-boarding IoT Devices in ICN Alberto Compagno 1,3, Mauro Conti 2 and Ralph Droms 3 1 Sapienza University of Rome 2 University of Padua 3 Cisco Systems 3rd ACM Conference

More information

SAP VORA 1.4 on AWS - MARKETPLACE EDITION FREQUENTLY ASKED QUESTIONS

SAP VORA 1.4 on AWS - MARKETPLACE EDITION FREQUENTLY ASKED QUESTIONS SAP VORA 1.4 on AWS - MARKETPLACE EDITION FREQUENTLY ASKED QUESTIONS 1. What is SAP Vora? SAP Vora is an in-memory, distributed computing solution that helps organizations uncover actionable business insights

More information

exam. Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0

exam.   Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0 70-775.exam Number: 70-775 Passing Score: 800 Time Limit: 120 min File Version: 1.0 Microsoft 70-775 Perform Data Engineering on Microsoft Azure HDInsight Version 1.0 Exam A QUESTION 1 You use YARN to

More information

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Storm at Twitter Twitter Web Analytics Before Storm Queues Workers Example (simplified) Example Workers schemify tweets and

More information

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017. Dublin Apache Kafka Meetup, 30 August 2017 The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Joseph @pleia2 * ASF projects 1 Elizabeth K. Joseph, Developer Advocate Developer Advocate

More information

Understanding the latent value in all content

Understanding the latent value in all content Understanding the latent value in all content John F. Kennedy (JFK) November 22, 1963 INGEST ENRICH EXPLORE Cognitive skills Data in any format, any Azure store Search Annotations Data Cloud Intelligence

More information

Big data streaming: Choices for high availability and disaster recovery on Microsoft Azure. By Arnab Ganguly DataCAT

Big data streaming: Choices for high availability and disaster recovery on Microsoft Azure. By Arnab Ganguly DataCAT : Choices for high availability and disaster recovery on Microsoft Azure By Arnab Ganguly DataCAT March 2019 Contents Overview... 3 The challenge of a single-region architecture... 3 Configuration considerations...

More information

Benchmarking Distributed Stream Processing Platforms for IoT Applications

Benchmarking Distributed Stream Processing Platforms for IoT Applications DISTRIBUTED RESEARCH ON EMERGING APPLICATIONS & MACHINES dream-lab.in Indian Institute of Science, Bangalore DREAM:Lab Benchmarking Distributed Stream Processing Platforms for IoT Applications Anshu Shukla

More information

Certified Big Data Hadoop and Spark Scala Course Curriculum

Certified Big Data Hadoop and Spark Scala Course Curriculum Certified Big Data Hadoop and Spark Scala Course Curriculum The Certified Big Data Hadoop and Spark Scala course by DataFlair is a perfect blend of indepth theoretical knowledge and strong practical skills

More information

BIG DATA COURSE CONTENT

BIG DATA COURSE CONTENT BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data

More information

Security Overview of the BGI Online Platform

Security Overview of the BGI Online Platform WHITEPAPER 2015 BGI Online All rights reserved Version: Draft v3, April 2015 Security Overview of the BGI Online Platform Data security is, in general, a very important aspect in computing. We put extra

More information

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Florian Tramèr (joint work with Dan Boneh) Intel, Santa Clara August 30 th 2018 Trusted execution of ML: 3 motivating

More information

IBM Cloud Security for the Cloud. Amr Ismail Security Solutions Sales Leader Middle East & Pakistan

IBM Cloud Security for the Cloud. Amr Ismail Security Solutions Sales Leader Middle East & Pakistan IBM Cloud Security for the Cloud Amr Ismail Security Solutions Sales Leader Middle East & Pakistan Today s Drivers for Cloud Adoption ELASTIC LOWER COST SOLVES SKILLS SHORTAGE RAPID INNOVATION GREATER

More information

New Approach to Unstructured Data

New Approach to Unstructured Data Innovations in All-Flash Storage Deliver a New Approach to Unstructured Data Table of Contents Developing a new approach to unstructured data...2 Designing a new storage architecture...2 Understanding

More information