Towards Scalable Cluster Auditing through Grammatical Inference over Provenance Graphs

Size: px
Start display at page:

Download "Towards Scalable Cluster Auditing through Grammatical Inference over Provenance Graphs"

Transcription

1 Towards Scalable Cluster Auditing through Grammatical Inference over Provenance Graphs Wajih Ul Hassan, Mark Lemay, Nuraini Aguse, Adam Bates, Thomas Moyer NDSS Symposium 2018 Feb 20, 2018

2 Notable Data Breach in

3 Notable Data Breach in 2017 Equifax Data Breach Timeline 2017 Breached Detected Breached Announced apr may jun jul aug sep oct Hackers in Equifax Servers Patched 3

4 Notable Data Breach in 2017 Equifax Data Breach Timeline 2017 Breached Detected Breached Announced 3 Months of crucial attack audit logs apr may jun jul aug sep oct Hackers in Equifax Servers Patched 4

5 Notable Data Breach in 2017 Equifax Data Breach Timeline 2017 Breached Detected apr may jun jul aug sep oct Breached Announced 3 Months of crucial attack audit Are current auditing systems scalable? logs Hackers in Equifax Servers Patched 5

6 Data Provenance aka Audit log Lineage of system activities Represented as Directed Acyclic Graph (DAG) Used for forensic analysis Code Audit log Provenance Graph Bash: exec(./nginx ); NGINX: recv(, abc.com ); fread( index.html ); 1. Bash, Spawns NGINX 2. NGINX, Receives from abc.com 3. NGINX, Reads File index.html Bash NGINX abc.com index.html 6

7 Data Provenance in a Cluster Worker Nodes Centralized auditing not practical due to two limitations Master Node 7

8 Limitation#1: Graph Complexity NGINX and MySQL running for 5 mins on a single machine Finding needle in a haystack problem 8

9 Limitation#2: Storage overhead Leads to network overhead as logs are transferred to master node Audit Log Size Growth for a Single NGINX server Uncompressed Compressed [VALUE] GB 2.54 GB/Day on a single machine 10 Log Size (GB) [VALUE] GB [VALUE] GB With compression >1GB/Day 2 0 [VALUE] GB Day1 Day2 Day3 Day4 Day5 9

10 Winnower Cluster applications are replicated in accordance with microservice architecture principle Replicated apps produce highly homogeneous provenance graphs core execution behaviour is similar Key Idea: Remove redundancy from provenance graphs across cluster before sending to master node 10

11 Master Node View with Winnower Before After Bash * NGINX Other library file ver@ces mysql mysqld /up/* * /db/* 11

12 Winnower Build consensus model across cluster using graph grammars Like string grammar, graph grammars provide rule-based mechanisms For generating, manipulating and analyzing graphs Induction produce grammar from a given graph Parsing membership test of a given graph is in a grammar t t t t S A T A a B a a b b b a a b e T B t b S Graph S e Graph Grammar 12

13 Architecture Worker Nodes Worker Node Audit Module Prov. Graph Master Node Model Aggregator Model graphs/grammars from cluster Winnower Agent Fetch graph at each epoch Fine-grained Graph Graph Abstraction Abstracted Graph Graph Induction Model Graph 13

14 Architecture Worker Nodes Worker Node Audit Module Prov. Graph Master Node Model Aggregator Aggregated Model Only send Model updates Winnower Agent Fetch graph at next epoch Fine-grained Graph Graph Abstraction Abstracted Graph Graph Induction Model Graph 14

15 Architecture Worker Nodes Worker Node Audit Module Master Node Model Aggregator Query part of Provenance High-fidelity graph Provenance graph Winnower Agent 15

16 Provenance Graph Abstraction Graph Induction process builds a model/grammar that concisely describe the whole graph However, instance-specific fields frustrate any attempts to build a generic application behaviour model Node 1 Node 2 listener pid:2791 pid:2788 worker pid:2795 pid:2780 listener pid:2789 worker pid: Graph Induc@on No General model as instance specific informa@on such PID is different among graphs /up/file1 Inode:3 /up/file2 Inode:

17 Provenance Graph Abstraction Provenance graph vertices have well defined fields E.g. pid:1234, FilePath:/etc/ld.so Defined rules manually that remove or generalize these fields Node 1 Node 2 pid:2788 pid: Node /24 Node /24 listener pid:2791 worker pid:2795 listener pid:2789 worker pid:2797 Graph Abstrac@on listener worker listener worker /up/file1 Inode:3 /up/file2 Inode: /24 /up/* /up/* /24

18 Provenance Graph Induction Deterministic Finite Automata (DFA) Learning to generate grammar Encodes the causality in generated models In DFA learning the present state of a vertex includes the path taken to reach the vertex (provenance ancestry) Winnower extends it to remember descendants (provenance progeny) State of each vertex consist of three items: 1. Label 2. Provenance ancestry 3. Provenance progeny Bash gzip Ancestry of gzip vertex File1.txt File1.txt Progeny of gzip vertex 18

19 Provenance Graph Induction Finds repetitive patterns using standard implicit and explicit state merging algorithm Implicit state merging combines two subgraphs if states of each vertex are same in both subgraphs Node /24 Node / /24 listener listener worker listener worker Graph Induc@on worker /24 /up/* /24 /up/* /up/* /24 Confidence level Legend 2 19

20 Explicit State Merging At high-level explicit state merging Picks two nodes and make their states same Check if subgraph can be merged implicitly Consider a chained map reduce job java java mapper data Node 1 java java mapper data java Graph Grammar := S S := A S := T V Node 1 := A -> X A -> Y X := B -> W Y := C -> W data W := D D -> S java mapper T java java mapper B A D data java reducer Merge two nodes java reducer java reducer A B C D := data := java mapper := java reducer := java java reducer C 20

21 Provenance Graph Induction Consider a graph with a malicious activity Malicious behavior is visible in the final model listener worker bash wget Node 1 /up/* x.x.x.x Malicious file /up/* /up/* Node 2 Node 3 listener worker listener worker Graph Induc@on /up/* Master Node listener worker bash x.x.x.x wget Confidence level Legend 1 3 Malicious file 21

22 Evaluation Setup Setup 1 VM as master node, 4 VMs as worker nodes SPADE and Docker Swarm Epoch size 50 sec Metrics Storage Overhead Computational Cost Effectiveness 22

23 Storage Overhead on Master Node 98.7% decrease 0.17 MySQL 130 Winnower Raw 0.12 ProFTPD HTTPD LOG SIZE IN MB 23

24 Storage Reduction on Master Node LOG SIZE (MB) Raw(Uncompressed) Raw(Compressed) Winnower Apache Webserver with moderate workload Note the log scale on y- axis 7z compression is not suitable: No global view of cluster Oblivious to previous batch TIME (SEC) 24

25 Evaluation: Computation Cost Average time spent in induction and membership test at each epoch 35 Generate Model for 30 Average Time (sec) Apache MySQL ProFTPD Membership check in model Heterogeneous Workload -> Updates model Elapsed Time (sec) 25

26 Case Study: Ransomware Attack Attacker exploits Redis database server vulnerability version < 3.2 Vulnerability allows attacker to change SSH key and log in as Root Attacker deletes the database and left a note using vim to send bitcoins get database back 26

27 Traditional Graph of Attack 10 instances of redis running in the cluster ~80k vertices and ~83K edges with 161 MB size Part of provenance graph shown below 27

28 Winnower Generated Provenance graph 54 vertices and 68 edges with 0.7 MB size Part of graph is shown below: * bash * Nginx * Worker /uploads/* /proc/12743/stat /var/log/redis/redis.log Confidence level Legend 1 10 redis-server /var/lib/redis/dump.rdb /24 x.x.x.x /dev/tty /root/.ssh/authorized_keys x.x.x.x sshd bash Other library files Attack Provenance vim /root/ransomware.note 28

29 Winnower Generated Provenance graph What happens if we attack all the nodes in the cluster * bash * Nginx * Worker /uploads/* /proc/12743/stat /var/log/redis/redis.log Confidence level Legend 10 redis-server /var/lib/redis/dump.rdb /24 x.x.x.x /dev/tty /root/.ssh/authorized_keys x.x.x.x sshd bash Other library files Attack Provenance vim /root/ransomware.note 29

30 Conclusion Winnower is the first practical system for provenance-based auditing of clusters at scale with low overhead Winnower significantly improves attack identification and investigation in a large cluster 30

31 Questions Thank you for your time. 31

32 Backup Slides 32

33 Threat model Assumptions Winnower only tracks user-space attacks i.e. trusts the OS Log integrity is maintained Attack surface Distributed application replicated on Worker nodes Attacker motive Gain control over worker node by exploiting a software vulnerability in the distributed application 33

Transparent Web Service Auditing via Network Provenance Functions

Transparent Web Service Auditing via Network Provenance Functions Transparent Web Service Auditing via Network Provenance Functions Adam Bates, Wajih Ul Hassan, Kevin Butler, Alin Dobra, Bradley Reaves, Patrick Cable, Thomas Moyer, Nabil Schear ased upon work supported

More information

Trustworthy Whole-System Provenance for the Linux Kernel

Trustworthy Whole-System Provenance for the Linux Kernel Trustworthy Whole-System Provenance for the Linux Kernel Adam Bates, Dave (Jing) Tian, Thomas Moyer, and Kevin R. B. Butler In association with USENIX Security Symposium, Washington D.C., USA 13 August,

More information

Adaptive Cluster Computing using JavaSpaces

Adaptive Cluster Computing using JavaSpaces Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of

More information

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Copyright 2018, Oracle and/or its affiliates. All rights reserved. Beyond SQL Tuning: Insider's Guide to Maximizing SQL Performance Monday, Oct 22 10:30 a.m. - 11:15 a.m. Marriott Marquis (Golden Gate Level) - Golden Gate A Ashish Agrawal Group Product Manager Oracle

More information

SOLUTION TRACK Finding the Needle in a Big Data Innovator & Problem Solver Cloudera

SOLUTION TRACK Finding the Needle in a Big Data Innovator & Problem Solver Cloudera SOLUTION TRACK Finding the Needle in a Big Data Haystack @EvaAndreasson, Innovator & Problem Solver Cloudera Agenda Problem (Solving) Apache Solr + Apache Hadoop et al Real-world examples Q&A Problem Solving

More information

Web Gateway Security Appliances for the Enterprise: Comparison of Malware Blocking Rates

Web Gateway Security Appliances for the Enterprise: Comparison of Malware Blocking Rates Web Gateway Security Appliances for the Enterprise: Comparison of Malware Blocking Rates A test commissioned by McAfee, Inc. and performed by AV-Test GmbH Date of the report: December 7 th, 2010 (last

More information

Docker and HPE Accelerate Digital Transformation to Enable Hybrid IT. Steven Follis Solutions Engineer Docker Inc.

Docker and HPE Accelerate Digital Transformation to Enable Hybrid IT. Steven Follis Solutions Engineer Docker Inc. Docker and HPE Accelerate Digital Transformation to Enable Hybrid IT Steven Follis Solutions Engineer Docker Inc. Containers are the Fastest Growing Cloud Enabling Technology Title source: 451 Research

More information

Logging, Monitoring, and Alerting

Logging, Monitoring, and Alerting Logging, Monitoring, and Alerting Logs are a part of daily life in the DevOps world In security, we focus on particular logs to detect security anomalies and for forensic capabilities A basic logging pipeline

More information

The Power of Prediction: Cloud Bandwidth and Cost Reduction

The Power of Prediction: Cloud Bandwidth and Cost Reduction The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal Zohar Israel Cidon Technion Osnat(Ossi) Mokryn Tel-Aviv College Traffic Redundancy Elimination (TRE) Traffic redundancy stems from downloading

More information

DON T CRY OVER SPILLED RECORDS Memory elasticity of data-parallel applications and its application to cluster scheduling

DON T CRY OVER SPILLED RECORDS Memory elasticity of data-parallel applications and its application to cluster scheduling DON T CRY OVER SPILLED RECORDS Memory elasticity of data-parallel applications and its application to cluster scheduling Călin Iorgulescu (EPFL), Florin Dinu (EPFL), Aunn Raza (NUST Pakistan), Wajih Ul

More information

Lecture 11 Hadoop & Spark

Lecture 11 Hadoop & Spark Lecture 11 Hadoop & Spark Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Outline Distributed File Systems Hadoop Ecosystem

More information

The Linux Audit System WAJIH 04/30/2018

The Linux Audit System WAJIH 04/30/2018 The Linux Audit System WAJIH 04/30/2018 $whoami Third year Ph.D. student in CS Dept. Working with Prof. Adam Bates Research Interests: System Security Data provenance Recent Cyber AFacks Equifax 145 million

More information

Murray Goldschmidt. Chief Operating Officer Sense of Security Pty Ltd. Micro Services, Containers and Serverless PaaS Web Apps? How safe are you?

Murray Goldschmidt. Chief Operating Officer Sense of Security Pty Ltd. Micro Services, Containers and Serverless PaaS Web Apps? How safe are you? Murray Goldschmidt Chief Operating Officer Sense of Security Pty Ltd Micro Services, Containers and Serverless PaaS Web Apps? How safe are you? A G E N D A 1 2 3 Serverless, Microservices and Container

More information

Sophos Central for partners and customers: overview and new features. Jonathan Shaw Senior Product Manager, Sophos Central

Sophos Central for partners and customers: overview and new features. Jonathan Shaw Senior Product Manager, Sophos Central Sophos Central for partners and customers: overview and new features Jonathan Shaw Senior Product Manager, Sophos Central What is Sophos Central? Partner Dashboard Admin Self Service Allows Partners to

More information

Think Small to Scale Big

Think Small to Scale Big Think Small to Scale Big Intro to Containers for the Datacenter Admin Pete Zerger Principal Program Manager, MVP pete.zerger@cireson.com Cireson Lee Berg Blog, e-mail address, title Company Pete Zerger

More information

Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration

Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration Soo-Jin Moon, Vyas Sekar Michael K. Reiter Co-residency side-channel attacks in clouds Stealing secrets (e.g., keys) VM VM

More information

THE THREE WAYS OF SECURITY. Jeff Williams Co-founder and CTO Contrast Security

THE THREE WAYS OF SECURITY. Jeff Williams Co-founder and CTO Contrast Security THE THREE WAYS OF SECURITY Jeff Williams Co-founder and CTO Contrast Security 1. TODAY S AVERAGE APPLICATION IS A SECURITY DISASTER 2. SOFTWARE IS LEAVING SECURITY IN THE DUST SOFTWARE Typical enterprise

More information

Software Security. Final Exam Preparation. Be aware, there is no guarantee for the correctness of the answers!

Software Security. Final Exam Preparation. Be aware, there is no guarantee for the correctness of the answers! Software Security Final Exam Preparation Note: This document contains the questions from the final exam on 09.06.2017. Additionally potential questions about Combinatorial Web Security Testing and Decentralized

More information

IBM Planning Analytics Workspace Local Distributed Soufiane Azizi. IBM Planning Analytics

IBM Planning Analytics Workspace Local Distributed Soufiane Azizi. IBM Planning Analytics IBM Planning Analytics Workspace Local Distributed Soufiane Azizi IBM Planning Analytics IBM Canada - Cognos Ottawa Lab. IBM Planning Analytics Agenda 1. Demo PAW High Availability on a Prebuilt Swarm

More information

Data Storage Infrastructure at Facebook

Data Storage Infrastructure at Facebook Data Storage Infrastructure at Facebook Spring 2018 Cleveland State University CIS 601 Presentation Yi Dong Instructor: Dr. Chung Outline Strategy of data storage, processing, and log collection Data flow

More information

Chapter 5. The MapReduce Programming Model and Implementation

Chapter 5. The MapReduce Programming Model and Implementation Chapter 5. The MapReduce Programming Model and Implementation - Traditional computing: data-to-computing (send data to computing) * Data stored in separate repository * Data brought into system for computing

More information

Jordan Levesque - Keeping your Business Secure

Jordan Levesque - Keeping your Business Secure Jordan Levesque - Keeping your Business Secure Review of PCI Benefits of hosting with RCS File Integrity Monitoring Two Factor Log Aggregation Vulnerability Scanning Configuration Management and Continuous

More information

Improve Web Application Performance with Zend Platform

Improve Web Application Performance with Zend Platform Improve Web Application Performance with Zend Platform Shahar Evron Zend Sr. PHP Specialist Copyright 2007, Zend Technologies Inc. Agenda Benchmark Setup Comprehensive Performance Multilayered Caching

More information

Cisco Tetration Analytics

Cisco Tetration Analytics Cisco Tetration Analytics Enhanced security and operations with real time analytics John Joo Tetration Business Unit Cisco Systems Security Challenges in Modern Data Centers Securing applications has become

More information

Going Without CPU Patches on Oracle E-Business Suite 11i?

Going Without CPU Patches on Oracle E-Business Suite 11i? Going Without CPU Patches on E-Business Suite 11i? September 17, 2013 Stephen Kost Chief Technology Officer Integrigy Corporation Phil Reimann Director of Business Development Integrigy Corporation About

More information

Current procedures, challenges and opportunities for collection and analysis of Criminal Justice statistics CERT-GH

Current procedures, challenges and opportunities for collection and analysis of Criminal Justice statistics CERT-GH Current procedures, challenges and opportunities for collection and analysis of Criminal Justice statistics CERT-GH International Workshop on Criminal Justice Statistics on Cybercrime and Electronic Evidence

More information

Control Center Release Notes

Control Center Release Notes Release 1.4.1 Zenoss, Inc. www.zenoss.com Copyright 2017 Zenoss, Inc. All rights reserved. Zenoss, Own IT, and the Zenoss logo are trademarks or registered trademarks of Zenoss, Inc., in the United States

More information

STATE OF MODERN APPLICATIONS IN THE CLOUD

STATE OF MODERN APPLICATIONS IN THE CLOUD STATE OF MODERN APPLICATIONS IN THE CLOUD 2017 Introduction The Rise of Modern Applications What is the Modern Application? Today s leading enterprises are striving to deliver high performance, highly

More information

Fault Identification from Web Log Files by Pattern Discovery

Fault Identification from Web Log Files by Pattern Discovery ABSTRACT International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 2 ISSN : 2456-3307 Fault Identification from Web Log Files

More information

/ Cloud Computing. Recitation 5 February 14th, 2017

/ Cloud Computing. Recitation 5 February 14th, 2017 15-319 / 15-619 Cloud Computing Recitation 5 February 14th, 2017 1 Overview Administrative issues Office Hours, Piazza guidelines Last week s reflection Project 2.1, OLI Unit 2 modules 5 and 6 This week

More information

Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration

Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration Soo-Jin Moon, Vyas Sekar Michael K. Reiter Context: Infrastructure-as-a-Service Clouds Client API Cloud Controller Machine

More information

Auto Management for Apache Kafka and Distributed Stateful System in General

Auto Management for Apache Kafka and Distributed Stateful System in General Auto Management for Apache Kafka and Distributed Stateful System in General Jiangjie (Becket) Qin Data Infrastructure @LinkedIn GIAC 2017, 12/23/17@Shanghai Agenda Kafka introduction and terminologies

More information

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13 Bigtable A Distributed Storage System for Structured Data Presenter: Yunming Zhang Conglong Li References SOCC 2010 Key Note Slides Jeff Dean Google Introduction to Distributed Computing, Winter 2008 University

More information

Multiple vulnerabilities in Vectra Cognito

Multiple vulnerabilities in Vectra Cognito Multiple vulnerabilities in Vectra Cognito Security advisory 2018-10-01 Julien Egloff Thibault Guittet www.synacktiv.com 5 rue Sextius Michel 75015 Paris Overview Cognito platform Cognito is the ultimate

More information

All Events. One Platform.

All Events. One Platform. All Events. One Platform. Industry s first IT ops platform that truly correlates the metric, flow and log events and turns them into actionable insights. Correlate Integrate Analyze www.motadata.com Motadata

More information

Security of End User based Cloud Services Sang Young

Security of End User based Cloud Services Sang Young Security of End User based Cloud Services Sang Young Chairman, Mobile SIG Professional Information Security Association sang.young@pisa.org.hk Cloud Services you can choose Social Media Business Applications

More information

DevOps Course Content

DevOps Course Content DevOps Course Content 1. Introduction: Understanding Development Development SDLC using WaterFall & Agile Understanding Operations DevOps to the rescue What is DevOps DevOps SDLC Continuous Delivery model

More information

Securing Distributed Computation via Trusted Quorums. Yan Michalevsky, Valeria Nikolaenko, Dan Boneh

Securing Distributed Computation via Trusted Quorums. Yan Michalevsky, Valeria Nikolaenko, Dan Boneh Securing Distributed Computation via Trusted Quorums Yan Michalevsky, Valeria Nikolaenko, Dan Boneh Setting Distributed computation over data contributed by users Communication through a central party

More information

3. EXCEL FORMULAS & TABLES

3. EXCEL FORMULAS & TABLES Winter 2019 CS130 - Excel Formulas & Tables 1 3. EXCEL FORMULAS & TABLES Winter 2019 Winter 2019 CS130 - Excel Formulas & Tables 2 Cell References Absolute reference - refer to cells by their fixed position.

More information

Harvesting Logs and Events Using MetaCentrum Virtualization Services. Radoslav Bodó, Daniel Kouřil CESNET

Harvesting Logs and Events Using MetaCentrum Virtualization Services. Radoslav Bodó, Daniel Kouřil CESNET Harvesting Logs and Events Using MetaCentrum Virtualization Services Radoslav Bodó, Daniel Kouřil CESNET Campus network monitoring and security workshop Prague 2014 Agenda Introduction Collecting logs

More information

VIAF: Verification-based Integrity Assurance Framework for MapReduce. YongzhiWang, JinpengWei

VIAF: Verification-based Integrity Assurance Framework for MapReduce. YongzhiWang, JinpengWei VIAF: Verification-based Integrity Assurance Framework for MapReduce YongzhiWang, JinpengWei MapReduce in Brief Satisfying the demand for large scale data processing It is a parallel programming model

More information

@joerg_schad Nightmares of a Container Orchestration System

@joerg_schad Nightmares of a Container Orchestration System @joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1 Jörg Schad Distributed Systems Engineer @joerg_schad Jan Repnak Support Engineer/ Solution Architect

More information

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:

More information

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context 1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes

More information

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat Grid Workflow Efficient Enactment for Data Intensive Applications L3.4 Data Management Techniques Authors : Eddy Caron Frederic Desprez Benjamin Isnard Johan Montagnat Summary : This document presents

More information

Waratek Runtime Protection Platform

Waratek Runtime Protection Platform Waratek Runtime Protection Platform Cirosec TrendTage - March 2018 Waratek Solves the Application Security Problems That No One Else Can Prateep Bandharangshi Director of Client Security Solutions March,

More information

The Speaker. Alain Fuhrer, born in Bachelor in computer science. Working with Oracle databases for about 12 years now

The Speaker. Alain Fuhrer, born in Bachelor in computer science. Working with Oracle databases for about 12 years now The Speaker Alain Fuhrer, born in 1986 Bachelor in computer science Working with Oracle databases for about 12 years now Head IT Databases at Swiss Mobiliar Insurance REPORTS ANALYTICS AUF ON OLTP DATEN

More information

Advanced Threat Hunting:

Advanced Threat Hunting: Advanced Threat Hunting: Identify and Track Adversaries Infiltrating Your Organization In Partnership with: Presented by: Randeep Gill Tony Shadrake Enterprise Security Engineer, Europe Regional Director,

More information

Developing and Testing Java Microservices on Docker. Todd Fasullo Dir. Engineering

Developing and Testing Java Microservices on Docker. Todd Fasullo Dir. Engineering Developing and Testing Java Microservices on Docker Todd Fasullo Dir. Engineering Agenda Who is Smartsheet + why we started using Docker Docker fundamentals Demo - creating a service Demo - building service

More information

Opera Web Browser Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space

Opera Web Browser Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space Property Value FTP Server ftp.opera.com Description Opera Web Browser Archive Country United States Scan Date 04/Nov/2015 Total Dirs 1,557 Total Files 2,211 Total Data 43.83 GB Top 20 Directories Sorted

More information

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Chris Calvert, CISSP, CISM Director of Solutions Innovation Copyright 2013 Hewlett-Packard Development

More information

PROCERGS Data Processing Company - FTP Site Statistics. Top 20 Directories Sorted by Disk Space

PROCERGS Data Processing Company - FTP Site Statistics. Top 20 Directories Sorted by Disk Space Property Value FTP Server ftp.procergs.com.br Description PROCERGS Data Processing Company Country Brazil Scan Date 29/Aug/2015 Total Dirs 2,261 Total Files 2,506 Total Data 15.31 GB Top 20 Directories

More information

Cybersecurity. You have been breached; What Happens Next THE CHALLENGE FOR THE FINANCIAL SERVICES INDUSTRY

Cybersecurity. You have been breached; What Happens Next THE CHALLENGE FOR THE FINANCIAL SERVICES INDUSTRY Cybersecurity THE CHALLENGE FOR THE FINANCIAL SERVICES INDUSTRY Gary Meshell World Wide Leader Financial Services Industry IBM Security March 21 2019 You have been breached; What Happens Next 2 IBM Security

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

CALENDAR FOR THE YEAR 2018

CALENDAR FOR THE YEAR 2018 27 Dubai 08-12 Jan 1 Advance Budgeting Workshop 386 Istanbul 11-1 Mar 396 London 13-17 May 2 Advance Business Writing 296 Abu Dhabi 1-19 Jan 3 Contract Management 396 London 22-26 Jan 27 Dubai 18-22 Mar

More information

CONTAINERIZED SPARK ON KUBERNETES. William Benton Red Hat,

CONTAINERIZED SPARK ON KUBERNETES. William Benton Red Hat, CONTAINERIZED SPARK ON KUBERNETES William Benton Red Hat, Inc. @willb willb@redhat.com BACKGROUND BACKGROUND BACKGROUND BACKGROUND BACKGROUND BACKGROUND BACKGROUND BACKGROUND WHAT OUR SPARK CLUSTER LOOKED

More information

Protecting Keys/Secrets in Network Automation Solutions. Dhananjay Pavgi, Tech Mahindra Ltd Srinivasa Addepalli, Intel

Protecting Keys/Secrets in Network Automation Solutions. Dhananjay Pavgi, Tech Mahindra Ltd Srinivasa Addepalli, Intel Protecting Keys/Secrets in Network Automation Solutions Dhananjay Pavgi, Tech Mahindra Ltd Srinivasa Addepalli, Intel Agenda Introduction Private Key Security Secret Management Tamper Detection Summary

More information

Building (Better) Data Pipelines using Apache Airflow

Building (Better) Data Pipelines using Apache Airflow Building (Better) Data Pipelines using Apache Airflow Sid Anand (@r39132) QCon.AI 2018 1 About Me Work [ed s] @ Co-Chair for Maintainer of Spare time 2 Apache Airflow What is it? 3 Apache Airflow : What

More information

Parallelizing CI using Docker Swarm-Mode

Parallelizing CI using Docker Swarm-Mode Open Source Summit Japan (June 1, 2017) Last update: June 1, 2017 Parallelizing CI using Docker Swarm-Mode Akihiro Suda NTT Software Innovation Center github.com/akihirosuda

More information

Informa)on Retrieval and Map- Reduce Implementa)ons. Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies

Informa)on Retrieval and Map- Reduce Implementa)ons. Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies Informa)on Retrieval and Map- Reduce Implementa)ons Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies mas4108@louisiana.edu Map-Reduce: Why? Need to process 100TB datasets On 1 node:

More information

5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992

5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992 2014-05-20 MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992 @SoQooL http://blog.mssqlserver.se Mattias.Lind@Sogeti.se 1 The evolution of the Microsoft data platform

More information

ovirt and Docker Integration

ovirt and Docker Integration ovirt and Docker Integration October 2014 Federico Simoncelli Principal Software Engineer Red Hat 1 Agenda Deploying an Application (Old-Fashion and Docker) Ecosystem: Kubernetes and Project Atomic Current

More information

Outline STRANGER. Background

Outline STRANGER. Background Outline Malicious Code Analysis II : An Automata-based String Analysis Tool for PHP 1 Mitchell Adair 2 November 28 th, 2011 Outline 1 2 Credit: [: An Automata-based String Analysis Tool for PHP] Background

More information

A Lightweight Blockchain Consensus Protocol

A Lightweight Blockchain Consensus Protocol A Lightweight Blockchain Consensus Protocol Keir Finlow-Bates keir@chainfrog.com Abstract A lightweight yet deterministic and objective consensus protocol would allow blockchain systems to be maintained

More information

Omega Engineering Software Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space

Omega Engineering Software Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space Omega Engineering Software Archive - FTP Site Statistics Property Value FTP Server ftp.omega.com Description Omega Engineering Software Archive Country United States Scan Date 14/Apr/2015 Total Dirs 460

More information

Scheduling Applications at Scale

Scheduling Applications at Scale Scheduling Applications at Scale Meeting Tomorrow's Application Needs, Today http://1stchoicesportsrehab.com/wp-content/uploads/2012/05/calendar.jpg SETH VARGO @sethvargo Globally Distributed Optimistically

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme CNA2080BU Deep Dive: How to Deploy and Operationalize Kubernetes Cornelia Davis, Pivotal Nathan Ness Technical Product Manager, CNABU @nvpnathan #VMworld #CNA2080BU Disclaimer This presentation may contain

More information

Malicious Code Analysis II

Malicious Code Analysis II Malicious Code Analysis II STRANGER: An Automata-based String Analysis Tool for PHP Mitchell Adair November 28 th, 2011 Outline 1 STRANGER 2 Outline 1 STRANGER 2 STRANGER Credit: [STRANGER: An Automata-based

More information

Server Virtualization and Optimization at HSBC. John Gibson Chief Technical Specialist HSBC Bank plc

Server Virtualization and Optimization at HSBC. John Gibson Chief Technical Specialist HSBC Bank plc Server Virtualization and Optimization at HSBC John Gibson Chief Technical Specialist HSBC Bank plc Background Over 5,500 Windows servers in the last 6 years. Historically, Windows technology dictated

More information

University of Hagen - FTP Site Statistics. Top 20 Directories Sorted by Disk Space

University of Hagen - FTP Site Statistics. Top 20 Directories Sorted by Disk Space Property Value FTP Server ftp.fernuni-hagen.de Description University of Hagen Country Germany Scan Date 25/Feb/2015 Total Dirs 15,751 Total Files 253,958 Total Data 153.37 GB Top 20 Directories Sorted

More information

Adaptive Android Kernel Live Patching

Adaptive Android Kernel Live Patching USENIX Security Symposium 2017 Adaptive Android Kernel Live Patching Yue Chen 1, Yulong Zhang 2, Zhi Wang 1, Liangzhao Xia 2, Chenfu Bao 2, Tao Wei 2 Florida State University 1 Baidu X-Lab 2 Android Kernel

More information

HTRC Data API Performance Study

HTRC Data API Performance Study HTRC Data API Performance Study Yiming Sun, Beth Plale, Jiaan Zeng Amazon Indiana University Bloomington {plale, jiaazeng}@cs.indiana.edu Abstract HathiTrust Research Center (HTRC) allows users to access

More information

National Aeronautics and Space Admin. - FTP Site Statistics. Top 20 Directories Sorted by Disk Space

National Aeronautics and Space Admin. - FTP Site Statistics. Top 20 Directories Sorted by Disk Space National Aeronautics and Space Admin. - FTP Site Statistics Property Value FTP Server ftp.hq.nasa.gov Description National Aeronautics and Space Admin. Country United States Scan Date 26/Apr/2014 Total

More information

Next Generation Enduser Protection

Next Generation Enduser Protection Next Generation Enduser Protection Janne Timisjärvi Systems Engineer 10.5.2017 What is the the real threat? Encrypted! Give me all your Bitcoin$ Let s check if there Is something of value The Evolution

More information

Workload Characterization and Optimization of TPC-H Queries on Apache Spark

Workload Characterization and Optimization of TPC-H Queries on Apache Spark Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research

More information

LOG AGGREGATION. To better manage your Red Hat footprint. Miguel Pérez Colino Strategic Design Team - ISBU

LOG AGGREGATION. To better manage your Red Hat footprint. Miguel Pérez Colino Strategic Design Team - ISBU LOG AGGREGATION To better manage your Red Hat footprint Miguel Pérez Colino Strategic Design Team - ISBU 2017-05-03 @mmmmmmpc Agenda Managing your Red Hat footprint with Log Aggregation The Situation The

More information

Smart Protection Network. Raimund Genes, CTO

Smart Protection Network. Raimund Genes, CTO Smart Protection Network Raimund Genes, CTO Overwhelmed by Volume of New Threats New unique samples added to AV-Test's malware repository (2000-2010) 20.000.000 18.000.000 16.000.000 14.000.000 12.000.000

More information

Russ McRee Bryan Casper

Russ McRee Bryan Casper Russ McRee Bryan Casper About us We re part of the security incident response team for Microsoft Online Services Security & Compliance We ask more questions than provide answers This presentation is meant

More information

Design Tradeoffs for Data Deduplication Performance in Backup Workloads

Design Tradeoffs for Data Deduplication Performance in Backup Workloads Design Tradeoffs for Data Deduplication Performance in Backup Workloads Min Fu,DanFeng,YuHua,XubinHe, Zuoning Chen *, Wen Xia,YuchengZhang,YujuanTan Huazhong University of Science and Technology Virginia

More information

Processing 11 billions events a day with Spark. Alexander Krasheninnikov

Processing 11 billions events a day with Spark. Alexander Krasheninnikov Processing 11 billions events a day with Spark Alexander Krasheninnikov Badoo facts 46 languages 10M Photos added daily 320M registered users 190 countries 21M daily active users 3000+ servers 2 data-centers

More information

HugeServer Networks Software Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space

HugeServer Networks Software Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space Property Value FTP Server mirror.lax.hugeserver.com Description HugeServer Networks Software Archive Country United States Scan Date 28/Dec/2015 Total Dirs 3,510 Total Files 162,243 Total Data 365.86 GB

More information

Rapid Prototyping and Evaluation of Intelligence Functions of Active Storage Devices

Rapid Prototyping and Evaluation of Intelligence Functions of Active Storage Devices Rapid Prototyping and Evaluation of Intelligence Functions of Active Storage Devices Yongsoo Joo Embedded Software Research Center Ewha Womans University This research was supported by Basic Science Research

More information

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second

More information

Blockstack, a New Internet for Decentralized Apps. Muneeb Ali

Blockstack, a New Internet for Decentralized Apps. Muneeb Ali Blockstack, a New Internet for Decentralized Apps Muneeb Ali The New Internet Problems with the traditional internet End-to-end design principle for the Internet. *1981 Saltzer, Reed, and Clark paper End-to-end

More information

Minimizing the PCI Footprint: Reduce Risk and Simplify Compliance

Minimizing the PCI Footprint: Reduce Risk and Simplify Compliance SESSION ID: GRC-F02 Minimizing the PCI Footprint: Reduce Risk and Simplify Compliance Troy Leach CTO PCI Security Standards Council Agenda Today s Landscape Reducing the Card Holder Data Footprint How

More information

Growth of Docker hub pulls

Growth of Docker hub pulls millions 6000 Growth of Docker hub pulls 5000 5000 4000 3000 2000 2000 1000 300 800 1200 0 May-15 Jun-15 Jul-15 Aug-15 Sep-15 2016 A Highly Complex Ecosystem Security challenges of container opera3ons

More information

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018 Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster

More information

Secure Kubernetes Container Workloads

Secure Kubernetes Container Workloads Secure Kubernetes Container Workloads with Production-Grade Networking Cynthia Thomas Irena Berezovsky Tim Hockin CIA IT operations have top secret apps for their agents, most of which require isolation

More information

Spark, Shark and Spark Streaming Introduction

Spark, Shark and Spark Streaming Introduction Spark, Shark and Spark Streaming Introduction Tushar Kale tusharkale@in.ibm.com June 2015 This Talk Introduction to Shark, Spark and Spark Streaming Architecture Deployment Methodology Performance References

More information

AMP-Based Flow Collection. Greg Virgin - RedJack

AMP-Based Flow Collection. Greg Virgin - RedJack AMP-Based Flow Collection Greg Virgin - RedJack AMP- Based Flow Collection AMP - Analytic Metadata Producer : Patented US Government flow / metadata producer AMP generates data including Flows Host metadata

More information

Flash Storage Complementing a Data Lake for Real-Time Insight

Flash Storage Complementing a Data Lake for Real-Time Insight Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum

More information

MPI: Multiple Perspective Attack Investigation with Semantic Aware Execution Partitioning

MPI: Multiple Perspective Attack Investigation with Semantic Aware Execution Partitioning MPI: Multiple Perspective Attack Investigation with Semantic Aware Execution Partitioning Shiqing Ma, Juan Zhai, Fei Wang, Kyu Hyung Lee, Xiangyu Zhang, Dongyan Xu Attack Investigation Provenance tracking

More information

Mpoli Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space

Mpoli Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space Mpoli Archive - FTP Site Statistics Property Value FTP Server ftp.mpoli.fi Description Mpoli Archive Country Finland Scan Date 01/Nov/2015 Total Dirs 52,408 Total Files 311,725 Total Data 28.53 GB Top

More information

Introduction to MapReduce. Instructor: Dr. Weikuan Yu Computer Sci. & Software Eng.

Introduction to MapReduce. Instructor: Dr. Weikuan Yu Computer Sci. & Software Eng. Introduction to MapReduce Instructor: Dr. Weikuan Yu Computer Sci. & Software Eng. Before MapReduce Large scale data processing was difficult! Managing hundreds or thousands of processors Managing parallelization

More information

In-Memory Data Management

In-Memory Data Management In-Memory Data Management Martin Faust Research Assistant Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University of Potsdam Agenda 2 1. Changed Hardware 2.

More information

INTRODUCING SOPHOS INTERCEPT X

INTRODUCING SOPHOS INTERCEPT X INTRODUCING SOPHOS INTERCEPT X Matt Cooke Senior Product Marketing Manager November 2016 A Leader in Endpoint Security Sophos delivers the most enterprise-friendly SaaS endpoint security suite. Sophos

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Today Bitcoin: A peer-to-peer digital currency Spark: In-memory big data processing December 4, 2017 EECS 498 Lecture 21 2 December

More information

PNDA.io: when BGP meets Big-Data

PNDA.io: when BGP meets Big-Data PNDA.io: when BGP meets Big-Data Let s go back in time 26 th April 2017 The Internet is very much alive Millions of BGP events occurring every day 15 Routers Monitored 410 active peers (both IPv4 and IPv6)

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information