Behavioral Analysis for Intrusion Resilience Ahmed Fawaz Dec 6, 2016 1
Recent Cyber Attacks on Private and Public Entities 2 Design for Resiliency Diverse Monitoring Secure Monitoring Monitoring Fusion
Source: Verizon 2016 Data Breach Investigation Report 3
Traditional Security Good States Reachable States Initial State All Possible States 4
Resiliency Approach Good States Initial State Reachable States All Possible States 5
Cyber resilience is the ability to identify, prevent, detect and respond to malicious and random process or technology failures and recover while maintaining an acceptable level of service. Adapted from: Presidential Policy Directive 21 (PPD-21), Critical Infrastructure Security and Resilience, February 12, 2013. Accenture Consulting, Making your Enterprise Cyber Resilient, 2015 Design for Resiliency Diverse Monitoring 6 Secure Monitoring Monitoring Fusion
Notional Architecture for Cyber Resiliency Diverse System Monitoring World View System Model Monitor Fusion Secure Monitoring and Response Infrastructure Response Selection and Actuation OFFLINE/ONLINE COMPUTATION ONLINE COMPUTATION RESILIENCY INFRASTRUCTURE 7
Notional Architecture for Cyber Resiliency Diverse System Monitoring World View System Model Monitor Fusion Secure Monitoring and Response Infrastructure The system model represents: services possible responses attacker OFFLINE/ONLINE characteristics architecture COMPUTATION of a system Response Selection and Actuation ONLINE COMPUTATION RESILIENCY INFRASTRUCTURE 8
Notional Architecture for Cyber Resiliency Diverse monitors are deployed at all levels of the system to generate a diverse sensor data. Diverse System Monitoring World View System Model Monitor Fusion Response Selection and Actuation Secure Monitoring and Response The sensor Infrastructure inputs, alerts, and logs feed into a different set of fusion and correlation algorithms to generate a higher-level alert OFFLINE/ONLINE COMPUTATION ONLINE COMPUTATION RESILIENCY INFRASTRUCTURE 9
Notional Architecture for Cyber Resiliency Diverse System Monitoring World View System Model Monitor Fusion Response Selection and Actuation The decision algorithm decides on learning responses to Secure intensify Monitoring and focus and the monitoring Response resources, and/or effect Infrastructure a response strategy, e.g. Block an attacker Move a target Reallocate services Recover services OFFLINE/ONLINE COMPUTATION ONLINE COMPUTATION RESILIENCY INFRASTRUCTURE 10
Notional Architecture for Cyber Resiliency Diverse System Monitoring World View System Model Monitor Fusion Secure Monitoring and Response Infrastructure OFFLINE/ONLINE COMPUTATION Response Selection and Actuation ONLINE COMPUTATION The monitoring and response architecture provides a trustworthy infrastructure on which to implement resiliency services RESILIENCY and maintain a INFRASTRUCTURE trustworthy world view. 11
Kobra: A Kernel Monitoring Engine Diverse System Monitoring World View System Model Monitor Fusion Secure Monitoring and Response Infrastructure Response Selection and Actuation OFFLINE/ONLINE COMPUTATION ONLINE COMPUTATION RESILIENCY INFRASTRUCTURE 12
Problem Description How to use diverse data types to model application behavior for anomaly detection? 13
Our Approach Processes File Operations Packets Data Sources System View Signal Learning 14
Kobra s Architecture Kernel-level monitor for Windows kernel Cooperative drivers that captures: Network activity Process communications Process creation/termination Objects access File system activity NDIS Filesystem filter WFP Callouts KDOM Comm Module Fusion Module Anomaly Detector Log Server Alert Low-overhead 15
What is the System View? The intent of the system view is to provide high-level information about host state. Reflects the methods by which users and user processes access different resources. 16
File and Network behaviors insert-edge{ VID.mp4 :? 2044} {devos:read} {512} Filter by Process and Application insert-edge{ VID.mp4 :? 2044} {devos:read} {512} Data is converted to a discrete time signal insert-edge{ VID.mp4 :? 2044} {devos:read} {4096} Chromium VLC 17
Map Discrete Events to a Polar Space Mapping inspired by digital modulation methods Partition space by quadrants according to type of events Map each event to a part of the quadrant The magnitude is a function of the size of event 18
Exampled 19
Application Behavior Model Learn local patterns in the signal (sliding window) Learn the co-occurrence relationships between the patterns Model: <Local Patterns, Co-occurrence> 20
Learning Local Patterns Learn sparse representation dictionary on the time signals Dictionary atoms correspond to the local patterns nx y D x D = arg min D i=1 min{kdx i y i k 2 + kx i k 1 } y! n D! n p x! p Input Signal Dictionary Sparse Approximation 21
Learning Co-occurrence (LSA) Sub-Signals Copatterns Sub-Signals Local Patterns Local Patterns Copatterns 22
Anomaly Detection using Model 50 Extract subsequence Dictionary 100 150 200 LSA Sparse Representation 250 300 LSA Rep. Anomaly Score 50 100 150 200 250 23
Reconstruction of MySQL using VLC Model Anomaly score 10 8 6 4 2 vlc mysql 6 LSE =1.8804 95th percentile of reconstruction error 0 0 100 200 300 400 500 600 700 800 Execution Steps 24
Evaluation Methodology 1. Generate traces of normal behavior of application VLC playing local files Apache + Mysql running wordpress Windows services 2. Learn model of each application 3. Inject traces of shellcode behavior into testing traces 4. Compute anomaly scores 25
Evaluation Results 0.12 0.1 0.08 0.06 0.04 0.02 0 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Kobra (FN) Kobra (FP) Kobra (FN) Kobra (FP) False Positive/Negative Rates for Reverse Shell False Positive/Negative Rates for Drive-by-Download 26
Lateral Movement Detection Using Distributed Data Fusion Diverse System Monitoring World View System Model Monitor Fusion Secure Monitoring and Response Infrastructure Response Selection and Actuation OFFLINE/ONLINE COMPUTATION ONLINE COMPUTATION RESILIENCY INFRASTRUCTURE 27
Problem Description How do we fuse diverse data sources using a distributed agent-based system to detect lateral movement in a network while maintaining scalability? 28
Lateral Movement Explained Starting from the entry point attacker moves to target host Uses system services or custom tools Goal: Detect lateral movement chains in a system Target Host Host 4 Host 2 Host 5 Host 3 Host 1 Entry Point 29
State-of-the-art Centrally correlate NetFlows to detect lateral movement NetFlow correlation method is not accurate Amount of information is too large to be handled centrally 30
Approach Overview Cluster Comm. Graph Host Comm. Graph (Connection Causation Events) Process Comm. Graph (Inter Process Comm. Events) 31
System Model Cluster 2 Cluster 1 32
Lateral Movement A critical step during APT to move from the entry point to target host GL L2 L1 Target Host Host 4 Host 2 Entry Point 6 5 Host 5 4 3 Host 3 2 C2 1 Host 1 C1 33
Inside Host 1 Process Communication View created by Kobra using timestamped events: Processes running Process communication (pipes, messages, ) Network connections (with a unique ID across system) File access Connection causation event is generated when the agent find a path between incoming and outgoing connections 34
Inside Host 1 Local agent infers connection causation using the Process Communication Graph Connection 2 (C2) Connection 1 (C1) T=4 P4 Start app using image Write file T=3 T=2 P3 (Fork) T=1 P1 T=0 35
Inside Host 1 Local agent infers connection causation using the Process Communication Graph Connection 2 (C2) Connection 1 (C1) T=4 Caused T=0 C1 C2 t(c1)<t(c2) 36
Lateral Movement A critical step during APT to move from the entry point to target host GL L2 L1 Target Host Host 4 C2 C3 C3 C4 Host 2 C1 C2 Entry Point 6 5 Host 5 4 C4 Host 3 3 C3 2 C2 Host 1 1 C1 37
Inside Cluster Leader 1 Cluster head maintains Host Communication Graph Incoming Causation Events: C1 C2 Host 4 Host 3 Host 1 Agents do not need to synchronize clocks C1 C2 C3 C4 t(c1)<t(c2)<t(c3)<t(c4) Host 2 C2 C3 C3 C4 38
Lateral Movement A critical step during APT to move from the entry point to target host GL Cluster2 C6 L2 Cluster1 C4 L1 Target Host C5 C6 C4 C5 Host 4 C2 C3 C3 C4 Host 2 C1 C2 Entry Point 6 C6 5 Host 5 C5 4 C4 Host 3 3 C3 2 C2 Host 1 1 C1 39
Discussion Network level causation inference using host-level calls Detection load distributed over all agents via distributed fusion Eliminate the need for global clocks by abstracting data using hierarchy 40
Conclusion We designed an end-to-end solution that provides cyber resiliency against coordinated threats Kobra generate views of a host and to learn models of applications In a hierarchical manner, we used Kobra s views to generate a network-wide chain of a coordinated attack 41
Future Work We will formulate a theory for resilient integrity checking when an attacker is attempting evasion PowerAlert Integrity checking of an SDN Rekeying of smart meters We plan to develop a response mechanism for lateral movement using adaptive control The attacker model is unknown, to be learned Response actions change network topology and healing rates of machines 42
Bibliography [PRDC 17] A. M. Fawaz and W. H. Sanders, Learning Process Behavioral Baselines for Anomaly Detection Proceedings of the 22nd IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2017), Christchurch, New Zealand, January 22-25, 2017, to appear. [SRDS 16] A. Fawaz, A. Bohara, C. Cheh, and W. H. Sanders, Lateral Movement Detection Using Distributed Data Fusion, Proceedings of the 35th Symposium on Reliable Distributed Systems (SRDS), Budapest, Hungary, Sept. 26-29, 2016, to appear. [RAID 16] Fawaz, A., and Sanders, W. H, Poster: Learning Process Behavioral Baselines for Anomaly Detection, RAID 2016[Poster] [GameSec 16] M. A. Noureddine, A. Fawaz, W. H. Sanders, and T. Basar, A Game-Theoretic Approach to Respond to Attacker Lateral Movement Proceedings of the 7th Conference on Decision and Game Theory for Security (GameSec 2016), New York, New York, November 2-4, 2016, Lecture Notes in Computer Science vol. 9996, Springer, 2016, pp. 294-313. [TSG 16] Fawaz, A., Berthier, R., and Sanders, W. H., A Response Cost Model for Advanced Metering Infrastructures, IEEE Transactions on Smart Grid, vol. 7, no. 2, March 2016, pp. 543-553. [JSAC 13] Stephen McLaughlin, Brett Holbert, Ahmed Fawaz, Robin Berthier and Saman Zonouz, A Multi-Sensor Intrusion and Energy Theft Detection Framework for Advanced Metering Infrastructures, IEEE JSAC Smart Grid Communications Series, vol. 31, no. 7, pp. 1319-1330, July 2013. [SmartGridComm 12] Fawaz, A., Berthier, R., and Sanders, W. H., Cost Modeling of Response Actions for Automated Response and Recovery in AMI, In Proceedings of the Third IEEE International Conference on Smart Grid Communication (SmartGridComm 2012), Tainan City, Taiwan, Nov. 5-8, 2012, pp. 348-353. [NISTCPS 12] Fawaz, A., Berthier, R., Sanders, W. H., and Pal., P., Understanding the Role of Automated Response Actions in Improving AMI Resiliency, In Proceedings of the NIST Cybersecurity for Cyber-Physical Systems Workshop, Gaithersburg, Maryland, Apr. 23-24, 2012. 43