Static Analysis Alert Audits Lexicon And Rules David Svoboda, CERT Lori Flynn, CERT Presenter: Will Snavely, CERT

Similar documents
Challenges and Progress: Automating Static Analysis Alert Handling with Machine Learning

Inference of Memory Bounds

Prioritizing Alerts from Static Analysis with Classification Models

Situational Awareness Metrics from Flow and Other Data Sources

ARINC653 AADL Annex Update

2013 US State of Cybercrime Survey

Defining Computer Security Incident Response Teams

Modeling the Implementation of Stated-Based System Architectures

Encounter Complexes For Clustering Network Flow

Investigating APT1. Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Deana Shick and Angela Horneman

SEI/CMU Efforts on Assured Systems

Model-Driven Verifying Compilation of Synchronous Distributed Applications

Design Pattern Recovery from Malware Binaries

Cyber Threat Prioritization

Julia Allen Principal Researcher, CERT Division

Components and Considerations in Building an Insider Threat Program

COTS Multicore Processors in Avionics Systems: Challenges and Solutions

Analyzing 24 Years of CVD

Be Like Water: Applying Analytical Adaptability to Cyber Intelligence

Flow Analysis for Network Situational Awareness. Tim Shimeall January Carnegie Mellon University

Cyber Hygiene: A Baseline Set of Practices

ARINC653 AADL Annex. Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Julien Delange 07/08/2013

The CERT Top 10 List for Winning the Battle Against Insider Threats

Advancing Cyber Intelligence Practices Through the SEI s Consortium

Engineering Improvement in Software Assurance: A Landscape Framework

Roles and Responsibilities on DevOps Adoption

Secure Coding Initiative

Passive Detection of Misbehaving Name Servers

Static Analysis Alert Audits: Lexicon & Rules

Verifying Periodic Programs with Priority Inheritance Locks

Causal Modeling of Observational Cost Data: A Ground-Breaking use of Directed Acyclic Graphs

Evaluating and Improving Cybersecurity Capabilities of the Electricity Critical Infrastructure

Fall 2014 SEI Research Review Verifying Evolving Software

Automated Provisioning of Cloud and Cloudlet Applications

Panel: Future of Cloud Computing

OSATE Analysis Support

Software, Security, and Resiliency. Paul Nielsen SEI Director and CEO

Information Security Is a Business

Smart Grid Maturity Model

Denial of Service Attacks

Researching New Ways to Build a Cybersecurity Workforce

Model-Driven Verifying Compilation of Synchronous Distributed Applications

Semantic Importance Sampling for Statistical Model Checking

Pharos Static Analysis Framework

Using DidFail to Analyze Flow of Sensitive Information in Sets of Android Apps

Software Assurance Education Overview

Goal-Based Assessment for the Cybersecurity of Critical Infrastructure

Providing Information Superiority to Small Tactical Units

Collaborative Autonomy with Group Autonomy for Mobile Systems (GAMS)

Elements of a Usability Reasoning Framework

Current Threat Environment

10 Years of FloCon. Prepared for FloCon George Warnagiris - CERT/CC #GeoWarnagiris Carnegie Mellon University

Foundations for Summarizing and Learning Latent Structure in Video

Cloud Computing. Grace A. Lewis Research, Technology and Systems Solutions (RTSS) Program System of Systems Practice (SoSP) Initiative

18-642: Code Style for Compilers

18-642: Code Style for Compilers

Report Writer and Security Requirements Finder: User and Admin Manuals

Netflow in Daily Information Security Operations

NISPOM Change 2: Considerations for Building an Effective Insider Threat Program

The Insider Threat Center: Thwarting the Evil Insider

Empirically Based Analysis: The DDoS Case

Modeling, Verifying, and Generating Software for Distributed Cyber- Physical Systems using DMPL and AADL

SOFTWARE QUALITY OBJECTIVES FOR SOURCE CODE

Improving Software Assurance 1

An Incident Management Ontology

Buffer overflow background

Using CERT-RMM in a Software and System Assurance Context

Integrating the Risk Management Framework (RMF) with DevOps

TSP Secure. Software Engineering Institute Carnegie Mellon University Pittsburgh, PA September 2009

CS 161 Computer Security

CS 161 Computer Security. Security Throughout the Software Development Process

Architectural Implications of Cloud Computing

Lecture 4 September Required reading materials for this class

CSE 565 Computer Security Fall 2018

Cloud Computing. Grace A. Lewis Research, Technology and Systems Solutions (RTSS) Program System of Systems Practice (SoSP) Initiative

TUNISIA CSIRT CASE STUDY

Specifications for Managed Strings

1d: tests knowing about bitwise fields and union/struct differences.

Page 1. Today. Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right? Compiler requirements CPP Volatile

Static Vulnerability Analysis

Software Vulnerabilities in Java

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

Common Misunderstandings from Exam 1 Material

Open Systems: What s Old Is New Again

! CSMIC SMI Tool. User s Guide. !!! July 30, 2014!!!!!!!!!!!!!!! CSMIC Carnegie Mellon University Silicon Valley Moffett Field, CA USA

Introduction to CBMC. Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel December 5, 2011

Mitigating Java Deserialization attacks from within the JVM

Java Security HotJava to Netscape and Beyond

Carnegie Mellon University Notice

CSE 403: Software Engineering, Fall courses.cs.washington.edu/courses/cse403/16au/ Static Analysis. Emina Torlak

CS61C Machine Structures. Lecture 4 C Pointers and Arrays. 1/25/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

Strip Plots: A Simple Automated Time-Series Visualization

One-Slide Summary. Lecture Outline. Language Security

NO WARRANTY. Use of any trademarks in this presentation is not intended in any way to infringe on the rights of the trademark holder.

A Practical Approach to Programming With Assertions

C++ Tutorial AM 225. Dan Fortunato

WHITE PAPER. 10 Reasons to Use Static Analysis for Embedded Software Development

The Java Language Implementation

Vector and Free Store (Pointers and Memory Allocation)

Evaluating Bug Finders

Transcription:

Static Analysis Alert Audits Lexicon And Rules David Svoboda, CERT Lori Flynn, CERT Presenter: Will Snavely, CERT Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 2016 Carnegie Mellon [DISTRIBUTION University STATEMENT A] This material has been approved for public release and unlimited REV-03.18.2016.0 distribution. 1

Copyright 2016 Carnegie Mellon University and IEEE This material is based upon work funded and supported by the Department of Defense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center. NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN AS-IS BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT. [Distribution Statement A] This material has been approved for Please see Copyright notice for non-us Government use and distribution. This material may be reproduced in its entirety, without modification, and freely distributed in written or electronic form without requesting formal permission. Permission is required for any other use. Requests for permission should be directed to the Software Engineering Institute at permission@sei.cmu.edu. Carnegie Mellon and CERT are registered marks of Carnegie Mellon University. DM-0004189 2

Audit Lexicon And Rules Background Lexicon Rules Future Work Questions? 3

Audit Lexicon And Rules Background [DISTRIBUTION STATEMENT A] This material has been [DISTRIBUTION STATEMENT approved A] This for material public has release been and approved unlimited for distribution. 4

Background: Automatic Alert Classification 1 2 3 Static Analysis Tool(s) Alert Consolidation (SCALe) Auditing Alerts Potential Rule Violations Determinations Training Data ML Classifier Development 5

Background: Automatic Alert Classification 1 2 3 Select candidate code bases for evaluation Static Analysis Tool(s) Alert Consolidation (SCALe) Auditing Alerts Potential Rule Violations Determinations Training Data ML Classifier Development 6

Background: Automatic Alert Classification 1 2 3 Static Analysis Tool(s) Alert Consolidation (SCALe) Auditing Alerts Potential Rule Violations Determinations Run SA Tool(s) to get a collection of alerts indicating potential flaws. Training Data ML Classifier Development 7

Background: Automatic Alert Classification 1 2 3 Static Analysis Tool(s) Alerts Alert Consolidation (SCALe) Potential Rule Violations Convert alerts to common Training format Data and map to CERT Secure Coding Rules/CWEs ML Classifier Development Auditing Determinations 8

Background: Automatic Alert Classification 1 2 3 Static Analysis Tool(s) Alert Consolidation (SCALe) Auditing Alerts Potential Rule Violations Determinations Training Data ML Classifier Development Humans evaluate the violations, e.g. marking them as TRUE or FALSE 9

Background: Automatic Alert Classification 1 2 3 Static Analysis Tool(s) Alert Consolidation (SCALe) Auditing Alerts Potential Rule Violations Determinations Training Data ML Classifier Development Use the training data to build machine learning classifiers that predict TRUE and FALSE determinations for new alerts 10

Background: Automatic Alert Classification 1 2 3 Static Analysis Tool(s) Alert Consolidation (SCALe) Auditing Alerts Potential Rule Violations Determinations Training Data ML Classifier Development What do TRUE/FALSE mean? Are there other determinations I can use? 11

What is truth? One collaborator reported using the determination True to indicate that the issue reported by the alert was a real problem in the code. Another collaborator used True to indicate that something was wrong with the diagnosed code, even if the specific issue reported by the alert was a false positive! 12

Background: Automatic Alert Classification 1 2 3 Static Analysis Tool(s) Alert Consolidation (SCALe) Auditing Alerts Potential Rule Violations Determinations Training Data ML Classifier Development Inconsistent assignment of audit determinations may have a negative impact on classifier development! 13

Solution: Lexicon And Rules We developed a lexicon and auditing rule set for our collaborators Includes a standard set of well-defined determinations for static analysis alerts Includes a set of auditing rules to help auditors make consistent decisions in commonly-encountered situations Different auditors should make the same determination for a given alert! Improve the quality and consistency of audit data for the purpose of building machine learning classifiers Help organizations make better-informed decisions about bug-fixes, development, and future audits. 14

Audit Lexicon And Rules Lexicon [DISTRIBUTION STATEMENT A] This material has been [DISTRIBUTION STATEMENT approved A] This for material public has release been and approved unlimited for distribution. 15

Lexicon: Audit Determinations Audit Determinations Basic Determinations Supplemental Determinations True False Dangerous construct Dead Complex Dependant Ignore Inapplicable environment Unknown (default) Choose ANY NUMBER Per Alert! Choose ONE Per Alert! 16

Lexicon: Basic Determinations True The code in question violates the condition indicated by the alert. - E.g. A valid program should not deference NULL pointers. The condition can be determined from the definition of the alert itself, or from the coding taxonomy the alert corresponds to. - CERT Secure Coding Rules - CWEs 17

Lexicon: Basic Determinations True Example char *build_array(size_t size, char first) { if(size == 0) { return NULL; } } char *array = malloc(size * sizeof(char)); array[0] = first; return array; ALERT: Do not dereference NULL pointers! Determination: TRUE 18

Lexicon: Basic Determinations False The code in question does not violate the condition indicated by the alert. char *build_array(int size, char first) { if(size == 0) { return NULL; } } char *array = malloc(size * sizeof(char)); if(array == NULL) { abort(); Determination: } FALSE array[0] = first; return array; ALERT: Do not dereference NULL pointers! 19

Lexicon: Basic Determinations Complex The alert is too difficult to judge in a reasonable amount of time and effort Reasonable is defined by the individual organization. Dependent The alert is related to a True alert that occurs earlier in the code. Intuition: fixing the first alert would implicitly fix the second one. Unknown None of the above. This is the default determination. 20

Lexicon: Basic Determinations Dependent Example char *build_array(size_t size, char first, char last) { if(size == 0) { ALERT: Do not return NULL; dereference Determination: } NULL pointers! TRUE } char *array = malloc(size * sizeof(char)); array[0] = first; array[size - 1] = last; return array; ALERT: Do not dereference NULL pointers! Determination: DEPENDENT 21

Lexicon: Supplemental Determinations Dangerous Construct The alert refers to a piece of code that poses risk if it is not modified. Risk level is specified as High, Medium, or Low Independent of whether the alert is true or false! Dead The code in question not reachable at runtime. Inapplicable Environment The alert does not apply to the current environments where the software runs (OS, CPU, etc.) If a new environment were added in the future, the alert may apply. Ignore The code in question does not require mitigation. 22

Lexicon: Supplemental Determinations Dangerous Construct Example #define BUF_MAX 128 void create_file(const char *base_name) { // Add the.txt extension! char filename[buf_max]; snprintf(filename, 128, "%s.txt", base_name); } // Create the file, etc... ALERT: potential buffer overrun! Seems ok but why not use BUF_MAX instead of 128? Determination: False + Dangerous Construct 23

Audit Lexicon And Rules Rules [DISTRIBUTION STATEMENT A] This material has been [DISTRIBUTION STATEMENT approved A] This for material public has release been and approved unlimited for distribution. 24

Auditing Rules Goals Clarify ambiguous or complex auditing scenarios Establish assumptions auditors can make Overall: help make audit determinations more consistent We developed 12 rules Drew on our own experiences auditing code bases at CERT Trained 3 groups of engineers on the rules, and incorporated their feedback In the following slides, we will inspect three of the rules in more detail. 25

Example Rule: Assume external inputs to the program are malicious An auditor should assume that inputs to a program module (e.g. function parameters, command line arguments, etc.) may have arbitrary, potentially malicious, values. Unless they have a strong guarantee to the contrary Example from recent history: Java Deserialization An auditor can assume that external data passed to the readobject deserialization routine may be malicious - Assuming there are no other mitigations in place - See: SER12-J, Prevent deserialization of untrusted data 26

Audit Rules External Inputs Example import java.io.*; class DeserializeExample { public static Object deserialize(byte[] buffer) throws Exception { ByteArrayInputStream bais; ObjectInputStream ois; bais = new ByteArrayInputStream(buffer); ois = new ObjectInputStream(bais); return ois.readobject(); } } ALERT: Don t deserialize untrusted data! Without strong evidence to the contrary, assume the buffer could be malicious! Determination: TRUE 27

Example Rule: Unless instructed otherwise, assume code must be portable. When auditing alerts for a code base where the target platform is not specified, the auditor should err on the side of portability. If a diagnosed segment of code malfunctions on certain platforms, and in doing so violates a condition, this is suitable justification for marking the alert True. 28

Audit Rules Portability Example int strcmp(const char *str1, const char *str2) { while(*str1 == *str2) { } if(*str1 == '\0') { return 0; } str1++; str2++; ALERT: Cast to unsigned char before comparing! } if(*str1 < *str2) { return -1; } else { return 1; } This code would be safe on a platform where chars are unsigned, but that hasn t been guaranteed! Determination: TRUE 29

Example Rule: Handle an alert in unreachable code depending on whether it is exportable. Certain code segments may be unreachable at runtime. Also called dead code. A static analysis tool might not be able to realize this, and still mark alerts in code that cannot be executed. The Dead supplementary determination can be applied to these alerts. However, an auditor should take care when deciding if a piece of code is truly dead. In particular: just because a given program module (function, class) is not used does not mean it is dead. The module might be exported as a public interface, for use by another application. This rule was developed as a result of a scenario encountered by one of our collaborators! 30

Future Work Gather feedback on our lexicon and rules from surveys, focus groups, experts, etc. Continue to refine the lexicon/rules. Further develop CERT s SCALe auditing framework to fully incorporate these concepts. Work with more collaborators to test the rules/lexicon in practice. - We have some initial feedback from two collaborators, who used our rules to audit several hundred alerts from C and Java codebases 31

Audit Lexicon And Rules Questions? wsnavely@cert.org [DISTRIBUTION STATEMENT A] This material has been [DISTRIBUTION STATEMENT approved A] This for material public has release been and approved unlimited for distribution. 32