Program Calling Context. Motivation

Similar documents
Probabilistic Calling Context

Lu Fang, University of California, Irvine Liang Dou, East China Normal University Harry Xu, University of California, Irvine

Execution Tracing and Profiling

Towards Parallel, Scalable VM Services

Efficient Path Profiling

Free-Me: A Static Analysis for Automatic Individual Object Reclamation

Efficient Runtime Tracking of Allocation Sites in Java

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

A Dynamic Evaluation of the Precision of Static Heap Abstractions

Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters. Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo

Efficient Context Sensitivity for Dynamic Analyses via Calling Context Uptrees and Customized Memory Management

Elfen Scheduling. Fine-Grain Principled Borrowing from Latency-Critical Workloads using Simultaneous Multithreading (SMT) Xi Yang

Lightweight Data Race Detection for Production Runs

Phase-based Adaptive Recompilation in a JVM

How s the Parallel Computing Revolution Going? Towards Parallel, Scalable VM Services

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler

Cachetor: Detecting Cacheable Data to Remove Bloat

Screaming Fast Declarative Pointer Analysis

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling

Bugs in Deployed Software

Adaptive Optimization using Hardware Performance Monitors. Master Thesis by Mathias Payer

Program Analysis. Program Analysis

Space-Time Tradeoffs in Software-Based Deep Packet Inspection

Trace-based JIT Compilation

DELTAPATH: PRECISE AND SCALABLE CALLING CONTEXT ENCODING

Jinn: Synthesizing Dynamic Bug Detectors for Foreign Language Interfaces

PERFBLOWER: Quickly Detecting Memory-Related Performance Problems via Amplification

Phosphor: Illuminating Dynamic. Data Flow in Commodity JVMs

Chronicler: Lightweight Recording to Reproduce Field Failures

Towards Garbage Collection Modeling

Detecting and Fixing Memory-Related Performance Problems in Managed Languages

Part I Garbage Collection Modeling: Progress Report

Java Heap Resizing From Hacked-up Heuristics to Mathematical Models. Jeremy Singer and David R. White

GC Assertions: Using the Garbage Collector to Check Heap Properties

Correcting the Dynamic Call Graph Using Control Flow Constraints

Z-Rays: Divide Arrays and Conquer Speed and Flexibility

Complex, concurrent software. Precision (no false positives) Find real bugs in real executions

Method-Level Phase Behavior in Java Workloads

Efficient Runtime Tracking of Allocation Sites in Java

CSE 373: Data Structures and Algorithms. Memory and Locality. Autumn Shrirang (Shri) Mare

IBM Research - Tokyo 数理 計算科学特論 C プログラミング言語処理系の最先端実装技術. Trace Compilation IBM Corporation

New Algorithms for Static Analysis via Dyck Reachability

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

The Potentials and Challenges of Trace Compilation:

The DaCapo Benchmarks: Java Benchmarking Development and Analysis

1 Scalable Runtime Bloat Detection Using Abstract Dynamic Slicing

Java Performance Evaluation through Rigorous Replay Compilation

Dynamic Analysis of Inefficiently-Used Containers

String Deduplication for Java-based Middleware in Virtualized Environments

Continuous Object Access Profiling and Optimizations to Overcome the Memory Wall and Bloat

Understanding Parallelism-Inhibiting Dependences in Sequential Java

More Graph Algorithms

Averroes: Letting go of the library!

Cooperative Memory Management in Embedded Systems

G Programming Languages - Fall 2012

Pointer Analysis in the Presence of Dynamic Class Loading

Precise Calling Context Encoding

Finding the Limit: Examining the Potential and Complexity of Compilation Scheduling for JIT-Based Runtime Systems

CSCI-1200 Data Structures Spring 2018 Lecture 7 Order Notation & Basic Recursion

Tolerating Memory Leaks

Myths and Realities: The Performance Impact of Garbage Collection

We don t have much time, so we don t teach them [students]; we acquaint them with things that they can learn. Charles E. Leiserson

CSE 417 Branch & Bound (pt 4) Branch & Bound

Algorithms for Data Structures: Uninformed Search. Phillip Smith 19/11/2013

Chapter 2: Basic Data Structures

6.006 Introduction to Algorithms Recitation 19 November 23, 2011

COMP 202 Recursion. CONTENTS: Recursion. COMP Recursion 1

Large-Scale API Protocol Mining for Automated Bug Detection

CISC 662 Graduate Computer Architecture. Classifying ISA. Lecture 3 - ISA Michela Taufer. In a CPU. From Source to Assembly Code

B+ Tree Review. CSE332: Data Abstractions Lecture 10: More B Trees; Hashing. Can do a little better with insert. Adoption for insert

Interprocedural Analysis. Motivation. Interprocedural Analysis. Function Calls and Pointers

Analysis of Algorithms

Graphs & Digraphs Tuesday, November 06, 2007

Write Barrier Removal by Static Analysis

MEMORY: SWAPPING. Shivaram Venkataraman CS 537, Spring 2019

Machine Learning. bad news. Big Picture: Supervised Learning. Supervised (Function) Learning. Learning From Data with Decision Trees.

Automated Documentation Inference to Explain Failed Tests

CMU-Q Lecture 2: Search problems Uninformed search. Teacher: Gianni A. Di Caro

An Efficient Storeless Heap Abstraction Using SSA Form

CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees. CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees 1

IP Forwarding Computer Networking. Graph Model. Routes from Node A. Lecture 11: Intra-Domain Routing

ShortCut: Architectural Support for Fast Object Access in Scripting Languages

References. NoSQL Motivation. Why NoSQL as the Solution? NoSQL Key Feature Decisions. CSE 444: Database Internals

CS 251, LE 2 Fall MIDTERM 2 Tuesday, November 1, 2016 Version 00 - KEY

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder

Software Challenges: Abstraction and Analysis

Midterm 1 for CS 170

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization

The wolf sheep cabbage problem. Search. Terminology. Solution. Finite state acceptor / state space

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Solutions to Exam Data structures (X and NV)

Module 5 Graph Algorithms

Hello, and welcome to the Texas Instruments Precision DAC overview of DC specifications of DACs. In this presentation we will briefly cover the

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

Evaluating Design Tradeoffs in Numeric Static Analysis for Java

Static Java Program Features for Intelligent Squash Prediction

Calvin Lin The University of Texas at Austin

CCured. One-Slide Summary. Lecture Outline. Type-Safe Retrofitting of C Programs

Preemption Delay Analysis for the Stack Cache

Transcription:

Program alling ontext Motivation alling context enhances program understanding and dynamic analyses by providing a rich representation of program location ollecting calling context can be expensive The resulting representation of contexts can be bulky 2 1

Overview Probabilistic alling ontext [2] Precise alling ontext Encoding [3] 3 Probabilistic alling ontext[2] ased on Mike ond s slides[1] 2

Why ontext Sensitivity? Static program location not enough at com.mckoi.db.jdbcserver.jdinterface.execquery():213 5 Why ontext Sensitivity? Static program location not enough at com.mckoi.db.jdbcserver.jdinterface.execquery():213 at com.mckoi.db.jdbc.monnection.executequery():348 at com.mckoi.db.jdbc.mstatement.executequery():110 at com.mckoi.db.jdbc.mstatement.executequery():127 at Test.main():48 6 3

Why ontext Sensitivity? Static program location not enough at com.mckoi.db.jdbcserver.jdinterface.execquery():213 at com.mckoi.db.jdbc.monnection.executequery():348 at com.mckoi.db.jdbc.mstatement.executequery():110 at com.mckoi.db.jdbc.mstatement.executequery():127 at Test.main():48 Motivated by omplex programs Small methods Virtual dispatch 7 Why ontext Sensitivity? Static program location not enough at com.mckoi.db.jdbcserver.jdinterface.execquery():213 at com.mckoi.db.jdbc.monnection.executequery():348 at com.mckoi.db.jdbc.mstatement.executequery():110 at com.mckoi.db.jdbc.mstatement.executequery():127 at Test.main():48 Motivated by omplex programs Small methods Virtual dispatch /Fortran method call return Java/# method return call 8 4

ontext Is Nontrivial PI calls Program all sites Distinct contexts antlr 4,184 128,627 bloat 3,306 600,947 chart 2,335 202,603 eclipse 9,611 226,020 fop 2,225 37,710 hsqldb 947 16,050 jython 1,830 628,048 luindex 654 102,556 lusearch 507 905 pmd 1,890 847,108 xalan 1,530 17,905 9 Example: Residual Testing t production time, are there any untested behaviors, such as unexercised calling context? class SimpleWindow { close() { class EditorWindow { close() { 10 5

Example: Residual Testing t production time, are there any untested behaviors, such as unexercised calling context? class SimpleWindow { close() { inputhandler() { class EditorWindow { case LIK_EXIT: close() { w.checkunsaved(); w.close(); autoupdate() { for all windows w w.close(); 11 Example: Residual Testing t production time, are there any untested behaviors, such as unexercised calling context? class SimpleWindow { close() { inputhandler() { class EditorWindow { case LIK_EXIT: close() { w.checkunsaved(); w.close(); autoupdate() { for all windows w w.close(); ug! 12 6

inputhandler() { case LIK_EXIT: Example: Residual Testing t production time, are there any untested behaviors, such as unexercised calling context? autoupdate() { class SimpleWindow { close() { for all windows w New behavior indicates bugs w.close(); ontext class EditorWindow sensievity helps { find new behavior close() { w.checkunsaved(); ug! w.close(); 13 Two-Phase Dynamic nalyses Training ProducEon ehavior observed New or anomalous behavior detected 14 7

Probabilistic alling ontext dds context sensitivity to dynamic analyses Maintains value representing context Unique with high probability New value à new context à walk stack High accuracy: <0.1% false negatives Low overhead: 3% overhead, 0-8% for clients Training ProducEon ehavior observed New or anomalous behavior detected 15 P Function f ( V, cs ) q V is P value q cs is call site ID 16 8

f ( V, cs ) q V is P value q cs is call site ID P Function V f ( V, cs 1 ) cs cs V f ( V, cs 2 ) 1 2 V V saved V V saved 17 P Function f ( V, cs ) 3V + cs (mod 2 32 ) q V is P value q cs is call site ID 18 9

P Function f ( V, cs ) 3V + cs (mod 2 32 ) heap to compute Desirable properties: Non-commutative omposition efficient to compute 19 Differentiating Similar ontexts V 3V + cs 1 V 3V + cs 2 V 3V + cs 2 V 3V + cs 1 à () à () à à () à () à 20 10

Differentiating Similar ontexts V 3V + cs 1 V 3V + cs 2 V 3V + cs 2 V 3V + cs 1 Non-commutative f ( f (V, cs 1 ), cs 2 ) f ( f (V, cs 2 ), cs 1 ) 21 Efficiency at Inlined alls V 3V + cs 1 V 3V + cs 2 22 11

Efficiency at Inlined alls V 3V + cs 1 V 3V + cs 2 V 3 ( 3V + cs 1 ) + cs 2 23 Efficiency at Inlined alls V 3V + cs 1 V 3V + cs 2 V 9V + 3cs 1 + cs 2 24 12

Efficiency at Inlined alls V 3V + cs 1 V 3V + cs 2 V 9V + 3cs 1 + cs 2 omposition efficient to compute n n i f ( V, cs ) = 3 V + 3 cs i i i 25 Outline Introduction Previous approaches Maintaining the P value Evaluation Methodology Evaluating potential clients ccuracy Performance 26 13

Methodology Implementation in Jikes RVM 2.4.6 vailable on Jikes RVM Research rchive Deterministic calling context profiling Maintains T node at each call & return enchmarks: Daapo, SPE J2000, SPE JVM98 Platform: 3.6 GHz Pentium 4 w/linux 27 How lients Use P Record values New value à new context à walk stack Training ProducEon ehavior observed New or anomalous behavior detected 28 14

Evaluating Potential lients Record values Training Global hash table heck values (no new values) ProducEon ehavior observed New or anomalous behavior detected 29 Evaluating Potential lients Memory overhead: proporeonal to contexts Record values Training Global hash table heck values (no new values) ProducEon ehavior observed New or anomalous behavior detected 30 15

Evaluating Potential lients nomaly- based intrusion deteceon heck P value at system calls (Network, I/O, OS) Residual teseng heck P value at Java PI calls (calls to java.*) Upper bound heck P value at all calls 31 n Ideal ccuracy P maps context to value New P value à new context Familiar P value à probably familiar context q q 32 16

n Ideal ccuracy P maps context to value New P value à new context Familiar P value à probably familiar context q q Expected conflicts (false negatives) Distinct contexts 32-bit values 64-bit values 100,000 1 (0.0%) 0 (0.0%) 1,000,000 116 (0.0%) 0 (0.0%) 10,000,000 11,632 (0.1%) 0 (0.0%) 100,000,000 1,155,170 (1.2%) 0 (0.0%) 1,000,000,000 107,882,641 (10.8%) 0 (0.0%) 10,000,000,000 6,123,623,065 (61.2%) 3 (0.0%) 33 n Ideal ccuracy P maps context to value New P value à new context Familiar P value à probably familiar context q q Expected conflicts (false negatives) Distinct contexts 32-bit values 64-bit values 100,000 1 (0.0%) 0 (0.0%) 1,000,000 PI calls 116 (0.0%) 0 (0.0%) 10,000,000 11,632 (0.1%) 0 (0.0%) 100,000,000 1,155,170 (1.2%) 0 (0.0%) 1,000,000,000 107,882,641 (10.8%) 0 (0.0%) 10,000,000,000 6,123,623,065 (61.2%) 3 (0.0%) 34 17

n Ideal ccuracy P maps context to value New P value à new context Familiar P value à probably familiar context q q Expected conflicts (false negatives) Distinct contexts 32-bit values 64-bit values 100,000 1 (0.0%) 0 (0.0%) 1,000,000 116 (0.0%) 0 (0.0%) 10,000,000 11,632 (0.1%) ll calls 0 (0.0%) 100,000,000 1,155,170 (1.2%) 0 (0.0%) 1,000,000,000 107,882,641 (10.8%) 0 (0.0%) 10,000,000,000 6,123,623,065 (61.2%) 3 (0.0%) 35 n Ideal ccuracy P maps context to value New P value à new context Familiar P value à probably familiar context q q Expected conflicts (false negatives) Distinct contexts 32-bit values 64-bit values 100,000 1 (0.0%) 0 (0.0%) 1,000,000 116 (0.0%) 0 (0.0%) 10,000,000 Near- perfect 11,632 accuracy (0.1%) 0 (0.0%) 100,000,000 1,155,170 (1.2%) 0 (0.0%) 1,000,000,000 107,882,641 (10.8%) 0 (0.0%) 10,000,000,000 6,123,623,065 (61.2%) 3 (0.0%) 36 18

P s ccuracy System calls Java PI calls Program Dynamic Distinct onf. Dynamic Distinct onf. antlr 211,490 1,567 0 24,422,013 128,627 3 bloat 12 10 0 1,159,281,573 600,947 40 chart 63 62 0 258,891,525 202,603 4 eclipse 14,110 197 0 132,507,343 226,020 5 fop 18 17 0 9,918,275 37,710 0 hsqldb 12 12 0 81,161,541 16,050 0 jython 5,929 4,289 0 543,845,772 628,048 48 luindex 2,615 14 0 39,733,214 102,556 0 lusearch 141 11 0 113,511,311 905 0 pmd 1,045 25 0 537,017,118 847,108 79 xalan 137,895 59 0 2,105,838,670 17,905 0 P s ccuracy System calls Java PI calls Program Dynamic Distinct onf. Dynamic Distinct onf. antlr 211,490 1,567 0 24,422,013 128,627 3 bloat 12 10 0 1,159,281,573 600,947 40 chart 63 62 0 258,891,525 202,603 4 eclipse 14,110 197 0 132,507,343 226,020 5 fop 18 17 0 9,918,275 37,710 0 hsqldb 12 12 0 81,161,541 16,050 0 jython 5,929 4,289 0 543,845,772 628,048 48 luindex 2,615 14 0 39,733,214 102,556 0 lusearch 141 11 0 113,511,311 905 0 pmd 1,045 25 0 537,017,118 847,108 79 xalan 137,895 59 0 2,105,838,670 17,905 0 19

P s ccuracy ll calls Program Dynamic Distinct onf. antlr 490,363,211 1,006,578 118 bloat 6,276,446,059 1,980,205 453 chart 908,459,469 845,432 91 eclipse 1,266,810,504 4,815,901 2,652 fop 44,200,446 174,955 2 hsqldb 877,680,667 110,795 1 jython 5,326,949,158 3,859,545 1,738 luindex 740,053,104 374,201 12 lusearch 1,439,034,336 6,039 0 pmd 2,726,876,957 8,043,096 7,653 xalan 10,083,858,546 163,205 6 P s Execution Time Overhead 60 50 Daapo SPE ll Overhead (%) 40 30 20 10 0 P 3% P + system calls 20

P s Execution Time Overhead 60 50 Daapo SPE ll Overhead (%) 40 30 20 10 0 3% P P + system calls P + PI calls P + all calls Summary P maintains calling context value New value indicates new behavior Low overhead Maintaining P value adds 3% hecking P value 0-8% Memory overhead proportional to contexts High accuracy Less than 0.1% false negative rate P adds context sensitivity to clients that detect anomalous behavior 42 21

Precise alling ontext Encoding [3] ased on Slides[5] Motivation alling context is important for profiling, debugging, and event logging The prior work encodes each calling context into a hash value, but cannot decode the context out of the value Is it possible to encode calling contexts into numbers from which the contexts can be restored? 44 22

ackground Efficient Path Profiling [4] Naively, where do you want to instrument to get the path profile? Is there any way to reduce instrumentation edges? D E F 45 The asic Idea If one node has only one outgoing edge, you don t need to instrument the edge If different paths can be encoded to different numbers, you can restore the whole paths purely based on the number values naïve example: Given the FG, how do you map the values to paths? E D F 46 23

Edge Value ssignment ssign each edge a value such that: Sum of values along DG path is unique, non-negative integer Sum lies in range 0 num_paths 1 Simple linear-time algorithm 47 Edge Value ssignment 48 24

Edge Value ssignment lgorithm foreach vertex v in reverse topological order { if v is a leaf vertex { NumPaths(v) = 1; else { NumPaths(v) = 0; for each edge e = v - > w { Val(e) = NumPaths(v); NumPaths(v) = NumPaths(v) + NumPaths(w); 2 2 1 E D F 49 How to decode a path from a number? lgorithm Given a path id Idx, set current node cur = while (cur!= F) { take the path whose value val(e) is (1) smaller than Idx, and (2) the maximum value among all the values smaller than Idx Idx = Idx val(e); cur = tgt(e); 2 2 1 E D F 50 25

Is the all-larus algorithm applicable to all Graph encoding? Yes. The L algorithm uniquely identifies each path ending at different methods. However, paths ending at different methods can have the same values, if we know ending points. E D 2 1 F ontext id DE 0 DF 1 DE 2 DF 3 E D 1 F ontext id DE 0 DF 0 DE 1 DF 1 51 lgorithm Edge Value ssignment foreach vertex v in topological order { if v is a root vertex { NumPaths(v) = 1; else { NumPaths(v) = 0; for each edge e = w - > v { Val(e) = NumPaths(v); NumPaths(v) = NumPaths(w) + Val(e); 52 26

L vs. New lgorithm 2 1 2 1 D D 1 3 E F E F 53 How to decode a context from a number? E D 1 F ontext id DE 0 DF 0 DE 1 DF 1 54 27

Encoding yclic Gs Introduce dummy edges D 1 Given encoded context (F, 1), does it correspond to path ( D F), ( D F D F), etc? Dummy D 1 1 E F Path Encoding D F (1, F) D F D F (<0, F to >, 1, F) E F 55 Selective Reduction To fit IDs of contexts in 32-bit integers, Profiling is used to identify hot and code call edges ode edges are replaced with dummy edges such as sub-paths starting with these code edges can be separately encoded 56 28

Evaluation 19 programs from SPEint 2000 benchmarks, and a set of open source programs 57 omparison between L and the new algorithm 58 29

Observations L usually instruments outgoing edges when there are multiple The new algorithm usually instruments incoming edges when there are multiple It seems that the latter instrumentation is more efficient than the former one The intuition behind is there are more methods with multi-method callees than methods with multi-method callers No experiment focuses on the instrumentation comparison 59 Stack Depth & Encountered Dynamic ontexts 60 30

Summary alling context is important for program analysis and debugging oth instrumentation overhead and memory overhead (how to encode the context) are important an you think about any approach to further reduce instrumentation effort? 61 References [1] Mike ond, Probabilistic alling ontext, PPT, http://web.cse.ohio-state.edu/~mikebond/ papers.html [2] Mike D. ond, Kathryn S. McKinley, Probabilistic alling ontext, OOPSL 07. [3] William N. Sumner, Yunhui Zheng, Dasarath Weeratunge, Xiangyu Zhang, Precise calling context encoding, ISE 10 [4] Thomas all, James Larus, Efficient Path Profiling, Micro 96 [5] James Larus, Efficient Path Profiling, Pages, http://pages.cs.wisc.edu/~larus/talks/path_talk/ 62 31