High System-Code Security with Low Overhead

Size: px
Start display at page:

Download "High System-Code Security with Low Overhead"

Transcription

1 High System-Code Security with Low Overhead Jonas Wagner, Volodymyr Kuznetsov, George Candea, and Johannes Kinder École Polytechnique Fédérale de Lausanne Royal Holloway, University of London

2 High System-Code Security? Today s so!ware is dangerous. Example: OpenSSL Overflow in ssl/t1_lib.c:3997 OpenSSL contains 53,073 memory accesses. How to protect them all?

3 Protect all dangerous operations using sanity checks Checks are automatically added at compile time No source code modification is needed if (!isvalidaddress(p)) { reporterror(p); *p = 42; abort(); AddressSanitizer } *p = 42;

4 Problem: Sanity checks cause high performance overhead Tool Avg. Overhead AddressSanitizer (memory errors) 73% SoftBound/CETS (full memory safety) 116% UndefinedBehaviorSanitizer (integer overflows, type errors ) 71% Assertions, code contracts, depends

5 Problem: Sanity checks cause high performance overhead People use checks heavily for testing, but disable them in production Goal: checks in production

6 Insight: Checks are not all equal Most of the overhead comes from a few expensive checks Checks in hot code, each executed many times Most of the protection comes from many cheap checks Checks in cold code

7 Our Approach : ASAP As Safe As Possible Lets users choose their overhead budget (e.g., 5%) Automatically identifies sanity checks in so!ware Analyzes the cost of every check Selects as many checks as fit in the user s budget Most of the overhead comes from a few expensive checks Most of the protection comes from many cheap checks

8 ASAP Insight & Results 100% 80% Fraction of critical operations Sanity level 60% 40% protected by a check 20% 0% 0% 10% 20% 30% 40% 50% 60% Overhead

9 ASAP Insight & Results 100% 80% Sanity level 60% 40% Most protection comes 20% from cheap checks 0% 0% 10% 20% 30% 40% 50% 60% Overhead

10 ASAP Insight & Results 100% 80% A few checks are very expensive Sanity level 60% 40% 20% 0% 0% 10% 20% 30% 40% 50% 60% Overhead

11 ASAP Insight & Results 100% 80% A user on a 5% budget Sanity level 60% 40% can get 87% of the protection 20% 0% 0% 10% 20% 30% 40% 50% 60% Overhead

12 Outline Introduction: What is ASAP? Design Key Algorithms Results Conclusion

13 Design ASAP is built into the compiler Easy to use (set CC and CFLAGS) Compatible & robust (parallel compilation, ) Compiler (LLVM) Identify sanity checks Profiler: measure check costs Optimizer: select maximum set of checks

14 Users use ASAP like a regular compiler that adds checks.cc.h ASAP.exe +.llvm ASAP stores intermediate compiler output

15 Users use ASAP like a regular compiler that adds checks.cc.h ASAP.exe +.llvm ASAP generates a program variant with profiling instrumentation. Users run this to measure check costs. ASAP.llvm Profiler.exe Costs Workload

16 Users use ASAP like a regular compiler that adds checks.cc.h ASAP.exe +.llvm ASAP generates a program variant with profiling instrumentation. Users run this to measure check costs. ASAP.llvm Profiler.exe Costs Workload ASAP uses costs & budget to generate an optimized program ASAP.llvm + Costs + Budget Optimizer.exe

17 Outline Introduction: What is ASAP? Design Key Algorithms Results Conclusion

18 ASAP Identify check instructions (de)activate checks Measure compiler driver Budget check cost IDE integration Quantify protection.exe profiling workload Can users trust ASAP to select checks that use the least CPU cycles?

19 Measure Check Cost... if (!isvalidaddress(p)) { reporterror(p); abort(); } *p = 42;...

20 Measure Check Cost Add profiling counters prof1++; if (!isvalidaddress(p)) { prof2++; reporterror(p); abort(); } prof3++; *p = 42;...

21 Measure Check Cost... prof1++; if (!isvalidaddress(p)) { 1. Add profiling counters 2. Identify check instructions prof2++; reporterror(p); abort(); } prof3++; *p = 42;...

22 Measure Check Cost... prof1++; if (!isvalidaddress(p)) { prof2++; reporterror(p); abort(); } 1. Add profiling counters 2. Identify check instructions 3. Use static model of cycles per instruction 4. Compute cost for check in CPU cycles: prof3++; *p = 42;... X i2check prof(i) cycles(i) Precise cost in CPU cycles

23 ASAP Identify check instructions (de)activate checks Measure compiler driver Budget check cost IDE integration Quantify protection.exe profiling workload How can ASAP detect checks robustly? How are checks switched on or off?

24 // blocksort.c line 349 (bzip2 from SPEC CPU 2006) cc1 = eclass[fmap[i]]; %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %19 = load i32* %3, align 4

25 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %4 = ptrtoint i32* %3 to i64 %5 = lshr i64 %4, 3 %6 = add i64 %5, 0x %7 = inttoptr i64 %6 to i8* %8 = load i8* %7, align 1 %9 = icmp eq i8 %8, 0 br i1 %9, label %18, label %10 ; <label>:10 %11 = ptrtoint i32* %3 to i64 %12 = and i64 %11, 7 %13 = add i64 %12, 3 %14 = trunc i64 %13 to i8 %15 = icmp slt i8 %14, %8 br i1 %15, label %18, label %16 ; <label>:18 %19 = load i32* %3, align 4 ; <label>:16 %17 = ptrtoint i32* %3 to i64 call asan_report_load4(i64 %17) #3 call void asm sideeffect "", ""() #3 unreachable

26 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %4 = ptrtoint i32* %3 to i64 %5 = lshr i64 %4, 3 %6 = add i64 %5, 0x %7 = inttoptr i64 %6 to i8* %8 = load i8* %7, align 1 %9 = icmp eq i8 %8, 0 br i1 %9, label %18, label %10 ; <label>:10 %11 = ptrtoint i32* %3 to i64 %12 = and i64 %11, 7 %13 = add i64 %12, 3 %14 = trunc i64 %13 to i8 %15 = icmp slt i8 %14, %8 br i1 %15, label %18, label %16 ; <label>:18 %19 = load i32* %3, align 4 ; <label>:16 %17 = ptrtoint i32* %3 to i64 call asan_report_load4(i64 %17) #3 call void asm sideeffect "", ""() #3 unreachable

27 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %4 = ptrtoint i32* %3 to i64 %5 = lshr i64 %4, 3 %6 = add i64 %5, 0x %7 = inttoptr i64 %6 to i8* %8 = load i8* %7, align 1 %9 = icmp eq i8 %8, 0 br i1 %9, label %18, label %10 ; <label>:10 %11 = ptrtoint i32* %3 to i64 %12 = and i64 %11, 7 %13 = add i64 %12, 3 %14 = trunc i64 %13 to i8 %15 = icmp slt i8 %14, %8 br i1 %15, label %18, label %16 ; <label>:18 %19 = load i32* %3, align 4 ; <label>:16 %17 = ptrtoint i32* %3 to i64 call asan_report_load4(i64 %17) #3 call void asm sideeffect "", ""() #3 unreachable

28 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %4 = ptrtoint i32* %3 to i64 %5 = lshr i64 %4, 3 %6 = add i64 %5, 0x %7 = inttoptr i64 %6 to i8* %8 = load i8* %7, align 1 %9 = icmp eq i8 %8, 0 br i1 %9, label %18, label %10 ; <label>:10 %11 = ptrtoint i32* %3 to i64 %12 = and i64 %11, 7 %13 = add i64 %12, 3 %14 = trunc i64 %13 to i8 %15 = icmp slt i8 %14, %8 br i1 %15, label %18, label %16 ; <label>:18 %19 = load i32* %3, align 4 ; <label>:16 %17 = ptrtoint i32* %3 to i64 call asan_report_load4(i64 %17) #3 call void asm sideeffect "", ""() #3 unreachable

29 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %4 = ptrtoint i32* %3 to i64 %5 = lshr i64 %4, 3 %6 = add i64 %5, 0x %7 = inttoptr i64 %6 to i8* %8 = load i8* %7, align 1 %9 = icmp eq i8 %8, 0 br i1 %9, label %18, label %10 ; <label>:10 %11 = ptrtoint i32* %3 to i64 %12 = and i64 %11, 7 %13 = add i64 %12, 3 %14 = trunc i64 %13 to i8 %15 = icmp slt i8 %14, %8 br i1 %15, label %18, label %16 ; <label>:18 %19 = load i32* %3, align 4 ; <label>:16 %17 = ptrtoint i32* %3 to i64 call asan_report_load4(i64 %17) #3 call void asm sideeffect "", ""() #3 unreachable

30 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %4 = ptrtoint i32* %3 to i64 %5 = lshr i64 %4, 3 %6 = add i64 %5, 0x %7 = inttoptr i64 %6 to i8* %8 = load i8* %7, align 1 %9 = icmp eq i8 %8, 0 br i1 true, label %18, label %10 ; <label>:10 %11 = ptrtoint i32* %3 to i64 %12 = and i64 %11, 7 %13 = add i64 %12, 3 %14 = trunc i64 %13 to i8 %15 = icmp slt i8 %14, %8 br i1 %15, label %18, label %16 ; <label>:18 %19 = load i32* %3, align 4 ; <label>:16 %17 = ptrtoint i32* %3 to i64 call asan_report_load4(i64 %17) #3 call void asm sideeffect "", ""() #3 unreachable

31 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %4 = ptrtoint i32* %3 to i64 %5 = lshr i64 %4, 3 %6 = add i64 %5, 0x %7 = inttoptr i64 %6 to i8* %8 = load i8* %7, align 1 %9 = icmp eq i8 %8, 0 br i1 true, label %18, label %10 ; <label>:18 %19 = load i32* %3, align 4

32 ; <label>:0 %1 = load i32* %fmap_i_ptr, align 4 %2 = zext i32 %1 to i64 %3 = getelementptr inbounds i32* %eclass, i64 %2 %19 = load i32* %3, align 4

33 ASAP Identify check instructions (de)activate checks Measure compiler driver Budget check cost IDE integration Quantify protection.exe profiling workload If ASAP says you re 87% protected, what does this mean?

34 Typically, instrumentation tools try to offer guarantees (e.g., memory safety) Lower overheads from tools with weaker guarantees (e.g., control flow integrity) Practical implication of weaker guarantees?

35 ASAP quantifies protection using the sanity level We would like to know the effective protection level Methodology: measure how many bugs/vulnerabilities are effectively prevented Typically, instrumentation tools try to offer guarantees (e.g., memory safety) Lower overheads from tools with weaker guarantees (e.g., control flow integrity) Practical implication of weaker guarantees?

36 ASAP quantifies protection using the sanity level Experiment 1 We would like to know the effective protection level Methodology: measure how many bugs/vulnerabilities are effectively prevented Source code of Python 2.7 Bug: line that has received a patch between version and 2.7.8

37 ASAP quantifies protection using the sanity level We would like to know the effective protection level Methodology: measure how many bugs/vulnerabilities are effectively prevented Buggy lines covered by checks 100% 80% 60% 40% 20% 0 Effective Protection sanity level 0 20% 40% 60% 80% 100% Sanity level

38 Experiment 2 Experiment 3 Known bugs Vulnerabilities from CVE DB Project Bugs Analyze 145 vulnerabilities from 2014 Python OpenSSL 1 RIPE Benchmarks 10 Memory errors Open source Patch available Error location known All of these are in cold code 83% of these are in cold code Checks in cold code provide real value

39 Outline Introduction: What is ASAP? Design Key Algorithms Results Conclusion

40 Results 120 Overhead for: SPEC benchmarks AddressSanitizer 100% sanity level libquantum lbm soplex mcf namd astar bzip2 milc sphinx3 hmmer gobmk sjeng gcc povray Overhead in %

41 Results 120 CPU cycles saved 100 Overhead for: SPEC benchmarks AddressSanitizer 95% sanity level 0 libquantum lbm soplex mcf namd astar bzip2 milc sphinx3 hmmer gobmk sjeng gcc povray Overhead in %

42 Results 120 Overhead for: SPEC benchmarks AddressSanitizer 90% sanity level Overhead in % libquantum lbm soplex mcf namd astar bzip2 milc sphinx3 hmmer gobmk sjeng gcc povray

43 Results Residual overhead Overhead for: (not due to checks) SPEC benchmarks AddressSanitizer 80% sanity level 0 libquantum lbm soplex mcf namd astar bzip2 milc sphinx3 hmmer gobmk sjeng gcc povray Overhead in %

44 Results 120 Overhead for: SPEC benchmarks AddressSanitizer 80% sanity level Overhead in % libquantum lbm soplex mcf namd astar bzip2 milc sphinx3 hmmer gobmk sjeng gcc povray

45 Conclusion Protect your so$ware! dslab.epfl.ch/proj/asap Run-time checks deliver strong protection at high cost ASAP Identify measure select maximum Budget sanity checks check costs set of checks.exe Most of the overhead comes from a few expensive checks Most of the protection comes from many cheap checks

Lightweight Memory Tracing

Lightweight Memory Tracing Lightweight Memory Tracing Mathias Payer*, Enrico Kravina, Thomas Gross Department of Computer Science ETH Zürich, Switzerland * now at UC Berkeley Memory Tracing via Memlets Execute code (memlets) for

More information

Resource-Conscious Scheduling for Energy Efficiency on Multicore Processors

Resource-Conscious Scheduling for Energy Efficiency on Multicore Processors Resource-Conscious Scheduling for Energy Efficiency on Andreas Merkel, Jan Stoess, Frank Bellosa System Architecture Group KIT The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe

More information

Footprint-based Locality Analysis

Footprint-based Locality Analysis Footprint-based Locality Analysis Xiaoya Xiang, Bin Bao, Chen Ding University of Rochester 2011-11-10 Memory Performance On modern computer system, memory performance depends on the active data usage.

More information

A Fast Instruction Set Simulator for RISC-V

A Fast Instruction Set Simulator for RISC-V A Fast Instruction Set Simulator for RISC-V Maxim.Maslov@esperantotech.com Vadim.Gimpelson@esperantotech.com Nikita.Voronov@esperantotech.com Dave.Ditzel@esperantotech.com Esperanto Technologies, Inc.

More information

Enhanced Operating System Security Through Efficient and Fine-grained Address Space Randomization

Enhanced Operating System Security Through Efficient and Fine-grained Address Space Randomization Enhanced Operating System Security Through Efficient and Fine-grained Address Space Randomization Anton Kuijsten Andrew S. Tanenbaum Vrije Universiteit Amsterdam 21st USENIX Security Symposium Bellevue,

More information

Improving Cache Performance by Exploi7ng Read- Write Disparity. Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A.

Improving Cache Performance by Exploi7ng Read- Write Disparity. Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Improving Cache Performance by Exploi7ng Read- Write Disparity Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Jiménez Summary Read misses are more cri?cal than write misses

More information

CS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines

CS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines CS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines Sreepathi Pai UTCS September 14, 2015 Outline 1 Introduction 2 Out-of-order Scheduling 3 The Intel Haswell

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 36 Performance 2010-04-23 Lecturer SOE Dan Garcia How fast is your computer? Every 6 months (Nov/June), the fastest supercomputers in

More information

Energy Models for DVFS Processors

Energy Models for DVFS Processors Energy Models for DVFS Processors Thomas Rauber 1 Gudula Rünger 2 Michael Schwind 2 Haibin Xu 2 Simon Melzner 1 1) Universität Bayreuth 2) TU Chemnitz 9th Scheduling for Large Scale Systems Workshop July

More information

Improving Cache Performance by Exploi7ng Read- Write Disparity. Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A.

Improving Cache Performance by Exploi7ng Read- Write Disparity. Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Improving Cache Performance by Exploi7ng Read- Write Disparity Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Jiménez Summary Read misses are more cri?cal than write misses

More information

Sandbox Based Optimal Offset Estimation [DPC2]

Sandbox Based Optimal Offset Estimation [DPC2] Sandbox Based Optimal Offset Estimation [DPC2] Nathan T. Brown and Resit Sendag Department of Electrical, Computer, and Biomedical Engineering Outline Motivation Background/Related Work Sequential Offset

More information

Performance Characterization of SPEC CPU Benchmarks on Intel's Core Microarchitecture based processor

Performance Characterization of SPEC CPU Benchmarks on Intel's Core Microarchitecture based processor Performance Characterization of SPEC CPU Benchmarks on Intel's Core Microarchitecture based processor Sarah Bird ϕ, Aashish Phansalkar ϕ, Lizy K. John ϕ, Alex Mericas α and Rajeev Indukuru α ϕ University

More information

Performance. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Performance. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Performance Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Defining Performance (1) Which airplane has the best performance? Boeing 777 Boeing

More information

What run-time services could help scientific programming?

What run-time services could help scientific programming? 1 What run-time services could help scientific programming? Stephen Kell stephen.kell@cl.cam.ac.uk Computer Laboratory University of Cambridge Contrariwise... 2 Some difficulties of software performance!

More information

Protecting Dynamic Code by Modular Control-Flow Integrity

Protecting Dynamic Code by Modular Control-Flow Integrity Protecting Dynamic Code by Modular Control-Flow Integrity Gang Tan Department of CSE, Penn State Univ. At International Workshop on Modularity Across the System Stack (MASS) Mar 14 th, 2016, Malaga, Spain

More information

Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems

Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems Min Kyu Jeong, Doe Hyun Yoon^, Dam Sunwoo*, Michael Sullivan, Ikhwan Lee, and Mattan Erez The University of Texas at Austin Hewlett-Packard

More information

Energy-centric DVFS Controlling Method for Multi-core Platforms

Energy-centric DVFS Controlling Method for Multi-core Platforms Energy-centric DVFS Controlling Method for Multi-core Platforms Shin-gyu Kim, Chanho Choi, Hyeonsang Eom, Heon Y. Yeom Seoul National University, Korea MuCoCoS 2012 Salt Lake City, Utah Abstract Goal To

More information

Super-optimizing LLVM IR

Super-optimizing LLVM IR Super-optimizing LLVM IR Duncan Sands DeepBlueCapital / CNRS Thanks to Google for sponsorship Super optimization Optimization Improve code Super optimization Optimization Improve code Super-optimization

More information

BUNSHIN: Compositing Security Mechanisms through Diversification (with Appendix)

BUNSHIN: Compositing Security Mechanisms through Diversification (with Appendix) BUNSHIN: Compositing Security Mechanisms through Diversification (with Appendix) Meng Xu, Kangjie Lu, Taesoo Kim, Wenke Lee Georgia Institute of Technology in practice. One reason is that the slowdown

More information

Thesis Defense Lavanya Subramanian

Thesis Defense Lavanya Subramanian Providing High and Predictable Performance in Multicore Systems Through Shared Resource Management Thesis Defense Lavanya Subramanian Committee: Advisor: Onur Mutlu Greg Ganger James Hoe Ravi Iyer (Intel)

More information

Kruiser: Semi-synchronized Nonblocking Concurrent Kernel Heap Buffer Overflow Monitoring

Kruiser: Semi-synchronized Nonblocking Concurrent Kernel Heap Buffer Overflow Monitoring NDSS 2012 Kruiser: Semi-synchronized Nonblocking Concurrent Kernel Heap Buffer Overflow Monitoring Donghai Tian 1,2, Qiang Zeng 2, Dinghao Wu 2, Peng Liu 2 and Changzhen Hu 1 1 Beijing Institute of Technology

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 38 Performance 2008-04-30 Lecturer SOE Dan Garcia How fast is your computer? Every 6 months (Nov/June), the fastest supercomputers in

More information

The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory

The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory Lavanya Subramanian* Vivek Seshadri* Arnab Ghosh* Samira Khan*

More information

NightWatch: Integrating Transparent Cache Pollution Control into Dynamic Memory Allocation Systems

NightWatch: Integrating Transparent Cache Pollution Control into Dynamic Memory Allocation Systems NightWatch: Integrating Transparent Cache Pollution Control into Dynamic Memory Allocation Systems Rentong Guo 1, Xiaofei Liao 1, Hai Jin 1, Jianhui Yue 2, Guang Tan 3 1 Huazhong University of Science

More information

ChargeCache. Reducing DRAM Latency by Exploiting Row Access Locality

ChargeCache. Reducing DRAM Latency by Exploiting Row Access Locality ChargeCache Reducing DRAM Latency by Exploiting Row Access Locality Hasan Hassan, Gennady Pekhimenko, Nandita Vijaykumar, Vivek Seshadri, Donghyuk Lee, Oguz Ergin, Onur Mutlu Executive Summary Goal: Reduce

More information

A Heterogeneous Multiple Network-On-Chip Design: An Application-Aware Approach

A Heterogeneous Multiple Network-On-Chip Design: An Application-Aware Approach A Heterogeneous Multiple Network-On-Chip Design: An Application-Aware Approach Asit K. Mishra Onur Mutlu Chita R. Das Executive summary Problem: Current day NoC designs are agnostic to application requirements

More information

Near-Threshold Computing: How Close Should We Get?

Near-Threshold Computing: How Close Should We Get? Near-Threshold Computing: How Close Should We Get? Alaa R. Alameldeen Intel Labs Workshop on Near-Threshold Computing June 14, 2014 Overview High-level talk summarizing my architectural perspective on

More information

Bunshin: Compositing Security Mechanisms through Diversification

Bunshin: Compositing Security Mechanisms through Diversification Bunshin: Compositing Security Mechanisms through Diversification Meng Xu, Kangjie Lu, Taesoo Kim, and Wenke Lee, Georgia Institute of Technology https://www.usenix.org/conference/atc17/technical-sessions/presentation/xu-meng

More information

HA2lloc: Hardware-Assisted Secure Allocator

HA2lloc: Hardware-Assisted Secure Allocator HA2lloc: Hardware-Assisted Secure Allocator Orlando Arias, Dean Sullivan, Yier Jin {oarias,dean.sullivan}@knights.ucf.edu yier.jin@ece.ufl.edu University of Central Florida University of Florida June 25,

More information

Addressing End-to-End Memory Access Latency in NoC-Based Multicores

Addressing End-to-End Memory Access Latency in NoC-Based Multicores Addressing End-to-End Memory Access Latency in NoC-Based Multicores Akbar Sharifi, Emre Kultursay, Mahmut Kandemir and Chita R. Das The Pennsylvania State University University Park, PA, 682, USA {akbar,euk39,kandemir,das}@cse.psu.edu

More information

Fast, precise dynamic checking of types and bounds in C

Fast, precise dynamic checking of types and bounds in C Fast, precise dynamic checking of types and bounds in C Stephen Kell stephen.kell@cl.cam.ac.uk Computer Laboratory University of Cambridge p.1 Tool wanted if (obj >type == OBJ COMMIT) { if (process commit(walker,

More information

Function Merging by Sequence Alignment

Function Merging by Sequence Alignment Function Merging by Sequence Alignment Rodrigo C. O. Rocha Pavlos Petoumenos Zheng Wang University of Edinburgh, UK University of Edinburgh, UK Lancaster University, UK r.rocha@ed.ac.uk ppetoume@inf.ed.ac.uk

More information

Energy Proportional Datacenter Memory. Brian Neel EE6633 Fall 2012

Energy Proportional Datacenter Memory. Brian Neel EE6633 Fall 2012 Energy Proportional Datacenter Memory Brian Neel EE6633 Fall 2012 Outline Background Motivation Related work DRAM properties Designs References Background The Datacenter as a Computer Luiz André Barroso

More information

EffectiveSan: Type and Memory Error Detection using Dynamically Typed C/C++

EffectiveSan: Type and Memory Error Detection using Dynamically Typed C/C++ EffectiveSan: Type and Memory Error Detection using Dynamically Typed C/C++ Gregory J. Duck and Roland H. C. Yap Department of Computer Science National University of Singapore {gregory, ryap}@comp.nus.edu.sg

More information

SVF: Static Value-Flow Analysis in LLVM

SVF: Static Value-Flow Analysis in LLVM SVF: Static Value-Flow Analysis in LLVM Yulei Sui, Peng Di, Ding Ye, Hua Yan and Jingling Xue School of Computer Science and Engineering The University of New South Wales 2052 Sydney Australia March 18,

More information

Coverage-guided Fuzzing of Individual Functions Without Source Code

Coverage-guided Fuzzing of Individual Functions Without Source Code Coverage-guided Fuzzing of Individual Functions Without Source Code Alessandro Di Federico Politecnico di Milano October 25, 2018 1 Index Coverage-guided fuzzing An overview of rev.ng Experimental results

More information

Architecture of Parallel Computer Systems - Performance Benchmarking -

Architecture of Parallel Computer Systems - Performance Benchmarking - Architecture of Parallel Computer Systems - Performance Benchmarking - SoSe 18 L.079.05810 www.uni-paderborn.de/pc2 J. Simon - Architecture of Parallel Computer Systems SoSe 2018 < 1 > Definition of Benchmark

More information

Perceptron Learning for Reuse Prediction

Perceptron Learning for Reuse Prediction Perceptron Learning for Reuse Prediction Elvira Teran Zhe Wang Daniel A. Jiménez Texas A&M University Intel Labs {eteran,djimenez}@tamu.edu zhe2.wang@intel.com Abstract The disparity between last-level

More information

HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer

HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer Motivation C++ is a popular programming language Google Chrome, Firefox,

More information

Leveraging ECC to Mitigate Read Disturbance, False Reads Mitigating Bitline Crosstalk Noise in DRAM Memories and Write Faults in STT-RAM

Leveraging ECC to Mitigate Read Disturbance, False Reads Mitigating Bitline Crosstalk Noise in DRAM Memories and Write Faults in STT-RAM 1 MEMSYS 2017 DSN 2016 Leveraging ECC to Mitigate ead Disturbance, False eads Mitigating Bitline Crosstalk Noise in DAM Memories and Write Faults in STT-AM Mohammad Seyedzadeh, akan. Maddah, Alex. Jones,

More information

HideM: Protecting the Contents of Userspace Memory in the Face of Disclosure Vulnerabilities

HideM: Protecting the Contents of Userspace Memory in the Face of Disclosure Vulnerabilities HideM: Protecting the Contents of Userspace Memory in the Face of Disclosure Vulnerabilities Jason Gionta, William Enck, Peng Ning 1 JIT-ROP 2 Two Attack Categories Injection Attacks Code Integrity Data

More information

PIPELINING AND PROCESSOR PERFORMANCE

PIPELINING AND PROCESSOR PERFORMANCE PIPELINING AND PROCESSOR PERFORMANCE Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 1, John L. Hennessy and David A. Patterson, Morgan Kaufmann,

More information

Security-Aware Processor Architecture Design. CS 6501 Fall 2018 Ashish Venkat

Security-Aware Processor Architecture Design. CS 6501 Fall 2018 Ashish Venkat Security-Aware Processor Architecture Design CS 6501 Fall 2018 Ashish Venkat Agenda Common Processor Performance Metrics Identifying and Analyzing Bottlenecks Benchmarking and Workload Selection Performance

More information

Thread-Level Speculation on Off-the-Shelf Hardware Transactional Memory

Thread-Level Speculation on Off-the-Shelf Hardware Transactional Memory Thread-Level Speculation on Off-the-Shelf Hardware Transactional Memory Rei Odaira Takuya Nakaike IBM Research Tokyo Thread-Level Speculation (TLS) [Franklin et al., 92] or Speculative Multithreading (SpMT)

More information

DEMM: a Dynamic Energy-saving mechanism for Multicore Memories

DEMM: a Dynamic Energy-saving mechanism for Multicore Memories DEMM: a Dynamic Energy-saving mechanism for Multicore Memories Akbar Sharifi, Wei Ding 2, Diana Guttman 3, Hui Zhao 4, Xulong Tang 5, Mahmut Kandemir 5, Chita Das 5 Facebook 2 Qualcomm 3 Intel 4 University

More information

LACS: A Locality-Aware Cost-Sensitive Cache Replacement Algorithm

LACS: A Locality-Aware Cost-Sensitive Cache Replacement Algorithm 1 LACS: A Locality-Aware Cost-Sensitive Cache Replacement Algorithm Mazen Kharbutli and Rami Sheikh (Submitted to IEEE Transactions on Computers) Mazen Kharbutli is with Jordan University of Science and

More information

SEN361 Computer Organization. Prof. Dr. Hasan Hüseyin BALIK (2 nd Week)

SEN361 Computer Organization. Prof. Dr. Hasan Hüseyin BALIK (2 nd Week) + SEN361 Computer Organization Prof. Dr. Hasan Hüseyin BALIK (2 nd Week) + Outline 1. Overview 1.1 Basic Concepts and Computer Evolution 1.2 Performance Issues + 1.2 Performance Issues + Designing for

More information

Efficient Physical Register File Allocation with Thread Suspension for Simultaneous Multi-Threading Processors

Efficient Physical Register File Allocation with Thread Suspension for Simultaneous Multi-Threading Processors Efficient Physical Register File Allocation with Thread Suspension for Simultaneous Multi-Threading Processors Wenun Wang and Wei-Ming Lin Department of Electrical and Computer Engineering, The University

More information

Efficient and Effective Misaligned Data Access Handling in a Dynamic Binary Translation System

Efficient and Effective Misaligned Data Access Handling in a Dynamic Binary Translation System Efficient and Effective Misaligned Data Access Handling in a Dynamic Binary Translation System JIANJUN LI, Institute of Computing Technology Graduate University of Chinese Academy of Sciences CHENGGANG

More information

Computer Architecture. Introduction

Computer Architecture. Introduction to Computer Architecture 1 Computer Architecture What is Computer Architecture From Wikipedia, the free encyclopedia In computer engineering, computer architecture is a set of rules and methods that describe

More information

562 IEEE TRANSACTIONS ON COMPUTERS, VOL. 65, NO. 2, FEBRUARY 2016

562 IEEE TRANSACTIONS ON COMPUTERS, VOL. 65, NO. 2, FEBRUARY 2016 562 IEEE TRANSACTIONS ON COMPUTERS, VOL. 65, NO. 2, FEBRUARY 2016 Memory Bandwidth Management for Efficient Performance Isolation in Multi-Core Platforms Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Member,

More information

LLVM Performance Improvements and Headroom

LLVM Performance Improvements and Headroom LLVM Performance Improvements and Headroom Gerolf Hoflehner Apple LLVM Developers Meeting 2015 San Jose, CA Messages Tuning and focused local optimizations Advancing optimization technology Getting inspired

More information

Practical Automated Vulnerability Monitoring Using Program State Invariants

Practical Automated Vulnerability Monitoring Using Program State Invariants Practical Automated Vulnerability Monitoring Using Program State Invariants Cristiano Giuffrida Vrije Universiteit Amsterdam giuffrida@cs.vu.nl Lorenzo Cavallaro Royal Holloway, University of London lorenzo.cavallaro@rhul.ac.uk

More information

Insertion and Promotion for Tree-Based PseudoLRU Last-Level Caches

Insertion and Promotion for Tree-Based PseudoLRU Last-Level Caches Insertion and Promotion for Tree-Based PseudoLRU Last-Level Caches Daniel A. Jiménez Department of Computer Science and Engineering Texas A&M University ABSTRACT Last-level caches mitigate the high latency

More information

Efficient Architecture Support for Thread-Level Speculation

Efficient Architecture Support for Thread-Level Speculation Efficient Architecture Support for Thread-Level Speculation A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Venkatesan Packirisamy IN PARTIAL FULFILLMENT OF THE

More information

Lightweight Memory Tracing

Lightweight Memory Tracing Lightweight Memory Tracing Mathias Payer ETH Zurich Enrico Kravina ETH Zurich Thomas R. Gross ETH Zurich Abstract Memory tracing (executing additional code for every memory access of a program) is a powerful

More information

Microarchitecture Overview. Performance

Microarchitecture Overview. Performance Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make

More information

Exploi'ng Compressed Block Size as an Indicator of Future Reuse

Exploi'ng Compressed Block Size as an Indicator of Future Reuse Exploi'ng Compressed Block Size as an Indicator of Future Reuse Gennady Pekhimenko, Tyler Huberty, Rui Cai, Onur Mutlu, Todd C. Mowry Phillip B. Gibbons, Michael A. Kozuch Execu've Summary In a compressed

More information

SafeStack + : Enhanced Dual Stack to Combat Data-Flow Hijacking

SafeStack + : Enhanced Dual Stack to Combat Data-Flow Hijacking SafeStack + : Enhanced Dual Stack to Combat Data-Flow Hijacking Yan Lin, Xiaoxiao Tang, and Debin Gao School of Information Systems, Singapore Management University, Singapore {yanlin.2016, xxtang.2013,

More information

A Fast and Low-Overhead Technique to Secure Programs Against Integer Overflows

A Fast and Low-Overhead Technique to Secure Programs Against Integer Overflows A Fast and Low-Overhead Technique to Secure Programs Against Integer Overflows Raphael Ernani Rodrigues, Victor Hugo Sperle Campos, Fernando Magno Quintão Pereira Department of Computer Science The Federal

More information

EKT 303 WEEK Pearson Education, Inc., Hoboken, NJ. All rights reserved.

EKT 303 WEEK Pearson Education, Inc., Hoboken, NJ. All rights reserved. + EKT 303 WEEK 2 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Chapter 2 + Performance Issues + Designing for Performance The cost of computer systems continues to drop dramatically,

More information

Decoupled Compressed Cache: Exploiting Spatial Locality for Energy-Optimized Compressed Caching

Decoupled Compressed Cache: Exploiting Spatial Locality for Energy-Optimized Compressed Caching Decoupled Compressed Cache: Exploiting Spatial Locality for Energy-Optimized Compressed Caching Somayeh Sardashti and David A. Wood University of Wisconsin-Madison 1 Please find the power point presentation

More information

Flexible Cache Error Protection using an ECC FIFO

Flexible Cache Error Protection using an ECC FIFO Flexible Cache Error Protection using an ECC FIFO Doe Hyun Yoon and Mattan Erez Dept Electrical and Computer Engineering The University of Texas at Austin 1 ECC FIFO Goal: to reduce on-chip ECC overhead

More information

Bias Scheduling in Heterogeneous Multi-core Architectures

Bias Scheduling in Heterogeneous Multi-core Architectures Bias Scheduling in Heterogeneous Multi-core Architectures David Koufaty Dheeraj Reddy Scott Hahn Intel Labs {david.a.koufaty, dheeraj.reddy, scott.hahn}@intel.com Abstract Heterogeneous architectures that

More information

Modeling Virtual Machines Misprediction Overhead

Modeling Virtual Machines Misprediction Overhead Modeling Virtual Machines Misprediction Overhead Divino César, Rafael Auler, Rafael Dalibera, Sandro Rigo, Edson Borin and Guido Araújo Institute of Computing, University of Campinas Campinas, São Paulo

More information

Baggy bounds with LLVM

Baggy bounds with LLVM Baggy bounds with LLVM Anton Anastasov Chirantan Ekbote Travis Hance 6.858 Project Final Report 1 Introduction Buffer overflows are a well-known security problem; a simple buffer-overflow bug can often

More information

Improving Error Checking and Unsafe Optimizations using Software Speculation. Kirk Kelsey and Chen Ding University of Rochester

Improving Error Checking and Unsafe Optimizations using Software Speculation. Kirk Kelsey and Chen Ding University of Rochester Improving Error Checking and Unsafe Optimizations using Software Speculation Kirk Kelsey and Chen Ding University of Rochester Outline Motivation Brief problem statement How speculation can help Our software

More information

Multiperspective Reuse Prediction

Multiperspective Reuse Prediction ABSTRACT Daniel A. Jiménez Texas A&M University djimenezacm.org The disparity between last-level cache and memory latencies motivates the search for e cient cache management policies. Recent work in predicting

More information

Enabling Consolidation and Scaling Down to Provide Power Management for Cloud Computing

Enabling Consolidation and Scaling Down to Provide Power Management for Cloud Computing Enabling Consolidation and Scaling Down to Provide Power Management for Cloud Computing Frank Yong-Kyung Oh Hyeong S. Kim Hyeonsang Eom Heon Y. Yeom School of Computer Science and Engineering Seoul National

More information

Data Prefetching by Exploiting Global and Local Access Patterns

Data Prefetching by Exploiting Global and Local Access Patterns Journal of Instruction-Level Parallelism 13 (2011) 1-17 Submitted 3/10; published 1/11 Data Prefetching by Exploiting Global and Local Access Patterns Ahmad Sharif Hsien-Hsin S. Lee School of Electrical

More information

CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:

CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: CPI CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: Clock cycle where: Clock rate = 1 / clock cycle f =

More information

Generating Low-Overhead Dynamic Binary Translators

Generating Low-Overhead Dynamic Binary Translators Generating Low-Overhead Dynamic Binary Translators Mathias Payer ETH Zurich, Switzerland mathias.payer@inf.ethz.ch Thomas R. Gross ETH Zurich, Switzerland trg@inf.ethz.ch Abstract Dynamic (on the fly)

More information

POPS: Coherence Protocol Optimization for both Private and Shared Data

POPS: Coherence Protocol Optimization for both Private and Shared Data POPS: Coherence Protocol Optimization for both Private and Shared Data Hemayet Hossain, Sandhya Dwarkadas, and Michael C. Huang University of Rochester Rochester, NY 14627, USA {hossain@cs, sandhya@cs,

More information

Virtualized and Flexible ECC for Main Memory

Virtualized and Flexible ECC for Main Memory Virtualized and Flexible ECC for Main Memory Doe Hyun Yoon and Mattan Erez Dept. Electrical and Computer Engineering The University of Texas at Austin ASPLOS 2010 1 Memory Error Protection Applying ECC

More information

Scalable Dynamic Task Scheduling on Adaptive Many-Cores

Scalable Dynamic Task Scheduling on Adaptive Many-Cores Introduction: Many- Paradigm [Our Definition] Scalable Dynamic Task Scheduling on Adaptive Many-s Vanchinathan Venkataramani, Anuj Pathania, Muhammad Shafique, Tulika Mitra, Jörg Henkel Bus CES Chair for

More information

Exploring Speculative Parallelism in SPEC2006

Exploring Speculative Parallelism in SPEC2006 Exploring Speculative Parallelism in SPEC2006 Venkatesan Packirisamy, Antonia Zhai, Wei-Chung Hsu, Pen-Chung Yew and Tin-Fook Ngai University of Minnesota, Minneapolis. Intel Corporation {packve,zhai,hsu,yew}@cs.umn.edu

More information

Linux Performance on IBM zenterprise 196

Linux Performance on IBM zenterprise 196 Martin Kammerer martin.kammerer@de.ibm.com 9/27/10 Linux Performance on IBM zenterprise 196 visit us at http://www.ibm.com/developerworks/linux/linux390/perf/index.html Trademarks IBM, the IBM logo, and

More information

OpenPrefetch. (in-progress)

OpenPrefetch. (in-progress) OpenPrefetch Let There Be Industry-Competitive Prefetching in RISC-V Processors (in-progress) Bowen Huang, Zihao Yu, Zhigang Liu, Chuanqi Zhang, Sa Wang, Yungang Bao Institute of Computing Technology(ICT),

More information

Open Access Research on the Establishment of MSR Model in Cloud Computing based on Standard Performance Evaluation

Open Access Research on the Establishment of MSR Model in Cloud Computing based on Standard Performance Evaluation Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2015, 7, 821-825 821 Open Access Research on the Establishment of MSR Model in Cloud Computing based

More information

Architectural Supports to Protect OS Kernels from Code-Injection Attacks

Architectural Supports to Protect OS Kernels from Code-Injection Attacks Architectural Supports to Protect OS Kernels from Code-Injection Attacks 2016-06-18 Hyungon Moon, Jinyong Lee, Dongil Hwang, Seonhwa Jung, Jiwon Seo and Yunheung Paek Seoul National University 1 Why to

More information

Leakage-Resilient Layout Randomization for Mobile Devices

Leakage-Resilient Layout Randomization for Mobile Devices Leakage-Resilient Layout Randomization for Mobile Devices Kjell Braden, Stephen Crane, Lucas Davi, Michael Franz Per Larsen, Christopher Liebchen, Ahmad-Reza Sadeghi, CASED/Technische Universität Darmstadt,

More information

Hybrid Cache Architecture (HCA) with Disparate Memory Technologies

Hybrid Cache Architecture (HCA) with Disparate Memory Technologies Hybrid Cache Architecture (HCA) with Disparate Memory Technologies Xiaoxia Wu, Jian Li, Lixin Zhang, Evan Speight, Ram Rajamony, Yuan Xie Pennsylvania State University IBM Austin Research Laboratory Acknowledgement:

More information

HeapTherapy: An Efficient End-to-end Solution Against Heap Buffer Overflows

HeapTherapy: An Efficient End-to-end Solution Against Heap Buffer Overflows HeapTherapy: An Efficient End-to-end Solution Against Heap Buffer Overflows Qiang Zeng*, Mingyi Zhao*, Peng Liu The Pennsylvania State University University Park, PA, 16802, USA Email: {quz105, muz127,

More information

Dynamic and Adaptive Calling Context Encoding

Dynamic and Adaptive Calling Context Encoding ynamic and daptive alling ontext Encoding Jianjun Li State Key Laboratory of omputer rchitecture, Institute of omputing Technology, hinese cademy of Sciences lijianjun@ict.ac.cn Wei-hung Hsu epartment

More information

A Hybrid Adaptive Feedback Based Prefetcher

A Hybrid Adaptive Feedback Based Prefetcher A Feedback Based Prefetcher Santhosh Verma, David M. Koppelman and Lu Peng Department of Electrical and Computer Engineering Louisiana State University, Baton Rouge, LA 78 sverma@lsu.edu, koppel@ece.lsu.edu,

More information

IAENG International Journal of Computer Science, 40:4, IJCS_40_4_07. RDCC: A New Metric for Processor Workload Characteristics Evaluation

IAENG International Journal of Computer Science, 40:4, IJCS_40_4_07. RDCC: A New Metric for Processor Workload Characteristics Evaluation RDCC: A New Metric for Processor Workload Characteristics Evaluation Tianyong Ao, Pan Chen, Zhangqing He, Kui Dai, Xuecheng Zou Abstract Understanding the characteristics of workloads is extremely important

More information

HOTL: A Higher Order Theory of Locality

HOTL: A Higher Order Theory of Locality HOTL: A Higher Order Theory of Locality Xiaoya Xiang Chen Ding Hao Luo Department of Computer Science University of Rochester {xiang, cding, hluo}@cs.rochester.edu Bin Bao Adobe Systems Incorporated bbao@adobe.com

More information

Efficient Memory Shadowing for 64-bit Architectures

Efficient Memory Shadowing for 64-bit Architectures Efficient Memory Shadowing for 64-bit Architectures The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Qin Zhao, Derek Bruening,

More information

Potential for hardware-based techniques for reuse distance analysis

Potential for hardware-based techniques for reuse distance analysis Michigan Technological University Digital Commons @ Michigan Tech Dissertations, Master's Theses and Master's Reports - Open Dissertations, Master's Theses and Master's Reports 2011 Potential for hardware-based

More information

VA Smalltalk 9: Exploring the next-gen LLVM-based virtual machine

VA Smalltalk 9: Exploring the next-gen LLVM-based virtual machine 2017 FAST Conference La Plata, Argentina November, 2017 VA Smalltalk 9: Exploring the next-gen LLVM-based virtual machine Alexander Mitin Senior Software Engineer Instantiations, Inc. The New VM Runtime

More information

An Evaluation of Decoupled Access Execute on ARMv8

An Evaluation of Decoupled Access Execute on ARMv8 IT 17 006 Examensarbete 30 hp Januari 2017 An Evaluation of Decoupled Access Execute on ARMv8 Georgios Petrousis Institutionen för informationsteknologi Department of Information Technology Abstract An

More information

An Application-Oriented Approach for Designing Heterogeneous Network-on-Chip

An Application-Oriented Approach for Designing Heterogeneous Network-on-Chip An Application-Oriented Approach for Designing Heterogeneous Network-on-Chip Technical Report CSE-11-7 Monday, June 13, 211 Asit K. Mishra Department of Computer Science and Engineering The Pennsylvania

More information

Last time. Lecture #29 Performance & Parallel Intro

Last time. Lecture #29 Performance & Parallel Intro CS61C L29 Performance & Parallel (1) inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #29 Performance & Parallel Intro 2007-8-14 Scott Beamer, Instructor Paper Battery Developed by Researchers

More information

Practical Data Compression for Modern Memory Hierarchies

Practical Data Compression for Modern Memory Hierarchies Practical Data Compression for Modern Memory Hierarchies Thesis Oral Gennady Pekhimenko Committee: Todd Mowry (Co-chair) Onur Mutlu (Co-chair) Kayvon Fatahalian David Wood, University of Wisconsin-Madison

More information

AB-Aware: Application Behavior Aware Management of Shared Last Level Caches

AB-Aware: Application Behavior Aware Management of Shared Last Level Caches AB-Aware: Application Behavior Aware Management of Shared Last Level Caches Suhit Pai, Newton Singh and Virendra Singh Computer Architecture and Dependable Systems Laboratory Department of Electrical Engineering

More information

Preventing Use-after-free with Dangling Pointers Nullification

Preventing Use-after-free with Dangling Pointers Nullification Preventing Use-after-free with Dangling Pointers Nullification Byoungyoung Lee, Chengyu Song, Yeongjin Jang Tielei Wang, Taesoo Kim, Long Lu, Wenke Lee Georgia Institute of Technology Stony Brook University

More information

Multi-Cache Resizing via Greedy Coordinate Descent

Multi-Cache Resizing via Greedy Coordinate Descent Noname manuscript No. (will be inserted by the editor) Multi-Cache Resizing via Greedy Coordinate Descent I. Stephen Choi Donald Yeung Received: date / Accepted: date Abstract To reduce power consumption

More information

Fine-Grained User-Space Security Through Virtualization

Fine-Grained User-Space Security Through Virtualization Fine-Grained User-Space Security Through Virtualization Mathias Payer mathias.payer@inf.ethz.ch ETH Zurich, Switzerland Thomas R. Gross trg@inf.ethz.ch ETH Zurich, Switzerland Abstract This paper presents

More information

Scheduling the Intel Core i7

Scheduling the Intel Core i7 Third Year Project Report University of Manchester SCHOOL OF COMPUTER SCIENCE Scheduling the Intel Core i7 Ibrahim Alsuheabani Degree Programme: BSc Software Engineering Supervisor: Prof. Alasdair Rawsthorne

More information

An Intelligent Fetching algorithm For Efficient Physical Register File Allocation In Simultaneous Multi-Threading CPUs

An Intelligent Fetching algorithm For Efficient Physical Register File Allocation In Simultaneous Multi-Threading CPUs International Journal of Computer Systems (ISSN: 2394-1065), Volume 04 Issue 04, April, 2017 Available at http://www.ijcsonline.com/ An Intelligent Fetching algorithm For Efficient Physical Register File

More information