Hierarchical Pointer Analysis
|
|
- Phebe Stevens
- 5 years ago
- Views:
Transcription
1 for Distributed Programs Amir Kamil and Katherine Yelick U.C. Berkeley August 23, Amir Kamil
2 Background 2 Amir Kamil
3 Hierarchical Machines Parallel machines often have hierarchical structure 4 A B D 1 2 C 3 level 1 (thread local) level 2 (node local) level 3 (cluster local) level 4 (grid world) 3 Amir Kamil
4 Partitioned Global Address Space Partitioned global address space (PGAS) languages provide the illusion of shared memory across the machine Wide pointers used to represent global addresses Contain identifying information plus the physical address Process ID: 1 Address: 0xf9a0cb48 Narrow pointers can still be used for addresses in the local physical address space Address: 0xf9a0cb48 4 Amir Kamil
5 The Problems 5 Amir Kamil
6 Three Problems What data is private to a thread? What data is local to the physical address space? What possible race conditions can occur? 6 Amir Kamil
7 Data Privacy Data is private if it cannot leak beyond its source thread Useful to know which data is private for global garbage collection, monitor optimization, and other applications 7 Amir Kamil
8 Data Locality Recall: global pointers composed identifying information and an address Process ID: 1 Address: 0xf9a0cb48 0 When dereferenced, runtime system must perform a check to determine if the data is actually in the local physical address space If local, l then access directly If not local, then perform communication Thus, global l pointers are more costly in both space and time, even if the actual data is local 8 Amir Kamil
9 Race Detection Shared memory introduces the possibility of race conditions Two threads access the same memory location The accesses can be simultaneous (no intermediate synchronization) At least one access is a write 9 Amir Kamil
10 The Solution 10 Amir Kamil
11 A pointer analysis that takes into account the machine hierarchy can answer the preceding questions For each variable, we want to know not only from which allocation sites the data could have originated, but also from which threads 11 Amir Kamil
12 Related Work Thread-aware pointer analysis has been done by others Rugina and Rinard, Zhu and Hendren, Hicks, and others None of them did it for hierarchical, distributed machines Data privacy and locality detection previously done by Liblit, Aiken, and Yelick Uses constraint propagation Does not distinguish allocation sites 12 Amir Kamil
13 The Implementation 13 Amir Kamil
14 Titanium Titanium is a single program, multiple data (SPMD) dialect of Java All threads execute the same program text Designed for distributed machines Global l address space all threads can access all memory At runtime, threads are grouped into processes A thread shares a physical address space with some other, but not all threads 14 Amir Kamil
15 Titanium Memory Hierarchy Global memory is composed of a hierarchy Program Processes Locations can be thread-local (tlocal), processlocal (plocal), or potentially in another process (global) Threads global 3 tlocal plocal 15 Amir Kamil
16 The Analysis 16 Amir Kamil
17 Approach We define a small SPMD language based on Titanium We produce a type system that accounts for the memory hierarchy The analysis can handle an arbitrary number of levels, but we use three levels in this talk We give an overview of the pointer analysis inference rules 17 Amir Kamil
18 Language Syntax Types τ ::= int ref q τ Qualifiers q ::= tlocal plocal global (tlocal l plocal l global) l) Expressions e ::= new l τ (allocation) transmite 1 from e 2 (communication) e 1 e 2 (dereferencing assignment) convert(e, n) (type conversion) 18 Amir Kamil
19 Type Rules Allocation The expression new l τ allocates space of type τ in local memory and returns a reference to the location The label l is unique for each allocation site and will be used by the pointer analysis The resulting reference is qualified with tlocal, since it references thread-local memory Thread 0 new l int tlocall Γ new l τ : ref tlocal τ 19 Amir Kamil
20 Type Rules Communication The expression transmit e 1 from e 2 evaluates e 1 on the thread given by e 2 and retrieves the result If e 1 has reference type, the result type must be widened to global Statically do not know source thread, so must assume it can be any thread Γ e 1 : τ Γ e 2 :int Thread 0 Thread 1 transmit y from 1 global y Γ transmit e 1 from e 2 : tlocal expand(τ, global) expand(ref q τ, q ) ref (q, q ) τ expand(τ, q ) τ otherwise 20 Amir Kamil
21 Type Rules Dereferencing Assignment The expression e 1 e 2 puts the value of e 2 into the location referenced by e 1 (like *e 1 = e 2 in C) Some assignments are unsound Γ e 1 : ref q τ Γ e 2 : τ robust(τ, q) Thread 0 Thread 1 Γ e 1 e 2 :ref q τ plocal y l l l z tlocal tlocal plocal robust(ref q τ, τ q ) false if q q robust(τ, q ) true otherwise 21 Amir Kamil
22 Type Rules Type Conversion The expression convert(e, q) is an assertion that e refers to data that is no further than q Titanium code often checks if data is plocal and then casts to it before operating on it for efficiency Thread 0 x global Γ e : ref q τ Γ convert(e, q) : ref q τ 22 Amir Kamil
23 Pointer Analysis Since language is SPMD, analysis is only done for a single thread We use thread 0 in our examples Each expression has a points-to set of abstract locations that it can reference Abstract locations also have points-to sets 23 Amir Kamil
24 Abstract Locations Abstract locations consist of label and qualifier A-loc (l, q) can refer to any concrete location allocated at label l that is at most distance q from thread 0 (l, tlocal) Thread 0 Thread 1 new l int new l int (l, plocal) l) tlocal tlocal 24 Amir Kamil
25 Pointer Analysis Allocation and Communication The inference rules for allocation and communication are similar to the type rules An allocation new l τ produces a new abstract location (l, tlocal) The result of the expression transmit e 1 from e 2 is the set of a-locs resulting from e 1 but with global qualifiers e 1 {(l 1, tlocal), (l 2, plocal), (l 3, global)} transmit e 1 from e 2 {(l 1, global), (l 2, global), (l 3, global)} 25 Amir Kamil
26 Pointer Analysis Dereferencing Assignment For assignment, must take into account actions of other threads x Thread 0 Thread 1 Thread 2 (l 1, tlocal) x (l 1, plocal) x (l 1, plocal) (l 2, tlocal) (l 2, plocal) (l 2, plocal) y y y x {(l 1, tlocal)}, l)} y {(l 2, plocal)} x y : (l 1, tlocal) (l 2, plocal), (l 1, plocal) (l 2, plocal), (l 1, global) lbl) (l 2, global) lbl) 26 Amir Kamil
27 Pointer Analysis Type Conversion Inthetypeconversionconvert(e type convert(e, q), the program is illegal if e evaluates to a location further than q Thus, the result of the expression convert(e, q) is the set of a-locs resulting from e with the qualifiers reduced to at most q e {(l 1, tlocal), (l 2, plocal), (l 3, global)} convert(e, plocal) l) {(l 1, tlocal), l) (l 2, plocal), l) (l 3, plocal)} l)} 27 Amir Kamil
28 Evaluation 28 Amir Kamil
29 Benchmarks Five application benchmarks used to evaluate the pointer analysis Benchmark Line Count Description amr 7581 Adaptive mesh refinement suite gas 8841 Hyperbolic solver for a gas dynamics problem ft 1192 NAS Fourier transform benchmark cg 1595 NAS conjugate gradient benchmark mg 1952 NAS multigrid benchmark 29 Amir Kamil
30 Running Time Determine actual cost of introducing multiple levels into the pointer analysis Tests run on 2.4GHz Pentium 4 with 512MB RAM Three analysis variants compared Name PA1 PA2 PA3 Description Single-level pointer analysis Two-level pointer analysis (thread-local and global) Three-level pointer analysis 30 Amir Kamil
31 Running Time Results 4 Pointer Analysis Running Time PA1 PA2 PA3 3.5 Analysis s Time (sec conds) Good amr gas ft cg mg Benchmark 31 Amir Kamil
32 Data Privacy Detection In pointer analysis, an allocation site is private if only thread-local references to it are used Thus, only two levels, thread-local and global, needed in the pointer analysis Two types of analysis compared Name SQI PA2 Description Constraint-based analysis by Liblit, Aiken, and Yelick; does not distinguish allocation sites Two-level pointer analysis (thread-local and global) 32 Amir Kamil
33 Perc cent Determ mined to be Thread-P Private Data Privacy Detection Results Data Privacy Detection SQI PA2 amr gas ft cg mg Benchmark Good 33 Amir Kamil
34 Data Locality Detection Goal: statically determine which pointers must be process-local Three analyses compared Name LQI PA2 PA3 Description Constraint-based analysis by Liblit and Aiken; does not distinguish allocation sites Two-level pointer analysis (thread-local and global) Three-level pointer analysis 34 Amir Kamil
35 Perc cent Determ mined to be Process- -Local Data Locality Detection Results Data Locality Detection LQI PA2 PA3 Good 0 amr gas ft cg mg Benchmark 35 Amir Kamil
36 Race Detection Pointer analysis used with an existing concurrency analysis to detect potential races at compile-time Three analyses compared Name concur concur+pa1 concur+pa3 Description Concurrency analysis plus constraint-based data sharing analysis and type-based alias analysis Concurrency analysis plus single-level level pointer analysis Concurrency analysis plus three-level pointer analysis 36 Amir Kamil
37 Race Detection Results Static Races Detected concur concur+pa1 concur+pa3 Races (Lo ogarithmic Scale) Good 10 amr gas ft cg mg Benchmark 37 Amir Kamil
38 Conclusion 38 Amir Kamil
39 Conclusion We developed a pointer analysis for hierarchical, distributed machines The cost of introducing the memory hierarchy into the analysis is small On the other hand, the payoff is large 39 Amir Kamil
40 Future Work Scientific programs tend to use a lot of arraybased data structures Need array index analysis to properly analyze them Implement a dynamic race detector Use static results to minimize the program locations that need to be tracked 40 Amir Kamil
41 Questions 41 Amir Kamil
Hierarchical Pointer Analysis for Distributed Programs
Hierarchical Pointer Analysis for Distributed Programs Amir Kamil Computer Science Division, University of California, Berkeley kamil@cs.berkeley.edu April 14, 2006 1 Introduction Many distributed, parallel
More informationEnforcing Textual Alignment of
Parallel Hardware Parallel Applications IT industry (Silicon Valley) Parallel Software Users Enforcing Textual Alignment of Collectives using Dynamic Checks and Katherine Yelick UC Berkeley Parallel Computing
More informationA Local-View Array Library for Partitioned Global Address Space C++ Programs
Lawrence Berkeley National Laboratory A Local-View Array Library for Partitioned Global Address Space C++ Programs Amir Kamil, Yili Zheng, and Katherine Yelick Lawrence Berkeley Lab Berkeley, CA, USA June
More informationHierarchical Pointer Analysis for Distributed Programs
Hierarchical Pointer Analysis for Distributed Programs Amir Kamil and Katherine Yelick Computer Science Division, University of California, Berkeley {kamil,yelick}@cs.berkeley.edu Abstract. We present
More informationOptimization and Evaluation of a Titanium Adaptive Mesh Refinement Code
Optimization and Evaluation of a Titanium Adaptive Mesh Refinement Code Amir Kamil Ben Schwarz Jimmy Su kamil@cs.berkeley.edu bschwarz@cs.berkeley.edu jimmysu@cs.berkeley.edu May 19, 2004 Abstract Adaptive
More informationAdaptive Mesh Refinement in Titanium
Adaptive Mesh Refinement in Titanium http://seesar.lbl.gov/anag Lawrence Berkeley National Laboratory April 7, 2005 19 th IPDPS, April 7, 2005 1 Overview Motivations: Build the infrastructure in Titanium
More informationConcurrency Analysis for Parallel Programs with Textually Aligned Barriers. 1 Introduction. 2 Motivation. 2.1 Static Race Detection
Concurrency Analysis for Parallel Programs with Textually Aligned Barriers Amir Kamil Katherine Yelick Computer Science Division, University of California, Berkeley {kamil,yelick}@cs.berkeley.edu Abstract.
More informationEfficient Data Race Detection for Unified Parallel C
P A R A L L E L C O M P U T I N G L A B O R A T O R Y Efficient Data Race Detection for Unified Parallel C ParLab Winter Retreat 1/14/2011" Costin Iancu, LBL" Nick Jalbert, UC Berkeley" Chang-Seo Park,
More informationCompiler and Runtime Support for Scaling Adaptive Mesh Refinement Computations in Titanium
Compiler and Runtime Support for Scaling Adaptive Mesh Refinement Computations in Titanium Jimmy Zhigang Su Tong Wen Katherine A. Yelick Electrical Engineering and Computer Sciences University of California
More informationTitanium: A High-Performance Java Dialect
Titanium: A High-Performance Java Dialect Kathy Yelick, Luigi Semenzato, Geoff Pike, Carleton Miyamoto, Ben Liblit, Arvind Krishnamurthy, Paul Hilfinger, Susan Graham, David Gay, Phil Colella, and Alex
More informationThe NAS Parallel Benchmarks in Titanium. by Kaushik Datta. Research Project
The NAS Parallel Benchmarks in Titanium by Kaushik Datta Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, in partial satisfaction
More informationCompilation Techniques for Partitioned Global Address Space Languages
Compilation Techniques for Partitioned Global Address Space Languages Katherine Yelick U.C. Berkeley and Lawrence Berkeley National Lab http://titanium.cs.berkeley.edu http://upc.lbl.gov 1 Kathy Yelick
More informationThe Hierarchical SPMD Programming Model
The Hierarchical SPMD Programming Model Amir Ashraf Kamil Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2011-28 http://www.eecs.berkeley.edu/pubs/techrpts/2011/eecs-2011-28.html
More informationProductivity and Performance Using Partitioned Global Address Space Languages
Productivity and Performance Using Partitioned Global Address Space Languages Katherine Yelick 1,2, Dan Bonachea 1,2, Wei-Yu Chen 1,2, Phillip Colella 2, Kaushik Datta 1,2, Jason Duell 1,2, Susan L. Graham
More informationData Sharing Analysis for Titanium
Data Sharing Analysis for Titanium Ben Liblit liblit@ cs. berkeley. edu Alex Aiken aiken@ cs. berkeley. edu Katherine Yelick yelick@ cs. berkeley. edu Report No. UCB//CSD-01-1165 November 2001 Computer
More information[0569] p 0318 garbage
A Pointer is a variable which contains the address of another variable. Declaration syntax: Pointer_type *pointer_name; This declaration will create a pointer of the pointer_name which will point to the
More informationStatic Analysis of Embedded C
Static Analysis of Embedded C John Regehr University of Utah Joint work with Nathan Cooprider Motivating Platform: TinyOS Embedded software for wireless sensor network nodes Has lots of SW components for
More informationA Context-Sensitive Memory Model for Verification of C/C++ Programs
A Context-Sensitive Memory Model for Verification of C/C++ Programs Arie Gurfinkel and Jorge A. Navas University of Waterloo and SRI International SAS 17, August 30th, 2017 Gurfinkel and Navas (UWaterloo/SRI)
More informationTitanium Performance and Potential: an NPB Experimental Study
Performance and Potential: an NPB Experimental Study Kaushik Datta 1, Dan Bonachea 1 and Katherine Yelick 1,2 {kdatta,bonachea,yelick}@cs.berkeley.edu Computer Science Division, University of California
More informationRegions Review. A region is a (typed) collection. Regent: More on Regions. Regions are the cross product of. CS315B Lecture 5
Regions Review A region is a (typed) collection Regent: More on Regions CS315B Lecture 5 Regions are the cross product of An index space A field space So far we ve seen regions with N-dim index spaces
More informationTitanium. Titanium and Java Parallelism. Java: A Cleaner C++ Java Objects. Java Object Example. Immutable Classes in Titanium
Titanium Titanium and Java Parallelism Arvind Krishnamurthy Fall 2004 Take the best features of threads and MPI (just like Split-C) global address space like threads (ease programming) SPMD parallelism
More informationHierarchical Additions to the SPMD Programming Model
Hierarchical Additions to the SPMD Programming Model Amir Ashraf Kamil Katherine A. Yelick Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2012-20
More informationCompilation Techniques for Partitioned Global Address Space Languages
Compilation Techniques for Partitioned Global Address Space Languages Katherine Yelick U.C. Berkeley and Lawrence Berkeley National Lab http://titanium.cs.berkeley.edu http://upc.lbl.gov 1 Kathy Yelick
More informationLithe: Enabling Efficient Composition of Parallel Libraries
Lithe: Enabling Efficient Composition of Parallel Libraries Heidi Pan, Benjamin Hindman, Krste Asanović xoxo@mit.edu apple {benh, krste}@eecs.berkeley.edu Massachusetts Institute of Technology apple UC
More informationJaguar: Enabling Efficient Communication and I/O in Java
Jaguar: Enabling Efficient Communication and I/O in Java Matt Welsh and David Culler UC Berkeley Presented by David Hovemeyer Outline ' Motivation ' How it works ' Code mappings ' External objects ' Pre
More informationParallel Languages: Past, Present and Future
Parallel Languages: Past, Present and Future Katherine Yelick U.C. Berkeley and Lawrence Berkeley National Lab 1 Kathy Yelick Internal Outline Two components: control and data (communication/sharing) One
More informationCluster Computing Paul A. Farrell 9/15/2011. Dept of Computer Science Kent State University 1. Benchmarking CPU Performance
Many benchmarks available MHz (cycle speed of processor) MIPS (million instructions per second) Peak FLOPS Whetstone Stresses unoptimized scalar performance, since it is designed to defeat any effort to
More informationBenchmarking CPU Performance. Benchmarking CPU Performance
Cluster Computing Benchmarking CPU Performance Many benchmarks available MHz (cycle speed of processor) MIPS (million instructions per second) Peak FLOPS Whetstone Stresses unoptimized scalar performance,
More information2. Reachability in garbage collection is just an approximation of garbage.
symbol tables were on the first exam of this particular year's exam. We did not discuss register allocation in this This exam has questions from previous CISC 471/672. particular year. Not all questions
More informationCalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1
CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1 1 University of California, Berkeley, USA {pallavi,parkcs,ksen}@eecs.berkeley.edu
More informationCS527 Software Security
Security Policies Purdue University, Spring 2018 Security Policies A policy is a deliberate system of principles to guide decisions and achieve rational outcomes. A policy is a statement of intent, and
More informationHierarchical Computation in the SPMD Programming Model
Hierarchical Computation in the SPMD Programming Model Amir Kamil Katherine Yelick Computer Science Division, University of California, Berkeley {kamil,yelick@cs.berkeley.edu Abstract. Large-scale parallel
More informationEscape Analysis. Applications to ML and Java TM
Escape Analysis. Applications to ML and Java TM Bruno Blanchet INRIA Rocquencourt Bruno.Blanchet@inria.fr December 2000 Overview 1. Introduction: escape analysis and applications. 2. Escape analysis 2.a
More informationCSE 403: Software Engineering, Fall courses.cs.washington.edu/courses/cse403/16au/ Static Analysis. Emina Torlak
CSE 403: Software Engineering, Fall 2016 courses.cs.washington.edu/courses/cse403/16au/ Static Analysis Emina Torlak emina@cs.washington.edu Outline What is static analysis? How does it work? Free and
More informationChapter 1 INTRODUCTION SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.
hapter 1 INTRODUTION SYS-ED/ OMPUTER EDUATION TEHNIQUES, IN. Objectives You will learn: Java features. Java and its associated components. Features of a Java application and applet. Java data types. Java
More informationA Comparison of Unified Parallel C, Titanium and Co-Array Fortran. The purpose of this paper is to compare Unified Parallel C, Titanium and Co-
Shaun Lindsay CS425 A Comparison of Unified Parallel C, Titanium and Co-Array Fortran The purpose of this paper is to compare Unified Parallel C, Titanium and Co- Array Fortran s methods of parallelism
More informationCIRM - Dynamic Error Detection
CIRM - Dynamic Error Detection Peter Pirkelbauer Center for Applied Scientific Computing (CASC) Lawrence Livermore National Laboratory This work was funded by the Department of Defense and used elements
More informationField Analysis. Last time Exploit encapsulation to improve memory system performance
Field Analysis Last time Exploit encapsulation to improve memory system performance This time Exploit encapsulation to simplify analysis Two uses of field analysis Escape analysis Object inlining April
More informationC PGAS XcalableMP(XMP) Unified Parallel
PGAS XcalableMP Unified Parallel C 1 2 1, 2 1, 2, 3 C PGAS XcalableMP(XMP) Unified Parallel C(UPC) XMP UPC XMP UPC 1 Berkeley UPC GASNet 1. MPI MPI 1 Center for Computational Sciences, University of Tsukuba
More informationScaling Abstraction Refinement via Pruning. PLDI - San Jose, CA June 8, 2011
Scaling Abstraction Refinement via Pruning PLDI - San Jose, CA June 8, 2011 Percy Liang UC Berkeley Mayur Naik Intel Labs Berkeley The Big Picture Program 2 The Big Picture Program Static Analysis 2 The
More informationLecture08: Scope and Lexical Address
Lecture08: Scope and Lexical Address Free and Bound Variables (EOPL 1.3.1) Given an expression E, does a particular variable reference x appear free or bound in that expression? Definition: A variable
More informationGrade Weights. Language Design and Overview of COOL. CS143 Lecture 2. Programming Language Economics 101. Lecture Outline
Grade Weights Language Design and Overview of COOL CS143 Lecture 2 Project 0% I, II 10% each III, IV 1% each Midterm 1% Final 2% Written Assignments 10% 2.% each Prof. Aiken CS 143 Lecture 2 1 Prof. Aiken
More informationCSE 307: Principles of Programming Languages
CSE 307: Principles of Programming Languages Variables and Constants R. Sekar 1 / 22 Topics 2 / 22 Variables and Constants Variables are stored in memory, whereas constants need not be. Value of variables
More informationRun-time Environments -Part 3
Run-time Environments -Part 3 Y.N. Srikant Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Compiler Design Outline of the Lecture Part 3 What is run-time support?
More informationShort-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg
Short-term Memory for Self-collecting Mutators Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg CHESS Seminar, UC Berkeley, September 2010 Heap Management explicit heap
More informationThe Operating System. Chapter 6
The Operating System Machine Level Chapter 6 1 Contemporary Multilevel Machines A six-level l computer. The support method for each level is indicated below it.2 Operating System Machine a) Operating System
More informationCode Generators for Stencil Auto-tuning
Code Generators for Stencil Auto-tuning Shoaib Kamil with Cy Chan, Sam Williams, Kaushik Datta, John Shalf, Katherine Yelick, Jim Demmel, Leonid Oliker Diagnosing Power/Performance Correctness Where this
More informationGlobal Address Space Programming in Titanium
Global Address Space Programming in Titanium Kathy Yelick CS267 Titanium 1 Titanium Goals Performance close to C/FORTRAN + MPI or better Safety as safe as Java, extended to parallel framework Expressiveness
More informationHalfway! Sequoia. A Point of View. Sequoia. First half of the course is over. Now start the second half. CS315B Lecture 9
Halfway! Sequoia CS315B Lecture 9 First half of the course is over Overview/Philosophy of Regent Now start the second half Lectures on other programming models Comparing/contrasting with Regent Start with
More informationManaging Hierarchy with Teams in the SPMD Programming Model
Lawrence Berkeley National Laboratory Managing Hierarchy with Teams in the SPMD Programming Model Amir Kamil Lawrence Berkeley Lab Berkeley, CA, USA April 28, 2014 1 Single Program, Multiple Data Model
More informationStatic Analysis of Embedded C Code
Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider Relevant features of C code for MCUs Interrupt-driven concurrency Direct hardware access Whole program
More informationLecture Notes on Memory Layout
Lecture Notes on Memory Layout 15-122: Principles of Imperative Computation Frank Pfenning André Platzer Lecture 11 1 Introduction In order to understand how programs work, we can consider the functions,
More informationCHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang
CHAPTER 6 Memory 6.1 Memory 233 6.2 Types of Memory 233 6.3 The Memory Hierarchy 235 6.3.1 Locality of Reference 237 6.4 Cache Memory 237 6.4.1 Cache Mapping Schemes 239 6.4.2 Replacement Policies 247
More informationGarbage Collection Algorithms. Ganesh Bikshandi
Garbage Collection Algorithms Ganesh Bikshandi Announcement MP4 posted Term paper posted Introduction Garbage : discarded or useless material Collection : the act or process of collecting Garbage collection
More informationfor Task-Based Parallelism
for Task-Based Parallelism School of EEECS, Queen s University of Belfast July Table of Contents 2 / 24 Thread-Based Programming Hard for programmer to reason about thread interleavings and concurrent
More informationParallel Programming Principle and Practice. Lecture 9 Introduction to GPGPUs and CUDA Programming Model
Parallel Programming Principle and Practice Lecture 9 Introduction to GPGPUs and CUDA Programming Model Outline Introduction to GPGPUs and Cuda Programming Model The Cuda Thread Hierarchy / Memory Hierarchy
More informationProgramming Languages Third Edition. Chapter 7 Basic Semantics
Programming Languages Third Edition Chapter 7 Basic Semantics Objectives Understand attributes, binding, and semantic functions Understand declarations, blocks, and scope Learn how to construct a symbol
More informationBulk Synchronous and SPMD Programming. The Bulk Synchronous Model. CS315B Lecture 2. Bulk Synchronous Model. The Machine. A model
Bulk Synchronous and SPMD Programming The Bulk Synchronous Model CS315B Lecture 2 Prof. Aiken CS 315B Lecture 2 1 Prof. Aiken CS 315B Lecture 2 2 Bulk Synchronous Model The Machine A model An idealized
More informationVerifying the Safety of Security-Critical Applications
Verifying the Safety of Security-Critical Applications Thomas Dillig Stanford University Thomas Dillig 1 of 31 Why Program Verification? Reliability and security of software is a huge problem. Thomas Dillig
More informationResilient X10 Efficient failure-aware programming
David Cunningham, David Grove, Benjamin Herta, Arun Iyengar, Kiyokuni Kawachiya, Hiroki Murata, Vijay Saraswat, Mikio Takeuchi, Olivier Tardieu X10 Workshop 2014 Resilient X10 Efficient failure-aware programming
More informationStatic analysis of concurrent avionics software
Static analysis of concurrent avionics software with AstréeA Workshop on Static Analysis of Concurrent Software David Delmas Airbus 11 September 2016 Agenda 1 Industrial context Avionics software Formal
More informationRun-time Environments - 3
Run-time Environments - 3 Y.N. Srikant Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler Design Outline of the Lecture n What is run-time
More informationOn the C11 Memory Model and a Program Logic for C11 Concurrency
On the C11 Memory Model and a Program Logic for C11 Concurrency Manuel Penschuck Aktuelle Themen der Softwaretechnologie Software Engineering & Programmiersprachen FB 12 Goethe Universität - Frankfurt
More informationPiecewise Holistic Autotuning of Compiler and Runtime Parameters
Piecewise Holistic Autotuning of Compiler and Runtime Parameters Mihail Popov, Chadi Akel, William Jalby, Pablo de Oliveira Castro University of Versailles Exascale Computing Research August 2016 C E R
More informationCS 470 Spring Parallel Languages. Mike Lam, Professor
CS 470 Spring 2017 Mike Lam, Professor Parallel Languages Graphics and content taken from the following: http://dl.acm.org/citation.cfm?id=2716320 http://chapel.cray.com/papers/briefoverviewchapel.pdf
More informationTowards a Reconfigurable HPC Component Model
C2S@EXA Meeting July 10, 2014 Towards a Reconfigurable HPC Component Model Vincent Lanore1, Christian Pérez2 1 ENS de Lyon, LIP 2 Inria, LIP Avalon team 1 Context 1/4 Adaptive Mesh Refinement 2 Context
More informationConcurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers
Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers Yuan Zhang 1, Evelyn Duesterwald 2, and Guang R. Gao 1 1 University of Delaware, Newark, DE, {zhangy,ggao}@capsl.udel.edu
More informationCS153: Compilers Lecture 15: Local Optimization
CS153: Compilers Lecture 15: Local Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Announcements Project 4 out Due Thursday Oct 25 (2 days) Project 5 out Due Tuesday Nov 13 (21 days)
More informationA Characterization of Shared Data Access Patterns in UPC Programs
IBM T.J. Watson Research Center A Characterization of Shared Data Access Patterns in UPC Programs Christopher Barton, Calin Cascaval, Jose Nelson Amaral LCPC `06 November 2, 2006 Outline Motivation Overview
More informationAssumptions. History
Assumptions A Brief Introduction to Java for C++ Programmers: Part 1 ENGI 5895: Software Design Faculty of Engineering & Applied Science Memorial University of Newfoundland You already know C++ You understand
More informationSpecial Topics: Programming Languages
Lecture #23 0 V22.0490.001 Special Topics: Programming Languages B. Mishra New York University. Lecture # 23 Lecture #23 1 Slide 1 Java: History Spring 1990 April 1991: Naughton, Gosling and Sheridan (
More informationHPX. High Performance ParalleX CCT Tech Talk Series. Hartmut Kaiser
HPX High Performance CCT Tech Talk Hartmut Kaiser (hkaiser@cct.lsu.edu) 2 What s HPX? Exemplar runtime system implementation Targeting conventional architectures (Linux based SMPs and clusters) Currently,
More informationAn Introduction to Heap Analysis. Pietro Ferrara. Chair of Programming Methodology ETH Zurich, Switzerland
An Introduction to Heap Analysis Pietro Ferrara Chair of Programming Methodology ETH Zurich, Switzerland Analisi e Verifica di Programmi Universita Ca Foscari, Venice, Italy Outline 1. Recall of numerical
More informationFormal methods for software security
Formal methods for software security Thomas Jensen, INRIA Forum "Méthodes formelles" Toulouse, 31 January 2017 Formal methods for software security Formal methods for software security Confidentiality
More informationD Programming Language
Group 14 Muazam Ali Anil Ozdemir D Programming Language Introduction and Why D? It doesn t come with a religion this is written somewhere along the overview of D programming language. If you actually take
More informationSubtyping (cont) Lecture 15 CS 565 4/3/08
Subtyping (cont) Lecture 15 CS 565 4/3/08 Formalization of Subtyping Inversion of the subtype relation: If σ
More informationLocal Qualification Inference for Titanium
Local Qualification Inference for Titanium Ben Liblit EECS Department University of California, Berkeley 387 Soda Hall #1776 Berkeley, CA 94720-1776 liblit@cs.berkeley.edu August 26, 1998 Abstract Titanium
More informationUnified Parallel C (UPC)
Unified Parallel C (UPC) Vivek Sarkar Department of Computer Science Rice University vsarkar@cs.rice.edu COMP 422 Lecture 21 March 27, 2008 Acknowledgments Supercomputing 2007 tutorial on Programming using
More informationLecture 15: More Iterative Ideas
Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!
More informationRecall from Tuesday. Our solution to fragmentation is to split up a process s address space into smaller chunks. Physical Memory OS.
Paging 11/10/16 Recall from Tuesday Our solution to fragmentation is to split up a process s address space into smaller chunks. Physical Memory OS Process 3 Process 3 OS: Place Process 3 Process 1 Process
More informationConcurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers
Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers Yuan Zhang 1, Evelyn Duesterwald 2,andGuangR.Gao 1 1 University of Delaware, Newark, DE {zhangy,ggao}@capsl.udel.edu 2
More informationThe Apron Library. Bertrand Jeannet and Antoine Miné. CAV 09 conference 02/07/2009 INRIA, CNRS/ENS
The Apron Library Bertrand Jeannet and Antoine Miné INRIA, CNRS/ENS CAV 09 conference 02/07/2009 Context : Static Analysis What is it about? Discover properties of a program statically and automatically.
More informationParallel Programming Languages. HPC Fall 2010 Prof. Robert van Engelen
Parallel Programming Languages HPC Fall 2010 Prof. Robert van Engelen Overview Partitioned Global Address Space (PGAS) A selection of PGAS parallel programming languages CAF UPC Further reading HPC Fall
More informationCSC System Development with Java. Exception Handling. Department of Statistics and Computer Science. Budditha Hettige
CSC 308 2.0 System Development with Java Exception Handling Department of Statistics and Computer Science 1 2 Errors Errors can be categorized as several ways; Syntax Errors Logical Errors Runtime Errors
More informationQuestion No: 1 ( Marks: 1 ) - Please choose one One difference LISP and PROLOG is. AI Puzzle Game All f the given
MUHAMMAD FAISAL MIT 4 th Semester Al-Barq Campus (VGJW01) Gujranwala faisalgrw123@gmail.com MEGA File Solved MCQ s For Final TERM EXAMS CS508- Modern Programming Languages Question No: 1 ( Marks: 1 ) -
More informationCloning-Based Context-Sensitive Pointer Alias Analysis using BDDs
More Pointer Analysis Last time Flow-Insensitive Pointer Analysis Inclusion-based analysis (Andersen) Today Class projects Context-Sensitive analysis March 3, 2014 Flow-Insensitive Pointer Analysis 1 John
More informationDEVELOPING AN OPTIMIZED UPC COMPILER FOR FUTURE ARCHITECTURES
DEVELOPING AN OPTIMIZED UPC COMPILER FOR FUTURE ARCHITECTURES Tarek El-Ghazawi, François Cantonnet, Yiyi Yao Department of Electrical and Computer Engineering The George Washington University tarek@gwu.edu
More informationMultithreaded Architectures and The Sort Benchmark. Phil Garcia Hank Korth Dept. of Computer Science and Engineering Lehigh University
Multithreaded Architectures and The Sort Benchmark Phil Garcia Hank Korth Dept. of Computer Science and Engineering Lehigh University About our Sort Benchmark Based on the benchmark proposed in A measure
More informationStatic and Dynamic Program Analysis: Synergies and Applications
Static and Dynamic Program Analysis: Synergies and Applications Mayur Naik Intel Labs, Berkeley CS 243, Stanford University March 9, 2011 Today s Computing Platforms Trends: parallel cloud mobile Traits:
More informationRun-Time Environments/Garbage Collection
Run-Time Environments/Garbage Collection Department of Computer Science, Faculty of ICT January 5, 2014 Introduction Compilers need to be aware of the run-time environment in which their compiled programs
More informationOutline. Issues with the Memory System Loop Transformations Data Transformations Prefetching Alias Analysis
Memory Optimization Outline Issues with the Memory System Loop Transformations Data Transformations Prefetching Alias Analysis Memory Hierarchy 1-2 ns Registers 32 512 B 3-10 ns 8-30 ns 60-250 ns 5-20
More informationSt. MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad
St. MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad-00 014 Subject: PPL Class : CSE III 1 P a g e DEPARTMENT COMPUTER SCIENCE AND ENGINEERING S No QUESTION Blooms Course taxonomy level Outcomes UNIT-I
More informationApplication and System Memory Use, Configuration, and Problems on Bassi. Richard Gerber
Application and System Memory Use, Configuration, and Problems on Bassi Richard Gerber Lawrence Berkeley National Laboratory NERSC User Services ScicomP 13, Garching, Germany, July 17, 2007 NERSC is supported
More informationConcurrent Models of Computation
Concurrent Models of Computation Edward A. Lee Robert S. Pepper Distinguished Professor, UC Berkeley EECS 219D Concurrent Models of Computation Fall 2011 Copyright 2009-2011, Edward A. Lee, All rights
More informationSeptember 10,
September 10, 2013 1 Bjarne Stroustrup, AT&T Bell Labs, early 80s cfront original C++ to C translator Difficult to debug Potentially inefficient Many native compilers exist today C++ is mostly upward compatible
More informationTitanium Language Reference Manual, version 2.19
Titanium Language Reference Manual, version 2.19 Paul N. Hilfinger Dan Oscar Bonachea Kaushik Datta David Gay Susan L. Graham Benjamin Robert Liblit Geoffrey Pike Jimmy Zhigang Su Katherine A. Yelick Electrical
More informationThe Checker Framework: pluggable static analysis for Java
The Checker Framework: pluggable static analysis for Java http://checkerframework.org/ Werner Dietl University of Waterloo https://ece.uwaterloo.ca/~wdietl/ Joint work with Michael D. Ernst and many others.
More informationAutomatic Scaling Iterative Computations. Aug. 7 th, 2012
Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics
More informationEfficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI
Efficient AMG on Hybrid GPU Clusters ScicomP 2012 Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann Fraunhofer SCAI Illustration: Darin McInnis Motivation Sparse iterative solvers benefit from
More informationEvaluating Titanium SPMD Programs on the Tera MTA
Evaluating Titanium SPMD Programs on the Tera MTA Carleton Miyamoto, Chang Lin {miyamoto,cjlin}@cs.berkeley.edu EECS Computer Science Division University of California, Berkeley Abstract Coarse grained
More information