Hierarchical Pointer Analysis

Size: px
Start display at page:

Download "Hierarchical Pointer Analysis"

Transcription

1 for Distributed Programs Amir Kamil and Katherine Yelick U.C. Berkeley August 23, Amir Kamil

2 Background 2 Amir Kamil

3 Hierarchical Machines Parallel machines often have hierarchical structure 4 A B D 1 2 C 3 level 1 (thread local) level 2 (node local) level 3 (cluster local) level 4 (grid world) 3 Amir Kamil

4 Partitioned Global Address Space Partitioned global address space (PGAS) languages provide the illusion of shared memory across the machine Wide pointers used to represent global addresses Contain identifying information plus the physical address Process ID: 1 Address: 0xf9a0cb48 Narrow pointers can still be used for addresses in the local physical address space Address: 0xf9a0cb48 4 Amir Kamil

5 The Problems 5 Amir Kamil

6 Three Problems What data is private to a thread? What data is local to the physical address space? What possible race conditions can occur? 6 Amir Kamil

7 Data Privacy Data is private if it cannot leak beyond its source thread Useful to know which data is private for global garbage collection, monitor optimization, and other applications 7 Amir Kamil

8 Data Locality Recall: global pointers composed identifying information and an address Process ID: 1 Address: 0xf9a0cb48 0 When dereferenced, runtime system must perform a check to determine if the data is actually in the local physical address space If local, l then access directly If not local, then perform communication Thus, global l pointers are more costly in both space and time, even if the actual data is local 8 Amir Kamil

9 Race Detection Shared memory introduces the possibility of race conditions Two threads access the same memory location The accesses can be simultaneous (no intermediate synchronization) At least one access is a write 9 Amir Kamil

10 The Solution 10 Amir Kamil

11 A pointer analysis that takes into account the machine hierarchy can answer the preceding questions For each variable, we want to know not only from which allocation sites the data could have originated, but also from which threads 11 Amir Kamil

12 Related Work Thread-aware pointer analysis has been done by others Rugina and Rinard, Zhu and Hendren, Hicks, and others None of them did it for hierarchical, distributed machines Data privacy and locality detection previously done by Liblit, Aiken, and Yelick Uses constraint propagation Does not distinguish allocation sites 12 Amir Kamil

13 The Implementation 13 Amir Kamil

14 Titanium Titanium is a single program, multiple data (SPMD) dialect of Java All threads execute the same program text Designed for distributed machines Global l address space all threads can access all memory At runtime, threads are grouped into processes A thread shares a physical address space with some other, but not all threads 14 Amir Kamil

15 Titanium Memory Hierarchy Global memory is composed of a hierarchy Program Processes Locations can be thread-local (tlocal), processlocal (plocal), or potentially in another process (global) Threads global 3 tlocal plocal 15 Amir Kamil

16 The Analysis 16 Amir Kamil

17 Approach We define a small SPMD language based on Titanium We produce a type system that accounts for the memory hierarchy The analysis can handle an arbitrary number of levels, but we use three levels in this talk We give an overview of the pointer analysis inference rules 17 Amir Kamil

18 Language Syntax Types τ ::= int ref q τ Qualifiers q ::= tlocal plocal global (tlocal l plocal l global) l) Expressions e ::= new l τ (allocation) transmite 1 from e 2 (communication) e 1 e 2 (dereferencing assignment) convert(e, n) (type conversion) 18 Amir Kamil

19 Type Rules Allocation The expression new l τ allocates space of type τ in local memory and returns a reference to the location The label l is unique for each allocation site and will be used by the pointer analysis The resulting reference is qualified with tlocal, since it references thread-local memory Thread 0 new l int tlocall Γ new l τ : ref tlocal τ 19 Amir Kamil

20 Type Rules Communication The expression transmit e 1 from e 2 evaluates e 1 on the thread given by e 2 and retrieves the result If e 1 has reference type, the result type must be widened to global Statically do not know source thread, so must assume it can be any thread Γ e 1 : τ Γ e 2 :int Thread 0 Thread 1 transmit y from 1 global y Γ transmit e 1 from e 2 : tlocal expand(τ, global) expand(ref q τ, q ) ref (q, q ) τ expand(τ, q ) τ otherwise 20 Amir Kamil

21 Type Rules Dereferencing Assignment The expression e 1 e 2 puts the value of e 2 into the location referenced by e 1 (like *e 1 = e 2 in C) Some assignments are unsound Γ e 1 : ref q τ Γ e 2 : τ robust(τ, q) Thread 0 Thread 1 Γ e 1 e 2 :ref q τ plocal y l l l z tlocal tlocal plocal robust(ref q τ, τ q ) false if q q robust(τ, q ) true otherwise 21 Amir Kamil

22 Type Rules Type Conversion The expression convert(e, q) is an assertion that e refers to data that is no further than q Titanium code often checks if data is plocal and then casts to it before operating on it for efficiency Thread 0 x global Γ e : ref q τ Γ convert(e, q) : ref q τ 22 Amir Kamil

23 Pointer Analysis Since language is SPMD, analysis is only done for a single thread We use thread 0 in our examples Each expression has a points-to set of abstract locations that it can reference Abstract locations also have points-to sets 23 Amir Kamil

24 Abstract Locations Abstract locations consist of label and qualifier A-loc (l, q) can refer to any concrete location allocated at label l that is at most distance q from thread 0 (l, tlocal) Thread 0 Thread 1 new l int new l int (l, plocal) l) tlocal tlocal 24 Amir Kamil

25 Pointer Analysis Allocation and Communication The inference rules for allocation and communication are similar to the type rules An allocation new l τ produces a new abstract location (l, tlocal) The result of the expression transmit e 1 from e 2 is the set of a-locs resulting from e 1 but with global qualifiers e 1 {(l 1, tlocal), (l 2, plocal), (l 3, global)} transmit e 1 from e 2 {(l 1, global), (l 2, global), (l 3, global)} 25 Amir Kamil

26 Pointer Analysis Dereferencing Assignment For assignment, must take into account actions of other threads x Thread 0 Thread 1 Thread 2 (l 1, tlocal) x (l 1, plocal) x (l 1, plocal) (l 2, tlocal) (l 2, plocal) (l 2, plocal) y y y x {(l 1, tlocal)}, l)} y {(l 2, plocal)} x y : (l 1, tlocal) (l 2, plocal), (l 1, plocal) (l 2, plocal), (l 1, global) lbl) (l 2, global) lbl) 26 Amir Kamil

27 Pointer Analysis Type Conversion Inthetypeconversionconvert(e type convert(e, q), the program is illegal if e evaluates to a location further than q Thus, the result of the expression convert(e, q) is the set of a-locs resulting from e with the qualifiers reduced to at most q e {(l 1, tlocal), (l 2, plocal), (l 3, global)} convert(e, plocal) l) {(l 1, tlocal), l) (l 2, plocal), l) (l 3, plocal)} l)} 27 Amir Kamil

28 Evaluation 28 Amir Kamil

29 Benchmarks Five application benchmarks used to evaluate the pointer analysis Benchmark Line Count Description amr 7581 Adaptive mesh refinement suite gas 8841 Hyperbolic solver for a gas dynamics problem ft 1192 NAS Fourier transform benchmark cg 1595 NAS conjugate gradient benchmark mg 1952 NAS multigrid benchmark 29 Amir Kamil

30 Running Time Determine actual cost of introducing multiple levels into the pointer analysis Tests run on 2.4GHz Pentium 4 with 512MB RAM Three analysis variants compared Name PA1 PA2 PA3 Description Single-level pointer analysis Two-level pointer analysis (thread-local and global) Three-level pointer analysis 30 Amir Kamil

31 Running Time Results 4 Pointer Analysis Running Time PA1 PA2 PA3 3.5 Analysis s Time (sec conds) Good amr gas ft cg mg Benchmark 31 Amir Kamil

32 Data Privacy Detection In pointer analysis, an allocation site is private if only thread-local references to it are used Thus, only two levels, thread-local and global, needed in the pointer analysis Two types of analysis compared Name SQI PA2 Description Constraint-based analysis by Liblit, Aiken, and Yelick; does not distinguish allocation sites Two-level pointer analysis (thread-local and global) 32 Amir Kamil

33 Perc cent Determ mined to be Thread-P Private Data Privacy Detection Results Data Privacy Detection SQI PA2 amr gas ft cg mg Benchmark Good 33 Amir Kamil

34 Data Locality Detection Goal: statically determine which pointers must be process-local Three analyses compared Name LQI PA2 PA3 Description Constraint-based analysis by Liblit and Aiken; does not distinguish allocation sites Two-level pointer analysis (thread-local and global) Three-level pointer analysis 34 Amir Kamil

35 Perc cent Determ mined to be Process- -Local Data Locality Detection Results Data Locality Detection LQI PA2 PA3 Good 0 amr gas ft cg mg Benchmark 35 Amir Kamil

36 Race Detection Pointer analysis used with an existing concurrency analysis to detect potential races at compile-time Three analyses compared Name concur concur+pa1 concur+pa3 Description Concurrency analysis plus constraint-based data sharing analysis and type-based alias analysis Concurrency analysis plus single-level level pointer analysis Concurrency analysis plus three-level pointer analysis 36 Amir Kamil

37 Race Detection Results Static Races Detected concur concur+pa1 concur+pa3 Races (Lo ogarithmic Scale) Good 10 amr gas ft cg mg Benchmark 37 Amir Kamil

38 Conclusion 38 Amir Kamil

39 Conclusion We developed a pointer analysis for hierarchical, distributed machines The cost of introducing the memory hierarchy into the analysis is small On the other hand, the payoff is large 39 Amir Kamil

40 Future Work Scientific programs tend to use a lot of arraybased data structures Need array index analysis to properly analyze them Implement a dynamic race detector Use static results to minimize the program locations that need to be tracked 40 Amir Kamil

41 Questions 41 Amir Kamil

Hierarchical Pointer Analysis for Distributed Programs

Hierarchical Pointer Analysis for Distributed Programs Hierarchical Pointer Analysis for Distributed Programs Amir Kamil Computer Science Division, University of California, Berkeley kamil@cs.berkeley.edu April 14, 2006 1 Introduction Many distributed, parallel

More information

Enforcing Textual Alignment of

Enforcing Textual Alignment of Parallel Hardware Parallel Applications IT industry (Silicon Valley) Parallel Software Users Enforcing Textual Alignment of Collectives using Dynamic Checks and Katherine Yelick UC Berkeley Parallel Computing

More information

A Local-View Array Library for Partitioned Global Address Space C++ Programs

A Local-View Array Library for Partitioned Global Address Space C++ Programs Lawrence Berkeley National Laboratory A Local-View Array Library for Partitioned Global Address Space C++ Programs Amir Kamil, Yili Zheng, and Katherine Yelick Lawrence Berkeley Lab Berkeley, CA, USA June

More information

Hierarchical Pointer Analysis for Distributed Programs

Hierarchical Pointer Analysis for Distributed Programs Hierarchical Pointer Analysis for Distributed Programs Amir Kamil and Katherine Yelick Computer Science Division, University of California, Berkeley {kamil,yelick}@cs.berkeley.edu Abstract. We present

More information

Optimization and Evaluation of a Titanium Adaptive Mesh Refinement Code

Optimization and Evaluation of a Titanium Adaptive Mesh Refinement Code Optimization and Evaluation of a Titanium Adaptive Mesh Refinement Code Amir Kamil Ben Schwarz Jimmy Su kamil@cs.berkeley.edu bschwarz@cs.berkeley.edu jimmysu@cs.berkeley.edu May 19, 2004 Abstract Adaptive

More information

Adaptive Mesh Refinement in Titanium

Adaptive Mesh Refinement in Titanium Adaptive Mesh Refinement in Titanium http://seesar.lbl.gov/anag Lawrence Berkeley National Laboratory April 7, 2005 19 th IPDPS, April 7, 2005 1 Overview Motivations: Build the infrastructure in Titanium

More information

Concurrency Analysis for Parallel Programs with Textually Aligned Barriers. 1 Introduction. 2 Motivation. 2.1 Static Race Detection

Concurrency Analysis for Parallel Programs with Textually Aligned Barriers. 1 Introduction. 2 Motivation. 2.1 Static Race Detection Concurrency Analysis for Parallel Programs with Textually Aligned Barriers Amir Kamil Katherine Yelick Computer Science Division, University of California, Berkeley {kamil,yelick}@cs.berkeley.edu Abstract.

More information

Efficient Data Race Detection for Unified Parallel C

Efficient Data Race Detection for Unified Parallel C P A R A L L E L C O M P U T I N G L A B O R A T O R Y Efficient Data Race Detection for Unified Parallel C ParLab Winter Retreat 1/14/2011" Costin Iancu, LBL" Nick Jalbert, UC Berkeley" Chang-Seo Park,

More information

Compiler and Runtime Support for Scaling Adaptive Mesh Refinement Computations in Titanium

Compiler and Runtime Support for Scaling Adaptive Mesh Refinement Computations in Titanium Compiler and Runtime Support for Scaling Adaptive Mesh Refinement Computations in Titanium Jimmy Zhigang Su Tong Wen Katherine A. Yelick Electrical Engineering and Computer Sciences University of California

More information

Titanium: A High-Performance Java Dialect

Titanium: A High-Performance Java Dialect Titanium: A High-Performance Java Dialect Kathy Yelick, Luigi Semenzato, Geoff Pike, Carleton Miyamoto, Ben Liblit, Arvind Krishnamurthy, Paul Hilfinger, Susan Graham, David Gay, Phil Colella, and Alex

More information

The NAS Parallel Benchmarks in Titanium. by Kaushik Datta. Research Project

The NAS Parallel Benchmarks in Titanium. by Kaushik Datta. Research Project The NAS Parallel Benchmarks in Titanium by Kaushik Datta Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, in partial satisfaction

More information

Compilation Techniques for Partitioned Global Address Space Languages

Compilation Techniques for Partitioned Global Address Space Languages Compilation Techniques for Partitioned Global Address Space Languages Katherine Yelick U.C. Berkeley and Lawrence Berkeley National Lab http://titanium.cs.berkeley.edu http://upc.lbl.gov 1 Kathy Yelick

More information

The Hierarchical SPMD Programming Model

The Hierarchical SPMD Programming Model The Hierarchical SPMD Programming Model Amir Ashraf Kamil Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2011-28 http://www.eecs.berkeley.edu/pubs/techrpts/2011/eecs-2011-28.html

More information

Productivity and Performance Using Partitioned Global Address Space Languages

Productivity and Performance Using Partitioned Global Address Space Languages Productivity and Performance Using Partitioned Global Address Space Languages Katherine Yelick 1,2, Dan Bonachea 1,2, Wei-Yu Chen 1,2, Phillip Colella 2, Kaushik Datta 1,2, Jason Duell 1,2, Susan L. Graham

More information

Data Sharing Analysis for Titanium

Data Sharing Analysis for Titanium Data Sharing Analysis for Titanium Ben Liblit liblit@ cs. berkeley. edu Alex Aiken aiken@ cs. berkeley. edu Katherine Yelick yelick@ cs. berkeley. edu Report No. UCB//CSD-01-1165 November 2001 Computer

More information

[0569] p 0318 garbage

[0569] p 0318 garbage A Pointer is a variable which contains the address of another variable. Declaration syntax: Pointer_type *pointer_name; This declaration will create a pointer of the pointer_name which will point to the

More information

Static Analysis of Embedded C

Static Analysis of Embedded C Static Analysis of Embedded C John Regehr University of Utah Joint work with Nathan Cooprider Motivating Platform: TinyOS Embedded software for wireless sensor network nodes Has lots of SW components for

More information

A Context-Sensitive Memory Model for Verification of C/C++ Programs

A Context-Sensitive Memory Model for Verification of C/C++ Programs A Context-Sensitive Memory Model for Verification of C/C++ Programs Arie Gurfinkel and Jorge A. Navas University of Waterloo and SRI International SAS 17, August 30th, 2017 Gurfinkel and Navas (UWaterloo/SRI)

More information

Titanium Performance and Potential: an NPB Experimental Study

Titanium Performance and Potential: an NPB Experimental Study Performance and Potential: an NPB Experimental Study Kaushik Datta 1, Dan Bonachea 1 and Katherine Yelick 1,2 {kdatta,bonachea,yelick}@cs.berkeley.edu Computer Science Division, University of California

More information

Regions Review. A region is a (typed) collection. Regent: More on Regions. Regions are the cross product of. CS315B Lecture 5

Regions Review. A region is a (typed) collection. Regent: More on Regions. Regions are the cross product of. CS315B Lecture 5 Regions Review A region is a (typed) collection Regent: More on Regions CS315B Lecture 5 Regions are the cross product of An index space A field space So far we ve seen regions with N-dim index spaces

More information

Titanium. Titanium and Java Parallelism. Java: A Cleaner C++ Java Objects. Java Object Example. Immutable Classes in Titanium

Titanium. Titanium and Java Parallelism. Java: A Cleaner C++ Java Objects. Java Object Example. Immutable Classes in Titanium Titanium Titanium and Java Parallelism Arvind Krishnamurthy Fall 2004 Take the best features of threads and MPI (just like Split-C) global address space like threads (ease programming) SPMD parallelism

More information

Hierarchical Additions to the SPMD Programming Model

Hierarchical Additions to the SPMD Programming Model Hierarchical Additions to the SPMD Programming Model Amir Ashraf Kamil Katherine A. Yelick Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2012-20

More information

Compilation Techniques for Partitioned Global Address Space Languages

Compilation Techniques for Partitioned Global Address Space Languages Compilation Techniques for Partitioned Global Address Space Languages Katherine Yelick U.C. Berkeley and Lawrence Berkeley National Lab http://titanium.cs.berkeley.edu http://upc.lbl.gov 1 Kathy Yelick

More information

Lithe: Enabling Efficient Composition of Parallel Libraries

Lithe: Enabling Efficient Composition of Parallel Libraries Lithe: Enabling Efficient Composition of Parallel Libraries Heidi Pan, Benjamin Hindman, Krste Asanović xoxo@mit.edu apple {benh, krste}@eecs.berkeley.edu Massachusetts Institute of Technology apple UC

More information

Jaguar: Enabling Efficient Communication and I/O in Java

Jaguar: Enabling Efficient Communication and I/O in Java Jaguar: Enabling Efficient Communication and I/O in Java Matt Welsh and David Culler UC Berkeley Presented by David Hovemeyer Outline ' Motivation ' How it works ' Code mappings ' External objects ' Pre

More information

Parallel Languages: Past, Present and Future

Parallel Languages: Past, Present and Future Parallel Languages: Past, Present and Future Katherine Yelick U.C. Berkeley and Lawrence Berkeley National Lab 1 Kathy Yelick Internal Outline Two components: control and data (communication/sharing) One

More information

Cluster Computing Paul A. Farrell 9/15/2011. Dept of Computer Science Kent State University 1. Benchmarking CPU Performance

Cluster Computing Paul A. Farrell 9/15/2011. Dept of Computer Science Kent State University 1. Benchmarking CPU Performance Many benchmarks available MHz (cycle speed of processor) MIPS (million instructions per second) Peak FLOPS Whetstone Stresses unoptimized scalar performance, since it is designed to defeat any effort to

More information

Benchmarking CPU Performance. Benchmarking CPU Performance

Benchmarking CPU Performance. Benchmarking CPU Performance Cluster Computing Benchmarking CPU Performance Many benchmarks available MHz (cycle speed of processor) MIPS (million instructions per second) Peak FLOPS Whetstone Stresses unoptimized scalar performance,

More information

2. Reachability in garbage collection is just an approximation of garbage.

2. Reachability in garbage collection is just an approximation of garbage. symbol tables were on the first exam of this particular year's exam. We did not discuss register allocation in this This exam has questions from previous CISC 471/672. particular year. Not all questions

More information

CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1

CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1 CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1 1 University of California, Berkeley, USA {pallavi,parkcs,ksen}@eecs.berkeley.edu

More information

CS527 Software Security

CS527 Software Security Security Policies Purdue University, Spring 2018 Security Policies A policy is a deliberate system of principles to guide decisions and achieve rational outcomes. A policy is a statement of intent, and

More information

Hierarchical Computation in the SPMD Programming Model

Hierarchical Computation in the SPMD Programming Model Hierarchical Computation in the SPMD Programming Model Amir Kamil Katherine Yelick Computer Science Division, University of California, Berkeley {kamil,yelick@cs.berkeley.edu Abstract. Large-scale parallel

More information

Escape Analysis. Applications to ML and Java TM

Escape Analysis. Applications to ML and Java TM Escape Analysis. Applications to ML and Java TM Bruno Blanchet INRIA Rocquencourt Bruno.Blanchet@inria.fr December 2000 Overview 1. Introduction: escape analysis and applications. 2. Escape analysis 2.a

More information

CSE 403: Software Engineering, Fall courses.cs.washington.edu/courses/cse403/16au/ Static Analysis. Emina Torlak

CSE 403: Software Engineering, Fall courses.cs.washington.edu/courses/cse403/16au/ Static Analysis. Emina Torlak CSE 403: Software Engineering, Fall 2016 courses.cs.washington.edu/courses/cse403/16au/ Static Analysis Emina Torlak emina@cs.washington.edu Outline What is static analysis? How does it work? Free and

More information

Chapter 1 INTRODUCTION SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

Chapter 1 INTRODUCTION SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC. hapter 1 INTRODUTION SYS-ED/ OMPUTER EDUATION TEHNIQUES, IN. Objectives You will learn: Java features. Java and its associated components. Features of a Java application and applet. Java data types. Java

More information

A Comparison of Unified Parallel C, Titanium and Co-Array Fortran. The purpose of this paper is to compare Unified Parallel C, Titanium and Co-

A Comparison of Unified Parallel C, Titanium and Co-Array Fortran. The purpose of this paper is to compare Unified Parallel C, Titanium and Co- Shaun Lindsay CS425 A Comparison of Unified Parallel C, Titanium and Co-Array Fortran The purpose of this paper is to compare Unified Parallel C, Titanium and Co- Array Fortran s methods of parallelism

More information

CIRM - Dynamic Error Detection

CIRM - Dynamic Error Detection CIRM - Dynamic Error Detection Peter Pirkelbauer Center for Applied Scientific Computing (CASC) Lawrence Livermore National Laboratory This work was funded by the Department of Defense and used elements

More information

Field Analysis. Last time Exploit encapsulation to improve memory system performance

Field Analysis. Last time Exploit encapsulation to improve memory system performance Field Analysis Last time Exploit encapsulation to improve memory system performance This time Exploit encapsulation to simplify analysis Two uses of field analysis Escape analysis Object inlining April

More information

C PGAS XcalableMP(XMP) Unified Parallel

C PGAS XcalableMP(XMP) Unified Parallel PGAS XcalableMP Unified Parallel C 1 2 1, 2 1, 2, 3 C PGAS XcalableMP(XMP) Unified Parallel C(UPC) XMP UPC XMP UPC 1 Berkeley UPC GASNet 1. MPI MPI 1 Center for Computational Sciences, University of Tsukuba

More information

Scaling Abstraction Refinement via Pruning. PLDI - San Jose, CA June 8, 2011

Scaling Abstraction Refinement via Pruning. PLDI - San Jose, CA June 8, 2011 Scaling Abstraction Refinement via Pruning PLDI - San Jose, CA June 8, 2011 Percy Liang UC Berkeley Mayur Naik Intel Labs Berkeley The Big Picture Program 2 The Big Picture Program Static Analysis 2 The

More information

Lecture08: Scope and Lexical Address

Lecture08: Scope and Lexical Address Lecture08: Scope and Lexical Address Free and Bound Variables (EOPL 1.3.1) Given an expression E, does a particular variable reference x appear free or bound in that expression? Definition: A variable

More information

Grade Weights. Language Design and Overview of COOL. CS143 Lecture 2. Programming Language Economics 101. Lecture Outline

Grade Weights. Language Design and Overview of COOL. CS143 Lecture 2. Programming Language Economics 101. Lecture Outline Grade Weights Language Design and Overview of COOL CS143 Lecture 2 Project 0% I, II 10% each III, IV 1% each Midterm 1% Final 2% Written Assignments 10% 2.% each Prof. Aiken CS 143 Lecture 2 1 Prof. Aiken

More information

CSE 307: Principles of Programming Languages

CSE 307: Principles of Programming Languages CSE 307: Principles of Programming Languages Variables and Constants R. Sekar 1 / 22 Topics 2 / 22 Variables and Constants Variables are stored in memory, whereas constants need not be. Value of variables

More information

Run-time Environments -Part 3

Run-time Environments -Part 3 Run-time Environments -Part 3 Y.N. Srikant Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Compiler Design Outline of the Lecture Part 3 What is run-time support?

More information

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg Short-term Memory for Self-collecting Mutators Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg CHESS Seminar, UC Berkeley, September 2010 Heap Management explicit heap

More information

The Operating System. Chapter 6

The Operating System. Chapter 6 The Operating System Machine Level Chapter 6 1 Contemporary Multilevel Machines A six-level l computer. The support method for each level is indicated below it.2 Operating System Machine a) Operating System

More information

Code Generators for Stencil Auto-tuning

Code Generators for Stencil Auto-tuning Code Generators for Stencil Auto-tuning Shoaib Kamil with Cy Chan, Sam Williams, Kaushik Datta, John Shalf, Katherine Yelick, Jim Demmel, Leonid Oliker Diagnosing Power/Performance Correctness Where this

More information

Global Address Space Programming in Titanium

Global Address Space Programming in Titanium Global Address Space Programming in Titanium Kathy Yelick CS267 Titanium 1 Titanium Goals Performance close to C/FORTRAN + MPI or better Safety as safe as Java, extended to parallel framework Expressiveness

More information

Halfway! Sequoia. A Point of View. Sequoia. First half of the course is over. Now start the second half. CS315B Lecture 9

Halfway! Sequoia. A Point of View. Sequoia. First half of the course is over. Now start the second half. CS315B Lecture 9 Halfway! Sequoia CS315B Lecture 9 First half of the course is over Overview/Philosophy of Regent Now start the second half Lectures on other programming models Comparing/contrasting with Regent Start with

More information

Managing Hierarchy with Teams in the SPMD Programming Model

Managing Hierarchy with Teams in the SPMD Programming Model Lawrence Berkeley National Laboratory Managing Hierarchy with Teams in the SPMD Programming Model Amir Kamil Lawrence Berkeley Lab Berkeley, CA, USA April 28, 2014 1 Single Program, Multiple Data Model

More information

Static Analysis of Embedded C Code

Static Analysis of Embedded C Code Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider Relevant features of C code for MCUs Interrupt-driven concurrency Direct hardware access Whole program

More information

Lecture Notes on Memory Layout

Lecture Notes on Memory Layout Lecture Notes on Memory Layout 15-122: Principles of Imperative Computation Frank Pfenning André Platzer Lecture 11 1 Introduction In order to understand how programs work, we can consider the functions,

More information

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 233 6.2 Types of Memory 233 6.3 The Memory Hierarchy 235 6.3.1 Locality of Reference 237 6.4 Cache Memory 237 6.4.1 Cache Mapping Schemes 239 6.4.2 Replacement Policies 247

More information

Garbage Collection Algorithms. Ganesh Bikshandi

Garbage Collection Algorithms. Ganesh Bikshandi Garbage Collection Algorithms Ganesh Bikshandi Announcement MP4 posted Term paper posted Introduction Garbage : discarded or useless material Collection : the act or process of collecting Garbage collection

More information

for Task-Based Parallelism

for Task-Based Parallelism for Task-Based Parallelism School of EEECS, Queen s University of Belfast July Table of Contents 2 / 24 Thread-Based Programming Hard for programmer to reason about thread interleavings and concurrent

More information

Parallel Programming Principle and Practice. Lecture 9 Introduction to GPGPUs and CUDA Programming Model

Parallel Programming Principle and Practice. Lecture 9 Introduction to GPGPUs and CUDA Programming Model Parallel Programming Principle and Practice Lecture 9 Introduction to GPGPUs and CUDA Programming Model Outline Introduction to GPGPUs and Cuda Programming Model The Cuda Thread Hierarchy / Memory Hierarchy

More information

Programming Languages Third Edition. Chapter 7 Basic Semantics

Programming Languages Third Edition. Chapter 7 Basic Semantics Programming Languages Third Edition Chapter 7 Basic Semantics Objectives Understand attributes, binding, and semantic functions Understand declarations, blocks, and scope Learn how to construct a symbol

More information

Bulk Synchronous and SPMD Programming. The Bulk Synchronous Model. CS315B Lecture 2. Bulk Synchronous Model. The Machine. A model

Bulk Synchronous and SPMD Programming. The Bulk Synchronous Model. CS315B Lecture 2. Bulk Synchronous Model. The Machine. A model Bulk Synchronous and SPMD Programming The Bulk Synchronous Model CS315B Lecture 2 Prof. Aiken CS 315B Lecture 2 1 Prof. Aiken CS 315B Lecture 2 2 Bulk Synchronous Model The Machine A model An idealized

More information

Verifying the Safety of Security-Critical Applications

Verifying the Safety of Security-Critical Applications Verifying the Safety of Security-Critical Applications Thomas Dillig Stanford University Thomas Dillig 1 of 31 Why Program Verification? Reliability and security of software is a huge problem. Thomas Dillig

More information

Resilient X10 Efficient failure-aware programming

Resilient X10 Efficient failure-aware programming David Cunningham, David Grove, Benjamin Herta, Arun Iyengar, Kiyokuni Kawachiya, Hiroki Murata, Vijay Saraswat, Mikio Takeuchi, Olivier Tardieu X10 Workshop 2014 Resilient X10 Efficient failure-aware programming

More information

Static analysis of concurrent avionics software

Static analysis of concurrent avionics software Static analysis of concurrent avionics software with AstréeA Workshop on Static Analysis of Concurrent Software David Delmas Airbus 11 September 2016 Agenda 1 Industrial context Avionics software Formal

More information

Run-time Environments - 3

Run-time Environments - 3 Run-time Environments - 3 Y.N. Srikant Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler Design Outline of the Lecture n What is run-time

More information

On the C11 Memory Model and a Program Logic for C11 Concurrency

On the C11 Memory Model and a Program Logic for C11 Concurrency On the C11 Memory Model and a Program Logic for C11 Concurrency Manuel Penschuck Aktuelle Themen der Softwaretechnologie Software Engineering & Programmiersprachen FB 12 Goethe Universität - Frankfurt

More information

Piecewise Holistic Autotuning of Compiler and Runtime Parameters

Piecewise Holistic Autotuning of Compiler and Runtime Parameters Piecewise Holistic Autotuning of Compiler and Runtime Parameters Mihail Popov, Chadi Akel, William Jalby, Pablo de Oliveira Castro University of Versailles Exascale Computing Research August 2016 C E R

More information

CS 470 Spring Parallel Languages. Mike Lam, Professor

CS 470 Spring Parallel Languages. Mike Lam, Professor CS 470 Spring 2017 Mike Lam, Professor Parallel Languages Graphics and content taken from the following: http://dl.acm.org/citation.cfm?id=2716320 http://chapel.cray.com/papers/briefoverviewchapel.pdf

More information

Towards a Reconfigurable HPC Component Model

Towards a Reconfigurable HPC Component Model C2S@EXA Meeting July 10, 2014 Towards a Reconfigurable HPC Component Model Vincent Lanore1, Christian Pérez2 1 ENS de Lyon, LIP 2 Inria, LIP Avalon team 1 Context 1/4 Adaptive Mesh Refinement 2 Context

More information

Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers

Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers Yuan Zhang 1, Evelyn Duesterwald 2, and Guang R. Gao 1 1 University of Delaware, Newark, DE, {zhangy,ggao}@capsl.udel.edu

More information

CS153: Compilers Lecture 15: Local Optimization

CS153: Compilers Lecture 15: Local Optimization CS153: Compilers Lecture 15: Local Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Announcements Project 4 out Due Thursday Oct 25 (2 days) Project 5 out Due Tuesday Nov 13 (21 days)

More information

A Characterization of Shared Data Access Patterns in UPC Programs

A Characterization of Shared Data Access Patterns in UPC Programs IBM T.J. Watson Research Center A Characterization of Shared Data Access Patterns in UPC Programs Christopher Barton, Calin Cascaval, Jose Nelson Amaral LCPC `06 November 2, 2006 Outline Motivation Overview

More information

Assumptions. History

Assumptions. History Assumptions A Brief Introduction to Java for C++ Programmers: Part 1 ENGI 5895: Software Design Faculty of Engineering & Applied Science Memorial University of Newfoundland You already know C++ You understand

More information

Special Topics: Programming Languages

Special Topics: Programming Languages Lecture #23 0 V22.0490.001 Special Topics: Programming Languages B. Mishra New York University. Lecture # 23 Lecture #23 1 Slide 1 Java: History Spring 1990 April 1991: Naughton, Gosling and Sheridan (

More information

HPX. High Performance ParalleX CCT Tech Talk Series. Hartmut Kaiser

HPX. High Performance ParalleX CCT Tech Talk Series. Hartmut Kaiser HPX High Performance CCT Tech Talk Hartmut Kaiser (hkaiser@cct.lsu.edu) 2 What s HPX? Exemplar runtime system implementation Targeting conventional architectures (Linux based SMPs and clusters) Currently,

More information

An Introduction to Heap Analysis. Pietro Ferrara. Chair of Programming Methodology ETH Zurich, Switzerland

An Introduction to Heap Analysis. Pietro Ferrara. Chair of Programming Methodology ETH Zurich, Switzerland An Introduction to Heap Analysis Pietro Ferrara Chair of Programming Methodology ETH Zurich, Switzerland Analisi e Verifica di Programmi Universita Ca Foscari, Venice, Italy Outline 1. Recall of numerical

More information

Formal methods for software security

Formal methods for software security Formal methods for software security Thomas Jensen, INRIA Forum "Méthodes formelles" Toulouse, 31 January 2017 Formal methods for software security Formal methods for software security Confidentiality

More information

D Programming Language

D Programming Language Group 14 Muazam Ali Anil Ozdemir D Programming Language Introduction and Why D? It doesn t come with a religion this is written somewhere along the overview of D programming language. If you actually take

More information

Subtyping (cont) Lecture 15 CS 565 4/3/08

Subtyping (cont) Lecture 15 CS 565 4/3/08 Subtyping (cont) Lecture 15 CS 565 4/3/08 Formalization of Subtyping Inversion of the subtype relation: If σ

More information

Local Qualification Inference for Titanium

Local Qualification Inference for Titanium Local Qualification Inference for Titanium Ben Liblit EECS Department University of California, Berkeley 387 Soda Hall #1776 Berkeley, CA 94720-1776 liblit@cs.berkeley.edu August 26, 1998 Abstract Titanium

More information

Unified Parallel C (UPC)

Unified Parallel C (UPC) Unified Parallel C (UPC) Vivek Sarkar Department of Computer Science Rice University vsarkar@cs.rice.edu COMP 422 Lecture 21 March 27, 2008 Acknowledgments Supercomputing 2007 tutorial on Programming using

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

Recall from Tuesday. Our solution to fragmentation is to split up a process s address space into smaller chunks. Physical Memory OS.

Recall from Tuesday. Our solution to fragmentation is to split up a process s address space into smaller chunks. Physical Memory OS. Paging 11/10/16 Recall from Tuesday Our solution to fragmentation is to split up a process s address space into smaller chunks. Physical Memory OS Process 3 Process 3 OS: Place Process 3 Process 1 Process

More information

Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers

Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers Yuan Zhang 1, Evelyn Duesterwald 2,andGuangR.Gao 1 1 University of Delaware, Newark, DE {zhangy,ggao}@capsl.udel.edu 2

More information

The Apron Library. Bertrand Jeannet and Antoine Miné. CAV 09 conference 02/07/2009 INRIA, CNRS/ENS

The Apron Library. Bertrand Jeannet and Antoine Miné. CAV 09 conference 02/07/2009 INRIA, CNRS/ENS The Apron Library Bertrand Jeannet and Antoine Miné INRIA, CNRS/ENS CAV 09 conference 02/07/2009 Context : Static Analysis What is it about? Discover properties of a program statically and automatically.

More information

Parallel Programming Languages. HPC Fall 2010 Prof. Robert van Engelen

Parallel Programming Languages. HPC Fall 2010 Prof. Robert van Engelen Parallel Programming Languages HPC Fall 2010 Prof. Robert van Engelen Overview Partitioned Global Address Space (PGAS) A selection of PGAS parallel programming languages CAF UPC Further reading HPC Fall

More information

CSC System Development with Java. Exception Handling. Department of Statistics and Computer Science. Budditha Hettige

CSC System Development with Java. Exception Handling. Department of Statistics and Computer Science. Budditha Hettige CSC 308 2.0 System Development with Java Exception Handling Department of Statistics and Computer Science 1 2 Errors Errors can be categorized as several ways; Syntax Errors Logical Errors Runtime Errors

More information

Question No: 1 ( Marks: 1 ) - Please choose one One difference LISP and PROLOG is. AI Puzzle Game All f the given

Question No: 1 ( Marks: 1 ) - Please choose one One difference LISP and PROLOG is. AI Puzzle Game All f the given MUHAMMAD FAISAL MIT 4 th Semester Al-Barq Campus (VGJW01) Gujranwala faisalgrw123@gmail.com MEGA File Solved MCQ s For Final TERM EXAMS CS508- Modern Programming Languages Question No: 1 ( Marks: 1 ) -

More information

Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs

Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs More Pointer Analysis Last time Flow-Insensitive Pointer Analysis Inclusion-based analysis (Andersen) Today Class projects Context-Sensitive analysis March 3, 2014 Flow-Insensitive Pointer Analysis 1 John

More information

DEVELOPING AN OPTIMIZED UPC COMPILER FOR FUTURE ARCHITECTURES

DEVELOPING AN OPTIMIZED UPC COMPILER FOR FUTURE ARCHITECTURES DEVELOPING AN OPTIMIZED UPC COMPILER FOR FUTURE ARCHITECTURES Tarek El-Ghazawi, François Cantonnet, Yiyi Yao Department of Electrical and Computer Engineering The George Washington University tarek@gwu.edu

More information

Multithreaded Architectures and The Sort Benchmark. Phil Garcia Hank Korth Dept. of Computer Science and Engineering Lehigh University

Multithreaded Architectures and The Sort Benchmark. Phil Garcia Hank Korth Dept. of Computer Science and Engineering Lehigh University Multithreaded Architectures and The Sort Benchmark Phil Garcia Hank Korth Dept. of Computer Science and Engineering Lehigh University About our Sort Benchmark Based on the benchmark proposed in A measure

More information

Static and Dynamic Program Analysis: Synergies and Applications

Static and Dynamic Program Analysis: Synergies and Applications Static and Dynamic Program Analysis: Synergies and Applications Mayur Naik Intel Labs, Berkeley CS 243, Stanford University March 9, 2011 Today s Computing Platforms Trends: parallel cloud mobile Traits:

More information

Run-Time Environments/Garbage Collection

Run-Time Environments/Garbage Collection Run-Time Environments/Garbage Collection Department of Computer Science, Faculty of ICT January 5, 2014 Introduction Compilers need to be aware of the run-time environment in which their compiled programs

More information

Outline. Issues with the Memory System Loop Transformations Data Transformations Prefetching Alias Analysis

Outline. Issues with the Memory System Loop Transformations Data Transformations Prefetching Alias Analysis Memory Optimization Outline Issues with the Memory System Loop Transformations Data Transformations Prefetching Alias Analysis Memory Hierarchy 1-2 ns Registers 32 512 B 3-10 ns 8-30 ns 60-250 ns 5-20

More information

St. MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

St. MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad St. MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad-00 014 Subject: PPL Class : CSE III 1 P a g e DEPARTMENT COMPUTER SCIENCE AND ENGINEERING S No QUESTION Blooms Course taxonomy level Outcomes UNIT-I

More information

Application and System Memory Use, Configuration, and Problems on Bassi. Richard Gerber

Application and System Memory Use, Configuration, and Problems on Bassi. Richard Gerber Application and System Memory Use, Configuration, and Problems on Bassi Richard Gerber Lawrence Berkeley National Laboratory NERSC User Services ScicomP 13, Garching, Germany, July 17, 2007 NERSC is supported

More information

Concurrent Models of Computation

Concurrent Models of Computation Concurrent Models of Computation Edward A. Lee Robert S. Pepper Distinguished Professor, UC Berkeley EECS 219D Concurrent Models of Computation Fall 2011 Copyright 2009-2011, Edward A. Lee, All rights

More information

September 10,

September 10, September 10, 2013 1 Bjarne Stroustrup, AT&T Bell Labs, early 80s cfront original C++ to C translator Difficult to debug Potentially inefficient Many native compilers exist today C++ is mostly upward compatible

More information

Titanium Language Reference Manual, version 2.19

Titanium Language Reference Manual, version 2.19 Titanium Language Reference Manual, version 2.19 Paul N. Hilfinger Dan Oscar Bonachea Kaushik Datta David Gay Susan L. Graham Benjamin Robert Liblit Geoffrey Pike Jimmy Zhigang Su Katherine A. Yelick Electrical

More information

The Checker Framework: pluggable static analysis for Java

The Checker Framework: pluggable static analysis for Java The Checker Framework: pluggable static analysis for Java http://checkerframework.org/ Werner Dietl University of Waterloo https://ece.uwaterloo.ca/~wdietl/ Joint work with Michael D. Ernst and many others.

More information

Automatic Scaling Iterative Computations. Aug. 7 th, 2012

Automatic Scaling Iterative Computations. Aug. 7 th, 2012 Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics

More information

Efficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI

Efficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI Efficient AMG on Hybrid GPU Clusters ScicomP 2012 Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann Fraunhofer SCAI Illustration: Darin McInnis Motivation Sparse iterative solvers benefit from

More information

Evaluating Titanium SPMD Programs on the Tera MTA

Evaluating Titanium SPMD Programs on the Tera MTA Evaluating Titanium SPMD Programs on the Tera MTA Carleton Miyamoto, Chang Lin {miyamoto,cjlin}@cs.berkeley.edu EECS Computer Science Division University of California, Berkeley Abstract Coarse grained

More information