High Performance Computing. Without a Degree in Computer Science

Size: px
Start display at page:

Download "High Performance Computing. Without a Degree in Computer Science"

Transcription

1 High Performance Computing Without a Degree in Computer Science

2 Smalley s Top Ten 1. energy 2. water 3. food 4. environment 5. poverty 6. terrorism and war 7. disease 8. education 9. democracy 10. population

3 Number of Physical Scientists and Engineers Computation is an important research paradigm in physical science Today, many scientists spend (waste?) enormous amounts of time on structuring of computations A goal of computer science: help make these scientists more productive

4 A Bit of History In the beginning, there was machine language (or assembly language) 1957: Fortran (John Backus,et.al., IBM) made it possible for every scientist to develop applications Today: programming high end machines is once again the near-exclusive domain of experts.

5 Key: The Compiler Fortran Program Compiler Machine Code Takes care of the many details of making the machine perform well Goal: make the penalty for using the programming language as small as possible

6 Making Languages Usable It was our belief that if FORTRAN, during its first months, were to translate any reasonable scientific source program into an object program only half as fast as its hand-coded counterpart, then acceptance of our system would be in serious danger... I believe that had we failed to produce efficient programs, the widespread use of languages like FORTRAN would have been seriously delayed. John Backus

7 The Programming Problem Programming is hard, and getting harder with new platforms Professional programmers are (still) in short supply Programming systems that result in low performance will not be accepted

8 The Programming Problem: A Strategy Make it possible for end users to become programmers users integrate software components using problem-solving environments (PSEs) or scripting languages (e.g., Visual Basic, Matlab) professional programmers develop software components

9 The Programming Problem: An Obstacle Achieving High Performance: translate scripts and components to common intermediate language optimize the resulting program using whole-program compilation

10 Whole-Program Compilation Component Library Script Translator Global Optimizing Compiler Code Generator Problem: long compilation times, even for short scripts! Problem: expert knowledge on specialization lost

11 Telescoping Languages L1 Component Library Compiler Generator Could run for many hours Script Translator L1 Compiler understands library calls as primitives Code Generator Optimized Application

12 Telescoping Languages: Advantages Compile times can be reasonable High-level optimizations can be included User retains substantive control over language performance Generate a new language with userproduced libraries Reliability can be improved

13 Applications

14 Application: Matlab SP Signal processing users want simplicity, programming power, and performance Currently over 500,000 Matlab licenses Matlab gives them simplicity and power but not performance Codes prototyped in Matlab, then rewritten in low-level programming language

15 Matlab SP: Profitable Transformations Vectorization: conversion of loops to array expressions Optimization of array expressions, including array allocation and reshape Applying conventional expression optimizations to procedures

16 Procedure Strength Reduction for i = 1:N x = x + f(c1,c2,i,c3) end f 0 (c1, c2, c3) for i = 1:N x = x + f 1 (i) end

17 Strength Reduction Performance Before Strength Reduction 1.0 After Strength Reduction jmp1 ctss olbf Results courtesy of Arun Chauhan

18 Application: Matlab SP Role of Telescoping Languages: Critical signal processing code modules are reused many times Run these procedures through the language generator Produce Matlab SP, a high-level domainspecific environment

19 Component Integration System Component integration systems are viewed as important productivity tools Programs constructed from them are often slow because no context based code improvements can be applied Telescoping languages could be applied to construct component integration systems that yield high-performance applications

20 Component Integration: A Special Case Integration of different component libraries that Implement data structures (e.g., sparse matrices) Implement functions on data structures (e.g., linear algebra) Telescoping languages can handle this well

21 Parallelism in Matlab Add distributions to Matlab arrays Distributions can cross multiple processors A(1:100) A(101:200) A(201:300) A(301:400) Use distributions to guide parallelism Hide parallelism in component array operations

22 Library Generator (ARGen) Prof Dan Sorensen (Rice CAAM) maintains ARPACK, a large-scale eigenvalue solver He prototypes the algorithms in Matlab, then generates 8 variants in Fortran by hand: ({Double, Complex} x {Symmetric, Nonsymmetric} x {Dense, Sparse} Could this hand generation step be avoided?

23 ARGen Results 400 Matlab ARGen ARPACK Dense Symmetric Sparse Symmetric Results courtesy of Cheryl McCosh

24 A Statistical Analysis Language S: A high-level language for manipulating, analyzing, and displaying data, widely used for design of clinical studies in medicine S Optimization: All the Matlab optimizations Translation to C with folding of temporary arrays into usage points

25 S Optimization Results Speedup Geneshaving Gibbs Smpl Trial Design Results courtesy of Bradley Broom

26 Generator for Grid Computations The Grid: nets of interconnected supercomputers Several national science infrastructures under development Challenge: application development

27 National Distributed Problem Solving Database Supercomputer Supercomputer Database

28 Grid Programming Today Application development is possible Support for finding available computer cycles, accounting, job initiation, and communication between parts of a program running on different machines Applications are programmed by hand Requires special expertise

29 Grid Programming Challenges Finding parallelism Mapping applications to machines and network links with different capacities Adapting to changes in load

30 GrADSoft Architecture Software Components Real-time Performance Monitor P S E Program Integrator/ Compiler Application Configurable Object Program Scheduler/ Resource Negotiator Negotiation Grid Run- Time System Libraries Binder Program Preparation System Execution Environment GrADS Project (NSF NGS): Berman, Chien, Cooper, Dongarra, Foster, Gannon, Johnsson, Kennedy, Kesselman, Mellor-Crummey, Reed, Torczon, Wolski

31 Summary A goal of computer science research is to make professionals, especially scientists and engineers, more productive This goal is difficult to achieve because of the need for high-performance applications One solution is to develop technologies that directly translate prototyping languages to production code

32 Collaborators Bradley Broom Arun Chauhan Keith Cooper Jack Dongarra Rob Fowler Lennart Johnsson Chuck Koelbel Cheryl McCosh John Mellor-Crummey Linda Torczon

33 The End

Generation of High Performance Domain- Specific Languages from Component Libraries. Ken Kennedy Rice University

Generation of High Performance Domain- Specific Languages from Component Libraries. Ken Kennedy Rice University Generation of High Performance Domain- Specific Languages from Component Libraries Ken Kennedy Rice University Collaborators Raj Bandypadhyay Zoran Budimlic Arun Chauhan Daniel Chavarria-Miranda Keith

More information

Compilers for High Performance Computer Systems: Do They Have a Future? Ken Kennedy Rice University

Compilers for High Performance Computer Systems: Do They Have a Future? Ken Kennedy Rice University Compilers for High Performance Computer Systems: Do They Have a Future? Ken Kennedy Rice University Collaborators Raj Bandypadhyay Zoran Budimlic Arun Chauhan Daniel Chavarria-Miranda Keith Cooper Jack

More information

Component Architectures

Component Architectures Component Architectures Rapid Prototyping in a Networked Environment Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/lacsicomponentssv01.pdf Participants Ruth Aydt Bradley Broom Zoran

More information

Parallel Matlab Based on Telescoping Languages and Data Parallel Compilation. Ken Kennedy Rice University

Parallel Matlab Based on Telescoping Languages and Data Parallel Compilation. Ken Kennedy Rice University Parallel Matlab Based on Telescoping Languages and Data Parallel Compilation Ken Kennedy Rice University Collaborators Raj Bandypadhyay Zoran Budimlic Arun Chauhan Daniel Chavarria-Miranda Keith Cooper

More information

Compilers and Run-Time Systems for High-Performance Computing

Compilers and Run-Time Systems for High-Performance Computing Compilers and Run-Time Systems for High-Performance Computing Blurring the Distinction between Compile-Time and Run-Time Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/compilerruntime.pdf

More information

Compiler Architecture for High-Performance Problem Solving

Compiler Architecture for High-Performance Problem Solving Compiler Architecture for High-Performance Problem Solving A Quest for High-Level Programming Systems Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/compilerarchitecture.pdf Context

More information

Compiler Technology for Problem Solving on Computational Grids

Compiler Technology for Problem Solving on Computational Grids Compiler Technology for Problem Solving on Computational Grids An Overview of Programming Support Research in the GrADS Project Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/gridcompilers.pdf

More information

Telescoping MATLAB for DSP Applications

Telescoping MATLAB for DSP Applications Telescoping MATLAB for DSP Applications PhD Thesis Defense Arun Chauhan Computer Science, Rice University PhD Thesis Defense July 10, 2003 Two True Stories Two True Stories the world of Digital Signal

More information

Grid Application Development Software

Grid Application Development Software Grid Application Development Software Department of Computer Science University of Houston, Houston, Texas GrADS Vision Goals Approach Status http://www.hipersoft.cs.rice.edu/grads GrADS Team (PIs) Ken

More information

UG3 Compiling Techniques Overview of the Course

UG3 Compiling Techniques Overview of the Course UG3 Compiling Techniques Overview of the Course Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission

More information

LCPC Arun Chauhan and Ken Kennedy

LCPC Arun Chauhan and Ken Kennedy Slice-hoisting for Array-size Inference in MATLAB LCPC 2003 Arun Chauhan and Ken Kennedy Computer Science, Rice University LCPC 2003 Oct 4, 2003 History Repeats It was our belief that if FORTRAN, during

More information

Compiling Techniques

Compiling Techniques Lecture 1: Introduction 20 September 2016 Table of contents 1 2 3 Essential Facts Lecturer: (christophe.dubach@ed.ac.uk) Office hours: Thursdays 11am-12pm Textbook (not strictly required): Keith Cooper

More information

Why Performance Models Matter for Grid Computing

Why Performance Models Matter for Grid Computing Why Performance Models Matter for Grid Computing Ken Kennedy 1 Rice University ken@rice.edu 1 Introduction Global heterogeneous computing, often referred to as the Grid [5, 6], is a popular emerging computing

More information

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Overview of the Course These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Critical Facts Welcome to CS415 Compilers Topics in the

More information

Toward a Framework for Preparing and Executing Adaptive Grid Programs

Toward a Framework for Preparing and Executing Adaptive Grid Programs Toward a Framework for Preparing and Executing Adaptive Grid Programs Ken Kennedy α, Mark Mazina, John Mellor-Crummey, Keith Cooper, Linda Torczon Rice University Fran Berman, Andrew Chien, Holly Dail,

More information

CS 526 Advanced Topics in Compiler Construction. 1 of 12

CS 526 Advanced Topics in Compiler Construction. 1 of 12 CS 526 Advanced Topics in Compiler Construction 1 of 12 Course Organization Instructor: David Padua 3-4223 padua@uiuc.edu Office hours: By appointment Course material: Website Textbook: Randy Allen and

More information

Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries

Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries Ken Kennedy, Bradley Broom, Keith Cooper, Jack Dongarra, Rob Fowler, Dennis Gannon,

More information

Pondering the Problem of Programmers Productivity

Pondering the Problem of Programmers Productivity Pondering the Problem of Programmers Productivity Are we there yet? Arun Chauhan Indiana University Domain-specific Languages Systems Seminar, 2004-11-04 The Big Picture Human-Computer Interface The Big

More information

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies

More information

Grid Computing: Application Development

Grid Computing: Application Development Grid Computing: Application Development Lennart Johnsson Department of Computer Science and the Texas Learning and Computation Center University of Houston Houston, TX Department of Numerical Analysis

More information

Biological Sequence Alignment On The Computational Grid Using The Grads Framework

Biological Sequence Alignment On The Computational Grid Using The Grads Framework Biological Sequence Alignment On The Computational Grid Using The Grads Framework Asim YarKhan (yarkhan@cs.utk.edu) Computer Science Department, University of Tennessee Jack J. Dongarra (dongarra@cs.utk.edu)

More information

Why Performance Models Matter for Grid Computing

Why Performance Models Matter for Grid Computing Why Performance Models Matter for Grid Computing Ken Kennedy 1 Rice University ken@rice.edu 1 Introduction Global heterogeneous computing, often referred to as the Grid [5, 6], is a popular emerging computing

More information

Enhanced Representation Of Data Flow Anomaly Detection For Teaching Evaluation

Enhanced Representation Of Data Flow Anomaly Detection For Teaching Evaluation Enhanced Representation Of Data Flow Anomaly Detection For Teaching Evaluation T.Mamatha A.BalaRam Asst.Prof. in Dept. of CSE Assoc.Prof. in Dept of CSE SreeNidhi Institute of Science & Technology CMR

More information

Self-adapting Numerical Software for Next Generation Applications Lapack Working Note 157, ICL-UT-02-07

Self-adapting Numerical Software for Next Generation Applications Lapack Working Note 157, ICL-UT-02-07 Self-adapting Numerical Software for Next Generation Applications Lapack Working Note 157, ICL-UT-02-07 Jack Dongarra, Victor Eijkhout December 2002 Abstract The challenge for the development of next generation

More information

Slice-hoisting for Array-size Inference in MATLAB

Slice-hoisting for Array-size Inference in MATLAB Slice-hoisting for Array-size Inference in MATLAB Arun Chauhan and Ken Kennedy achauhan@cs.rice.edu ken@cs.rice.edu Department of Computer Science, Rice University, Houston, TX 77005 Abstract. Inferring

More information

Compiler Design. Dr. Chengwei Lei CEECS California State University, Bakersfield

Compiler Design. Dr. Chengwei Lei CEECS California State University, Bakersfield Compiler Design Dr. Chengwei Lei CEECS California State University, Bakersfield The course Instructor: Dr. Chengwei Lei Office: Science III 339 Office Hours: M/T/W 1:00-1:59 PM, or by appointment Phone:

More information

Parallelizing MATLAB

Parallelizing MATLAB Parallelizing MATLAB Arun Chauhan Indiana University ParaM Supercomputing, OSC booth, 2004-11-10 The Performance Gap MATLAB Example function mcc demo x = 1; y = x / 10; z = x * 20; r = y + z; MATLAB Example

More information

Experiments with Scheduling Using Simulated Annealing in a Grid Environment

Experiments with Scheduling Using Simulated Annealing in a Grid Environment Experiments with Scheduling Using Simulated Annealing in a Grid Environment Asim YarKhan Computer Science Department University of Tennessee yarkhan@cs.utk.edu Jack J. Dongarra Computer Science Department

More information

Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries

Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries Ken Kennedy, Bradley Broom, Keith Cooper, Jack Dongarra, Rob Fowler, Dennis Gannon,

More information

Compilation for Heterogeneous Platforms

Compilation for Heterogeneous Platforms Compilation for Heterogeneous Platforms Grid in a Box and on a Chip Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/heterogeneous.pdf Senior Researchers Ken Kennedy John Mellor-Crummey

More information

Code Merge. Flow Analysis. bookkeeping

Code Merge. Flow Analysis. bookkeeping Historic Compilers Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials

More information

CS426 Compiler Construction Fall 2006

CS426 Compiler Construction Fall 2006 CS426 Compiler Construction David Padua Department of Computer Science University of Illinois at Urbana-Champaign 0. Course organization 2 of 23 Instructor: David A. Padua 4227 SC, 333-4223 Office Hours:

More information

GRID*p: Interactive Data-Parallel Programming on the Grid with MATLAB

GRID*p: Interactive Data-Parallel Programming on the Grid with MATLAB GRID*p: Interactive Data-Parallel Programming on the Grid with MATLAB Imran Patel and John R. Gilbert Department of Computer Science University of California, Santa Barbara {imran, gilbert}@cs.ucsb.edu

More information

Multicore Computing and Scientific Discovery

Multicore Computing and Scientific Discovery scientific infrastructure Multicore Computing and Scientific Discovery James Larus Dennis Gannon Microsoft Research In the past half century, parallel computers, parallel computation, and scientific research

More information

Latency Hiding by Redundant Processing: A Technique for Grid enabled, Iterative, Synchronous Parallel Programs

Latency Hiding by Redundant Processing: A Technique for Grid enabled, Iterative, Synchronous Parallel Programs Latency Hiding by Redundant Processing: A Technique for Grid enabled, Iterative, Synchronous Parallel Programs Jeremy F. Villalobos University of North Carolina at Charlote 921 University City Blvd Charlotte,

More information

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Overview of the Course These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Welcome to CS415 - Compilers Topics in the design of

More information

High Performance Computing Course Notes Grid Computing I

High Performance Computing Course Notes Grid Computing I High Performance Computing Course Notes 2008-2009 2009 Grid Computing I Resource Demands Even as computer power, data storage, and communication continue to improve exponentially, resource capacities are

More information

Biological Sequence Alignment On The Computational Grid Using The GrADS Framework

Biological Sequence Alignment On The Computational Grid Using The GrADS Framework Biological Sequence Alignment On The Computational Grid Using The GrADS Framework Asim YarKhan a Jack J. Dongarra a,b a Computer Science Department, University of Tennessee, Knoxville, TN 37996 b Computer

More information

Overpartioning with the Rice dhpf Compiler

Overpartioning with the Rice dhpf Compiler Overpartioning with the Rice dhpf Compiler Strategies for Achieving High Performance in High Performance Fortran Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/hug00overpartioning.pdf

More information

CS415 Compilers. Intermediate Represeation & Code Generation

CS415 Compilers. Intermediate Represeation & Code Generation CS415 Compilers Intermediate Represeation & Code Generation These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Review - Types of Intermediate Representations

More information

The Processor Memory Hierarchy

The Processor Memory Hierarchy Corrected COMP 506 Rice University Spring 2018 The Processor Memory Hierarchy source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Procedure Strength Reduction: An Optimizing Strategy for Telescoping Languages

Procedure Strength Reduction: An Optimizing Strategy for Telescoping Languages Procedure Strength Reduction: An Optimizing Strategy for Telescoping Languages Arun Chauhan and Ken Kennedy Motivation High Performance programming is hard Increasingly a specialized activity Shortage

More information

The View from 35,000 Feet

The View from 35,000 Feet The View from 35,000 Feet This lecture is taken directly from the Engineering a Compiler web site with only minor adaptations for EECS 6083 at University of Cincinnati Copyright 2003, Keith D. Cooper,

More information

Self-adapting Numerical Software and Automatic Tuning of Heuristics

Self-adapting Numerical Software and Automatic Tuning of Heuristics Self-adapting Numerical Software and Automatic Tuning of Heuristics Jack Dongarra, Victor Eijkhout Abstract Self-Adapting Numerical Software (SANS) systems aim to bridge the knowledge gap that exists between

More information

Compiling Java For High Performance on Servers

Compiling Java For High Performance on Servers Compiling Java For High Performance on Servers Ken Kennedy Center for Research on Parallel Computation Rice University Goal: Achieve high performance without sacrificing language compatibility and portability.

More information

GrADSoft and its Application Manager: An Execution Mechanism for Grid Applications

GrADSoft and its Application Manager: An Execution Mechanism for Grid Applications GrADSoft and its Application Manager: An Execution Mechanism for Grid Applications Authors Ken Kennedy, Mark Mazina, John Mellor-Crummey, Rice University Ruth Aydt, Celso Mendes, UIUC Holly Dail, Otto

More information

Matlab Programming MET 164 1/24

Matlab Programming MET 164 1/24 Matlab Programming 1/24 2/24 What does MATLAB mean? Contraction of Matrix Laboratory Matrices are rectangular arrays of numerical values 7 3 6 2 1 9 4 4 8 4 1 5 7 2 1 3 What are the fundamental components

More information

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code COMP 412 FALL 2017 Local Optimization: Value Numbering The Desert Island Optimization Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon,

More information

CS Understanding Parallel Computing

CS Understanding Parallel Computing CS 594 001 Understanding Parallel Computing Web page for the course: http://www.cs.utk.edu/~dongarra/web-pages/cs594-2006.htm CS 594 001 Wednesday s 1:30 4:00 Understanding Parallel Computing: From Theory

More information

SIMULATION OF ADAPTIVE APPLICATIONS IN HETEROGENEOUS COMPUTING ENVIRONMENTS

SIMULATION OF ADAPTIVE APPLICATIONS IN HETEROGENEOUS COMPUTING ENVIRONMENTS SIMULATION OF ADAPTIVE APPLICATIONS IN HETEROGENEOUS COMPUTING ENVIRONMENTS Bo Hong and Viktor K. Prasanna Department of Electrical Engineering University of Southern California Los Angeles, CA 90089-2562

More information

Decreasing End-to Job Execution Times by Increasing Resource Utilization using Predictive Scheduling in the Grid

Decreasing End-to Job Execution Times by Increasing Resource Utilization using Predictive Scheduling in the Grid Decreasing End-to to-end Job Execution Times by Increasing Resource Utilization using Predictive Scheduling in the Grid Ioan Raicu Computer Science Department University of Chicago Grid Computing Seminar

More information

Arrays and Functions

Arrays and Functions COMP 506 Rice University Spring 2018 Arrays and Functions source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

Instruction Selection: Peephole Matching. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Instruction Selection: Peephole Matching. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Instruction Selection: Peephole Matching Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. The Problem Writing a compiler is a lot of work Would like to reuse components

More information

Intermediate Representations

Intermediate Representations Most of the material in this lecture comes from Chapter 5 of EaC2 Intermediate Representations Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP

More information

Future Applications and Architectures

Future Applications and Architectures Future Applications and Architectures And Mapping One to the Other Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/futurelacsi06.pdf Viewpoint (Outside DOE) What is the predominant

More information

Intermediate Representations

Intermediate Representations COMP 506 Rice University Spring 2018 Intermediate Representations source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

Co-array Fortran Performance and Potential: an NPB Experimental Study. Department of Computer Science Rice University

Co-array Fortran Performance and Potential: an NPB Experimental Study. Department of Computer Science Rice University Co-array Fortran Performance and Potential: an NPB Experimental Study Cristian Coarfa Jason Lee Eckhardt Yuri Dotsenko John Mellor-Crummey Department of Computer Science Rice University Parallel Programming

More information

CS415 Compilers. Instruction Scheduling and Lexical Analysis

CS415 Compilers. Instruction Scheduling and Lexical Analysis CS415 Compilers Instruction Scheduling and Lexical Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Instruction Scheduling (Engineer

More information

Research Related Activities

Research Related Activities Research Related Activities Lennart Johnsson Research Infrastructure Research Science and Engineering Research Infrastructure Observations Collaborators are increasingly chosen regardless of location Instruments

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

Programming Languages and Compilers. Jeff Nucciarone AERSP 597B Sept. 20, 2004

Programming Languages and Compilers. Jeff Nucciarone AERSP 597B Sept. 20, 2004 Programming Languages and Compilers Jeff Nucciarone Sept. 20, 2004 Programming Languages Fortran C C++ Java many others Why use Standard Programming Languages? Programming tedious requiring detailed knowledge

More information

Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid Architectures

Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid Architectures Procedia Computer Science Volume 51, 2015, Pages 2774 2778 ICCS 2015 International Conference On Computational Science Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid

More information

Introduction to Optimization Local Value Numbering

Introduction to Optimization Local Value Numbering COMP 506 Rice University Spring 2018 Introduction to Optimization Local Value Numbering source IR IR target code Front End Optimizer Back End code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights

More information

Project Proposals. 1 Project 1: On-chip Support for ILP, DLP, and TLP in an Imagine-like Stream Processor

Project Proposals. 1 Project 1: On-chip Support for ILP, DLP, and TLP in an Imagine-like Stream Processor EE482C: Advanced Computer Organization Lecture #12 Stream Processor Architecture Stanford University Tuesday, 14 May 2002 Project Proposals Lecture #12: Tuesday, 14 May 2002 Lecturer: Students of the class

More information

Just-In-Time Compilers & Runtime Optimizers

Just-In-Time Compilers & Runtime Optimizers COMP 412 FALL 2017 Just-In-Time Compilers & Runtime Optimizers Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Introduction to Cluster Computing

Introduction to Cluster Computing Introduction to Cluster Computing Prabhaker Mateti Wright State University Dayton, Ohio, USA Overview High performance computing High throughput computing NOW, HPC, and HTC Parallel algorithms Software

More information

Principles of Parallel Algorithm Design: Concurrency and Mapping

Principles of Parallel Algorithm Design: Concurrency and Mapping Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 17 January 2017 Last Thursday

More information

A Grid Web Portal for Aerospace

A Grid Web Portal for Aerospace A Grid Web Portal for Aerospace Sang Boem Lim*, Joobum Kim*, Nam Gyu Kim*, June H. Lee*, Chongam Kim, Yoonhee Kim * Supercomputing Application Technology Department, Korea Institute of Science and Technology

More information

In 1986, I had degrees in math and engineering and found I wanted to compute things. What I ve mostly found is that:

In 1986, I had degrees in math and engineering and found I wanted to compute things. What I ve mostly found is that: Parallel Computing and Data Locality Gary Howell In 1986, I had degrees in math and engineering and found I wanted to compute things. What I ve mostly found is that: Real estate and efficient computation

More information

Systems Architecture, Sixth Edition. Chapter 2 Introduction To Systems Architecture

Systems Architecture, Sixth Edition. Chapter 2 Introduction To Systems Architecture Systems Architecture, Sixth Edition Chapter 2 Introduction To Systems Architecture Chapter Objectives In this chapter, you will learn to: Discuss the development of automated computing Describe the general

More information

Using Cache Models and Empirical Search in Automatic Tuning of Applications. Apan Qasem Ken Kennedy John Mellor-Crummey Rice University Houston, TX

Using Cache Models and Empirical Search in Automatic Tuning of Applications. Apan Qasem Ken Kennedy John Mellor-Crummey Rice University Houston, TX Using Cache Models and Empirical Search in Automatic Tuning of Applications Apan Qasem Ken Kennedy John Mellor-Crummey Rice University Houston, TX Outline Overview of Framework Fine grain control of transformations

More information

LAPACK. Linear Algebra PACKage. Janice Giudice David Knezevic 1

LAPACK. Linear Algebra PACKage. Janice Giudice David Knezevic 1 LAPACK Linear Algebra PACKage 1 Janice Giudice David Knezevic 1 Motivating Question Recalling from last week... Level 1 BLAS: vectors ops Level 2 BLAS: matrix-vectors ops 2 2 O( n ) flops on O( n ) data

More information

Speeding up MATLAB Applications Sean de Wolski Application Engineer

Speeding up MATLAB Applications Sean de Wolski Application Engineer Speeding up MATLAB Applications Sean de Wolski Application Engineer 2014 The MathWorks, Inc. 1 Non-rigid Displacement Vector Fields 2 Agenda Leveraging the power of vector and matrix operations Addressing

More information

Advanced Reservation-based Scheduling of Task Graphs on Clusters

Advanced Reservation-based Scheduling of Task Graphs on Clusters Advanced Reservation-based Scheduling of Task Graphs on Clusters Anthony Sulistio 1, Wolfram Schiffmann 2, and Rajkumar Buyya 1 1 Grid Computing and Distributed Systems Lab Dept. of Computer Science and

More information

Handling Assignment Comp 412

Handling Assignment Comp 412 COMP 412 FALL 2018 Handling Assignment Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp

More information

FOBS: A Lightweight Communication Protocol for Grid Computing Phillip M. Dickens

FOBS: A Lightweight Communication Protocol for Grid Computing Phillip M. Dickens FOBS: A Lightweight Communication Protocol for Grid Computing Phillip M. Dickens Abstract The advent of high-performance networks in conjunction with low-cost, powerful computational engines has made possible

More information

Performance Analysis of the MPAS-Ocean Code using HPCToolkit and MIAMI

Performance Analysis of the MPAS-Ocean Code using HPCToolkit and MIAMI Performance Analysis of the MPAS-Ocean Code using HPCToolkit and MIAMI Gabriel Marin February 11, 2014 MPAS-Ocean [4] is a component of the MPAS framework of climate models. MPAS-Ocean is an unstructured-mesh

More information

Early Evaluation of the Cray X1 at Oak Ridge National Laboratory

Early Evaluation of the Cray X1 at Oak Ridge National Laboratory Early Evaluation of the Cray X1 at Oak Ridge National Laboratory Patrick H. Worley Thomas H. Dunigan, Jr. Oak Ridge National Laboratory 45th Cray User Group Conference May 13, 2003 Hyatt on Capital Square

More information

Parallel Numerics, WT 2013/ Introduction

Parallel Numerics, WT 2013/ Introduction Parallel Numerics, WT 2013/2014 1 Introduction page 1 of 122 Scope Revise standard numerical methods considering parallel computations! Required knowledge Numerics Parallel Programming Graphs Literature

More information

Virtual Grids. Today s Readings

Virtual Grids. Today s Readings Virtual Grids Last Time» Adaptation by Applications» What do you need to know? To do it well?» Grid Application Development Software (GrADS) Today» Virtual Grids» Virtual Grid Application Development Software

More information

GLAF: A Visual Programming and Auto- Tuning Framework for Parallel Computing

GLAF: A Visual Programming and Auto- Tuning Framework for Parallel Computing GLAF: A Visual Programming and Auto- Tuning Framework for Parallel Computing Student: Konstantinos Krommydas Collaborator: Dr. Ruchira Sasanka (Intel) Advisor: Dr. Wu-chun Feng Motivation High-performance

More information

CS420/CSE 402/ECE 492. Introduction to Parallel Programming for Scientists and Engineers. Spring 2006

CS420/CSE 402/ECE 492. Introduction to Parallel Programming for Scientists and Engineers. Spring 2006 CS420/CSE 402/ECE 492 Introduction to Parallel Programming for Scientists and Engineers Spring 2006 1 of 28 Additional Foils 0.i: Course organization 2 of 28 Instructor: David Padua. 4227 SC padua@uiuc.edu

More information

Building Performance Topologies for Computational Grids UCSB Technical Report

Building Performance Topologies for Computational Grids UCSB Technical Report Building Performance Topologies for Computational Grids UCSB Technical Report 2002-11 Martin Swany and Rich Wolski Department of Computer Science University of California Santa Barbara, CA 93106 {swany,rich}@cs..edu

More information

Team Science in mhealth Research

Team Science in mhealth Research Team Science in mhealth Research Sherry Pagoto, PhD Co-Founder, UMass Center of mhealth and Social Media Associate Professor of Medicine Division of Preventive and Behavioral Medicine University of Massachusetts

More information

Building Performance Topologies for Computational Grids

Building Performance Topologies for Computational Grids Building Performance Topologies for Computational Grids Martin Swany and Rich Wolski Department of Computer Science University of California Santa Barbara, CA 93106 {swany,rich}@cs.ucsb.edu Abstract This

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 2: Hardware/Software Interface Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Basic computer components How does a microprocessor

More information

MATLAB*P: Architecture. Ron Choy, Alan Edelman Laboratory for Computer Science MIT

MATLAB*P: Architecture. Ron Choy, Alan Edelman Laboratory for Computer Science MIT MATLAB*P: Architecture Ron Choy, Alan Edelman Laboratory for Computer Science MIT Outline The p is for parallel MATLAB is what people want Supercomputing in 2003 The impact of the Earth Simulator The impact

More information

Instruction Selection: Preliminaries. Comp 412

Instruction Selection: Preliminaries. Comp 412 COMP 412 FALL 2017 Instruction Selection: Preliminaries Comp 412 source code Front End Optimizer Back End target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can

More information

A Performance Oriented Migration Framework For The Grid Λ

A Performance Oriented Migration Framework For The Grid Λ A Performance Oriented Migration Framework For The Grid Λ Sathish S. Vadhiyar and Jack J. Dongarra Computer Science Department University of Tennessee fvss, dongarrag@cs.utk.edu Abstract At least three

More information

Principles of Parallel Algorithm Design: Concurrency and Mapping

Principles of Parallel Algorithm Design: Concurrency and Mapping Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 28 August 2018 Last Thursday Introduction

More information

Compilers and Compiler-based Tools for HPC

Compilers and Compiler-based Tools for HPC Compilers and Compiler-based Tools for HPC John Mellor-Crummey Department of Computer Science Rice University http://lacsi.rice.edu/review/2004/slides/compilers-tools.pdf High Performance Computing Algorithms

More information

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić How to perform HPL on CPU&GPU clusters Dr.sc. Draško Tomić email: drasko.tomic@hp.com Forecasting is not so easy, HPL benchmarking could be even more difficult Agenda TOP500 GPU trends Some basics about

More information

Issues In Implementing The Primal-Dual Method for SDP. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM

Issues In Implementing The Primal-Dual Method for SDP. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM Issues In Implementing The Primal-Dual Method for SDP Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu Outline 1. Cache and shared memory parallel computing concepts.

More information

Harvard-MIT Division of Health Sciences and Technology HST.952: Computing for Biomedical Scientists HST 952. Computing for Biomedical Scientists

Harvard-MIT Division of Health Sciences and Technology HST.952: Computing for Biomedical Scientists HST 952. Computing for Biomedical Scientists Harvard-MIT Division of Health Sciences and Technology HST.952: Computing for Biomedical Scientists HST 952 Computing for Biomedical Scientists Introduction Medical informatics is interdisciplinary, and

More information

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP (extended abstract) Mitsuhisa Sato 1, Motonari Hirano 2, Yoshio Tanaka 2 and Satoshi Sekiguchi 2 1 Real World Computing Partnership,

More information

The Cascade High Productivity Programming Language

The Cascade High Productivity Programming Language The Cascade High Productivity Programming Language Hans P. Zima University of Vienna, Austria and JPL, California Institute of Technology, Pasadena, CA CMWF Workshop on the Use of High Performance Computing

More information

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target. COMP 412 FALL 2017 Generating Code for Assignment Statements back to work Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights

More information

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g., C++) to low-level assembly language that can be executed by hardware int a,

More information

NetSolve: past, present, and future; a look at a Grid enabled server 1

NetSolve: past, present, and future; a look at a Grid enabled server 1 24 NetSolve: past, present, and future; a look at a Grid enabled server 1 Sudesh Agrawal, Jack Dongarra, Keith Seymour, and Sathish Vadhiyar University of Tennessee, Tennessee, United States 24.1 INTRODUCTION

More information