FLEXJAVA:)Language'Support' for'safe'and'modular' Approximate'Programming

Similar documents
Approximate Computing Is Dead; Long Live Approximate Computing. Adrian Sampson Cornell

Bridging Analog Neuromorphic and Digital von Neumann Computing

Architecture Support for Disciplined Approximate Programming

Luis Ceze. sa pa. areas: computer architecture, OS, programming languages. Safe MultiProcessing Architectures at the University of Washington

This page intentionally left blank

MA 511: Computer Programming Lecture 3: Partha Sarathi Mandal

Approximate Overview of Approximate Computing

ESC101 : Fundamental of Computing

Core. Error Predictor. Figure 1: Architectural overview of our quality control approach. Approximate Accelerator. Precise.

Refinement Types for TypeScript

Tutorial 11. Exercise 1: CSC111 Computer Programming I. A. Write a code snippet to define the following arrays:

C Programming for Engineers Functions

CSE101-lec#12. Designing Structured Programs Introduction to Functions. Created By: Amanpreet Kaur & Sanjeev Kumar SME (CSE) LPU

CS141 Programming Assignment #5

CS S-08 Arrays and Midterm Review 1

6.092 Introduction to Software Engineering in Java January (IAP) 2009

Architecture, Programming and Performance of MIC Phi Coprocessor

Dealing with Concurrency Problems

Q1 Q2 Q3 Q4 Q5 Total 1 * 7 1 * 5 20 * * Final marks Marks First Question

AxBench: A Benchmark Suite for Approximate Computing Across the System Stack

CS221: Algorithms and Data Structures. Asymptotic Analysis. Alan J. Hu (Borrowing slides from Steve Wolfman)

More types, Methods, Conditionals. ARCS Lab.

Identifying Volatile Numeric Expressions in OpenCL Applications Miriam Leeser

Copyright 2012, Elsevier Inc. All rights reserved.

CSCI 135 Exam #0 Fundamentals of Computer Science I Fall 2013

Approximate Program Synthesis

Array. Array Declaration:

OmpCloud: Bridging the Gap between OpenMP and Cloud Computing

CS 170 Exam 2. Section 004 Fall Name (print): Instructions:

University of Palestine. Mid Exam Total Grade: 100

Data Structure and Programming Languages

AP COMPUTER SCIENCE A DIAGNOSTIC EXAM. Multiple Choice Section Time - 1 hour and 15 minutes Number of questions - 40 Percent of total grade - 50

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks

Cloning Arrays. In practice, one might duplicate an array for some reason. One could attempt to use the assignment statement (=), for example,

Runtime Support for Scalable Task-parallel Programs

2 marks. class q1c{ class point{ int p,q; point(int p, int q){ this.p=p; this.q=q; } void printpoint(){ System.out.println(this.p+" "+this.

CSC 1051 Arrays - Review questions

A Preliminary Workload Analysis of SPECjvm2008

Introducing SQL Query Verifier Plugin

Chapter 4 Functions By C.K. Liang

COMPUTER APPLICATIONS

EECS4201 Computer Architecture

Efficient, Deterministic, and Deadlock-free Concurrency

Course web site: teaching/courses/car. Piazza discussion forum:

Data Structure and Programming Languages

Compilation and Hardware Support for Approximate Acceleration

CS18000: Problem Solving And Object-Oriented Programming

Axilog: Language Support for Approximate Hardware Design

Faculty of Engineering Computer Engineering Department Islamic University of Gaza C++ Programming Language Lab # 6 Functions

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Arrays. CS10001: Programming & Data Structures. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

Getting started with Java

COS 126 General Computer Science Spring Written Exam 1

(Not Quite) Minijava

Use the template below and fill in the areas in Red to complete it.

Understanding Shaders and WebGL. Chris Dalton & Olli Etuaho

Tiling: A Data Locality Optimizing Algorithm

Arrays in C. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur. Basic Concept

Evaluating MMX Technology Using DSP and Multimedia Applications

Methods (Deitel chapter 6)

Methods (Deitel chapter 6)

Java Tutorial. Saarland University. Ashkan Taslimi. Tutorial 3 September 6, 2011

Computação I. Exercises. Leonardo Vanneschi NOVA IMS, Universidade Nova de Lisboa. Leonardo Vanneschi Computação I NOVA IMS

EECS402 Lecture 02. Functions. Function Prototype

CODE-GENERATION FOR DIFFERENTIAL EQUATION SOLVERS

Neural Acceleration for GPU Throughput Processors

Quiz for Chapter 2 Instructions: Language of the Computer3.10

Affine Loop Optimization using Modulo Unrolling in CHAPEL

Neural Acceleration for General-Purpose Approximate Programs

Integrated CPU and Cache Power Management in Multiple Clock Domain Processors

Introduction to Knime 1

Visual Programming. Lecture 2: More types, Methods, Conditionals

A Tightly-coupled Light-Weight Neural Network Processing Units with RISC-V Core

Tiling: A Data Locality Optimizing Algorithm

ITEC2620 Introduction to Data Structures

ArcGIS Runtime: Styling Maps. Lucas Danzinger and Michael Wilburn

XPU A Programmable FPGA Accelerator for Diverse Workloads

Neural Network based Energy-Efficient Fault Tolerant Architect

CSE 141 Summer 2016 Homework 2

Object Oriented Programming in Java. Jaanus Pöial, PhD Tallinn, Estonia

Term 1 Unit 1 Week 1 Worksheet: Output Solution

Lecture 9 - C Functions

Overcoming the Barriers to Sustained Petaflop Performance. William D. Gropp Mathematics and Computer Science

Recursive definition: A definition that is defined in terms of itself. Recursive method: a method that calls itself (directly or indirectly).

Java Identifiers, Data Types & Variables

Programming Problem for the 1999 Comprehensive Exam

Chapter 01 Arrays Prepared By: Dr. Murad Magableh 2013

24V, Motorized Actuators

M.CS201 Programming language

Functions. Lab 4. Introduction: A function : is a collection of statements that are grouped together to perform an operation.

The University Of Michigan. EECS402 Lecture 02. Andrew M. Morgan. Savitch Ch. 3-4 Functions Value and Reference Parameters.

Research Group. 2: More types, Methods, Conditionals

OpenMP on the FDSM software distributed shared memory. Hiroya Matsuba Yutaka Ishikawa

A Preliminary Workload Analysis of SPECjvm2008

A PROBLEM can be solved easily if it is decomposed into parts. Similarly a C program decomposes a program into its component functions.

RECURSION. Data Structures & SWU Rachel Cardell- Oliver

CS 470 Spring Mike Lam, Professor. Advanced OpenMP

CS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims. Lecture 10: Asymptotic Complexity and

DM502 Programming A. Peter Schneider-Kamp.

On the Design of the Local Variable Cache in a Hardware Translation-Based Java Virtual Machine

Transcription:

FLEXJAVA:)Language'Support' for'safe'and'modular' Approximate'Programming Jongse Park,'Hadi'Esmaeilzadeh,'Xin'Zhang,' Mayur Naik,'William'Harris Alternative'Computing'Technologies'(ACT)'Lab Georgia'Institute'of'Technology ESEC/FSE'2015

Energy'is'a'primary'constraint Data)Center Mobile Internet)of) Things 2

Data'growth'vs'performance Data'growth' trends:'idc's'digital'universe'study,'december'2012 Performance'growth' trends:'esmaeilzadeh'et'al,' Dark'Silicon' and'the'end'of' Multicore'Scaling, 'ISCA'2011 3

Adding'a'third'dimension Embracing'Error Processor Pareto.Fron0er Data'center Energy Desktop IoT Mobile Performance 4

Navigating'a'three'dimensional'space Processor Pareto.Fron0er Data'center Energy Desktop IoT Mobile Performance 5

Approximate'computing Embracing'error Relax)the'abstraction'of' near%perfect accuracy)in Processing Storage Communication Allows'errors to'happen'to'improve' performance resource'utilization'efficiency 6

New'landscape'of'computing Personalized'and'targeted'computing 7

Mobile' Computing IoT Computing Cloud'Computing 8

Classes'of approximate'applications Programs'with'analog'inputs Sensors,'scene'reconstruction Programs'with'analog'outputs Multimedia Programs'with'multiple'possible'answers Web'search,'machine'learning Convergent'programs Gradient'descent,'big'data'analytics 9

Avoiding'overkill'design Approximate'Computing Application Programming'Language Compiler Architecutre Microarchitecture Cost Precision Reliability Cost Circuit Physical' Device 10

Cross_stack'approach Application Programming' Language Compiler Architecutre Microarchitecture Circuit Physical' Device 11

WH 2 of'approximation What to'approximate? How'much to'approximate? How to'approximate? Language Compiler Runtime Hardware 12

13

Approximate'program 14

Separation 15

Goal Design'an'automated and'highglevel programming'language' for'approximate'computing Criteria 1) Safety 2) Scalability) 3) Modularity)and)reusability 4) Generality 16

FLEXJAVA Lightweight'set'of'language'extensions Reducing'annotating'effort Language_compiler'co_design Safety Scalability Modularity'and'reusability Generality 17

FLEXJAVA Annotations 1.''Approximation'for'individual'data'and'operations loosen'(variable) loosen_invasive (variable)' tighten'(variable) 2.''Approximation'for'a'code'block begin_loose( TECHNIQUE,'ARGUMENTS) end_loose (ARGUMENTS) 3.'Expressing'the'quality'requirement loosen'(variable,'quality_requirement);' loosen_invasive (variable,'quality_requirement); end_loose (ARGUMENTS,'QUALITY_REQUIREMENT); 18

Safe'and'scalable'approximation float computeluminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; 19

Safe'and'scalable'approximation float computeluminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; 19

Safe'and'scalable'approximation float computeluminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; 19

Safe'and'scalable'approximation float computeluminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; 19

Safe'and'scalable'approximation int fibonacci (int n) { int r; if (n <= 1) r = n; else r = fibonacci (n 1) + fibonacci (n 2); loosen (r); return r; 20

Safe'and'scalable'approximation int fibonacci (int n) { int r; if (n <= 1) r = n; else r = fibonacci (n 1) + fibonacci (n 2); loosen (r); return r; 20

Modularity int p = 1; for (int i = 0; i < a.length; i++) { p *= a[i]; for (int i = 0; i < b.length; i++) { p += b[i]; loosen(p); 21

Reuse'and'library'support static int square(int a) { int s = a * a; loosen(s); return s; main () { int x = 2 * square(3); System.out.println(x); 22

Reuse'and'library'support static int square(int a) { int s = a * a; loosen(s); return s; main () { int x = 2 * square(3); loosen(x); System.out.println(x); 22

Reuse'and'library'support static int square(int a) { int s = a * a; loosen(s); return s; main () { int x = 2 * square(3); loosen(x); System.out.println(x); 22

Reuse'and'library'support static int square(int a) { int s = a * a; loosen(s); return s; main () { int x = 2 * square(3); loosen(x); System.out.println(x); 22

Reuse'and'library'support static int square(int a) { int s = a * a; return s; main () { int x = 2 * square(3); loosen(x); System.out.println(x); 23

Reuse'and'library'support static int square(int a) { int s = a * a; return s; main () { int x = 2 * square(3); loosen(x); System.out.println(x); 23

Reuse'and'library'support static int square(int a) { int s = a * a; return s; main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); 24

Reuse'and'library'support static int square(int a) { int s = a * a; return s; main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); 24

Reuse'and'library'support static int square(int a) { int s = a * a; return s; main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); 24

Reuse'and'library'support static int square(int a) { int s = a * a; return s; main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); 24

Reuse'and'library'support static int square(int a) { int s = a * a; return s; main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); 24

Generality Software Loop)perforation)and)loop)earlyGtermination) [ESEC/FSE 11,)PLDI 10] Function)substitution)[PLDI 10] Hardware Truffle:)Architecture)support)for) approximate)computing)[asplos 12] NPU:)Neural)acceleration)[MICRO 12] Memory Flikker:)Saving)DRAM)refresh)power)[ASPLOS 11] RFVP)and)load)value)prediction)[PACT 14,)MICRO 14] 25

Generality begin_loose( PERFORATION, 0.10); for (int i = 0; i < n; i++) { end_loose(); 26

Generality begin_loose( NPU ); p = Math.sin(x) + Math.cos(y); q = 2 * Math.sin(x + y); end_loose(); 27

Approximation'Safety'Analysis Find'the'maximal'set'of'safe_to_approximate data/operations 1. For'each'annotation,'loosen(v),'find'the'set'of'data)and' operations that affect)v 2. Merge'all'the'sets 3. Exclude'the'data'and'operations'that'affect'control)flow 4. Exclude'the'data'and'operations'that'affect'address) calculations The'rest'of'the'program'data'and'operations'are'precise 28

Workflow Source)code FLEXJAVA annotations Programming Approximation Safety)Analysis SafeGtoGapproximate data/operations Compilation SourceGcode Highlighting)Tool Highlighted Source)Code Feedback 29

Benchmark smm #)lines:)38 SciMark2'benchmark Matrix_vector' multiplication mc #)lines:)59 SciMark2'benchmark Mathematical' approximation fft #)lines:)168 SciMark2'benchmark Signal' processing sor #)lines:)36 SciMark2'benchmark Successive' Over_relaxation lu #)lines:)283 SciMark2'benchmark Matrix' factorization raytracer #)lines:)174 Graphics 3D image' renderer jmeint #)lines:)5,962 3D'gaming Triangle intersection kernel hessian #)lines:)10,174 Image'processing Interest point detection sobel #)lines:)163 Image'processing Image edge detection zxing #)lines:)26,171 Image'processing Bar'code' decoder 30

Lower'is'Better Number'of'Annotations 120 EnerJ FLEXJAVA FlexJava 440 240 696 109 100 Number of Annotations 80 60 40 20 0 52 51 40 30 24 13 14 8 11 9 8 3 3 6 3 6 sor smm mc fft lu sobel raytracer jmeint hessian zxing Benchmarks 2 to'17 reduction'in'the'number'of'annotations 31

Lower'is'Better User_study:'Annotation'Time 10'subjects'(programmers) Time'for'annotating'fft using'enerj and'flexjava 25 EnerJ FLEXJAVA FlexJava Annotation Time (min) 20 15 10 5 0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 avg Subjects 8 average'reduction'in'annotation'time 32

FLEXJAVA Language_compiler'coGdesign to Automate approximate'programming' Reduce programmer'effort Safety Scalability Modularity'and'reusability Generality 33

Replication'Package http://act_lab.org/artifacts/flexjava/ 1. Annotated'benchmarks' 2. Compiler'with'approximate'safety'analysis 3. Source'code'highlighting'tool' Open'source'git repository: https://bitbucket.org/act_lab/flexjava.code 34