Compiler Structure. Backend. Input program. Intermediate representation. Assembly or machine code. Code generation Code optimization

Similar documents
Model-based Software Development

Embedded Systems Development

Compilers. Lecture 2 Overview. (original slides by Sam

Independent DSP Benchmarks: Methodologies and Results. Outline

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Compiler Code Generation COMP360

Compilers and Interpreters

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

REAL TIME DIGITAL SIGNAL PROCESSING

General Purpose Signal Processors

Separating Reality from Hype in Processors' DSP Performance. Evaluating DSP Performance

Compiling Techniques

Better sharc data such as vliw format, number of kind of functional units

Instruction Set Architecture (Contd)

Introduction to Compiler

Compilers and Code Optimization EDOARDO FUSELLA

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Compiler Design. Subject Code: 6CS63/06IS662. Part A UNIT 1. Chapter Introduction. 1.1 Language Processors

REAL TIME DIGITAL SIGNAL PROCESSING

The Structure of a Syntax-Directed Compiler

Compiler Structure. Lexical. Scanning/ Screening. Analysis. Syntax. Parsing. Analysis. Semantic. Context Analysis. Analysis.

Advances in Compilers

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

WS_CCESSH-OUT-v1.00.doc Page 1 of 8

CS 4201 Compilers 2014/2015 Handout: Lab 1

A Simple Syntax-Directed Translator

Front End. Hwansoo Han

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Computer Architecture and Organization

CS 415 Midterm Exam Spring SOLUTION

Syntax Errors; Static Semantics

Embedded Systems. 7. System Components

Compilers Crash Course

Embedded Systems: Hardware Components (part I) Todor Stefanov

Compiling Regular Expressions COMP360

Ascenium: A Continuously Reconfigurable Architecture. Robert Mykland Founder/CTO August, 2005

Compiler Design (40-414)

Generic Software pipelining at the Assembly Level

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi

REAL TIME DIGITAL SIGNAL PROCESSING

COMPILER CONSTRUCTION FOR A NETWORK IDENTIFICATION SUMIT SONI PRAVESH KUMAR

HDL. Operations and dependencies. FSMs Logic functions HDL. Interconnected logic blocks HDL BEHAVIORAL VIEW LOGIC LEVEL ARCHITECTURAL LEVEL

Introduction. Compiler Design CSE Overview. 2 Syntax-Directed Translation. 3 Phases of Translation

Low-power Architecture. By: Jonathan Herbst Scott Duntley

MidTerm Papers Solved MCQS with Reference (1 to 22 lectures)

Design of Embedded DSP Processors Unit 7: Programming toolchain. 9/26/2017 Unit 7 of TSEA H1 1

Compiler Architecture

Unit 2: High-Level Synthesis

Digital Signal Processor Core Technology

The Structure of a Syntax-Directed Compiler

Principles of Compiler Design

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING

tokens parser 1. postfix notation, 2. graphical representation (such as syntax tree or dag), 3. three address code

Computer Hardware Requirements for Real-Time Applications

A simple syntax-directed

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

Outline. Lecture 17: Putting it all together. Example (input program) How to make the computer understand? Example (Output assembly code) Fall 2002

Code Generation. CS 540 George Mason University

Welcome to CSE131b: Compiler Construction

Overview of Compiler. A. Introduction

Embedded Computing Platform. Architecture and Instruction Set

CPS 506 Comparative Programming Languages. Syntax Specification

VLSI Signal Processing

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

HW/SW Co-design. Design of Embedded Systems Jaap Hofstede Version 3, September 1999

TMS320C3X Floating Point DSP

Principles of Compiler Design

Copyright 2012, Elsevier Inc. All rights reserved.

High-level View of a Compiler

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory

Embedded C for High Performance DSP Programming with the CoSy Compiler Development System

CS 321 IV. Overview of Compilation

When do We Run a Compiler?

G Compiler Construction Lecture 12: Code Generation I. Mohamed Zahran (aka Z)

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

COMPILER DESIGN. For COMPUTER SCIENCE

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

An Overview to Compiler Design. 2008/2/14 \course\cpeg421-08s\topic-1a.ppt 1

An introduction to Digital Signal Processors (DSP) Using the C55xx family

VLSI Design Automation. Maurizio Palesi

Microprocessors, Lecture 1: Introduction to Microprocessors

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

Parallel-computing approach for FFT implementation on digital signal processor (DSP)

Introduction to Compilers

EECS150 - Digital Design Lecture 09 - Parallelism

LECTURE 3. Compiler Phases

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Overview. Overview. Programming problems are easier to solve in high-level languages

Instruction Set Architecture

1 Lexical Considerations

Processor Applications. The Processor Design Space. World s Cellular Subscribers. Nov. 12, 1997 Bob Brodersen (

The Structure of a Compiler

Advance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts

CS606- compiler instruction Solved MCQS From Midterm Papers

Transcription:

Compiler Structure Frontend Backend Input program Intermediate representation Assembly or machine code Syntactic and semantic analysis Code generation Code optimization void main(void) { int a, b, c; c=a+b; r1=dm(i7, m7) r2=pm(i8, m8) r3=r1+r2 } Embedded Systems 2002/2003 (c) Daniel Kästner. 1

Detailed Compiler Structure Source (Text) Decorated Syntax Tree Lexical Analysis High-Level Optimizations Tokenized Program Intermediate Representation Syntax Analysis Code Generation Syntax Tree Semantic Analysis Machine Program Embedded Systems 2002/2003 (c) Daniel Kästner. 2

Lexical Analysis (Scanning & Screeing) Input: Program text as sequence of characters. Output: Program text as sequence of symbols (tokens). 1. Read Input file. 2. Report errors about symbols illegal in the programming language. 3. Screening subtask: Identify language keywords and standard identifiers. Elimininate "white-spaces", e.g. consecutive blanks and newlines. Count line numbers. Embedded Systems 2002/2003 (c) Daniel Kästner. 3

Lexical Analysis (Scanning) Embedded Systems 2002/2003 (c) Daniel Kästner. 4

Syntax Analysis (Parsing) Input: Sequence of symbols (tokens). Output: Structure of the program: concrete syntax tree, abstract syntax tree, or parse (derivation). Syntax errors: report (as many as possible) syntax errors, diagnose syntax errors, correct syntax errors. Embedded Systems 2002/2003 (c) Daniel Kästner. 5

Syntax Analysis (Parsing) Embedded Systems 2002/2003 (c) Daniel Kästner. 6

Semantic Analysis Input: Abstract syntax tree. Output: Abstract syntax tree decorated with attributes, e.g. types of subexpressions. Report semantic errors, e.g. undeclared variables, type mismatches. Resolve usage of variables: identify applied occurrences of variables with their declarations. Compute type of every (sub-)expression. Embedded Systems 2002/2003 (c) Daniel Kästner. 7

Semantic Analysis 1 Embedded Systems 2002/2003 (c) Daniel Kästner. 8

Code Selection, Register Allocation, and Instruction Scheduling Code selection Register allocation c=a+b; load adr(a) load adr(b) add store adr(c) b a c a a r1 b a r2 c a r3 Instruction scheduling r1=load adr(a) r2=load adr(b) r2=load r3=add r1, adr(b) r2 r3=add store adr(c), r1, r2 r3 store adr(c), r3 Embedded Systems 2002/2003 (c) Daniel Kästner. 9

Phase Coupling Problem d1=s1*s1; d2=s1+s2; r3=r1*r1 store r3 r3=r1+r2 store r3 r3=r1*r1, r4=r1+r2 store r3 store r4 Optimize number of used registers Optimize code speed and size Embedded Systems 2002/2003 (c) Daniel Kästner. 10

Main Tasks of Code Generation (1) Code selection: Map the intermediate representation to a semantically equivalent sequence of machine operations that is as efficient as possible. Register allocation: Map the values of the intermediate representation to physical registers in order to minimize the number of memory references during program execution. Register allocation proper: Decide which variables and expressions of the IR are mapped to registers and which ones are kept in memory. Register assignment: Determine the physical registers that are used to store the values that have been previously selected to reside in registers. Embedded Systems 2002/2003 (c) Daniel Kästner. 11

Main Tasks of Code Generation (2) Instruction scheduling: Reorder the produced operation stream in order to minimize pipeline stalls and exploit the available instruction-level parallelism. Resource allocation / functional unit binding: Bind operations to machine resources, e.g. functional units or buses. Embedded Systems 2002/2003 (c) Daniel Kästner. 12

The Code Generation Problem Instruction scheduling, register allocation and code selection are NP complete problems. In classical approaches they are addressed by heuristic methods in separate phases. Unfortunately, all the code generation phases are interdependent, i.e. decisions made in one phase may impose restrictions to the other phases. Thus: often suboptimal combination of suboptimal partial results. Moreover: specific/irregular hardware features not well covered by standard code generation methods. Embedded Systems 2002/2003 (c) Daniel Kästner. 13

The DSPStone Study Evaluation of the performance of DSP compilers and joint compiler/processor systems. Evaluated compilers: Analog Devices ADSP2101, AT&T DSP1610, Motorola DSP56001, NEC mpd77016, TI TMS320C51. Hand-crafted assembly code is compared to the compilergenerated code. Result: overhead between 100% and 1000% of compilergenerated code is typical! Embedded Systems 2002/2003 (c) Daniel Kästner. 14

Compiling for DSPs Code quality of traditional high-level language compilers is not satisfactory. Thus: Assembly programming. Increasing size of DSP applications and time to market pressure Deficiencies of assembly programming:! time consuming! error prone! Bad portability! Bad maintainability Urgent demand for the use of high-level languages Embedded Systems 2002/2003 (c) Daniel Kästner. 15

Case Study: Infineon TriCore C code (FIR Filter): int i,j,sum; for (i=0;i<n-m;i++) { sum=0; for (j=0;j<m;j++) { sum+=array1[i+j]*coeff[j]; } output[i]=sum>>15; } Compiler-generated code (gcc):.l21: add %d15,%d3,%d1 addsc.a %a15,%a4,%d15,1 addsc.a %a2,%a5,%d1,1 mov %d4,49 ld.h %d0,[%a15]0 ld.h %d15,[%a2]0 madd %d2,%d2,%d0,%d15 add %d1,%d1,1 jge %d4,%d1,.l21 Hand-written code (inner loop): _8: ld16.w d5, [a2+] ld16.w d4, [a3+] madd.h e8,e8,d5,d4ul,#0 loop a7,_8! Zero-Overhead Loops! SIMD Instructions! Auto-modify addressing 6 times faster! (execution in SRAM) Embedded Systems 2002/2003 (c) Daniel Kästner. 16

Classification of Microprocessors Microprocessors General Purpose Processors ( GPP) Application Specific Processors ( ASP) Requirements: high performance low cost low power consumption GPP proper: general purpose applications Microcontrollers: industrial applications DSP (Digital Signal Processor): programmable microprocessor for extensive numerical real-time computations ASIC (Application Specific Integrated Circuit): algorithm completely implemented in hardware ASIP (Application Specific Instruction Set Processor): programmable microprocessor where hardware and instruction set are designed together for one special application Specialization Embedded Systems 2002/2003 (c) Daniel Kästner. 17

Architectural Valuation [Campbell, Northrop Grumman Corporation, 1998] More efficient architectures will use less energy to complete the same task on the same generation CMOS solid state technology 2 Power consumption P CV f N C : Capacitance V : CPU Core Voltage f : CPU clock frequency N : Number of gates changing state = Embedded Systems 2002/2003 (c) Daniel Kästner. 18

Architectural Valuation Observations: Higher performance by increasing the clock frequency does not change the performance per power ratio Performance Power = O Cycle CV Cycle Time 2 f N Of = 2 CV f N O = 2 CV N A voltage decrease improves performance per power non-linearly Performance Power O CV f N = 2 Embedded Systems 2002/2003 (c) Daniel Kästner. 19

Architectural Valuation Architectural specialization as a measure for how well the architecture fits a given target application. Estimation of architectural specialization: Performance per power. Embedded Systems 2002/2003 (c) Daniel Kästner. 20

Comparion of Performace Per Power Ratios FFT MFLOPS/WATT 80 70 60 50 40 30 20 10 0 Q Q RISC64 58 = 3.14 DSP32 18 Q Q 10.9 = 4.8 new RISC64 = old RISC64 64 128 256 512 1024 2048 4096 8192 16384 32768 Complex FFT Length (Points) 2.3 Alpha 21064A, 275 MHz, 275 MFLOP Peak, 3.3 Volts, 33+8 Watts Alpha 21164, 333 MHz, 666 MFLOP Peak, 2.2 Volt, 25.4+6 Watts SHARC 21060, 40 MHz, 120 MFLOP Peak, 3.3 Volt, 1.75 Watts Embedded Systems 2002/2003 (c) Daniel Kästner. 21