Automatic Generation of a Code Generator for SHARC ADSP-2106x
|
|
- Damian Small
- 5 years ago
- Views:
Transcription
1 Automatic Generation of a Code Generator for SHARC ADSP-2106x Peter Aronsson, Levon Saldamli, Peter Fritzson (petar, levsa, petfr)@ida.liu.se Dept. of Computer and Information Science Linköping University, Sweden August 3, Abstract New DSP processors with increasingly complex instruction sets are continously being developed. To master such complexity it is becoming essential to quickly provide efficient high level language compilers for these processors. This paper describes the use of new compiler generation tools (CoSy) to automatically generate a code generator for the Digital Signal Processor SHARC ADSP 2106x from a description of its instruction set. The resulting C compiler was produced by two master students in 5 months, generating production-quality code. This gives an indication of the power and flexibility of generator tools, compared to traditional manual compiler implementations. 2 Introduction This paper describes the generation and implementation of a code generator for the digital signal processor SHARC ADSP- 2106x from Analog Devices Inc., by using the Back End Generator tool, BEG, which is a part of the CoSy compiler generation system[2][5]. CoSy is a compiler development tool, developed by ACE (bv) as a spinoff product from the ESPRIT projects COMPARE and PREPARE. New DSP processors are developed all the time, therefore to quickly develop compiler for new DSP processors is important for the acceptance of these processors. To develop an entire new compiler for each new processor being developed is far too expensive. By using a compiler construction tool such as CoSy, several advantages are gained. First, since a compiler in CoSy is built up of several modules which can be reused in other compilers, the development time decreases substantially. In fact, implementing a new compiler for a certain DSP requires only the code generator to be instructed. The other modules, such as the front end, can be reused. Another great advantage is that generators for modules, or engines as they are called in CoSy, exist for optimizers and backends. These generators generate complete, or almost complete, engines from specifications. By using these generators, it is easier to guarantee a compiler of higher quality. Of course, then the generator tool must be well tested, so that it doesn t contain errors. The C frontend delivered with CoSy has an optional DSP-C extension, which allows C programmers to, for instance, de-
2 2 clare variables in different memory banks, or to declare a variable of type fixed point number, a type very common in DSP applications. It is also possible to declare an array to be circular, another common data structure in DSP software. The Back End Generator, BEG, generates a set of engines, that work on the internal representation produced by a frontend, and produces the output file containing the program in the specified target language. In this case, the target language of the compiler is assembler code for the SHARC processor. BEG uses pattern matching combined with dynamic programming to translate the internal representation, which has the form of an abstract syntax tree, to assembler instructions. The internal representation is called CCMIR, which means Common CoSy Medium-level Intermediate Representation. Patterns are reduced to nonterminals, which can correspond to values stored in registers, or perhaps addressing modes. BEG also generates engines for register allocation and for instruction scheduling. In the current release of CoSy these work independently of each other. This has some disadvantages, especially when the processor has a VLIW architecture, since many operations have register constraints when executing in parallel. 3 DSP-C extension The DSP-C extensions to the C language is totally integrated in the fronted engine, which translates the program into CCMIR. The CCMIRs type system has support for the DSP specific variable declarations. For instance the code: accum acc_val; fixed D signal[48]; fixed P coeff[48]; declares a variable of type accum, which is a fixed point number with both fractional and integral part. It also declares two arrays of fixed point number type, i.e. a number with only a fractional part. The array named signal is declared to be stored in the data memory, hence the D keyword. The array named coeff is declared to be stored in program memory. The DSP-C extension has also support for circular arrays, i.e. an array can be declared as circular. Indexing the array beyond the boundary is safe because it wraps around back into the correct range. All type information is stored in the CCMIR and can be used by the backend to produce effective code. 4 Back End Generator BEG uses pattern matching and dynamic programming to select the best instructions for a given subtree in the CCMIR[2][4]. The Code Generator Description file (CGD-file), which is the input to BEG, consists of a set of rules and nonterminals, and a description for the scheduler. Each rule has a pattern to match in order for the rule to apply. If the rule applies it can reduce the part of the tree covered by the pattern to a nonterminal. A nonterminal can be one of four different types: Register to represent a value stored in registers. Memory to represent a storage in memory. Addrmode to represent an addressing mode. Unique for values stored in some unique location.
3 The rules and nonterminals are illustrated by the following example: x = y + 1; The TEMPLATE keyword tells the scheduler which resource template this rule allocates, i.e which resources the assembler operation needs. In this case the operation is performed in the ALU, thus allocating a resource template named alu. The templates are also specified in the code generator description file. It supports allocating arbitrary resources for an arbitrary amount of cycles. mirassign mirobjectaddr mirplus x mircontent mirobjectaddr y mirintconst Instructing the Scheduler All operations in the SHARC processor has a latency of one[1]. That means the result of all operations are available in the next instruction cycle. However, BEG has support for setting different latencies for each rule/instruction. This is common for several DSP architectures and it sets higher constraints on the instruction scheduler. Figure 1: The Pattern Matching of rules on the CCMIR tree. The statement above can be covered by the rules as shown in figure 1. Each area corresponds to a rule covering that specific tree. For instance, the mirplus node can be reduced to a nonterminal that holds the value of the operation in a register. In order for that rule to match, the children of the mirplus node must be covered by rules reducing them also to nonterminals holding their value in a register. The mirplus rule looks like this: RULE mirplus(rs:reg, c:mirintconst) -> rd:reg; COND { c.value == 1 } COST 3; TEMPLATE alu; EMIT { emit(add1,rs,rd); } Many DSP architectures has register constraints on specific operations. For instance, the SHARC ADSP-2106x can issue an operation using the multiplier and the ALU in the same instruction cycle[1]. This can however only be performed in the same cycle if the operands are taken from specific subsets of the register file. BEG has support for, in a rule, specifying constraints on which registers to be used. This is specified by adding the allowed registers after the nonterminal in the pattern. For instance, the rules for issuing a multiply and an ALU operation in the same cycle looks like this: RULE [bi_multrealspec] o:mirmult (r1:reg<r0..r3>, r2:reg<r4..r7>) -> r:reg; TEMPLATE mulspec;.. RULE [bi_plusspec] o:mirplus (r1:reg<r8..r11>, r2:reg<r12..r15>) -> r:reg; TEMPLATE aluspec;.. The template for the two rules above
4 4 are declared as taking up the multiplier resource and the ALU resource respectively. Thus the two rules can be issued in the same instruction cycle. An ordinary mirplus rule has the alu template resource, which actually allocates all three functional units, since in general, only one compute operation can be performed in a single cycle. Since the register allocation is performed prior to the instruction scheduling, this implementation can sometimes produce slower code than without the register constraints. Consider if the rules above are chosen, but the register allocator needs to perform a spill in order to fulfill the constraints. Then, this approach will produce two extra instructions, one for spilling and one for restoring the register. The best way to handle this problem would be to integrate the register allocator with the instruction scheduler. However this is not possible in BEG, without rewriting all generated engines yourself. Another solution could be to run the backend twice for each procedure. The first run would use register constraints on the rules, and the second run without these constraints. Then the backend could be instructed to select the best result. 4.2 Implementing Post Modify Addressing Mode The SHARC has, along with several other DSPs, a specific addressing mode for updating address pointers after accessing memory. This is very efficient when sequentially accessing the values in an array, as typically is done in for instance a FIR filter. When trying to implement this in BEG some problems occur. First all expressions accessing arrays with indexes must be transformed into pointer expressions, so that the pointer can be post incremented. Fortunately there exists an engine in the CoSy system that does this. Another problem is that the post modify instruction actually originates from two statements in the CCMIR, the pointer increment statement and the memory access statement. Beg cannot reduce two different statements into one nonterminal, so this special case has to be handled separately. The solution was to handle them as two separate operations, and if the scheduler schedules them in the same instruction, then they are rewritten to a single operation. 5 Results The backend was compared on a number of programs. Figure 2 gives some test results. C-file # instructions ILP a b c a b c 8q.c fir.c mov.c mat.c vss.c Figure 2: Test Results for the compiler, compared with g21k from Analog Devices. a is the g21k without optimization. b is the g21k with optimization. c is our compiler with optimization. ILP means percentage of instructions issued in parallel. The file 8q.c is the eight queens problem. It contains some nested loops and recursive function calls. The file fir.c is a simple FIR filter. The last three files contains matrix and vector manipulations. The tests presented here are only small examples showing that the compiler does almost as good as g21k, the commer-
5 cial compiler from Analog Devices. When comparing the assembler files from the two compilers one can detect that the major difference is that the g21k compiler has software pipelining implemented for a set of standard loops, such as a FIR filter. This optimization isn t yet available in CoSy. Runtime tests are presented in [6]. 6 Conclusions A drawback in BEG is that the scheduler only schedules per basic block. This limits the schedulers option to pack instructions. In order to get a better schedule, some algorithm working on larger code segments has to be used, like for instance trace scheduling[3]. However, these limitations didn t affect the implemented code generator for the SHARC processor that much. Mostly because the ADSP 2106x has in general only two issue slots, containing a compute operation and a move operation. If some register constraints are fulfilled, three issue slots can be performed. Additionally, two compute operations, one in the ALU and one in the multiplier, can be run in the same instruction as a move operation. Some test results from the backend gave rather low percentage of parallel instructions. Typically between 5 and 30 percent. A conclusion drawn from this work is that a code generator for a processor can be implemented in about eight to ten man-months, resulting in a compiler that produces almost as good code as a commercial compiler. Of course, a better backend, supporting more optimization, can be produced if the development time is increased a bit. Note also that the work also included learning the CoSy system. This is a substantial part of the effort, since CoSy is a large system that takes a while to fully understand and master. Approximately half of the time of the work was dedicated to learning the system. This learning process was integrated with development, which had the effect that some design decisions, now afterwards, probably could have been better. To summarize, one could say that developing a compiler using the CoSy system is far more resource efficient and less error prone than using conventional methods. The fact that many optimizers and DSP extensions already exist in the CoSy system makes the development time even shorter. References [1] Inc. Analog Devices. ADSP 2106x SHARC User s Manual. Analog Devices, Inc., first edition, [2] Niclas Andersson and Peter Fritzson. Overview and industrial application of code generator generators. Journal of Systems and Software, [3] J.A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers, 30(7): , [4] R. Landwehr H. Emmelmann, F. W. Schrrer. Beg - a generator for efficient back ends. ACM Sigplan Notices, 24(7): , [5] Hans von Someren Martin Alt, Uwe Assmann. Cosy compiler phase embedding with the cosy compiler model. In Peter A. Fritzson, editor, Compiler Construction, [6] Levon Saldamli Peter Aronsson. Code generator for sharc adsp-2106x. Master s thesis, Dept. of Computer and Information Science, Linköping University, 1999.
DSP Platforms Lab (AD-SHARC) Session 05
University of Miami - Frost School of Music DSP Platforms Lab (AD-SHARC) Session 05 Description This session will be dedicated to give an introduction to the hardware architecture and assembly programming
More informationLatches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter
IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more
More informationDSP VLSI Design. Addressing. Byungin Moon. Yonsei University
Byungin Moon Yonsei University Outline Definition of addressing modes Implied addressing Immediate addressing Memory-direct addressing Register-direct addressing Register-indirect addressing with pre-
More informationDynamic Control Hazard Avoidance
Dynamic Control Hazard Avoidance Consider Effects of Increasing the ILP Control dependencies rapidly become the limiting factor they tend to not get optimized by the compiler more instructions/sec ==>
More informationEmbedded C for High Performance DSP Programming with the CoSy Compiler Development System
Embedded C for High Performance DSP Programming with the CoSy Compiler Development System Marcel Beemster/Yoichi Sugiyama ACE Associated Compiler Experts/Japan Novel Corporation contact: yo_sugi@jnovel.co.jp
More informationInstruction scheduling
Instruction scheduling iaokang Qiu Purdue University ECE 468 October 12, 2018 What is instruction scheduling? Code generation has created a sequence of assembly instructions But that is not the only valid
More informationCS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.
CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. Part 5: Processors Our goal: understand basics of processors and CPU understand the architecture of MARIE, a model computer a close look at the instruction
More informationCS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS
CS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS 1 Last time Each block is assigned to and executed on a single streaming multiprocessor (SM). Threads execute in groups of 32 called warps. Threads in
More informationJava and CoSy Technology for Embedded Systems: the JOSES Project
Java and CoSy Technology for Embedded Systems: the JOSES Project Daniela GENIUS 1 and Uwe AßMANN 1 and Peter FRITZSON 2 and Henk SIPS 3 and Rob KURVER 4 and Reinhard WILHELM 5 and Henk SCHEPERS 6 and Tom
More information6.001 Notes: Section 4.1
6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,
More informationTechnical Questions. Q 1) What are the key features in C programming language?
Technical Questions Q 1) What are the key features in C programming language? Portability Platform independent language. Modularity Possibility to break down large programs into small modules. Flexibility
More informationA Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function
A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function Chen-Ting Chang, Yu-Sheng Chen, I-Wei Wu, and Jyh-Jiun Shann Dept. of Computer Science, National Chiao
More informationBuilding a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano
Building a Runnable Program and Code Improvement Dario Marasco, Greg Klepic, Tess DiStefano Building a Runnable Program Review Front end code Source code analysis Syntax tree Back end code Target code
More informationWilliam Stallings Computer Organization and Architecture 8 th Edition. Chapter 11 Instruction Sets: Addressing Modes and Formats
William Stallings Computer Organization and Architecture 8 th Edition Chapter 11 Instruction Sets: Addressing Modes and Formats Addressing Modes Immediate Direct Indirect Register Register Indirect Displacement
More informationAn introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures
An introduction to DSP s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures DSP example: mobile phone DSP example: mobile phone with video camera DSP: applications Why a DSP?
More information1 INTRODUCTION. Purpose. Audience. Figure 1-0. Table 1-0. Listing 1-0.
1 INTRODUCTION Figure 1-0. Table 1-0. Listing 1-0. Purpose The ADSP-21160 SHARC DSP Instruction Set Reference provides assembly syntax information for the ADSP-21160 Super Harvard Architecture (SHARC)
More informationComputer Organization & Assembly Language Programming
Computer Organization & Assembly Language Programming CSE 2312 Lecture 11 Introduction of Assembly Language 1 Assembly Language Translation The Assembly Language layer is implemented by translation rather
More informationPage 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions
Structure of von Nuemann machine Arithmetic and Logic Unit Input Output Equipment Main Memory Program Control Unit 1 1 Instruction Set - the type of Instructions Arithmetic + Logical (ADD, SUB, MULT, DIV,
More informationEngineer To Engineer Note
Engineer To Engineer Note EE-134 Phone: (800) ANALOG-D, FAX: (781) 461-3010, EMAIL: dsp.support@analog.com, FTP: ftp.analog.com, WEB: www.analog.com/dsp Copyright 2001, Analog Devices, Inc. All rights
More informationLecture 6 MIPS R4000 and Instruction Level Parallelism. Computer Architectures S
Lecture 6 MIPS R4000 and Instruction Level Parallelism Computer Architectures 521480S Case Study: MIPS R4000 (200 MHz, 64-bit instructions, MIPS-3 instruction set) 8 Stage Pipeline: first half of fetching
More informationDSP VLSI Design. Instruction Set. Byungin Moon. Yonsei University
Byungin Moon Yonsei University Outline Instruction types Arithmetic and multiplication Logic operations Shifting and rotating Comparison Instruction flow control (looping, branch, call, and return) Conditional
More informationCD Assignment I. 1. Explain the various phases of the compiler with a simple example.
CD Assignment I 1. Explain the various phases of the compiler with a simple example. The compilation process is a sequence of various phases. Each phase takes input from the previous, and passes the output
More informationUnderstanding Sources of Inefficiency in General-Purpose Chips
Understanding Sources of Inefficiency in General-Purpose Chips Rehan Hameed Wajahat Qadeer Megan Wachs Omid Azizi Alex Solomatnikov Benjamin Lee Stephen Richardson Christos Kozyrakis Mark Horowitz GP Processors
More informationECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation
ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating
More informationAdvanced FPGA Design Methodologies with Xilinx Vivado
Advanced FPGA Design Methodologies with Xilinx Vivado Alexander Jäger Computer Architecture Group Heidelberg University, Germany Abstract With shrinking feature sizes in the ASIC manufacturing technology,
More informationProgramming Style. Quick Look. Features of an Effective Style. Naming Conventions
Programming Style Quick Look An effective programming style helps you write code that is easier to understand, debug, maintain, and port from system to system. This article discusses the general features
More informationComputer Architecture and Engineering CS152 Quiz #3 March 22nd, 2012 Professor Krste Asanović
Computer Architecture and Engineering CS52 Quiz #3 March 22nd, 202 Professor Krste Asanović Name: This is a closed book, closed notes exam. 80 Minutes 0 Pages Notes: Not all questions are
More informationMultiple Choice Questions. Chapter 5
Multiple Choice Questions Chapter 5 Each question has four choices. Choose most appropriate choice of the answer. 1. Developing program in high level language (i) facilitates portability of nonprocessor
More informationECE260: Fundamentals of Computer Engineering
Arithmetic for Computers James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Arithmetic for
More informationINTRODUCTION TO DIGITAL SIGNAL PROCESSOR
INTRODUCTION TO DIGITAL SIGNAL PROCESSOR By, Snehal Gor snehalg@embed.isquareit.ac.in 1 PURPOSE Purpose is deliberately thought-through goal-directedness. - http://en.wikipedia.org/wiki/purpose This document
More informationCode Compression for DSP
Code for DSP Charles Lefurgy and Trevor Mudge {lefurgy,tnm}@eecs.umich.edu EECS Department, University of Michigan 1301 Beal Ave., Ann Arbor, MI 48109-2122 http://www.eecs.umich.edu/~tnm/compress Abstract
More informationDomains Geometry Definition
PDE extension Changes over Levon s extension Jan Šilar jan.silar@lf1.cuni.cz November 3, 2014 New extension is compared to Levon s work ([2]), mostly chapter 4 Domains Geometry Definition see [2] -- 4.3.1.1
More informationLecture Notes on Garbage Collection
Lecture Notes on Garbage Collection 15-411: Compiler Design Frank Pfenning Lecture 21 November 4, 2014 These brief notes only contain a short overview, a few pointers to the literature with detailed descriptions,
More informationFunctions and Procedures
Functions and Procedures Function or Procedure A separate piece of code Possibly separately compiled Located at some address in the memory used for code, away from main and other functions (main is itself
More informationCS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS
CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight
More informationPointers II. Class 31
Pointers II Class 31 Compile Time all of the variables we have seen so far have been declared at compile time they are written into the program code you can see by looking at the program how many variables
More informationA Lost Cycles Analysis for Performance Prediction using High-Level Synthesis
A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,
More informationSemantic Analysis. Lecture 9. February 7, 2018
Semantic Analysis Lecture 9 February 7, 2018 Midterm 1 Compiler Stages 12 / 14 COOL Programming 10 / 12 Regular Languages 26 / 30 Context-free Languages 17 / 21 Parsing 20 / 23 Extra Credit 4 / 6 Average
More informationAn Optimizing Compiler for the TMS320C25 DSP Chip
An Optimizing Compiler for the TMS320C25 DSP Chip Wen-Yen Lin, Corinna G Lee, and Paul Chow Published in Proceedings of the 5th International Conference on Signal Processing Applications and Technology,
More informationSeparate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention
Runtime Environment The Procedure Abstraction and Separate Compilation Topics we will cover The procedure abstraction and linkage conventions Runtime storage convention Non-local data access (brief) These
More informationA Feasibility Study for Methods of Effective Memoization Optimization
A Feasibility Study for Methods of Effective Memoization Optimization Daniel Mock October 2018 Abstract Traditionally, memoization is a compiler optimization that is applied to regions of code with few
More informationCA Compiler Construction
CA4003 - Compiler Construction David Sinclair When procedure A calls procedure B, we name procedure A the caller and procedure B the callee. A Runtime Environment, also called an Activation Record, is
More informationCS 101, Mock Computer Architecture
CS 101, Mock Computer Architecture Computer organization and architecture refers to the actual hardware used to construct the computer, and the way that the hardware operates both physically and logically
More informationFORTH SEMESTER DIPLOMA EXAMINATION IN ENGINEERING/ TECHNOLIGY- OCTOBER, 2012 DATA STRUCTURE
TED (10)-3071 Reg. No.. (REVISION-2010) Signature. FORTH SEMESTER DIPLOMA EXAMINATION IN ENGINEERING/ TECHNOLIGY- OCTOBER, 2012 DATA STRUCTURE (Common to CT and IF) [Time: 3 hours (Maximum marks: 100)
More informationQuestion Bank Subject: Advanced Data Structures Class: SE Computer
Question Bank Subject: Advanced Data Structures Class: SE Computer Question1: Write a non recursive pseudo code for post order traversal of binary tree Answer: Pseudo Code: 1. Push root into Stack_One.
More informationCOMPILER CONSTRUCTION FOR A NETWORK IDENTIFICATION SUMIT SONI PRAVESH KUMAR
COMPILER CONSTRUCTION FOR A NETWORK IDENTIFICATION SUMIT SONI 13 PRAVESH KUMAR language) into another computer language (the target language, often having a binary form known as object code The most common
More informationMACHINE INDEPENDENCE IN COMPILING*
MACHINE INDEPENDENCE IN COMPILING* Harry D. Huskey University of California Berkeley, California, USA Since 1958, there has been a substantial interest in the development of problem-oriented languages
More informationCS 426 Parallel Computing. Parallel Computing Platforms
CS 426 Parallel Computing Parallel Computing Platforms Ozcan Ozturk http://www.cs.bilkent.edu.tr/~ozturk/cs426/ Slides are adapted from ``Introduction to Parallel Computing'' Topic Overview Implicit Parallelism:
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationTABLES AND HASHING. Chapter 13
Data Structures Dr Ahmed Rafat Abas Computer Science Dept, Faculty of Computer and Information, Zagazig University arabas@zu.edu.eg http://www.arsaliem.faculty.zu.edu.eg/ TABLES AND HASHING Chapter 13
More informationCSE 504: Compiler Design. Intermediate Representations Symbol Table
Intermediate Representations Symbol Table Pradipta De pradipta.de@sunykorea.ac.kr Current Topic Intermediate Representations Graphical IRs Linear IRs Symbol Table Information in a Program Compiler manages
More informationCompilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam
Compilers Intermediate representations and code generation Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Today Intermediate representations and code generation Scanner Parser Semantic
More informationBeyond ILP II: SMT and variants. 1 Simultaneous MT: D. Tullsen, S. Eggers, and H. Levy
EE482: Advanced Computer Organization Lecture #13 Processor Architecture Stanford University Handout Date??? Beyond ILP II: SMT and variants Lecture #13: Wednesday, 10 May 2000 Lecturer: Anamaya Sullery
More informationPRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS
Objective PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Explain what is meant by compiler. Explain how the compiler works. Describe various analysis of the source program. Describe the
More informationThe basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table
SYMBOL TABLE: A symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier in a program's source code is associated with information relating
More informationLast time: forwarding/stalls. CS 6354: Branch Prediction (con t) / Multiple Issue. Why bimodal: loops. Last time: scheduling to avoid stalls
CS 6354: Branch Prediction (con t) / Multiple Issue 14 September 2016 Last time: scheduling to avoid stalls 1 Last time: forwarding/stalls add $a0, $a2, $a3 ; zero or more instructions sub $t0, $a0, $a1
More information22 File Structure, Disk Scheduling
Operating Systems 102 22 File Structure, Disk Scheduling Readings for this topic: Silberschatz et al., Chapters 11-13; Anderson/Dahlin, Chapter 13. File: a named sequence of bytes stored on disk. From
More informationSemantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End
Outline Semantic Analysis The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors
More information3 TUTORIAL. In This Chapter. Figure 1-0. Table 1-0. Listing 1-0.
3 TUTORIAL Figure 1-0. Table 1-0. Listing 1-0. In This Chapter This chapter contains the following topics: Overview on page 3-2 Exercise One: Building and Running a C Program on page 3-4 Exercise Two:
More informationIntroduction to Compiler Construction
Introduction to Compiler Construction ASU Textbook Chapter 1 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is a compiler? Definitions: A recognizer. A translator. source
More informationAn Instruction Stream Compression Technique 1
An Instruction Stream Compression Technique 1 Peter L. Bird Trevor N. Mudge EECS Department University of Michigan {pbird,tnm}@eecs.umich.edu Abstract The performance of instruction memory is a critical
More informationSardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization
Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES (2018-19) (ODD) Code Optimization Prof. Jonita Roman Date: 30/06/2018 Time: 9:45 to 10:45 Venue: MCA
More informationTECH. 9. Code Scheduling for ILP-Processors. Levels of static scheduling. -Eligible Instructions are
9. Code Scheduling for ILP-Processors Typical layout of compiler: traditional, optimizing, pre-pass parallel, post-pass parallel {Software! compilers optimizing code for ILP-processors, including VLIW}
More informationMemory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008
Dynamic Storage Allocation CS 44: Operating Systems Spring 2 Memory Allocation Static Allocation (fixed in size) Sometimes we create data structures that are fixed and don t need to grow or shrink. Dynamic
More informationHPC VT Machine-dependent Optimization
HPC VT 2013 Machine-dependent Optimization Last time Choose good data structures Reduce number of operations Use cheap operations strength reduction Avoid too many small function calls inlining Use compiler
More informationDirectory Structure and File Allocation Methods
ISSN:0975-9646 Mandeep Kaur et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 7 (2), 2016, 577-582 Directory Structure and ile Allocation Methods Mandeep Kaur,
More information,1752'8&7,21. Figure 1-0. Table 1-0. Listing 1-0.
,1752'8&7,21 Figure 1-0. Table 1-0. Listing 1-0. The ADSP-21065L SHARC is a high-performance, 32-bit digital signal processor for communications, digital audio, and industrial instrumentation applications.
More informationFeldspar A Functional Embedded Language for Digital Signal Processing *
Proceedings of the 8 th International Conference on Applied Informatics Eger, Hungary, January 27 30, 2010. Vol. 2. pp. 149 156. Feldspar A Functional Embedded Language for Digital Signal Processing *
More informationAutomatic Format Generation Techniques For Network Data Acquisition Systems
Automatic Format Generation Techniques For Network Data Acquisition Systems Benjamin Kupferschmidt Technical Manager - TTCWare Teletronics Technology Corporation Eric Pesciotta Director of Systems Software
More informationControl Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary
Control Instructions Computer Organization Architectures for Embedded Computing Thursday, 26 September 2013 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,
More informationControl Instructions
Control Instructions Tuesday 22 September 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary Previous Class Instruction Set
More informationD Programming Language
Group 14 Muazam Ali Anil Ozdemir D Programming Language Introduction and Why D? It doesn t come with a religion this is written somewhere along the overview of D programming language. If you actually take
More informationWACC Report. Zeshan Amjad, Rohan Padmanabhan, Rohan Pritchard, & Edward Stow
WACC Report Zeshan Amjad, Rohan Padmanabhan, Rohan Pritchard, & Edward Stow 1 The Product Our compiler passes all of the supplied test cases, and over 60 additional test cases we wrote to cover areas (mostly
More informationCS415 Compilers. Intermediate Represeation & Code Generation
CS415 Compilers Intermediate Represeation & Code Generation These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Review - Types of Intermediate Representations
More informationAdvanced Parallel Architecture Lesson 3. Annalisa Massini /2015
Advanced Parallel Architecture Lesson 3 Annalisa Massini - Von Neumann Architecture 2 Two lessons Summary of the traditional computer architecture Von Neumann architecture http://williamstallings.com/coa/coa7e.html
More informationLecture 4: Instruction Set Design/Pipelining
Lecture 4: Instruction Set Design/Pipelining Instruction set design (Sections 2.9-2.12) control instructions instruction encoding Basic pipelining implementation (Section A.1) 1 Control Transfer Instructions
More informationAssembly Code Conversion of Software-Pipelined Loop between two VLIW DSP Processors
Assembly Code Conversion of Software-Pipelined Loop between two VLIW DSP Processors Bogong Su 1 Jian Wang 2 Erh-Wen Hu 1 Joseph Manzano 1 (973)720-2979 (514) 818-2541 (973)720-2196 (973)720-2649 sub@wpunj.edu
More informationLecture 7: Binding Time and Storage
Lecture 7: Binding Time and Storage COMP 524 Programming Language Concepts Stephen Olivier February 5, 2009 Based on notes by A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts Goal of Lecture The
More informationLOW-COST SIMD. Considerations For Selecting a DSP Processor Why Buy The ADSP-21161?
LOW-COST SIMD Considerations For Selecting a DSP Processor Why Buy The ADSP-21161? The Analog Devices ADSP-21161 SIMD SHARC vs. Texas Instruments TMS320C6711 and TMS320C6712 Author : K. Srinivas Introduction
More informationWhat do Compilers Produce?
What do Compilers Produce? Pure Machine Code Compilers may generate code for a particular machine, not assuming any operating system or library routines. This is pure code because it includes nothing beyond
More informationAn introduction to Digital Signal Processors (DSP) Using the C55xx family
An introduction to Digital Signal Processors (DSP) Using the C55xx family Group status (~2 minutes each) 5 groups stand up What processor(s) you are using Wireless? If so, what technologies/chips are you
More informationProject Compiler. CS031 TA Help Session November 28, 2011
Project Compiler CS031 TA Help Session November 28, 2011 Motivation Generally, it s easier to program in higher-level languages than in assembly. Our goal is to automate the conversion from a higher-level
More informationIntroduction to Compiler Construction
Introduction to Compiler Construction ASU Textbook Chapter 1 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is a compiler? Definitions: A recognizer. A translator. source
More informationmultiple variables having the same value multiple variables having the same identifier multiple uses of the same variable
PART III : Language processing, interpretation, translation, the concept of binding, variables, name and scope, Type, l-value, r-value, reference and unnamed variables, routines, generic routines, aliasing
More informationNOTE: Answer ANY FOUR of the following 6 sections:
A-PDF MERGER DEMO Philadelphia University Lecturer: Dr. Nadia Y. Yousif Coordinator: Dr. Nadia Y. Yousif Internal Examiner: Dr. Raad Fadhel Examination Paper... Programming Languages Paradigms (750321)
More informationUNIT TESTING OF C++ TEMPLATE METAPROGRAMS
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LV, Number 1, 2010 UNIT TESTING OF C++ TEMPLATE METAPROGRAMS ÁBEL SINKOVICS Abstract. Unit testing, a method for verifying a piece of software, is a widely
More informationCOMPILER DESIGN - RUN-TIME ENVIRONMENT
COMPILER DESIGN - RUN-TIME ENVIRONMENT http://www.tutorialspoint.com/compiler_design/compiler_design_runtime_environment.htm Copyright tutorialspoint.com A program as a source code is merely a collection
More informationFixed-Point Math and Other Optimizations
Fixed-Point Math and Other Optimizations Embedded Systems 8-1 Fixed Point Math Why and How Floating point is too slow and integers truncate the data Floating point subroutines: slower than native, overhead
More informationPrinciples of Programming Languages COMP251: Syntax and Grammars
Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007
More informationIntro. Scheme Basics. scm> 5 5. scm>
Intro Let s take some time to talk about LISP. It stands for LISt Processing a way of coding using only lists! It sounds pretty radical, and it is. There are lots of cool things to know about LISP; if
More informationLow Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm
Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm 1 A.Malashri, 2 C.Paramasivam 1 PG Student, Department of Electronics and Communication K S Rangasamy College Of Technology,
More informationIntroduction to Compiler Construction
Introduction to Compiler Construction ALSU Textbook Chapter 1.1 1.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is a compiler? Definitions: a recognizer ; a translator.
More informationHW1 Solutions. Type Old Mix New Mix Cost CPI
HW1 Solutions Problem 1 TABLE 1 1. Given the parameters of Problem 6 (note that int =35% and shift=5% to fix typo in book problem), consider a strength-reducing optimization that converts multiplies by
More informationModel-based Software Development
Model-based Software Development 1 SCADE Suite Application Model in SCADE (data flow + SSM) System Model (tasks, interrupts, buses, ) SymTA/S Generator System-level Schedulability Analysis Astrée ait StackAnalyzer
More informationSection 6 Blackfin ADSP-BF533 Memory
Section 6 Blackfin ADSP-BF533 Memory 6-1 a ADSP-BF533 Block Diagram Core Timer 64 L1 Instruction Memory Performance Monitor JTAG/ Debug Core Processor LD0 32 LD1 32 L1 Data Memory SD32 DMA Mastered 32
More informationPerformance. frontend. iratrecon - rational reconstruction. sprem - sparse pseudo division
Performance frontend The frontend command is used extensively by Maple to map expressions to the domain of rational functions. It was rewritten for Maple 2017 to reduce time and memory usage. The typical
More informationCS1102: Macros and Recursion
CS1102: Macros and Recursion Kathi Fisler, WPI October 5, 2009 This lecture looks at several more macro examples. It aims to show you when you can use recursion safely with macros and when you can t. 1
More informationExcerpt from: Stephen H. Unger, The Essence of Logic Circuits, Second Ed., Wiley, 1997
Excerpt from: Stephen H. Unger, The Essence of Logic Circuits, Second Ed., Wiley, 1997 APPENDIX A.1 Number systems and codes Since ten-fingered humans are addicted to the decimal system, and since computers
More informationThe role of semantic analysis in a compiler
Semantic Analysis Outline The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors
More informationUsing Intel Streaming SIMD Extensions for 3D Geometry Processing
Using Intel Streaming SIMD Extensions for 3D Geometry Processing Wan-Chun Ma, Chia-Lin Yang Dept. of Computer Science and Information Engineering National Taiwan University firebird@cmlab.csie.ntu.edu.tw,
More information