1 Application Specific Signal Processors 521281S Dept. of Computer Science and Engineering Mehdi Safarpour 23.9.2018
Course contents Lecture contents 1. Introduction and number formats 2. Signal processor architectures 3. Transport-triggered processors 4. Program code and compilation 5. Performance optimization 6. Tools and FPGAs
What you will learn this time Structure of a compiler How to improve program performance by SW changes
Introduction In Lecture 2 we concentrated on the processor Human-readable code Compiler Machinereadable code Processor Output
Introduction This time we will concentrate on the compiler Human-readable code Compiler Machinereadable code Processor Output
Compiler structure Human-readable code Compiler
Compiler structure Human-readable code Front-end Compiler Intermediate representation
Compiler structure Human-readable code Front-end Compiler Intermediate representation Optimization Intermediate representation
Compiler structure Human-readable code Front-end Compiler Intermediate representation Optimization Intermediate representation Back-end
Compiler structure Human-readable code Front-end Compiler Intermediate representation Optimization Intermediate representation Back-end Target machine program
Compiler structure C C++ Fortran Java... Front-end Compiler Intermediate representation Optimization Intermediate representation Back-end x86 ARM c6x TTA
language-dependent front-end code optimizer processor dependent back-end Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Lexical analysis: divides the source code into parts Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Lexical analysis: divides the source code into parts for example: Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) y = x + 3 y symbol 1 x symbol 2 = token (=) + token (+) 3 number 1
Lexical analysis: divides the source code into parts for example: Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) y = x + 3 y symbol 1 x symbol 2 = token (=) + token (+) 3 number 1
Lexical analysis: divides the source code into parts An example of an entry that creates an error in the lexical analyzer: 6a = x + c; Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) supposing this is C language, identifiers can not start with a number
Symbol table: maintains a list of symbols that exist in the code Generally there are multiple symbol tables for different parts of the code Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Syntax analyzer (parser) Analyzes the relations between symbols, tokens and numbers according to precendence rules etc. Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Syntax analyzer (parser) Analyzes the relations between symbols, tokens and numbers according to precendence rules etc. y = x + 3 = y + Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) x 3
Syntax analyzer (parser) Analyzes the relations between symbols, tokens and numbers according to precendence rules etc. y = * + 3 = Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) y?
Semantic analyzer Analyzes the relations between symbols, tokens and numbers according to precendence rules etc. Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Semantic analyzer Analyzes the relations between symbols, tokens and numbers according to precendence rules etc. int array[10]; float index; return array[index]; Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Semantic analyzer Analyzes the relations between symbols, tokens and numbers according to precendence rules etc. int array[10]; float index; return array[index]; Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Intermediate code generator Translates the input coming from the previous stages to an abstract language that is used by the compiler for optimization etc. Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Two major styles of intermediate representation exist: 1) syntax trees 2) three-address codes Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Two major styles of intermediate representation exist: 1) syntax trees 2) three-address codes Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) + + * * a - d b c a + a * (b c) + (b - c) * d
Two major styles of intermediate representation exist: 1) syntax trees 2) three-address codes t 1 = b c t 2 = a * t 1 t 3 = a + t 2 t 4 = t 1 * d t 5 = t 3 + t 4 Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) a + a * (b c) + (b - c) * d
Reminder: This is the compiler front-end The front-end is languagedependent Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Code optimizer: Make transformations to the program so that it executes faster, but behaves identically to the original version. Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Exercise Together with your pair, answer the questions: 1) What does the C source code do? 2) What does the unoptimized assembly code contain? 3) What does the optimized assembly do? Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007) Description of puts(): int puts ( const char * str ); Writes the C string pointed by str to the standard output (stdout) and appends a newline character ('\n').
Code generator: Produces assembly instructions or machine code for a specific processor out of the intermediate representation. Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
Machine dependent code optimizer: Optimizes the assembly/machine code for a specific platform. For example, instruction scheduling. Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
language-dependent front-end code optimizer processor dependent back-end Figure source: Aho, Lam, Sethi, Ullman Compilers principles, techniques and tools (2007)
39 Loop scheduling Software pipelining: reorganize a loop so that parts of different loop iterations run in parallel Need prolog and epilog around the loop to set up
Practical compiler-related issues
Practical compiler-related issues Key question: how to make the program run more efficiently on the processor? Efficiency means Avoiding behavior that is not needed (e.g. Memory accesses) Improving the utilization of processor resources (less empty instructions) As a consequence the execution time improves and power consumption reduces The software-related methods how to do this depend somewhat on the processor and on the programming language that is used
Practical compiler-related issues For TTA processors the following issues slow down the program: 1. Global data 2. Use of pointers in some cases (int *p) 3. Data size changes (8-bit integer, 16-bit integer, etc) 4. Conditionals (if-then-else, switch)
Practical compiler-related issues 1. Global data In the C language, it is preferable that variables and arrays are defined inside functions, which means that they are local int main() { int x;... } However, sometimes, by accident or for some important reason, variables are defined outside functions, making them global. Reading and writing global variables is slow. int x; int main() {... } The reason is global variable is likely to be assigned in Memory rather than a register file
Practical compiler-related issues 2. Pointers In the C language, pointers are problematic because they can point to any place in the memory Because of this general problem, the C compiler cannot much optimize the code around a pointer. As a consequence, code containing pointer may run slow and/or use the data memory
Practical compiler-related issues 2. Pointers These cases can make pointers problematic for the compiler: Pointers as function parameters int filter (int *x, int *y) { } Pointer arithmetic int array[3] *(array + i) = value;
Practical compiler-related issues 2. Pointers How to avoid performance loss caused by pointers: Using the restrict keyword int filter (int * restrict x, int * restrict y) { } Index arrays by [] int array[3] array[i] = value;
Practical compiler-related issues 3. Data size changes Data size changes force the compiler to insert extra operations to the program code short val1; int val2; val2 = val1; A sign extension (sxhw) is inserted Or: unsigned int data; _TCE_FIFO_U8_STREAM_IN(0, data, status); An and-operation is inserted
Practical compiler-related issues 4. Conditionals The C compiler in the TTA toolset is very advanced and can most usually transform small if-statements to logical computations Still, in some cases the removal of if-statements must be done manually
Mini Assignment Take the filter.c program from Exercise 1 (\float folder). Compile using gcc compiler with optimization levels of 0 to 3. Use S to instruct the compiler to generate assembly code. Report obsereved changes in generated asm code. Example: gcc filter.c o filterlevel_1.asm O1 S Of course the asm code will be for X86 machine.