Compiler Theory. (GCC the GNU Compiler Collection) Sandro Spina 2009

Similar documents
2 Compiling a C program

GCC: the GNU Compiler Collection

GCC: the GNU Compiler Collection

Topic 6: A Quick Intro To C

Topic 6: A Quick Intro To C. Reading. "goto Considered Harmful" History

Basic C Programming. Bin Li Assistant Professor Dept. of Electrical, Computer and Biomedical Engineering University of Rhode Island

Programs. Function main. C Refresher. CSCI 4061 Introduction to Operating Systems

Intermediate Programming, Spring 2017*

Conduite de Projet Cours 4 The C build process

The C Preprocessor (and more)!

1 You seek performance 3

PROGRAMMAZIONE I A.A. 2017/2018

CS 326 Operating Systems C Programming. Greg Benson Department of Computer Science University of San Francisco

// file2.c. // file1.c #include <stdio.h> int A1 = 42; // 1.1 static int B1; // 1.2. int A2 = 12; // 2.1 int B2; // 2.2. extern int A2; // 1.

C: Program Structure. Department of Computer Science College of Engineering Boise State University. September 11, /13

#include <stdio.h> int main() { printf ("hello class\n"); return 0; }

6.s096. Introduction to C and C++

Chapter 11 Introduction to Programming in C

A Fast Review of C Essentials Part II

CS3157: Advanced Programming. Outline

Intermediate Programming, Spring 2017*

Slide Set 5. for ENCM 339 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary

CS2141 Software Development using C/C++ Compiling a C++ Program

CS Basics 15) Compiling a C prog.

C introduction: part 1

Berner Fachhochschule Haute cole spcialise bernoise Berne University of Applied Sciences 2

Lectures 5-6: Introduction to C

Modifiers. int foo(int x) { static int y=0; /* value of y is saved */ y = x + y + 7; /* across invocations of foo */ return y; }

KYC - Know your compiler. Introduction to GCC

Miscellaneous C-programming Issues

Essentials for Scientific Computing: Source Code, Compilation and Libraries Day 8

Control flow and string example. C and C++ Functions. Function type-system nasties. 2. Functions Preprocessor. Alastair R. Beresford.

cs3157: another C lecture (mon-21-feb-2005) C pre-processor (3).

Practical C Issues:! Preprocessor Directives, Multi-file Development, Makefiles. CS449 Fall 2017

Page 1. Agenda. Programming Languages. C Compilation Process

CS 261 Fall C Introduction. Variables, Memory Model, Pointers, and Debugging. Mike Lam, Professor

Chapter 11 Introduction to Programming in C

C and C++ 2. Functions Preprocessor. Alan Mycroft

Executables and Linking. CS449 Spring 2016

Reliable C++ development - session 1: From C to C++ (and some C++ features)

Chapter 11 Introduction to Programming in C

CSE 374 Programming Concepts & Tools

CSc 520 Principles of Programming Languages

Hello, World! in C. Johann Myrkraverk Oskarsson October 23, The Quintessential Example Program 1. I Printing Text 2. II The Main Function 3

Chapter IV Introduction to C for Java programmers

INTERMEDIATE SOFTWARE DESIGN SPRING 2011 ACCESS SPECIFIER: SOURCE FILE

CS240: Programming in C

CSE 2421: Systems I Low-Level Programming and Computer Organization. Linking. Presentation N. Introduction to Linkers

Final CSE 131B Spring 2004

Have examined process Creating program Have developed program Written in C Source code

P.G.TRB - COMPUTER SCIENCE. c) data processing language d) none of the above

CSci 4061 Introduction to Operating Systems. Programs in C/Unix

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

C-Programming. CSC209: Software Tools and Systems Programming. Paul Vrbik. University of Toronto Mississauga

C / C++ Coding Rules

Chapter 11 Introduction to Programming in C

C - Basics, Bitwise Operator. Zhaoguo Wang

Tutorial 1: Introduction to C Computer Architecture and Systems Programming ( )

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 12, FALL 2012

Executables and Linking. CS449 Fall 2017

Chapter 11 Introduction to Programming in C

C Language Part 1 Digital Computer Concept and Practice Copyright 2012 by Jaejin Lee

CSE101-lec#12. Designing Structured Programs Introduction to Functions. Created By: Amanpreet Kaur & Sanjeev Kumar SME (CSE) LPU

The Compilation Process

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006


emacs customisation ($HOME/.emacs)

Figure 1 Common Sub Expression Optimization Example

Arguments of C++ Applications g++ options Libraries

CMPSC 311- Introduction to Systems Programming Module: Build Processing

CS201 - Introduction to Programming Glossary By

COMP26120: Algorithms and Imperative Programming. Lecture 5: Program structuring, Java vs. C, and common mistakes

Rule 1-3: Use white space to break a function into paragraphs. Rule 1-5: Avoid very long statements. Use multiple shorter statements instead.

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

CS Programming In C

Decision Making -Branching. Class Incharge: S. Sasirekha

C Programming Review CSC 4320/6320

A Fast Review of C Essentials Part I

Chapter 7: Preprocessing Directives

Program Translation. text. text. binary. binary. C program (p1.c) Compiler (gcc -S) Asm code (p1.s) Assembler (gcc or as) Object code (p1.

CS240: Programming in C. Lecture 2: Overview

Unit 1: Introduction to C Language. Saurabh Khatri Lecturer Department of Computer Technology VIT, Pune

Program Translation. text. text. binary. binary. C program (p1.c) Compiler (gcc -S) Asm code (p1.s) Assembler (gcc or as) Object code (p1.

C Introduction. Comparison w/ Java, Memory Model, and Pointers

Appendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things.

EL2310 Scientific Programming

TDDB68 Concurrent Programming and Operating Systems. Lecture 2: Introduction to C programming

Programming. Projects with Multiple Files

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

G52CPP C++ Programming Lecture 6. Dr Jason Atkin

Chapter 11 Introduction to Programming in C

Outline. Computer programming. Debugging. What is it. Debugging. Hints. Debugging

Short Notes of CS201

SOFTWARE ARCHITECTURE 5. COMPILER

CSE 333 Autumn 2013 Midterm

Compiler (1A) Young Won Lim 6/8/14

A software view. Computer Systems. The Compilation system. How it works. 1. Preprocesser. 1. Preprocessor (cpp)

Lecture 3: C Programm

Link 7.A Static Linking

Transcription:

Compiler Theory (GCC the GNU Compiler Collection) Sandro Spina 2009

GCC Probably the most used compiler. Not only a native compiler but it can also cross-compile any program, producing executables for different systems other than the one it is being used on. GCC is written in C and can compile itself!! Provides numerous front-ends C C++ Objective-C Fortran Java Ada

Intermediate Languages for GCC (GENERIC) GCC uses three intermediate languages to represent the program during compilation: GENERIC Is a language-independent representation generated by each front end. It is used as an interface between the parser and optimizer. It is a common representation that is able to represent programs written in all the languages supported by GCC.

Intermediate Languages for GCC (GIMPLE) Gimlipification (done by the gimplifier) is the compiler pass which lowers GENERIC to GIMPLE. It works recursively by replacing complex statements with sequences of simple statements. GIMPLE Used for target and language independent optimisations (e.g. inlining, constant propagation, etc) It is a language independent, tree based representation Differs from GENERIC in that the GIMPLE grammar is more restrictive: expressions contain no more that 3 operands (except for functions calls) It has no control flow structures because these are lowered to gotos

Intermediate Languages for GCC (RTL) Register Transfer Language (RTL) It has mainly an internal form i.e. used within GCC structures for optimisations. It also has a textual form which is used when printing debugging dumps (core)

Compiling a C Program Compilation refers (as you all obviously know by now) to the process of converting a program from source code (text), in a programming language such as C or C++, into machine code, i.e. a sequence of 1 s and 0 s used to control what the processing unit (CPU, GPU, etc.) does. $ gcc Wall hello.c o hello The above compiles the source code in hello.c to machine code and stores it in the executable file hello -o specifies the output file

GCC warnings flag The option -Wall turns on all the most commonly-used compiler warnings You should ALWAYS use it!! Because they are essential when detecting problems and debugging.

GCC warnings example int main (void) { printf ( Three plus two is %f\n, 5); return 0; } $ gcc Wall example.c -0 example Example.c: In function main : Example.c:6: warning: double format, different type arg (arg 2) Warnings do not prevent compilation but indicate possible problems. The program above compiles (and runs) but gives an incorrect answer.

Compiling multiple source files Large programs will always be split in multiple files (if your s are not then you might consider it) The nice things about this is that one can compile the individual parts independently. Header files are used to specify the prototype of function calls. Eg. void hello ( const char * name); #include hello.h int main (void) { } hello ( world ); return 0; #include <stdio.h> #include hello.h void hello (const char * name) { } printf ( Hello, %s!\n, name);

Compiling multiple source files $ gcc Wall main.c hello_fn.c o hello Note that the header file is not included in the list of files to be compiled. This is because the directive #include hello.h in the source files instructs the compiler to include it automatically at the appropriate points. This is something which is carried out by the pre-processor tool cpp. Only the files which have changed need recompilation because a twostage process is carried out compilation and linking

Compiling then linking In the first stage the source file is compiled without creating an executable. The result is referred to as an object file, and has the extension.o when using gcc compiler. In the second stage, the object files are merged together by a separate program called a linker. The linker combines all the object files together to create an executable. Essentially, an object file contains machine code where any references to the memory addresses of functions (or variables) in other files are left undefined. This allows source files to be compiled without direct reference to each other. The linker fills in these missing addresses when it produces the executable.

Creating object files from source files $ gcc Wall c main.c Produces an object file main.o containing the machine code for the main function with a reference to the external function hello. The corresponding memory address is left undefined at this stage $ gcc main.o hello_fn.o o hello gcc uses the GNU linker ld It should be noted that linking is effectively an unambiguous process which either succeeds or fails (and it fails only if there are references which cannot be resolved) Warnings flag is useless here of course!

Linking with external libraries A library is a collection of precompiled object files which can be linked into programs. Eg. Math library libm.a Static libraries: They are created from object files with GNU archiver tool ar, and are used by the linker to resolve references to functions at compile-time. $ gcc Wall add.c lm o add -l is used to link against libraries. lm for math one. Shared libraries: libraries are loaded and references are resolved at run-time.

Linking with shared libraries When a program is linked against a static library, the machine code from any external functions used by the program is copied from the library into the final executable (increasing it s size) With shared libraries a more advanced from of linking is performed, which makes the executables smaller. Shared library ext. is.so for shared objects Instead of complete machine code of functions the executable would contain a small table of the functions it requires. Before executable starts running, the machine code of the external functions is copied into memory from the shared library dynamic linking Dynamic linking makes executables smaller.

Compilation warning options -Wall -Wcomment (inc. in Wall) This options warns about nested comments eg. /* /* */ */ which may become a source of confusion. -Wformat (inc. in Wall) This option warns about the incorrect use of format strings in the functions such as printf, where the format specified (eg. %f) does not agree with the type of the argument. -Wunused (inc. in Wall) This option warns about unused variables. Could be an error or could be genuinely not needed.

Compilation warning options -Wall -Wimplicit (inc. in Wall) This options warns about any functions which are used without being declared. Usually you re missing the #include header file. -Wreturn-type (inc. in Wall) This option warns about functions which are declared without a return type but not declared void. It also catches empty return statements in functions that are not declared void. It is usually good to avoid ambiguity (eg. Use return 0 not just return ). Any compiler warning can be taken as an indication of a potentially serious problem. Good compiler implementations try an pinpoints these cases.

Warnings not in -Wall There are cases where one would want the compiler to warn him of suspicious code, which might be good but might also indicate potential problems. Check this piece of code: what s wrong here? int foo ( unsigned int x ) { } if (x < 0) return 0 else return 1

-W gcc compiler option The -W is a general option which warns about a selection of common programming errors, such as functions which can return without a value and comparisons between signed and unsigned values. From the previous example the compiler outputs: $ gcc W c example.c Example.c: In function foo Example.c:4: warning: comparison of unsigned expressions < 0 is always false

-W specific options -Wconversion This option warns about implicit type conversions that could cause unexpected results. For eg. unsigned int x = -1; -Wshadow The option warns about the redeclaration of a variable name in a scope where it has already been declared. This is referred to as variable shadowing. Check code in next slide.

-Wshadow double test ( double x ) { double y = 1.0; { double y; y = x; } return y; } This is clearly a valid piece of code but some people might think that the return value of y = x when it s 1.

Preprocessor - cpp It is automatically called whenever gcc processes a c or c++ file. Recently it can been inbuilt in the compiler itself. Cpp still exists. Used mainly to expand macros. For eg #ifdef: #ifdef TEST #endif Printf( Test Mode \n ); The gcc option -DNAME defines a preprocessor macro NAME from the command line. In the program above if we want to compile in test mode, the command line option -DTEST is used.

Macros with values In addition to being defined, a macro can also be assigned a value. Clearly this value is inserted in the source code at each point where the macro occurs. -DNAME=value eg DNUM=100 would replace the occurrences of macro NUM with 100 in the program $gcc Wall DNUM=100 test.c $gcc Wall DNUM= 50+50 test.c The above a equivalent gcc calls. Macros can also be defined inside the code using the #define command. Eg #define SIZE 100 int table1[size]

Compiling with optimisation GCC is an optimised compiler, i.e. it provides a number of options which either increase the speed of an executable or else decrease it s size (or both) Source-level optimisation Common subexpression elimination Function inlining (especially important when small functions are continuously invoked). Eg double sq(double x) { return x*x; } Consider this function inside a loop!

Compiling with optimisation (ii) Speed-space tradeoffs One can produce faster code at the expense of size. Eg. Loop Unrolling Increases the speed of loops by eliminating the end of loop condition in each iteration. Eg For (i=0; i<8; i++) { y[i] = i; } Is replaced with y[0] = 0; y[1] = 1; etc

Compiling with optimisation (iii) Scheduling: The lowest level of optimisation in which the compiler determines the best ordering of individual instructions. Pipelining, in which multiple instructions execute in parallel on the same CPU. The compiler tries to optimise this amount of parallelisation. Increases speed of executable but takes longer to compiler due to its complexity!

Optimisation Levels (i) GCC provides a range of general optimisation levels, numbered 0-3 (using OLEVEL), as well as options for specific optimisations. -O0 (default): Does not perform any optimisations. Best for debugging. -O1 : Turns on the most common forms of optimisation that do not require any speedspace tradeoffs.

Optimisation Levels (ii) -O2 : Turns on O1 plus instruction scheduling. It will not increase the executable size but will take longer to compile. Best for release versions. -O3 : Turns on speed-space optimisations such as unfolding of loops. -funroll-loops : specific optimisation -Os: Turns on optimisations which should decrease the size of the executable.

Summary A single invocation of GCC consists of the following stages: Preprocessing (to expand macros) Compilation (from source code to assembly language) Assembly (from assembly language to machine code) Linking (to create the final executable)

References An Introduction to GCC Brian Gough