SLIDE 2 At the beginning of the lecture, we answer question: On what platform the system will work when discussing this subject? We have two systems: Widnows and Linux. The easiest solution is to use the Linux operating system distributions, which contains all the tools you need. Sometimes, however, a Linux distribution may not contain LLgen generator. About this generator we will say later in the course. In Windows, you can use Cygwin, which includes most of the required tools (except unfortunately LLgen). SLIDE 3 During the exercise we will use the following tools: 1) C compiler, which is built into Linux. He is on a server with our accounts. 2) LEX generator - will be used to create files lexical analyzer 3) Text editor - here is freedom. Who, in which the editor, a better feeling. 4) Generator LLgen (LLnextgen) used to create a parser 5) Generator Yacc (Bison) also used to create parsers SLIDE 4 On the exercise of design compiler, we used the accounts created on the server. These accounts are equipped with all tools we were talking about. When we use Linux or Cygwin, best to use the default in these systems C compiler - gcc or C ++ (g ++). When using Windows, you can use the commercial version of Borland compilers and Microsoft (which are free) or Open Watcom environment (http://www.openwatcom.org).
SLIDE 5 When it comes to generator LLgen, of which we shall speak later in the lecture, is it a problem. The problem lies in the fact that only some Linux distributions have it equipped. You can reach his version of the source, within the framework of ACK; You can also get LLgen as a separate product page... The last possibility is the Minix operating system, party. This system can record on CD, and loads it into memory. SLIDE 6 On slide SLIDE 7 On slide SLIDE 8 After this short introduction, we will look at the theme of our meetings. What we will talk in the lecture. We will be talking about compilers and interpretators. What is this? How it works? We will look at the various stages of operation of the compiler and stages of compilation. We will also study the lexical analysis. Next we will talk about parsing process. We will also talk about three models of generators Lex LLgen, Yacc. During the classes we will be use them to created analyzers. So would it look in a nutshell.
SLIDE 9 The compiler is a special kind of program, which convert a source code (called the source language) written in a programming language (ie in accordance with the rules governing of the language) at him equivalent code in another language (understandable by a machine), which we call output language. The first compilers appeared in the early 50s of the last century. Since then it created a whole bunch of different compilers of various types. There are thousands of compilers for programming languages. From traditional, universal programming languages such as Fortran, Pascal, C, up to the specified language for some specific applications such as created by INMOS, programming language for parallel systems, named Occam designed for transputer. Different languages are generated by compilers: from 1) resulting languages, which is another programming language, there are, for example, compilers translating from Fortran to Pascal; 2) resulting languages is portable indirect code to a variety of virtual machines (for example, java and.net) 3) resulting languages are machine languages any computer (from the microcomputer to the supercomputer). An important type of compilers, are so-called cross-compilers. A crosscompilers are use to generate executable code for another platform than the one on which the cross- compiler is run. Example. Compiling the code on platforms that do not directly run the compiler, because there is no direct access to them or access is much more difficult (embedded systems). The interpreter is also the program. SLIDE 11 Interpreters are used to perform a command of languages coating system.
The most popular interpreters were in the second half of the 70s and 80s, when apire in the home computers language "Basic". Today interpreters are less popular than compilers, primarily due to the low performance of the application. Advantages: 1) their implementation is sometimes simpler and faster; 2) You can run partially written systems; 3) allow for quick and frequent modifications to the software. Interpreters sometimes also used commonly kompilowalnych languages (such as C). There are also hibrid approaches - Prolog language may be compiled or interpreted. An important field of application interpreters virtual machines, but also here because of the performance are being replaced by compilers or hardware mechanisms. SLIDE 12 At the outset, to the compiler delivered is the source program (the source language). Next we have the first stage, ie. stage of analysis. We'll look at it more broadly. The next second phase of work of the compiler is the synthesis, We also will look at in detail in a moment. As a result of the synthesis we receive the resulting program. The resulting program may be a program in another source language, or some form of executable code. SLIDE 13. The analysis is the stage in which the source program is decomposed into its constituent parts and then generated its intermediate representation. SLIDE 14 on slide SLIDE 15 on slide
SLIDE 16 Let us return for a moment to lexical analysis. Besides leksical analysis can greatly simplify the implementation of the next phase of compilation. This simplification involves removing inputs, irrelevant elements such as white spaces (symbol spaces, tabs, and new lines), and comments. These elements are usually improve transparency code and does not affect the semantics of the program. However, there are exceptions to this rule. For example, in jęzkach Fortran and AWK, white spaces are important. TeX and Turbo Pascal, in the comments placed compile options. It is worth saying one more issue. Lexical analysis sometimes preceded by a socalled preproces that performs simple operations on source program such as replacing texts, developing macros, the inclusion files. SLIDES 17 23 on slides SLIDE 24 Phase parsing seeks to examine whether lexical units form the correct structures of the language. The division of tasks between the lexical analyzer and the syntax is contractual. The most common solution adopted in which to recognize non-recursive structure of the source language used in the scanner. Recursive language constructs, processed by the parser. SLIDE 25 Parsing decide whether located at the entrance of lexical units are arranged in the correct order. Did make the correct structures of the source language. In the case of natural language task it is very difficult. Natural languages have rigid rules and allow the various ranks of words in a sentence.
In this case we will use formal languages that are strictly defined by the grammar. Therefore fragment, though it has the correct lexical units C language, but their order is incompatible with the language C SLIDE 26 After leexical and syntactic analysis, the compiler goes to the semantic analysis. SLIDE 27 At this stage of the analysis we have two tasks: 1) checking compliance with the semantic definition of the source language 2) collected the information needed in the next phase, which is the code generation SLIDE 28 During the semantic analysis is carried out: 1) Verification of compilance of types. You can perform the following checks in: -whether routines were triggered with the appropriate number and type of arguments. -Whether the expressions used the types of operands whether the variables used to index arrays and variables truncated, do not exceed the declared range 2) checking the folow control - are validated using the instruction of flow; Whether the onstructions od folow are used in the right way. The rules of correct use specific instructions are different in different languages. jump goto statement in C can jump inside the loop in Pascal and can not. 3) the uniqueness declaration - during the inspection correctness of the declarations of objects in program source (if such language requires) check whether each object has been correctly declared before the first use
4) The repetitions of names- some programming languages make use of the repetition of the same name at the beginning and end of some instructions to poprawniena error detection efficiency of syntactic and semantic. On the occasion of the correctness of the readability of the program. SLIDE 30 Upon completion of the analysis, compilers usually generate intermediate codes also known as tongues or indirect representations. There are many types of pivot languages: They can be oriented to a specific source language; On the target machine; Universal. The main purpose of using an intermediate language is the reduction of maintenance costs of existing and the use and maintenance of new compilers for new languages and target machines. SLIDE 32 In the example we see the C language translation loop on the three-address intermediate code In the example in see the translation of C language loop, a three-address intermediate code. The three-address intermediate code is simple imperative indirect representation, well reflecting the qualities typical hardware architecture. SLIDE 34 Code optimization aims to improve its effectiveness. The measure of the effectiveness are: acceleration of code and reduce code size. Optimization can be done at the source code level, intermediate and output file.
Code optimization is a complex process. Using the advanced techniques of analysis of the data flow and control flow analysis, carried out transformations improvers, which can not change the semantics of the program. SLIDE 35 The most noticeable effect gives the optimization of loop. Even a single CPU cycle in loop, give big savings of successive iterations. Examples loop optimization level intermediate code is: the use of induced variables reduction of operator forces Induced variables are variables whose values remain constant relationship in the body of the loop. Power reduction operator is to replace expensive operations by cheaper. Calculating sum is faster than multiplication, multiply faster than compounded SLIDE 36 on slide SLIDE 37 During code generation for its efficiency are two key steps: - choice of orders - allocation records SLIDE 38 The key date structures on Which We follow during optimization and code generation are the basic blocks and flow graphs. Basic units are linear pieces of code to which control enters and leaves at the end. With divided into blocks of code can be constructed graph of folow.
In the graph, the vertices are blocks, and the edges are control flow between the blocks. SLIDE 39-41 on slide SLIDE 42 The executable program, it must be connected with the so-called the run-time environment, whose tasks include, among others: providing access to non-local names; exception handling; dynamic memory allocation passing parameters to subprogram SLIDE 43 A major problem that must be solved in a run-time environment is the problem of access to non-local names, and to declare local variables. Rules of the visibility of language, appoint the way interpretation of references is a non-local names. SLIDE 44 Regardless of which visibility rule is applied, the run-time environment must provide a suitable linkage names in return.