SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A 1. What is a compiler? (A.U Nov/Dec 2007) A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the target language.the compiler reports to its user the presence of errors in the source program. 2. What are the two parts of a compilation? Explain briefly. (A.U Nov/Dec 2007) Analysis and Synthesis are the two parts of compilation. The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. The synthesis part constructs the desired target program from the intermediate representation. 3. List the subparts or phases of analysis part. Analysis consists of three phases: Linear Analysis. Hierarchical Analysis. Semantic Analysis. 4. Depict diagrammatically how a language is processed. (A.U May/June 2007) Preprocessor Compiler Assembler Loader/ link editor Target assembly program Relocatable machine code Source Program Skeletal source program Absolute machine code 25. What is linear analysis? Linear analysis is one in which the stream of characters making up the sourceprogram is read from left to right and grouped into tokens that are sequences of charactershaving a collective meaning. Also called lexical analysis or scanning. 26. List the various phases of a compiler. (A.U Nov/Dec 2008) The following are the various phases of a compiler: Lexical Analyzer Syntax Analyzer Semantic Analyzer Intermediate code generator Code optimizer Code generator 27. What are the classifications of a compiler? (A.U Nov/Dec 2009) Compilers are classified as: Single- pass Mu lt i-p ass Lo ad-an d go NARESHKUMR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214. Page 1
Debugging or optimizing 28. What is a symbol table? (A.U Nov/Dec 2010) A symbol table is a data structure containing a record for each identifier, with fieldsfor the attributes of the identifier. The data structure allows us to find the record for eachidentifier quickly and to store or retrieve data from that record quickly. Whenever an identifier is detected by a lexical analyzer, it is entered into the symbol table. The attributes of an identifier cannot be determined by the lexical analyzer. 29. Mention some of the cousins of a compiler. (A.U Nov/Dec 2007) May June 2012 Cousins of the compiler are: Preprocessors Assemblers Loaders and Link-Editors 30. List the phases that constitute the front end of a compiler. (A.U Nov/Dec 2010) The front end consists of those phases or parts of phases those depends primarily on the source language and are largely independent of the target machine. These include Lexical and Syntactic analysis The creation of symbol table Semantic analysis Generation of intermediate code A certain amount of code optimization can be done by the front end as well. Also includes error handling that goes along with each of these phases. 31. Mention the back-end phases of a compiler. (A.U May 2008) The back end of compiler includes those portions that depend on the target machineand generally those portions do not depend on the source language, just the intermediatelanguage. These include Code optimization Code generation, along with error handling and symbol- table operations. 32. Define compiler-compiler. (A.U Nov/Dec 2007) Systems to help with the compiler-writing process are often been referred to as compiler-compilers, compiler-generators or translator-writing systems. Largely they are oriented around a particular model of languages, and they are suitable for generating compilers of languages similar model. 33. List the various compiler construction tools. The following is a list of some compiler construction tools: Parser generators Scanner generators Syntax-directed translation engines Automatic code generators Data-flow engines 34. Differentiate tokens, patterns, and lexeme. (A.U May/June 2010) April May 2011 Tokens- Sequence of characters that have a collective meaning. Patterns- There is a set of strings in the input for which the same token is produced as output. This set of strings is described by a rule called a pattern associated with the token Lexeme- A sequence of characters in the source program that ismatched by the pattern for a token. 35. List the operations on languages. o Union - L U M ={s s is in L or s is in M} o Concatenation LM ={st s is in L and t is in M} o Kleene Closure L* (zero or more concatenations of L) o Positive Closure L+ ( one or more concatenations of L) 36. Write a regular expression for an identifier. NOV DEC 2012 NARESHKUMR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214. Page 2
An identifier is defined as a letter followed by zero or more letters or digits. The regular expression for an identifier is given asletter (letter digit)* 37. Mention the various notational shorthands for representing regular expressions. One or more instances (+) Zero or one instance (?) Character classes ([abc] where a,b,c are alphabet symbols denotes the Regular expressions a b c.)n Non regular sets 38. What is the function of a hierarchical analysis? Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 39. What does a semantic analysis do? Semantic analysis is one in which certain checks are performed to ensure that components of a program fit together meaningfully. Mainly performs type checking. 40. List the various error recovery strategies for a lexical analysis. May June 2012 Possible error recovery actions are: Panic mode recovery Deleting an extraneous character Inserting a missing character Replacing an incorrect character by a correct character Transposing two adjacent characters 41.Wha is an interpreter? April May 2011 Interpreter: It is one of the translators that translate high level language to low level language. high level language Interpreter low level language NARESHKUMR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214. Page 3
43. What are the tools available in analysis phase? i) Structure editors ii) Pretty printer iii) Static checkers iv) Interpreters. 43. Define pretty printers? A pretty printer analyzes a program and prints it in such a way that the structure of the program becomes clearly visible. For the comments may appear with an amount of indentation proportional to the depth of their nesting in the hierarchical organization of the statements. 44. Define assembler and its types? It is defined by the low level language is assembly language and high level language is machine language is called assembler. One pass assembler Two pass assembler 45. Give the types of a language processing system? a) Preprocessors b) Compilers c) Assembler d) Loaders and link editors 46. What are the functions performed in analysis phase? a) Lexical analysis or Linear analysis b) Syntax analysis or hierarchical analysis c) Semantic analysis 47. What are the functions performed in synthesis phase? i) Intermediate code generation ii) Code generation iii) Code optimization 48. Give the classification of processing performed by the semantic analysis? a) Processing of declarative statements. b) Processing of executable statements. 49. Give the properties of intermediate representation? a) It should be easy to produce. b) It should be easy to translate into the target program. 50. What are the two different parts of compilation? a) Analysis phases of compilation b) Synthesis phases of compilation 51. What is meant by lexical analysis? It reads the characters in the program and groups them into tokens that are sequences of characters having a collective meaning. Such as an identifier, a keyword, a punctuation, character or a multi-character operator like ++. 52. What is meant by syntax analysis? It processes the string of descriptors, synthesized by the lexical analyzer, to determine the syntactic structure of an input statement. This process is known as parsing. Output of the parsing step is a representation of the syntactic structure of a statement. It representation in the form of syntax tree. 53. What is meant by intermediate code generation? After syntax and semantic analysis, some compilers generate an explicit intermediate representation of the source program. It can have a variety of forms. This form called three-address code. It consists of sequence of instructions, each NARESHKUMAR.R, AP\CSE,MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Page 4
of which has at most three operands. 54. What is meant by semantic analysis? This phase checks the source program for semantic errors and gathers type of information for the subsequent phase. 55. What do you meant by interpreter? Certain other translators transform a programming language into a simplified language called intermediate code, which can directly executed using a program called an interpreter. 56. What do you meant by phases? Each of which transforms the source program one representation to another. A phase is a logically cohesive operation that takes as input one representation of the source program and produces as output another representation 57. Write short notes on symbol table manager? The table management or bookkeeping portion of the compiler keeps track of the names used by program and records essential information about each, such as its type (int,real etc.,) the data structure used to record this information is called a symbol table manger. 58. Write short notes on error handler? The error handler is invoked when a flaw in the source program is detected. It must warn the programmer by issuing a diagnostic, and adjust the information being passed from phase to phase so that each phase can proceed. So that as many errors as possible can be detected in one compilation. 59. Mention some of the cousins of the compiler? i) Preprocessors ii) Assemblers iii) Two pass assembly iv) Loaders and Linker-editors. 60. What is front end and back end? The phases are collected into a front end and a back end. The front end consists of those phases or parts of phases, that depends primarily on the source language and is largely independent of the target machine. The back ends that depend on the target machine and generally these portions do not depend on the source language. 61. What do you meant by passes? A pass reads the source program or the output of the previous pass, makes the transformations specified by its phases and writes output into an intermediate file, which may then be read by a subsequent pass. In an implementation of a compiler, portions of one or more phases are combined into a module called pass. 25. List some compiler construction tools? i) Parser generators ii) Scanner generators iii) Syntax-directed translation engine iv) Automatic code generators v) Data-flow engine. 62. Explain any one compiler construction tool? Scanner generators, these automatically generate lexical analyzers normally from a specification based on regular expressions. The resulting of lexical analyzer is in effect of finite automata. 63. What are issues available in lexical analysis? i) Simpler design ii) Compiler efficiency is improved by specialized buffering techniques for reading input characters and processing tokens and significantly NARESHKUMAR.R, AP\CSE,MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Page 5
speeds up the performance of a compiler. iii) Compiler portability is enhanced. 64. Define patterns/lexeme/tokens? A set of strings in the input for which the same token is produced as output. This set of strings described by a rule called pattern associated with the token. A lexeme is a sequence of characters in the source program that is matched by the pattern for a token. Token is a sequence of character that can be treated as a single logical entity. 65. Give the algebraic properties of regular expression? AXIOM DESCRIPTION i) r/s = s/r / is commutative ii) r/(s/t)=(r/s)/t / is associative iii) (rs)t=r(st) concatenation is associative iv) r(s/t)=rs/rt concatenation distributes over / v) r**=r* * is idempotent 66. What are the systems referred to data flow engine? i) Compiler-compilers ii) Compiler-generators iii) Translator writing systems. 67. Give the parts of a string? Prefix of s, suffix of s, substring of s, proper prefix, proper suffix, proper substring and subsequence of s. 68. What are the operations on language? Union Concatenation Kleene closure or star closure and Star closure. SNSCT Department of Compute Science and Engineering Page 5 69. Give the error recovery actions in lexical errors? i) Deleting an extraneous character ii) Inserting a missing character iii) Replacing an incorrect character by a correct character. 70. What are the ROLE of lexical analyzer? NOV DEC 2011 a) Use a lexical analyzer generator, such as Lex compiler, to produce the lexical analyzer from a regular expression based specification b) Write the lexical analyzer in a conventional systems-programming language using the I/O facilities of that language to read the input. c) Write the lexical analyzer in assembly language and explicitly manage the reading of input. 71. Define regular expression? It is built up out of simpler regular expression using a set of defining rules. Each regular expression r denotes a language L(r). The defining rules specify how L(r) is formed by combining in various ways the languages denoted by the sub expressions of r. 72. Give the precedence of regular expression operator? i) The unary operator * has the highest precedence and is left associative. ii) Concatenation has the second highest precedence and is left associative. iii) / has the lowest precedence and is left associative. 73. Define the length of a string? It is the number of occurrences of symbols in string, s denoted by s. NARESHKUMAR.R, AP\CSE,MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Page 6
Example: s=abc, s =3. 74. Give the rules in regular expression? 1) is a regular expression that denotes { }, that is the set containing the empty string. 2) If a is a symbol in _, then a is a regular expression that denoted {a}, i.e., the set containing the string a. 3) Suppose r and s are regular expression denoting the languages L( r) and L(s). 75. Define regular set? A language denoted by a regular expression is said to be a regular set. 76. Give the types of notational shorthand s? Zero or more instance One or more instance Character classes Character not in a given set and unary operator? NARESHKUMAR.R, AP\CSE,MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Page 7