Why are there so many programming languages?

Similar documents
Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

Early computers (1940s) cost millions of dollars and were programmed in machine language. less error-prone method needed

Programming Languages

Introduction to Programming Languages. CSE 307 Principles of Programming Languages Stony Brook University

Compilation I. Hwansoo Han

CS 3304 Comparative Languages. Lecture 1: Introduction

Organization of Programming Languages (CSE452) Why are there so many programming languages? What makes a language successful?

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

8/27/17. CS-3304 Introduction. What will you learn? Semester Outline. Websites INTRODUCTION TO PROGRAMMING LANGUAGES

Concepts of Programming Languages

General Concepts. Abstraction Computational Paradigms Implementation Application Domains Influence on Success Influences on Design

Compilers and Interpreters

Compilers. Prerequisites

Principles of Programming Languages. Lecture Outline

CST-402(T): Language Processors

Introduction. A. Bellaachia Page: 1

LECTURE 1. Overview and History

LECTURE 3. Compiler Phases

Programming Languages, Summary CSC419; Odelia Schwartz

Chapter 11 :: Functional Languages

Chapter 1. Preliminaries

COMPILER DESIGN LECTURE NOTES

Pioneering Compiler Design

Chapter 1. Preliminaries

Chapter 1 Preliminaries

COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

Crafting a Compiler with C (II) Compiler V. S. Interpreter

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

SKILL AREA 304: Review Programming Language Concept. Computer Programming (YPG)

Working of the Compilers

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Programming Language Pragmatics

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

CS 415 Midterm Exam Spring 2002

Chapter 6 Control Flow. June 9, 2015

The role of semantic analysis in a compiler

CS A331 Programming Language Concepts

Chapter 5. Names, Bindings, and Scopes

Chapter 3:: Names, Scopes, and Bindings (cont.)

Chapter 1. Preview. Reason for Studying OPL. Language Evaluation Criteria. Programming Domains

Chapter 10 :: Data Abstraction and Object Orientation

Compiling Regular Expressions COMP360

CS 415 Midterm Exam Spring SOLUTION

LECTURE 18. Control Flow

LECTURE 2. Compilers and Interpreters

Programmiersprachen (Programming Languages)

CS Exam #1-100 points Spring 2011

Chapter 3:: Names, Scopes, and Bindings (cont.)

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Chapter 8 :: Composite Types

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

Discovering Computers 2008

Semantic Analysis. Lecture 9. February 7, 2018

Low-Level Languages. Computer Programs and Programming Languages

CSCI 3136 Principles of Programming Languages

CSC 326H1F, Fall Programming Languages. What languages do you know? Instructor: Ali Juma. A survey of counted loops: FORTRAN

Discovering Computers Chapter 13 Programming Languages and Program Development

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

Concepts in Programming Languages

CS101 Introduction to Programming Languages and Compilers

CS321 Languages and Compiler Design I. Winter 2012 Lecture 1

! Broaden your language horizons! Different programming languages! Different language features and tradeoffs. ! Study how languages are implemented

CS 360 Programming Languages Interpreters

COSC121: Computer Systems: Runtime Stack

Topic I. Introduction and motivation References: Chapter 1 of Concepts in programming languages by J. C. Mitchell. CUP, 2003.

Language Translation, History. CS152. Chris Pollett. Sep. 3, 2008.

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Theory and Compiling COMP360

Programming Languages Third Edition. Chapter 7 Basic Semantics


Chapter 9 :: Data Abstraction and Object Orientation

Languages and Compilers

Informatica 3 Syntax and Semantics

CMPT 379 Compilers. Anoop Sarkar.

A Tour of Language Implementation

CS 314 Principles of Programming Languages

COMPILER DESIGN. For COMPUTER SCIENCE

Intermediate Representations & Symbol Tables

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Short Notes of CS201

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Compilers and interpreters

CSC 533: Organization of Programming Languages. Spring 2005

Lecture 09. Ada to Software Engineering. Mr. Mubashir Ali Lecturer (Dept. of Computer Science)

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

CS201 - Introduction to Programming Glossary By

Introduction to Compiler Construction

UNIT-4 (COMPILER DESIGN)

Syntax Errors; Static Semantics

Compiler Design (40-414)

Lecturer: William W.Y. Hsu. Programming Languages

G Programming Languages - Fall 2012

What do Compilers Produce?

Software II: Principles of Programming Languages

Lexical Scanning COMP360

Transcription:

Chapter 1 :: Introduction Programming Language Pragmatics, Fourth Edition Michael L. Scott Copyright 2016 Elsevier 1 Chapter01_ Introduction_4e - Tue November 21, 2017 Introduction Why are there so many programming languages? evolution -- we've learned better ways of doing things over time socio-economic factors: proprietary interests, commercial advantage orientation toward special purposes orientation toward special hardware diverse ideas about what is pleasant to use 2 Chapter01_ Introduction_4e - Tue November 21, 2017

Introduction What makes a language successful? easy to learn (BASIC, Pascal, LOGO, Scheme) easy to express things, easy use once fluent, "powerful (C, Common Lisp, APL, Algol-68, Perl) easy to implement (BASIC, Forth) possible to compile to very good (fast/small) code (Fortran) backing of a powerful sponsor (COBOL, PL/1, Ada, Visual Basic) wide dissemination at minimal cost (Pascal, Turing, Java) 3 Chapter01_ Introduction_4e - Tue November 21, 2017 Introduction Why do we have programming languages? What is a language for? way of thinking -- way of expressing algorithms languages from the user's point of view abstraction of virtual machine -- way of specifying what you want the hardware to do without getting down into the bits languages from the implementor's point of view 4 Chapter01_ Introduction_4e - Tue November 21, 2017

Why study programming languages? Help you choose a language. C vs. C++ vs. C# for systems programming Fortran vs. C for numerical computations PHP vs. Ruby for web-based applications Ada vs. C for embedded systems Common Lisp vs. Scheme vs. ML for symbolic data manipulation Java vs..net for networked PC programs 5 Chapter01_ Introduction_4e - Tue November 21, 2017 Why study programming languages? Make it easier to learn new languages. Some languages are similar; easy to walk down family tree. concepts have even more similarity; if you think in terms of iteration, recursion, abstraction (for example), you will find it easier to assimilate the syntax and semantic details of a new language. Think of an analogy to human languages: good grasp of grammar makes it easier to pick up new languages (at least Indo-European). 6 Chapter01_ Introduction_4e - Tue November 21, 2017

Why study programming languages? Help you make better use of whatever language you use understand obscure features: In C, help you understand unions, arrays & pointers, separate compilation, varargs, catch and throw In Common Lisp, help you understand first-class functions/closures, streams, catch and throw, symbol internals 7 Chapter01_ Introduction_4e - Tue November 21, 2017 Why study programming languages? Help you make better use of whatever language you use (2) understand implementation costs: choose between alternative ways of doing things, based on knowledge of what will be done underneath: use simple arithmetic equal (use x*x instead of x**2) use C pointers or Pascal "with" statement to factor address calculations avoid call by value with large data items in Pascal avoid the use of call by name in Algol 60 choose between computation and table lookup (e.g. for cardinality operator in C or C++) 8 Chapter01_ Introduction_4e - Tue November 21, 2017

Why study programming languages? Help you make better use of whatever language you use (3) figure out how to do things in languages that don't support them explicitly: lack of suitable control structures in Fortran use comments and programmer discipline for control structures lack of recursion in Fortran, CSP, etc write a recursive algorithm then use mechanical recursion elimination (even for things that aren't quite tail recursive) 9 Chapter01_ Introduction_4e - Tue November 21, 2017 Why study programming languages? Help you make better use of whatever language you use (4) figure out how to do things in languages that don't support them explicitly: lack of named constants and enumerations in Fortran use variables that are initialized once, then never changed lack of modules in C and Pascal use comments and programmer discipline lack of iterators in just about everything fake them with (member?) functions 10 Chapter01_ Introduction_4e - Tue November 21, 2017

Imperative languages Group languages as imperative von Neumann (Fortran, Pascal, Basic, C) object-oriented (Smalltalk, Eiffel, C++?) scripting languages (Perl, Python, JavaScript, PHP) declarative functional (Scheme, ML, pure Lisp, FP) logic, constraint-based (Prolog, VisiCalc, RPG) 11 Chapter01_ Introduction_4e - Tue November 21, 2017 Imperative languages Imperative languages, particularly the von Neumann languages, predominate They will occupy the bulk of our attention We also plan to spend a lot of time on functional, logic languages 12 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Compilation vs. interpretation not opposites not a clear-cut distinction Pure Compilation The compiler translates the high-level source program into an equivalent target program (typically in machine language), and then goes away: 13 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Pure Interpretation Interpreter stays around for the execution of the program Interpreter is the locus of control during execution 14 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Interpretation: Greater flexibility Better diagnostics (error messages) Compilation Better performance 15 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Common case is compilation or simple preprocessing, followed by interpretation Most language implementations include a mixture of both compilation and interpretation 16 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Note that compilation does NOT have to produce machine language for some sort of hardware Compilation is translation from one language into another, with full analysis of the meaning of the input Compilation entails semantic understanding of what is being processed; pre-processing does not A pre-processor will often let errors through. A compiler hides further steps; a pre-processor does not 17 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Many compiled languages have interpreted pieces, e.g., formats in Fortran or C Most use virtual instructions set operations in Pascal string manipulation in Basic Some compilers produce nothing but virtual instructions, e.g., Pascal P-code, Java byte code, Microsoft COM+ 18 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Implementation strategies: Preprocessor Removes comments and white space Groups characters into tokens (keywords, identifiers, numbers, symbols) Expands abbreviations in the style of a macro assembler Identifies higher-level syntactic structures (loops, subroutines) 19 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Implementation strategies: Library of Routines and Linking Compiler uses a linker program to merge the appropriate library of subroutines (e.g., math functions such as sin, cos, log, etc.) into the final program: 20 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Implementation strategies: Post-compilation Assembly Facilitates debugging (assembly language easier for people to read) Isolates the compiler from changes in the format of machine language files (only assembler must be changed, is shared by many compilers) 21 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Implementation strategies: The C Preprocessor (conditional compilation) Preprocessor deletes portions of code, which allows several versions of a program to be built from the same source 22 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Implementation strategies: Source-to-Source Translation (C++) C++ implementations based on the early AT&T compiler generated an intermediate program in C, instead of an assembly language: 23 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Implementation strategies: Source-to-Source Translation (C++) C++ implementations based on the early AT&T compiler generated an intermediate program in C, instead of an assembly language: 24 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Implementation strategies: Bootstrapping 25 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Implementation strategies: Compilation of Interpreted Languages The compiler generates code that makes assumptions about decisions that won t be finalized until runtime. If these assumptions are valid, the code runs very fast. If not, a dynamic check will revert to the interpreter. 26 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Implementation strategies: Dynamic and Just-in-Time Compilation In some cases a programming system may deliberately delay compilation until the last possible moment. Lisp or Prolog invoke the compiler on the fly, to translate newly created source into machine language, or to optimize the code for a particular input set. The Java language definition defines a machine-independent intermediate form known as byte code. Byte code is the standard format for distribution of Java programs. The main C# compiler produces.net Common Intermediate Language (CIL), which is then translated into machine code immediately prior to execution. 27 Chapter01_ Introduction_4e - Tue November 21, 2017 Compilation vs. Interpretation Implementation strategies: Microcode Assembly-level instruction set is not implemented in hardware; it runs on an interpreter. Interpreter is written in low-level instructions (microcode or firmware), which are stored in readonly memory and executed by the hardware. 28 Chapter01_ Introduction_4e - Tue November 21, 2017

Compilation vs. Interpretation Compilers exist for some interpreted languages, but they aren't pure: selective compilation of compilable pieces and extrasophisticated pre-processing of remaining source. Interpretation of parts of code, at least, is still necessary for reasons above. Unconventional compilers text formatters silicon compilers query language processors 29 Chapter01_ Introduction_4e - Tue November 21, 2017 Programming Environment Tools Tools 30 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Phases of Compilation 31 Chapter01_ Introduction_4e - Tue November 21, 2017 An Overview of Compilation Scanning: divides the program into "tokens", which are the smallest meaningful units; this saves time, since character-by-character processing is slow we can tune the scanner better if its job is simple; it also saves complexity (lots of it) for later stages you can design a parser to take characters instead of tokens as input, but it isn't pretty scanning is recognition of a regular language, e.g., via DFA 32 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Parsing is recognition of a context-free language, e.g., via PDA Parsing discovers the "context free" structure of the program Informally, it finds the structure you can describe with syntax diagrams (the "circles and arrows" in a Pascal manual) 33 Chapter01_ Introduction_4e - Tue November 21, 2017 An Overview of Compilation Semantic analysis is the discovery of meaning in the program The compiler actually does what is called STATIC semantic analysis. That's the meaning that can be figured out at compile time Some things (e.g., array subscript out of bounds) can't be figured out until run time. Things like that are part of the program's DYNAMIC semantics 34 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Intermediate form (IF) done after semantic analysis (if the program passes all checks) IFs are often chosen for machine independence, ease of optimization, or compactness (these are somewhat contradictory) They often resemble machine code for some imaginary idealized machine; e.g. a stack machine, or a machine with arbitrarily many registers Many compilers actually move the code through more than one IF 35 Chapter01_ Introduction_4e - Tue November 21, 2017 An Overview of Compilation Optimization takes an intermediate-code program and produces another one that does the same thing faster, or in less space The term is a misnomer; we just improve code The optimization phase is optional Code generation phase produces assembly language or (sometime) relocatable machine language 36 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Certain machine-specific optimizations (use of special instructions or addressing modes, etc.) may be performed during or after target code generation Symbol table: all phases rely on a symbol table that keeps track of all the identifiers in the program and what the compiler knows about them This symbol table may be retained (in some form) for use by a debugger, even after compilation has completed 37 Chapter01_ Introduction_4e - Tue November 21, 2017 An Overview of Compilation Lexical and Syntax Analysis GCD Program (in C) int main() { int i = getint(), j = getint(); while (i!= j) { if (i > j) i = i - j; else j = j - i; } putint(i); } 38 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Lexical and Syntax Analysis GCD Program Tokens Scanning (lexical analysis) and parsing recognize the structure of the program, groups characters into tokens, the smallest meaningful units of the program int main ( ) { int i = getint ( ), j = getint ( ) ; while ( i!= j ) { if ( i > j ) i = i - j ; else j = j - i ; } putint ( i ) ; } 39 Chapter01_ Introduction_4e - Tue November 21, 2017 An Overview of Compilation Lexical and Syntax Analysis Context-Free Grammar and Parsing Parsing organizes tokens into a parse tree that represents higher-level constructs in terms of their constituents Potentially recursive rules known as context-free grammar define the ways in which these constituents combine 40 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Context-Free Grammar and Parsing Example (while loop in C) iteration-statement while ( expression ) statement statement, in turn, is often a list enclosed in braces: statement compound-statement compound-statement { block-item-list opt } where block-item-list opt block-item-list or block-item-list opt ϵ and block-item-list block-item block-item-list block-item-list block-item block-item declaration block-item statement 41 Chapter01_ Introduction_4e - Tue November 21, 2017 An Overview of Compilation Context-Free Grammar and Parsing GCD Program Parse Tree A B next slide 42 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Context-Free Grammar and Parsing (continued) 43 Chapter01_ Introduction_4e - Tue November 21, 2017 An Overview of Compilation Context-Free Grammar and Parsing (continued) A B 44 Chapter01_ Introduction_4e - Tue November 21, 2017

An Overview of Compilation Syntax Tree GCD Program Parse Tree 45 Chapter01_ Introduction_4e - Tue November 21, 2017

Lecture 1 (Chapter 1) Programming languages? Why so many? evolution, society, orientation, taste Why successful?. easy to learn, use, or implement; free. produces fast or small executable. sponsor What for?. express algorithms. write programs not in machine language. defines an abstract machine.. example: Java bytecode on a Java machine Why study?. which used in practice?. which languages for which application domains?. helps to learn new languages. helps understand changes to language family. teaches you current language features you didn't know (some obscure). helps you use current language better.. do things the language supports.. fake things the language doesn't support Families. imperative: von Neumann, object-oriented, scripting.. currently the predominant form (esp. von N with some objects). declarative: functional, logic or constraint-based Compilation vs interpretation. Differences, similarities, conceptual overlaps. Compiler.. high-level source to target language... can be machine language for real machine... can be machine language for "virtual" machine... can be another high-level language.. better performance in general. Interpreter: high-level source to internal representation.. interpreter runtime system used to execute.. greater flexibility, often better diagnostics. Often see mix.. run-time analysis of printf format is a kind of interpretation. Compare with preprocessor.. Preprocessor does textual analysis transformations... Remove comments from source code... #include adds source code from a file... conditionally add or remove code at compile time Implementation strategies. Compile to object code, link edit to executable.. object code = machine code with holes... e.g. address of library routine. Compile to assembler, then assemble and run.. can be easier to debug compiler. Source-to-source translation.. Original C++ to C compiler. Dynamic compilation.. Invoke compiler at runtime for newly-generated code.. or for optimizing code

. Just-in-Time compilation.. Compile intermediate code at runtime if you need it.. Don't compile code that never gets executed.. Possible runtime delay during compilation. Compile normally-interpreted code.. Not always possible - might need to wait until runtime because... interconnections of code maybe not settled till then... actions to compile not known till then Compilation Phases. Character stream.. -> Scanner (lexical analyzer). -> token stream.. -> Parser (syntactic analyzer). -> parse tree.. -> Semantic Analyzer. -> intermediate form (e.g. AST = abstract syntax tree).. -> Machine-Independent Code Optimizer*. -> modified intermediate form.. -> Target Code Generator. -> target program.. -> Machine-Specific Code Optimizer*. -> final target program * steps optional Two chunks to compiler. Character Stream -> Front End -> AST. AST -> Back End -> final target program. Front end can be reused on different machines. Back end can take AST from different languages Parts of Compiler. Scanner - Divides input into tokens (words like IF, multi-char symbols like!=). Parser - Determines whether input is syntactically legal, builds parse tree. Semantic Analyzer - Check for static properties (variables declared? right types?). Parse Tree - Tree format that represents syntactic structure of input.. E.g. assignment statement has a lhs, assignment symbol, and rhs.. lhs might be an array name, '[', index expression, and ']'. Intermediate Form - a representation that's preferable to parse trees for analysis.. Probably smaller than parse tree (abstract syntax tree, e.g.).. Might be code for virtual machine.. Can have multiple intermediate forms. Code Generator - takes intermediate form to actual machine code (object code). Symbol Table - describes identifiers used by program.. E.g. name, type, location in memory.. Might be used by runtime debugger. Slides include example of parse tree.. Slide 45 contains an *abstract syntax tree* not a parse tree... Note AST is much smaller than parse tree