Parsing and Pattern Recognition

Size: px
Start display at page:

Download "Parsing and Pattern Recognition"

Transcription

1 Topics in IT 1 Parsing and Pattern Recognition Week 02 String searching and finite-choice languages College of Information Science and Engineering Ritsumeikan University 1

2 this week string comparison brute force using a hash function string searching brute force Boyer-Moore-Horspool algorithm state machine recognising finite-choice languages table lookup binary search state machine 2

3 last week s topics applications of pattern matching and parsing the parts of language: words and vocabulary: lexemes, lexicons sequences of words: sentences systems of sentences: grammars the structure of grammars 3

4 string comparison: brute force compare strings, character by character if all are the same, the strings are equal F O O T F O O T F O O T F O L D "FOOT" = "FOOT" "FOOT"!= "FOLD" int string_compare(char *s1, char *s2) { while (*s1) // not at end of s1 if (*s1!= *s2) return 0; // characters differ else ++s1, ++s2; // move to next character return *s2 == 0; // at end of s1, check s2 has ended too } 4

5 string comparison: brute force almost no extra cost to compute the order of the strings negative, zero, or positive results means first string is less than, equal to, or greater than the second string F O O T \000 F O O T \000 F O O T \000 F O L D \000 \0 - \0 = 0 "FOOT" = "FOOT" O - L = 3 "FOOT" > "FOLD" int strcmp(char *s1, char *s2) { while (*s1 && *s1 == *s2) // s1 not ended && s2 still matches ++s1, ++s2; // advance to next pair of characters return *s1 - *s2; // difference of first non-matching characters } 5

6 string comparison: hashing a hash is a small number calculated from some larger data the hash characterises the data equal data always has the same hash value e.g: parity bit, checksum (ISBN, student ID), MD5 signature,... if we create two hashes from a few characters of two strings if the hashes are different, the strings must be different if the hashes are the same, the strings might be different int my_hash(char *s) { // hash made from first, middle, and last character int last = max(0, strlen(s) - 1); int hash = (s[0] << 16) + (s[last / 2] << 8) + s[last]; return hash; } int compare(char *s1, char *s2) { if (my_hash(s1)!= my_hash(s2)) return 0; // different hash: strings must be different return!strcmp(s1, s2); // same hash: strings might be different } 6

7 string comparison: perfect hashing if all possible strings to be compared are known in advance, then a perfect hash function can be constructed automatically e.g., comparing only the strings cat, bet, and bob notice that the middle letter is different in all the strings the middle letter itself can be used as a perfect hash value /* For a given set of strings we can implement a simplest-possible hash * function that guarantees a different result for each string in the set. */ int my_hash(char *s) // require: s is one of "cat", "bet", or "bob" { return s[1]; // middle letter a, e, or o uniquely characterises s } int string_compare(char *s1, char *s2) { return my_hash(s1) == my_hash(s2); // equal perfect hash => equal strings } 7

8 string searching: brute force compare target string with contents of sliding window if they match, we have found the target string; otherwise slide the window one character to the right, and repeat moving window over text to be searched text to be searched target string to be found M E A S U R E M E N T S M E N M E N M E N M E N M E N M E N M E N M E N comparisons of target string with each part of the text to be searched that is currently within the bounds of the sliding window found at index 7 it took 12 comparisons to find men in measurements 8

9 string searching: brute force /* Search for the target string within the given text. Return the index of the match, or -1 if no match is found. */ int string_search(char *text, char *target) { int target_len = strlen(target); int last_win_pos = strlen(text) - target_len; for (int win_pos = 0; win_pos <= last_win_pos; ++win_pos, ++text) { for (int offset = 0; text[offset] == target[offset]; ++offset) { if (offset == target_len - 1) return win_pos; // target string found at win_pos } } return -1; // target was not found in text } 9

10 string searching: Boyer-Moore-Horspool problem with brute-force search: almost always fails when matching first character in window no information available about next character in the window must move window one character to the right M E A S M +1 U R E M E N T S E N M E N 10

11 string searching: Boyer-Moore-Horspool Horspool algorithm compares the target string and window contents backwards starting with last character if window does not match target: try to move the window as far to the right as possible the last character in the window is used to decide how far we can move the window possibility 1: last character in window does not occur in target move window right by the entire length of the target M E A S U R M E N +3 E M E N T S M E N 11

12 string searching: Boyer-Moore-Horspool possibility 2: last character in window occurs once in target move the window so the character appears in the correct position M E A S U R E M E N T S M E N +1 M E N M E A S U R E M E N T S M E N +2 M E N 12

13 string searching: Boyer-Moore-Horspool possibility 3: last character occurs more than once in target move so the last occurrence appears in the correct position M E A S U R E M E N T S C E M E N T +2 C E M E N T M E A S U R E M E N T S C E M +3 E N T C E M E N T

14 string searching: Boyer-Moore-Horspool possibility 4: strings differ, last character occurs only at end of target move window right by the length of the target (draw your own diagram if you cannot see why!) algorithm: build a table that maps any character to the amount to move right characters not in the target move the window by the target length if the last target character is not repeated, it moves by the target length other characters in the target move themselves to the end of the window for repeated characters, use the rightmost when searching for CEMENT or MEN, our tables look like: character: C M E N T others move by: character: M E N others move by:

15 string searching: Boyer-Moore-Horspool (Boyer - Moore -) Horspool algorithm moving window over text to be searched text to be searched M E A S U R E M E N T S move[] = M E N? target string to be found move[ A ] = 3 M E N +3 move[ R ] = 3 M +3 E N move[ E ] = 1 M +1 E N M E N compare target with window backwards found at index 7 6 comparisons to find men in measurements, but have to construct move[] array for each specific target string (practice on a few target strings until you find it easy) 15

16 string searching: Boyer-Moore-Horspool int string_search(char *text, char *target) { int text_len = strlen(text), target_len = strlen(target); if (text_len < 1 target_len < 1) return -1; // empty string int target_last = target_len - 1; // index of last character in target int window_pos = 0; // current position of window in text int move[256]; // amount to move window right // default: all characters move the window right by the target length for (int c = 0; c < 256; ++c) move[c] = target_len; // for characters appearing in target, move window right to align them with end of window for (int index = 0; index < target_last; ++index) move[target[i]] = target_last - index; // search for the target in text while (text_len >= target_len) { // not at end of text for (int index = target_last; text[index] == target[index]; --index) if (i == 0) return window_pos; // success if target matches window int n = move[text[target_last]]; // amount to move window right window_pos += n; // remember new position of window text += n; // move text (start of window) right text_len -= n; // text has shrunk by the same amount } return -1; // failure: target not found in text } 16

17 string searching: Boyer-Moore-Horspool Horspool works well for large alphabets and large target lengths e.g., phrases in natural languages 17

18 string searching: state machine use successive characters from input to drive a state machine approximately: any other character any other character any other character M E N success: target found begin in state 0, then... look at the next input character, follow the arrow that matches if you reach state 3, stop and succeed (.. ) if you run out of input, stop and fail (.. ) why approximately? (hint: try searching for aba with input aaba ) later we see how to construct state machines properly matching flexible patterns 18

19 string comparison and grammars let s write string comparison as a grammar S hello this language has only one valid sentence, hello recognising whether or not a string belongs to this language is easy: compare the input string to the one valid sentence succeed if the string matches it // recognise production rule S int recognise_s(char *s) { if (!strcmp(s, "hello")) return 1; // succeed return 0; // fail } 19

20 string comparison and grammars that s boring, so let s recognise a more interesting language S hello S goodbye or... S hello goodbye or... S hello goodbye this language has two valid sentences, hello and goodbye recognising whether a string belongs to this language is also easy: compare the input string to all valid sentences succeed if the string matches one of them // recognise production rule S int recognise_s(char *s) { if (!strcmp(s, "hello")) return 1; // succeed if (!strcmp(s, "goodbye")) return 1; // succeed return 0; // fail } 20

21 string comparison and grammars a language that consists of fixed strings taken from a finite set of choices is called a finite choice language a grammar that describes a finite choice language is called a finite choice grammar are they useful? nouns in a natural language: cat, dog, totoro, pikachu, miffy reserved words in a programming language: class, public 21

22 finite-choice grammars many complex languages have a subset that is finite choice e.g., in the C programming language... keyword auto break case char const continue default do double else enum extern float for goto if int long register return short signed sizeof static struct switch typedef union unsigned void volatile while if we can recognise this FC language very quickly, we can treat all identifiers as if they were variable names recognise identifiers, using this FC grammar, to detect keywords 22

23 recognising sentences of FC languages consider a slightly smaller FC language S one two three four five six seven eight nine ten this language has ten valid sentences: one, two, three, four, five, six, seven, eight, nine, ten to recognise a valid sentence, just detect one of these strings 23

24 FC parsing: brute force brute force method: ten string comparisons // recognise the production S; return the number of // the rule that matched, or -1 if no rule matches int recognise_s(char *sentence) { if (!strcmp(sentence, "one" )) return 0; if (!strcmp(sentence, "two" )) return 1; if (!strcmp(sentence, "three")) return 2; if (!strcmp(sentence, "four" )) return 3; if (!strcmp(sentence, "five" )) return 4; if (!strcmp(sentence, "six" )) return 5; if (!strcmp(sentence, "seven")) return 6; if (!strcmp(sentence, "eight")) return 7; if (!strcmp(sentence, "nine" )) return 8; if (!strcmp(sentence, "ten" )) return 9; return -1; // fail } how tedious (and inefficient, for large FC grammars)! 24

25 linear search of a table FC parsing: linear search scales better (to hundreds of choices) smaller code (and probably faster) enum { NWORDS = 10 }; char *words[nwords] = { "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten" }; int recognise_s(char *sentence) { for (int i = 0; i < NWORDS; ++i) if (!strcmp(sentence, words[i])) return i; return -1; } 25

26 FC parsing: binary search since all sentences are known in advance, we can sort them alphabetically perform a binary search words[ ] "eight" "five" "four" "nine" "one" "seven" "six" "ten" "three" "two" iteration window lower mid upper mid+1 lower mid upper mid-1 mid found strcmp("seven", words[mid]) >1 <1 =0 26

27 FC parsing: binary search char *words[nwords] = { "eight", "five", "four", "nine", "one", "seven", "six", "ten", "three", "two" }; int recognise_s(char *sentence) { int lower = 0, upper = NWORDS - 1; while (lower <= upper) { int mid = (lower + upper) / 2; int cmp = strcmp(sentence, words[mid]); if (cmp < 0) upper = mid - 1; // not in [ mid...upper ] else if (cmp > 0) lower = mid + 1; // not in [ lower...mid ] else return mid; // sentence found } return -1; // sentence not recognised } rule numbering is slightly different (alphabetical order) trivial to fix with a corresponding table of rule numbers 27

28 FC parsing: state machine for illustration, consider a smaller grammar: S bet bob cat any other character any other character any other character b e t success: target found c 5 o a 4 6 b any other character t any other character any other character (we see later how to implement this kind of matching very efficiently) 28

29 summary simple techniques and algorithms exists to improve efficiency of string comparisons string searching (within larger text) string searching (within table of strings) Horspool search algorithm is a good choice simple to understand and implement good for large alphabet and large target string (natural language) better algorithms exist for special cases (but more complex) parsing is the opposite of generating sentences from a grammar parsing: given a sentence and a grammar, how do we make the sentence from the start rule? parsing a language with a finite-choice grammar is just string search state machines can be used for matching and searching 29

30 review these slides homework practice making tables for Horspool string searching preview the slides for the next class become familiar with the notation used for grammars download and read the first two handouts reading pdf Section (how grammars are constructed) Section (why grammars describe entire languages) (the rest is optional, but good background material if you are interested) reading-2.3.pdf Section 2.3 (the five types of grammar) 30

31 glossary brute force solving a problem by using a simple, obvious, direct approach. Much more efficient solutions may exist that are not obvious or simple. comparing determining if two sets of data have the same contents, e.g., two strings that contain the same characters in the same order will compare as equal. finite having a limited, countable number of elements. hash a value that characterises, and is computed from, data. Usually numeric, and much smaller than the original data, making the hash a useful in comparison, classification, and verification of data. For a given hash function, the same data should always produce the same hash value. When two hash values are different, we can be certain the the data they represent is different; when two hash values are the same, the data they represent may or may not be the same. 31

32 hash function a function that computes a hash value from a set of data. order the relative position of two data sets according to some classification scheme. Two numbers (including integers representing characters) can be ordered by their magnitude. Two strings can be ordered according to the order of the first character that differs between them (corresponding to dictionary order for English words). perfect hash function a hash function that is designed with prior knowledge of every possible input that it might encounter. Since the possible inputs to the function are known in advance, we arrange for the function to produce a different hash value for each possible input. This makes is possible to compare two (potentially large) data sets by computing their hash values directly, which will be the same only if the two data sets have the same contents. 32

33 searching finding the position of a target set of data within a collection of sets of data. It can be accomplished by comparing the target with each set of data in the collection successively until the comparison succeeds. sliding window (in data analysis) a window that moves for each iteration of an algorithm. For example, when searching for a string in some text, a window (of the same size as the target string) moves to a new position in the text each time a comparison is made between the target string and the portion of the text visible in the window. The term sliding implies that the movement is monotonic (in a single direction) and overlapping (moving a short distance relative to the size of the window). 33

34 state machine a way to model (or implement) a process as a software machine in which there are several distinct states and explicit transitions between them. The model is in only one of the states at a given time. Progress is made by following a transition out of the current state into another, when the next input data item (or some other external stimulus) is received. The current state combined with some characteristic of the data item determines which transition should be followed, and hence what state the machine will be in next. In text searching applications, the input data are successive characters of the text being searched and the machine states represent the progress that has been made towards recognising the target string that is being searched for. 34

35 window (in data analysis) a small (usually contiguous) subset of data taken from a larger set of data to which an algorithm or process is applied. The algorithm can only see the subset of data that is currently revealed by the window. Windows can be fixed, or they can move for successive iterations of the algorithm. If they move then successive windows can overlap or be non-overlapping. For example, when searching for a target string in some larger text, a window on the text reveals a sub-string having the same length as the target string. A comparison can be made directly between the target string and the text visible in the window. If the comparison fails the window is moved and the process repeats until the comparison succeeds or the entire text has been considered. 35

Parsing and Pattern Recognition

Parsing and Pattern Recognition Topics in IT 1 Parsing and Pattern Recognition Week 10 Lexical analysis College of Information Science and Engineering Ritsumeikan University 1 this week mid-term evaluation review lexical analysis its

More information

Variables Data types Variable I/O. C introduction. Variables. Variables 1 / 14

Variables Data types Variable I/O. C introduction. Variables. Variables 1 / 14 C introduction Variables Variables 1 / 14 Contents Variables Data types Variable I/O Variables 2 / 14 Usage Declaration: t y p e i d e n t i f i e r ; Assignment: i d e n t i f i e r = v a l u e ; Definition

More information

Programming in C++ 4. The lexical basis of C++

Programming in C++ 4. The lexical basis of C++ Programming in C++ 4. The lexical basis of C++! Characters and tokens! Permissible characters! Comments & white spaces! Identifiers! Keywords! Constants! Operators! Summary 1 Characters and tokens A C++

More information

DEPARTMENT OF MATHS, MJ COLLEGE

DEPARTMENT OF MATHS, MJ COLLEGE T. Y. B.Sc. Mathematics MTH- 356 (A) : Programming in C Unit 1 : Basic Concepts Syllabus : Introduction, Character set, C token, Keywords, Constants, Variables, Data types, Symbolic constants, Over flow,

More information

Unit-II Programming and Problem Solving (BE1/4 CSE-2)

Unit-II Programming and Problem Solving (BE1/4 CSE-2) Unit-II Programming and Problem Solving (BE1/4 CSE-2) Problem Solving: Algorithm: It is a part of the plan for the computer program. An algorithm is an effective procedure for solving a problem in a finite

More information

Laboratory 2: Programming Basics and Variables. Lecture notes: 1. A quick review of hello_comment.c 2. Some useful information

Laboratory 2: Programming Basics and Variables. Lecture notes: 1. A quick review of hello_comment.c 2. Some useful information Laboratory 2: Programming Basics and Variables Lecture notes: 1. A quick review of hello_comment.c 2. Some useful information 3. Comment: a. name your program with extension.c b. use o option to specify

More information

Aryan College. Fundamental of C Programming. Unit I: Q1. What will be the value of the following expression? (2017) A + 9

Aryan College. Fundamental of C Programming. Unit I: Q1. What will be the value of the following expression? (2017) A + 9 Fundamental of C Programming Unit I: Q1. What will be the value of the following expression? (2017) A + 9 Q2. Write down the C statement to calculate percentage where three subjects English, hindi, maths

More information

Binghamton University. CS-211 Fall Syntax. What the Compiler needs to understand your program

Binghamton University. CS-211 Fall Syntax. What the Compiler needs to understand your program Syntax What the Compiler needs to understand your program 1 Pre-Processing Any line that starts with # is a pre-processor directive Pre-processor consumes that entire line Possibly replacing it with other

More information

DECLARATIONS. Character Set, Keywords, Identifiers, Constants, Variables. Designed by Parul Khurana, LIECA.

DECLARATIONS. Character Set, Keywords, Identifiers, Constants, Variables. Designed by Parul Khurana, LIECA. DECLARATIONS Character Set, Keywords, Identifiers, Constants, Variables Character Set C uses the uppercase letters A to Z. C uses the lowercase letters a to z. C uses digits 0 to 9. C uses certain Special

More information

Variables in C. Variables in C. What Are Variables in C? CMSC 104, Fall 2012 John Y. Park

Variables in C. Variables in C. What Are Variables in C? CMSC 104, Fall 2012 John Y. Park Variables in C CMSC 104, Fall 2012 John Y. Park 1 Variables in C Topics Naming Variables Declaring Variables Using Variables The Assignment Statement 2 What Are Variables in C? Variables in C have the

More information

C Language, Token, Keywords, Constant, variable

C Language, Token, Keywords, Constant, variable C Language, Token, Keywords, Constant, variable A language written by Brian Kernighan and Dennis Ritchie. This was to be the language that UNIX was written in to become the first "portable" language. C

More information

VARIABLES AND CONSTANTS

VARIABLES AND CONSTANTS UNIT 3 Structure VARIABLES AND CONSTANTS Variables and Constants 3.0 Introduction 3.1 Objectives 3.2 Character Set 3.3 Identifiers and Keywords 3.3.1 Rules for Forming Identifiers 3.3.2 Keywords 3.4 Data

More information

Syntax and Variables

Syntax and Variables Syntax and Variables What the Compiler needs to understand your program, and managing data 1 Pre-Processing Any line that starts with # is a pre-processor directive Pre-processor consumes that entire line

More information

Presented By : Gaurav Juneja

Presented By : Gaurav Juneja Presented By : Gaurav Juneja Introduction C is a general purpose language which is very closely associated with UNIX for which it was developed in Bell Laboratories. Most of the programs of UNIX are written

More information

LESSON 1. A C program is constructed as a sequence of characters. Among the characters that can be used in a program are:

LESSON 1. A C program is constructed as a sequence of characters. Among the characters that can be used in a program are: LESSON 1 FUNDAMENTALS OF C The purpose of this lesson is to explain the fundamental elements of the C programming language. C like other languages has all alphabet and rules for putting together words

More information

University of Technology. Laser & Optoelectronics Engineering Department. C++ Lab.

University of Technology. Laser & Optoelectronics Engineering Department. C++ Lab. University of Technology Laser & Optoelectronics Engineering Department C++ Lab. Second week Variables Data Types. The usefulness of the "Hello World" programs shown in the previous section is quite questionable.

More information

Character Set. The character set of C represents alphabet, digit or any symbol used to represent information. Digits 0, 1, 2, 3, 9

Character Set. The character set of C represents alphabet, digit or any symbol used to represent information. Digits 0, 1, 2, 3, 9 Character Set The character set of C represents alphabet, digit or any symbol used to represent information. Types Uppercase Alphabets Lowercase Alphabets Character Set A, B, C, Y, Z a, b, c, y, z Digits

More information

Computer Science & Information Technology (CS) Rank under AIR 100. Examination Oriented Theory, Practice Set Key concepts, Analysis & Summary

Computer Science & Information Technology (CS) Rank under AIR 100. Examination Oriented Theory, Practice Set Key concepts, Analysis & Summary GATE- 2016-17 Postal Correspondence 1 C-Programming Computer Science & Information Technology (CS) 20 Rank under AIR 100 Postal Correspondence Examination Oriented Theory, Practice Set Key concepts, Analysis

More information

ET156 Introduction to C Programming

ET156 Introduction to C Programming ET156 Introduction to C Programming Unit 1 INTRODUCTION TO C PROGRAMMING: THE C COMPILER, VARIABLES, MEMORY, INPUT, AND OUTPUT Instructor : Stan Kong Email : skong@itt tech.edutech.edu Figure 1.3 Components

More information

STUDY NOTES UNIT 1 - INTRODUCTION TO OBJECT ORIENTED PROGRAMMING

STUDY NOTES UNIT 1 - INTRODUCTION TO OBJECT ORIENTED PROGRAMMING OBJECT ORIENTED PROGRAMMING STUDY NOTES UNIT 1 - INTRODUCTION TO OBJECT ORIENTED PROGRAMMING 1. Object Oriented Programming Paradigms 2. Comparison of Programming Paradigms 3. Basic Object Oriented Programming

More information

Variables. Data Types.

Variables. Data Types. Variables. Data Types. The usefulness of the "Hello World" programs shown in the previous section is quite questionable. We had to write several lines of code, compile them, and then execute the resulting

More information

Variables in C. CMSC 104, Spring 2014 Christopher S. Marron. (thanks to John Park for slides) Tuesday, February 18, 14

Variables in C. CMSC 104, Spring 2014 Christopher S. Marron. (thanks to John Park for slides) Tuesday, February 18, 14 Variables in C CMSC 104, Spring 2014 Christopher S. Marron (thanks to John Park for slides) 1 Variables in C Topics Naming Variables Declaring Variables Using Variables The Assignment Statement 2 What

More information

Data Types and Variables in C language

Data Types and Variables in C language Data Types and Variables in C language Basic structure of C programming To write a C program, we first create functions and then put them together. A C program may contain one or more sections. They are

More information

Procedures, Parameters, Values and Variables. Steven R. Bagley

Procedures, Parameters, Values and Variables. Steven R. Bagley Procedures, Parameters, Values and Variables Steven R. Bagley Recap A Program is a sequence of statements (instructions) Statements executed one-by-one in order Unless it is changed by the programmer e.g.

More information

C: How to Program. Week /Mar/05

C: How to Program. Week /Mar/05 1 C: How to Program Week 2 2007/Mar/05 Chapter 2 - Introduction to C Programming 2 Outline 2.1 Introduction 2.2 A Simple C Program: Printing a Line of Text 2.3 Another Simple C Program: Adding Two Integers

More information

Introduction to Computing Lecture 01: Introduction to C

Introduction to Computing Lecture 01: Introduction to C Introduction to Computing Lecture 01: Introduction to C Assist.Prof.Dr. Nükhet ÖZBEK Ege University Department of Electrical&Electronics Engineering ozbek.nukhet@gmail.com Topics Introduction to C language

More information

A Fast Review of C Essentials Part I

A Fast Review of C Essentials Part I A Fast Review of C Essentials Part I Structural Programming by Z. Cihan TAYSI Outline Program development C Essentials Functions Variables & constants Names Formatting Comments Preprocessor Data types

More information

Crit-bit Trees. Adam Langley (Version )

Crit-bit Trees. Adam Langley (Version ) Crit-bit Trees Adam Langley (agl@imperialviolet.org) (Version 20080926) 1. Introduction This code is taken from Dan Bernstein s qhasm and implements a binary crit-bit (alsa known as PATRICA) tree for NUL

More information

Recap. ANSI C Reserved Words C++ Multimedia Programming Lecture 2. Erwin M. Bakker Joachim Rijsdam

Recap. ANSI C Reserved Words C++ Multimedia Programming Lecture 2. Erwin M. Bakker Joachim Rijsdam Multimedia Programming 2004 Lecture 2 Erwin M. Bakker Joachim Rijsdam Recap Learning C++ by example No groups: everybody should experience developing and programming in C++! Assignments will determine

More information

The New C Standard (Excerpted material)

The New C Standard (Excerpted material) The New C Standard (Excerpted material) An Economic and Cultural Derek M. Jones derek@knosof.co.uk Copyright 2002-2008 Derek M. Jones. All rights reserved. 1378 type specifier type-specifier: void char

More information

6.1 Skip List, Binary Search Tree

6.1 Skip List, Binary Search Tree Homework #6 RELEASE DATE: 05/26/2015 DUE DATE: 06/09/2015, 16:20 in CSIE R102/R104 and on github As directed below, you need to submit your code to the designated place on the course website. Any form

More information

Tokens, Expressions and Control Structures

Tokens, Expressions and Control Structures 3 Tokens, Expressions and Control Structures Tokens Keywords Identifiers Data types User-defined types Derived types Symbolic constants Declaration of variables Initialization Reference variables Type

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

You may refer to the lesson on data structures (Introduction to Data Structures) as necessary.

You may refer to the lesson on data structures (Introduction to Data Structures) as necessary. The Science of Computing I Living with Cyber Raspberry Pi Activity 4: Arraynging Things In this activity, you will implement the insertion sort. You will need the following items: Raspberry Pi B v2 with

More information

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens Syntactic Analysis CS45H: Programming Languages Lecture : Lexical Analysis Thomas Dillig Main Question: How to give structure to strings Analogy: Understanding an English sentence First, we separate a

More information

Data Representation and Storage. Some definitions (in C)

Data Representation and Storage. Some definitions (in C) Data Representation and Storage Learning Objectives Define the following terms (with respect to C): Object Declaration Definition Alias Fundamental type Derived type Use pointer arithmetic correctly Explain

More information

CprE 288 Introduction to Embedded Systems Exam 1 Review. 1

CprE 288 Introduction to Embedded Systems Exam 1 Review.  1 CprE 288 Introduction to Embedded Systems Exam 1 Review http://class.ece.iastate.edu/cpre288 1 Overview of Today s Lecture Announcements Exam 1 Review http://class.ece.iastate.edu/cpre288 2 Announcements

More information

Chapter 2 - Introduction to C Programming

Chapter 2 - Introduction to C Programming Chapter 2 - Introduction to C Programming 2 Outline 2.1 Introduction 2.2 A Simple C Program: Printing a Line of Text 2.3 Another Simple C Program: Adding Two Integers 2.4 Memory Concepts 2.5 Arithmetic

More information

B.V. Patel Institute of Business Management, Computer & Information Technology, Uka Tarsadia University

B.V. Patel Institute of Business Management, Computer & Information Technology, Uka Tarsadia University Unit 1 Programming Language and Overview of C 1. State whether the following statements are true or false. a. Every line in a C program should end with a semicolon. b. In C language lowercase letters are

More information

Basic Elements of C. Staff Incharge: S.Sasirekha

Basic Elements of C. Staff Incharge: S.Sasirekha Basic Elements of C Staff Incharge: S.Sasirekha Basic Elements of C Character Set Identifiers & Keywords Constants Variables Data Types Declaration Expressions & Statements C Character Set Letters Uppercase

More information

CPS222 Lecture: Sets. 1. Projectable of random maze creation example 2. Handout of union/find code from program that does this

CPS222 Lecture: Sets. 1. Projectable of random maze creation example 2. Handout of union/find code from program that does this CPS222 Lecture: Sets Objectives: last revised April 16, 2015 1. To introduce representations for sets that can be used for various problems a. Array or list of members b. Map-based representation c. Bit

More information

COMP3121/3821/9101/ s1 Assignment 1

COMP3121/3821/9101/ s1 Assignment 1 Sample solutions to assignment 1 1. (a) Describe an O(n log n) algorithm (in the sense of the worst case performance) that, given an array S of n integers and another integer x, determines whether or not

More information

Fundamentals of Programming

Fundamentals of Programming Fundamentals of Programming Lecture 3 - Constants, Variables, Data Types, And Operations Lecturer : Ebrahim Jahandar Borrowed from lecturer notes by Omid Jafarinezhad Outline C Program Data types Variables

More information

Chapter 2 - Control Structures

Chapter 2 - Control Structures Chapter 2 - Control Structures 1 Outline 2.1 Introduction 2.2 Algorithms 2.3 Pseudocode 2.4 Control Structures 2.5 if Selection Structure 2.6 if/else Selection Structure 2.7 while Repetition Structure

More information

ECE 2035 Programming HW/SW Systems Spring problems, 5 pages Exam Three 8 April Your Name (please print clearly)

ECE 2035 Programming HW/SW Systems Spring problems, 5 pages Exam Three 8 April Your Name (please print clearly) Your Name (please print clearly) This exam will be conducted according to the Georgia Tech Honor Code. I pledge to neither give nor receive unauthorized assistance on this exam and to abide by all provisions

More information

Topics Applications Most Common Methods Serial Search Binary Search Search by Hashing (next lecture) Run-Time Analysis Average-time analysis Time anal

Topics Applications Most Common Methods Serial Search Binary Search Search by Hashing (next lecture) Run-Time Analysis Average-time analysis Time anal CSC212 Data Structure t Lecture 18 Searching Instructor: George Wolberg Department of Computer Science City College of New York @ George Wolberg, 2016 1 Topics Applications Most Common Methods Serial Search

More information

CMSC 104 -Lecture 5 John Y. Park, adapted by C Grasso

CMSC 104 -Lecture 5 John Y. Park, adapted by C Grasso CMSC 104 -Lecture 5 John Y. Park, adapted by C Grasso 1 Topics Naming Variables Declaring Variables Using Variables The Assignment Statement 2 a + b Variables are notthe same thing as variables in algebra.

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

Introduction to Programming

Introduction to Programming Introduction to Programming session 6 Instructor: Reza Entezari-Maleki Email: entezari@ce.sharif.edu 1 Spring 2011 These slides are created using Deitel s slides Sharif University of Technology Outlines

More information

ET156 Introduction to C Programming

ET156 Introduction to C Programming ET156 Introduction to C Programming g Unit 22 C Language Elements, Input/output functions, ARITHMETIC EXPRESSIONS AND LIBRARY FUNCTIONS Instructor : Stan Kong Email : skong@itt tech.edutech.edu General

More information

Structured Programming. Jon Macey

Structured Programming. Jon Macey Structured Programming Jon Macey Structured Programming Structured programming is an attempt to formalise the process of program development. There are several basis for a theorem of structured programming,

More information

QUIZ. What is wrong with this code that uses default arguments?

QUIZ. What is wrong with this code that uses default arguments? QUIZ What is wrong with this code that uses default arguments? Solution The value of the default argument should be placed in either declaration or definition, not both! QUIZ What is wrong with this code

More information

Computer Components. Software{ User Programs. Operating System. Hardware

Computer Components. Software{ User Programs. Operating System. Hardware Computer Components Software{ User Programs Operating System Hardware What are Programs? Programs provide instructions for computers Similar to giving directions to a person who is trying to get from point

More information

Chapter 2. C++ Syntax and Semantics, and the Program Development Process. Dale/Weems 1

Chapter 2. C++ Syntax and Semantics, and the Program Development Process. Dale/Weems 1 Chapter 2 C++ Syntax and Semantics, and the Program Development Process Dale/Weems 1 Chapter 2 Topics Programs Composed of Several Functions Syntax Templates Legal C++ Identifiers Assigning Values to Variables

More information

Crit-bit Trees. Adam Langley (Version )

Crit-bit Trees. Adam Langley (Version ) CRITBIT CWEB OUTPUT 1 Crit-bit Trees Adam Langley (agl@imperialviolet.org) (Version 20080926) 1. Introduction This code is taken from Dan Bernstein s qhasm and implements a binary crit-bit (alsa known

More information

DETAILED SYLLABUS INTRODUCTION TO C LANGUAGE

DETAILED SYLLABUS INTRODUCTION TO C LANGUAGE COURSE TITLE C LANGUAGE DETAILED SYLLABUS SR.NO NAME OF CHAPTERS & DETAILS HOURS ALLOTTED 1 INTRODUCTION TO C LANGUAGE About C Language Advantages of C Language Disadvantages of C Language A Sample Program

More information

Agenda. CS 61C: Great Ideas in Computer Architecture. Lecture 2: Numbers & C Language 8/29/17. Recap: Binary Number Conversion

Agenda. CS 61C: Great Ideas in Computer Architecture. Lecture 2: Numbers & C Language 8/29/17. Recap: Binary Number Conversion CS 61C: Great Ideas in Computer Architecture Lecture 2: Numbers & C Language Krste Asanović & Randy Katz http://inst.eecs.berkeley.edu/~cs61c Numbers wrap-up This is not on the exam! Break C Primer Administrivia,

More information

Input And Output of C++

Input And Output of C++ Input And Output of C++ Input And Output of C++ Seperating Lines of Output New lines in output Recall: "\n" "newline" A second method: object endl Examples: cout

More information

Programming Lecture 3

Programming Lecture 3 Programming Lecture 3 Expressions (Chapter 3) Primitive types Aside: Context Free Grammars Constants, variables Identifiers Variable declarations Arithmetic expressions Operator precedence Assignment statements

More information

CS 61C: Great Ideas in Computer Architecture. Lecture 2: Numbers & C Language. Krste Asanović & Randy Katz

CS 61C: Great Ideas in Computer Architecture. Lecture 2: Numbers & C Language. Krste Asanović & Randy Katz CS 61C: Great Ideas in Computer Architecture Lecture 2: Numbers & C Language Krste Asanović & Randy Katz http://inst.eecs.berkeley.edu/~cs61c Numbers wrap-up This is not on the exam! Break C Primer Administrivia,

More information

In this session we will cover the following sub-topics: 1.Identifiers 2.Variables 3.Keywords 4.Statements 5.Comments 6.Whitespaces 7.Syntax 8.

In this session we will cover the following sub-topics: 1.Identifiers 2.Variables 3.Keywords 4.Statements 5.Comments 6.Whitespaces 7.Syntax 8. In this session we will cover the following sub-topics: 1.Identifiers 2.Variables 3.Keywords 4.Statements 5.Comments 6.Whitespaces 7.Syntax 8.Semantic www.tenouk.com, 1/16 C IDENTIFIERS 1. Is a unique

More information

XSEDE Scholars Program Introduction to C Programming. John Lockman III June 7 th, 2012

XSEDE Scholars Program Introduction to C Programming. John Lockman III June 7 th, 2012 XSEDE Scholars Program Introduction to C Programming John Lockman III June 7 th, 2012 Homework 1 Problem 1 Find the error in the following code #include int main(){ } printf(find the error!\n");

More information

C How to Program, 7/e by Pearson Education, Inc. All Rights Reserved.

C How to Program, 7/e by Pearson Education, Inc. All Rights Reserved. C How to Program, 7/e This chapter serves as an introduction to data structures. Arrays are data structures consisting of related data items of the same type. In Chapter 10, we discuss C s notion of

More information

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield CMPS 3500 Programming Languages Dr. Chengwei Lei CEECS California State University, Bakersfield Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing

More information

INTRODUCTION 1 AND REVIEW

INTRODUCTION 1 AND REVIEW INTRODUTION 1 AND REVIEW hapter SYS-ED/ OMPUTER EDUATION TEHNIQUES, IN. Programming: Advanced Objectives You will learn: Program structure. Program statements. Datatypes. Pointers. Arrays. Structures.

More information

Fundamental Data Types. CSE 130: Introduction to Programming in C Stony Brook University

Fundamental Data Types. CSE 130: Introduction to Programming in C Stony Brook University Fundamental Data Types CSE 130: Introduction to Programming in C Stony Brook University Program Organization in C The C System C consists of several parts: The C language The preprocessor The compiler

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Introduction How to retrieval information? A simple alternative is to search the whole text sequentially Another option is to build data structures over the text (called indices)

More information

ME240 Computation for Mechanical Engineering. Lecture 4. C++ Data Types

ME240 Computation for Mechanical Engineering. Lecture 4. C++ Data Types ME240 Computation for Mechanical Engineering Lecture 4 C++ Data Types Introduction In this lecture we will learn some fundamental elements of C++: Introduction Data Types Identifiers Variables Constants

More information

Fundamental of Programming (C)

Fundamental of Programming (C) Borrowed from lecturer notes by Omid Jafarinezhad Fundamental of Programming (C) Lecturer: Vahid Khodabakhshi Lecture 3 Constants, Variables, Data Types, And Operations Department of Computer Engineering

More information

6.096 Introduction to C++ January (IAP) 2009

6.096 Introduction to C++ January (IAP) 2009 MIT OpenCourseWare http://ocw.mit.edu 6.096 Introduction to C++ January (IAP) 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Welcome to 6.096 Lecture

More information

Grammar Rules in Prolog!!

Grammar Rules in Prolog!! Grammar Rules in Prolog GR-1 Backus-Naur Form (BNF) BNF is a common grammar used to define programming languages» Developed in the late 1950 s Because grammars are used to describe a language they are

More information

Week 2: Syntax Specification, Grammars

Week 2: Syntax Specification, Grammars CS320 Principles of Programming Languages Week 2: Syntax Specification, Grammars Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 2: Syntax Specification, Grammars 1/ 62 Words and Sentences

More information

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1 CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions

More information

Short Notes of CS201

Short Notes of CS201 #includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

More information

Compiler Theory. (Semantic Analysis and Run-Time Environments)

Compiler Theory. (Semantic Analysis and Run-Time Environments) Compiler Theory (Semantic Analysis and Run-Time Environments) 005 Semantic Actions A compiler must do more than recognise whether a sentence belongs to the language of a grammar it must do something useful

More information

The component base of C language. Nguyễn Dũng Faculty of IT Hue College of Science

The component base of C language. Nguyễn Dũng Faculty of IT Hue College of Science The component base of C language Nguyễn Dũng Faculty of IT Hue College of Science Content A brief history of C Standard of C Characteristics of C The C compilation model Character set and keyword Data

More information

Recursive definition: A definition that is defined in terms of itself. Recursive method: a method that calls itself (directly or indirectly).

Recursive definition: A definition that is defined in terms of itself. Recursive method: a method that calls itself (directly or indirectly). Recursion We teach recursion as the first topic, instead of new object-oriented ideas, so that those who are new to Java can have a chance to catch up on the object-oriented ideas from CS100. Recursive

More information

ANSI C Programming Simple Programs

ANSI C Programming Simple Programs ANSI C Programming Simple Programs /* This program computes the distance between two points */ #include #include #include main() { /* Declare and initialize variables */ double

More information

Chapter 3 Basic Data Types. Lecture 3 1

Chapter 3 Basic Data Types. Lecture 3 1 Chapter 3 Basic Data Types Lecture 3 1 Topics Scalar Types in C Integers Bit Operations Floating Point Types Conversions Lecture 4 2 Scalar Types in C The amount of memory available for a variable depends

More information

UNIT -2 LEXICAL ANALYSIS

UNIT -2 LEXICAL ANALYSIS OVER VIEW OF LEXICAL ANALYSIS UNIT -2 LEXICAL ANALYSIS o To identify the tokens we need some method of describing the possible tokens that can appear in the input stream. For this purpose we introduce

More information

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Module # 02 Lecture - 03 Characters and Strings So, let us turn our attention to a data type we have

More information

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines. Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of

More information

3 Data Storage 3.1. Foundations of Computer Science Cengage Learning

3 Data Storage 3.1. Foundations of Computer Science Cengage Learning 3 Data Storage 3.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List five different data types used in a computer. Describe how

More information

CSCI-1200 Data Structures Fall 2013 Lecture 9 Iterators & Lists

CSCI-1200 Data Structures Fall 2013 Lecture 9 Iterators & Lists Review from Lecture 8 CSCI-1200 Data Structures Fall 2013 Lecture 9 Iterators & Lists Explored a program to maintain a class enrollment list and an associated waiting list. Unfortunately, erasing items

More information

Hexadecimal Numbers. Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those?

Hexadecimal Numbers. Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those? 9/10/18 1 Binary and Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those? Hexadecimal Numbers Check Homework 3 Binary Numbers A binary (base-two) number

More information

COMP4128 Programming Challenges

COMP4128 Programming Challenges Multi- COMP4128 Programming Challenges School of Computer Science and Engineering UNSW Australia Table of Contents 2 Multi- 1 2 Multi- 3 3 Multi- Given two strings, a text T and a pattern P, find the first

More information

M3-R4: PROGRAMMING AND PROBLEM SOLVING THROUGH C LANGUAGE

M3-R4: PROGRAMMING AND PROBLEM SOLVING THROUGH C LANGUAGE M3-R4: PROGRAMMING AND PROBLEM SOLVING THROUGH C LANGUAGE NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. 2. PART ONE is to be

More information

CSC 421: Algorithm Design & Analysis. Spring Space vs. time

CSC 421: Algorithm Design & Analysis. Spring Space vs. time CSC 421: Algorithm Design & Analysis Spring 2015 Space vs. time space/time tradeoffs examples: heap sort, data structure redundancy, hashing string matching brute force, Horspool algorithm, Boyer-Moore

More information

BLM2031 Structured Programming. Zeyneb KURT

BLM2031 Structured Programming. Zeyneb KURT BLM2031 Structured Programming Zeyneb KURT 1 Contact Contact info office : D-219 e-mail zeynebkurt@gmail.com, zeyneb@ce.yildiz.edu.tr When to contact e-mail first, take an appointment What to expect help

More information

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

SEMANTIC ANALYSIS TYPES AND DECLARATIONS SEMANTIC ANALYSIS CS 403: Type Checking Stefan D. Bruda Winter 2015 Parsing only verifies that the program consists of tokens arranged in a syntactically valid combination now we move to check whether

More information

Arrays. Week 4. Assylbek Jumagaliyev

Arrays. Week 4. Assylbek Jumagaliyev Arrays Week 4 Assylbek Jumagaliyev a.jumagaliyev@iitu.kz Introduction Arrays Structures of related data items Static entity (same size throughout program) A few types Pointer-based arrays (C-like) Arrays

More information

Working with Batches of Data

Working with Batches of Data Hartmut Kaiser hkaiser@cct.lsu.edu http://www.cct.lsu.edu/ hkaiser/fall_2012/csc1254.html 2 Abstract So far we looked at simple read a string print a string problems. Now we will look at more complex problems

More information

Review of the C Programming Language for Principles of Operating Systems

Review of the C Programming Language for Principles of Operating Systems Review of the C Programming Language for Principles of Operating Systems Prof. James L. Frankel Harvard University Version of 7:26 PM 4-Sep-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights

More information

String Matching Algorithms

String Matching Algorithms String Matching Algorithms Georgy Gimel farb (with basic contributions from M. J. Dinneen, Wikipedia, and web materials by Ch. Charras and Thierry Lecroq, Russ Cox, David Eppstein, etc.) COMPSCI 369 Computational

More information

Do not start the test until instructed to do so!

Do not start the test until instructed to do so! Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other electronic devices

More information

CS & IT Conversions. Magnitude 10,000 1,

CS & IT Conversions. Magnitude 10,000 1, CS & IT Conversions There are several number systems that you will use when working with computers. These include decimal, binary, octal, and hexadecimal. Knowing how to convert between these number systems

More information

Introduction. two of the most fundamental concepts in computer science are, given an array of values:

Introduction. two of the most fundamental concepts in computer science are, given an array of values: Searching Class 28 Introduction two of the most fundamental concepts in computer science are, given an array of values: search through the values to see if a specific value is present and, if so, where

More information

UEE1302 (1102) F10: Introduction to Computers and Programming

UEE1302 (1102) F10: Introduction to Computers and Programming Computational Intelligence on Automation Lab @ NCTU Learning Objectives UEE1302 (1102) F10: Introduction to Computers and Programming Programming Lecture 00 Programming by Example Introduction to C++ Origins,

More information

Information Science 1

Information Science 1 Information Science 1 - Representa*on of Data in Memory- Week 03 College of Information Science and Engineering Ritsumeikan University Topics covered l Basic terms and concepts of The Structure of a Computer

More information

Formal Languages and Compilers Lecture VI: Lexical Analysis

Formal Languages and Compilers Lecture VI: Lexical Analysis Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal

More information