CSE 154 LECTURE 11: REGULAR EXPRESSIONS

Similar documents
Lecture 11 & 12 PHP - III

CSE 303 Lecture 7. Regular expressions, egrep, and sed. read Linux Pocket Guide pp , 73-74, 81

CSE 390a Lecture 7. Regular expressions, egrep, and sed

Reset Buttons. More forms. Grouping input: <fieldset>,<legend> fieldset groups related input fields, adds a border; legend supplies a caption

CSE 154 LECTURE 21: COOKIES

CSE 154 LECTURE 21: COOKIES

DATA STRUCTURE AND ALGORITHM USING PYTHON

Server-side Web Development (I3302) Semester: 1 Academic Year: 2017/2018 Credits: 4 (50 hours) Dr Antoun Yaacoub

Regular Expressions. Computer Science and Engineering College of Engineering The Ohio State University. Lecture 9

CSE 154, Autumn 2012 Final Exam, Thursday, December 13, 2012

Regular Expressions. Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl)

CSc 110, Autumn Lecture 14: Strings. Adapted from slides by Marty Stepp and Stuart Reges

Indian Institute of Technology Kharagpur. PERL Part III. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T.

PIC 40A. Lecture 19: PHP Form handling, session variables and regular expressions. Copyright 2011 Jukka Virtanen UCLA 1 05/25/12

CSc 110, Spring Lecture 14: Booleans and Strings. Adapted from slides by Marty Stepp and Stuart Reges

Problem Description Earned Max 1 CSS 2 PHP 3 PHP/JSON 4 Ajax 5 Regular Expressions 6 SQL TOTAL Total Points 100

EECS1012. Net-centric Introduction to Computing. Lecture JavaScript and Forms

Regular Expressions 1

Lecture 18 Regular Expressions

Regex Guide. Complete Revolution In programming For Text Detection

Problem Description Earned Max 1 PHP 25 2 JavaScript / DOM 25 3 Regular Expressions 25 TOTAL Total Points 100. Good luck! You can do it!

Problem Description Earned Max 1 HTML / CSS Tracing 20 2 CSS 20 3 PHP 20 4 JS / Ajax / JSON 20 5 SQL 20 X Extra Credit 1 TOTAL Total Points 100

WEB APPLICATION ENGINEERING II

Perl Regular Expressions. Perl Patterns. Character Class Shortcuts. Examples of Perl Patterns

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

Understanding Regular Expressions, Special Characters, and Patterns

Python allows variables to hold string values, just like any other type (Boolean, int, float). So, the following assignment statements are valid:

PYTHON MOCK TEST PYTHON MOCK TEST III

Global Search And Replace User s Manual

Lecture 9 & 10 PHP - II

CSE 413 Final Exam. June 7, 2011

CSE 143: Computer Programming II Summer 2017 HW4: Grammar (due Tuesday, July :30pm)

ASSIGNMENT 3. COMP-202, Winter 2015, All Sections. Due: Tuesday, February 24, 2015 (23:59)

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

Problem Description Earned Max 1 PHP 25 2 JavaScript/DOM 25 3 Regular Expressions 25 4 SQL 25 TOTAL Total Points 100

CS 106B Practice Midterm Exam #1 (by Marty Stepp)

Regular Expressions in programming. CSE 307 Principles of Programming Languages Stony Brook University

CS 211 Regular Expressions

Ruby: Introduction, Basics

Lecture 8: The String Class and Boolean Zen

Problem Description Earned Max 1 PHP 20 2 JS 20 3 JS / Ajax / JSON 20 4 SQL 20 X Extra Credit 1 TOTAL Total Points 100

Structure of Programming Languages Lecture 3

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

University of Washington, CSE 154 Homework Assignment 7: To-Do List

University of Washington, CSE 190 M Homework Assignment 4: NerdLuv

CSE 154 LECTURE 8: FORMS

Ruby: Introduction, Basics

CSC207 Week 9. Larry Zhang

CSE 142, Summer 2015

Who This Book Is For What This Book Covers How This Book Is Structured What You Need to Use This Book. Source Code

CS 2112 Lab: Regular Expressions

CSE 413 Final Exam Spring 2011 Sample Solution. Strings of alternating 0 s and 1 s that begin and end with the same character, either 0 or 1.

Form Validation (with jquery, HTML5, and CSS) MIS Konstantin Bauman. Department of MIS Fox School of Business Temple University

Regular Expressions. Regular Expression Syntax in Python. Achtung!

CS Unix Tools & Scripting

DAY 2. Creating Forms

CIS192 Python Programming

A first look at string processing. Python

CMSC 330: Organization of Programming Languages. Ruby Regular Expressions

Regular Expressions for Technical Writers (tutorial)

JAVA PROGRAMMING LAB. ABSTRACT EXTRA LAB, In this Lab you will learn working with recursion, JRX, Java documentations

FORMULAS QUICK REFERENCE

MCIS/UA. String Literals. String Literals. Here Documents The <<< operator (also known as heredoc) can be used to construct multi-line strings.

CSC 467 Lecture 3: Regular Expressions

Supplemental Materials: Grammars, Parsing, and Expressions. Topics. Grammars 10/11/2017. CS2: Data Structures and Algorithms Colorado State University

Building Java Programs. Chapter 2: Primitive Data and Definite Loops

Introduction to String Manipulation

15-110: Principles of Computing, Spring 2018

"Hello" " This " + "is String " + "concatenation"

Emacs manual:

15-110: Principles of Computing, Spring 2018

Here's an example of how the method works on the string "My text" with a start value of 3 and a length value of 2:

Building Java Programs

Using Java Classes Fall 2018 Margaret Reid-Miller

PowerGREP. Manual. Version October 2005

CSCA20 Worksheet Strings

Building Java Programs

CS160A EXERCISES-FILTERS2 Boyd

Regular expressions and case insensitivity

Building Java Programs

Regular expressions and case insensitivity

Week - 01 Lecture - 04 Downloading and installing Python

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute

COMS 3101 Programming Languages: Perl. Lecture 3

York University Department of Electrical Engineering and Computer Science. Regular Expressions

Regular Expressions. James Balamuta STAT UIUC. Lecture 25: Nov 9, 2018

Problem Description Earned Max 1 HTML / CSS Tracing 20 2 CSS 20 3 PHP 20 4 JS / Ajax / JSON 20 5 SQL 20 X Extra Credit 1 TOTAL Total Points 100

ME 172. Lecture 2. Data Types and Modifier 3/7/2011. variables scanf() printf() Basic data types are. Modifiers. char int float double

Strings. Strings and their methods. Dr. Siobhán Drohan. Produced by: Department of Computing and Mathematics

Regular Expressions Overview Suppose you needed to find a specific IPv4 address in a bunch of files? This is easy to do; you just specify the IP

CMSC201 Computer Science I for Majors

Chapter 5 BET TER ARRAYS AND STRINGS HANDLING

Computer Systems and Architecture

Documentation Requirements Computer Science 2334 Spring 2016

CSE 21 Intro to Computing II. JAVA Objects: String & Scanner

Web Programming TL 9. Tutorial. Exercise 1: String Manipulation

CS 541 Spring Programming Assignment 2 CSX Scanner

Logical Coding, algorithms and Data Structures

Lab 3: Call to Order CSCI 2101 Fall 2018

Creating Web Pages with SeaMonkey Composer

Transcription:

CSE 154 LECTURE 11: REGULAR EXPRESSIONS

What is form validation? validation: ensuring that form's values are correct some types of validation: preventing blank values (email address) ensuring the type of values integer, real number, currency, phone number, Social Security number, postal address, email address, date, credit card number,... ensuring the format and range of values (ZIP code must be a 5-digit integer) ensuring that values fit together (user types email twice, and the two must match)

A real form that uses validation

Client vs. server-side validation Validation can be performed: client-side (before the form is submitted) can lead to a better user experience, but not secure (why not?) server-side (in PHP code, after the form is submitted) needed for truly secure validation, but slower both best mix of convenience and security, but requires most effort to program

An example form to be validated <form action="http://foo.com/foo.php" method="get"> <div> City: <input name="city" /> <br /> State: <input name="state" size="2" maxlength="2" /> <br /> ZIP: <input name="zip" size="5" maxlength="5" /> <br /> <input type="submit" /> </div> </form> HTML Let's validate this form's data on the server... output

Recall: Basic server-side validation $city = $_POST["city"]; $state = $_POST["state"]; $zip = $_POST["zip"]; if (!$city strlen($state)!= 2 strlen($zip)!= 5) { print "Error, invalid city/state/zip submitted."; } PHP basic idea: examine parameter values, and if they are bad, show an error message and abort. But: How do you test for integers vs. real numbers vs. strings? How do you test for a valid credit card number? How do you test that a person's name has a middle initial? (How do you test whether a given string matches a particular complex format?)

Regular expressions /^[a-za-z_\-]+@(([a-za-z_\-])+\.)+[a-za-z]{2,4}$/ regular expression ("regex"): a description of a pattern of text can test whether a string matches the expression's pattern can use a regex to search/replace characters in a string regular expressions are extremely powerful but tough to read (the above regular expression matches email addresses) regular expressions occur in many places: Java: Scanner, String's split method (CSE 143 sentence generator) supported by PHP, JavaScript, and other languages many text editors (TextPad) allow regexes in search/replace The site Rubular is useful for testing a regex.

Regular expressions This picture best describes regex.

Basic regular expressions /abc/ in PHP, regexes are strings that begin and end with / the simplest regexes simply match a particular substring the above regular expression matches any string containing "abc": YES: "abc", "abcdef", "defabc", ".=.abc.=.",... NO: "fedcba", "ab c", "PHP",...

Wildcards:. A dot. matches any character except a \n line break /.oo.y/ matches "Doocy", "goofy", "LooNy",... A trailing i at the end of a regex (after the closing /) signifies a case-insensitive match /all/i matches Allison Obourn", small", JANE GOODALL",...

Special characters:, (), \ means OR /abc def g/ matches "abc", "def", or "g" There's no AND symbol. Why not? () are for grouping /(Homer Marge) Simpson/ matches "Homer Simpson" or "Marge Simpson" \ starts an escape sequence many characters must be escaped to match them literally: / \ $. [ ] ( ) ^ * +? /<br \/>/ matches lines containing <br /> tags

Quantifiers: *, +,? * means 0 or more occurrences /abc*/ matches "ab", "abc", "abcc", "abccc",... /a(bc)*/ matches "a", "abc", "abcbc", "abcbcbc",... /a.*a/ matches "aa", "aba", "a8qa", "a!?xyz 9a",... + means 1 or more occurrences /Hi!+ there/ matches "Hi! there", "Hi!!! there",... /a(bc)+/ matches "abc", "abcbc", "abcbcbc",...? means 0 or 1 occurrences /a(bc)?/ matches "a" or "abc"

More quantifiers: {min,max} {min,max} means between min and max occurrences (inclusive) /a(bc){2,4}/ matches "abcbc", "abcbcbc", or "abcbcbcbc" min or max may be omitted to specify any number {2,} means 2 or more {,6} means up to 6 {3} means exactly 3

Practice exercise When you search Google, it shows the number of pages of results as "o"s in the word "Google". What regex matches strings like "Google", "Gooogle", "Goooogle",...? (try it) (data) Answer: /Goo+gle/ (or /Go{2,}gle/)

Anchors: ^ and $ ^ represents the beginning of the string or line; $ represents the end /Jess/ matches all strings that contain Jess; /^Jess/ matches all strings that start with Jess; /Jess$/ matches all strings that end with Jess; /^Jess$/ matches the exact string "Jess" only /^Alli.*Obourn$/ matches AlliObourn", Allie Obourn", Allison E Obourn",... but NOT Allison Obourn stinks" or "I H8 Allison Obourn" (on the other slides, when we say, /PATTERN/ matches "text", we really mean that it matches any string that contains that text)

Character sets: [] [] group characters into a character set; will match any single character from the set /[bcd]art/ matches strings containing "bart", "cart", and "dart" equivalent to /(b c d)art/ but shorter inside [], many of the modifier keys act as normal characters /what[!*?]*/ matches "what", "what!", "what?**!", "what??!",... What regular expression matches DNA (strings of A, C, G, or T)? /[ACGT]+/

Character ranges: [start-end] inside a character set, specify a range of characters with - /[a-z]/ matches any lowercase letter /[a-za-z0-9]/ matches any lower- or uppercase letter or digit an initial ^ inside a character set negates it /[^abcd]/ matches any character other than a, b, c, or d inside a character set, - must be escaped to be matched /[+\-]?[0-9]+/ matches an optional + or -, followed by at least one digit

Practice Exercises What regular expression matches letter grades such as A, B+, or D-? (try it) (data) What regular expression would match UW Student ID numbers? (try it) (data) What regular expression would match a sequence of only consonants, assuming that the string consists only of lowercase letters? (try it) (data)

Escape sequences special escape sequence character sets: \d matches any digit (same as [0-9]); \D any non-digit ([^0-9]) \w matches any word character (same as [a-za-z_0-9]); \W any non-word char \s matches any whitespace character (, \t, \n, etc.); \S any non-whitespace What regular expression matches names in a "Last, First M." format with any number of spaces? /\w+,\s+\w+\s+\w\./

Regular expressions in PHP (PDF) regex syntax: strings that begin and end with /, such as "/[AEIOU]+/" function preg_match(regex, string) preg_replace(regex, replacement, string) preg_split(regex, string) description returns TRUE if string matches regex returns a new string with all substrings that match regex replaced by replacement returns an array of strings from given string broken apart using given regex as delimiter (like explode but more powerful)

PHP form validation w/ regexes $state = $_POST["state"]; if (!preg_match("/^[a-z]{2}$/", $state)) { print "Error, invalid state submitted."; } PHP preg_match and regexes help you to validate parameters sites often don't want to give a descriptive error message here (why?)

Regular expression PHP example # replace vowels with stars $str = "the quick brown fox"; $str = preg_replace("/[aeiou]/", "*", $str); # "th* q**ck br*wn f*x" # break apart into words $words = preg_split("/[ ]+/", $str); # ("th*", "q**ck", "br*wn", "f*x") # capitalize words that had 2+ consecutive vowels for ($i = 0; $i < count($words); $i++) { if (preg_match("/\\*{2,}/", $words[$i])) { $words[$i] = strtoupper($words[$i]); } } # ("th*", "Q**CK", "br*wn", "f*x") PHP

Practice exercise Use regular expressions to add validation to the turnin form shown in previous lectures. The student name must not be blank and must contain a first and last name (two words). The student ID must be a seven-digit integer. The assignment must be a string such as "hw1" or "hw6". The section must be a two-letter uppercase string representing a valid section such as AF or BK. The email address must follow a valid general format such as user@example.com. The course must be one of "142", "143", or "154" exactly.

Handling invalid data function check_valid($regex, $param) { if (preg_match($regex, $_POST[$param])) { return $_POST[$param]; } else { # code to run if the parameter is invalid die("bad $param"); } }... $sid = check_valid("/^[0-9]{7}$/", "studentid"); $section = check_valid("/^[ab][a-c]$/i", "section"); PHP Having a common helper function to check parameters is useful. If your page needs to show a particular HTML output on errors, the die function may not be appropriate.

Regular expressions in HTML forms How old are you? <input type="text" name="age" size="2" pattern="[0-9]+" title="an integer" /> <input type="submit" /> HTML output HTML5 adds a new pattern attribute to input elements the browser will refuse to submit the form unless the value matches the regex