Lecture 11 & 12 PHP - III May 14 th, 2015 May 16 th, 2015 Web Application Development CS 228 Web Development CS 303 Fall 2015 numangift.wordpress.com/web-development-spring-2015
Regular Expressions
What is form validation? validation: ensuring that form's values are correct some types of validation: preventing blank values (email address) ensuring the type of values integer, real number, currency, phone number, Social Security number, postal address, email address, date, credit card number,... ensuring the format and range of values (ZIP code must be a 5-digit integer) ensuring that values fit together (user types email twice, and the two must match)
A real form that uses validation
Client vs. server-side validation Validation can be performed: client-side (before the form is submitted) can lead to a better user experience, but not secure (why not?) server-side (in PHP code, after the form is submitted) needed for truly secure validation, but slower both best mix of convenience and security, but requires most effort to program
An example form to be validated <form action="http://foo.com/foo.php" method="get"> <div> City: <input name="city" /> <br /> State: <input name="state" size="2" maxlength="2" /> <br /> ZIP: <input name="zip" size="5" maxlength="5" /> <br /> <input type="submit" /> </div> </form> HTML Let's validate this form's data on the server... output
Recall: Basic server-side validation $city = $_POST["city"]; $state = $_POST["state"]; $zip = $_POST["zip"]; if (!$city strlen($state)!= 2 strlen($zip)!= 5) { print "Error, invalid city/state/zip submitted."; } PHP basic idea: examine parameter values, and if they are bad, show an error message and abort. But: How do you test for integers vs. real numbers vs. strings? How do you test for a valid credit card number? How do you test that a person's name has a middle initial? (How do you test whether a given string matches a particular complex format?)
Regular expressions /^[a-za-z_\-]+@(([a-za-z_\-])+\.)+[a-za-z]{2,4}$/ regular expression ("regex"): a description of a pattern of text can test whether a string matches the expression's pattern can use a regex to search/replace characters in a string regular expressions are extremely powerful but tough to read (the above regular expression matches email addresses) regular expressions occur in many places: Java: Scanner, String's split method (CSE 143 sentence generator) supported by PHP, JavaScript, and other languages many text editors (TextPad) allow regexes in search/replace The site Rubular is useful for testing a regex.
Regular expressions This picture best describes regex.
Basic regular expressions /abc/ in PHP, regexes are strings that begin and end with / the simplest regexes simply match a particular substring the above regular expression matches any string containing "abc": YES: "abc", "abcdef", "defabc", ".=.abc.=.",... NO: "fedcba", "ab c", "PHP",...
Wildcards:. A dot. matches any character except a \n line break /.oo.y/ matches "Doocy", "goofy", "LooNy",... A trailing i at the end of a regex (after the closing /) signifies a case-insensitive match /all/i matches Allison Obourn", small", JANE GOODALL",...
Special characters:, (), \ means OR /abc def g/ matches "abc", "def", or "g" There's no AND symbol. Why not? () are for grouping /(Homer Marge) Simpson/ matches "Homer Simpson" or "Marge Simpson" \ starts an escape sequence many characters must be escaped to match them literally: / \ $. [ ] ( ) ^ * +? /<br \/>/ matches lines containing <br /> tags
Quantifiers: *, +,? * means 0 or more occurrences /abc*/ matches "ab", "abc", "abcc", "abccc",... /a(bc)*/ matches "a", "abc", "abcbc", "abcbcbc",... /a.*a/ matches "aa", "aba", "a8qa", "a!?xyz 9a",... + means 1 or more occurrences /Hi!+ there/ matches "Hi! there", "Hi!!! there",... /a(bc)+/ matches "abc", "abcbc", "abcbcbc",...? means 0 or 1 occurrences /a(bc)?/ matches "a" or "abc"
More quantifiers: {min,max} {min,max} means between min and max occurrences (inclusive) /a(bc){2,4}/ matches "abcbc", "abcbcbc", or "abcbcbcbc" min or max may be omitted to specify any number {2,} means 2 or more {,6} means up to 6 {3} means exactly 3
Practice exercise When you search Google, it shows the number of pages of results as "o"s in the word "Google". What regex matches strings like "Google", "Gooogle", "Goooogle",...? (try it) (data) Answer: /Goo+gle/ (or /Go{2,}gle/)
Anchors: ^ and $ ^ represents the beginning of the string or line; $ represents the end /Jess/ matches all strings that contain Jess; /^Jess/ matches all strings that start with Jess; /Jess$/ matches all strings that end with Jess; /^Jess$/ matches the exact string "Jess" only /^Alli.*Obourn$/ matches AlliObourn", Allie Obourn", Allison E Obourn",... but NOT Allison Obourn stinks" or "I H8 Allison Obourn" (on the other slides, when we say, /PATTERN/ matches "text", we really mean that it matches any string that contains that text)
Character sets: [] [] group characters into a character set; will match any single character from the set /[bcd]art/ matches strings containing "bart", "cart", and "dart" equivalent to /(b c d)art/ but shorter inside [], many of the modifier keys act as normal characters /what[!*?]*/ matches "what", "what!", "what?**!", "what??!",... What regular expression matches DNA (strings of A, C, G, or T)? /[ACGT]+/
Character ranges: [start-end] inside a character set, specify a range of characters with - /[a-z]/ matches any lowercase letter /[a-za-z0-9]/ matches any lower- or uppercase letter or digit an initial ^ inside a character set negates it /[^abcd]/ matches any character other than a, b, c, or d inside a character set, - must be escaped to be matched /[+\-]?[0-9]+/ matches an optional + or -, followed by at least one digit
Practice Exercises What regular expression matches letter grades such as A, B+, or D-? (try it) (data) What regular expression would match UW Student ID numbers? (try it) (data) What regular expression would match a sequence of only consonants, assuming that the string consists only of lowercase letters? (try it) (data)
Escape sequences special escape sequence character sets: \d matches any digit (same as [0-9]); \D any non-digit ([^0-9]) \w matches any word character (same as [a-za-z_0-9]); \W any non-word char \s matches any whitespace character (, \t, \n, etc.); \S any non-whitespace What regular expression matches names in a "Last, First M." format with any number of spaces? /\w+,\s+\w+\s+\w\./
Regular expressions in PHP (PDF) regex syntax: strings that begin and end with /, such as "/[AEIOU]+/" function preg_match(regex, string) preg_replace(regex, replacement, string) preg_split(regex, string) description returns TRUE if string matches regex returns a new string with all substrings that match regex replaced by replacement returns an array of strings from given string broken apart using given regex as delimiter (like explode but more powerful)
PHP form validation w/ regexes $state = $_POST["state"]; if (!preg_match("/^[a-z]{2}$/", $state)) { print "Error, invalid state submitted."; } PHP preg_match and regexes help you to validate parameters sites often don't want to give a descriptive error message here (why?)
Regular expression PHP example # replace vowels with stars $str = "the quick brown fox"; $str = preg_replace("/[aeiou]/", "*", $str); # "th* q**ck br*wn f*x" # break apart into words $words = preg_split("/[ ]+/", $str); # ("th*", "q**ck", "br*wn", "f*x") # capitalize words that had 2+ consecutive vowels for ($i = 0; $i < count($words); $i++) { if (preg_match("/\\*{2,}/", $words[$i])) { $words[$i] = strtoupper($words[$i]); } } # ("th*", "Q**CK", "br*wn", "f*x") PHP
Practice exercise Use regular expressions to add validation to the turnin form shown in previous lectures. The student name must not be blank and must contain a first and last name (two words). The student ID must be a seven-digit integer. The assignment must be a string such as "hw1" or "hw6". The section must be a two-letter uppercase string representing a valid section such as AF or BK. The email address must follow a valid general format such as user@example.com. The course must be one of "142", "143", or "154" exactly.
Handling invalid data function check_valid($regex, $param) { if (preg_match($regex, $_POST[$param])) { return $_POST[$param]; } else { # code to run if the parameter is invalid die("bad $param"); } }... $sid = check_valid("/^[0-9]{7}$/", "studentid"); $section = check_valid("/^[ab][a-c]$/i", "section"); PHP Having a common helper function to check parameters is useful. If your page needs to show a particular HTML output on errors, the die function may not be appropriate.
Regular expressions in HTML forms How old are you? <input type="text" name="age" size="2" pattern="[0-9]+" title="an integer" /> <input type="submit" /> HTML output HTML5 adds a new pattern attribute to input elements the browser will refuse to submit the form unless the value matches the regex
Cookies
Stateful client/server interaction Sites like amazon.com seem to "know who I am." How do they do this? How does a client uniquely identify itself to a server, and how does the server provide specific content to each client? HTTP is a stateless protocol; it simply allows a browser to request a single document from a web server today we'll learn about pieces of data called cookies used to work around this problem, which are used as the basis of higher-level sessions between clients and servers
What is a cookie? cookie: a small amount of information sent by a server to a browser, and then sent back by the browser on future page requests cookies have many uses: authentication user tracking maintaining user preferences, shopping carts, etc. a cookie's data consists of a single name/value pair, sent in the header of the client's HTTP GET or POST request
How cookies are sent when the browser requests a page, the server may send back a cookie(s) with it if your server has previously sent any cookies to the browser, the browser will send them back on subsequent requests alternate model: clientside JavaScript code can set/get cookies
Myths about cookies Myths: Cookies are like worms/viruses and can erase data from the user's hard disk. Cookies are a form of spyware and can steal your personal information. Cookies generate popups and spam. Cookies are only used for advertising. Facts: Cookies are only data, not program code. Cookies cannot erase or read information from the user's computer. Cookies are usually anonymous (do not contain personal information). Cookies CAN be used to track your viewing habits on a particular site.
A "tracking cookie" an advertising company can put a cookie on your machine when you visit one site, and see it when you visit another site that also uses that advertising company therefore they can tell that the same person (you) visited both sites can be thwarted by telling your browser not to accept "third-party cookies"
Where are the cookies on my computer? IE: HomeDirectory\Cookies e.g. C:\Documents and Settings\jsmith\Cookies each is stored as a.txt file similar to the site's domain name Chrome: C:\Users\username\AppData\Local\Google\Chrome\User Data\Default Firefox: HomeDirectory\.mozilla\firefox\???.default\cookies.txt view cookies in Firefox preferences: Privacy, Show Cookies...
How long does a cookie exist? session cookie : the default type; a temporary cookie that is stored only in the browser's memory when the browser is closed, temporary cookies will be erased can not be used for tracking long-term information safer, because no programs other than the browser can access them persistent cookie : one that is stored in a file on the browser's computer can track long-term information potentially less secure, because users (or programs they run) can open cookie files, see/change the cookie values, etc.
Setting a cookie in PHP setcookie("name", "value"); setcookie("username", allllison"); setcookie("age", 19); PHP PHP setcookie causes your script to send a cookie to the user's browser setcookie must be called before any output statements (HTML blocks, print, or echo) you can set multiple cookies (20-50) per user, each up to 3-4K bytes by default, the cookie expires when browser is closed (a "session cookie")
Retrieving information from a cookie $variable = $_COOKIE["name"]; if (isset($_cookie["username"])) { $username = $_COOKIE["username"]; print("welcome back, $username.\n"); } else { print("never heard of you.\n"); } print("all cookies received:\n"); print_r($_cookie); # retrieve value of the cookie any cookies sent by client are stored in $_COOKIES associative array use isset function to see whether a given cookie name exists PHP
What cookies have been set? Chrome: F12 Resources Cookies; Firefox: F12 Cookies
Expiration / persistent cookies setcookie("name", "value", expiration); $expiretime = time() + 60*60*24*7; # 1 week from now setcookie("couponnumber", "389752", $expiretime); setcookie("couponvalue", "100.00", $expiretime); PHP PHP to set a persistent cookie, pass a third parameter for when it should expire indicated as an integer representing a number of seconds, often relative to current timestamp if no expiration passed, cookie is a session cookie; expires when browser is closed time function returns the current time in seconds date function can convert a time in seconds to a readable date
Deleting a cookie setcookie("name", FALSE); setcookie("couponnumber", FALSE); PHP PHP setting the cookie to FALSE erases it you can also set the cookie but with an expiration that is before the present time: setcookie("count", 42, time() - 1); PHP remember that the cookie will also be deleted automatically when it expires, or can be deleted manually by the user by clearing their browser cookies
Clearing cookies in your browser Chrome: Wrench History Clear all browsing data... Firefox: Firefox menu Options Privacy Show Cookies... Remove (All) Cookies
Cookie scope and attributes setcookie("name", "value", expire, "path", "domain", secure, httponly); a given cookie is associated only with one particular domain (e.g. www.example.com) you can also specify a path URL to indicate that the cookie should only be sent on certain subsets of pages within that site (e.g. /users/accounts/ will bind towww.example.com/users/accounts) a cookie can be specified as Secure to indicate that it should only be sent when using HTTPS secure requests a cookie can be specified as HTTP Only to indicate that it should be sent by HTTP/HTTPS requests only (not JavaScript, Ajax, etc.; seen later); this is to help avoid JavaScript security attacks
Common cookie bugs When you call setcookie, the cookie will be available in $_COOKIE on the next page load, but not the current one. If you need the value during the current page request, also store it in a variable: setcookie("name", "joe"); print $_COOKIE["name"]; # undefined PHP $name = "joe"; setcookie("name", $name); print $name; # joe PHP setcookie must be called before your code prints any output or HTML content: <!DOCTYPE html><html> <?php setcookie("name", "joe"); # should precede HTML content!
Session
How long does a cookie exist? session cookie : the default type; a temporary cookie that is stored only in the browser's memory when the browser is closed, temporary cookies will be erased can not be used for tracking long-term information safer, because no programs other than the browser can access them persistent cookie : one that is stored in a file on the browser's computer can track long-term information potentially less secure, because users (or programs they run) can open cookie files, see/change the cookie values, etc.
What is a session? session: an abstract concept to represent a series of HTTP requests and responses between a specific Web browser and server HTTP doesn't support the notion of a session, but PHP does sessions vs. cookies: a cookie is data stored on the client a session's data is stored on the server (only 1 session per client) sessions are often built on top of cookies: the only data the client stores is a cookie holding a unique session ID on each page request, the client sends its session ID cookie, and the server uses this to find and retrieve the client's session data
How sessions are established client's browser makes an initial request to the server server notes client's IP address/browser, stores some local session data, and sends a session ID back to client (as a cookie) client sends that same session ID (cookie) back to server on future requests server uses session ID cookie to retrieve its data for the client's session later (like a ticket given at a coat-check room)
Cookies vs. sessions duration: sessions live on until the user logs out or closes the browser; cookies can live that long, or until a given fixed timeout (persistent) data storage location: sessions store data on the server (other than a session ID cookie); cookies store data on the user's browser security: sessions are hard for malicious users to tamper with or remove; cookies are easy privacy: sessions protect private information from being seen by other users of your computer; cookies do not
Sessions in PHP: session_start session_start(); PHP session_start signifies your script wants a session with the user must be called at the top of your script, before any HTML output is produced when you call session_start: if the server hasn't seen this user before, a new session is created otherwise, existing session data is loaded into $_SESSION associative array you can store data in $_SESSION and retrieve it on future pages complete list of PHP session functions
Accessing session data $_SESSION["name"] = value; # store session data $variable = $_SESSION["name"]; # read session data if (isset($_session["name"])) { # check for session data PHP if (isset($_session["points"])) { $points = $_SESSION["points"]; print("you've earned $points points.\n"); } else { $_SESSION["points"] = 0; # default } PHP the $_SESSION associative array reads/stores all session data use isset function to see whether a given value is in the session
Where is session data stored? on the client, the session ID is stored as a cookie with the name PHPSESSID on the server, session data are stored as temporary files such as /tmp/sess_fcc17f071... you can find out (or change) the folder where session data is saved using the session_save_path function for very large applications, session data can be stored into a SQL database (or other destination) instead using thesession_set_save_handler function
Session timeout because HTTP is stateless, it is hard for the server to know when a user has finished a session ideally, user explicitly logs out, but many users don't client deletes session cookies when browser closes server automatically cleans up old sessions after a period of time old session data consumes resources and may present a security risk adjustable in PHP server settings or with session_cache_expire function you can explicitly delete a session by calling session_destroy
Ending a session session_destroy(); session_destroy ends your current session potential problem: if you call session_start again later, it sometimes reuses the same session ID/data you used before if you may want to start a completely new empty session later, it is best to flush out the old one: session_destroy(); session_regenerate_id(true); ID number session_start(); PHP # flushes out session PHP
Common session bugs session_start doesn't just begin a session; it also reloads any existing session for this user. So it must be called in every page that uses your session data: # the user has a session from a previous page print $_SESSION["name"]; # undefined session_start(); print $_SESSION["name"]; # joe PHP previous sessions will linger unless you destroy them and regenerate the user's session ID: session_destroy(); session_regenerate_id(true); session_start(); PHP
Implementing user logins many sites have the ability to create accounts and log in users most apps have a database of user accounts when you try to log in, your name/pw are compared to those in the database
"Remember Me" feature How might an app implement a "Remember Me" feature, where the user's login info is remembered and reused when the user comes back later? Is this stored as session data? Why or why not? What concerns come up when trying to remember data about the user who has logged in?
Practice problem: Power Animal Write a page poweranimal.php that chooses a random "power animal" for the user. The page should remember what animal was chosen for the user and show it again each time they visit the page. It should also count the number of times that user has visited the page. If the user selects to "start over," the animal and number of page visits should be forgotten.
Credits https://courses.cs.washington.edu/courses/cse154/