1 Introduction to strings CSCA20 Worksheet Strings A string is just a sequence of characters. Why do you think it is called string? List some real life applications that use strings: 2 Basics We define strings using either single or double quotes csca20 or "csca20" unless it needs to span more than one line and then we use triple quotes: >>> This is a sentence... on several lines... it is too long... to fit on one line. This is a sentence\non several lines\nit is too long\nto fit on one line. Hold on, we ll find out what \n stands for shortly! Q. What do you think the result of csca + 20 is? Q. What other use does + have? We say that + is overloaded. Q. How do you think Python knows what to do? Q. Why would "csca" + 20 result in an error? Need to use conversion: >>> dept = csc >>> level = a >>> num = 20 >>> dept + level + str(num) csca20 >>> 1
Q. What do you think "ho" * 3 will do? Q. How about len("saravanapavananthan") and "ou" in "colour"? 3 Slicing Strings Since strings are sequences of characters, we sometimes want to be able to retrieve portions of the string. To do so, we need to have a way to refer the i th character. By convention, the first character has position or index 0. Each subsequent character has index one greater. Here are the ways we can access characters in a string s: s = "sliceofspam" s[0] # Grab a single character. First index is zero. s[3] s[-2] # Negative indexes are for counting from the RHS s[2:5] # Slice a string. From position 2 (inclusive) to 5 (not inclusive) s[3:] # If omit the second index, goes to the far RHS s[:8] # If omit the first index, goes from the far LHS s[:] # If omit both, goes from both extremes. # I.e., you get a copy of the whole string. s[3:-2] # You can use negative indices when slicing too. s[-5:] # s[5:2] # Nothing there! 4 String Methods We have been using some string operators such as: csc + 108 # concatenation ho * 3 # repetition ou in colour # substring There are many string methods. These particular methods belong to the str type. We call methods using the same notation as calling a function in a module: u of toronto.capitalize() campus = utsc campus.capitalize() Some methods require parameters, such as 2
campus.startswith( ut ) You need to know when you are calling a method vs calling a function; the notation is different. For example: s = "Hi, CSCA20 students!" len(s) # len is a built-in function. s is the argument passed in. s.len() # Doesn t work. len is not a string method. s.lower() # lower is a str method lower(s) # Doesn t work. lower() must be called on an object Q. How can we get a list of all string methods with their descriptions? For the methods listed below, assume that the string we are using is: s = "This is CSC120/A20". Method Output Explanation s.lower() s.replace("20", "48") s.count("i") "yabababa daba do!".count("abab") s.find("is") s.rfind("is") s.find("is", 3) s.find("is", 2) s.find("the") Define a new s = " A quite open line ". Method Output Explanation s.lstrip() s.rstrip() s.strip() 3
5 Loops Sometimes we need to do something with each character in a string. This is called looping or iterating over the string. The form for looping through a string: for char in s: <do something involving char> for, in are Python keywords char is a variable that holds the value of the current character s is the string (for now) For example, def print_vertical(s): for char in s: print(char) Q. Trace the call print vertical("word") Q. How many times does the loop execute? Sometimes we want to iterate over the indices of a string. Here is how we can reimplement print vertical (much more on this later). Recall that len(s) returns the length of the string s. def print_vertical(s): for index in range(len(s)): print(s[index]) 6 Let s Practice Let us design and implement some functions that involve strings. Count the number of occurrences of a character in a string. Count the number of vowels in a string. Reverse a string. Remove spaces from a string. Count the number of characters two strings have in common. 4
7 Escape Sequences Q. How can we put a new line or quote inside a string? We use special escape sequences. For each of the following escape sequences, guess what they do: \n \ \" \t \\ How about this example: >>> print("\\t stands for \ tab\ and is printed \ \t\ (without quotes)") \t stands for tab and is printed (without quotes) 8 Printing strings: conversion specifiers Consider this code snippet: a1 = 89 a2 = 76 a3 = 91 print("the maximum of the three grades (", a1, ",", a2, "and", a3, ") is", max(a1, a2, a3)) The result is: The maximum of the three grades ( 89, 76 and 91 ) is 91 It is a bit of a pain to intersperse variable values with non-variable text in a print statement. We can try to use +: print("the maximum of the three grades (" + a1 + "," + a2 + " and " + a3 + ") is " + max(a1, a2, a3)) Q. Why does this cause an error? Q. How can we fix it? A MUCH better fix: print("the maximum of the three grades (%d, %d and %d) is %d" % (a1, a2, a3, max(a1, a2, a3))) There is a % operator with two operands: a string on the left and a bunch of things on the right. 5
Each %d is a placeholder for a decimal (that is, base 10) integer. After the string (with its placeholders in it), you type % followed by parenthesis containing the list of values to be substituted wherever there is a %d. We call the placeholders conversion specifiers. They indicate the format (e.g., integer, float, string) we want the values printed in. Q. The % is overloaded. What else is % used for? Q. What are the other two overloaded operators we have seen? Common conversion specifiers are: %d display the object as a decimal integer %f floating point; display it with 6 decimal places by default %.2f floating point; display it with 2 decimal places (you can specify a number other than 2) %s display the object as a string Example import math print("pi is %.15f" % (math.pi)) r = 9.625 Write a print statement that calculates the area of a circle of radius r and outputs: A circle of radius 9.625000 has area 291.0391. Example Write a snippet of code to ask for the user s name and age and then print: NAME is YYY years old. 8.1 The format built-in function You may find formatting with the % operator awkward, though it does improve with familiarity. An alternative is the built-in function format, which we will not cover here. 6