Python Programming: Lecture 2 Data Types Lili Dworkin University of Pennsylvania
Last Week s Quiz 1..pyc files contain byte code 2. The type of math.sqrt(9)/3 is float 3. The type of isinstance(5.5, float) is bool 4. The type of print hello() is None 5. hi and 0 evaluates to 0 6. Even integers from 0 up to and including 10: range(0, 11, 2)
Ternary Expressions How do we replicate a bool? a : b expression? i.e. if bool is true, then return a, else return b >>> b = True >>> x = 5 if b else 10 >>> x 5 >>> x = 5 if not b else 10 >>> x 10
Lists Mutable / resizable arrays Can contain multiple datatypes Written using square brackets [] # empty list [5, "Hi!", [1, 2, 3], False]
Lists: Indexing >>> a = ['a', 'b', 'c', 'd'] >>> a[0] 'a' >>> a[-1] 'd'
Lists: Indexing >>> a = ['a', 'b', 'c', 'd'] >>> range(1,3) # recall... [1, 2] >>> a[1:3] ['b', 'c'] >>> a[1:] ['b', 'c', 'd'] >>> a[:-2] ['a', 'b']
Lists: Indexing >>> a = ['a', 'b', 'c', 'd'] >>> range(1,10,2) # recall... [1, 3, 5, 7, 9] >>> a[0:3:2] ['a', 'c'] >>> a[::2] ['a', 'c']
Lists: Indexing >>> a = [5, "Hi!", [1, 2, 3], False] >>> a[2][1] 2
Lists: Builtins >>> b = [5, 10, 15, 20] >>> b.index(10) 1 >>> len(b) 4 >>> sum(b) 50 >>> 5 in b True >>> 6 not in b True
Lists: Iteration >>> a = ["apple", "orange", "banana"] >>> range(len(a)) [0, 1, 2] >>> for i in range(len(a)): # Bad!... print a[i]... apple orange banana >>> for i in a: # Good!... print i... apple orange banana
Lists: Insertion >>> a = ["apple", "orange", "banana"] >>> a[1] = "pear" >>> a ['apple', 'pear', 'banana'] >>> a.insert(2, "kiwi") >>> a ['apple', 'orange', 'kiwi', 'banana']
Lists: Concatenation >>> a = [1, 2, 3] >>> b = [4, 5, 6] >>> a + b [1, 2, 3, 4, 5, 6] >>> a.extend(b) >>> a [1, 2, 3, 4, 5, 6] >>> a.append(7) >>> a [1, 2, 3, 4, 5, 6, 7]
Lists: Removal >>> a = ['a', 'b', 'c'] >>> del a[0] >>> a ['b', 'c'] >>> a.remove('c') >>> a ['b'] del removes a specific index, whereas remove removes the first matching value.
Lists: Removal >>> b = [1, 2, 3, 4] >>> b[1:3] = [] >>> b [1, 4]
References All data in Python is represented by objects Then variables must all be references to objects The memory a variable takes up is merely a pointer to some other part of memory that actually has the object We usually think of the thing being pointed to as the value of the variable, not the pointer itself
References
References
References
References == checks whether variables point to objects of the same value is checks whether variables point to the same object >>> a = [1,2,3] >>> b = a >>> a == b True >>> a is b True >>> a.append(4) >>> a [1, 2, 3, 4] >>> b [1, 2, 3, 4]
References b was an shallow copy Use list or [:] to get deep copies >>> a = [1,2,3] >>> b = list(a) # or b = a[:] >>> a == b True >>> a is b False >>> a.append(5) >>> a [1, 2, 3, 4, 5] >>> b [1, 2, 3, 4]
References What happens here? >>> a = [1,2,3] >>> b = [1,2,3]
References What happens here? >>> a = [1,2,3] >>> b = [1,2,3] >>> a == b True >>> a is b False
Lists: Multiplication >>> [0] * 5 [0, 0, 0, 0, 0] >>> [1, 2] * 3 [1, 2, 1, 2, 1, 2] >>> [[1, 2]] * 3 [[1, 2], [1, 2], [1, 2]]
Lists: Multiplication But be careful! >>> l = [[]] * 3 >>> l [[], [], []] >>> l[1].append(0) >>> l [[0], [0], [0]] # expected [[], [0], []] When you use the [x]*n syntax, what you get is a list of n many x objects, but they re all references to the same object.
Lists: Multiplication [[]]*n is the same as:: l = [] x = [] for i in range(n): l.append(x) Instead, do: l = [] for i in range(n): l.append([]) We ll see a one-liner that accomplishes this shortly...
Lists: Comprehensions Produce new lists from existing lists Basic form: [expr(x) for x in xs if cond(x)]
Lists: Comprehensions >>> a = [1, 2, 3, 4, 5] # map >>> [x * 2 for x in a] [2, 4, 6, 8, 10] >>> [x for x in a if x > 3] # filter [4, 5] >>> [x * 3 for x in a if x % 2 == 0] # both [6, 12]
Lists: Comprehensions To achieve our desired list-of-lists initialization: >>> x = [[] for _ in range(3)]
Lists: Comprehensions Nested for loop: >>> matrix = [[1, 2], [3, 4]] >>> [element for row in matrix for element in row] [1, 2, 3, 4] Read this left-to-right, as: for row in matrix: for element in row: element
Lists: Comprehensions Nested list comprehension: >>> matrix = [[1, 2], [3, 4]] >>> [[row[i] for row in matrix] for i in range(len(matrix))] [[1, 3], [2, 4]] Read this right-to-left, as: for i in range(len(matrix)): for row in matrix: row[i]
Lists: Comprehensions Find the sum of the integers from 1 to 100 that are even, and not perfect squares.
Strings Immutable sequences of characters Use single or double quotes for one-line strings Use triple quotes for multi-line strings We can use many of the same operations as before as long as they didn t mutate the list
Strings >>> s = "hello" >>> s[0] 'h' >>> for i in s:... print i... h e l l o >>> s[0] = 'a' TypeError
Strings: Split >>> s = "Pretend this sentence makes sense." >>> s.split(" this ") ['Pretend', 'sentence makes sense.'] >>> words = s.split(" ") >>> words ['Pretend', 'this', 'sentence', 'makes', 'sense.']
Strings: Join >>> " ".join(words) 'Pretend this sentence makes sense.' >>> "_".join(words) 'Pretend_this_sentence_makes_sense.' Note that we are calling the join method of the object.
Strings: Join join only works on lists of strings, but... >>> a = [1, 2, 3] >>> " ".join([str(x) for x in a]) '1 2 3'
Strings: Find and Replace >>> s = "I wish I were home right now." >>> s.find('home') 14 >>> s.find('dinosaur') -1 >>> s = "I'm taking CIS 192." >>> s.replace('cis', 'Computer Science') "I'm taking Computer Science 192."
Strings: Case >>> s = 'dog' >>> s.capitalize() 'Dog' >>> s.upper() 'DOG' >>> s.lower() 'dog'
Strings: Stripping >>> s = ' hi --' >>> s.strip(' -') 'hi' >>> s.lstrip(' ') 'hi --'
Strings: Alphas and Digits >>> s = 'abc' >>> s.isalpha() True >>> s = '123' >>> s.isdigit() True >>> int(s) 123
Strings: Escape Characters >>> s = "a\nb" >>> s 'a\nb' >>> print s a b
Tuples Immutable lists Written using parentheses, not square brackets A way of packaging multiple values together Functions can return tuples, take tuples as arguments Immutability makes them more efficient
Tuples >>> a = [1,2] >>> b = (a, 3) >>> b ([1, 2], 3) >>> b[0] = 0 TypeError >>> a[0] = 0 >>> b ([0, 2], 3)
Tuples: Enumerate >>> a = ["apple", "orange", "banana"] >>> for (index, fruit) in enumerate(a):... print str(index) + ": " + fruit... 0: apple 1: orange 2: banana
Tuples: Enumerate Write a function tuple indices(l, position, value) that has the following behavior: >>> l = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 7)] >>> tuple_indices(l, 1, 7) [1, 3] >>> tuple_indices(l, 0, "kumquat") [2]
Tuples: Zip >>> a = [1, 2, 3] >>> b = ['a', 'b', 'c'] >>> c = [4, 5, 6] >>> zip(a, b, c) [(1, 'a', 4), (2, 'b', 5), (3, 'c', 6)] Here zip has three arguments, but it can actually take a variable number.
Tuples: Singletons >>> a = ("Hi") >>> type(a) <type 'str'> >>> a = ("Hi",) >>> type(a) <type 'tuple'>
Tuples: String Formatting >>> "Hello, %s!" % "Annie" 'Hello, Annie!' >>> "A number: %s" % 5 'A number: 5' >>> "A list: %s" % [1, 2, 3] 'A list: [1, 2, 3]' Non-string objects are converted using str.
Tuples: String Formatting Use a tuple for multiple arguments: >>> "a = %s and b = %s" % ("apple", "banana") 'a = apple and b = banana' Can also perform type conversions explicitly: >>> "a = %d and b = %d" % (0, 1) 'a = 0 and b = 1' >>> "a = %d and b = %f" % (0, 1.5) 'a = 0 and b = 1.500000' >>> "a = %d and b = %.1f" % (0, 1.5) 'a = 0 and b = 1.5'
Tuples: String Formatting What if we wanted to print a tuple? >>> x = ("a", "b", "c") >>> "A tuple: %s" % x TypeError Instead: >>> "A tuple: %s" % (x,) "A tuple: ('a', 'b', 'c')"
Sidenote on String Formatting Another option, introduced in Python 2.6: >>> template = "{0} is {1} years old. {0} is a girl." >>> template.format("annie", 20) 'Annie is 20 years old. Annie is a girl.'
Tuples: Assigning Multiple Values at Once >>> v = ('a', 'b', 'c') >>> (x, y, z) = v >>> x 'a' >>> y 'b' >>> z 'c'
Tuples: Assignming Multiple Values at Once >>> def foo(x, y):... return (x,y)... >>> (a, b) = foo('a', 'b')
Dictionaries Key : Value pairs Implemented using hash tables So keys can be any hashable type strings numbers tuples ** not lists ** if they contain hashable types only!
Dictionaries >>> d = {'a': 1, 'b': 2, 'c': 3} >>> d['b'] 2 >>> d['b'] = 3 >>> d['b'] 3 >>> 'a' in d True >>> d.keys() ['a', 'c', 'b'] >>> d.values() [1, 3, 3]
Dictionaries >>> d = {'a': 1, 'b': 2, 'c': 3} >>> d['e'] KeyError: 'e' >>> d.get('e', 0) 0
Dictionaries: Iteration >>> d.items() [('a', 1), ('c', 3), ('b', 3)] >>> for (key, value) in d.items():... print "%s: %s" % (key, value)... a: 1 c: 3 b: 3
Dictionaries: Construction Dictionaries can also be constructed from lists of tuples. >>> e = dict([('first', 'a'), ('second', 'b')]) >>> e {'second': 'b', 'first': 'a'} >>> keys = ['a', 'b', 'c'] >>> values = [0, 0, 0] >>> e = dict(zip(keys, values)) >>> e {'a': 0, 'c': 0, 'b': 0}
Dictionaries: Construction >>> import string >>> alphabet = string.lowercase >>> alphabet 'abcdefghijklmnopqrstuvwxyz' >>> d = dict(zip(list(alphabet), [0]*len(alphabet))) >>> d {'a': 0, 'c': 0, 'b': 0, 'e': 0,...
Dictionaries: DefaultDict Counting character occurrences: d = {} # we see an 'a'! # don't know if it's the first one... if 'a' in d: d['a'] += 1 else: d['a'] = 1
Dictionaries: DefaultDict from collections import defaultdict >>> d = defaultdict(int) >>> d['a'] 0 >>> d['a'] += 1 >>> d = defaultdict(list) >>> d['a'] [] >>> d['a'].append(0)
Dictionaries vs. If/Elif Consider writing a function position(char) that takes a character as input and returns its position in the alphabet Don t do: if char == 'a': return 1 elif char == 'b'... Instead: alphabet = string.lowercase d = dict(zip(list(alphabet), range(1,27))) return d['c']
Shallow/Deep Copies >>> d1 = {'a': 1, 'b': 2} >>> d2 = d1 >>> d1['a'] = 5 >>> d2['a'] 5
Shallow/Deep Copies >>> d1 = {'a': 1, 'b': 2} >>> d2 = dict(d1) >>> d1['a'] = 5 >>> d2['a'] 1
Sets Lists, but without order or duplicates. Two common uses: 1. De-duplication of lists: >>> l = [1, 1, 2, 2, 3] >>> set(l) set([1, 2, 3])
Sets 2. Do you just need to know whether or not you ve already got a particular value? (And you don t need to know how many times?) >>> seen = set() >>> for i in ['abcabcdabcde']: seen.add(i) >>> seen set(['a', 'c', 'b', 'e', 'd'])
Sets But be careful can only use sets on hashable types! >>> s = set() >>> s.add([1]) TypeError: unhashable type: 'list'