contacts= { bill : , rich : , jane : } print contacts { jane : , bill : , rich : }

Similar documents
More Data Structures. What is a Dictionary? Dictionaries. Python Dictionary. Key Value Pairs 10/21/2010

Modules and scoping rules

CSCE 110 Programming I

Beyond Blocks: Python Session #1

Python Essential Reference, Second Edition - Chapter 5: Control Flow Page 1 of 8

Lecture 8: A cure for what ails you

Exception Handling and Debugging

Python Reference (The Right Way) Documentation

Python File Modes. Mode Description. Open a file for reading. (default)

Python. Executive Summary

Overview. Recap of FP Classes Instances Inheritance Exceptions

Python Intro GIS Week 1. Jake K. Carr

SBT 645 Introduction to Scientific Computing in Sports Science #6

CS Lecture 26: Grab Bag. Announcements

Advanced Python. Executive Summary, Session 1

bioengineering-book Documentation

Python 3000 and You. Guido van Rossum PyCon March 14, 2008

Python. Karin Lagesen.

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

There are four numeric types: 1. Integers, represented as a 32 bit (or longer) quantity. Digits sequences (possibly) signed are integer literals:

Data Structures. Dictionaries - stores a series of unsorted key/value pairs that are indexed using the keys and return the value.

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression

CIS192 Python Programming

CIS192 Python Programming

CIS192 Python Programming

Data Structures. Lists, Tuples, Sets, Dictionaries

CS2304: Python for Java Programmers. CS2304: Sequences and Collections

CIS192 Python Programming

Python Evaluation Rules

LECTURE 3 Python Basics Part 2

Lecture #12: Quick: Exceptions and SQL

Exceptions & a Taste of Declarative Programming in SQL

CSE : Python Programming. Homework 5 and Projects. Announcements. Course project: Overview. Course Project: Grading criteria

CNRS ANF PYTHON Objects everywhere

Iterators & Generators

Python Tutorial. Day 1

COMP 204: Sets, Commenting & Exceptions

Python 3000 and You. Guido van Rossum EuroPython July 7, 2008

Python. How You Can Do More Interesting Things With Python (Part II)

MEIN 50010: Python Flow Control

The Big Python Guide

COMP519 Web Programming Lecture 20: Python (Part 4) Handouts

06/11/2014. Subjects. CS Applied Robotics Lab Gerardo Carmona :: makeroboticsprojects.com June / ) Beginning with Python

Today: Revisit some objects. Programming Languages. Key data structure: Dictionaries. Using Dictionaries. CSE 130 : Winter 2009

Collections. Lists, Tuples, Sets, Dictionaries

CIS192: Python Programming Data Types & Comprehensions Harry Smith University of Pennsylvania September 6, 2017 Harry Smith (University of Pennsylvani

Exceptions CS GMU

Data type built into Python. Dictionaries are sometimes found in other languages as associative memories or associative arrays.

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

Part IV. More on Python. Tobias Neckel: Scripting with Bash and Python Compact Max-Planck, February 16-26,

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

Python Programming: Lecture 2 Data Types

CIS192 Python Programming

Index. Cache, 20 Callables, 185 call () method, 290 chain() function, 41 Classes attributes descriptors, 145 properties, 143

Absent: Lecture 3 Page 1. def foo(a, b): a = 5 b[0] = 99

THE CONTENT OF THIS BOOK IS PROVIDED UNDER THE TERMS OF THE CREATIVE COMMONS PUBLIC LICENSE BY-NC-ND 3.0.

Chapter 8 Dictionaries. Dictionaries are mutable, unordered collections with elements in the form of a key:value pairs that associate keys to values.

(IUCAA, Pune) kaustubh[at]iucaa[dot]ernet[dot]in.

Lecture 21. Programming with Subclasses

Sharing, mutability, and immutability. Ruth Anderson UW CSE 160 Spring 2018

COMP 204: Sets, Commenting & Exceptions

Lecture 21. Programming with Subclasses

Collections. Michael Ernst CSE 190p University of Washington

COMP1730/COMP6730 Programming for Scientists. Exceptions and exception handling

Lecture 18. Classes and Types

Python Tutorial. Day 2

(CC)A-NC 2.5 by Randall Munroe Python

The Practice of Computing Using PYTHON

Introduction to Python programming, II

ENGR 102 Engineering Lab I - Computation

Senthil Kumaran S

COMP 204: Dictionaries Recap & Sets

Python for Non-programmers

Exceptions & error handling in Python 2 and Python 3

CS Summer 2013

Exam #3, Form 3 A CSE 231 Fall 2015 (1) DO NOT OPEN YOUR EXAM BOOKLET UNTIL YOU HAVE BEEN TOLD TO BEGIN.

Outline. the try-except statement the try-finally statement. exceptions are classes raising exceptions defining exceptions

1 Strings (Review) CS151: Problem Solving and Programming

Python and Bioinformatics. Pierre Parutto

Functions, Scope & Arguments. HORT Lecture 12 Instructor: Kranthi Varala

PYTHON FOR KIDS A Pl ayfu l I ntrodu ctio n to Prog r am m i ng J a s o n R. B r i g g s

Lecture 21. Programming with Subclasses

Introduction to Python programming, II

Single Source Python 2 3

University of Texas at Arlington, TX, USA

Software Development Python (Part B)

CIS192: Python Programming

Part I. Wei Tianwen. A Brief Introduction to Python. Part I. Wei Tianwen. Basics. Object Oriented Programming

Python Review IPRE

Intro. Scheme Basics. scm> 5 5. scm>

CS1 Lecture 3 Jan. 22, 2018

Exceptions and File I/O

John Perry. Fall 2009

Sequence types. str and bytes are sequence types Sequence types have several operations defined for them. Sequence Types. Python

Notebook. March 30, 2019

Chapter 9: Dealing with Errors

An Introduction to Python

DEBUGGING TIPS. 1 Introduction COMPUTER SCIENCE 61A

Announcements. Lecture Agenda. Class Exercise. Hashable. Mutability. COMP10001 Foundations of Computing Iteration

Basic Python Revision Notes With help from Nitish Mittal

Transcription:

Chapter 8 More Data Structures We have seen the list data structure and its uses. We will now examine two, more advanced data structures: the set and the dictionary. In particular, the dictionary is an important, very useful part of Python as well as generally useful to solve many problems. What is a Dictionary? In data structure terms, a dictionary is better termed an associative array or associative list or a map. You can think if it as a list of pairs, where the first element of the pair, the key, is used to retrieve the second element, the value. Thus we map a key to a value. Key Value Pairs The key acts as a lookup to find the associated value. Just like a dictionary, you look up a word by its spelling to find the associated definition. A dictionary can be searched to locate the value associated with a key. Python Dictionary Use the { } marker to create a dictionary Use the : marker to indicate key:value pairs: contacts= { bill : 353-1234, rich : 269-1234, jane : 352-1234 } print contacts { jane : 352-1234, bill : 353-1234, rich : 269-1234 } Keys and Values Key must be immutable: strings, integers, tuples are fine lists are NOT Value can be anything. 1

Collection but not a Sequence Dictionaries are collections, but they are not sequences like lists, strings or tuples: there is no order to the elements of a dictionary in fact, the order (for example, when printed) might change as elements are added or deleted. So how to access dictionary elements? Access Dictionary Elements Access requires [ ], but the key is the index! mydict={} an empty dictionary mydict[ bill ]=25 added the pair bill :25 print mydict[ bill ] prints 25 Dictionaries are Mutable Like lists, dictionaries are a mutable data structure: you can change the object via various operations, such as index assignment mydict = { bill :3, rich :10} print mydict[ bill ] # prints 2 mydict[ bill ] = 100 print mydict[ bill ] # prints 100 Again, Common Operators Like others, dictionaries respond to these: len(mydict) number of key:value pairs in the dictionary element in mydict boolean, is element a key in the dictionary for key in mydict: iterates through the keys of a dictionary Lots of Methods mydict.items() all the key/value pairs mydict.keys() all the keys mydict.values() all the values key in mydict does the key exist in the dictionary mydict.clear() empty the dictionary mydict.update(yourdict) for each key in yourdict, updates mydict with that key/value pair 2

Dictionaries are Iterable for key in mydict: print key prints all the keys for key,value in mydict.items(): print key,value prints all the key/value pairs for value in mydict.values(): print value prints all the values Building Dictionaries Faster zip creates pairs from two parallel lists: zip( abc,[1,2,3]) yields [( a,1),( b,2),( c,3)] That s good for building dictionaries. We call the dict function which takes a list of pairs to make a dictionary: dict(zip( abc,[1,2,3])) yields {'a': 1, 'c': 3, 'b': 2} Word Frequency Analysis of Gettysburg Address def addword(w, wcdict): '''Update the word frequency: word is the key, frequency is the value.''' if w in wcdict: wcdict[w] += 1 else: wcdict[w] = 1 import string def processline(line, wcdict): '''Process the line to get lowercase words to add to the dictionary.''' line = line.strip() wordlist = line.split() for word in wordlist: # ignore the '--' that is in the file if word!= '--': word = word.lower() word = word.strip() # get commas, periods and other punctuation out as well word = word.strip(string.punctuation) addword(word, wcdict) def prettyprint(wcdict): '''Print nicely from highest to lowest frequency.''' # create a list of tuples, (value, key) # valkeylist = [(val,key) for key,val in d.items()] valkeylist=[] for key,val in wcdict.items(): valkeylist.append((val,key)) # sort method sorts on list's first element, here the frequency. # Reverse to get biggest first valkeylist.sort(reverse=true) print '%-10s%10s'%('Word', 'Count') print '_'*21 for val,key in valkeylist: print '%-12s %3d'%(key,val) 3

sort(...) L.sort(cmp=None, key=none, reverse=false) -- stable sort *IN PLACE*; cmp(x, y) -> -1, 0, 1 def main (): wcdict={} fobj = open('gettysburg.txt','r') for line in fobj: processline(line, wcdict) print 'Length of the dictionary:',len(wcdict) prettyprint(wcdict) Passing Mutables Because we are passing a mutable data structure, such as a dictionary, we do not have to return the dictionary when the function ends. If all we do is update the dictionary (change the object) then the argument will be associated with the changed object. The only built-in sort method works on lists, so if we sort we must sort a list. For complex elements (like a tuple), the sort compares the first element of each complex element: (1, 3) < (2, 1) # True (3,0) < (1,2,3) # False Sets, as in Mathematical Sets In mathematics, a set is a collection of objects, potentially of many different types. In a set, no two elements are identical. That is, a set consists of elements each of which is unique compared to the other elements. There is no order to the elements of a set A set with no elements is the empty set Creating a Set myset = set( abcd ) The set keyword creates a set. The single argument that follows must be iterable, that is, something that can be walked through one item at a time with a for. The result is a set data structure: print myset set(['a', 'c', 'b', 'd']) Diverse Elements A set can consist of a mixture of different types of elements: myset = set([ a,1,3.14159,true]) As long as the single argument can be iterated through, you can make a set of it. Duplicates are automatically removed. myset = set( aabbccdd ) print myset set(['a', 'c', 'b', 'd']) 4

Common Operators Most data structures respond to these: len(myset) the number of elements in a set element in myset boolean indicating whether element is in the set for element in myset: iterate through the elements in myset Set Operators The set data structure provides some special operators that correspond to the operators you learned in middle school. These are various combinations of set contents. Unlike most Python names, these are full names (this changes in Python 3.x). Set Ops, Intersection myset=set( abcd ); newset=set( cdef ) myset.intersection(newset) # returns set([ c, d ]) Set Ops, Difference myset=set( abcd ); newset=set( cdef ) myset.difference(newset) # returns set([ a, b ]) Set Ops, Union myset=set( abcd ); newset=set( cdef ) myset.union(newset) # returns set([ a, b, c, d, e, f ]) Set Ops, symmetricdifference myset=set( abcd ); newset=set( cdef ) myset.symmetric_difference(newset) # returns set([ a, b, e, f ]) Set Ops, super and sub set myset=set( abc ); newset=set( abcdef ) myset.issubset(newset) # returns True newset.issuperset(myset) # returns True Other Set Ops myset.add( g ) Adds to the set, no effect if item is in set already. myset.clear() Empties the set. myset.remove( g ) versus myset.discard( g ) remove throws an error if g isn t there. discard doesn t care. Both remove g from the set. myset.copy() - Returns a shallow copy of myset. 5

Common/Unique Words def addword(w, theset): '''A the word to the set. if len(w) > 3: theset.add(w) No word smaller than length 3.''' def processline(line, theset): '''Process the line to get lowercase words to be added to the set.''' line = line.strip() wordlist = line.split() for word in wordlist: # ignore the '--' thatt is in the file if word!= '--': word = word.strip( () # get commas, periods and other punctuation out as well word = word.strip( (string.punctuation) word = word.lower( () addword(word, theset) def main (): '''Compare the Gettysburg Address and the Declaration of Independence.''' GASet = set() DoISet = set() GAFileObj = open('gettysburg.txt') DoIFileObj = open('declofind.txt') for line in GAFileObj: processline(line, GASet) for line in DoIFileObj: processline(line,doiset) prettyprint(gaset, DoISet) 6

def prettyprint(gaset, doiset): # print some stats about the two sets print 'Count of unique words of length 4 or greater' print 'Gettysburg Addr: %d, Decl of Ind: %d\n'%(len(gaset),len(doiset)) print '%15s %15s'%('Operation', 'Count') print '-'*35 print '%15s %15d'%('Union', len(gaset.union(doiset))) print '%15s %15d'%('Intersection', len(gaset.intersection(doiset))) print '%15s %15d'%('Sym Diff', len(gaset.symmetric_difference(doiset))) print '%15s %15d'%('GA-DoI', len(gaset.difference(doiset))) print '%15s %15d'%('DoI-GA', len(doiset.difference(gaset))) # list the intersection words, 5 to a line, alphabetical order intersectionset = gaset.intersection(doiset) wordlist = list(intersectionset) wordlist.sort() print '\n Common words to both' print '-'*20 cnt = 0 for w in wordlist: if cnt % 5 == 0: print print '%13s'% (w), cnt += 1 Common words in Gettysburg Address and Declaration of Independence Can reuse or only slightly modify much of the code for document frequency. The overall outline remains much the same. For clarity, we will ignore any word that has three characters or less (typically stop words). More on Scope What is a Namespace? We ve had this discussion, but let s review: A namespace is an association of a name and a value. It looks like a dictionary, and for the most part it is (at least for modules and classes). Scope What namespace you might be using is part of identifying the scope of the variables and function you are using. By scope, we mean the context, the part of the code, where we can make a reference to a variable or function. Multiple Scopes Often, there can be multiple scopes that are candidates for determining a reference. Knowing which one is the right one (or more importantly, knowing the order of scope) is important. Two Kinds of Namespaces Unqualified namespaces: this is what we have pretty much seen so far - functions, assignments, etc. Qualified namespaces: these are modules and classes (we ll talk more about this one later in the Classes section). Unqualified This is the standard assignment and def we have seen so far. 7

Determining the scope of a reference identifies what its true value is. Unqualified Follow the LEGB Rule Local - inside the function in which it was defined. If not there, enclosing/encompassing. Is it defined in an enclosing function? If not there, is it defined in the global namespace? Finally, check the built-in, defined as part of the special builtin scope. Else, ERROR. Local def myfn (myparam = 25): print myparam myfn(100) # prints 100 myparam # ERROR Function Local Values If a reference is assigned in a function, then that reference is only available within that function. If a reference with the same name is provided outside the function, the reference is reassigned. Local and Global myvar = 123 # global def myfn(myvar=456): # parameter is local yourvar = -10 # assignment local print myvar, yourvar myfn() # prints 456-10 myfn(500) # prints 500-10 myvar # prints 123 yourvar # ERROR Builtin This is just the standard library of Python. To see what is there, look at import builtin dir( builtin ) ['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', 'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False', 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError', 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError', 'NameError', 'None', 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError', 'PendingDeprecationWarning', 'ReferenceError', 'RuntimeError', 'RuntimeWarning', 'StandardError', 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'True', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'ZeroDivisionError', '_', ' debug ', ' doc ', ' import ', ' name ', ' package ', 'abs', 'all', 'any', 'apply', 'basestring', 'bin', 'bool', 'buffer', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod', 'cmp', 'coerce', 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'execfile', 'exit', 'file', 'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int', 'intern', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list', 'locals', 'long', 'map', 'max', 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'raw_input', 'reduce', 'reload', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'unichr', 'unicode', 'vars', 'xrange', 'zip'] 8

The Global Statement You can cheat by using the global keyword in a function: myvar = 100 def myfn(): global myvar myvar = -1 myfn() print myvar # prints -1 Nested Functions Look up through the nested functions to find a binding: def afunc(): x = 5 def bfunc(): # x = 10 print x bfunc() afunc() # prints 5 bfunc() ERROR, bfunc not defined 9