Dictionaries, Text, Tuples

Similar documents
SI Networked Computing: Storage, Communication, and Processing, Winter 2009

Python Lists. What is not a Collection. A List is a kind of Collection. friends = [ 'Joseph', 'Glenn', 'Sally' ]

Constants. Variables, Expressions, and Statements. Variables. x = 12.2 y = 14 x = 100. Chapter

It is time to go find some Data to mess with!

Python for Everybody. Exploring Data Using Python 3. Charles R. Severance

Variables, Expressions, and Statements

Why Program? Computers want to be helpful... Programmers Anticipate Needs. Chapter 1. Computers are built for one purpose - to do things for us

Chapter 8 SETS AND DICTIONARIES

Computer Sciences 368 Scripting for CHTC Day 3: Collections Suggested reading: Learning Python

Python for Informatics

Collections. Lists, Tuples, Sets, Dictionaries

Spring INF Principles of Programming for Informatics. Manipulating Lists

MUTABLE LISTS AND DICTIONARIES 4

CIS192: Python Programming Data Types & Comprehensions Harry Smith University of Pennsylvania September 6, 2017 Harry Smith (University of Pennsylvani

CMSC 201 Fall 2015 Lab 12 Tuples and Dictionaries

CIS192 Python Programming

Overview.

List of squares. Program to generate a list containing squares of n integers starting from 0. list. Example. n = 12

Reading Files. Chapter 7. Python for Informatics: Exploring Information

Chapter 8 Dictionaries. Dictionaries are mutable, unordered collections with elements in the form of a key:value pairs that associate keys to values.

Dictionaries. Looking up English words in the dictionary. Python sequences and collections. Properties of sequences and collections

Data Structures. Dictionaries - stores a series of unsorted key/value pairs that are indexed using the keys and return the value.

Python Lists: Example 1: >>> items=["apple", "orange",100,25.5] >>> items[0] 'apple' >>> 3*items[:2]

Hacettepe University Computer Engineering Department. Programming in. BBM103 Introduction to Programming Lab 1 Week 7. Fall 2018

LECTURE 3 Python Basics Part 2

Data Structures. Lists, Tuples, Sets, Dictionaries

Exercise: The basics - variables and types

DM502 Programming A. Peter Schneider-Kamp.

ProductCatalogue ph: (02) e:

Outline: Data collections (Ch11)

Test 2: Compsci 101. Owen Astrachan and Kristin Stephens-Martinez. April 10, 2018

Variable and Data Type 2

Part I. Wei Tianwen. A Brief Introduction to Python. Part I. Wei Tianwen. Basics. Object Oriented Programming

There are four numeric types: 1. Integers, represented as a 32 bit (or longer) quantity. Digits sequences (possibly) signed are integer literals:

a b c d a b c d e 5 e 7

TUPLES AND RECURSIVE LISTS 5

Python Tutorial. Day 1

CIS192 Python Programming

CIS192 Python Programming

Programming in Python

DSC 201: Data Analysis & Visualization

6. Data Types and Dynamic Typing (Cont.)

Sharing, mutability, and immutability. Ruth Anderson UW CSE 160 Spring 2018

(CC)A-NC 2.5 by Randall Munroe Python

Python Tutorial. CS/CME/BioE/Biophys/BMI 279 Oct. 17, 2017 Rishi Bedi

CIS192 Python Programming

Abstract Data Types Chapter 1

At full speed with Python

Script language: Python Data structures

Real Python: Python 3 Cheat Sheet

COMP519 Web Programming Lecture 20: Python (Part 4) Handouts

CS2304: Python for Java Programmers. CS2304: Sequences and Collections

All programs can be represented in terms of sequence, selection and iteration.

A tuple can be created as a comma-separated list of values:

Python. Olmo S. Zavala R. Python Data Structures. Lists Tuples Dictionaries. Center of Atmospheric Sciences, UNAM. August 24, 2016

CS Advanced Unix Tools & Scripting

More Examples Using Lists Tuples and Dictionaries in Python CS 8: Introduction to Computer Science, Winter 2018 Lecture #11

Data type built into Python. Dictionaries are sometimes found in other languages as associative memories or associative arrays.

Data Science Python. Anaconda. Python 3.x. Includes ALL major Python data science packages. Sci-kit learn. Pandas.

GIS 4653/5653: Spatial Programming and GIS. More Python: Statements, Types, Functions, Modules, Classes

(Refer Slide Time: 01:12)

PYTHON FOR MEDICAL PHYSICISTS. Radiation Oncology Medical Physics Cancer Care Services, Royal Brisbane & Women s Hospital

Visualize ComplexCities

Delayed Expressions Fall 2017 Discussion 9: November 8, Iterables and Iterators. For Loops. Other Iterable Uses

Lecture 7: Python s Built-in. in Types and Basic Statements

python 01 September 16, 2016

Worksheet 6: Basic Methods Methods The Format Method Formatting Floats Formatting Different Types Formatting Keywords

They grow as needed, and may be made to shrink. Officially, a Perl array is a variable whose value is a list.

Python Lists 2 CS 8: Introduction to Computer Science Lecture #9

Decisions, Decisions. Testing, testing C H A P T E R 7

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

LISTS WITH PYTHON. José M. Garrido Department of Computer Science. May College of Computing and Software Engineering Kennesaw State University

The Practice of Computing Using PYTHON

APT Session 2: Python

Python review. 1 Python basics. References. CS 234 Naomi Nishimura

Introduction to Python

Introduction to Python

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

Announcements. Lecture Agenda. Class Exercise. Hashable. Mutability. COMP10001 Foundations of Computing Iteration

ENGR 102 Engineering Lab I - Computation

Introduction to Python

More Data Structures. What is a Dictionary? Dictionaries. Python Dictionary. Key Value Pairs 10/21/2010

Sequence types. str and bytes are sequence types Sequence types have several operations defined for them. Sequence Types. Python

Lesson 4: Type Conversion, Mutability, Sequence Indexing. Fundamentals of Text Processing for Linguists Na-Rae Han

An introduction introduction to functional functional programming programming using usin Haskell

Notebook. March 30, 2019

def order(food): food = food.upper() print( Could I have a big + food + please? ) return fresh + food

CSc 120 Introduction to Computer Programing II Adapted from slides by Dr. Saumya Debray

CSC148H Week 3. Sadia Sharmin. May 24, /20

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

Chapter 8 SETS AND DICTIONARIES

Reading Files. Chapter 7. Python for Everybody

CSC148 Fall 2017 Ramp Up Session Reference

A Brief Introduction to Python for those who know Java. (Last extensive revision: Daniel Moroz, fall 2015)

Announcements COMP 141. Lab from Last Time. Dictionaries 11/29/2017. CS1: Programming Fundamentals

Python Programming: Lecture 2 Data Types

Physics 514 Basic Python Intro

University of Washington CSE 140 Introduction to Data Programming Winter Midterm exam. February 6, 2013

MHealthy Approved Menu Items

CSC326 Python Sequences i. CSC326 Python Sequences

Transcription:

Dictionaries, Text, Tuples 4

We re Familiar with Lists [2, 3, 51, 0] [[ a, 151], [ z, 23], [ i, -42.3]] [2, BBC, KCRW, [ z, 23]] Lists enable us to store multiple values in one variable Lists are ordered, and we can iterate through them We can easily find and/or change a particular element x = [2, 3, 51, 0] x[2] = -23.0 print x 5

A Story of Two Collections.. List > A linear collection of values that stay in order Dictionary > A bag of values, each with its own label 6

It Would Be Hard to Keep Track of My Jelly Bean Collection with a List This is just a small part of it! 7 Buttered Popcorn Chocolate Pudding Mixed Berry Smoothie Orange Sherbet Peach Raspberry Red Apple Sparkling Berry Blue Sparkling Blueberry Sparkling Cream Soda Sparkling Grape Soda Sparkling Green Apple Sparkling Island Punch Sparkling Orange Sparkling Sour Apple Sparkling Sour Lemon Sparkling Very Cherry Sparkling Wild Blackberry Strawberry Jam Watermelon

A Dictionary Is an Associative Array Dictionaries Dictionaries are Python s most powerful data collection Dictionaries allow us to do fast database-like operations in Python Dictionaries have different names in different languages > Associative Arrays - Perl / PHP > Properties or Map or HashMap - Java > Property Bag - C# /.Net http://en.wikipedia.org/wiki/associative_array 8

Dictionaries Lists index their entries based on the position in the list Dictionaries are like bags - no order So we index the things we put in the dictionary with a lookup tag >>> purse = dict() >>> purse['money'] = 12 >>> purse['candy'] = 3 >>> purse['tissues'] = 75 >>> print purse {'money': 12, 'tissues': 75, 'candy': 3} >>> print purse['candy'] 3 >>> purse['candy'] = purse['candy'] + 2 >>> print purse {'money': 12, 'tissues': 75, 'candy': 5} 9

Comparing Lists and Dictionaries Dictionaries are like lists except that they use keys instead of numbers to look up values >>> lst = list() >>> lst.append(21) >>> lst.append(183) >>> print lst [21, 183] >>> lst[0] = 23 >>> print lst [23, 183] >>> ddd = dict() >>> ddd['age'] = 21 >>> ddd['course'] = 182 >>> print ddd {'course': 182, 'age': 21} >>> ddd['age'] = 23 >>> print ddd {'course': 182, 'age': 23} 10

If the value in each list position stood for something specific, it would be hard to keep track of it. Not so in a dictionary! >>> lst = list() >>> lst.append(21) >>> lst.append(183) >>> print lst [21, 183] >>> lst[0] = 23 >>> print lst [23, 183] Key List Value [0] 21 [1] 183 lst >>> ddd = dict() >>> ddd['age'] = 21 >>> ddd['course'] = 182 >>> print ddd {'course': 182, 'age': 21} >>> ddd['age'] = 23 >>> print ddd {'course': 182, 'age': 23} Dictionary Key Value ['course'] 182 ['age'] 21 ddd 11

Dictionary Literals (Constants) Dictionary literals use curly braces and have a list of key : value pairs You can make an empty dictionary using empty curly braces >>> jjj = { 'chuck' : 1, 'fred' : 42, 'jan': 100} >>> print jjj {'jan': 100, 'chuck': 1, 'fred': 42} >>> ooo = { } >>> print ooo {} >>> And See Examples 12

Counting: A Common Use of Dictionaries 13

Most Common Name? marquard zhen csev zhen cwen marquard zhen csev cwen zhen csev marquard zhen 14

Most Common Name? 15

Most Common Name? marquard zhen csev zhen cwen marquard zhen csev cwen zhen csev marquard zhen 16

Most Common Name? marquard zhen csev zhen cwen marquard zhen csev cwen zhen csev marquard zhen 17

Many Counters with a Dictionary One common use of dictionary is counting how often we see something >>> ccc = dict() >>> ccc['csev'] = 1 >>> ccc['cwen'] = 1 >>> print ccc {'csev': 1, 'cwen': 1} >>> ccc['cwen'] = ccc['cwen'] + 1 >>> print ccc {'csev': 1, 'cwen': 2} Key Value What s the problem with using the approach on the left? 18

Dictionary Tracebacks It is an error to reference a key which is not in the dictionary We can use the in operator to see if a key is in the dictionary >>> ccc = dict() >>> print ccc['csev'] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'csev' >>> print 'csev' in ccc False 19

One Possible Approach When we see a new name When we encounter a new name, we need to add a new entry in the dictionary and if this the second or later time we have seen the name, we simply add one to the count in the dictionary under that name counts = dict() names = ['csev', 'cwen', 'csev', 'zqian', 'cwen'] for name in names : if name not in counts: counts[name] = 1 else : counts[name] = counts[name] + 1 print counts {'csev': 2, 'zqian': 1, 'cwen': 2} 20

Dictionaries Make Our Life Easier with the get Method The get method for dictionaries This pattern of checking to see if a key is already in a dictionary and assuming a default value if the key is not there is so common, that there is a method called get() that does this for us if name in counts: x = counts[name] else : x = 0 x = counts.get(name, 0) Default value if key does not exist (and no Traceback). {'csev': 2, 'zqian': 1, 'cwen': 2} 21

Simplified counting with get() We can use get() and provide a default value of zero when the key is not yet in the dictionary - and then just add one counts = dict() names = ['csev', 'cwen', 'csev', 'zqian', 'cwen'] for name in names : counts[name] = counts.get(name, 0) + 1 print counts Default {'csev': 2, 'zqian': 1, 'cwen': 2} 22

So Let s Count https://www.youtube.com/watch?v=ehj9uyx5l58 23

the clown ran after the car and the car ran into the tent and the tent fell down on the clown and the car 24

Definite Loops and Dictionaries Even though dictionaries are not stored in order, we can write a for loop that goes through all the entries in a dictionary - actually it goes through all of the keys in the dictionary and looks up the values >>> counts = { 'chuck' : 1, 'fred' : 42, 'jan': 100} >>> for key in counts:... print key, counts[key]... jan 100 chuck 1 fred 42 >>> See Example 25

Helpful Dictionary Methods Exist! You easily can get a list of keys, values, or both (items) from a dictionary dict.keys() dict.values() dict.items() Check it out! 26

Bonus: Two Iteration Variables! We loop through the key-value pairs in a dictionary using *two* iteration variables Each iteration, the first variable is the key and the second variable is the corresponding value for the key >>> jjj = { 'chuck' : 1, 'fred' : 42, 'jan': 100} >>> for aaa,bbb in jjj.items() :... print aaa, bbb... jan 100 chuck 1 aaa bbb fred 42 >>> [jan] 100 [chuck] 1 [fred] 42 27

name = raw_input('enter file:') handle = open(name) text = handle.read() words = text.split() counts = dict() for word in words: counts[word] = counts.get(word,0) + 1 bigcount = None bigword = None for word,count in counts.items(): if bigcount is None or count > bigcount: bigword = word bigcount = count python words.py Enter file: words.txt to 16 python words.py Enter file: clown.txt the 7 print bigword, bigcount 28

In Class Exercise Write a program to read through the mail box data in mbox-small.txt and when it finds line that starts with From, splits the line into words using the split function. We are interested in who sent the message, which is the second word on the From line, e. g., From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008 Collect all the addresses that sent emails, and then print out the count of each. It should look something like this: % python fromcount.py gopal.ramasammycook@gmail.com appears 1 times louis@media.berkeley.edu appears 3 times cwen@iupui.edu appears 5 times [ more output ] 29

handle = open('mbox-short.txt') authors = dict() for line in handle: if line.startswith('from '): name = line.split()[1] authors[name] = 1 + authors.get(name, 0) for (name, count) in authors.items(): print name, 'appears', count, 'times' 30

Tuples 31

Tuples are like lists Tuples are another kind of sequence that functions much like a list - they have elements which are indexed starting at 0 >>> x = ('Glenn', 'Sally', 'Joseph') >>> print x[2] Joseph >>> y = ( 1, 9, 2 ) >>> print y (1, 9, 2) >>> print max(y) 9 >>> for iter in y:... print iter... 1 9 2 >>> 32

but... Tuples are immutable Unlike a list, once you create a tuple, you cannot alter its contents - similar to a string >>> x = [9, 8, 7] >>> x[2] = 6 >>> print x >>>[9, 8, 6] >>> >>> y = 'ABC' >>> y[2] = 'D' Traceback:'str' object does not support item Assignment >>> >>> z = (5, 4, 3) >>> z[2] = 0 Traceback:'tuple' object does not support item Assignment >>> 33

Things not to do with tuples >>> x = (3, 2, 1) >>> x.sort() Traceback: AttributeError: 'tuple' object has no attribute 'sort' >>> x.append(5) Traceback: AttributeError: 'tuple' object has no attribute 'append' >>> x.reverse() Traceback: AttributeError: 'tuple' object has no attribute 'reverse' >>> 34

tuple.count( x ) is how many times x appears tuple.index( x ) is where x first appears A Tale of Two Sequences >>> l = list() >>> dir(l) ['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] >>> t = tuple() >>> dir(t) ['count', 'index'] 35

Why Have Two Similar Data Structures? Tuples are more efficient Since Python does not have to build tuple structures to be modifiable, they are simpler and more efficient in terms of memory use and performance than lists So in our program when we are making temporary variables we prefer tuples over lists 36

Tuples and Assignment We can also put a tuple on the left-hand side of an assignment statement We can even omit the parentheses >>> (x, y) = (4, 'fred') >>> print y fred >>> (a, b) = (99, 98) >>> print a 99 x1, y1 = 0, 180 # is perfectly legal 37

Now, What We Saw Earlier with Dictionaries Makes More Sense Tuples and Dictionaries The items() method in dictionaries returns a list of (key, value) tuples >>> d = dict() >>> d['csev'] = 2 >>> d['cwen'] = 4 >>> for (k,v) in d.items():... print k, v... csev 2 cwen 4 >>> tups = d.items() >>> print tups [('csev', 2), ('cwen', 4)] 38

Tuples are Comparable The comparison operators work with tuples and other sequences. If the first item is equal, Python goes on to the next element, and so on, until it finds elements that differ. >>> (0, 1, 2) < (5, 1, 2) True >>> (0, 1, 2000000) < (0, 3, 4) True >>> ( 'Jones', 'Sally' ) < ('Jones', 'Sam') True >>> ( 'Jones', 'Sally') > ('Adams', 'Sam') True Tuples can therefore be sorted! 39

Once a dictionary is turned into a list of tuples, it can be sorted Sorting Lists of Tuples We can take advantage of the ability to sort a list of tuples to get a sorted version of a dictionary First we sort the dictionary by the key using the items() method >>> d = {'a':10, 'b':1, 'c':22} >>> t = d.items() >>> t [('a', 10), ('c', 22), ('b', 1)] >>> t.sort() >>> t [('a', 10), ('b', 1), ('c', 22)] 40

Using sorted() We can do this even more directly using the built-in function sorted that takes a sequence as a parameter and returns a sorted sequence >>> d = {'a':10, 'b':1, 'c':22} >>> d.items() [('a', 10), ('c', 22), ('b', 1)] >>> t = sorted(d.items()) >>> t [('a', 10), ('b', 1), ('c', 22)] >>> for k, v in sorted(d.items()):... print k, v... a 10 b 1 c 22 41

sort vs. sorted sort is a method that is defined only for lists sorted is a built-in function that can be applied to any iterable structure, and will turn it into a list. E. g., >>> sorted('abdce') ['a', 'b', 'c', 'd', 'e'] sorted can take an argument that defines the value to compare for each element make sense when each element is a more complex structure Read more about sorting here: https://wiki.python.org/moin/howto/sorting 42

Sort by values instead of key If we could construct a list of tuples of the form (value, key) we could sort by value We do this with a for loop that creates a list of tuples >>> c = {'a':10, 'b':1, 'c':22} >>> tmp = list() >>> for k, v in c.items() :... tmp.append( (v, k) )... >>> print tmp [(10, 'a'), (22, 'c'), (1, 'b')] >>> tmp.sort(reverse=true) >>> print tmp [(22, 'c'), (10, 'a'), (1, 'b')] 43

Even Shorter Version >>> c = {'a':10, 'b':1, 'c':22} >>> print sorted( [ (v,k) for k,v in c.items() ] ) [(1, 'b'), (10, 'a'), (22, 'c')] List comprehension creates a dynamic list. In this case, we make a list of reversed tuples and then sort it. Read more about sorting here: https://wiki.python.org/moin/howto/sorting 44