Chapter 8 SETS AND DICTIONARIES

Similar documents
Chapter 8 SETS AND DICTIONARIES

MEMOIZATION, RECURSIVE DATA, AND SETS

CS261 Data Structures. Ordered Bag Dynamic Array Implementa;on

COMP1730/COMP6730 Programming for Scientists. Dictionaries and sets

List of squares. Program to generate a list containing squares of n integers starting from 0. list. Example. n = 12

University of Texas at Arlington, TX, USA

Writing a Fraction Class

Document Databases: MongoDB

Converting File Input

Data Structures. Lists, Tuples, Sets, Dictionaries

Programming Environments

Collections. Lists, Tuples, Sets, Dictionaries

Introduction to Python. Fang (Cherry) Liu Ph.D. Scien5fic Compu5ng Consultant PACE GATECH

15110 Principles of Computing, Carnegie Mellon University - CORTINA. Binary Search. Required: List L of n unique elements.

Array Basics: Outline

Sequence Types FEB

Crash Dive into Python

61A Lecture 21. Friday, March 13

Arithmetic Expressions 9/7/16 44

Array Basics: Outline

Data Structures. Dictionaries - stores a series of unsorted key/value pairs that are indexed using the keys and return the value.

Introduction to Python! Lecture 2

Sets and Maps. Set Commands Set ADT Set ADT implementation Map ADT Map ADT implementation

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

File I/O, Benford s Law, and sets

Array Basics: Outline

Agenda. Excep,ons Object oriented Python Library demo: xml rpc

61A LECTURE 15 MEMOIZATION, RECURSIVE DATA, SETS

Chapter 8 Dictionaries. Dictionaries are mutable, unordered collections with elements in the form of a key:value pairs that associate keys to values.

CSCE 110 Programming I

Worksheet 6: Basic Methods Methods The Format Method Formatting Floats Formatting Different Types Formatting Keywords

Strings, Arrays, A1. COMP 401, Spring 2015 Lecture 4 1/20/2015

Dictionaries. Looking up English words in the dictionary. Python sequences and collections. Properties of sequences and collections

CS261 Data Structures. Maps (or Dic4onaries)

Python Tutorial. Day 1

IntSet Documentation. Release David R. MacIver

Brush up Python. December 20, 2018

More Data Structures. What is a Dictionary? Dictionaries. Python Dictionary. Key Value Pairs 10/21/2010

Introduction to Bioinformatics

Strings. Upsorn Praphamontripong. Note: for reference when we practice loop. We ll discuss Strings in detail after Spring break

ECE 364 Software Engineering Tools Laboratory. Lecture 4 Python: Collections I

TUPLES AND RECURSIVE LISTS 5

Chapter 6. Files and Exceptions I

A Python programming primer for biochemists. BCH364C/394P Systems Biology/Bioinformatics Edward Marcotte, Univ of Texas at Austin

LECTURE 3 Python Basics Part 2

CS2304: Python for Java Programmers. CS2304: Sequences and Collections

Python. Chapter 4. Sets

Common Loop Algorithms 9/21/16 42

Object Oriented Programming. Feb 2015

Chapter 14 Tuples, Sets, and Dictionaries. Copyright 2012 by Pearson Education, Inc. All Rights Reserved.

CS251 Programming Languages Spring 2016, Lyn Turbak Department of Computer Science Wellesley College

CSC326 Python Sequences i. CSC326 Python Sequences

University of Texas at Arlington, TX, USA

CSc 120. Introduction to Computer Programming II. 14: Stacks and Queues. Adapted from slides by Dr. Saumya Debray

CSC Discrete Math I, Spring Sets

CS101: Fundamentals of Computer Programming. Dr. Tejada www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++

Python a modern scripting PL. Python

6. Data Types and Dynamic Typing (Cont.)

Topic 7: Lists, Dictionaries and Strings

Review: Python Transi,on Warning

Python iterators and generators

Debugging. BBM Introduc/on to Programming I. Hace7epe University Fall Fuat Akal, Aykut Erdem, Erkut Erdem, Vahid Garousi

COMP 204: Sets, Commenting & Exceptions

COMP 204: Sets, Commenting & Exceptions

CS 10: Problem solving via Object Oriented Programming. Info Retrieval

Index construc-on. Friday, 8 April 16 1

Objec0ves. Gain understanding of what IDA Pro is and what it can do. Expose students to the tool GUI

DSC 201: Data Analysis & Visualization

CIS192 Python Programming

Real Python: Python 3 Cheat Sheet

More on collec)ons and sor)ng. CSCI 136: Fundamentals of Computer Science II Keith Vertanen

Lecture 21. Chapter 12 More Python Containers

Collections. Michael Ernst CSE 190p University of Washington

Python review. 1 Python basics. References. CS 234 Naomi Nishimura

CSc 120. Introduction to Computer Programming II. 07: Excep*ons. Adapted from slides by Dr. Saumya Debray

CS 1301 CS1 with Robots Summer 2007 Exam 1

Teach A level Compu/ng: Algorithms and Data Structures

Week - 03 Lecture - 18 Recursion. For the last lecture of this week, we will look at recursive functions. (Refer Slide Time: 00:05)

Technical Documentation Version 7.3 RPL Data Types and Palette

CIS192: Python Programming Data Types & Comprehensions Harry Smith University of Pennsylvania September 6, 2017 Harry Smith (University of Pennsylvani

Summer 2017 Discussion 10: July 25, Introduction. 2 Primitives and Define

There are four numeric types: 1. Integers, represented as a 32 bit (or longer) quantity. Digits sequences (possibly) signed are integer literals:

CS61A Notes Week 13: Interpreters

Lecture Notes on Tries

Python. Jae-Gil Lee Based on the slides by K. Naik, M. Raju, and S. Bhatkar. December 28, Outline

CSc 120 Introduction to Computer Programing II Adapted from slides by Dr. Saumya Debray

Structure and Interpretation of Computer Programs

A Func'onal Introduc'on. COS 326 David Walker Princeton University

PYTHON FOR MEDICAL PHYSICISTS. Radiation Oncology Medical Physics Cancer Care Services, Royal Brisbane & Women s Hospital

Overview.

University of Texas at Arlington, TX, USA

More on collec)ons and sor)ng

Babu Madhav Institute of Information Technology, UTU 2015

Announcements COMP 141. Lab from Last Time. Dictionaries 11/29/2017. CS1: Programming Fundamentals

RPL Data Types and Palette

Vocabulary Size of Moby Dick

DSC 201: Data Analysis & Visualization

Programming to Python

Lecture #12: Mutable Data. map rlist Illustrated (III) map rlist Illustrated. Using Mutability For Construction: map rlist Revisited

LISTS WITH PYTHON. José M. Garrido Department of Computer Science. May College of Computing and Software Engineering Kennesaw State University

Transcription:

Chapter 8 SETS AND DICTIONARIES

Chapter Contents Sets Dictionaries Complex Data Structures 10/19/16 Page 2

Chapter Goals To build and use a set container To learn common set opera2ons for processing data To build and use a dic2onary container To work with a dic2onary for table lookups To work with complex data structures In this chapter, we will learn how to work with two more types of containers (sets and dictionaries) as well as how to combine containers to model complex structures. 10/19/16 Page 3

Sets SECTION 8.1 10/19/16 4

Sets A set is a container that stores a collec2on of unique values Unlike a list, the elements or members of the set are not stored in any par2cular order and cannot be accessed by posi2on Opera2ons are the same as the opera2ons performed on sets in mathema2cs Because sets do not need to maintain a par2cular order, set opera2ons are much faster than the equivalent list opera2ons 10/19/16 Page 5

Example Set This set contains three sets of colors the colors of the Bri2sh, Canadian, and Italian flags In each set, the order does not maeer, and the colors are not duplicated in any one of the sets 10/19/16 Page 6

Creating and Using Sets To create a set with ini2al elements, you can specify the elements enclosed in braces, just like in mathema2cs: cast = { "Luigi", "Gumbys", "Spiny" } Alterna2vely, you can use the set() func2on to convert any sequence into a set: names = ["Luigi", "Gumbys", "Spiny"] cast = set(names) 10/19/16 Page 7

Creating an Empty Set For historical reasons, you cannot use {} to make an empty set in Python Instead, use the set() func2on with no arguments: cast = set() As with any container, you can use the len() func2on to obtain the number of elements in a set: numberofcharacters = len(cast) # In this case it s zero 10/19/16 Page 8

Set Membership: in To determine whether an element is contained in the set, use the in operator or its inverse, the not in operator: if "Luigi" in cast : print("luigi is a character in Monty Python s Flying Circus.") else : print("luigi is not a character in the show.") 10/19/16 Page 9

Accessing Set Elements Because sets are unordered, you cannot access the elements of a set by posi2on as you can with a list We use a for loop to iterate over the individual elements: print("the cast of characters includes:") for character in cast : print(character) Note that the order in which the elements of the set are visited depends on how they are stored internally 10/19/16 Page 10

Accessing Elements (2) For example, the previous loop above displays the following: The cast of characters includes: Gumbys Spiny Luigi Note that the order of the elements in the output is different from the order in which the set was created 10/19/16 Page 11

Displaying Sets In Sorted Order Use the sorted() func2on, which returns a list (not a set) of the elements in sorted order The following loop prints the cast in sorted order: for actor in sorted(cast) : print(actor) 10/19/16 Page 12

Adding Elements Sets are mutable collec2ons, so you can add elements by using the add() method: cast = set(["luigi", "Gumbys", "Spiny"]) #1 cast.add("arthur") #2 cast.add("spiny") #3 Arthur is not in the set, so it is added to the set and the size of the set is increased by one Spiny is already in the set, so there is no effect on the set 10/19/16 Page 13

Removing Elements: discard() The discard() method removes an element if the element exists: cast.discard("arthur") #4 It has no effect if the given element is not a member of the set: cast.discard("the Colonel") # Has no effect 10/19/16 Page 14

Removing Elements: remove() The remove() method, on the other hand, removes an element if it exists, but raises an excep2on if the given element is not a member of the set: cast.remove("the Colonel") # Raises an exception For this class we will use the discard() method 10/19/16 Page 15

Removing Elements: clear() Finally, the clear() method removes all elements of a set, leaving the empty set: cast.clear() # cast now has size 0 10/19/16 Page 16

Example 10/19/16 Page 17

Subsets A set is a subset of another set if and only if every element of the first set is also an element of the second set In the image below, the Canadian flag colors are a subset of the Bri2sh colors The Italian flag colors are not. 10/19/16 Page 18

The issubset() Method The issubset() method returns True or False to report whether one set is a subset of another: canadian = { "Red", "White" } british = { "Red", "Blue", "White" } italian = { "Red", "White", "Green" } # True if canadian.issubset(british) : print("all Canadian flag colors occur in the British flag.") # True if not italian.issubset(british) : print("at least one of the colors in the Italian flag does not.") 10/19/16 Page 19

Set Equality / Inequality We test set equality with the == and!= operators Two sets are equal if and only if they have exactly the same elements french = { "Red", "White", "Blue" } if british == french : print("the British and French flags use the same colors.") 10/19/16 Page 20

Set Union: union() The union of two sets contains all of the elements from both sets, with duplicates removed # ineither: The set {"Blue", "Green", "White", "Red"} ineither = british.union(italian) Both the Bri2sh and Italian sets contain the colors Red and White, but the union is a set and therefore contains only one instance of each color Note that the union() method returns a new set. It does not modify either of the sets in the call 10/19/16 Page 21

Set Intersection: intersection() The intersection of two sets contains all of the elements that are in both sets # inboth: The set {"White", "Red"} inboth = british.intersection(italian) 10/19/16 Page 22

Difference of Two Sets: difference() The difference of two sets results in a new set that contains those elements in the first set that are not in the second set print("colors that are in the Italian flag but not the British:") print(italian.difference(british)) # Prints {'Green'} 10/19/16 Page 23

Common Set Operations 10/19/16 Page 24

Common Set Operations (2) Remember: union, intersection and difference return new sets They do not modify the set they are applied to 10/19/16 Page 25

Simple Examples Open the file: set examples.py 10/19/16 Page 26

Set Example: Spell Checking The program spellcheck.py reads a file that contains correctly spelled words and places the words in a set It then reads all words from a document here, the book Alice in Wonderland into a second set Finally, it prints all words from the document that are not in the set of correctly spelled words Open the file spellcheck.py 10/19/16 Page 27

Example: Spellcheck.py 10/19/16 Page 28

Example: Spellcheck.py 10/19/16 Page 29

Execution: Spellcheck.py 10/19/16 Page 30

Programming Tip When you write a program that manages a collec2on of unique items, sets are far more efficient than lists Some programmers prefer to use the familiar lists, replacing with: itemset.add(item) if (item not in itemlist) itemlist.append(item) However, the resul2ng program is much slower. The speed factor difference is over 10 2mes 10/19/16 Page 31

Counting Unique Words 10/19/16 32

Problem Statement We want to be able to count the number of unique words in a text document Mary had a liele lamb has 57 unique words Our task is to write a program that reads in a text document and determines the number of unique words in the document 10/19/16 Page 33

Step One: Understand the Task To count the number of unique words in a text document we need to be able to determine if a word has been encountered earlier in the document Only the first occurrence of a word should be counted The easiest way to do this is to read each word from the file and add it to the set Because a set cannot contain duplicates we can use the add method The add method will prevent a word that was encountered earlier from being added to the set A_er we process every word in the document the size of the set will be the number of unique words contained in the document 10/19/16 Page 34

Step Two: Decompose the Problem The problem can be split into several simple steps: Create an empty set for each word in the text document Add the word to the set Number of unique words = the size of the set Crea2ng the empty set, adding an element to the set, and determining the size of the set are standard set opera2ons Reading the words in the file can be handled as a separate task 10/19/16 Page 35

Step Three: Build the Set We need to read individual words from the file. For simplicity in our example we will use a literal file name inputfile = open( nurseryrhyme.txt, r ) For line in inputfile : thewords = line.split() For words in thewords : Process word To count unique words we need to remove any nonleeers and remove capitaliza2on We will design a func2on to clean the words before we add them to the set 10/19/16 Page 36

Step Four: Clean the Words To strip out all the characters that are not leeers we will iterate through the string, one character at a 2me, and build a new clean word def clean(string) : result = for char in string : if char.isalpha() : result = result + char return result.lower() 10/19/16 Page 37

Step Five: Some Assembly Required Implement the main() func2on and combine it with the other func2ons Open the file: countwords.py 10/19/16 Page 38