Lesson 4: Type Conversion, Mutability, Sequence Indexing. Fundamentals of Text Processing for Linguists Na-Rae Han

Similar documents
Lab 6: Data Types, Mutability, Sorting. Ling 1330/2330: Computational Linguistics Na-Rae Han

Lab 3: for and while Loops, Indexing. Ling 1330/2330: Computational Linguistics Na-Rae Han

Python I. Some material adapted from Upenn cmpe391 slides and other sources

Lab 8: File I/O, Mutability vs. Assignment. Ling 1330/2330: Intro to Computational Linguistics Na-Rae Han

Some material adapted from Upenn cmpe391 slides and other sources

History Installing & Running Python Names & Assignment Sequences types: Lists, Tuples, and Strings Mutability

Python Intro GIS Week 1. Jake K. Carr

Types, lists & functions

COMP1730/COMP6730 Programming for Scientists. Strings

Lab 2: input(), if else, String Operations, Variable Assignment. Ling 1330/2330: Computational Linguistics Na-Rae Han

Sequence types. str and bytes are sequence types Sequence types have several operations defined for them. Sequence Types. Python

Lab 1: Course Intro, Getting Started with Python IDLE. Ling 1330/2330 Computational Linguistics Na-Rae Han

Data Structures. Lists, Tuples, Sets, Dictionaries

Lab 5: Function Types, Lists and Dictionaries, Sorting. Ling 1330/2330: Computational Linguistics Na-Rae Han

Python Class-Lesson1 Instructor: Yao

Part I. Wei Tianwen. A Brief Introduction to Python. Part I. Wei Tianwen. Basics. Object Oriented Programming

Programming in Python

Fundamentals of Python: First Programs. Chapter 4: Strings (Indexing, Slicing, and Methods)

Collections. Lists, Tuples, Sets, Dictionaries

Lists in Python CS 8: Introduction to Computer Science, Winter 2018 Lecture #10

Computer Sciences 368 Scripting for CHTC Day 3: Collections Suggested reading: Learning Python

MUTABLE LISTS AND DICTIONARIES 4

python 01 September 16, 2016

Compound Data Types 1

Topic 7: Lists, Dictionaries and Strings

Programming Fundamentals and Python

Python and Bioinformatics. Pierre Parutto

Level 3 Computing Year 2 Lecturer: Phil Smith

Introduction to Python

Python for Non-programmers

Sequences and iteration in Python

CMSC201 Computer Science I for Majors

Basic Scripting, Syntax, and Data Types in Python. Mteor 227 Fall 2017

Python. Karin Lagesen.

Script language: Python Data structures

Lists How lists are like strings

[CHAPTER] 1 INTRODUCTION 1

Overview of List Syntax

Part III Appendices 165

CS2304: Python for Java Programmers. CS2304: Sequences and Collections

Working with Strings. Husni. "The Practice of Computing Using Python", Punch & Enbody, Copyright 2013 Pearson Education, Inc.

DEBUGGING TIPS. 1 Introduction COMPUTER SCIENCE 61A

Sequence of Characters. Non-printing Characters. And Then There Is """ """ Subset of UTF-8. String Representation 6/5/2018.

A Brief Introduction to Python for those who know Java. (Last extensive revision: Daniel Moroz, fall 2015)

Worksheet 6: Basic Methods Methods The Format Method Formatting Floats Formatting Different Types Formatting Keywords

All programs can be represented in terms of sequence, selection and iteration.

Fundamentals of Python: First Programs. Chapter 4: Strings and Text Files

CMSC 201 Fall 2016 Lab 09 Advanced Debugging

Python - Variable Types. John R. Woodward

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

Teaching London Computing

LISTS WITH PYTHON. José M. Garrido Department of Computer Science. May College of Computing and Software Engineering Kennesaw State University

Lists, loops and decisions

For strings (and tuples, when we get to them), its easiest to think of them like primitives directly stored in the variable table.

Outline. Simple types in Python Collections Processing collections Strings Tips. 1 On Python language. 2 How to use Python. 3 Syntax of Python

18.1. CS 102 Unit 18. Python. Mark Redekopp

Python for Analytics. Python Fundamentals RSI Chapters 1 and 2

Lecture 10: Lists and Sequences

Python: common syntax

The Practice of Computing Using PYTHON. Chapter 4. Working with Strings. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Introduction to Python

ECS15, Lecture 13. Midterm solutions posted. Topic 4: Programming in Python, cont. Staying up to speed with Python RESOURCES. More Python Resources

COMP 364: Functions II

CS61A Lecture 16. Amir Kamil UC Berkeley February 27, 2013

COMP1730/COMP6730 Programming for Scientists. Sequence types, part 2

The current topic: Python. Announcements. Python. Python

CS Advanced Unix Tools & Scripting

CS 1110: Introduction to Computing Using Python Lists and Sequences

UNIVERSITÀ DI PADOVA. < 2014 March >

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression

CS61A Lecture 16. Amir Kamil UC Berkeley February 27, 2013

The Dynamic Typing Interlude

MEIN 50010: Python Introduction

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming

Programming to Python

Overview of OOP. Dr. Zhang COSC 1436 Summer, /18/2017

At full speed with Python

Strings. Genome 373 Genomic Informatics Elhanan Borenstein

CSCE 110 Programming I

Lecture 7: Python s Built-in. in Types and Basic Statements

Expressions and Variables

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

Data Handing in Python

CSE341: Programming Languages Lecture 9 Function-Closure Idioms. Dan Grossman Winter 2013

Java Fall 2018 Margaret Reid-Miller

Today: Revisit some objects. Programming Languages. Key data structure: Dictionaries. Using Dictionaries. CSE 130 : Winter 2009

Algorithms and Programming I. Lecture#12 Spring 2015

CSCE 110 Programming I Basics of Python: Lists, Tuples, and Functions

Comp 151. Control structures.

Introduction to Python

Index. object lifetimes, and ownership, use after change by an alias errors, use after drop errors, BTreeMap, 309

3 The Building Blocks: Data Types, Literals, and Variables

Python in 10 (50) minutes

CSCI 2041: Basic OCaml Syntax and Features

Here n is a variable name. The value of that variable is 176.

ECS15, Lecture 14. Reminder: how to learn Python. Today is the final day to request a regrade of the midterm. Topic 4: Programming in Python, cont.

CSc 120 Introduction to Computer Programing II Adapted from slides by Dr. Saumya Debray

Advanced Algorithms and Computational Models (module A)

There are four numeric types: 1. Integers, represented as a 32 bit (or longer) quantity. Digits sequences (possibly) signed are integer literals:

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

Transcription:

Lesson 4: Type Conversion, Mutability, Sequence Indexing Fundamentals of Text Processing for Linguists Na-Rae Han

Objectives Python data types Mutable vs. immutable object types How variable assignment and reference works Type conversion Conversion between types Sequence types: string, list, tuple Indexing Slicing in, +, * operations * Today's slides borrowed heartily from http://tdcwww.harvard.edu/python.pdf 1/29/2014 2

All about palindromes Fun reads: "Doubling in the Middle", Sep 2011, The Believer http://www.believermag.com/issues//201109/?read=article_kornbluh "World's Longest Palindrome Sentence?" by Peter Norvig http://norvig.com/palindrome.html He uses Python scripts to generate palindromes using word lists his record is 17,826 words! 1/29/2014 3

List is MUTABLE, string is not.reverse() >>> mwords ['Mary', 'had', 'a', 'little', 'lamb'] >>> mwords.reverse() >>> mwords ['lamb', 'little', 'a', 'had', 'Mary'] >>> mwords.reverse() >>> mwords ['Mary', 'had', 'a', 'little', 'lamb'] >>> wd = 'school' >>> wd.upper() 'SCHOOL' >>> wd 'school' Reverses a list IN PLACE: original list is changed List is MUTABLE String is IMMUTABLE cf. String operations do not change the original string! 1/29/2014 4

Python data types 33 5.49 int: integer float: floating point number 'Bart' 'Hello, world!' str: string (a piece of text) ['cat', 'dog', 'fox', 'hippo'] list ('Spring', 'Summer', 'Winter', 'Fall') tuple {'Homer':36, 'Marge':36, 'Bart':10, 'Lisa':8, 'Maggie':1} dict: (dictionary) maps a value to an object 1/29/2014 5

Immutable data types int 33 float 33.5 str tuple 'Hello, world!' ('Spring', 'Summer', 'Winter', 'Fall') In Python, the data types integer, float, string, and tuple are immutable. Python functions do NOT directly change these data they are unchangeable! Instead, a new object is created in memory and returned. Value change only occurs through a new assignment statement. 1/29/2014 6

Mutable data types list dict ['cat', 'dog', 'fox', 'hippo'] {'Homer':36, 'Marge':36, 'Bart':10, 'Lisa':8} List and dictionary are mutable data types. When a Python method* is called on these data, the operation is done in place. The original data in memory are changed! They are not copied into a new memory address each time. * "Methods" are functions that are specific to a data type. They are "called on" a data object and have object.method() syntax. 1/29/2014 7

Assignment: under the hood Binding a variable in Python means setting a name to hold a reference to some object. Assignment creates references, not copies. Names in Python do not have an intrinsic type; objects do. Python determines the type of the reference automatically based on the data object assigned to it. You create a name the first time it appears on the left side of an assignment statement: x = 3 A reference is deleted via garbage collection after any names bound to it have passed out of scope. 1/29/2014 8

Understanding reference semantics Assignment manipulates references var1 = var2 does not make a copy of the object var2 references var1 = var2 makes var1 reference the object var2 references! So, for immutable data types (integers, floats, strings) assignment behaves as you would expect: >>> x = 3 # Creates 3, name x refers to 3 >>> y = x # Creates name y, refers to 3 >>> x = 4 # Creates ref for 4. Change x to point to it >>> print y # No effect on y, which still points to 3 3 1/29/2014 9

Reference vs. immutable types So, for immutable datatypes (integers, floats, strings) assignment behaves as you would expect: >>> x = 3 # Creates 3, name x refers to 3 >>> y = x # Creates name y, refers to 3 >>> x = 4 # Creates ref for 4. Change x to point to it >>> print y # No effect on y, which still points to 3 3 x int 3 1/29/2014 10

Reference vs. immutable types So, for immutable datatypes (integers, floats, strings) assignment behaves as you would expect: >>> x = 3 # Creates 3, name x refers to 3 >>> y = x # Creates name y, refers to 3 >>> x = 4 # Creates ref for 4. Change x to point to it >>> print y # No effect on y, which still points to 3 3 x y int 3 1/29/2014 11

Reference vs. immutable types So, for immutable datatypes (integers, floats, strings) assignment behaves as you would expect: >>> x = 3 # Creates 3, name x refers to 3 >>> y = x # Creates name y, refers to 3 >>> x = 4 # Creates ref for 4. Change x to point to it >>> print y # No effect on y, which still points to 3 3 x int 4 y int 3 1/29/2014 12

Reference vs. mutable types Mutable data types (list, dictionaries) behave differently! Method functions change these data in place, so >>> x = [1,2,3] # x references list [1,2,3] >>> y = x # y now references what x references >>> x.append(4) # this changes the original list in memory >>> print x [1, 2, 3, 4] >>> print y [1, 2, 3, 4] Value of y also changed! 1/29/2014 13

Reference vs. mutable types Mutable data types (list, dictionaries) behave differently! Method functions change these data in place, so >>> x = [1,2,3] # x references list [1,2,3] >>> y = x # y now references what x references >>> x.append(4) # this changes the original list in memory >>> print x [1, 2, 3, 4] >>> print y [1, 2, 3, 4] x 1 2 3 int 1/29/2014 14

Reference vs. mutable types Mutable data types (list, dictionaries) behave differently! Method functions change these data in place, so >>> x = [1,2,3] # x references list [1,2,3] >>> y = x # y now references what x references >>> x.append(4) # this changes the original list in memory >>> print x [1, 2, 3, 4] >>> print y [1, 2, 3, 4] x y 1 2 3 int 1/29/2014 15

Reference vs. mutable types Mutable data types (list, dictionaries) behave differently! Method functions change these data in place, so >>> x = [1,2,3] # x references list [1,2,3] >>> y = x # y now references what x references >>> x.append(4) # this changes the original list in memory >>> print x [1, 2, 3, 4] >>> print y [1, 2, 3, 4] x y 1 2 3 4 int 1/29/2014 16

Reference vs. mutable types Mutable data types (list, dictionaries) behave differently! Method functions change these data in place, so >>> x = [1,2,3] # x references list [1,2,3] >>> y = x # y now references what x references >>> x.append(4) # this changes the original list in memory >>> print x [1, 2, 3, 4] >>> print y [1, 2, 3, 4] x y 1 2 3 4 int x and y refer to the same object in memory! When x is modified, y also changes 1/29/2014 17

== vs. is >>> str1 = 'cat' >>> str2 = 'cat' >>> str1 == str2 True >>> str1 is str2 True >>> list1 = ['cat', 'dog'] >>> list2 = ['cat', 'dog'] >>> list1 == list2 True >>> list1 is list2 False >>> list1 = list2 >>> list1 is list2 True == tests for equivalence of value is tests for equivalence of the object in memory 1/29/2014 18

Data types in Python type() displays the data type of the object >>> h = 'hello' # h is str type >>> h = list(h) # h now refers to a list >>> h ['h', 'e', 'l', 'l', 'o'] >>> type(h) <type 'list'> Beware of type conflicts! >>> w = 'Mary' >>> w+3 Traceback (most recent call last): File "<pyshell#68>", line 1, in <module> w+3 TypeError: cannot concatenate 'str' and 'int' objects 1/29/2014 19

Converting between types int() float() str() string, floating point integer >>> int(3.141592) 3 string, integer floating point number >>> float('256') 256.0 integer, float, list, tuple, dictionary string >>> str(3.141592) '3.141592' >>> str([1,2,3,4]) '[1, 2, 3, 4]' list() string, tuple, dictionary list >>> list('mary') ['M', 'a', 'r', 'y'] 1/29/2014 20

Practice: Fix this script! 2 minutes "Tip calculator" script: Open your IDLE editor, and try it out. The script is broken fix it! 1/29/2014 21

Fixed: type conversion is key Don't forget Space ' ' at the end of the prompt string! 1/29/2014 22

Sequence types Three Python basic data types are based on sequence: Strings, lists, and tuples str list tuple 'Hello, world!' ['cat', 'dog', 'fox', 'hippo'] ('Spring', 'Summer', 'Winter', 'Fall') These sequence types share much of the same syntax and functionality. The operations shown in this section can be applied to all sequence types 1/29/2014 23

Sequence types 1. String A sequence of individual characters Immutable 2. List An ordered sequence of items Items can be mixed types. Mutable: items can be changed, added or removed 3. Tuple (1, 2,'ab', 3.14) A fixed sequence of mixed types of items Sort of like a list, but cannot be altered Immutable! 'hello' [1, 2, 'ab', 3.14] 1/29/2014 24

Indexing Individual items of a string, list, or tuple can be accessed using square bracket "array" notation: [index] Index starts from 0! >>> st = 'Hello, world!' >>> st[1] 'e' >>> li = [1, 2, 'ab', 3.14] >>> li[1] 2 >>> tu = ('Homer', 'Marge', 'Bart', 'Lisa') >>> tu[1] 'Marge' 1/29/2014 25

Positive vs. negative indices >>> st = 'Hello!' >>> li = ['Mary', 'had', 'a', 'little', 'lamb'] Positive index: count from the left, starting with 0 >>> st[1] 'e' >>> li[1] 'had' Negative index: count from right, starting with -1 >>> st[-1] '!' >>> li[-1] 'lamb' H e l l o! 0 1 2 3 4 5-6 -5-4 -3-2 -1 1/29/2014 26

Slicing returns a copy of a subset Slicing syntax: s[start:end] Returns the elements beginning at start and extending up to but not including end Copying starts at the first index, and stops copying before the second index! >>> st = 'Hello!' >>> li = [1, 2, 'ab', 3.14, 'c'] >>> st[1:4] 'ell' >>> li[2:4] ['ab', 3.14] You can also use negative indices when slicing: >>> st[1:-1] 'ello' >>> li[1:-2] [2, 'ab'] 1/29/2014 27

Slicing >>> st = 'Hello!' >>> li = [1, 2, 'ab', 3.14, 'c'] Omit the first index [:5] to make a copy starting from the beginning: >>> st[:4] 'Hell' >>> li[:3] [1, 2, 'ab'] Omit the second index [2:] to make a copy till the end: >>> st[1:] 'ello!' >>> li[-3:] ['ab', 3.14, 'c'] 1/29/2014 28

Copying the whole sequence Use [:] to make a copy of an entire list >>> li = [1, 2, 'ab', 3.14, 'c'] >>> li2 = li[:] A copy means changes to one does not affect the other! >>> li.reverse() >>> li ['c', 3.14, 'ab', 2, 1] >>> li2 [1, 2, 'ab', 3.14, 'c'] Remember how reference works with mutable objects: list2 = list1 # 2 names refer to 1 memory object # --> changing one affects both list2 = list1[:] # Two independent objects, two refs 1/29/2014 29

The versatile in operator For strings, in tests for substrings >>> 'bat' in 'combative' True >>> 'cat' in 'combative' False For lists and tuples, in tests for membership >>> medals = ('gold', 'silver', 'bronze') >>> 'zinc' in medals False >>> 'zinc' not in medals True >>> li ['c', 3.14, 'ab', 2, 1] >>> 3.141592 in li False in is also used as a keyword in the syntax of for loops and others 1/29/2014 30

The + operator + produces a new tuple, list, or string whose value is the concatenation of its arguments. >>> ('Homer', 'Marge') + ('Bart', 'Lisa', 'Maggie') ('Homer', 'Marge', 'Bart', 'Lisa', 'Maggie') >>> [1, 2, 3] + [4, 5] [1, 2, 3, 4, 5] >>> 'cat' + 'dog' + 'fox' 'catdogfox' 1/29/2014 31

The * operator * produces a new tuple, list, or string that "repeats" the original content. >>> (1, 2, 3) * 3 (1, 2, 3, 1, 2, 3, 1, 2, 3) >>> ['a', 'b', 'c'] *3 ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c'] >>> 'Hello' * 3 'HelloHelloHello' 1/29/2014 32

String indexing examples >>> w = 'python' >>> w[2] 't' >>> w[5] 'n' >>> w[9] Traceback (most recent call last): File "<pyshell#8>", line 1, in <module> w[9] IndexError: string index out of range >>> w[-3] 'h' >>> w[2:4] 'th' >>> w[0:3] 'pyt' >>> w[2:] 'thon' >>> w[-5:] 'ython' p y t h o n 0 1 2 3 4 5-6 -5-4 -3-2 -1 >>> w[:4] 'pyth' >>> w[:-1] 'pytho' >>> w[2:100] 'thon' >>> w[:2] + w[2:] 'python' >>> w[:3] + w[3:] 'python' >>> w[:] 'python' >>> w[4:4] '' 33

Practice: a vs. an 3 minutes Indefinite NP Generator Write a Python script that: prompts for a noun prints out the indefinite article (a/an), and the noun It must pick the correct indefinite article! "a cat" vs. "an owl" Describe the algorithm, step-by-step. Specify which Python function you will use with each step. 1/29/2014 34

Attempt 1. Works, but crude 1/29/2014 35

Attempt 2. Elegant! Uses string indexing & list membership test 1/29/2014 36

Practice: Gerund Generator 3 minutes Write a Python script that: prompts for a verb (base form) and then prints out its gerund form, i.e., suffixed with '-ing' CAVEAT: Account for verbal stems that end with ' e' walk walking sleep sleeping make making live living KEY: How to derive 'mak' from 'make'? 1/29/2014 37

1/29/2014 38

Looking ahead for x in LIST : iterates through a list for doing something to each element Press ENTER at the end of each line Iterates through every element s of the simpsons list and prints the value of s followed by 'is a Simpson.'. 1/29/2014 39

Wrap-up Next class Loops! for loop, while loop Exercise #4 Due Tuesday midnight Practice Python for 1 hour: review what we learned in class, and explore on your own. 1/29/2014 40