By default, it is assumed that we open a file to read data. So the above is equivalent to the following:

Similar documents
1 Modules 2 IO. 3 Lambda Functions. 4 Some tips and tricks. 5 Regex. Sandeep Sadanandan (TU, Munich) Python For Fine Programmers May 30, / 22

Slide Set 15 (Complete)

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #47. File Handling

Scientific Computing: Lecture 3

Level 3 Computing Year 2 Lecturer: Phil Smith

Execution order. main()

Chapter 10: File Input / Output

examples from first year calculus (continued), file I/O, Benford s Law

File Processing. CS 112: Introduction to Programming: File Processing Sequence. File Processing. File IO

CME 193: Introduction to Scientific Python Lecture 4: Strings and File I/O

CSE 374 Programming Concepts & Tools. Brandon Myers Winter 2015 Lecture 4 Shell Variables, More Shell Scripts (Thanks to Hal Perkins)

LECTURE 4 Python Basics Part 3

Lab 5: File I/O CSE/IT 107. NMT Computer Science

PHY224 Practical Physics I. Lecture 2

ARITHMETIC EXPRESSION

Introduction to Computer Programming for Non-Majors

Previously. Iteration. Date and time structures. Modularisation.

CCBC Math 081 Order of Operations Section 1.7. Step 2: Exponents and Roots Simplify any numbers being raised to a power and any numbers under the

Files. CSE 1310 Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington

Lists How lists are like strings

Lecture 4. Introduction to Python! Lecture 4

Lecture 4. Defining Functions

FILE HANDLING AND EXCEPTIONS

Python Programming Exercises 1

Python review. 1 Python basics. References. CS 234 Naomi Nishimura

Computer Science Lab Exercise 1

PREPARING FOR PRELIM 1

CS 31: Intro to Systems Binary Arithmetic. Martin Gagné Swarthmore College January 24, 2016

You Need an Interpreter! Comp Spring /28/08 L10 - An Interpreter

Loop structures and booleans

Lessons on Python Modules and Packages

Java Programming Fundamentals - Day Instructor: Jason Yoon Website:

PHY224 Practical Physics I. Lecture 2

File Input/Output. Learning Outcomes 10/8/2012. CMSC 201 Fall 2012 Instructor: John Park Lecture Section 01. Discussion Sections 02-08, 16, 17

Python Working with files. May 4, 2017

CSI 402 Systems Programming LECTURE 4 FILES AND FILE OPERATIONS

ENCM 335 Fall 2018 Lab 6 for the Week of October 22 Complete Instructions

Previously. Iteration. Date and time structures. Modularisation.

CS Programming Languages: Python

cs1114 REVIEW of details test closed laptop period

Lecture 4. Defining Functions

Notices. Test rules. Page 1 of 8. CS 1112 Spring 2018 Test 2

Math 15 - Spring Homework 5.2 Solutions

Data type built into Python. Dictionaries are sometimes found in other languages as associative memories or associative arrays.

Reading and writing files

Lecture 3: Functions & Modules (Sections ) CS 1110 Introduction to Computing Using Python

Problem 1 (a): List Operations

CS 211: Potpourri Enums, Packages, Unit Tests, Multi-dimensional Arrays Command Line Args

I2204 ImperativeProgramming Semester: 1 Academic Year: 2018/2019 Credits: 5 Dr Antoun Yaacoub

Outline. Introduction. 2 Proof of Correctness. 3 Final Notes. Precondition P 1 : Inputs include

COMP-202: Foundations of Programming. Lecture 4: Methods Jackie Cheung, Winter 2016

Subtraction Understand Subtraction on a Number Line Using a number line let s demonstrate the subtraction process using the problem 7 5.

CME 193: Introduction to Scientific Python Lecture 4: File I/O and Classes

Python Programming Exercises 3

CS 220: Introduction to Parallel Computing. Input/Output. Lecture 7

Python Programming: An Introduction to Computer Science

Com S 127x - Lab 6 1. READING FLOWCHARTS WITH CONDITIONAL ACTIONS!

Python File Modes. Mode Description. Open a file for reading. (default)

CS 115 Lecture 4. More Python; testing software. Neil Moore

Text Input and Conditionals

1 Strings (Review) CS151: Problem Solving and Programming

CS1 Lecture 22 Mar. 6, 2019

CE Lecture 11

CS 1110 SPRING 2016: GETTING STARTED (Jan 27-28) First Name: Last Name: NetID:

Student Database Challenge Problem

CMSC 201 Fall 2016 Homework 6 Functions

ENCM 339 Fall 2017 Lecture Section 01 Lab 9 for the Week of November 20

Getting Started. Office Hours. CSE 231, Rich Enbody. After class By appointment send an . Michigan State University CSE 231, Fall 2013

06/11/2014. Subjects. CS Applied Robotics Lab Gerardo Carmona :: makeroboticsprojects.com June / ) Beginning with Python

STA141C: Big Data & High Performance Statistical Computing

Math 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency

PROGRAMMING, DATA STRUCTURES AND ALGORITHMS IN PYTHON

Python, Part 2 CS 8: Introduction to Computer Science Lecture #4

Computational Methods of Scientific Programming. Lecturers Thomas A Herring Chris Hill

Week - 03 Lecture - 18 Recursion. For the last lecture of this week, we will look at recursive functions. (Refer Slide Time: 00:05)

Chapter 2. Python Programming for Physicists. Soon-Hyung Yook. March 31, Soon-Hyung Yook Chapter 2 March 31, / 52

NLP Lab Session Week 4 September 17, Reading and Processing Test, Stemming and Lemmatization. Getting Started

Lecture 4: Defining Functions (Ch ) CS 1110 Introduction to Computing Using Python

CS246 Spring14 Programming Paradigm Files, Pipes and Redirection

CS116 - Module 10 - File Input/Output

Announcements. Lab Friday, 1-2:30 and 3-4:30 in Boot your laptop and start Forte, if you brought your laptop

Using Files. Wrestling with Python Classes. Rob Miles

File I/O, Benford s Law, and sets

Lecture 2: Variables & Assignments

Solution Guide for Chapter 2

Revising CS-M41. Oliver Kullmann Computer Science Department Swansea University. Linux Lab Swansea, December 13, 2011.

Introduction to Python. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas

Module 1: Introduction RStudio

Short Version of Matlab Manual

CS1 Lecture 3 Jan. 18, 2019

LECTURE 2. Python Basics

Hadoop and Map-reduce computing

Programming for Engineers in Python. Autumn

Student Outcomes. Lesson Notes. Classwork. Example 2 (3 minutes)

CS2351 Data Structures. Lecture 7: A Brief Review of Pointers in C

CS1110 Lab 1 (Jan 27-28, 2015)

CS 1110, LAB 3: MODULES AND TESTING First Name: Last Name: NetID:

CS1 Lecture 3 Jan. 22, 2018

DEBUGGING TIPS. 1 Introduction COMPUTER SCIENCE 61A

DM550/DM857 Introduction to Programming. Peter Schneider-Kamp

Transcription:

LING115 Lecture Note Session #5: Files, Functions and Modules 1. Introduction A corpus comes packaged as a set of files. Obviously, we must know how to read data from a file into our program. At the same time, it is convenient to save the output of our analysis of the corpus to a file. Let s learn how to work with files in section 2 and how to write our programs more efficiently using functions and modules in sections 3 and 4. 2. Files We sometimes want to read the data from a file or want to write the output of our program to a file. This can be done by using the file data-type. We create a file object by opening a file with a particular name. To this end, we use the open function. We specify the name of the file and whether we want to read data from the file or write data to the file. For example, the following will open a file named foo.txt under /home/ling115/python_examples/ so we can read its data: f=open( /home/ling115/python_examples/foo.txt, r ) By default, it is assumed that we open a file to read data. So the above is equivalent to the following: f=open( /home/ling115/python_examples/foo.txt ) The following will create a file called blah.txt under /home/hahnkoo/ so we can write data to it: f=open( /home/hahnkoo/blah.txt, w ) If a file with the same name already exists, opening a file with the w parameter is equivalent to deleting the old file and starting a new file with the same name. To open a file so that we can append new to the existing data, we open a file with a parameter. For example, the following will allow us to add more data to foo.txt under /home/hahnkoo/, assuming the file exists: f=open( /home/hahnkoo/foo.txt, a ) Note that I specified the path to the directory in addition to the filename. If the directory path is not specified, Python assumes the directory is the current working directory. A file object is created as a result of opening a file. In the above examples, the file object is called f. Having an object of a particular data-type means we can use its methods. Below are some useful methods specific to file objects. close() 1

Close the file. Unless we need to have multiple files open at the same, it is recommended that you close the file as soon as you are done with it. For example, after we read lines from foo.txt into a list, we want to close the file as follows: >>> foo=open( /home/ling115/python_examples/foo.txt ) >>> lines=foo.readlines() >>> foo.close() readline() Read one line from the file. Each time this method is called, it returns the next line in order starting from the first line of the file. Try the following, for example. >>> foo=open( /home/ling115/python_examples/foo.txt ) >>> foo.readline() >>> foo.readline() >>> foo.close() readlines() Read all lines from the file into a list. write(string) Write string to the file. >>> f=open( temp.txt, w ) >>> a= a string >>> f.write(a) >>> f.close() writelines(sequence) Write a sequence of strings to the file. For example, the following will create a file called foo.txt which contains one line astringwithoutspace. >>> f=open( foo.txt, w ) >>> a=[ a, string, without, space ] >>> f.writelines(a) >>> f.close() The following, on the other hand, will create a file called foo2.txt which contains three lines. >>> f=open( foo2.txt, w ) >>> a=[ line1\n, line2\n, line3\n ] >>> f.writelines(a) >>> f.close() 2

3. Functions Roughly speaking, a function is like a program inside a program: it receives input arguments, does something with them, and returns a value. For example, we could have a function called avg which takes a list of numbers as its argument and returns its arithmetic mean. We could perhaps use it to see how often a word appears in the given corpus on average. A function must be defined first in order to use it. def N(A): B return X N, A, B, X refer to the name of the function, its arguments, the block of code that defines the function, and the value that the function returns, respectively. For example, the avg function would be defined as follows: def avg(list): sum=0.0 for number in list: sum=sum+number n=len(list) return sum/n A function returns a value, so it can be used in expressions. For example, we can subtract average from a value to calculate its deviation as follows: list=[1,2,3,4] deviation = list[2] - avg(list) As you can see from the above example, we use a function by calling it: specify the name of a function and its arguments in parentheses. 4. Modules A module is a file that contains function definitions so that we can use functions defined in another program file. For example, we define the avg function only once and then use it in any programs where we want to calculate the average of a list. In order to call a function defined in a module, we must first import the module. That is, add the following line: import <module> We have already seen an example when we imported the sys module to process standard input. 3

import sys Once we import the module, we call its function in the following format: <module>.<function> For example, we can use the log function defined in the math module as follows: import math math.log(100) There are modules like sys which are built-in. See the list at http://docs.python.org/modindex.html for more. For these, we just need to enter import <module> as above. However, to use a function defined in a file we created, we must do the following: 1) Tell Python the directory that contains the file. This is done by first importing sys and then entering sys.path.append(<directory-path>). 2) Enter import <module>, where <module> is the name of your file without.py. For example, suppose we defined the avg function in a file named ling115_stat.py under /home/ling115/python_examples/. To import the module, import sys sys.path.append( /home/ling115/python_examples/ ) import ling115_stat 5. Exercise 1 Suppose we wanted to count the number of words in each file under a specified directory. Get a list of files under the specified directory. For each file in the list, do the following: o Count the number of words in the file. Let s first define how to count the number of words in each file. Define a counting variable named word_count. Initialize it to zero. Open the file. Read the lines in the file into a list. For each line in list, do the following: o Remove leading or trailing control characters. o Split the line by white space and store it in a list of words. o Increase word_count by the number of words in the list. 4

We can define a function that captures the counting process. Let s call it count_words. def count_words(file): f=open(file) lines=f.readlines() f.close() word_count=0 for line in lines: words=line.strip().split() word_count=word_count+len(words) return word_count With the definition above, we can write the program as follows: import sys import os directory_name=sys.argv[1] file_list=os.listdir(directory_name) for file in file_list: count=count_words(directory_name+file) print file+ \t +str(count) In the program above, we import two built-in Python modules: sys and os. The sys module is imported to process command-line arguments as can be seen in directory_name=sys.argv[1]. The os module is imported to list the files in the specified directory: os.listdir(directory_name). Note the use of count_words function in count=count_words(directory_name+file). 6. Exercise 2 Now suppose we wanted to calculate the average number of words in each file using the avg function defined in /home/ling115/python_examples/ling115_stat.py. Instead of printing the number of words in each file, we add the word-count to a list while we re in the for-loop and calculate the mean of the list afterwards. In addition to the definition of count_words mentioned in the previous section, our code should include the following: import sys import os sys.path.append( /home/ling115/python_examples/ ) import ling115_stat directory_name=sys.argv[1] file_list=os.listdir(directory_name) count_list=[] for file in file_list: 5

count=count_words(directory_name+file) count_list.append(count) print ling115_stat.avg(count_list) Note that the path to the directory containing ling115_stat.py must be first added to sys.path in order for Python to import ling115_stat to our program. 6