CSCI 4152/6509 Natural Language Processing. Perl Tutorial CSCI 4152/6509. CSCI 4152/6509, Perl Tutorial 1

Similar documents
CSCI 4152/6509 Natural Language Processing Lecture 6: Regular Expressions; Text Processing in Perl

30-Jan CSCI 4152/6509 Natural Language Processing Lab 4: Perl Tutorial 3. Perl Tutorial 3. Faculty of Computer Science, Dalhousie University

Pathologically Eclectic Rubbish Lister

Programming Perls* Objective: To introduce students to the perl language.

Perl Tutorial. Diana Inkpen. School of Information Technology and Engineering University of Ottawa. CSI 5180, Fall 2004

COMP284 Scripting Languages Lecture 2: Perl (Part 1) Handouts

Introduction to Perl. Perl Background. Sept 24, 2007 Class Meeting 6

Beginning Perl. Mark Senn. September 11, 2007

Perl. Many of these conflict with design principles of languages for teaching.

Introduc)on to Unix and Perl programming

Introductory Perl. What is Perl?

A Crash Course in Perl5

Perl. Perl. Perl. Which Perl

1. Introduction. 2. Scalar Data

Perl. Interview Questions and Answers

COMS 3101 Programming Languages: Perl. Lecture 1

Introduction to Perl. c Sanjiv K. Bhatia. Department of Mathematics & Computer Science University of Missouri St. Louis St.

Outline. Introduction to Perl. Why use scripting languages? What is expressiveness. Why use Java over C

Introduc)on to Unix and Perl programming

A control expression must evaluate to a value that can be interpreted as true or false.

Lecture 2: Programming in Perl: Introduction 1

COMS 3101 Programming Languages: Perl. Lecture 2

Scripting Languages Perl Basics. Course: Hebrew University

CSCI-GA Scripting Languages

Perl Basics 7 Mar 2009 #fedora-classroom. Doran Barton

Welcome to Research Computing Services training week! November 14-17, 2011

(Refer Slide Time: 01:12)

Indian Institute of Technology Kharagpur. PERL Part II. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T.

PERL Scripting - Course Contents

CSCI-GA Scripting Languages

COMP284 Scripting Languages Lecture 3: Perl (Part 2) Handouts

Learning Perl 6. brian d foy, Version 0.6, Nordic Perl Workshop 2007

What is PERL?

CSCI 4152/6509 Natural Language Processing. Lab 3: Perl Tutorial 2

CMSC 331 Final Exam Fall 2013

28-Nov CSCI 2132 Software Development Lecture 33: Shell Scripting. 26 Shell Scripting. Faculty of Computer Science, Dalhousie University

CSC Web Programming. Introduction to JavaScript

Examples of Using the ARGV Array. Loop Control Operators. Next Operator. Last Operator. Perl has three loop control operators.

Perl and Python ESA 2007/2008. Eelco Schatborn 27 September 2007

Introductory Perl. Boston University Information Services & Technology. Course Coordinator: Timothy Kohl. What is Perl?

... and run the script as /path/to/script.pl. Of course, it'll need to be executable first, so chmod 755 script.pl (under Unix).

T-( )-MALV, Natural Language Processing The programming language Perl

Programming. Syntax and Semantics

27-Sep CSCI 2132 Software Development Lecture 10: Formatted Input and Output. Faculty of Computer Science, Dalhousie University. Lecture 10 p.

Regular expressions and case insensitivity

Programming introduction part I:

Perl Library Functions

Scripting Languages. Diana Trandabăț

Programming Languages and Uses in Bioinformatics

Introduction to CFX. Appendix A. Scripting and Automation A-1. ANSYS, Inc. Proprietary 2009 ANSYS, Inc. All rights reserved.

Outline. CS3157: Advanced Programming. Feedback from last class. Last plug

8/13/ /printqp.php?heading=II BSc [ ], Semester III, Allied: COMPUTER PROGRAMMING-PERL -309C&qname=309C

CSCI 2010 Principles of Computer Science. Data and Expressions 08/09/2013 CSCI

Modularity and Reusability I. Functions and code reuse

9.1 Origins and Uses of Perl

Course Outline. Introduction to java

Introduction to Perl programmation & one line of Perl program. BOCS Stéphanie DROC Gaëtan ARGOUT Xavier

Overview. - General Data Types - Categories of Words. - Define Before Use. - The Three S s. - End of Statement - My First Program

CSCI 2132 Software Development. Lecture 8: Introduction to C

The e switch allows Perl to execute Perl statements at the command line instead of from a script.

15.1 Origins and Uses of Ruby

Object oriented programming. Instructor: Masoud Asghari Web page: Ch: 3

Basic Linux (Bash) Commands

printf( Please enter another number: ); scanf( %d, &num2);

Topic 9: Type Checking

Topic 9: Type Checking

Topic 1: Introduction

sottotitolo A.A. 2016/17 Federico Reghenzani, Alessandro Barenghi

Hands-On Perl Scripting and CGI Programming

Algorithms and Programming I. Lecture#12 Spring 2015

Control Structures. Important Semantic Difference

Shells & Shell Programming (Part B)

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 11 Introduction to PHP

Regular expressions and case insensitivity

Variables and literals

Awk. 1 What is AWK? 2 Why use AWK instead of Perl? 3 Uses Of AWK. 4 Basic Structure Of AWK Programs. 5 Running AWK programs

Lecture 1. Torgeir R. Hvidsten Assistant professor in Bioinformatics Umeå Plant Science Center (UPSC) Computational Life Science Centre (CLiC)

Goals of C "" The Goals of C (cont.) "" Goals of this Lecture"" The Design of C: A Rational Reconstruction"

On The First Steps. Chapter 1. In This Chapter

Fundamental Data Types. CSE 130: Introduction to Programming in C Stony Brook University

CSCI 2132: Software Development. Norbert Zeh. Faculty of Computer Science Dalhousie University. Introduction to C. Winter 2019

Title:[ Variables Comparison Operators If Else Statements ]

Computer Components. Software{ User Programs. Operating System. Hardware

Beginning Perl for Bioinformatics. Steven Nevers Bioinformatics Research Group Brigham Young University

Procedures, Parameters, Values and Variables. Steven R. Bagley

Programming refresher and intro to C programming

RTL Reference 1. JVM. 2. Lexical Conventions

EL2310 Scientific Programming

Bash scripting Tutorial. Hello World Bash Shell Script. Super User Programming & Scripting 22 March 2013

Shell Programming (bash)

C-Programming. CSC209: Software Tools and Systems Programming. Paul Vrbik. University of Toronto Mississauga

Pace University. Fundamental Concepts of CS121 1

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Perl Scripting. Students Will Learn. Course Description. Duration: 4 Days. Price: $2295

COMP284 Practical 1 Perl (1)

Mobile Computing Professor Pushpendra Singh Indraprastha Institute of Information Technology Delhi Java Basics Lecture 02

Computer Components. Software{ User Programs. Operating System. Hardware

1007 Imperative Programming Part II

1 Lexical Considerations

Basics of Java Programming

Transcription:

CSCI 4152/6509 Natural Language Processing Perl Tutorial CSCI 4152/6509 Vlado Kešelj CSCI 4152/6509, Perl Tutorial 1

created in 1987 by Larry Wall About Perl interpreted language, with just-in-time semi-compilation provides effective string manipulation, brief if needed convenient for system tasks syntax (and semantics) similar to: C, shell scripts, awk, sed, even Lisp, C++ CSCI 4152/6509, Perl Tutorial 2

Perl Strengths good prototyping language, expressive: It can express a lot in a few lines of code. can be used incrementally: useful even if you learn a small part of it. It becomes more useful when you know more; i.e., its learning curve is not steep. flexible; e.g, most tasks can be done in more than one way garbage collection: i.e., no worries about memory management free, open-source; portable, extensible powerful, string and data manipulation, regular expressions efficient, especially considering it is an interpreted language supports Object-Oriented style CSCI 4152/6509, Perl Tutorial 3

Perl Weaknesses not as efficient as C/C++ may not be very readable without prior knowledge OO features are an add-on, rather than built-in not a steep learning curve, but a long one (which is not necessarily a weakness) CSCI 4152/6509, Perl Tutorial 4

Hello world Choose your favourite editor and edit hello.pl: print "Hello world!\n"; Type perl hello.pl to run the program, which should produce: Hello world! You can execute the Perl code by directly interacting with the Perl interpreter: perl print "Hello world!\n"; ˆD (The last ˆD is actually Ctrl+D.) This means that you can also do: perl < hello.pl CSCI 4152/6509, Perl Tutorial 5

Another way to run a program Let us edit again hello.pl into: #!/usr/bin/perl print "Hello world!\n"; Change permissions of the program and run it: chmod u+x hello.pl./hello.pl Running perl -w hello.pl may print useful warnings. The same effect is achieved by running: #!/usr/bin/perl -w print "Hello world!\n"; CSCI 4152/6509, Perl Tutorial 6

File Names extension.pl is common, but not mandatory extension.pm is used for Perl modules Finding Help man perl, man perlintro,... Web: perl.com, CPAN.org, perlmonks.org,... books: the Camel book: Learning Perl, 4th Edition by Brian D. Foy; Tom Phoenix; Randal L. Schwartz (2005) Available on-line on Safari at Dalhousie http://proquest.safaribooksonline.com/0596101058 CSCI 4152/6509, Perl Tutorial 7

Syntactic Elements statements separated by semi-colon ; white space does not matter except in strings line comments begin with # ; e.g. # a comment until the end of line variable names start with $, @, or %: $a a scalar variable @a an array variable %a an associative array (or hash) However: $a[5] is 5th element of an array, and $a{5} is a value associated with key 5 in hash %a the starting special symbol is followed either by a name (e.g., $varname) or a non-letter symbol (e.g., $!) user-defined subroutines are usually prefixed wiht &: &a call the subroutine a (procedure, function) CSCI 4152/6509, Perl Tutorial 8

Example Program 2 We can call this program prog2.pl: #!/usr/bin/perl print "What is your name? "; $name = <>; chomp $name; print "Hello $name!\n"; chomp removes the trailing newline from $name if there is one. However, changing the special variable $/ will change the behaviour of chomp too. The declaration use strict; is useful to force more strict verification of the code. If it is used in the previous program, Perl will complain about variable $name not being declared, so you can declare it: CSCI 4152/6509, Perl Tutorial 9

Examples 3 and 4 #!/usr/bin/perl use strict; my $name; print "What is your name? "; $name = <>; chomp $name; print "Hello $name!\n"; or #!/usr/bin/perl use strict; print "What is your name? "; my $name = <>; chomp $name; print "Hello $name!\n"; CSCI 4152/6509, Perl Tutorial 10

Example 5: Copy standard input to standard output #!/usr/bin/perl while ($line = <>) { print $line; } Special variable $_ is the default variable for many commands, including print and expression while (<>), so another version of the program would be: #!/usr/bin/perl while (<>) { print } or even shorter #!/usr/bin/perl -p CSCI 4152/6509, Perl Tutorial 11

Variables no need to declare them unless use strict; is in place use strict; is a good practice for larger projects variable type is not declared (it is inferred from context) the main variable types: 1. Scalars numbers (integers and floating-point) strings references (pointers) 2. Arrays of scalars 3. Hashes (associative arrays) of scalars CSCI 4152/6509, Perl Tutorial 12

Single-Quoted String Literals print hello\n ; # produces hello\n print It is 5 o\ clock! ; # has to be escaped print q(another way of single-quoting ); # no need to escape this time print q< and another way >; print q{ and another way }; print q[ and another way ]; print q- and another way with almost arbitrary character (e.g. not q)-; print A multi line string (embedded new-line characters) ; print << EOT ; Some lines of text and more $a @b EOT CSCI 4152/6509, Perl Tutorial 13

Double-Quoted String Literals print "Backslash combinations are interpreted in double-quoted strings.\n"; print "newline after this\n"; $a = are ; print "variables $a interpolated in double-quoted strings\n"; # produces "variables are interpolated" etc. @a = ( arrays, too ); print "and @a\n"; # produces "and arrays too" and a newline print qq{similarly to single-quoted, this is also a double-quoted string, (etc.)}; CSCI 4152/6509, Perl Tutorial 14

name starts with $ followed by: Scalar Variables 1. a letter and a sequence of letters, digits or underscores, or 2. a special character such as punctuation or digit contains a single scalar value such as a number, string, or reference (a pointer) do not need to worry whether a number is actually a number or string representation of a number $a = 5.5; $b = " $a "; print $a+$b; (11) CSCI 4152/6509, Perl Tutorial 15

basic operations: + - * / Numerical Operators transparent conversion between int and float additional operators: ** (exponentiation), % (modulo), ++ and -- (post/pre inc/decrement, like in C/C++, Java) can be combined into assignment operators: += -= /= *= %= **= CSCI 4152/6509, Perl Tutorial 16

String Operators. is concatenation; e.g., $a.$b x is string repetition operator; e.g., print "This sentence goes on"." and on" x 4; produces: This sentence goes on and on and on and on and on assignment operators: =.= x= string find and extract functions: index(str,substr[,offset]), and substr(str,offset[,len]) CSCI 4152/6509, Perl Tutorial 17

Comparison operators Operation Numeric String -------------------------------------------- less than < lt less than or equal to <= le greater than > gt greater than or equal to >= ge equal to == eq not equal to!= ne compare <=> cmp -------------------------------------------- Example: print ">".(1==1)."<"; # produces: >1< print ">".(1==0)."<"; # produces: >< CSCI 4152/6509, Perl Tutorial 18

What is true and what is false Beware print? true : false ; #false print 1? true : false ; #true print 1? true : false ; #true print 0? true : false ; #false print 0? true : false ; #false print 0? true : false ; #true print 0.0? true : false ; #false print "0.0"? true : false ; #true print true? true : false ; #true print zero? true : false ; #true The false values are: 0,, 0, or undef True is anything else. CSCI 4152/6509, Perl Tutorial 19

<=> and cmp $a <=> $b and $a cmp $b return the sign of $a - $b in a sense: -1 if $a < $b or $a lt $b, 0 if $a == $b or $a eq $b, and 1 if $a > $b or $a gt $b. Useful with the sort command @a = ( 123, 19, 124 ); @a = sort @a; print "@a\n"; # 123 124 19 @a = sort {$a<=>$b} @a; print "@a\n"; # 19 123 124 @a = sort {$b<=>$a} @a; print "@a\n"; # 124 123 19 @a = sort {$a cmp $b} @a; print "@a\n"; # 123 124 19 @a = sort {$b cmp $a} @a; print "@a\n"; # 19 124 123 CSCI 4152/6509, Perl Tutorial 20

Boolean Operators Six operators: && and or! not Difference between && and and operators is in precedence, and similarly for others. Range Operators.. - creates a list in list context, flip-flop otherwise... - same, except for flip-flop behaviour @a = 1..10; print "@a\n"; # out: 1 2 3... @a = -5.. 5; print "@a\n"; # out? $a = 1; $b = 5; @c = ($a.. $b, -2.. 2); print "@c\n"; #? print map{$_.="\n"} ( aa.. zz ); CSCI 4152/6509, Perl Tutorial 21

Arrays an array is an ordered list of scalar values example my @animals = ("camel", "llama", "owl"); my @numbers = (23, 42, 69); my @mixed = ("camel", 42, 1.23); print "animals are @animals that is: $animals[0] $animals[1] $animals[2]\n"; print "There is a total of ",$#animals+1," animals\n"; print "There is a total of ",scalar(@animals), " animals\n"; $animals[5] = lion ; print "animals are @animals\n"; CSCI 4152/6509, Perl Tutorial 22

Some Array Functions (Operators) @a = (1,2,3); # @a = (1, 2, 3) push @a, 4; # @a = (1, 2, 3, 4) $b = pop @a; # $b=4, $a = (1, 2, 3) $b = shift @a; # $b=1, $a = (2, 3) unshift @a, 5; # @a = (5, 2, 3) $s = "This is a sentence."; @a = split /[.]+/, $s; $s = join <>, @a; print $s, "\n"; print Print, is, also a list operator, "\n"; print STDERR "print can use a filehandle\n"; CSCI 4152/6509, Perl Tutorial 23

Hashes (Associative arrays) a structure, associates keys with values example %p = ( one => first, two => second ); $p{ three } = third ; $p{ four } = fourth ; @a = keys %p; # or keys(%p), no order @b = values %p; # or values(%p), no order CSCI 4152/6509, Perl Tutorial 24

if-elsif-else and unless while loop for loop foreach loop Control Structures CSCI 4152/6509, Perl Tutorial 25

If-elsif-else if (EXPRESSION) { STATEMENTS; } elsif { # optional STATEMENTS; } elsif { # optional additional elsif s STATEMENTS; } else { STATEMENTS; # optional else } Other equivalent forms, e.g.: if ($x > $y) { $a = $x } $a = $x if $x > $y; $a = $x unless $x <= $y; unless ($x <= $y) { $a = $x } CSCI 4152/6509, Perl Tutorial 26

while (EXPRESSION) { STATEMENTS; } While Loop last is used to break the loop (like break in C/C++/Java) next is used to start next iteration (like continue) redo is similar to next, except that the loop condition is not evaluated labels are used to break from non-innermost loop, e.g.: L: while (EXPRESSION) {... while (E1) {... last L; } } CSCI 4152/6509, Perl Tutorial 27

next vs. redo #!/usr/bin/perl $i=0; while (++$i < 5) { print "($i) "; ++$i; next if $i==2; print "$i "; } # output: (1) (3) 4 $i=0; while (++$i < 5) { print "($i) "; ++$i; redo if $i==2; print "$i "; } # output: (1) (2) 3 (4) 5 CSCI 4152/6509, Perl Tutorial 28

For Loop for ( INIT_EXPR; COND_EXPR; LOOP_EXPR ) { STATEMENTS; } Example: for (my $i=0; $i <= $#a; ++$i) { print "$a[$i]," } CSCI 4152/6509, Perl Tutorial 29

Foreach Loop Examples: @a = ( lion, zebra, giraffe ); foreach $a (@a) { print "$a is an animal\n" } # or use default variable foreach (@a) { print "$_ is an animal\n" } # more examples foreach my $a (@a, horse ) { print "$a is animal\n"} foreach (1..50) { print "$_, " } for can be used instead of foreach\ as a synonym. CSCI 4152/6509, Perl Tutorial 30

Basic I/O # read STDIN and print, or from file specified # in the command line while ($line = <>) { print $line } # or while (<>) { print } # using default variable $_ $line = <>; # reads one line @lines = <>; # reads all lines, # (context-dependent behaviour) print "a line\n"; # output, or printf "%10s %10d %12.4f\n", $s, $n, $fl; # formatted output CSCI 4152/6509, Perl Tutorial 31

Subroutines sub say_hi { print "Hello\n"; } &say_hi(); # call &say_hi; # call, another way since we have no params say_hi; # works as well (no variable sign = # sub, i.e., &) sub add2 { my $a = shift; my $b = shift; return $a + $b; } print &add2(2,5); # produces 7 # alternative definition sub add2 { return $_[0] + $_[1] } # @_ is array of parameters # shift with no arguments takes @_ by default # (or @ARGV outside of a subroutine) CSCI 4152/6509, Perl Tutorial 32

Subroutines (2) sub add { my $ret = 0; while (@_) { $ret += shift } return $ret; } print &add(1..10); # produces 55 CSCI 4152/6509, Perl Tutorial 33

Regular Expressions A simple way to test a regular expression: while (<>) { print if /book/; } i.e., print lines that contain substring book /chee[sp]eca[rk]e/ would match: cheesecare, cheepecare, cheesecake, cheepecake option /i matches case variants; e.g., /book/i would match Book, BOOK, book, etc., as well CSCI 4152/6509, Perl Tutorial 34

RegEx: Character Class /200[012345]/ match one of the chars /200[0-9]/ character range /From[ˆ:]/ match any character but /[ˆa-zA-Z]the[ˆa-zA-Z]/ multiple ranges /[]]/ to match ] /[]-]/ to match ] or - /[$ˆ]/ to match $ or ˆ [0-9ABCDEFa-f] to match one-digit hexadecimal number. (period) any character but new-line \d any digit; i.e., same as [0-9] \D any character but digit \s any white-space character, including new-line \S any character but white-space, i.e., printable \w any word character (letter, digit, or underscore) \W any non-word character CSCI 4152/6509, Perl Tutorial 35