Week January 27 January. From last week Arrays. Reading for this week Hashes. Files. 24 H: Hour 4 PP Ch 6:29-34, Ch7:51-52
|
|
- Angelina Parker
- 6 years ago
- Views:
Transcription
1 Week 3 23 January 27 January From last week Arrays 24 H: Hour 4 PP Ch 6:29-34, Ch7:51-52 Reading for this week Hashes 24 H: Hour 7 PP Ch 6:34-37 Files 24 H: Hour 5 PP Ch 19: Biol Practical Biocomputing 1
2 Week 3 Homework 1 # # HW1 # # For a fastq file on standard input # count and report the number of reads # calculate and report the average length of the sequences # # there are four lines for each sequence in a fastq file: # title line # sequence # separator # quality line # # Michael Gribskov 14 january 2016 # # read the file line-by-line count the number of reads and total length $line_num = 0; while ( $line = <> ) { # remove the newline character chomp $line; if ( $line_num % 4 == 1 ) { # the second line is the sequence line #print "$line_num: $line:\n"; $sum_len += length($line); $seq_count++; $line_num++; $ave_len = $sum_len / $seq_count; print "there are $seq_count reads\n"; print "average read length is $ave_len\n"; Biol Practical Biocomputing 2
3 Week 3 Homework 2 Using the same FastQ file as we used in homework 1 tabulate the percentage of A, C, G and T bases versus the position in the sequence. The logical way to use this is to save the count of each base in an array where the array index is the position in the sequence. Following the base output, write a histogram of the base qualities. Base quality is shown in the 4th line of each entry and is encoded as follows Q = chr( 33 + (-10 * log10(error)) ) where chr is a perl function that converts an integer to the letter with the corresponding ascii code, the ord function does the opposite. for the histogram we can simple use the ascii values, i.e. ord($quality_letter) Biol Practical Biocomputing 3
4 Week 3 Homework 2 #Positional base content #A C G T AT GC #quality histogram Biol Practical Biocomputing 4
5 Arrays - Quick review Identified by and $a are completely different (they have different name is only used for multiple values (whole array) $name[$element] is used for single values An ordered set of scalars Initialized using parentheses with the elements separated by commas # Define nucleotide = ( "A", "C", "G", "T" ); foreach $nuc ) { print "$nuc is a valid base\n"; print the first base is $base[0]\n ; Biol Practical Biocomputing 5
6 Hashes Also called associative arrays Hashes are an array indexed by a string rather than a number Hashes are a set of key/value pairs When do you use hashes? When the natural index is a string, such as sequence id base or amino acid Hashes are declared using % and () Hashes elements are accessed using {, e.g. $residue{ala Notice the importance of [ ] array element, vs { hash element, vs ( ) list delimiter Biol Practical Biocomputing 6
7 Hashes Collections of key/value pairs an array whose index is a word # filling a hash, three different ways %student = ( name => Derren, status => Sophomore, gpa => 4.7 ); %student = ( name, Derren, status, Sophomore, gpa, 4.7 ); $student{name = 'Derren'; $student{status = 'Sophomore'; $student{gpa = 4.7; # iterating over hash foreach $attribute ( keys %student ) { print "$attribute: $student{$attribute\n"; Biol Practical Biocomputing 7
8 Hashes Defining hashes Two syntaxes for hashes "," and "=>" %student = ( name => Derren, status => Sophomore, gpa => -7 ); %student = ( name, Derren, status, Sophomore, gpa, -7 ); As with arrays, use % to access the whole hash, $hash_name{key to access the individual elements Under the hood, a hash is just a array in which the elements are paired up as keys and values %residue = ( ala => A, cys cys, C, C asp => D ); ala foreach $k ( %residue ) { A print "$k\n"; asp D Biol Practical Biocomputing 8
9 Hashes Why use hashes? Index is text (string) more natural for named entities than having two parallel arrays, one with the name, one with the information Sparse arrays where many values are absent or = qw/ala cys asp = ( 89.0, 121.0, 133.0, ); %aa_mw = { ala => 89.0, cys => 121.0, asp => 133.0, ; Elements in a hash are arranged for fast lookup, not the order you enter them (unlike an array) Features of hashes There are special functions for ordered access to hashes keys: produces a list of just the keys (heavily used) values: produces a list of just the values (not used much) each: produces a list of name value pairs (hardly ever used) Entire arrays can be copied %working = %original; Hash slices can be used, more complicated than for arrays Biol Practical Biocomputing 9
10 Hashes Iterating over hashes - TMTOWTDI foreach $key ( keys %x ) { Block of code foreach $value ( values %x ) { Block of code while ( ($key, $value) = each %x ) { Block of code %x = ( 1 => "a", 2 => "b" = keys %x; = = values %x ; = ("a","b") Biol Practical Biocomputing 10
11 Hashes and Arrays Minor Arcana When array assignments overflow or underflow, the leftover part is $value # same as $value = $data[3]; = ( 3, 6, 8, 2 ); $value # $value == 4; number of array elements %letter = ( a=>1, b=>2, c=>3 = %letter; foreach $l ) { print "b:$l\n"; letter:c letter:3 letter:a letter:1 letter:b letter:2 Biol Practical Biocomputing 11
12 Input/Output Standard filehandles names are derived from C Must be capitalized, no prefix character (sigil) Standard input: STDIN, default is keyboard if no filehandle is supplied for input, the default is STDIN <> (functionally, the same as <STDIN>) returns the string that is read when the input stream ends, it returns the empty string (false) Standard output: STDOUT, default is terminal display STDOUT is buffered, i.e., saved up until a certain amount is present if no filehandle is supplied, the default output is STDOUT print "test\n" is the same as print STDOUT "test\n"; Standard error: STDERR STDERR is also generally the terminal, but STDERR messages are immediate not buffered As it sounds, STDERR is often used for error messages Also used for printing to terminal when output is going to a file Biol Practical Biocomputing 12
13 Input/Output Reading from (multiple) files If a file name, or list of file names, follows the name of the program, it will be used as standard input rather than the terminal a.txt This is a test file three lines of txt copy_text.pl while ( $line = <> ) { print "$line"; %copytext.pl a.txt %copytext.pl a.txt b.txt c.txt %copytext.pl *.txt (may not work on all systems, or with very long lists) Biol Practical Biocomputing 13
14 Input/Output Reading/writing files using redirection Standard input, STDIN, is the keyboard Change at command line by using input redirect operator (<) Standard output, STDOUT, goes to the terminal Change at command line using the output redirect (>) operator %copytext.pl <a.txt >output.txt %copytext.pl a.txt b.txt c.txt >analysis.dat Biol Practical Biocomputing 14
15 Input/Output Recommended Reading and printing to/from named files Filehandle A symbol that identifies a file Filehandles explicity identify input and output streams, usually files Filehandles have names all in capitals with no symbol preceding Predefined: STDIN, STDOUT, STDERR Usually you will use an indirect filehandle, a scalar variable ($filehandle) Three parameter file open Open files with open( $filehandle, mode, $filename ) You should test whether a filehandle opens correctly open returns true if successful Close with close( $filehandle ); sends all pending output and closes file If you omit close, all filehandles close when the program terminates Biol Practical Biocomputing 15
16 Input/Output Recommended Mode indicates whether you want to read or write to the file read, mode = < write, mode = > append, mode = >> # test for file opening preferred method # die is a system function that terminates the script and prints the # following string (if present) $filename = 'seq.fa'; open ( $in, "<", $filename ) die "$filename cannot be opened\n"; open ( $out, ">", $filename ) or die "$filename cannot be opened\n"; # mode == read # mode == write $line = <$in>; print $out $line; close $in; Biol Practical Biocomputing 16
17 Input/Output old style (deprecated) Open with open( FILEHANDLE, "<$filename" ); Filehandles have names all in capitals with no sigil preceding Predefined: STDIN, STDOUT, STDERR (predefined) Close with close( FILEHANDLE ); sends all pending output and closes file If you omit close, all filehandles close when the program terminates # test for file opening preferred method # die is a system function that terminates the script and prints the # following string (if present) $filename = 'seq.fa'; open ( IN, "< $filename" ) die "$file cannot be opened\n"; open ( OUT, "> $filename" ) or die "$file cannot be opened\n"; # mode == read # mode == write $line = <IN>; print OUT $line; close IN; Biol Practical Biocomputing 17
18 Input/Output Opening files Files must be opened for reading, writing, or appending Files should be closed, but usually it won't matter # open for output open( $outgoing, ">", "filename.txt" ) die "filename.txt cannot be opened\n"; print $outgoing "foo\n"; close( $outgoing ); # open for append open( $add, ">>", "filename.txt" ) die "filename.txt cannot be opened\n"; print $add "foo2\n"; close( $add ); # open for input open( $incoming, "<filename.txt" ) die "filename.txt cannot be opened\n"; while ( $foo = < $incoming > ) { print "=>$foo\n"; close( $incoming ); Biol Practical Biocomputing 18
19 Input/Output Reading from filehandles use the input operator with the filehandle $filehandle Writing to filehandles insert the filehandle after print, print $filehandle "stuff\n" # add line numbers to each line of a file open( $in, "<", "filename.txt" ) die "filename.txt cannot be opened\n"; open( $out, ">", "new.txt" ) die "new.txt cannot be opened\n"; $line_no = 0; while ( $line = <$in> ) { $line_no++; print $out "$line_no $line"; close $in; close $out; Biol Practical Biocomputing 19
20 Input/Output Special functions for files eof: end of file, returns true after the last line is read particularly useful when reading a series of files on STDIN $file_num = 1; $line_num = 0; while ( $line=<> ) { chomp $line; $line_num++; print " ($file_num,$line_num) $line\n"; if ( eof ) { print "end of file\n"; file1.txt $file_num++; file1=a $line_num = 0; file1=b file2.txt file2,first file2,second % eof.pl file1.txt file2.txt (1,1) file1=a (1,2) file1=b end of file (2,1) file2,first (2,2) file2,second end of file Biol Practical Biocomputing 20
21 Input/Output Using eof (end of file) eof is true when you have just read the last line of the file, even if it is not the last line of the input stream in most cases you don't need eof because <FILE> returns false when you reach the end of file when reading multiple files on standard input this is not true, <> is true until the end of the last file #counts the total number of lines in all files $line_count = 0; while ( $line = <> ) { $line_count++; print "$line_count\n"; # counts the number of lines in each file $file_no = 0; $line_count = 0; while ( $line = <> ) { $line_count++; if ( eof ) { $file_no++; print "file number:$file_no lines:$line_count\n"; $line_count = 0; Biol Practical Biocomputing 21
22 Input/Output Special functions for files File test operators (a selection), true means -r file is readable -w file is writable -f file is a plain file (i.e., a text file) -d file is a directory -M age of file (since modification), time in days -A age of file (since accessed ), time in = ( "a.txt", "b.txt", "c.txt" ); foreach $file ) { next unless M $file > 0.5; # files older than 12 hours Biol Practical Biocomputing 22
23 Getting rid of newlines Every line you read is ended by a newline (\n, carriage return) When you split the line, the last item will have the newline Function chomp removes the last character of a string if and only if it is a newline Newline may be different on different hardware while ( $line = <> ) { chomp = split " ", $line; Biol Practical Biocomputing 23
24 Input/Output An entire file can be read into a array at once, this is called slurping Slurping can be done with any filehandle Each line of the file is one element of the array Advantages: fast and simple Disadvantages: entire contents is stored in memory large files may be too = <>; # note the difference from $content = <> foreach $line ) { print "$line"; # same with = <$sequence>; foreach $line ) { print "$line"; Biol Practical Biocomputing 24
25 Input/Output Sorting a file Arthur, Chester A Adams, John Q Buchanan, Jean Williams, Andrew Jackson, Annette $filename = "student.txt"; open( $infile, "<", $filename ) die "Unable to open input file $filename\n"; $filename2 = "student2.txt"; open( $outfile, ">", $filename2 ) die "Unable to open file = <$infile>; # reads all of = ); close $infile; foreach $line ) { print $outfile $line; close $outfile; Biol Practical Biocomputing 25
26 Text Processing Split and Join functions split: break a string into pieces, converts string to list split( pattern, expression ); split "pattern" $string; split( pattern, expression, limit ); split "pattern" $string, 2; the split pattern is removed from string (more about patterns later) join: connect elements of list into a string, converts list to string join( expression, list ); $text = "But soft,\n what light through \t yonder window\n"; print "starting string: $text\n"; # split on white = split " ",$text; $wordcount = 0; foreach $word ) { $wordcount++; print "$wordcount: /$word/\n"; $new_string = join " print "string after joining: $new_string\n"; print "\n$wordcount words found\n\n"; starting string: But soft, what light through yonder window 1: /But/ 2: /soft,/ 3: /what/ 4: /light/ 5: /through/ 6: /yonder/ 7: /window/ string after joining: But soft, what light through yonder window 7 words found Biol Practical Biocomputing 26
27 Text Processing Split can be used to get columns of data ideal for tab-delimited or csv files from ms-excel, GFF files in genomics Arthur Adams Buchanan Williams Jackson while ( $line = <> ) { ( $name, $score1, $score2 ) = split " ", $line; print "score 1: $score1 score 2: $score2 $name\n"; while ( $line = <> ) { ( $name, $score1, $score2 ) = split " ", $line; $test1{ $name = $score1; $test2{ $name = $score2; foreach $name ( sort keys %test1 ) { print "name: $name $test1{$name $test2{$name; Biol Practical Biocomputing 27
28 More splitting use split to break up a sequence into an array of letters $seq = = split "", $seq; # nothing between the double quotes foreach $base ) { use split with a limit to break apart the first line of a FASTA or FASTQ formatted sequence >CPK2 calcium dependent protein kinase 2 2:N:0:GTAGAG AGGCCATGAGGTTCCCCAGAAGGAAAGGTCCGGCCGGACCAGTACTCGCGATGAGGCGGACCGGC ( $name, $doc ) = split " ", $line, 2; # split into two parts at first space Biol Practical Biocomputing 28
29 Unique IDs A very common problem is to identify a set of unique IDs Homework 3 asks you to identify transcripts with the same ID but different isoform numbers The solution relies on the fact that the keys in hashes must be unique, and a special function, defined, that tells whether a scal value has been defined (defined $scalar_value) is true if the variable has been defined using my, or if it has been assigned a value Hash elements are scalar values, they are defined if they have been assigned a value (defined $hash{key ) is true if the hashkey key is defined Biol Practical Biocomputing 29
30 Practical Steps to Writing a Program Figure out what you want to do Write it as a series of comments Take one step at a time, testing while you go Focus on the loops first test to see the loops work Then add the logical tests test to see your logic is correct Only then add the "meat" Biol Practical Biocomputing 30
31 GFF3 Genome annotation file 1. Sequence 2. Source 3. Feature 4. Begin 5. End 6. Score 7. Strand 8. Frame 9. Comment 4. repeat_region ?. description=dust 4. repeat_region ?. description=dust 4 ensembl protein_coding_gene ID=AT4G00060;description=Nucleotidyltransferase family protein;external_name=mee44; 4 ensembl transcript ID=AT4G ;Parent=AT4G00060;biotype=protein_coding;logic_name=tair 4. CDS ID=AT4G ;Parent=AT4G ;rank=1 4. exon ID=AT4G00060-E.1;Parent=AT4G ;constitutive=1;rank=1 4. CDS ID=AT4G ;Parent=AT4G ;rank=2 4. exon ID=AT4G00060-E.2;Parent=AT4G ;constitutive=1;rank=2 4. CDS ID=AT4G ;Parent=AT4G ;rank=3 4. exon ID=AT4G00060-E.3;Parent=AT4G ;constitutive=1;rank= CDS ID=AT4G ;Parent=AT4G ;rank=15 4. exon ID=AT4G00060-E.15;Parent=AT4G ;constitutive=1;ensembl_end_phase=-1;rank=15 4. exon ID=AT4G00060-E.16;Parent=AT4G ;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;rank=16 4. exon ID=AT4G00060-E.17;Parent=AT4G ;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;rank=17 4. exon ID=AT4G00060-E.18;Parent=AT4G ;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;rank=18 4 ensembl protein_coding_gene ID=AT4G00070; description=ring/u-box superfamily protein; 4 ensembl transcript ID=AT4G ;Parent=AT4G exon ID=AT4G00070-E.5;Parent=AT4G ;constitutive=1;ensembl_end_phase=-1;rank=5 4. CDS ID=AT4G ;Parent=AT4G ;rank=5 4. CDS ID=AT4G ;Parent=AT4G ;rank=4 4. exon ID=AT4G00070-E.4;Parent=AT4G ;constitutive=1;rank=4 Biol Practical Biocomputing 31
32 Print out the beginning and ending coordinate of every transcript in the Arabidopsis genome based on the GFF3 file Read one line at a time Use split to get the columns Identify whether the feature is a transcript Split the comment to get the ID Print out the ID, begin, and end information Biol Practical Biocomputing 32
Indian Institute of Technology Kharagpur. PERL Part II. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T.
Indian Institute of Technology Kharagpur PERL Part II Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T. Kharagpur, INDIA Lecture 22: PERL Part II On completion, the student will be able
More informationWeek 4. Week 4 Goals & Reading. Strict pragma P24H: Hour 8: Making a stricter Perl PP: Ch 6 (using the strict pragma)
Week 4 Week 4 Goals & Reading Strict pragma P24H: Hour 8: Making a stricter Perl PP: Ch 6 (using the strict pragma) Regular Expressions P24H: Hour 6, Hour 9: Transliteration PP: Ch10, Ch15 Special variables
More informationRegular expressions and case insensitivity
Regular expressions and case insensitivity As previously mentioned, you can make matching case insensitive with the i flag: /\b[uu][nn][ii][xx]\b/; /\bunix\b/i; # explicitly giving case folding # using
More informationRegular expressions and case insensitivity
Regular expressions and case insensitivity As previously mentioned, you can make matching case insensitive with the i flag: /\b[uu][nn][ii][xx]\b/; # explicitly giving case folding /\bunix\b/i; # using
More informationWhat is PERL?
Perl For Beginners What is PERL? Practical Extraction Reporting Language General-purpose programming language Creation of Larry Wall 1987 Maintained by a community of developers Free/Open Source www.cpan.org
More informationBioinformatics. Computational Methods II: Sequence Analysis with Perl. George Bell WIBR Biocomputing Group
Bioinformatics Computational Methods II: Sequence Analysis with Perl George Bell WIBR Biocomputing Group Sequence Analysis with Perl Introduction Input/output Variables Functions Control structures Arrays
More informationClassnote for COMS6100
Classnote for COMS6100 Yiting Wang 3 November, 2016 Today we learn about subroutines, references, anonymous and file I/O in Perl. 1 Subroutines in Perl First of all, we review the subroutines that we had
More information(Refer Slide Time: 01:12)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #22 PERL Part II We continue with our discussion on the Perl
More informationWeek Overview. Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file
ULI101 Week 05 Week Overview Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file head and tail commands These commands display
More informationCOMS 3101 Programming Languages: Perl. Lecture 2
COMS 3101 Programming Languages: Perl Lecture 2 Fall 2013 Instructor: Ilia Vovsha http://www.cs.columbia.edu/~vovsha/coms3101/perl Lecture Outline Control Flow (continued) Input / Output Subroutines Concepts:
More informationSequence Analysis with Perl. Unix, Perl and BioPerl. Why Perl? Objectives. A first Perl program. Perl Input/Output. II: Sequence Analysis with Perl
Sequence Analysis with Perl Unix, Perl and BioPerl II: Sequence Analysis with Perl George Bell, Ph.D. WIBR Bioinformatics and Research Computing Introduction Input/output Variables Functions Control structures
More informationA control expression must evaluate to a value that can be interpreted as true or false.
Control Statements Control Expressions A control expression must evaluate to a value that can be interpreted as true or false. How a control statement behaves depends on the value of its control expression.
More informationUnix, Perl and BioPerl
Unix, Perl and BioPerl II: Sequence Analysis with Perl George Bell, Ph.D. WIBR Bioinformatics and Research Computing Sequence Analysis with Perl Introduction Input/output Variables Functions Control structures
More informationScripting Languages Perl Basics. Course: Hebrew University
Scripting Languages Perl Basics Course: 67557 Hebrew University אליוט יפה Jaffe Lecturer: Elliot FMTEYEWTK Far More Than Everything You've Ever Wanted to Know Perl Pathologically Eclectic Rubbish Lister
More informationCS 230 Programming Languages
CS 230 Programming Languages 09 / 16 / 2013 Instructor: Michael Eckmann Today s Topics Questions/comments? Continue Syntax & Semantics Mini-pascal Attribute Grammars More Perl A more complex grammar Let's
More informationPerl for Biologists. Session 6 April 16, Files, directories and I/O operations. Jaroslaw Pillardy
Perl for Biologists Session 6 April 16, 2014 Files, directories and I/O operations Jaroslaw Pillardy Perl for Biologists 1.1 1 Reminder: What is a Hash? Array Hash Index Value Key Value 0 apple red fruit
More informationPerl for Biologists. Regular Expressions. Session 7. Jon Zhang. April 23, Session 7: Regular Expressions CBSU Perl for Biologists 1.
Perl for Biologists Session 7 April 23, 2014 Regular Expressions Jon Zhang Session 7: Regular Expressions CBSU Perl for Biologists 1.1 1 Review of Session 6 Each program has three default input/output
More informationsottotitolo A.A. 2016/17 Federico Reghenzani, Alessandro Barenghi
Titolo presentazione Piattaforme Software per la Rete sottotitolo BASH Scripting Milano, XX mese 20XX A.A. 2016/17, Alessandro Barenghi Outline 1) Introduction to BASH 2) Helper commands 3) Control Flow
More informationProgramming introduction part I:
Programming introduction part I: Perl, Unix/Linux and using the BlueHive cluster Bio472- Spring 2014 Amanda Larracuente Text editor Syntax coloring Recognize several languages Line numbers Free! Mac/Windows
More informationOutline. CS3157: Advanced Programming. Feedback from last class. Last plug
Outline CS3157: Advanced Programming Lecture #2 Jan 23 Shlomo Hershkop shlomo@cs.columbia.edu Feedback Introduction to Perl review and continued Intro to Regular expressions Reading Programming Perl pg
More informationIT441. Network Services Administration. Perl: File Handles
IT441 Network Services Administration Perl: File Handles Comment Blocks Perl normally treats lines beginning with a # as a comment. Get in the habit of including comments with your code. Put a comment
More informationPathologically Eclectic Rubbish Lister
Pathologically Eclectic Rubbish Lister 1 Perl Design Philosophy Author: Reuben Francis Cornel perl is an acronym for Practical Extraction and Report Language. But I guess the title is a rough translation
More informationIdentiyfing splice junctions from RNA-Seq data
Identiyfing splice junctions from RNA-Seq data Joseph K. Pickrell pickrell@uchicago.edu October 4, 2010 Contents 1 Motivation 2 2 Identification of potential junction-spanning reads 2 3 Calling splice
More informationChapter 3. Basics in Perl. 3.1 Variables and operations Scalars Strings
Chapter 3 Basics in Perl 3.1 Variables and operations 3.1.1 Scalars 2 $hello = "Hello World!"; 3 print $hello; $hello is a scalar variable. It represents an area in the memory where you can store data.
More informationA Crash Course in Perl5
z e e g e e s o f t w a r e A Crash Course in Perl5 Part 5: Data Zeegee Software Inc. http://www.zeegee.com/ Terms and Conditions These slides are Copyright 2008 by Zeegee Software Inc. They have been
More informationAdvanced Perl. Making complex structures
Advanced Perl Making complex structures When you are not sure that you are constructing a complex structure correctly, make a version by hand for testing Array of atom positions (x,y,z) @atom = ( [1.0,
More informationprint STDERR "This is a debugging message.\n";
NAME DESCRIPTION perlopentut - simple recipes for opening files and pipes in Perl Whenever you do I/O on a file in Perl, you do so through what in Perl is called a filehandle. A filehandle is an internal
More informationPerl. Interview Questions and Answers
and Answers Prepared by Abhisek Vyas Document Version 1.0 Team, www.sybaseblog.com 1 of 13 Q. How do you separate executable statements in perl? semi-colons separate executable statements Example: my(
More informationIntroduction to Perl. Perl Background. Sept 24, 2007 Class Meeting 6
Introduction to Perl Sept 24, 2007 Class Meeting 6 * Notes on Perl by Lenwood Heath, Virginia Tech 2004 Perl Background Practical Extraction and Report Language (Perl) Created by Larry Wall, mid-1980's
More informationEssential Skills for Bioinformatics: Unix/Linux
Essential Skills for Bioinformatics: Unix/Linux SHELL SCRIPTING Overview Bash, the shell we have used interactively in this course, is a full-fledged scripting language. Unlike Python, Bash is not a general-purpose
More informationProgramming Perls* Objective: To introduce students to the perl language.
Programming Perls* Objective: To introduce students to the perl language. Perl is a language for getting your job done. Making Easy Things Easy & Hard Things Possible Perl is a language for easily manipulating
More informationDATA STRUCTURES USING C
DATA STRUCTURES USING C File Handling in C Goals By the end of this unit you should understand how to open a file to write to it. how to open a file to read from it. how to open a file to append data to
More informationCSCI 4152/6509 Natural Language Processing. Perl Tutorial CSCI 4152/6509. CSCI 4152/6509, Perl Tutorial 1
CSCI 4152/6509 Natural Language Processing Perl Tutorial CSCI 4152/6509 Vlado Kešelj CSCI 4152/6509, Perl Tutorial 1 created in 1987 by Larry Wall About Perl interpreted language, with just-in-time semi-compilation
More informationLearning Perl 6. brian d foy, Version 0.6, Nordic Perl Workshop 2007
Learning Perl 6 brian d foy, Version 0.6, Nordic Perl Workshop 2007 for the purposes of this tutorial Perl 5 never existed Don t really do this $ ln -s /usr/local/bin/pugs /usr/bin/perl
More informationIntroduction to Perl. c Sanjiv K. Bhatia. Department of Mathematics & Computer Science University of Missouri St. Louis St.
Introduction to Perl c Sanjiv K. Bhatia Department of Mathematics & Computer Science University of Missouri St. Louis St. Louis, MO 63121 Contents 1 Introduction 1 2 Getting started 1 3 Writing Perl scripts
More informationReading and manipulating files
Reading and manipulating files Goals By the end of this lesson you will be able to Read files without using text editors Access specific parts of files Count the number of words and lines in a file Sort
More informationC Concepts - I/O. Lecture 19 COP 3014 Fall November 29, 2017
C Concepts - I/O Lecture 19 COP 3014 Fall 2017 November 29, 2017 C vs. C++: Some important differences C has been around since around 1970 (or before) C++ was based on the C language While C is not actually
More informationBIOS 546 Midterm March 26, Write the line of code that all Perl programs on biolinx must start with so they can be executed.
1. What values are false in Perl? BIOS 546 Midterm March 26, 2007 2. Write the line of code that all Perl programs on biolinx must start with so they can be executed. 3. How do you make a comment in Perl?
More informationCOMS 3101 Programming Languages: Perl. Lecture 1
COMS 3101 Programming Languages: Perl Lecture 1 Fall 2013 Instructor: Ilia Vovsha http://www.cs.columbia.edu/~vovsha/coms3101/perl What is Perl? Perl is a high level language initially developed as a scripting
More informationArrays (Lists) # or, = ("first string", "2nd string", 123);
Arrays (Lists) An array is a sequence of scalars, indexed by position (0,1,2,...) The whole array is denoted by @array Individual array elements are denoted by $array[index] $#array gives the index of
More informationIndian Institute of Technology Kharagpur. PERL Part III. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T.
Indian Institute of Technology Kharagpur PERL Part III Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T. Kharagpur, INDIA Lecture 23: PERL Part III On completion, the student will be able
More information1. Introduction. 2. Scalar Data
1. Introduction What Does Perl Stand For? Why Did Larry Create Perl? Why Didn t Larry Just Use Some Other Language? Is Perl Easy or Hard? How Did Perl Get to Be So Popular? What s Happening with Perl Now?
More informationMiniproject 1. Part 1 Due: 16 February. The coverage problem. Method. Why it is hard. Data. Task1
Miniproject 1 Part 1 Due: 16 February The coverage problem given an assembled transcriptome (RNA) and a reference genome (DNA) 1. 2. what fraction (in bases) of the transcriptome sequences match to annotated
More informationThey grow as needed, and may be made to shrink. Officially, a Perl array is a variable whose value is a list.
Arrays Perl arrays store lists of scalar values, which may be of different types. They grow as needed, and may be made to shrink. Officially, a Perl array is a variable whose value is a list. A list literal
More informationEssential Skills for Bioinformatics: Unix/Linux
Essential Skills for Bioinformatics: Unix/Linux WORKING WITH COMPRESSED DATA Overview Data compression, the process of condensing data so that it takes up less space (on disk drives, in memory, or across
More informationUNIT IV-2. The I/O library functions can be classified into two broad categories:
UNIT IV-2 6.0 INTRODUCTION Reading, processing and writing of data are the three essential functions of a computer program. Most programs take some data as input and display the processed data, often known
More informationFile Input/Output. Learning Outcomes 10/8/2012. CMSC 201 Fall 2012 Instructor: John Park Lecture Section 01. Discussion Sections 02-08, 16, 17
CMSC 201 Fall 2012 Instructor: John Park Lecture Section 01 1 Discussion Sections 02-08, 16, 17 Adapted from slides by Sue Evans et al. 2 Learning Outcomes Become familiar with input and output (I/O) from
More informationCSE 390a Lecture 2. Exploring Shell Commands, Streams, Redirection, and Processes
CSE 390a Lecture 2 Exploring Shell Commands, Streams, Redirection, and Processes slides created by Marty Stepp, modified by Jessica Miller & Ruth Anderson http://www.cs.washington.edu/390a/ 1 2 Lecture
More informationStandard File Pointers
1 Programming in C Standard File Pointers Assigned to console unless redirected Standard input = stdin Used by scan function Can be redirected: cmd < input-file Standard output = stdout Used by printf
More informationShells and Shell Programming
Shells and Shell Programming 1 Shells A shell is a command line interpreter that is the interface between the user and the OS. The shell: analyzes each command determines what actions are to be performed
More informationComputational Theory MAT542 (Computational Methods in Genomics) - Part 2 & 3 -
Computational Theory MAT542 (Computational Methods in Genomics) - Part 2 & 3 - Benjamin King Mount Desert Island Biological Laboratory bking@mdibl.org Overview of 4 Lectures Introduction to Computation
More informationPerl Programming Fundamentals for the Computational Biologist
Perl Programming Fundamentals for the Computational Biologist Class 2 Marine Biological Laboratory, Woods Hole Advances in Genome Technology and Bioinformatics Fall 2004 Andrew Tolonen Chisholm lab, MIT
More informationPerl. Perl. Perl. Which Perl
Perl Perl Perl = Practical Extraction and Report Language Developed by Larry Wall (late 80 s) as a replacement for awk. Has grown to become a replacement for awk, sed, grep, other filters, shell scripts,
More informationCS 105 Perl: File I/O, slices, and array manipulation
CS 105 Perl: File I/O, slices, and array manipulation Nathan Clement January 27, 2013! Agenda Intermediate iteration last and next Intermediate I/O Special variables Array manipulation push, pop, shift,
More informationPractical Linux examples: Exercises
Practical Linux examples: Exercises 1. Login (ssh) to the machine that you are assigned for this workshop (assigned machines: https://cbsu.tc.cornell.edu/ww/machines.aspx?i=87 ). Prepare working directory,
More informationA Field Guide To The Perl Command Line. Andy Lester
A Field Guide To The Perl Command Line Andy Lester andy@petdance.com http://petdance.com/perl/ Where we're going Command-line == super lazy The magic filehandle The -e switch -p, -n: Implicit looping -a,
More informationLESSON 1. A C program is constructed as a sequence of characters. Among the characters that can be used in a program are:
LESSON 1 FUNDAMENTALS OF C The purpose of this lesson is to explain the fundamental elements of the C programming language. C like other languages has all alphabet and rules for putting together words
More informationECE 364 Software Engineering Tools Lab. Lecture 2 Bash II
ECE 364 Software Engineering Tools Lab Lecture 2 Bash II 1 Lecture 2 Summary Arrays I/O Redirection Pipes Quotes Capturing Command Output Commands: cat, head, tail, cut, paste, wc 2 Array Variables Declaring
More information30-Jan CSCI 4152/6509 Natural Language Processing Lab 4: Perl Tutorial 3. Perl Tutorial 3. Faculty of Computer Science, Dalhousie University
Lecture 4 p.1 Faculty of Computer Science, Dalhousie University CSCI 4152/6509 Natural Language Processing Lab 4: Perl Tutorial 3 30-Jan-2019 Lab Instructor: Dijana Kosmajac, Afsan Gujarati, and Yuhan
More informationAdvanced UCSC Browser Functions
Advanced UCSC Browser Functions Dr. Thomas Randall tarandal@email.unc.edu bioinformatics.unc.edu UCSC Browser: genome.ucsc.edu Overview Custom Tracks adding your own datasets Utilities custom tools for
More informationUNIX Shell Programming
$!... 5:13 $$ and $!... 5:13.profile File... 7:4 /etc/bashrc... 10:13 /etc/profile... 10:12 /etc/profile File... 7:5 ~/.bash_login... 10:15 ~/.bash_logout... 10:18 ~/.bash_profile... 10:14 ~/.bashrc...
More information2.8. Decision Making: Equality and Relational Operators
Page 1 of 6 [Page 56] 2.8. Decision Making: Equality and Relational Operators A condition is an expression that can be either true or false. This section introduces a simple version of Java's if statement
More informationIntroduction to: Computers & Programming: Strings and Other Sequences
Introduction to: Computers & Programming: Strings and Other Sequences in Python Part I Adam Meyers New York University Outline What is a Data Structure? What is a Sequence? Sequences in Python All About
More informationSOLiD GFF File Format
SOLiD GFF File Format 1 Introduction The GFF file is a text based repository and contains data and analysis results; colorspace calls, quality values (QV) and variant annotations. The inputs to the GFF
More informationFunctional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute
Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Module # 02 Lecture - 03 Characters and Strings So, let us turn our attention to a data type we have
More informationWeek 5: Files and Streams
CS319: Scientific Computing (with C++) Week 5: and Streams 9am, Tuesday, 12 February 2019 1 Labs and stuff 2 ifstream and ofstream close a file open a file Reading from the file 3 Portable Bitmap Format
More informationAdvanced training. Linux components Command shell. LiLux a.s.b.l.
Advanced training Linux components Command shell LiLux a.s.b.l. alexw@linux.lu Kernel Interface between devices and hardware Monolithic kernel Micro kernel Supports dynamics loading of modules Support
More informationComputer Programming : C++
The Islamic University of Gaza Engineering Faculty Department of Computer Engineering Fall 2017 ECOM 2003 Muath i.alnabris Computer Programming : C++ Experiment #1 Basics Contents Structure of a program
More informationSprite an animation manipulation language Language Reference Manual
Sprite an animation manipulation language Language Reference Manual Team Leader Dave Smith Team Members Dan Benamy John Morales Monica Ranadive Table of Contents A. Introduction...3 B. Lexical Conventions...3
More informationIntroduction to C++ Programming Pearson Education, Inc. All rights reserved.
1 2 Introduction to C++ Programming 2 What s in a name? that which we call a rose By any other name would smell as sweet. William Shakespeare When faced with a decision, I always ask, What would be the
More informationLinux II and III. Douglas Scofield. Crea-ng directories and files 18/01/14. Evolu5onary Biology Centre, Uppsala University
Linux II and III Douglas Scofield Evolu5onary Biology Centre, Uppsala University douglas.scofield@ebc.uu.se slides at Crea-ng directories and files mkdir 1 Crea-ng directories and files touch if file does
More informationScripting Languages. Diana Trandabăț
Scripting Languages Diana Trandabăț Master in Computational Linguistics - 1 st year 2017-2018 Today s lecture What is Perl? How to install Perl? How to write Perl progams? How to run a Perl program? perl
More informationCSCI 123 Introduction to Programming Concepts in C++
CSCI 123 Introduction to Programming Concepts in C++ Brad Rippe C++ Basics C++ layout Include directive #include using namespace std; int main() { } statement1; statement; return 0; Every program
More informationPerl. Many of these conflict with design principles of languages for teaching.
Perl Perl = Practical Extraction and Report Language Developed by Larry Wall (late 80 s) as a replacement for awk. Has grown to become a replacement for awk, sed, grep, other filters, shell scripts, C
More informationShells and Shell Programming
Shells and Shell Programming Shells A shell is a command line interpreter that is the interface between the user and the OS. The shell: analyzes each command determines what actions are to be performed
More informationRegular Expressions. Todd Kelley CST8207 Todd Kelley 1
Regular Expressions Todd Kelley kelleyt@algonquincollege.com CST8207 Todd Kelley 1 POSIX character classes Some Regular Expression gotchas Regular Expression Resources Assignment 3 on Regular Expressions
More informationGenome 373: Intro to Python II. Doug Fowler
Genome 373: Intro to Python II Doug Fowler Review string objects represent a sequence of characters characters in strings can be gotten by index, e.g. mystr[3] substrings can be extracted by slicing, e.g.
More informationPERL Bioinformatics. Nicholas E. Navin, Ph.D. Department of Genetics Department of Bioinformatics. TA: Dr. Yong Wang
PERL Bioinformatics Nicholas E. Navin, Ph.D. Department of Genetics Department of Bioinformatics TA: Dr. Yong Wang UNIX Background and History PERL Practical Extraction and Reporting Language Developed
More informationCSE 390a Lecture 2. Exploring Shell Commands, Streams, and Redirection
1 CSE 390a Lecture 2 Exploring Shell Commands, Streams, and Redirection slides created by Marty Stepp, modified by Jessica Miller & Ruth Anderson http://www.cs.washington.edu/390a/ 2 Lecture summary Unix
More informationprintf( Please enter another number: ); scanf( %d, &num2);
CIT 593 Intro to Computer Systems Lecture #13 (11/1/12) Now that we've looked at how an assembly language program runs on a computer, we're ready to move up a level and start working with more powerful
More informationData Types and Variables in C language
Data Types and Variables in C language Basic structure of C programming To write a C program, we first create functions and then put them together. A C program may contain one or more sections. They are
More informationProgram Elements -- Introduction
Program Elements -- Introduction We can now examine the core elements of programming Chapter 3 focuses on: data types variable declaration and use operators and expressions decisions and loops input and
More informationBriefly: Bioinformatics File Formats. J Fass September 2018
Briefly: Bioinformatics File Formats J Fass September 2018 Overview ASCII Text Sequence Fasta, Fastq ~Annotation TSV, CSV, BED, GFF, GTF, VCF, SAM Binary (Data, Compressed, Executable) Data HDF5 BAM /
More informationWelcome to Research Computing Services training week! November 14-17, 2011
Welcome to Research Computing Services training week! November 14-17, 2011 Monday intro to Perl, Python and R Tuesday learn to use Titan Wednesday GPU, MPI and profiling Thursday about RCS and services
More informationThe Power of Perl. Perl. Perl. Change all gopher to World Wide Web in a single command
The Power of Perl Perl Change all gopher to World Wide Web in a single command perl -e s/gopher/world Wide Web/gi -p -i.bak *.html Perl can be used as a command Or like an interpreter UVic SEng 265 Daniel
More informationCommand Interpreters. command-line (e.g. Unix shell) On Unix/Linux, bash has become defacto standard shell.
Command Interpreters A command interpreter is a program that executes other programs. Aim: allow users to execute the commands provided on a computer system. Command interpreters come in two flavours:
More informationIntroduc)on to Unix and Perl programming
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Department of Systems Biology Technical University of Denmark Introduc)on to Unix and Perl programming EDITA KAROSIENE PhD student edita@cbs.dtu.dk www.cbs.dtu.dk
More informationI/O and Text Processing. Data into and out of programs
I/O and Text Processing Data into and out of programs Copyright 2006 2009 Stewart Weiss Extending I/O You have seen that input to your program can come from the keyboard and that in Perl, a statement such
More informationChIP-seq (NGS) Data Formats
ChIP-seq (NGS) Data Formats Biological samples Sequence reads SRA/SRF, FASTQ Quality control SAM/BAM/Pileup?? Mapping Assembly... DE Analysis Variant Detection Peak Calling...? Counts, RPKM VCF BED/narrowPeak/
More informationDr. Barbara Morgan Quantitative Methods
Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In
More informationData and File Structures Chapter 2. Basic File Processing Operations
Data and File Structures Chapter 2 Basic File Processing Operations 1 Outline Physical versus Logical Files Opening and Closing Files Reading, Writing and Seeking Special Characters in Files The Unix Directory
More informationCSE 12 Spring 2016 Week One, Lecture Two
CSE 12 Spring 2016 Week One, Lecture Two Homework One and Two: hw2: Discuss in section today - Introduction to C - Review of basic programming principles - Building from fgetc and fputc - Input and output
More informationIntroduction to: Computers & Programming: Strings and Other Sequences
Introduction to: Computers & Programming: Strings and Other Sequences in Python Part I Adam Meyers New York University Outline What is a Data Structure? What is a Sequence? Sequences in Python All About
More informationThe Big Python Guide
The Big Python Guide Big Python Guide - Page 1 Contents Input, Output and Variables........ 3 Selection (if...then)......... 4 Iteration (for loops)......... 5 Iteration (while loops)........ 6 String
More informationPod::Usage, pod2usage() - print a usage message from embedded pod documentation
NAME Pod::Usage, pod2usage() - print a usage message from embedded pod documentation SYNOPSIS use Pod::Usage my $message_text = "This text precedes the usage message."; my $exit_status = 2; ## The exit
More informationTable of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs
Summer 2010 Department of Computer Science and Engineering York University Toronto June 29, 2010 1 / 36 Table of contents 1 2 3 4 2 / 36 Our goal Our goal is to see how we can use Unix as a tool for developing
More informationFile I/O. Last updated 10/30/18
Last updated 10/30/18 Input/Output Streams Information flow between entities is done with streams Keyboard Text input stream data stdin Data Text output stream Monitor stdout stderr printf formats data
More informationLING/C SC/PSYC 438/538. Lecture 10 Sandiway Fong
LING/C SC/PSYC 438/538 Lecture 10 Sandiway Fong Administrivia Homework 4 Perl regex Python re import re slightly complicated string handling: use raw https://docs.python.or g/3/library/re.html Regular
More informationstdin, stdout, stderr
stdin, stdout, stderr stdout and stderr Many programs make output to "standard out" and "standard error" (e.g. the print command goes to standard out, error messages go to standard error). By default,
More informationBovineMine Documentation
BovineMine Documentation Release 1.0 Deepak Unni, Aditi Tayal, Colin Diesh, Christine Elsik, Darren Hag Oct 06, 2017 Contents 1 Tutorial 3 1.1 Overview.................................................
More information