Lecture 2, Introduction to Python. Python Programming Language

Size: px
Start display at page:

Download "Lecture 2, Introduction to Python. Python Programming Language"

Transcription

1 BINF 3360, Introduction to Computational Biology Lecture 2, Introduction to Python Young-Rae Cho Associate Professor Department of Computer Science Baylor University Python Programming Language Script Language General-purpose script language Broad applications (web, bioinformatics, network programming, graphics, software engineering) Features Object-oriented Extension with modules Database integration Embeddable Web frameworks / Web modules 1

2 Getting Started Download & Installation The most recent version: Python 3.6 Baylor CS labs have: Python 3.2 Baylor ECS servers have: Python 2.7 Edit & Run Create a file named test.py Edit the code # This is a test. dna = ATCGATGA print dna, \n Run the code > python test.py Primitives Primitive Data Types Numbers or Strings num = 1234 st = 1234 num_1 = num + int(st) st_1 = str(num) + st Substring dna1 = ACGTGAACT dna2 = dna1[0:4] length = len(dna2) Reversing dna1 = ACGTGAACT dna2 = dna1[::-1] 2

3 Lists List Variables A list of comma-separated primitive values lst1 = [ A, C, G ] lst2 = [ T ] lst1 = lst1 + lst2 Variable-length list Insert, Delete, Append, Reverse, and Sort lst = [ A, T, G ] lst.insert(1, C ) del lst[2] lst.append( T ) lst.extend([ A, C ]) lst.reverse() lst.sort() lst = [ A, T, G ] lst [1:2] = C lst [1:1] = T lst [2:3] = lst [len(lst) : len(lst)] = T lst [len(lst) : len(lst)] = [ A, C ] lst [::-1] Sets Set Variables DNAbases = { A, C, G, T } RNAbases = { A, C, G, U } DNAbases RNAbases DNAbases & RNAbases DNAbases - RNAbases Add and Remove bases = { A, D, G } bases.add( T ) bases.remove( D ) 3

4 Dictionaries Initialization d = { key1 : value1, key2 : value2, key3 : value3 } d = dict() d[ key1 ] = value1 k2, v2 = key2, value2 d[k2] = v2 Mapping d[ key1 ] d.get( key1 ) d.keys() d.values() Delete del d[ key1 ] Input / Output Standard Input import sys data = sys.stdin.readline().replace( \n, ) Reading Files name = myfilename.txt with open(name) as file: data = file.read() name = sys.stdin.readline() with open(name) as file: data = file.read() name = sys.argv[1] with open(name) as file: data = file.read() Writing Files name = output.txt with open(name, w ) as file: file.write( ATCGATG ) 4

5 Functions Types Built-in system functions User-defined functions Defining Function def function_name (parameter_list): statement statement return value Examples Function with no return / Function with multiple returns def printtext (text): print text def printtext (text): print text return None def firstsecondlast (text): return text[0:1], text[1:2], text[len(text)-1:len(text)] Iteration Iterative Process def find_max(lst): max_so_far = lst[0] for item in lst[1:]: if item > max_so_far: max_so_far = item return max_so_far lst1 = [3,5,10,4,6] maximum = find_max(lst1) 5

6 Recursion Recursive Call def print_tree(tree, level): print * 4 * level, tree[0] for subtree in tree[1:]: print_tree(subtree, level+1) t1 = [ A, [ T, [ A ], [ T ]], [ G, [ G ], [ C ]]] print_tree(t1, 0) Modules Module A collection of functions Module python (.py) files in a library directory Module Call import random seq = 'ATCGATAGCTA' random_base = seq[random.randint(0,len(seq)-1)] from random import * seq = 'ATCGATAGCTA' random_base = seq[randint(0,len(seq)-1)] 6

7 Regular Expressions (1) Special Languages Metacharacters (characters having special meanings):. (any character), \n, \t, \s (whitespace), \w (any alphabetic or numeric character), \W, \d (decimal digit) Quantifiers e.g., ct.*g, ct.+g, ct.?g, ct{2}g, ct{2,5}g Grouping and back-reference e.g., (.)(.)aa\1\2 Alternatives Character set e.g., (ct ca) e.g., [acgt], [a-za-z] Anchors: ^ (the start of the string), $ (the end of the string) e.g., ^tata, aa$ Regular Expressions (2) Usage search: searches the first match of the pattern in a string, and returns the position as a MatchObject instance import re pos = re.search( TATA.* AA, seq) print pos.start() findall: searches all matches of the pattern in a string, and returns a list of the matches import re matches = re.findall( TATA.* AA, seq) print matches finditer: searches all matches of the patterns in a string, and returns an Iterator object as a MatchObject instance 7

8 Biological Applications Parsing Sequences Sequence Validation Motif Search Sequence Transformation DNA Replication Transcription from DNA to RNA Translating RNA into Protein DNA Sequence Mutation Parsing Sequences (1) Single Sequence in FASTA Format >gi gb AAD cytochrome b LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIP YIGTNLVEWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDK IPFHPYYTIKDFLGLLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRS VPNKLGGVLALFLSIVILGLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYP YTIIGQMASILYFSIILAFLPIAGXIENY Parsing Make a function to return the sequence from the FASTA format def read_fasta_seq(filename): with open(filename) as f: return f.read().partition( \n )[2].replace( \n, ) 8

9 Parsing Sequences (2) Multiple Sequences in FASTA Format >SEQUENCE_1 MTEITAAMVKELRESTGAGMMDCKNALSETNGDFDKAVQLLREKGLGKAAKKADRLAAEG LVSVKVSDDFTIAAMRPSYLSYEDLDMTFVENEYKALVAELEKENEERRRLKDPNKPEHKIP QFASRKQLSDAILKEAEEKIKEELKAQGKPEKIWDNIIPGKMNSFIADNSQLDSKLTLMGQFY VMDDKKTVEQVIAEKEKEFGGKIKIVEFICFEVGEGLEKKTEDFAAEVAAQL >SEQUENCE_2 SATVSEINSETDFVAKNDQFIALTKDTTAHIQSNSLQSVEELHSSTINGVKFEEYLKSQIATIGE NLVVRRFATLKAGANGVVNGYIHTNGRVGVVIAAACDSAEVASKSRDLLRQICMH Parsing? Sequence Validation (1) DNA Sequence Validation Make a function to check the sequence consists of A, T, C, and G only def validate_dna (base_sequence): seq = base_sequence.upper() for base in seq: if base not in ACGT : return False return True def validate_dna (base_sequence): seq = base_sequence.upper() return len(seq) == (seq.count( T ) + seq.count( C ) + seq.count( A ) + seq.count( G ) ) 9

10 Sequence Validation (2) Counting Base Frequency Make a function to calculate the percent of C and G in a DNA sequence def percent_of_gc (base_sequence): seq = base_sequence.upper() count = 0 for base in seq: if base in CG : count += 1 return float(count) / len(seq) def percent_of_gc (base_sequence): seq = base_sequence.upper() return float(seq.count( G ) + seq.count( C )) / len(seq) Motif Search Searching Substring Make a function to take a sequence and a motif and return the position(s) of matching in the sequence def motif_search (seq, motif): return seq.find(motif) def all_motif_search (seq, motif): pos = [] idx = seq.find(motif) pos.append(idx) seq = seq.partition(motif)[2] while seq.find(motif) > 0: idx += seq.find(motif) + len(motif) pos.append(idx) seq = seq.partition(motif)[2] return pos 10

11 Transcription Simulating Transcription Make a function to transcribe a DNA into an RNA def transcription (dna): return dna.replace( T, U ) Translation (1) Making Genetic Code Make a function to translate a codon to an amino acid def codon2aa(codon): genetic_code = { UUU : F, UUC : F, UUA : L, } if codon in genetic_code.keys(): return genetic_code[codon] else: return Error 11

12 Translation (2) Simulating Translation Make a function to translate an RNA into a protein sequence def translation(rna): protein = for n in range(0, len(rna), 3): protein += codon2aa(rna[n:n+3]) return protein Translation (3) Simulating Translation cont Make a generator function which returns values from a series it computes def aa_generator(rna): return (codon2aa(rna[n:n+3]) for n in range(0, len(rna), 3) ) def translation(rna): gen = aa_generator(rna) protein = aa = next(gen) while aa: protein += aa aa = next(gen) return protein 12

13 Mutation Simulating Mutation Make a function to simulate single point mutations in a DNA sequence import random def mutation(dna): position = random.randint(0,len(dna)-1) bases = ACGT new_base = bases[random.randint(0,3)] dna[position:position+1] = new_base return dna bases.replace(dna[position], ) new_base = bases[random.randint(0,2)] Questions? Lecture Slides are found on the Course Website, web.ecs.baylor.edu/faculty/cho/

Welcome to. Python 2. Session #5. Michael Purcaro, Chris MacKay, Nick Hathaway, and the GSBS Bootstrappers February 2014

Welcome to. Python 2. Session #5. Michael Purcaro, Chris MacKay, Nick Hathaway, and the GSBS Bootstrappers February 2014 Welcome to Python 2 Session #5 Michael Purcaro, Chris MacKay, Nick Hathaway, and the GSBS Bootstrappers February 2014 michael.purcaro@umassmed.edu 1 Building Blocks: modules To more easily reuse code,

More information

CIS192: Python Programming Data Types & Comprehensions Harry Smith University of Pennsylvania September 6, 2017 Harry Smith (University of Pennsylvani

CIS192: Python Programming Data Types & Comprehensions Harry Smith University of Pennsylvania September 6, 2017 Harry Smith (University of Pennsylvani CIS192: Python Programming Data Types & Comprehensions Harry Smith University of Pennsylvania September 6, 2017 Harry Smith (University of Pennsylvania) CIS 192 Fall Lecture 2 September 6, 2017 1 / 34

More information

CIS192 Python Programming

CIS192 Python Programming CIS192 Python Programming Data Types Joseph Cappadona University of Pennsylvania September 03, 2015 Joseph Cappadona (University of Pennsylvania) CIS 192 September 03, 2015 1 / 32 Outline 1 Data Types

More information

CSC148 Fall 2017 Ramp Up Session Reference

CSC148 Fall 2017 Ramp Up Session Reference Short Python function/method descriptions: builtins : input([prompt]) -> str Read a string from standard input. The trailing newline is stripped. The prompt string, if given, is printed without a trailing

More information

Dictionaries, Functions 1 / 16

Dictionaries, Functions 1 / 16 Dictionaries, Functions 1 / 16 Lists and Array Reminders To create a list of items, use the [ ] genes = ['SOD1','CDC11','YFG1'] print(genes) print(genes[1]) print(genes[1:]) # everything after slot 1 (incl

More information

Computational Molecular Biology

Computational Molecular Biology Computational Molecular Biology Erwin M. Bakker Lecture 3, mainly from material by R. Shamir [2] and H.J. Hoogeboom [4]. 1 Pairwise Sequence Alignment Biological Motivation Algorithmic Aspect Recursive

More information

Question 1. Part (a) Part (b) December 2013 Final Examination Marking Scheme CSC 108 H1F. [13 marks] [4 marks] Consider this program:

Question 1. Part (a) Part (b) December 2013 Final Examination Marking Scheme CSC 108 H1F. [13 marks] [4 marks] Consider this program: Question 1. Part (a) [4 marks] Consider this program: [13 marks] def square(x): (number) -> number Write what this program prints, one line per box. There are more boxes than you need; leave unused ones

More information

Assignment 6: Motif Finding Bio5488 2/24/17. Slide Credits: Nicole Rockweiler

Assignment 6: Motif Finding Bio5488 2/24/17. Slide Credits: Nicole Rockweiler Assignment 6: Motif Finding Bio5488 2/24/17 Slide Credits: Nicole Rockweiler Assignment 6: Motif finding Input Promoter sequences PWMs of DNA-binding proteins Goal Find putative binding sites in the sequences

More information

Files. Reading from a file

Files. Reading from a file Files We often need to read data from files and write data to files within a Python program. The most common type of files you'll encounter in computational biology, are text files. Text files contain

More information

Finding Hidden Patterns in DNA. What makes searching for frequent subsequences hard? Allowing for errors? All the places they could be hiding?

Finding Hidden Patterns in DNA. What makes searching for frequent subsequences hard? Allowing for errors? All the places they could be hiding? Finding Hidden Patterns in DNA What makes searching for frequent subsequences hard? Allowing for errors? All the places they could be hiding? 1 Initiating Transcription As a precursor to transcription

More information

Working with files. File Reading and Writing. Reading and writing. Opening a file

Working with files. File Reading and Writing. Reading and writing. Opening a file Working with files File Reading and Writing Reading get info into your program Parsing processing file contents Writing get info out of your program MBV-INFx410 Fall 2014 Reading and writing Three-step

More information

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science DECEMBER 2013 EXAMINATIONS CSC 108 H1F Instructors: Craig and Gries Duration 3 hours PLEASE HAND IN Examination Aids: None Student Number:

More information

Algorithmic Approaches for Biological Data, Lecture #8

Algorithmic Approaches for Biological Data, Lecture #8 Algorithmic Approaches for Biological Data, Lecture #8 Katherine St. John City University of New York American Museum of Natural History 17 February 2016 Outline More on Pattern Finding: Regular Expressions

More information

One minute responses. Not really sure how the loop, for, while, and zip work. I just need more practice problems to work on.

One minute responses. Not really sure how the loop, for, while, and zip work. I just need more practice problems to work on. One minute responses Not really sure how the loop, for, while, and zip work. I just need more practice problems to work on. More practice problems please! Comparing the dierent loops claried how they work.

More information

Managing Your Biological Data with Python

Managing Your Biological Data with Python Chapman & Hall/CRC Mathematical and Computational Biology Series Managing Your Biological Data with Python Ailegra Via Kristian Rother Anna Tramontano CRC Press Taylor & Francis Group Boca Raton London

More information

DNA Sequencing. Overview

DNA Sequencing. Overview BINF 3350, Genomics and Bioinformatics DNA Sequencing Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Eulerian Cycles Problem Hamiltonian Cycles

More information

Introduction to Computer Programming for Non-Majors

Introduction to Computer Programming for Non-Majors Introduction to Computer Programming for Non-Majors CSC 2301, Fall 2016 Chapter 11 Part 1 Instructor: Long Ma The Department of Computer Science Chapter 11 Data Collections Objectives: To understand the

More information

Regular Expressions. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Regular Expressions. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Regular Expressions Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein A quick review: The super Date class class Date: def init (self, day, month): self.day = day self.month

More information

Python review. 1 Python basics. References. CS 234 Naomi Nishimura

Python review. 1 Python basics. References. CS 234 Naomi Nishimura Python review CS 234 Naomi Nishimura The sections below indicate Python material, the degree to which it will be used in the course, and various resources you can use to review the material. You are not

More information

Overview.

Overview. Overview day one 0. getting set up 1. text output and manipulation day two 2. reading and writing files 3. lists and loops day three 4. writing functions 5. conditional statements day four today day six

More information

Lecture 18: Lists II. CS1068+ Introductory Programming in Python. Dr Kieran T. Herley 2018/19. Department of Computer Science University College Cork

Lecture 18: Lists II. CS1068+ Introductory Programming in Python. Dr Kieran T. Herley 2018/19. Department of Computer Science University College Cork Lecture 18: Lists II CS1068+ Introductory Programming in Python Dr Kieran T. Herley 2018/19 Department of Computer Science University College Cork Summary More on Python s lists. Sorting and reversing.

More information

Introduction to Computer Programming for Non-Majors

Introduction to Computer Programming for Non-Majors Introduction to Computer Programming for Non-Majors CSC 2301, Fall 2015 Chapter 11 Part 1 The Department of Computer Science Objectives Chapter 11 Data Collections To understand the use of lists (arrays)

More information

Programming Applications. What is Computer Programming?

Programming Applications. What is Computer Programming? Programming Applications What is Computer Programming? An algorithm is a series of steps for solving a problem A programming language is a way to express our algorithm to a computer Programming is the

More information

Overview.

Overview. Overview day one 0. getting set up 1. text output and manipulation day two 2. reading and writing files 3. lists and loops today 4. writing functions 5. conditional statements day four day five day six

More information

Script language: Python Data and files

Script language: Python Data and files Script language: Python Data and files Cédric Saule Technische Fakultät Universität Bielefeld 4. Februar 2015 Python User inputs, user outputs Command line parameters, inputs and outputs of user data.

More information

Important Example: Gene Sequence Matching. Corrigiendum. Central Dogma of Modern Biology. Genetics. How Nucleotides code for Amino Acids

Important Example: Gene Sequence Matching. Corrigiendum. Central Dogma of Modern Biology. Genetics. How Nucleotides code for Amino Acids Important Example: Gene Sequence Matching Century of Biology Two views of computer science s relationship to biology: Bioinformatics: computational methods to help discover new biology from lots of data

More information

(DNA#): Molecular Biology Computation Language Proposal

(DNA#): Molecular Biology Computation Language Proposal (DNA#): Molecular Biology Computation Language Proposal Aalhad Patankar, Min Fan, Nan Yu, Oriana Fuentes, Stan Peceny {ap3536, mf3084, ny2263, oif2102, skp2140} @columbia.edu Motivation Inspired by the

More information

Computational Theory MAT542 (Computational Methods in Genomics) - Part 2 & 3 -

Computational Theory MAT542 (Computational Methods in Genomics) - Part 2 & 3 - Computational Theory MAT542 (Computational Methods in Genomics) - Part 2 & 3 - Benjamin King Mount Desert Island Biological Laboratory bking@mdibl.org Overview of 4 Lectures Introduction to Computation

More information

Scientific Programming Practical 10

Scientific Programming Practical 10 Scientific Programming Practical 10 Introduction Luca Bianco - Academic Year 2017-18 luca.bianco@fmach.it Biopython FROM Biopython s website: The Biopython Project is an international association of developers

More information

Beginning Perl for Bioinformatics. Steven Nevers Bioinformatics Research Group Brigham Young University

Beginning Perl for Bioinformatics. Steven Nevers Bioinformatics Research Group Brigham Young University Beginning Perl for Bioinformatics Steven Nevers Bioinformatics Research Group Brigham Young University Why Use Perl? Interpreted language (quick to program) Easy to learn compared to most languages Designed

More information

CS 61A Interpreters, Tail Calls, Macros, Streams, Iterators. Spring 2019 Guerrilla Section 5: April 20, Interpreters.

CS 61A Interpreters, Tail Calls, Macros, Streams, Iterators. Spring 2019 Guerrilla Section 5: April 20, Interpreters. CS 61A Spring 2019 Guerrilla Section 5: April 20, 2019 1 Interpreters 1.1 Determine the number of calls to scheme eval and the number of calls to scheme apply for the following expressions. > (+ 1 2) 3

More information

Guide to Programming with Python. Algorithms & Computer programs. Hello World

Guide to Programming with Python. Algorithms & Computer programs. Hello World Guide to Programming with Python Yuzhen Ye (yye@indiana.edu) School of Informatics and Computing, IUB Objectives Python basics How to run a python program How to write a python program Variables Basic

More information

Introduction to Python

Introduction to Python Introduction to Python Reading assignment: Perkovic text, Ch. 1 and 2.1-2.5 Python Python is an interactive language. Java or C++: compile, run Also, a main function or method Python: type expressions

More information

Outline: Data collections (Ch11)

Outline: Data collections (Ch11) Data collections Michael Mandel Lecture 9 Methods in Computational Linguistics I The City University of New York, Graduate Center https://github.com/ling78100/lectureexamples/blob/master/lecture09final.ipynb

More information

GeneR. JORGE ARTURO ZEPEDA MARTINEZ LOPEZ HERNANDEZ JOSE FABRICIO. October 6, 2009

GeneR. JORGE ARTURO ZEPEDA MARTINEZ LOPEZ HERNANDEZ JOSE FABRICIO.  October 6, 2009 GeneR JORGE ARTURO ZEPEDA MARTINEZ LOPEZ HERNANDEZ JOSE FABRICIO. jzepeda@lcg.unam.mx jlopez@lcg.unam.mx October 6, 2009 Abstract GeneR packages allow direct use of nucleotide sequences within R software.

More information

Regular Expressions. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Regular Expressions. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Regular Expressions Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein A quick review: The super Date class class Date: def init (self, day, month): self.day = day self.month

More information

Algorithmic Approaches for Biological Data, Lecture #7

Algorithmic Approaches for Biological Data, Lecture #7 Algorithmic Approaches for Biological Data, Lecture #7 Katherine St. John City University of New York American Museum of Natural History 10 February 2016 Outline Patterns in Strings Recap: Files in and

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

Shell / Python Tutorial. CS279 Autumn 2017 Rishi Bedi

Shell / Python Tutorial. CS279 Autumn 2017 Rishi Bedi Shell / Python Tutorial CS279 Autumn 2017 Rishi Bedi Shell (== console, == terminal, == command prompt) You might also hear it called bash, which is the most widely used shell program macos Windows 10+

More information

Welcome to. Python 2. Session #2. Michael Purcaro & The GSBS Bootstrappers February 2014

Welcome to. Python 2. Session #2. Michael Purcaro & The GSBS Bootstrappers February 2014 Welcome to Python 2 Session #2 Michael Purcaro & The GSBS Bootstrappers February 2014 michael.purcaro@umassmed.edu 1 Extended Exercise 1 Goal: count how many signal peaks are present in processed ENCODE

More information

Delayed Expressions Fall 2017 Discussion 9: November 8, Iterables and Iterators. For Loops. Other Iterable Uses

Delayed Expressions Fall 2017 Discussion 9: November 8, Iterables and Iterators. For Loops. Other Iterable Uses CS 6A Delayed Expressions Fall 07 Discussion 9: November 8, 07 Iterables and Iterators An iterable is any container that can be processed sequentially. Examples include lists, tuples, strings, and dictionaries.

More information

(CC)A-NC 2.5 by Randall Munroe Python

(CC)A-NC 2.5 by Randall Munroe Python http://xkcd.com/353/ (CC)A-NC 2.5 by Randall Munroe Python Python: Operative Keywords Very high level language Language design is focused on readability Mulit-paradigm Mix of OO, imperative, and functional

More information

Built-in functions. You ve used several functions already. >>> len("atggtca") 7 >>> abs(-6) 6 >>> float("3.1415") >>>

Built-in functions. You ve used several functions already. >>> len(atggtca) 7 >>> abs(-6) 6 >>> float(3.1415) >>> Functions Built-in functions You ve used several functions already len("atggtca") 7 abs(-6) 6 float("3.1415") 3.1415000000000002 What are functions? A function is a code block with a name def hello():

More information

Motif Discovery using optimized Suffix Tries

Motif Discovery using optimized Suffix Tries Motif Discovery using optimized Suffix Tries Sergio Prado Promoter: Prof. dr. ir. Jan Fostier Supervisor: ir. Dieter De Witte Faculty of Engineering and Architecture Department of Information Technology

More information

Introduction to Python Part I

Introduction to Python Part I Introduction to Python Part I BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute Nov 29th 2018 http://barc.wi.mit.edu/hot_topics/ 1 About Python Object oriented language; easy to

More information

Regexp. Lecture 26: Regular Expressions

Regexp. Lecture 26: Regular Expressions Regexp Lecture 26: Regular Expressions Regular expressions are a small programming language over strings Regex or regexp are not unique to Python They let us to succinctly and compactly represent classes

More information

Lecture 3, Review of Algorithms. What is Algorithm?

Lecture 3, Review of Algorithms. What is Algorithm? BINF 336, Introduction to Computational Biology Lecture 3, Review of Algorithms Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Algorithm? Definition A process

More information

Lecture 2 Pairwise sequence alignment. Principles Computational Biology Teresa Przytycka, PhD

Lecture 2 Pairwise sequence alignment. Principles Computational Biology Teresa Przytycka, PhD Lecture 2 Pairwise sequence alignment. Principles Computational Biology Teresa Przytycka, PhD Assumptions: Biological sequences evolved by evolution. Micro scale changes: For short sequences (e.g. one

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics 582670 Algorithms for Bioinformatics Lecture 1: Primer to algorithms and molecular biology 4.9.2012 Course format Thu 12-14 Thu 10-12 Tue 12-14 Grading Exam 48 points Exercises 12 points 30% = 1 85% =

More information

MUTABLE LISTS AND DICTIONARIES 4

MUTABLE LISTS AND DICTIONARIES 4 MUTABLE LISTS AND DICTIONARIES 4 COMPUTER SCIENCE 61A Sept. 24, 2012 1 Lists Lists are similar to tuples: the order of the data matters, their indices start at 0. The big difference is that lists are mutable

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the

More information

Tutorial 4 BLAST Searching the CHO Genome

Tutorial 4 BLAST Searching the CHO Genome Tutorial 4 BLAST Searching the CHO Genome Accessing the CHO Genome BLAST Tool The CHO BLAST server can be accessed by clicking on the BLAST button on the home page or by selecting BLAST from the menu bar

More information

6. Data Types and Dynamic Typing (Cont.)

6. Data Types and Dynamic Typing (Cont.) 6. Data Types and Dynamic Typing (Cont.) 6.5 Strings Strings can be delimited by a pair of single quotes ('...'), double quotes ("..."), triple single quotes ('''...'''), or triple double quotes ("""...""").

More information

APT Session 2: Python

APT Session 2: Python APT Session 2: Python Laurence Tratt Software Development Team 2017-10-20 1 / 17 http://soft-dev.org/ What to expect from this session: Python 1 What is Python? 2 Basic Python functionality. 2 / 17 http://soft-dev.org/

More information

Algorithmic Thinking: Computing with Lists

Algorithmic Thinking: Computing with Lists Algorithmic Thinking: Computing with Lists So Far in Python Data types: int, float, Boolean, string Assignments, function definitions Control structures: For loops, while loops, conditionals Last Lecture

More information

Biopython. Karin Lagesen.

Biopython. Karin Lagesen. Biopython Karin Lagesen karin.lagesen@bio.uio.no Object oriented programming Biopython is object-oriented Some knowledge helps understand how biopython works OOP is a way of organizing data and methods

More information

Working with files. File Reading and Writing. Reading and writing. Opening a file

Working with files. File Reading and Writing. Reading and writing. Opening a file Working with files File Reading and Writing Reading get info into your program Parsing processing file contents Writing get info out of your program MBV-INFx410 Fall 2015 Reading and writing Three-step

More information

CS 112: Intro to Comp Prog

CS 112: Intro to Comp Prog CS 112: Intro to Comp Prog Importing modules Branching Loops Program Planning Arithmetic Program Lab Assignment #2 Upcoming Assignment #1 Solution CODE: # lab1.py # Student Name: John Noname # Assignment:

More information

This quiz is open book and open notes, but do not use a computer.

This quiz is open book and open notes, but do not use a computer. 1. /15 2. /10 3. /10 4. /18 5. /8 6. /13 7. /15 8. /9 9. /1 10. /1 Total /100 This quiz is open book and open notes, but do not use a computer. Please write your name on the top of each page. Answer all

More information

CS 331/401 Summer 2018 Midterm Exam

CS 331/401 Summer 2018 Midterm Exam CS 331/401 Summer 2018 Midterm Exam Instructions: This exam is closed-book, closed-notes. Computers of any kind are not permitted. For numbered, multiple-choice questions, fill your answer in the corresponding

More information

Modularization. Functions and Modules. Functions. Functions how to define

Modularization. Functions and Modules. Functions. Functions how to define Modularization Functions and Modules MBV-INFx410 Fall 2015 Programs can get big Risk of doing the same thing many times Functions and modules encourage - re-usability - readability - helps with maintenance

More information

Sequences and iteration in Python

Sequences and iteration in Python GC3: Grid Computing Competence Center Sequences and iteration in Python GC3: Grid Computing Competence Center, University of Zurich Sep. 11 12, 2013 Sequences Python provides a few built-in sequence classes:

More information

CSI33 Data Structures

CSI33 Data Structures Outline Department of Mathematics and Computer Science Bronx Community College September 25, 2017 Outline Outline 1 Chapter 4: Linked Structures and Chapter 4: Linked Structures and Outline 1 Chapter 4:

More information

CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees. CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees 1

CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees. CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees 1 CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees 1 Goals of this tutorial You should be able to... understand

More information

: Intro Programming for Scientists and Engineers Assignment 3: Molecular Biology

: Intro Programming for Scientists and Engineers Assignment 3: Molecular Biology Assignment 3: Molecular Biology Page 1 600.112: Intro Programming for Scientists and Engineers Assignment 3: Molecular Biology Peter H. Fröhlich phf@cs.jhu.edu Joanne Selinski joanne@cs.jhu.edu Due Dates:

More information

Lecture 5: Markov models

Lecture 5: Markov models Master s course Bioinformatics Data Analysis and Tools Lecture 5: Markov models Centre for Integrative Bioinformatics Problem in biology Data and patterns are often not clear cut When we want to make a

More information

Mutation & Data Abstraction Summer 2018 Discussion 4: July 3, Mutable Lists

Mutation & Data Abstraction Summer 2018 Discussion 4: July 3, Mutable Lists CS 61A Mutation & Data Abstraction Summer 2018 Discussion 4: July 3, 2018 1 Mutable Lists Let s imagine you order a mushroom and cheese pizza from La Val s, and that they represent your order as a list:

More information

Dynamic Programming & Smith-Waterman algorithm

Dynamic Programming & Smith-Waterman algorithm m m Seminar: Classical Papers in Bioinformatics May 3rd, 2010 m m 1 2 3 m m Introduction m Definition is a method of solving problems by breaking them down into simpler steps problem need to contain overlapping

More information

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science SUMMER 2012 EXAMINATIONS CSC 108 H1Y Instructors: Janicki Duration NA PLEASE HAND IN Examination Aids: None Student Number: Family Name(s):

More information

for loops Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas

for loops Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas for loops Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas Must be indented for loop Allows you to perform an operation on each element in a list (or character in

More information

CS Lecture 26: Grab Bag. Announcements

CS Lecture 26: Grab Bag. Announcements CS 1110 Lecture 26: Grab Bag Announcements The End is Nigh! 1. Next (last) lecture will be recap and final exam review 2. A5 due Wednesday night 3. Final exam 7pm Thursday May 15 in Barton Hall (East section)

More information

Outline. gzip and gunzip data compression archiving files and pipes in Unix. format conversions encrypting text

Outline. gzip and gunzip data compression archiving files and pipes in Unix. format conversions encrypting text Outline 1 Compressing Files gzip and gunzip data compression archiving files and pipes in Unix 2 File Methods in Python format conversions encrypting text 3 Using Buffers counting and replacing words using

More information

CS150 Sample Final. Name: Section: A / B

CS150 Sample Final. Name: Section: A / B CS150 Sample Final Name: Section: A / B Date: Start time: End time: Honor Code: Signature: This exam is closed book, closed notes, closed computer, closed calculator, etc. You may only use (1) the final

More information

Lecture #21: Search and Sets. Last modified: Wed Mar 9 15:44: CS61A: Lecture #21 1

Lecture #21: Search and Sets. Last modified: Wed Mar 9 15:44: CS61A: Lecture #21 1 Lecture #21: Search and Sets Last modified: Wed Mar 9 15:44:55 2016 CS61A: Lecture #21 1 Announcements My office hours this Thursday (only) are 3 4PM. Homework 5 to be released later today. Many problems

More information

from scratch A primer for scientists working with Next-Generation- Sequencing data CHAPTER 8 biopython

from scratch A primer for scientists working with Next-Generation- Sequencing data CHAPTER 8 biopython from scratch A primer for scientists working with Next-Generation- Sequencing data CHAPTER 8 biopython Chapter 8: Biopython Biopython is a collection of modules that implement common bioinformatical tasks

More information

Regular Expressions. Steve Renals (based on original notes by Ewan Klein) ICL 12 October Outline Overview of REs REs in Python

Regular Expressions. Steve Renals (based on original notes by Ewan Klein) ICL 12 October Outline Overview of REs REs in Python Regular Expressions Steve Renals s.renals@ed.ac.uk (based on original notes by Ewan Klein) ICL 12 October 2005 Introduction Formal Background to REs Extensions of Basic REs Overview Goals: a basic idea

More information

ECE 364 Software Engineering Tools Laboratory. Lecture 4 Python: Collections I

ECE 364 Software Engineering Tools Laboratory. Lecture 4 Python: Collections I ECE 364 Software Engineering Tools Laboratory Lecture 4 Python: Collections I 1 Lecture Summary Lists Tuples Sets Dictionaries Printing, More I/O Bitwise Operations 2 Lists list is a built-in Python data

More information

CS150 - Sample Final

CS150 - Sample Final CS150 - Sample Final Name: Honor code: You may use the following material on this exam: The final exam cheat sheet which I have provided The matlab basics handout (without any additional notes) Up to two

More information

Computational Molecular Biology

Computational Molecular Biology Computational Molecular Biology Erwin M. Bakker Lecture 2 Materials used from R. Shamir [2] and H.J. Hoogeboom [4]. 1 Molecular Biology Sequences DNA A, T, C, G RNA A, U, C, G Protein A, R, D, N, C E,

More information

Ling : Lecture Notes 5 From Programs to Projects

Ling : Lecture Notes 5 From Programs to Projects Ling 684.01: Lecture Notes 5 From Programs to Projects 1. Modules: You can build projects with shared code files Other code files (written by you or others) can be imported into a program (a) stmt import

More information

DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms

DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms Attiya Mahmood, Nazia Islam, Dawit Nigatu, and Werner Henkel Jacobs University Bremen Electrical Engineering and Computer Science Bremen,

More information

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines. Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of

More information

Today. CISC101 Reminders & Notes. Searching in Python - Cont. Searching in Python. From last time

Today. CISC101 Reminders & Notes. Searching in Python - Cont. Searching in Python. From last time CISC101 Reminders & Notes Test 3 this week in tutorial USATs at the beginning of next lecture Please attend and fill out an evaluation School of Computing First Year Information Session Thursday, March

More information

Lecture 15: Dictionaries

Lecture 15: Dictionaries Lecture 15: Dictionaries CS1068+ Introductory Programming in Python Dr Kieran T. Herley 2018/19 Department of Computer Science University College Cork Summary Python s dictionary concept. 1 Dictionaries

More information

Regular Expressions. Regular Expression Syntax in Python. Achtung!

Regular Expressions. Regular Expression Syntax in Python. Achtung! 1 Regular Expressions Lab Objective: Cleaning and formatting data are fundamental problems in data science. Regular expressions are an important tool for working with text carefully and eciently, and are

More information

Dictionaries. Looking up English words in the dictionary. Python sequences and collections. Properties of sequences and collections

Dictionaries. Looking up English words in the dictionary. Python sequences and collections. Properties of sequences and collections Looking up English words in the dictionary Comparing sequences to collections. Sequence : a group of things that come one after the other Collection : a group of (interesting) things brought together for

More information

Transfer String Kernel for Cross-Context Sequence Specific DNA-Protein Binding Prediction. by Ritambhara Singh IIIT-Delhi June 10, 2016

Transfer String Kernel for Cross-Context Sequence Specific DNA-Protein Binding Prediction. by Ritambhara Singh IIIT-Delhi June 10, 2016 Transfer String Kernel for Cross-Context Sequence Specific DNA-Protein Binding Prediction by Ritambhara Singh IIIT-Delhi June 10, 2016 1 Biology in a Slide DNA RNA PROTEIN CELL ORGANISM 2 DNA and Diseases

More information

for loops Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas

for loops Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas for loops Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas Reminders use if - elif - else statements for conditional code blocks memorize the logical operators (==,!=,

More information

Managing big biological sequence data with Biostrings and DECIPHER. Erik Wright University of Wisconsin-Madison

Managing big biological sequence data with Biostrings and DECIPHER. Erik Wright University of Wisconsin-Madison Managing big biological sequence data with Biostrings and DECIPHER Erik Wright University of Wisconsin-Madison What you should learn How to use the Biostrings and DECIPHER packages Creating a database

More information

Programming Languages and Uses in Bioinformatics

Programming Languages and Uses in Bioinformatics Programming in Perl Programming Languages and Uses in Bioinformatics Perl, Python Pros: reformatting data files reading, writing and parsing files building web pages and database access building work flow

More information

ChIPModule 1.0 Manual Computational Systems Biology Lab EECS,UCF

ChIPModule 1.0 Manual Computational Systems Biology Lab EECS,UCF ChIPModule 1.0 Manual Computational Systems Biology Lab EECS,UCF UNIX version ChIPModule software-------------------------------------------------------------------------- ----------------------------------------------------------------------------------

More information

1 Modules 2 IO. 3 Lambda Functions. 4 Some tips and tricks. 5 Regex. Sandeep Sadanandan (TU, Munich) Python For Fine Programmers May 30, / 22

1 Modules 2 IO. 3 Lambda Functions. 4 Some tips and tricks. 5 Regex. Sandeep Sadanandan (TU, Munich) Python For Fine Programmers May 30, / 22 1 Modules 2 IO 3 Lambda Functions 4 Some tips and tricks 5 Regex Sandeep Sadanandan (TU, Munich) Python For Fine Programmers May 30, 2009 1 / 22 What are they? Modules are collections of classes or functions

More information

Module 09: Additional Options for Organizing Data

Module 09: Additional Options for Organizing Data Module 09: Additional Options for Organizing Data Topics: Dictionaries Classes Readings: ThinkP 11, 15, 16, 17 1 Collections of key-value pairs In CS115, you studied collections of key-value pairs, where

More information

Python for Non-programmers

Python for Non-programmers Python for Non-programmers A Gentle Introduction 2 Yann Tambouret Scientific Computing and Visualization Information Services & Technology Boston University 111 Cummington St. yannpaul@bu.edu Winter 2013

More information

Package RWebLogo. August 29, 2016

Package RWebLogo. August 29, 2016 Type Package Title plotting custom sequence logos Version 1.0.3 Date 2014-04-14 Author Omar Wagih Maintainer Omar Wagih Package RWebLogo August 29, 2016 Description RWebLogo is a wrapper

More information

Short Answer Questions (40 points)

Short Answer Questions (40 points) CS 1112 Fall 2017 Test 2 Page 1 of 6 Short Answer Questions (40 points) 1. TRUE FALSE You have very legibly printed your name and email id below. Name = EMAILD = 2. TRUE FALSE On my honor, I pledge that

More information

UNIVERSITY OF TORONTO SCARBOROUGH. Wnter 2016 EXAMINATIONS. CSC A20H Duration 2 hours 45 mins. No Aids Allowed

UNIVERSITY OF TORONTO SCARBOROUGH. Wnter 2016 EXAMINATIONS. CSC A20H Duration 2 hours 45 mins. No Aids Allowed Student Number: Last Name: First Name: UNIVERSITY OF TORONTO SCARBOROUGH Wnter 2016 EXAMINATIONS CSC A20H Duration 2 hours 45 mins No Aids Allowed Do not turn this page until you have received the signal

More information

LISTS WITH PYTHON. José M. Garrido Department of Computer Science. May College of Computing and Software Engineering Kennesaw State University

LISTS WITH PYTHON. José M. Garrido Department of Computer Science. May College of Computing and Software Engineering Kennesaw State University LISTS WITH PYTHON José M. Garrido Department of Computer Science May 2015 College of Computing and Software Engineering Kennesaw State University c 2015, J. M. Garrido Lists with Python 2 Lists with Python

More information

Instructors: Daniel Deutch, Amir Rubinstein, Teaching Assistants: Amir Gilad, Michal Kleinbort

Instructors: Daniel Deutch, Amir Rubinstein, Teaching Assistants: Amir Gilad, Michal Kleinbort Extended Introduction to Computer Science CS1001.py Lecture 10b: Recursion and Recursive Functions Instructors: Daniel Deutch, Amir Rubinstein, Teaching Assistants: Amir Gilad, Michal Kleinbort School

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Winter 2017 Lecture 6a Andrew Tolmach Portland State University 1994-2017 Iteration into Recursion Any iteration can be written as a recursion, e.g. while (c) {e Scala is equivalent

More information