CSE 111 Bio: Program Design I Lecture 4: Variables, Functions, Strings, Genbank Randall Munoe, XKCD http://xkcd.com/1319/ Robert Sloan (CS) & Rachel Poretsky (Bio) University of Illinois, Chicago September 7, 2017
At end of this code, y will have what value? y = 3 + 2 + 1 y = y * 2 A. 2 B. 6 C. 8 D. 12 E. This code will cause an error
Code x = 7 print(x) x = x + 1 print(x) y = x - 3 print(x) print(y) At the end of running this code, what will appear from the print statements in the execution window? 7 8 5 5 7 8 8 5 7 7 7 4 D. This will cause an error E. I don t know A B C
Survey! Survey URL: CS 111 Green: https://uic.qualtrics.com/jfe/form/sv_e5p8trpi7s7ejnx
Alternate Problem if not taking survey: Evaluate in your head; check with computer when done: 1.5 ** 2 2.9 * 5 3.15 / 12 4.12 / 15 5.15 // 12 6.12 // 15 7.5 % 2 8.9 % 5 9.15 % 12 10. 12 % 15 11. 6 % 6 12. 0 % 7
Order of common binary operations Things in ()s first. Use ()s whenever you are in the slightest doubt Next ** (exponentiation) Next *, /, and // Lastly + and
Order of common operations Things in ()s first. Use ()s whenever you are in the slightest doubt Next ** (exponentiation) Next *, /, and // //?! Lastly + and
Division of integers: / vs. // Recall from reading that / is ordinary division, and // is floored division Does it work with integers?
What is printed print(6 + 1 * 3) A. 9 B. 21 C. Some other value
Brief first look at Functions Much more to come!
Defining your own functions def triple(x): return 3 * x Notice the indentation Done using tab and absolutely necessary! x triple 3 * x
Functions can have more than one line def triple(x): return 3 * x x triple 3 * x def triple(x) : myanswer = 3 * x return myanswer
Docstrings def triple(x): ''' Input is number x, returns 3*x ''' return 3 * x Teaching your program to talk to you Can access via help(triple); then type q to get back to main Python prompt 3 single or double quotes just fine (but watch out for editor trying to give you an even number like 4 marks!) Use docstrings!
Comments # Tripling program # Authors: Rachel and Bob # Date: August 52, 2017 def triple(x): ''' Input is number x, returns 3*x ''' # Comments begin with a hash mark return 3 * x
Which of these Python 3 programs will print out an "A"? def print_a(): '''I claim to print A''' print("a") A def print_a(): '''I claim to print A''' print "A" B def print_a(): '''I claim to print A''' print("b") C D. None of the above
Strings, strings, strings!
Strings Are really key data type for many parts of biology DNA sequences: Monster long strings of A's, C's, T's, and G's Protein: Monster long sequence of amino acids. For computation, the 20 amino acids are each represented as a string More later
Python awesome language for working with strings Python happy to have strings in 'these' or "these" or even '''these''', making it much simpler to deal with embedded quote characters such as '''I t has been disputed at what period of time the causes of variability, whatever they may be, generally act; whether during the early or late period of development of the embryo, or at the instant of conception. Geoffroy St Hilaire's experiments show that unnatural treatment of the embryo causes monstrosities; and monstrosities cannot be separated by any clear line of distinction from mere variations..''' Python can handle carriage returns in strings
paragraph1 = ''' WHEN we look to the individuals of the same variety or subvariety of our older cultivated plants and animals, one of the first points which strikes us, is, that they generally differ much more from each other, than do the individuals of any one species or variety in a state of nature. When we reflect on the vast diversity of the plants and animals which have been cultivated, and which have varied during all ages under the most different climates and treatment, I think we are driven to conclude that this greater variability is simply due to our domestic productions having been raised under conditions of life not so uniform as, and somewhat different from, those to which the parentspecies have been exposed under nature. There is, also, I think, some probability in the view propounded by Andrew Knight, that this variability may be partly connected with excess of food. It seems pretty clear that organic beings must be exposed during several generations to the new conditions of life to cause any appreciable amount of variation; and that when the organisation has once begun to vary, it generally continues to vary for many generations. No case is on record of a variable being ceasing to be variable under cultivation. Our oldest cultivated plants, such as wheat, still often yield new varieties: our oldest domesticated animals are still capable of rapid improvement or modification..'''
String String: any sequence of characters enclosed in single, double, or triple quotes Beginning and ending quote marks need to match Can use either 3 single quotes ''' or 3 double quotes for triple quotes Convention: Docstrings are in triple quotes Example of a sequence Other important kind of sequence is list (coming soon)
Which is a valid Python string? A. "Admiral Grace Hopper" B. "Admiral Grace Hopper' C. Admiral Grace Hopper D. "Admiral Grace Hopper' " E. Both A and D
Things we can do with strings Find their length, using the built-in Python function len In [1]: my_dna="aatgccgtgctt" In [2]: len(my_dna) Out[2]: 12 In [3]: len("hi there") Out[3]: 8 String arithmetic: + à concatenation; * à repeat (as we saw last time) In [4]: "The DNA string we're working with is: " + my_dna Out[4]: The DNA string we're working with is: AATGCCGTGCTT' In [5]: 2 * my_dna Out[5]: ' AATGCCGTGCTTAATGCCGTGCTT'
Arithmetic limited to + and integer * >>> my_dna / 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for /: 'str' and 'int' >>> 3.0 * my_dna Also Barf Importance of types: floats are not integers, and multiplication of int by string makes some sense, but multiplication of float by string makes no sense
Using strings: slicing (first look) >>> mydna = "AATGCCGTGCTT" 0 1 2 3 4 5 6 7 8 9 10 11 A A T G C C G T G C T T >>> mydna[0:4] 'AATG' >>> mydna[3:7] 'GCCG
Introduction to GenBank GenBank is a public database of nucleotide sequences and supporting information It is hosted at the National Center for Biotechnology Information (NCBI), associated with the US National Institutes of Health (NIH)
Introduction to GenBank http://www.ncbi.nlm.nih.gov/ or http://www.ncbi.nlm.nih.gov/genbank