Getting started with programming For geospatial data analysis Robert Hijmans UC Davis Crash course for the CSI Cali, 22 Sept 2017
Developing computer programs for geographic data manipulation to get: - Scalability - Repeatability - Documentation - Innovation - Speed
Computer languages
Computer languages Basic data types (Boolean, integer, float, string) Multi-dimensional arrays (vector, matrix, ) Compound data types (structures, classes) Arithmetic Variables Flow control (branching, looping) Functions (procedures)
Examples C++ Python R JavaScript (Earth Engine) And then more Python
double CaliLon, CaliLat, BogLat, BogLon; double CaliLonRad, CaliLatRad, BogLatRad, BogLonRad; double torad, distanceunitsphere, distance, r; int distkm; CaliLon = -76.53; CaliLat = 3.45; BogLon = -74.07; BogLat = 4.71; C++ torad = M_PI / 180; CaliLonRad = CaliLon * torad; CaliLatRad = CaliLat * torad; BogLonRad = BogLon * torad; BogLatRad = BogLat * torad; distanceunitsphere = acos(sin(calilatrad) * sin(boglatrad) + cos(calilatrad) * cos(boglatrad) * cos(calilonrad - BogLonRad)); distance = distanceunitsphere * 6378.137; distkm = round(distance); cout << "The distance is " << distkm << " km";
double distcosines(double lona, double lata, double lonb, double latb, double r=6378137) { double torad = M_PI / 180; lona = lona * torad; lonb = lonb * torad; lata = lata * torad; latb = latb * torad; double distance = acos(sin(lata) * sin(latb) + cos(lata) * cos(latb) * cos(lona - lonb)) * r; return(distance); } int main () { double CaliLon = -76.53199; double CaliLat = 3.451647; double BogLon = -74.07209; double BogLat = 4.710989; C++ (improved) int distkm = distcosines(calilon, CaliLat, BogLon, BogLat); } cout << "The distance is " << distkm/1000 << " km";
import math; def distcosines(lona, lata, lonb, latb, r=6378137): torad = math.pi / 180 lona = lona * torad lonb = lonb * torad lata = lata * torad latb = latb * torad distance = math.acos(math.sin(lata) * math.sin(latb) + math.cos(lata) * math.cos(latb) * math.cos(lona - lonb)) * r return distance CaliLon = -76.53199 CaliLat = 3.451647 BogLon = -74.07209 BogLat = 4.710989 Python distkm = distcosines(calilon, CaliLat, BogLon, BogLat) / 1000 distkm = int(round(distkm)) print "The distance is " + str(distkm) + " km"
distcosines <- function( lona, lata, lonb, latb, r=6378137) { torad <- pi / 180; lona <- lona * torad; lonb <- lonb * torad; lata <- lata * torad; latb <- latb * torad; distance <- acos(sin(lata) * sin(latb) + cos(lata) * cos(latb) * cos(lona - lonb)) * r; return(distance); } CaliLon <- -76.53199; CaliLat <- 3.451647; BogLon <- -74.07209; BogLat <- 4.710989; R distkm <- distcosines(calilon, CaliLat, BogLon, BogLat) / 1000); distkm
library(dismo) library(geosphere) Real R code cali = geocode('cali, Colombia') bogota = geocode('bogota, Colombia') distcosine(cali[, c('longitude', 'latitude')], bogota[, c('longitude', 'latitude')])
Real R code > geocode('cartagena, Colombia') Cartagena, Colombia Cartagena, Cartagena Province, Bolivar, Colombia -75.47943 10.39105
library(raster) col <- getdata('gadm', country='col', level=1, path='c:/goodrive/csi/data') bio <- getdata('worldclim', res=5, var='bio', path='c:/goodrive/csi/data') bio <- crop(bio, col) biocol <- mask(bio, col) plot(biocol, 1) plot(col, add=t) hist(biocol)
Don't we have GIS for that? GIS* R Visual interaction Data & model focused Data management Data Analysis Geometric operations Attributes as important Standard workflows Creativity & innovation Single map production Many (simpler) maps Click, click, click & click Repeatability (single script) Speed of execution Speed of development Cumbersome Easy & powerful (& free) * there are many different GISs and they evolve
http://www.rspatial.org/
Python - General purpose - High level - Emphasizes code readability - Free and open source interpreter - Easy to learn (by design) - Widely used ($$$) for stand alone programs, scripts, web-sites
Some code. x = 34 x = x - 23 # A comment. y = Hello # Another one. if x < 0 or y == Hello : x = x + 1 y = y + World # String concat. print x print y Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
Basic Datatypes Integers (default for numbers) z = 5 / 2 # Answer is 2, integer division. Floats x = 3.456 Strings Can use abc or abc (Same thing.) Unmatched quotes can occur within the string. matt s Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
Whitespace Use a newline to end a line of code. (Use \ when must go to next line prematurely.) Use consistent indentation. The first line with a new indentation is considered outside of the block. Often a colon appears at the start of a new block. Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
Types You can t just append an integer to a string. You must first convert the integer to a string itself. x = the answer is # Decides x is string. y = 23 # Decides y is integer. print x + y # Python will complain about this. print x + str(y) Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
Naming Rules Names are case sensitive and cannot start with a number. They can contain letters, numbers, and underscores. bob Bob _bob _2_bob_ bob_2 BoB But use a good style... There are some reserved words: and, assert, break, class, continue, def, del, elif, else, except, exec, finally, for, from, global, if, import, in, is, lambda, not, or, pass, print, raise, return, try, while Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
Accessing Non-existent Name >>> y Traceback (most recent call last): File "<pyshell#16>", line 1, in -toplevely NameError: name y' is not defined >>> y = 3 >>> y 3 Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
Multiple Assignment >>> x, y = 2, 3 >>> x 2 >>> y 3 Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
String maniupulations a = "hello" a + "world" "helloworld" # concatenation a * 3 "hellohellohello" # repetition a[0] "h" # indexing a[-1] "o" # (from end) a[1:4] "ell" # slicing len(a) 5 # size a < "jello" TRUE # comparison "e" in a TRUE # search Based on Introduction to Python by Guido van Rossem: www.python.org/doc/essays/ppt/lwnyc2002/intro22.ppt
Functions and Methods Function >>> import math >>> a = 9 >>> math.sqrt(a) 3 Method >>> a = hello >>> a.upper() >>> print a HELLO Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
Print >>> print abc, xyz, 34 abc xyz 34 The % string operator in combination with the print command to format output text. >>> print %s xyz %d % ( abc, 34) abc xyz 34 Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
for i in range(10): x = i * i print x for i in [0,1,2,3,4,5,6,7,8,9]: x = i * i print x
Lists a = [99, "bottles of water", ["off", "the", "wall"]] Same operators as for strings a+b, a*3, a[0], a[-1], a[1:], len(a) Item and slice assignment a[0] = 98 a[1:2] = ["bottles", "of", "water"] print a [98, "bottles", "of", "water", ["off", "the", "wall"]] del a[-1] print a [98, "bottles", "of", "water"] Based on presentation from www.cis.upenn.edu/~cse391/cse391_2004/pythonintro1.ppt
>>> a = range(5) # [0,1,2,3,4] >>> a.append(5) # [0,1,2,3,4,5] >>> a.pop() # [0,1,2,3,4] 5 >>> a.insert(0, 42) # [42,0,1,2,3,4] >>> a.pop(0) # [0,1,2,3,4] 42 >>> a.reverse() # [4,3,2,1,0] >>> a.sort() # [0,1,2,3,4]
Pyscripter
Earth Engine
Exercise Use Python to Compute the distance from Cali to Cartagena Write a program that can compute the distance from Cali to any other place Check for bad input data (impossible coordiantes) Read input from a file Write output to a file