Maps 02-201
Arrays/Slices Store Lists of Variables H i T h e r e! 0 1 2 3 4 5 6 7 8 1 1 2 3 5 8 13 21 34 55 89 0 1 2 3 4 5 6 7 8 9 10 ACG TTA GAG CCT TAA GGG CAT 0 1 2 3 4 5 6
What if Indices Aren t Integers? key value California 38,802,500 Texas 26,956,958 Florida 19,893,297 New York 19,746,227 Illinois 12,880,580 Pennsylvania 12,787,209 Ohio 11,594,163 Georgia 10,097,343 North Carolina 9,943,964 Michigan 9,909,877
Maps: The Go Data Structure We Want Recall: slice declaration: var a []int a = make([]int, 10)
Maps: The Go Data Structure We Want Recall: slice declaration: var a []int a = make([]int, 10) Now: map declaration: var statepop map[string]int statepop = make(map[string]int) // note: no need to specify length of map statepop[ Pennsylvania ] = 12787209
Shortcut Declarations Recall: shortcut slice declaration: a := make([]int, 10)
Shortcut Declarations Recall: shortcut slice declaration: a := make([]int, 10) Now: shortcut map declaration: statepop := make(map[string]int)
Adding Items to a Map Recall: appending items to end of a slice b := make([]int, 0) b = append(b, 23)
Adding Items to a Map Recall: appending items to end of a slice b := make([]int, 0) b = append(b, 23) Now: can assign map values directly. statepop := make(map[string]int) statepop[ Pennsylvania ] = 12787209
Number of Items in Map Recall: number of items in slice: a := make([]int, 10) fmt.println(len(a)) // prints 10
Number of Items in Map Recall: number of items in slice: a := make([]int, 10) fmt.println(len(a)) // prints 10 Now: number of items in map: statepop := make(map[string]int) statepop[ Pennsylvania ] = 12,787,209 fmt.println(len(statepop)) // prints 1
Removing an Item from a Map Recall: removing item i from a slice: a = append(a[:i], a[i+1:]...)
Removing an Item from a Map Recall: removing item i from a slice: a = append(a[:i], a[i+1:]...) Now: removing key and value from a map: delete(statepop, Florida ) // deletes both Florida key and the // population value it refers to.
Looping through Maps Recall: looping through slices with two indices: for j, v := range list { fmt.println( The value at, j, is, v)
Looping through Maps Recall: looping through slices with two indices: for j, v := range list { fmt.println( The value at, j, is, v) Now: looping through map proceeds similarly: for j, v := range statepop { fmt.println( The pop of, j, is, v)
Looping through Maps Recall: looping through slices with two indices: for j, v := range list { fmt.println( The value at, j, is, v) Now: looping through map proceeds similarly: for j, v := range statepop { fmt.println( The pop of, j, is, v) Think: In what order do you think the states print?
Looping through Maps Recall: looping through slices with two indices: for j, v := range list { fmt.println( The value at, j, is, v) Now: looping through map proceeds similarly: for j, v := range statepop { fmt.println( The pop of, j, is, v) Note: ordering of map keys doesn t follow clear pattern.
Map Literals Recall: array and slice literals var a = [4]float64{3.2, -30.0, 84.71, 62.3 var prime = []int{2, 3, 5, 7, 11
Map Literals Recall: array and slice literals var a = [4]float64{3.2, -30.0, 84.71, 62.3 var prime = []int{2, 3, 5, 7, 11 Now: map literals charskew := map[byte]int { A : 0, C : -1, G : 1, T : 0, // the last comma is important!
Map Literals func SkewArray(s string) []int{ var a []int a[0] = 0 for i := range s { a = append(a, a[i-1]+charskew(s[i])) return a charskew := map[byte]int { A : 0, C : -1, G : 1, T : 0, // the last comma is important!
Returning to Finding Frequent Words FrequentWords(Text, k) FrequentPatterns ß an empty list c ß empty array of length Text - k for i ß 0 to Text - k Pattern ß Substring(Text, i, k) c[i] ß Count(Text, Pattern) maxcount ß Max(a) for i ß 0 to Text - k if c[i] = maxcount add Substring(Text, i, k) to FrequentPatterns FrequentPatterns ß RemoveDuplicates(FrequentPatterns) return FrequentPatterns Why not rewrite with a map instead of an array?
Rewriting Frequent Words Pseudocode BetterFrequentWords(Text, k) FrequentPatterns ß an empty list Freq ß empty map for i ß 0 to Text - k Pattern ß Substring(Text, i, k) if Freq[Pattern]!exists Freq[Pattern] = 1 else Freq[Pattern]++ maxcount ß Max(Freq) for all patterns Pattern in Freq if Freq[Pattern] = maxcount add Pattern to FrequentPatterns return FrequentPatterns Note: We don t need RemoveDuplicates() or Count()!
Returning to BetterFrequentWords() BetterFrequentWords(Text, k) FrequentPatterns ß an empty list Freq ß empty map for i ß 0 to Text - k Pattern ß Substring(Text, i, k) if Freq[Pattern]!exists Freq[Pattern] = 1 else Freq[Pattern]++ maxcount ß Max(Freq) for all patterns Pattern in Freq if Freq[Pattern] = maxcount add Pattern to FrequentPatterns return FrequentPatterns Exercise: Write a Go function taking a map of strings to ints as input and returning the max value in the map.
Returning to BetterFrequentWords() BetterFrequentWords(Text, k) FrequentPatterns ß an empty list Freq ß empty map for i ß 0 to Text - k Pattern ß Substring(Text, i, k) if Freq[Pattern]!exists Freq[Pattern] = 1 else Freq[Pattern]++ maxcount ß Max(Freq) for all patterns Pattern in Freq if Freq[Pattern] = maxcount add Pattern to FrequentPatterns return FrequentPatterns Next: let s focus on implementing the code in red.
Returning to BetterFrequentWords() BetterFrequentWords(Text, k) FrequentPatterns ß an empty list Freq ß FrequencyMap(Text, k) maxcount ß Max(Freq) for all patterns Pattern in Freq if Freq[Pattern] = maxcount add Pattern to FrequentPatterns return FrequentPatterns
Checking if a Map Contains a Key: Method 1 _, exists := Freq[Pattern] // exists is a boolean value that is equal // to false if Freq[Pattern] doesn t exist if!exists { Freq[Pattern] = 1 else { Freq[Pattern]++
Checking if a Map Contains a Key: Method 2 Freq[Pattern]++ // this will automatically tell Go that we // need a key = Pattern (with default value // 0), and then immediately increment it
Returning to BetterFrequentWords() BetterFrequentWords(Text, k) FrequentPatterns ß an empty list Freq ß empty map of length Text - k for i ß 0 to Text - k Pattern ß Substring(Text, i, k) if Freq[Pattern]!exists Freq[Pattern] = 1 else Freq[Pattern]++ maxcount ß Max(Freq) for all patterns Pattern in Freq if Freq[Pattern] = maxcount add Pattern to FrequentPatterns return FrequentPatterns
Returning to BetterFrequentWords() BetterFrequentWords(Text, k) FrequentPatterns ß an empty list Freq ß FrequencyMap(Text, k) maxcount ß Max(Freq) for all patterns Pattern in Freq if Freq[Pattern] = maxcount add Pattern to FrequentPatterns return FrequentPatterns Exercise: Write a Go function implementing FrequencyMap(Text, k).
Implementing FrequentWords() in Go func FrequentWords(Text string, k int) []string { freqpatterns := make([]string, 0) freq := FrequencyMap(Text, k) m := Max(freq) for pattern, val := range freq { if val == m { freqpatterns = append(freqpatterns, pattern) return freqpatterns
Cataloguing Multiple Genomes Think: Say that you would like to store frequency maps for the replication origins of 1,000 different bacterial genomes. One way is to have a separate frequency map for each genome. How could we consolidate all of this information into a single data structure?
Map of Maps: Mental Image Key Bacterium A! Bacterium B! Bacterium C! Bacterium D! Value!!! Key Value ATGCACGCT 8! GGACGTACG 1! GTACGACAG 2! ATAAATTGC 3! GATACCAGA 2! ATAGGATCC 6! GGATATCCC 3! Bacterium E!! Bacterium F!! Bacterium G!!
Recall: Creating a 2-D Slice 2-D slices are also slices of slices we must define the outer slice first. var field [][]bool = make([][]bool, m) To initialize the slices in field, write an explicit loop. for row := range field { field[row] = make([]bool, n) m rows 0 1 2 3 4 5 6 n columns 0 1 2 3
Map of Maps in Go // say we have a slice of strings genomes // containing the bacterial genomes. database := make(map[string]map[string]int) // create initial 2-D map for bact := range genomes { // create map for each bacterium database[bact] = make(map[string]int) // map bacterium to its frequency map database[bact] = FrequentWords(bact, k)