STL Containers, Part II

Similar documents
Assignment 1: grid. Due November 20, 11:59 PM Introduction

Programming with Haiku

Sets and MultiSets. Contents. Steven J. Zeil. July 19, Overview of Sets and Maps 4

Standard Library Reference

CSE100. Advanced Data Structures. Lecture 4. (Based on Paul Kube course materials)

const int length = myvector.size(); for(int i = 0; i < length; i++) //...Do something that doesn't modify the length of the vector...

Member Initializer Lists

SETS AND MAPS. Chapter 9

static CS106L Spring 2009 Handout #21 May 12, 2009 Introduction

Associative Containers & Iterators. Cristian Cibils

CSE 100: C++ TEMPLATES AND ITERATORS

CSCI-1200 Data Structures Spring 2018 Lecture 14 Associative Containers (Maps), Part 1 (and Problem Solving Too)

Chapter 5. The Standard Template Library.

Problem Set 1. You might want to read Chapter 12 of the course reader before attempting this problem.

Standard Template Library STL

Document Number: P0429R4 Date: Reply to: 0.1 Revisions... 1

Associative Containers and Iterators. Ali Malik

A Standard flat_map. 1 Revisions. 2 Introduction. 3 Motivation and Scope. 1.1 Changes from R Changes from R0

Summary. Design. Layout

CPSC 427a: Object-Oriented Programming

19.1 The Standard Template Library

Copy Constructors and Assignment Operators

Advanced Associative Containers. Ali Malik

Exercise 6.2 A generic container class

STL Containers, Part I

A Standard flat_map. Drop the requirement on container contiguity; sequence container will do.

2

Trees. Chapter 6. strings. 3 Both position and Enumerator are similar in concept to C++ iterators, although the details are quite different.

CSCI-1200 Data Structures Fall 2017 Lecture 10 Vector Iterators & Linked Lists

Function Templates. Consider the following function:

A Standard flat_map. 1 Introduction. 2 Motivation and Scope

Outline. Variables Automatic type inference. Generic programming. Generic programming. Templates Template compilation

CSE 100: C++ TEMPLATES AND ITERATORS

The STL <functional> Library

Basic Array Implementation Exception Safety

Container Notes. Di erent Kinds of Containers. Types Defined by Containers. C++11 Container Notes C++11

CSE 100: C++ TEMPLATES AND ITERATORS

Const Correctness. Ali Malik

A Standard flat_set. This paper outlines what a (mostly) API-compatible, non-node-based set might look like.

Prefix Trees Tables in Practice

CS 11 Ocaml track: lecture 4. Today: modules

Splicing Maps and Sets (Revision 1)

CS11 Advanced C++ Fall Lecture 1

CS193D Handout 12 Winter 2005/2006 January 30, 2006 Introduction to Templates and The STL

Chapter 1 Getting Started

Template based set of collection classes STL collection types (container types)

Lecture 21 Standard Template Library. A simple, but very limited, view of STL is the generality that using template functions provides.

CE221 Programming in C++ Part 1 Introduction

CSCI-1200 Data Structures Fall 2017 Lecture 17 Trees, Part I

CSCI-1200 Data Structures Fall 2016 Lecture 17 Associative Containers (Maps), Part 2

Pointers, Dynamic Data, and Reference Types

Computer Science II CSci 1200 Test 2 Overview and Practice

CS197c: Programming in C++

CSCI-1200 Data Structures Spring 2018 Lecture 16 Trees, Part I

by Pearson Education, Inc. All Rights Reserved. 2

COM S 213 PRELIM EXAMINATION #2 April 26, 2001

Major Language Changes, pt. 1

C++0x. Handout #34 June 2, CS106L Spring Introduction

CSCI-1200 Data Structures Spring 2018 Lecture 15 Associative Containers (Maps), Part 2

SSE2034: System Software Experiment 3

Chapter 21a Other Library Issues

Bruce Merry. IOI Training Dec 2013

Major Language Changes, pt. 1

STL: C++ Standard Library

CSC 427: Data Structures and Algorithm Analysis. Fall 2004

CS93SI Handout 04 Spring 2006 Apr Review Answers

List, Stack, and Queues

CSCI-1200 Computer Science II Fall 2008 Lecture 15 Associative Containers (Maps), Part 2

Getting started with C++ (Part 2)

Chapter 17 vector and Free Store. Bjarne Stroustrup

CSCI-1200 Data Structures Fall 2013 Lecture 9 Iterators & Lists

CS201 - Introduction to Programming Glossary By

Computer Science II Lecture 2 Strings, Vectors and Recursion

Unit 4 Basic Collections

377 Student Guide to C++

CMSC 341 Lecture 6 Templates, Stacks & Queues. Based on slides by Shawn Lupoli & Katherine Gibson at UMBC

CSCI-1200 Data Structures Fall 2017 Lecture 9 Iterators & STL Lists

CSE 333 Autumn 2013 Midterm

Topics. bool and string types input/output library functions comments memory allocation templates classes

Lectures 11,12. Online documentation & links

Standard Template Library. Outline

CSCI-1200 Data Structures Spring 2015 Lecture 18 Trees, Part I

Lecture-5. STL Containers & Iterators

Software Engineering Concepts: Invariants Silently Written & Called Functions Simple Class Example

CSCI-1200 Data Structures Spring 2018 Lecture 8 Templated Classes & Vector Implementation

DataStruct 9. Hash Tables, Maps, Skip Lists, and Dictionaries

Short Notes of CS201

Type Checking and Type Equality

COEN244: Class & function templates

Assignment 1: SmartPointer

CS349/SE382 A1 C Programming Tutorial

l Determine if a number is odd or even l Determine if a number/character is in a range - 1 to 10 (inclusive) - between a and z (inclusive)

III. Classes (Chap. 3)

The vector Class and Memory Management Chapter 17. Bjarne Stroustrup Lawrence "Pete" Petersen Walter Daugherity Fall 2007

CS193D Handout 10 Winter 2005/2006 January 23, 2006 Pimp Your Classes

CSCI-1200 Data Structures Fall 2018 Lecture 22 Hash Tables, part 2 & Priority Queues, part 1

CS106X Handout 03 Autumn 2012 September 24 th, 2012 Getting Started

CSCI-1200 Data Structures Fall 2010 Lecture 8 Iterators

CSCI-1200 Data Structures Spring 2018 Lecture 10 Vector Iterators & Linked Lists


Transcription:

CS106L Handout #08 Fall 2007 October 17, 2007 STL Containers, Part II Introduction Last week we talked about vector and deque, the STL's two managed array classes. However, the STL offers many other containers, two of which, set and map, we will cover in this handout. Because set and map are associative containers, they require a decent understanding of iterators and iterator syntax. At the end, we'll quickly talk about two special container classes, the multimap and multiset, which have no counterparts in the CS106 ADT library. Using typedef Liberally Now that you're using STL iterators more frequently, you might find it useful to liberally sprinkle the typedef statement to create shorthand for your types. For example, if you have a set<string>, the return type of its insert function will be pair<set<string>::iterator, bool>. Do you really want to have to type that out every single time? While it's by no means a requirement, you should strongly consider typedefing iterator types if you plan on using them frequently. For example, if you typedef set<string>::iterator as SetStrItr, then the syntax for the return type of insert reduces to pair<setstritr, bool>, which is still mouthful but is considerably better than the first. set The first container we'll cover today is the set. Like the CS106 Set, set is an ordered container representing an abstract collection of objects. You can test a set for inclusion, and can also insert and remove elements. However, unlike the CS106 Set, the STL set does not have intrinsic support for union, intersection, and difference, though next week we'll see how you can mimic the functionality using STL algorithms. Like the CS106 Set, when storing non-primitive types, the set needs extra information about how to compare those objects. Unfortunately, the syntax to provide a comparison callback for the set either uses complicated, error-prone syntax or requires an understanding of operator overloading, a language feature we haven't covered yet. Thus, for now, you should avoid storing custom data types in sets. The most basic operation you can perform on a set is insertion using the insert function. Unlike the deque and vector insert functions, you do not need to specify a location for the new element because the set automatically orders its elements. Here is some sample code using insert: set<int> myset; myset.insert(137); // Now contains: 137 myset.insert(42); // Now contains: 42 137 (in that order) myset.insert(137); // Now contains: 42 137 (note no duplicates)

To check whether the set contains an element, you can use the find function. find searches the set for an element and, if found, returns an iterator to it. Otherwise, find returns the value of the container's end function as a sentinel indicating the element wasn't found. For example, given the set initialized above: if(myset.find(137)!= myset.end()) cout << "The set contains 137." << endl; // This is printed. if(myset.find(0) == myset.end()) cout << "The set doesn't contain 0." << endl; // Also printed. Instead of find, you can also use the count function, which returns 1 if the element is contained in the set and 0 otherwise. Using C++'s automatic conversion of positive values into true and zero values to false, you can actually write much cleaner code using count. For example: if(myset.count(137)) cout << "137 is in the set." << endl; // Printed if(!myset.count(500)) cout << "500 is not in the set." << endl; // Printed If count is simpler than find, why use find? The reason is that sometimes you'll want an iterator to an element of a set for reasons other than determining membership. You might want to erase the value, for example, or perhaps you're trying to obtain iterators to a range of values for use with an STL algorithm. Thus, while count is quite useful, find is much more common in practice. You can remove elements from a set using erase. erase has several versions, one of which removes the value pointed at by the specified iterator, and another that removes the specified value from the set if it exists. For example: myset.erase(137); // Removes 137, if it exists. set<int>::iterator itr = myset.find(42); if(itr!= myset.end()) myset.erase(itr); Sometimes it can be useful to search the set for all elements within a certain range. For example, suppose we wanted to obtain iterators to all values in the set in the range [10, 100]. Using only the above functions, we'd have to iterate manually over the entire set looking for values that matched the criteria. However, set exports two useful functions called lower_bound and upper_bound that can greatly simplify this task. lower_bound returns an iterator to the first element in the set greater than or equal to the specified value, while upper_bound returns an iterator to the first element in the set that isn't less than or equal to the specified element. Using lower_bound to get an iterator to the first element in the range and iterating until we reach the value returned by upper_bound, we can loop over all elements in a the range. For example, the following loop iterates over all elements in the set in the range [10, 100]: set<int>::iterator stop = myset.upper_bound(100); for(set<int>::iterator itr = myset.lower_bound(10); itr!= stop; ++itr) //... perform tasks... A word of caution: you should not modify items pointed to by set iterators. Since sets are internally stored in sorted order, if you modify a value in a set in-place, you risk ruining the internal

ordering. With the elements no longer in sorted order, the set's searching routines may fail, leading to difficult-to-track bugs. If you want to apply a transformation to a set's elements, unless you're convinced the transformation will leave the elements in forward-sorted order, you should strongly consider creating a second set and storing the result there. The following table lists some of the most important set functions, though there are more. The entries in this table are also applicable for the map. Again, we haven't covered const yet, so for now it's safe to ignore it. Also, we haven't covered const_iterators, but for now you can just treat them as iterators that can't write any values. Constructor: set<t>() set<int> myset; Constructs an empty set. Constructor: set<t>(const set<t> &other) set<int> myotherset = myset; Constructor: set<t>(inputiterator start, InputIterator stop) size_type size() const Constructs a set that's a copy of another set. set<int> myset(myvec.begin(), myvec.end()); Constructs a set containing copies of the elements in the range [start, stop). Any duplicates are discarded, and the elements are sorted. Note that this function accepts iterators from any source. int numentries = myset.size(); Returns the number of elements contained in the set. bool empty() const if(myset.empty()) {... } void clear() iterator begin() const_iterator begin() const iterator end() const_iterator end() pair<iterator, bool> insert(const T& value) void insert(inputiterator begin, InputIterator end) Returns whether the set is empty. myset.clear(); Removes all elements from the set. set<int>::iterator itr = myset.begin(); Returns an iterator to the start of the set. Be careful when modifying elements in-place. while(itr!= myset.end()) {... } Returns an iterator to the element one past the end of the final element of the set. myset.insert(4); myset.insert(myvec.begin(), myvec.end()); The first version inserts the specified value into the set. The return type is a pair containing an iterator to the element and a bool indicating whether the element was inserted successfully (true) or if it already existed (false). The second version inserts the specified range of elements into the set, ignoring duplicates.

set functions, contd. iterator find(const T& element) if(myset.find(0) == myset.end()) {... } Returns an iterator to the specified element if it exists, and end otherwise. size_type count(const T& item) const if(myset.count(0)) {... } size_type erase(const T& element) void erase(iterator itr); void erase(iterator start, iterator stop); iterator lower_bound(const T& value) iterator upper_bound(const T& value) Returns 1 if the specified element is contained in the set, and 0 otherwise. if(myset.erase(0)) {...} // 0 was erased myset.erase(myset.begin()); myset.erase(myset.begin(), myset.end()); Removes an element from the set. In the first version, the specified element is removed if found, and the function returns 1 if the element was removed and 0 if it wasn't in the set. The second version removes the element pointed to by itr. The final version erases elements in the range [start, stop). itr = myset.lower_bound(5); Returns an iterator to the first element that is greater than or equal to the specified value. This function is useful for obtaining iterators to a range of elements, especially in conjunction with upper_bound. itr = myset.upper_bound(100); Returns an iterator to the first element that is greater than the specified value. Because this element must be strictly greater than the specified value, you can iterate over a range until the iterator is equal to upper_bound to obtain all elements less than the parameter. pair Before starting map, let us first quickly discuss pair, a template struct that simply stores two elements of mixed types. pair accepts two template arguments and is declared as pair<typeone, TypeTwo>. pair has two fields, named first and second, which store the values of the two elements of the pair: first is a variable of type TypeOne, second of type TypeTwo. You can create a pair explicitly using this syntax: pair<int, string> mypair; mypair.first = myinteger; mypair.second = mystring; Alternatively, you can initialize the pair's members in the constructor as follows: pair<int, string> mypair(myinteger, mystring);

In some instances, you will need to create a pair on-the-fly to pass as a parameter (especially to the map's insert). You can therefore use the make_pair function as follows: pair<int, string> mypair = make_pair(137, string("string!")); Note that when passing a string literal to make_pair you must manually typecast the string from its normal C string form into an actual C++ string. make_pair will automatically return a pair of the types of its arguments, and unless you manually force C strings to become C++ strings, you'll get back a pair<int, char *> instead of a pair<int, string>. This is an easy mistake to make, but can generate a slew of awful compiler errors. Of course, if you truly want a pair<int, char *>, avoid the typecast. map The map is one of the STL's most useful containers. Unlike the CS106 Map, the STL map lets you choose what key type you want to use. That is, the STL map can map strings to ints, ints to strings, or even vector<int>s to deque<double>s. Also unlike the CS106 Map, the STL map stores its elements in sorted order. That is, if you iterate over the keys in a map<string, int>, you'll get all of the string keys back in alphabetical order. If you want to use nonstandard types as keys in a map for example, your own custom structs you will need to provide a comparison function. Again, like set, providing this function is quite difficult and we will not discuss how until we cover operator overloading. To declare a map, use the following syntax: map<keytype, ValueType> mymap; For example, to create a map similar to the CS106 Map<int>, you'd write map<string, int>, a mapping from strings to ints. Unfortunately, while the STL map is quite powerful, it is a syntax nightmare because its functions and iterators need to accept, return, and manipulate two values. Thus the map's iterators act as pointers to pair<const KeyType, ValueType>s, a pair of an immutable key and a mutable value. For example, to iterate over a map<string, int> and print out all its key/value pairs, you can use the following loop: for(map<string, int>::iterator itr = mymap.begin(); itr!= mymap.end(); ++ itr) cout << itr->first << ": " << itr->second << endl; Note that the key is itr->first and the value is itr->second. While you cannot change the value of the key during iteration, * you can modify its associated value. As with the set, you can use find to return an iterator to an element in a map given a key. For example, to check to see if STL is a key in a map<string, int>, you can use this syntax: if(mymap.find("stl")!= mymap.end()) {... } * As with the set, changing the values of the keys in a map could destroy the internal ordering and render the map non-functional. However, unlike the set, if you try to modify a key with an iterator, you'll get a compile-time error.

Alternatively, you can use the count function as you would with a set. Because the map stores everything as pairs, inserting a new value into a map is a syntax headache. You must manually create the pair to insert using the make_pair function discussed earlier. Thus, if you wanted to insert the key/value pair ( STL, 137) into a map<string, int>, you must use this syntax: mymap.insert(make_pair(string("stl"), 137)); Unlike the CS106 Map, if you try to overwrite the value of an existing item with the insert function, the STL map won't modify the value. Thus, if you write mymap.insert(make_pair(string("stl"), 137)); // Inserted mymap.insert(make_pair(string("stl"), 42)); // Whoops! Not overwritten. The value associated with the key STL will still be the value 137. To check whether your insertion actually created a new value, the map's insert function returns a pair<iterator, bool> that contains information about the result of the operation. The first element of the pair is an iterator to the key/value pair you just inserted, and the second element is a bool indicating whether the insert operation set the value appropriately. If this value is false, the value stored in the map was not updated. Thus, you can write code like this to insert a key/value pair and manually update the value if the key already existed.: // Try to insert normally. pair<map<string, int>::iterator, bool> result; result = mymap.insert(make_pair(string("stl"), 137)); // If insertion failed, manually set the value. if(!result.second) result.first->second = 137; In the last line, the expression result.first->second is the value of the existing entry, since result.first yields an iterator pointing to the entry, so result.first->second is the value field of the iterator to the entry. As you can see, the pair can make for tricky, unintuitive code. There is an alternate syntax you can use for insertion involving the bracket operator. As with the CS106 map, you can access individual elements by writing mymap[key]. If the element doesn't already exist, it will be created. Furthermore, if the key already exists, the value will be updated. For example, you can use the bracket operator to insert a new element into a map<string, int> as follows: mymap["stl"] = 137; You can also use the bracket operator to read a value from the map. As with insertion, if the element doesn't exist, it will be created. For example: int myvalue = mymap["stl"]; If the bracket notation is so much more convenient, why even bother with insert? As with everything in the STL, the reason is efficiency. Suppose you have a

map<string, vector<int> >. If you insert a key/value pair using mymap.insert(make_pair(string("stl"), myvector)) the newly created vector<int> will be initialized to the contents of myvector. With mymap["stl"] = myvector, the bracket operation will first create an empty vector<int>, then assign myvector on top of it. This requires more allocations and deallocations and consequently is much slower. Again, as an STL programmer, you are working with professional libraries. If you want to go for raw speed, yes, you will have to deal with more complicated syntax, but if you want to just get the code up and running, then the STL won't have any reservations. multimap and multiset The STL provides two special multicontainer classes, multimap and multiset, that act as maps and sets except that the values and keys they store are not necessarily unique. That is, a multiset could contain several copies of the same value, while a multimap might have duplicate keys associated with different values. multimap and multiset have identical syntax to map and set, except that some of the functions work slightly differently. For example, the count function will return the number of copies of an element an a multicontainer, not just a binary zero or one. Also, while find will still return an iterator to an element if it exists, the element it points to is not guaranteed to be the only copy of that element in the multicontainer. Finally, the erase function will erase all copies of the specified key or element, not just the first it encounters. One function that's quite useful for the multicontainers is the equal_range function. equal_range returns a pair<iterator, iterator> that represents the span of entries equal to the specified value. For example, given a multimap<string, int>, you could use the following code to iterate over all entries with key STL : // Assume we have multimap<string, int> mymultimap. // Store the result of the equal_range pair<multimap<string, int>::iterator, multimap<string, int>::iterator> mypair = mymultimap.equal_range("stl"); // Iterate over it! for(multimap<string, int>::iterator itr = mypair.first; itr!= mypair.second; ++itr) cout << itr->first << ": " << itr->second << endl; As you can see, you might find yourself needing to typedef your iterator types a bit more with the multicontainers, but nonetheless they are quite useful.

More to Explore In this handout we covered map and set, which combined with vector and deque are the most commonly-used STL containers. However, there are several others we didn't cover, a few of which might be worth looking into. Here are some topics you might want to read up on: 1. list: vector and deque are both sequential containers that mimic built-in arrays. The list container, however, models a sequence of elements without indices. list supports several nifty operations, such as merging, sorting, and splicing, and has quick insertions at almost any point. If you're planning on using a linked list for an operation, the list container is perfect for you. 2. The Boost Containers: As mentioned in the string handout, the Boost C++ Libraries are a collection of functions and classes developed to augment C++'s native library support. Boost offers several new container classes that might be worth looking into. For example, multi_array is a container class that acts as a Grid in any number of dimensions. Also, the unordered_set and unordered_map act as replacements to the set and map that use hashing instead of tree structures to store their data. If you're interested in exploring these containers, head on over to www.boost.org. Practice Problems Some of these are programming exercises, and some of these are just cool things to think about. Play around with them a bit to get some practice with map and set! 1. Using the time trial code from lecture, time the speed differences of the map's insert function versus the bracket syntax in a map<int, int>, then repeat the trial for map<int, string>. What does this tell you about the time it takes to copy objects? 2. To understand the sheer madness of STL error messages, given a map<string, int> called mymap, write mymap.insert(make_pair("stl", 137)) instead of mymap.insert(make_pair(string("stl"), 137)). How many errors do you get? What if you forget the call to make_pair and instead write mymap.insert(string("stl"), 137)? 3. Write a function NumberDuplicateEntries that accepts a map<string, string> and returns the number of duplicate values in the map. 4. As mentioned earlier, you can use a combination of lower_bound and upper_bound to iterate over elements in the closed interval [min, max]. What combination of these two functions could you use to iterate over the interval [min, max)? What about (min, max] and (min, max)? 5. Write a function PrintMatchingPrefixes that accepts a set<string> and a string containing a prefix and prints out all of the entries of the set that begin with that prefix. Your function should only iterate over the entires it finally prints out. You can assume the prefix is nonempty and should treat prefixes case-sensitively. (Hint: In a set<string>, strings are sorted lexicographically, so all strings that start with abc will come before all strings that start with abd. )