Linked Lists Spring 2016 CS 107 Version I. Motivating Example We'll dive right in with an example linked list. Our list will hold the values 1, 2, and 3. Linked lists can easily grow and shrink. In that spirit, let's Add 4 to the end of the list. Delete the 2. Notice that as the list changes, we use exactly as much memory as we need - never any extra space as we would likely have using an array as a container. We can also change the list anywhere without doing any shifting of values, as pointers facilitate the changes. II. Representing Nodes and Lists Let's begin with some important vocabulary: Each of the items in the list is stored in a node. The node basically contains some data and a pointer to some other node. We call the data values we store in each node keys. The list has a defined order in that each node has a defined predecessor before it and a defined successor after it, with two exceptions: The start of the list is called the head; the head has no predecessor. The end of the list is called the tail; the tail has no successor. So, how do we represent a list? It turns out we don't need any new syntax whatsoever; a list is something we build up from the tools we already know. The first step is to represent a node. We'll keep things as simple as possible and say we'll work with a linked list of integers. Then, a node is simply a record with two fields: Problem: Write actual code to define such a record. Page 1 of 7 Prepared by D. Hogan
It turns out that it can get syntactically ugly to deal with pointers so explicitly, so it's customary to define a type for a pointer to a node and use that type in our definition. Here's what I mean: struct NodeType; typedef NodeType * NodePtr; struct NodeType { int data; NodePtr next; }; Then, to represent a list, we maintain a pointer to the head of the list (we can use the "pointer to a node" type we just discussed), which, it turns out, is how we store the list itself define nodes for each of the values in the list, where each node points to its successor (the "links") use a null pointer for the successor of the last node III. Traversing a List Very well. We know how to represent lists, so it's time for the first fundamental operation on a linked list: traversal of the list (a.k.a. walking the list). Once again, we don't need anything special to do this. We just need to have the head of the list. Here's the gist: 1. Follow the head pointer to the first node and output the data there. 2. Check whether the first node has a successor. If it does, follow the pointer to that node and print it out. If it doesn't, stop. 3. Repeat Step 2 for the current node. 4. Continue doing this until we reach the end of the list. How do we know we're done? So, this process essentially boils down to a loop. Question: Is this loop determinate or indeterminate? What loop is the ideal choice and why? Before we begin, let's define a temporary variable to hold the node we're currently inspecting: Where should this variable start? Now, let's set up the loop to walk the list and display each value: Page 2 of 7 Prepared by D. Hogan
This technique works for all lists and you'll often want to print out the state of a list to inspect it, so it behooves you to package this algorithm in a method. IV. Hardcoding a List Now we'll work together through how a list works conceptually via a concrete example. This will also function as a lab exercise, as you'll also represent this list in code next time. Here in the notes, we'll illustrate the several concepts and write down pseudocode for each operation, and then you'll turn those operations into code. Now, we'll proceed through several steps. In your lab, label the start of each step both with a comment and a message to the screen telling what step you're on, and after each step, use the list traversal method to print the progress to screen. Step 1: Create a head node and store 5 in the first location. Step 2: Now insert 8 at the beginning of the list. Step 3: Insert 12 at the end of the list. Page 3 of 7 Prepared by D. Hogan
Step 4: Insert 7 between 5 and 12. Step 5: Delete the 5. V. C++ Syntax Interlude You may notice that many times in the lab exercises, we dereferenced a pointer to a record for a node and then immediately accessed fields from that node, e.g. (*somenode).data (*somenode).next That's ugly, and we do it so often that C++ has a built-in operator to dereference a pointer to a struct variable and then access a field of it: -> The general form is that is equivalent to. So, we could rewrite the above more succinctly as somenode->data somenode->next Page 4 of 7 Prepared by D. Hogan
VI. Generalizing Lists That lab was fun. Okay, but it was pretty tedious too. Suppose you had a list with hundreds of nodes. You probably wouldn't want to dereference, say, 107 pointers to insert a node. I know I certainly wouldn't But it turns out that from working with a very concrete example of a linked list, we're ready to generalize the whole concept and it won't be too difficult. We can abstract the ideas we used before. Really, as before, all we need to reference a linked list is a pointer to its head node. This is, in fact, one way to work with linked lists. To summarize, we could: Define a record type to hold a list node Use a pointer to the head of a list to refer to a list Pass the head pointer into every algorithm we have for working with lists We'll use this style here conceptually. Another way of doing it is to create a data type for a list; doing so involves classes. VII. Generalizing Insertion We covered all three cases of where we could insert a new node in our concrete list example: at the head at the tail somewhere in the middle Let's work with the supposition we have a list with head pointer head and we wish to insert a new node with a key k. In most applications when we insert, we'll want to insert at one of the extremes, because inserting in the middle generally requires either more input or having restrictions on our list. We'll start with those two cases. Case of insertion at the head: Illustration: Pseudocode: Page 5 of 7 Prepared by D. Hogan
Case of insertion at the tail: Illustration: Pseudocode: VIII. Generalizing Deletion Let's look at each of the three cases of where we could delete a node: Case of deletion at the head: Case of deletion at the tail: Case of deletion in the middle: Page of Prepared by D. Hogan 6 7
It turns out all the cases we looked at above can then be abstracted into the same thing. (By the way, there's a big problem solving and mathematical thinking lesson here: divide a problem up into all of the possible cases, analyze those cases individually, and then abstract them as best you can.) To make things easier, let's make an assumption that all keys in our linked list are unique. (While not a strict requirement, from a data structures perspective, this is common.) We can generalize with pseudocode again. Here, let's say we are deleting a key k from a linked list whose head is pointed to by head and restricted such that all keys are unique. Here's proposed psuedocode: LINKED-LIST-DELETE(head, k) { Let pred be a pointer to a list node that will eventually hold the predecessor of the node to delete Let cur be a pointer to a list node cur = head while cur NIL and cur data k { pred = cur cur = cur next } } pred next = cur next deallocate cur // or pred next = pred next next Question: What does this pseudocode do if k isn't found? Should we handle that with a precondition or? Page 7 of 7 Prepared by D. Hogan