CS107 Handout 08 Spring 2007 April 9, 2007 The Ins and Outs of C Arrays

Similar documents
Dynamic Allocation in C

Dynamic Allocation in C

CS107 Handout 13 Spring 2008 April 18, 2008 Computer Architecture: Take II

CS 11 C track: lecture 5

CS 61C: Great Ideas in Computer Architecture. C Arrays, Strings, More Pointers

Pointers. 1 Background. 1.1 Variables and Memory. 1.2 Motivating Pointers Massachusetts Institute of Technology

Pointers, Dynamic Data, and Reference Types

So far, system calls have had easy syntax. Integer, character string, and structure arguments.

SYSC 2006 C Winter 2012

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

In Java we have the keyword null, which is the value of an uninitialized reference type

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics

CS 61c: Great Ideas in Computer Architecture

CS201- Introduction to Programming Current Quizzes

Pointer Basics. Lecture 13 COP 3014 Spring March 28, 2018

Declaring Pointers. Declaration of pointers <type> *variable <type> *variable = initial-value Examples:

Memory Corruption 101 From Primitives to Exploit

CS162 - POINTERS. Lecture: Pointers and Dynamic Memory

by Pearson Education, Inc. All Rights Reserved.

Today s lecture. Pointers/arrays. Stack versus heap allocation CULTURE FACT: IN CODE, IT S NOT CONSIDERED RUDE TO POINT.

Arrays and Memory Management

Intermediate Programming, Spring 2017*

Dynamic Data Structures. CSCI 112: Programming in C

Today s lecture. Continue exploring C-strings. Pointer mechanics. C arrays. Under the hood: sequence of chars w/ terminating null

Chapter 1 Getting Started

CS 222: Pointers and Manual Memory Management

CS113: Lecture 9. Topics: Dynamic Allocation. Dynamic Data Structures

Common Misunderstandings from Exam 1 Material

Essential C Nick Parlante, 1996.Free for non-commerical use.

Lecture 04 Introduction to pointers

calling a function - function-name(argument list); y = square ( z ); include parentheses even if parameter list is empty!

Programming. Pointers, Multi-dimensional Arrays and Memory Management

CSC 211 Intermediate Programming. Arrays & Pointers

Arrays and Pointers (part 1)

Arrays and Pointers in C. Alan L. Cox

Pointers in C/C++ 1 Memory Addresses 2

Systems Programming and Computer Architecture ( )

CS 61C: Great Ideas in Computer Architecture C Pointers. Instructors: Vladimir Stojanovic & Nicholas Weaver

Pointers. Memory. void foo() { }//return

Chapter 17 vector and Free Store

CA31-1K DIS. Pointers. TA: You Lu

ECE 15B COMPUTER ORGANIZATION

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions?

Vector and Free Store (Pointers and Memory Allocation)

Memory, Arrays & Pointers

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

Chapter 17 vector and Free Store

EL2310 Scientific Programming

Pointers (part 1) What are pointers? EECS We have seen pointers before. scanf( %f, &inches );! 25 September 2017

Arrays. Returning arrays Pointers Dynamic arrays Smart pointers Vectors

Lecture07: Strings, Variable Scope, Memory Model 4/8/2013

Arrays and Pointers. CSE 2031 Fall November 11, 2013

Arrays, Pointers and Memory Management

Chapter 17 vector and Free Store. Bjarne Stroustrup

nptr = new int; // assigns valid address_of_int value to nptr std::cin >> n; // assigns valid int value to n

Section Notes - Week 1 (9/17)

Pointers and Memory 1

Announcements. assign0 due tonight. Labs start this week. No late submissions. Very helpful for assign1

CS61C Machine Structures. Lecture 5 C Structs & Memory Mangement. 1/27/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

CS 161 Computer Security

CS61C : Machine Structures

Pointers, Arrays, Memory: AKA the cause of those Segfaults

Programs in memory. The layout of memory is roughly:

Administrivia. Introduction to Computer Systems. Pointers, cont. Pointer example, again POINTERS. Project 2 posted, due October 6

Homework #3 CS2255 Fall 2012

Arrays and Pointers (part 1)

Arrays and Pointers. Arrays. Arrays: Example. Arrays: Definition and Access. Arrays Stored in Memory. Initialization. EECS 2031 Fall 2014.

C++ for Java Programmers

CS61C Machine Structures. Lecture 4 C Pointers and Arrays. 1/25/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

Introduction to Linked Lists

CS2351 Data Structures. Lecture 7: A Brief Review of Pointers in C

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

C Pointers. 6th April 2017 Giulio Picierro

Dynamic Memory Allocation (and Multi-Dimensional Arrays)

Introduction to C: Pointers

11 'e' 'x' 'e' 'm' 'p' 'l' 'i' 'f' 'i' 'e' 'd' bool equal(const unsigned char pstr[], const char *cstr) {

Computer Science 50: Introduction to Computer Science I Harvard College Fall Quiz 0 Solutions. Answers other than the below may be possible.

Ch. 11: References & the Copy-Constructor. - continued -

CS 2461: Computer Architecture I

CS61C Machine Structures. Lecture 4 C Structs & Memory Management. 9/5/2007 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

Agenda. Components of a Computer. Computer Memory Type Name Addr Value. Pointer Type. Pointers. CS 61C: Great Ideas in Computer Architecture

CS 61C: Great Ideas in Computer Architecture. Lecture 3: Pointers. Krste Asanović & Randy Katz

Week 9 Part 1. Kyle Dewey. Tuesday, August 28, 12

Kurt Schmidt. October 30, 2018

Abstract. Vector. Overview. Building from the ground up. Building from the ground up 1/8/10. Chapter 17 vector and Free Store

Fall 2018 Discussion 2: September 3, 2018

Arrays and Pointers.

C Language Part 1 Digital Computer Concept and Practice Copyright 2012 by Jaejin Lee

Pointers. A pointer is simply a reference to a variable/object. Compilers automatically generate code to store/retrieve variables from memory

The vector Class and Memory Management Chapter 17. Bjarne Stroustrup Lawrence "Pete" Petersen Walter Daugherity Fall 2007

Understanding Pointers

Memory and Addresses. Pointers in C. Memory is just a sequence of byte-sized storage devices.

CS 61C: Great Ideas in Computer Architecture. Lecture 3: Pointers. Bernhard Boser & Randy Katz

Review: C Strings. A string in C is just an array of characters. Lecture #4 C Strings, Arrays, & Malloc

A brief introduction to C programming for Java programmers

CS61C : Machine Structures

DAY 3. CS3600, Northeastern University. Alan Mislove

Principles of Programming Pointers, Dynamic Memory Allocation, Character Arrays, and Buffer Overruns

CS61C : Machine Structures

Pointers and Arrays CS 201. This slide set covers pointers and arrays in C++. You should read Chapter 8 from your Deitel & Deitel book.

Transcription:

CS107 Handout 08 Spring 2007 April 9, 2007 The Ins and Outs of C Arrays C Arrays This handout was written by Nick Parlante and Julie Zelenski. As you recall, a C array is formed by laying out all the elements contiguously in memory from low to high. The array as a whole is referred to by the address of the first element. For example, the variable intarray below is synonymous with the address of the first element and can be used in expressions like an int *. int intarray[6]; intarray[0] intarray[1]... The programmer can refer to elements in the array with the simple [] syntax such as intarray[1]. This scheme works by combing the base address of the array with the index to compute the base address of the desired element in the array, using some simple arithmetic. Each element takes up a fixed number of bytes known at compiletime. So address of the nth element in the array (0-based indexing) will be at an offset of (n * element_size) bytes from the base address of the whole array. address of nth element = address_of_0th_element + (n * element_size_in_bytes) The square bracket syntax [] deals with this address arithmetic for you, but it's useful to know what it's doing. The [] multiplies the integer index by the element size, adds the resulting offset to the array base address, and finally dereferences the resulting pointer to get to the desired element. intarray[3] = 13; base + 3*elem_size Assume sizeof(int) = 4. Each element takes up 4 bytes. index Offset in bytes 13 [0] [1] [2] [3] [4] [5] 0 4 8 12 16 20

'+' Syntax In a closely related piece of syntax, adding an integer to a pointer does the same offset computation, but leaves the result as a pointer. The square bracket syntax dereferences that pointer to access the nth element while the + syntax just computes the pointer to the nth element. So the expression (intarray + 3) is a pointer to the integer intarray[3]. (intarray + 3) is of type (int*) while intarray[3] is of type int. The two expression only differ by whether the pointer is dereferenced or not. So the expression (intarray + 3) is exactly equivalent to the expression (&(intarray[3])). In fact those two probably compile to exactly the same code. They both represent a pointer to the element at index 3. Any [] expression can be written with the + syntax instead. We just need to add in the pointer dereference. So intarray[3] is exactly equivalent to *(intarray + 3). For most purposes, it's easiest and most readable to use the [] syntax. Every once in a while the + is convenient if you needed a pointer to the element instead of the element itself. Pointer++ If p is a pointer to an element in an array, then (p+1) points to the next element in the array. Code can exploit this using the construct p++ to step a pointer over the elements in an array. It doesn't help readability any, so I can't recommend the technique, but you may see it in code written by others. Here's a sequence of versions of strcpy written in order: from most verbose to most cryptic. In the first one, the straightforward for loop needs a little follow-up to ensure that the terminating null character is copied over. The second removes that opportunity for error by rearranging into a while loop and moving the assignment into the test. The last two are cute (and they demonstrate using ++ on pointers), but not really the sort of code you want to maintain. Among the four, I think the second is the best stylistically. With a smart compiler, all four will compile to basically the same code with the same efficiency. void strcpy1(char dest[], const char source[]) int i; 2 for (i = 0; source[i]!= '\0'; i++) dest[i] = source[i]; dest[i] = '\0'; // don't forget to null-terminate!

3 // Move the assignment into the test void strcpy2(char dest[], const char source[]) int i = 0; while ((dest[i] = source[i])!= '\0') i++; // Get rid of i and just move the pointers. // Relies on the precedence of * and ++. void strcpy3(char dest[], const char source[]) while ((*dest++ = *source++)!= '\0') ; // Rely on the fact that '\0' is equivalent to FALSE void strcpy4(char dest[], const char source[]) while (*dest++ = *source++) ; Pointer Type Effects Both [] and + implicitly use the compile time type of the pointer to compute the element_size which effects the offset arithmetic. When looking at code, it's easy to assume that everything is in the units of bytes. int *p; p = p + 12; // at run-time, what does this add to p? 12? The above code does not add the number 12 to the address in p that would increment p by 12 bytes. The code above increments p by 12 ints. Each int takes 4 bytes, so at run time the code will effectively increment the address in p by 48. The compiler figures all this out based on the type of the pointer. You can manipulate this using casts. For example, the following code really does just add 12 to the address in the pointer p. It works by telling the compiler that the pointer points to char instead of int. The size of char is defined to be exactly 1 byte (or whatever the smallest addressable unit is on the computer). In other words, sizeof(char) is always 1. We then cast the resulting (char*) back to an (int*). You can use casting like this to change the code the compiler generates. The compiler just blindly follows your orders. p = (int*) ((char*)p + 12); Arithmetic on a void pointer For a (void*) pointer, array subscripting and pointer arithmetic don't quite make sense. These manipulations include implicit multiplication by the size of the element

type, and what is sizeof(void)? Unknown! Some compilers assume that it should treat it like a (char*), but if you were to depend on this you would be creating nonportable code. To be precise and correct, you should cast the (void*) to (char*) before doing any math to make clear that all arithmetic is done in one-byte increments and any necessary multiplication will be done explicitly. void *ptr; p = (char*)ptr + 4; // increments ptr by exactly 4, no extra multiplication 4 Note that you do not need to cast the result back to (void*), a (void*) is the "universal recipient" of pointer types and can be freely assigned any type of pointer. It is best to use casts only when you absolutely must. Arrays and Pointers One effect of the C array scheme is that the compiler does not meaningfully distinguish between arrays and pointers they both just look like pointers. In the following example, the value of intarray is a pointer to the first element in the array so it's an (int*). The value of the variable intptr is also (int*) and it is set to point to a single integer i. So what's the difference between intarray and intptr? Not much as far as the compiler is concerned. They are both just (int*) pointers, and the compiler is perfectly happy to apply the [] or + syntax to either. It's the programmer's responsibility to ensure that the elements referred to by a [] or + operation really are there. Really its just the same old rule that C doesn't do any bounds checking. C thinks of the single integer i as just a sort of degenerate array of size 1. int intarray[6]; int *intptr; int i; intptr = &i; intarray[3] = 13; // ok intptr[0] = 12; // odd, but ok. Changes i. intptr[3] = 13; // BAD! There is no integer reserved here!

5 intarray (intarray+3) [0] [1] [2] [3] [4] [5] intptr (intptr+3) 12 13 i These bytes exist, but they have not been explicitly reserved. They are the bytes which happen to be adjacent to the memory for i. They are probably being used to store something already, such as a smashed looking smiley face. The 13 just gets blindly written over the smiley face. This error will only be apparent later when the program tries to read the smiley face data. Array Names Are Const One subtle distinction between an array and a pointer, is that the pointer which represents the base address of an array cannot be changed in the code. Technically, the array base address is a const pointer. The constraint applies to the name of the array where it is declared in the code the variable intarray in the example below.

6 int intarray[100] int *intptr; int i; intarray = NULL; intarray = &i; intarray = intarray + 1; intarray++; // no, cannot change the base addr ptr // no // no // no intptr = intarray; // ok, intptr is a regular pointer which can be changed // here it is now pointing to same array intarray is intptr++; // ok, intptr can still be changed (and intarray cannot) intptr = NULL; // ok intptr = &i; // ok foo(intarray); // ok (possible foo definitions are below) foo(intptr); // ditto Array parameters are passed as pointers. The arguments to the two definitions of foo look different, but to the compiler they mean exactly the same thing. It's preferable to use whichever syntax is more accurate for readability. If the pointer coming in is going to be treated as the base address of a whole array, then use [], if it is merely a pointer to one integer, the * notation is more appropriate. void foo1(int arrayparam[]) arrayparam = NULL; // Silly but valid. Just changes the local pointer void foo2(int *arrayparam) arrayparam = NULL; // ditto Note that either intarray or intptr (from above) could be passed to either version of foo. In each case, the address of an integer is passed for intarray it is a copy of base address of the array, for intptr, it is a copy of its current value. Either way, once in the function foo, either pointer can be changed to point elsewhere without affecting the value of the pointers in the calling function.

Dynamic Arrays Since arrays are just contiguous areas of bytes, you can allocate your own arrays in the heap using malloc or operator new[]. The following code allocates three arrays of 1000 ints one in the stack the usual way, one in the heap using C s malloc, and the third using C++ s operator new[] functionality. Other than the different allocations, the two are syntactically similar in use. 7 int a[1000]; int *b, *c; b = malloc(sizeof(int) * 1000); c = new int[1000]; a[123] = 13; b[123] = 13; c[123] = 13; // Just use good ol' [] to access elements // in all three arrays. free(b); delete[] c; There are some key differences: Advantages of being in the heap Size (in this case 1000) can be decided at run time. Not so for an array like a. The array will exist until it is explicitly deallocated with a call to free or operator delete[]. This means a pointer to the array can safely be returned from the function not so with a local stack variable that is deallocated on function exit. You can change the size of the malloc ed array at will at run time using realloc. The following changes the size of the array to 2000. The realloc function will potentially just stretch the original region to the new size, but if necessary, it will move the region to a new location, copy over the old elements, and free the previous region. (Note that there s no C++ equivalent to realloc.) b = realloc(b, sizeof(int) * 2000); Disadvantages of being in the heap You have to remember to allocate the array, and you have to get it right. You have to remember to deallocate it exactly once when you are done with it, and you have to get that right. The above two disadvantages have the same basic profile: if you get them wrong, your code still looks right. It compiles fine. It even runs for small cases, but for some input cases it just crashes unexpectedly because random memory is getting overwritten somewhere like the smiley face on page 5. This sort of "random memory smasher" bug can be a real ordeal to track down.