Data Storage Geoffrey Brown Bryce Himebaugh Indiana University August 9, 2016 Geoffrey Brown, Bryce Himebaugh 2015 August 9, 2016 1 / 19
Outline Bits, Bytes, Words Word Size Byte Addressable Memory Byte Order Arrays and Pointers Structures and Unions Geoffrey Brown, Bryce Himebaugh 2015 August 9, 2016 2 / 19
It s All Just Bits Data is represented in words with fixed bit width. The meaning of bits is context dependent. Some interpretations of bits are supported natively by the processor (e.g. arithmetic on integers). Some interpretations are unexpected writing a particular bit might turn on a light or a motor. Geoffrey Brown, Bryce Himebaugh 2015 August 9, 2016 3 / 19
Word Bit Numbering Figure 2.1 (msb) 31. (msb) 7 0 (lsb) 0 (lsb) Geoffrey Brown, Bryce Himebaugh 2015 Bits, Bytes, Words August 9, 2016 4 / 19
Hexadecimal Hex Decimal Binary 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 8 8 1000 9 9 1001 A 10 1010 B 11 1011 C 12 1100 D 13 1101 E 14 1110 F 15 1111 (126) 10 = (0111 1110) 2 = (7E) 16 Geoffrey Brown, Bryce Himebaugh 2015 Bits, Bytes, Words August 9, 2016 5 / 19
Word Size A computer s word size is commonly the number of bits in a memory address or the size of an integer. (There is no standard definition) Examples Pentium (IA32) 32 bits. IA64 64 bits. ARM7, Cortex 32 bits. msp430 16 bits. Geoffrey Brown, Bryce Himebaugh 2015 Word Size August 9, 2016 6 / 19
Word Size Matters The memory address range: 0..2 n 1. Primitive integer range: signed integers: 2 n 1..2 n 1 1 unsigned integers: 0..2 n 1. Operations on larger sizes require multiple instructions Geoffrey Brown, Bryce Himebaugh 2015 Word Size August 9, 2016 7 / 19
C Data Type Sizes (in bytes) for Various Architectures Figure 2.2 C Data Type IA64 ARM 32bit Required char 1 1 minimum size to hold a character short 2 2 at least 2 bytes int 4 4 at least 4 bytes long int 8 4 at least 4 bytes long long 8 8 at least 8 bytes float 4 4 commonly 4 bytes (IEEE) double 8 8 commonly 8 bytes (IEEE) void * 8 4 machine dependent Geoffrey Brown, Bryce Himebaugh 2015 Word Size August 9, 2016 8 / 19
Byte Addressable Memory Figure 2.3 high 007 006 006 005 004 004 004 003 002 002 001 low. 000 000 000 000 Byte 16-bit 32-bit 64-bit Geoffrey Brown, Bryce Himebaugh 2015 Byte Addressable Memory August 9, 2016 9 / 19
Little Endian Byte Order Figure 2.4 (msb) 63 0 (lsb). ḅyte 7 byte 6 byte 5 byte 4 byte 3 byte 2 byte 1 byte 0 31 0 byte 3 byte 2 byte 1 byte 0 15 0 byte 1 byte 0 Geoffrey Brown, Bryce Himebaugh 2015 Byte Addressable Memory August 9, 2016 10 / 19
Big Endian Byte Order Figure 2.5 (msb) 63 0 (lsb). ḅyte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 31 0 byte 0 byte 1 byte 2 byte 3 15 0 byte 0 byte 1 Geoffrey Brown, Bryce Himebaugh 2015 Byte Addressable Memory August 9, 2016 11 / 19
Byte Swapping Figure 2.6 7 6 5 4 AB 98 76 54 54 76 98 AB. Little Endian Big Endian 0xAB987654 Geoffrey Brown, Bryce Himebaugh 2015 Byte Addressable Memory August 9, 2016 12 / 19
Pointers and Arrays Example 2.1 int v; void foo( void) { unsigned char *p = ( unsigned char *) &v; p[0] = 0x10; p[1] = 0x20; p[2] = 0x30; p[3] = 0x40; } Geoffrey Brown, Bryce Himebaugh 2015 Arrays and Pointers August 9, 2016 13 / 19
Pointers and Arrays cont. *p = 0x10; *(p+1) = 0x20; *(p+2) = 0x30; *(p+3) = 0x40; This approach can lead to some confusion. Suppose that we wish to access an integer array using pointer arithmetic, then the following two approaches are equivalent and int *q =... q[0] = 1; q[1] = 2; q[2] = 3; *q = 1; *(q+1) = 2; *(q+2) = 3; Geoffrey Brown, Bryce Himebaugh 2015 Arrays and Pointers August 9, 2016 14 / 19
Pointers and Arrays cont. The advantage to the array syntax is that it handles the address calculation in a natural manner, whereas, with pointer arithmetic we have to be cognizant of the size of the objects we are addressing. A further difference is that array names provide a mechanism for static memory allocation in C. For example, int a[7]; allocates a block of memory large enough to hold seven integers. The name of this block of elements is a and its elements are a[0]... a[6]. We can reference the address of this block of memory as or int *p = a; int *p = &a[0]; Geoffrey Brown, Bryce Himebaugh 2015 Arrays and Pointers August 9, 2016 15 / 19
Structure Layout Figure 2.7 typedef struct { int a; short b; long long d; } exs; exs x; 3 Byte Number 2 1 0 0 Byte Number 1 2 3 3 d[7] d[6] d[5] d[4] 3 d[3] d[2] d[1] d[0] Word 2 1 d[3] d[2] d[1] b[1] d[0] b[0] Word 2 1 d[7] b[1] d[6] b[0] d[5] d[4] 0. a[3] a[2] a[1] Little Endian a[0] 0 a[3] a[2] Big Endian a[1] a[0] Geoffrey Brown, Bryce Himebaugh 2015 Structures and Unions August 9, 2016 16 / 19
Structure Field Offset In a structure, the offset of a field from the structure start is computed as follows. Add the offset and size of the preceding field Round up to the nearest address that satisfies the alignment requirements of the field (e.g. a multiple of 8 for a long long). Geoffrey Brown, Bryce Himebaugh 2015 Structures and Unions August 9, 2016 17 / 19
Structure Field Offset (example) The overall size of a structure is a multiple of the maximum alignment requirements of its fields. To understand the seemingly wasteful requirement, consider the following: struct { long long ll; int i; } a[2]; The alignment requirements for first array element could be met with a 12-byte size, but then the second element would be incorrectly aligned. The C libraries provide two macros that may be used to determine the offset and size of a field offsetof and sizeof. Geoffrey Brown, Bryce Himebaugh 2015 Structures and Unions August 9, 2016 18 / 19
Unions union { int a; char b; long long c; } All three fields have the same offset from the beginning of the union 0. The size of the union is the size of its largest field. While the Cortex-M0 will execute properly if this union is aligned on 4-byte boundaries, the current ARM ABI requires 8-byte boundaries to ensure interoperability with other members of the ARM family. Geoffrey Brown, Bryce Himebaugh 2015 Structures and Unions August 9, 2016 19 / 19