Simple Data Types in C. Alan L. Cox

Similar documents
Chapter 2 Bits, Data Types, and Operations

Chapter 2 Bits, Data Types, and Operations

Chapter 2 Bits, Data Types, and Operations

CPS 104 Computer Organization and Programming Lecture-2 : Data representations,

Chapter 2 Bits, Data Types, and Operations

Numbers and Computers. Debdeep Mukhopadhyay Assistant Professor Dept of Computer Sc and Engg IIT Madras

Chapter 2 Bits, Data Types, and Operations

Chapter 3. Information Representation

Bits and Bytes. Data Representation. A binary digit or bit has a value of either 0 or 1; these are the values we can store in hardware devices.

Number Systems for Computers. Outline of Introduction. Binary, Octal and Hexadecimal numbers. Issues for Binary Representation of Numbers

Under the Hood: Data Representation. Computer Science 104 Lecture 2

EE 109 Unit 3. Analog vs. Digital. Analog vs. Digital. Binary Representation Systems ANALOG VS. DIGITAL

CS/ECE 252: INTRODUCTION TO COMPUTER ENGINEERING UNIVERSITY OF WISCONSIN MADISON

EE 109 Unit 2. Analog vs. Digital. Analog vs. Digital. Binary Representation Systems ANALOG VS. DIGITAL

Fundamentals of Programming (C)

Number Representations

CS/ECE 252: INTRODUCTION TO COMPUTER ENGINEERING UNIVERSITY OF WISCONSIN MADISON

Unit 3, Lesson 2 Data Types, Arithmetic,Variables, Input, Constants, & Library Functions. Mr. Dave Clausen La Cañada High School

CMSC 313 Lecture 03 Multiple-byte data big-endian vs little-endian sign extension Multiplication and division Floating point formats Character Codes

ASSIGNMENT 5 TIPS AND TRICKS

1.1. INTRODUCTION 1.2. NUMBER SYSTEMS

Exercises Software Development I. 03 Data Representation. Data types, range of values, internal format, literals. October 22nd, 2014

Data Representation and Binary Arithmetic. Lecture 2

Data Representa5on. CSC 2400: Computer Systems. What kinds of data do we need to represent?

Data Representa5on. CSC 2400: Computer Systems. What kinds of data do we need to represent?

Fundamentals of Programming

Number Systems Base r

plc numbers Encoded values; BCD and ASCII Error detection; parity, gray code and checksums

The Binary Number System

Unit 3. Analog vs. Digital. Analog vs. Digital ANALOG VS. DIGITAL. Binary Representation

Number System (Different Ways To Say How Many) Fall 2016

EE 109 Unit 2. Binary Representation Systems

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 02, FALL 2012

Chapter 2 Number System

Chapter 8. Characters and Strings

CSE-1520R Test #1. The exam is closed book, closed notes, and no aids such as calculators, cellphones, etc.

CSE-1520R Test #1. The exam is closed book, closed notes, and no aids such as calculators, cellphones, etc.

Do not start the test until instructed to do so!

COMP2611: Computer Organization. Data Representation

Bits, Bytes, and Integers

Integers. Today. Next time. ! Numeric Encodings! Programming Implications! Basic operations. ! Floats

Positional Number System

3.1. Unit 3. Binary Representation

Fundamental Data Types

Oberon Data Types. Matteo Corti. December 5, 2001

Variables and data types

5/17/2009. Digitizing Discrete Information. Ordering Symbols. Analog vs. Digital

Do not start the test until instructed to do so!

DATA REPRESENTATION. Data Types. Complements. Fixed Point Representations. Floating Point Representations. Other Binary Codes. Error Detection Codes

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

But first, encode deck of cards. Integer Representation. Two possible representations. Two better representations WELLESLEY CS 240 9/8/15

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 02, SPRING 2013

Computer System and programming in C

Do not start the test until instructed to do so!

Bits, Bytes and Integers

Bits, Bytes and Integers Part 1

Representing and Manipulating Integers. Jo, Heeseung

Integers Sep 3, 2002

Week 1 / Lecture 2 8 March 2017 NWEN 241 C Fundamentals. Alvin Valera. School of Engineering and Computer Science Victoria University of Wellington

Source coding and compression

Binary Numbers. The Basics. Base 10 Number. What is a Number? = Binary Number Example. Binary Number Example

These are reserved words of the C language. For example int, float, if, else, for, while etc.

Integers. Dr. Steve Goddard Giving credit where credit is due

CS341 *** TURN OFF ALL CELLPHONES *** Practice NAME

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 02, FALL 2012

Systems 1. Integers. Unsigned & Twoʼs complement. Addition, negation, multiplication

Chapter 2. Data Representation in Computer Systems

2a. Codes and number systems (continued) How to get the binary representation of an integer: special case of application of the inverse Horner scheme

CPSC 301: Computing in the Life Sciences Lecture Notes 16: Data Representation

Data Representation 1

o Echo the input directly to the output o Put all lower-case letters in upper case o Put the first letter of each word in upper case

Topics of this Slideset. CS429: Computer Organization and Architecture. Encoding Integers: Unsigned. C Puzzles. Integers

CS429: Computer Organization and Architecture

Number Systems II MA1S1. Tristan McLoughlin. November 30, 2013

Representing and Manipulating Integers Part I

The type of all data used in a C (or C++) program must be specified

Floating Point January 24, 2008

Integers II. CSE 351 Autumn Instructor: Justin Hsia

Integer Encoding and Manipulation

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Representing and Manipulating Floating Points. Jo, Heeseung

Systems Programming and Computer Architecture ( )

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

CS367 Test 1 Review Guide

Computer Architecture and System Software Lecture 02: Overview of Computer Systems & Start of Chapter 2

System Programming CISC 360. Floating Point September 16, 2008

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Floating Point Arithmetic

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

1.3b Type Conversion

ICS Instructor: Aleksandar Kuzmanovic TA: Ionut Trestian Recitation 2

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Bits, Bytes, and Integers Part 2

Representing and Manipulating Floating Points

Lecture 5-6: Bits, Bytes, and Integers

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Bits and Bytes and Numbers

Computer Systems CEN591(502) Fall 2011

CS 33. Data Representation, Part 2. CS33 Intro to Computer Systems VIII 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Transcription:

Simple Data Types in C Alan L. Cox alc@rice.edu

Objectives Be able to explain to others what a data type is Be able to use basic data types in C programs Be able to see the inaccuracies and limitations introduced by machine representations of numbers Cox / Fagan Simple Data Types 2

What do you see? Cox / Fagan Simple Data Types 3

What do you see? Cox / Fagan Simple Data Types 4

Last one Cox / Fagan Simple Data Types 5

Everything is Just a Bunch of Bits Bits can represent many different things Depends on interpretation You and your program must keep track of what kind of data is at each location in the computer s memory E.g., program data types Cox / Fagan Simple Data Types 6

Big Picture Processor works with finite-sized data All data implemented as a sequence of bits Bit = 0 or 1 Represents the level of an electrical charge Byte = 8 bits Word = largest data size handled by processor 32 bits on most older computers 64 bits on most new computers Cox / Fagan Simple Data Types 7

Data types in C Only really four basic types: char int (short, long, long long, unsigned) float double Size of these types on CLEAR machines: Type Size (bytes) char 1 int 4 short 2 Sizes of these types vary from one machine to another! long 8 long long 8 float 4 double 8 Cox / Fagan Simple Data Types 8

Characters (char) Roman alphabet, punctuation, digits, and other symbols: Encoded within one byte (256 possible symbols) ASCII encoding (man ascii for details) In C: char a_char = a ; char newline_char = \n ; char tab_char = \t ; char backslash_char = \\ ; Cox / Fagan Simple Data Types 9

ASCII Special control characte From man ascii : rs 0 NUL 1 SOH 2 STX 3 ETX 4 EOT 5 ENQ 6 ACK 7 BEL 8 BS 9 HT 10 NL 11 VT 12 NP 13 CR 14 SO 15 SI 16 DLE 17 DC1 18 DC2 19 DC3 20 DC4 21 NAK 22 SYN 23 ETB 24 CAN 25 EM 26 SUB 27 ESC 28 FS 29 GS 30 RS 31 US 32 SP 33! 34 " 35 # 36 $ 37 % 38 & 39 ' 40 ( 41 ) 42 * 43 + 44, 45-46. 47 / 48 0 49 1 50 2 51 3 52 4 53 5 54 6 55 7 56 8 57 9 58 : 59 ; 60 < 61 = 62 > 63? 64 @ 65 A 66 B 67 C 68 D 69 E 70 F 71 G 72 H 73 I 74 J 75 K 76 L 77 M 78 N 79 O 80 P 81 Q 82 R 83 S 84 T 85 U 86 V 87 W 88 X 89 Y 90 Z 91 [ 92 \ 93 ] 94 ^ 95 _ 96 ` 97 a 98 b 99 c 100 d 101 e 102 f 103 g 104 h 105 i 106 j 107 k 108 l 109 m 110 n 111 o 112 p 113 q 114 r 115 s 116 t 117 u 118 v 119 w 120 x 121 y 122 z 123 { 124 125 } 126 ~ 127 DEL Cox / Fagan Simple Data Types 10

Characters are just numbers What does this function do? return type procedure name local variable type and name char fun(char c) { char new_c; if ((c >= A ) && (c <= Z )) new_c = c - A + a ; else new_c = c; argument type and name comparisons with characters! Math on characters! } return (new_c); Cox / Fagan Simple Data Types 11

Integers Fundamental problem: Fixed-size representation can t encode all numbers Standard low-level solution: Limit number range and precision Usually sufficient Potential source of bugs Signed and unsigned variants unsigned modifier can be used with any sized integer (short, long, or long long) Cox / Fagan Simple Data Types 12

Signed Integer Cox / Fagan Simple Data Types 13

Integer Representations Base 2 Base 16 Unsigned 2 s Comp. 0000 0 0 0 0001 1 1 1 0010 2 2 2 0011 3 3 3 0100 4 4 4 0101 5 5 5 0110 6 6 6 0111 7 7 7 1000 8 8-8 1001 9 9-7 1010 A 10-6 1011 B 11-5 1100 C 12-4 1101 D 13-3 1110 E 14-2 1111 F 15-1 1101 1100 1011 1110 C D B 1010 E 12 13 11 A 1111 F 10 1001 9 15 14-1 -2-3 -4-5 -6-7 9 0000 0 0 0-8 8 8 1000 7 1 7 1 7 0001 1 2 2 3 6 5 4 6 0111 3 5 2 4 6 0010 0011 3 4 0100 5 0101 0110 Why one more negative than positive? Cox / Fagan Simple Data Types 14

Integer Representations Base 2 Base 16 Unsigned 2 s Comp. 0000 0 0 0 0001 1 1 1 0010 2 2 2 0011 3 3 3 0100 4 4 4 0101 5 5 5 0110 6 6 6 0111 7 7 7 1000 8 8-8 1001 9 9-7 1010 A 10-6 1011 B 11-5 1100 C 12-4 1101 D 13-3 1110 E 14-2 1111 F 15-1 Define B2U ( x) Math for n bits: x xn n 1 i 0 B2T ( x) 2 1 x 0 2 n 1 i xi x n 1 n 2 i 0 sign bit 0=non-negative 1=negative 2 i x i Cox / Fagan Simple Data Types 15

Integer Ranges Unsigned UMin n UMax n = 0 2 n -1: 32 bits: 0... 4,294,967,295 unsigned int 64 bits: 0... 18,446,744,073,709,551,615 unsigned long int 2 s Complement TMin n TMax n = -2 n-1 2 n-1-1: 32 bits: -2,147,483,648... 2,147,483,647 int 64 bits: -9,223,372,036,854,775,808 9,223,372,036,854,775,807 long int Note: C numeric ranges are platform dependent! #include <limits.h> to define ULONG_MAX, UINT_MIN, INT_MAX, Cox / Fagan Simple Data Types 16

Detecting Overflow in Programs Some high-level languages (ML, Ada, ): Overflow causes exception that can be handled C: Overflow causes no special event Programmer must check, if desired E.g., given a, b, and c=uadd n (a,b) overflow? Claim: Overflow iff c < a (Or similarly, iff c < b) Proof: Know 0 b < 2 n If no overflow, c = (a + b) mod 2 n = a + b a If overflow, c = (a + b) mod 2 n = a + b 2 n < a Cox / Fagan Simple Data Types 17

Overflow unsigned int x = 2123456789U; unsigned int y = 3123456789U; unsigned int z; z is 951,946,282 not 5,246,913,578 as expected z = x + y; int x = 2123456789; int y = 3123456789; int z; z = x + y; y is not a valid positive number (sign bit is set)! It s -1,171,510,507 z is still 951,946,282 Cox / Fagan Simple Data Types 18

However, #include <assert.h> void procedure(int x) { assert(x + 10 > x); Should this assertion ever fail? Do a web search for GCC bug 30475. The C language definition does not assume 2 s complement as the underlying implementation. The behavior of signed integer overflow is undefined. Cox / Fagan Simple Data Types 19

You say, What should I do? #include <limits.h> /* Safe, signed add. Won t overflow. */ int safe_add(int x, int y) { } if (y < 0) return (safe_sub(x, -y)); if (INT_MAX y < x) return (INT_MAX); /* Don t overflow! */ return (x + y); Cox / Fagan Simple Data Types 20

Bit Shifting as Multiplication ft left (x << 1) multiplies by 2: 0 0 1 1 = 3 0 1 1 0 = 6 1 1 0 1 = -3 1 0 1 0 = -6 Works for unsigned, 2 s complement Can overflow In decimal, same idea multiplies by 10: e.g., 4 Cox / Fagan Simple Data Types 21

Bit Shifting as Division Logical shift right (x >> 1) divides by 2 for uns 0 1 1 1 = 7 0 0 1 1 = 3 1 0 0 1 = 9 0 1 0 0 = 4 Always rounds down! Arithmetic shift right (x >> 1) divides by 2 for 0 1 1 1 = 7 0 0 1 1 = 3 1 0 0 1 = -7 1 1 0 0 = -4 Always rounds down! Cox / Fagan Simple Data Types 22

Bit Shifting for Multiplication/Division Why useful? Simpler, thus faster, than general multiplication & division Standard compiler optimization Can shift multiple positions at once: Multiplies or divides by corresponding power-of- 2 a << 5 a >> 5 Cox / Fagan Simple Data Types 23

A Sampling of Integer Properties For both unsigned & 2 s complement: Mostly as usual, e.g.: 0 is identity for +, - 1 is identity for, +, -, are associative +, are commutative distributes over +, - Some surprises, e.g.: doesn t distribute over +, - (a,b > 0 a + b > a) Why should you care? Programmer should be aware of behavior of their progr Compiler uses such properties in optimizations Cox / Fagan Simple Data Types 24

Beware of Sign Conversions in C Beware implicit or explicit conversions between unsigned and signed representations! One of many common mistakes: unsigned int u; if (u > -1)? What s wrong?? ways false(!) because -1 is converted to unsigned, yielding UMax n Cox / Fagan Simple Data Types 25

Non-Integral Numbers: How? Fixed-size representations Rational numbers (i.e., pairs of integers) Fixed-point (use integer, remember where point is) Floating-point (scientific notation) Variable-size representations Sums of fractions (e.g., Taylor-series) Unbounded-length series of digits/bits Cox / Fagan Simple Data Types 26

Floating-point Binary version of scientific notation 1.001101110 2 5 = 100110.1110 = 32 + 4 + 2 + 1 / 2 + 1 / 4 + 1 / 8 = 38.875-1.011 2-3 = -.001011 = - ( 1 / 8 + 1 / 32 + 1 / 64 ) = -.171875 binary point Cox / Fagan Simple Data Types 27

FP Overflow & Underflow Fixed-sized representation leads to limitations Round to - Large positive exponent. Unlike integer arithmetic, overflow imprecise result ( ), not inaccurate result Zero Round to + NegativeExpressible Negative Positive Expressible Positive overflow negative values underflow underflow positive values overflow Large negative exponent Round to zero Cox / Fagan Simple Data Types 28

FP Representation 1.001101110 2 5 Fixed-size representation Using more significand bits precision Using more exponent bits significand exponent increased increased range Typically, fixed # of bits for each part, for simplicity Cox / Fagan Simple Data Types 29

FP Representation: IEEE 754 Current standard version of floating-point Single-precision (float) One word: 1 sign bit, 23 bit fraction, 8 bit exponent Positive range: 1.17549435 10-38 3.40282347 10 +38 Double-precision (double) Two words: 1 sign bit, 52 bit fraction, 11 bit exponent Positive range: 2.2250738585072014 10-308 1.7976931348623157 10 +308 Lots of details in B&O Chapter 2.4 Cox / Fagan Simple Data Types 30

IEEE 754 Special Numbers +0.0, -0.0 +, - NaN: Not a number (+1.0 10 +38 ) 2 = + +0.0 +0.0 = NaN +1.0 +0.0 = + + - + = NaN +1.0-0.0 = - 1 = NaN Cox / Fagan Simple Data Types 31

FP vs. Integer Results int i = 20 / 3; float f = 20.0 / 3.0; True mathematical answer: 20 3 = 6 2/3 i =? f =? 6 6.666667 Integer division ignores remainder FP arithmetic rounds result Cox / Fagan Simple Data Types 32

FP vs. Integer Results int i = 1000 / 6; float f = 1000.0 / 6.0; True mathematical answer: 1000 6 = 166 2 / 3 i =? f =? 166 166.666672 Integer division ignores remainder FP arithmetic rounds result Surprise! Arithmetic in binary, printing in decimal doesn t always give expected result Cox / Fagan Simple Data Types 33

FP Integer Conversions in C #include <limits.h> #include <stdio.h> int main(void) { unsigned int ui = UINT_MAX; float f = ui; printf( ui: %u\nf: %f\n, ui, (double)f); } Surprisingly, this program print the following. Why? ui: 4294967295 f: 4294967296.000000 Cox / Fagan Simple Data Types 34

FP Integer Conversions in C int i = 3.3 * 5; float f = i; True mathematical answer: 3.3 5 = 16 ½ i =? f =? 16 16.0 Converts 5 5.0 Truncates result 16 ½ 16 integer FP: Can lose precision Rounds, if necessary FP integer: Truncate fraction If out of range, undefined not error 32-bit int fits in doubleprecision FP Cox / Fagan Simple Data Types 35

FP Behavior Programmer must be aware of accuracy limitations! Dealing with this is a subject of classes like CAAM 453 (10 10 + 10 30 ) + 10 30 =? 10 10 + (10 30 + 10 30 ) 10 30 10 30 =? 10 10 + 0 0 10 10 Operations not associative! (1.0 + 6.0) 640.0 =? (1.0 640.0) + (6.0 64 7.0 640.0 =?.001563 +.009375.010937.010938, not distributive across +,- Cox / Fagan Simple Data Types 36

What about other types? Booleans A late addition to C Strings We ll cover these in a later class Enumerated types A restricted set of integers Complex Numbers Not covered Cox / Fagan Simple Data Types 37

Booleans One bit representation 0 is false 1 is true One byte or word representation Inconvenient to manipulate only one bit Two common encodings: 0000 0000 is false 0000 0000 is false 0000 0001 is true all other words are true all other words are garbage Wastes space, but space is usually cheap Cox / Fagan Simple Data Types 38

Booleans in C #include <stdbool.h> bool bool1 = true; bool bool2 = false; Important! Compiler needs this or it won't know about "bool"! bool added to C in 1999 Many programmers had already defined their own Boolean type To avoid conflict bool is disabled by default Cox / Fagan Simple Data Types 39

C s Common Boolean Operations C extends definitions to integers Booleans are encoded as integers 0 == false non-0 == true Logical AND: 0 && 4 == 0 3 && 4 == 1 3 && 0 == 0 Logical OR: 0 4 == 1 3 4 == 1 3 0 == 1 Logical NOT:! 4 == 0! 0 == 1 && and short-circuit Evaluate 2nd argument only if necessary E.g., 0 && error-producing-code == 0 Cox / Fagan Simple Data Types 40

Enumerated Types E.g., a Color = red, blue, black, or yellow Small (finite) number of choices Booleans & characters are common special cases Pick arbitrary bit patterns for each Not enforced in C Actually just integers Can assign values outside of the enumeration Could cause bugs Cox / Fagan Simple Data Types 41

Enumerated Types in C enum Color { RED, WHITE, BLACK, YELLOW }; enum Color my_color = RED; Alternative style: enum AColor { COLOR_RED, COLOR_WHITE, COLOR_BLACK, COLOR_YELLOW }; typedef enum AColor color_t; color_t my_color = COLOR_RED; The new type name is enum Color Pre-C99 Boolean definition: enum Bool { false = 0, true = 1 }; typedef enum Bool bool; bool my_bool = true; Cox / Fagan Simple Data Types 42

First Assignment Carefully read the assignments web page Honor code policy Slip day policy All assignments will be posted on that web page Assignments are due at 11:55PM on the due date, unless otherwise specified Assignments must be done on CLEAR servers Cox / Fagan Simple Data Types 43

Next Time Arrays and Pointers in C Cox / Fagan Simple Data Types 44