CMPS Introduction to Computer Science Lecture Notes Binary Numbers Until now we have considered the Computing Agent that executes algorithms to be an abstract entity. Now we will be concerned with techniques for designing physical computing machines. Ultimately all data and instructions in modern computing devices are stored as Bits (Binary Digits) i.e. s and s, so we need to discuss the Binary Numbering System. We begin with a more general idea: The Base-b Positional Numbering System. What does 2,526 mean to you? You can read it in words, and you have some sense of how big a number it is, but what does it mean? 2526 = 4 + 2 3 + 5 2 + 2 + 6 This is the Base- Positional Numbering System we are all familiar with. Recall this was introduced to Europe around 2 AD, replacing the earlier Roman numerals and radically improving the efficiency of everyday computations. Given an integer b >, the Base-b Positional Numbering System is obtained by assigning b symbols called digits to the integers:,, 2,..., (b - ). A sequence (or string) of these symbols can then be interpreted as an integer. If the base is b = 5, and the chosen symbols are,,,, and (representing,, 2, 3, and 4 respectively), then the string represents the integer = 3 5 5 + 5 4 + 4 5 3 + 5 2 + 2 5 + 5 = 9,9 Here on earth, we would not use these particular symbols since we already have more familiar ones, namely,, 2, 3, and 4. Since we now wish to express numbers in different basses we will surround a string of digits with square brackets with a subscript to indicate the base. We would thus write [342] 5 = [99]. In general, for any base b, we interpret a string of digits dn d n 2 d 2dd as n n2 2 d d d d d d n n2 2 n n 2 2 b b Note the subscripts on the d s were chosen for convenience to coincide with the powers on the base b. More generally, a fraction is represented by inserting a Base-b point: n 2 k d d d d d d n. 2 k b n 2 k [43.] 5 = 4 5 + 3 5 + 5 - + 5-2 = 2 + 3 + + (/25) = [23.4]
Notice that there is no digit to represent b in the Base-b Positional Numbering System. The string always represents the base of the system. Thus [] 5 = [5] and in general [] b = b. In Computer Science we are most concerned with the following bases. Name Base Digits _ Decimal b =,, 2, 3, 4, 5, 6, 7, 8, 9 Binary b = 2, Octal b = 8,, 2, 3, 4, 5, 6, 7 Hexadecimal b = 6,, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F We will now adopt the practice of omitting the square brackets when referring to base- numerals, and occasionally for other bases when no confusion arises. Why can t we just use to stand for ten in base 6? If we did, what number would the symbol sequence [] 6 stand for? Would it be or would it be 6 + 6 = 6? Counting in binary is like watching the odometer on a car. Decimal Binary 2 3 4 5 6 7 8 9 2 3 4 5 6 3 3 32 33 62 63 64 65 2
Converting from any base into base is easy. Just write down the definition and simplify by doing all your calculations in base, as one would do normally. [] 2 = 5,87 [237] 8 = 59 [A7D] 6 = 4,34 [357] = [? ] 2 List all the powers of 2 up to the first number greater than or equal to 357. k : 2 3 4 5 6 7 8 9 2 k : 2 4 8 6 32 64 28 256 52 Write 357 as a sum of powers of 2. 357 = 256 + = 256 + 64 + 37 = 256 + 64 + 32 + 5 = 256 + 64 + 32 + 4 + = 2 8 + 2 6 + 2 5 + 2 2 + 2 Insert the missing powers of 2. 357 = 2 8 + 2 7 + 2 6 + 2 5 + 2 4 + 2 3 + 2 2 + 2 + 2 Now just read off the binary digits: [357] = [] 2 An alternative approach is to start with 357, compute the quotient and remainder upon division by 2, and repeat the process on the quotient until you reach a quotient of. Remainder 357 Quotient 78 89 44 22 5 2 Now read off the sequence of remainders from bottom to top to get the binary representation of the number:. 3
Exercise Try to write the above procedure in pseudo-code. Expressing a fraction in binary is basically the same process, i.e. first write the number as a sum of powers of two, then read off the binary digits. s 2.75 = 8 + 4 +.5 +.25 = 2 3 + 2 2 + 2 - + 2-2 = [.] 2 5.375 = 8 + 4 + 2 + +.25 +.25 = 2 3 + 2 2 + 2 + 2 + 2-2 + 2-3 = [.] 2 Converting from Binary to Octal is simple since 8 is a power of 2. Just partition your binary number into groups of threes (since 8=2 3 ), adding extra zeros on the left if necessary, then convert each 3-bit binary number into one octal digit. [] 2 = [ ] 2 = [545] 8 Dictionary Octal Binary 2 3 4 5 6 7 Binary to Hexadecimal and back is equally simple since 6 is also a power of 2. This time partition the binary numeral into groups of 4 bits (since 6=2 4 ), again include leading zeros if necessary, then translate each group of 4 bits into one hex digit. [] 2 = [ ] 2 = [65] 6 Dictionary Hexadecimal Binary 2 3 4 5 6 7 8 9 A = B = C = 2 D = 3 E = 4 F = 5 4
Reversing this process is also very simple. [332] 8 = [ ] 2 = [2DA] 6 = [ ] 2 = Doing arithmetic in other bases is little different than doing it in base ten. Just carry when a column sum is greater than the base. Binary + Octal 733 + 572 525 Hexadecimal DB + 7A 355 Observe that these three examples are actually one and the same. In decimal they would read 475 + 378 853 In modern computers, all data is stored in binary form, whether it be numeric data or any other type of information. The fundamental unit of memory is a single cell. In almost every modern architecture a cell consists of 8 bits, also known as byte. byte = 8 bits How many different bit patterns can be represented in byte? More generally, how many different bit patterns of length n are there, for any n? There are 2 choices ( or ) for the first bit, 2 choices for the second bit, 2 for the third,..., and finally 2 choices for the n th bit. Thus there are 2 2 2 2 = 2 n choices for the entire bit string. Thus # bit strings of length n = 2 n 5
Since a single memory cell holds byte = 8 bits, it can store 2 8 = 256 different bit patterns. If these bit strings represent integers, then the range is = to = 255. To store integers outside this range requires more than one cell. We ve discussed how to encode numeric data using bits, but how can we encode other types of information, such as text? There are a number of different Character Encodings in use. ASCII (American Standard Code for Information Interchange) uses byte (8 bits) to represent a single character. The codes -3 represent control characters, codes 32-27 are the printable characters, and 28-255 are the so-called extended ASCII character set, which includes special symbols. See http://www.ascii-code.com/ for a complete table. Hello World! would be represented (without quotes ) as Binary Hex 48 65 6C 6C 6F 2 57 6F 72 6C 64 2 Unicode uses 2 bytes (6 bits) to store a single character. The number of characters that can be represented is therefore 2 6 = 65,536. This is many more than are available on a typical keyboard. We need so many to represent characters in all the world s alphabets. The Unicode representation of Latin letters and English language keyboard symbols are just the ASCII codes pre-pended with 8 s, so for instance H in Unicode is (=48 in Hex). Go to http://unicode.org/charts/ to see the complete UNICODE character set. 6