Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer organzaton, we work wth fnte domans. We found that the early concepts of formal languages can be appled n the study. We found as well that coverng the dfferent data representaton n sequence s benefcal. Tradtonal texts present the topcs n a dspersed fashon. The contrbuton of the paper s n: (a applyng the concepts of an alphabet, a word over an alphabet, and a formal a language to a dgtal desgn and computer organzaton course; and (b dervng a sngle converson equaton for the dfferent data encodngs used. We frst revew the dfferent bnary encodngs used n presentng numerc data. We then generate a generalzed equaton that can be appled to each of the man data types. By proper parameter substtutons we can convert between the dfferent data types. Introducton Fnte automaton forms one of the corner stones n the study computer scence; t s also fundamental n the study of computer engneerng. Fnte automata and pushdown automata are found n the study of complers n software [2, 3]. Fnte automaton s also found n the study of hardware desgn [, 7]. In [7] the desgn of the control unt and datapath of a processor can be modeled as a fnte state machne (FSM wth a datapath (FSMD. Algorthmc state machnes (ASMs are generalzaton of fnte state machnes and are used throughout the desgn process at the regster transfer level descrpton of a computer. At the early stages of studyng dgtal desgn and computer organzatons, the concepts of fnte state machnes are deferred towards the second half of the course. They are covered under the concepts state mnmzaton and sequental crcuts realzatons. Instead, n the early part of the course, the study of nteger data representaton and Boolean functon realzaton are covered. Later n the study, the dfferent representatons of real data type are covered. The study of data representatons, s dspersed; at dfferent parts of the course, the study ncludes unsgned data types for nteger and fxed pont real numbers, sgned data types, and floatng pont number representaton. When dscussng data types we also dscuss overflow and arthmetc errors for the many data representatons. The early parts of a formal languages course nclude, the defnton of an alphabet, the defnton of a word of arbtrary length of over an alphabet, and the defnton of a formal language over an alphabet. We found these same concepts can be ncorporated at the early stages of a dgtal desgn course. These concepts can be used to unfy the coverage of all data types and present a common bass for the data presentaton process. In ths paper we propose a teachng method that covers the topcs as an ordered sequence by startng wth the concept of alphabet, a word over an alphabet, and a formal language. We then use the common features of encodng to derve a sngle converson equaton from a gven code n the fnte doman to a mal equvalent. The paper s organzaton as follows. Secton 2 starts wth the needed defntons of a formal language and how these defntons can be used to construct all possble words over the bnary alphabet. The secton ncludes the defnton of concatenaton of words and languages. Secton 3 expands on the defntons applcatons; we progress to the dfferent word nterpretaton n a dgtal desgn course where we explore the representaton of the three major data types: unsgned ntegers, sgned ntegers and floatng-pont representatons. Secton 4, covers revews the IEEE 754 floatng pont representaton. From the equatons derved n sectons 3 and 4, n secton 5 we form a generalzed converson equaton that apples all data types. The concluson s gven n secton 6. 2 Alphabet, Words and Formal Languages The defntons of an alphabet a word and formal language are found n [2, 3]. An alphabet s defned as any set of symbols,. We normally work wth a fnte set of symbols. The cardnalty of a set,, s the number of elements n the set and s denoted as. A word over the alphabet,, s an ordered n-tuple (a, a 2,, a n where each a s an alphabet symbol n. The length of the word s the
number of alphabet symbols n the word. We normally abbrevate the word by removng the parentheses and commas (represent as a a 2 a n. We denote the set of all words of length n as n w w a a... a where each a s an element of }. { 2 n The set of all words of any length over an alphabet s defned as *, where U represents the unon of sets operator. The set s a specal set that contans a specal symbol (a word of length. The symbol s the dentty symbol under concatenaton. A formal language, L, over an alphabet s defned as any subset of *. The concatenaton operator,., can be used to generate new words. The operator s defned over two alphabet symbols, two words, two languages, or any combnatons of symbols words and languages. For example, let L be a formal language and a any alphabet symbol. The set a.l s defned as the set of all words that are prefxed wth the symbol a followed by a word from L. For two languages L and L 2 we defne L. L 2 = {w w 2 w s L and w 2 s n L 2 } The concatenaton operator s assocatve but not commutatve. The operator can be used to generate all words of length n + from a known set words of length n. We defne (n+ =. n = {a w j where a s an element of and w j s an element of n } The above can be used to determne the total number of words (all possble bnary combnatons when = {, }. Recursvely, one can show n = n = 2 n ; the total number of possble bnary combnatons of length n bts. It can as well be used to construct a truth table recursvely. Gven the truth table for n varables, the truth table for n + varables can be constructed from the truth table for n varables usng the concatenaton operator (n+ =. n = {, }. n Ths leads to the tradtonal method of generatng the table, where the n+ column (startng at n = s composed of a sequence 2 n zeros followed wth a sequence of 2 n onesṫhe defnton can also be of help n dscussng the concepts of nstructon sets and fnte nature of the nstructon sets. 3 The Three Common Data Types n Dgtal Desgn In a typcal dgtal desgn and computer organzaton course, the concept of bnary arthmetc s consdered early n the course and s covered over unsgned numbers. Addtonal data types and arthmetc on the types s covered later n the course. From the defnton of an alphabet we can combne the dscusson of the three data types and cover successvely n the order of unsgned, sgned, and floatng pont. In the dscusson, we make use of the cardnalty of bnary words of length n as an ndependent parameter that lmts the range of numbers under any representaton (the number of words n any set of n-bt words s always 2 n. We revew the three representatons next. Unsgned bnary numbers: Gven the n-bt bnary (n-tuple word number x = x (n x n x, the mal value of x, unsgned (x, s n n unsgned ( x n n ( The above functon s used to generate the mnmum and maxmum bnary numbers, mnmum s a word wth n zeros; a maxmum s a word wth n ones. Snce the cardnalty of n s 2 n, we conclude the maxmum mal value of n consecutve ones s 2 n. The defnton also helps n the bnary countng process. When countng, the frst bt always alternates between zero and one. The second bt remans unchanged f the frst bt value s zero. It changes (get complemented only after the frst bt s one. Smlarly, the th bt reman unchanged untl all less sgnfcant bts assume a value of one. Ths helps n the desgn of bnary sequental counters later n the course. Sgned Numbers: Sgned numbers are represented usng one of the three conventons, sgned magntude, radx complement or dmnshed radx complement. When appled to bnary numbers, radx complement and dmnshed radx complement become, respectvely, two s (2 s and one s ( s complement. Today s computers represent sgned ntegers n two s complement form. Floatng-pont numbers are represented usng sgnedmagntude form. To form the 2 s complement of an n-bt number, x, we form the bnary subtracton 2 n x. Gven an n-bt number x, as defned above, the mal value of x, twos (x, can be found to be twos ( x ( x ( x n n n n n n ( x x unsgned n n3... x The above equaton and cardnalty of the alphabet can be used to determne: (a the range of postve; (b range of negatve; and (c the mnmum and maxmum n each range. (2
For the negatve part, the range of negatve, from smallest to largest s determned from unsgned part (least sgnfcant n bts. A mal value of the unsgned part of results n the smallest negatve value, 2 (n. When the mal part s all ones, ts mal value s 2 (n. Hence, the largest mal negatve value s correspondng to an n-bt word composed of a sequence of n ones. The range of non-negatve numbers n bnary and correspondng mal value s derved smlarly. The cardnalty of a set s used to determne the range of the numbers. Real numbers: Real numbers have two common representatons, fxed-pont and floatng-pont. Whle the set of real numbers s uncountable, the range over n-bt words n s fnte and countable. Real numbers may need to represent very large numbers or very small fractons. To ncrease the range of values, floatng-pont representaton s used. A real number, x, n fxed-pont representaton has a whole part, n bts, and fractonal part, m bts, x = x (n x n x. x x m. Smlarly x n floatng-pont notaton has an n-bt feld representng the exponent part and an m-b t feld representng the fractonal part. The mal value of the fxed-pont s determned usng the equaton n n fxedpo nt ( x n n n x m The range of fractonal part can be computed usng geometrc seres or by fracton ( x ( x m m Hence the range of the fractonal part s to 2 m (2 m. The above representaton and cardnalty of a set of words s dscussed here as well as when we look at the floatng-pont representaton range. (Over a 5-bt word, the total number of possble words s 32. If these represent numerc data then the maxmum number of data tems s 32. Hence, could be a code for: the mal value 2 (unsgned number, the mal value 2 (2 s complement, or 2/32 (fracton. The mal value of a number represented n floatng pont form depends on the standard used. We consder ths next and dscuss the commonalty between the dfferent representatons. 4 Floatng-Pont Numbers and Words Over the Bnary Alphabet (3 (4 When words represent floatng-pont numbers, the bts of a word are broken nto 3 felds: a sgn bt, a based exponent feld, and a fractonal feld. The standard floatngpont representaton used today s the IEEE 754 format developed around 985. It apples to 32-bt and 64-bt representatons. Earler computers dd not have a standard floatng-pont representaton. Floatng-pont numbers are used to represent very large numbers as well as very small fractons. We revew the representaton. Gven a number of y the form N x, the number can be represented n bnary usng IEEE-754 32-bt standard. The representaton has 3 felds: a sgn bt, a fractonal part (sgnfcand, F, and a based exponent part, E. Based exponents representaton means the encodng ncludes a constant value added (bas to the actual exponent. For an n-bt exponent feld, the bas s followed by n one bts. For the 32-bt IEEE representaton (sngle precson the floatng-pont format s shown n Fgure. Bts: 3 3 23 22 S E F S =, postve; S =, negatve Fgure As can seen from the fgure, there are three felds: a bt sgn feld, an 8-bt exponent feld and 23 bts fractonal feld. The sgn bt s or, representng postve and negatve numbers. The exponent value used s based exponent. Hence the added bas s = 2 (7 = 27. IEEE 754 has 3 nterpretatons (representatons of the 32-bt word, denormalzed, normalzed and specal cases. The nterpretatons are based on the bnary representaton of the based exponent.. If E ( 2 ( 6 then the representaton s denormalzed. 2. If E ( 2 and E ( 2 then the representaton s normalzed. 3. If E ( 2 then the representaton s specal cases representaton. We next look at the mal value equaton for each. Denormalzed word encodng: When E ( 2 the mal value of the encodng s gven by the equaton bas 26 2 (. (. F Normalzed word encodng: When E ( 2 and E ( 2 the mal value of the encodng s gven by the equaton E bas E 26 2 (. F (. F (6 Note the addton of before the radx pont n the fractonal part. There are two specal cases correspondng to an E feld composed of all bts. One case represents nfnte numbers. Ths occurs when feld F s all zeros. The other (5
case represents not a number (NaN. The NaN case occurs when E s composed of all one and the F feld contans at least one non-zero bt. The table n Fgure 2 gves an llustraton based on an example IEEE format but on a small word sze, E = 2, F = 2. The fgure ncludes nonnegatve numbers only. n n twos ( x ( xn n n n n3 ( xn n n3 n n3 ( xn xn n3 (( xn xn xn 3... x ( xn xn xn 3... x E F E E _adjust F F_adjust Type Decmal denormalzed /4 /4 denormalzed /4 2/4 2/4 denormalzed 2/4 3/4 3/4 denormalzed 3/4 bas 2 2 (. (., bas normalzed /4 5/4 normalzed 5/4 2/4 6/4 normalzed 6/4 3/4 7/4 normalzed 7/4 2 normalzed 2 2 /4 5/4 normalzed /4 2 2/4 6/4 normalzed 2/4 2 3/4 7/4 normalzed 4/4 E bas E 2 (. (. F, 5 Generatng a Common Converson Equaton bas x x x x Specal values Infnty x x x x Specal values NaN x x x x Specal values NaN x x x x Specal values NaN Infnty: E = and F = ; NaN: E = and F =,, or Fgure 2 Assume we are workng wth the set of data types over 32-bt words. Based on the dscusson of alphabet and languages over the alphabet, we develop a common approach to computng the mal value of a gven bnary encodng. In all the dscussons we note that due to the fnte cardnalty, then the uncountable set of real numbers encodng functon (for a gven real number x, f(x s the bnary encodng of x s such that f s not -to-. Hence, the nverse of f does not exst. We show that all the computatons can be wrtten n the form k2 ( (7 The proof s done by cases. Case of unsgned numbers. Let the word encodng be X = x (n x (n 2 x. By settng beta = X, k = and alpha =, the proof follows. Case of sgned ntegers n 2 s complement: Let the be X = x (n x (n 2 x. We know the mal value of X s gven by equaton (2 as The proof follows for k = and alpha =. Case of real numbers n fxed-pont notaton: Let x = x (n x n x. x x m. From equatons 3 and 4 we have n n fxedpo nt ( x x 2 2 x 2 m m n n... x x... x Note that n n... x x... x m represents an n + m unsgned bnary 2 nteger. The above equaton s satsfed for k, m. Case of real numbers n floatng-pont notaton, denormalzed form. For ths case, the encodng s broken nto two parts, the based exponent part and the fractonal part. Let E = E (m E (m 2 E and F = F (n F (n 2 F correspond to the exponent and fractonal parts, respectvely. The number X s represented as EF. Usng equaton (5 for the denormalzed part we have x bas (2 E ( bas F. F (2 2( F. (2 F The defnton s satsfed for F, k, ( F. Note that F s represented as an unsgned nteger. Case of real numbers n floatng-pont notaton, normalzed form: Ths represents the fnal case. From equaton 6 we have E bas X (. E bas ( bas F F E ( The equaton can be satsfed by assgnment k ( bas F, ( E, F
Before we leave the dscusson, we emphasze to the students that the under the same representaton, unlke mappng mal to bnary, each of the functons above forms a -to- correspondence;.e., two dfferent encodngs result n two dfferent mages (mal values. We also emphasze that dependng on the nature of encodng dfferent bnary codes may have the same mal value. For example the fxed-pont word. has a mal value 4.25. Smlarly and the floatng-pont word ( E =3, F = 4 shas the mal value 4.25. Ths can be verfed usng the equaton for the normalzed floatng-pont representaton. ( bas F E ( (3 4 5 7 ( 4.25 4 6 Concluson In ths paper we have ntroduced an alternatve approach to teachng data representatons n a dgtal desgn course. We have ncorporated the use formal languages. In addton we have ntroduced a new general converson equaton. By proper parameter substtuton, the equaton can be n conversons gven the common bnary number encodngs, unsgned ntegers, sgned ntegers, fxed-pont real numbers, and floatng-pont representatons. For the floatng-pont encodng, we have ncorporated the IEEE 754 standard. References [] Weste N. and Karman E. (993. Prncples of CMOS VLSI desgn A Systems Perspectve 2nd edton. Addson Wesley. [2] Martn J. (99. Introducton to Languages and the Theory of Computaton. McGraw Hll. [3] Barett W., Bates R., Gustafson D. Couch J. (979. Compler Constructon Theory and Practce, 2 nd edton. SRA publshng. [4] Nelson V., Nagle H., Carroll B., Irwn J. (995. Dgtal logc Crcut Analyss and Desgn. Prentce Hall. [5] Katz, R. and Gaetano B. (25. Contemporary Logc Desgn, 2 nd edton. Prentce Hall. [6] Mano M., Kme C. (23. Logc and Computer Desgn Fundamentals, 3 rd Edton. Prentce Hall. [7] Gajsk D. (997. Prncple of Dgtal Desgn, PrentceHall, 997 The converson of a base number to floatng follows the followng steps: (a fnd the bnary value of the number, (b wrte as 2 e x.f, (c add the based bnary exponent to e to form E, (d the floatng pont number s represented as EF, wth least sgnfcant bts of F flled wth and the sgn bt (MSB set to or