CS:APP2e Web Aside ASM:X87: X87-Based Support for Floating Point

Size: px
Start display at page:

Download "CS:APP2e Web Aside ASM:X87: X87-Based Support for Floating Point"

Transcription

1 CS:APP2e Web Aside ASM:X87: X87-Based Support for Floating Point Randal E. Bryant David R. O Hallaron June 5, 2012 Notie The material in this doument is supplementary material to the book Computer Systems, A Programmer s Perspetive, Seond Edition, by Randal E. Bryant and David R. O Hallaron, published by Prentie-Hall and opyrighted In this doument, all referenes beginning with CS:APP2e are to this book. More information about the book is available at sapp.s.mu.edu. This doument is being made available to the publi, subjet to opyright provisions. You are free to opy and distribute it, but you should not use any of this material without attribution. 1 Introdution The floating-point arhiteture for a proessor onsists of the different aspets that affet how programs operating on floating-point data are mapped onto the mahine, inluding: How floating-point values are stored and aessed. This is typially via some form of registers. The instrutions that operate on floating-point data. The onventions used for passing floating-point values as arguments to funtions, and for returning them as results. In this doument, we will desribe the floating-point arhiteture for x86 proessors known as x87. The set of instrutions for manipulating floating-point values is one of the least elegant features of the historial x86 arhiteture. In the original Intel mahines, floating point was performed by a separate oproessor, a unit with its own registers and proessing apabilities that exeutes a subset of the instrutions. This oproessor was implemented as a separate hip named the 8087, 80287, and i387, to aompany Copyright 2010, R. E. Bryant, D. R. O Hallaron. All rights reserved. 1

2 2 the proessor hips 8086, 80286, and i386, respetively. During these produt generations, hip apaity was insuffiient to inlude both the main proessor and the floating-point oproessor on a single hip. In addition, lower-budget mahines would omit floating-point hardware and simply perform the floating-point operations (very slowly!) in software. Sine the i486, floating point has been inluded as part of the IA32 CPU hip. The legay 8087 defines a set of instrutions and a storage model for implementing floating-point ode, often referred to as x87, muh as x86 refers to the evolutionary proessor arhiteture that started with the We will use the term x87 instrutions in this doument. The original 8087 oproessor was introdued to great alaim in It was the first single-hip floatingpoint unit (FPU), and the first implementation of what beame the IEEE 754 floating-point standard. Operating as a oproessor, the FPU would take over the exeution of floating-point instrutions after they were fethed by the main proessor. There was minimal onnetion between the FPU and the main proessor. Communiating data from one proessor to the other required the sending proessor to write to memory and the reeiving one to read it. Artifats of that design remain in the x87 floating-point instrution set today. In addition, the ompiler tehnology of 1980 was muh less sophistiated than it is today. Many features of x87 make it a diffiult target for optimizing ompilers. With the introdution of SSE2 in the Pentium 4 (2000), it has beome possible to implement single and double-preision floating-point arithmeti using SSE instrutions. These provide a muh better target for optimizing ompilers (see Web Aside ASM:SSE), and so slowly the use of the x87 floating-point arhiteture is being phased out of x86 ode. Still, x87 instrutions are the default for GCC when generating IA32 floating-point ode. They are also the only way to implement 80-bit extended-preision floating-point operations, suh as for C data type long double. In this doument, we onsider only IA32 ode. All x87 instrutions an be used with x86-64 ode, as well, but the onventions for passing funtion arguments and returning funtion values in x86-64 ode are based on the SSE floating-point arhiteture. 2 Floating-Point Registers X87 has eight floating-point registers, but unlike normal registers, these are treated as a shallow stak. The registers are identified as,, and so on, up to %st(7), with being the top of the stak. When more than eight values are pushed onto the stak, the ones at the bottom simply disappear. Rather than diretly indexing the registers, most of the arithmeti instrutions pop their soure operands from the stak, ompute a result, and then push the result onto the stak. Stak arhitetures were onsidered a lever idea in the 1970s, sine they provide a simple mehanism for evaluating arithmeti instrutions, and they allow a very dense oding of the instrutions. With advanes in ompiler tehnology and with the memory required to enode instrutions no longer onsidered a ritial resoure, these properties are no longer important. Compiler writers would be muh happier with a onventional set of floating-point registers, suh as is available with SSE. Aside: Other stak-based languages. Stak-based interpreters are still ommonly used as an intermediate representation between a high-level language and its mapping onto an atual mahine. Other examples of stak-based evaluators inlude Java byte ode, the intermediate format generated by Java ompilers, and the PostSript page formatting language. End Aside. Having the floating-point registers organized as a bounded stak makes it diffiult for ompilers to use these

3 3 Instrution load S storep D neg addp subp multp divp Effet Push value at S onto stak Pop top stak element and store at D Negate top stak element Pop top two stak elements; Push their sum Pop top two stak elements; Push their differene Pop top two stak elements; Push their produt Pop top two stak elements; Push their ratio Figure 1: Hypothetial stak instrution set. These instrutions are used to illustrate stak-based expression evaluation registers for storing the loal variables of a proedure that alls other proedures. For storing loal integer variables, we have seen that some of the general purpose registers an be designated as allee saved and hene be used to hold loal variables aross a proedure all. Suh a designation is not possible for an x87 register, sine its identity hanges as values are pushed onto and popped from the stak. For example, a push operation auses the value in to now be in. On the other hand, it might be tempting to treat the floating-point registers as a true stak, with eah proedure all pushing its loal values onto it. Unfortunately, this approah would quikly lead to a stak overflow, sine there is room for only eight values. Instead, the x87 registers must be treated as being allersaved. Compilers generate ode that saves every loal floating-point value on the main program stak before alling another proedure and then retrieves them on return. This generates memory traffi that an degrade program performane. As noted in Web Aside DATA:IA32-FP the IA32 floating-point registers are all 80 bits wide. They enode numbers in an extended-preision format as desribed in CS:APP2e Problem All single and doublepreision numbers are onverted to this format as they are loaded from memory into floating-point registers. The arithmeti is always performed in extended preision. Numbers are onverted from extended preision to single or double-preision format as they are stored in memory. 3 Stak Evaluation of Expressions To understand how x87 uses its registers as a stak, let us onsider a more abstrat version of stak-based evaluation on a hypothetial stak mahine. One we have introdued the basi exeution model, we will return to the somewhat more arane x87 arhiteture. Assume we have an arithmeti unit that uses a stak to hold intermediate results, having the instrution set illustrated in Figure 1. In addition to the stak, this unit has a memory that an hold values we will refer to by names suh as a, b, and x. As Figure 1 indiates, we an push memory values onto this stak with the load instrution. The storep operation pops the top element from the stak and stores the result in memory. A unary operation suh as neg (negation) uses the top stak element as its argument and overwrites this element with the result. Binary operations suh as addp and multp use the top two elements of the stak as their arguments. They pop both arguments off the stak and then push the result bak onto the stak. We use the suffix p with the store, add, subtrat,

4 4 multiply, and divide instrutions to emphasize the fat that these instrutions pop their operands. As an example, onsider the expression x = (a-b)/(-b+). We ould translate this expression into the ode that follows. Alongside eah line of ode, we show the ontents of the floating-point register stak. In keeping with our earlier onvention, we show the stak as growing downward, so the top of the stak is really at the bottom. 1 load b + 6 load a a b %st(2) 2 load b b b + 7 subp a b 3 neg b 8 divp (a b)/( b + ) 4 addp b + 9 storep x b + 5 load b b As this example shows, there is a natural reursive proedure for onverting an arithmeti expression into stak ode. Our expression notation has four types of expressions having the following translation rules: 1. A variable referene of the form Var. This is implemented with the instrution load Var. 2. A unary operation of the form - Expr. This is implemented by first generating the ode for Expr followed by a neg instrution. 3. A binary operation of the form Expr 1 + Expr 2, Expr 1 - Expr 2, Expr 1 * Expr 2, or Expr 1 / Expr 2. This is implemented by generating the ode for Expr 2, followed by the ode for Expr 1, followed by an addp, subp, multp, or divp instrution. 4. An assignment of the form Var = Expr. This is implemented by first generating the ode for Expr, followed by the storep Var instrution. As an example, onsider the expression x = a-b/. Sine division has preedene over subtration, this expression an be parenthesized as x = a-(b/). The reursive proedure would therefore proeed as follows: 1. Generate ode for Expr. = a-(b/): (a) Generate ode for Expr 2. = b/: i. Generate ode for Expr 2. = using the instrution load. ii. Generate ode for Expr 1. = b, using the instrution load b.

5 5 iii. Generate instrution divp. (b) Generate ode for Expr 1. = a, using the instrution load a. () Generate instrution subp. 2. Generate instrution storep x. The overall effet is to generate the following stak ode: 1 load b/ 4 load a a 2 load b b 5 subp a (b/) 3 divp b/ Pratie Problem 1: 6 storep x Generate stak ode for the expression x = a*b/ * -(a+b*). Diagram the ontents of the stak for eah step of your ode. Remember to follow the C rules for preedene and assoiativity. Stak evaluation beomes more omplex when we wish to use the result of some omputation multiple times. For example, onsider the expression x = (a*b)*(-(a*b)+). For effiieny, we would like to ompute a*b only one, but our stak instrutions do not provide a way to keep a value on the stak one it has been used. With the set of instrutions listed in Figure 1, we would therefore need to store the intermediate result a*b in some memory loation, say t, and retrieve this value for eah use. This gives the following ode: 1 load 7 neg (a b) 2 load b b 3 load a a b 4 multp a b %st(2) 8 addp (a b) + (a b) + 9 load t a b 10 multp a b ( (a b) + ) 5 storep t 11 storep x 6 load t a b

6 6 Instrution Soure format Soure loation flds Addr Single M 4 [Addr] fldl Addr Double M 8 [Addr] fldt Addr Extended M 10 [Addr] fildl Addr Integer M 4 [Addr] fld %st(i) Extended %st(i) Figure 2: Floating-point load instrutions. All onvert the operand to extended-preision format and push it onto the register stak. This approah has the disadvantage of generating additional memory traffi, even though the register stak has suffiient apaity to hold its intermediate results. The x87 instrution set avoids this ineffiieny by introduing variants of the arithmeti instrutions that leave their seond operand on the stak, and that an use an arbitrary stak value as their seond operand. In addition, it provides an instrution that an swap the top stak element with any other element. Although these extensions an be used to generate more effiient ode, the simple and elegant algorithm for translating arithmeti expressions into stak ode is lost. 4 Floating-Point Data Movement and Conversion Operations Floating-point registers are referened with the notation %st(i), where i denotes the position relative to the top of the stak. The value i an range between 0 and 7. Register is the top stak element, is the seond element, and so on. The top stak element an also be referened as %st. When a new value is pushed onto the stak, the value in register %st(7) is lost. When the stak is popped, the new value in %st(7) is not preditable. Compilers must generate ode that works within the limited apaity of the register stak. Figure 2 shows the set of instrutions used to push values onto the floating-point register stak. The first group of these read from a memory loation, where the argument Addr is a memory address given in one of the memory operand formats listed in CS:APP2e Figure 3.3. These instrutions differ by the presumed format of the soure operand and hene the number of bytes that must be read from memory. Reall that the notation M b [Addr] indiates an aess of b bytes with starting address Addr. All of these instrutions onvert the operand to extended-preision format before pushing it onto the stak. The final load instrution fld is used to dupliate a stak value. That is, it pushes a opy of floating-point register %st(i) onto the stak. For example, the instrution fld pushes a opy of the top stak element onto the stak. Figure 3 shows the instrutions that store the top stak element either in memory or in another floating-point register. There are both popping versions that pop the top element off the stak (similar to the storep instrution for our hypothetial stak evaluator), as well as nonpopping versions that leave the soure value on the top of the stak. As with the floating-point load instrutions, different variants of the instrution generate different formats for the result and therefore store different numbers of bytes. The first group of these store the result in memory. The address is speified using any of the memory operand formats listed in CS:APP2e Figure 3.3. The seond group opies the top stak element to some other floating-point register.

7 7 Instrution Pop (Y/N) Destination format Destination loation fsts Addr N Single M 4 [Addr] fstps Addr Y Single M 4 [Addr] fstl Addr N Double M 8 [Addr] fstpl Addr Y Double M 8 [Addr] fstt Addr N Extended M 10 [Addr] fstpt Addr Y Extended M 10 [Addr] fistl Addr N Integer M 4 [Addr] fistpl Addr Y Integer M 4 [Addr] fst %st(i) N Extended %st(i) fstp %st(i) Y Extended %st(i) Figure 3: Floating-point store instrutions. All onvert from extended-preision format to the destination format. Instrutions with suffix p pop the top element off the stak. Pratie Problem 2: Assume for the following ode fragment that register %eax ontains an integer variable x and that the top two stak elements orrespond to variables a and b, respetively. Fill in the boxes to diagram the stak ontents after eah instrution 1 testl %eax,%eax 2 jne L11 a b 3 fstp 4 jmp L9 5 L11: 6 fstp 7 L9: Write a C expression desribing the ontents of the top stak element at the end of this ode sequene in terms of x, a and b. A final floating-point data movement operation allows the ontents of two floating-point registers to be swapped. The instrution fxh %st(i) exhanges the ontents of floating-point registers and %st(i). The notation fxh written with no argument is equivalent to fxh, that is, swap the top two stak elements.

8 8 Instrution Computation fldz 0 fld1 1 fabs Op fhs Op fos os Op fsin sin Op fsqrt Op fadd Op 1 + Op 2 fsub Op 1 Op 2 fsubr Op 2 Op 1 fdiv Op 1 /Op 2 fdivr Op 2 /Op 1 fmul Op 1 Op 2 Figure 4: Floating-point arithmeti operations. Eah of the binary operations has many variants. Instrution Operand 1 Operand 2 Format Destination Pop (Y/N) fsubs Addr M 4 [Addr] Single N fsubl Addr M 8 [Addr] Double N fsubt Addr M 10 [Addr] Extended N fisubl Addr M 4 [Addr] Integer N fsub %st(i),%st %st(i) Extended N fsub %st,%st(i) %st(i) Extended %st(i) N fsubp %st,%st(i) %st(i) Extended %st(i) Y fsubp Extended Y Figure 5: Floating-point subtration instrutions. All store their results into a floating-point register in extended-preision format. Instrutions with suffix p pop the top element off the stak. 5 Floating-Point Arithmeti Instrutions Figure 4 douments some of the most ommon floating-point arithmeti operations. Instrutions in the first group have no operands. They push the floating-point representation of some numerial onstant onto the stak. There are similar instrutions for suh onstants as π, e, and log Instrutions in the seond group have a single operand. The operand is always the top stak element, similar to the neg operation of the hypothetial stak evaluator. They replae this element with the omputed result. Instrutions in the third group have two operands. For eah of these instrutions, there are many different variants for how the operands are speified, as will be disussed shortly. For nonommutative operations suh as subtration and division there is both a forward (e.g., fsub) and a reverse (e.g., fsubr) version, so that the arguments an be used in either order. In Figure 4 we show just a single form of the subtration operation fsub. In fat, this operation omes in

9 9 many different variants, as shown in Figure 5. All ompute the differene of two operands: Op 1 Op 2 and store the result in some floating-point register. Beyond the simple subp instrution we onsidered for the hypothetial stak evaluator, x87 has instrutions that read their seond operand from memory or from some floating-point register other than. In addition, there are both popping and nonpopping variants. The first group of instrutions reads the seond operand from memory, either in single-preision, double-preision, or integer format. It then onverts this to extended-preision format, subtrats it from the top stak element, and overwrites the top stak element. These an be seen as a ombination of a floating-point load following by a stak-based subtration operation. The seond group of subtration instrutions use the top stak element as one argument and some other stak element as the other, but they vary in the argument ordering, the result destination, and whether or not they pop the top stak element. Observe that the assembly ode line fsubp is shorthand for fsubp %st,. This line orresponds to the subp instrution of our hypothetial stak evaluator. That is, it omputes the differene between the top two stak elements, storing the result in, and then popping so that the omputed value ends up on the top of the stak. All of the binary operations listed in Figure 4 ome in all of the variants listed for fsub in Figure 5. As an example, we an write the ode for the expression x = (a-b)*(-b+) using x87 instrutions. For exposition purposes we will still use symboli names for memory loations and we assume these are double-preision values. 1 fldl b b b + 5 fsubl b a b 2 fhs b 6 fmulp (a b)( b + ) 3 faddl b + 7 fstpl x b + 4 fldl a a As another example, onsider the expression x = ((-(a*b)+)*(a*b). Observe how the instrution fld is used to reate two opies of a*b on the stak, avoiding the need to save the value in a temporary memory loation.

10 10 1 fldl a a a b 5 faddl (a b) + 2 fmull b a b 6 fmulp ( (a b) + ) (a b) a b 3 fld a b a b 4 fhs (a b) 7 fstpl x Pratie Problem 3: Diagram the stak ontents after eah step of the following ode: 1 fldl b 2 fldl a 3 fmul,%st 4 fxh 5 fdivrl 6 fsubrp 7 fstp x Give a C expression desribing this omputation. 6 Using Floating Point in Proedures With IA32, floating-point arguments are passed to a alling proedure on the stak, just as are integer arguments. Eah parameter of type float requires 4 bytes of stak spae, while eah parameter of type

11 11 double requires 8. For funtions whose return values are of type float or double, the result is returned on the top of the floating-point register stak in extended-preision format. As an example, onsider the following funtion 1 double funt(double a, float x, double b, int i) 2 { 3 return a*x - b/i; 4 } Arguments a, x, b, and i will be at byte offsets 8, 16, 20, and 28 relative to %ebp, respetively, as follows: Offset Contents a x b i The body of the generated ode, and the resulting stak values are as follows: 1 fildl 28(%ebp) i 2 fdivrl 20(%ebp) b/i b/i 3 flds 16(%ebp) x b/i 4 fmull 8(%ebp) a x 5 fsubp %st, a x b/i Pratie Problem 4: For a funtion funt2 with arguments p, q, r, and s, the ompiler generates the following ode for the funtion body: 1 fildl 8(%ebp) 2 flds 20(%ebp) 3 faddl 12(%ebp) 4 fdivrp %st, 5 fld1 6 fadds 24(%ebp) 7 fsubrp %st, The returned value is of type double. Write C ode for funt2. Be sure to orretly delare the argument types.

12 12 Ordered Unordered Op 2 Type Number of pops foms Addr fuoms Addr M 4 [Addr] Single 0 foml Addr fuoml Addr M 8 [Addr] Double 0 fom %st(i) fuom %st(i) %st(i) Extended 0 fom fuom Extended 0 fomps Addr fuomps Addr M 4 [Addr] Single 1 fompl Addr fuompl Addr M 8 [Addr] Double 1 fomp %st(i) fuomp %st(i) %st(i) Extended 1 fomp fuomp Extended 1 fompp fuompp Extended 2 Figure 6: Floating-point omparison instrutions. Ordered vs. unordered omparisons differ in their treatment of NaN s. Op 1 : Op 2 Binary Deimal > [ ] 0 < [ ] 1 = [ ] 64 Unordered [ ] 69 Figure 7: Enoded results from floating-point omparison. The results are enoded in the high-order byte of the floating-point status word after masking out all but bits 0, 2, and 6. 7 Testing and Comparing Floating-Point Values Similar to the integer ase, determining the relative values of two floating-point numbers involves using a omparison instrution to set ondition odes and then testing these ondition odes. For floating point, however, the ondition odes are part of the floating-point status word, a 16-bit register that ontains several flags about the floating-point unit. This status word must be transferred to an integer word, and then the partiular bits must be tested. There are a number of different floating-point omparison instrutions as doumented in Figure 6. All of them perform a omparison between operands Op 1 and Op 2, where Op 1 is the top stak element. Eah line of the table douments two different omparison types: an ordered omparison and an unordered omparison. The two omparisons differ only in how they handle the ase when both arguments are some form of NaN. Even then, their only differene are that one sets an exeption flag while the other does not, but this flag is typially ignored anyhow, and so we find GCC using the two forms interhangeably. Different omparison instrutions also differ in the loation of operand Op 2, analogous to the different forms of floating-point load and floating-point arithmeti instrutions. Finally, the different forms differ in the number of elements popped off the stak after the omparison is ompleted. Instrutions in the first group shown in the table do not hange the stak at all. Even for the ase where one of the arguments is in memory, this value is not on the stak at the end. Operations in the seond group pop element Op 1 off the stak. The final operation pops both Op 1 and Op 2 off the stak.

13 13 The floating-point status word is transferred to an integer register with the fnstsw instrution. The operand for this instrution is one of the 16-bit register identifiers shown in CS:APP2e Figure 3.2, for example, %ax. The bits in the status word enoding the omparison results are in bit positions 0, 2, and 6 of the high-order byte of the status word. For example, if we use instrution fnstw %ax to transfer the status word, then the relevant bits will be in %ah. A typial ode sequene to selet these bits is then: 1 fnstsw %ax Store floating-point status word in %ax 2 testb $69, %ah Test bits 0, 2, and 6 of word Note that has bit representation [ ], that is, it has 1s in the three relevant bit positions. Figure 7 shows the possible values of byte %ah that would result from this ode sequene. Observe that there are only four possible outomes for omparing operands Op 1 and Op 2 : the first is either greater, less, equal, or inomparable to the seond, where the latter outome only ours when one of the values is a NaN. (Any omparison with a NaN value should yield 0. For example, if x is NaN, then the omparisons x < y, x == y, and x > y should all yield 0.) As an example, onsider the following proedure: 1 int less(double x, double y) 2 { 3 return x < y; 4 } The ompiled ode for the funtion body is as follows: 1 fldl 16(%ebp) Push y 2 fldl 8(%ebp) Push x 3 fxh Swap x and y on stak 4 fuompp Compare y:x and pop both 5 fnstsw %ax Store floating-point status word in %ax 6 testb $69, %ah Test bits 0, 2, and 6 of word 7 sete %al Test for omparison outome of 0 (>) 8 movzbl %al, %eax Zero extend to get result Pratie Problem 5: Suppose we want to generate ode for the following funtion 1 int lesseq(double x, double y) 2 { 3 return x <= y; 4 } How ould we adapt the ode generated for the less funtion, hanging only the mask used by the testb instrution from 69 to something else. This ompletes our overage of floating-point programming with x87. Even experiened programmers find this ode arane and diffiult to read. The stak-based operations, the awkwardness of getting status

14 14 results from the FPU to the main proessor, and the many subtleties of floating-point omputations ombine to make the mahine ode lengthy and obsure. As mentioned in the introdution, the SSE floating-point arhiteture has beome the preferred method for implementing floating-point operations on x86 proessors, but we an expet that the use of x87 instrutions will ontinue, given the large amount of legay ode and the many legay mahines that still exist. Solutions to Problems Problem 1 Solution: [Pg. 5] This problem gives you a hane to try out the reursive proedure desribed in Setion 3. 1 load 2 load b b 3 multp b b 4 load a a (a + b ) 8 load b b (a + b ) 9 load a a b (a + b ) 10 multp a b (a + b ) 11 divp a b/ %st(2) %st(3) %st(2) %st(2) 5 addp a + b 6 neg (a + b ) (a + b ) 7 load 12 multp a b/ (a + b ) 13 storep x Problem 2 Solution: [Pg. 6] The following ode is similar to that generated by the ompiler for seleting between two values based on the outome of a test: 1 test %eax,%eax 2 jne L11 a b

15 15 3 fstp b 4 jmp L9 5 L11: 6 fstp a 7 L9: The resulting top of stak value is x? a : b. Problem 3 Solution: [Pg. 10] Floating-point ode is triky, with its different onventions about popping operands, the order of the arguments, et. This problem gives you a hane to work through some speifi ases in omplete detail. 1 fldl b b 2 fldl a a b b 3 fmul,%st a b a b 4 fxh b a b 5 fdivrl /b 6 fsubrp a b /b 7 fstp x This ode omputes the expression x = a*b - /b. Problem 4 Solution: [Pg. 11] This problem requires you to think about the different operand types and sizes in floating-point ode. Looking at the ode, we see that it reads its arguments from offsets 8, 12, 20, and 24 relative to %ebp. We then annotate the ode as follows:

16 16 p, q, r, and s are at offsets 8, 12, 16, and 24 from %ebp 1 fildl 8(%ebp) Get p (int) 2 flds 20(%ebp) Get r (float) 3 faddl 12(%ebp) Get q (double) and ompute q+r 4 fdivrp %st, Compute p/(q+r) 5 fld1 Load fadds 24(%ebp) Get s (float) and ompute s fsubrp %st, Compute p/(q+r) - (s+1.0) Based on the instrutions that read the arguments from memory, we determine that arguments p, q, r, and s are of types int, double, float, and float, respetively. We an also determine the expression being omputed, leading us to generate the following C version of the ode: 1 double funt2(int p, double q, float r, float s) 2 { 3 return p/(q+r) - (s+1); 4 } Problem 5 Solution: [Pg. 13] Sine the arguments are ompared in reverse order, we want to return 1 when the outome of the test is either greater or equal. The ode should aept the two ases in Figure 7 with deimal values 0 and 64 but rejet those with values 1 and 69. Using a mask of either 1 or 5 will aomplish this.

17 Index CS:APP2e, 1 arhiteture floating-point, 1 fabs [IA32/x86-64] FP absolute value, 8 fadd [IA32/x86-64] FP add, 8 fhs [IA32/x86-64] FP negate, 8 fom [IA32/x86-64] FP ompare, 12 foml [IA32/x86-64] FP ompare double preision, 12 fomp [IA32/x86-64] FP ompare with pop, 12 fompl [IA32/x86-64] FP ompare double preision with pop, 12 fompp [IA32/x86-64] FP ompare with two pops, 12 fomps [IA32/x86-64] FP ompare single preision with pop, 12 foms [IA32/x86-64] FP ompare single preision, 12 fos [IA32/x86-64] FP osine, 8 fdiv [IA32/x86-64] FP divide, 8 fdivr [IA32/x86-64] FP reverse divide, 8 fildl [IA32/x86-64] FP load and onvert integer, 6 fistl [IA32/x86-64] FP onvert and store integer, 7 fistpl [IA32/x86-64] FP onvert and store integer with pop, 7 fisubl [IA32/x86-64] FP load and onvert integer and subtrat, 8 fld1 [IA32/x86-64] FP load one, 8 fldl [IA32/x86-64] FP load double preision, 6 fldl [IA32/x86-64] FP load from register, 6 flds [IA32/x86-64] FP load single preision, 6 fldt [IA32/x86-64] FP load extended preision, 6 fldz [IA32/x86-64] FP load zero, 8 floating point registers in x87, 2 status word, 12 floating-point arhiteture, 1 fmul [IA32/x86-64] FP multiply, 8 fnstw [IA32/x86-64] opy FP status word, 13 fsin [IA32/x86-64] FP sine, 8 fsqrt [IA32/x86-64] FP square root, 8 fst [IA32/x86-64] FP store to register, 7 fstl [IA32/x86-64] FP store double preision, 7 fstp [IA32/x86-64] FP store to register with pop, 7 fstpl [IA32/x86-64] FP store double preision with pop, 7 fstps [IA32/x86-64] FP store single preision with pop, 7 fstpt [IA32/x86-64] FP store extended preision with pop, 7 fsts [IA32/x86-64] FP store single preision, 7 fstt [IA32/x86-64] FP store extended preision, 7 fsub [IA32/x86-64] FP subtrat, 8 fsubl [IA32/x86-64] FP load double preision and subtrat, 8 fsubp [IA32/x86-64] FP subtrat with pop, 8 fsubr [IA32/x86-64] FP reverse subtrat, 8 fsubs [IA32/x86-64] FP load single preision and subtrat, 8 fsubt [IA32/x86-64] FP load extended preision and subtrat, 8 fuom [IA32/x86-64] FP unordered ompare, 12 fuoml [IA32/x86-64] FP unordered ompare double preision, 12 fuomp [IA32/x86-64] FP unordered ompare with pop, 12 fuompl [IA32/x86-64] FP unordered ompare double preision with pop, 12 fuompp [IA32/x86-64] FP unordered ompare with two pops, 12 fuomps [IA32/x86-64] FP unordered ompare single preision with pop, 12 fuoms [IA32/x86-64] FP unordered ompare single preision, 12 fxh [IA32/x86-64] FP exhange registers, 7 IEEE 754 floating-point standard, 2 registers x87 floating point, 2 17

18 status word, floating-point, 12 18

Reading Object Code. A Visible/Z Lesson

Reading Object Code. A Visible/Z Lesson Reading Objet Code A Visible/Z Lesson The Idea: When programming in a high-level language, we rarely have to think about the speifi ode that is generated for eah instrution by a ompiler. But as an assembly

More information

Reading Object Code. A Visible/Z Lesson

Reading Object Code. A Visible/Z Lesson Reading Objet Code A Visible/Z Lesson The Idea: When programming in a high-level language, we rarely have to think about the speifi ode that is generated for eah instrution by a ompiler. But as an assembly

More information

COMP 181. Prelude. Intermediate representations. Today. Types of IRs. High-level IR. Intermediate representations and code generation

COMP 181. Prelude. Intermediate representations. Today. Types of IRs. High-level IR. Intermediate representations and code generation Prelude COMP 181 Intermediate representations and ode generation November, 009 What is this devie? Large Hadron Collider What is a hadron? Subatomi partile made up of quarks bound by the strong fore What

More information

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A.

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A. Compilation 0368-3133 Leture 11a Text book: Modern ompiler implementation in C Andrew A. Appel Register Alloation Noam Rinetzky 1 Registers Dediated memory loations that an be aessed quikly, an have omputations

More information

Background/Review on Numbers and Computers (lecture)

Background/Review on Numbers and Computers (lecture) Bakground/Review on Numbers and Computers (leture) ICS312 Mahine-Level and Systems Programming Henri Casanova (henri@hawaii.edu) Numbers and Computers Throughout this ourse we will use binary and hexadeimal

More information

The x87 Floating-Point Unit

The x87 Floating-Point Unit The x87 Floating-Point Unit Lecture 27 Intel Manual, Vol. 1, Chapter 8 Intel Manual, Vol. 2 Robb T. Koether Hampden-Sydney College Fri, Mar 27, 2015 Robb T. Koether (Hampden-Sydney College) The x87 Floating-Point

More information

13.1 Numerical Evaluation of Integrals Over One Dimension

13.1 Numerical Evaluation of Integrals Over One Dimension 13.1 Numerial Evaluation of Integrals Over One Dimension A. Purpose This olletion of subprograms estimates the value of the integral b a f(x) dx where the integrand f(x) and the limits a and b are supplied

More information

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2 On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,

More information

8 Instruction Selection

8 Instruction Selection 8 Instrution Seletion The IR ode instrutions were designed to do exatly one operation: load/store, add, subtrat, jump, et. The mahine instrutions of a real CPU often perform several of these primitive

More information

represent = as a finite deimal" either in base 0 or in base. We an imagine that the omputer first omputes the mathematial = then rounds the result to

represent = as a finite deimal either in base 0 or in base. We an imagine that the omputer first omputes the mathematial = then rounds the result to Sientifi Computing Chapter I Computer Arithmeti Jonathan Goodman Courant Institute of Mathemaial Sienes Last revised January, 00 Introdution One of the many soures of error in sientifi omputing is inexat

More information

HEXA: Compact Data Structures for Faster Packet Processing

HEXA: Compact Data Structures for Faster Packet Processing Washington University in St. Louis Washington University Open Sholarship All Computer Siene and Engineering Researh Computer Siene and Engineering Report Number: 27-26 27 HEXA: Compat Data Strutures for

More information

Direct-Mapped Caches

Direct-Mapped Caches A Case for Diret-Mapped Cahes Mark D. Hill University of Wisonsin ahe is a small, fast buffer in whih a system keeps those parts, of the ontents of a larger, slower memory that are likely to be used soon.

More information

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications System-Level Parallelism and hroughput Optimization in Designing Reonfigurable Computing Appliations Esam El-Araby 1, Mohamed aher 1, Kris Gaj 2, arek El-Ghazawi 1, David Caliga 3, and Nikitas Alexandridis

More information

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking Algorithms for External Memory Leture 6 Graph Algorithms - Weighted List Ranking Leturer: Nodari Sithinava Sribe: Andi Hellmund, Simon Ohsenreither 1 Introdution & Motivation After talking about I/O-effiient

More information

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract CS 9 Projet Final Report: Learning Convention Propagation in BeerAdvoate Reviews from a etwork Perspetive Abstrat We look at the way onventions propagate between reviews on the BeerAdvoate dataset, and

More information

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays nalysis of input and output onfigurations for use in four-valued D programmable logi arrays J.T. utler H.G. Kerkhoff ndexing terms: Logi, iruit theory and design, harge-oupled devies bstrat: s in binary,

More information

Chapter 2: Introduction to Maple V

Chapter 2: Introduction to Maple V Chapter 2: Introdution to Maple V 2-1 Working with Maple Worksheets Try It! (p. 15) Start a Maple session with an empty worksheet. The name of the worksheet should be Untitled (1). Use one of the standard

More information

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study What are Cyle-Stealing Systems Good For? A Detailed Performane Model Case Study Wayne Kelly and Jiro Sumitomo Queensland University of Tehnology, Australia {w.kelly, j.sumitomo}@qut.edu.au Abstrat The

More information

XML Data Streams. XML Stream Processing. XML Stream Processing. Yanlei Diao. University of Massachusetts Amherst

XML Data Streams. XML Stream Processing. XML Stream Processing. Yanlei Diao. University of Massachusetts Amherst XML Stream Proessing Yanlei Diao University of Massahusetts Amherst XML Data Streams XML is the wire format for data exhanged online. Purhase orders http://www.oasis-open.org/ommittees/t_home.php?wg_abbrev=ubl

More information

Graph-Based vs Depth-Based Data Representation for Multiview Images

Graph-Based vs Depth-Based Data Representation for Multiview Images Graph-Based vs Depth-Based Data Representation for Multiview Images Thomas Maugey, Antonio Ortega, Pasal Frossard Signal Proessing Laboratory (LTS), Eole Polytehnique Fédérale de Lausanne (EPFL) Email:

More information

Instruction Set extensions to X86. Floating Point SIMD instructions

Instruction Set extensions to X86. Floating Point SIMD instructions Instruction Set extensions to X86 Some extensions to x86 instruction set intended to accelerate 3D graphics AMD 3D-Now! Instructions simply accelerate floating point arithmetic. Accelerate object transformations

More information

Pipelined Multipliers for Reconfigurable Hardware

Pipelined Multipliers for Reconfigurable Hardware Pipelined Multipliers for Reonfigurable Hardware Mithell J. Myjak and José G. Delgado-Frias Shool of Eletrial Engineering and Computer Siene, Washington State University Pullman, WA 99164-2752 USA {mmyjak,

More information

Outline: Software Design

Outline: Software Design Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling

More information

Floating Point Instructions

Floating Point Instructions Floating Point Instructions Ned Nedialkov McMaster University Canada SE 3F03 March 2013 Outline Storing data Addition Subtraction Multiplication Division Comparison instructions Some more instructions

More information

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks International Journal of Advanes in Computer Networks and Its Seurity IJCNS A Load-Balaned Clustering Protool for Hierarhial Wireless Sensor Networks Mehdi Tarhani, Yousef S. Kavian, Saman Siavoshi, Ali

More information

Accommodations of QoS DiffServ Over IP and MPLS Networks

Accommodations of QoS DiffServ Over IP and MPLS Networks Aommodations of QoS DiffServ Over IP and MPLS Networks Abdullah AlWehaibi, Anjali Agarwal, Mihael Kadoh and Ahmed ElHakeem Department of Eletrial and Computer Department de Genie Eletrique Engineering

More information

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions Department of Eletrial Engineering and Computer iene MAACHUETT INTITUTE OF TECHNOLOGY 6.035 Fall 2016 Test I olutions 1 I Regular Expressions and Finite-tate Automata For Questions 1, 2, and 3, let the

More information

Extracting Partition Statistics from Semistructured Data

Extracting Partition Statistics from Semistructured Data Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk

More information

Gray Codes for Reflectable Languages

Gray Codes for Reflectable Languages Gray Codes for Refletable Languages Yue Li Joe Sawada Marh 8, 2008 Abstrat We lassify a type of language alled a refletable language. We then develop a generi algorithm that an be used to list all strings

More information

Real Arithmetic. Fractional binary numbers. Fractional binary numbers examples. Binary real numbers

Real Arithmetic. Fractional binary numbers. Fractional binary numbers examples. Binary real numbers Fractional binary numbers 2 i 2 i 1 Real Arithmetic Computer Organization and Assembly Languages g Yung-Yu Chuang 4 2 1 b i b i 1 b 2 b 1 b 0. b 1 b 2 b 3 b j Representation 1/2 1/4 1/8 2 j Bits to right

More information

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425) Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University

More information

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY Dileep P, Bhondarkor Texas Instruments Inorporated Dallas, Texas ABSTRACT Charge oupled devies (CCD's) hove been mentioned as potential fast auxiliary

More information

Partial Character Decoding for Improved Regular Expression Matching in FPGAs

Partial Character Decoding for Improved Regular Expression Matching in FPGAs Partial Charater Deoding for Improved Regular Expression Mathing in FPGAs Peter Sutton Shool of Information Tehnology and Eletrial Engineering The University of Queensland Brisbane, Queensland, 4072, Australia

More information

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating Capturing Large Intra-lass Variations of Biometri Data by Template Co-updating Ajita Rattani University of Cagliari Piazza d'armi, Cagliari, Italy ajita.rattani@diee.unia.it Gian Lua Marialis University

More information

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System Algorithms, Mehanisms and Proedures for the Computer-aided Projet Generation System Anton O. Butko 1*, Aleksandr P. Briukhovetskii 2, Dmitry E. Grigoriev 2# and Konstantin S. Kalashnikov 3 1 Department

More information

ZDT -A Debugging Program for the Z80

ZDT -A Debugging Program for the Z80 ZDT -A Debugging Program for the Z80 il I,, 1651 Third Ave.. New York, N.Y. 10028 (212) 860-o300 lnt'l Telex 220501 ZOT - A DEBUGGING PROGRAM FOR THE ZAO Distributed by: Lifeboat Assoiates 1651 Third Avenue

More information

Space- and Time-Efficient BDD Construction via Working Set Control

Space- and Time-Efficient BDD Construction via Working Set Control Spae- and Time-Effiient BDD Constrution via Working Set Control Bwolen Yang Yirng-An Chen Randal E. Bryant David R. O Hallaron Computer Siene Department Carnegie Mellon University Pittsburgh, PA 15213.

More information

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization Self-Adaptive Parent to Mean-Centri Reombination for Real-Parameter Optimization Kalyanmoy Deb and Himanshu Jain Department of Mehanial Engineering Indian Institute of Tehnology Kanpur Kanpur, PIN 86 {deb,hjain}@iitk.a.in

More information

Contents Contents...I List of Tables...VIII List of Figures...IX 1. Introduction Information Retrieval... 8

Contents Contents...I List of Tables...VIII List of Figures...IX 1. Introduction Information Retrieval... 8 Contents Contents...I List of Tables...VIII List of Figures...IX 1. Introdution... 1 1.1. Internet Information...2 1.2. Internet Information Retrieval...3 1.2.1. Doument Indexing...4 1.2.2. Doument Retrieval...4

More information

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0;

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0; Naïve line drawing algorithm // Connet to grid points(x0,y0) and // (x1,y1) by a line. void drawline(int x0, int y0, int x1, int y1) { int x; double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx;

More information

CleanUp: Improving Quadrilateral Finite Element Meshes

CleanUp: Improving Quadrilateral Finite Element Meshes CleanUp: Improving Quadrilateral Finite Element Meshes Paul Kinney MD-10 ECC P.O. Box 203 Ford Motor Company Dearborn, MI. 8121 (313) 28-1228 pkinney@ford.om Abstrat: Unless an all quadrilateral (quad)

More information

Path Sharing and Predicate Evaluation for High-Performance XML Filtering*

Path Sharing and Predicate Evaluation for High-Performance XML Filtering* Path Sharing and Prediate Evaluation for High-Performane XML Filtering Yanlei Diao, Mihael J. Franklin, Hao Zhang, Peter Fisher EECS, University of California, Berkeley {diaoyl, franklin, nhz, fisherp}@s.erkeley.edu

More information

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines The Minimum Redundany Maximum Relevane Approah to Building Sparse Support Vetor Mahines Xiaoxing Yang, Ke Tang, and Xin Yao, Nature Inspired Computation and Appliations Laboratory (NICAL), Shool of Computer

More information

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks A Dual-Hamiltonian-Path-Based Multiasting Strategy for Wormhole-Routed Star Graph Interonnetion Networks Nen-Chung Wang Department of Information and Communiation Engineering Chaoyang University of Tehnology,

More information

80x87 Instruction Set (x87 - Pentium)

80x87 Instruction Set (x87 - Pentium) 80x87 Instruction Set (x87 - Pentium) Legend: General: reg = floating point register, st(0), st(1)... st(7) mem = memory address mem32 = memory address of 32-bit item mem64 = memory address of 64-bit item

More information

The recursive decoupling method for solving tridiagonal linear systems

The recursive decoupling method for solving tridiagonal linear systems Loughborough University Institutional Repository The reursive deoupling method for solving tridiagonal linear systems This item was submitted to Loughborough University's Institutional Repository by the/an

More information

This fact makes it difficult to evaluate the cost function to be minimized

This fact makes it difficult to evaluate the cost function to be minimized RSOURC LLOCTION N SSINMNT In the resoure alloation step the amount of resoures required to exeute the different types of proesses is determined. We will refer to the time interval during whih a proess

More information

Figure 8-1. x87 FPU Execution Environment

Figure 8-1. x87 FPU Execution Environment Sign 79 78 64 63 R7 Exponent R6 R5 R4 R3 R2 R1 R0 Data Registers Significand 0 15 Control Register 0 47 Last Instruction Pointer 0 Status Register Last Data (Operand) Pointer Tag Register 10 Opcode 0 Figure

More information

SAND Unlimited Release Printed November 1995 Updated November 29, :26 PM EXODUS II: A Finite Element Data Model

SAND Unlimited Release Printed November 1995 Updated November 29, :26 PM EXODUS II: A Finite Element Data Model SAND92-2137 Unlimited Release Printed November 1995 Updated November 29, 2006 12:26 PM EXODUS II: A Finite Element Data Model Gregory D. Sjaardema (updated version) Larry A. Shoof, Vitor R. Yarberry Computational

More information

Performance Benchmarks for an Interactive Video-on-Demand System

Performance Benchmarks for an Interactive Video-on-Demand System Performane Benhmarks for an Interative Video-on-Demand System. Guo,P.G.Taylor,E.W.M.Wong,S.Chan,M.Zukerman andk.s.tang ARC Speial Researh Centre for Ultra-Broadband Information Networks (CUBIN) Department

More information

Exploring the Commonality in Feature Modeling Notations

Exploring the Commonality in Feature Modeling Notations Exploring the Commonality in Feature Modeling Notations Miloslav ŠÍPKA Slovak University of Tehnology Faulty of Informatis and Information Tehnologies Ilkovičova 3, 842 16 Bratislava, Slovakia miloslav.sipka@gmail.om

More information

Video Data and Sonar Data: Real World Data Fusion Example

Video Data and Sonar Data: Real World Data Fusion Example 14th International Conferene on Information Fusion Chiago, Illinois, USA, July 5-8, 2011 Video Data and Sonar Data: Real World Data Fusion Example David W. Krout Applied Physis Lab dkrout@apl.washington.edu

More information

Total 100

Total 100 CS331 SOLUTION Problem # Points 1 10 2 15 3 25 4 20 5 15 6 15 Total 100 1. ssume you are dealing with a ompiler for a Java-like language. For eah of the following errors, irle whih phase would normally

More information

UCSB Math TI-85 Tutorials: Basics

UCSB Math TI-85 Tutorials: Basics 3 UCSB Math TI-85 Tutorials: Basis If your alulator sreen doesn t show anything, try adjusting the ontrast aording to the instrutions on page 3, or page I-3, of the alulator manual You should read the

More information

And, the (low-pass) Butterworth filter of order m is given in the frequency domain by

And, the (low-pass) Butterworth filter of order m is given in the frequency domain by Problem Set no.3.a) The ideal low-pass filter is given in the frequeny domain by B ideal ( f ), f f; =, f > f. () And, the (low-pass) Butterworth filter of order m is given in the frequeny domain by B

More information

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks Abouberine Ould Cheikhna Department of Computer Siene University of Piardie Jules Verne 80039 Amiens Frane Ould.heikhna.abouberine @u-piardie.fr

More information

Implementing Load-Balanced Switches With Fat-Tree Networks

Implementing Load-Balanced Switches With Fat-Tree Networks Implementing Load-Balaned Swithes With Fat-Tree Networks Hung-Shih Chueh, Ching-Min Lien, Cheng-Shang Chang, Jay Cheng, and Duan-Shin Lee Department of Eletrial Engineering & Institute of Communiations

More information

A Novel Validity Index for Determination of the Optimal Number of Clusters

A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers

More information

CA Test Data Manager 4.x Implementation Proven Professional Exam (CAT-681) Study Guide Version 1.0

CA Test Data Manager 4.x Implementation Proven Professional Exam (CAT-681) Study Guide Version 1.0 Implementation Proven Professional Study Guide Version 1.0 PROPRIETARY AND CONFIDENTIAL INFORMATION 2017 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer

More information

COSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems

COSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems COSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems Andreas Brokalakis Synelixis Solutions Ltd, Greee brokalakis@synelixis.om Nikolaos Tampouratzis Teleommuniation

More information

Allocating Rotating Registers by Scheduling

Allocating Rotating Registers by Scheduling Alloating Rotating Registers by Sheduling Hongbo Rong Hyunhul Park Cheng Wang Youfeng Wu Programming Systems Lab Intel Labs {hongbo.rong,hyunhul.park,heng..wang,youfeng.wu}@intel.om ABSTRACT A rotating

More information

EXODUS II: A Finite Element Data Model

EXODUS II: A Finite Element Data Model SAND92-2137 Unlimited Release Printed November 1995 Distribution Category UC-705 EXODUS II: A Finite Element Data Model Larry A. Shoof, Vitor R. Yarberry Computational Mehanis and Visualization Department

More information

A {k, n}-secret Sharing Scheme for Color Images

A {k, n}-secret Sharing Scheme for Color Images A {k, n}-seret Sharing Sheme for Color Images Rastislav Luka, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos The Edward S. Rogers Sr. Dept. of Eletrial and Computer Engineering, University

More information

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer Communiations and Networ, 2013, 5, 69-73 http://dx.doi.org/10.4236/n.2013.53b2014 Published Online September 2013 (http://www.sirp.org/journal/n) Cross-layer Resoure Alloation on Broadband Power Line Based

More information

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections SVC-DASH-M: Salable Video Coding Dynami Adaptive Streaming Over HTTP Using Multiple Connetions Samar Ibrahim, Ahmed H. Zahran and Mahmoud H. Ismail Department of Eletronis and Eletrial Communiations, Faulty

More information

Announcements. Lecture Caching Issues for Multi-core Processors. Shared Vs. Private Caches for Small-scale Multi-core

Announcements. Lecture Caching Issues for Multi-core Processors. Shared Vs. Private Caches for Small-scale Multi-core Announements Your fous should be on the lass projet now Leture 17: Cahing Issues for Multi-ore Proessors This week: status update and meeting A short presentation on: projet desription (problem, importane,

More information

Z8530 Programming Guide

Z8530 Programming Guide Z8530 Programming Guide Alan Cox alan@redhat.om Z8530 Programming Guide by Alan Cox Copyright 2000 by Alan Cox This doumentation is free software; you an redistribute it and/or modify it under the terms

More information

We don t need no generation - a practical approach to sliding window RLNC

We don t need no generation - a practical approach to sliding window RLNC We don t need no generation - a pratial approah to sliding window RLNC Simon Wunderlih, Frank Gabriel, Sreekrishna Pandi, Frank H.P. Fitzek Deutshe Telekom Chair of Communiation Networks, TU Dresden, Dresden,

More information

Automatic Generation of Transaction-Level Models for Rapid Design Space Exploration

Automatic Generation of Transaction-Level Models for Rapid Design Space Exploration Automati Generation of Transation-Level Models for Rapid Design Spae Exploration Dongwan Shin, Andreas Gerstlauer, Junyu Peng, Rainer Dömer and Daniel D. Gajski Center for Embedded Computer Systems University

More information

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks Flow Demands Oriented Node Plaement in Multi-Hop Wireless Networks Zimu Yuan Institute of Computing Tehnology, CAS, China {zimu.yuan}@gmail.om arxiv:153.8396v1 [s.ni] 29 Mar 215 Abstrat In multi-hop wireless

More information

1 The Knuth-Morris-Pratt Algorithm

1 The Knuth-Morris-Pratt Algorithm 5-45/65: Design & Analysis of Algorithms September 26, 26 Leture #9: String Mathing last hanged: September 26, 27 There s an entire field dediated to solving problems on strings. The book Algorithms on

More information

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION Ken Sauer and Charles A. Bouman Department of Eletrial Engineering, University of Notre Dame Notre Dame, IN 46556, (219) 631-6999 Shool of

More information

PASCAL 64. "The" Pascal Compiler for the Commodore 64. A Data Becker Product. >AbacusiII Software P.O. BOX 7211 GRAND RAPIDS, MICK 49510

PASCAL 64. The Pascal Compiler for the Commodore 64. A Data Becker Product. >AbacusiII Software P.O. BOX 7211 GRAND RAPIDS, MICK 49510 PASCAL 64 "The" Pasal Compiler for the Commodore 64 A Data Beker Produt >AbausiII Software P.O. BOX 7211 GRAND RAPIDS, MICK 49510 7010 COPYRIGHT NOTICE ABACUS Software makes this pakage available for use

More information

- 1 - S 21. Directory-based Administration of Virtual Private Networks: Policy & Configuration. Charles A Kunzinger.

- 1 - S 21. Directory-based Administration of Virtual Private Networks: Policy & Configuration. Charles A Kunzinger. - 1 - S 21 Diretory-based Administration of Virtual Private Networks: Poliy & Configuration Charles A Kunzinger kunzinge@us.ibm.om - 2 - Clik here Agenda to type page title What is a VPN? What is VPN Poliy?

More information

Detection and Recognition of Non-Occluded Objects using Signature Map

Detection and Recognition of Non-Occluded Objects using Signature Map 6th WSEAS International Conferene on CIRCUITS, SYSTEMS, ELECTRONICS,CONTROL & SIGNAL PROCESSING, Cairo, Egypt, De 9-31, 007 65 Detetion and Reognition of Non-Oluded Objets using Signature Map Sangbum Park,

More information

Acoustic Links. Maximizing Channel Utilization for Underwater

Acoustic Links. Maximizing Channel Utilization for Underwater Maximizing Channel Utilization for Underwater Aousti Links Albert F Hairris III Davide G. B. Meneghetti Adihele Zorzi Department of Information Engineering University of Padova, Italy Email: {harris,davide.meneghetti,zorzi}@dei.unipd.it

More information

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem Calulation of typial running time of a branh-and-bound algorithm for the vertex-over problem Joni Pajarinen, Joni.Pajarinen@iki.fi Otober 21, 2007 1 Introdution The vertex-over problem is one of a olletion

More information

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization Zippy - A oarse-grained reonfigurable array with support for hardware virtualization Christian Plessl Computer Engineering and Networks Lab ETH Zürih, Switzerland plessl@tik.ee.ethz.h Maro Platzner Department

More information

Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems

Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems Arne Hamann, Razvan Rau, Rolf Ernst Institute of Computer and Communiation Network Engineering Tehnial University of Braunshweig,

More information

A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering

A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering A Novel Bit Level Time Series Representation with Impliation of Similarity Searh and lustering hotirat Ratanamahatana, Eamonn Keogh, Anthony J. Bagnall 2, and Stefano Lonardi Dept. of omputer Siene & Engineering,

More information

CA Release Automation 5.x Implementation Proven Professional Exam (CAT-600) Study Guide Version 1.1

CA Release Automation 5.x Implementation Proven Professional Exam (CAT-600) Study Guide Version 1.1 Exam (CAT-600) Study Guide Version 1.1 PROPRIETARY AND CONFIDENTIAL INFORMATION 2016 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer use only. No unauthorized

More information

Intra- and Inter-Stream Synchronisation for Stored Multimedia Streams

Intra- and Inter-Stream Synchronisation for Stored Multimedia Streams IEEE International Conferene on Multimedia Computing & Systems, June 17-23, 1996, in Hiroshima, Japan, p 372-381 Intra- and Inter-Stream Synhronisation for Stored Multimedia Streams Ernst Biersak, Werner

More information

Automated System for the Study of Environmental Loads Applied to Production Risers Dustin M. Brandt 1, Celso K. Morooka 2, Ivan R.

Automated System for the Study of Environmental Loads Applied to Production Risers Dustin M. Brandt 1, Celso K. Morooka 2, Ivan R. EngOpt 2008 - International Conferene on Engineering Optimization Rio de Janeiro, Brazil, 01-05 June 2008. Automated System for the Study of Environmental Loads Applied to Prodution Risers Dustin M. Brandt

More information

Performance Improvement of TCP on Wireless Cellular Networks by Adaptive FEC Combined with Explicit Loss Notification

Performance Improvement of TCP on Wireless Cellular Networks by Adaptive FEC Combined with Explicit Loss Notification erformane Improvement of TC on Wireless Cellular Networks by Adaptive Combined with Expliit Loss tifiation Masahiro Miyoshi, Masashi Sugano, Masayuki Murata Department of Infomatis and Mathematial Siene,

More information

Define - starting approximation for the parameters (p) - observational data (o) - solution criterion (e.g. number of iterations)

Define - starting approximation for the parameters (p) - observational data (o) - solution criterion (e.g. number of iterations) Global Iterative Solution Distributed proessing of the attitude updating L. Lindegren (21 May 2001) SAG LL 37 Abstrat. The attitude updating algorithm given in GAIA LL 24 (v. 2) is modified to allow distributed

More information

Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis

Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis Journal of Computer Siene 4 (): 9-97, 008 ISSN 549-3636 008 Siene Publiations Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis Deepti Gaur, Aditya Shastri and Ranjit Biswas Department of Computer

More information

Scheduling Multiple Independent Hard-Real-Time Jobs on a Heterogeneous Multiprocessor

Scheduling Multiple Independent Hard-Real-Time Jobs on a Heterogeneous Multiprocessor Sheduling Multiple Independent Hard-Real-Time Jobs on a Heterogeneous Multiproessor Orlando Moreira NXP Semiondutors Researh Eindhoven, Netherlands orlando.moreira@nxp.om Frederio Valente Universidade

More information

1. The collection of the vowels in the word probability. 2. The collection of real numbers that satisfy the equation x 9 = 0.

1. The collection of the vowels in the word probability. 2. The collection of real numbers that satisfy the equation x 9 = 0. C HPTER 1 SETS I. DEFINITION OF SET We begin our study of probability with the disussion of the basi onept of set. We assume that there is a ommon understanding of what is meant by the notion of a olletion

More information

Design Implications for Enterprise Storage Systems via Multi-Dimensional Trace Analysis

Design Implications for Enterprise Storage Systems via Multi-Dimensional Trace Analysis Design Impliations for Enterprise Storage Systems via Multi-Dimensional Trae Analysis Yanpei Chen, Kiran Srinivasan, Garth Goodson, Randy Katz University of California, Berkeley, NetApp In. {yhen2, randy}@ees.berkeley.edu,

More information

The AMDREL Project in Retrospective

The AMDREL Project in Retrospective The AMDREL Projet in Retrospetive K. Siozios 1, G. Koutroumpezis 1, K. Tatas 1, N. Vassiliadis 2, V. Kalenteridis 2, H. Pournara 2, I. Pappas 2, D. Soudris 1, S. Nikolaidis 2, S. Siskos 2, and A. Thanailakis

More information

1. Introduction. 2. The Probable Stope Algorithm

1. Introduction. 2. The Probable Stope Algorithm 1. Introdution Optimization in underground mine design has reeived less attention than that in open pit mines. This is mostly due to the diversity o underground mining methods and omplexity o underground

More information

Bring Your Own Coding Style

Bring Your Own Coding Style Bring Your Own Coding Style Naoto Ogura, Shinsuke Matsumoto, Hideaki Hata and Shinji Kusumoto Graduate Shool of Information Siene and Tehnology, Osaka University, Japan {n-ogura, shinsuke, kusumoto@ist.osaka-u.a.jp

More information

Incremental Mining of Partial Periodic Patterns in Time-series Databases

Incremental Mining of Partial Periodic Patterns in Time-series Databases CERIAS Teh Report 2000-03 Inremental Mining of Partial Periodi Patterns in Time-series Dataases Mohamed G. Elfeky Center for Eduation and Researh in Information Assurane and Seurity Purdue University,

More information

Detecting Outliers in High-Dimensional Datasets with Mixed Attributes

Detecting Outliers in High-Dimensional Datasets with Mixed Attributes Deteting Outliers in High-Dimensional Datasets with Mixed Attributes A. Koufakou, M. Georgiopoulos, and G.C. Anagnostopoulos 2 Shool of EECS, University of Central Florida, Orlando, FL, USA 2 Dept. of

More information

Recommendation Subgraphs for Web Discovery

Recommendation Subgraphs for Web Discovery Reommation Subgraphs for Web Disovery Arda Antikaioglu Department of Mathematis Carnegie Mellon University aantika@andrew.mu.edu R. Ravi Tepper Shool of Business Carnegie Mellon University ravi@mu.edu

More information

Divide-and-conquer algorithms 1

Divide-and-conquer algorithms 1 * 1 Multipliation Divide-and-onquer algorithms 1 The mathematiian Gauss one notied that although the produt of two omplex numbers seems to! involve four real-number multipliations it an in fat be done

More information

CA Agile Requirements Designer 2.x Implementation Proven Professional Exam (CAT-720) Study Guide Version 1.0

CA Agile Requirements Designer 2.x Implementation Proven Professional Exam (CAT-720) Study Guide Version 1.0 Exam (CAT-720) Study Guide Version 1.0 PROPRIETARY AND CONFIDENTIAL INFORMATION 2017 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer use only. No unauthorized

More information

Recursion examples: Problem 2. (More) Recursion and Lists. Tail recursion. Recursion examples: Problem 2. Recursion examples: Problem 3

Recursion examples: Problem 2. (More) Recursion and Lists. Tail recursion. Recursion examples: Problem 2. Recursion examples: Problem 3 Reursion eamples: Problem 2 (More) Reursion and s Reursive funtion to reverse a string publi String revstring(string str) { if(str.equals( )) return str; return revstring(str.substring(1, str.length()))

More information

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Ho Wireless Networks Giorgio Quer, Federio Librino, Lua Canzian, Leonardo Badia, Mihele Zorzi, University of California San Diego La

More information

Detection of RF interference to GPS using day-to-day C/No differences

Detection of RF interference to GPS using day-to-day C/No differences 1 International Symposium on GPS/GSS Otober 6-8, 1. Detetion of RF interferene to GPS using day-to-day /o differenes Ryan J. R. Thompson 1#, Jinghui Wu #, Asghar Tabatabaei Balaei 3^, and Andrew G. Dempster

More information