Introduction to numerical algorithms
|
|
- Theodore Sutton
- 5 years ago
- Views:
Transcription
1 Introduction to numerical algorithms Given an algebraic equation or formula, we may want to approximate the value, and while in calculus, we deal with equations or formulas that are well defined at each point and where the equations or formulas have properties such as continuity and differentiability, in engineering applications, such information is not always available. Instead, we may understand that the underlying behavior is continuous and differentiable, but we may only have samples of the values of the equation or formula. Our goal will be to find algorithms that will give us, under many circumstances, good approximations of solutions to an equation or the value of a formula. In this introductory chapter, we will look at: 1. techniques used in numerical algorithms, 2. sources of error, and 3. the representation of floating-point numbers. We will begin with Techniques in numerical algorithms Algorithms for numerical approximations to solutions of algebraic equations and formula generally use at least one of six techniques for finding such approximations, including: 1. iteration, 2. linear algebra, 3. interpolation, 4. Taylor series, 5. bracketing and 6. weighted averages. We will look at each of these six techniques, and while each is relatively straight-forward on their own, we will see that together solutions to some of the most complex algebraic equations and formulas can be computed. Iteration This section will first introduce the concept of iteration, then look at a straight-forward example, the fixed-point theorem, and it will conclude with a discussion of initial points for such iterations. Iteration Many numerical algorithms involve taking a poor approximation x k and from this finding a better approximation, x k+1, and this process can be repeated such that, under certain conditions and usually only in theory, the approximations will always get closer and closer to the correct answer. Problems with iterative approaches include: 1. the sequence converges, but very slowly, 2. the sequence converges to a solution that is not the one we are looking for, 3. the sequence diverges (approaches plus or minus infinity), or 4. the sequence does not converge. When we discuss the fixed-point theorem, we will see examples of each of these. Fixed-point theorem The easiest example of an iterative means of approximating a solution to an equation is finding a solution to
2 x = f(x) for any function f. In this case, if we start out with any initial approximation x 0 and the let x k+1 = f(x k ), then under specific circumstances, the fixed-point theorem says that sequence will converge to a solution of the equation x = f(x). Example 1 As an example, suppose we want to approximate a solution to the equation x = cos(x). The two solutions to this equation are approximately represented by Now, we know that and cos , so let s start with x0, but as computers cannot, we will start out with a 20-decimal-digit approximation, x 0 = , and so we begin: x cos x 1 0 cos If we repeat this, we have the sequence of values presented here: x x x x x x x x x x x x x x x x x x x x x The easiest way to observe this is to take any older generation calculator, randomly punch in any number, and then start hitting the cos key.
3 Example 2 If we try instead to approximate a solution to x = sin(x), we know the only solution to this equation is x = 0. If, however, we start with x 0 = 1.0, the first step looks hopeful, x 1 = sin(1.0) = , but we might rightfully have reason for concern if it takes nine iterations to get a value less than 0.5: x x x x x x x x x x In this case, the convergence is very slow; after iterations, our approximation is still x = far less accurate than twenty iterations with the equation x = cos(x). Example 3 The equation x = e x 1 cos(x) has a solution at x = , but if we start with x 0 = , we find that we actually converge to the other solution at x = x x x x x x x x x x x x x x x x x x x x x Example 4 If we take the same equation, x = e x 1 cos(x), but instead we start with x 0 = , we find a different result: x x x x x
4 x x x x x x x Example 5 If we consider the equation x = 1 + cos(x) e x, we note that this equation has only one solution, approximately at x = ; however, after iterations, the sequence neither converges to our solution, nor does it converge to any other solution, nor does it diverge to infinity. Instead, the values always remain bounded.
5 x x x x x x x x x x x x x x x x x x x x x To demonstrate this more clearly, the following plot shows the first 2000 iterations together with the actual solution of x = The approximations jump both above and below the value, but never converge to it. Example 6 Finally, if we consider the equation x = 3.5x(1 x), we note that this equation has two solutions, at x = 0 and approximately at x = ; however, after twenty iterations, we note the points bounce between four values, none of which are either solution.
6 x x x x x x x x x x x x x x x x x x x x x Convergence criteria If a sequence of points converges to a point x, it is necessary that lim x x 0, k k but for numerical solutions, we don t require the exact answer, only an approximation, and thus, we may desire that x x, k abs but even then we can t always guarantee the sequence will converge. Of course, we don t know when we re sufficnently close, because we don t know the actual value x, thus, we will look at x k 1 x k. abs Unfortunately, even this does not guarantee convergence, as we saw with the example with x = sin(x); however, but we are still quite far away from the solution: x10000 x , x Where possible, we may require other convergence criteria. Initial values As two examples demonstrated, using slightly different initial values resulted in drastically different behaviours: one converged to the other solution, while the other diverged to infinity. In other cases, it can be shown that
7 iterative methods will converge, but only if the initial approximation is sufficiently close to the actual solution that is being sought out. In general, given an arbitrary iterative method, there is no conditions that tell you where to start. However, as an engineering student, when you use such techniques, you should already know approximately what the solution should be, and from that information, it should give you reasonable initial values and reasonable tests as to whether or not the approximation is the desired one. Summary of iteration As you may have noticed, iteration can be very useful in finding approximations to solutions of equations, but their use allows for many possible failures. Consequently, any Linear Algebra The next tool for solving algebraic equations is finding approximations to linear equations. Given a system of n linear equations in n unknowns, the objective is to find a solution that satisfies all n linear equations. In general, these are the only systems of equations that we can reliably solve, and therefore in many cases we will linearize an equation from non-linear equation to one that is linear, or from a system of non-linear equations on a system of linear equations. In solving the linear system, we hope that it will give us information to the solution of the nonlinear equations. In your course on linear algebra, you have already been exposed to Gaussian elimination. While this technique can be used to find numeric approximations of solutions to a system of linear equations, it is slow ( (n 3 ) for a system of n linear equations in n unknowns) and it is subject to round-off error, and if certain precautions are not taken, the approximation can have a significant error associated with it. There are iterative techniques for approximating solutions to systems of linear equations that are particularly effective for large sparse systems. Interpolation Given a set of n points (x 1, y 1 ),, (x n, y n ), if all the x values are different, there exists a polynomial of degree n 1 that passes through all n points. This technique will often be used to convert a set of n observations into a continuous function. Taylor series A Taylor series describes the behavior of a function by Taylor series will be used primarily for error analysis, although with techniques such as automatic differentiation (where the derivative of a Matlab or C function can be deduced algorithmically), it is possible use Taylor series in numerical computations. As an example of automatic differentiation, from the C function double f( double x, double y ) { return x + x*(x*x - x*y*sin(x)); } it could be deduced that the partial derivatives are double f_x( double x, double y ) { return x*x - x*y*sin(x) + x*(2*x - y*sin(x) - x*y*cos(x)); } double f_y( double x, double y ) { return -x*x*sin(x); } These could then be compiled and called directly.
8 Also, given a set of n points (x 1, y 1 ),, (x n, y n ), if we allow the x values to converge on a single point (again, without any repetition except in the limit), the limit of the interpolating polynomials will be the (n 1) th -order Taylor series approximation of the function at that limit point. Bracketing In some cases, it is simply not possible to use interpolation or Taylor series to find approximations to equations. In such cases, it may be necessary to revert to the intermediate-value theorem. For example, if we are attempting to approximate a root of a function f(x) and we know that f(x 1 ) < 0 and f(x 2 ) > 0, if the function is continuous, there must be a root on the interval [x 1, x 2 ]. If we let root is in [x 1, x 3 ] or [x 3, x 2 ]. x x x , then the sign of f(x 3 ) will let us know whether or not a Weighted averages Finally, another approach to finding numerical approximations is to use weighted averages. A simple average of n values is the sum of those values divided by n x x x 1 2 n n, but a simple average may not always be the best approximation of a value in question. In some cases, we may have a number of weights c 1,, c n where then c1 c2 c n 1, c1 x1 c2x2 cnxn 1 is a weighted average of the n x values. When c1 c2 c n, the weighted average is the average. n As an example, suppose we wanted to approximate the average value of the sine function on [1.0, 1.2] with three function evaluations. One solution may be to calculate however, the weighted average sin 1.0 sin 1.1 sin sin 1.0 2sin 1.1 sin , (here c 1 = 0.25, c 2 = 0.5 and c 3 = 0.25) is closer to the actual average value sin x dx You may actually notice the error is almost exactly half ( versus ). We will see later why there are good theoretical reasons for what may appear to be a coincidence.
9 Summary of numerical techniques In summary, there are six techniques that we will be using find numeric approximations to algebraic equations and formulas. Every technique will use at least one of these techniques, and often more. Next, we will look at the source of errors. Sources of error One source of error in numerical computations is rounding error, and this manifests itself in two ways: 1. Certain numbers, such as, have non-terminating non-repeating decimal representation. Such numbers cannot be stored exactly. 2. The result of many arithmetic operations, including most divisions, anything but integer multiplications, the sum of numbers with very different magnitudes, and the subtraction of very similar numbers, will either result in additional rounding errors, or amplify the effect of previous rounding errors. As an example, suppose we want to calculate the average of two values that are approximately equal: the most obvious solution is to calculate a b c, 2 but what happens if a + b results in a numeric overflow? If we assume b > a, then while b a c a 2 is algebraically equivalent to the straight-forward calculation, this formula is not subject to numeric overflow. There are other sources of error: 1. the values used may themselves be subject to error: a sensor may only be so precise, or the sensor itself could be subject to a bias (always 2. an incorrect model Representation of numbers This section will briefly describe the various means of storing numbers, including: 1. representations of integers, 2. floating-point representations of real numbers, 3. fixed-point representations of real numbers, and 4. the representation of complex numbers. This course will focus on the second: the floating-point representation of real numbers, but we will at least introduce the other three. Base 2 in favour of base 10 From early childhood, we have learned to count to 9, and having maxed out the number of digits available, we proceed to writing 10. This is referred to as base 10, as there are ten digits, 0, 1, 2,, 8 and 9. It would be possible to have a computer store a base-10 number using, for example, 10 different voltages, but it is easier to use just two voltages, thereby allowing only two digits: 0 and 1. Thus, 0 and 1 represent the first two numbers, but the next must be 10, after which we have 11, and then 100.
10 Thus, 10 represents two, 11 three, 100 four, and so on. The first seventeen numbers are shown in this table. Decimal Binary To differeniate between base-10 numbers ( decimal numbers ) and base-2 numbers ( binary numbers ), if the possibility of ambiguity exists, the base is appended as a subscript, so = You may wonder whether this is efficient, as it takes 5 digits to represent sixteen, whereas base 10 only requires two digits. The additional memory, however, is only a constant multiple: it requires approximately log times as many binary digits ( bits ) as it does require decimal digits to represent the same number. Thus, while one million may be represented with seven digits, it requires 20 bits ( = ). Examples of binary operations are presented here, always remember that binary arithmetic is just like decimal arithmetic, only = 10 and = 100, etc
11 To convert a decimal number into a binary number is tedious and the reader is welcome to look this topic up on his or her own; however, we will make one comment: just because a number has a finite representation in base 10, such as 0.3, this does not mean that the binary representation will also be finite. The conversion of a number from binary to decimal is quite straight forward. Recall that the decimal integer represents the number d n d 1 d 0 n k dk 10, k 0 as 5402 is Similarly, each bit corresponds with a power of two, and for integers, if the bits are numbered as represents the number b n b 1 b 0
12 n k b 2, k k 0 so is = 13. This also works for real numbers, where represents 2 1 or 0.5, and represents 0.25, and so on. Thus, represents = Representations of integers Integers are generally stored in computers as an n-bit unsigned integer capable of storing values from 0 to 2 n 1 or as a signed integer using 2 s-complement capable of storing values from 2 n 1 to 2 n 1 1. In general, positive numbers are always stored using a base-2 representation, where the k th bit represents the coefficient of 2 k of the binary expansion of the number For example, represents = 42. Note, however, if a system is little endian (discussed in Section Error! Reference source not found.), such a 16-bit binary representation would be stored in main memory as The 2 s-complement representation storing both positive and negative integers is as follows: Given n bits, 1. if the first bit is 0, the remaining n 1 bits represent integers from 0 to 2 n 1 1 using a base-2 representation, wihle 2. if the first bit is 1, the remaining n 1 bits represent (2 n 1 b) where b is the positive integer of the remaining n 1 bits storing a number from 0 to 2 n 1 1, so negative numbers range from 2 n 1 to 1. The easiest way to calculate the representation of a negative integer is to take the bit-wise NOT (complement) of the positive number from 1 to 2 n 1 (from to ), taking the bit-wise complement (from to ) and adding 1 to the result (from to ). Note this forces the first bit to be 1. For example, given the 16-bit representations of = , the 16-bit 2 s-complement representation of 42 is x = ~x = ~x + 1 = All positive integers have a leading 0 and all negative numbers have a leading 1. Incidentally, the largest negative number is while the representation of 0 is If you ask most libraries for the absolute value of the largest negative number, it comes back unchanged a negative number. The most significant benefit of the 2 s-complement representation is that addition does not require additional checks. For example, we can find by calculating: This result is negative (the first bit is 1), and thus we calculate the additive inverse of the result: y =
13 ~y = ~y + 1 = That is, the sum is 32. While there have previously been other digital formats (for example, binary-coded decimal), these representations for positive integers and signed integers are almost universal today. One issue with integer representations is what happens if the result of an operation cannot be within that representation. For example, suppose we add 1 to the largest signed integer (say, ). There are two approaches: 1. The most common is to wrap and signal an overflow, so the result is which is now the largest negative integer. Most high-level programming languages do not allow the programmer to determine if an overflow has occurred, and therefore it is necessary that checks are made before an operation is made to determine if an overflow will occur. 2. The second is referred to as saturation arithmetic, where, for example, adding one to the largest integer will have the largest integer return. This was discussed previously in Section Error! Reference source not found. with the QADD operation. One operation that must, however, be avoided at all costs is a division-by-zero or modulo zero operation. Such operations will throw an interrupt that will halt the currently executing task. The Clementine lunar mission that failed, in part due to the absence of a watchdog timer, had a second peculiarity: prior to the exception that caused the processor to hang, there had been previously almost 3000 similar exceptions. See Jack Ganssle s 2002 article Born to Fail for further details. In summary, fixed-length base-2 representations of positive integers and 2 s-complement representation of negative numbers are near universal. Most applications use usual arithmetic while checking for overflow; however, saturation arithmetic may be more appropriate in critical systems where an accidental overflow may result in a disaster (as in the Ariane 5 rocket). Allowing exceptions to result from invalid integer operations has also resulted in numerous issues, too. Floating-point representations Real numbers are generally approximated using floating- or fixed-point representations. We say approximated because almost every real number cannot be represented exactly using any finite-length representation. Floating-point approximations usually use one of two representations specified by IEEE 754: single- and doubleprecision floating point numbers, or float and double, respectively. For general applications, double-precision floating-point numbers, which occupy eight bytes, have sufficient precision to be used for most engineering and scientific computation, while single-precision floating-point numbers occupy only four bytes, and have significantly less precision, and therefore should only be used when only course approximations are necessary, such as in the generation of graphics. In embedded systems, however, if it can be determined that the higher precision of the double format is not necessary, use of the float format can result in significant savings in memory and run time. Most larger microcontrollers have floating-point units (FPUs) which perform floating-point operations. Issues with floating-point operations such as those associated with integer operations are avoided with the introduction of three special floating-point numbers representing infinity, negative infinity and not-a-number. These numbers result from operations such as 1.0/0.0, -1e300*1e300 and 0.0/0.0, respectively. Consequently, there will never be an exception in any floating-point operation. Note that even zero is signed, where +0 and -0 represents all positive and negative real numbers, respectively, too small to be represented by any other floatingpoint number. Therefore, 1.0/(-0.0) should result in negative infinity.
14 For further information on floating-point numbers, see any good text on numerical analysis. Fixed-point representations Fixed-point representation of real numbers is usually restricted to smaller mircocontrollers that lack an FPU, often with only 24- or 16-bit registers or smaller. In a fixed-point representation, the first bit is usually the sign bit, and the radix point is arbitrarily fixed at some location within the number. Thus, if an 16-bit number represented a sign bit, 7 bits before the integer component, and 8 bits for the fractional component, the value of would be represented by which is the approximation = with a % relative error. This can represent real numbers in the range ( 256, 256). Adding two fixed-point representations can, for the most part, be done with integer addition, but multiplication requires a little more effort, requiring integer multiplication of the 16-bit numbers as 32-bit numbers, and then truncating the last 8 bits Thus, 2 is approximately equal to = , whereas 2 = Whether or not numbers like and represent plus or minus infinity is a question that must be addressed during the design phase. Representation of complex numbers The most usual means of representing a complex number is to store a pair of real numbers representing the real and imaginary components of the complex number. Fortunately, Matlab allows you to work seamlessly with complex numbers. By default, the variables i and j both represent the imaginary unit, 1, but even if you assign to these variables, you may always enter a complex number by juxtaposing the imaginary unit with the imaginary component: >> j = 4; >> j ans = i Indeed, Matlab recommends using 1j instead of j for entering the imaginary unit (to avoid the possibility that your code may at some point later fail if j is accidently assigned a value earlier in your scripts). While it is possible to store complex number as a pair of real numbers representing its magnitude and argument, this is seldom used in practice except in special circumstances. Summary of the representation of numbers In this section, we have reviewed or introduced various binary representation of integers and real numbers. Each representation must have some limitations and developers of real-time systems must be aware of those limitations. We will continue with the introduction of definitions related to real-time systems.
15 Summary of our introduction to numerical algorithms This first chapter discussed techniques that will be used in numerical algorithms, including iteration, linear algebra, interpolation, Taylor series, bracketing and weighted averages; a brief discussion on the source of error; and the representation of numbers.
CS321 Introduction To Numerical Methods
CS3 Introduction To Numerical Methods Fuhua (Frank) Cheng Department of Computer Science University of Kentucky Lexington KY 456-46 - - Table of Contents Errors and Number Representations 3 Error Types
More information(Refer Slide Time: 02:59)
Numerical Methods and Programming P. B. Sunil Kumar Department of Physics Indian Institute of Technology, Madras Lecture - 7 Error propagation and stability Last class we discussed about the representation
More information2 Computation with Floating-Point Numbers
2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers
More informationFloating Point Considerations
Chapter 6 Floating Point Considerations In the early days of computing, floating point arithmetic capability was found only in mainframes and supercomputers. Although many microprocessors designed in the
More information2 Computation with Floating-Point Numbers
2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers
More informationScientific Computing. Error Analysis
ECE257 Numerical Methods and Scientific Computing Error Analysis Today s s class: Introduction to error analysis Approximations Round-Off Errors Introduction Error is the difference between the exact solution
More informationCS101 Lecture 04: Binary Arithmetic
CS101 Lecture 04: Binary Arithmetic Binary Number Addition Two s complement encoding Briefly: real number representation Aaron Stevens (azs@bu.edu) 25 January 2013 What You ll Learn Today Counting in binary
More informationNumber Systems. Both numbers are positive
Number Systems Range of Numbers and Overflow When arithmetic operation such as Addition, Subtraction, Multiplication and Division are performed on numbers the results generated may exceed the range of
More information2.1.1 Fixed-Point (or Integer) Arithmetic
x = approximation to true value x error = x x, relative error = x x. x 2.1.1 Fixed-Point (or Integer) Arithmetic A base 2 (base 10) fixed-point number has a fixed number of binary (decimal) places. 1.
More informationFloating-point representation
Lecture 3-4: Floating-point representation and arithmetic Floating-point representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However,
More informationChapter 2. Data Representation in Computer Systems
Chapter 2 Data Representation in Computer Systems Chapter 2 Objectives Understand the fundamentals of numerical data representation and manipulation in digital computers. Master the skill of converting
More informationReal Numbers finite subset real numbers floating point numbers Scientific Notation fixed point numbers
Real Numbers We have been studying integer arithmetic up to this point. We have discovered that a standard computer can represent a finite subset of the infinite set of integers. The range is determined
More informationDLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 1 DLD P VIDYA SAGAR
UNIT I Digital Systems: Binary Numbers, Octal, Hexa Decimal and other base numbers, Number base conversions, complements, signed binary numbers, Floating point number representation, binary codes, error
More informationLECTURE 0: Introduction and Background
1 LECTURE 0: Introduction and Background September 10, 2012 1 Computational science The role of computational science has become increasingly significant during the last few decades. It has become the
More informationEC121 Mathematical Techniques A Revision Notes
EC Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes Mathematical Techniques A begins with two weeks of intensive revision of basic arithmetic and algebra, to the level
More informationMath 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.
Math 340 Fall 2014, Victor Matveev Binary system, round-off errors, loss of significance, and double precision accuracy. 1. Bits and the binary number system A bit is one digit in a binary representation
More informationAn interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s.
Using Monte Carlo to Estimate π using Buffon s Needle Problem An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s. Here s the problem (in a simplified form). Suppose
More informationInternal Data Representation
Appendices This part consists of seven appendices, which provide a wealth of reference material. Appendix A primarily discusses the number systems and their internal representation. Appendix B gives information
More informationDr Richard Greenaway
SCHOOL OF PHYSICS, ASTRONOMY & MATHEMATICS 4PAM1008 MATLAB 2 Basic MATLAB Operation Dr Richard Greenaway 2 Basic MATLAB Operation 2.1 Overview 2.1.1 The Command Line In this Workshop you will learn how
More informationNumber Systems CHAPTER Positional Number Systems
CHAPTER 2 Number Systems Inside computers, information is encoded as patterns of bits because it is easy to construct electronic circuits that exhibit the two alternative states, 0 and 1. The meaning of
More informationBits, Words, and Integers
Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are
More informationComputational Economics and Finance
Computational Economics and Finance Part I: Elementary Concepts of Numerical Analysis Spring 2015 Outline Computer arithmetic Error analysis: Sources of error Error propagation Controlling the error Rates
More informationChapter 3 Data Representation
Chapter 3 Data Representation The focus of this chapter is the representation of data in a digital computer. We begin with a review of several number systems (decimal, binary, octal, and hexadecimal) and
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 1 Scientific Computing Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction
More informationReal Numbers finite subset real numbers floating point numbers Scientific Notation fixed point numbers
Real Numbers We have been studying integer arithmetic up to this point. We have discovered that a standard computer can represent a finite subset of the infinite set of integers. The range is determined
More informationReview of Calculus, cont d
Jim Lambers MAT 460/560 Fall Semester 2009-10 Lecture 4 Notes These notes correspond to Sections 1.1 1.2 in the text. Review of Calculus, cont d Taylor s Theorem, cont d We conclude our discussion of Taylor
More informationTHE LOGIC OF COMPOUND STATEMENTS
CHAPTER 2 THE LOGIC OF COMPOUND STATEMENTS Copyright Cengage Learning. All rights reserved. SECTION 2.5 Application: Number Systems and Circuits for Addition Copyright Cengage Learning. All rights reserved.
More informationFloating-Point Numbers in Digital Computers
POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored
More informationCPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS
CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS Aleksandar Milenković The LaCASA Laboratory, ECE Department, The University of Alabama in Huntsville Email: milenka@uah.edu Web:
More informationCPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS
CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS Aleksandar Milenković The LaCASA Laboratory, ECE Department, The University of Alabama in Huntsville Email: milenka@uah.edu Web:
More informationChapter 3: Arithmetic for Computers
Chapter 3: Arithmetic for Computers Objectives Signed and Unsigned Numbers Addition and Subtraction Multiplication and Division Floating Point Computer Architecture CS 35101-002 2 The Binary Numbering
More informationFloating Point Arithmetic
Floating Point Arithmetic Clark N. Taylor Department of Electrical and Computer Engineering Brigham Young University clark.taylor@byu.edu 1 Introduction Numerical operations are something at which digital
More informationMAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic
MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic September 28, 2018 Lecture 1 September 28, 2018 1 / 25 Floating point arithmetic Computers use finite strings of binary digits to represent
More informationAMTH142 Lecture 10. Scilab Graphs Floating Point Arithmetic
AMTH142 Lecture 1 Scilab Graphs Floating Point Arithmetic April 2, 27 Contents 1.1 Graphs in Scilab......................... 2 1.1.1 Simple Graphs...................... 2 1.1.2 Line Styles........................
More informationLearning from Math Library Testng for C Marcel Beemster Solid Sands
Learning from Math Library Testng for C Marcel Beemster Solid Sands Introduction In the process of improving SuperTest, I recently dived into its math library testing. Turns out there were some interesting
More informationIntermediate Algebra. Gregg Waterman Oregon Institute of Technology
Intermediate Algebra Gregg Waterman Oregon Institute of Technology c 2017 Gregg Waterman This work is licensed under the Creative Commons Attribution 4.0 International license. The essence of the license
More informationFloating-Point Numbers in Digital Computers
POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored
More informationBinary floating point encodings
Week 1: Wednesday, Jan 25 Binary floating point encodings Binary floating point arithmetic is essentially scientific notation. Where in decimal scientific notation we write in floating point, we write
More informationNumerical computing. How computers store real numbers and the problems that result
Numerical computing How computers store real numbers and the problems that result The scientific method Theory: Mathematical equations provide a description or model Experiment Inference from data Test
More informationOutline. Numerical Analysis Basics. Some Higher-Level Languages. Programming Languages. BASIC and VBA MATLAB
umerical Analysis Basics Larry Caretto Mechanical Engineering 39 umerical Analysis of Engineering Systems February 5, 214 Outline Programming Languages Binary umbers and Data Types Limits on the size and
More informationSlide Set 1. for ENEL 339 Fall 2014 Lecture Section 02. Steve Norman, PhD, PEng
Slide Set 1 for ENEL 339 Fall 2014 Lecture Section 02 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Fall Term, 2014 ENEL 353 F14 Section
More informationBinary Adders: Half Adders and Full Adders
Binary Adders: Half Adders and Full Adders In this set of slides, we present the two basic types of adders: 1. Half adders, and 2. Full adders. Each type of adder functions to add two binary bits. In order
More informationComputational Economics and Finance
Computational Economics and Finance Part I: Elementary Concepts of Numerical Analysis Spring 2016 Outline Computer arithmetic Error analysis: Sources of error Error propagation Controlling the error Rates
More informationChapter 3. Errors and numerical stability
Chapter 3 Errors and numerical stability 1 Representation of numbers Binary system : micro-transistor in state off 0 on 1 Smallest amount of stored data bit Object in memory chain of 1 and 0 10011000110101001111010010100010
More informationIntroduction to Computer Programming with MATLAB Calculation and Programming Errors. Selis Önel, PhD
Introduction to Computer Programming with MATLAB Calculation and Programming Errors Selis Önel, PhD Today you will learn Numbers, Significant figures Error analysis Absolute error Relative error Chopping
More informationFloating-Point Arithmetic
ENEE446---Lectures-4/10-15/08 A. Yavuz Oruç Professor, UMD, College Park Copyright 2007 A. Yavuz Oruç. All rights reserved. Floating-Point Arithmetic Integer or fixed-point arithmetic provides a complete
More informationMost nonzero floating-point numbers are normalized. This means they can be expressed as. x = ±(1 + f) 2 e. 0 f < 1
Floating-Point Arithmetic Numerical Analysis uses floating-point arithmetic, but it is just one tool in numerical computation. There is an impression that floating point arithmetic is unpredictable and
More informationAccuracy versus precision
Accuracy versus precision Accuracy is a consistent error from the true value, but not necessarily a good or precise error Precision is a consistent result within a small error, but not necessarily anywhere
More information9/3/2015. Data Representation II. 2.4 Signed Integer Representation. 2.4 Signed Integer Representation
Data Representation II CMSC 313 Sections 01, 02 The conversions we have so far presented have involved only unsigned numbers. To represent signed integers, computer systems allocate the high-order bit
More informationFinite arithmetic and error analysis
Finite arithmetic and error analysis Escuela de Ingeniería Informática de Oviedo (Dpto de Matemáticas-UniOvi) Numerical Computation Finite arithmetic and error analysis 1 / 45 Outline 1 Number representation:
More informationIntegers. N = sum (b i * 2 i ) where b i = 0 or 1. This is called unsigned binary representation. i = 31. i = 0
Integers So far, we've seen how to convert numbers between bases. How do we represent particular kinds of data in a certain (32-bit) architecture? We will consider integers floating point characters What
More informationUsing Arithmetic of Real Numbers to Explore Limits and Continuity
Using Arithmetic of Real Numbers to Explore Limits and Continuity by Maria Terrell Cornell University Problem Let a =.898989... and b =.000000... (a) Find a + b. (b) Use your ideas about how to add a and
More informationClasses of Real Numbers 1/2. The Real Line
Classes of Real Numbers All real numbers can be represented by a line: 1/2 π 1 0 1 2 3 4 real numbers The Real Line { integers rational numbers non-integral fractions irrational numbers Rational numbers
More informationChapter 1. Numeric Artifacts. 1.1 Introduction
Chapter 1 Numeric Artifacts 1.1 Introduction Virtually all solutions to problems in electromagnetics require the use of a computer. Even when an analytic or closed form solution is available which is nominally
More informationCHAPTER V NUMBER SYSTEMS AND ARITHMETIC
CHAPTER V-1 CHAPTER V CHAPTER V NUMBER SYSTEMS AND ARITHMETIC CHAPTER V-2 NUMBER SYSTEMS RADIX-R REPRESENTATION Decimal number expansion 73625 10 = ( 7 10 4 ) + ( 3 10 3 ) + ( 6 10 2 ) + ( 2 10 1 ) +(
More information5.5 Newton s Approximation Method
498CHAPTER 5. USING DERIVATIVES TO ANALYZE FUNCTIONS; FURTHER APPLICATIONS 4 3 y = x 4 3 f(x) = x cosx y = cosx 3 3 x = cosx x cosx = 0 Figure 5.: Figure showing the existence of a solution of x = cos
More informationData Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8
Data Representation At its most basic level, all digital information must reduce to 0s and 1s, which can be discussed as binary, octal, or hex data. There s no practical limit on how it can be interpreted
More information1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM
1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number
More informationReals 1. Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method.
Reals 1 13 Reals Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method. 13.1 Floating-point numbers Real numbers, those declared to be
More informationIn this lesson you will learn: how to add and multiply positive binary integers how to work with signed binary numbers using two s complement how fixed and floating point numbers are used to represent
More informationUNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-A Floating-Point Arithmetic Israel Koren ECE666/Koren Part.4a.1 Preliminaries - Representation
More informationDIGITAL ARITHMETIC: OPERATIONS AND CIRCUITS
C H A P T E R 6 DIGITAL ARITHMETIC: OPERATIONS AND CIRCUITS OUTLINE 6- Binary Addition 6-2 Representing Signed Numbers 6-3 Addition in the 2 s- Complement System 6-4 Subtraction in the 2 s- Complement
More informationD I G I T A L C I R C U I T S E E
D I G I T A L C I R C U I T S E E Digital Circuits Basic Scope and Introduction This book covers theory solved examples and previous year gate question for following topics: Number system, Boolean algebra,
More informationThe Bisection Method versus Newton s Method in Maple (Classic Version for Windows)
The Bisection Method versus (Classic Version for Windows) Author: Barbara Forrest Contact: baforres@uwaterloo.ca Copyrighted/NOT FOR RESALE version 1.1 Contents 1 Objectives for this Lab i 2 Approximate
More informationBindel, Fall 2016 Matrix Computations (CS 6210) Notes for
1 Logistics Notes for 2016-09-07 1. We are still at 50. If you are still waiting and are not interested in knowing if a slot frees up, let me know. 2. There is a correction to HW 1, problem 4; the condition
More informationCHW 261: Logic Design
CHW 261: Logic Design Instructors: Prof. Hala Zayed Dr. Ahmed Shalaby http://www.bu.edu.eg/staff/halazayed14 http://bu.edu.eg/staff/ahmedshalaby14# Slide 1 Slide 2 Slide 3 Digital Fundamentals CHAPTER
More informationLogic, Words, and Integers
Computer Science 52 Logic, Words, and Integers 1 Words and Data The basic unit of information in a computer is the bit; it is simply a quantity that takes one of two values, 0 or 1. A sequence of k bits
More informationunused unused unused unused unused unused
BCD numbers. In some applications, such as in the financial industry, the errors that can creep in due to converting numbers back and forth between decimal and binary is unacceptable. For these applications
More informationCalculus I Review Handout 1.3 Introduction to Calculus - Limits. by Kevin M. Chevalier
Calculus I Review Handout 1.3 Introduction to Calculus - Limits by Kevin M. Chevalier We are now going to dive into Calculus I as we take a look at the it process. While precalculus covered more static
More informationSCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics. Numbers & Number Systems
SCHOOL OF ENGINEERING & BUILT ENVIRONMENT Mathematics Numbers & Number Systems Introduction Numbers and Their Properties Multiples and Factors The Division Algorithm Prime and Composite Numbers Prime Factors
More informationNatasha S. Sharma, PhD
Revisiting the function evaluation problem Most functions cannot be evaluated exactly: 2 x, e x, ln x, trigonometric functions since by using a computer we are limited to the use of elementary arithmetic
More informationUp next. Midterm. Today s lecture. To follow
Up next Midterm Next Friday in class Exams page on web site has info + practice problems Excited for you to rock the exams like you have been the assignments! Today s lecture Back to numbers, bits, data
More informationChapter 2. Positional number systems. 2.1 Signed number representations Signed magnitude
Chapter 2 Positional number systems A positional number system represents numeric values as sequences of one or more digits. Each digit in the representation is weighted according to its position in the
More informationCOMP2611: Computer Organization. Data Representation
COMP2611: Computer Organization Comp2611 Fall 2015 2 1. Binary numbers and 2 s Complement Numbers 3 Bits: are the basis for binary number representation in digital computers What you will learn here: How
More informationFoundations of Computer Systems
18-600 Foundations of Computer Systems Lecture 4: Floating Point Required Reading Assignment: Chapter 2 of CS:APP (3 rd edition) by Randy Bryant & Dave O Hallaron Assignments for This Week: Lab 1 18-600
More informationCS321. Introduction to Numerical Methods
CS31 Introduction to Numerical Methods Lecture 1 Number Representations and Errors Professor Jun Zhang Department of Computer Science University of Kentucky Lexington, KY 40506 0633 August 5, 017 Number
More informationTable : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = (
Floating Point Numbers in Java by Michael L. Overton Virtually all modern computers follow the IEEE 2 floating point standard in their representation of floating point numbers. The Java programming language
More informationChapter Three. Arithmetic
Chapter Three 1 Arithmetic Where we've been: Performance (seconds, cycles, instructions) Abstractions: Instruction Set Architecture Assembly Language and Machine Language What's up ahead: Implementing
More informationCMPSCI 145 MIDTERM #1 Solution Key. SPRING 2017 March 3, 2017 Professor William T. Verts
CMPSCI 145 MIDTERM #1 Solution Key NAME SPRING 2017 March 3, 2017 PROBLEM SCORE POINTS 1 10 2 10 3 15 4 15 5 20 6 12 7 8 8 10 TOTAL 100 10 Points Examine the following diagram of two systems, one involving
More informationCS429: Computer Organization and Architecture
CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: September 18, 2017 at 12:48 CS429 Slideset 4: 1 Topics of this Slideset
More informationJulia Calculator ( Introduction)
Julia Calculator ( Introduction) Julia can replicate the basics of a calculator with the standard notations. Binary operators Symbol Example Addition + 2+2 = 4 Substraction 2*3 = 6 Multify * 3*3 = 9 Division
More information2.9 Linear Approximations and Differentials
2.9 Linear Approximations and Differentials 2.9.1 Linear Approximation Consider the following graph, Recall that this is the tangent line at x = a. We had the following definition, f (a) = lim x a f(x)
More informationFLOATING POINT NUMBERS
FLOATING POINT NUMBERS Robert P. Webber, Longwood University We have seen how decimal fractions can be converted to binary. For instance, we can write 6.25 10 as 4 + 2 + ¼ = 2 2 + 2 1 + 2-2 = 1*2 2 + 1*2
More informationChapter 2 Bits, Data Types, and Operations
Chapter 2 Bits, Data Types, and Operations Original slides from Gregory Byrd, North Carolina State University Modified slides by Chris Wilcox, Colorado State University How do we represent data in a computer?!
More informationIntegers and Floating Point
CMPE12 More about Numbers Integers and Floating Point (Rest of Textbook Chapter 2 plus more)" Review: Unsigned Integer A string of 0s and 1s that represent a positive integer." String is X n-1, X n-2,
More informationAppendix. Numbering Systems. In This Appendix...
Numbering Systems ppendix In This ppendix... Introduction... inary Numbering System... exadecimal Numbering System... Octal Numbering System... inary oded ecimal () Numbering System... 5 Real (Floating
More informationCHAPTER 5: Representing Numerical Data
CHAPTER 5: Representing Numerical Data The Architecture of Computer Hardware and Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint
More informationVARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung
POLYTECHNIC UNIVERSITY Department of Computer and Information Science VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung Abstract: Techniques for reducing the variance in Monte Carlo
More informationArithmetic and Bitwise Operations on Binary Data
Arithmetic and Bitwise Operations on Binary Data CSCI 2400: Computer Architecture ECE 3217: Computer Architecture and Organization Instructor: David Ferry Slides adapted from Bryant & O Hallaron s slides
More informationComputational Methods. Sources of Errors
Computational Methods Sources of Errors Manfred Huber 2011 1 Numerical Analysis / Scientific Computing Many problems in Science and Engineering can not be solved analytically on a computer Numeric solutions
More informationDigital Fundamentals
Digital Fundamentals Tenth Edition Floyd Chapter 2 2009 Pearson Education, Upper 2008 Pearson Saddle River, Education NJ 07458. All Rights Reserved Decimal Numbers The position of each digit in a weighted
More informationLimits. f(x) and lim. g(x) g(x)
Limits Limit Laws Suppose c is constant, n is a positive integer, and f() and g() both eist. Then,. [f() + g()] = f() + g() 2. [f() g()] = f() g() [ ] 3. [c f()] = c f() [ ] [ ] 4. [f() g()] = f() g()
More informationSigned umbers. Sign/Magnitude otation
Signed umbers So far we have discussed unsigned number representations. In particular, we have looked at the binary number system and shorthand methods in representing binary codes. With m binary digits,
More informationMC1601 Computer Organization
MC1601 Computer Organization Unit 1 : Digital Fundamentals Lesson1 : Number Systems and Conversions (KSB) (MCA) (2009-12/ODD) (2009-10/1 A&B) Coverage - Lesson1 Shows how various data types found in digital
More informationCOURSE: NUMERICAL ANALYSIS. LESSON: Methods for Solving Non-Linear Equations
COURSE: NUMERICAL ANALYSIS LESSON: Methods for Solving Non-Linear Equations Lesson Developer: RAJNI ARORA COLLEGE/DEPARTMENT: Department of Mathematics, University of Delhi Page No. 1 Contents 1. LEARNING
More informationCMPSCI 145 MIDTERM #2 Solution Key SPRING 2018 April 13, 2018 Professor William T. Verts
CMPSCI 145 MIDTERM #2 Solution Key SPRING 2018 April 13, 2018 10 Points Answer 10 of the following problems (1 point each). Answer more than 10 for extra credit. Scoring will be +1 for each correct
More informationSystems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties
Systems I Floating Point Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties IEEE Floating Point IEEE Standard 754 Established in 1985 as uniform standard for
More informationD-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview
Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,
More informationLong (or LONGMATH ) floating-point (or integer) variables (length up to 1 million, limited by machine memory, range: approx. ±10 1,000,000.
QuickCalc User Guide. Number Representation, Assignment, and Conversion Variables Constants Usage Double (or DOUBLE ) floating-point variables (approx. 16 significant digits, range: approx. ±10 308 The
More informationAppendix. Numbering Systems. In this Appendix
Numbering Systems ppendix n this ppendix ntroduction... inary Numbering System... exadecimal Numbering System... Octal Numbering System... inary oded ecimal () Numbering System... 5 Real (Floating Point)
More information