Double-Precision Floating Point Emulation Acceleration
|
|
- Derick Davis
- 6 years ago
- Views:
Transcription
1 Double-Precision Floating Point Emulation Acceleration Application Note Tensilica, Inc Scott Blvd. Santa Clara, CA (408) Fax (408) December 2007 Doc Number: AN
2 2007 Tensilica, Inc. Printed in the United States of America All Rights Reserved This publication is provided AS IS. Tensilica, Inc. (hereafter Tensilica ) does not make any warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Information in this document is provided solely to enable system and software developers to use Tensilica processors. Unless specifically set forth herein, there are no express or implied patent, copyright or any other intellectual property rights or licenses granted hereunder to design or fabricate Tensilica integrated circuits or integrated circuits based on the information in this document. Tensilica does not warrant that the contents of this publication, whether individually or as one or more groups, meets your requirements or that the publication is error-free. This publication could include technical inaccuracies or typographical errors. Changes may be made to the information herein, and these changes may be incorporated in new editions of this publication. Tensilica is a registered trademark of Tensilica, Inc. The following terms are trademarks of Tensilica, Inc. FLIX, OSKit, Sea of Processors, TurboXim, Vectra, Xenergy, Xplorer, and XPRES. All other trademarks and registered trademarks are the property of their respective companies. Notice Tensilica, Inc. reserves the right to make changes to its products or discontinue any of its products or offerings without notice. Tensilica warrants the performance of its products to the specifications applicable at the time of sale in accordance with Tensilica s standard warranty. Document Change History: Published December 2007 ii
3 Contents H1 Introduction... H1 H2 Accelerating H3 Double-precision H4 Code H5 The H6 Building Basic Double-Precision Emulation Functions... H1 HDouble-precision Emulation Package Features... H1 HComparison... H2 HConformance to IEEE 754 Specification... H2 Emulation Routine Performance... H2 Size for the Double-Precision Emulation Functions... H5 TIE Extensions... H6 and Using the Double-Precision Acceleration Library... H7 HUsing Xplorer... H7 HUsing Command Line Tools... H8 HUsing the Library with an RTOS... H9 Tables HTable 1: Double-precision Floating Point Emulation Library Features... H2 HTable 2: Cycle Count Comparison for Double-precision Emulation... H3 HTable 3: Cycle Count Comparison for Double-precision Multiply Emulation... H4 HTable 4: Cycle Count Comparison for Integer Divide and Modulus Emulation... H4 HTable 5: Double-precision Emulation Code Size Comparison... H5 HTable 6: Double-precision Emulation Code Size Comparison... H5 HTable 7: Integer Divide and Modulus Code Size ComparisonError! Bookmark not defined. HTable 8: Description of Added Instructions... H6 TENSILICA INC. iii
4 Abstract Double precision floating point is used in applications that require precision greater than single precision floating point. In Xtensa 7, LX, LX2 and Diamond products, double-precision floating point operations are implemented with a software emulation library. This application note presents a small set of TIE instructions and states that can be used for speeding up the existing double-precision software emulation. Adding 4K-7K gates to an Xtensa processor can perform double precision adds and subtracts in an average of 19 cycles. Multiplies take an average of 26 cycles for configurations with the Multiply High option and 60 cycles for configurations with 16 bit or 32 bit multipliers. It also includes a software library designed for easy integration into an existing project that uses these instructions to implement basic double-precision floating point operations. Since the library provides functions for routines that the compiler invokes when it encounters floating point operations, the library is easy to drop in to an existing project that needs double-precision floating point. The library directly speeds up double-precision floating point addition, subtraction, multiplication, square root, divide and comparison operations. Other routines that invoke these basic operations will be sped up indirectly. This application note characterizes the IEEE compliance of the implemented emulation routines, provides estimated gate counts for the hardware, and code sizes for the software. In addition we present average and maximum cycle counts for the emulation routines. Finally, we give step-by-step instructions on integrating the TIE and software library into an existing project to speed up double-precision floating point operations. The instructions added to speed up the floating point divide can also be used to speed up 32-bit integer divide and modulus operations. The package provided in this application note includes software routines for signed and unsigned 32-bit divide and modulus operations, in addition to the double-precision floating point ones. iv
5 1 Introduction This document describes TIE extensions and a software library used to accelerate software emulation of basic double-precision floating point functions. This library can be used to speed up double-precision floating point functionality on Xtensa processors. Customers who need low energy, moderate performance double-precision floating point operations should consider using this package. This package adds an estimated 4K gates when synthesizing for low area to a standard Xtensa processor and less than 7K gates when synthesizing for high speed. In addition to speeding up double-precision functionality, the instruction extensions used for speeding up the floating point divide operation can also be used to speed up integer divide and modulus operations for configurations without the Divide Option. This application note is divided into the following sections: Presentation of the accelerated double-precision and integer operations. This includes a description of the IEEE compliance of the accelerated double-precision functions. Cycle count comparison of the accelerated double-precision emulation functions with the existing software emulation library. Cycle count comparisons of the accelerated integer divide and modulus functions with the existing software emulation library. Gate count estimates for the added TIE instructions and states Step-by-step instructions on using the double-precision acceleration libraries with Xplorer or command line tools. A methodology for modifying an existing application to take advantage of the hardware instructions using intrinsics. 2 Accelerating Basic Double-Precision Emulation Functions This package implements IEEE 754 compliant 64-bit double-precision add, subtract, multiply, square root and divide operations with the round-to-nearest rounding mode. It also implements a complete set of comparison operations that allow for IEEE-compliant comparisons of two double-precision numbers. The package correctly handles IEEE denormalized numbers. To use this library, the user needs to build the library and include it on their link line before other libraries. In Xtensa Xplorer, adding the library to your project s dependencies is sufficient. Double-precision Emulation Package Features The double-precision emulation package adds a 32-bit and a 64-bit state to a processor along with instructions to speed up common double-precision operations. Optimizing for area in 90lp technology, the package synthesizes to an extra 4093 gates. Optimizing for speed in 90g technology the package synthesizes to an extra 6721 gates. The package is designed to work with any Xtensa configuration, however, it is recommended that it be used with the Sign Extension option, Zero-overhead Loop option and at least one of the multiply options. The iterative divide instruction and normalization instructions used to accelerate doubleprecision operations can also be used to implement signed and unsigned integer division and modulo operations. We have included emulation routines for these operations in the library for configurations without the Divide Option. These routines use the extra states that the library provides. If these emulation functions can be invoked from within an interrupt routine, they 1
6 should either be removed from the library or the interrupt routine must correctly save and restore the extra states. Comparison TABLE 1: DOUBLE-PRECISION FLOATING POINT EMULATION LIBRARY FEATURES Feature Double-precision Operations 32-bit Integer Operations Additional Architectural State Recommended Processor Configuration Options Post-Synthesis Additional Gates (Optimized for Area in 90lp) Post-Synthesis Addition Gates (Optimized for Speed in 90g) Rounding Modes Signaling Nans Overflow/Underflow exceptions Support Add Subtract Multiply Divide Comparisons ( ==,!=, <, <=, >, >=) Square Root Divide, modulus (signed and unsigned) 32-bit status state (F64S) 64-bit value state (F64R) Sign Extend Zero-overhead loop, MAC16, MUL16, MUL32 or MUL32 High 4093 gates 6721 gates Round-to-nearest No No Conformance to IEEE 754 Specification All of the implemented functions correctly implement the Round-To-Nearest rounding mode. Truncate, round up and round down modes are not implemented. The library does not generate underflow, overflow, inexact, invalid or divide-by-zero flags or exceptions. 3 Double-precision Emulation Routine Performance This section provides individual timing data for each of the functions. This timing data is derived with the cycle-accurate instruction simulator by counting the cycles spent in the emulation functions. The simulation assumes a single-cycle latency for each data memory access. The optimized functions do not access data memory, so this assumption only benefits the un-optimized functions that use a few PC-relative loads to instantiate constant literal values. The cycle counts used in this application note all assume that the double-precision functions are invoked with a windowed call instruction. Call and return sequences do not overflow or underflow the register file. All instructions either hit in the cache or local memory. Cycle counts for emulation routines are measured from the commit cycle of the first instruction in the routine 2
7 to the cycle before the commit of the instruction following the return. The add and subtract routines share code. This can confuse the standard profiler so the data is measured from execution traces. Table 2 gives average and maximum cycles counts for the standard emulation library (Base cycles) and the optimized library ( Cycles) for the implemented double-precision emulation routines. The average and maximum cycle data for add, subtract, multiply, divide, and square root was taken from a simulation of the timesoftfloat test that is included in the source package. This program does not include all of the comparison functions so the average and maximum cycle data for the comparison functions was taken from a separate directed random comparison test. With optimized functions and the data mix generated by timesoftfloat, a floating point add or subtract takes less than 20 cycles on average. Emulation for the divide takes about 72 cycles and the square root function takes about 78 cycles. If the zero-overhead loop instructions are not available, the performance of the square root and divide instructions are significantly degraded. The accelerated comparison functions (*) take 6 to 8 cycles in the emulation routines. If the TIE instructions that implement them are inlined into the routines that use them, only 2 instructions are needed to produce the binary comparison result. The table reports the number of cycles required when they are invoked through the emulation library functions invoked by the compiler for C code with double-precision comparisons. Note that because of compiler expectations, the functions for == and!= are the same. TABLE 2: CYCLE COUNT COMPARISON FOR DOUBLE-PRECISION EMULATION Operation Library Name Base Cycles Cycles Avg Max Avg Max Avg Base / Add adddf x Sub subdf x Mul muldf x Div divdf x Sqrt Sqrt x ==,!= * eqdf2, * nedf x < * ltdf x <= * ledf x > * gtdf x >= * gedf x Multiply emulation performance has a significant dependence on the base Xtensa processor's multiply hardware. Table 2 presents the performance for a configuration that has Mul32High. HTable 3 shows the performance for a variety of multiply configurations. With a base processor that includes the 32-bit Integer Multiply Option with Mul32High, the multiply takes about 26 cycles on average. For processors with the 32-bit Integer Multiply Option that do not include Mul32High and processors with the 16-bit Multiply Option, the multiply takes about 60 cycles on average. TENSILICA INC. 3
8 For processors without the 32-bit or 16-bit multiply options that include the MAC16 Option, the multiply takes about 70 instructions. Without any multiply or MAC Option a double-precision multiply takes 706 cycles. If the Sign Extension option is not available, the multiply and divide emulation can take an extra cycle with some inputs. TABLE 3: CYCLE COUNT COMPARISON FOR DOUBLE-PRECISION MULTIPLY EMULATION Multiply Configuration Option Base Cycles Cycles Avg Max Avg Max Avg Base / 32-Bit Multiply Option with Mul32High x 32-Bit Multiply Option without Mul32High x 16-Bit Multiply Option x MAC16 Option x No Multiply x HTable 4 gives a cycle count comparison for the 32-bit divide and modulus emulation routines. If the zero-overhead loop option is not available, the performance of these routines is degraded significantly. When the Divide Option is available, these emulation routine are not included. TABLE 4: CYCLE COUNT COMPARISON FOR INTEGER DIVIDE AND MODULUS EMULATION Operation Library Name Base Cycles Cycles Avg Max Avg Max Avg Base / Unsigned Integer Divide Unsigned Integer Modulus udivsi3 umodsi x x Integer Divide divsi x Integer Modulus modsi x 4
9 4 Code Size for the Double-Precision Emulation Functions The enhanced double-precision library reduces the code size footprint for all of the emulated functions. HTable 5 presents the code reductions for the double-precision emulation routines. TABLE 5: DOUBLE-PRECISION EMULATION CODE SIZE COMPARISON Operation Library Name Base Size (Bytes) Code Size Reduction Add, Sub adddf3, subdf % Mul (w Mul32 High) muldf % Mul (w Mul16/Mul32) muldf % Div divdf % Sqrt Sqrt % ==,!=, <, <=, >, >= eqdf2, nelt2, ltdf2, ledf2, gtdf2, gedf % HTable 6 presents the code reductions for the double-precision multiply with various multiply configuration options. TABLE 6: DOUBLE-PRECISION EMULATION CODE SIZE COMPARISON Multiply Configuration Option Base Size (Bytes) Code Size Reduction 32-Bit Multiply Option with Mul32High % 32-Bit Multiply Option without Mul32High % 16-Bit Multiply Option % MAC16 Option % No Multiply % Error! Reference source not found. presents the code size reductions for the integer divide and modulus emulation routines. The code size for these routines is reduced by about 50-60%. TENSILICA INC. 5
10 TABLE 7: INTEGER DIVIDE AND MODULUS CODE SIZE COMPARISON Operation Library Name Base Size (Bytes) Code Size Reduction Unsigned Integer Divide udivsi % Unsigned Integer Modulus umodsi % Integer Divide divsi % Integer Modulus modsi % 5 The TIE Extensions The TIE package implements 2 new states, a 32-bit F64S status state and a 64-bit F64R state. In addition, it implements a number of operations. We give a brief description of these operations in HTable 8. TABLE 8: DESCRIPTION OF ADDED INSTRUCTIONS Instruction Description F64CMPL, F64CMPH F64ITER F64RND F64NORM F64SIG F64SEXP F64ADDC, F64SUBC First 2 instructions for each emulation routine. Zeros F64R state and sets F64S status state. Iterative step for divide and square root Rounding assist Count leading zeros of a mantissa Extract the upper part of a mantissa Set an exponent Addition and subtraction with carry operations RF64R, WF64R Move data in and out of the F64R state RUR.F64S, WUR.F64S Move data in and out of the F64S status state 6
11 6 Building and Using the Double-Precision Acceleration Library The library is easy to use from Xplorer or command line tools. In either case, you must first compile the TIE file with your configuration, compile the library, and then make your project dependent on the compiled library. Using Xplorer To build the libdfpemu double-precision acceleration library, import the source workspace dfpemu_library.xws into Xtensa Xplorer, attach the TIE file to your configuration and build the test program. The dependent library will be built before the test program. To build a different project, create a Library Dependency on the libdfpemu library before building your project and add some link flags to your program. 1. Import the TIE file and projects into your Xplorer Workspace. a. Choose File, Import... b. Choose Import Xtensa Xplorer Workspace. c. Browse to the dfpemu_library.xws file in the Application Note directory and click Next. d. Select all of the projects (libdfpemu and timesoftfloat) in the projects dialog and the dfpemu_lib.tie file in the TIE files dialog. On the last dialog, click Finish to import the projects and TIE file into the workspace. 2. Attach the TIE file to your configuration. a. Select the C/C++ perspective if it is not the active perspective. Select the menu Window, Open Perspective, C/C++. b. In the System Overview pane, right-click on your configuration and choose Attach TIE and TDB files. c. Select the dfpemu_lib.tie file and click Finish. 3. Compile the TDK for your configuration. a. In the System Overview pane, right-click on your configuration and choose Compile TDK for Configuration. 4. If building your own project, add the libdfpemu dependency to the project. This step is unnecessary when building the timesoftfloat project because the dependency has been added already. a. In the C/C++ Projects pane, right-click your project and choose Properties. b. Click Library Dependencies. Choose libdfpemu from the Available Libraries and click Add. c. Click Ok to dismiss the project properties. 5. When building your own project, add link options to force the dfpemu library routines to be included if your project does not reference them directly.. TENSILICA INC. 7
12 a. In the active project area, select your project as the active project, select your configuration as the active configuration, and select the Release target (or the Debug target) as the active target. b. Click on the triangle to the right of the active target and select Modify c. Click on the Linker tab to change linker flags for the timesoftfloat program Release target. d. In the Linker Flags box, add: -Wl,-u, adddf3,-u, subdf3,-u, muldf3,-u, divdf3,-u,sqrt -Wl,-u, eqdf2,-u, gedf2,-u, gtdf2,-u, ledf2,-u, ltdf2,-u, nedf2 e. For configurations without the Divide Option, add: -Wl,-u, divsi3,-u, modsi3,-u, udivsi3,-u, umodsi3 6. Build the main project. a. In the active project area, select the timesoftfloat project (or your own project) as the active project, select your configuration as the active configuration, and select the Release target (or the Debug target) as the active target. b. Click Build Active. This will build the dependent libdfpemu library and the main project. 7. Now that the timesoftfloat project has been built, you can run or profile it once you have set up the command line arguments. a. Click the triangle to the right of the Run button in the active project area. Choose Run. b. In the Configurations area, choose the Auto timesoftfloat launch c. In the Create, Manage, and Run dialog box, choose the Arguments tab. Add all nearesteven tininessafter in the C/C++ Program Arguments box. d. Click Apply to save the arguments then Run to start the simulation. e. For a profile, click Profile in the active project area. If the runtime arguments have not been set yet, set them as in step b). Because the floating point add and subtract share code, some of the cycles from one can be misattributed to the other in the profile. Using Command Line Tools From a command prompt on a Linux host with the csh or tcsh shell, unpackage the library sources, TIE files and test program: 1. unzip dfpemu_library.zip 2. cd dfpemu_library 3. setenv PATH <path_to_your_xtensa_tools>:$path 4. setenv XTENSA_CORE <your_config_name> 5. xt-make clean all test profile This will compile the TIE file, build the libdfpemu.a library, build timesoftfloat, execute it and profile it. It will leave the text profile in timesoftfloat/prof.gmon.txt. 8
13 The emulation routines in this package redefine routines found in the standard libgcc and libm libraries. The object and library order on the link command line determine whether the libdfpemu or standard emulation routine is included in the application. To ensure that the libdfpemu library is included with your own project, add the full library pathname for the libdfpemu.a library to the link command before any other libraries and add -Wl,-u, adddf3,-u, subdf3,-u, muldf3,-u, divdf3,-u,sqrt -Wl,-u, eqdf2,-u, gedf2,- u, gtdf2,-u, ledf2,-u, ltdf2,-u, nedf2 to the link flags. For configurations without the Divide Option, add -Wl,-u, divsi3,-u, modsi3,-u, udivsi3,- u, umodsi3 as well. Using the Library with an RTOS The library routines add two additional states, F64R and F64S. If routines in this library can be invoked during interrupt handling, they should be removed from the library or the interrupt handlers must save the states before invoking these routines and restore them before returning from the interrupt. It is uncommon for an interrupt handler to invoke double-precision floating point emulation routines. Integer divide and modulus are more likely to be invoked. If they can be invoked from an interrupt handler with configurations without the Divide Option, then either remove the integer divide and modulus emulation routines from the library or save and restore the F64R and F64S states in the interrupt handling code. TENSILICA INC. 9
ConnX D2 DSP Engine. A Flexible 2-MAC DSP. Dual-MAC, 16-bit Fixed-Point Communications DSP PRODUCT BRIEF FEATURES BENEFITS. ConnX D2 DSP Engine
PRODUCT BRIEF ConnX D2 DSP Engine Dual-MAC, 16-bit Fixed-Point Communications DSP FEATURES BENEFITS Both SIMD and 2-way FLIX (parallel VLIW) operations Optimized, vectorizing XCC Compiler High-performance
More informationFP_IEEE_DENORM_GET_ Procedure
FP_IEEE_DENORM_GET_ Procedure FP_IEEE_DENORM_GET_ Procedure The FP_IEEE_DENORM_GET_ procedure reads the IEEE floating-point denormalization mode. fp_ieee_denorm FP_IEEE_DENORM_GET_ (void); DeNorm The denormalization
More informationOutline. L9: Project Discussion and Floating Point Issues. Project Parts (Total = 50%) Project Proposal (due 3/8) 2/13/12.
Outline L9: Project Discussion and Floating Point Issues Discussion of semester projects Floating point Mostly single precision until recent architectures Accuracy What s fast and what s not Reading: Ch
More informationGPU Floating Point Features
CSE 591: GPU Programming Floating Point Considerations Klaus Mueller Computer Science Department Stony Brook University Objective To understand the fundamentals of floating-point representation To know
More informationUNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-C Floating-Point Arithmetic - III Israel Koren ECE666/Koren Part.4c.1 Floating-Point Adders
More informationCodeWarrior Development Studio for StarCore DSP SC3900FP Architectures Quick Start for the Windows Edition
CodeWarrior Development Studio for StarCore DSP SC3900FP Architectures Quick Start for the Windows Edition SYSTEM REQUIREMENTS Hardware Operating System Disk Space Intel Pentium 4 processor, 2 GHz or faster,
More informationOn a 64-bit CPU. Size/Range vary by CPU model and Word size.
On a 64-bit CPU. Size/Range vary by CPU model and Word size. unsigned short x; //range 0 to 65553 signed short x; //range ± 32767 short x; //assumed signed There are (usually) no unsigned floats or doubles.
More informationXtensa 7 Configurable Processor Core
FEATURES 32-bit synthesizable RISC architecture with 5-stage pipeline, 16/24-bit instruction encoding with modeless switching Designer-configurable processor options (MMU/MPU, local memory types and sizes,
More informationCodeWarrior Development Studio for Power Architecture Processors Version 10.x Quick Start
CodeWarrior Development Studio for Power Architecture Processors Version 10.x Quick Start SYSTEM REQUIREMENTS Hardware Operating System Intel Pentium 4 processor, 2 GHz or faster, Intel Xeon, Intel Core,
More informationUsing an External GCC Toolchain with CodeWarrior for Power Architecture
Freescale Semiconductor Application Note Document Number: AN5277 Using an External GCC Toolchain with CodeWarrior for Power Architecture 1. Introduction This document explains how to use an external GNU
More informationComputer Arithmetic Ch 8
Computer Arithmetic Ch 8 ALU Integer Representation Integer Arithmetic Floating-Point Representation Floating-Point Arithmetic 1 Arithmetic Logical Unit (ALU) (2) Does all work in CPU (aritmeettis-looginen
More informationDivide: Paper & Pencil
Divide: Paper & Pencil 1001 Quotient Divisor 1000 1001010 Dividend -1000 10 101 1010 1000 10 Remainder See how big a number can be subtracted, creating quotient bit on each step Binary => 1 * divisor or
More informationNS9750 Release Notes: NET+Works with GNU Tools
NS9750 Release Notes: NET+Works with GNU Tools Operating system: NET+OS 6.1 Part number/version: 93000532_B Release date: June 2004 www.netsilicon.com 2001-2004 NetSilicon, Inc. Printed in the United States
More informationFinite arithmetic and error analysis
Finite arithmetic and error analysis Escuela de Ingeniería Informática de Oviedo (Dpto de Matemáticas-UniOvi) Numerical Computation Finite arithmetic and error analysis 1 / 45 Outline 1 Number representation:
More informationComputer Arithmetic Ch 8
Computer Arithmetic Ch 8 ALU Integer Representation Integer Arithmetic Floating-Point Representation Floating-Point Arithmetic 1 Arithmetic Logical Unit (ALU) (2) (aritmeettis-looginen yksikkö) Does all
More informationFLOATING POINT NUMBERS
Exponential Notation FLOATING POINT NUMBERS Englander Ch. 5 The following are equivalent representations of 1,234 123,400.0 x 10-2 12,340.0 x 10-1 1,234.0 x 10 0 123.4 x 10 1 12.34 x 10 2 1.234 x 10 3
More informationAPPLICATION COMMON OPERATING ENVIRONMENT (APPCOE)
APPLICATION COMMON OPERATING ENVIRONMENT (APPCOE) TRAINING GUIDE Version 1.0 March 12, 2013 Copyright (c) 2013 MapuSoft Technologies 1301 Azalea Road Mobile, AL 36693 www.mapusoft.com Copyright The information
More informationQuixilica Floating Point FPGA Cores
Data sheet Quixilica Floating Point FPGA Cores Floating Point Adder - 169 MFLOPS* on VirtexE-8 Floating Point Multiplier - 152 MFLOPS* on VirtexE-8 Floating Point Divider - 189 MFLOPS* on VirtexE-8 Floating
More informationC Fast RTS Library User Guide (Rev 1.0)
C Fast RTS Library User Guide (Rev 1.0) Revision History 22 Sep 2008 Initial Revision v. 1.0 IMPORTANT NOTICE Texas Instruments and its subsidiaries (TI) reserve the right to make changes to their products
More informationImplementing the Fast Fourier Transform for the Xtensa Processor
Implementing the Fast Fourier Transform for the Xtensa Processor Application Note Tensilica, Inc. 3255-6 Scott Blvd. Santa Clara, CA 95054 (408) 986-8000 Fax (408) 986-8919 www.tensilica.com November 2005
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 15
CO20-320241 Computer Architecture and Programming Languages CAPL Lecture 15 Dr. Kinga Lipskoch Fall 2017 How to Compute a Binary Float Decimal fraction: 8.703125 Integral part: 8 1000 Fraction part: 0.703125
More informationFoundations of Computer Systems
18-600 Foundations of Computer Systems Lecture 4: Floating Point Required Reading Assignment: Chapter 2 of CS:APP (3 rd edition) by Randy Bryant & Dave O Hallaron Assignments for This Week: Lab 1 18-600
More informationComputer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University
Computer Architecture Chapter 3 Fall 2005 Department of Computer Science Kent State University Objectives Signed and Unsigned Numbers Addition and Subtraction Multiplication and Division Floating Point
More informationAPPLICATION NOTE. AT6486: Using DIVAS on SAMC Microcontroller. SMART ARM-Based Microcontroller. Introduction. Features
APPLICATION NOTE AT6486: Using DIVAS on SAMC Microcontroller SMART ARM-Based Microcontroller Introduction DIVAS stands for Division and Square Root Accelerator. DIVAS is a brand new peripheral introduced
More informationOpen Floating Point Unit
Open Floating Point Unit The Free IP Cores Projects www.opencores.org Author: Rudolf Usselmann rudi@asics.ws www.asics.ws Summary: This documents describes a free single precision floating point unit.
More informationIntel Architecture Software Developer s Manual
Intel Architecture Software Developer s Manual Volume 1: Basic Architecture NOTE: The Intel Architecture Software Developer s Manual consists of three books: Basic Architecture, Order Number 243190; Instruction
More informationItanium Processor Floatingpoint Software Assistance and Floating-point Exception Handling
Itanium Processor Floatingpoint Software Assistance and Floating-point Exception Handling January 2000 Order Number: 245415-001 THIS DOCUMENT IS PROVIDED AS IS WITH NO WARRANTIES WHATSOEVER, INCLUDING
More informationComputer Architecture, Lecture 14: Does your computer know how to add?
Computer Architecture, Lecture 14: Does your computer know how to add? Hossam A. H. Fahmy Cairo University Electronics and Communications Engineering 1 / 25 A strange behavior What do you expect from this
More informationFloating Point Numbers
Floating Point Floating Point Numbers Mathematical background: tional binary numbers Representation on computers: IEEE floating point standard Rounding, addition, multiplication Kai Shen 1 2 Fractional
More informationMicroLogix 1200 Programmable Controllers
Document Update MicroLogix 1200 Programmable Controllers (Catalog Numbers 1762-L24AWA, -L24BWA, - L24BXB, -L40AWA, -L40BWA and -L40BXB; Series C) Purpose of This Document This Document Update revises the
More informationFixed Point. Basic idea: Choose a xed place in the binary number where the radix point is located. For the example above, the number is
Real Numbers How do we represent real numbers? Several issues: How many digits can we represent? What is the range? How accurate are mathematical operations? Consistency... Is a + b = b + a? Is (a + b)
More informationGetting Started with Freescale MQX RTOS for Kinetis SDK and Kinetis Design Studio IDE
Freescale Semiconductor, Inc. Document Number: KSDKGSKDSUG User s Guide Rev. 1, 04/2015 Getting Started with Freescale MQX RTOS for Kinetis SDK and Kinetis Design Studio IDE 1 Overview This section describes
More informationIntel 64 and IA-32 Architectures Software Developer s Manual
Intel 64 and IA-32 Architectures Software Developer s Manual Volume 1: Basic Architecture NOTE: The Intel 64 and IA-32 Architectures Software Developer's Manual consists of five volumes: Basic Architecture,
More informationFR Family MB Emulator System Getting Started Guide
FR Family MB2198-01 Emulator System Getting Started Guide Doc. No. 002-05222 Rev. *A Cypress Semiconductor 198 Champion Court San Jose, CA 95134-1709 http://www.cypress.com Copyrights Copyrights Cypress
More informationCodeWarrior Development Studio for etpu v10.x Quick Start SYSTEM REQUIREMENTS
CodeWarrior Development Studio for etpu v10.x Quick Start SYSTEM REQUIREMENTS Hardware Operating System Software Disk Space Intel Pentium 4 processor, 2 GHz or faster, Intel Xeon, Intel Core, AMD Athlon
More informationLecture 10. Floating point arithmetic GPUs in perspective
Lecture 10 Floating point arithmetic GPUs in perspective Announcements Interactive use on Forge Trestles accounts? A4 2012 Scott B. Baden /CSE 260/ Winter 2012 2 Today s lecture Floating point arithmetic
More informationFloating Point Numbers. Lecture 9 CAP
Floating Point Numbers Lecture 9 CAP 3103 06-16-2014 Review of Numbers Computers are made to deal with numbers What can we represent in N bits? 2 N things, and no more! They could be Unsigned integers:
More informationegui Eclipse User Guide
Imperas Software Limited Imperas Buildings, North Weston, Thame, Oxfordshire, OX9 2HA, UK docs@imperascom Author: Imperas Software Limited Version: 211 Filename: egui_eclipse_user_guidedoc Project: Imperas
More informationFloating Point Numbers
Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU) Mohamed Zahran
More informationFloating Point Numbers
Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Fractions in Binary Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU)
More informationFloating Point January 24, 2008
15-213 The course that gives CMU its Zip! Floating Point January 24, 2008 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties class04.ppt 15-213, S 08 Floating
More informationArchitecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier
Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier Sahdev D. Kanjariya VLSI & Embedded Systems Design Gujarat Technological University PG School Ahmedabad,
More informationDebugging Nios II Systems with the SignalTap II Logic Analyzer
Debugging Nios II Systems with the SignalTap II Logic Analyzer May 2007, ver. 1.0 Application Note 446 Introduction As FPGA system designs become more sophisticated and system focused, with increasing
More informationFloating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Floating Point Arithmetic Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Floating Point (1) Representation for non-integral numbers Including very
More informationNumber Representations
Number Representations times XVII LIX CLXX -XVII D(CCL)LL DCCC LLLL X-X X-VII = DCCC CC III = MIII X-VII = VIIIII-VII = III 1/25/02 Memory Organization Viewed as a large, single-dimension array, with an
More informationWeek 2: Console I/O and Operators Arithmetic Operators. Integer Division. Arithmetic Operators. Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5.
Week 2: Console I/O and Operators Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5.1) CS 1428 Fall 2014 Jill Seaman 1 2.14 Arithmetic Operators An operator is a symbol that tells the computer to perform specific
More informationQuick Front-to-Back Overview Tutorial
Quick Front-to-Back Overview Tutorial PlanAhead Design Tool This tutorial document was last validated using the following software version: ISE Design Suite 14.5 If using a later software version, there
More informationNios II Embedded Design Suite 6.1 Release Notes
December 2006, Version 6.1 Release Notes This document lists the release notes for the Nios II Embedded Design Suite (EDS) version 6.1. Table of Contents: New Features & Enhancements...2 Device & Host
More informationFloating Point Inverse (ALTFP_INV) Megafunction User Guide
Floating Point Inverse (ALTFP_INV) Megafunction User Guide 101 Innovation Drive San Jose, CA 95134 www.altera.com Document Version: 1.0 Document Date: October 2008 Copyright 2008 Altera Corporation. All
More informationChapter Two MIPS Arithmetic
Chapter Two MIPS Arithmetic Computer Organization Review Binary Representation Used for all data and instructions Fixed size values: 8, 16, 32, 64 Hexadecimal Sign extension Base and virtual machines.
More information3.5 Floating Point: Overview
3.5 Floating Point: Overview Floating point (FP) numbers Scientific notation Decimal scientific notation Binary scientific notation IEEE 754 FP Standard Floating point representation inside a computer
More informationFloating Point Numbers
Floating Point Numbers Summer 8 Fractional numbers Fractional numbers fixed point Floating point numbers the IEEE 7 floating point standard Floating point operations Rounding modes CMPE Summer 8 Slides
More informationUNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-A Floating-Point Arithmetic Israel Koren ECE666/Koren Part.4a.1 Preliminaries - Representation
More informationPerformance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit Integer Functions
Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit nteger Functions L. Li, Alex Fit-Florea, M. A. Thornton, D. W. Matula Southern Methodist University,
More informationModule 2: Computer Arithmetic
Module 2: Computer Arithmetic 1 B O O K : C O M P U T E R O R G A N I Z A T I O N A N D D E S I G N, 3 E D, D A V I D L. P A T T E R S O N A N D J O H N L. H A N N E S S Y, M O R G A N K A U F M A N N
More informationLaboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication
Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication Introduction All processors offer some form of instructions to add, subtract, and manipulate data.
More informationFloating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3
Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Instructor: Nicole Hynes nicole.hynes@rutgers.edu 1 Fixed Point Numbers Fixed point number: integer part
More informationChapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation
Chapter 2 Float Point Arithmetic Topics IEEE Floating Point Standard Fractional Binary Numbers Rounding Floating Point Operations Mathematical properties Real Numbers in Decimal Notation Representation
More informationVHDL IMPLEMENTATION OF IEEE 754 FLOATING POINT UNIT
VHDL IMPLEMENTATION OF IEEE 754 FLOATING POINT UNIT Ms. Anjana Sasidharan Student, Vivekanandha College of Engineering for Women, Namakkal, Tamilnadu, India. Abstract IEEE-754 specifies interchange and
More informationSystem Programming CISC 360. Floating Point September 16, 2008
System Programming CISC 360 Floating Point September 16, 2008 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Powerpoint Lecture Notes for Computer Systems:
More informationNumber Systems and Computer Arithmetic
Number Systems and Computer Arithmetic Counting to four billion two fingers at a time What do all those bits mean now? bits (011011011100010...01) instruction R-format I-format... integer data number text
More informationFloating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.
class04.ppt 15-213 The course that gives CMU its Zip! Topics Floating Point Jan 22, 2004 IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Floating Point Puzzles For
More informationFloating Point Arithmetic
Floating Point Arithmetic Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationfor StarCore DSP Architectures Quick Start for the Windows Edition
for StarCore DSP Architectures Quick Start for the Windows Edition CodeWarrior Development Studio for StarCore DSP Architectures Quick Start for the Windows Edition SYSTEM REQUIREMENTS Hardware Operating
More informationCollecting Linux Trace without using CodeWarrior
Freescale Semiconductor Application Note Document Number: AN5001 Collecting Linux Trace without using CodeWarrior 1. Introduction This document guides you how to collect Linux trace directly from QDS or
More informationFloating-Point Unit. Introduction. Agenda
Floating-Point Unit Introduction This chapter will introduce you to the Floating-Point Unit (FPU) on the LM4F series devices. In the lab we will implement a floating-point sine wave calculator and profile
More informationBy, Ajinkya Karande Adarsh Yoga
By, Ajinkya Karande Adarsh Yoga Introduction Early computer designers believed saving computer time and memory were more important than programmer time. Bug in the divide algorithm used in Intel chips.
More informationChapter 4. Operations on Data
Chapter 4 Operations on Data 1 OBJECTIVES After reading this chapter, the reader should be able to: List the three categories of operations performed on data. Perform unary and binary logic operations
More informationProgramming in C++ 5. Integral data types
Programming in C++ 5. Integral data types! Introduction! Type int! Integer multiplication & division! Increment & decrement operators! Associativity & precedence of operators! Some common operators! Long
More informationWhite Paper Taking Advantage of Advances in FPGA Floating-Point IP Cores
White Paper Recently available FPGA design tools and IP provide a substantial reduction in computational resources, as well as greatly easing the implementation effort in a floating-point datapath. Moreover,
More informationChapter 3. Arithmetic Text: P&H rev
Chapter 3 Arithmetic Text: P&H rev3.29.16 Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation
More informationSoftware Overview Release Rev: 3.0
Software Overview Release Rev: 3.0 1 Overview of ClearSpeed software The ClearSpeed Advance accelerators are provided with a package of runtime software. A software development kit (SDK) is also available
More informationFPLibrary v0.94 User documentation. LIP ÉNS Lyon 46, allée d Italie Lyon cedex 07 France
FPLibrary v0.94 User documentation Jérémie Detrey Florent de Dinechin LIP ÉNS Lyon 46, allée d Italie 69364 Lyon cedex 07 France {Jeremie.Detrey,Florent.de.Dinechin}@ens-lyon.fr Contents 1 Introduction
More informationLibraries Guide. Arithmetic Libraries User Guide. Document #: Rev. *A
Libraries Guide Arithmetic Libraries User Guide Document #: 001-44477 Rev. *A Cypress Semiconductor 198 Champion Court San Jose, CA 95134-1709 Phone (USA): 800.858.1810 Phone (Intnl): 408.943.2600 http://www.cypress.com
More information4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning
4 Operations On Data 4.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List the three categories of operations performed on data.
More informationSystems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties
Systems I Floating Point Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties IEEE Floating Point IEEE Standard 754 Established in 1985 as uniform standard for
More informationS1C17 Family EEPROM Emulation Library Manual
S1C17 Family EEPROM Emulation Library Manual Rev.1.1 Evaluation board/kit and Development tool important notice 1. This evaluation board/kit or development tool is designed for use for engineering evaluation,
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication
More informationFloating Point Arithmetic
Floating Point Arithmetic Clark N. Taylor Department of Electrical and Computer Engineering Brigham Young University clark.taylor@byu.edu 1 Introduction Numerical operations are something at which digital
More informationCS429: Computer Organization and Architecture
CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: September 18, 2017 at 12:48 CS429 Slideset 4: 1 Topics of this Slideset
More informationFloating-point representations
Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010
More informationFloating-point representations
Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010
More informationLogiCORE IP Floating-Point Operator v6.2
LogiCORE IP Floating-Point Operator v6.2 Product Guide Table of Contents SECTION I: SUMMARY IP Facts Chapter 1: Overview Unsupported Features..............................................................
More informationGRLIDE. LEON IDE plugin for Eclipse User's Manual. The most important thing we build is trust GR-LIDE-UM. August 2016, Version 1.
. GRLIDE LEON IDE plugin for Eclipse 2016 User's Manual The most important thing we build is trust GR-LIDE 1 Table of Contents 1. Introduction... 3 1.1. Tools... 3 1.2. Supported Operating Systems... 3
More informationFloating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754
Floating Point Puzzles Topics Lecture 3B Floating Point IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties For each of the following C expressions, either: Argue that
More informationDescription. October Rev 4 1/10
Toolset for developing STxP70 applications Data brief production data Features Code development tools stxp70cc C compiler newlib C runtime library targeted for STxP70 (based on newlib v1.18) Multi-context
More informationCisco TEO Adapter Guide for Microsoft System Center Operations Manager 2007
Cisco TEO Adapter Guide for Microsoft System Center Operations Manager 2007 Release 2.3 April 2012 Americas Headquarters Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134-1706 USA http://www.cisco.com
More informationFloating-Point Arithmetic
ENEE446---Lectures-4/10-15/08 A. Yavuz Oruç Professor, UMD, College Park Copyright 2007 A. Yavuz Oruç. All rights reserved. Floating-Point Arithmetic Integer or fixed-point arithmetic provides a complete
More informationComputer Arithmetic Floating Point
Computer Arithmetic Floating Point Chapter 3.6 EEC7 FQ 25 About Floating Point Arithmetic Arithmetic basic operations on floating point numbers are: Add, Subtract, Multiply, Divide Transcendental operations
More information2/5/2018. Expressions are Used to Perform Calculations. ECE 220: Computer Systems & Programming. Our Class Focuses on Four Types of Operator in C
University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 220: Computer Systems & Programming Expressions and Operators in C (Partially a Review) Expressions are Used
More informationAn Efficient Implementation of Floating Point Multiplier
An Efficient Implementation of Floating Point Multiplier Mohamed Al-Ashrafy Mentor Graphics Mohamed_Samy@Mentor.com Ashraf Salem Mentor Graphics Ashraf_Salem@Mentor.com Wagdy Anis Communications and Electronics
More informationFloating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754
Floating Point Puzzles Topics Lecture 3B Floating Point IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties For each of the following C expressions, either: Argue that
More informationBuilding U-Boot in CodeWarrior ARMv8
NXP Semiconductors Document Number: AN5347 Application Note Rev. 0, 10/2016 Building U-Boot in CodeWarrior ARMv8 1 Introduction This application note defines guidelines for configuring CodeWarrior for
More informationCodeWarrior Development Studio for Freescale 68HC12/HCS12/HCS12X/XGATE Microcontrollers Quick Start SYSTEM REQUIREMENTS Hardware Operating System 200
CodeWarrior Development Studio for Freescale 68HC12/HCS12/HCS12X/XGATE Microcontrollers Quick Start SYSTEM REQUIREMENTS Hardware Operating System 200 MHz Pentium II processor or AMD-K6 class processor,
More informationUC Berkeley CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 16 Floating Point II 2010-02-26 TA Michael Greenbaum www.cs.berkeley.edu/~cs61c-tf Research without Google would be like life
More informationfor ColdFire Architectures V7.2 Quick Start
for ColdFire Architectures V7.2 Quick Start CodeWarrior Development Studio for ColdFire Architectures V7.2 Quick Start SYSTEM REQUIREMENTS Hardware Operating System Disk Space 1 GHz Pentium compatible
More informationAN 834: Developing for the Intel HLS Compiler with an IDE
AN 834: Developing for the Intel HLS Compiler with an IDE Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Developing for the Intel HLS Compiler with an Eclipse* IDE...
More informationMPLAB XC8 C Compiler Version 2.00 Release Notes for AVR MCU
MPLAB XC8 C Compiler Version 2.00 Release Notes for AVR MCU THIS DOCUMENT CONTAINS IMPORTANT INFORMATION RELATING TO THE MPLAB XC8 C COM- PILER WHEN TARGETING MICROCHIP AVR DEVICES. PLEASE READ IT BEFORE
More informationRepresenting and Manipulating Floating Points
Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with
More informationMost nonzero floating-point numbers are normalized. This means they can be expressed as. x = ±(1 + f) 2 e. 0 f < 1
Floating-Point Arithmetic Numerical Analysis uses floating-point arithmetic, but it is just one tool in numerical computation. There is an impression that floating point arithmetic is unpredictable and
More information