Engineering Degree Thesis 15 credits. Energy monitoring of the Cortex-M4 core, embedded in the Atmel SAM G55 microcontroller. Zeid Bekli William Ouda

Size: px
Start display at page:

Download "Engineering Degree Thesis 15 credits. Energy monitoring of the Cortex-M4 core, embedded in the Atmel SAM G55 microcontroller. Zeid Bekli William Ouda"

Transcription

1 Faculty of Technology and Society Computer Engineering Engineering Degree Thesis 15 credits Energy monitoring of the Cortex-M4 core, embedded in the Atmel SAM G55 microcontroller Zeid Bekli William Ouda Exam: Bachelor of Science in Engineering Examiner: Olle Lindeberg Subject Area: Computer Engineering Supervisor: Tommy Andersson Date of final seminar:

2 Abstract The technology in cellular phones, portable computing systems, intelligent- and connected- devices are evolving in a high pace and in many cases these devices are required to operate in a low-power environment. The problem that continues to emerge, is the power consumption in microcontrollers and DSP devices. This issue has over time become important to solve in order to maximize battery life. To ease the choice of power efficient microcontrollers, controlled experiments were therefore performed with the Cortex-M4, this microcontroller was chosen because of the upgraded hardware, which has led to an appreciable change in both power- and speed efficiency compared to its predecessors. The conclusion presents important points, along with advantages and difficulties to consider when implementing a DSP application. By comparing different optimizations with the Floating Point Unit(FPU), Fixed-point and software Floating-point, the results show that there are major differences in power consumption between these three options. Depending on which option and optimization used then the power consumption can exceed over 70% more compared to the other options available. i

3 Acknowledgements We would like to show our appreciation to our supervisor Tommy Andersson for taking the time to guide and support us during this thesis. We would also like to thank Magnus Krampell for all the support and encouragement during our three years of studies in Malmo University. ii

4 Contents 1. Introduction Problem domain Research Questions limitations Theoretical background Digital Signal Processing The FIR filter The IIR filter GNU Compiler Collection (GCC) Fixed-point Floating-point Energy monitoring in MCU DSP device The Cortex-M The SAM G55 DSC Atmel power debugger Interrupt latency Related work Martin Trevor - The designers guide to the cortex-m processor family Li Tan, Jean Jiang - Digital Signal Processing Joseph Yiu - The Definitive Guide to ARM Cortex -M3 and Cortex -M4 Processors Savita Rani - Area and Speed Efficient Floating-Point Unit Alexandre Aminot, et al. - Floating Point Units Efficiency in Multi-Core Processor Method Literature study Research problem Controlled experiments SAMG55 with FIR- and IIR- filter In-system debugging (DWT) Power measuring system Results and Analysis Algorithms and filter design Energy monitoring Enabling the FPU Results of the controlled experiments The power consumption when using optimization -O0 and -O1 with Software Floating-Point, Fixed- Point and FPU The power consumption in FIR filter The power consumption in IIR filter Comprehensive analysis Discussion Method discussion Discussion of the data measurement The power consumption when executing the FIR filter The power consumption when executing the IIR filter iii

5 7. Conclusion Answering the research questions How does power consumption vary when using the same algorithms with and without hardware floating point unit? (Floating-point operation) How does power consumption vary when using an algorithm with the same functionality as RQ1? (Fixed-point operation) How is the dependency between speed and power consumption? Further work Contribution of this thesis iv

6 Acronyms ADC ADP CMSIS-DAP CMSIS_DSP DMIPS DP DSC DSP DSP device DWT EDBG FFT FIR FPO FPU HP GCC IIR IoT JTAG MAC MCU MSB Opt SIMD SP SWD Analog to Digital Converter Atmel Data Protocol Cortex Microcontroller Software Interface Standard-Debug Access Port Cortex Microcontroller Software Interface Standard-Digital Signal Processing Dhrystone Million Instructions Per Second Double Precision Digital Signal Controllers Digital Signal Processing Digital Signal Processor Data Watchpoint and trace Atmel Embedded Debugger Fast Fourier Transform Finite Impulse Response Floating Point Operations Floating Point Unit Half Precision GNU Compiler Collection Infinite Impulse Response Internet of Things Joint Test Action Group Multiplier Accumulator unit Microcontroller Unit Most Significant Bit Optimization Single Instruction Multiple Data Single Precision Serial Wire Debugger v

7 1 Introduction Signal processing is used in almost every technology that we rely on today such as cellphones, computers, smart watches, automotive control systems, just to name a few[1]. Signal processing is at the heart of our modern world, powering today s entertainment and tomorrow s technology [1]. One part of signal processing is Digital Signal Processing (DSP) that refers to a set of algorithms that are used to process digital signals. The usage for some of these algorithms is to improve the signal by using techniques such as Finite Impulse Response (FIR), Infinite Impulse Response (IIR). Other algorithms that are widely used is the Fast Fourier Transform (FFT) [2]. A DSP device 1 is a microprocessor specialized in processing of real time signals and algorithms for DSP. The advantage of a DSP device compared to a general-purpose processor is its ability to be more efficient when processing the same algorithms. This is because the main task of a DSP device is signal processing [3][4]. This results in one of the most common issue, that is the increased complexity of DSP blocks that have led to an ever-increasing power consumption challenge. The DSP blocks consist of a network of adders and multipliers, typically it is these networks that have a major influence on the power consumption for a DSP device [5][6]. There are processors that are manufactured with DSP architecture layers that support the general-purpose processor to perform the DSP algorithms more efficient and these can be characterized as Digital Signal Controllers (DSC). This means that, there might not be any reason to buy a separate DSP device and a general-purpose processor to get the desired results. One could be able to save significantly on the cost of the products that are being constructed, by replacing two processors with one high performance processor with DSP extension [7]. One of these processors is the ARM Cortex-M family processor, Cortex-M4. The Cortex-M4 is a powerful member in the Cortex-M family and is used worldwide in a range of digital signal control embedded market segments. Robotics is one of these segments and has a critical role in healthcare such as precision surgery and assisted-living. Other important segments are automotive control systems, smartwatches and medical instruments [8]. 1 Both Digital signal processing and Digital signal processor use the acronym DSP, therefore from onwards we will be referencing to Digital signal processor as DSP device. 1

8 1.1 Problem domain The technology in cellular phones, portable computing systems, intelligent- and connected- devices are evolving in a high pace and in many cases these devices are required to operate in a low-power environment [7]. The problem that continues to emerge, is the power consumption in microprocessors and DSP devices. This issue has in time become important to solve in order to maximize battery life [7][9]. For mobile devices where the requirement to achieve higher performance and better power efficiency is crucial to the success of the mobile devices, one option to consider is using the Cortex-A72 processor. It uses the poweroptimization ARM big.little TM processing technology which combines highperformance ARM CPU cores with the more power efficient ARM CPU cores, to give a good performance at a significantly lower average power [10]. The Cortex-M4 processor is different from the Cortex-A72 processor in the way that it is built on one high-performance core [11], this means that it cannot use the big.little processing technology. Yet the Cortex-M4 processor is used in areas such as robotics and healthcare. This is because of the upgraded hardware, which has led to an appreciable change in both power- and speed efficiency compared to its predecessors. New technologies and hardware accelerators were made and then implemented in the Cortex-M4 CPU such as single cycle multiply, hardware division, bit field instruction and of course the added DSP functions, this has been an important factor that has led to making the Cortex-M4 into a high-performance processor [12]. How does power consumption relate to these new hardware technologies and are there any significant changes in power consumption when using DSP algorithms? 1.2 Research Questions The aim of this research is to investigate the power consumption in a Cortex-M4 DSC, and to review the DSP algorithms when enabling the hardware Floating Point Unit(FPU) that is featured in the Cortex-M4 DSC compared to when having it disabled. By implementing DSP algorithms such as FIR- and IIR- filters, will there be any noticeable tradeoff between speed and power consumption? This will give a better insight of the advantages and disadvantages of the Cortex-M4 DSC. 2

9 Main question How does power efficiency vary in the Cortex-M4 DSC when enabling the FPU compared to when disabled, and is speed related in anyway? Sub questions RQ1: How does power consumption vary when using the same algorithms with and without hardware floating point unit? (Floating-point operation) RQ2: How does power consumption vary when using an algorithm with the same functionality as RQ1? (Fixed-point operation) RQ3: How is the dependency between speed and power consumption? 1.3 Limitations The aim of this thesis is to measure the power consumption on the Cortex-M4 DSC along with the embedded FPU which can be enabled optionally. The research done on the sub questions that are in section 1.2 will be based on the points below. DSP execution computed with FPU, and GNU compiler optimization -O0 and -O1 DSP execution computed with software Floating-point, and GNU compiler optimization -O0 and -O1 DSP execution computed with Fixed-point, and GNU compiler optimization -O0 and -O1 The device used in this thesis is the SAM G55 with the Cortex-M4 core. The wake-up time will not be included in the measurements, this is because of the differences that can be found between the MCUs that use the Cortex M4 core. The power consumption of the ADC and DAC will also not be included in the measurements, because the main focus is on the power consumption in the Cortex-M4 DSC when executing the DSP algorithms. Optimization -O2 and -O3 makes debugging harder and gives incorrect DWT cycles values and therefore will not be used. 3

10 2. Theoretical background The aim of this chapter is to review the areas that are important to understand in order to follow the chapters ahead in this thesis. Each subsection below should give a sufficient understanding to each term. 2.1 Digital Signal Processing Signals are patterns of variations that represent information. There are all kinds of signals such as speech signals, audio signals, video or image signals, radar signals, just to name a few [4][13]. Many signals originate as continuous-time signals, and speech signals are one of these. It can sometimes be desirable to obtain the discrete-time representation of the signal, and one way to do this is through sampling equally spaced points in time. The result will be a discrete time representation of the signal that can be processed digitally [16] The FIR filter In FIR filters each value in the output sequence is a weighted sum of a finite number of samples of the input sequence, which is basically a feed-forward difference equation. The relationship of a general FIR filter is specified by the following equation [4][13]. M y[n] = b k x[n k] k=0 (Eq. 1) The output signal (y[n]) is dependent on the input signal (x[n]), the filter order (M) and the value of the impulse response (bk). It can be illustrated by doing a block-diagram. A third-order FIR filter can be seen in figure 1 below and its equation is. y[n] = b 0 x[n] + b 1 x[n 1] + b 2 x[n 2] + b 3 x[n 3] (Eq. 2) 4

11 Figure 1. Third-order FIR filter The input signal in an third-order FIR filter (Figure 1) has three signal delays (unit delays) that are then multiplied with filter coefficients (b0, b1, b2, b3), and the results of the product are then added to generate the output(y[n]) [13] The IIR filter IIR filters are feedback systems in such way that the output value of the system is reused. The difference between a FIR filter and a IIR filter, is its intelligibility to combine an output value with an input signal to compute an output. The general IIR difference equation is [4][13]. n y[n] = a i y[n i] i=1 M + b k x[n k] k=0 The equation coefficients are feedback (ai), feedback filter order (N), the feedforward (bk), the feedforward filter order (M) and of course the input signal (x[n]) and output signal (y[n]). By taking a closer look at the equation, it is obvious that if the coefficient (ai) were to be zero then we would have acquired the equation of a FIR filter [4][13]. (Eq. 3) A block-diagram of a first-order IIR filter with its corresponding equation can be seen in figure 2 and equation 4. y[n] = a 1 y[n 1] + b 0 x[n] + b 1 x[n 1] 5 (Eq. 4)

12 Figure 2A. First-order IIR filter in Direct Form I Figure 2B. First-order IIR filter in Direct Form II The IIR filter in figure 2A is of Direct Form I, however there is also Direct Form II which can be seen in figure 2B. The difference in these two forms, is that the unit delay in Direct Form II can be combined, this is because the signal to the unit delays in figure 2B is the same [4][13]. IIR filters gets unstable if the pole(s) are outside the unit circle[4]. 6

13 2.2 GNU Compiler Collection (GCC) The GCC is one of the most used compilers today, it is a free software and volunteers can contribute to improving the functionality of the GCC. Basically the GCC is an optimizing compiler from the GNU project. The GNU project have object file tools such as the assembler and linker [14]. There are five options for code optimization with GCC [15][16]. -Os - This optimization is for space usage(size) rather than speed. -O0 - Optimization is disabled and will make it more easy to debug, and will be slower than the three options below. -O1 - This optimization is also suitable for debugging, this option will enhance both speed performance and space usage(size). -O2 - Full optimization and skips any optimization that can lead to increase in space usage(size). -O3 - Does the same as -O2, the difference is in how the optimization is used to increases the space usage(size) for speed performance. Optimization -O2 and -O3 are fast performing options, but makes debugging harder [16]. 2.3 Fixed-point Fixed-point arithmetic means fixed number of digits before and after the decimal point. This implies that the resolution is depending on the amount of bits. For example, if 8-bits are used for fraction, then the resolution will be 2^-8 [4][17][18]. The maximum number in Fixed-point arithmetic is limited to the number of bits available, e.g. with 32-bits it is possible to use from 0- up to 32- bits as fraction numbers and the scaling is predetermined by the user. This makes it more commonly used when FPU is not available in the hardware [2][17]. There are four ways to store and represent integer(converting decimal numbers into binary), Unsigned integer, Offset binary, Sign and Magnitude, Two s complement [2][17]. Unsigned integer is quite straightforward when compared to the other three formats, it can go from the number zero up to the maximum positive number depending on the amount of bits. One noticeable disadvantage with unsigned integers is that there are no negative representations. 7

14 Offset binary format works in a similar way to the unsigned integer format, the difference lies in the shifted offset that allows either a positive number or a negative number to be represented. Sign and magnitude format is another way to represent negative and positive integers. Where the most significant bit (MSB) is zero for positive numbers, and one for negative numbers, this is called the Sign bit. The following bits function as a standard binary format. This in term will mean that there are two ways to represent zero, and that is a waste of bit pattern [2][17]. Two s complement format is more used by engineers, because it is less complex to implement in the hardware compared to the other three formats [2][17]. This is illustrated in the table below. Bits Decimal Table 1. Illustration of Two s complement format When wanting to represent fraction numbers with sign and magnitude format, it is possible to trick the CPU into thinking that it is dealing with integers. As seen in figure 3 that the Sign(S) bit is in the 16th position while the decimal point is put between the 5th and 6th bit [17][18]. Figure 3. Fraction representation with sign and magnitude format The equation below is to calculate the intended value for figure 3 [17]. +1 S = 1 Sum = S (integer(decimal) + fraction(decimal)) 2 5 (Eq.5) 8

15 2.4 Floating-point Floating-points indicate that the decimal point is floating around based on the given value, unlike the Fixed-point representation where the decimal point is set on the same place, this makes the Floating-point representations more dynamic and efficient. There are three precisions used with Floating-point, Half Precision(HP) uses 16-bit, Single Precision(SP) is the more common one and uses 32-bit, and Double Precision(DP) is used with 64-bits [4][19]. The IEEE standard representation for Floating-point is divided in three parts [4][19]. The sign bit The exponent The mantissa Precision Sign bit Exponent Mantissa HP floating 1 bit 5 bits 10 bits SP floating 1 bit 8 bits 23 bits DP floating 1 bit 11 bits 52 bits Table 2 - Basic floating formats The sign bit decides the polarity of the value, where setting the bit to 0 represents positive numbers and 1 represent negative numbers [4][19]. The mantissa is the fraction part (the part after the separator), for example the following Floating-point number can be considered, the number 7 can be represented as 1.75*4=( 1+½+ 1/4)*2 2, the mantissa is the fraction part, (0.75) and the exponent 2+bias [4][19]. For SP floating point it can be normal to have an exponent in the range of 1 up to 254, and this is best explained when studying the mathematical equation of the SP value seen in eq.6. value = ( 1)( 1) sign 2 (exponent 127) (1 + ( 1 2 Fraction[22]) + (1 4 Fraction[21]) ( 1 Fraction[0])) 223 (Eq. 6) This is the Offset binary format (explained above in section 2.3), where the exponent is shifted by a bias. The bias shown in the equation above is 127 [4][19]. 9

16 FPU is a hardware unit that can be added to processors to perform Floating-point arithmetic operations in less cycles than the software Floating-point. Most FPU s support the IEEE standard[4][19]. 2.5 Energy monitoring in MCU CMOS technology is used in MCUs and there are two power dissipations, static and dynamic. Static dissipation is the power leakage that occurs during steady state, while dynamic power occurs when switching states [20]. There are different forms in how to monitor energy efficiency and power characteristics. One way to measure energy efficiency, is to look at the work that is done with a limited energy. By doing so the measurement unit can be in the form of Dhrystone Million Instructions Per Second (DMIPS)/μW or CoreMark(benchmark score)/μw. These two forms are a set of benchmark for the embedded system. Power measurement is on the other hand based on three factors, the active current which is measured in μa/mhz, the sleep mode current is measured in μa since the clock is should be stopped and the third factor to consider is energy efficiency. By energy efficiency it is the execution time that is taken into consideration, if the MCU has a long execution time then the overall power consumption will suffer [19]. The Cortex-M3 and Cortex-M4 has a number of power features such as sleep mode, wait mode and backup mode. The Cortex-M4 should be able to run at under 200μA/MHz, while some other Cortex-M processors are able to run at under 100 μa/mhz [19]. 2.6 DSP device A DSP device is a processor that is specialized in DSP algorithms, this leads to fast arithmetical calculations [3][21]. Harvard structure or the improved Harvard structure is generally used in a DSP device, that means data and instructions/program are in a separate memory. There are at least 4 buses in the DSP device: bus of program data, bus of program address, bus of data, and bus of data address. This separation means faster- and independent access during a cycle [3]. DSP device usually possess several processing units, these units main purpose are to enhance the speed of the device [3]. One of the units is the FPU, approximately one third of the DSP devices out in the market have a FPU unit, and over one half of the FPU non-users are planning to change. This is due to high cost of the hardware [2][21]. The pipelines are structured in a different way than the general-purpose processors, this allows the DSP device to execute multiple instructions simultaneously [3]. 10

17 2.7 The Cortex-M4 In the year 2004 the Cortex-M microcontrollers were presented. The Cortex-M4 and Cortex-M7 processors support DSP instructions. The Cortex-M4 can be used in demanding areas where memory protection and Floating-point for SP and HP calculations are mandatory. The key features in the Cortex-M4 are DSP, SIMD(Single Instruction Multiple Data), MAC(Multiply-Accumulate) unit, debug, Harvard architecture, 32-bit performance, and optional FPU [22] The SAM G55 DSC The SAM G55 is a microcontroller based on the Cortex-M4 core and is intended for low power applications [8]. The SAM G55 DSC is an development board with this controller. Figure 4A. SAM G55 features Figure 4B. The SAM G55 DSC The key features in the SAM G55 DSC are Atmel Embedded Debugger(EDBG), Atmel Data Protocol(ADP), current measurement header, 120 MHz, Analog to Digital Converter(ADC) module, Serial Wire Debugger(SWD)- and Joint Test Action Group(JTAG) interfaces, Data Watchpoint and trace(dwt) [23][24]. The EDBG is intended for onboard debugging, and one of its functions is to stream data from the MCU to the host PC. The EDBG makes use of the ADP when streaming the data. The DWT is a debugging unit that enables data tracing and counters for the processor. SWD is an alternative to the JTAG interface for debugging [23][24]. 11

18 2.8 Atmel power debugger The Atmel power debugger(figure 6) is a development tool which is intended for debugging and programming the ARM Cortex-M based Atmel SAM and Atmel AVR microcontrollers. The controllers need to have an interface of JTAG or SWD [25]. The JTAG also referred as boundary-scan is defined by IEEE as a method for testing functionality on circuit boards [26], while the SWD interface is a subset of the JTAG interface. The SWD interface takes use of TCK- and TMS- pin for connection, and these two pins can also be found on the JTAG 10-pin connector [27]. The power debugger has two separate means for measuring current and is ARM CMSIS-DAP(Cortex Microcontroller Software Interface Standard-Debug Access Port) compatible which means it will work with Atmel Studio 7.0 or later[25]. CMSIS-DAP is a interface that provides access for debugging [28]. A key benefit of the debugger is that it streams measurements and data to the Atmel Data Visualizer for real-time analysis [25]. Figure 5. The Power Debugger Channel A in the power debugger provides high accuracy measurements when measuring a low current in the range of 100mA - 500μA, the resolution is around 3μA and the accuracy is no worse than 3% [25]. 2.9 Interrupt latency Interrupt latency is the number of clock cycles required from a processor to react to an interrupt signal on entry and on exit. The interrupt latency is around twelve cycles on entry and ten cycles on exit. If the FPU is enabled then an increase of seventeen cycles is possible on entry and on exit [29]. 12

19 3. Related work In this section, you will find relevant information contributed from previous work that is closely related to this thesis. The guidelines in the subsections below points out important features in the Cortex-M4 and mathematics in the subject of DSP but also about the speed efficient Floating-point unit. 3.1 Martin Trevor -The designers guide to the cortex-m processor family - Chapter 8 The main focus in Trevor s [12] book is to understand the DSP functions that are embedded within the Cortex M4 and the Cortex M7. The combination with a traditional MCU can be referred to as a DSC. Martin Trevor explains the key features that are added to the M4 and M7 to support DSP usage. The enhancements are SIMD instructions, FPU and a more improved MAC unit compared to the M3. Trevor then uses the ARM CMSIS-DSP(Cortex Microcontroller Software Interface Standard - Digital Signal processing) software library to show how to access these functions that are added in M4 and M7. By doing experiments he explains the difference between FPU and the software Floating-point, he also explains how to enable and disable the FPU. The SIMD instructions are also explained. This is done by giving some code examples that shows how efficient SIMD is with DSP algorithms and he even shows some exercises on how to optimize DSP algorithms and these are explained in a chronological order. Further the CMSIS DSP Software library is explained in more detail, a part of this is about the conversion functions and their ability to convert between Floating-point and Fixed-point. The most relevant points that are brought up is about SIMD, FPU and MAC that are embedded in Cortex M4. All these functions are relevant to this thesis since they will be encountered when solving the research question in section 1.3. By studying this book, it has given a better understanding of what a DSC represents but also how speed and power efficiency play a significant role in modern processors such as the M4. 13

20 3.2 Li Tan, Jean Jiang - Digital Signal Processing - chapter 7, chapter 8, and chapter 9 Digital Signal Processing offers electrical engineers and computer engineers an introduction to the use of mathematics in the subject of DSP. Tan, et al. [4] takes advantage of the availability of powerful computers, and software environments such as MATLAB to perform extensive computation and create laboratories, this in return will give engineering students a bigger perspective about the effects that can be gained from filtering signals. In chapter 7 Tan, et al. illustrates with figures, and mathematical equations about the concept of FIR filters. This is also explained by creating block-diagrams. The intention of these basic illustrations is to give engineers an understanding of how FIR filters can be implemented in projects or laboratories. Chapter 8 in Digital Signal Processing is much like chapter 7, it is about IIR filters and how they can be implemented in projects and laboratories. It is explained through block-diagrams, mathematical representations, and figures. To keep it simple, Tan, et al. explaining a simple first-order IIR filter and a second-order filter, and to sum up all the points presented in the subsections, they present a few examples of IIR filters. In chapter 9 Tan, et al. brings up hardware and software for DSP devices. They explain the architecture differences that exist between a DSP device and a traditional MCU, such as the Harvard- and Von Neumann-architecture. Followed by the hardware units that exist in most common DSP devices such as the MAC unit. They bring up how important it is with a MAC unit by showing a visual representation of how the execution of the MAC function works. Fixed- and Floating- point are both brought up in much detail in this book. Li, et al. brings up the differences of these two and how they are implemented in DSP devices. The FPU and MATLAB are both essential to this thesis. To solve RQ1 and RQ2 the use of MATLAB is required for generating filter coefficients and a signal with noise, and by following this guide has made it less complex to understand the workflow of MATLAB. The examples that are given by Li, et al. on FIR- and IIR-filters are implemented with both Fixed- and Floating- point, which is an important part to understand for this thesis. 14

21 3.3 Joseph Yiu - The Definitive Guide to ARM Cortex -M3 and Cortex - M4 Processors - chapter 9, 13, 21 and 22 Joseph Yiu [19] sheds light on the Cortex-M3 and the Cortex-M4 having examples of guidelines. The chapters of focus will be 9, 13, 21 and 22 because they are closely related to this thesis. Chapter 9 is divided into two major sections. The first section is about low power systems, and low power features in the Cortex-M family. The focus will be on this section. Joseph Yiu brings up important questions like what does low power mean in microcontrollers? and then later explains that one typical way to measure energy efficiency is in the form of DMIPS/uW or CoreMark/uW which is basically how much processing is done with limited energy. Yiu later states that the measurement of power is done in ua/mhz since it traditionally is based on active current and sleep mode current, however this is now inadequate because energy efficiency is equally important. The end of the section is about how to utilize the low power feature in application software, this is illustrated through charts and tables. The second important chapter that needs to be reviewed is chapter 13. It is based on Floating Point Operations (FPO). Yiu introduces software Floating-point, FPU and their usage in Cortex-M4. By showing examples such as how to convert a value to SP in IEEE-754 standard, along with HP and DP. He later points out that for MCUs without FPU, the arithmetic calculations are carried out by run-time library functions. This brings us to chapter 21 (ARM Cortex-M4 and DSP Applications) and chapter 22 (Using the ARM CMSIS-DSP Library) which is about the DSP functions in the Cortex-M4 processor and how it compares to DSP devices. Yiu starts by explaining the term DSP, and its use on a MCU which is the key feature in the Cortex-M4 which makes it into a DSC. This is illustrated by showing the architecture layers added to the Cortex-M4. Yiu even states that by using the Cortex-M4 which is a DSC will solve the limitation of having an MCU and an DSP device separately, this will lead to lower power consumption and lower overall system cost. The signal processing algorithms in the CMSIS-DSP library are optimized for Cortex-M4, Yiu brings up some examples and guides through common algorithms from the CMSIS-DSP library such as FIR-filter, IIR-filter, and FFT. Yiu guidelines that are introduced in the book such as FPO, CMSIS-DSP library, SIMD, Cortex-M4 as a DSC and these guidelines are important to this thesis. By understanding in which form energy- and power- efficiency is measured, this in turn sets the foundation for the benchmark that will be applied to solve the research questions in section

22 3.4 Savita Rani - Area and Speed Efficient Floating Point Unit Savita Rani [30] explains what the FPU is and the advantages of using FPU compared to the use of Fixed-point arithmetics. Rani states that the FPU is a key element in the area where real time computations are required such as with signal processing, and then mentions that with numbers that are very large or very small the use of FPU is required even if using Fixed-point arithmetics can be faster. One point that Rani brings up is that multiplication is not as common as the use of addition, but is very important even essential for MCUs and DSP devices where DSP applications are involved. Rani talks in more detail about multiplication techniques and methods such as, Integer Multiplication Methods, Truncated Multipliers and Logarithmic Multipliers. Then investigates the performance of these three multiplication methods mentioned above, by using simulations to analyze the output of the multiplication techniques with full FPU, it is then discussed which multiplication technique that provides better results. In this thesis both the use of FPU and Fixed-point arithmetics are performed, therefore it is important to think about what Rani says about how very large and very small numbers can be an issue when using Fixed-point arithmetics. Especially since the use of very small numbers are used in both the FIR- and IIR- filter. For the IIR-filter small changes in the coefficients can make it very unstable and this is probably one of the reasons that Rani recommends the use of FPU over Fixed-Point arithmetics when dealing with very small numbers. 16

23 3.5 Alexandre Aminot, et al. - Floating Point Units Efficiency in Multi- Core Processor Alexandre Aminot, et al. [31] explains the speed-up extensions in multi-core processors such as having a multi-core processor with FPU in every core, Aminot, et al. call them SMP. There are processors that only have FPU in some of the cores, Aminot, et al. call them for FAMP. The paper's research question is how to efficiently exploit floating point units in multi-core processors? and if there is any advantages of having FPU in all the cores. The method used in Aminot, et al. research is based on controlled experiments, three energy management systems are compared when using FAMP processors, the results are later compared with the SMP processors that only takes use one energy management system. The three energy management systems are, application level, scheduler event level, and the hardware level which SMP use. It is stated that no modification of the code will be made for the experiments and that they use different benchmarks to estimate the power consumption, and the performance. The results that Aminot, et al. achieved from experimentation is from the first energy management system which is about when using applications that take advantage of integers or minimal use of floating point then it is better with the FAMP processor, because the speed-up does not balance the power cost. Aminot, et al. mentions an application that is mostly used for Floating-point and with this application the SMP consumes less energy and has a higher speed than with FAMP. The second energy management system is the scheduler level where the system switches cores depending on the event that is in the application. The results lowered the energy consumption but instead increased the execution time, this is because of the time spent in the core without FPU is longer. The energy management system did not decrease the energy consumption for applications that depend more on the FPU, this is because more time is spent on the speed for switching cores and less time is spent for the cores without FPU. Aminot, et al. recommends that for applications that need to use floating point should be completely executed on the core that have FPU. The third energy management system used is experimented with both an SMP- and FAMP- processor, is the hardware(instruction) level. The hardware level is an aggressive technique that quickly powers up the FPU in the core(s). This technique is application dependent, and the power up time is 1000 cycles. The energy consumption in the hardware level is reduced when using longer applications compared to having the same applications in the scheduler level. Aminot, et al's conclusion is that the FPU is not necessary for each core, this is because of the power leakage that occurs in the FAMP processors because of the FPU. 17

24 4. Method The workflow that is used in this thesis is presented in this section. It can be seen as a top-down framework that consist of two main categories; literature study and controlled experiments. Controlled experiments are then divided into three subcategories; SAMG55 with FIR- and IIR- filter, Power measuring system and Insystem debugging. This structure is presented in figure 6 [32]. Figure 6. Research workflow 4.1 Literature study It should be noted that several studies were reviewed during this thesis, and the most relevant reviews can be found in section 3. This has resulted in giving an overview of the domain problem stated in section 1.1 and the techniques (observe, formulate and evaluate) to identify the sub questions. 18

25 4.2 Research problem The questions in section 1.2 are acquired through studies. To reach a conclusion on the research problem three steps have to be followed in a chronological order to ease the workflow. Evaluating the background Deciding the problem domain Setting the limitations Evaluating the background is the first step done in this thesis, the advantage of this step is to get an overview of the area. The following step is to decide the problem domain found in the area that was evaluated in the first step. The last step is to set the limitations in order to focus on the specific problem at hand. These steps above are done through iteration of literature study which can be seen in figure 6. The most related studies in this thesis can be seen section Controlled experiments Science classifies knowledge. Experimental science classifies knowledge derived from observation Denning P.J [33]. To get a basis in an area it is important to acquire an understanding of the fundamental components and relationships in that area. By doing experiments one will be provided with the necessary data to better evaluate, predict, understand, control and improve a development process and product [32]. This is a well-known concept, where basically everything is held constant except for one variable [32][33]. The DSP functionality can be seen as the variable in the Cortex-M4 DSC. In this thesis experiments will be performed with an apparatus. The data( Section 5.4) given from the apparatus is then analyzed and used to answer the problems in question. Apparatus can be divided into two categories, system apparatus and simulator apparatus [33]. The apparatus in this case will be the system apparatus, SAMG55, In-system debugging, and Power measuring system SAMG55 with FIR- and IIR- filter The system apparatus from Atmel is the SAMG55 development board. The SAMG55 is used to execute DSP algorithms such as FIR- and IIR-filter. During the execution of the algorithms the current measurement header and the Cortex debugger header are connected to a Power measuring system(section 4.3.3), the schematics for this setup can be seen in figure

26 4.3.2 In-system debugging (DWT) Measuring speed with in-system debugging is efficient. This is done by marking a set of code with a start- and stop- counter. This allows the user to see the amount of cycles that will be performed to execute the marked code. In this thesis in-system debugging will be performed with Atmel Studio to achieve a result to sub question three in section 1.3 [34] Power measuring system The apparatus used in this thesis is the Atmel power debugger (section 2.8) which is a device used to measure power consumption. The power debugger allows the user to follow the power consumption of the FIR- and IIR- filter in a real-time application and analyze the efficiency of the Cortex-M4 device in its present state [25]. Atmel data visualizer is a program that is compatible with the Atmel power debugger which offers a graph plotter, oscilloscope and other indicators that will help in interpreting the data [25]. The power debugger and the data visualizer will be used to achieve a result to sub question one, two and three in section 1.3. The measurements will be done using common FIR and IIR algorithms. 20

27 5. Results and Analysis The data and results presented in this section are based on Atmel SAM G55 DSC with the Cortex-M4 Core and FPU. 5.1 Algorithms and filter design The algorithms that are used to accomplish RQ1, RQ2, and RQ3 are FIR-filter and IIRfilter. With such filters, there are some key parameters that need to be considered. The sampling rate The number of taps The pass-band The stop-band In this thesis, the tool that is used to calculate FIR- and IIR- filter coefficients is called the Filter Designer tool and is a graphical GUI from MATLAB to design and analyze filters. The benefit of using this tool is its easy-to-use GUI that enables the user to design digital FIR- and IIR- filters by setting the specific parameters(sampling, pass-band and stop-band) listed above. The two filters that are mentioned above has been designed as low pass filters with the following parameters. Filter Apass Astop Fpass Fstop Sampling rate Number of taps received FIR 1 db 40 db 1100 Hz 2000 Hz Hz 23 IIR 1 db 40 db 1100 Hz 2000 Hz Hz 5 Table 3 -The parameters used in FIR- and IIR filter. The Filter Designer tool gives a magnitude response overview of the design that was created, by doing so one can evaluate if the design meets the specifications that are sought, and in this case the requirements were met. 21

28 Figure 7A. FIR filter magnitude response Figure 7B. IIR filter magnitude response At this point, the FIR- and IIR- filter coefficients are created which is then implemented in Atmel Studio. The last step is to generate a sinusoidal signal with some interference. This signal is also created in MATLAB and is going to be the basis for the IIR- and FIRfilters to filter out the interference. The interference signal are sinusoids with frequencies and Hz and can be seen in the FFT spectrum in figure 8. Figure 8. FFT generated by MATLAB with Frequency 800, 2500, 4500 Hz The implementation of the IIR-filter with Fixed-point was a special case. It was created as an double section filter, which in turn means that the number of taps are 3. Double section is intended to function in the same manner as an single section filter, the difference lies in the functionality where the output of the first section becomes the input for the second section this is illustrated in the figure below. 22

29 figure 9. Double section IIR filter The reason for using this implementation is because of the low accuracy with the Fixed-point coefficients and this makes the IIR filter unstable, this was mentioned by Savita Rani[32]. The filter coefficients are test on another development board that are based on the Cortex M4 core, this is done because the SAMG55 lacks Digital-Analog Converter (DAC). 5.2 Energy monitoring To monitor the energy in the DSC SAM G55 MCU, two tools were used, the data visualizer and the power debugger. The first tool is the Power debugger which can use both the JTAG- and SWD- interface to target the SAM G55 DSC. The main focus is on the SWD (programming and debugging) interface together with the two current sensing channels (power measurement) that are on the Atmel debugger (Figure 10). Figure 10. Logical Construction of the Power Debugger [25] The benefit of the Cortex-M4 is its capability to collect data in a cycle-by-cycle resolution with the data watchpoint and trace unit (DWT), which is then shown on Atmel Studio. By doing so have led to identifying some particular energy consuming spots in the embedded system. The second tool is the Data visualizer that is based on the ADP. The intent of this ADP protocol is to transfer data from a target MCU to the user s PC. This is done through the Cortex debug header that can be found on the SAM G55. In this project, the method used to transfer the data from the DSC to the PC was through the Power debugger. Figure 11 shows the paring for the the SAMG55 MCU. 23

30 Figure 11. The wiring diagram[25] A great benefit is to integrate the Data visualizer to work with the GNU C/C++ compiler and debugger, making it easier to monitor the embedded system. It is also important for the monitoring tools to have the same interface as the embedded system, or else they will not be compatible to each other, unless implementing a new interface. This was not an issue in this project since the SAM G55 has the same interface as the Atmel Studio Data visualizer, which is the ADP mentioned above. In short, the ADP protocol is very important in this project because a large set of data will be transferred from the DSC to the host PC. To measure the power consumption accurately with the data visualizer, the following three areas are important to monitor. The Active Mode The Standby Mode The Sample Area (Active Mode + Standby Mode) The Active mode is the part of the Sample area where the interrupt code is executed. While the Standby mode is the time where no code is executed. The Sample area is important in such way that it makes it possible to monitor the overall power consumption in a sample. These three areas are illustrated in the figure 12 below. 24

31 Figure 12. The power measuring areas This approach below for measuring the three areas(active mode, Standby mode, and Sample area) is chosen in order to disregard the capacitors that can be found in between the MCU Voltage supply headers and the current measurement headers. The monitoring of the Active mode is done in five steps, the first step is by increasing the sample frequency, so that almost no time is spent on Standby mode(the interrupt latency time, section5.4). The second step is to record the average current and the average power from the data visualizer. The third step is the in-system debugging, this is done to record the amount of cycles it takes to execute the Active mode. Step four is to convert the amount of cycles into time(eq. 7). Step five is to multiply the time in Active mode with the average power to get the energy spent in Active mode(eq. 8). 1 DWT cycles = Time in Active mode MCU Clock frequency (Eq. 7) Time in Active mode Average power in Active mode = Total energy in Active mode (Eq. 8) While for monitoring the power consumption in the Standby mode area it is done in four steps. First is set the device in sleep mode. The second step is to record the average current and the average power from the data visualizer. The third is to get the time spent in Standby mode (Eq. 9). The last step is to multiply the time in Standby mode with the average power to get the energy spent in Standby mode(eq.10). 25

32 1 Sample frequency Time in Active mode = Time in Standby mode (Eq. 9) Time in Standby mode Average power in Standby mode = Total energy in Standby mode (Eq. 10) The total energy in the Sample area is computed by adding the energy from the Standby mode and the Active mode(eq. 11). The calculation of the average power in the Sample area is shown in equation 12. Total energy in Active mode + Total energy in Standby mode = Total energy in Sample area (Eq. 11) Total energy in Sample area Sample time = Average power (Eq. 12) 5.3 Enabling the FPU There are different ways to enable the FPU depending on the MCU. The SAM G55 uses a processor made by Atmel, and these steps where necessary: 1. Make sure that the following symbol ARM_MATH_CM4 = true can be found in the compiler. 2. Adding two flags to both the compiler and the linker. -mfloat-abi=hard -mfpu=fpv4-sp-d16 3. Include arm_math header in main.c. 4. Call the fpu_enable() function in main. 26

33 5.4 Results of the controlled experiments The conclusive results obtained with experimentation in a controlled environment can be seen below. The Active current, the Sleep mode, and the energy efficiency that are mentioned in section 2.5 can also be seen in the subsections below. The Standby mode average current: 11.18mA FPU disabled 11.46mA FPU enabled For the results in tables 4A-4C, it is important to consider the accuracy of the power debugger that is explained in chapter 2.8 and the interrupt latency in chapter 2.9. The latency time(entry plus exit) was measured, and is around: optimization -O0 FPU disabled 407 ns optimization -O0 FPU enabled 407 ns optimization -O1 FPU disabled 266 ns optimization -O1 FPU enabled 340 ns During the latency time the MCU is not put into sleep mode, and by doing so the measurements of the Active mode will be more accurate. The current measured during latency time is between (depending on the optimization and if the FPU is enabled or disabled) 24.48mA mA. The effect of the latency is at worst ~0.88%, this can be calculated with the equations below. ((active mode time + latency time) average current) latency current latency time) active mode time = X current (1 average current ) 100 = error in % X current (Eq.13) (Eq.14) 27

34 Software Floating-point Filter Opt. Area DWT cycles Avg. Current (ma) Avg. Power (mw) Time (μs) Energy (μj) Active O0 Sample N/A N/A FIR Active O1 Sample N/A N/A Active IIR -O0 -O1 Sample N/A N/A Active Sample N/A N/A Table 4A. Complex composition of the conclusive results for software Floating-point Filter Opt. Area DWT cycles FIR -O0 -O1 Active Sample Active Sample FPU Avg. Current (ma) Avg. Power (mw) Time (μs) N/A N/A N/A N/A Energy (μj) IIR -O0 -O1 Active Sample Active Sample N/A N/A N/A N/A Table 4B. Complex composition of the conclusive results for FPU 2 This value is received from the debugger 28

35 Filter Opt. Area DWT cycles FIR -O0 -O1 Active Sample Active Sample Fixed-point Avg. Current (ma) Avg. Power (mw) Time (μs) N/A N/A N/A N/A Energy (μj) IIR -O0 -O1 Active Sample Active sample N/A N/A N/A N/A Table 4C. Complex composition of the conclusive results for Fixed-point 5.5 The power consumption when using optimization -O0 and -O1 with Software Floating-Point, Fixed-Point and FPU The measuring units that are presented in this project are based on the energy consumption and the number of cycles executed. These measurements are for well documented algorithms such as the FIR filter and the IIR filter. The results shown in tables 4A-C are achieved with the Atmel data visualizer, Atmel power debugger and the DWT unit. A benefit in the SAMG 55 DSC is that it has three low power modes which are backup, wait and sleep. When using sleep mode, the core clock should be stopped if used correctly and all the other functions should be able to keep on running [35]. In this experiment, the sleep mode is implemented and used to reduce the power consumption. In the subsections below the main focus will be on the average power in the Sample area, and the execution time in the Active mode with optimization -O0 and -O1. 2 This value is received from the debugger 29

36 5.5.1 The power consumption in FIR filter The power consumption varies between the two optimizations -O0 and -O1. The values are based on the FIR filter with Software Floating-Point, FPU and Fixed-Point. Software Floating-Point: Active mode time difference: ~32.6% Sample area average power difference: ~26.7% FPU enabled: Active mode time difference: ~17.6% Sample area average power difference: ~11.1% Fixed-Point: Active mode time difference: ~20.4% Sample area average power difference: ~11.9% For the three options(fixed-point, FPU and Software Floating-Point) mentioned above it is clear that with optimization -O1 the power consumption is reduced by 10% to 25% compared to -O0 and the execution time is reduced by 16% to 29% The power consumption in IIR filter The power consumption also varies for the IIR filter depending on the optimization(-o0 and -O1) chosen. Software Floating-Point: Active mode time difference: ~46% Sample area average power difference: ~ 17.5% FPU enabled: Active mode time difference: ~38% Sample area average power difference: ~6.9% Fixed-Point: Active mode time difference: ~41.4% Sample area average power difference: ~ 11.1% Much like section it was more beneficial to use optimization -O1 where the power consumption is reduced by 6% to 17% compared to -O0 and the execution time is reduced by 30% to 38%. 30

37 Energy(nJ) in Sample area 5.6 Comprehensive analysis When following the workflow of chapter 5, a pattern has been noticed in the FIR- and IIR-filter values when performing -O0 and -O1 this can be traced back to subsection and This pattern can be seen in the Active mode time and the Sample area power consumption where the measured values are lower with -O1 compared to -O0. Based on the charts below one can see that the execution time and the power consumption are tightly related to each other. FPU ensures faster Floating-point calculations which in turn will lead to that the time of the active mode is decreased and the time of the Standby mode is increased. Since in the Standby mode the DSC is not executing any code, this results in an overall reduced power consumption FIR time(μs) in Active mode Chart 1. FIR Sample area power consumption(y-axis), execution time in Active mode(x-axis) Chart 1, shows the total energy for the FIR filter. An interesting point is that the execution of the FIR filter with optimization -O1 is done with less time and less total energy consumption in the Sample area. When comparing the three options (FPU, software Floating-point and Fixed-point) in chart 1, it gets clear that when using Floating-point its less energy expensive to enable the FPU. While when performing Fixed-point compared to FPU, there are no major differences for the total accumulated energy consumption in the Sample area, when taking the accuracy of the power debugger and the interrupt latency in to consideration. When relating the energy consumption to the execution time in the chart above, longer execution time will use more energy, however this does not apply when comparing the FPU with Fixed-point when using the same optimization. 31

38 Energy(nJ) in Sample area Fixed-point takes more time to execute the code than FPU, yet the energy consumption is approximately the same IIR Time(μs) in Active mode Chart 2. IIR Sample area power consumption(y-axis), execution time in Active mode(x-axis) In the results for the IIR filter measurements seen in chart 2, it is obvious that the FPU, the software Floating-point and the Fixed-point are executed faster and consumes less energy with -O1. When comparing these three options(fpu, software Floating-point and Fixed-point) with each other in -O0 then it is clear that the FPU consumes less energy than the other two options and the execution time is also faster. In -O1 the FPU is executed faster than the other two options, however when looking at the total energy consumption then the difference between Fixed-Point and FPU is indistinguishable when taking the measurement error into account. 32

39 mw mw FIR ,6 54,2 54,2 70,8 48,5 48,1 Software Floating-point FPU 40 Fixed-point Opt -O0 Opt -O1 Chart 3. FIR Sample area, average power consumption Chart 3 shows that the power consumption in -O0 with software Floating-point is 71% more than the power consumption used by the FPU, while in -O1 the power consumption is 46% more. Optimization -O1 has been more beneficial for the software Floating-point compared to the FPU and Fixed-point by around 15mW more but is still inferior to the FPU and Fixed-Point. IIR ,7 46,2 49,3 50,1 43,1 44,1 Software Floating-point FPU Fixed-point 10 0 Opt -O0 Opt -O1 Chart 4. IIR Sample area, average power consumption For the IIR filter it has been a challenge to distinguish which of the two options(fpu and Fixed- Point) with -O1 that has the most power consumption because of the small measurement error that exist. However even in this case, the Software Floating-point has been proven to be inferior to the other two options. 33

ELC4438: Embedded System Design ARM Embedded Processor

ELC4438: Embedded System Design ARM Embedded Processor ELC4438: Embedded System Design ARM Embedded Processor Liang Dong Electrical and Computer Engineering Baylor University Intro to ARM Embedded Processor (UK 1990) Advanced RISC Machines (ARM) Holding Produce

More information

VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009

VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009 Digital Signal Processing 8 December 24, 2009 VIII. DSP Processors 2007 Syllabus: Introduction to programmable DSPs: Multiplier and Multiplier-Accumulator (MAC), Modified bus structures and memory access

More information

NXP Unveils Its First ARM Cortex -M4 Based Controller Family

NXP Unveils Its First ARM Cortex -M4 Based Controller Family NXP s LPC4300 MCU with Coprocessor: NXP Unveils Its First ARM Cortex -M4 Based Controller Family By Frank Riemenschneider, Editor, Electronik Magazine At the Electronica trade show last fall in Munich,

More information

ARM ARCHITECTURE. Contents at a glance:

ARM ARCHITECTURE. Contents at a glance: UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture

More information

Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India

Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India Mapping Signal Processing Algorithms to Architecture Sumam David S Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India sumam@ieee.org Objectives At the

More information

2-bit ARM Cortex TM -M3 based Microcontroller FM3 Family MB9A130 Series

2-bit ARM Cortex TM -M3 based Microcontroller FM3 Family MB9A130 Series 3 2-bit ARM Cortex TM -M3 based Microcontroller FM3 Family Ten products from the Ultra-low Leak group have been added to the lineup as the third group of products from the 32-bit microcontroller FM3 Family.

More information

CS6303 COMPUTER ARCHITECTURE LESSION NOTES UNIT II ARITHMETIC OPERATIONS ALU In computing an arithmetic logic unit (ALU) is a digital circuit that performs arithmetic and logical operations. The ALU is

More information

Ten Reasons to Optimize a Processor

Ten Reasons to Optimize a Processor By Neil Robinson SoC designs today require application-specific logic that meets exacting design requirements, yet is flexible enough to adjust to evolving industry standards. Optimizing your processor

More information

Implementing Biquad IIR filters with the ASN Filter Designer and the ARM CMSIS DSP software framework

Implementing Biquad IIR filters with the ASN Filter Designer and the ARM CMSIS DSP software framework Implementing Biquad IIR filters with the ASN Filter Designer and the ARM CMSIS DSP software framework Application note (ASN-AN05) November 07 (Rev 4) SYNOPSIS Infinite impulse response (IIR) filters are

More information

Hello and welcome to this Renesas Interactive module that provides an architectural overview of the RX Core.

Hello and welcome to this Renesas Interactive module that provides an architectural overview of the RX Core. Hello and welcome to this Renesas Interactive module that provides an architectural overview of the RX Core. 1 The purpose of this Renesas Interactive module is to introduce the RX architecture and key

More information

Chapter 15 ARM Architecture, Programming and Development Tools

Chapter 15 ARM Architecture, Programming and Development Tools Chapter 15 ARM Architecture, Programming and Development Tools Lesson 07 ARM Cortex CPU and Microcontrollers 2 Microcontroller CORTEX M3 Core 32-bit RALU, single cycle MUL, 2-12 divide, ETM interface,

More information

Ali Karimpour Associate Professor Ferdowsi University of Mashhad

Ali Karimpour Associate Professor Ferdowsi University of Mashhad AUTOMATIC CONTROL SYSTEMS Ali Karimpour Associate Professor Ferdowsi University of Mashhad Main reference: Christopher T. Kilian, (2001), Modern Control Technology: Components and Systems Publisher: Delmar

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGNAL PROCESSING UTN-FRBA 2010 Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable. Reproducibility. Don t depend on components

More information

Implementing FIR Filters

Implementing FIR Filters Implementing FIR Filters in FLEX Devices February 199, ver. 1.01 Application Note 73 FIR Filter Architecture This section describes a conventional FIR filter design and how the design can be optimized

More information

CHAPTER 1 Numerical Representation

CHAPTER 1 Numerical Representation CHAPTER 1 Numerical Representation To process a signal digitally, it must be represented in a digital format. This point may seem obvious, but it turns out that there are a number of different ways to

More information

CMPSCI 145 MIDTERM #1 Solution Key. SPRING 2017 March 3, 2017 Professor William T. Verts

CMPSCI 145 MIDTERM #1 Solution Key. SPRING 2017 March 3, 2017 Professor William T. Verts CMPSCI 145 MIDTERM #1 Solution Key NAME SPRING 2017 March 3, 2017 PROBLEM SCORE POINTS 1 10 2 10 3 15 4 15 5 20 6 12 7 8 8 10 TOTAL 100 10 Points Examine the following diagram of two systems, one involving

More information

AVR XMEGA TM. A New Reference for 8/16-bit Microcontrollers. Ingar Fredriksen AVR Product Marketing Director

AVR XMEGA TM. A New Reference for 8/16-bit Microcontrollers. Ingar Fredriksen AVR Product Marketing Director AVR XMEGA TM A New Reference for 8/16-bit Microcontrollers Ingar Fredriksen AVR Product Marketing Director Kristian Saether AVR Product Marketing Manager Atmel AVR Success Through Innovation First Flash

More information

ECE4703 B Term Laboratory Assignment 2 Floating Point Filters Using the TMS320C6713 DSK Project Code and Report Due at 3 pm 9-Nov-2017

ECE4703 B Term Laboratory Assignment 2 Floating Point Filters Using the TMS320C6713 DSK Project Code and Report Due at 3 pm 9-Nov-2017 ECE4703 B Term 2017 -- Laboratory Assignment 2 Floating Point Filters Using the TMS320C6713 DSK Project Code and Report Due at 3 pm 9-Nov-2017 The goals of this laboratory assignment are: to familiarize

More information

Representation of Numbers and Arithmetic in Signal Processors

Representation of Numbers and Arithmetic in Signal Processors Representation of Numbers and Arithmetic in Signal Processors 1. General facts Without having any information regarding the used consensus for representing binary numbers in a computer, no exact value

More information

An introduction to Digital Signal Processors (DSP) Using the C55xx family

An introduction to Digital Signal Processors (DSP) Using the C55xx family An introduction to Digital Signal Processors (DSP) Using the C55xx family Group status (~2 minutes each) 5 groups stand up What processor(s) you are using Wireless? If so, what technologies/chips are you

More information

Floating-point to Fixed-point Conversion. Digital Signal Processing Programs (Short Version for FPGA DSP)

Floating-point to Fixed-point Conversion. Digital Signal Processing Programs (Short Version for FPGA DSP) Floating-point to Fixed-point Conversion for Efficient i Implementation ti of Digital Signal Processing Programs (Short Version for FPGA DSP) Version 2003. 7. 18 School of Electrical Engineering Seoul

More information

TMS320C3X Floating Point DSP

TMS320C3X Floating Point DSP TMS320C3X Floating Point DSP Microcontrollers & Microprocessors Undergraduate Course Isfahan University of Technology Oct 2010 By : Mohammad 1 DSP DSP : Digital Signal Processor Why A DSP? Example Voice

More information

EDBG. Description. Programmers and Debuggers USER GUIDE

EDBG. Description. Programmers and Debuggers USER GUIDE Programmers and Debuggers EDBG USER GUIDE Description The Atmel Embedded Debugger (EDBG) is an onboard debugger for integration into development kits with Atmel MCUs. In addition to programming and debugging

More information

Chapter 5. Introduction ARM Cortex series

Chapter 5. Introduction ARM Cortex series Chapter 5 Introduction ARM Cortex series 5.1 ARM Cortex series variants 5.2 ARM Cortex A series 5.3 ARM Cortex R series 5.4 ARM Cortex M series 5.5 Comparison of Cortex M series with 8/16 bit MCUs 51 5.1

More information

EE 354 Fall 2015 Lecture 1 Architecture and Introduction

EE 354 Fall 2015 Lecture 1 Architecture and Introduction EE 354 Fall 2015 Lecture 1 Architecture and Introduction Note: Much of these notes are taken from the book: The definitive Guide to ARM Cortex M3 and Cortex M4 Processors by Joseph Yiu, third edition,

More information

FAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH

FAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH Key words: Digital Signal Processing, FIR filters, SIMD processors, AltiVec. Grzegorz KRASZEWSKI Białystok Technical University Department of Electrical Engineering Wiejska

More information

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 Scherer Balázs Budapest University of Technology and Economics Department of Measurement and Information Systems BME-MIT 2018 Trends of 32-bit microcontrollers

More information

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. CS 320 Ch 10 Computer Arithmetic The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. Signed integers are typically represented in sign-magnitude

More information

Ali Karimpour Associate Professor Ferdowsi University of Mashhad

Ali Karimpour Associate Professor Ferdowsi University of Mashhad AUTOMATIC CONTROL SYSTEMS Ali Karimpour Associate Professor Ferdowsi University of Mashhad Main reference: Christopher T. Kilian, (2001), Modern Control Technology: Components and Systems Publisher: Delmar

More information

Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development

Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development New Family of Microcontrollers Combine Scalability and Power Efficiency with Extensive Peripheral Capabilities

More information

Computer Organization and Assembly Language. Lab Session 01

Computer Organization and Assembly Language. Lab Session 01 Objective: Lab Session 01 Introduction to Assembly Language Tools and Familiarization with Emu8086 environment To be able to understand Data Representation and perform conversions from one system to another

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction The Motorola DSP56300 family of digital signal processors uses a programmable, 24-bit, fixed-point core. This core is a high-performance, single-clock-cycle-per-instruction engine

More information

Evaluating MMX Technology Using DSP and Multimedia Applications

Evaluating MMX Technology Using DSP and Multimedia Applications Evaluating MMX Technology Using DSP and Multimedia Applications Ravi Bhargava * Lizy K. John * Brian L. Evans Ramesh Radhakrishnan * November 22, 1999 The University of Texas at Austin Department of Electrical

More information

D. Richard Brown III Associate Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department

D. Richard Brown III Associate Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department D. Richard Brown III Associate Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department drb@ece.wpi.edu 3-November-2008 Analog To Digital Conversion analog signal ADC digital

More information

CS 61C: Great Ideas in Computer Architecture Performance and Floating Point Arithmetic

CS 61C: Great Ideas in Computer Architecture Performance and Floating Point Arithmetic CS 61C: Great Ideas in Computer Architecture Performance and Floating Point Arithmetic Instructors: Bernhard Boser & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/25/16 Fall 2016 -- Lecture #17

More information

Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication

Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication Introduction All processors offer some form of instructions to add, subtract, and manipulate data.

More information

In this article, we present and analyze

In this article, we present and analyze [exploratory DSP] Manuel Richey and Hossein Saiedian Compressed Two s Complement Data s Provide Greater Dynamic Range and Improved Noise Performance In this article, we present and analyze a new family

More information

Lecture 1: Introduction to Microprocessors

Lecture 1: Introduction to Microprocessors ECE342 Digital II Lecture 1: Introduction to Microprocessors Dr. Ying (Gina) Tang Electrical and Computer Engineering Rowan University 1 What is a microprocessor Informally, a microprocessor (µp) is the

More information

Computer Hardware Requirements for ERTSs: Microprocessors & Microcontrollers

Computer Hardware Requirements for ERTSs: Microprocessors & Microcontrollers Lecture (4) Computer Hardware Requirements for ERTSs: Microprocessors & Microcontrollers Prof. Kasim M. Al-Aubidy Philadelphia University-Jordan DERTS-MSc, 2015 Prof. Kasim Al-Aubidy 1 Lecture Outline:

More information

Choosing a Micro for an Embedded System Application

Choosing a Micro for an Embedded System Application Choosing a Micro for an Embedded System Application Dr. Manuel Jiménez DSP Slides: Luis Francisco UPRM - Spring 2010 Outline MCU Vs. CPU Vs. DSP Selection Factors Embedded Peripherals Sample Architectures

More information

Digital Signal Processing Laboratory 7: IIR Notch Filters Using the TMS320C6711

Digital Signal Processing Laboratory 7: IIR Notch Filters Using the TMS320C6711 Digital Signal Processing Laboratory 7: IIR Notch Filters Using the TMS320C6711 PreLab due Wednesday, 3 November 2010 Objective: To implement a simple filter using a digital signal processing microprocessor

More information

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China

More information

4.1 QUANTIZATION NOISE

4.1 QUANTIZATION NOISE DIGITAL SIGNAL PROCESSING UNIT IV FINITE WORD LENGTH EFFECTS Contents : 4.1 Quantization Noise 4.2 Fixed Point and Floating Point Number Representation 4.3 Truncation and Rounding 4.4 Quantization Noise

More information

Computer Systems. Binary Representation. Binary Representation. Logical Computation: Boolean Algebra

Computer Systems. Binary Representation. Binary Representation. Logical Computation: Boolean Algebra Binary Representation Computer Systems Information is represented as a sequence of binary digits: Bits What the actual bits represent depends on the context: Seminar 3 Numerical value (integer, floating

More information

LED Matrix Scrolling using ATmega32 microcontroller

LED Matrix Scrolling using ATmega32 microcontroller LED Matrix Scrolling using ATmega32 microcontroller Deepti Rawat 1, Gunjan Aggarwal 2, Dinesh Kumar Yadav 3, S.K. Mahajan 4 Department of Electronics and Communication Engineering IIMT college of Engineering,

More information

Parallel FIR Filters. Chapter 5

Parallel FIR Filters. Chapter 5 Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture

More information

Cache Justification for Digital Signal Processors

Cache Justification for Digital Signal Processors Cache Justification for Digital Signal Processors by Michael J. Lee December 3, 1999 Cache Justification for Digital Signal Processors By Michael J. Lee Abstract Caches are commonly used on general-purpose

More information

AVR Microcontrollers Architecture

AVR Microcontrollers Architecture ก ก There are two fundamental architectures to access memory 1. Von Neumann Architecture 2. Harvard Architecture 2 1 Harvard Architecture The term originated from the Harvard Mark 1 relay-based computer,

More information

An Optimizing Compiler for the TMS320C25 DSP Chip

An Optimizing Compiler for the TMS320C25 DSP Chip An Optimizing Compiler for the TMS320C25 DSP Chip Wen-Yen Lin, Corinna G Lee, and Paul Chow Published in Proceedings of the 5th International Conference on Signal Processing Applications and Technology,

More information

D. Richard Brown III Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department

D. Richard Brown III Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department D. Richard Brown III Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department drb@ece.wpi.edu Lecture 2 Some Challenges of Real-Time DSP Analog to digital conversion Are

More information

USER GUIDE EDBG. Description

USER GUIDE EDBG. Description USER GUIDE EDBG Description The Atmel Embedded Debugger (EDBG) is an onboard debugger for integration into development kits with Atmel MCUs. In addition to programming and debugging support through Atmel

More information

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures An introduction to DSP s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures DSP example: mobile phone DSP example: mobile phone with video camera DSP: applications Why a DSP?

More information

EECS 452 Midterm Closed book part Fall 2010

EECS 452 Midterm Closed book part Fall 2010 EECS 452 Midterm Closed book part Fall 2010 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: # Points Closed book Page

More information

RL78 Project Configuration Tips

RL78 Project Configuration Tips RL78 Project Configuration Tips Renesas Electronics America Inc. Renesas Technology & Solution Portfolio 2 Microcontroller and Microprocessor Line-up 2010 2012 32-bit 8/16-bit 1200 DMIPS, Superscalar Automotive

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGNAL PROCESSING UTN - FRBA 2011 www.electron.frba.utn.edu.ar/dplab Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable.

More information

MICROPROCESSOR BASED SYSTEM DESIGN

MICROPROCESSOR BASED SYSTEM DESIGN MICROPROCESSOR BASED SYSTEM DESIGN Lecture 5 Xmega 128 B1: Architecture MUHAMMAD AMIR YOUSAF VON NEUMAN ARCHITECTURE CPU Memory Execution unit ALU Registers Both data and instructions at the same system

More information

AN4777 Application note

AN4777 Application note Application note Implications of memory interface configurations on low-power STM32 microcontrollers Introduction The low-power STM32 microcontrollers have a rich variety of configuration options regarding

More information

ECE 450:DIGITAL SIGNAL. Lecture 10: DSP Arithmetic

ECE 450:DIGITAL SIGNAL. Lecture 10: DSP Arithmetic ECE 450:DIGITAL SIGNAL PROCESSORS AND APPLICATIONS Lecture 10: DSP Arithmetic Last Session Floating Point Arithmetic Addition Block Floating Point format Dynamic Range and Precision 2 Today s Session Guard

More information

Storage I/O Summary. Lecture 16: Multimedia and DSP Architectures

Storage I/O Summary. Lecture 16: Multimedia and DSP Architectures Storage I/O Summary Storage devices Storage I/O Performance Measures» Throughput» Response time I/O Benchmarks» Scaling to track technological change» Throughput with restricted response time is normal

More information

DEVELOPMENT OF USER FRIENDLY DATA ACQUISITION AND ACTUATION SYSTEM ON EMBEDDED PLATFORM

DEVELOPMENT OF USER FRIENDLY DATA ACQUISITION AND ACTUATION SYSTEM ON EMBEDDED PLATFORM DEVELOPMENT OF USER FRIENDLY DATA ACQUISITION AND ACTUATION SYSTEM ON EMBEDDED PLATFORM 1 Moolya Ashwar Shankar, 2 Mr. Sukesh Rao M. 1 PG Scholar, 2 Assistant Professor, NMAMIT Nitte Email: 1 moolya.ashwar@gmail.com,

More information

COL862 - Low Power Computing

COL862 - Low Power Computing COL862 - Low Power Computing Power Measurements using performance counters and studying the low power computing techniques in IoT development board (PSoC 4 BLE Pioneer Kit) and Arduino Mega 2560 Submitted

More information

Advanced Microcontrollers Grzegorz Budzyń Lecture. 1: Introduction

Advanced Microcontrollers Grzegorz Budzyń Lecture. 1: Introduction Advanced Microcontrollers Grzegorz Budzyń Lecture 1: Introduction Plan Introduction Course requirements Workplan for thesemester Firstlecture Basic definitions, Microcontroller, Microprocessor Introduction

More information

Robotic Systems ECE 401RB Fall 2006

Robotic Systems ECE 401RB Fall 2006 The following notes are from: Robotic Systems ECE 401RB Fall 2006 Lecture 13: Processors Part 1 Chapter 12, G. McComb, and M. Predko, Robot Builder's Bonanza, Third Edition, Mc- Graw Hill, 2006. I. Introduction

More information

Computer Hardware Requirements for Real-Time Applications

Computer Hardware Requirements for Real-Time Applications Lecture (4) Computer Hardware Requirements for Real-Time Applications Prof. Kasim M. Al-Aubidy Computer Engineering Department Philadelphia University Real-Time Systems, Prof. Kasim Al-Aubidy 1 Lecture

More information

Floating-Point Unit. Introduction. Agenda

Floating-Point Unit. Introduction. Agenda Floating-Point Unit Introduction This chapter will introduce you to the Floating-Point Unit (FPU) on the LM4F series devices. In the lab we will implement a floating-point sine wave calculator and profile

More information

EE 354 Fall 2013 Lecture 9 The Sampling Process and Evaluation of Difference Equations

EE 354 Fall 2013 Lecture 9 The Sampling Process and Evaluation of Difference Equations EE 354 Fall 2013 Lecture 9 The Sampling Process and Evaluation of Difference Equations Digital Signal Processing (DSP) is centered around the idea that you can convert an analog signal to a digital signal

More information

Double Precision Floating-Point Multiplier using Coarse-Grain Units

Double Precision Floating-Point Multiplier using Coarse-Grain Units Double Precision Floating-Point Multiplier using Coarse-Grain Units Rui Duarte INESC-ID/IST/UTL. rduarte@prosys.inesc-id.pt Mário Véstias INESC-ID/ISEL/IPL. mvestias@deetc.isel.ipl.pt Horácio Neto INESC-ID/IST/UTL

More information

ARM Processors for Embedded Applications

ARM Processors for Embedded Applications ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or

More information

EECS 452 Midterm Closed book part Fall 2010

EECS 452 Midterm Closed book part Fall 2010 EECS 452 Midterm Closed book part Fall 2010 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: # Points Closed book Page

More information

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code COPY RIGHT 2018IJIEMR.Personal use of this material is permitted. Permission from IJIEMR must be obtained for all other uses, in any current or future media, including reprinting/republishing this material

More information

Computer Organisation CS303

Computer Organisation CS303 Computer Organisation CS303 Module Period Assignments 1 Day 1 to Day 6 1. Write a program to evaluate the arithmetic statement: X=(A-B + C * (D * E-F))/G + H*K a. Using a general register computer with

More information

Classification of Semiconductor LSI

Classification of Semiconductor LSI Classification of Semiconductor LSI 1. Logic LSI: ASIC: Application Specific LSI (you have to develop. HIGH COST!) For only mass production. ASSP: Application Specific Standard Product (you can buy. Low

More information

Team 1. Common Questions to all Teams. Team 2. Team 3. CO200-Computer Organization and Architecture - Assignment One

Team 1. Common Questions to all Teams. Team 2. Team 3. CO200-Computer Organization and Architecture - Assignment One CO200-Computer Organization and Architecture - Assignment One Note: A team may contain not more than 2 members. Format the assignment solutions in a L A TEX document. E-mail the assignment solutions PDF

More information

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm

More information

Analytical Approach for Numerical Accuracy Estimation of Fixed-Point Systems Based on Smooth Operations

Analytical Approach for Numerical Accuracy Estimation of Fixed-Point Systems Based on Smooth Operations 2326 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL 59, NO 10, OCTOBER 2012 Analytical Approach for Numerical Accuracy Estimation of Fixed-Point Systems Based on Smooth Operations Romuald

More information

Five Ways to Build Flexibility into Industrial Applications with FPGAs

Five Ways to Build Flexibility into Industrial Applications with FPGAs GM/M/A\ANNETTE\2015\06\wp-01154- flexible-industrial.docx Five Ways to Build Flexibility into Industrial Applications with FPGAs by Jason Chiang and Stefano Zammattio, Altera Corporation WP-01154-2.0 White

More information

CN310 Microprocessor Systems Design

CN310 Microprocessor Systems Design CN310 Microprocessor Systems Design Micro Architecture Nawin Somyat Department of Electrical and Computer Engineering Thammasat University 28 August 2018 Outline Course Contents 1 Introduction 2 Simple

More information

Computer Architecture Review. ICS332 - Spring 2016 Operating Systems

Computer Architecture Review. ICS332 - Spring 2016 Operating Systems Computer Architecture Review ICS332 - Spring 2016 Operating Systems ENIAC (1946) Electronic Numerical Integrator and Calculator Stored-Program Computer (instead of Fixed-Program) Vacuum tubes, punch cards

More information

Low Power Design Michael Thomas, Applications Engineer

Low Power Design Michael Thomas, Applications Engineer Low Power Design Michael Thomas, Applications Engineer Class ID: CL01B Renesas Electronics America Inc. Michael Thomas (Applications Engineer) 5 years at Renesas Electronics RX200 Technical Support RTOS,

More information

Chapter 3: Arithmetic for Computers

Chapter 3: Arithmetic for Computers Chapter 3: Arithmetic for Computers Objectives Signed and Unsigned Numbers Addition and Subtraction Multiplication and Division Floating Point Computer Architecture CS 35101-002 2 The Binary Numbering

More information

STM32G0 MCU Series Efficiency at its Best

STM32G0 MCU Series Efficiency at its Best STM32G0 MCU Series Efficiency at its Best Key Messages of STM32G0 Series 2 2 3 Efficient Arm Cortex -M0+ at 64 MHz Compact cost: maximum I/Os count Best RAM/Flash Ratio Smallest possible package down to

More information

Low Power Design. Renesas Electronics America Inc Renesas Electronics America Inc. All rights reserved.

Low Power Design. Renesas Electronics America Inc Renesas Electronics America Inc. All rights reserved. Low Power Design Renesas Electronics America Inc. Renesas Technology & Solution Portfolio 2 Microcontroller and Microprocessor Line-up 2010 2012 32-bit 8/16-bit 1200 DMIPS, Superscalar Automotive & Industrial,

More information

University Syllabus. Subject Code : 10EC751 IA Marks : 25. No. of Lecture Hrs/Week : 04 Exam Hours : 03

University Syllabus. Subject Code : 10EC751 IA Marks : 25. No. of Lecture Hrs/Week : 04 Exam Hours : 03 University Syllabus Subject Code : IA Marks : 25 No. of Lecture Hrs/Week : 04 Exam Hours : 03 Total no. of Lecture Hrs. : 52 Exam Marks : 100 PART - A UNIT - 1 INTRODUCTION TO DIGITAL SIGNAL PROCESSING:

More information

AND SOLUTION FIRST INTERNAL TEST

AND SOLUTION FIRST INTERNAL TEST Faculty: Dr. Bajarangbali P.E.S. Institute of Technology( Bangalore South Campus) Hosur Road, ( 1Km Before Electronic City), Bangalore 560100. Department of Electronics and Communication SCHEME AND SOLUTION

More information

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING 1 DSP applications DSP platforms The synthesis problem Models of computation OUTLINE 2 DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: Time-discrete representation

More information

Computer Organization and Levels of Abstraction

Computer Organization and Levels of Abstraction Computer Organization and Levels of Abstraction Announcements Today: PS 7 Lab 8: Sound Lab tonight bring machines and headphones! PA 7 Tomorrow: Lab 9 Friday: PS8 Today (Short) Floating point review Boolean

More information

Keil uvision development story (Adapted from (Valvano, 2014a))

Keil uvision development story (Adapted from (Valvano, 2014a)) Introduction uvision has powerful tools for debugging and developing C and Assembly code. For debugging a code, one can either simulate it on the IDE s simulator or execute the code directly on ta Keil

More information

Incorporating a Capacitive Touch Interface into Your Design

Incorporating a Capacitive Touch Interface into Your Design Incorporating a Capacitive Touch Interface into Your Design Renesas Electronics America Inc. Renesas Technology & Solution Portfolio 2 Microcontroller and Microprocessor Line-up 2010 2012 32-bit 8/16-bit

More information

Mohammad Jafar Navabi Medtronic Microelectronics Center, Tempe, Arizona, USA

Mohammad Jafar Navabi Medtronic Microelectronics Center, Tempe, Arizona, USA MICROCONTROLLERS Mohammad Jafar Navabi Medtronic Microelectronics Center, Tempe, Arizona, USA Keywords: Microprocessor, peripheral devices, CPU, I/O, analog to digital converter, digital to analog converter,

More information

Quixilica Floating Point FPGA Cores

Quixilica Floating Point FPGA Cores Data sheet Quixilica Floating Point FPGA Cores Floating Point Adder - 169 MFLOPS* on VirtexE-8 Floating Point Multiplier - 152 MFLOPS* on VirtexE-8 Floating Point Divider - 189 MFLOPS* on VirtexE-8 Floating

More information

Embedded Target for TI C6000 DSP 2.0 Release Notes

Embedded Target for TI C6000 DSP 2.0 Release Notes 1 Embedded Target for TI C6000 DSP 2.0 Release Notes New Features................... 1-2 Two Virtual Targets Added.............. 1-2 Added C62x DSP Library............... 1-2 Fixed-Point Code Generation

More information

Exercises in DSP Design 2016 & Exam from Exam from

Exercises in DSP Design 2016 & Exam from Exam from Exercises in SP esign 2016 & Exam from 2005-12-12 Exam from 2004-12-13 ept. of Electrical and Information Technology Some helpful equations Retiming: Folding: ω r (e) = ω(e)+r(v) r(u) F (U V) = Nw(e) P

More information

EQUALIZER DESIGN FOR SHAPING THE FREQUENCY CHARACTERISTICS OF DIGITAL VOICE SIGNALS IN IP TELEPHONY. Manpreet Kaur Gakhal

EQUALIZER DESIGN FOR SHAPING THE FREQUENCY CHARACTERISTICS OF DIGITAL VOICE SIGNALS IN IP TELEPHONY. Manpreet Kaur Gakhal EQUALIZER DESIGN FOR SHAPING THE FREQUENCY CHARACTERISTICS OF DIGITAL VOICE SIGNALS IN IP TELEPHONY By: Manpreet Kaur Gakhal A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGNAL PROCESSING SASE 2010 Universidad Tecnológica Nacional - FRBA Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable.

More information

ATAUL GHALIB ANALYSIS OF FIXED-POINT AND FLOATING-POINT QUANTIZATION IN FAST FOURIER TRANSFORM Master of Science Thesis

ATAUL GHALIB ANALYSIS OF FIXED-POINT AND FLOATING-POINT QUANTIZATION IN FAST FOURIER TRANSFORM Master of Science Thesis ATAUL GHALIB ANALYSIS OF FIXED-POINT AND FLOATING-POINT QUANTIZATION IN FAST FOURIER TRANSFORM Master of Science Thesis Examiner: Professor Jarmo Takala Examiner and topic approved in the Computing and

More information

Assembly Language Math Co-processor Efficiency Study and Improvements on a Multi-Core Microcontroller

Assembly Language Math Co-processor Efficiency Study and Improvements on a Multi-Core Microcontroller Assembly Language Math Co-processor Efficiency Study and Improvements on a Multi-Core Microcontroller Matthew Lang and Adam Stienecker Ohio Northern University, m-lang@onu.edu, a-stienecker.1@onu.edu Abstract

More information

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS Aleksandar Milenković The LaCASA Laboratory, ECE Department, The University of Alabama in Huntsville Email: milenka@uah.edu Web:

More information

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers Chapter 03: Computer Arithmetic Lesson 09: Arithmetic using floating point numbers Objective To understand arithmetic operations in case of floating point numbers 2 Multiplication of Floating Point Numbers

More information

M1 Computers and Data

M1 Computers and Data M1 Computers and Data Module Outline Architecture vs. Organization. Computer system and its submodules. Concept of frequency. Processor performance equation. Representation of information characters, signed

More information

Monday, January 27, 2014

Monday, January 27, 2014 Monday, January 27, 2014 Topics for today History of Computing (brief) Encoding data in binary Unsigned integers Signed integers Arithmetic operations and status bits Number conversion: binary to/from

More information