Power Measurements using performance counters

Size: px
Start display at page:

Download "Power Measurements using performance counters"

Transcription

1 Power Measurements using performance counters CSL862: Low-Power Computing By Suman A M (2015SIY7524)

2 Android Power Consumption in Android Power Consumption in Smartphones are powered from batteries which are limited in size and therefore capacity. Therefore Power management in Smartphone is of great concern. Objective: The Objective was to monitor the Power, performance and Frequency of CPU and GPU on running various benchmarks to gather data. Platform Specifications Phone Model : Mi4i Processor : Qualcomm MSM8939 GPU : Qualcomm Adreno 405 System RAM : 2GB CPU Cores : 8x Arm cortex A53 OS : Android 5.0 Figure 1: Test Platform Mi4i Smartphone Tools used: Bench mark Apps: Benchmark apps are standard application programs used to stress various resources of a system. SPEC CPU: SPEC CPU is industry-standardized, CPU-intensive benchmark suite, stressing a system's processor, memory subsystem and compiler. It contains 12 different benchmark tests. It provides a comparative measure of compute-intensive performance across the widest practical range of hardware using workloads developed from real user applications. Performance test Mobile: Performance test Mobile is an Android Application used to benchmark an Android device using a variety of different speed tests and compare the results to other device. It consists of various Benchmarking tests.

3 Power (W) Experiment: Setup: The Setup includes an app HWMonitor pro which can provide with the following data (as in fig ) at a time interval of 0.5 Second which is acquired in PC through Remote IP interface provided by the app. The data can be logged in CSV format which is imported to MS Excel for analysis and Charts The Experiment is conducted by running HWMonitor pro app in background which collects the performance and power data and transmits it to remote PC through IP. The power is measured by monitoring the drain current from the battery given by hardware measuring circuits present in the smartphone. Initially the systems Idle power is measured with the phone in screen off mode where all the CPUs are in sleep state. When the phone wakes the LCD is on and the LCB power is recorded which forms the base power of the smartphone. and the benchmark app is run in sequence one by one with an interval of 2 seconds between two benchmarks Spec CPU: SPEC CPU is industry-standardized, CPU-intensive benchmark suite, stressing a system's processor, memory subsystem and compiler. It contains 12 different benchmark tests. It provides a comparative measure of compute-intensive performance across the widest practical range of hardware using workloads developed from real user applications. The Spec CPU Benchmarks is executed in Smartphone using an Android apk. The results of the experiment shows that all these benchmarks consume same power around 2.5 to 3W as these are assigned to single core by the scheduler and there is no control to allocate more cores to the process and thus all should similar power values SPEC CPU Data MCF vortex bzip2 crafty gcc Benchmarks Figure 2: Power values of SPEC CPU Benchmarks

4 Pass Mark Benchmark: The Passmark softwares provides a set of standard tests for benchmarking the performance of an Android smartphone Benchmarks: CPU Tests Integer Math Operations The test uses a large set of random 32-bit and 64-bit integers and adds, subtracts, multiplies and divides these numbers. Floating Point Math Operations Similar to Integer test it perform same operations with Floating point data Prime Number Test - This algorithm uses loops and CPU operations that are common in computer software, and determines the speed at which numbers can be compared with other numbers String Sorting Test - The String Sorting Test uses the qsort algorithm to see how fast the CPU can sort strings Data Encryption Test - The Encryption Test encrypts blocks of random data using the Blowfish algorithm, This test uses many of the techniques in the math tests, but also uses a large amount of binary data manipulation and CPU mathematical functions like 'to the power of'. Data Compression Test - The compression test uses an Adaptive encoding algorithm based on a method described by from Ian H. Witten, Radford M. Neal, and John G. Cleary in an article called "Arithmetic Coding for Data Compression". The system uses a model which maintains the probability of each symbol being the next encoded Memory Tests Memory Read Test and Memory Write Test - A Memory File is created in RAM. The size of the Memory File is determined by the amount of ram currently available. GPU Stress Test Three helicopters travel around the scene featuring 150 trees on a terrain with water. The terrain is formed with 5,000 triangular polygons (15,000 vertices). The trees and water consist of 2 polygons (6 vertices) and each helicopter object consists of 1,120 triangular polygons (3,360 vertices). Lighting & alpha blending are enabled. Following Parameters are recorded for analysis: CPU Frequency (Core 0, Core 4) CPU Utilization GPU Frequency Power

5 Figure 3: Performance chart of benchmarks The Power Consumption of the various benchmarks is plotted in fig 3. The data consists of the Minimum, Maximum and the Average Power consumed by each benchmark. From the experiment following observations are made for each benchmark. Figure 4: Power vs CPU Utilisation graph Idle: The Idle power for the smartphone is very low 0.18W when it is in sleep it disable all the processing modes and uses only essential resources like GSM and 3G connectivity resources. The power consumption is there only when there is a data transaction with the network.

6 LCD: LCD is the most power consuming part of the Smartphone since it consumes a constant power independent of the applications load it always remain powered. It consumed 0.8 W of continuous power until there is an operation being performed in Phone. Math Operations (Fixed point): The Math operations run a set of Math operations and compute the results. It is highly CPU Intensive process and the CPU Utilisation is seen to be 98% also the power consumed is 6.7 W which is mainly consumed by the CPU. Math Operations (Floating point): The Floating Point operations consumed 4.7 W which is less than Math Operations which can be related to reduction in CPU usage (85 %) mainly due to floating point operations being performed by FPU unit and thus consuming less power. Prime Number : Prime Number detection as it many loops and memory operations the CPU is not fully utilised thus is consuming a power of 5.7 W. String Sorting: String sorting uses Quick sort algorithm which utilizes the CPU to the maximum as the sorting is not limited anywhere by memory operations and it CPU is utilised to its full strength to compute the results and thus has highest power of 7.5 W. Data Encryption : It also similar to Prime number detection the CPU Utilisation is low as data encryption has memory dependence. It consumed 6.4 W which less that String sorting. Figure 5: CPU Utilisation for various Benchmarks

7 POWER DRAIN FROM BATTERY (W) Power (W) Power (W) Power (W) Figure 6: Average, Min and Max Power consumption for various Benchmarks Memory Read and Write: Memory operations perform a transfer a block file from or to RAM. The CPU Utilisation is very low (~14%) as it is purely a memory bound operation. Thus Power consumption of 2.1 W ( ) W can be inferred to be consumed by the Memory controller only. It is visible from graph that write cycles are consuming more power than read cycle as Memory write is a power intensive operation. GPU Test: The GPU test stressed the GPU operations with its 3d vector processing operations. It consumed 3.3 W of Power and the CPU utilisation is very low (11%) showing that the power is being utilised mainly by the GPU. It also seen from fig. 7 that GPU frequency has increased to 550MHz from 400 MHz operated in all other operations.

8 Figure 7: GPU Frequency for various Benchmarks Conclusion: The Power distribution for various benchmark apps were analysed and from inference we can relate that operations that utilise CPU consume the most power peak of 7.5 W in String operation. Therefore CPU has the highest power rating based on operating frequency. The LCD consumes a continuous power of 0.8 W which is throughout the use of a smart phone compared to CPU which are burst of peak power based on operations. The GPU and Memory controllers consume considerably less power compared to CPU and contribute less to the power consumption of the system.

COL862 Programming Assignment-1

COL862 Programming Assignment-1 Submitted By: Rajesh Kedia (214CSZ8383) COL862 Programming Assignment-1 Objective: Understand the power and energy behavior of various benchmarks on different types of x86 based systems. We explore a laptop,

More information

Power Measurement Using Performance Counters

Power Measurement Using Performance Counters Power Measurement Using Performance Counters October 2016 1 Introduction CPU s are based on complementary metal oxide semiconductor technology (CMOS). CMOS technology theoretically only dissipates power

More information

Power Measurements using performance counters CSL862: Low-Power Computing By Radhika D (2014SIY7530)

Power Measurements using performance counters CSL862: Low-Power Computing By Radhika D (2014SIY7530) Power Measurements using performance counters CSL862: Low-Power Computing By Radhika D (214SIY753) 1 Objective: To observe and note the performance and power consumption of Raspberry PI for various benchmark

More information

COL862 - Low Power Computing

COL862 - Low Power Computing COL862 - Low Power Computing Power Measurements using performance counters and studying the low power computing techniques in IoT development board (PSoC 4 BLE Pioneer Kit) and Arduino Mega 2560 Submitted

More information

APPENDIX Summary of Benchmarks

APPENDIX Summary of Benchmarks 158 APPENDIX Summary of Benchmarks The experimental results presented throughout this thesis use programs from four benchmark suites: Cyclone benchmarks (available from [Cyc]): programs used to evaluate

More information

ARM big.little Technology Unleashed An Improved User Experience Delivered

ARM big.little Technology Unleashed An Improved User Experience Delivered ARM big.little Technology Unleashed An Improved User Experience Delivered Govind Wathan Product Specialist Cortex -A Mobile & Consumer CPU Products 1 Agenda Introduction to big.little Technology Benefits

More information

The Role of Performance

The Role of Performance Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture The Role of Performance What is performance? A set of metrics that allow us to compare two different hardware

More information

ELE 455/555 Computer System Engineering. Section 1 Review and Foundations Class 5 Computer System Performance

ELE 455/555 Computer System Engineering. Section 1 Review and Foundations Class 5 Computer System Performance ELE 455/555 Computer System Engineering Section 1 Review and Foundations Class 5 Computer System Overview Eight Great Ideas in Computer Architecture Design for Moore s Law Integrated Circuit resources

More information

Abhishek Pandey Aman Chadha Aditya Prakash

Abhishek Pandey Aman Chadha Aditya Prakash Abhishek Pandey Aman Chadha Aditya Prakash System: Building Blocks Motivation: Problem: Determining when to scale down the frequency at runtime is an intricate task. Proposed Solution: Use Machine learning

More information

Power Control in Virtualized Data Centers

Power Control in Virtualized Data Centers Power Control in Virtualized Data Centers Jie Liu Microsoft Research liuj@microsoft.com Joint work with Aman Kansal and Suman Nath (MSR) Interns: Arka Bhattacharya, Harold Lim, Sriram Govindan, Alan Raytman

More information

TINY System Ultra-Low Power Sensor Hub for Always-on Context Features

TINY System Ultra-Low Power Sensor Hub for Always-on Context Features TINY System Ultra-Low Power Sensor Hub for Always-on Context Features MediaTek White Paper June 2015 MediaTek s sensor hub solution, powered by the TINY Stem low power architecture, supports always-on

More information

Energy-centric DVFS Controlling Method for Multi-core Platforms

Energy-centric DVFS Controlling Method for Multi-core Platforms Energy-centric DVFS Controlling Method for Multi-core Platforms Shin-gyu Kim, Chanho Choi, Hyeonsang Eom, Heon Y. Yeom Seoul National University, Korea MuCoCoS 2012 Salt Lake City, Utah Abstract Goal To

More information

Computer Architecture. Fall Dongkun Shin, SKKU

Computer Architecture. Fall Dongkun Shin, SKKU Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses

More information

Performance of computer systems

Performance of computer systems Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type

More information

New STM32 F7 Series. World s 1 st to market, ARM Cortex -M7 based 32-bit MCU

New STM32 F7 Series. World s 1 st to market, ARM Cortex -M7 based 32-bit MCU New STM32 F7 Series World s 1 st to market, ARM Cortex -M7 based 32-bit MCU 7 Keys of STM32 F7 series 2 1 2 3 4 5 6 7 First. ST is first to sample a fully functional Cortex-M7 based 32-bit MCU : STM32

More information

Scheduling the Intel Core i7

Scheduling the Intel Core i7 Third Year Project Report University of Manchester SCHOOL OF COMPUTER SCIENCE Scheduling the Intel Core i7 Ibrahim Alsuheabani Degree Programme: BSc Software Engineering Supervisor: Prof. Alasdair Rawsthorne

More information

SMARTPHONE HARDWARE: ANATOMY OF A HANDSET. Mainak Chaudhuri Indian Institute of Technology Kanpur Commonwealth of Learning Vancouver

SMARTPHONE HARDWARE: ANATOMY OF A HANDSET. Mainak Chaudhuri Indian Institute of Technology Kanpur Commonwealth of Learning Vancouver SMARTPHONE HARDWARE: ANATOMY OF A HANDSET Mainak Chaudhuri Indian Institute of Technology Kanpur Commonwealth of Learning Vancouver Outline of topics What is the hardware architecture of a How does communication

More information

Power Management for Embedded Systems

Power Management for Embedded Systems Power Management for Embedded Systems Minsoo Ryu Hanyang University Why Power Management? Battery-operated devices Smartphones, digital cameras, and laptops use batteries Power savings and battery run

More information

Cloud Based Framework for Rich Mobile Application

Cloud Based Framework for Rich Mobile Application Cloud Based Framework for Rich Mobile Application by Andrew Williams (ID: 29003739), Krishna Sharma (ID:), and Roberto Fonseca (ID: 51324561) CS 230 Distributed Systems Project Champion: Reza Rahimi Prof.

More information

CS61C - Machine Structures. Week 6 - Performance. Oct 3, 2003 John Wawrzynek.

CS61C - Machine Structures. Week 6 - Performance. Oct 3, 2003 John Wawrzynek. CS61C - Machine Structures Week 6 - Performance Oct 3, 2003 John Wawrzynek http://www-inst.eecs.berkeley.edu/~cs61c/ 1 Why do we worry about performance? As a consumer: An application might need a certain

More information

Chapter 5. Introduction ARM Cortex series

Chapter 5. Introduction ARM Cortex series Chapter 5 Introduction ARM Cortex series 5.1 ARM Cortex series variants 5.2 ARM Cortex A series 5.3 ARM Cortex R series 5.4 ARM Cortex M series 5.5 Comparison of Cortex M series with 8/16 bit MCUs 51 5.1

More information

ECE 471 Embedded Systems Lecture 2

ECE 471 Embedded Systems Lecture 2 ECE 471 Embedded Systems Lecture 2 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 7 September 2018 Announcements Reminder: The class notes are posted to the website. HW#1 will

More information

Energy Management Issue in Ad Hoc Networks

Energy Management Issue in Ad Hoc Networks Wireless Ad Hoc and Sensor Networks - Energy Management Outline Energy Management Issue in ad hoc networks WS 2010/2011 Main Reasons for Energy Management in ad hoc networks Classification of Energy Management

More information

Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers

Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers This white paper details the performance improvements of Dell PowerEdge servers with the Intel Xeon Processor Scalable CPU

More information

Chip-Multithreading Systems Need A New Operating Systems Scheduler

Chip-Multithreading Systems Need A New Operating Systems Scheduler Chip-Multithreading Systems Need A New Operating Systems Scheduler Alexandra Fedorova Christopher Small Daniel Nussbaum Margo Seltzer Harvard University, Sun Microsystems Sun Microsystems Sun Microsystems

More information

Computer Performance Evaluation: Cycles Per Instruction (CPI)

Computer Performance Evaluation: Cycles Per Instruction (CPI) Computer Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: where: Clock rate = 1 / clock cycle A computer machine

More information

Utilization-based Power Modeling of Modern Mobile Application Processor

Utilization-based Power Modeling of Modern Mobile Application Processor Utilization-based Power Modeling of Modern Mobile Application Processor Abstract Power modeling of a modern mobile application processor (AP) is challenging because of its complex architectural characteristics.

More information

Energy Management Issue in Ad Hoc Networks

Energy Management Issue in Ad Hoc Networks Wireless Ad Hoc and Sensor Networks (Energy Management) Outline Energy Management Issue in ad hoc networks WS 2009/2010 Main Reasons for Energy Management in ad hoc networks Classification of Energy Management

More information

Suspend-aware Segment Cleaning in Log-Structured File System

Suspend-aware Segment Cleaning in Log-Structured File System USENI HotStorage 15 Santa Clara, CA, USA, July 6~7, 2015 Suspend-aware Segment Cleaning in Log-Structured File System Dongil Park, Seungyong Cheon, Youjip Won Hanyang University Outline Introduction Log-structured

More information

Ubiquitous and Mobile Computing CS 528:EnergyEfficiency Comparison of Mobile Platforms and Applications: A Quantitative Approach. Norberto Luna Cano

Ubiquitous and Mobile Computing CS 528:EnergyEfficiency Comparison of Mobile Platforms and Applications: A Quantitative Approach. Norberto Luna Cano Ubiquitous and Mobile Computing CS 528:EnergyEfficiency Comparison of Mobile Platforms and Applications: A Quantitative Approach Norberto Luna Cano Computer Science Dept. Worcester Polytechnic Institute

More information

Contour Detection on Mobile Platforms

Contour Detection on Mobile Platforms Contour Detection on Mobile Platforms Bor-Yiing Su, subrian@eecs.berkeley.edu Prof. Kurt Keutzer, keutzer@eecs.berkeley.edu Parallel Computing Lab, University of California, Berkeley 1/26 Diagnosing Power/Performance

More information

Lesson #1. Computer Systems and Program Development. 1. Computer Systems and Program Development - Copyright Denis Hamelin - Ryerson University

Lesson #1. Computer Systems and Program Development. 1. Computer Systems and Program Development - Copyright Denis Hamelin - Ryerson University Lesson #1 Computer Systems and Program Development Computer Systems Computers are electronic systems that can transmit, store, and manipulate information (data). Data can be numeric, character, graphic,

More information

Multimedia in Mobile Phones. Architectures and Trends Lund

Multimedia in Mobile Phones. Architectures and Trends Lund Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson

More information

LPGPU2 Font Renderer App

LPGPU2 Font Renderer App LPGPU2 Font Renderer App Performance Analysis 2 Introduction As part of LPGPU2 Work Package 3, a font rendering app was developed to research the profiling characteristics of different font rendering algorithms.

More information

Fixed-Point Math and Other Optimizations

Fixed-Point Math and Other Optimizations Fixed-Point Math and Other Optimizations Embedded Systems 8-1 Fixed Point Math Why and How Floating point is too slow and integers truncate the data Floating point subroutines: slower than native, overhead

More information

Low-power Architecture. By: Jonathan Herbst Scott Duntley

Low-power Architecture. By: Jonathan Herbst Scott Duntley Low-power Architecture By: Jonathan Herbst Scott Duntley Why low power? Has become necessary with new-age demands: o Increasing design complexity o Demands of and for portable equipment Communication Media

More information

An Analysis of the Amount of Global Level Redundant Computation in the SPEC 95 and SPEC 2000 Benchmarks

An Analysis of the Amount of Global Level Redundant Computation in the SPEC 95 and SPEC 2000 Benchmarks An Analysis of the Amount of Global Level Redundant Computation in the SPEC 95 and SPEC 2000 s Joshua J. Yi and David J. Lilja Department of Electrical and Computer Engineering Minnesota Supercomputing

More information

SAE5C Computer Organization and Architecture. Unit : I - V

SAE5C Computer Organization and Architecture. Unit : I - V SAE5C Computer Organization and Architecture Unit : I - V UNIT-I Evolution of Pentium and Power PC Evolution of Computer Components functions Interconnection Bus Basics of PCI Memory:Characteristics,Hierarchy

More information

Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development

Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development New Family of Microcontrollers Combine Scalability and Power Efficiency with Extensive Peripheral Capabilities

More information

P a g e 1. MathCAD VS MATLAB. A Usability Comparison. By Brian Tucker

P a g e 1. MathCAD VS MATLAB. A Usability Comparison. By Brian Tucker P a g e 1 MathCAD VS MATLAB A Usability Comparison By Brian Tucker P a g e 2 Table of Contents Introduction... 3 Methodology... 3 Tasks... 3 Test Environment... 3 Evaluative Criteria/Rating Scale... 4

More information

Amber Baruffa Vincent Varouh

Amber Baruffa Vincent Varouh Amber Baruffa Vincent Varouh Advanced RISC Machine 1979 Acorn Computers Created 1985 first RISC processor (ARM1) 25,000 transistors 32-bit instruction set 16 general purpose registers Load/Store Multiple

More information

Big.LITTLE Processing with ARM Cortex -A15 & Cortex-A7

Big.LITTLE Processing with ARM Cortex -A15 & Cortex-A7 Big.LITTLE Processing with ARM Cortex -A15 & Cortex-A7 Improving Energy Efficiency in High-Performance Mobile Platforms Peter Greenhalgh, ARM September 2011 This paper presents the rationale and design

More information

Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications

Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications University of Dortmund Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications Robert Pyka * Christoph Faßbach * Manish Verma + Heiko Falk * Peter Marwedel

More information

Snapdragon S4 System on Chip

Snapdragon S4 System on Chip Snapdragon S4 System on Chip Analyst Webinar 10/19/2011 2011 QUALCOMM Incorporated. All rights reserved. 1 2011 QUALCOMM Incorporated. All rights reserved. 2 New Snapdragon Brand and Roadmap Features Overview

More information

GCSE Computer Science for OCR Overview Scheme of Work

GCSE Computer Science for OCR Overview Scheme of Work GCSE Computer Science for OCR Overview Scheme of Work The following assumes a two-year model. During the course, the final challenges can be used for practice in computational thinking, algorithm design

More information

F28HS Hardware-Software Interface: Systems Programming

F28HS Hardware-Software Interface: Systems Programming F28HS Hardware-Software Interface: Systems Programming Hans-Wolfgang Loidl School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh Semester 2 2017/18 0 No proprietary software has

More information

Enabling the A.I. Era: From Materials to Systems

Enabling the A.I. Era: From Materials to Systems Enabling the A.I. Era: From Materials to Systems Sundeep Bajikar Head of Market Intelligence, Applied Materials New Street Research Conference May 30, 2018 External Use Key Message PART 1 PART 2 A.I. *

More information

CMSC 611: Advanced Computer Architecture

CMSC 611: Advanced Computer Architecture CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

Android Power Management & Ways to reduce the Power Consumption in an Android Smartphone

Android Power Management & Ways to reduce the Power Consumption in an Android Smartphone ISSN 2395-1621 Android Power Management & Ways to reduce the Power Consumption in an Android Smartphone #1 Shailendra Kumar Pandey, #2 Varsha Shinde, #3 Rani Magar #4 Prof. Gunjun K. Naigaonkar 1 pandey_shailendra.ghrcempcse@raisoni.net

More information

Expanding Opportunities in Clamshell Devices. Laurence Bryant VP Strategic Marketing

Expanding Opportunities in Clamshell Devices. Laurence Bryant VP Strategic Marketing Expanding Opportunities in Clamshell Devices Laurence Bryant VP Strategic Marketing 1 PC Mobile Ecosystem Scaling The Richness Of Small Screen Experiences The smartphone and tablet ecosystem is shaping

More information

Affordable and power efficient computing for high energy physics: CPU and FFT benchmarks of ARM processors

Affordable and power efficient computing for high energy physics: CPU and FFT benchmarks of ARM processors Affordable and power efficient computing for high energy physics: CPU and FFT benchmarks of ARM processors Mitchell A Cox, Robert Reed and Bruce Mellado School of Physics, University of the Witwatersrand.

More information

The character of the instruction scheduling problem

The character of the instruction scheduling problem The character of the instruction scheduling problem Darko Stefanović Department of Computer Science University of Massachusetts March 997 Abstract Here I present some measurements that serve to characterize

More information

Baikal-T1 Microprocessor Performance Tests

Baikal-T1 Microprocessor Performance Tests Baikal-T1 Microprocessor Performance Tests Revision list Revision Date Author Description 1.0 15.03.2017 Initial version 1.1 08.08.2017 Added SPEC CPU2006 Int, iperf results Revision list... 1 1. List

More information

Example: Adding 1000 integers on Cortex-M4F. Lower bound: 2n + 1 cycles for n LDR + n ADD. Imagine not knowing this : : :

Example: Adding 1000 integers on Cortex-M4F. Lower bound: 2n + 1 cycles for n LDR + n ADD. Imagine not knowing this : : : Cryptographic software engineering, part 2 1 Daniel J. Bernstein Last time: General software engineering. Using const-time instructions. Comparing time to lower bound. Example: Adding 1000 integers on

More information

Variables and Operators 2/20/01 Lecture #

Variables and Operators 2/20/01 Lecture # Variables and Operators 2/20/01 Lecture #6 16.070 Variables, their characteristics and their uses Operators, their characteristics and their uses Fesq, 2/20/01 1 16.070 Variables Variables enable you to

More information

Introduction to Computer Science. Homework 1

Introduction to Computer Science. Homework 1 Introduction to Computer Science Homework. In each circuit below, the rectangles represent the same type of gate. Based on the input and output information given, identify whether the gate involved is

More information

Negotiating the Maze Getting the most out of memory systems today and tomorrow. Robert Kaye

Negotiating the Maze Getting the most out of memory systems today and tomorrow. Robert Kaye Negotiating the Maze Getting the most out of memory systems today and tomorrow Robert Kaye 1 System on Chip Memory Systems Systems use external memory Large address space Low cost-per-bit Large interface

More information

ARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 3.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C (ID102813)

ARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 3.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C (ID102813) ARM Mali GPU Version: 3.0 OpenGL ES Application Optimization Guide Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C () ARM Mali GPU OpenGL ES Application Optimization Guide Copyright 2011,

More information

EPUB // SAMSUNG GALAXY 7500 ONLINE MANUAL DOWNLOAD

EPUB // SAMSUNG GALAXY 7500 ONLINE MANUAL DOWNLOAD 06 January, 2019 EPUB // SAMSUNG GALAXY 7500 ONLINE MANUAL DOWNLOAD Document Filetype: PDF 165.6 KB 0 EPUB // SAMSUNG GALAXY 7500 ONLINE MANUAL DOWNLOAD Samsung GT-S7500 Galaxy Ace Plus complete Service

More information

CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic

CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic Instructors: Nick Weaver & John Wawrzynek http://inst.eecs.berkeley.edu/~cs61c/sp18 3/16/18 Spring 2018 Lecture #17

More information

TS1010C 10.1" Tablet PC Surf

TS1010C 10.1 Tablet PC Surf TS1010C 10.1" Tablet PC Surf Android 4.0 Super NVIDIA Tegra 3 processor Ultra-slim case NVIDIA Tegra 3, 1,4 GHz 10.1" (1280 x 800), 10-point capacitive multi-touch Up to 2 * GB emmc up to 128 * GB 2 MP

More information

Chapter 02. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1

Chapter 02. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1 Chapter 02 Authors: John Hennessy & David Patterson Copyright 2011, Elsevier Inc. All rights Reserved. 1 Figure 2.1 The levels in a typical memory hierarchy in a server computer shown on top (a) and in

More information

Flowmap Generator Reference

Flowmap Generator Reference Flowmap Generator Reference Table of Contents Flowmap Overview... 3 What is a flowmap?... 3 Using a flowmap in a shader... 4 Performance... 4 Creating flowmaps by hand... 4 Creating flowmaps using Flowmap

More information

Computer Architecture, RISC vs. CISC, and MIPS Processor

Computer Architecture, RISC vs. CISC, and MIPS Processor CSE 2421: Systems I Low-Level Programming and Computer Organization Computer Architecture, RISC vs. CISC, and MIPS Processor Gojko Babić 1-1-217 Computer Architecture A modern meaning of the term computer

More information

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 Scherer Balázs Budapest University of Technology and Economics Department of Measurement and Information Systems BME-MIT 2018 Trends of 32-bit microcontrollers

More information

Quantifying the Energy Cost of Data Movement for Emerging Smartphone Workloads on Mobile Platforms

Quantifying the Energy Cost of Data Movement for Emerging Smartphone Workloads on Mobile Platforms Quantifying the Energy Cost of Data Movement for Emerging Smartphone Workloads on Mobile Platforms Arizona State University Dhinakaran Pandiyan(dpandiya@asu.edu) and Carole-Jean Wu(carole-jean.wu@asu.edu

More information

Introduction. Arizona State University 1

Introduction. Arizona State University 1 Introduction CSE100 Principles of Programming with C++, Fall 2018 (based off Chapter 1 slides by Pearson) Ryan Dougherty Arizona State University http://www.public.asu.edu/~redoughe/ Arizona State University

More information

Which is the best? Measuring & Improving Performance (if planes were computers...) An architecture example

Which is the best? Measuring & Improving Performance (if planes were computers...) An architecture example 1 Which is the best? 2 Lecture 05 Performance Metrics and Benchmarking 3 Measuring & Improving Performance (if planes were computers...) Plane People Range (miles) Speed (mph) Avg. Cost (millions) Passenger*Miles

More information

Byte Ordering. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Byte Ordering. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Byte Ordering Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Model Physical memory DRAM chips can read/write 4, 8, 16 bits DRAM modules

More information

Meet the Increased Demands on Your Infrastructure with Dell and Intel. ServerWatchTM Executive Brief

Meet the Increased Demands on Your Infrastructure with Dell and Intel. ServerWatchTM Executive Brief Meet the Increased Demands on Your Infrastructure with Dell and Intel ServerWatchTM Executive Brief a QuinStreet Excutive Brief. 2012 Doing more with less is the mantra that sums up much of the past decade,

More information

Heterogeneous Computing Made Easy:

Heterogeneous Computing Made Easy: Heterogeneous Computing Made Easy: Qualcomm Symphony System Manager SDK Wenjia Ruan Sr. Engineer, Advanced Content Group Qualcomm Technologies, Inc. May 2017 Qualcomm Symphony System Manager SDK is a product

More information

Resource-Conscious Scheduling for Energy Efficiency on Multicore Processors

Resource-Conscious Scheduling for Energy Efficiency on Multicore Processors Resource-Conscious Scheduling for Energy Efficiency on Andreas Merkel, Jan Stoess, Frank Bellosa System Architecture Group KIT The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe

More information

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital

More information

A Fast Instruction Set Simulator for RISC-V

A Fast Instruction Set Simulator for RISC-V A Fast Instruction Set Simulator for RISC-V Maxim.Maslov@esperantotech.com Vadim.Gimpelson@esperantotech.com Nikita.Voronov@esperantotech.com Dave.Ditzel@esperantotech.com Esperanto Technologies, Inc.

More information

Artificial Intelligence Enriched User Experience with ARM Technologies

Artificial Intelligence Enriched User Experience with ARM Technologies Artificial Intelligence Enriched User Experience with ARM Technologies Daniel Heo Senior Segment Manager Mobile, BSG, ARM ARM Tech Forum Singapore July 12 th 2017 Global AI survey: the world is ready 71

More information

What is a computer? Units of Measurement. - A machine that: - Counts.

What is a computer? Units of Measurement. - A machine that: - Counts. What is a computer? - A machine that: - Counts. - Does Arithmetic (Addition, Subtraction, Multiplication, and Division) in binary system. - Stores numbers. - Retrieves numbers. Units of Measurement - The

More information

SIMD. Utilization of a SIMD unit in the OS Kernel. Shogo Saito 1 and Shuichi Oikawa 2 2. SIMD. SIMD (Single SIMD SIMD SIMD SIMD

SIMD. Utilization of a SIMD unit in the OS Kernel. Shogo Saito 1 and Shuichi Oikawa 2 2. SIMD. SIMD (Single SIMD SIMD SIMD SIMD OS SIMD 1 2 SIMD (Single Instruction Multiple Data) SIMD OS (Operating System) SIMD SIMD OS Utilization of a SIMD unit in the OS Kernel Shogo Saito 1 and Shuichi Oikawa 2 Nowadays, it is very common that

More information

Modeling CPU Energy Consumption for Energy Efficient Scheduling

Modeling CPU Energy Consumption for Energy Efficient Scheduling Modeling CPU Energy Consumption for Energy Efficient Scheduling Abhishek Jaiantilal, Yifei Jiang, Shivakant Mishra University of Colorado - Boulder GCM '10 Proceedings of the 1st Workshop on Green Computing

More information

An overview of mobile and embedded platforms

An overview of mobile and embedded platforms ES3 Lecture 2 An overview of mobile and embedded platforms Basic Classes Embedded devices: These are in toasters, cars, mice, etc. Usually very real-time focused (us accuracy) Very small memory, not usually

More information

GO - OPERATORS. This tutorial will explain the arithmetic, relational, logical, bitwise, assignment and other operators one by one.

GO - OPERATORS. This tutorial will explain the arithmetic, relational, logical, bitwise, assignment and other operators one by one. http://www.tutorialspoint.com/go/go_operators.htm GO - OPERATORS Copyright tutorialspoint.com An operator is a symbol that tells the compiler to perform specific mathematical or logical manipulations.

More information

WearDrive: Fast and Energy Efficient Storage for Wearables

WearDrive: Fast and Energy Efficient Storage for Wearables WearDrive: Fast and Energy Efficient Storage for Wearables Reza Shisheie Cleveland State University CIS 601 Wearable Computing: A New Era 2 Wearable Computing: A New Era Notifications Fitness/Healthcare

More information

MediaTek CorePilot. Heterogeneous Multi-Processing Technology. Delivering extreme compute performance with maximum power efficiency

MediaTek CorePilot. Heterogeneous Multi-Processing Technology. Delivering extreme compute performance with maximum power efficiency MediaTek CorePilot Heterogeneous Multi-Processing Technology Delivering extreme compute performance with maximum power efficiency In July 2013, MediaTek delivered the industry s first mobile system on

More information

POWER MANAGEMENT AND ENERGY EFFICIENCY

POWER MANAGEMENT AND ENERGY EFFICIENCY POWER MANAGEMENT AND ENERGY EFFICIENCY * Adopted Power Management for Embedded Systems, Minsoo Ryu 2017 Operating Systems Design Euiseong Seo (euiseong@skku.edu) Need for Power Management Power consumption

More information

CS 101, Mock Computer Architecture

CS 101, Mock Computer Architecture CS 101, Mock Computer Architecture Computer organization and architecture refers to the actual hardware used to construct the computer, and the way that the hardware operates both physically and logically

More information

Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation

Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation Large-Scale Data & Systems Group Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation Georgios Theodorakis, Alexandros Koliousis, Peter Pietzuch, Holger Pirk Large-Scale Data & Systems (LSDS)

More information

TABLET COMPARISON WITH BENCHMARKS TABLETS WE TESTED A PRINCIPLED TECHNOLOGIES TEST REPORT. SEPTEMBER 2014 (Revised) Commissioned by Intel Corp.

TABLET COMPARISON WITH BENCHMARKS TABLETS WE TESTED A PRINCIPLED TECHNOLOGIES TEST REPORT. SEPTEMBER 2014 (Revised) Commissioned by Intel Corp. TABLET COMPARISON WITH BENCHMARKS TABLETS WE TESTED Benchmarks for tablets give a representative view of device performance. When purchasing a tablet, consumers can use benchmark results that measure battery

More information

Matrix Manipulation Using High Computing Field Programmable Gate Arrays

Matrix Manipulation Using High Computing Field Programmable Gate Arrays Matrix Manipulation Using High Computing Field Programmable Gate Arrays 1 Mr.Rounak R. Gupta, 2 Prof. Atul S. Joshi Department of Electronics and Telecommunication Engineering, Sipna College of Engineering

More information

Main Window. June 25, 2017, Beginners SIG Activity Monitor (Part 1 of 2)

Main Window. June 25, 2017, Beginners SIG Activity Monitor (Part 1 of 2) Note: This discussion is based on MacOS, 10.12.5 (Sierra). Some illustrations may differ when using other versions of macos or OS X. Credit 1: Activity Monitor Help Credit 2: Use Activity Monitor on your

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore

More information

Optimized Scientific Computing:

Optimized Scientific Computing: Optimized Scientific Computing: Coding Efficiently for Real Computing Architectures Noah Kurinsky SASS Talk, November 11 2015 Introduction Components of a CPU Architecture Design Choices Why Is This Relevant

More information

XPU A Programmable FPGA Accelerator for Diverse Workloads

XPU A Programmable FPGA Accelerator for Diverse Workloads XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for

More information

Lecture 3 Notes Topic: Benchmarks

Lecture 3 Notes Topic: Benchmarks Lecture 3 Notes Topic: Benchmarks What do you want in a benchmark? o benchmarks must be representative of actual workloads o first few computers were benchmarked based on how fast they could add/multiply

More information

CS 110 Computer Architecture

CS 110 Computer Architecture CS 110 Computer Architecture Performance and Floating Point Arithmetic Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University

More information

Building Ultra-Low Power Wearable SoCs

Building Ultra-Low Power Wearable SoCs Building Ultra-Low Power Wearable SoCs 1 Wearable noun An item that can be worn adjective Easy to wear, suitable for wearing 2 Wearable Opportunity: Fastest Growing Market Segment Projected Growth from

More information

EEL 4783: Hardware/Software Co-design with FPGAs

EEL 4783: Hardware/Software Co-design with FPGAs EEL 4783: Hardware/Software Co-design with FPGAs Lecture 5: Digital Camera: Software Implementation* Prof. Mingjie Lin * Some slides based on ISU CPrE 588 1 Design Determine system s architecture Processors

More information

RPICT03/07: Overview of MTP Laptop Computer Testing Activities and Results

RPICT03/07: Overview of MTP Laptop Computer Testing Activities and Results RPICT03/07: Overview of MTP Laptop Computer Testing Activities and Results www.mtprog.com RPCEXX/06: Overview of MTP Laptop Computer Testing Activities and Results 2 Executive summary This report gives

More information

Higher compression efficiency, exceptional image quality, faster encoding time and lower costs

Higher compression efficiency, exceptional image quality, faster encoding time and lower costs Table of Content Introduction... 2 RealMedia HD fast facts... 2 Key benefits of RealMedia HD... 2 1. RealMedia HD... 4 1.1 The evolution of video codecs... 4 1.2 RealMedia Variable Bitrate (RMVB) global

More information

Byte Ordering. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Byte Ordering. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Byte Ordering Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

Cryptography. Cryptography is much more than. What is Cryptography, exactly? Why Cryptography? (cont d) Straight encoding and decoding

Cryptography. Cryptography is much more than. What is Cryptography, exactly? Why Cryptography? (cont d) Straight encoding and decoding Copyright 2000-2001, University of Washington Cryptography is much more than Cryptography Cryptography systems allow 2 parties to communicate securely. The intent is to give privacy, integrity and security

More information