DE0 Nano SoC - CPU Performance and Power

Similar documents
Benchmark Study: A Performance Comparison Between RHEL 5 and RHEL 6 on System z

F28HS Hardware-Software Interface: Systems Programming

Benchmark of a Cubieboard cluster

Benchmark of XU4 with Active Cooler and XU4Q with Passive Cooler

1. Introduction. 2. Setup

MediaTek CorePilot. Heterogeneous Multi-Processing Technology. Delivering extreme compute performance with maximum power efficiency

Power Measurements using performance counters CSL862: Low-Power Computing By Radhika D (2014SIY7530)

ARM Vision for Thermal Management and Energy Aware Scheduling on Linux

Towards Energy Efficient Data Management in HPC: The Open Ethernet Drive Approach

Macbook Pro HostEurope CESNET 100%IT TransIP. DE (commercial) CZ UK Xeon E GHz. vcores Mem (GB)

GEN_OMEGA2: The HPSUMMARY Procedure: A SAS Macro for Computing the Generalized Omega-Squared Effect Size Associated with

Parallel Simulation Accelerates Embedded Software Development, Debug and Test

Power Efficiency of Hypervisor and Container-based Virtualization

EyeCheck Smart Cameras

Building a home lab : From OK to Bada$$$ By Maxime Mercier

Modeling CPU Energy Consumption for Energy Efficient Scheduling

Software Quality is Directly Proportional to Simulation Speed

USB 3.0 to 4-Bay SATA 6Gbps Hard Drive Docking Station w/ UASP & Dual Fans - 2.5/3.5in SSD / HDD Dock

S100 Series. Compact Smart Camera. High Performance: Dual Core Cortex-A9 processor and Xilinx FPGA. acquisition and preprocessing

2009. October. Semiconductor Business SAMSUNG Electronics

Experimental Calibration and Validation of a Speed Scaling Simulator

Molecular Dynamics Simulations with Julia

Datacenter application interference

Bill Nesheim Sun Microsystems, Inc. Bob Kasten Intel Corporation

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Power Measurement Using Performance Counters

TR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut

Affordable and power efficient computing for high energy physics: CPU and FFT benchmarks of ARM processors

Contour Detection on Mobile Platforms

HP ProLiant DL580 G7

A FLEXIBLE ARM-BASED CEPH SOLUTION

Developing a Powerful yet Inexpensive Computational Infrastructure for the UT Dept. of Nuclear Engineering. David D. Dixon April 8, 2009

N-series HDX Ready SoC Thin Clients

Competitive Power Savings with VMware Consolidation on the Dell PowerEdge 2950

POWER MANAGEMENT AND ENERGY EFFICIENCY

SRM-Buffer: An OS Buffer Management Technique to Prevent Last Level Cache from Thrashing in Multicores

High-Performance Transaction Processing in Journaling File Systems Y. Son, S. Kim, H. Y. Yeom, and H. Han

Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters

Quantifying the Energy Cost of Data Movement for Emerging Smartphone Workloads on Mobile Platforms

vsphere Resource Management Update 2 VMware vsphere 5.5 VMware ESXi 5.5 vcenter Server 5.5

HP ProLiant DL380 G7

Unblinding the OS to Optimize User-Perceived Flash SSD Latency

Intergenerational Energy Efficiency of Dell EMC PowerEdge Servers

High performance 2D Discrete Fourier Transform on Heterogeneous Platforms. Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli

Introduction to Energy-Efficient Software 2 nd life talk

ADVANCED ELECTRONIC SOLUTIONS AVIATION SERVICES COMMUNICATIONS AND CONNECTIVITY MISSION SYSTEMS

Usability of ARM ANN anomaly detector in local networks

USB 3.0 Dual Hard Drive Docking Station with UASP for 2.5/3.5in SSD / HDD SATA 6 Gbps

Map3D V58 - Multi-Processor Version

Elaborazione dati real-time su architetture embedded many-core e FPGA

Touch technology and collaboration are brought to life with the brilliance of our 4K ultra high-definition LCD flat panel Android display.

Power Measurements using performance counters

VoltDB vs. Redis Benchmark

Managing Hardware Power Saving Modes for High Performance Computing

How GPUs can find your next hit: Accelerating virtual screening with OpenCL. Simon Krige

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

Parallel Computing Ideas

Accelerating molecular docking on multi- and manycore computer architectures

Improving Performance and Power of Multi-Core Processors with Wonderware and System Platform 3.0

State of the Linux Kernel

75 4K Ultra HD ViewBoard Interactive Flat Panel IFP7500

operating manual steropes halogen light source

A176 Cyclone. GPGPU Fanless Small FF RediBuilt Supercomputer. IT and Instrumentation for industry. Aitech I/O

«Real Time Embedded systems» Multi Masters Systems

75 4K Ultra HD ViewBoard Interactive Flat Panel IFP7500

N-Series SoC Based Thin Clients

Resource-Conscious Scheduling for Energy Efficiency on Multicore Processors

A unified multicore programming model

Distributed Computing Systems with Raspberry Pi

Operating System Support for Shared-ISA Asymmetric Multi-core Architectures

ARM and x86 on Qseven & COM Express Mini. Zeljko Loncaric, Marketing Engineer, congatec AG

How Scalable is your SMB?

Resource 2 Embedded computer and development environment

UTILIZING A BIG.LITTLE TM SOLUTION IN AUTOMOTIVE

COL862 Programming Assignment-1

HP SSD EX900 M.2. Product Specification Capacity: 120GB, 250GB, 500GB Components: 3D NAND TLC

High Performance Computing on ARM

Advanced Modelling of Virtualized Servers

HDMI and DVI Dual-Monitor Docking Station for Laptops - USB 3.0

Amortised Optimisation as a Means to Achieve Genetic Improvement

Scheduling the Intel Core i7

Designing with ALTERA SoC Hardware

Advanced 65" Ultra HD ViewBoard Interactive Flat Panel

HP SSD EX920 M.2. 2TB Sustained sequential read: Up to 3200 MB/s Sustained sequential write: Up to 1600 MB/s

Root Cause Analysis for SAP HANA. June, 2015

55 20-Point Touch Interactive Flat Panel

CHAPTER 7 IMPLEMENTATION OF DYNAMIC VOLTAGE SCALING IN LINUX SCHEDULER

AMD EPYC Delivers Linear Scalability for Docker with Bare-Metal Performance

Energy Efficiency Analysis of Heterogeneous Platforms: Early Experiences

*Yuta SAWA and Reiji SUDA The University of Tokyo

Intel Stratix 10 SoC FPGA Boot User Guide

Performance Evaluation of Live Migration based on Xen ARM PVH for Energy-efficient ARM Server

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY

GTRC Hosting Infrastructure Reports

Arm crossplatform. VI-HPS platform October 16, Arm Limited

USB 3.0 / esata Dual Hard Drive Docking Station with UASP for 2.5/3.5in SATA SSD / HDD SATA 6 Gbps

Measuring the impacts of the Preempt-RT patch

ARM big.little Technology Unleashed An Improved User Experience Delivered

A Simple Model for Estimating Power Consumption of a Multicore Server System

Parallel Computing. Parallel Computing. Hwansoo Han

Transcription:

DE0 Nano SoC DE0 Nano SoC - CPU Performance and Power While Running Debian 19 th March 2017 - Satyen Akolkar Group 5 - AR Internet of Things By: Satyen Akolkar OVERVIEW The benchmark was performed by using the sysbench utility. This is a linux CPU benchmark utility where we can specify the number of threads to utilize in order to do the request processing. The utility is not perfect as it only tests for certain CPU features. The benchmark makes the CPU verify prime numbers by dividing them by numbers from 2 up to the square root of the number. The max requests signifies the largest number that should be checked for primality. We ran the benchmark on both the ARM Cortex A9 on the DE0 Nano SoC and my laptop running an Intel i7-6700hq. We kept the number of threads for the benchmark at 2 since the ARM is Dual Core and not give an advantage to the i7 for its quad cores. Result: We can expect a CPU intensive algorithm to run 55x slower on the ARM Cortex A9 on the DE0 Nano SoC than on an i7. We measured power consumption for the board in several states of operation. On the FPGA side there are no Nios II cores only the base components required to perform communication between the FPGA and the HPS. With minimal components on the FPGA side you can expect around 4 Watts when the Debian OS is running. 1

METHOD The following sample shows the method of running the benchmark with max requests being 5000. debian@socsystem:~$ sysbench --test=cpu --max-requests=5000 --num-threads=2 run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 2 Doing CPU performance benchmark Threads started! Done. imum prime number checked in CPU test: 10000 Test execution summary: total time: 123.7601s total number of events: 5000 total time taken by event execution: 247.4922 per-request statistics: min: 35.29ms avg: 49.50ms max: 77.63ms approx. 95 percentile: 56.52ms Threads fairness: events (avg/stddev): 2500.0000/0.00 execution time (avg/stddev): 123.7461/0.00 2

RAW DATA Time to Process Requests on ARM Cortex A9 (DE0 Nano SoC [Dual Core]) Time (s) 24.74 49.47 74.21 98.95 123.70 148.43 173.17 197.91 222.64 247.38 24.74 49.48 74.21 98.96 24.73 49.51 74.21 98.96 24.76 49.51 24.74 49.47 Average(s) 24.74 49.49 74.21 98.96 123.70 148.43 173.17 197.91 222.64 247.38 The large requests were taking too long, there was little variance between runs, and the CPU started getting very hot so only one run was done for these. Time to Process Requests on Intel Core i7-6700hq [Quad Core - Hyperthreaded] Time (s) 0.47 0.89 1.34 1.82 2.27 2.69 3.18 3.50 4.08 4.53 0.44 0.88 1.32 1.77 2.22 2.66 3.10 3.55 3.98 4.42 0.44 0.90 1.38 1.81 2.26 2.66 3.09 3.55 3.98 4.40 0.44 0.90 1.35 1.81 2.20 2.66 3.11 3.54 4.00 4.41 0.44 0.91 1.37 1.80 2.24 2.67 3.15 3.54 4.08 4.37 Average(s) 0.45 0.89 1.35 1.80 2.24 2.67 3.13 3.54 4.02 4.43 RESULTS Performance of DE0 Nano SoC CPU Compared to Core i7 Relative Performance 1.82% 1.80% 1.82% 1.82% 1.81% 1.80% 1.81% 1.79% 1.81% 1.79% Slowdown 55.1 55.4 54.9 55.0 55.3 55.6 55.4 56.0 55.3 55.9 3

The blue dots are the time taken on the ARM Cortex A9. The Red dots are the time taken on the Core i7. MODES OF OPERATION Different power consumptions under various operations. Mode of Operation Voltage (V) Current (A) Watts (W) Debian System 4.94 0.85 4.199 Debian System Min 4.94 0.76 3.75 Debian Shutdown 4.94 0.70 3.45 These are the modes of operation of our FPGA. Running at maximum power there is a difference of 0.45W than running at minimum and 0.75W with the linux turned off. We test the maximum power consumption by running the benchmarking program. Minimum system load occurs when we run the server in normal state. Debian Shutdown mode is where we have completely turned off the linux and the system is idle. 4

REFERENCES [1]"Sysbench - Gentoo Wiki", Wiki.gentoo.org, 2017. [Online]. Available: https://wiki.gentoo.org/wiki/sysbench. [Accessed: 19- Mar- 2017]. 5