Temperature measurement in the Intel CoreTM Duo Processor

Similar documents
Tacked Link List - An Improved Linked List for Advance Resource Reservation

BoxPlot++ Zeina Azmeh, Fady Hamoui, Marianne Huchard. To cite this version: HAL Id: lirmm

Linux: Understanding Process-Level Power Consumption

Fault-Tolerant Storage Servers for the Databases of Redundant Web Servers in a Computing Grid

Multimedia CTI Services for Telecommunication Systems

The Sissy Electro-thermal Simulation System - Based on Modern Software Technologies

Change Detection System for the Maintenance of Automated Testing

FIT IoT-LAB: The Largest IoT Open Experimental Testbed

Accurate Conversion of Earth-Fixed Earth-Centered Coordinates to Geodetic Coordinates

Setup of epiphytic assistance systems with SEPIA

Study on Feebly Open Set with Respect to an Ideal Topological Spaces

Blind Browsing on Hand-Held Devices: Touching the Web... to Understand it Better

How to simulate a volume-controlled flooding with mathematical morphology operators?

QuickRanking: Fast Algorithm For Sorting And Ranking Data

Modularity for Java and How OSGi Can Help

NP versus PSPACE. Frank Vega. To cite this version: HAL Id: hal

Comparison of radiosity and ray-tracing methods for coupled rooms

Comparison of spatial indexes

A Resource Discovery Algorithm in Mobile Grid Computing based on IP-paging Scheme

Branch-and-price algorithms for the Bi-Objective Vehicle Routing Problem with Time Windows

A Voronoi-Based Hybrid Meshing Method

Very Tight Coupling between LTE and WiFi: a Practical Analysis

Regularization parameter estimation for non-negative hyperspectral image deconvolution:supplementary material

SDLS: a Matlab package for solving conic least-squares problems

Moveability and Collision Analysis for Fully-Parallel Manipulators

KeyGlasses : Semi-transparent keys to optimize text input on virtual keyboard

High robustness PNP-based structure for the ESD protection of high voltage I/Os in an advanced smart power technology

Zigbee Wireless Sensor Network Nodes Deployment Strategy for Digital Agricultural Data Acquisition

Taking Benefit from the User Density in Large Cities for Delivering SMS

Relabeling nodes according to the structure of the graph

Mokka, main guidelines and future

HySCaS: Hybrid Stereoscopic Calibration Software

Service Reconfiguration in the DANAH Assistive System

lambda-min Decoding Algorithm of Regular and Irregular LDPC Codes

DANCer: Dynamic Attributed Network with Community Structure Generator

Open Digital Forms. Hiep Le, Thomas Rebele, Fabian Suchanek. HAL Id: hal

SIM-Mee - Mobilizing your social network

The New Territory of Lightweight Security in a Cloud Computing Environment

Malware models for network and service management

DSM GENERATION FROM STEREOSCOPIC IMAGERY FOR DAMAGE MAPPING, APPLICATION ON THE TOHOKU TSUNAMI

Experimental Evaluation of an IEC Station Bus Communication Reliability

Every 3-connected, essentially 11-connected line graph is hamiltonian

Multi-atlas labeling with population-specific template and non-local patch-based label fusion

Natural Language Based User Interface for On-Demand Service Composition

Cloud My Task - A Peer-to-Peer Distributed Python Script Execution Service

Generic Design Space Exploration for Reconfigurable Architectures

A Practical Evaluation Method of Network Traffic Load for Capacity Planning

Representation of Finite Games as Network Congestion Games

An Experimental Assessment of the 2D Visibility Complex

THE KINEMATIC AND INERTIAL SOIL-PILE INTERACTIONS: CENTRIFUGE MODELLING

On-chip temperature-based digital signal processing for customized wireless microcontroller

LaHC at CLEF 2015 SBS Lab

Fuzzy sensor for the perception of colour

Scan chain encryption in Test Standards

Teaching Encapsulation and Modularity in Object-Oriented Languages with Access Graphs

The Connectivity Order of Links

Light field video dataset captured by a R8 Raytrix camera (with disparity maps)

Traffic Grooming in Bidirectional WDM Ring Networks

IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs

Structuring the First Steps of Requirements Elicitation

Linked data from your pocket: The Android RDFContentProvider

Quality of Service Enhancement by Using an Integer Bloom Filter Based Data Deduplication Mechanism in the Cloud Storage Environment

CRC: Protected LRU Algorithm

Real-Time Collision Detection for Dynamic Virtual Environments

Syrtis: New Perspectives for Semantic Web Adoption

ASAP.V2 and ASAP.V3: Sequential optimization of an Algorithm Selector and a Scheduler

Simulations of VANET Scenarios with OPNET and SUMO

Prototype Selection Methods for On-line HWR

Scalewelis: a Scalable Query-based Faceted Search System on Top of SPARQL Endpoints

From medical imaging to numerical simulations

Real-time FEM based control of soft surgical robots

Stream Ciphers: A Practical Solution for Efficient Homomorphic-Ciphertext Compression

THE COVERING OF ANCHORED RECTANGLES UP TO FIVE POINTS

A 64-Kbytes ITTAGE indirect branch predictor

Learning a Representation of a Believable Virtual Character s Environment with an Imitation Algorithm

Mapping classifications and linking related classes through SciGator, a DDC-based browsing library interface

IMPLEMENTATION OF MOTION ESTIMATION BASED ON HETEROGENEOUS PARALLEL COMPUTING SYSTEM WITH OPENC

Real-Time and Resilient Intrusion Detection: A Flow-Based Approach

An FCA Framework for Knowledge Discovery in SPARQL Query Answers

Is GPU the future of Scientific Computing?

FPGA implementation of a real time multi-resolution edge detection video filter

X-Kaapi C programming interface

Catalogue of architectural patterns characterized by constraint components, Version 1.0

Monitoring Air Quality in Korea s Metropolises on Ultra-High Resolution Wall-Sized Displays

Making Full use of Emerging ARM-based Heterogeneous Multicore SoCs

Complexity Comparison of Non-Binary LDPC Decoders

CTF3 BPM acquisition system

Application of RMAN Backup Technology in the Agricultural Products Wholesale Market System

Primitive roots of bi-periodic infinite pictures

Using a Medical Thesaurus to Predict Query Difficulty

Operation of Site Running StratusLab toolkit v1.0

A N-dimensional Stochastic Control Algorithm for Electricity Asset Management on PC cluster and Blue Gene Supercomputer

RecordMe: A Smartphone Application for Experimental Collections of Large Amount of Data Respecting Volunteer s Privacy

The optimal routing of augmented cubes.

BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs

Reverse-engineering of UML 2.0 Sequence Diagrams from Execution Traces

A case-based reasoning approach for unknown class invoice processing

Kernel perfect and critical kernel imperfect digraphs structure

Comparator: A Tool for Quantifying Behavioural Compatibility

Assisted Policy Management for SPARQL Endpoints Access Control

Transcription:

Temperature measurement in the Intel CoreTM Duo Processor E. Rotem, J. Hermerding, A. Cohen, H. Cain To cite this version: E. Rotem, J. Hermerding, A. Cohen, H. Cain. Temperature measurement in the Intel CoreTM Duo Processor. THERMINIC 2006, Sep 2006, Nice, France. TIMA Editions, pp.23-27, 2006. <hal- 00171349> HAL Id: hal-00171349 https://hal.archives-ouvertes.fr/hal-00171349 Submitted on 12 Sep 2007 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Temperature measurement in the Intel Core TM Duo Processor Efraim Rotem Mobile Platform Group, Intel corporation Jim Hermerding Mobile platform Group, Intel corporation Cohen Aviad - Microprocessor Technology Lab, Intel corporation Cain Harel - Microprocessor Technology Lab, Intel corporation Abstract Modern CPUs with increasing core frequency and power are rapidly reaching a point where the CPU frequency and performance are limited by the amount of heat that can be extracted by the cooling technology. In mobile environment, this issue is becoming more apparent, as form factors become thinner and lighter. Often, mobile platforms trade CPU performance in order to reduce power and manage thermals. This enables the delivery of high performance computing together with improved ergonomics by lowering skin temperature and reducing fan acoustic noise. Most of available high performance CPUs provide thermal sensor on the die to allow thermal management, typically in the form of analog thermal diode. Operating system algorithms and platform embedded controllers read the temperature and control the processor power. Improved thermal sensors directly translate into better system performance, reliability and ergonomics. In this paper we will introduce the new Intel Core TM Duo processor temperature sensing capability and present performance benefits measurements and results. Introduction Today s high performance processors contain over a hundred million transistors, running in a frequency of several gigahertz. The power and thermal characteristics of these processors are becoming more challenging than ever before, and are likely to continue to grow with Moor s low. Improvements in the cooling technology however, are relatively slow and do not follow Moor s low. All computing segments face power and thermal challenges. In the server domain, the cost of electricity and air conditioning is one of the biggest expense items of a data center, and drives the need for low power high efficiency systems. In the mobile computing market, power and thermal management are the key limiter for delivering higher computational performance. Thin and light industrial designs are limited by the heat that can be extracted from the box. Ergonomic characteristics are also highly impacted by thermal considerations. The cooling fan is the major source of acoustic noise in the mobile system and external skin hot spots should be avoided for ergonomic reasons as well. The increasing demand for compute density brings the need for efficient thermal management schemes. Several such schemes have been proposed, for example DVS (dynamic frequency voltage scaling [1]). These mechanisms were implemented in CPUs such as the Intel Centrino Processor [2]. Most operating systems on the market support ACPI [3]. This is an industry standard infrastructure that enables thermal management of computer platforms. Thermal management is done by the use of active cooling devices, such as fans, or passive cooling actions such as DVS. Thermal management schemes accept user preferences for setting management policy. A computer user can select between high performance, energy conservation and improved ergonomics parameters. The basic feedback for most of the power and thermal management schemes is temperature measurement. Both Intel processors [4] and others [5], incorporate temperature sensor on the die to allow thermal measurement, typically in the form of analog thermal diode. The voltage on a diode junction is a function of the junction temperature. The diode is routed to external pins and an A/D chip on the platform converts the voltage into temperature reading. The Intel Centrino processor [2] introduced a fixed thermal sensor, tuned to the max specified junction temperature. In case of abnormal conditions, such as cooling system malfunction, the circuit 1

asserts a signal that activates a programmable self management power saving action that protects the CPU from operating out of its specified thermal range. It is apparent that the accuracy of the thermal measurements directly impacts the performance of the thermal management system and the performance of the CPU. In mobile computers, 1.5 o C accuracy in temperature measurement is equivalent to 1 Watt of CPU power. In desk-top computers the impact is even higher due to the lower thermal resistance and 1 o C accuracy translates into 2 Watt of CPU power. There are several causes for temperature measurement inaccuracy: 1. Parameter variance: The thermal diode is not ideal and during the manufacturing process, there are variations in the diode parameters that translate into reading variations. An offset value is programmed into the Intel Core Duo, to be used by the A/D to generate accurate readings. 2. A/D accuracy: Some errors are associated with the analog to digital conversion due to design and technology limitations as well as quantization errors. The best temperature A/Ds available on the market today provide +/- ½ o C accuracy. 3. Proximity to the hot spot: CPU performance and reliability is limited by the temperature of the hottest location on the die. Thermal diode placement is limited by routing and I/O considerations and usually cannot be placed at the hottest spot on the die. Furthermore, the hot spot tends to shift around as a function of the workload of the CPU. It is not rare to find temperature difference as high as 10 o C between a diode and the hot spot. 4. Manufacturing temperature control: Parts are tested for functionality and reliability at the max temperature specifications. Variations in test temperature drive a need for additional guard-band in the temperature control set points. The speed of response to temperature changes also impacts thermal management performance. The Intel Core TM Duo processor has implemented a new digital temperature reading capability to address the accuracy and response time limitations of existing solutions. The rest of the paper will describe the implementation of the digital sensor and the measured results of it s performance. The Intel Core Duo digital sensor (DTS) The general structure of the digital thermometer of the Intel Core Duo [7] is described in Figure 1. In addition to the analog thermal diode, multiple sensing devices are distributed on the die in all the hot spots. An internal A/D circuit converts each sensor into a 7 bit digital reading. The temperature reading is calculated as an offset from the maximum specified Tj, e.g. 0 indicated that the CPU is at it s maximum allowable Tj, 1 indicated 1 o C below etc. All temperature readings are combined together into a single value, indicating the temperature of the hottest spot on die. The Intel Core TM Duo is a dual core CPU. The DTS offers the ability to read temperature for each core independently and to read the maximum temperature for the entire package. To achieve measurement accuracy, each sensor is calibrated at test time. Calibration is done for the Maximum Tj and the linearity of the readout slope. The temperature reading is post processed for filtering out random noise and generating the H/W activated thermal protection functions. The DTS implementation on the Intel Core TM Duo processor supports the legacy Intel Centrino thermal sensor and fixed function thresholds PROCHOT and THERMTRIP [2]. Multiple Sensors S/W accessible Register Temperature Threshold #1 Threshold #2 A/D Converter Filtering & Post Processing 7 bit temperature reading Thermal shut down Thermal protection Out of Spec High Threshold compare Low Threshold compare Analog Sensor Thermal Interrupt Figure 1: Digital Thermometer Block Diagram PROCHOT is a fixed temperature threshold calibrated to trip at the max specified junction temperature. Upon crossing this threshold, a H/W power reduction action is initiated, reducing the frequency and voltage, keeping the CPU within functionality and reliability limits. A properly designed cooling system with thermal management should 2

not activate the H/W protection mechanisms. Some aggressive platform designs however, may need occasional H/W initiated action due to long response time. On most operating systems, interrupt latency is not guaranteed and therefore, S/W based control may respond too slow. Other actions have inherent long delays. The time extending from activation of a fan and until its maximum speed is reached may be too long. Aggressive thermal design, together with a slow cooling response may cause thermal excursions that may compromise reliability and functionality. It is possible to design a system with enough margins to avoid such cases, but this comes at a cost of performance or compromised ergonomic characteristics. H/W based protection enables better user experience without compromising the device reliability and performance. THERMTRIP is a catastrophic shut down event, both on the CPU and for the platform. It identifies thermal runaway in case of cooling system malfunction and turns off the CPU and platform voltages, preventing meltdown and permanent damage. A new functionality of the DTS on the Intel Core TM Duo is out of spec indication. It is possible for the CPU to operate within specifications while at maximum Tj. Out of spec indication is a notification to the operating system that a malfunction occurred, junction temperature is rising and a graceful shut down is required while functionality is still guaranteed and user data can be saved. In order to perform S/W and ACPI thermal control functions, the DTS offers interrupt generation capability, in addition to the temperature reading. Two S/W programmable thresholds are loaded by S/W and a thermal interrupt is generated upon threshold crossing. This thermal event generates an interrupt to single or both cores simultaneously according to the APIC settings. The digital thermometer is the basis for software thermal control such as the ACPI. In the ACPI infrastructure, thermal management is done by assigning a set of policies or actions to temperature thresholds. A policy can be active, such as activating fan in various speeds (_ACx), or passive (_PSV), by reducing the CPU frequency. Interrupt thresholds are defined to indicate upper and lower temperatures thresholds. An example of digital thermometer usage is given in Figure 2. _TMP 60 = -40- -35- -30- -25- -95- -90- -85- -80- -75- -70- -65- -60- -55- -50- -45- _CRT _AC0 _AC1 _PSV Figure 2: Digital thermometer and ACPI In the above example the current die temperature is at 60 o C. The thresholds set to 5 o C above and below the current temperature. If the temperature rises above 65 o C, an interrupt is generated, notifying the S/W of a significant change in temperature. The control software reads the temperature and identifies the new temperature and initiates action if needed. In the above example, 65 o C requires activating a fan at a low speed. The activation thresholds and policies are defined at system configuration and communicated to the ACPI. Upon interrupt servicing, new thresholds are written around the new temperature to further track temperature changes. Small hysteresis values are applied to prevent frequent interrupts around a threshold point. Measurements and results In previous Pentium - M systems, a single analog thermal diode was used to measure die temperature. Thermal diode cannot be located at the hottest spot of the die due to design limitations. To perform thermal management activities, some fixed offset was applied to the measured temperature, to keep the CPU within specifications. With the increasing performance and power density of the Intel Core TM Duo, the performance implications of guard bands increase. Figure 3 shown measured die temperature of different workloads. It can be seen that the hot spot of the die moves to different locations depending on the nature of the workload. 3

Cache Cache Die Hot-Spot Core #1 Core #2 Core #1 Core #2 Die Hot-Spot Figure 3: Die hot spots at different workloads Figure 3 demonstrates a shift of the hot spot in a dual core workload. A workload that stresses the floating point unit which is a high power operation, will generate hot spot near the floating point while other workloads will stress different locations on the die. Figure 4 shows the thermal impact of single core applications. Die Hot-Spot In order to evaluate the DTS temperature reading, we performed a study to identify the impact of different workloads on the difference between diode and the hot spot, as measured by the DTS. A set of workloads including all SPEC-2K components and other popular benchmarks and applications, at single thread and multithread were executed on the CPU. Several iterations were done to reach a thermal steady state and then the diode and DTS temperatures were measured. Before taking the measurement, a calibration process has been performed, leaving only the temperature offset. As described earlier, both external A/D and internal DTS have some inaccuracies. Calibration procedure is needed to equalize DTS and diode temperature readings and measure temperature offsets only. Figure 5 shows the offset between the analog diode and the hot spot, as measured by the DTS. The horizontal axis represents the hot spot temperature as a percentage of the max temperature. The vertical axis shows the temperature offset between the diode and the hot spot. Each point on the chart represents a single application. It can be seen that large temperature gradients exist on the die. It also can be noted that some workloads display high temperature gradients while other have no offset. Thermal control algorithms need to prevent the hot spot from exceeding the max temperature specification. It is possible to mitigate the temperature difference by applying a fixed offset to the diode reading. This obviously is a non optimal solution as the workloads with low offset will be panelized by the unnecessary temperature offset. The use of digital thermometer provides improved temperature reading, enables higher CPU performance within thermal limitations and improves reliability. Core #1 Cache Core #2 12 Tj max vs Diode Temperature 10 Offset diod Figure 4: Thermal behavior of a single core application It can be noted that a single diode cannot capture the maximal die temperature. Placing a diode between the cores, results in non optimal location as this is a relatively cold area of the die in single thread workloads. Workloads can be migrated by the operating system scheduler from one core to the another on the same die and therefore a symmetrical sensor placement is required. T offset ['C] 8 6 4 2 0 45 55 65 75 85 95 Tj [% of Tj max] Figure 5: Diode to DTS Temp. difference 4

Previous studies [6] have shown that temperature reduction directly translates into performance degradation. The above chart represents 3%-7% reduction in performance due to temperature measurement offset. Building a thermal management system around a thermal diode, with the characteristics shown in Figure 5 requires temperature guard-band. This guard band can be applied to the control set point, and as a result, the workloads that generate high offset temperatures result in lost performance. A different approach can set the Tj threshold assuming that the diode represents the correct die temperature. Some of the workloads will run at high max Tj and therefore risk functional issues or reliability degradation. The DTS also reduces the other temperature readings errors, which are not shown in this paper. The DTS is calibrated at manufacturing conditions and the reference point is set to this test temperature. Functionality, electrical specifications and reliability commitments are guaranteed at maximum Tj as measured by the DTS. Any test inaccuracy or parameters variance are already accounted for in the DTS set point. Summary and conclusions With the increasing demand for computational density and the increase in CPU transistors and frequency, power and thermal are the key limiters for providing computing performance. In recent years, thermal management has become a fundamental function of computer platform. The input to every thermal management scheme is a thermal sensor. We have shown that thermal sensor accuracy translates into power, which in turn translates into better CPU performance. Legacy analog thermal sensors incur inaccuracies due to parameter distribution and temperature offset from the hot spot. In this paper we introduced the new digital thermal sensor (DTS) of the Intel Core Duo processor. We showed that multiple sense point on various hot spots of the die, together with on die A/D converter provide improved temperature reading. The better accuracy translates either into 3%-7% higher performance or into improved ergonomics. The introduction of dual core References [1] D. Brooks, M. Martonosi, Dynamic Thermal Management for High-Performance Microprocessors Proceedings of the HPCA-07, January 2001 [2] Intel Pentium M processor product specifications http://download.intel.com/design/mobile/datashts/302189 08.pdf [3] Advanced Configuration and Power Interface. http://www.teleport.com/ acpi/. [4] Intel SpeedStep Technology, http://support.intel.com/support/processors/mobile/pentiu miii/ss.htm [5] D. Pham1, S. Asano, et. al., The design and implementation of a first-generation CELL processor, Proceedings of ISSCC 2005. [6] E. Rotem, A. Naveh, et al., Analysis of Thermal Monitor features of the Intel Pentium M Processor, Proceedings of TACS-01, ISCA-31, 2004. [7]A. Naveh, E. Rotem, et al., Power and Thermal Management in the Intel Core Duo, Intel Technology Journal Vol. 10 #2, 2006. ITJ MY 5