MCUs Low-Power Features
Why Low Power Is so Important for MCUs? Longer battery life Smaller products Simpler power supplies Less EMI simplifies PCB Permanent battery Reduced liability
Power as a Design Constraint Why worry about power? Battery life in portable and mobile platforms Power consumption in desktops, server farms Cooling costs, packaging costs, reliability, timing Power density: 30 W/cm2 in Alpha 21364 (3x of typical hot plate) Where does power go in CMOS? Dynamic power consumption Power due to shortcircuit current during transition Power due to leakage current P ACV 2 f AVI short f VI leak
Dynamic Power Consumption C Total capacitance seen by the gate s outputs Function of wire lengths, transistor sizes,... V Supply voltage Trend: has been dropping with each successive fab ACV 2 f A - Activity of gates How often on average do wires switch? f clock frequency Trend: increasing... Reducing Dynamic Power 1) Reducing V has quadratic effect; Limits? 2) Lower C - shrink structures, shorten wires 3) Reduce switching activity - Turn off unused parts or use design techniques to minimize number of transitions
Short-circuit Power Consumption AVI short f Vin I short Vout Finite slope of the input signal causes a direct current path between V DD and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting C L Reducing Short-circuit 1) Lower the supply voltage V 2) Slope engineering match the rise/fall time of the input and output signals
Leakage Power VI leak Sub-threshold current Sub-threshold current grows exponentially with increases in temperature and decreases in Vt
Achieving low-power Reducing dynamic power ACV 2 f: capacitance is a physical property, can be tweaked only when designing the chip act on the frequency f: 1. clock gating: turn down the clocks driving currently idle logic 2. clock scaling: reduce the frequency of peripherals requiring lower speed act on the activity A: do not perform useless activities race to idle state + perform duty cycling act on the operating voltage V: 3. voltage scaling: very powerful, typically done together with frequency scaling (DVFS dynamic voltage/frequency scaling) pitfall: must be done in a well designed way, or logic will stop working! Reducing static power (mostly leakage) VI leak leakage current depends on physical and electrical properties, not always tweakable at runtime exceptions exist, e.g. body biasing, but not often exposed to programmer! act on operating voltage V: same as above
Achieving low power: application phases 2 Tperiod Tperiod I DD Process IRQ ACTIVE IRQ ACTIVE OFF STARTUP INITIALIZATION TASKS TASKS Application phases: OFF power is not applied to MCU STARTUP INITIALIZATION MCU performs configuration (peripherals, clocks, ) Tperiod INACTIVE INACTIVE INACTIVE INACTIVE MCU is in low power mode to reduce power consumption ACTIVE MCU is in normal mode and performs tasks Time
Achieving low power: duty cycling The relationship and balance between the performance and execution time needs to be carefully analyzed to find a best compromise which leads to lowest energy consumption. P T = Time of Period P = Activation Time Duty Cycling = (Activity Period/Time of Period) * 100% T 9
MCU Shootout MCU A MCU B Active mode 2 ma 2.1 ma (+5%) Low power mode 1 ua 0.6 ua (-40%) MCU A MCU B Energy (10% active in 1s interval) 10% Duty Cycling Energy (0.1% active in 1s interval) 0.1% Duty Cycling 200.9 uj 210.5 uj (+4,8%) 3 uj 2.7 uj (-10%)
MCU Shootout - 2 What if the MCUs have different architectures and CoreMark scores? The time spent in active mode will not be equal. MCU A(16-bit) MCU B(32-bit) Active mode time 10 % 7 % Energy (in 1s) 200.9 uj 147.6 uj (-26.6 %) Less time spent in active lower total energy consumed. We can do so by using optimization techniques.
STM32L1: an ultra low-power MCU The STM32L1 is a relatively simple microcontroller series from ST targeting ultra-low power computing (e.g. for wireless sensor nodes) Cortex-M3 core To achieve low power, even the «simple» STM32L1 has a very sophisticated and fine-grain architecture for clocking (frequency scaling) and power distribution (voltage scaling)
CSS Clock sources in the STM32L1 The 5 clocks sources, the PLL and the CSS offers the maximum flexibility and safety for any battery-operated application. HSI Internal @ 16 MHz High Speed Internal clock @ 16MHz. Multiplied by 2 using the PLL to reach the 32MHZ. User trimable with +/-0.5% accuracy MSI Internal 64kHz to 4MHz Multi-Speed Internal clock Very low frequency to address ultra-low-consumption budget application. LSI Internal @ 38kHz Low Speed Internal clock (also called Security clock). Used for Watchdog security and RTC. HSE External 1-24MHz LSE External @ 32kHz High Speed External clock: external quartz could be 1 to 24MHz. USB 48MHz clk will require only a 16MHz crystal(cheaper), x3 using PLL. You can still reached ultra-low-consumption value below 16MHz (down to 65KHz) In case of HSE failure Clock Security System (CSS) will switch to HSI. Low Speed External clock (32.768 KHz) Mainly used for precise RTC. Could be used to calibrate HSI & MSI. LSE could also be calibrate by external clock (eg: 50Hz of Home power supply)
From clock sources to system clocks - 1 MSI Internal 64kHz to 4MHz HSI Internal @ 16 MHz HSE External 1-24MHz System clock (SYSCLK): primary clock of the MCU, most clocks used by digital components derive from this LSE External @ 32kHz LSI Internal @ 38kHz
From clock sources to system clocks - 2 MSI Internal 64kHz to 4MHz HSI Internal @ 16 MHz HSE External 1-24MHz LSE External @ 32kHz LSI Internal @ 38kHz High Performance clock (HCLK): used by the Cortex-M3 core + memory and main interconnect
From clock sources to system clocks - 3 MSI Internal 64kHz to 4MHz HSI Internal @ 16 MHz HSE External 1-24MHz LSE External @ 32kHz Peripheral clock x (PCLKx): used by peripherals connected to APBx bus LSI Internal @ 38kHz
From clock sources to system clocks - 4 MSI Internal 64kHz to 4MHz HSI Internal @ 16 MHz HSE External 1-24MHz Clock generation and source selection LSE External @ 32kHz Clock prescaling LSI Internal @ 38kHz
From clock sources to system clocks - 5 MSI Internal 64kHz to 4MHz HSI Internal @ 16 MHz HSE External 1-24MHz LSE External @ 32kHz LSI Internal @ 38kHz ADC clock (ADCCLK): used by analog-to-digital converter
From clock sources to system clocks - 6 MSI Internal 64kHz to 4MHz HSI Internal @ 16 MHz HSE External 1-24MHz LSE External @ 32kHz LSI Internal @ 38kHz Timer clocks (TIMxCLK): used by timers to count time
From clock sources to system clocks - 7 MSI Internal 64kHz to 4MHz HSI Internal @ 16 MHz HSE External 1-24MHz LSE External @ 32kHz LSI Internal @ 38kHz Real-time clock (RTCCLK): always-on, very slow clock used to save data fundamental for device wake-up
STM32L1 Clocks Summary Osc/Clocks Speed Cons. Precision 25 C/0-85 C Wakeup HSE ext crystal 1-24 MHz ~500 µa ~0.01% 1ms HSE ext clock 1-32 MHz NA NA HSI 16 MHz 100 µa 1%/2.5% 3.7 µs MSI 65 khz-4.2 MHz 0.7-15 µa 0.5%/3% 3.7 µs PLL 2-32 MHz 350 µa NA 100 µs LSI 37 khz 0.4 µa 50% 200 µs LSE ext crystal 32.7 khz typ 0.5 µa ~0.002% ~1s LSE ext clock 1-1000 khz NA NA 21
Clock domains in the STM32L1 HCLK Cortex- M3 core PCLK1 PCLK2
Voltage domains in the STM32L1 V DDA analog devices, provided from ext
Voltage domains in the STM32L1 V DDA analog devices, provided from ext V DD (1.65V-3.3V) digital devices, provided from ext, used directly for RTC, wakeup, standby
Voltage domains in the STM32L1 V DDA analog devices, provided from ext V DD (1.65V-3.6V) digital devices, provided from ext, used directly for RTC, wakeup, standby V CORE (1.2V-1.8V) core memories + peripherals, generated by voltage regulator from V DD
Typical current Universität Dortmund 249µA/MHz Full speed (32MHz) Wake up time Stop to Run: 8μs Standby to run: 50μs 183µA/MHz MSI clock (4.2MHz) 9 µa 4.4 µa 1.2µA (500nA) 900nA (300nA) Dynamic RUN From Flash LPRUN @ 32KHz LPSLEEP + 1 timer @ 32KHz STOP w/rtc (w/o RTC) STANDBY w/rtc (w/o RTC)
Limitations depending on the power supply range 27 Operating power supply range ADC operation USB VCORE Maximum CPU frequency (fcpu max) V DD = 1.65 to 1.8V Not functional Not functional Range 2 or Range 3 16 MHz (1ws) 8 MHz (0ws) V DD = 1.8 to 2.0V Conversion time up to 500 Ksps Not functional Range 2 or Range 3 16 MHz (1ws) 8 MHz (0ws) V DD = 2.0 to 2.4V Conversion time up to 500 Ksps Functional * Range 1, Range 2 or Range 3 32 MHz (1ws) 16 MHz (0ws) V DD = 2.4 to 3.6V Conversion time up to 1 Msps Functional * Range 1, Range 2 or Range 3 32 MHz (1ws) 16 MHz (0ws) * requires range 1 + USB transceiver requires VDD>=3.V to be compliant 27
Limitations depending on V CORE V CORE Maximum CPU frequency (Fcpu max) Programming Flash/EEPROM HSI HSE Max Maximum PLL frequency after Multiply ADC Clock max Range 3 2.1 MHz to 4.2 MHz (1ws) 32 khz to 2.1 MHz (0ws) No Not for SYSCLK 4 MHz 24 MHz 4 MHz Range 2 8 MHz to 16 MHz (1ws) 32 khz to 8 MHz (0ws) Yes Yes 16 MHz 48 MHz 16 MHz Range 1 16 MHz to 32 MHz (1ws) 32 khz to 16 MHz (0ws) Yes Yes 32 MHz (clock) 24 MHz (crystal) 96 MHz 16 MHz 28
STM32L1 operating modes: RUN normal operating mode all devices active core clocked by HCLK, up to 32 MHz
STM32L1 operating modes: RUN - 2 power consumption can be reduced with fine-grain peripheral clock gating power also reduced with frequency scaling
STM32L1 operating modes: RUN - 3 Clocking the peripheral increases consumption when the Bus clock is running So the clock driving each peripheral can be gated Default mode at reset is gated, minimizing consumption Peripheral can be gated automatically when entering Sleep mode Be aware of non-synchronous consumption though! (GPIO sink/source, etc) range 1 range 2 range 3 LP Sleep / Run Value Condition: 32 MHz 16 MHz 4 MHz 65 khz GPIOA 7 6 5 6 GPIOB 7 6 5 6 CRC 0.5 0.5 0.5 1 DMA1 18 15 13 18 FSMC 15 12 10 12 SYSCFG & RI 3 2 2 3 TIM9 8 7 6 7 TIM10 6 5 5 5 TIM2 13 11 9 11 TIM7 4 4 4 4 LCD 4 3 3 4 WWDG 3 2.5 2.5 3 USB 15 7 7 7 PWR 3 3 3 3 DAC 6 5 4.5 5.. ALL 279 221 219 215 µa/mhz
STM32L1 operating modes: RUN - 4 Dynamic voltage/frequency scaling 32 16 4 Maximum f CPU (MHz) 171 µa/ DMIPS 200 µa/ DMIPS 230 µa/ DMIPS Further power savings in RUN mode: Run on MSI CLK: 183 µa/mhz (Active mode) Run full speed (32 MHz): 249 µa/mhz with 2.61 CoreMark/MHz V 1.2 V 1.5 V 1.8 V CORE V 1.65 3.6 V 2 3.6V DD Value given for V DD =3V @ 25 C Execution from Flash 2/ Run from Flash with int. osc. at min values
STM32L1 operating modes: LPRUN low power active mode some peripherals are disabled (retaining state), voltage regulator in LP mode core clocked by HCLK, up to 4 MHz (must use MSI source)
STM32L1 operating modes: LPRUN - 2 LP RUN Mode: Core running, peripherals kept running System Clock is set to multispeed internal (MSI) RC oscillator (131kHz max) Execution from SRAM or Flash memory Internal regulator is in low power mode to minimize the regulator's operating current FLASH can be in Power Down mode (when executing from RAM) VREFINT can be OFF The system clock frequency and enabled peripherals are both limited. Overall consumption of digital IP limited to 200µA When flash is in Power Down Mode, interrupts must be mapped to RAM
STM32L1 operating modes: SLEEP core is stopped and gated, waiting for a wakeup event or interrupt entered with wait-for-event (WFE) or wait-for-interrupt (WFI) instruction or at the exit from an interrupt service routine (if configured to do so)
STM32L1 operating modes: SLEEP - 2 SLEEP Mode: Core stopped, peripherals kept running Entered from by executing special instructions WFI (Wait For Interrupt) Exit: Any peripheral interrupt WFE (Wait For Event) An event can be an interrupt enabled in the peripheral control register but NOT in the NVIC OR an EXTI line configured in event mode Exit: as soon as the event occurs No time wasted in interrupt entry/exit Two mechanisms to enter this mode Sleep Now: Enter SLEEP mode as soon as WFI or WFE is executed Sleep on Exit: Enter as soon as it exits the lowest priority ISR The stack is not popped before entering the sleep, it will not be pushed when the next interrupt occurs, saving running time
STM32L1 operating modes: LPSLEEP core is stopped and gated, waiting for a wakeup event or interrupt some peripherals are disabled (retaining state), voltage regulator in LP mode Flash can be in power-down
STM32L1 operating modes: LPSLEEP - 2 LP sleep Mode: core stopped, peripherals kept running Entered by executing special instructions from LPRUN mode WFI (Wait For Interrupt) WFE (Wait For Event) Internal regulator is in low power mode to minimize current draw FLASH can be in Power Down mode V REFINT can be OFF
STM32L1 operating modes: STOP all V CORE clocks are gated voltage regulator in LP mode, SRAM and registers retain state
STM32L1 operating modes: STOP - 2 STOP Mode: all peripheral clocks, PLL, MSI, HSI and HSE are disabled, SRAM and register contents are preserved. If the RTC, LCD and IWDG are running they are not stopped Voltage Regulator can be put in Low Power mode Wake-up sources: WFI was used for entry: any EXTI Line configured in Interrupt mode WFE was used for entry: any EXTI Line configured in event mode EXTI sources can be: one of the 16 GPIO lines, PVD, RTC sources, Comparators, USB wake-up After resuming from STOP, the clock config returns to its reset state (MSI used as system clock) Wake-up time from Stop mode on MSI RC at 4MHz STM32L15x typ Regulator in run or in low power mode mode (V REFINT ON) 7.9 µs Regulator in run or in low power mode mode (V REFINT OFF with Fast Wakeup) 7.9µs
STM32L1 operating modes: STOP - 3 Wake Up Time from STOP mode is defined here as from IT event to the interrupt vector fetch Wake Up time contributors: ANALOG delay : MSI start up : 3.5us REGULATOR switch from LP to MR mode : 3.5us (voltage range has an impact on the startup of the regulator / temperature also has an impact 6.5µs MAX) EEPROM start up : 3us MAX (after the ready of the regulator) DIGITAL delay System synchronization: 10 clock cycles Interrupt vector fetch / context restoring : 20 clock cycles Wake Up clock : Wakeup sequence is done on MSI and its frequency is the one selected before entering STOP mode. (max wakeup freq is 4.2MHz) Wake Up time in datasheet : 8.2µs typ / 9.3µs max (range 1 and range 2)
Low-Power Modes Transitions LPSleep Sleep LPRun Shutdown Stop1 Run Standby Core ON/OFF Max clock frequency Stop2 Peripherals ON/OFF Regulator HP/LP RTC ON/OFF Memories retentive/not retentive Reset
STM32L1 operating modes: STOP - 4 WakeUp Event Analog WakeUp WakeUp Time MSI PD MSI READY MSI Clock MSI StartUp (3.5us MAX) 20 cycles MSI 4MHZ REG PD READY LP mode REG StartUp (3.5us MAX) MR mode REGULATOR EE PD EE READY IddQ mode EE wakeup (3us MAX) Operating mode EEPROM CPU CLK 6.5µs 2µs Interrupt vector fetch 1 st ISR word fetch start Analog delay 3 cycles 7 cycles 20 cycles
STM32L1 operating modes: STANDBY all V CORE clocks are gated, HSI/MSI/HSE oscillators off waiting for external wakeup or RTC wakeup/alarm/tamper event (auto-wakeup) voltage regulator is off, only RTC register contents are retained
STM32L1 operating modes: STANDBY - 2 STANDBY Mode: V CORE domain is powered off and V REFINT can be OFF. SRAM and register contents are lost except registers in the STANDBY circuitry RTC and IWDG are kept running in STANDBY (if enabled) In STANDBY mode all IO pins are high impedance except Reset pad (still available) RTC_AF1 pin, if configured WKUP1, WKUP2 and WKUP3 pins if enabled Wake-up sources: WKUP1, WKUP2, WKUP3 pins rising edge RTC alarm A, RTC alarm B, RTC Wakeup, Tamper event, TimeStamp External reset in NRST pin IWDG reset After wake-up from STANDBY mode, program execution will restart in the same way as after a RESET. Wake-up time from STANDBY mode on MSI RC at 2MHz STM32L15x typ STANDBY with V REFINT ON 57.2 µs STANDBY with V REFINT OFF 2.4 ms
STM32L1 operating modes: STANDBY - 3 Standby Circuitry contains Low power calendar RTC (Alarm, periodic wakeup) 80 Bytes Data RTC registers Separate 32KHz Osc (LSE) for RTC RCC CSR register: Clock + LSE config Standby Circuitry -> Reset only by RTC domain RESET Wakeup sources 3 wakeup pins (1 for MD) RTC Alarm A or AlarmB RTC Wakeup Timer Wakeup Pin 1 Wakeup Pin 3 RTC_AF1 Wakeup Pin 2 Wakeup Pin 2 RCC CSR reg Wakeup Logic 32kHz OSC (LSE) LSI IWDG RTC + 128 Bytes Data RTC Tamper / Timestamps Events RTC Alternate functions Tamper detection: resets all RTC user backup registers RTC Alarm Outputs: Alarms A/B, Wakeup on AF1 pin RTC Clock calibration Output
Abstracting the complexity of Microcontrollers Low-Power modes: Real-Time Operating Systems
Polling Polling is when a process continually evaluates the status of a register in an effort to synchronize program execution. Polling is not widely used because it is an inefficient use of microcontroller resources.
Polling Example void main ( void ) { while(1) { while (!button1_pressed); turn_on_led; while (!button1_pressed); turn_off_led; } }
Interrupts Interrupt An event that causes the Program Counter (PC) to change. These events can be internal or external. Interrupt Service Routine (ISR) Set of instructions that is executed when an interrupt occurs Interrupt Vector Memory address of the first instruction in the ISR.
Example of interrupt main() { some code Interrupt Vector Table while(1) sleep(); ISR1 INT1 } ISR2 INT2 interrupt void ISR1(void) { turn_on_led;... reti } ISR3 INT3
Interrupts Why should one use interrupts? Provides more efficient use of microcontroller resources. Provides a means to create firmware that can multi-task. Enables utilization of MCUs SLEEP MODES
Interrupts Interrupts are often prioritized within the system and are associated with a variety of on-chip and off-chip peripherals Timers, A/D D/A converters, UART, GP I/O A majority of the mc interrupts can be enabled or disabled. Globally Individually Those interrupts that cannot be disabled are called Non- Maskable Interrupts (NMI). Interrupts can occur at anyplace, at anytime. ISRs should be kept short and fast.
Interrupts vs. Polling Allowed for more efficient use of the microcontroller. Faster program execution Multi-tasking Low Power Facilitated the development of complex firmware. Supports a modular approach to firmware design.
Real-time Operating Systems
Real-time Operating System A real-time operating system (RTOS) is a multitasking operating system intended for realtime applications. Mainly used in embedded applications. Facilitates the creation of a real-time system. Tool for the real-time software developer. Provides a layer abstraction between the hardware and software.
Real-time Operating System State Task A unique operating condition of the system. A single thread of execution through a group of related states. Task Manager Responsible for maintaining the current state of each task. Responsible for providing each task with execution time.
Real-time Operating System Collection of Tasks States Initial State Idle State Work State 1 Single Task Task Manager Work State 3 Work State 2 Work State 4 Completion State
Task void idlestate( void ) { if (rx_buffer_full) { read_rxbuffer; if (syncbyte_received) { transition_to_workstate1; } else { stay_in_idlestate; } } else { stay_in_idlestate; } } Work State 3 Work State 4 Initial State Idle State Work State 1 Work State 2 Completion State
Real-time Operating System Event-driven Tasks are granted execution time, based on an event (interrupt). Tasks of higher priority are executed first (interrupt priorities). Time sharing Each task is granted a given amount of time to execute. Tasks may also be granted execution time based on events. Creates a more deterministic multi-tasking system.
Real-time Operating Systems void main ( void ) { initsystem(); Receive Msg Process Transmitt Response while (1) { work(); sleep(); } } void work (void) { dotask(recievemsg); dotask(process); dotask(transmittresponse); } interrupt void Timer_A (void) { wakeup(); } Initial State State2 State2 State2 State2 Initial State State2 State2 State2 State2 Initial State State2 State2 State2 State2
Real-time Operating System Available commercially TinyOS -an open source component-based operating system and platform targeting wireless sensor networks (WSNs). Salvo - an RTOS developed by Pumpkin Inc. FreeRTOS - free RTOS that runs on several architectures. DrRTOS - works with ARM 7 Implement a custom RTOS Can be highly optimized to suit your application
Real-time Operating Systems A tool for real-time software developers. Allows increasingly complex systems to be developed in less time. Provides a level abstraction between software and hardware. Continues the evolution microcontroller system design.