U-Boot Falcon Mode. Minimizing boot times using U-Boot "Falcon" mode. Stefano Babic / Wolfgang Denk. July 2012

Similar documents
Linux FastBoot. Reducing Embedded Linux Boot Times. Embedded World Conference 2012

Achieve Fastest System Startup Sequences.

Memory memories memory

Booting Linux Fast & Fancy. Embedded Linux Conference Europe Cambridge, Robert Schwebel

The Right Approach to Minimal Boot Times

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Another fundamental component of the computer is the main memory.

UFCETW-20-2 Examination Answer all questions in Section A (60 marks) and 2 questions from Section B (40 marks)

Software Architecture Division (SARD) Sony India Software Center Pvt Ltd. Copyright 2013 Sony Corporation

NAND/MTD support under Linux

Product Technical Brief S3C2416 May 2008

Input/Output Problems. External Devices. Input/Output Module. I/O Steps. I/O Module Function Computer Architecture

Organisasi Sistem Komputer

Memory Management. Mobile Hardware Resources. Address Space and Process. Memory Management Overview. Memory Management Concepts

OK335x Products Guide. Contents

Scaling Without Sharding. Baron Schwartz Percona Inc Surge 2010

File systems: management 1

Mass-Storage Structure

Misc. Third Generation Batch Multiprogramming. Fourth Generation Time Sharing. Last Time Evolution of OSs

How I survived to a SoC with a terrible Linux BSP

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Product Technical Brief S3C2440X Series Rev 2.0, Oct. 2003

Memory classification:- Topics covered:- types,organization and working

Azure Sphere: Fitting Linux Security in 4 MiB of RAM. Ryan Fairfax Principal Software Engineering Lead Microsoft

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

ECE 471 Embedded Systems Lecture 12

Computer Fundamentals and Operating System Theory. By Neil Bloomberg Spring 2017

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Embedded Systems: Architecture

Cute Tricks with Virtual Memory

AT-501 Cortex-A5 System On Module Product Brief

ARM Security Solutions and Numonyx Authenticated Flash

Test ROM for Zaccaria 1B1165 CPU Board

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

TQ2440 Development Platform Manual

Operating Systems, Fall

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]

Product Technical Brief S3C2413 Rev 2.2, Apr. 2006

Full Linux on FPGA. Sven Gregori

The Memory Hierarchy 1

CENG4480 Lecture 09: Memory 1

Pharmacy college.. Assist.Prof. Dr. Abdullah A. Abdullah

Four Components of a Computer System

SAM A5 ARM Cortex - A5 MPUs

KSZ9692PB User Guide Brief

Module 5a: Introduction To Memory System (MAIN MEMORY)

Copyright 2016 Xilinx

December 1, 2015 Jason Kridner

RK3036 Kylin Board Hardware Manual V0.1

How I survived to a SoC with a terrible Linux BSP

SBC8140 Single Board Computer

Introduction to ARM LPC2148 Microcontroller

Flash File Systems Overview

Embest SOC8200 Single Board Computer

FriendlyARM. Mini2440.

Chapter 3: Operating-System Structures

CpE 442. Memory System

MYD-IMX28X Development Board

Introduction to Embedded Bootloader. Intel SSG/SSD/UEFI

Galaxy Note Root Guide. by Max Lee

CS/COE 1550

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

Chapter 3: Operating-System Structures

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

goals review some basic concepts and terminology

CENG3420 Lecture 08: Memory Organization

Design Choices for FPGA-based SoCs When Adding a SATA Storage }

Device Tree as a stable ABI: a fairy tale?

Microprocessor Systems

Product Technical Brief S3C2412 Rev 2.2, Apr. 2006

Track One Building a connected home automation device with the Digi ConnectCore Wi-i.MX51 using LinuxLink

MYD-SAMA5D3X Development Board

Debugging Linux With LinuxScope-JTD

IPL+UBI: Flexible and Reliable with Linux as the Bootloader

Lecture 18: DRAM Technologies

Mass-Storage. ICS332 Operating Systems

Lecture 11: Speed & Communications

Introduction to OS. Introduction MOS Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

Fastboot Techniques for the x86 Architecture. Ben Biron QNX Software Systems

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

+ Random-Access Memory (RAM)

Operating-System Structures

Computing s Energy Problem:

i.mx 7 - Hetereogenous Multiprocessing Architecture

Boot time Optimization of Automotive Grade Linux. Shilu SL & Renjith G 14-Jul-2016

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

Last class: OS and Architecture. OS and Computer Architecture

Last class: OS and Architecture. Chapter 3: Operating-System Structures. OS and Computer Architecture. Common System Components

CPE300: Digital System Architecture and Design

Chapter 6 Storage and Other I/O Topics

CSC 553 Operating Systems

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

GFS: The Google File System

Computing platforms. Design methodology. Consumer electronics architectures. System-level performance and power analysis.

SparkGate7 Quick startup guide

Embedded Linux system development training 5-day session

HC508 TURBO BOARD QUICK START GUIDE. PCB rev. 4a. Connects to expansion port, no need to open your machine to install the card

Computer Science 146. Computer Architecture

Transcription:

U-Boot Falcon Mode Minimizing boot times using U-Boot "Falcon" mode Stefano Babic / Wolfgang Denk July 2012

Overview Requirements for Boot Loaders Frequently Asked For Optimizations: Boot Time Hardware Influence and Considerations Software Optimizations Changes Imposed by Recent Hardware SPL a Little Gem for Multiple Use Example: Boot into Linux/Qt quickly Things to be done Questions...

Overview Requirements for Boot Loaders Frequently Asked For Optimizations: Boot Time Hardware Influence and Considerations Software Optimizations Changes Imposed by Recent Hardware SPL a Little Gem for Multiple Use Example: Boot into Linux/Qt quickly Things to be done Questions...

Many Different Requirements End User: almost never interacts with the boot loader at all Boot OS as quickly as possible Application Engineer: flexible environment for varying software configurations Boot from any available storage devices Easy software installation, reliable software updates BSP Engineer: efficient development environment Easy to port Easy to extend Easy to debug Hardware Engineer: powerful test environment Help with board bringup Production tests Service tools to diagnose hardware problems

Boot Time Optimization Time from Power-On to Operational Mode includes: Boot Loader 0.3 s OS Initialization 3 s Application Startup 30 s Focus here: Boot Loader but remember Donald Knuth: Premature Optimization see also: Röder/Zundel: Linux FastBoot http://www.denx.de/en/documents/presentations

General Optimization Rules Avoid anything you don't really need: Code is fastest if not executed at all! Perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away.- Antoine de Saint-Exupery Initialize hardware only if it is really used by U-Boot itself, and only then. Make it run fast Caches on? Burst Mode accesses enabled? Fastest hardware? Maximum bus bandwidth? Fixed configuration vs. probing / bus scans (USB, I2C, PCI, )?

Other Optimizations Maximize resource utilization: Don't busy-wait for long running operations ( Fire and Forget ) Run several tasks in parallel Bus bandwidth vs. CPU performance: compression? Interpret Requirements Intelligently Run tests at end of boot cycle, i. e. before power-down

Hardware Areas for Optimization CPU Speed, Bus Bandwidth Boot device (memory or storage?) Boot method (execute user code or immutable boot ROM?) Software Design Trade Security for Speed: switch of checksum tests Trade Costs for Speed: use uncompressed images [may not help on fast CPUs] Implementation Compute checksums while copying/uncompressing images Avoid copy operations: make Linux accept ramdisk in NOR flash

Hardware Considerations Memory Devices: ROM, EEPROM, NOR Flash CPU can directly address individual cells in some ramge of addresses can directly provide code and data, allows XIP limited capacity, expensive fast, reliable Storage Devices: NAND Flash, SDCard, USB,... controller interface; data need to be read into memory buffer before they can be accessed SoC limitations: only small buffer space available Boot ROM limitations: read only first block (2... 128 KiB) huge capacities, cheap slow, unreliable

Hardware Considerations In the olden days: Reset CPU starts executing boot loader in ROM at reset entry point loads OS Optimizations: Fast CPU, fast ROM (usually NOR flash) Maximize bus bandwidth (32 / 64 bit bus interface) Enable caches Enable Burst Mode Accesses

Hardware Considerations Classic U-Boot: reset starts executing code in NOR flash relocation to RAM because execution from RAM typically faster (NOR usually did not support burst mode accesses) and to allow flash programming (other solutions possible) CPU performance versus bus bandwidth: CPU faster: minimize data transfers, use compressed images Bus faster: load uncompressed data

Hardware Considerations Today: Reset CPU executes on-chip boot ROM (immutable) loads X-Loader from Storage loads U-Boot from Storage loads OS Problems: complicated multi-stage boot procedure inherently slow boot ROM cannot be changed many limitations (buffer size, can read only one block of data) X-Loader duplicates U-Boot, but code (especially drivers) cannot be shared

Software Optimizations Step 1: Get rid of X-Loader create SPL (Secondary Program Loader) as separate boot stage that gets loaded and started by the boot ROM small enough to meet hardware restrictions common code base with normal U-Boot, shared drivers flexible allow to use all available hardware resources obsoletes X-Loader, UBL,... single source, common code base is much better, but not inherently faster

Software Optimizations Step 2: Make more flexible SPL basic asks: initialize RAM, load U-Boot from storage, start it generalize: initialize RAM, load some image from storage, start it implement a way to select which image to boot; for example, test a GPIO (switch, button, jumper) more flexible, but how is this faster?

Optimizing Boot Time using SPL 1st image = standard U-Boot All features available as usual suitable for development, production, service, maintenance, software updates,... Development Mode 2nd image = Linux Kernel When all you want to do is booting an OS, then do not load and run the full U-Boot at all! saves time to load (and run) several 100 KiB of code! Production Mode

Optimizing Boot Time using SPL 1st image = standard U-Boot All features available as usual suitable for development, production, service, maintenance, software updates,... Development Mode 2nd image = Linux Kernel When all you want to do is booting an OS, then do not load and run the full U-Boot at all! saves time to load (and run) several 100 KiB of code! Production Mode

Test Case: Twister Board Test setup on Twister Board : TI AM3517 CPU at 600 MHz 256 MiB DDR2 RAM 512 MiB NAND flash added jumper to select boot mode Direct boot of Linux Kernel Root FS = 24 MiB UBIFS in NAND (ELDK 5.1 QtE) with slide-show based on fbi as application See also: http://www.denx.de/wiki/u-bootdoc/falconboottwister

Falcon Boot on AM3517 Twister Board - ROM loads SPL from NAND - GPIO (Jumper) used to select boot mode => Falcon mode (shown here): - Linux kernel loaded from NAND - UBIFS in NAND mounted as root file system -> fast => Service mode: - U-Boot loaded from NAND - U-Boot can boot Linux, or... -> all features available

Slow Motion x10 Seconds after Video Start: 3.00 = +0.00 Power on (see red LED) 3.44 = +0.44 Backlight on 5.36 = +2.36 Linux Penguin 5.88 = +2.88 Penguin off 6.08 = +3.08 Qt App running

How does it work? All relevant parts of the code are already in mainline How can we handle boot arguments or device tree updates normally done in U-Boot? Setup of the system is in two stages: Using normal U-Boot, we prepare a static ( frozen ) parameter block image In Falcon Mode, the SPL just passes this parameter block to the payload So can I use this, too? Yes, if your board uses SPL.

Faster, Faster, Faster... 3 seconds is pretty lame. Company XXXXXX claims they support sub-second boot times. Yes, but did you try to do the same on your own board? Usually such results cannot be re-used: Fine-tuned to specific hardware / boot device (NOR?) Not all needed code / know-how published Synthetic use case that conflicts with real-life requirements This here is different: Only standard technology used Can be used as is in a real project We encourage you to re-use all this!

Limits for Boot Times? So can you do in less than 3 seconds or not? Yes, we can! Code is not fully optimized yet: caches are switched off in SPL due to unknown problem (runs slower with caches on) Use hardware that better supports booting fast Optimize other areas, too (size of Linux kernel image,...) But: Each additional step will need an increasing amount of efforts Further optimizations may strip functionality May quickly become highly project-specific

Things to do Convert more boards to use SPL Spread the word about the new capabilities Fix remaining issues (why is booting slower with caches enabled?) Push the last few remaining bits upstream

What's in a Name? Why the name Falcon mode? All obvious names were already in use: Fast-, Presto-, Quick-, Rapid-, Speed-, Swift-, Turbo-, You-name-it- Boot Pergrin Falcon - The world's fastest animal: The pergrin falcon can fly/dive from up to 100 to 175 miles per hour (160...280 km/h). See http://wiki.answers.com/q/what_is_the_ fastest_animal_in_the_world So Falcon Boot is just our way to say: hey, we can boot pretty quickly...

It's your turn now... Questions...

Triple Constraint Good Pick any two! Fast Cheap