Silver Bullet of Virtualization. Challenges and Concerns May 27, 2013 v1.0
Agenda Introduction / Motivation Background Use Cases / Scenarios Open Questions / Problems Q & A COGENT EMBEDDED 2
Introduction Who we are, What we do Embedded SW services/solutions company Working with semiconductor vendors (SOC and IP block providers) and OEM/ODMs (industrial, automotive, medical, consumer) Helping to Make Open Source work for You COGENT EMBEDDED 3
Motivation. Why we talk about embedded virtualization Embedded industry is evolving: ARM/Intel domination, multi-core designs, Open Source Complexity of Automotive/Embedded designs is already ahead of mobile Cluster, ADAS, Infotainment Common question from OEM/ODM/Tier-n ARM introduced virtualization extensions. New SOCs coming: Does it solve existing problems? Does it bring new (potential) problems? Where it does not help? COGENT EMBEDDED 4
Can we learn from Desktop / Server virtualization experience? Sandboxing and Containment Efficient resource utilization: Dynamic resource allocation Fine-grained QoS control mechanisms Virtualized I/O (for example Single Root I/O Virtualization SR-IO Ethernet controllers, MR I/O storage devices) Typically deal with loosely-coupled Guest OSes Data-center oriented: Focus on infrastructure and manageability Fast VM migration and disaster recovery High availability requirements All about Watts/Money/Performance COGENT EMBEDDED 5
How Embedded (Automotive) virtualization is different? Static, predictable behavior Fast boot / Instant-on requirements Safety requirements Real-time requirements Certification Power management COGENT EMBEDDED 6
How Embedded (Automotive) virtualization is different? (cont d) Extensive I/O, peripherals Complex multi-core environment FPGAs Limitations external IO, memory, power budget, environmental, lifecycle No common hardware design *) **) COGENT EMBEDDED *) Image ownership and copyrights belong to Intel 7 **) Images ownership and copyrights belong to NVIDIA
Summary ARM is following trend created by Intel/AMD: Virtualization is a de-facto standard in desktop/server Success of cloud technologies ARM, Linaro, third-parties are actively improving KVM, Xen but embedded/automotive virtualization is quite different Is there an alternative for embedded/automotive? Shall we introduce one or contribute to Xen Embedded? Embedded virtualization always been a domain for commercial/third-party solutions What ARMv7 virtualization extensions bring for embedded? Is it a breakthrough or a just a checkbox yet? COGENT EMBEDDED 8
Background. Embedded Virtualization on ARM (until ARMv7 virtualization extensions) Embedded virtualization on ARM: Full virtualization Paravirtualization Full Para User mode Guest (ARM OS) Guest (ARM OS) VM port patch trap call Supervisor mode VMM (Hypervisor) VMM (Hypervisor) COGENT EMBEDDED 9
Paravirtualization on ARM (example) Microkernel (hypervisor) Client/Server architecture, IPC Syscalls redirection Emulated interrupts Linux Kernel Userspace HV port/drivers App(s)/glibc User mode Supervisor mode micro-kernel COGENT EMBEDDED 10
Paravirtualization on ARM (performance) (example) COGENT EMBEDDED 11 *) Tables extracted from Performance Evaluation of Para-virtualization on Modern Mobile Phone Platform. Yang Xu, Felix Bruns, Elizabeth Gonzalez, Shadi Traboulsi, Klaus Mott, Attila Bilgic
Paravirtualization drawbacks. Overheads CPU virtualization overhead (system calls, IPCs) increased amount of context switches I/O virtualization overhead: Direct access to I/O from Guest OS can be dangerous I/O (DMA) can read memory that belongs to a different OS User mode Guest0 RTOS System Server I/O Linux kernel Guest1 Userspace Supervisor mode ukernel COGENT EMBEDDED 12
Paravirtualization drawbacks. Maintenance headache Guest OS (e.g. Linux kernel) fork required: Massive changes when adding new (sub) architecture to Linux kernel Linux community may not like it: Not much advantage of OSS Mainline sync process is tough Hypervisor is a moving target as well: Changes in hypervisor may require changes in Linux port Hypervisor and Linux port are tightly coupled and have to be maintained together COGENT EMBEDDED 13
Paravirtualization advantages Better control over Guest OS: Sandboxing/Containment Resource access is 100% controlled by HV Hypervisor is implemented as pure software (easy to patch, fix, change) Guest OS can be untied from particular hardware COGENT EMBEDDED 14
HW-assisted virtualization. ARM virtualization extensions CPU virtualization: New HYP privilege mode (Non-Secure Privilege Level 2) Instructions that can not be executed natively are trapped into hypervisor. Hypervisor Syndrome Register (HSR) helps to identify entry reason Separate vector table for hypervisor. Hypervisor Vector Base Address Register (HVBAR) Hypervisor Call (HVC) and 0x14 vector Memory virtualization: Intermediate Physical Address 2 stage translation (VA->IPA->PA) Large Physical Address Extension (LPAE) Virtual Machine IDentifier (VMID) (TLB maintenance) COGENT EMBEDDED 15
HW-assisted virtualization. ARM virtualization extensions (cont d) I/O virtualization: Virtual Interrupts. Virtual GIC, Virtual Interrupts Distributor System MMU (x86-world IOMMU) even more flexible (2-stages translations, SMMU repeats MMU tables structure) Is this enough? PCI-SIG Single Root I/O Virtualization Multi-Root I/O Virtualization Desktop/server video cards (do not offer virtual functions, but provide independent hardware queues are controlled via separate register pages) COGENT EMBEDDED 16
Hypervisor enablement (with HW-assisted virtualization) Still a lot of work to do at hypervisor side: Boot/initialization, lifecycle management Resource allocation / management Capabilities / privileges management IPC Scheduling I/O virtualization Power management Guest1 Guest0 GuestN Hypervisor System Server Trust Zone Secure Domain COGENT EMBEDDED 17
Automotive. Real world scenario Graphics SOC *) *) MCU Instrument Cluster ECU MCU IVI SOC Infotainment ECU **) Gateway MCU DSP Vehicle domain Driver assistance ECU COGENT EMBEDDED *) Image ownership and copyrights belong to NVIDIA 18 **) From EE-Times acrticle Magna brings camera-based driver assistance systems to volume markets
Will it evolve in the future?. Giant step in consolidation Cluster Infotainment ADAS System Gateway MCU Hypervisor Super SOC Vehicle domain big.little, GPU, DSPs COGENT EMBEDDED 19
Is it feasible nowadays? Is there enough room to combine IVI, Cluster, Driver Assistance and other functions on a single SOC? Most recent multi-core ARM SOCs seem to have enough CPU, GPU, Memory resources and misc. accelerators Not enough I/O interfaces (need to use companion chips, extenders, etc.) How to share complex IP blocks (GPU, Displays, etc)? COGENT EMBEDDED 20
Potential benefits Hardware Software Lower total BoM Independent partition Space/size, wiring, weight management economy Fast boot, instant-on Power consumption Shut-down, restart, lifecycle Temperature Minimal system can always Less efforts to design and be up and running productize Easy software update and recovery Faster interconnect between domains Can enable variety of automotive OSes simultaneously (including legacy): Linux, QNX, Windows Automotive COGENT EMBEDDED 21
Consolidation. Already happening AMP scenario No shared I/O (except IPC/communication mechanism) Need to add knowledge about each domain Difficult to achieve absolute isolation Complexity of I/O handover from RTOS to Linux (early video/audio) Not efficient resource usage (RTOS may not need power of big ARM core) RTOS Communication ARM11 CAN ARM11 Graphics Linux (SMP) Multimedia ARM11 Multimedia ARM-based SOC COGENT EMBEDDED 22
Now with super SOC. Sharing problem Need to isolate access to critical I/O like clocks, voltages Some I/O blocks may have many instances Difficult to share offload engines, DSPs, GPU May need to share on companion chipsets multiplexing different functions (like PMIC in mobile, hiding controls for audio, touch, USB, power behind I2C) bad scenario May need to share single A15/A7 core? Infotainment ADAS Cluster System A15 cluster A7 cluster GPU DSP Display Video CAN I2C Clocks Voltages ARM-based super SOC COGENT EMBEDDED 23
Virtual I/O complexities HW-assisted virtualization helps to minimize impact on Guest OSes Still need to modify/virtualize Guest on BSP/drivers level Virtual I/O support increases Hypervisor/System Server complexity (Repeating complex OS drivers, sharing/qos/priorities) Can we push I/O virtualization complexity further to hardware IP (like in server world)? Cluster Display2 VGPU System server Display2 Ethernet V I/O Hypervisor Clocks A15 cluster A7 cluster Infotainment ARM-based super SOC COGENT EMBEDDED 24 GPU DSP Voltages Display1 I2C1 VDisplay1 VGPU
I/O Virtualization in Embedded SOC? PCI SR I/O for embedded realistic? Context-aware offload engines. True story COGENT EMBEDDED 25
Sharing ARM cores. Scheduling Not enough ARM cores? Introduce domains priorities, go with traditional full-preemptive, prioritybased scheduling How schedule domains with same priority? Cooperative scheduling dangerous for CPU bound tasks What time-slice granularity to choose? Priority inversion? Are we ready for big.little yet? Trade-off performance, powerconsumption, deterministic behavior Guest0 up Guest1 up Hypervisor Guest2 SMP System Server A15 A15 GPU DSP COGENT EMBEDDED 26
Power Management Guest OS (e.g. Linux) state of art Power Management framework: Static PM: sleep states Dynamic PM: DVFS/CPUFreq, power states, governors, individual peripherals shutdown, CPU hotplug When consolidating multiple Guest OSes need to offload power management heuristic to hypervisor/system Server (no hardware yet with VM-isolated power states) Modify Guest OSes and design Power Management- aware hypervisor COGENT EMBEDDED 27
Summary No silver bullet case by case analysis required May not even have a choice, forced to use it: legacy SW migration, combination of multiple OSes Already deployed embedded ARM paravirtualization solution now can get rid of overheads and simplify design! Good scenarios/use-cases HW-assisted ARM hypervisor fits well: Enhances AMP scenario domain protection/isolation/management, I/O handover between domains Simple, cheap peripherals sharing (or minimal I/O sharing) COGENT EMBEDDED 28
Summary (cont d) More advanced scenarios: Increasing complexity of hypervisor/system Server: I/O sharing, Scheduling, Power Management Embedded SOCs not 100% ready (yet) for efficient I/O virtualization Answer for advanced uses-cases is in hands of SOC/IP block vendors: I/O sharing silicon IP vendors can enhance their products for virtualization scenarios (e.g. GPU, DSP -> multi-context/queue support) SOC vendors can integrate more IP blocks, more cores, more offload engines Trade-offs: saving HW costs by increasing SW design complexity, cost, maintenance headache, time to market COGENT EMBEDDED 29
Summary (cont d) Think about I/O sharing from the beginning: SOCs have MANY offload engines Do you really need GPU/OpenGL for simple bitblit operations Image processing on DSP or GPU? Audio codecs on DSP or ARM? Keep things simple vs trying to be super-flexible : Sharing of single CPU for embedded potentially dangerous scenario Optimistic view: ARM opened door for virtualization SOC/silicon IP vendors are working on efficient solutions: new/better SOCs + optimized sw - coming really soon COGENT EMBEDDED 30
Questions, Comments Questions, Thoughts? Send your questions: hv@cogentembedded.com Thank You! COGENT EMBEDDED 31