SMP/BIOS Overview. Nov 18, 2014

Similar documents
SMP/BIOS Overview. March 19, 2015

SMP/BIOS Overview. June 3, 2013

Single thread Scheduler All processes called once each sample

Debugging with System Analyzer. Todd Mullanix TI-RTOS Apps Manager Oct. 15, 2017

Zero-Latency Interrupts and TI-RTOS on C2000 Devices. Todd Mullanix TI-RTOS Apps Manager March, 5, 2018

file://c:\documents and Settings\degrysep\Local Settings\Temp\~hh607E.htm

Frequently Asked Questions about Real-Time

Frequently Asked Questions about Real-Time

National Aeronautics and Space and Administration Space Administration. cfe Release 6.6

An Interrupt is either a Hardware generated CALL (externally derived from a hardware signal)

Windows Interrupts

Multiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.

Yielding, General Switching. November Winter Term 2008/2009 Gerd Liefländer Universität Karlsruhe (TH), System Architecture Group

Operating Systems. Computer Science & Information Technology (CS) Rank under AIR 100

Multiprocessor Systems. Chapter 8, 8.1

Multiprocessor Systems. COMP s1

GLOSSARY. VisualDSP++ Kernel (VDK) User s Guide B-1

CS 571 Operating Systems. Midterm Review. Angelos Stavrou, George Mason University

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5)

Product Update. Errata to Z8 Encore! 8K Series Silicon. Z8 Encore! 8K Series Silicon with Date Codes 0402 and Later

Multi-core microcontroller design with Cortex-M processors and CoreSight SoC

embos Real-Time Operating System embos plug-in for IAR C-Spy Debugger Document: UM01025 Software Version: 3.1 Revision: 0 Date: May 3, 2018

DSP/BIOS Kernel Scalable, Real-Time Kernel TM. for TMS320 DSPs. Product Bulletin

embos Real-Time Operating System embos plug-in for IAR C-Spy Debugger Document: UM01025 Software Version: 3.0 Revision: 0 Date: September 18, 2017

Green Hills Software, Inc.

Threads, SMP, and Microkernels

What s An OS? Cyclic Executive. Interrupts. Advantages Simple implementation Low overhead Very predictable

CSCI-GA Operating Systems Lecture 3: Processes and Threads -Part 2 Scheduling Hubertus Franke

Interrupt/Timer/DMA 1

RTX64 Features by Release IZ-DOC-X R3

Real-Time Component Software. slide credits: H. Kopetz, P. Puschner

Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools

8086 Interrupts and Interrupt Responses:

PRU Firmware Development. Building Blocks for PRU Development: Module 2

Engineer-to-Engineer Note

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou ( Zhejiang University

CS 326: Operating Systems. CPU Scheduling. Lecture 6

Following are a few basic questions that cover the essentials of OS:

Chap.6 Limited Direct Execution. Dongkun Shin, SKKU

Motivation. Threads. Multithreaded Server Architecture. Thread of execution. Chapter 4

AUTOBEST: A United AUTOSAR-OS And ARINC 653 Kernel. Alexander Züpke, Marc Bommert, Daniel Lohmann

Process Scheduling Queues

embos Real Time Operating System CPU & Compiler specifics for ARM core with ARM RealView Developer Suite 3.0 Document Rev. 1

LabVIEW Programming for a Multicore Environment. Stefan Kreuzer Applications Engineer National Instruments

For use by students enrolled in #71251 CSE430 Fall 2012 at Arizona State University. Do not use if not enrolled.

NI Linux Real-Time. Fanie Coetzer. Field Sales Engineer SA North. ni.com

Process Context & Interrupts. New process can mess up information in old process. (i.e. what if they both use the same register?)

GE420 Laboratory Assignment 3 More SYS/BIOS

CS A331 Programming Language Concepts

CSC 2405: Computer Systems II

Cortex-A9 MPCore Software Development

MARACAS: A Real-Time Multicore VCPU Scheduling Framework

CS370 Operating Systems

Simultaneous Multithreading on Pentium 4

An Overview of MIPS Multi-Threading. White Paper

An Operating System in Action

A unified multicore programming model

CS370 Operating Systems Midterm Review

How to get realistic C-states latency and residency? Vincent Guittot

Linux Driver and Embedded Developer

Chapter 5: CPU Scheduling. Operating System Concepts 8 th Edition,

Processes. Process Management Chapter 3. When does a process gets created? When does a process gets terminated?

ARM CORTEX-R52. Target Audience: Engineers and technicians who develop SoCs and systems based on the ARM Cortex-R52 architecture.

Using kgdb and the kgdb Internals

Operating Systems : Overview

General Objectives: To understand the process management in operating system. Specific Objectives: At the end of the unit you should be able to:

How to Get Started With DSP/BIOS II

Precept 2: Non-preemptive Scheduler. COS 318: Fall 2018

Micrium OS Kernel Labs

ò Paper reading assigned for next Tuesday ò Understand low-level building blocks of a scheduler User Kernel ò Understand competing policy goals

Lecture 3: Concurrency & Tasking

Interrupts (Exceptions) Gary J. Minden September 11, 2014

Conclusions. Introduction. Objectives. Module Topics

2018/04/06 01:40 1/11 SMP

Process Description and Control. Chapter 3

An Interrupt is either a Hardware generated CALL (externally derived from a hardware signal)

Scheduling in the Supermarket

CPU Scheduling: Objectives

Threads, SMP, and Microkernels. Chapter 4

Operating Systems: Internals and Design Principles. Chapter 4 Threads Seventh Edition By William Stallings

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Real-Time Performance of Linux. OS Latency

Intel SoC FPGA Embedded Development Suite (SoC EDS) Release Notes

Process Characteristics. Threads Chapter 4. Process Characteristics. Multithreading vs. Single threading

Threads Chapter 4. Reading: 4.1,4.4, 4.5

Code Composer Studio v4. Introduction

Migrating to Cortex-M3 Microcontrollers: an RTOS Perspective

19: I/O Devices: Clocks, Power Management

The Google File System

CSE 4/521 Introduction to Operating Systems. Lecture 29 Windows 7 (History, Design Principles, System Components, Programmer Interface) Summer 2018

Chapter 5: Process Scheduling

Software Quality is Directly Proportional to Simulation Speed

Interrupts (Exceptions) (From LM3S1968) Gary J. Minden August 29, 2016

RTX64 Features by Release

Announcement. Exercise #2 will be out today. Due date is next Monday

Fix serial communications not functioning after reboot when the station number is set to iptables Fix iptables.

XDS560 Trace. Technology Showcase. Daniel Rinkes Texas Instruments

Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2

Assembling and Debugging VPs of Complex Cycle Accurate Multicore Systems. July 2009

Debugging Nios II Systems with the SignalTap II Logic Analyzer

Transcription:

SMP/BIOS Overview Nov 18, 2014!!! SMP/BIOS is currently supported only on Cortex-M3/M4 (Ducati/Benelli) subsystems and Cortex-A15 (Vayu/K2/Omap5) subsystems!!!

Agenda SMP/BIOS Overview What is SMP/BIOS? SMP/BIOS Benefits New APIs & Config Params Task Scheduling in SMP mode Hwi/Swi Scheduling in SMP mode Inter-core locking Performance and Benchmarking Getting Started CCS setup for SMP debug (Sync Group feature) Future Scope Summary Q&A

What is SMP/BIOS? What is an SMP system? Symmetric Multiprocessing (SMP) systems are composed of two or more identical processor cores that share a common view of memory and peripherals. All processors are managed by a single OS instance. What is SMP/BIOS? SMP/BIOS is a multi-core variant of SYS/BIOS designed to run on SMP systems.

SMP/BIOS Benefits A single instance of SMP/BIOS manages the concurrent execution of tasks on shared cores. Simplified development of multi-core subsystem applications. Easily maximize utilization of each core. Simple migration from multiple separate BIOS application instances to a single SMP/BIOS application instance. Need to load a single BIOS image versus 2 separate BIOS images Helps save boot time. Inherent load-balancing of Tasks. Task affinity needs to be DON T CARE to leverage load-balancing. Backward compatible with SYS/BIOS applications with certain caveats. Possible application behavioral differences (including application failure) if the application relied on task priorities to ensure exclusive access to system objects.

New APIs & Config Params BIOS Module Bool BIOS.smpEnabled (BIOS_smpEnabled) This flag is provided to manage building applications for both SMP and non-smp versions of SYS/BIOS. Task Module UInt Task_setAffinity(Task_Handle handle, UInt affinity); Used to dynamically set a task s core affinity. Can be used by a running task to move itself to another core. UInt Task_getAffinity(Task_Handle handle); Used to dynamically get a task s core affinity. Task_Handle Task_getIdleTaskHandle(Uint coreid); Returns a handle to the idle task object for the specified coreid. Task.defaultAffinity module config parameter Used to globally define default Task affinity of user created tasks. Defaults to Task_AFFINITY_NONE. Task.PARAMS.affinity instance config parameter Used to define a task s affinity at create time. Default is inherited from Task.defaultAffinity (i.e. Task_AFFINITY_NONE).

New APIs & Config Params Idle Module Existing metaonly Idle.addFunc (Function); Add idle functions only to Core 0. New metaonly Idle.addCoreFunc (Function, CoreId); Add idle functions to a specific core. Core Module (new) ti.sysbios.hal.core module UInt Core_getCoreId(); returns the current core id. const UInt Core_numCores (Core.numCores) number of smp cores Core is a proxy for target/device specific delegate module Currently bound to ti.sysbios.family.arm.ducati.smp.core on M3/M4 and ti.sysbios.family.arm.a15.smp.core on A15.

New APIs & Config Params Hwi Module New metaonly config UInt8 intaffinity[]; An array that maps an interrupt number to a coreid. Allows the application to control on which core will a Hwi run. By default, all Hwis run on core0.

Task Scheduling in SMP mode 3*N-1 2*N+3 2*N+2 2*N+1 2*N 2*N-1 N+3 N+2 N+1 N N-1 3 2 1 0......... Don t Care Ready Queues Core 1 Ready Queues Core 0 Ready Queues Core Affinity Support Each task has a core affinity (Hard CPU Affinity) 0, 1, or Task_AFFINITY_NONE Default is Task_AFFINITY_NONE Task Scheduler Ready Queues 3 ready queue sets in total, 1 per core affinity Each ready queue set contains N queues, N being the number of supported task priorities (16 by default). Each ready queue maintains a list of ready tasks that share the same priority and affinity. For coding efficiency, the three sets of ready queues are placed contiguously in memory A ready task is removed from its ready queue when it is made to run.

Task Scheduling Algorithm Core X* Task Scheduler called Pick highest priority ready task from between Current Core s ready Set and Don t Care ready Set SMP Scheduling Rule: At any given time, the two highest priority tasks that are ready to run, ARE in running state. Scheduling Algorithm: The Task scheduler on a particular core is called whenever a Task on that core becomes ready to run due to a Semaphore_post(), Event_post(), Task_sleep() timeout, etc. Core 0 task scheduler always picks the highest priority ready task from between the core 0 ready set and the don t care ready set. Core 1 task scheduler always picks the highest priority ready task from between the core 1 ready set and the don t care ready set. Pre-empted tasks are placed at the beginning of their respective ready queue while Blocked tasks that become ready are placed at the end. Return from Scheduler Interrupt other core to make its scheduler run Yes Selected ready task s priority > Running task s priority? No Yes Put currently running task on its ready queue & remove the selected highest priority ready task from its ready queue and make it run No scheduling required on current core Other Core s scheduler needs to perform a scheduling operation? No * X is current Core Id on which task scheduler is running

Hwi and Swi Scheduling in SMP mode Hwi s can be configured to run on either cores in a SMP application The Hwi module has a new module wide config parameter called intaffinity to manage which core does a Hwi run on in SMP mode. intaffinity[] is an array that maps an interrupt number to the coreid. By default, all interrupts are mapped to core 0. The core affinity cannot be changed for Interrupt numbers below 16 on Cortex-M3/M4. Such interrupts will always be serviced by core 0 in a SMP application. The core affinity cannot be changed for Interrupt numbers below 32 on Cortex-A15. Each core has its own Hwi stack on which the ISR routine runs. For design simplification, Swi s are forced to run on Core 0. Swi s can be posted from either core but will only be run on Core 0. NOTE: Unlike Non-SMP BIOS, Swis can be running on Core 0 while a Task is running on Core 1. Additionally, Hwis can be running on one core while a Task is running on the other core.

Inter-core Locking An Inter-core Lock guarantees exclusive access to Hwi/Swi/Task critical section code/data. Accessed through these Core module APIs: Core_lock() Core_unlock(). These APIs are spec d in ICore but must only be used internally by BIOS. Their implementations are hardware specific. Current Design The Inter-core lock is acquired whenever any of the three schedulers (Task, Swi or Hwi) are disabled by Task_disable(), Swi_disable(), or Hwi_disable(), and released only when all 3 schedulers are enabled. A Task or Swi disable on one core effectively disables Hwi s on the other core.

Performance Simulated 1080P load with 6 tasks, 14% task, 1% Hwi, 2% Swi loads. SYS/BIOS SMP/BIOS SMP/BIOS SMP/BIOS 6.37.00 affinity=0 affinity=x affinity=x Hwis on Core0 Distributed Hwis Task Switches 63995 69975 88768 90486 Swis 18071 19902 19868 19897 Hwis 18073 19904 34604 36046 tsk0count[0] 9082 10003 8894 8513 tsk0count[1] 0 0 1090 1486 tsk1count[0] 9084 10005 792 1977 tsk1count[1] 0 0 9193 8023 tsk2count[0] 8996 9907 8404 3079 tsk2count[1] 0 0 1482 6823 tsk3count[0] 8998 9909 7805 1674 tsk3count[1] 0 0 2082 8229 tsk4count[0] 8998 9909 2078 4656 tsk4count[1] 0 0 7808 5244 tsk5count[0] 9000 9911 1587 5051 tsk5count[1] 0 0 8301 4851 idlecount[0] 506001 262554 1831082 2070815 idlecount[1] 0 12756857 8725290 5953236 Hwi load 1 1 2 2 Swi load 2 3 3 3 tsk0 load 14 14 15 15 tsk1 load 14 14 14 14 tsk2 load 14 14 14 15 tsk3 load 14 14 14 14 tsk4 load 14 14 14 14 tsk5 load 14 14 14 14 Core 0 cpu load 89 92 49 42 Core 1 cpu load 0 0 45 53

Benchmarks M4 Benchmark results on a Vayu board running @212.8MHz * Using SYS/BIOS 6.41.01.28_eng product

Benchmarks A15 Benchmark results on a Vayu board running @588MHz using DMTimer (@19.2MHz) for timestamps * Using SYS/BIOS 6.41.01.28_eng product

Benchmarks A15 Benchmark results on a K2H board running @125MHz using Timer64 (@20.48MHz) for timestamps * Using SYS/BIOS 6.41.01.28_eng product

Getting Started Download and install the latest SYS/BIOS tools SYS/BIOS 6.41.00.26 XDCtools 3.30.04.52 Link to SYS/BIOS download pages: http://downloads.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/bios/sysbios/index.html Porting existing SYS/BIOS applications to SMP/BIOS: Merge existing separate applications into a single application. Merge separate platform memory definitions as necessary. Add this to your existing application s config script: BIOS.smpEnabled = true; Use these SMP-aware clone modules in place of their xdc.runtime equivalents: SysMin, SysStd, LoggerBuf (in ti.sybios.smp package) The existing Load module has been tweaked to provide minimal support for SMP. For initial sanity testing, force all tasks to run on core 0: Task.defaultAffinity = 0; Once basic functionality of the merged applications has been demonstrated, either remove Task.defaultAffinity setting or replace it with: Task.defaultAffinity = Task.AFFINITY_NONE;

Getting Started Loading and running applications using CCS Load the single image on both cores and release both cores from reset simultaneously. CCS 5.4+ adds a new Sync Groups features that allows you to group the cores and treat them as a single debug entity. See SMP Debug wiki for more info: http://processors.wiki.ti.com/index.php/smp_debug Note: Using Sync Groups breaks Semi-hosting on Cortex-A15 processors. Semihosting is required for CIO to work on A15. If CIO is desired, Sync Groups should be avoided on A15. A bug has been filed with the CCS team regarding this issue. With Sync Groups, a Software breakpoint set on one core is automatically set on the other core. Also, starting/halting one core, will start/halt the other core. If using an older version of CCS that does not support Sync Groups or if it is not possible to start both cores simultaneously, start Core1 before starting Core0.

CCS setup for SMP Debug Step1: Create a Sync Group that groups the cores in the SMP sub-system.

CCS setup for SMP Debug Step2: Goto Tools->Debugger Options->Misc/Other Options and select Allow software breakpoints to be used.

CCS setup for SMP Debug Step3: Shared memory needs to be setup next. We need to tell CCS what regions of memory are shared between the dual M3/M4/A15 cores. Here s an excerpt from a gel script that is used to configure the memory as shared between the 2 cores.

CCS setup for SMP Debug Known Issues: Occasionally users may observe that the SysMin log does not get flushed to the CIO console. This happens since execution cross-triggering has been enabled and when one of the cores exits and halts, it causes the other core to halt too. The solution is to run the other core that halted before flushing the SysMin buffer. Alternatively, the Halt at program termination feature can be disabled on all cores. This feature can be disabled from the Tools -> Debugger Options -> Program/Memory Load Options window. Once this feature is disabled, the cores will not halt at exit and need to be manually paused/halted. The screenshots below show the Debug window view when such a situation occurs: Run Core 0

CCS setup for SMP Debug Known Issues: The Sync Group feature does not work correctly when debugging applications on Cortex-A15. In particular, the global breakpoints feature and Semi-hosting does not work. A bug has been filed for this issue (SDSCM00051006). A temporary workaround is to group the cores as a regular group and enable global breakpoints. When setting the breakpoints, use hardware breakpoints instead of software breakpoints and set a breakpoint on both cores. Though the global breakpoint feature is still broken, when one of the cores hits a breakpoint, the other core will halt immediately i.e. the halts will be sync ed.

Summary With SMP/BIOS, the TWO highest priorities tasks that are ready to run will be RUNNING SIMULTANEOUSLY at all times. You can not depend on Task priorities to guarantee critical section protection. Currently, there is no attempt to balance Tasks between the 2 cores (No explicit Load Balancing). Hwi s can be mapped to either M3/M4/A15 cores. Core affinity can only be set for interrupt numbers >= 16 on M3/M4 and for interrupt numbers >= 32 on A15. All Swi s are run on core 0. May be posted on either core. Swi s running on core 0 while tasks are running on core 1 may violate thread execution assumptions. You can not depend on Swi/Task priorities to guarantee critical section protection SMP aware SysMin, SysStd, LoggerBuf modules (in ti.sysbios.smp package) are provided in place of corresponding xdc.runtime equivalents. ti.sysbios.load module has been enhanced to support SMP/BIOS.

Q&A SYS/BIOS product download pages SYS/BIOS wiki page SYS/BIOS FAQs SMP/BIOS public wiki SMP/BIOS debug help Useful Links http://downloads.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/bios/sysbios /index.html http://processors.wiki.ti.com/index.php/category:sysbios http://processors.wiki.ti.com/index.php/sys/bios_faqs http://processors.wiki.ti.com/index.php/smp/bios http://processors.wiki.ti.com/index.php/smp_debug