A Linux multimedia platform for SH-Mobile processors

Similar documents
ID 730L: Getting Started with Multimedia Programming on Linux on SH7724

Streaming Media Portability

CE Linux 2007 GStreamer Tutorial

Track Three Building a Rich UI Based Dual Display Video Player with the Freescale i.mx53 using LinuxLink

Streaming Media. Advanced Audio. Erik Noreke Standardization Consultant Chair, OpenSL ES. Copyright Khronos Group, Page 1

NVJPEG. DA _v0.2.0 October nvjpeg Libary Guide

Simple Plugin API. Wim Taymans Principal Software Engineer October 10, Pinos Wim Taymans

Linux DRM Developer s Guide

Android Multimedia Framework Overview. Li Li, Solution and Service Wind River

ID B30C: IP Video Surveillance Applications

Embedded Linux Conference EU Complex Cameras on Linux. Mauro Carvalho Chehab. Oct, SRBR Samsung R&D Institute Brazil

Completing the Multimedia Architecture

Media Controller (MC) and OMAP2+ Display Subsystem (DSS) Embedded Linux Conference, SFO, 2011 Sumit Semwal

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

Multimedia Decoder Using the Nios II Processor

NVJPEG. DA _v0.1.4 August nvjpeg Libary Guide

Operating Systems 2010/2011

Multimedia SoC System Solutions

Intel Atom x3-c3200rk Processor (Formerly SoFIA 3G R) Simple V4L2- based Capture Method for MIPI-CSI2 Smart Camera Sensors

Linux DRM Developer s Guide

Implementation and Evaluation of the Synchronization Protocol Immediate Priority Ceiling in PREEMPT-RT Linux

ECEN 449 Microprocessor System Design. Hardware-Software Communication. Texas A&M University

Efficient Video Processing on Embedded GPU

Final Step #7. Memory mapping For Sunday 15/05 23h59

Methods to protect proprietary components in device drivers

Runtime Power Management on SuperH Mobile

CS365-TI Digital Media Software Development Kit

CS368-LI Digital Media Software Development Kit

CS368-TI Digital Media Software Development Kit

Porting Tizen-IVI 3.0 to an ARM based SoC Platform. Damian Hobson-Garcia, IGEL Co., Ltd.

LCA14-417: mmap, allocators & sharing buffers - userland experience. Thu 6 March, 4:10pm, S.Semwal, B.Gaignard

Applying User-level Drivers on

Jan Schmidt Sydney, Australia / 18 January 2007

SH-MobileG1: A Single-Chip Application and Dual-mode Baseband Processor

Module 10 MULTIMEDIA SYNCHRONIZATION

4K HEVC Video Processing with GPU Optimization on Jetson TX1

Marek Szyprowski Samsung R&D Institute Poland

AM57x Sitara Processors Multimedia and Graphics

This application note describes the specification of the JPEG codec unit (in the following, JCU) driver of SH7268/SH7269.

4K Video Processing and Streaming Platform on TX1

CS Lab 0.5: Fish

I/O Management Software. Chapter 5

Chapter 4: Threads. Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads

SH-Mobile LSIs for Cell Phones

Porting Tizen-IVI 3.0 to an ARM based SoC Platform

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich

W I S S E N T E C H N I K L E I D E N S C H A F T

An Implementation Of Multiprocessor Linux

Inter-Process Communication and Synchronization of Processes, Threads and Tasks: Lesson-1: PROCESS

GStreamer Daemon - Building a media server under 30min. Michael Grüner - David Soto -

Application and Desktop Sharing. Omer Boyaci November 1, 2007

DaVinci Family Video Performance

Bringing display and 3D to the C.H.I.P computer

OpenMAX AL, OpenSL ES

4K Video Processing and Streaming Platform on TX1

pthreads CS449 Fall 2017

System Call. Preview. System Call. System Call. System Call 9/7/2018

Asynchronous Events on Linux

GStreamer Conference 2013, Edinburgh 22 October Sebastian Dröge Centricular Ltd

FOSDEM 3 February 2018, Brussels. Tim-Philipp Müller < >

Christophe Massiot (FOSDEM 2016) What makes Upipe great for video processing

Introduction to pthreads

Bringing display and 3D to the C.H.I.P computer

TriMedia Motion JPEG Encoder (VencMjpeg) API 1

Using kgdb and the kgdb Internals

Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio

Model: LT-122-PCIE For PCI Express

Threads. Threads (continued)

Module Introduction. Content 15 pages 2 questions. Learning Time 25 minutes

I/O Management Software. Chapter 5

15: OS Scheduling and Buffering

GM8126 SCALER FOR VIDEO GRAPHIC

SH-X3 Flexible SuperH Multi-core for High-performance and Low-power Embedded Systems

MPEG Video Decoder (VdecMpeg) API 2

TriMedia Motion JPEG Decoder (VdecMjpeg) API 1

Micrium OS Kernel Labs

Protection and System Calls. Otto J. Anshus

Part 1: Introduction to device drivers Part 2: Overview of research on device driver reliability Part 3: Device drivers research at ERTOS

More performance options

Kernel Internals. Course Duration: 5 days. Pre-Requisites : Course Objective: Course Outline

Dotstack Porting Guide.

Threads Chapter 5 1 Chapter 5

Operating System: Chap13 I/O Systems. National Tsing-Hua University 2016, Fall Semester

Multimedia on the Web

ios vs Android By: Group 2

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

PCIe/104 or PCI/104-Express 4-Channel Audio/Video Codec Model 953 User's Manual Rev.C September 2017

Our Technology Expertise for Software Engineering Services. AceThought Services Your Partner in Innovation

Memory Management in Tizen. SW Platform Team, SW R&D Center

EXPLICIT SYNCHRONIZATION

Threads Implementation. Jo, Heeseung

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

CS 326: Operating Systems. Process Execution. Lecture 5

The Mobile Advantage. Erik Noreke Independent Standardization Consultant Chair, OpenSL ES. Copyright Khronos Group, Page 1

What s in a process?

CELF Audio Video Graphics Specification v2

I/O Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

SFO15-200: TEE kernel driver

MPEG4 Programming Guide

A hardware operating system kernel for multi-processor systems

Transcription:

A Linux multimedia platform for SH-Mobile processors Embedded Linux Conference 2009 April 7, 2009

Abstract Over the past year I ve been working with the Japanese semiconductor manufacturer Renesas, developing a Linux multimedia stack for the SH-MobileR series of application processors. These processors have an integrated DSP and dedicated hardware for video encoding and decoding, as well as various other nice features. In this presentation I ll introduce various free software components for building multimedia gadgets on this platform. I ll also discuss architecture-independent details about sharing UIO devices and using IL components.

of topics Intro to libshcodecs, omxil-sh Resource management, UIOMux Demo session tomorrow evening at 6:30 PM, Imperial B

SH-Mobile IP Blocks VPU CEU VEU BEU SH-Mobile processors SH-Mobile are 32-bit RISC application processors, comprising a system LSI with multiple function blocks. SH7722: SH-4 up to 266MHz SH7723: SH-4A up to 400MHz http://www.renesas.com/

SH-Mobile IP Blocks VPU CEU VEU BEU IP Blocks VPU: Video Processing Unit VIO: CEU, VEU, BEU JPU: JPEG LCDC: LCD controller VOU: Video Output SIU: Sound I/O USB

SH-Mobile IP Blocks VPU CEU VEU BEU VPU Video Processing Unit MPEG-4 and H.264 accelerators Encoding and decoding (half-duplex) H.264 slice processing

SH-Mobile IP Blocks VPU CEU VEU BEU CEU Camera interface: YCbCr 4:2:2 or 4:2:0 horiz/vertical sync

SH-Mobile IP Blocks VPU CEU VEU BEU VEU Video Engine Unit: image processing in memory colorspace conversion YCbCr RGB YCbCr dithering (in RGB color subtraction) filtering: mirror, point symmetry, 90degree, deblocking, median

SH-Mobile IP Blocks VPU CEU VEU BEU BEU Blend Engine Unit: image blending in memory picture-in-picture graphic combining

SH-Mobile IP Blocks VPU CEU VEU BEU DSP SH7722 also has SH4AL-DSP extended functions max 4 parallel operations: ALU, mult, 2xload/store conditional execution zero-overhead repeat loop control

v4l2 fbdev UIO We use existing to provide access to various IP blocks: v4l2: Video for Linux 2 fbdev: Framebuffer device UIO: User space IO

v4l2 Introduction v4l2 fbdev UIO v4l2 (Video for Linux 2) is the standard Linux kernel interface for video capture. It contains specific interfaces for selecting capture dimensions and color format. Applications can use read() to read frame data, or alternatively can use an ioctl() to stream frame data directly into memory buffers. Camera parameters like brightness and contrast can also be changed while the device is open.

fbdev Introduction v4l2 fbdev UIO fbdev (Framebuffer device) is a standard Linux kernel interface for handling video output. It specifies routines for negotiationg screen dimensions, the color map and pixel format, and for using any available acceleration functions.

v4l2 fbdev UIO UIO: overview UIO (Userspace IO) is a fairly recent Linux kernel mechanism for allowing device driver logic to be implemented in userspace. The kernel level UIO interface itself is generic and does not have functionality specific to any kind of device. It is currently only available for char drivers (not block or network). Using UIO, the implementation of a device driver is split into two parts: a small kernel module which provides memory maps, and stubs for interrupt handling and IO a userspace device driver which uses normal read(), mmap() system calls to work with the device.

v4l2 fbdev UIO UIO: kernel module On the kernel side, the module provides a struct specifying: the IRQ number, flags and a handler() function which acknowledges the interrupt an array of memory areas which can be mapped into userspace mmap(), open(), release() functions which are called from the corresponding file operations members

v4l2 fbdev UIO UIO: user space The user space portion is a normal process which opens eg. /dev/uio0. This returns a pollable file descriptor from which the number of available events can be read. Normal system calls can be used to interact with this device: Can use poll() to wait until the file descriptor is ready to perform I/O. calling read() on this file descriptor returns the number of events (interrupts) since the last read. Both blocking and non-blocking reads are possible. mmap() should be used to map the memory areas described by the kernel space module.

UIO: enable interrupts v4l2 fbdev UIO /* Enable interrupt in UIO driver */ { unsigned long enable = 1; } write(uio_dev.fd, &enable, sizeof(u_long));

UIO: wait for interrupt v4l2 fbdev UIO /* Wait for an interrupt */ { unsigned long n_pending; } read(uio_dev.fd, &n_pending, sizeof(u_long));

Introduction Integration Layer Development Layer Application Layer Implementations GStreamer

Integration Layer Development Layer Application Layer Implementations GStreamer is a set of cross-platform APIs for multimedia codec and application portability. It consists of three layers: Integration Layer Development Layer Application Layer is specified by the Khronos Group.

IL Introduction Integration Layer Development Layer Application Layer Implementations GStreamer IL (Integration Layer) is a standardized media component interface to integrate and communicate with multimedia codecs implemented in hardware or software. It does not provide any interfaces for synchronized capture or playback of video and audio.

DL Introduction Integration Layer Development Layer Application Layer Implementations GStreamer DL (Development Layer) APIs specify audio, video and imaging functions that can be implemented and optimized on new CPUs, hardware engines, and DSPs and then used for a wide range of accelerated codec functionality such as MPEG-4, H.264, MP3, AAC and JPEG.

Integration Layer Development Layer Application Layer Implementations GStreamer AL AL (Application Layer) provides acceleration of capture and presentation of audio, video, and images. This API is provisional: it was expected to be finalized at the end of 2008.

Integration Layer Development Layer Application Layer Implementations GStreamer Implementations A shared library with the IL core and a reference component, and some components which pass Khronos conformance tests. Bellagio is an open source implementation of IL, developed by STMicroelectronics and Nokia. TI have an implementation for OMAP. OpenCore is the multimedia framework used by the Android platform. It was developed by PacketVideo. It includes an open source implementation of IL.

GStreamer Introduction Integration Layer Development Layer Application Layer Implementations GStreamer GStreamer is a framework which allows applications to build an arbitrary graph of demux, decode, effects, encode, mux and I/O elements. The scope of GStreamer is roughly equivalent to IL and AL combined. There is an -GStreamer project which implements GStreamer plugin wrappers for Bellagio IL components.

Tunneling Introduction Integration Layer Development Layer Application Layer Implementations GStreamer In a real application you want to use various IP blocks together, like displaying the output of the VPU onto the LCD. and GStreamer are abstract interfaces which generally make buffers available for software processing. However if you happen to connect two plugins together for hardware IP blocks, then you want them to share buffers and just pass pointers around and synchronize buffer access.

GStreamer negotiation Integration Layer Development Layer Application Layer Implementations GStreamer GStreamer has great facilities for discovering plugin capabilities and negotiating data formats for exchange between plugins. GStreamer- can use this negotiation to trigger tunneling between components.

libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins libshcodecs omxil-sh libuiomux These libraries are available under the GNU LGPL

libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins libshcodecs Callback-based encoding and decoding. Example apps: shcodecs-dec shcodecs-enc shcodecs-cap, shcodecs-capenc

libshcodecs: Decode Introduction libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins To decode video data, an application provides a callback with the following prototype: int vpu4_decoded (SHCodecs_Decoder * decoder, unsigned char * y_buf, int y_size, unsigned char * c_buf, int c_size, void * user_data);

libshcodecs: Decode Introduction libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins and then supplies encoded video data to the function: shcodecs_decode (decoder, input_buffer, length); The decoder will process the input buffer, and call the provided callback function each time a frame is decoded. The output is given in two bitplanes of YUV 4:2:0.

libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins libshcodecs: Decode There are two modes of operation: Frame-by-frame: data passed to shcodecs decode() corresponds to an entire encoded frame Streaming: pass in arbitrary amounts of an MPEG elementary stream. When all streaming data has been passed in, call shcodecs decoder finalize() to flush any constructed frames remaining in the decoder.

libshcodecs: Encode Introduction libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins To encode video data, an application provides both input and output callback functions: /* callback for acquiring an image */ int get_input(shcodecs_encoder * encoder, void *user_data); /* callback for writing encoded data */ int write_output(shcodecs_encoder * encoder, unsigned char *data, int length, void *user_data);

libshcodecs: Encode Introduction libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins and then run the encoder by: shcodecs_encoder_run(encoder); The encoder will retrieve raw video as needed using the input callback, and will use the output callback you have provided whenever encoded video data is available.

libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins MPlayer Magnus Damm developed VEU support for VIDIX, MPlayer s interface for fast user-space video output drivers. This is now in upstream MPlayer. We also have patches for: Video decoding using libshcodecs Audio deocding using AAC, MP3 DSP routines

libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins omxil-sh Bellagio IL components for: MP3, AAC decoding using SH7722 DSP MPEG4, H.264 decoding (and encoding) using the VPU

GStreamer plugins Introduction libshcodecs libshcodecs Decode libshcodecs Encode MPlayer omxil-sh GStreamer plugins GStreamer plugins for VPU encoding and decoding using libshcodecs are currently under development.

Resource allocation Introduction Resource allocation UIOMux UIOMux Map management We provide a kernel interfaces for various IP blocks: VPU, VEU, (2DG, BEU, URAM, MRAM, JPG). We need to provide a simple way for applications to use any of these individually, and to allow multiple applications to access different functions of a single IP block. For example, encoding and decoding of video via the VPU is half-duplex, and we want to be able to separate these tasks. Thus we provide a resource allocation layer which manages these functions, on top of the UIO user space device driver. It handles map management and event dispatch, but does not have any specific code for the devices.

UIOMux Introduction Resource allocation UIOMux UIOMux Map management UIOMux multiplexes access to named resources, including devices which are available via UIO. It can also be queried for whether or not a resource is available on the currently running system. UIOMux consists of a user-level shared library, libuiomux, which manages a shared memory segment containing mutexes for each managed resource. This segment and its mutexes are shared amongst all processes and threads on the system, to provide system-wide locking. In this way, libuiomux can be used to manage contention across multiple simultaneous processes and threads.

UIOMux locking Introduction Resource allocation UIOMux UIOMux Map management UIOMux allows simultaneous locking of access to multiple resources, with deterministic locking and unlocking order to avoid circular waiting. Processes or threads requring simultaneous access to more than one resource should lock and unlock them simultaneously via libuiomux. UIOMux will save and restore of memory-mapped IO registers associated with a UIO device. Registers are saved on uiomux unlock() and restored on uiomux lock(), if intervening users have used the device.

Map management Introduction Resource allocation UIOMux UIOMux Map management The entire range mapped by the UIO device is implicitly available to all its users, so they must co-operatively share it and ensure not to overwrite each other s regions. The kernel is unable to provide more finely grained protection as these regions are not fixed in size at the time of UIO initialization; for example, the size of an image buffer depends on the requested dimensions.

Introduction

The has various IP blocks which are useful for multimedia processing. for these are in the upstream kernel, and we are developing a set of libraries and multimedia components for using them. Demo session tomorrow evening at 6:30 PM, Imperial B conrad@metadecks.org