MULTIMEDIA ADAPTATION FOR DYNAMIC ENVIRONMENTS

Similar documents
CONTENT MODEL FOR MOBILE ADAPTATION OF MULTIMEDIA INFORMATION

Module 7 VIDEO CODING AND MOTION ESTIMATION

9/8/2016. Characteristics of multimedia Various media types

Multimedia Database Architecture!

Optimal Video Adaptation and Skimming Using a Utility-Based Framework

Fundamentals of Video Compression. Video Compression

VIDEO COMPRESSION STANDARDS

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Optical Storage Technology. MPEG Data Compression

Active Adaptation in QoS Architecture Model

Authoring Adaptive Multimedia Services

Vol. 5, No.3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA

How to achieve low latency audio/video streaming over IP network?

Module 6 STILL IMAGE COMPRESSION STANDARDS

RECOMMENDATION ITU-R BT.1720 *

Video Compression MPEG-4. Market s requirements for Video compression standard

Recommended Readings

Page 1. Outline / Computer Networking : 1 st Generation Commercial PC/Packet Video Technologies

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India

Scalable Video Coding

Mobile Multimedia Services Content Adaptation

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 18: Font compression and streaming

DELIVERING MULTIMEDIA CONTENT FOR THE FUTURE GENERATION MOBILE NETWORKS

Video Streaming Over Multi-hop Wireless Networks

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Georgios Tziritas Computer Science Department

MULTIPLEXER / DEMULTIPLEXER IMPLEMENTATION USING A CCSDS FORMAT

MPEG: It s Need, Evolution and Processing Methods

Module 10 MULTIMEDIA SYNCHRONIZATION

QoS-Enabled Video Streaming in Wireless Sensor Networks

A Framework for Video Streaming to Resource- Constrained Terminals

Design and Performance Evaluation of a New Spatial Reuse FireWire Protocol. Master s thesis defense by Vijay Chandramohan

Lecture 5: Error Resilience & Scalability

View Synthesis for Multiview Video Compression

Multimedia Decoder Using the Nios II Processor

Transporting audio-video. over the Internet

Lecture 17: Distributed Multimedia

Distributed Multimedia Systems. Introduction

Compression and File Formats

Network Extension Unit (NXU)

Automatic visual recognition for metro surveillance

Real-Time Course. Video Streaming Over network. June Peter van der TU/e Computer Science, System Architecture and Networking

3. Quality of Service

Fathelalem F.Ali Meio University, Nago, Okinawa , Japan Tel: (81) Fax : (81)

FPGA Provides Speedy Data Compression for Hyperspectral Imagery

A Study on Transmission System for Realistic Media Effect Representation

Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications

An Adaptive Scene Compositor Model in MPEG-4 Player for Mobile Device

Introduction to LAN/WAN. Application Layer 4

L2: Bandwidth and Latency. Hui Chen, Ph.D. Dept. of Engineering & Computer Science Virginia State University Petersburg, VA 23806

Recording at the Edge Solution Brief

Audio and video compression

Surveillance System with Mega-Pixel Scalable Transcoder

An Analysis of Quality of Service (QoS) In Live Video Streaming Using Evolved HSPA Network Media

ILE Implementation: CYBER TELEPORTATION TOKYO at SXSW

Real-time target tracking using a Pan and Tilt platform

Configuring the Wireless Parameters (CPE and WBS)

Digital video coding systems MPEG-1/2 Video

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

About MPEG Compression. More About Long-GOP Video

Multimedia Communications ECE 728 (Data Compression)

Coding for the Network: Scalable and Multiple description coding Marco Cagnazzo

Network-Adaptive Video Coding and Transmission

Survey on Concurrent Multipath Scheduling for Real Time Video Streaming in Wireless Network

Multimedia Standards

MOBILE VIDEO COMMUNICATIONS IN WIRELESS ENVIRONMENTS. Jozsef Vass Shelley Zhuang Jia Yao Xinhua Zhuang. University of Missouri-Columbia

Types and Methods of Content Adaptation. Anna-Kaisa Pietiläinen

Performance Evaluation of WiMAX- Wi-Fi Video Surveillance System

Header Compression Capacity Calculations for Wireless Networks

MISB EG Motion Imagery Standards Board Engineering Guideline. 24 April Delivery of Low Bandwidth Motion Imagery. 1 Scope.

Bridging and Switching Basics

Compressed-Domain Video Processing and Transcoding

THE INGV (Catania Section) NETWORK SYSTEM: USE OF BROADBAND NETWORKS FOR REAL TIME VOLCANIC MONITORING TECHNIQUES

Networking Applications

Messenger 2 Transmitter Camera-Mount (M2T-C)

Distributed Video Systems Chapter 3 Storage Technologies

SIP-based Mobility Architecture for Next Generation Wireless Networks

Internet Video Delivery. Professor Hui Zhang

A Survey on Wireless Multimedia Sensor Network

A Study on Delay, Throughput and Traffic Measurement for Wi-Fi Connected Stations Based on MAC Sublayer

Router Virtualization as an Enabler for Future Internet Multimedia Applications

White Paper: WiseStream II Technology. hanwhasecurity.com

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Multimedia. Sape J. Mullender. Huygens Systems Research Laboratory Universiteit Twente Enschede

Part 1 of 4. MARCH

An Efficient Player for MPEG-4 Contents on a Mobile Device

Multimedia Communications

Lecture 8 Wireless Sensor Networks: Overview

SIMULATION OF PACKET DATA NETWORKS USING OPNET

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

Optimizing IP Networks for Acquisition. Presented by Henry Quintana, Director of Solutions TVU Networks

QoS Mapping along the Protocol Stack: Discussion and Preliminary Results

International Journal of Scientific & Engineering Research, Volume 6, Issue 3, March ISSN

KNOWLEDGE-BASED MULTIMEDIA ADAPTATION DECISION-TAKING

Extensions to RTP to support Mobile Networking: Brown, Singh 2 within the cell. In our proposed architecture [3], we add a third level to this hierarc

Transcription:

MULTIMEDIA ADAPTATION FOR DYNAMIC ENVIRONMENTS Maija Metso, Antti Koivisto and Jaakko Sauvola Machine Vision and Media Processing Group Infotech Oulu, University of Oulu PO BOX 444, 90570 Oulu, FINLAND tel. +358-8-553 2611, fax. +358-8-553 2612 e-mail:maija@ee.oulu.fi, http://ee.oulu.fi/ee/info.laboratory Abstract - This paper presents a new approach for adapting a multimedia presentation into heterogeneous and dynamically changing environments. A novel media adaptation taxonomy is proposed that includes synchronized media optimization and scaling. The taxonomy incorporates multi-dimensional interaction between media, service requirements and presentation. Network properties, terminal types and integration capabilities differ in presentation requirement profile(s). Understanding the relationship between media information and these service characteristics benefits the presentation process. Using our taxonomy, a concrete service presentation are shown, with adaptation properties in different situations and for different types of terminals. The advantages of the content-based adaptation taxonomy are demonstrated with an example of an entire (end-to-end) multimedia service system in a dynamic environment. Keywords: Media adaptation, Dynamic environments, Content understanding INTRODUCTION Multimedia as a service and an information framework has gained wide attention, due to penetration of Internet, media processing, advancements in office automation, and emerging mobile multimedia communications technologies [1]. Especially, the mobile multimedia is a subject of increasing interest. This is related to the fact that humans need to be mobile, and therefore use mobile services in varying environments. This progress puts a huge emphasis on multimedia presentation and communication technology. While multimedia and mobile networks develop rapidly [2], new means for preparing and delivering multimedia information over the extending end-to-end system path(s) are needed [3]. Emerging mobile data communication and work models set new requirements for multimedia as foundation for services. Eventually, we are forced to focus in the information itself. Different media types (e.g. audio, video, synthetic) contain a vast amount of value adding information that can be extracted directly from the media itself or by the use of media analysis and understanding techniques. This is the basis of rapidly increasing research effort in document analysis [4], audio, image and video processing. Emerging new services based on multimedia and on the integration of complex systems having multiple, deforming elements, can benefit immensely from the optimized combination of media content, value adding media information and system knowledge. This

knowledge can be used to adapt the media for the service domain, thus meeting the requirements and enabling the creation of new types of multimedia services. In this paper we address this problem present with a new taxonomy for media, multimedia and service based adaptation. At the end we present a concrete example of using the proposed taxonomy in an demanding multimedia service task: video surveillance in a dynamic environment. TAXONOMY OF MULTIMEDIA ADAPTATION Media scaling and optimization are two separate but at the same time related issues. Together they form a taxonomy hierarchy that we call adaptation. It is exploited in media object preparation, transmission and presentation over an entire service domain. The adaptation process is guided by the requirements obtained from the application and set by the content domain, and the output of the process is used to form a presentation, Fig 1. The scaling process modifies the physical charasteristics of the media while the optimization deals mostly with semantic features but also influences the physical nature of the media. The adaptation is required in dynamic (unstable) environments. Figure 1. Taxonomy of media adaptation. Media scaling refers to altering the content of media objects in different physical, semantic and functional domains, Fig 1. Scaling may be a fundamental conversion of the physical format of media data (i.e. information with presentation features) from one standard to another, or removal of unnecessary or excessive information. Based on the knowledge of the physical characteristics of the media, its resolution can dynamically be rendered accordingly. Media optimization refers to finding the right type and amount of information that ought to be presented over the end-to-end system path. Media optimization is needed when the environment is highly heterogeneous, or when the presentation events are subject to changes, which can be predicted robustly. The actual presentation environment usually comprises multiple networks and different terminals with varying capabilities. To maintain transfer time, delivery cost and the usability of terminal capabilitities, the media objects must be (re)constructed and transferred in an optimized way.

ADAPTATION TO SERVICE ENVIRONMENT When applying the presented taxonomy in actual media processing applications, several complex system variables need to be considered. Rules for service adaptation are derived from the information content of media, Quality of Service characteristics and synchronization needs, which in turn are determined from the service and system context, Fig 2. Media properties that affect the entire adaptation sequence can be divided into physical, semantic and functional entities. Functional characteristics such as media context service control include user-driven and system specific functions, in varying or deforming environments, e.g. terminal change, service context switch or service (re)parametrization on-the-fly. Multimedia systems specify physical requirements for synchronization, timing and processing. Figure 2. Factors affecting the media adaptation process. Media adaptation is guided by specific Quality of Service parameters, such as the complexity of media objects in terms of available resources and the requirements set by the user process. QoS parameters include both the transfer channel resources such as bandwidth and reliability, and the client-specific capabilities such as preferred display resolution. They dictate the transmission parametrics for the required presentation in service implementation. If the parameters are unknown, the quality of the presentation can be unsatisfactory. For example, video stream can be discontinuous, audio signal can be broken or media transfer can suffer from long latencies. To address this, a dynamic QoS-negotiation mechanism is used at the content- and transmission levels of the end-to-end multimedia path otherwise it would be difficult to guarantee robust performance [5]. For example, a typical criteria is relative picture quality, which could include variables such as spatial resolution. In order to adapt the service presentation, the features should be scaled according to physical system requirements such as terminal and network properties. Information needed in the synchronization of a presentation is usually carried in the object headers. Synchronization is needed between multiple streams, for example, when audio and video are generated in separate sources and combined into one presentation. Synchronization points are determined from time indexing or content/ event information. The main issue in media-based adaptation taxonomy is understanding the twofolded nature of media in digital service context: Its properties can be analyzed and applied from both physical and semantic point of view, Fig 3. Physical content car-

ries the actual information in a specific format with a certain structure and organization. For example, the physical difference between two images is determined from the headers. This information does not change as a function of time and can be used for scaling purposes. Figure 3. Utilization of media content understanding in adaptation. The semantic difference between two consecutive objects is determined from their contents and it can be used to find aliased information or to interpolate excessively quantized data. In case of a videostream, the temporal relationships of sequential frames can be utilized for example in video compression, keyframe extraction or scene detection, Fig 4. Figure 4. Media in physical, semantic and temporal domain. EXAMPLE OF ADAPTATION - VIDEO SURVEILLANCE We demonstrate the proposed media adaptation taxonomy with a video surveillance service. Consider a scene, where multiple surveillance cameras are connected to a distributed video and delivery system. The system detects movement and generate alerts to guards equipped with multimedia PCs or Personal Digital Assistants. The principle for the video surveillance application are presented in Fig 5. The

Physical and semantic content of video are used for detection. In order to adapt the media objects for specified parameters such as maximum tolerated latency, the objects must be presented, controlled and retrieved in a manner that also considers various terminals, networks, processing and transmission of the service. Figure 5. A block diagram of the video surveillance application. Two types of networks and terminals are used. The wireless network is capable of transfer rates of 56kbps-2Mbps, while the MMPC is connected to 10Mbps the Ethernet. To plan the error conditions together with protocol stack limit and available channel capacity, we define the best-case throughput for WLAN/Ethernet as 200kBps/1MBps and worst-case as 10kBps/100 kbps. The WPDA is supplied with a 160x120 display, which can present MPEG-2 -stream at 5 frames per second (fps). The MMPC can display 30 fps with 320x240x24 resolution. Following six variables are used for analyzing the adaptation performance: sent/ received bytes, fps and resolution, both sent and received. Real-time video stream with extracted textual data was sent to both terminals in unadapted and adapted form. The unadapted stream takes up to 0.5MB of bandwidth. Our application can alter both the spatial and temporal resolutions, for example a 160x120 video at 1 fps requires only 0.01 MBps. Values are presented in Table 1 under two load cases adapted (A) and unadapted (NA). The adaptation results in a more efficient usage of bandwidth and terminal and decreases the amount of unnecessary data transfer substantially. For example, if the video resolution is scaled to a half and fps rate is decreased to 1/6 in AWPDA-case, the transmission load of the service process is decreased by a factor of 2.5 while perceived video quality is even better due to more continous stream. Controlled processing of streaming media and adaptation are needed to match the presentation to connection and terminal capatibilites. In our example, the video processing and object control involving process timing information and utilization of extracted semantic data and retrieval control, are distributed among dedicated servers. When altering the frame rate and resolution along with scalable presentations, the proposed system maps all three aspects of our media adaptation taxonomy (media scaling, optimization and end-to-end requirements) into the system domain. If media is adapted according to the network load and terminal capatibilites, band

width and transmission latency are reduced with on presentation quality. Property/ Environment NAMMPC AMMPC NAWPDA AWPDA MBps sent 0.5:0.5 0.5:0.1 0.5:0.5 0.2:0.01 MBps received 0.5:0.1 0.5:0.1 0.2:0.01 0.2:0.01 fps sent 30:30 30:12 30:30 5:1 fps presented 30:6 30:12 5:1 5:1 relative res sent 1:1 1:0.5 1:1 0.5:0.5 relative res presented 1:1 1:0.5 0.5:0.5 0.5:0.5 Table 1: Adaptation of video surveillance media SUMMARY This paper proposed a new taxonomy for multimedia adaptation. The adaptation taxonomy exploits media analysis and understanding techniques, parameters extracted from the network, processing and terminal property evaluation, and is therefore divided into scaling and optimization. This division is very useful when designing new multimedia services, since it gives a steady basis for implementing optimum processing and transmission tasks that cluster inside the terminal and network (computing) environment. A video surveillance service was shown, where the proposed taxonomy was applied in an end-to-end path. In comparison to non-adapted use of media, the results indicated a clear improvement. References [1] Wu C. & Irwin J. Emerging Multimedia Communication Technologies, Prentice Hall, 443 pages, 1998. [2] Kummerfield R.J. Server Initiated Delivery of Multimedia Information, Proceedings of WWCA 97, Tsukuba, Japan, section C3, 1997. [3] Dhawan, C. Mobile Computing, McGraw-Hill, 579 pages, 1997. [4] Proceedings of the International Conference for Document Analysis and Recognition, 1991 Japan, 1993 France, 1996 Canada, 1997 Germany. [5] Aurrecoechea C., Campbell A., Hauw L. A Survey of QoS Architectures, Multimedia Systems Journal, Special Issue on QoS Architecture, May 1998.

MULTIMEDIA ADAPTATION FOR DYNAMIC ENVIRONMENTS Maija Metso, Antti Koivisto and Jaakko Sauvola Machine Vision and Media Processing Group Infotech Oulu, University of Oulu PO BOX 444, 90570 Oulu, FINLAND tel. +358-8-553 2611, fax. +358-8-553 2612 e-mail:maija@ee.oulu.fi, http://ee.oulu.fi/ee/info.laboratory Abstract - This paper presents a new approach for adapting a multimedia presentation into heterogeneous and dynamically changing environments. A novel media adaptation taxonomy is proposed that includes synchronized media optimization and scaling. The taxonomy incorporates multi-dimensional interaction between media, service requirements and presentation. Network properties, terminal types and integration capabilities differ in presentation requirement profile(s). Understanding the relationship between media information and these service characteristics benefits the presentation process. Using our taxonomy, a concrete service presentation are shown, with adaptation properties in different situations and for different types of terminals. The advantages of the content-based adaptation taxonomy are demonstrated with an example of an entire (end-to-end) multimedia service system in a dynamic environment. Keywords: Media adaptation, Dynamic environments, Content understanding INTRODUCTION Multimedia as a service and an information framework has gained wide attention, due to penetration of Internet, media processing, advancements in office automation, and emerging mobile multimedia communications technologies [1]. Especially, the mobile multimedia is a subject of increasing interest. This is related to the fact that humans need to be mobile, and therefore use mobile services in varying environments. This progress puts a huge emphasis on multimedia presentation and communication technology. While multimedia and mobile networks develop rapidly [2], new means for preparing and delivering multimedia information over the extending end-to-end system path(s) are needed [3]. Emerging mobile data communication and work models set new requirements for multimedia as foundation for services. Eventually, we are forced to focus in the information itself. Different media types (e.g. audio, video, synthetic) contain a vast amount of value adding information that can be extracted directly from the media itself or by the use of media analysis and understanding techniques. This is the basis of rapidly increasing research effort in document analysis [4], audio, image and video processing. Emerging new services based on multimedia and on the integration of complex systems having multiple, deforming elements, can benefit immensely from the optimized combination of media content, value adding media information and system knowledge. This

knowledge can be used to adapt the media for the service domain, thus meeting the requirements and enabling the creation of new types of multimedia services. In this paper we address this problem present with a new taxonomy for media, multimedia and service based adaptation. At the end we present a concrete example of using the proposed taxonomy in an demanding multimedia service task: video surveillance in a dynamic environment. TAXONOMY OF MULTIMEDIA ADAPTATION Media scaling and optimization are two separate but at the same time related issues. Together they form a taxonomy hierarchy that we call adaptation. It is exploited in media object preparation, transmission and presentation over an entire service domain. The adaptation process is guided by the requirements obtained from the application and set by the content domain, and the output of the process is used to form a presentation, Fig 1. The scaling process modifies the physical charasteristics of the media while the optimization deals mostly with semantic features but also influences the physical nature of the media. The adaptation is required in dynamic (unstable) environments. Figure 1. Taxonomy of media adaptation. Media scaling refers to altering the content of media objects in different physical, semantic and functional domains, Fig 1. Scaling may be a fundamental conversion of the physical format of media data (i.e. information with presentation features) from one standard to another, or removal of unnecessary or excessive information. Based on the knowledge of the physical characteristics of the media, its resolution can dynamically be rendered accordingly. Media optimization refers to finding the right type and amount of information that ought to be presented over the end-to-end system path. Media optimization is needed when the environment is highly heterogeneous, or when the presentation events are subject to changes, which can be predicted robustly. The actual presentation environment usually comprises multiple networks and different terminals with varying capabilities. To maintain transfer time, delivery cost and the usability of terminal capabilitities, the media objects must be (re)constructed and transferred in an optimized way.

ADAPTATION TO SERVICE ENVIRONMENT When applying the presented taxonomy in actual media processing applications, several complex system variables need to be considered. Rules for service adaptation are derived from the information content of media, Quality of Service characteristics and synchronization needs, which in turn are determined from the service and system context, Fig 2. Media properties that affect the entire adaptation sequence can be divided into physical, semantic and functional entities. Functional characteristics such as media context service control include user-driven and system specific functions, in varying or deforming environments, e.g. terminal change, service context switch or service (re)parametrization on-the-fly. Multimedia systems specify physical requirements for synchronization, timing and processing. Figure 2. Factors affecting the media adaptation process. Media adaptation is guided by specific Quality of Service parameters, such as the complexity of media objects in terms of available resources and the requirements set by the user process. QoS parameters include both the transfer channel resources such as bandwidth and reliability, and the client-specific capabilities such as preferred display resolution. They dictate the transmission parametrics for the required presentation in service implementation. If the parameters are unknown, the quality of the presentation can be unsatisfactory. For example, video stream can be discontinuous, audio signal can be broken or media transfer can suffer from long latencies. To address this, a dynamic QoS-negotiation mechanism is used at the content- and transmission levels of the end-to-end multimedia path otherwise it would be difficult to guarantee robust performance [5]. For example, a typical criteria is relative picture quality, which could include variables such as spatial resolution. In order to adapt the service presentation, the features should be scaled according to physical system requirements such as terminal and network properties. Information needed in the synchronization of a presentation is usually carried in the object headers. Synchronization is needed between multiple streams, for example, when audio and video are generated in separate sources and combined into one presentation. Synchronization points are determined from time indexing or content/ event information. The main issue in media-based adaptation taxonomy is understanding the twofolded nature of media in digital service context: Its properties can be analyzed and applied from both physical and semantic point of view, Fig 3. Physical content car-

ries the actual information in a specific format with a certain structure and organization. For example, the physical difference between two images is determined from the headers. This information does not change as a function of time and can be used for scaling purposes. Figure 3. Utilization of media content understanding in adaptation. The semantic difference between two consecutive objects is determined from their contents and it can be used to find aliased information or to interpolate excessively quantized data. In case of a videostream, the temporal relationships of sequential frames can be utilized for example in video compression, keyframe extraction or scene detection, Fig 4. Figure 4. Media in physical, semantic and temporal domain. EXAMPLE OF ADAPTATION - VIDEO SURVEILLANCE We demonstrate the proposed media adaptation taxonomy with a video surveillance service. Consider a scene, where multiple surveillance cameras are connected to a distributed video and delivery system. The system detects movement and generate alerts to guards equipped with multimedia PCs or Personal Digital Assistants. The principle for the video surveillance application are presented in Fig 5. The

Physical and semantic content of video are used for detection. In order to adapt the media objects for specified parameters such as maximum tolerated latency, the objects must be presented, controlled and retrieved in a manner that also considers various terminals, networks, processing and transmission of the service. Figure 5. A block diagram of the video surveillance application. Two types of networks and terminals are used. The wireless network is capable of transfer rates of 56kbps-2Mbps, while the MMPC is connected to 10Mbps the Ethernet. To plan the error conditions together with protocol stack limit and available channel capacity, we define the best-case throughput for WLAN/Ethernet as 200kBps/1MBps and worst-case as 10kBps/100 kbps. The WPDA is supplied with a 160x120 display, which can present MPEG-2 -stream at 5 frames per second (fps). The MMPC can display 30 fps with 320x240x24 resolution. Following six variables are used for analyzing the adaptation performance: sent/ received bytes, fps and resolution, both sent and received. Real-time video stream with extracted textual data was sent to both terminals in unadapted and adapted form. The unadapted stream takes up to 0.5MB of bandwidth. Our application can alter both the spatial and temporal resolutions, for example a 160x120 video at 1 fps requires only 0.01 MBps. Values are presented in Table 1 under two load cases adapted (A) and unadapted (NA). The adaptation results in a more efficient usage of bandwidth and terminal and decreases the amount of unnecessary data transfer substantially. For example, if the video resolution is scaled to a half and fps rate is decreased to 1/6 in AWPDA-case, the transmission load of the service process is decreased by a factor of 2.5 while perceived video quality is even better due to more continous stream. Controlled processing of streaming media and adaptation are needed to match the presentation to connection and terminal capatibilites. In our example, the video processing and object control involving process timing information and utilization of extracted semantic data and retrieval control, are distributed among dedicated servers. When altering the frame rate and resolution along with scalable presentations, the proposed system maps all three aspects of our media adaptation taxonomy (media scaling, optimization and end-to-end requirements) into the system domain. If media is adapted according to the network load and terminal capatibilites, band

width and transmission latency are reduced with on presentation quality. Property/ Environment NAMMPC AMMPC NAWPDA AWPDA MBps sent 0.5:0.5 0.5:0.1 0.5:0.5 0.2:0.01 MBps received 0.5:0.1 0.5:0.1 0.2:0.01 0.2:0.01 fps sent 30:30 30:12 30:30 5:1 fps presented 30:6 30:12 5:1 5:1 relative res sent 1:1 1:0.5 1:1 0.5:0.5 relative res presented 1:1 1:0.5 0.5:0.5 0.5:0.5 Table 1: Adaptation of video surveillance media SUMMARY This paper proposed a new taxonomy for multimedia adaptation. The adaptation taxonomy exploits media analysis and understanding techniques, parameters extracted from the network, processing and terminal property evaluation, and is therefore divided into scaling and optimization. This division is very useful when designing new multimedia services, since it gives a steady basis for implementing optimum processing and transmission tasks that cluster inside the terminal and network (computing) environment. A video surveillance service was shown, where the proposed taxonomy was applied in an end-to-end path. In comparison to non-adapted use of media, the results indicated a clear improvement. References [1] Wu C. & Irwin J. Emerging Multimedia Communication Technologies, Prentice Hall, 443 pages, 1998. [2] Kummerfield R.J. Server Initiated Delivery of Multimedia Information, Proceedings of WWCA 97, Tsukuba, Japan, section C3, 1997. [3] Dhawan, C. Mobile Computing, McGraw-Hill, 579 pages, 1997. [4] Proceedings of the International Conference for Document Analysis and Recognition, 1991 Japan, 1993 France, 1996 Canada, 1997 Germany. [5] Aurrecoechea C., Campbell A., Hauw L. A Survey of QoS Architectures, Multimedia Systems Journal, Special Issue on QoS Architecture, May 1998.

MULTIMEDIA ADAPTATION FOR DYNAMIC ENVIRONMENTS Maija Metso, Antti Koivisto and Jaakko Sauvola Machine Vision and Media Processing Group Infotech Oulu, University of Oulu PO BOX 444, 90570 Oulu, FINLAND tel. +358-8-553 2611, fax. +358-8-553 2612 e-mail:maija@ee.oulu.fi, http://ee.oulu.fi/ee/info.laboratory Abstract - This paper presents a new approach for adapting a multimedia presentation into heterogeneous and dynamically changing environments. A novel media adaptation taxonomy is proposed that includes synchronized media optimization and scaling. The taxonomy incorporates multi-dimensional interaction between media, service requirements and presentation. Network properties, terminal types and integration capabilities differ in presentation requirement profile(s). Understanding the relationship between media information and these service characteristics benefits the presentation process. Using our taxonomy, a concrete service presentation are shown, with adaptation properties in different situations and for different types of terminals. The advantages of the content-based adaptation taxonomy are demonstrated with an example of an entire (end-to-end) multimedia service system in a dynamic environment. Keywords: Media adaptation, Dynamic environments, Content understanding INTRODUCTION Multimedia as a service and an information framework has gained wide attention, due to penetration of Internet, media processing, advancements in office automation, and emerging mobile multimedia communications technologies [1]. Especially, the mobile multimedia is a subject of increasing interest. This is related to the fact that humans need to be mobile, and therefore use mobile services in varying environments. This progress puts a huge emphasis on multimedia presentation and communication technology. While multimedia and mobile networks develop rapidly [2], new means for preparing and delivering multimedia information over the extending end-to-end system path(s) are needed [3]. Emerging mobile data communication and work models set new requirements for multimedia as foundation for services. Eventually, we are forced to focus in the information itself. Different media types (e.g. audio, video, synthetic) contain a vast amount of value adding information that can be extracted directly from the media itself or by the use of media analysis and understanding techniques. This is the basis of rapidly increasing research effort in document analysis [4], audio, image and video processing. Emerging new services based on multimedia and on the integration of complex systems having multiple, deforming elements, can benefit immensely from the optimized combination of media content, value adding media information and system knowledge. This

knowledge can be used to adapt the media for the service domain, thus meeting the requirements and enabling the creation of new types of multimedia services. In this paper we address this problem present with a new taxonomy for media, multimedia and service based adaptation. At the end we present a concrete example of using the proposed taxonomy in an demanding multimedia service task: video surveillance in a dynamic environment. TAXONOMY OF MULTIMEDIA ADAPTATION Media scaling and optimization are two separate but at the same time related issues. Together they form a taxonomy hierarchy that we call adaptation. It is exploited in media object preparation, transmission and presentation over an entire service domain. The adaptation process is guided by the requirements obtained from the application and set by the content domain, and the output of the process is used to form a presentation, Fig 1. The scaling process modifies the physical charasteristics of the media while the optimization deals mostly with semantic features but also influences the physical nature of the media. The adaptation is required in dynamic (unstable) environments. Figure 1. Taxonomy of media adaptation. Media scaling refers to altering the content of media objects in different physical, semantic and functional domains, Fig 1. Scaling may be a fundamental conversion of the physical format of media data (i.e. information with presentation features) from one standard to another, or removal of unnecessary or excessive information. Based on the knowledge of the physical characteristics of the media, its resolution can dynamically be rendered accordingly. Media optimization refers to finding the right type and amount of information that ought to be presented over the end-to-end system path. Media optimization is needed when the environment is highly heterogeneous, or when the presentation events are subject to changes, which can be predicted robustly. The actual presentation environment usually comprises multiple networks and different terminals with varying capabilities. To maintain transfer time, delivery cost and the usability of terminal capabilitities, the media objects must be (re)constructed and transferred in an optimized way.

ADAPTATION TO SERVICE ENVIRONMENT When applying the presented taxonomy in actual media processing applications, several complex system variables need to be considered. Rules for service adaptation are derived from the information content of media, Quality of Service characteristics and synchronization needs, which in turn are determined from the service and system context, Fig 2. Media properties that affect the entire adaptation sequence can be divided into physical, semantic and functional entities. Functional characteristics such as media context service control include user-driven and system specific functions, in varying or deforming environments, e.g. terminal change, service context switch or service (re)parametrization on-the-fly. Multimedia systems specify physical requirements for synchronization, timing and processing. Figure 2. Factors affecting the media adaptation process. Media adaptation is guided by specific Quality of Service parameters, such as the complexity of media objects in terms of available resources and the requirements set by the user process. QoS parameters include both the transfer channel resources such as bandwidth and reliability, and the client-specific capabilities such as preferred display resolution. They dictate the transmission parametrics for the required presentation in service implementation. If the parameters are unknown, the quality of the presentation can be unsatisfactory. For example, video stream can be discontinuous, audio signal can be broken or media transfer can suffer from long latencies. To address this, a dynamic QoS-negotiation mechanism is used at the content- and transmission levels of the end-to-end multimedia path otherwise it would be difficult to guarantee robust performance [5]. For example, a typical criteria is relative picture quality, which could include variables such as spatial resolution. In order to adapt the service presentation, the features should be scaled according to physical system requirements such as terminal and network properties. Information needed in the synchronization of a presentation is usually carried in the object headers. Synchronization is needed between multiple streams, for example, when audio and video are generated in separate sources and combined into one presentation. Synchronization points are determined from time indexing or content/ event information. The main issue in media-based adaptation taxonomy is understanding the twofolded nature of media in digital service context: Its properties can be analyzed and applied from both physical and semantic point of view, Fig 3. Physical content car-

ries the actual information in a specific format with a certain structure and organization. For example, the physical difference between two images is determined from the headers. This information does not change as a function of time and can be used for scaling purposes. Figure 3. Utilization of media content understanding in adaptation. The semantic difference between two consecutive objects is determined from their contents and it can be used to find aliased information or to interpolate excessively quantized data. In case of a videostream, the temporal relationships of sequential frames can be utilized for example in video compression, keyframe extraction or scene detection, Fig 4. Figure 4. Media in physical, semantic and temporal domain. EXAMPLE OF ADAPTATION - VIDEO SURVEILLANCE We demonstrate the proposed media adaptation taxonomy with a video surveillance service. Consider a scene, where multiple surveillance cameras are connected to a distributed video and delivery system. The system detects movement and generate alerts to guards equipped with multimedia PCs or Personal Digital Assistants. The principle for the video surveillance application are presented in Fig 5. The

Physical and semantic content of video are used for detection. In order to adapt the media objects for specified parameters such as maximum tolerated latency, the objects must be presented, controlled and retrieved in a manner that also considers various terminals, networks, processing and transmission of the service. Figure 5. A block diagram of the video surveillance application. Two types of networks and terminals are used. The wireless network is capable of transfer rates of 56kbps-2Mbps, while the MMPC is connected to 10Mbps the Ethernet. To plan the error conditions together with protocol stack limit and available channel capacity, we define the best-case throughput for WLAN/Ethernet as 200kBps/1MBps and worst-case as 10kBps/100 kbps. The WPDA is supplied with a 160x120 display, which can present MPEG-2 -stream at 5 frames per second (fps). The MMPC can display 30 fps with 320x240x24 resolution. Following six variables are used for analyzing the adaptation performance: sent/ received bytes, fps and resolution, both sent and received. Real-time video stream with extracted textual data was sent to both terminals in unadapted and adapted form. The unadapted stream takes up to 0.5MB of bandwidth. Our application can alter both the spatial and temporal resolutions, for example a 160x120 video at 1 fps requires only 0.01 MBps. Values are presented in Table 1 under two load cases adapted (A) and unadapted (NA). The adaptation results in a more efficient usage of bandwidth and terminal and decreases the amount of unnecessary data transfer substantially. For example, if the video resolution is scaled to a half and fps rate is decreased to 1/6 in AWPDA-case, the transmission load of the service process is decreased by a factor of 2.5 while perceived video quality is even better due to more continous stream. Controlled processing of streaming media and adaptation are needed to match the presentation to connection and terminal capatibilites. In our example, the video processing and object control involving process timing information and utilization of extracted semantic data and retrieval control, are distributed among dedicated servers. When altering the frame rate and resolution along with scalable presentations, the proposed system maps all three aspects of our media adaptation taxonomy (media scaling, optimization and end-to-end requirements) into the system domain. If media is adapted according to the network load and terminal capatibilites, band

width and transmission latency are reduced with on presentation quality. Property/ Environment NAMMPC AMMPC NAWPDA AWPDA MBps sent 0.5:0.5 0.5:0.1 0.5:0.5 0.2:0.01 MBps received 0.5:0.1 0.5:0.1 0.2:0.01 0.2:0.01 fps sent 30:30 30:12 30:30 5:1 fps presented 30:6 30:12 5:1 5:1 relative res sent 1:1 1:0.5 1:1 0.5:0.5 relative res presented 1:1 1:0.5 0.5:0.5 0.5:0.5 Table 1: Adaptation of video surveillance media SUMMARY This paper proposed a new taxonomy for multimedia adaptation. The adaptation taxonomy exploits media analysis and understanding techniques, parameters extracted from the network, processing and terminal property evaluation, and is therefore divided into scaling and optimization. This division is very useful when designing new multimedia services, since it gives a steady basis for implementing optimum processing and transmission tasks that cluster inside the terminal and network (computing) environment. A video surveillance service was shown, where the proposed taxonomy was applied in an end-to-end path. In comparison to non-adapted use of media, the results indicated a clear improvement. References [1] Wu C. & Irwin J. Emerging Multimedia Communication Technologies, Prentice Hall, 443 pages, 1998. [2] Kummerfield R.J. Server Initiated Delivery of Multimedia Information, Proceedings of WWCA 97, Tsukuba, Japan, section C3, 1997. [3] Dhawan, C. Mobile Computing, McGraw-Hill, 579 pages, 1997. [4] Proceedings of the International Conference for Document Analysis and Recognition, 1991 Japan, 1993 France, 1996 Canada, 1997 Germany. [5] Aurrecoechea C., Campbell A., Hauw L. A Survey of QoS Architectures, Multimedia Systems Journal, Special Issue on QoS Architecture, May 1998.

MULTIMEDIA ADAPTATION FOR DYNAMIC ENVIRONMENTS Maija Metso, Antti Koivisto and Jaakko Sauvola Machine Vision and Media Processing Group Infotech Oulu, University of Oulu PO BOX 444, 90570 Oulu, FINLAND tel. +358-8-553 2611, fax. +358-8-553 2612 e-mail:maija@ee.oulu.fi, http://ee.oulu.fi/ee/info.laboratory Abstract - This paper presents a new approach for adapting a multimedia presentation into heterogeneous and dynamically changing environments. A novel media adaptation taxonomy is proposed that includes synchronized media optimization and scaling. The taxonomy incorporates multi-dimensional interaction between media, service requirements and presentation. Network properties, terminal types and integration capabilities differ in presentation requirement profile(s). Understanding the relationship between media information and these service characteristics benefits the presentation process. Using our taxonomy, a concrete service presentation are shown, with adaptation properties in different situations and for different types of terminals. The advantages of the content-based adaptation taxonomy are demonstrated with an example of an entire (end-to-end) multimedia service system in a dynamic environment. Keywords: Media adaptation, Dynamic environments, Content understanding INTRODUCTION Multimedia as a service and an information framework has gained wide attention, due to penetration of Internet, media processing, advancements in office automation, and emerging mobile multimedia communications technologies [1]. Especially, the mobile multimedia is a subject of increasing interest. This is related to the fact that humans need to be mobile, and therefore use mobile services in varying environments. This progress puts a huge emphasis on multimedia presentation and communication technology. While multimedia and mobile networks develop rapidly [2], new means for preparing and delivering multimedia information over the extending end-to-end system path(s) are needed [3]. Emerging mobile data communication and work models set new requirements for multimedia as foundation for services. Eventually, we are forced to focus in the information itself. Different media types (e.g. audio, video, synthetic) contain a vast amount of value adding information that can be extracted directly from the media itself or by the use of media analysis and understanding techniques. This is the basis of rapidly increasing research effort in document analysis [4], audio, image and video processing. Emerging new services based on multimedia and on the integration of complex systems having multiple, deforming elements, can benefit immensely from the optimized combination of media content, value adding media information and system knowledge. This

knowledge can be used to adapt the media for the service domain, thus meeting the requirements and enabling the creation of new types of multimedia services. In this paper we address this problem present with a new taxonomy for media, multimedia and service based adaptation. At the end we present a concrete example of using the proposed taxonomy in an demanding multimedia service task: video surveillance in a dynamic environment. TAXONOMY OF MULTIMEDIA ADAPTATION Media scaling and optimization are two separate but at the same time related issues. Together they form a taxonomy hierarchy that we call adaptation. It is exploited in media object preparation, transmission and presentation over an entire service domain. The adaptation process is guided by the requirements obtained from the application and set by the content domain, and the output of the process is used to form a presentation, Fig 1. The scaling process modifies the physical charasteristics of the media while the optimization deals mostly with semantic features but also influences the physical nature of the media. The adaptation is required in dynamic (unstable) environments. Figure 1. Taxonomy of media adaptation. Media scaling refers to altering the content of media objects in different physical, semantic and functional domains, Fig 1. Scaling may be a fundamental conversion of the physical format of media data (i.e. information with presentation features) from one standard to another, or removal of unnecessary or excessive information. Based on the knowledge of the physical characteristics of the media, its resolution can dynamically be rendered accordingly. Media optimization refers to finding the right type and amount of information that ought to be presented over the end-to-end system path. Media optimization is needed when the environment is highly heterogeneous, or when the presentation events are subject to changes, which can be predicted robustly. The actual presentation environment usually comprises multiple networks and different terminals with varying capabilities. To maintain transfer time, delivery cost and the usability of terminal capabilitities, the media objects must be (re)constructed and transferred in an optimized way.

ADAPTATION TO SERVICE ENVIRONMENT When applying the presented taxonomy in actual media processing applications, several complex system variables need to be considered. Rules for service adaptation are derived from the information content of media, Quality of Service characteristics and synchronization needs, which in turn are determined from the service and system context, Fig 2. Media properties that affect the entire adaptation sequence can be divided into physical, semantic and functional entities. Functional characteristics such as media context service control include user-driven and system specific functions, in varying or deforming environments, e.g. terminal change, service context switch or service (re)parametrization on-the-fly. Multimedia systems specify physical requirements for synchronization, timing and processing. Figure 2. Factors affecting the media adaptation process. Media adaptation is guided by specific Quality of Service parameters, such as the complexity of media objects in terms of available resources and the requirements set by the user process. QoS parameters include both the transfer channel resources such as bandwidth and reliability, and the client-specific capabilities such as preferred display resolution. They dictate the transmission parametrics for the required presentation in service implementation. If the parameters are unknown, the quality of the presentation can be unsatisfactory. For example, video stream can be discontinuous, audio signal can be broken or media transfer can suffer from long latencies. To address this, a dynamic QoS-negotiation mechanism is used at the content- and transmission levels of the end-to-end multimedia path otherwise it would be difficult to guarantee robust performance [5]. For example, a typical criteria is relative picture quality, which could include variables such as spatial resolution. In order to adapt the service presentation, the features should be scaled according to physical system requirements such as terminal and network properties. Information needed in the synchronization of a presentation is usually carried in the object headers. Synchronization is needed between multiple streams, for example, when audio and video are generated in separate sources and combined into one presentation. Synchronization points are determined from time indexing or content/ event information. The main issue in media-based adaptation taxonomy is understanding the twofolded nature of media in digital service context: Its properties can be analyzed and applied from both physical and semantic point of view, Fig 3. Physical content car-

ries the actual information in a specific format with a certain structure and organization. For example, the physical difference between two images is determined from the headers. This information does not change as a function of time and can be used for scaling purposes. Figure 3. Utilization of media content understanding in adaptation. The semantic difference between two consecutive objects is determined from their contents and it can be used to find aliased information or to interpolate excessively quantized data. In case of a videostream, the temporal relationships of sequential frames can be utilized for example in video compression, keyframe extraction or scene detection, Fig 4. Figure 4. Media in physical, semantic and temporal domain. EXAMPLE OF ADAPTATION - VIDEO SURVEILLANCE We demonstrate the proposed media adaptation taxonomy with a video surveillance service. Consider a scene, where multiple surveillance cameras are connected to a distributed video and delivery system. The system detects movement and generate alerts to guards equipped with multimedia PCs or Personal Digital Assistants. The principle for the video surveillance application are presented in Fig 5. The

Physical and semantic content of video are used for detection. In order to adapt the media objects for specified parameters such as maximum tolerated latency, the objects must be presented, controlled and retrieved in a manner that also considers various terminals, networks, processing and transmission of the service. Figure 5. A block diagram of the video surveillance application. Two types of networks and terminals are used. The wireless network is capable of transfer rates of 56kbps-2Mbps, while the MMPC is connected to 10Mbps the Ethernet. To plan the error conditions together with protocol stack limit and available channel capacity, we define the best-case throughput for WLAN/Ethernet as 200kBps/1MBps and worst-case as 10kBps/100 kbps. The WPDA is supplied with a 160x120 display, which can present MPEG-2 -stream at 5 frames per second (fps). The MMPC can display 30 fps with 320x240x24 resolution. Following six variables are used for analyzing the adaptation performance: sent/ received bytes, fps and resolution, both sent and received. Real-time video stream with extracted textual data was sent to both terminals in unadapted and adapted form. The unadapted stream takes up to 0.5MB of bandwidth. Our application can alter both the spatial and temporal resolutions, for example a 160x120 video at 1 fps requires only 0.01 MBps. Values are presented in Table 1 under two load cases adapted (A) and unadapted (NA). The adaptation results in a more efficient usage of bandwidth and terminal and decreases the amount of unnecessary data transfer substantially. For example, if the video resolution is scaled to a half and fps rate is decreased to 1/6 in AWPDA-case, the transmission load of the service process is decreased by a factor of 2.5 while perceived video quality is even better due to more continous stream. Controlled processing of streaming media and adaptation are needed to match the presentation to connection and terminal capatibilites. In our example, the video processing and object control involving process timing information and utilization of extracted semantic data and retrieval control, are distributed among dedicated servers. When altering the frame rate and resolution along with scalable presentations, the proposed system maps all three aspects of our media adaptation taxonomy (media scaling, optimization and end-to-end requirements) into the system domain. If media is adapted according to the network load and terminal capatibilites, band