Multimedia What is multimedia? Media types +Text + Graphics + Audio +Image +Video Interchange formats What is multimedia? Multimedia = many media User interaction = interactivity Script = time 1 2 Most common media types Text Graphics Audio Image Video Continuous media Animations (virtual reality) Audio Video What is the difference between normal and continuous media? 3 4 Continuous media processing Interactivity Capture Play D/Aconversion Postprocessing Decompression Preprocessing A/Dconversions Compression Transfer User can control the progress of the presentation Level of interaction + user interface, application, server, content producer Amount of interaction + www page, www application, video, game, virtual reality 5 6 1
Time Media elements are placed in time dimension Different components are synchronized Presentation system takes cares of synchronization (orchestration) Applications Multimedia presentations + PowerPoint Multimedia communications + video conference Multimedia services + video on demand Integrated hybrid applications + record store 7 8 Text ASCII Text documents + Microsoft Word, Adobe Acrobat Structured documents + SGML, HTML, XML Hypertext + HyperCard Graphics Bitmap graphics + paintings + Microsoft Paint Vector graphics + drawings + OpenGL + Postscript 9 10 Physical properties of sound Amplitude +db = 20 log 10 (A/B) + hearing threshold 0 db, pain limit 100-120 db Period / frequency +Hz = 1/s + Hearing band 20 Hz - 20 khz Pulse code modulation Samples are taken at sample frequency Sample frequency has to be at least twice of maximum frequency (so called Nyquist frequency) Common sample frequencies are 8, 44.1, and 48 khz 11 12 2
Pulse code modulation (cont.) Signal amplitude at sample moment is converted into numerical value + Pulse Code Modulation (PCM) Sampling causes a quantization error Frequency transformation Natural sounds are composed of base frequency and its harmonic frequencies Thus, sound can be presented also in frequency dimension 13 14 Clarinet sound Frequency transformation 15 16 Frequency transformation (cont.) The Fourier transformation coefficients present the sound in frequency dimension Stationary signals can be presented accurately with Fourier transformation With dynamic signals discrete Fourier transformation has to be used Usually, fast Fast Fourier Transformation (FFT) algorithm is used Psycho acoustics The properties of human hearing should be considered in coding of sound Sound should be processed in frequency dimension For example, hearing threshold depends on frequency Ear is sensitive to the valleys and hills of spectrum + for example, detection of vowels 17 18 3
Threshold of hearing Psycho acoustics (cont.) Amplitude So called masking effect A loud sound at certain frequency raises the threshold of hearing at wider frequency area Frequency 19 20 Compression methods Begin For example, masking effect can be used in sound coding Signal is divided into frequency bands, which are coded separately (subband coding) For example, Mini Disc records (Sony), DCC Cassettes (Philips), and MPEG audio Subband analysis Scale factor calculation Coding of scale factors FFT analysis Calculation of masking and required bit allocation Determination of nontransmitted subbands Adjustment to fixed bit-rate Quantization of samples Coding of samples Coding of bit allocation Formatting and transmission End 21 Image and video coding Methods are either lossy or lossless Most common lossy method is Discrete Cosine Transformation (DCT) For example, Huffmann method is lossless Coding methods Image coding + JPEG (Joint Photographic Expert Group) Video coding + H.261, H.263 + MPEG (Motion Picture Expert Group) Coding methods utilize usually several coding techniques 23 24 4
JPEG Goals Compression ratio vs. image quality can be selected Works with all kinds of images Both software and hardware implementation Four modes: + sequential coding (original order) + progressive coding (multiphase coding) + lossless coding (perfect copy) + hierarchical coding (multiple resolutions) 25 JPEG Architectures Lossy modes use DCT for 8 x 8 pixel blocks Sequential mode outputs the DCT-coefficients block by block Progressive mode outputs the DCT-coefficient in groups Hierarchical mode encodes several resolutions at the same time 26 Sequential JPEG Progressive JPEG 27 28 Hierarchical JPEG Lossless JPEG 29 30 5
DCT and quantization DCT coding The DCT-coefficients can be represented as a matrix The quantization is done according to a quantization table The coefficients are put in Zig-Zag order This places the zero coefficients in the end of the run Finally Run-Length coding eliminates the zeros 31 32 Statistical coding Uses either Huffman or arithmetic coding Huffman coding requires a separate table Arithmetic coding does not require a table, but need more computation In addition, the compression ratio of arithmetic coding is 5-10 % better Lossless encoding Lossless encoding utilizes prediction Seven different alternatives + how many and which pixels are used Predictive encoding can reach compression ratio of 2:1 33 34 Efficiency 0,25-0,5 bpp: reasonable - good quality 0,5-0,75 bpp: good - very good quality 0,75-1,5 bpp: very good quality 1,5-2,00 bpp: same as original MPEG Resembles JPEG method In addition, properties of moving images have been used + following images are very similar + images have few moving objects + image series change seldom Implementation is more complicated Requires usually special hardware 35 36 6
Motion estimation Moving objects are searched from following images Motion estimation are calculated for the changing blocks Estimations are called motion vectors and they are transmitted as part of the coded information 37 Difference images MPEG image sequence Difference images are calculated between the original and predicted images Only changed (non-zero) areas are transmitted MPEG utilizes both forward an backward prediction: + I = Intra - original image + P = forward Prediction + B = Bidirection prediction 39 Interchange formats 40 Application areas The co-operation and transfer of multimedia applications requires common interchange formats Interchange format defines: A common interchange format can be used for + storage format (e.g., Macromedia Director) + transfer format (e.g., CD-ROM) + real-time transfer format (e.g., digital TV) + content transfer between applications (e.g., group work) + time, place, structure, and operation (procedures) Without an interchange format content produced with one application cannot be read and used with other application Conversion tools are not a good solution 41 42 7
Requirements Data model: + time, synchronization, different formats, pointers, links, interactivity Scripts: + programming language or graphical programming Capacity: + definitions should not take much space Retrieval time: + decompression has to be fast, progressive resolution, etc. Interdependency: + hardware and platform neutral Scalability: + new formats, attributes, etc. MHEG ISO working group Object based multimedia and hypermedia interchange format Supports interactivity and real-time transfer Defines final presentation format 43 44 Properties Defines a set of platform independent components for interactive applications Two ways to implement interactivity: + components are linked to events + scripting language is used Events are caused by timers or user s actions Properties (cont.) Defines both temporal and spatial location Macros allow the definition of complex objects based on models Contains also support for real-time transfer 45 46 MHEG class hierarchy HTML MH-object Behavior Action Link Script Component Content Interaction Selection Modification Composite Descriptor Macro Macro Definition Macro Use HTML is also an interchange format Limited features HTML = structural document, links + DOM model is coming ECMAScript (JavaScript) = interactivity Style sheets (Cascading Style Sheets) = placement Synchronization is still missing + Synchronized Multimedia Integration Language (SMIL) 47 48 8