CMP632 Multimedia Systems 1. Do not turn this page over until instructed to do so by the Senior Invigilator.

Similar documents
Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

ITNP80: Multimedia! Sound-II!

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Optical Storage Technology. MPEG Data Compression

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

5: Music Compression. Music Coding. Mark Handley

Audio-coding standards

Mpeg 1 layer 3 (mp3) general overview

Audio-coding standards

Lecture 16 Perceptual Audio Coding

Multimedia Communications. Audio coding

Principles of Audio Coding

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011

New Media Production week 3

ELL 788 Computational Perception & Cognition July November 2015

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Chapter 14 MPEG Audio Compression

Do not turn this page over until instructed to do so by the Senior Invigilator.

CSE 4/60373: Multimedia Systems

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Audio and video compression

Lecture #3: Digital Music and Sound

EE482: Digital Signal Processing Applications

Compression; Error detection & correction

Interactive Multimedia. Multimedia and the World Wide Web

Audio Coding and MP3

Multimedia Data and Its Encoding

DAB. Digital Audio Broadcasting

Fundamental of Digital Media Design. Introduction to Audio

Appendix 4. Audio coding algorithms

The Gullibility of Human Senses

CS 074 The Digital World. Digital Audio

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J.

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Inserting multimedia objects in Dreamweaver

Export Audio Mixdown

Principles of MPEG audio compression

NATIONAL OPEN UNIVERSITY OF NIGERIA SCHOOL OF SCIENCE AND TECHNOLOGY COURSE CODE: CIT 742 COURSE TITLE: MULTIMEDIA TECHNOLOGY

2.4 Audio Compression

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

AUDIOVISUAL COMMUNICATION

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Skill Area 325: Deliver the Multimedia content through various media. Multimedia and Web Design (MWD)

INTRODUCTION TO MULTIMEDIA. Chapter 3 Multimedia Authoring

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Networking Applications

MPEG-4 Structured Audio Systems

CISC 7610 Lecture 3 Multimedia data and data formats

Data Compression. Audio compression

Multimedia Authoring Chapter 1

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec

Compression; Error detection & correction

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft

3 Sound / Audio. CS 5513 Multimedia Systems Spring 2009 LECTURE. Imran Ihsan Principal Design Consultant

_APP B_549_10/31/06. Appendix B. Producing for Multimedia and the Web

Parametric Coding of Spatial Audio

Parametric Coding of High-Quality Audio

Rich Recording Technology Technical overall description

Chapter 1. Data Storage Pearson Addison-Wesley. All rights reserved

AUDIOVISUAL COMMUNICATION

Data Representation. Reminders. Sound What is sound? Interpreting bits to give them meaning. Part 4: Media - Sound, Video, Compression

Interactive Progressive Encoding System For Transmission of Complex Images

6MPEG-4 audio coding tools

Distributed Multimedia Systems

University of Pennsylvania Department of Electrical and Systems Engineering Digital Audio Basics

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Lossy compression. CSCI 470: Web Science Keith Vertanen

UNDERSTANDING MUSIC & VIDEO FORMATS

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

Digital Image Processing

Chapter 5.5 Audio Programming

CHAPTER 10: SOUND AND VIDEO EDITING

Request for: 2400 bytes 2005/11/12

Skill Area 214: Use a Multimedia Software. Software Application (SWA)

MISB ST STANDARD. 27 February Motion Imagery Interpretability and Quality Metadata. 1 Scope. 2 References. 2.1 Normative References

Elementary Computing CSC 100. M. Cheng, Computer Science

9/8/2016. Characteristics of multimedia Various media types

lesson 24 Creating & Distributing New Media Content

08 Sound. Multimedia Systems. Nature of Sound, Store Audio, Sound Editing, MIDI

AI/RWIS CBT Ontario Version MINIMUM SYSTEM REQUIREMENTS

CHAPTER 04: MULTIMEDIA AUTHORING TOOLS

Digital Media. Daniel Fuller ITEC 2110

VALLIAMMAI ENGINEERING COLLEGE

MPEG-4. Today we'll talk about...

Introduction to LAN/WAN. Application Layer 4

Compression transparent low-level description of audio signals

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Compression and File Formats

Jianhui Zhang, Ph.D., Associate Prof. College of Computer Science and Technology, Hangzhou Dianzi Univ.

Module 10 MULTIMEDIA SYNCHRONIZATION

CSCD 443/533 Advanced Networks Fall 2017

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Image, video and audio coding concepts. Roadmap. Rationale. Stefan Alfredsson. (based on material by Johan Garcia)

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

AUDIO AND VIDEO COMMUNICATION MEEC EXERCISES. (with abbreviated solutions) Fernando Pereira

Transcription:

CMP632 Multimedia Systems 1 CARDIFF UNIVERSITY EXAMINATION PAPER SOLUTIONS Academic Year: 2002-2003 Examination Period: Lent 2003 Examination Paper Number: CMP632 Examination Paper Title: Multimedia Systems Duration: 2 hours Do not turn this page over until instructed to do so by the Senior Invigilator. Structure of Examination Paper: There are FOUR pages. There are FOUR questions in total. There are NO appendices. The maximum mark for the examination paper is 100% and the mark obtainable for a question or part of a question is shown in brackets alongside the question. Students to be provided with: The following items of stationery are to be provided: One answer book. Instructions to Students: Answer THREE questions. The use of translation dictionaries between English or Welsh and a foreign language bearing an appropriate departmental stamp is permitted in this examination.

CMP632 Multimedia Systems 2 1. (a) Give a definition of a Multimedia Authoring System. An Authoring System is a program which has pre-programmed elements for the development of interactive multimedia software titles. 2 Marks BOOKWORK (b) What are key features should such a Multimedia Authoring System provide? Underlying support of a rich set of multimedia components Usually object based Visual Programming usually supported Advanced programming usually via some form of scripting 4 Marks Bookwork (c) What Multimedia Authoring paradigms exist? Briefly describe each paradigm. There are various paradigms, including: Scripting Language The Scripting paradigm is the authoring method closest in form to traditional programming. The paradigm is that of a programming language, which specifies (by filename) multimedia elements, sequencing, hotspots, synchronization, etc. A powerful, object-oriented scripting language is usually the centerpiece of such a system; in-program editing of elements (still graphics, video, audio, etc.) tends to be minimal or non-existent. Scripting languages do vary; check out how much the language is object-based or object-oriented. The scripting paradigm tends to be longer in development time (it takes longer to code an individual interaction), but generally more powerful interactivity is possible. Since most Scripting languages are interpreted, instead of compiled, the runtime speed gains over other authoring methods are minimal. The media handling can vary widely; check out your system with your contributing package formats carefully. The Apple s HyperTalk for HyperCard, Assymetrix s OpenScript for ToolBook and Lingo scripting language of Macromedia Director are examples of a Multimedia scripting language. Here is an example lingo script to jump to a frame global gnavsprite on exitframe

CMP632 Multimedia Systems 3 go the frame play sprite gnavsprite end Iconic/Flow Control This tends to be the speediest (in development time) authoring style; it is best suited for rapid prototyping and short-development time projects. Many of these tools are also optimized for developing Computer-Based Training (CBT). The core of the paradigm is the Icon Palette, containing the possible functions/interactions of a program, and the Flow Line, which shows the actual links between the icons. These programs tend to be the slowest runtimes, because each interaction carries with it all of its possible permutations; the higher end packages, such as Authorware or IconAuthor, are extremely powerful and suffer least from runtime speed problems. Frame The Frame paradigm is similar to the Iconic/Flow Control paradigm in that it usually incorporates an icon palette; however, the links drawn between icons are conceptual and do not always represent the actual flow of the program. This is a very fast development system, but requires a good auto-debugging function, as it is visually un-debuggable. The best of these have bundled compiled-language scripting, such as Quest (whose scripting language is C) or Apple Media Kit. Card/Scripting The Card/Scripting paradigm provides a great deal of power (via the incorporated scripting language) but suffers from the index-card structure. It is excellently suitedfor Hypertext applications, and supremely suited for navigation intensive (a la Cyan s MYST game) applications. Such programs are easily extensible via XCMDs anddlls; they are widely used for shareware applications. The best applications allow all objects (including individual graphic elements) to be scripted; many entertainment applications are prototyped in a card/scripting system prior to compiled-language coding. Cast/Score/Scripting The Cast/Score/Scripting paradigm uses a music score as its primary authoring metaphor; the synchronous elements are shown in various horizontal tracks withsimultaneity shown via the vertical columns. The true power of this metaphor lies in the ability to script the behavior of each of the cast members. The most popularmember of this paradigm is Director, which is used in the creation of many commercial applications. These programs are best suited for animation-intensive orsynchronized media applications; they are easily extensible to handle other functions (such as hypertext) via XOBJs, XCMDs, and DLLs. Macromedia Director uses this.

CMP632 Multimedia Systems 4 Hierarchical Object The Hierarchical Object paradigm uses a object metaphor (like OOP) which is visually represented by embedded objects and iconic properties. Although the learning curve is non-trivial, the visual representation of objects can make very complicated constructions possible. Hypermedia Linkage The Hypermedia Linkage paradigm is similar to the Frame paradigm in that it shows conceptual links between elements; however, it lacks the Frame paradigm s visual linkage metaphor. Tagging The Tagging paradigm uses tags in text files (for instance, SGML/HTML, SMIL (Synchronised Media Integration Language), VRML, 3DML and WinHelp) to link pages, provide interactivity and integrate multimedia elements. 8 Marks --- BOOKWORK (d) You have been asked to create a multimedia presentation from a set of multimedia data that is distributed over the Internet. For example, a set of audio files exist at www.audio.com and a set of video files resides at www.video.com. Essentially your task is to sequence a series of videos over a series of audio files where you may assume that the length of the audio and video sequences match. You may assume that the files are video1.mpg, video2.mpg and that the corresponding video files are audio1.wav, audio2.wav. Which Multimedia Authoring paradigm is best suited to provide the best Solution for this kind of task? Illustrate your answer with suitable fragments of example code. Best Solution as Demonstrated in Lectures. Tagged Paradigm using SMIL. SMIL has capability to link to external media. To play media as required need to have sequential series of parallel environments, in which video/audio is synchroised

CMP632 Multimedia Systems 5 e.g. <smil> <head> <layout> <root-layout height="400" width="600" backgroundcolor="#000000" title="slides and Sound"/> </layout> </head> <body> <seq> <par> <audio src="www.audio.com/audio1.wav" title="slide 1"/> <video src="www.video.com/video1.mpg." /> </par> <par> <audio src="www.audio.com/audio2.wav" title="slide 2"/> <video src="www.video.com/video2.mpg" /> </par> </seq> </body> </smil> 10Marks UNSEEN APPLICATION OF ABOVE KNOWLEDGE 1 Mark Paradigm 1 Mark SMIL 2 marks basic SMIL syntax 4 marks Seq.par environments 2 marks links to audio and video

CMP632 Multimedia Systems 6 2. (a) What is MIDI? Definition of MIDI: a protocol that enables computer, synthesizers, keyboards, and other musical device to communicate with each other. 2 Marks Basic Bookwork (b) How is a basic MIDI message structured? Structure of MIDI messages: MIDI message includes a status byte and up to two data bytes. Status byte The most significant bit of status byte is set to 1. The 4 low-order bits identify which channel it belongs to (four bits produce 16 possible channels). The 3 remaining bits identify the message. The most significant bit of data byte is set to 0. 4 Marks Basic Bookwork

CMP632 Multimedia Systems 7 (c) A piece of music that lasts 3 minutes is to be transmitted over a network. The piece of music has 4 constituent instruments: Drums, Bass, Piano and Trumpet. The music has been recorded at CD quality (44.1 Khz, 16 bit, Stereo) and also as MIDI information, where on average the drums play 180 notes per minute, the Bass 140 notes per minute, the Piano 600 notes per minute and the trumpet 80 notes per minute. (i) Estimate the number of bytes required for the storage of a full performance at CD quality audio and the number of bytes for the Midi performance. You should assume that the general midi set of instruments is available for any performance of the recorded MIDI data. CD AUDIO SIZE: Midi: 2 channels * 44,100 samples/sec * 2 bytes (16bits) * 3*60 (3 Mins) = 31,752,000 bytes = 30.3 Mb 3 bytes per midi message KEY THINGS TO NOTE Need to send 4 program change (messages to set up General MIDI instruments) = 12 bytes (2 marks) Need to send Note ON and Note OFF messages to play each note properly. (4 marks) Then send 3 mins * 3 (midi bytes) * 2 (Note ON and OFF) * (180 + 140 + 600 + 80) = 18,000 bytes = 17.58 Kb. 7 Marks Unseen 2 for CD AUDIO 5 for MIDI (ii) Estimate the time it would take to transmit each performance over a network with 64 kbps. CD AUDIO Time = 31,752,000*8 (bits per second)/(64*1024) = 3,876 seconds = 1.077 Hours MIDI Time = 18,000*8 / (64*1024) = 2.197 seconds 2 Marks Unseen

CMP632 Multimedia Systems 8 (iii) Briefly comment on the merits and drawbacks of each method of transmission of the performance. Audio: Pro: Exact reproduction of source sounds Con: High bandwidth/long file transfer for high quality audio MIDI: Pro: Very low bandwidth Con: No control of quality playback of Midi sounds. 4 Marks Unseen but extended discussion on lecture material (d) Suppose vocals (where actual lyrics were to be sung) were required to be added to the each performance in (c) above. How might each performance be broadcast over a network? KEY POINT: Vocals cannot utilize MIDI Audio: Need to overdub vocal audio on the background audio track Need some audio editing package and then mix combined tracks for stereo audio. Assuming no change in sample rate or bit size the new mixed track will have exactly the same file size as the previous audio track so transmission is same as in (c). Midi: Midi alone is now no longer sufficient so how to proceed? For best bandwidth keep backing tracks as MIDI and send Vocal track as Audio. To achieve such a mix some specialist music production software will be needed to allow a file to be saved with synchronized Midi and Audio. How to deliver over a network? Need to use a Multimedia standard that supports MIDI and digital audio. Quicktime files support both (as do some Macromedia Director/Flash(?) files) so save mixed MIDI audio file in this format. The size of the file will be significantly increased due to single channel audio. If this is not compressed and assume a mono audio file file size will increase by around 15Mb. SO transmission time will increase drastically. 5 Marks Unseen

CMP632 Multimedia Systems 9 3. (a) What is meant by the terms frequency and temporal masking of two or more audio signals? Briefly, what is cause of this masking? Frequency Masking: When an Audio signal consists of multiple frequencies the sensitivity of the ear changes with the relative amplitude of the signals. If the frequencies are close and the amplitude of one is less than the other close frequency then the second frequency may not be heard. The range of closeness for frequency masking (The Critical Bands) depends on the frequencies and relative amplitudes. Temporal Masking: After the ear hears a loud sound it takes a further short while before it can hear a quieter sound. The cause for both types of masking is that within the human ear there are tiny hair cells that are excited by air pressure variations. Different hair cells respond to different ranges of frequencies. Frequency Masking occurs because after excitation by one frequency further excitation by a less strong similar frequency is not possible of the same group of cells. Temporal Masking occurs because the hairs take time to settle after excitation to respond again. 7 Marks BookWork

CMP632 Multimedia Systems 10 (b) How does MPEG audio compression exploit such phenomena? Give a schematic diagram of the MPEG audio perceptual encoder. MPEG use some perceptual coding concepts: Bandwidth is divided into frequency subbands using a bank of analysis filters critical band filters. Each analysis filter using a scaling factor of subband max amplitudes for psychoacoustic modeling. FFT (DFT) used, Signal to mask ratios used for frequencies below a certain audible threshold. 8 Marks - BookWork

CMP632 Multimedia Systems 11 (c) The critical bandwidth for average human hearing is a constant 100Hz for frequencies less than 500Hz and increases (approximately) linearly by 100 Hz for each additiional 500Hz. (i) Given a frequency of 300 Hz, what is the next highest (integer) frequency signal that is distinguishable by the human ear assuming the latter signal is of a substantially lower amplitude? Trick is to realize (remember?) definition of critical band: Critical Band: The Width of a masking area (curve) to which no signal may be heard given a first carrier signal of higher amplitude within a given frequency range as defined above. Critical Band is 100 Hz for 300 Hz signal so if a 300 Hz Signal So range of band is 250 350 Hz. So next highest Audible frequency is 351 Hz 4 Marks Unseen (ii) Given a frequency of 5000 Hz, what is the next highest (Integer) frequency signal that is distinguishable by the human ear assuming the latter signal is of a substantially lower amplitude? 5, 000 Hz critical bandwith is 10 * 100 Hz = 1000Hz So range of of band is 4500 5500 Hz So next highest audible frequency is 5501 Hz 5 Marks Unseen

CMP632 Multimedia Systems 12 4. (a) What is the distinction between lossy and lossless data compression? Lossless preserves data undergoing compression, Lossy compression aims to obtain the best possible fidelity for a given bit-rate or minimizing the bit-rate to achieve a given fidelity measure but will not produce a complete facsimile of the original data. 2 Marks Bookwork (b) Briefly describe the four basic types of data redundancy that data compression algorithms can apply to audio, image and video signals. 4 Types of Compression: Temporal -- in 1D data, 1D signals, Audio etc. Spatial -- correlation between neighbouring pixels or data items Spectral -- correlation between colour or luminescence components. This uses the frequency domain to exploit relationships between frequency of change in data. Psycho-visual, psycho-acoustic -- exploit perceptual properties of the human visual system or aural system to compress data.. 8 Marks Bookwork (c) Encode the following stream of characters using decimal arithmetic coding compression: MEDIA You may assume that characters occur with probabilities of M = 0.1, E = 0.3, D = 0.3, I = 0.2 and A = 0.1. Sort Data into largest probabilities first and make cumulative probabilities 0 - E - 0.3 - D - 0.6 I 0.8 M - 0.9 A 1.0 There are only 5 Characters so there are 5 segments of width determined by the probability of the related character. The first character to encoded is M which is in the range 0.8 0.9, therefore the range of the final codeword is in the range 0.8 to 0.89999..

CMP632 Multimedia Systems 13 Each subsequent character subdivides the range 0.8 0.9 SO after coding M we get 0.8 - E - 0.83 - D - 0.86 I 0.88 M - 0.89 A 0.9 So to code E we get range 0.8 0.83 SO we subdivide this range 0 - E - 0.809 - D - 0.818 I 0.824 M - 0.827 A 0.83 Next range is for D so we split in the range 0.809 0.818 0.809 - E - 0.8117 - D - 0.8144 I 0.8162 M - 0.8171 A 0.818 Next Character is I so range is from 0.8144 0.8162 so we get 0.8144 - E - 0.81494 - D - 0.81548 I 0.81584 M - 0.81602 A 0.8162 Final Char is A which is in the range 0.81602 0.8162 So the completed codeword is any number in the range 0.81602 <= codeword < 0.8162 10 Marks Unseen (d) Show how your solution to (c) would be decoded. Assume Codeword is 0.8161 Code can readily determine first character is M since it is in the Range 0.8 0.9 By expanding interval we can see that next char must be an E as it is in the range 0.8 0.83 and so on for all other intervals. 4 marks Unseen