CSCD 443/533 Advanced Networks Fall 2017

Similar documents
CISC 7610 Lecture 3 Multimedia data and data formats

Compression; Error detection & correction

Image and video processing

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Lecture 16 Perceptual Audio Coding

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Compression Part 2 Lossy Image Compression (JPEG) Norm Zeck

Audio and video compression

Audio-coding standards

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

ELL 788 Computational Perception & Cognition July November 2015

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

About MPEG Compression. More About Long-GOP Video

Compression; Error detection & correction

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Video Compression An Introduction

Audio-coding standards

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

Compression and File Formats

5: Music Compression. Music Coding. Mark Handley

Mpeg 1 layer 3 (mp3) general overview

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology

Networking Applications

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Principles of Audio Coding

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

MULTIMEDIA SYSTEMS

UNDERSTANDING MUSIC & VIDEO FORMATS

Chapter 14 MPEG Audio Compression

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Bluray (

Computing in the Modern World

Part 1 of 4. MARCH

Video Compression MPEG-4. Market s requirements for Video compression standard

Lossy compression. CSCI 470: Web Science Keith Vertanen

Image, video and audio coding concepts. Roadmap. Rationale. Stefan Alfredsson. (based on material by Johan Garcia)

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Optical Storage Technology. MPEG Data Compression

Multimedia Communications. Audio coding

Multimedia on the Web

ITNP80: Multimedia! Sound-II!

VC 12/13 T16 Video Compression

Fundamentals of Video Compression. Video Compression

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

CS 335 Graphics and Multimedia. Image Compression

Lossy compression CSCI 470: Web Science Keith Vertanen Copyright 2013

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

The Gullibility of Human Senses

VIDEO COMPRESSION STANDARDS

Video coding. Concepts and notations.

Digital video coding systems MPEG-1/2 Video

5.9. Video Compression (1)

Data Representation. Reminders. Sound What is sound? Interpreting bits to give them meaning. Part 4: Media - Sound, Video, Compression

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Inserting multimedia objects in Dreamweaver

Appendix 4. Audio coding algorithms

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Digital Image Representation Image Compression

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Digital Video Processing

Chapter 1. Digital Data Representation and Communication. Part 2

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Introduction to LAN/WAN. Application Layer 4

Introduction to Video Encoding

AUDIOVISUAL COMMUNICATION

Advanced Video Coding: The new H.264 video compression standard

Obtaining video clips

Skill Area 325: Deliver the Multimedia content through various media. Multimedia and Web Design (MWD)

Tech Note - 05 Surveillance Systems that Work! Calculating Recorded Volume Disk Space

Compressed-Domain Video Processing and Transcoding

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Multimedia Standards

Media player for windows 10 free download

AUDIOVISUAL COMMUNICATION

Topic 5 Image Compression

Lecture 6: Compression II. This Week s Schedule

Wireless Communication

Computer and Machine Vision

Introduction to Video Compression

Streaming Technologies Glossary

CS 260: Seminar in Computer Science: Multimedia Networking

Advanced Encoding Features of the Sencore TXS Transcoder

JPEG. Wikipedia: Felis_silvestris_silvestris.jpg, Michael Gäbler CC BY 3.0

Multimedia Signals and Systems Motion Picture Compression - MPEG

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

ITEC310 Computer Networks II

EE482: Digital Signal Processing Applications

MPEG-2. ISO/IEC (or ITU-T H.262)

Introduction to Video Encoding

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

VIDEO COMPRESSION. Image Compression. Multimedia File Formats. Lossy Compression. Multimedia File Formats. October 8, 2009

Megapixel Video for. Part 2 of 4. Brought to You by. Presented by Video Security Consultants

Georgios Tziritas Computer Science Department

Transcription:

CSCD 443/533 Advanced Networks Fall 2017 Lecture 18 Compression of Video and Audio 1

Topics Compression technology Motivation Human attributes make it possible Audio Compression Video Compression Performance 2

Motivation, Why Compress? Why do we need to compress streaming media? Look at one instance 640 X 480 pixel frames 24 bits color/pixel 30 frames / sec No compression, takes over 200 Mbps to transmit just video Do you have a 200 Mbps link? We need massive compression to be able to view streaming video and audio with our current network

Motivation, Why Compress? What does compression buy us? Lossless DVD video - 221 Mbps Compressed DVD video - 4 Mbps 50:1 compression ratio!

Why Compress? In a Nutshell To reduce the file size To deliver stream to the user To conserve storage space Choosing a compression rate is a balance: Quality of Available the Media bandwidth

So, Why Compress? Delivering video over Web means compromises Mostly trading image quality for lower bit rates In general, Video and audio are compressed Stuffed into a container and Delivered to you via web If done well, you won't notice The missing bits and The delivery of media Discuss individual format, codecs and tradeoffs

Definitions File Format Particular way information is stored in a file Known as containers for streaming media Codec Codec is an acronym for Compression/Decompression Codec is any technology for compressing and decompressing data. Compression Reduces file size by removing audio or video information Takes advantage of human perception

Format vs. Codec Example Flash Video (FLV) is a file format H.264, On2, VP6, Sorenson Spark are codecs for the flash video file

Container File Formats Purpose of container formats Examples Function as "black boxes" for holding a variety of media formats Good container formats can handle files compressed with a variety of different codecs In a perfect world, you could put any codec in any container format Unfortunately are some incompatibilities MPEG-2, Advanced Systems Format (ASF) from Microsoft, AVI, Quicktime (MOV), MP4, Flash (FLV) RealMedia

Multimedia Container Files Multimedia file extensions.mov,.ogg,.wmv,.flv,.mp4,.mpeg Essentially, videos packaged Into encapsulation containers, or wrapper formats, that contain all information needed to present video You can think of file formats as being containers that hold all this information Very similar to a.zip,.sit or.rar file

Differences in Containers Why are certain formats are popular? Popular Support File Size How widely supported is the format? Larger is not better for streaming files Support for advanced codec functionality Older formats such as AVI do not support new codec features like B-frames or VBR audio Support for advanced content Such as chapters, subtitles, meta-tags, userdata.

Compression

Compression Two Types: Lossless Lossy Keeps All Bits Removes Bits

Lossy Compression Lossy compression schemes reduce file size by discarding some amount of data during encoding before sent over Internet Once received by client, codec attempts to reconstruct information that was lost or discarded

Video Lossy Compression Image Compression Image format uses lossy compression to sample an image and discard unnecessary color/contrast information

Can you really see difference?

Video Lossy Compression Why can you do lossy compression? Spatial and temporal redundancy Pixel values are not independent, correlated with their neighbors both within same frame and across frame Value of pixel is predictable given values of neighboring pixels Psychovisual redundancy Human eye has limited response to fine spatial detail, Less sensitive to detail near object edges or around shotchanges Impairments introduced by bit rate reduction should not be visible to human viewer

Audio Lossy Compression Audio compression Lossy discards frequencies on high and low end of spectrum and attempts to locate and remove unnecessary audio data More on this Nice description and example programs http://www.videograbber.net/compress-audio-file.html

Audio Streaming Formats Many formats and standards for streaming audio RealNetworks' RealAudio, streaming MP3, Macromedia's Flash and Director Shockwave, Microsoft's Windows Media, and Apple's QuickTime Also recognized standard formats, including Liquid Audio, MP3, MIDI, WAV, and AU

Audio Lossy Compression First, player decompresses audio file as it downloads to your computer Then fills in missing information according to the instructions set by codec Compressed file is unintelligible to listener Decompressed file is intelligible but of a lower quality than original

MP3 Audio Lossy Compression Example - MP3 MP3 lossy audio data compression algorithm takes advantage of perceptual limitation of human hearing Auditory Masking Discovered (in late 1800's) that tone could be rendered inaudible by another tone of lower frequency How your brain perceives similar sounds

MP3 Audio Lossy Compression Uncompressed audio, Like CDs, stores more data than your brain can actually process For example, Two notes are very similar and very close together, your brain may perceive only one of them Two sounds are different, one is much louder than the other, your brain may never perceive the quieter signal

MP3 Audio Lossy Compression Study these auditory phenomena Psychoacoustics, Can be accurately described in tables and charts, Mathematical models representing human hearing patterns These can be stored in the codec as reference tables Article on psychoacoustics http://www.uaudio.com/blog/how-the-ear-works/

MP3 Audio Lossy Compression MP3 Encoding Tools Analyze incoming source signal, Break it down into mathematical patterns, and Compare these patterns to psychoacoustic models stored in encoder itself Encoder can then discard most of data that doesn't match stored models, keeping that which does Shrinks file by discarding great deal of extra data

MP3 Audio Lossy Compression MP3 encoding process two-pass system Step 1 Run all psychoacoustic models, discarding data Then compress what's left to shrink storage space Step 2 Huffman coding, does not discard any data Lets you store what's left in a smaller amount of space Uses fewer bits to store most common symbols Steps 2a - Break resulting audio stream into frames assembled into a bitstream, with header information preceding each data frame Headers contain "meta-data" specific to that frame Such as an ID, bitrate, audio frequency, padding, type of frame, MPEG1 or 2

Basic Structure of Audio Encoder Limit values to audible tones Note: A decoder works in just the opposite manner

Processes of and Audio Encoder Mapping Block divides audio inputs into 32 equalwidth frequency subbands (samples) Psychoacoustic Block calculates masking threshold for each subband

Processes of and Audio Encoder Bit-Allocation Block (in Quantizer block) allocates bits using outputs of the Mapping and Psychoacoustic blocks Quantizer & Coding Block scales and quantize (reduce) the samples Frame Packing Block formats the headers into an encoded stream samples with

Video Encoding, Standards

MPEG Organization Moving Picture Experts Group Established in 1988 Standards under International Organization for standardization (ISO) and International Electro technical Commission (IEC) Official name: ISO/IEC JTC1 SC29 WG11 Responsible for MPEG standards

Evolution of MPEG MPEG-1 Initial audio/video compression standard Used by VCD s 1990's MP3 = MPEG-1 audio layer 3 Target of 1.5 Mb/s bitrate at 352x240 resolution Only supports progressive pictures, no interlaced pictures

Evolution of MPEG MPEG-2 Standard, still widely used in DVD and Digital TV Support in current hardware implies that it will be here for a long time Transition to HDTV has taken over 10 years and is not finished yet Different profiles and levels allow for quality control

Evolution of MPEG MPEG-3 Originally developed for HDTV, but abandoned when MPEG-2 was determined to be sufficient MPEG-4 Includes support for AV objects, 3D content, low bitrate encoding, and DRM In practice, provides equal quality to MPEG-2 at a lower bitrate MPEG-4 Part 10 is H.264, which is used in HD- DVD and Blu-Ray H.264 is the encoding used in video

MPEG technical specification Part 1 - Systems - describes synchronization and multiplexing of video and audio. Part 2 - Video - compression codec for interlaced and noninterlaced video signals. Part 3 - Audio - compression codec for perceptual coding of audio signals. A multichannel-enabled extension of MPEG-1 audio. Part 4 - Describes procedures for testing compliance. Part 5 - Describes systems for Software simulation. Part 6 - Describes extensions for DSM-CC (Digital Storage Media Command and Control.) Part 7 - Advanced Audio Coding (AAC) Part 8 - Deleted Part 9 - Extension for real time interfaces. Part 10 - Conformance extensions for DSM-CC.

MPEG Video spatial domain processing Spatial Domain Handled Similarly to JPEG Convert RGB values to YUV colorspace One Brightness and two other color representations RGB from Television, YUV graphics processing Y represents luminosity, U,V color Can represent YUV with fewer bits since human eye can't tell if color is missing We care more about brightness Split frame into 8x8 blocks

8 x 8 Blocks

MPEG Video spatial domain processing 2-D Discrete Cosine Transform (DCT) on each block Similar to a Fourier Transform for Signal Processing Transforms blocks into higher frequency and lower frequency values Pushes more important least frequent values to upper quadrant of the 8 X 8 block For typical image, most of visually significant information about image is concentrated in just a few coefficients of DCT Quantization of DCT coefficients Values that are near zero, converted to zero Values that are smaller, shrunk All are represented by integers

Quantization matrix matrix divides each coefficient by a number. The quantization matrix is pre-calculated and defined by the JPEG standard and favors the items in the top left corner of the matrix, the more frequency significant terms. Each coefficient has a different weighting

Run-length Encoding The regular JPEG standard uses an advanced version of Huffman coding

DCT Transform on Blocks Final Result Reduction in Number of Bits De-compression is the reverse process However, the lossy part of this, can't quite get back to the original image there is a loss of information Nice Examples using Discrete Cosine Transform http://www.dspguide.com/ch27/6.htm http://datagenetics.com/blog/november32012/index.html

MPEG video time domain processing Totally new ballgame (this concept doesn t exist in JPEG) General idea Use motion vectors to specify how a 16x16 macroblock translates between reference frames and current frame, then code difference between reference and actual block

MPEG video time domain processing GOP (Group of Pictures) GOP is a set of consecutive frames that can be decoded without any other reference frames Usually 12 or 15 frames Starts with I frame

MPEG video time domain processing Group of Pictures (GOP) I-frames Can be reconstructed without any reference to other frames, like still pictures P-frames Forward predicted from last I-frame and P-frames, Code differences like movement Two to 4 frames in the future B-frames Forward and backward predicted

MPEG Processing GOP

MPEG GOP

Final Comments on Prediction Only use motion vector if a close match can be found Evaluate closeness with Mean Standard Error or other metric Can t search all possible blocks, so need a smart algorithm If no suitable match found, just code the macroblock as an I-block If a scene change is detected, start fresh Don t want too many P or B frames in a row Predictive error will keep propagating until next I frame Delay in decoding

MPEG Usefulness Multimedia Communications Webcasting Broadcasting Video on Demand Interactive Digital Media Telecommunications Mobile communications

References Overviews of Codecs and Container Formats http://www.divxland.org/en/article/15/multimedia_container_formats http://www.pcworld.com/article/213612/all_about_video_codecs_and _containers.html?page=2 Ripping CD's and Encoding audio http://www.blog.gartonhill.com/ripping-your-cd-collection-part-1/ http://www.blog.gartonhill.com/ripping-your-cd-collection-part-2- building-your-library/ Mp3 Audio http://oreilly.com/catalog/mp3/chapter/ch02.html Audio Streaming http://oreilly.com/catalog/sound/chapter/ch05.html

Summary Video and audio has become a huge part of our daily interaction with the Internet New codecs and file formats being proposed all the time Number of devices with different needs driving the push for more efficient ways to compress and deliver streaming media

End New program is up last assignment

1

Motivation, Why Compress? What does compression buy us? Lossless DVD video - 221 Mbps Compressed DVD video - 4 Mbps 50:1 compression ratio! 4

Why Compress? In a Nutshell To reduce the file size To deliver stream to the user To conserve storage space Choosing a compression rate is a balance: Quality of Available the Media bandwidth 5

So, Why Compress? Delivering video over Web means compromises Mostly trading image quality for lower bit rates In general, Video and audio are compressed Stuffed into a container and Delivered to you via web If done well, you won't notice The missing bits and The delivery of media Discuss individual format, codecs and tradeoffs 6

Container File Formats Purpose of container formats Examples Function as "black boxes" for holding a variety of media formats Good container formats can handle files compressed with a variety of different codecs In a perfect world, you could put any codec in any container format Unfortunately are some incompatibilities MPEG-2, Advanced Systems Format (ASF) from Microsoft, AVI, Quicktime (MOV), MP4, Flash (FLV) RealMedia 9

Multimedia Container Files Multimedia file extensions.mov,.ogg,.wmv,.flv,.mp4,.mpeg Essentially, videos packaged Into encapsulation containers, or wrapper formats, that contain all information needed to present video You can think of file formats as being containers that hold all this information Very similar to a.zip,.sit or.rar file 10

Differences in Containers Why are certain formats are popular? Popular Support How widely supported is the format? File Size Larger is not better for streaming files Support for advanced codec functionality Older formats such as AVI do not support new codec features like B-frames or VBR audio Support for advanced content Such as chapters, subtitles, meta-tags, userdata. 11

Mapping-divide into 32 subbands, or frequency samples Psychoacoustic- below which noise is imperceptible to the human ear (Map & Psycho can be done independently Bit-Allocation-total noise to mask ratios can be minimized, over all the channels and subbands Frame Packing header includes bit allocation and scaling information (scale factor) Quantizer & Coding scaled and quantized according to the bit allocation 27

Mapping-divide into 32 subbands, or frequency samples Psychoacoustic- below which noise is imperceptible to the human ear (Map & Psycho can be done independently Bit-Allocation-total noise to mask ratios can be minimized, over all the channels and subbands Frame Packing header includes bit allocation and scaling information (scale factor) Quantizer & Coding scaled and quantized according to the bit allocation 28

8 x 8 Blocks 36

Quantization matrix matrix divides each coefficient by a number. The quantization matrix is pre-calculated and defined by the JPEG standard and favors the items in the top left corner of the matrix, the more frequency significant terms. Each coefficient has a different weighting Click to add an outline 38

Run-length Encoding The regular JPEG standard uses an advanced version of Huffman coding 39

40

The MPEG file consists of compressed video data, called the video stream. The basic unit of the video stream is a "Group of Pictures" (GOP), made up of three picture types, also called frames: I, P, and B. The I -frames can be restructured without any references to other frames. On average, the I -frames can occur one in every ten-fifteen frames of motion picture. This type of frames contains information only about itself. P -frames can only be recreated by references from previous I-frame or P-frame; it is impossible to construct them without any data of another frame. The B -frames are referred to as bi-directional frames, because they can be recreated based on forward and backward predictions from the information presented in the nearest preceding and following I or P frame. 43

Summary Video and audio has become a huge part of our daily interaction with the Internet New codecs and file formats being proposed all the time Number of devices with different needs driving the push for more efficient ways to compress and deliver streaming media 49