Vector Bank Based Multimedia Codec System-on-a-Chip (SoC) Design

Similar documents
International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Design Considerations of SOPC-Based H.264/AVC Systems

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

Fast frame memory access method for H.264/AVC

Title Adaptive Lagrange Multiplier for Low Bit Rates in H.264.

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Advanced Video Coding: The new H.264 video compression standard

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC

An Efficient Table Prediction Scheme for CAVLC

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

Fast Wavelet-based Macro-block Selection Algorithm for H.264 Video Codec

Improving the quality of H.264 video transmission using the Intra-Frame FEC over IEEE e networks

Comparison of Some Motion Detection Methods in cases of Single and Multiple Moving Objects

RECOMMENDATION ITU-R BT

Department of Electrical Engineering

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

White paper: Video Coding A Timeline

Module 7 VIDEO CODING AND MOTION ESTIMATION

MediaTek High Efficiency Video Coding

Performance Analysis of DIRAC PRO with H.264 Intra frame coding

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

Pupil Localization Algorithm based on Hough Transform and Harris Corner Detection

Comparison between Various Edge Detection Methods on Satellite Image

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

A Dedicated Hardware Solution for the HEVC Interpolation Unit

Motion Estimation for Video Coding Standards

Investigation of the GoP Structure for H.26L Video Streams

BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IMAGES FOR URBAN SURVEILLANCE SYSTEMS

Overview of H.264 and Audio Video coding Standards (AVS) of China

Performance Comparison between DWT-based and DCT-based Encoders

Improved Context-Based Adaptive Binary Arithmetic Coding in MPEG-4 AVC/H.264 Video Codec

EE 5359 Low Complexity H.264 encoder for mobile applications. Thejaswini Purushotham Student I.D.: Date: February 18,2010

Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis

Reduced Frame Quantization in Video Coding

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

Digital Image Stabilization and Its Integration with Video Encoder

Part 1 of 4. MARCH

Complexity Reduction Tools for MPEG-2 to H.264 Video Transcoding

Optimized architectures of CABAC codec for IA-32-, DSP- and FPGAbased

[30] Dong J., Lou j. and Yu L. (2003), Improved entropy coding method, Doc. AVS Working Group (M1214), Beijing, Chaina. CHAPTER 4

Video Compression An Introduction

Automatic Shadow Removal by Illuminance in HSV Color Space

An Edge Based Adaptive Interpolation Algorithm for Image Scaling

Objective: Introduction: To: Dr. K. R. Rao. From: Kaustubh V. Dhonsale (UTA id: ) Date: 04/24/2012

Using Shift Number Coding with Wavelet Transform for Image Compression

Pattern based Residual Coding for H.264 Encoder *

JPEG Compression Using MATLAB

Detection and Classification of a Moving Object in a Video Stream

Analysis of Motion Estimation Algorithm in HEVC

SAD implementation and optimization for H.264/AVC encoder on TMS320C64 DSP

EE Low Complexity H.264 encoder for mobile applications

Tech Note - 05 Surveillance Systems that Work! Calculating Recorded Volume Disk Space

Megapixel Video for. Part 2 of 4. Brought to You by. Presented by Video Security Consultants

International Journal of Advance Engineering and Research Development

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

Fast Mode Decision for H.264/AVC Using Mode Prediction

A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND THREE STEP SEARCH ALGORITHM FOR P-FRAME

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

About MPEG Compression. More About Long-GOP Video

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

PERFORMANCE ANALYSIS OF CANNY AND OTHER COMMONLY USED EDGE DETECTORS Sandeep Dhawan Director of Technology, OTTE, NEW YORK

JPEG 2000 vs. JPEG in MPEG Encoding

Mixed Raster Content for Compound Image Compression

A Hybrid Architecture for Video Transmission

Video Object Extraction for Surveillance System. Like Zhang CS, UTSA

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

An Efficient Mode Selection Algorithm for H.264

High Efficiency Video Decoding on Multicore Processor

High Performance VLSI Architecture of Fractional Motion Estimation for H.264/AVC

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Bi-directional optical flow for future video codec

Digital Image Processing COSC 6380/4393

ONVIF Profile T and H.265: the evolution of video compression

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

For layered video encoding, video sequence is encoded into a base layer bitstream and one (or more) enhancement layer bit-stream(s).

Unit-level Optimization for SVC Extractor

EDGE BASED REGION GROWING

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

EFFICIENT DEISGN OF LOW AREA BASED H.264 COMPRESSOR AND DECOMPRESSOR WITH H.264 INTEGER TRANSFORM

H.264 to MPEG-4 Transcoding Using Block Type Information

Compression; Error detection & correction

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Study on Jabber Be Applied to Video Diagnosis for Plant Diseases and Insect Pests

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

High-Performance VLSI Architecture of H.264/AVC CAVLD by Parallel Run_before Estimation Algorithm *

A threshold decision of the object image by using the smart tag

A 4-way parallel CAVLC design for H.264/AVC 4 Kx2 K 60 fps encoder

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala

Advanced Encoding Features of the Sencore TXS Transcoder

AN EFFICIENT APPROACH FOR IMPROVING CANNY EDGE DETECTION ALGORITHM

CS334: Digital Imaging and Multimedia Edges and Contours. Ahmed Elgammal Dept. of Computer Science Rutgers University

Compression of Stereo Images using a Huffman-Zip Scheme

Transcription:

2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks Vector Bank Based Multimedia Codec System-on-a-Chip (SoC) Design Ruei-Xi Chen, Wei Zhao, Jeffrey Fan andasaddavari Computer Science and Information Engineering, St. Johns University, Taipei, Taiwan Electrical and Computer Engineering, Florida International University, Miami, Florida, USA Electrical and Computer Engineering, West Virginia University Institute of Technology, Montgomery, West Virginia, USA Abstract In this paper, we present a design architecture of implementing a Vector Bank into video encoder system, namely, an H.264 encoder, in order to detect and analyze the moving objects within the specific area. Also, we believe that the transmitting bandwidth could be saved with the implementation of Vector Bank design. Motion Estimation is a common technology for today s video codec. By abstracting the vector data from the Motion Estimation block using the motion detection method and with the application of Laplacian of Gaussian operator, we could obtain the object motion data generated by up to 16 reference frames. Thus, it could save the bandwidth, processing load, and memory resources dramatically. Keywords-H.264; Edge Detection; Motion Estimation I. INTRODUCTION People have been using visual sensors to setup surveillance system for years. However, we hardly find any applications that can be used widely for home security purpose due to the lack of ability of data transmitting and analyzing limitation. As you can see in a simple platform that we commonly used for the institutional security in Figure-1, analog sensors along with analog network transfer the TV signals into the eyes of a guard in security room. However, analog signal is good at transmitting, but weak in analyzing, store, and encryption. If we need real-time alert, in most case, human intervention is a must. Apparently, that won t work for regular families in home security. Things have been changed and the whole world is becoming digital. Today, we could purchase a digital visual sensor from any free market. The price is reasonably low and they all do have a video encoding chip inside the camera. But still, the solution is not applicable for home use in some way. As you can see in Figure-2, since most video compression codec are resource consuming, we need a powerful computer to decompress the video data coming from all the sensors and analyze them in real time manner. Definitely, it is not something for home use. The technique we are proposing today is to add a vector bank on a typical video codec core, and use motion object detection method plus boundary detect operator to identify the motion object. Firstly, an H.264/AVC [1] (also called as MPEG-4 Part 10) is becoming the most popular video encoding and Figure 1. Analog Visual Sensors network for Secure Surveillance System decoding standard today. It is developed by the ITU-T (International Telecommunication Union - Telecommunication Standardization Sector) and MPEG (ISO/IEC Moving Picture Experts Group) and it has an advanced compression ratio that is about 50% in size compared to the previous generations, such as MPEG-2 [2]. Figure 2. Digital Visual Sensors network for Secure Surveillance System An important part of H.264 is called Motion Estimation (ME), as shown in Figure-3 [3]. H.264 Encoder pushes both 978-0-7695-3908-9/09 $26.00 2009 IEEE DOI 10.1109/I-SPAN.2009.74 515

current frame and reference frames (i.e. previous frames) to the ME block. The ME block will analyze the similarity of the Marco Blocks (MB) between the current frame and several reference frames. Finally, the relations from ME block are called Motion Vectors. Figure 3. General H.264 Encoder Core Architecture Moving object detection and tracking technique has been studied for years. It is served widely in different areas, such as video surveillance, machine-human interfaces and authentication systems. There are different algorithms today in use to track moving objects [4] [5] [6] [7]. Most of them are using the frame differences of the neighbor frames to detect moving objects. In this paper, we use Motion Vectors generated by H.264 encoder to indicate the difference between two frames and moving objects. After we detect the moving objects using the vector data coming out of the H.264 encoder, we pass those vectors into the edge detection unit. There are several edge detection operators [8] that we can use. In this paper, we suggest two operators: a first-derivative operator Sobel or a secondderivative operator Laplacian of Gaussian operator. The rest of the paper is organized as follows. The Vector Bank and Motion Detection based on vectors will be introduced in Section 2. The edge detection operator will be mentioned in section 3. After that, we will demonstrate some experimental results in Section 4. Finally, the conclusion will be described at Section 5. II. VECTOR BASED MOVING OBJECT DETECTION Basically, Vector Bank is a memory based analyzer which should be attached to the Motion Estimation block of an typical H.264 hard encoder [9] [10]. A. Motion Estimation and Motion Vectors Figure-4 [3] shows portion of a typical Motion Vector Map of a video encoding procedure. With current frame and reference frames (decoded previous frames), the Motion Estimation Block generates the Macro-Block based Motion Vectors. These vectors along with residues are going to be transformed, quantized and compressed into video codes. And the Motion Estimation Block will dump the Vectors so that next MB would come in. In general, if we want to analyze the vector information inside the video code, we need to decompress the video file to see it again. Generally, an H.264Standard Definition (SD) file (e.g.ntsc 480pfile) would be decoded in real time by a full-running Pentium-4 CPU platform. If we simply have 2 video cameras, we need two PCs. And if the Video Sensor is HD (High Definition, e.g. 1080p), it is almost impossible for us to decode it in real-time, even harder for us to implement other algorithms to detect the motion object by the motion vectors. B. Vector Bank As you can see from Figure-6, the Vector Bank grabs Motion Vectors from the output of the Motion Estimation Block. With a queue, the Vector Bank could recover the Block-Based motion vectors back to frame-based. It is important because motion objects are base on frames but not Marco-Blocks. Figure 5. Block Implement Vector Bank into a typical H.264 Motion Estimation Figure 4. A typical Motion Vector Map in H.264 encoding procedure C. Moving Object Detection Basically, a digital camera for surveillance purpose will be placed with fixed and still location. That means the picture will have still or almost steady background. In most cases, the background won t move as there are no vector yields. However, if any movement occurs, the module with vector banks would generate a non-zero interrupt to the CPU. Then the CPU (or a programmed embedded DSP) will process the vector data in order to get the information of the moving object. Here is an example of how we identify moving objects. Figure-6 shows 2 frames of a home video taken by a steady 516

III. EDGE DETECTION OPERATOR As one important component of theory of Computer Vision, Edge Detection is well developed and widely used for Digital Image Processing Field. There are a lot of edge detection algorithms, including the first-derivative operators and second-derivative operators. First-derivative operators such as the Roberts, Prewitt and Sobel [8], can detect the edge of an image in one dimension (horizontal or vertical), while second-derivative operators, such as Laplacian operator could detect in both dimensions at the same time. In this paper, we propose two most famous edge operators, Sobel and Laplacian of Gaussian. Figure 6. An example 2 frame of home video camcorder. The only moving object here is a person. Two frames are taken with the time interval of 0.3 seconds, and the field that we see in the Figure-6 is only a portion of the whole frame. With H.264 motion estimation algorithm, vectors of MBs have been generated by the differences from two frames. And after the Vector Bank collected every Motion Vectors, it should have a view of Figure-7. As you can see, the vectors are not all the same for every individual MB, but the still background has no vector at all in this case. That could easily isolate the motion object from the background. If a programmable DSP or intelligent CPU is provided, the object detection would be more smoothly and the direction and speed of the object could be predictable. Figure 8. The Sobel Operator 3-D Plot in MatLab A. Sobel Operator The Sobel operator is a first-derivative edge detection operator. It is simple and easy to realize in most cases. But in order to detect a 2-D image edge, we need to run Sobel twice with different directions. A typical Sobel bi-directional kernel (also shown in Figure-8): +1 +2 +1 G y = 0 0 0 (1) 1 2 1 and +1 0 1 G x = +2 0 2 (2) +1 0 1 Figure 7. The Motion Vector indicate the frame difference of two frames G x and G y can be combined together to get the absolute magnitude of the gradient: G = G 2 x + G 2 y (3) for the fast computation, the magnitude could also be approximate computed as: G = G x + G y (4) 517

Thus, the approximate kernel for 2-D sobel detection operator is: G = (z 1 +2 z 2 + z 3 ) (z 7 +2 z 8 + z 9 ) (5) + (z 3 +2 z 6 + z 9 ) (z 1 +2 z 4 + z 7 ) B. Laplacian of Gaussian (LoG) Operator The Laplacian of Gaussian (LoG) Operator is a seconddirevative edge operator, the 2-D function is going to be: 2 f = 2 f x 2 + 2 f y 2 (6) The typical Gaussian Kernel with width σ is: G σ = 1 e x2 +y 2 2σ 2 (7) 2πσ IV. EXPERIMENTAL RESULT A. Experiment Assumptions Consider a typical home security video surveillance sensor monitoring a specific area that contains moving objects, public areas and restricted areas shown in the Figure-10. In this case, we have monitored every possible movement in addition to the restricted area. If the moving object is approaching the restricted area (e.g. private garden or yard), then our system should siren an alert and tries to locate the object. So, the Laplacian of Gaussian will be: 2 G σ = 2 G σ x 2 + 2 G σ y 2 (8) Variable x and y is equal in this equation, we determine the xpartfirst: 2 G σ x 2 = x2 σ 2 σ 4 e x 2 +y 2 2σ 2 (9) Let x 2 +y 2 = r 2, and put x, y together back to the equation: 2 G σ = x2 + y 2 2σ 2 σ 4 e x 2 +y 2 2σ 2 = r2 2σ 2 σ 4 e r 2 2σ 2 (10) Figure 10. The Experiment Scene Setup B. LoG result without Motion Vector Bank As we mentioned in the last section, the LoG algorithm is for the still image to use color differential boundary to detect object. But with still image, it is easy to imagine that the boundary of the moving object should be hard to detected and obtained because of the other high-frequency spatial color signals. Figure-11 shows the output for the LoG algorithm from a single frame input of our video. Figure 9. The Laplacian of Gaussian 3D Plot in MatLab Finally, with the selection of σ, wefind the 9 9 digital approximation of equation [9]: 0 1 1 2 2 2 1 1 0 1 2 4 5 5 5 4 2 1 1 4 5 3 0 3 5 4 1 2 5 3 12 24 12 3 5 2 2 5 0 24 40 24 0 5 2 (11) 2 5 3 12 24 12 3 5 2 1 4 5 3 0 3 5 4 1 1 2 4 5 5 5 4 2 1 0 1 1 2 2 2 1 1 0 Figure 11. The Laplacian of Gaussian (LoG) output for a single frame C. LoG result with Motion Vector Bank In comparison to color-based LoG processing, the vector-based LoG processing has 2 important advantages: 518

1) Result of Vector different will not contain still color noises. No matter how colorful the still background is, the Vector-Based LoG result would just contain the moving object(s). 2) Vector-Based LoG processing would be more efficient. Instead of processing 16x16 (256) color points, each 16x16 Macro-Block would generate one Vector data. That is a 256 times saving. Let s take a look the result of the vector-based LoG processing Figure-12. The only result is two moving object borders (the background is for readers to better understand those borders). Figure 13. The Laplacian of Gaussian (LoG) output for a single frame compressed video data. The proposed approach could save the bandwidth when there are more than one visual camera out there which could share the bandwidth with equal priorities. Figure 12. The Laplacian of Gaussian (LoG) output for a single frame In the real algorithm, we select the appropriate threshold to achieve the best results. We assign threshold =0.7 for the color frame LoG and threshold = 0.5 for Vector frame LoG. In this case, the color frame LoG threshold tolerance is quite small that either 0.6 or 0.8 would mess up the whole picture because of the large variety of color schemes. But for the Motion Vectors, they are quite same in the case. Even if we use 0.3-0.9 as the vector, the result will still remain the same. D. Border violation detection and feedback Border violation detection is based on the result of moving object border output of Vector LoG processing. We can easily detect that the object 1 is inside the pre-defined restricted area, while object 2 is in the public area. So, our system should start to give the feedback that signals the alert, while starting to store the H.264 streaming data from this point on. V. CONCLUSION The Vector Bank based H.264 architecture could abstract the motion vectors during the encoding processing. By using a few steps of mathematical analysis, such as Laplacian of Gaussian filter, we could locate and identify the moving object. Furthermore, the Vector Bank could tell if there is movement in the observed area, that would be the key switch to start recording or transmitting the 1) An extended memory structure for H.264 is not hard to build. The cost of the implementation would not be higher than today s video encoder chips. It means the proposed approach is affordable and achievable. 2) Never need a separate computer (or people) to monitor the video stream. Make home-based surveillance network sensor possible. 3) Sensors do not need to stream out the video data when there is no occurrence of moving violations. That saves the bandwidth of the overall surveillance networks. 4) The surveillance video could only be taken while moving violation is taking place. This could potentially save the memory resources dramatically. REFERENCES [1] Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU- T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264 ISO/IEC 14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Document JVT-GO50, December 2003. [2] R. Chen, W. Zhao, Q. Liu, J. Fan, Efficient H.264 architecture using modular bandwidth estimation, IEEE 5th International Conference on Embedded Software and Systems (ICESS 08), pp. 277-282, Chengdu, China, July 29-31, 2008. [3] I. E. G. Richardson, H.264 and mpeg-4 video compression, pp. 27 28, August 2003. [4] D. Li, Moving objects detection by block comparison, Electronics, Circuits and Systems, vol. 1, pp. 341-344, Dec, 2000. [5] R. Cucchiara, C. Grana, M. Piccardi and A. Prati, Statistic and knowledge-based moving object detection in traffic scenes, IEEE Proceedings. Intelligent Transportation Systems, pp. 27-32, Oct, 2000. 519

[6] Y.K. Jung, K.W. Lee and Y.S. Ho, Content-based event retrieval using semantic scene interpretation for automated traffic surveillance, IEEE Transactions on Intelligent Transportation Systems, vol. 2, pp. 151-163, Sep, 2001. [7] R. Montoliu and F. Pla, Multiple parametric motion model estimation and segmentation, ICIP 2001, vol. 2, pp. 933-936, Oct, 2001. [8] R. C. Gonzalez and R. E. Woods, Digital image processing, vol. 10, no. 2, pp. 585 611, 2001. [9] W. Zhao, Z. Luo, Jeffrey Fan, S. Tan, Vector edge detection in H.264 Implementation, IEEE 5th International Conference on Embedded Software and Systems Symposia (ISHSO 08), pp. 208-212, Chengdu, China, July 29-31, 2008. [10] W. Zhao, Jeffrey Fan, A. Davari, Vector bank based target tracking via vision sensors in aviation systems, IEEE 41st Southeastern Symposium on System Theory (SSST 09), pp. 73-76, Tullahoma, TN, March 15-17, 2009. 520