Development and optimization of coding algorithms for mobile 3DTV. Gerhard Tech Heribert Brust Karsten Müller Anil Aksay Done Bugdayci

Size: px
Start display at page:

Download "Development and optimization of coding algorithms for mobile 3DTV. Gerhard Tech Heribert Brust Karsten Müller Anil Aksay Done Bugdayci"

Transcription

1 Development and optimization of coding algorithms for mobile 3DTV Gerhard Tech Heribert Brust Karsten Müller Anil Aksay Done Bugdayci

2 Project No Development and optimization of coding algorithms for mobile 3DTV Gerhard Tech, Heribert Brust, Karsten Müller, Anil Aksay, Done Bugdayci Abstract: Error resilience tools for H.264/AVC are presented. Slice encoding has been implemented in the H.264/MVC reference software JMVC An evaluation of the new encoder shows that the additional bit rate needed for the error resilience can be neglected for error free channels. In case of error-prone channel the new slice mode provides sufficient error resilience. The Mixed Resolution Stereo Coding (MRSC) approach has been evaluated. The optimal bit rate distribution between left and down sample right view has been examined. The Advanced Mixed Resolution Stereo Coding (AMRSC) approach has been developed. The three main features of AMRSC are optimized down sampling, interview prediction and view enhancement using unsharp masking. The suitability of sub sampling and low pass filtering together with interview prediction has objectively been evaluated. Further improvements of coding efficiency have been achieved by optimizing the bit rate distribution between the full view and the predicted down-sampled view. AMRSC is compliant with the overall MVC coding strategy in Mobile3DTV. For subjective evaluation of coding methods 96 test stimuli have been generated from the six sequences of the coding test set using Simulcast, Multi View, Mixed Resolution and Video + Depth coding. Two codec profile and a low and a high quality level have been used. 24 test stimuli have been generated for subjective evaluation of transmission approaches using Simulcast, Multi View and Video + Depth coding. Half of the 24 sequences have been coded using the new encoder supporting error resilience methods. Keywords: 3DTV, Error Resilience, Mixed Resolution Coding, Generation of coded test sequences

3 Executive Summary This deliverable is tripartite. The first part deals with the new prototype of the software encoder using error resilience tools. The second part describes the examination and the development of the Mixed Resolution approach. The optimization of the coding approach for subjective tests is presented in the last part. H.264/AVC provides several error resilience tools. However none of them is implemented in JMVC Reference Software for MVC extension of H.264/AVC. The implementation of slice encoding into the H.264/MVC reference software JMVC has been carried out. Frames are stored in smaller data packets that can still be decoded independently in case of losses. An evaluation of the new encoder has been carried out using an error-free and error-prone channel. Coding tests show that the additional bit rate needed for the error resilience can be neglected and video quality only decreases slightly for error-free channels. In case of error-prone channel it has been demonstrated that the new slice mode provides sufficient error resilience and leads to a high gain of video quality. In the context of stereo coding, both views have the same resolution in classical coder settings. Here, an interesting alternative is the Mixed Resolution Coding approach, which is also evaluated. It is found that the optimal bit rate distribution between left and down sample right view is approximately 30% to 35% for the down sampled view. The quality of Mixed Resolution and Full Resolution Coding is subjectively evaluated and shows that the subjective quality of coded Mixed Resolution sequences is better than Simulcast coded sequences, due to decreasing number of coding artifacts. Although this approach yields lower bitrates, the perceived quality may not always be close to the full view. Therefore, beyond the Mixed Resolution approach, an Advanced Mixed Resolution Stereo Coding (AMRSC) approach has been developed. The three main features of the AMRSC approach are optimized down sampling, interview prediction and view enhancement using unsharp masking at the receiver side. The suitability of sub sampling and low pass filtering together with interview prediction was objectively evaluated. Further improvements in coding efficiency were achieved by optimizing the bit rate distribution between the full view and the predicted down-sampled view. For the Mobile3DTV application, this means that an MVC codec will be used, for which the AMRSC approach is now also compliant. Test stimuli for subjective evaluations have been generated. Coding results for various stereo video coding approaches, codecs and codec settings are presented. For subjective evaluation of coding methods in total 96 test stimuli have been generated from the six sequences of the coding test set using Simulcast, Multi View, Mixed Resolution and Video + Depth Coding. Moreover a baseline and a high codec profile were used. An objective evaluation was carried out at a low and a high quality level. 24 test stimuli have been generated for the subjective evaluation of transmission methods. Therefore the four sequences of the transmission test set have been coded using Simulcast, Multi View and Video plus Depth coding. Half of the sequences were coded using the new encoder with error resilience tools. 2

4 Table of Contents 1 Introduction Software encoder using error resilience tools Slice Interleaving Modified MVC encoder Modified MVC bit stream assembler Modified MVC decoder Evaluation of MVC encoder using slice mode Test Setup Coding Results for error free channel Coding Results for error prone channel Conclusion Mixed Resolution Coding Optimization of Mixed Resolution Stereo Coding (MRSC) Different sampling methods of MRSC Objective criteria for bit rate allocation Subjective evaluation Advanced Mixed Resolution Coding (AMRSC) Interview Prediction View Enhancement using unsharp masking Conclusion Optimization of coding approaches for subjective tests Test sequences for subjective evaluation of coding approaches Test setup Coding Results Generated Test Stimuli Conclusion Test sequences for transmission studies Test setup Coding Results Generated Test Stimuli Conclusion Conclusion

5 1 Introduction This deliverable consists of three parts. The first part deals with the new prototype of the software encoder using error resilience tools. The second part describes the examination and the development of the Mixed Resolution approach. The optimization of coding approaches for subjective tests is presented in the last part. H.264/AVC provides several error resilience tools. However none of them is implemented in JMVC Reference Software for MVC extension of H.264/AVC. The implementation of slice encoding into the H.264/MVC reference software JMVC is reported in section 2. Moreover coding results using this new encoder are presented. The evaluation has been carried out using an error-free and error-prone channel. In section 3 the Mixed Resolution Coding approach is presented. Different types of sub sampling one view are compared. The optimum bit rate distribution between both views is investigated and the quality of Mixed Resolution and Full Resolution Coding is subjectively evaluated. An Advanced Mixed Resolution Stereo Coding Approach (AMRSC) is presented. Therefore the coding of mixed resolution sequences was improved by exploiting interview dependences. Moreover the suitability of sub sampling and low pass filtering together with interview prediction was objectively evaluated. Improvements in coding efficiency were achieved by choosing different QP parameters for left and right view. Finally the enhancement of the subjective quality by applying a simple unsharp masking algorithm was investigated. Section 4 presents methods for the generation of test stimuli for subjective tests. Coding results for various stereo video coding approaches, codecs and codec settings are presented. The focus is set on the generation of test stimuli with defined bit rates for subjective comparison of coding approaches as well as transmission methods. Test stimuli generated for the coding test also include sequences of the simple Mixed Resolution approach. For generation of some of the transmission test stimuli, the new encoder using error resilience tools has been utilized. 4

6 2 Software encoder using error resilience tools Error resilient tools in H.264/AVC are data partitioning, slice interleaving, flexible macroblock (MB) ordering (FMO), SP/SI frames, reference frame selection, intra block refreshing and redundant slices [1]. SP/SI frames and reference frame selection requires feedback from the decoder. Data partitioning, slice interleaving, FMO, intra block refreshing and redundant slices are the candidates to be used in MVC. However none of these tools are implemented in JMVC Reference Software for MVC extension of H.264/AVC. Slice interleaving for error resilience has been integrated to JMVC By this way, it is possible to code each different representation with/without slices. 2.1 Slice Interleaving Parameter sets Slice #0 Slice #1 Slice #2 Slice #0 Slice #0 Slice #1 Slice #0 Frame #0 Bit stream NAL packets Frame #1 Frame #2 Frame #3 Video sequence Figure 2.1: Bit stream syntax of H.264/AVC using fixed-size slices H.264/AVC bit stream is composed of network abstraction layer (NAL) units as shown in Figure 2.1. In each NAL unit, there is a video coding layer (VCL) block. VCL can be a small packet with information about the bitstream like sequence parameter set (SPS); picture parameter set (PPS) or supplemental enhancement information (SEI). SPS and PPS are required packets whereas SEI can be skipped. Other VCL packets are the coded video streams. Each packet is a slice containing an integer number of macro block. It can contain all macro blocks of a frame or it can contain a single macro block. Figure 2.2 depicts a frame encoded using several slices. Slices are independently decodable if previous frames are available. This is achieved by using the location information in the slice header and by allowing spatial dependency only inside the slice. For compression efficiency, using a single slice per frame is better in order to avoid header overhead. 5

7 Slice #0 Slice #1 Slice #3 Slice #4 Figure 2.2 Slice encoding of a frame using fixed-size slices If NAL unit size is bigger than Maximum Transmission Unit (MTU) of the corresponding transport medium, it will be fragmented into smaller packets. In erroneous environments, some of these smaller packets can be lost, and this will cause the system to lose the entire frame, since parts of a NALU cannot be decoded by the decoder. However, by encoding a frame into several slices so that each slice size is smaller than MTU, each packet arrived at the decoder can be decoded correctly. In Figure 2.3 it is shown how the same error pattern is applied to both slice and no slice encoded streams. Some of the slices can still be decoded in case of slice encoding. The performance of slice encoding can be affected by the burst size of the error and also the size of the slices. time time : Bit Error : Lost Packet time Figure 2.3 Sliced and not-sliced encoding in cases of erroneous transmission Slice encoding and decoding has been implemented into JMVC by modifying the encoder, decoder and bit stream assembler. In the current version, encoder generates several numbers of slices for a frame according to the input slice size parameter. 2.2 Modified MVC encoder There are several functions modified in order to integrate slice encoding into MVC encoder. First a new function is added to check the total bytes spend for the currently encoded slice. By using this function, slice encoding loop is modified. Instead of coding all MBs in the frame, there is a check after each MB is encoded. If the total allowed slice size is smaller than the current slice size, current slice is finalized and the next MB will start with a new slice. Another modification is done for the loop filter functions. Previously loop filter operations are applied for all MBs in the frame. This part is changed so that loop filter operations applied for the MBs inside the current slice. 6

8 2.3 Modified MVC bit stream assembler This is simply the assembling of the left and right streams before the decoder and it is necessary since JMVC software decoder takes the multi view coded streams in assembled format. The decoder uses the inter-view prediction references between the two streams so the assembling is done such that the frames of each stream are put in order containing one from left and one from right view streams. JMVC bit assembler assumes each frame is encoded using a single slice and there are no losses in the stream. Instead of fixing these problems, a new application is written which assembles left and right streams correctly in case of slice encoding and also losses. 2.4 Modified MVC decoder Since JMVC Reference software does not have the slice mode implemented, adding slice mode in the encoder brought the issue of decoder modification. Decoder software does not require any modification for slice decoding. However there are some modifications for error handling in case of slice and/or frame losses. Error concealment is not a normative part of H.264/AVC. More information on H.264/AVC error concealment can be found in [1]. However, these concealment strategies are not included in the MVC software. In order to handle frame losses another application is used to insert skip frames. This will enable decoder to decode lost frames as a copy of the previous frames in the buffer. For slice losses, modifications are done in the MVC decoder. New functions are added to check the decoded MBs in each frame and identify the missing MBs when decoding of the current picture finishes. Missing MBs are copied from the collocated MBs from the nearest decoded picture in the buffer. Also loop filter functions are modified to disable loop filtering for the missing MBs. 7

9 2.5 Evaluation of MVC encoder using slice mode Test Setup The evaluation of the MVC encoder was carried out using the four sequences of the test set for transmission studies (see section ). To generate test sequences the slice argument determining the slice size was varied as well as the QP. Codec Parameters are given in Table 2.1. A slice size of zero means that slice interleaving was disabled. Profile GOP Size Symbol Mode Table 2.1 Codec Settings Baseline 1 (IPPP) CAVLC Search Range 48 Intra Period 16 QP 24, 28, 32, 36, 40 Slice Size (byte) 250, 500, 750, 1000, 1250 The performance of the encoder has been evaluated for an error-free and an error prone channel. To generated distorted sequences the channel model shown in Table 2.2 has been used. For details please refer to [2]. Table 2.2 Channel and transmission parameters Channel Model Modulation (COST207) Typical Urban 6 taps fdmax=24hz 16QAM Convolutional Code Rate 2/3 Guard Interval 1/4 FFT Mode SNR 8K 17dB 8

10 2.5.2 Coding Results for error free channel The rate-distortion characteristics of the error resilient encoder in case of an error-free channel are depicted in Figure 2.4. With decreasing size of the slices the performance of the encoder decreases. This can be explained with an increasing overhead, introduced by the higher number of data packets. Nevertheless for a slice size of e.g byte performance losses are below 0.5 db can be neglected. Figure 2.4 PSNR vs. bit rate, for different slice sizes and sequences; error-free channel 9

11 2.5.3 Coding Results for error prone channel Results for the error-prone channel are depicted in Figure 2.5. Each of the shown rate-distortion points is an average over rates and distortions of five sequences distorted with different error patterns. Note that averaging is required for evaluation, since coding different QPs and slice sizes results in bit streams of different length that are distorted a different position. For a single sequence quality can decrease with enabled slice mode. Reason for this is that losses that occurred in uncritical parts of the bit stream generated without slice mode can be shifted to critical parts of the bit stream when coded with enabled slice mode. However, in average the slice interleaving should lead to an increased quality. To show this the evaluation has to be carried out statistically. Here only five different error patterns have been used per bit stream. Hence the influence of outliers can still be seen in Figure 2.5 (For example for RollerBlade1 at a slice size of 250). Nevertheless the converging tendency can already be observed: A smaller slice size results in an increasing performance. For high rates the gain obtained by using the slice mode increases. Reason for this is that the length of critical parts in the bit stream increases and losses of important data packets are more likely. This effect can be diminished by slice partitioning effectively. Figure 2.5 PSNR vs. bit rate, for different slice sizes and sequences; error-prone channel; average over sequences distorted by 5 different error patterns 10

12 2.6 Conclusion The prototype of the software encoder using error resilience has been presented. Error resilience is achieved by a new slice mode. Frames are stored in smaller data packets that can still be decoded independently in case of losses. Coding tests show that the additional bit rate needed for the error resilience can be neglected and video quality only decreases slightly for error free channels. In case of error-prone channel it has been demonstrated that the new slice mode provides sufficient error resilience and leads to a high gain of video quality. Beyond the objective examination of the slice encoder carried out here, a subjective evaluation of the encoder using slice mode will be carried out in a large scale subjective test and be reported in the upcoming deliverable D4.3 Results of quality attributes of coding, transmission and their combinations. Further error protection applied in the lower layers will be reported in the upcoming deliverable D3.4 Stereo DVB-H broadcasting system with error resilient tools. 11

13 3 Mixed Resolution Coding In this section the Mixed Resolution Coding approach is presented. Different types of sub sampling one view are compared. The optimum bit rate distribution between both views is investigated and the quality of Mixed Resolution and Full Resolution coding is subjectively evaluated. The Advanced Mixed Resolution Stereo Coding (AMRSC) is presented. Therefore the coding of mixed resolution sequences was improved by exploiting interview dependences. Moreover the suitability of sub sampling and low pass filtering together with interview prediction was objectively evaluated. Improvements in coding efficiency were achieved by choosing different QP parameters for left and right view. Finally the enhancement of the subjective quality by applying a simple unsharp masking algorithm was investigated. 3.1 Optimization of Mixed Resolution Stereo Coding (MRSC) If the sharpness in two views of a stereoscopic signal is different, the perceived quality is close to the sharper view (binocular suppression theory) [3], [4]. In the presence of different amount of blocking artifacts however, the binocular quality is rated as the average of both views. This means that it should be possible to transmit a stereoscopic video with one reduced resolution view (Mixed Resolution representation) at a lower bit rate and to still reach the same quality as for the Full Resolution representation Different sampling methods of MRSC Different ways of reducing the sharpness of one view were investigated. Sub sampling one view in both directions and sub sampling only in one direction were compared. Another method of reducing the sharpness is low pass filtering and coding at the base resolution. These methods have been compared to Full Resolution coding in an informal subjective test with the sequences Mountain, Diving, Performance and Soccer 2 (Figure 3.1). (a) (b) (c) (d) Figure 3.1: Test sequences for subjective comparison of Mixed Resolution and Full Resolution coding(left view): (a) Mountain (b) Diving (c) Performance (d) Soccer 2 12

14 The sequences have a resolution of 320x240 Pixel, a frame rate of 30 frames per second and a length of 240 frames (Mountain, Diving and Performance) and 450 frames (Soccer2) respectively. For down sampling and up sampling, the filters used in the JSVM software, were applied [5] row wise and column wise. These are the non-normative dyadic filter for down sampling (equation 1) and the normative dyadic filter for up sampling (equation 2) (1) (2) The test was carried out on a 3.5" display with barrier technology. It has a total resolution of 640x480 pixel and a resolution of 320x480 pixel per view in 3D mode. The sequences were displayed with the stereoscopic player [6] which does a vertical up sampling by a factor of two. Four different coding types were compared (Table 3.1). Table 3.1: Tested types of sub sampling of coded stereoscopic sequences Left view Right view Type 1 Coding of full resolution view Coding of full resolution view Type 2 Type 3 Type 4 Horizontal sub sampling by a factor of 2 and coding Horizontal and vertical sub sampling by a factor of 2 and coding Low pass filtering in horizontal and vertical direction and coding of full resolution view Coding of full resolution view Coding of full resolution view Coding of full resolution view The total bit rate for all types was the same and the bit rate distributions were chosen to 1:2, 1:4 and 1:8 for left view vs. right view. For coding, H.264/AVC simulcast with reference software JM 14.2 was used. The different coding types were subjectively rated by 5 video coding experts in an informal test from best to worst in the following order: 1. Type 3 2. Type 2 3. Type 4 4. Type 1 This indicates that it is possible to transmit a Mixed Resolution sequence with the same bit rate at a higher subjective quality. To verify these results, further tests have been carried out to show the difference between the best Mixed Resolution representation (Type 3) and Full Resolution. In order to do further subjective tests, the best bit rate distribution had to be found. This was done with objective criteria, described in the next section. 13

15 3.1.2 Objective criteria for bit rate allocation To optimize the bit rate distribution between left and right view, the PSNR measure was used. Due to the theory that the binocular quality of a stereo sequence is the average of the quality of both views, in presence of blocking artefacts [4], a total PSNR was calculated, considering all pixels of both views. To realize that with existing tools, the mean squared error (MSE) was first calculated separately for both views. After that the total PSNR was calculated from both mean squared errors for left and right view using the following equations: PSNR 10 log10 (3) MSE t MSE t MSE 1 MSE 2 (4) 2 In the case of Mixed Resolution Coding, the calculation of mean squared error was done after up-sampling the lower resolution view. The calculation was done with respect to the low pass filtered and up-sampled original view. Figure 3.2 shows the rate-distortion curves for the sequence Mountain for left and right view for Mixed Resolution in which the left view was coded at half horizontal and half vertical resolution, and Full Resolution without any low pass filtering. A total bit rate of 400 kbit/s was used. The bit rate distribution varied over the entire range from 100% for left view to 100% for right view. Both curves were then interpolated with a cubic spline interpolation to calculate the total PSNR and match the exact value of 400 kbit/s total bit rate. (a) (b) Figure 3.2: PSNR for left view, right view and total PSNR with (a) Mixed Resolution and (b) Full Resolution It can be seen that the total PSNR curve for Mixed Resolution has its maximum at 30% for the left (subsampled) view and for Full Resolution at 45% for the left (full) view. Moreover it can be seen that the total PSNR reaches higher values for Mixed Resolution than for Full Resolution. This comes from the fact that the PSNR for the left view was calculated with respect to the low 14

16 pass filtered up-sampled uncoded view. Hence the total PSNR curves do not take blur into account. However, they follow the binocular quality, if there is no difference between Mixed Resolution (with one low pass filtered, down-sampled and up-sampled view) and Full Resolution visible. It was shown in D2.4 that the difference between Mixed Resolution and Full Resolution is minimized with increasing base resolution, increasing viewing distance and decreasing display size. Figure 3.3 shows the total PSNR curves for Mixed Resolution and Full Resolution for different sequences with the following total bit rates: Mountain (425 kbit/s), Diving (236 kbit/s), Performance (1280 kbit/s) and Soccer 2 (350 kbit/s). (a) (b) (c) (d) Figure 3.3: Total PSNR for Mixed Resolution and Full Resolution for the sequences (a) Mountain, (b) Diving, (c) Performance and (d) Soccer 2 The maximum of the total PSNR curve for Full Resolution lies around 50% for the left view. For Mixed Resolution the total PSNR reaches its maximum at 30% (Mountain and Performance), 35% (Soccer 2) and 45% (Diving). This shows that the optimum bit rate distribution between both views is sequence-dependent. 15

17 Figure 3.4 shows the total PSNR curves and its maxima for the sequence Mountain at different total bit rates. It can be seen that the optimum distribution depends on the total bit rate. Figure 3.4: Total PSNR and corresponding maxima for the sequence Mountain at total bit rates of 200 kbit/s, 400 kbit/s, 600 kbit/s, 800 kbit/s and 1000 kbit/s (from bottom to top) Subjective evaluation Small scale subjective tests were carried out to compare Mixed Resolution coding with Full Resolution coding. The test setup was the same as described in the subjective tests with uncoded sequences in D2.4 [5]. The 3.5" and the 32" stereoscopic displays were used. The sequences Mountain, Diving, Performance and Soccer2 were shown to 13 expert viewers in an A- B preference vote in the following order: AABBAABB It was randomly chosen, whether A was the Mixed Resolution sequence and B Full Resolution sequence or vice versa. After that test persons should rate, if A or B had the better overall quality. The tested bit rates are shown in Table 3.2. Table 3.2: Bit rate distribution to left and right view for subjective tests with Mixed Resolution and Full Resolution Sequence Mountain Mountain Diving Performance Soccer2 Soccer2 Total bitrate [kbit/s] Bit rate left view /total bit rate: Mixed Resolution [%] Bit rate left view /total bit rate: Full Resolution [%]

18 Setup I (3.5 display) h/d = 1/10 Setup II (32 display) h/d = 1/5 Table 3.3: Result of subjective tests with coded sequences Mountain 320 kbit/s Mountain 425 kbit/s Diving 236 kbit/s Performance 1280 kbit/s Soccer2 260 kbit/s Soccer2 350 kbit/s Mixed Resolution better No difference Full Resolution better total Mountain 320 kbit/s Mountain 425 kbit/s Diving 236 kbit/s Performance 1280 kbit/s Soccer2 260 kbit/s Soccer2 350 kbit/s total

19 The results of these tests are shown in Table 3.3. It can be seen, that for both displays Mixed Resolution has a slightly better binocular quality than Full Resolution. This means that for these relatively small bit rates the stronger blocking artifacts in the case of Full Resolution are more annoying than the slightly unsharper images in the case of Mixed Resolution. Nevertheless the performance of the Mixed Resolution approach is also display dependent. In the large scale subjective evaluation of the coding approaches the Mixed Resolution approach does not outperform the simulcast approach (see upcoming Deliverable 4.3 Results of quality attributes of coding, transmission and their combinations ). This can be related to the advanced NEC display that is used in the large scale study. The NEC autostereoscopic 3.5 display with a resolution of 428 x 240 is based on a lenticular sheet technology and provides a much better video quality than the display based on parallax barrier technology. The sharpness difference introduced by mixed resolution seems to be more visible on the NEC display. However, the evaluation carried out here shows the potential of the mixed resolution approach. Therefore an advanced mixed resolution approach has been investigated and is presented in the next section. 3.2 Advanced Mixed Resolution Coding (AMRSC) The coding tests in section 3.1 were carried out without using any prediction between both views. Coding a stereoscopic video with inter-view prediction can result in bit rate savings while maintaining the same quality. It was investigated whether inter-view prediction and Mixed Resolution Coding can be combined to obtain better coding results than applying only one of the two techniques. The unsharp masking algorithm presented in section is a method for enhancing the subjective quality of Mixed Resolution sequences. It is a simple algorithm that can be applied on a mobile device with low computational costs. A further enhancement of the Mixed Resolution approach can be obtained by using optimized down sampling algorithms. An investigation of these methods is not part of this deliverable but presented in the upcoming Deliverable 5.4 ( Advanced algorithms for stereo-video preprocessing ) Interview Prediction It was reported in D2.2 that using H.264/MVC (Multiview Video Coding) with inter-view prediction results in a significantly better rate distortion performance than simulcast coding. On the other hand, coding a low pass filtered or sub sampled video requires less bit rate than the original video and maintains a high binocular quality due to the binocular suppression theory. It was investigated how the Mixed Resolution Coding can be improved by exploiting inter-view dependences of both views Low Pass Filtering and Sub sampling This section describes the coding experiments of low pass filtered and sub sampled views with interview and without interview prediction. The right view was coded with the base resolution. The left view was coded with four different methods: Method I: low pass filtering and coding with base resolution (Left LP). Method II: low pass filtering and sub-sampling by a factor of two in both directions and coding at the reduced resolution (Left DS). Method III: the decoded right view was used for inter-view prediction for the low pass filtered left view (Left LP IV). 18

20 Method IV: the decoded right view was low pass filtered and sub-sampled by a factor of two and used for inter-view prediction of the low pass filtered and sub-sampled left view (Left DS IV) [7]. For all methods the H.264/MVC was used. The tested sequences are Hands (251 frames), Snail (189 frames), Horse (140 frames) and Car (235 frames) with a base resolution of 480x272 pixel. (a) (b) (c) (d) Figure 3.5 Sequences for coding test with Mixed Resolution and interview prediction: (a) Hands, (b) Snail, (c) Horse and (d) Car The encoder settings are shown in Table 3.4. The QP parameter was varied from 20 to 44. For the methods using interview prediction, the QP for left and right view were the same. Table 3.4 Encoder setting for Mixed Resolution Coding with interview prediction Encoder Implementation JMVM 7.0 Quantization Parameter 20, 22, 24, 26,..., 44 GOP Size 2 Intra Period 16 Symbol Mode CAVLC 19

21 (a) (b) Figure 3.6: Coding results for Mixed Resolution with inter-view prediction, LP = low-pass filtered, DS = down-sampled, IV = inter-view prediction: (a) Hands, (b) Snail 20

22 (a) (b) Figure 3.7: Coding results for Mixed Resolution with inter-view prediction, LP = low-pass filtered, DS = down-sampled, IV = inter-view prediction: (a) Horse and (b) Car 21

23 The rate-distortion curves are shown in Figure 3.6 and Figure 3.7. The right (original) view has the worst performance because it was coded with base resolution and high details. The coding of the low pass filtered left view (Left LP) shows a better performance because the PSNR was calculated with respect to the low pass filtered uncoded view. The PSNR of the decoded low pass filtered and sub-sampled view (Left DS) was also calculated with respect to the low pass filtered and sub-sampled uncoded view. It can be seen for all sequences that coding with a lower resolution leads to a better rate distortion performance than only low pass filtering. With the use of inter-view prediction the low pass filtered version (Left LP IV) reaches some gain for low bit rates. For high bit rates nearly no enhancement is visible. The use of inter-view prediction for the low pass filtered and down-sampled version (Left DS IV) achieves a gain for all sequences for low and high bit rates. Bit rate savings of up to 70% compared to the coding of a down sampled view without inter-view prediction are possible for some sequences (Horse, Car) Bit rate allocation In the tests described in section the best result was achieved with coding a sub-sampled version of the left view with inter-view prediction from the right view. The QP values were the same for left and right view. This combination of QP values is not necessarily the best in terms of the overall binocular quality. It was further tested how the rate distortion performance changes for coding the left view when the QP value of the base (right) view is varied. The coder settings of Table 3.4 were used for this test, while the QP value of the right view was varied from 20 to 44 with step size 1. The decoded right view was low pass filtered and sub sampled, and used for inter-view prediction of the left view. The left view was coded with QP values from 20 to 44 with step size 4 with all QP values of the decoded right view. It can be seen in Figure 3.8 that based on the PSNR value for same QPs for left and right view, the PSNR value changes if the QP value of the right view varies. The PSNR value reaches higher values at lower left-view-only bit rates if the quality of the base view is increased. When the quality of the right view is decreased, the left view requires a higher bit rate and has a lower PSNR value. For the sequence Snail there are some exceptions of this behavior, but only for relatively low bit rates of the right view. Note, that this inverse PSNR behavior only occurs, because the left view PSNR is plotted against the left-view-only bit rate. For the total PSNR vs. total bit rate, the expected behavior occurs, as shown in the following figures. 22

24 (a) (b) (c) (d) Figure 3.8: Left view bit rate vs. left view PSNR; the down-sampled left view was coded with inter-view prediction from right view; the violet curve shows results for same QP values for left and right view; the black curves show the variation of rate and distortion with varying QP of the right view (a) Hands, (b) Snail, (c) Horse and (d) Car To find out which gain is achievable with different QP values compared to same QP values of left and right view, it is necessary to evaluate both views jointly. This was done by averaging the mean squared errors of both views. The mean squared error of the left view was calculated with respect to the low pass filtered uncoded left view. Figure 3.9 and Figure 3.10 show the total PSNR versus the total bit rate of both views. The displayed numbers are the QP values of the left (down sampled) and the right view with the highest total PSNR for particular bit rates. 23

25 (a) (b) Figure 3.9: Total PSNR versus total bit rate for Mixed Resolution Coding with inter-view prediction and the QP combinations for left (down sampled) and right view with the highest PSNR for particular bit rates: (a) Hands and (b) Snail 24

26 (a) (b) Figure 3.10: Total PSNR versus total bit rate for Mixed Resolution Coding with inter-view prediction and the QP combinations for left (down sampled) and right view with the highest PSNR for particular bit rates: Horse (a) and (b) Car 25

27 It can be seen that with this optimization the QP of the right (base) view is always higher than the QP of the left (down sampled) view. The bit rate distribution to left and right view is shown in Table 3.5. For all tested QPs the bit rate for the right view is higher than the bit rate for the left view. For the sequence Hands the bit rate difference between left and right view is lower than for the other sequences. The reason for this is that the sequence Hands has less inter-view dependences than the sequence Horse for example. Figure 3.11 and Figure 3.12 show the comparison between the optimized QP values for interview prediction and the results of inter-view prediction with same QPs for left and right view. For the sequences Car and Horse the rate distortion curves are nearly identical for both methods. There is a small gain of different QP values for the sequence Snail and a significant gain for the sequence Hands. The use of inter-view prediction with same QPs causes less gain compared to simulcast for the sequence Hands than for the other sequences. Because of that the optimization of the QP values results in high gains for the sequence Hands. Table 3.5: Coding results for left (down sampled) and right view with interview prediction Hands Snail Total bit rate [kbit/s] Total PSNR [db] QP left (DS) view QP right view Bit rate left view [kbit/s] Bit rate right view [kbit/s] Horse Car Total bit rate [kbit/s] Total PSNR [db] QP left (DS) view QP right view Bit rate left view [kbit/s] Bit rate right view [kbit/s]

28 (a) (b) Figure 3.11: Total bit rate versus total PSNR for Mixed resolution coding without inter-view prediction, with inter-view prediction with same QPs and with different QPs for left and right view: (a) Hand and (b) Snail 27

29 (a) (b) Figure 3.12: Total bit rate versus total PSNR for Mixed resolution coding without inter-view prediction, with inter-view prediction with same QPs and with different QPs for left and right view: (a) Horse and (b) Car 28

30 3.2.2 View Enhancement using unsharp masking The suitability of unsharp masking filters for enhancement of the binocular quality has been evaluated. The filter increases the subjective sharpness of the sub-sampled view, hence the approach has the potential to achieve a sharper overall image, as well as the potential to reduce the sharpness differences between both views. An advantage of this approach is that the bit rate for transmission does not increase. The resolution of the right (base) view was 480x272 pixel and the resolution of the left (sub sampled) view was 240x136. After decoding and up sampling an unsharp masking algorithm was applied to the left view. This algorithm uses the following convolution matrix (5) for the up-sampled image. This has the effect that in the resulting image, the low frequency components are reduced, while the high frequency components are enhanced. Thus, the algorithm produces a subjectively sharper sequence. The parameter α adjusts the factor of the unsharp masking. Here, α values of 2 and 4 and QP values of 25, 28, 31, 34, 37 were tested. The sequences were shown on the 3.5 and 32, displays (see section 3.1.3) in an informal expert viewing test. It was observed, that the binocular quality of the Mixed Resolution sequences, to which the method was applied, did not improve quality in all cases. Even a slightly worse quality was observed for medium and low bit rates. As expected the overall subjective sharpness was increased, but also the coding artifacts were amplified and binocular quality decreases. Nevertheless for high bit rates which support a quality appropriate in a real life scenario the positive effect is dominant and algorithm improves the quality, due to increased overall sharpness (Figure 3.13). 29

31 (a) (b) (c) (d) (e) (f) Figure 3.13: Part of sequence Horse for QP=25 (a), (c), (e) and QP=37 (b), (d), (f); (a) and (b): sub sampled decoded left view; (c) and (d): sub sampled decoded left view with unsharp masking; (e) and (f): decoded left view 30

32 3.3 Conclusion Expert evaluation shows that the subjective quality of coded Mixed Resolution sequences is better than simulcast coded sequences, due to decreasing number of coding artifacts. The optimized bit rate distribution between left and right view for Mixed Resolution Coding without interview prediction was approximately 30 to 35% for the low resolution view. Nevertheless in the large scale subjective evaluation of the coding approaches the Mixed Resolution approach does not outperform the simulcast approach (see the upcoming deliverable D4.3 Results of quality attributes of coding, transmission and their combinations ). This might be related to the advanced display used in the large scale study and the different test methodologies. However, the evaluation carried out in this scope show the potential of the mixed resolution approach. Therefore the Advanced Mixed Resolution Stereo Coding (AMRSC) approach has been investigated. The three main features of the AMRSC approach are optimized down sampling, interview prediction and view enhancement using unsharp masking at the receiver side. Optimized down sampling is reported in D5.4 ( Advanced algorithms for stereo-video preprocessing ) and leads to PSNR gains up to 1dB for the down-sampled view. Inter-view prediction significantly improves the rate distortion performance. The optimized QP combinations of both views show that the base view should be coded with a higher QP than the predicted low resolution view. The observed QP difference is between 2 and 8 for the tested sequences and settings. Unsharp masking of the low resolution view can enhance the overall quality but only for high bit rates. This does not apply to low or medium bit rates, because coding artifacts are also amplified by sharpening. Further potential for the Mixed Resolution approach lies in more advanced content-adaptive sharpening and up-sampling algorithms. Also approaches using information from the full view for reconstruction from the down-sampled view are thinkable. A higher performance might result also from an optimization of the AMRSC approach using a new 3D video quality metric which comprises the implications of the binocular suppression theory better than the PSNR used in this scope. 31

33 4 Optimization of coding approaches for subjective tests In this section methods for the generation of test stimuli for subjective tests are described. Coding results for various stereo video coding approaches, codecs and codec settings are presented. The focus of this section is set on the generation of test stimuli with defined bit rates for subjective comparison (in contrast to objective coding comparisons in previous Deliverable D2.2 [8]). Based on the results of D2.2 the coding approaches to be tested have been chosen. From the two possible methods for Video plus Depth Coding (MPEG-C part 3 using H.264/AVC and H.264 auxiliary picture syntax), MPEG-C part 3 has been selected. Reason for this is the flexibility of independent bit rate allocation for video and depth provided by MPEG-c part 3. Coding approaches using interview prediction are H.264/AVC Stereo SEI message and H.264/MVC. Out of those, H.264/MVC has been chosen in line with the 3D Video community, not for performance reasons, but for the reason of backward compatibility given by the possibility to extract a bit stream for 2D presentation. Beyond the methods examined in D2.2 the simple Mixed Resolution approach was optimized to generate test stimuli for the evaluation of coding methods. Furthermore the new prototype of the software encoder using slice mode has been utilized for the coding carried out for the evaluation of transmission approaches. Another difference to D2.2 is the choice of test sequences. The coding test set from D2.1 [9] and a transmission test set have been defined matching the user s needs examined in D4.1 [10]. For coding tests short sequences (~10s) are used. Longer sequences with audio (~60s) have been coded for the transmission tests. Further adjustments concern the spatial and temporal resolution: The video format was adapted to match the resolution of the new NEC display and the frame rate was set to 12.5 or 15 fps. 4.1 Test sequences for subjective evaluation of coding approaches The subjective evaluation of coding approaches targets the finding of the optimal approach for coding of stereo video content. Therefore a great variety of coded sequences has been generated. The next sections describe the test setup as well as the coding results Test setup For the large scale evaluation of the four coding approaches Simulcast, Multi View (MVC), Mixed Resolution (MRSC) and Video plus Depth (VD) coding using MPEG-C part 3 a set of test stimuli has been generated. The coding approaches have been optimized at rate points with a low and a high video quality. Furthermore a baseline and a high codec profile have been used. The six sequences from the coding test set of the stereo video database [9] have been used. This leads to a total number of 4 (approaches) x 2 (qualities) x 2 (profiles) x 6 (sequences) = 96 test stimuli Coding Approaches H.264/AVC Simulcast The left and right views are coded as independent streams using H.264/MPEG-4 AVC. Hence this method does not need any pre- or post processing before coding and after decoding, the complexity on sender and receiver side is low. Redundancy between channels is not exploited. Optimization is carried out by jointly varying the quantization parameter (QP) for left and right view. 32

34 H.264/AVC Multi View Coding (MVC) H.264/AVC MVC allows inter-view prediction. The left view is used as reference for the right view. Prediction has been enabled for anchor as well as for non-anchor frames. No pre- or postprocessing is required on the sender or receiver side. Optimization is carried out by jointly varying the QP for left and right view. H.264/AVC Mixed Resolution coding (MRSC) Binocular suppression theory states that perceived image quality is dominated by the view with higher spatial resolution [4]. The mixed resolution approach utilizes this attribute of human perception by decimating one view before transmission and up-scaling at the receiver side. This enables a trade off between spatial sub-sampling and amplitude quantization. Nevertheless sampling introduces pre- as well as post-processing. For experiments in this scope the right view was decimated by a factor of two in horizontal and vertical direction. The simple MRSC approach without interview prediction, optimized down-sampling and unsharp masking has been used. Optimization is carried out by independently varying the QP for left and right. MPEG C Part 3 using H.264/AVC (V+D Coding) MPEG-C Part 3 defines a video plus depth representation of the stereo video content. Depth was estimated from an original left and right view by the HHI Hybrid Recursive Matching (HRM) algorithm. One view and the associated depth signal are coded. At the receiver the second view is synthesized by depth image based rendering [11]. Compared to video, a depth signal can be coded in most cases at a fraction of the color bit rate at sufficient quality for view synthesis. Nevertheless errors in depth estimation and interpolation at occurring disocclusions introduce artefacts to the rendered view. Optimization is carried out by independently varying the QP for video and depth Test set The test set for coding defined in the stereo video database [9] was used to generate the test stimuli. The sequences are shown in Figure 4.1. Details are presented in Table 4.1. All sequences have a frame rate of 15 frames per second. Table 4.1 Properties of sequences from the coding test set Sequence Genre Movement Complexity Length Size in Camera Object Structural Depth in sec. pixels Horse Nature none low high medium x240 Bullinger News none low low low x240 Car Action high low medium high x240 Mountain Documentary medium low medium low x240 Butterfly Animation none high high medium x240 Soccer2 Sports high high medium high x240 33

35 Horse Bullinger Car Mountain Butterfly Figure 4.1 Sequences of the coding test set Soccer2 34

36 Codec Profiles Coding has been carried out using two codec profiles. The simple baseline profile uses an IPPP structure and CAVLC. The complex high profile enables hierarchical B-Frames and CABAC. For the Simulcast, Mixed Resolution and V+D approach the AVC Reference Software JM 14.2 has been used. The MVC stimuli have been coded using the MVC reference Software JMVC Table 4.2 shows the used codec settings in detail. Table 4.2 Codec Settings and Profiles Profile Baseline High GOP Size 1 (IPPP) 8 (Hierarchical B frames) Symbol Mode CAVLC CABAC Search Range Intra Period High and low quality The coding approaches have been evaluated at a high and low quality. Note, that it is not useful to define a constant high and constant low bit rate for all sequences to achieve high and low qualities for all sequences. Reason for this is a variable compressibility of different sequences. A rate sufficient for a high quality for one sequence might produce a low quality for other sequences. To guarantee a comparable low and a comparable high quality for all sequences a low and a high rate point had to be determined for each sequence individually. The following approach was used to obtain these rate points: To define a high and a low quality for all sequences of the coding test set the quantization parameters (QP) of the codec for simulcast coding was set to 30 for the high quality and 37 for the low quality. This results in a low and high bit rate for each sequence of the coding test set. Resulting bit rates are shown in Table 4.3 and have been used as target rates for the other three approaches together with the baseline profile. Table 4.3 Target bit rates in kbit/s for high and low quality Profile Quality Bullinger Butterfly Car Horse Mountain Soccer2 Baseline Low High High Low High Bit rates for the high profile are also shown in Table 4.3. They are the rates from the sequences coded with high profile and simulcast having the same PSNR as the sequences coded with the base profile and simulcast at QP 37 and QP 30. This guarantees a comparable objective quality for the baseline and high-profile sequences using simulcast. Hence it can be subjectively evaluated if the different GOP structures of the two profiles have an influence on the subjective quality which is not reflected by the PSNR. 35

37 4.1.2 Coding Results Baseline Profile Simulcast Figure 4.2 shows the RD-characteristics used for the optimization of the simulcast approach. For coding a QP range from 18 to 44 with a step size of one was used. Sequences matching the bit rates defined in Table 4.3 have been taken as test stimuli. The Bullinger sequence is highly compressible due to the very low complexity of the constant background and only slightly moving foreground. Content of the Butterfly and the Horse sequence have both a high structural complexity and no camera movement. Nevertheless coding gains for Butterfly are higher than for Horse. Reason for this is the absence noise and a higher similarity of subsequent frames in the artificial scene. In the sequences Mountain, Soccer2 and Car the camera is moving. The strongest camera motion can be found in Car, nevertheless the camera is only moving in forward direction thus the scene is changing rather slowly. This explains the higher gains compared to the Mountain and Soccer2 sequences in which the camera moves in horizontal or vertical direction. Figure 4.2 PSNR vs. bit rate of left and right view for simulcast coding (baseline profile) 36

38 Multi View Coding The RD-characteristics used for the optimization of MVC approach is shown in Figure 4.3. The sequences have been coded using a QP range from 18 to 44 with a step size of one. Sequences matching the bit rates defined in Table 4.3 have been taken as test stimuli. A comparison to simulcast coding shows that the coding gain increases. The differences between the sequences are similar. A high gain can be found for the Butterfly sequence. This is related to the similarity of the two artificial views that enables an efficient interview prediction. Figure 4.3 PSNR vs. bit rate of left and right view for mvc coding (baseline profile) Mixed Resolution Stereo Coding To determine optimal bit rate distribution between the views of the mixed resolution method, the approach suggested in [12] and section was used. Thus the shown PSNR was calculated from the average MSE of the full and the up-sampled low resolution view. To take binocular suppression theory into account the down- and up-sampled original view was taken as reference for the up-sampled low resolution view. Hence the PSNR calculated this way only evaluates the coding quality and not the overall quality. The left view and the down-sampled right view have been coded with QPs from 18 to 44 with a step size of 2. Coding results are shown in Figure 4.5. Each point represents a QP-combination for the left and the down-sampled right view. The optimal QP-combinations can be found on the envelope of these points. Sequences matching the bit rates defined in Table 4.3 and coded with optimal QP combinations have been taken as test stimuli. Therefore also coding with intermediate QP-combinations has been done if necessary. 37

39 Figure 4.4 PSNR vs. bit rate of left view and and down-sampled right view for MRSC Coding (baseline profile) 38

40 Figure 4.5 PSNR vs. bit rate of left view and depth for V+D Coding (baseline profile) 39

41 Video + Depth Coding Coding results for V+D coding are shown in Figure 4.5. The PSNR was calculated from the average MSE of the left and the rendered right view. The MSE of the rendered right view was calculated taking the rendered right view from uncoded data as reference. Rendering artifacts already existing in the uncoded data are neglected with this approach. Hence the PSNR calculated this way only evaluates the coding quality and not the overall quality. The left view has been coded with QPs from 18 to 44 and a step size of 2. For depth QPs from 8 to 44 or 18 to 44 depending on the sequence have been used with a step size of 2. Each point in Figure 4.5 represents a QP-combination for the left view and depth. The optimal QP-combinations can be found on the envelope of these points. Sequences matching the bit rates defined in Table 4.3 and coded with optimal QP combinations have been taken as test stimuli. Therefore also coding with intermediate QP-combinations has been done, if necessary High Profile Figure 4.6 to Figure 4.9 show the coding results for the high profile. Typical characteristics of the sequences are similar to the baseline profile case. For all sequences a high coding can be achieved by using the high profile with hierarchical B-pictures and CABAC. A comparison of high and base profile is presented separately for each sequence in section Simulcast Figure 4.6 PSNR vs. bit rate of left and right view for simulcast coding (high profile) 40

42 Multi View Coding Figure 4.7 PSNR vs. bit rate of left and right view for MVC coding (high profile) 41

43 Mixed Resolution Stereo Coding Figure 4.8 PSNR vs. bit rate of left view and and down-sampled right view for MRSC Coding (high profile) 42

44 Video+Depth Coding Figure 4.9 PSNR vs. bit rate of left view and depth for V+D Coding (high profile) 43

45 4.1.3 Generated Test Stimuli Table 4.4 to Table 4.9 show PSNRs and bit rate distribution of the resulting test stimuli. The total PSNR was calculated using the MSE of the single left and right views. Note that for MRSC the PSNR of the right view was calculated using the uncoded up- and down-sampled right view as reference. Therefore PSNR values are marked with pluses. For V+D coding the PSNR of the right view was calculated using the rendered right view from uncoded data as reference, PSNR values are marked with asterisks. Method PSNR-Y both views [db] PSNR-Y left view [db] PSNR-Y right view [db] Bit rate right / Bit rate total Base Profile Low Rate (74 kbit/s) Simulcast % MVC % MRSC % V+D 37.3* * 23% High Rate (160 kbit/s) Simulcast % MVC % MRSC % V+D 39.5* * 33% High Profile Low Rate (46 kbit/s) Simulcast % MVC % MRSC % V+D 37.3* * 26% High Rate (99 kbit/s) Simulcast % MVC % MRSC % V+D 39.4* * 21% Table 4.4 Properties of test stimuli of sequence Bullinger Table 4.4 shows the coding results for Bullinger. Interview prediction leads to bit rate savings of about 24% for the right view and significant PSNR gains for MVC. Optimal distribution of the bit rate for MR coding reaches from 32%-38% for the down-sampled right view. Depth can be coded at approximately 20%-30% of the total bit rate and leads to the best quality of the left 44

46 view. A comparison of the baseline and the high profile shows, that the high profile enables bit rate savings of about 40% while the quality for the Simulcast, MRSC and VD coding remains unchanged. Method PSNR-Y both views [db] PSNR-Y left view [db] PSNR-Y right view [db] Bit rate right / Bit rate total Base Profile Low Rate Base Profile (143 kbit/s) Simulcast % MVC % MRSC 33.8* * 45% V+D % High Rate Base Profile (318 kbit/s) Simulcast % MVC % MRSC 39.1* * 38% V+D % High Profile Low Rate (94 kbit/s) Simulcast % MVC % MRSC 33.3* % V+D % High Rate (212 kbit/s) Simulcast % MVC % MRSC 38.8* * 39% V+D % Table 4.5 Properties of test stimuli of sequence Butterfly The results for the Butterfly sequence are shown in Table 4.5. Due to the synthetic character of the sequence and the similarity of both views, interview prediction is very efficient. About 50% of bit rate compared to simulcast can be saved for the right view. The bit rate of the down-sampled right view ranges from 38% to 49% of the total bit rate. The optimal bit rate for depth is from about 8% to 29% of the total rate. The bit rates of sequences coded with high profile are about 33% lower than for sequences coded with the baseline profile. Performance of MVC and MRSC decreases, but performance of V+D increases. 45

47 Method PSNR-Y both views [db] PSNR-Y left view [db] PSNR-Y right view [db] Bit rate right / Bit rate total Base Profile Low Rate (130 kbit/s) Simulcast % MVC % MRSC 35.1* * 46% V+D % High Rate (378 kbit/s) Simulcast % MVC % MRSC 39.4* * 42% V+D % High Profile Low Rate (112 kbit/s) Simulcast % MVC % MRSC 35.2* * 42% V+D % High Rate (323 kbit/s) Simulcast % MVC % MRSC 39.7* * 34% V+D % Table 4.6 Properties of test stimuli of sequence Car Table 4.6 depicts the results for the sequence Car. MVC and MRSC results in bit rate savings of approx. 20% for the right view. Depth can be coded efficiently and needs only about 7% to 16% of the total bit rate. The bit rate of sequences coded with the high profile is about 14% lower as for sequences coded with the baseline profile. The quality remains for all methods approximately equal. Thus the gain achieved by using the more complex coding structure is relatively low. 46

48 Method PSNR-Y both views [db] PSNR-Y left view [db] PSNR-Y right view [db] Bit rate right / Bit rate total Base Profile Low Rate (160 kbit/s) Simulcast % MVC % MRSC 31.3* * 41% V+D % High Rate (450 kbit/s) Simulcast % MVC % MRSC 37.1* * 33% V+D % High Profile Low Rate (104 kbit/s) Simulcast % MVC % MRSC 31.2* * 41% V+D % High Rate (284 kbit/s) Simulcast % MVC % MRSC 37.0* * 29% V+D % Table 4.7 Properties of test stimuli of sequence Horse Properties of the sequence Horse are provided in Table 4.7. MVC and MRSC lead to a rate for the right view of approximately 30% to 40% of the total bit rate. Depth can be coded at about 9% to 18% of the total bit rate and leads to a high quality of the left view. The high profile enables bit rate savings 35% of compared to the base profile at approximately equal quality. 47

49 Method PSNR-Y both views [db] PSNR-Y left view [db] PSNR-Y right view [db] Bit rate right / Bit rate total Base Profile Low Rate (104 kbit/s) Simulcast % MVC % MRSC 31.7* * 33% V+D % High Rate (367 kbit/s) Simulcast % MVC % MRSC 37.0* * 30% V+D % High Profile Low Rate (78 kbit/s) Simulcast % MVC % MRSC 32.7* * 29% V+D % High Rate (208 kbit/s) Simulcast % MVC % MRSC 37.3* * 28% V+D % Table 4.8 Properties of test stimuli of sequence Mountain Table 4.8 shows coding results for the Mountain sequence. Regarding the bit rate distribution of simulcast coding, it can be seen that the right view is slightly less compressible than the left view. MVC achieves gains up to 1 db compared with simulcast. Nevertheless interview prediction is not efficient for the high rate and baseline profile. The MRSC leads to a distribution of bit rate of about 30% for the down-sampled right view. Depth can be coded at 13% to 19% of the total bit rate. For the low rates up to 25% of bit rate can be saved with the high profile and slightly better quality. At the high rate a saving of 40% is achieved. 48

50 Method PSNR-Y both views [db] PSNR-Y left view [db] PSNR-Y right view [db] Bit rate right / Bit rate total Base Profile Low Rate (159 kbit/s) Simulcast % MVC % MRSC 34.4* * 37% V+D % High Rate (452 kbit/s) Simulcast % MVC % MRSC 39.4* * 36% V+D % High Profile Low Rate (134 kbit/s) Simulcast % MVC % MRSC 34.4* * 34% V+D % High Rate (381 kbit/s) Simulcast % MVC % MRSC 39.4* * 34% V+D % Table 4.9 Properties of test stimuli of sequence Soccer2 Coding results for the Soccer2 sequence are presented in Table 4.9. For MVC PSNR gains up to 1.5 db can be reached. Bit rate of the right view reaches from 34% to 43% for MVC and MRSC. The depth can be coded at 10% to 15% of the total bit rate and enables gains up to 2.7 db for the left view. Coding with the high profile leads to bit rate saving of 15% at approximately the same quality Conclusion For subjective evaluation of coding methods 96 test stimuli have been generated from the six sequences of the coding test set using Simulcast, Multi View, Mixed Resolution and Video + Depth coding. A baseline and a high codec profile were used. An objective evaluation was carried out at a low and a high quality level. 49

51 MVC results in a higher PSNR compared to simulcast. Using V+D and Mixed Resolution coding the PSNR of the left view increases compared to the simulcast approach. Nevertheless quality of the rendered right view, the down-sampled view and the overall quality of both views is questionable since rendering artifacts and image distortions introduced by down-sampling cannot be evaluated using the PSNR. Therefore the large scale subjective test is needed. Results of this test are reported in the upcoming deliverable D4.3 Results of quality attributes of coding, transmission and their combinations. Coding using the high profile generates sequences at approximately the same quality level, but with bit rate savings from 15% to 50%. 4.2 Test sequences for transmission studies Additional to the examination of the coding approaches a study on transmission approaches was carried out. This section deals with the preparation of test stimuli for this study. The focus is set on the coding part. Apart from the slice mode, error resilience strategies are discussed in the upcoming deliverable D3.4 ( Stereo DVB-H broadcasting system with error resilient tools ) Test setup For the transmission studies coded sequences using the Simulcast, Multi View and Video plus Depth coding have been generated. The coding approaches have been optimized at rate points with a high video quality. The baseline codec profile has been used. The four sequences from the transmission test set of the stereo video database [9] have been used. Furthermore sequences have been coded with and without using the newly implemented slice mode. This leads to a total number of 3 (approaches) x 4 (sequences) x 2 (slice mode)= 24 test stimuli Coding Approaches The Simulcast, Multi View and the V+D coding approaches as described in section have been used. Due to low performance of the simple Mixed Resolution approach in the subjective coding test (see D4.3), this approach was omitted. The Advanced Mixed Resolution Stereo Coding (AMRSC) approach has not been available at the time of the preparation of the test sequences Test set A test set for transmission studies was defined and will be reported in the next update of the stereo video database [9]. The sequences are shown in Figure Details are presented in Figure The sequences RhineValleyMoving, Knights Quest and HeidelbergAlleys consist of different scenes with varying movement and complexity. All sequences have a length of 60 seconds and are available with audio. Table 4.10 Properties of sequences from the coding test set Sequence Genre Movement Complexity Frame Size in Camera Object Structural Depth Rate pixels RollerBlade Sports None High Medium Medium x240 User Created RhineValleyMoving Action High High Medium Low x240 KnightsQuest Animation Various Various Low Low x240 HeidelbergAlleys Documentary Low Low High Various x240 50

52 RollerBlade1 RhineValleyMoving KnightsQuest HeidelbergAlleys Figure 4.10: Sequences of the transmission test set Codec Profile The transmission study has been carried out using the baseline profile shown in Table 4.2. For all approaches the MVC reference Software JMVC has been used. Interview prediction was not used for simulcast and V+D coding Quality level The coding approaches have been evaluated at a high quality point. Individual target bit rates for each sequence have been found with the approach described in section To define a high quality for all sequences of the transmission test set it was chosen to set a quantization parameter (QP) of the codec for simulcast coding to 30. This results in a target bit rate for each sequence from coding test set. Furthermore bit rates should not exceed 600 kbit/s. Therefore it was necessary to set a QP of 33 for the RollerBlade sequence. Resulting bit rates are shown in Table 4.11 and have been used as target rates for the other two approaches and the slice mode. Table 4.11: Target bit rates in kbit/s RollerBlade1 RhineValleyMoving KnightsQuest HeidelbergAlleys

Final report on coding algorithms for mobile 3DTV. Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin

Final report on coding algorithms for mobile 3DTV. Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin Final report on coding algorithms for mobile 3DTV Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin MOBILE3DTV Project No. 216503 Final report on coding algorithms for mobile 3DTV Gerhard

More information

Stereo DVB-H Broadcasting System with Error Resilient Tools

Stereo DVB-H Broadcasting System with Error Resilient Tools Stereo DVB-H Broadcasting System with Error Resilient Tools Done Bugdayci M. Oguz Bici Anil Aksay Murat Demirtas Gozde B Akar Antti Tikanmaki Atanas Gotchev Project No. 21653 Stereo DVB-H Broadcasting

More information

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD THE H.264 ADVANCED VIDEO COMPRESSION STANDARD Second Edition Iain E. Richardson Vcodex Limited, UK WILEY A John Wiley and Sons, Ltd., Publication About the Author Preface Glossary List of Figures List

More information

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc. Upcoming Video Standards Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc. Outline Brief history of Video Coding standards Scalable Video Coding (SVC) standard Multiview Video Coding

More information

Scalable Extension of HEVC 한종기

Scalable Extension of HEVC 한종기 Scalable Extension of HEVC 한종기 Contents 0. Overview for Scalable Extension of HEVC 1. Requirements and Test Points 2. Coding Gain/Efficiency 3. Complexity 4. System Level Considerations 5. Related Contributions

More information

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications: Chapter 11.3 MPEG-2 MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications: Simple, Main, SNR scalable, Spatially scalable, High, 4:2:2,

More information

Advanced Video Coding: The new H.264 video compression standard

Advanced Video Coding: The new H.264 video compression standard Advanced Video Coding: The new H.264 video compression standard August 2003 1. Introduction Video compression ( video coding ), the process of compressing moving images to save storage space and transmission

More information

Recent, Current and Future Developments in Video Coding

Recent, Current and Future Developments in Video Coding Recent, Current and Future Developments in Video Coding Jens-Rainer Ohm Inst. of Commun. Engineering Outline Recent and current activities in MPEG Video and JVT Scalable Video Coding Multiview Video Coding

More information

Results of quality attributes of coding, transmission, and their combinations

Results of quality attributes of coding, transmission, and their combinations Results of quality attributes of coding, transmission, and their combinations Dominik Strohmeier Satu Jumisko-Pyykkö Kristina Kunze Gerhard Tech Döne Buğdaycı Mehmet OguzBici Project No. 216503 Results

More information

H.264 Video Transmission with High Quality and Low Bitrate over Wireless Network

H.264 Video Transmission with High Quality and Low Bitrate over Wireless Network H.264 Video Transmission with High Quality and Low Bitrate over Wireless Network Kadhim Hayyawi Flayyih 1, Mahmood Abdul Hakeem Abbood 2, Prof.Dr.Nasser Nafe a Khamees 3 Master Students, The Informatics

More information

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France Video Compression Zafar Javed SHAHID, Marc CHAUMONT and William PUECH Laboratoire LIRMM VOODDO project Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier LIRMM UMR 5506 Université

More information

Experimental Evaluation of H.264/Multiview Video Coding over IP Networks

Experimental Evaluation of H.264/Multiview Video Coding over IP Networks ISSC 11, Trinity College Dublin, June 23-24 Experimental Evaluation of H.264/Multiview Video Coding over IP Networks Zhao Liu *, Yuansong Qiao *, Brian Lee *, Enda Fallon **, Karunakar A. K. *, Chunrong

More information

Investigation of the GoP Structure for H.26L Video Streams

Investigation of the GoP Structure for H.26L Video Streams Investigation of the GoP Structure for H.26L Video Streams F. Fitzek P. Seeling M. Reisslein M. Rossi M. Zorzi acticom GmbH mobile networks R & D Group Germany [fitzek seeling]@acticom.de Arizona State

More information

New Techniques for Improved Video Coding

New Techniques for Improved Video Coding New Techniques for Improved Video Coding Thomas Wiegand Fraunhofer Institute for Telecommunications Heinrich Hertz Institute Berlin, Germany wiegand@hhi.de Outline Inter-frame Encoder Optimization Texture

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 MPEG2011/N12559 February 2012,

More information

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Week 14. Video Compression. Ref: Fundamentals of Multimedia Week 14 Video Compression Ref: Fundamentals of Multimedia Last lecture review Prediction from the previous frame is called forward prediction Prediction from the next frame is called forward prediction

More information

Lecture 13 Video Coding H.264 / MPEG4 AVC

Lecture 13 Video Coding H.264 / MPEG4 AVC Lecture 13 Video Coding H.264 / MPEG4 AVC Last time we saw the macro block partition of H.264, the integer DCT transform, and the cascade using the DC coefficients with the WHT. H.264 has more interesting

More information

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Jeffrey S. McVeigh 1 and Siu-Wai Wu 2 1 Carnegie Mellon University Department of Electrical and Computer Engineering

More information

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu Outline Introduction to H.264 Current algorithms for

More information

Coding of 3D Videos based on Visual Discomfort

Coding of 3D Videos based on Visual Discomfort Coding of 3D Videos based on Visual Discomfort Dogancan Temel and Ghassan AlRegib School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA, 30332-0250 USA {cantemel, alregib}@gatech.edu

More information

Video coding. Concepts and notations.

Video coding. Concepts and notations. TSBK06 video coding p.1/47 Video coding Concepts and notations. A video signal consists of a time sequence of images. Typical frame rates are 24, 25, 30, 50 and 60 images per seconds. Each image is either

More information

Jimin Xiao, Tammam Tillo, Senior Member, IEEE, Yao Zhao, Senior Member, IEEE

Jimin Xiao, Tammam Tillo, Senior Member, IEEE, Yao Zhao, Senior Member, IEEE Real-Time Video Streaming Using Randomized Expanding Reed-Solomon Code Jimin Xiao, Tammam Tillo, Senior Member, IEEE, Yao Zhao, Senior Member, IEEE Abstract Forward error correction (FEC) codes are widely

More information

The Scope of Picture and Video Coding Standardization

The Scope of Picture and Video Coding Standardization H.120 H.261 Video Coding Standards MPEG-1 and MPEG-2/H.262 H.263 MPEG-4 H.264 / MPEG-4 AVC Thomas Wiegand: Digital Image Communication Video Coding Standards 1 The Scope of Picture and Video Coding Standardization

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

4G WIRELESS VIDEO COMMUNICATIONS

4G WIRELESS VIDEO COMMUNICATIONS 4G WIRELESS VIDEO COMMUNICATIONS Haohong Wang Marvell Semiconductors, USA Lisimachos P. Kondi University of Ioannina, Greece Ajay Luthra Motorola, USA Song Ci University of Nebraska-Lincoln, USA WILEY

More information

JPEG 2000 vs. JPEG in MPEG Encoding

JPEG 2000 vs. JPEG in MPEG Encoding JPEG 2000 vs. JPEG in MPEG Encoding V.G. Ruiz, M.F. López, I. García and E.M.T. Hendrix Dept. Computer Architecture and Electronics University of Almería. 04120 Almería. Spain. E-mail: vruiz@ual.es, mflopez@ace.ual.es,

More information

Lecture 5: Error Resilience & Scalability

Lecture 5: Error Resilience & Scalability Lecture 5: Error Resilience & Scalability Dr Reji Mathew A/Prof. Jian Zhang NICTA & CSE UNSW COMP9519 Multimedia Systems S 010 jzhang@cse.unsw.edu.au Outline Error Resilience Scalability Including slides

More information

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression Fraunhofer Institut für Nachrichtentechnik Heinrich-Hertz-Institut Ralf Schäfer schaefer@hhi.de http://bs.hhi.de H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression Introduction H.264/AVC:

More information

Objective: Introduction: To: Dr. K. R. Rao. From: Kaustubh V. Dhonsale (UTA id: ) Date: 04/24/2012

Objective: Introduction: To: Dr. K. R. Rao. From: Kaustubh V. Dhonsale (UTA id: ) Date: 04/24/2012 To: Dr. K. R. Rao From: Kaustubh V. Dhonsale (UTA id: - 1000699333) Date: 04/24/2012 Subject: EE-5359: Class project interim report Proposed project topic: Overview, implementation and comparison of Audio

More information

On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks

On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks 2011 Wireless Advanced On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks S. Colonnese, F. Cuomo, O. Damiano, V. De Pascalis and T. Melodia University of Rome, Sapienza, DIET,

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAGE COMPRESSION STANDARDS Lesson 19 JPEG-2000 Error Resiliency Instructional Objectives At the end of this lesson, the students should be able to: 1. Name two different types of lossy

More information

H.264 / AVC (Advanced Video Coding)

H.264 / AVC (Advanced Video Coding) H.264 / AVC (Advanced Video Coding) 2014-2016 Josef Pelikán CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ H.264/AVC 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 1 / 20 Context

More information

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK Professor Laurence S. Dooley School of Computing and Communications Milton Keynes, UK How many bits required? 2.4Mbytes 84Kbytes 9.8Kbytes 50Kbytes Data Information Data and information are NOT the same!

More information

3366 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 9, SEPTEMBER 2013

3366 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 9, SEPTEMBER 2013 3366 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 9, SEPTEMBER 2013 3D High-Efficiency Video Coding for Multi-View Video and Depth Data Karsten Müller, Senior Member, IEEE, Heiko Schwarz, Detlev

More information

VIDEO COMPRESSION STANDARDS

VIDEO COMPRESSION STANDARDS VIDEO COMPRESSION STANDARDS Family of standards: the evolution of the coding model state of the art (and implementation technology support): H.261: videoconference x64 (1988) MPEG-1: CD storage (up to

More information

Review Article Video Traffic Characteristics of Modern Encoding Standards: H.264/AVC with SVC and MVC Extensions and H.265/HEVC

Review Article Video Traffic Characteristics of Modern Encoding Standards: H.264/AVC with SVC and MVC Extensions and H.265/HEVC e Scientific World Journal, Article ID 189481, 16 pages http://dx.doi.org/10.1155/2014/189481 Review Article Video Traffic Characteristics of Modern Encoding Standards: H.264/AVC with SVC and MVC Extensions

More information

View Synthesis for Multiview Video Compression

View Synthesis for Multiview Video Compression View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro email:{martinian,jxin,avetro}@merl.com, behrens@tnt.uni-hannover.de Mitsubishi Electric Research

More information

Image and Video Coding I: Fundamentals

Image and Video Coding I: Fundamentals Image and Video Coding I: Fundamentals Thomas Wiegand Technische Universität Berlin T. Wiegand (TU Berlin) Image and Video Coding Organization Vorlesung: Donnerstag 10:15-11:45 Raum EN-368 Material: http://www.ic.tu-berlin.de/menue/studium_und_lehre/

More information

Scalable Video Coding in H.264/AVC

Scalable Video Coding in H.264/AVC Scalable Video Coding in H.264/AVC 1. Introduction Potentials and Applications 2. Scalability Extension of H.264/AVC 2.1Scalability Operation and High-Level Syntax 2.2Temporal Scalability 2.3SNR/Fidelity/Quality

More information

Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec

Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec Seung-Hwan Kim and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST), 1 Oryong-dong Buk-gu,

More information

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION Yi-Hau Chen, Tzu-Der Chuang, Chuan-Yung Tsai, Yu-Jen Chen, and Liang-Gee Chen DSP/IC Design Lab., Graduate Institute

More information

Next-Generation 3D Formats with Depth Map Support

Next-Generation 3D Formats with Depth Map Support MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Next-Generation 3D Formats with Depth Map Support Chen, Y.; Vetro, A. TR2014-016 April 2014 Abstract This article reviews the most recent extensions

More information

Introduction to Video Encoding

Introduction to Video Encoding Introduction to Video Encoding INF5063 23. September 2011 History of MPEG Motion Picture Experts Group MPEG1 work started in 1988, published by ISO in 1993 Part 1 Systems, Part 2 Video, Part 3 Audio, Part

More information

MPEG-4: Simple Profile (SP)

MPEG-4: Simple Profile (SP) MPEG-4: Simple Profile (SP) I-VOP (Intra-coded rectangular VOP, progressive video format) P-VOP (Inter-coded rectangular VOP, progressive video format) Short Header mode (compatibility with H.263 codec)

More information

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46 LIST OF TABLES TABLE Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46 Table 5.2 Macroblock types 46 Table 5.3 Inverse Scaling Matrix values 48 Table 5.4 Specification of QPC as function

More information

MPEG-2. ISO/IEC (or ITU-T H.262)

MPEG-2. ISO/IEC (or ITU-T H.262) MPEG-2 1 MPEG-2 ISO/IEC 13818-2 (or ITU-T H.262) High quality encoding of interlaced video at 4-15 Mbps for digital video broadcast TV and digital storage media Applications Broadcast TV, Satellite TV,

More information

EE Low Complexity H.264 encoder for mobile applications

EE Low Complexity H.264 encoder for mobile applications EE 5359 Low Complexity H.264 encoder for mobile applications Thejaswini Purushotham Student I.D.: 1000-616 811 Date: February 18,2010 Objective The objective of the project is to implement a low-complexity

More information

Video Communication Ecosystems. Research Challenges for Immersive. over Future Internet. Converged Networks & Services (CONES) Research Group

Video Communication Ecosystems. Research Challenges for Immersive. over Future Internet. Converged Networks & Services (CONES) Research Group Research Challenges for Immersive Video Communication Ecosystems over Future Internet Tasos Dagiuklas, Ph.D., SMIEEE Assistant Professor Converged Networks & Services (CONES) Research Group Hellenic Open

More information

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

2014 Summer School on MPEG/VCEG Video. Video Coding Concept 2014 Summer School on MPEG/VCEG Video 1 Video Coding Concept Outline 2 Introduction Capture and representation of digital video Fundamentals of video coding Summary Outline 3 Introduction Capture and representation

More information

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis High Efficiency Video Coding (HEVC) test model HM-16.12 vs. HM- 16.6: objective and subjective performance analysis ZORAN MILICEVIC (1), ZORAN BOJKOVIC (2) 1 Department of Telecommunication and IT GS of

More information

Digital Video Processing

Digital Video Processing Video signal is basically any sequence of time varying images. In a digital video, the picture information is digitized both spatially and temporally and the resultant pixel intensities are quantized.

More information

Reduced Frame Quantization in Video Coding

Reduced Frame Quantization in Video Coding Reduced Frame Quantization in Video Coding Tuukka Toivonen and Janne Heikkilä Machine Vision Group Infotech Oulu and Department of Electrical and Information Engineering P. O. Box 500, FIN-900 University

More information

Overview, implementation and comparison of Audio Video Standard (AVS) China and H.264/MPEG -4 part 10 or Advanced Video Coding Standard

Overview, implementation and comparison of Audio Video Standard (AVS) China and H.264/MPEG -4 part 10 or Advanced Video Coding Standard Multimedia Processing Term project Overview, implementation and comparison of Audio Video Standard (AVS) China and H.264/MPEG -4 part 10 or Advanced Video Coding Standard EE-5359 Class project Spring 2012

More information

Fast Mode Decision for H.264/AVC Using Mode Prediction

Fast Mode Decision for H.264/AVC Using Mode Prediction Fast Mode Decision for H.264/AVC Using Mode Prediction Song-Hak Ri and Joern Ostermann Institut fuer Informationsverarbeitung, Appelstr 9A, D-30167 Hannover, Germany ri@tnt.uni-hannover.de ostermann@tnt.uni-hannover.de

More information

IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC

IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC Damian Karwowski, Marek Domański Poznań University

More information

Spline-Based Motion Vector Encoding Scheme

Spline-Based Motion Vector Encoding Scheme Spline-Based Motion Vector Encoding Scheme by Parnia Farokhian A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of

More information

Video Quality Analysis for H.264 Based on Human Visual System

Video Quality Analysis for H.264 Based on Human Visual System IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021 ISSN (p): 2278-8719 Vol. 04 Issue 08 (August. 2014) V4 PP 01-07 www.iosrjen.org Subrahmanyam.Ch 1 Dr.D.Venkata Rao 2 Dr.N.Usha Rani 3 1 (Research

More information

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS Ye-Kui Wang 1, Miska M. Hannuksela 2 and Moncef Gabbouj 3 1 Tampere International Center for Signal Processing (TICSP), Tampere,

More information

Depth Estimation for View Synthesis in Multiview Video Coding

Depth Estimation for View Synthesis in Multiview Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Depth Estimation for View Synthesis in Multiview Video Coding Serdar Ince, Emin Martinian, Sehoon Yea, Anthony Vetro TR2007-025 June 2007 Abstract

More information

Implementation and analysis of Directional DCT in H.264

Implementation and analysis of Directional DCT in H.264 Implementation and analysis of Directional DCT in H.264 EE 5359 Multimedia Processing Guidance: Dr K R Rao Priyadarshini Anjanappa UTA ID: 1000730236 priyadarshini.anjanappa@mavs.uta.edu Introduction A

More information

Module 7 VIDEO CODING AND MOTION ESTIMATION

Module 7 VIDEO CODING AND MOTION ESTIMATION Module 7 VIDEO CODING AND MOTION ESTIMATION Lesson 20 Basic Building Blocks & Temporal Redundancy Instructional Objectives At the end of this lesson, the students should be able to: 1. Name at least five

More information

ESTIMATION OF THE UTILITIES OF THE NAL UNITS IN H.264/AVC SCALABLE VIDEO BITSTREAMS. Bin Zhang, Mathias Wien and Jens-Rainer Ohm

ESTIMATION OF THE UTILITIES OF THE NAL UNITS IN H.264/AVC SCALABLE VIDEO BITSTREAMS. Bin Zhang, Mathias Wien and Jens-Rainer Ohm 19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 ESTIMATION OF THE UTILITIES OF THE NAL UNITS IN H.264/AVC SCALABLE VIDEO BITSTREAMS Bin Zhang,

More information

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013 ECE 417 Guest Lecture Video Compression in MPEG-1/2/4 Min-Hsuan Tsai Apr 2, 213 What is MPEG and its standards MPEG stands for Moving Picture Expert Group Develop standards for video/audio compression

More information

Digital video coding systems MPEG-1/2 Video

Digital video coding systems MPEG-1/2 Video Digital video coding systems MPEG-1/2 Video Introduction What is MPEG? Moving Picture Experts Group Standard body for delivery of video and audio. Part of ISO/IEC/JTC1/SC29/WG11 150 companies & research

More information

Improved Context-Based Adaptive Binary Arithmetic Coding in MPEG-4 AVC/H.264 Video Codec

Improved Context-Based Adaptive Binary Arithmetic Coding in MPEG-4 AVC/H.264 Video Codec Improved Context-Based Adaptive Binary Arithmetic Coding in MPEG-4 AVC/H.264 Video Codec Abstract. An improved Context-based Adaptive Binary Arithmetic Coding (CABAC) is presented for application in compression

More information

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC Randa Atta, Rehab F. Abdel-Kader, and Amera Abd-AlRahem Electrical Engineering Department, Faculty of Engineering, Port

More information

Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao

Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao 28th April 2011 LIST OF ACRONYMS AND ABBREVIATIONS AVC: Advanced Video Coding DVD: Digital

More information

BLOCK MATCHING-BASED MOTION COMPENSATION WITH ARBITRARY ACCURACY USING ADAPTIVE INTERPOLATION FILTERS

BLOCK MATCHING-BASED MOTION COMPENSATION WITH ARBITRARY ACCURACY USING ADAPTIVE INTERPOLATION FILTERS 4th European Signal Processing Conference (EUSIPCO ), Florence, Italy, September 4-8,, copyright by EURASIP BLOCK MATCHING-BASED MOTION COMPENSATION WITH ARBITRARY ACCURACY USING ADAPTIVE INTERPOLATION

More information

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD Siwei Ma, Shiqi Wang, Wen Gao {swma,sqwang, wgao}@pku.edu.cn Institute of Digital Media, Peking University ABSTRACT IEEE 1857 is a multi-part standard for multimedia

More information

Multimedia Decoder Using the Nios II Processor

Multimedia Decoder Using the Nios II Processor Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra

More information

Introduction of Video Codec

Introduction of Video Codec Introduction of Video Codec Min-Chun Hu anita_hu@mail.ncku.edu.tw MISLab, R65601, CSIE New Building 3D Augmented Reality and Interactive Sensor Technology, 2015 Fall The Need for Video Compression High-Definition

More information

High Efficiency Video Coding. Li Li 2016/10/18

High Efficiency Video Coding. Li Li 2016/10/18 High Efficiency Video Coding Li Li 2016/10/18 Email: lili90th@gmail.com Outline Video coding basics High Efficiency Video Coding Conclusion Digital Video A video is nothing but a number of frames Attributes

More information

Scalable Bit Allocation between Texture and Depth Views for 3D Video Streaming over Heterogeneous Networks

Scalable Bit Allocation between Texture and Depth Views for 3D Video Streaming over Heterogeneous Networks Scalable Bit Allocation between Texture and Depth Views for 3D Video Streaming over Heterogeneous Networks Jimin XIAO, Miska M. HANNUKSELA, Member, IEEE, Tammam TILLO, Senior Member, IEEE, Moncef GABBOUJ,

More information

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264 Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264 Jing Hu and Jerry D. Gibson Department of Electrical and Computer Engineering University of California, Santa Barbara, California

More information

JUNSHENG FU A Real-time Rate-distortion Oriented Joint Video Denoising and Compression Algorithm

JUNSHENG FU A Real-time Rate-distortion Oriented Joint Video Denoising and Compression Algorithm JUNSHENG FU A Real-time Rate-distortion Oriented Joint Video Denoising and Compression Algorithm Master of Science Thesis Subject approved in the Department Council meeting on the 23rd of August 2011 Examiners:

More information

Compressed-Domain Video Processing and Transcoding

Compressed-Domain Video Processing and Transcoding Compressed-Domain Video Processing and Transcoding Susie Wee, John Apostolopoulos Mobile & Media Systems Lab HP Labs Stanford EE392J Lecture 2006 Hewlett-Packard Development Company, L.P. The information

More information

5LSH0 Advanced Topics Video & Analysis

5LSH0 Advanced Topics Video & Analysis 1 Multiview 3D video / Outline 2 Advanced Topics Multimedia Video (5LSH0), Module 02 3D Geometry, 3D Multiview Video Coding & Rendering Peter H.N. de With, Sveta Zinger & Y. Morvan ( p.h.n.de.with@tue.nl

More information

An Implementation of Multiple Region-Of-Interest Models in H.264/AVC

An Implementation of Multiple Region-Of-Interest Models in H.264/AVC An Implementation of Multiple Region-Of-Interest Models in H.264/AVC Sebastiaan Van Leuven 1, Kris Van Schevensteen 1, Tim Dams 1, and Peter Schelkens 2 1 University College of Antwerp Paardenmarkt 92,

More information

Fraunhofer Institute for Telecommunications - Heinrich Hertz Institute (HHI)

Fraunhofer Institute for Telecommunications - Heinrich Hertz Institute (HHI) Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6) 9 th Meeting: 2-5 September 2003, San Diego Document: JVT-I032d1 Filename: JVT-I032d5.doc Title: Status:

More information

THIS TUTORIAL on evaluating the performance of video

THIS TUTORIAL on evaluating the performance of video 1142 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 14, NO. 4, FOURTH QUARTER 2012 Video Transport Evaluation With H.264 Video Traces Patrick Seeling, Senior Member, IEEE, and Martin Reisslein, Senior Member,

More information

Mesh Based Interpolative Coding (MBIC)

Mesh Based Interpolative Coding (MBIC) Mesh Based Interpolative Coding (MBIC) Eckhart Baum, Joachim Speidel Institut für Nachrichtenübertragung, University of Stuttgart An alternative method to H.6 encoding of moving images at bit rates below

More information

View Synthesis for Multiview Video Compression

View Synthesis for Multiview Video Compression MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro TR2006-035 April 2006 Abstract

More information

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Torsten Palfner, Alexander Mali and Erika Müller Institute of Telecommunications and Information Technology, University of

More information

Low complexity H.264 list decoder for enhanced quality real-time video over IP

Low complexity H.264 list decoder for enhanced quality real-time video over IP Low complexity H.264 list decoder for enhanced quality real-time video over IP F. Golaghazadeh1, S. Coulombe1, F-X Coudoux2, P. Corlay2 1 École de technologie supérieure 2 Université de Valenciennes CCECE

More information

Unit-level Optimization for SVC Extractor

Unit-level Optimization for SVC Extractor Unit-level Optimization for SVC Extractor Chang-Ming Lee, Chia-Ying Lee, Bo-Yao Huang, and Kang-Chih Chang Department of Communications Engineering National Chung Cheng University Chiayi, Taiwan changminglee@ee.ccu.edu.tw,

More information

Low Complexity Multiview Video Coding

Low Complexity Multiview Video Coding Low Complexity Multiview Video Coding Shadan Khattak Faculty of Technology De Montfort University A thesis submitted for the degree of Doctor of Philosophy April 2014 To my family. Abstract 3D video is

More information

IN the early 1980 s, video compression made the leap from

IN the early 1980 s, video compression made the leap from 70 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 9, NO. 1, FEBRUARY 1999 Long-Term Memory Motion-Compensated Prediction Thomas Wiegand, Xiaozheng Zhang, and Bernd Girod, Fellow,

More information

Performance and Complexity Co-evaluation of the Advanced Video Coding Standard for Cost-Effective Multimedia Communications

Performance and Complexity Co-evaluation of the Advanced Video Coding Standard for Cost-Effective Multimedia Communications EURASIP Journal on Applied Signal Processing :, c Hindawi Publishing Corporation Performance and Complexity Co-evaluation of the Advanced Video Coding Standard for Cost-Effective Multimedia Communications

More information

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri MPEG MPEG video is broken up into a hierarchy of layer From the top level, the first layer is known as the video sequence layer, and is any self contained bitstream, for example a coded movie. The second

More information

Performance Comparison between DWT-based and DCT-based Encoders

Performance Comparison between DWT-based and DCT-based Encoders , pp.83-87 http://dx.doi.org/10.14257/astl.2014.75.19 Performance Comparison between DWT-based and DCT-based Encoders Xin Lu 1 and Xuesong Jin 2 * 1 School of Electronics and Information Engineering, Harbin

More information

Advanced Encoding Features of the Sencore TXS Transcoder

Advanced Encoding Features of the Sencore TXS Transcoder Advanced Encoding Features of the Sencore TXS Transcoder White Paper November 2011 Page 1 (11) www.sencore.com 1.605.978.4600 Revision 1.0 Document Revision History Date Version Description Author 11/7/2011

More information

Recommended Readings

Recommended Readings Lecture 11: Media Adaptation Scalable Coding, Dealing with Errors Some slides, images were from http://ip.hhi.de/imagecom_g1/savce/index.htm and John G. Apostolopoulos http://www.mit.edu/~6.344/spring2004

More information

Optimal Estimation for Error Concealment in Scalable Video Coding

Optimal Estimation for Error Concealment in Scalable Video Coding Optimal Estimation for Error Concealment in Scalable Video Coding Rui Zhang, Shankar L. Regunathan and Kenneth Rose Department of Electrical and Computer Engineering University of California Santa Barbara,

More information

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

International Journal of Emerging Technology and Advanced Engineering Website:   (ISSN , Volume 2, Issue 4, April 2012) A Technical Analysis Towards Digital Video Compression Rutika Joshi 1, Rajesh Rai 2, Rajesh Nema 3 1 Student, Electronics and Communication Department, NIIST College, Bhopal, 2,3 Prof., Electronics and

More information

Scalable Video Coding

Scalable Video Coding 1 Scalable Video Coding Z. Shahid, M. Chaumont and W. Puech LIRMM / UMR 5506 CNRS / Universite Montpellier II France 1. Introduction With the evolution of Internet to heterogeneous networks both in terms

More information

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Jung-Ah Choi and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju, 500-712, Korea

More information

VHDL Implementation of H.264 Video Coding Standard

VHDL Implementation of H.264 Video Coding Standard International Journal of Reconfigurable and Embedded Systems (IJRES) Vol. 1, No. 3, November 2012, pp. 95~102 ISSN: 2089-4864 95 VHDL Implementation of H.264 Video Coding Standard Jignesh Patel*, Haresh

More information

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H. EE 5359 MULTIMEDIA PROCESSING SPRING 2011 Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.264 Under guidance of DR K R RAO DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY

More information

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame consists of two interlaced fields, giving a field rate of 50

More information

HEVC based Stereo Video codec

HEVC based Stereo Video codec based Stereo Video B Mallik*, A Sheikh Akbari*, P Bagheri Zadeh *School of Computing, Creative Technology & Engineering, Faculty of Arts, Environment & Technology, Leeds Beckett University, U.K. b.mallik6347@student.leedsbeckett.ac.uk,

More information