Study on Real Time Video Transmission for Aid Remote Research of Bees Silvio Miyadaira Amancio 1, Andre Riyuiti Hirakawa 1, Euripedes Lopes Junior 1, Sergio Dias Hilário, Astrid M. P. Kleinert, Vera L. Imperatriz Fonseca and Tereza Cristina Giannini 1 Laboratory of Agriculture Automation Electrical Engineering University of Sao Paulo - Av. Prof. Luciano Gualberto, trav. 3, n. 158, sala C-56 - CEP 05508-900 - Cidade Universitária S. Paulo - SP Laboratory of Bees Bioscience Institute University of Sao Paulo Abstract. The use of real-time video to provide a web laboratory (web lab) for research of bees is being conducted as a part of the VINCES (Virtual Network Center of Ecosystem Services) consortium, which focuses ecosystems services research, such as polinization and photosynthesis. Currently, monitoring of a mandassaia beehive (Melípona quadrifasciata anthidioides) is done through the analyses of the sound emitted by the bees and also by the video recorded and transmitted in real time on the Internet. This kind of video application is very bandwidth consuming even considering high-speed networks. This work presents a brief quality comparison of the most used video compression formats in real-time video transmission at a low data-rate and the proposed structure for transmission of High Definition Videos on a High Speed Internet Network. This work intends to show useful information related to the proposed weblab structure concerning video applications in which the configuration of the video compression format and its quality will play an important role. Keywords. Web labs, High Definition Videos, High Speed Networks, Real Time Video. Introduction Web labs are important components of the VINCES (Virtual Network Center of Ecosystem Services) project, a research consortium focusing ecosystems services such as polinization and photosynthesis. The partners of this project are: Agricultural Automation Laboratory of Escola Politecnica da Universidade de Sao Paulo EPUSP; Bees Laboratory of USP Biosciences Institute IBUSP; California University (San Diego) among others. The web labs intent is to provide the research development through the use of the Kyatera Advanced Internet a high speed Internet network. The web lab development allows sharing information, use of laboratories resources in a better way, international collaboration among other benefits. Several applications, raging from Robotics to Medicine are planned to be part of VINCES project. The bees web lab main purpose is to show the bees behavior inside their hives at the USP Biosciences Institute. Currently, a real-time video of the entrance of one of those beehives is available on the Internet. The video structure is composed by a standard definition (35x40 resolution) analogue Sony CCD-PC1 camera, a Pinnacle video capture board installed on a Pentium IV 3.0 GHz, 1Gb RAM, running the Windows Media Services and Windows Media Encoder on a Windows XP Station. The beehive is illuminated using a red lamp due to the bees sensitivity to certain light spectrum. This way, the resulting video appears in red-scale colors. Face the current network restrictions; the bit rates (amount of data transmitted per time unit) were limited to 384kbps. Tests concerning the quality of the video obtained at those rates were evaluated using several codecs and different metrics, not taking in account network conditions. It is planned the implementation of optical fibers forming a stable 1Gbps network, which will connect the main research institutes of the São Paulo State, also, up to 4 pairs of optical fibers will compose experimental networks for research and development of several applications. In video transmission of beehives, it is interesting to generate images that allow researchers to visualize in detail such insects. As their behavior is subject of interest for many scientific areas (e.g. Bee Dance Karl von Frisch), therefore, it is important to allow the observation of bees in detail inside their nest. This way, future tests with High Definition (HD) Videos, compressed or not, are to be done in this structure. Video and Codecs Video itself consists on a sequence of images. Each image is composed by a matrix of points, called pixels, which are usually represented by the three-color components (1 byte each in a 4bit color image, for example), or by one luminance and two chrominance components in general. The second form of
representing a pixel s color is preferred, as it allows coding luminance, which is visually more important to the human eye than color, with more bytes than the chrominance components. Digital videos may be compressed or uncompressed. Uncompressed videos present major problems due to the large amount of data necessary and the related problems in transmitting such amount of information (for instance, a standard CCIR 601 - Consultative Committee for International Radio - 70x576 resolution video at 30 frames per second generates about 165Mbps of data). In order to reduce its size, videos are usually compressed through the use of lossy methods, which, although may reduce slightly its perceptual quality, reduce significantly its size: for instance, the bit rate of a typical 640x480 MPEG encoded video may vary between 4 to 9 Mbps. In Digital Television, HD videos are usually referred to as those with resolutions raging from at least 180x70 through 190x1080. New studies reach resolutions up to 3840x160 at 50 frames per second (SVERIGES TELEVISION, 006). In order to generate a video flow (i.e. continuous transmission of video) on the Internet, it is usual compress it by using one of the existing lossy standards. Among the codecs (compressor-decompressor) mostly used, the MPEG (Moving Picture Experts Group) standards are widely known, being subject of several improvements from which several versions of this standard were presented. The MPEG- standard (ISO/IEC 13818) is an evolution of the MPEG1 standard (used mostly in Video Compact Discs devices). MPEG is applicable for higher video resolutions compression, like HD content. Compressed HD videos require about 30Mbps of network bandwidth. Open source software, such as Videolan is able to transmit and receive HD MPEG. Codecs such as Windows Media Video (WMV), a standard developed by Microsoft, are also widely used at several applications for computers and others equipments. The WMV standard is also applicable to High Definition Content. Starting in 1999 by Researchchannel, several experiments were already conducted in the HD field, mainly for Digital Television, scientific applications and digital cinema. Experiments involving uncompressed HD videos were also conducted by icat foundation under an open HD platform called HDCAT. This platform uses open source software called Ultragrid, and a hardware scheme similar to one to be implemented on the TIDIA project to support the web labs at VINCES. Objective Metrics Assessment of image quality is not an easy task. In order to evaluate the quality of a compressed video in comparison with its original source, several metrics are known, which are object of discussion due to their difference in complexity, meaning and impact on the Human Visual System. For human beings, subjective evaluation defines well the quality of an image or video, however, it is necessary to obtain parameters by which we can control, benchmark or optimize the quality of a video automatically, hence emerges the necessity of having an objective metric. Among the objective metrics widely used today, Peak Signal to Noise Ratio (PSNR) and Structural Similarity (SSIM) were chosen to evaluate the resulting quality of the sampled video, as PSNR is widely used for reasons of simplicity, and SSIM presents great expectations of improvements on metrics. Assessment of the quality of images was already done for several types of distortion (Zang et al, 004), showing the advantages of using SSIM instead of more traditional metrics such as PSNR and Mean Squared Error (MSE). PSNR and MSE are given by the following equations: MSE = PSNR m, n ( xij yij ) i= 1, j= 1 m n 55 MSE = 10 log10 () Where: x ij, y ij, are the i th and j th row (m) and column (n) pixels of the compressed and the original images, respectively. (1) SSIM is based on measurements of three components (luminance, contrast and structural similarities)
and it is given by the following equations: SSIM (μxμ y + C1)(σ xy + C) x, y) = ( μ + μ + C1)( σ + σ + C) ( x y x y (3) Where: C 1 = ( K1L), C = ( K L) andc3 = C/ σ σ μ z is the average of x, x is the variance of x and xy is the covariance of x and y L is the range of pixels values (L=55 for 8 bits/pixel gray scale images) and K1and K scalar constants. As the standard 10/100Mbps local network currently limits the web lab performance, video streaming is being done at lower bit rates. The main problem in this case is the generation of a typical compression noise that occurs from the elimination of higher frequencies components in the Discrete Cosine Transform (DCT) of the codecs, often called mosquito noise and blurring (Wang et al, 004). In scenes with higher motion, mosquito noise appears as impairments, or artifacts around the edges of moving objects (fig 1.). As shown in fig. at the beehive web lab, bees details are distorted or lost, causing visually perceptible distortions, mainly due to bit rates restrictions. are Figure 1. Mosquito Noise over a compressed image of text In order to enhance the quality of the resulting video, a comparison of some of the most used codecs was evaluated for low bitrates (64 ~384 kbps). The comparison involved the codecs: Mpeg1, Mpeg, Mpeg4- based (Xvid), Windows Media Video 9 (WMV) and Theora. For this study, a short video of the bees nest was compressed. The original video is composed by,490 frames (1minute and 3 seconds at 30 frames per second) and was recorded in an uncompressed AVI format of 35x40 pixels. It contains the behavior of the mandassaia bees in an autumn afternoon. Figure. Original Uncompressed and Compressed Bee Video Results In fig.4, average values measured for the PSNR obtained in the 490 frame movie of the Bees web lab
are shown. The coding of the original video was done offline, in a single pass at constant bitrate (CBR). Fig. 5 shows the average values for the SSIM comparison made at low bit rates. As demonstrated in fig.1, in lower bitrates the impairments on the video are severe, even on modern MPEG4 based codecs. The preliminary results show a slightly advantage of Windows Media 9 codec, which is currently being used to provide the real time videos of the beehive. Figure 3. Frame to frame PSNR values for a 19kbps bees video Figure 4. Average PSNR Values for low bit rates of a 490-frame video
Figure 5. Average SSIM Values for low bit rates of a 490-frame video Conclusions The difference of performance of several codecs was evaluated for lower bitrates. This is one of the steps for developing VINCES infrastructure to enable HD video transmission. The results presented here will serve as base of comparison for higher bitrates, in order to provide better images and reduced impairments in the web labs to be implemented. Despite all the discussion about better metrics to evaluate video quality, the test showed similar results for both metrics used. References Fundació ICAT http:// www.icat.net/icat/servlet/icat.mainservlet?seccio=1_1 Accessed 07 June 006 http://www.ib.usp.br/vinces/ Accessed 30 May 006 Jerôme Buzzi and Fréderic Guichard UNIQUENESS OF BLUR MEASURE, DOLabs mms://143.107.46.10:1 Accessed 07 June 006 Researchchannel http://www.researchchannel.org Accessed 07 June 006 SVERIGES TELEVISION. The SVT High Definition Multi Format Test Set ftp://vqeg.its.bldrdoc.gov/hdtv/svt_multiformat/svt_multiformat_v10.pdf accessed 1 June 006 Z. Wang and A. C. Bovik, Why is image quality assessment so difficult?, in Proc. IEEE Int. Conf. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Processing, vol. 13, no.4,: 600 61, Apr. 004. ITU-T Recommendation P.930, Principles of a reference impairment system for video, 8/96 XVID http://www.xvid.org/.accessed in 07 June 006.