QUALITY OF EXPERIENCE IN INTERNET TELEVISION Mathias Gjerstad Lervold (1), Liyuan Xing (2), Andrew Perkis (2) (1) Accenture (2) Centre for Quantifiable Quality of Service in Communication Systems (Q2S) 1 Norwegian University of Science and Technology (NTNU), Trondheim, Norway ABSTRACT In this paper, we evaluate the Quality of Experience (QoE) of an Internet television service based on a subjective test using VG LIVE, a live streaming service over the Internet. VG LIVE is a service providing Norwegian football matches; both live on match day as well as selected highlights from the weeks before. This paper also presents a model of how various artifacts modeling Quality of Experience can be classified in relation to the service delivery. The subjective test was carefully designed based on this classification of artifacts. The results indicate that the user experience of Internet television is generally better than that of Internet video like Youtube, while worse than that of traditional TV programs. It was observed that both video quality and psychological factors, such as content and context, have impact on the user s experience. The results make it possible to provide some guidelines for service providers to improve their quality. Index Terms Quality of Experience, Internet TV 1. INTRODUCTION Quality of Experience (QoE) is a term describing the perceived quality from a user s point of view. Whereas Quality of Service (QoS) typically measures a certain set of network parameters. QoE aims at measuring the quality from the content is captured at the head end to the presentation at user end, as illustrated by Figure 1. Each stage in the service delivery adds complexity to the measurement of QoE and has different characteristics. Several television networks have taken their programs online, and quality is an important factor for revenue. Studies by Move Networks have shown that a high level of QoE translates into more satisfied customers, who will spend more time on the service, both per visit and the number of visits [2]. However, Internet television is delivered over the open Internet and thus has no guarantee of QoS. In addition, with a variety of network connections, user terminals and screen sizes, the user end has so far been very difficult to manage. In our studies we have focused on how the QoE can be measured and improved for an Internet television provider. In section 2 we will give an overview of QoE in Internet television, and how we have classified the various QoE artifacts. Section 3 presents the subjective test we did in our research and section 4 gives the results and analysis of said test. In section 5 we discuss the results and our research, and finally give some general conclusions in section 6. 2. QOE IN INTERNET TELEVISION Prior to designing our subjective test for the video streaming over the Internet, we grouped the artifacts building the QoE into four categories [3]: Audiovisual; comprising of an overall impression of the quality (or lack thereof), lip sync problems, and adaptation time in the case of adaptive streaming, Figure 1: QoS vs. QoE for video streaming, based on figure 3 in [1] 1 Centre for Quantifiable Quality of Service in Communication Systems, Centre of Excellence appointed by the Research Council of Norway, funded by the Research Council, NTNU and UNINETT.
Figure 2: A model of how various QoE artifacts relates to the service delivery of an Internet television service Video; temporal and spatial video artifacts, such as video jerkiness, frame skips, blockiness, blurring, Audio; choppy audio, noise or silence, Interactivity; responsiveness (or delay) of interactions with the media player. Based on this categorization along with the typical service delivery stages, as shown in Figure 1, we derived a model of how various QoE artifacts can be classified in relation to the service delivery of an Internet television service, as seen in Figure 2. The figure shows where in the service delivery a QoE artifact is most likely to happen. Here we have split the network stage into network throughput (bandwidth), transmission jitter and packet delay/loss, all of which are measurable QoS parameters. The head end includes the capturing of the content, pre-/post-processing and encoding of the video before transmission. At the user end the video is decoded by the client, and finally rendered and presented on a screen (user equipment). It is important to note that there are direct and indirect reasons for a QoE artifact. One example might be that low bandwidth will result in low quality video, however it is the
compression/encoding that is the direct reason and the low bandwidth is an indirect reason. Figure 2 focuses on the direct reasons to the various artifacts. Using the classifications and characterizations of the model as a tool, we found a way of monitoring the QoE and how it could be improved from (user) feedback. 3. SUBJECTIVE TEST Based on this a subjective test was conducted on a Norwegian Internet television service (VG LIVE), where users answered questions about their impression of the QoE after using the service. 3.1 Research method The subjective test was run as a survey with questionnaires measuring the user QoE. This works effectively for gathering and processing feedback from a large number of samples. In particular, a longitudinal survey is adopted in our study, whereby the respondents respond to the survey several times over a long period, 6 weeks in our test. The survey was three-fold; first we collected demographical and psychographic data about the users as well as their user equipment, with questions such as: What is your attitude towards new technology?, How often do you watch video on the Internet? and What type of screen will you be using?. Secondly we asked the users to respond to a questionnaire after each viewing session (i.e. each football match or parts thereof) with questions such as: How good would you say the video quality of VG LIVE is compared to football on TV?, How satisfied were you with the level of detail on the field/crowd etc.? and Did you notice any skipping or lack of smoothness in sound or video during the stream?. Finally we asked the users to respond to a post-survey in order to clarify questions we might have from the previous questionnaires and to summarize their impressions. Our subjective quality assessment method is most similar to Single Stimulus Continuous Quality Evaluation (SSCQE), with long sequences and no reference, while we use a retrospective rating technique instead of the continuous rating in SSCQE. 3.2 Method of analysis The data collected is discrete interval measured on a 4- or 5- or 10-point rating scale. First, an analysis of the responses is conducted in order to sort out the possible erroneous answers and unserious respondents. Then we use a descriptive analysis for the survey results, including average/mean, standard deviation and confidence interval. In this way, our analysis can be seen as a totality of the survey and an overview of the perceived level of QoE amongst the respondents is obtained. Furthermore, all the data gathered during the 6 weeks subjective test have been analyzed and mapped against the classifications in Figure 2. This provides the basis for our analysis and conclusions regarding how a service provider can receive feedback in order to improve the QoE of the Internet television service. 3.3 Technical configuration The user tests were run in-service on VG LIVE, a live streaming service of Norwegian football. We wanted to test the totality of the user experience as close to the real situation as possible, thus no restrictions were applied and no lab testing was conducted. The users could access the service through a web page (vglive.no) and choose which video stream to watch, when to watch and for how long (most times it was the entire match of 90 minutes). VG LIVE uses Move Networks as supplier for the video streaming solution. Move Networks uses adaptive streaming based on the On2 VP7 codec with bit rates spanning from 32 kbps to 1360 kbps (SD-quality). The advantage of adaptive streaming is the quick start and robustness against network problems. The adaptive stream starts at a low bit rate in order to minimize buffer time and increases the bit rate until it either reached its maximum bandwidth or has matched the network connection bandwidth. If the network connection is suddenly restricted, the client will request a lower bit rate from the server in order to keep the video stream running smoothly. The total service delivery for VG LIVE can be seen in Figure 3. The video is captured on various stadiums in Norway, produced in a TV production truck outside, and transferred as a single video stream on dedicated fiber to a production/distribution centre. There it is encoded into 2- second streamlets in 8-10 bit rates (Move adaptive streaming), published onto origin servers in USA and further on cached on various servers in Europe. The users access the video stream over the open Internet and have no guarantee of QoS from the server. The pre/post-production is similar to what is applied to a regular TV production, and before the encoding there is no differences in the video stream quality wise. 4. RESULTS AND ANALYSIS Throughout our test period of 6 weeks we received 62 feedbacks on our quality analysis survey from 18 individual respondents. The survey sample ranged from 18-55 years old, with a majority between 18-30 years (2/3). The male/female ratio was about 80/20, with a 60/40 ratio of employees and students. The general interest in new technology for our sample was above average. These demographics fit well with the estimated customer base of VG LIVE. We also registered their user equipment in order to retrospectively see if screen type/size/resolution, processing power or Internet connection would influence their level of QoE.
Figure 3: An overview of the VG LIVE service delivery, using streaming technology from Move Networks 4.1 Survey results After each match we asked the users to answer a set of questions regarding their impression of the quality. The questions ranged from comparing the overall quality of the streaming service with other well known services, to more detailed questions about startup delay, adaptation time and video- and audio artifacts. Figure 4 shows the overall quality rating of VG LIVE compared to regular TV broadcast quality and the general quality of other Internet video services, such as Youtube. experience with Internet video, as seen in Table 1. If their favorite team won the match the average score was slightly higher than for a loss. Also, if the respondent watched Internet video more frequently, thus more adapted to viewing video over the Internet, the average score was higher. Note that the scale ranges from -2 (VG LIVE much worse) to 2 (VG LIVE much better). The results indicate that content, as well as context, can influence the QoE. Figure 4: Comparison of the overall quality of VG LIVE to other services The overall impression of the quality was that it was superior to other Internet video services, but inferior to broadcast TV. An interesting finding was a slight correlation of the quality rating with the outcome of the matches and the user s Table 1: The quality rating based on the respondent's attitude towards the content and context
In order to rate the video encoding we asked the users about their satisfaction with details and sharpness of the video when watching moving objects (ball/players/referee), stationary objects (on-screen text) and background (field/crowd). On a scale from very bad at 1 to very good at 5 (and OK in the middle) the overall rating was 3.5, 3.7 and 3.2 respectively. The encoding was in other words experienced as OK to Good. The scores might indicate that there were issues with blurring (background), as well as some issues with motion blurring, edge ringing and color bleeding (foreground/moving objects). The users were in general happy with response time (lack of buffer delay) and smooth playback, due to adaptive streaming. However, the downside with adaptive streaming is the adaptation time, that is the time it takes from it starts playback at a low bit rate until it has reached its maximum bit rate. This was especially unsatisfying in short replays of goals and other highlights, when the quality didn t reach its peak until the event of interest had passed. In summary; slightly over half of our respondents felt that Internet television in its present state is worth paying for, higher quality (HD) and the option to choose quality and price would increase the value of the service. 4.2 Improving the QoE As can be seen from Figure 2 most of the spatial artifacts are introduced at head end. The encoding stage is especially important and harsh encoding results in blurry pictures, blockiness, color bleeding etc. Quality is a trade-off of bit rate and artifacts, thus higher quality will require higher bandwidth (network throughput). The head end is usually under control by the service provider, and a full reference quality assessment metric could be implemented to monitor the quality prior to transmission. Another important issue we found was the adaptation time. In Figure 5 we see how adaptation time is a trade-off of response time, and also how one can tweak the client in order to optimize the service. Some of our users complained about the long delay from other live-services covering the matches (TV, radio, SMS-updates, web-updates). This is mostly due to the buffer at the client, and one can also tweak this in order to satisfy the customers, however at the cost of lower robustness and reliability. 4.3 Client feedback The video is transmitted over the open Internet and thus has no guarantee of QoS. Jitter or packet loss may result in temporal artifacts, and low network throughput can result in long delay due to buffering. These parameters can be monitored by the client and reported back to the server. Due to adaptive streaming, as described in section 3.3, the video stream is robust against network errors. The client will at all times be aware of the available bandwidth, and can report to the server about network constrictions and packet errors. Figure 5: Diagram of how various factors of the QoE can be increased by tweaking the client The user end is usually the most difficult to monitor, while at the same time have a big impact on the user s experience. High quality user s devices are very important in order to properly render and present the video. In our research we proposed a combination of an extended user profile, providing a baseline for user equipment, and a QoE tool in the form of the menu seen in Figure 6. The menu can provide a guide (1) for the user to improve his QoE, as well as test bandwidth and processing power. It could also provide support through real-time feedback from the user on quality issues (2). Finally, we propose personalization of the service (3), where the user might choose quality according to price and also have the possibility to select among various options in order to maximize his total user experience. Figure 6: A mock-up of the proposed QoE menu with 3 main features: QoE guide (1), support and user feedback (2), options and personalization (3). The Achilles heel in our proposed solution is the integration between user profile by the service provider/content owner (VG LIVE), and video client by the video stream distributor (Move Networks). That is, in order to get the full picture of the individual user s level of QoE, one needs both the technical data from the client coupled with the user information and real time feedback.
5. DISCUSSIONS Our research has been limited in time and resources, something that might be reflected in our survey sample and the depth of our research. Our user selection was limited, and perhaps not entirely representative, thus the results should not be read as universal facts by its numbers. The results have in all generality showed the tendencies within the group, and many of them have confirmed our expectations previous to the tests. An important question that arose from our results is the importance of the content and context in which it is being watched. Content is likely more important for long sequences, as the user s focus and attention may drift over time. As we saw in Table 1 our results may have been influenced by wins or losses of favorite teams, and although we didn t see any clear correlation between screen sizes and average quality scores, other factors such as Internet video viewing habits might suggest that context also play an important part in the total QoE. Finally, we have shown how the QoE can be improved and how it can be monitored, but one important question we haven t asked is: Is it worth it? That is; is the improved quality and customer satisfaction worth the extra cost? This has been out of our scope, but is without a doubt an important question and should be further researched. [2] Move Networks, Move Internet television services: Move networks provides a new standard for broadcasters. http://www.movenetworks.com, Sept. 2008. [3] S. Winkler (Symmetricom), Delivering quality of experience to IPTV and TVoIP customers, 2007. [4] M. Lervold, Measuring perceptual quality in Internet television, NTNU (Norwegian University of Science and Technology), Trondheim, June 2009. 6. CONCLUSIONS In our research of Internet television we have found that although providing a satisfying quality compared to other Internet video services, there is a need for improving the QoE in order to gain a greater share of the customers from regular broadcast TV. The combination of monitoring the head end, network and user end can give the service provider a good measurement of the QoE for each individual user. QoE problems are mapped and analyzed against Figure 2, and actions can be taken in order to fix the problems in the correct stage of the service delivery. In our research we also found that QoE should be extended to psychological factors such as content and context. The viewer s feeling towards the content being presented and the context in which it is viewed, e.g. on a big screen in his living room or on a laptop at his office desk, would to some extent influence his rating of the quality. However, these factors are not so easily measured as video artifacts and resolution, and will require further research. 7. REFERENCES [1] F. Boavida and E. Cerqueira, Benchmarking the Quality of Experience of Video Streaming and Multimedia Search Services: the CONTENT Network of Excellence, Telecommunication Review and Telecommunication News Journal, Technical Journals and Books Publisher SIGMA NOT Ltd, Dec. 2008