Optimized visualization on portable autostereoscopic displays. Atanas Boev Jarkko Pekkarinen Atanas Gotchev

Size: px

Start display at page:

Download "Optimized visualization on portable autostereoscopic displays. Atanas Boev Jarkko Pekkarinen Atanas Gotchev"

Roger Parker
5 years ago
Views:

1 Optimized visualization on portable autostereoscopic displays Atanas Boev Jarkko Pekkarinen Atanas Gotchev

2 Project No Optimized visualization on portable autostereoscopic displays Atanas Boev, Jarkko Pekkarinen, Atanas Gotchev Abstract: There are various optical parameters that influence the quality of a 3D display for example resolution, number of views, angular visibility of each view, crosstalk, etc. However, not all optical parameters have direct influence on the perceived quality. In this report, we develop a group of algorithms for optimized visualization on 3D displays. We discuss the operation principles of mobile 3D displays and identify the optical parameters which have the biggest influence over the visual quality degradation. We develop measurement methodology, which allows deriving of the important optical parameters of a 3D display. Additionally, we propose a method for estimating the subjective perceptibility of crosstalk in a typical observation condition. We give measurement results and perform comparative analysis for 10 different 3D displays. We discuss a set of artefact mitigation algorithms - two algorithms for point-ofview optimization, and two for visually optimal antialiasing. Keywords: 3D displays, autostereoscopic displays, visual quality, 3D artefacts, quality estimation, artefact mitigation

3 Executive Summary In this report, we develop a group of algorithms for optimized visualization on 3D displays. First, we discuss the operation principles of mobile 3D displays, and the reasons for artefacts to appear. We identify the optical parameters which have the biggest influence over the visual quality degradation. There are three major types of artefacts. The first type is viewpoint related artefacts, such as ghosting and pseudoscopy, which appear when the used is not at the correct observation position. The second type is optical mask artefacts, such as aliasing, colour bleeding and minimal crosstalk, which are seen even from the visual sweet-spots of the display, and depend on the quality of the optical mask, its ability to redirect the light, and the interleaving topology, chosen by the display manufacturer. The third group is content-related artefacts, such as hyperstereo and accommodation induced diplopia, which are influenced by the disparity range present in the 3D content. We develop measurement methodology, which allows deriving of the important optical parameters of a 3D display, such as angular brightness, size and position of the sweet-spots, and minimal crosstalk. Additionally, we propose a method for estimating the subjective perceptibility of crosstalk in a typical observation condition, as opposed to objective measurements in a dark room. We give measurement results and perform comparative analysis for 10 different 3D displays. Finally, we discuss a set of artefact mitigation algorithms. Apparently, the artefacts mostly influencing the quality of a mobile 3D displays are the viewpoint-related ones. We propose two algorithms for point-of-view optimization, which utilize a combination of face and eye-tracking. By using single eye-tracking camera, we can eliminate pseudoscopy and mitigate angular dependant crosstalk artefacts. By using stereoscopic eye-tracking with two cameras, we can further optimize the image on a 3D display for various observation distances. The second most important artefacts are the optical mask related ones. We discuss two algorithms for visually optimal antialiasing one is uses subjective measurements, and can be used for stereoscopic displays, and the other uses objective measurements, and can be applied to multiview displays. The second algorithm allows the user to select a desired level of 3D sharpness, and the proper antialiasing filter gets automatically selected in order to keep that level regardless of frequency content and disparity of the signal. Based on our measurement, the third type of artefacts content related ones does not influence mobile 3D displays much. Apparently, most mobile 3D displays can accommodate downscalled HD content with sufficient quality. Some amount of post-processing for content repurposing still can be done, but the increase of quality would not be visible enough to justify the CPU intensive post-processing needed for on-device repurposing. 2

4 Table of contents Executive Summary Introduction Quality evaluation of 3D displays Visual quality of 3D displays Operation principles Optical parameters D display as imaging channel Visual artefacts on 3D displays Viewpoint related 3D artefacts Pseudoscopy Diplopia (angular dependant crosstalk) Limited head parallax Optical mask related 3D artefacts Ghosting (minimal crosstalk) Colour bleeding Moiré Masking Content related 3D artefacts Accommodation/convergence rivalry Divergent parallax Measurement and comparative analysis of 3D displays Displays under test Angular visibility function Measurement procedure Preparation of test images Results Optical mask related quality parameters Objective crosstalk Perceived crosstalk Perceptual resolution Viewpoint related quality parameters Position and size of viewing sweet-spots Apparent size and resolution

5 3.5 Content related quality parameters Objective comfort disparity range Subjective comfort disparity range Visual optimization Viewpoint optimization Pseudoscopy correction Point-of-view optimization Antialiasing for stereoscopic displays Antialiasing for multiview displays

6 1 Introduction There are various optical parameters that influence the quality of a 3D display for example resolution, number of views, angular visibility of each view, crosstalk, etc. However, not all optical parameters have direct influence on the perceived quality for example brightness of the display is perceived differently depending on the observation conditions. Additionally the quality of the perceived image depends on some parameters of the content, as well on the position of the observer. Even if all parameters are measured, it is hard for the average consumer or content producer to compare the visual quality of two displays or judge if a given 3D content is suitable for a certain display. We start by finding the parameters, important for the optical quality of the display. In chapter 2 we identify three major types of artefacts viewpoint related, optical mask related and content related. We discuss the reasons for each type of artefact to appear and its perceptual impact on the human visual system (HVS). In chapter 3 we develop display measurement technology, which allows us to assess the optical quality of a 3D display and to judge to what extend a given display would be affected by 3D artefacts. Along with the measurement methodology, we present measurement results for a number of portable 3D displays, and a comparative analysis, which aims to find if there are any typical values for parameters like crosstalk and position of sweet-spots. In chapter 3 we describe a family of algorithms for optimized visualization on a 3D display. The algorithms utilize the knowledge of optical parameters, and can be used for extending the viewing zone and for mitigation of the most common artefacts such as Moiré and ghosting. 2 Quality evaluation of 3D displays In this chapter, we aim at finding the minimal set of most important parameters, which can be used for quality characterization of a 3D display. For each parameter we judge how it is perceived by the HVS in natural observation conditions. We develop a set of measurements that would assess the perceptual influence of each parameter. We model the display as an image processing channel and identify these model parameters which can be later used to design image processing algorithms for artefact mitigation. Along with presenting the measurement methodology, we give results of the measurements for the 3D displays available in our laboratory. 2.1 Visual quality of 3D displays All autostereoscopic displays currently available on the market use an optical mask to separate the light of an underlying TFT-LCD panel towards different directions. Certain properties of the layer create specific artefacts, such as ghost images, moiré patterns and masking. Some portable devices use polarized glasses to achieve binocular separation (for example LG Optimus Pad [1]). Such displays use polarization filters to separate the light, but since the image is spatially interleaved (different image components are seen by each eye), the effect of the filters is quite similar to that of the mask. A major problem in the deployment of 3D-enabled mobile devices is whether the available 3D content will be suitable for the various mobile 3D displays, and to what extent some postprocessing of the content will be needed. In order to select 3D display module, the vendors of mobile devices need to know how to compare the visual quality of such displays. In order to produce optimized 3D scenes, the content producers need to know what disparity range is suitable for a given display. There are several studies on estimating the visual quality of 3D displays. Some studies propose analytical derivations based on knowledge of display properties [2][3][4] other studies measure the optical parameters of the displays [5], [6] or perform subjective tests [7][8][9]. Neither of this approaches is universally applicable, as display properties might not be known to 5

7 the user, optical parameters might not be directly related to the perceived quality, and subjective tests are time consuming and expensive Operation principles Autostereoscopic displays can create binocular illusion of depth without requiring the observer to wear special glasses. They work by beaming different image towards each eye. Most autostereoscopic displays use TFT-LCD matrix for image formation [10][11]. Additional optical filter mounted on top of the screen makes the visibility of each TFT element a function of the observation angle. There are three major types of optical filters. Displays with lenticular sheet have an array of microlenses which redirects the light, as shown in Figure 1a. This type of optical layer allows high brightness of the displayed image, but cannot be disabled electronically, thus these displays operate in 3D mode only. Parallax barrier works by partially blocking the light travelling in certain directions, as depicted in Figure 1b. This type of optical barrier allows less light trough, which results in a darker image, but allows the display to switch between 2D and 3D mode by turning the barrier on and off. Parallax barrier is the most commonly used optical filter in mobile 3D displays. The third approach polarization-based film the light from each TFT element is polarized with different orientation, as shown in Figure 1c. Since the human eye is not sensitive to polarization of the light, without wearing the glasses the observer sees all sub-pixels of the image and perceives a normal 2D image. Trough the glasses, only half of the sub-pixels are seen by each eye. The image on a display with polarization filter is less affected by the horizontal observation angle, but degrades fast with changing the vertical angle of observation. Backlight R G B R G B R G Backlight R G B R G B R G Backlight R G B R G B R G L L L L L L L L R R R R R R R R L L R R a) b) c) Figure 1. Different techniques for view separation in mobile 3D displays: a) lenticular sheet, b) parallax barrier, c) polarization layer Optical parameters The design of a multiview display is a trade-off between observation convenience and visual quality. The added convenience comes at the expense of limited brightness, contrast and resolution [3], [5], [10]. There are various methods for assessing the optical quality of the display for example, using directional scanning for 2D [12] and 3D [13] displays. In [5] the authors propose an extensive list of optical parameters which can be measured for characterization of autostereoscopic 3D displays. In this work, we want to identify and measure the parameters that can be used for visual optimization. 6

8 Figure 2. Observation zones of one view The image intended to be seen by one eye of the observer is called a view [10][11]. Mobile autostereosopic displays usually have two views, as they are meant for single observer. These views contain the images intended for left and right eye of the observer, and thus are referred to as the left view and the right view. The difference in the horizontal position of an object in each view is called disparity, and it is responsible for the binocular illusion of depth. The illusory distance between the object and the display is referred to as apparent depth. Horizontal position of the object inside a view is measured as its distance between the object and the left edge of that view. Disparity is measured as the difference between the position of the object in the left and in the right view. Positive disparity creates the illusion of the object being behind the display, and negative disparity places the object in front of the display. The range of positions from which a view is visible is called visibility zone of that view. A photo of the visibility zones of one view of a mobile 3D display is shown in Figure 2. If a display has two views, their visibility zones alternate in horizontal direction. There are multiple positions (called sweet spots), from which an observer can perceive proper stereoscopic image. The procedure of mixing and mapping the images of both views to the TFT elements of the display is called interleaving, and the map, which determines if a TFT element belongs to the left or right view is known as interleaving map. We refer to interleaving maps, where all TFT elements on a row belong to the same view as row-interleaved, and to the ones, where columns of TFT elements belong to the same view - as column-interleaved D display as imaging channel In order to understand how various artefacts occur, and to asses perceptual differences between the intended and visualized signal we consider the 3D display as image processing channel. Our model follows the stages of content preparation and visualization as shown in Figure 3. The examples in the figure are given for single row of a dual-view display, however, the general signal transformations hold true also for a multiview display. The channel input can be regarded as continuous signal. Each channel is sampled with a sampling step corresponding to the resolution of the display. If the input is a 3D object at some depth, there would be an offset between the left and right signal and there would be disparity between the sampled representations of both channels. If the input signal represents 2D objects, but the intention is this object to appear at a certain depth, disparity can be introduced along with the sampling process. The next two stages model the interleaving process. Only some sub-pixels of the display belong to one view, and the first step is to decimate the sampled signal in such a way, that only the relevant sub-pixels remain. From the interleaving map, one can derive a binary mask that represents the subpixles which belong to one view. The left input signal is decimated (sub-sampled) using the binary mask of the left channel, and the right input signal with the mask of the right channel. Note, that such decimation would require suitable pre-filter, or otherwise aliasing can occur. Then, the sub- 7

9 Left channel Right channel MOBILE3DTV sampled versions of each signal are interleaved in the way prescribed by the interleaving map. Finally, the optical mask (or passive glasses) ensures that the samples intended for each eye will be predominantly visible, while the others would be suppressed. In the optimal case the separation would be optimal, though in reality, some residual intended for one eye arrives at the other. This residual signal is often modelled as crosstalk. Signal Sampling Disparity Decimation Interleaving Interleaved signal Interleaving Seen by left eye Seen by right eye Figure 3. Generalized model of a stereoscopic display as an image processing channel Visual artefacts on 3D displays During the evolution, HVS got optimized for extracting the structure from an image. As a consequence, it became largely insensitive to global contrast or brightness variance [14][15]. The vision decomposes the scene to patterns of various spatial frequency and orientation [14]. A number of visual quality metrics attempt to assess the perceptual difference between two images [16][17]. Predicting the visibility of an image detail is a complex task, as it is influenced both by HVS parameters (contrast sensitivity function, pattern masking) and observation conditions (distance to display, ambient light) [15]. We would like to use the concept of structural distortion for measuring the visibility of all stereoscopic artefacts. The high level visual processing might recognize the distortion as belonging to a certain group of artefacts, but it is the early vision that detects it. We think the visibility of any 3D artefact can be expressed as the amount of structural change caused by it. There are works on general visibility stereoscopic crosstalk in typical observation conditions [8][9]. According to the Weber-Fechner law the perceptibility of a change in stimuli is proportional to the amplitude of the stimuli. This fact also holds true for perception of brightness [14]. Following the Weber-Fechner law, in experiments for perceptibility of crosstalk, the crosstalk is measured as percentage of the input signal. Crosstalk of less than 5% is considered under the visibility threshold and crosstalk of more than 25% is considered unacceptable [8]. We think that these ratios can be applied to other types of 3D artefacts as well. We propose that visibility of an 3D distortions is measured as the ratio between the input signal and the signal distortion introduced by the display. 8

10 We measure the distortion at two different ratios 5% signal-to-distortion ratio, which represents unnoticeable levels of distortion, and 20% which represents visible, but still acceptable stereoscopic distortion. 2.2 Viewpoint related 3D artefacts Pseudoscopy Most common artefact in portable stereoscopic displays is pseudoscopy. In dual-view stereoscopic displays the observation zones of the left and right channel are arranged in a repetitive pattern, as shown in Figure 4. This created the possibility that a pseudoscopic image is perceived. For example, the rightmost observer in Figure 4 sees the right view with the left eye and vice versa. In pseudoscopic image, the binocular depth cues are reversed and the objects intended to appear in front of the display appear behind it and vice versa. In most cases this contradicts the depth suggested by other depth cues in the image (shadows, occlusion and parallax) and results in disturbing image [18]. The perceptual influence of pseudoscopy depends on the amount of disparity. For 2D image with zero disparity pseudoscopy is not an issue, as in that case the left and with channels are identical. However, once image has any disparity present, pseudoscopy becomes immediately visible. L R L R L R L R Stereoscopy Pseudoscopy (reverse stereo) Figure 4. Observation positions where pseudoscopy occurs Diplopia (angular dependant crosstalk) In autostereoscopic displays, the visibility of each view does suddenly change, but is a smooth function of the angle, as it is shown in Figure 5. For example, at the point marked with I on the figure, the left view has maximum visibility while the right view is maximally suppressed. In the zone around that point the optical separation is high enough to ensure sufficient quality of the image. Leaving this observation zone, the optical separation drops, till the visibility of both views becomes the same, as marked with II in the figure. At that observation angle, both eyes see simultaneously both views, and the image is perceived flat, and the disparity between the views is causing the objects to appear with double contours. The perceptual effect is similar to the natural condition when the eyes cannot converge, known as diplopia. Again, as with all viewpoint related artefacts, the effect is not visible for flat 2D images, but becomes visible for scenes with pronounced depth. Around the optimal observation points of each view (marked with I and III on 9

11 Luminance MOBILE3DTV Figure 5) there is a narrow zone where the separation is good enough to ensure crosstalk of less than 20%. At these zones proper stereoscopic perception is still possible. No diplopia zone Diplopia zone No diplopia zone I II III L R Observation angle θ w (in transverse plane) Figure 5. Visibilities of left and right view as function of the observation angle Limited head parallax When observing a real 3D scene, the user can freely move the head, and see the object at different angles. This is known as head parallax. However, a stereoscopic display recreates the scene from one particular angle only, and a multiview display recreates a small subset of the full scene parallax. A stereoscopic display can be built in a way to ensure that each eye of the observer is always inside free from diplopia zone for example with eye tracking and active optical barrier, or requiring the observer to wear passive glasses. Still, while moving the head the observer will perceive the scene always from the same point-of-view. In a real scene, the objects with certain depth would exhibit parallax shifts, but the brain will compensate that movement perceiving the objects as being still. In a display with fixed-parallax visualization, the objects will appear still but the brain, trying to compensate for parallax movement will perceive them as moving in a direction opposite to the one of the head movement. This perceptual effect is known as shear distortion [18]. Multiview displays can recreate a number of observation positions, which allows for some head parallax. Usually, such displays present couple of zones with limited head parallax, as shown in Figure 6. Inside the zone, the observer has freedom to observe the scene from different angles. However, one leaving the last visibility zone, the first view becomes visible again, which is perceived as if the objects flip back to the initial parallax position. Such effect is regarded as image flipping [18]. 10

12 Limited head parallax representation A Limited head parallax representation B Image flipping zone Figure 6. Image flipping zone in multiview displays 2.3 Optical mask related 3D artefacts The most pronounced display-related artefacts are moiré and ghosting [18], and a third artefact, masking, is visible on displays with parallax barrier. There are two main reasons for optical mask related artefacts to occur design tradeoffs and the interleaving procedure. The design of a 3D display involves a trade-off between number of views, resolution of a view, and visibility artefacts such as image flipping and banding [2]. Often, the visibility zones of different views are interspersed and from a given angle multiple views are simultaneously visible, albeit with different brightness [10][11][13]. When visualizing 3D objects with pronounced depth the combination of disparity and simultaneous visibility is perceived as ghosting artefacts [3][8]. Mapping the input images, which are sampled on a regular grid to the visible pixels of a view requires special antialiasing filters [3][5][19]. Direct mapping of multiple images to the (possibly non-rectangular) grid of visible sub-pixels of a 3D display produces moiré and colour aliasing artefacts Ghosting (minimal crosstalk) Even when seen from an optimal observation position, some of the light intended for one eye is visible in the other. For example, at the observation angle marked with I in Figure 5, which is optimal for the left view, the right view still has some visibility. The combination of disparity and crosstalk is perceived as semi-transparent, ghost -like images floating next to the original objects, as shown in Figure 7. The visibility of ghost artefacts is influenced both by the disparity and by the local contrast of the content. For example, the image in Figure 7 is a stereopair rendered with 25% crosstalk. In the close figure no ghosting is visible, since the figure is positioned close to the zerodepth plane, and is blurred by an out-of-focus effect. In the far plane there are two places as identical depth and exhibiting identical amount of disparity (as marked with arrows on the figure). The high contrast area (far figure) has very pronounced ghost image visible next to it, while in the low-contrast area (dungeon bricks) ghosting is not visible. 11

No visible distortions Visible ghost 2.3.2 Colour bleeding Figure 7. Ghosting artefacts Depending on the interleaving map colour bleeding artefacts might occur together with the ghosting.

13 No visible distortions Visible ghost Colour bleeding Figure 7. Ghosting artefacts Depending on the interleaving map colour bleeding artefacts might occur together with the ghosting. If the angular visibility functions of each colour channel do not overlap, as certain angles some colours would be predominantly visible as shown in Figure 8. Most often this is the case with pixel-interleaved displays, where each second pixel on a given row belongs to an alternative view. In such case the maximum visibility of each colour component appears at a slightly different angle. TFT matrix backlight TFT matrix optical filter backlight optical filter Balanced color distribution imbalanced color distribution a) b) Figure 8. Reasons for colour bleeding: a) colour balanced interleaving, b) interleaving, causing colour misbalance [26] In Figure 9, one can see the difference between displays with different interleaving map, observed from an angle between two sweet spots. Both images have visible crosstalk, however, the image in Figure 9b also exhibits colour bleeding, while the image in Figure 9a exhibits only crosstalk. 12

a) b) Figure 9. Crosstalk versus colour bleeding between views: a) crosstalk with balanced colour distribution, b) crosstalk with additional colour bleeding artefacts. Figure adapted from [26]. 2.3.

14 a) b) Figure 9. Crosstalk versus colour bleeding between views: a) crosstalk with balanced colour distribution, b) crosstalk with additional colour bleeding artefacts. Figure adapted from [26] Moiré In a spatially interleaved 3D display, for any observation angle at least one half of its sub-pixels not visible. Selective masking of a 2D image caused by optical filter can be modelled as a subsampling. Without pre-filtering this process creates aliasing frequencies, which are seen as moiré artefacts. Such artefacts are especially seen on multiview displays, where the visibility pattern acts as sub-sampling on a non-orthogonal grid. In Figure 10, we show the effects of non-orthogonal mask of a multiview 3D display. A test image from the Kodak image database [20], is shown in Figure 10a. The same image, photographed on a 3D display, exhibiting visible structural and colour artefacts is given in Figure 10b. 13

a) b) Figure 10. Moiré artefacts on 3D display: a) test image, b) the same image photographed on a multiview 3D display. 2.3.4 Masking Certain 3D displays, for example the ones that use parallax barrier or polarized patterned retarder, suffer from masking artefacts.

The interleaved signal contains information intended for both eyes (as shown in Figure 11 top left), but the mask allows only some parts to be seen by each eye, and suppresses the light intended to

15 a) b) Figure 10. Moiré artefacts on 3D display: a) test image, b) the same image photographed on a multiview 3D display Masking Certain 3D displays, for example the ones that use parallax barrier or polarized patterned retarder, suffer from masking artefacts. For these displays, the mask which separates the views appears as visible black stripes over the image. The interleaved signal contains information intended for both eyes (as shown in Figure 11 top left), but the mask allows only some parts to be seen by each eye, and suppresses the light intended to be seen by the opposite eye (as shown in Figure 11 top right). The eye perceives this as a combination of masking and crosstalk. The signal which is intended for the same eye (Figure 11 bottom left) is seen as if part of pixels are turned off, or masked The other component of the signal, which is intended for the opposite eye (Figure 11 bottom right) is perceived as crosstalk. Interleaved signal Seen by left eye Masking Crosstalk Figure 11. Interleaving in parallax barrier displays as combination of masking and crosstalk + 14

The effect of the mask can be modelled as an upsampling stage, where the introduced samples are set to zero [21].

Even in the simple case of stereoscopic display with parallax barrier where the mask is vertical, effect of the mask

As a result, the visibility of the masking artefacts is a complex process which is influenced both by the frequency

As an example, in Figure 12 we show the effect of masking every second pixel of test signals with various frequency

4 Content related 3D artefacts Figure 12. Influence of masking artefacts over various images 2.4.1

focused objects inside the fovea of each eye.

second, keeping the objects in front of, and behind the point of convergence out of focus, eases the extraction of

16 The effect of the mask can be modelled as an upsampling stage, where the introduced samples are set to zero [21]. If an upsampled signal is not suitably post-filtered, the spectrum of the signal gets replicated in a similar way as it happens with aliasing. In sampling and interpolation literature the effect is denoted as imaging and the filters tackling it are known as anti-imaging filters. Imaging artefacts cannot be mitigated by software means, as post-filtering an image would require optical filter (e.g. diffuser). Even in the simple case of stereoscopic display with parallax barrier where the mask is vertical, effect of the mask interferes with the pattern masking properties of the human vision. As a result, the visibility of the masking artefacts is a complex process which is influenced both by the frequency characteristics of the mask and the ones of the underlying signal. As an example, in Figure 12 we show the effect of masking every second pixel of test signals with various frequency and orientation. Original Masked Original Masked Original Masked Original Masked Original Masked Original Masked 2.4 Content related 3D artefacts Figure 12. Influence of masking artefacts over various images Accommodation/convergence rivalry There are two mechanisms which allow the human vision to adapt to objects at a different depth. One of them is convergence, where the eyes perform inward or outward motion in order to keep the projection of the focused objects inside the fovea of each eye. The other is accommodation, in which the focal power of the eyes change in order to keep the object focused on the retina. These two mechanisms are closely coupled, and the eyes automatically accommodate to the distance, suggested by the point of convergence [18]. The binocular vision benefits from such coupling in two ways first, this allows much faster focal accommodation, and second, keeping the objects in front of, and behind the point of convergence out of focus, eases the extraction of binocular depth cues. On a stereoscopic display the conference and focal distances to an object differ. The distance to the converging point is influenced by object disparity, while the focal distance is always equal to the viewing distance, as shown in Figure 13a. This discrepancy causes the objects with pronounced apparent depth to be perceived out-of-focus an effect, known as accommodationconvergence (A/C) rivalry. The combinations of focal and convergence distances which allow clear vision are listed by A. Percival in [22]. On a focal distance versus convergence distance plot, these combinations define so-called zones of clear single vision [23]. Beyond these zones, the A/C rivalry prevents eyes from converging, causing diplopia (double vision). Example for zones of clear single vision can be seen in Figure 13b. 15

17 Focal distance, cm MOBILE3DTV However, inside the zones of clear single vision the observer still might experience A/C rivalry and see objects out of focus. In [24], the authors define so-called Percival s zone of comfort, which is approximately three times narrower than the zone of clear single vision. Example of Percival s zones of comfort is shown in Figure 13b. Within that zone, A/C rivalry is negligible, which allows comfortable 3D perception [23][24]. Notably, the smaller the focal distance is, the more pronounced A/C rivalry is, and smaller differences between focal and convergence distance could lead to uncomfortable 3D scene. As a consequence, the range of comfortable disparities is more limited for handheld 3D displays than for displays allowing greater viewing distance. Left eye observation 100 Focal distance Right eye observation Virtual object Zones of clear single vision a) b) Figure 13. Accommodation-convergence rivalry: a) difference between convergence and focal distances to an object, b) zones of clear vision and Percival s zones of comfort. (from Divergent parallax Convergence distance The eyes have limited ability to perform inward or outward motion. The point of convergence can range between can converge at distances ranging from about 5cm in front of the head to infinity. The muscles do not allow the eyes to look in divergent directions. The maximum disparity that can be perceived is limited by the observer s inter-pupillary distance (IPD). If the disparity is larger, divergent parallax occurs, which is a disturbing and painful experience [18]. For mobile displays, this limitation is less pronounced, as the mean IPD of 65mm corresponds to substantial part of the display width and limitation by A/C rivalry occur for much lower disparity values. 3 Measurement and comparative analysis of 3D displays Currently, there is a big variety of 3D displays available on the market. Since there is no standard developed yet, the optimal content for one display might be with reduced quality, or even incompatible with another display. In order to produce optimized 3D scenes, the content producers need to know what disparity range is suitable for a given display. There are several studies on estimating the visual quality of 3D displays. Some studies propose analytical derivations based on knowledge of display properties [2][3][4], other studies measure the optical parameters of the displays [10][11] or perform subjective tests [7][8]. Neither of this approaches is universally applicable, as display properties might not be known to the user, optical parameters might not be directly related to the perceived quality, and subjective tests are time consuming and expensive. In this chapter we perform measurements and a comparative analysis over a number of 3D displays. We attempt to identify the important display parameters that have direct effect over the visual quality of the display. 25 Percivial s zones of comfort Convergence distance, cm 16

18 3.1 Displays under test We compare eight autostereosopic and two glasses-based 3D displays. The devices under test are listed in Table 3.1. The HDDP display is lentucular sheet display with HDDP pixel arrangement, produced by NEC LCD [28]. The MI_L and MI_P denote two orientations of a display produced by masterimage [31]. That display can operate in 3D-landscape and 3D-portrait mode by changing the direction of its parallax barrier. In either mode, the interleaving is per pixel. MI_L denotes the display operating in landscape mode, and MI_P in portrait mode. FC is a portable stereoscopic camera produced by Fujifilm and equipped with 3D display [29]. The FC display uses light guide and retardation layer. Fujifilm also produces 3D photo frame which is labelled with FF in our comparison [29]. SL is a laptop with autostereoscopic 3D display produced by Sharp [32]. V3D is a prototype of a portable PC with 3D display, which is not in mass production. The FF, SL and V3D displays are all 2D/3D switchable, column-interleaved on a sub-pixel level. For comparison, we have included two displays that work with polarized glasses. AL(OG) and AL(RG) are the same laptop with 3D display, produced by Acer, which uses patterned retarder and polarization glasses. The measurements labelled with AL(OG) are done trough the original glasses that are sold with the laptop, and the ones labelled AL(RG) use replacement glasses sold separately. VUON is large 3D television set with HDTV resolution. 3DS is a portable gaming console, produced by Nintendo, which has two displays, one of which is a parallax barrier-based autostereosopic display. Table 3.1 display models under study Model Description Interleaving Horizontal resolution (px) Vertical resolution (px) Width (cm) Height (cm) HDDP 3.2" display based on the lenticular HDDP technology by NEC HDDP MI_L MB403M by Master Image (landscape 3D mode) Column-interleaved, per pixel MI_P MB403M by Master Image (protrait 3D mode) Column-interleaved, per pixel FC FinePix REAL 3D W1 camera by Fujifilm Light guide FF FinePix REAL 3D V1 photo frame by Fujifilm Column-interleaved, per sub-pixel SL Sharp AL3DU (with parallax barrier display) Column-interleaved, per sub-pixel V3D Portable computer with 3D display prototype Column-interleaved, per sub-pixel AL(OG) Acer AS5738DG-6165 laptop (original polarized glasses) Row interleaved AL(RG) Acer AS5738DG-6165 laptop (replacement polarized glasses) Row interleaved VUON Vuon E465SV 3D TV set by Hyundai (polarized glasses) Row interleaved DS Nintendo 3DS (with parallax barrier display) Column-interleaved, per sub-pixel Angular visibility function The angular visibility function gives the visibility of each TFT element as a function of the observation angle. It is proportional to the pixel brightness, but is normalized between zero and one, where one is the maximal brightness of the display. The angular visibility function can be directly used for deriving a number of other parameters, like minimal crosstalk, width of the sweetspots, and the existence of colour bleeding. 17

19 3.2.1 Measurement procedure In our measurement setup, the display under study is stationary and the front element of the camera lens was positioned and moved along a line as shown in Figure 14. The line lies on a plane parallel to the screen surface and at optimal viewing distance from it. The position of the line on this plane was at the height corresponding to the height of the centre of the display. The camera was oriented perpendicular to this plane during all the measurements that were made at observation points 2mm apart from each other on the line described above. Widest available focal length was used with the camera lens in order to produce measurements from as many observation angles as possible. Figure 14. Visualization of the measurement setup. From each measurement point, the appearance four test image was photographed on the display. The list of the test images is: Left view displays white with full intensity while right view displays black (L: white, R:black) L: black, R: white L: white, R: white L: black, R: black Depending on a screen s pixel pattern and the choice of optical layer to separate the views it is possible that the visibility of the R/G/B components in relation to each other becomes distorted depending on the angle of view. To study this effect in more detail also test images were prepared where for one view the red/green/blue components were set to full intensity and the others to zero. On the other view all the components were set to zero. This kind of test images allowed visibilities of the single colour components to be obtained for both of the views from all the observation points Preparation of test images For measuring the crosstalk per colour channel we prepared stereo images with different combinations pure white(w), black(k), red(r), green(r) or blue(b) regions. The combinations we used are W/W, W/K, K/W, K/K, R/K, G/K, B/K, K/R, K/G and K/B, which sums up to a set of 10 images. First of the three used composite stereo images was formed such that it had all the black/white combinations for the two views in it. To achieve this it was chosen that for the left view an image with four vertical stripes (in the order W/K/K/W) with equal width were displayed on the screen as 18

shown in Figure 15. The width and height of these stripes were adjusted for each display such that they filled the whole available screen area.

a) b) Figure 15: Black and white test image: a) left channel and b) right channel Second composite stereo image had in its left view three horizontal stripes of equal height displaying only red,

For the left and right edges of both views a white border with width of 15 pixels was set to help with automated image segmentation. The images for the views are visualized in Figure 16.

20 shown in Figure 15. The width and height of these stripes were adjusted for each display such that they filled the whole available screen area. This gives two areas with identical B/W combinations in the stereo image and allows calculating mean between the measured properties. a) b) Figure 15: Black and white test image: a) left channel and b) right channel Second composite stereo image had in its left view three horizontal stripes of equal height displaying only red, green and blue components with maximum intensity. The right view was set full black. For the left and right edges of both views a white border with width of 15 pixels was set to help with automated image segmentation. The images for the views are visualized in Figure 16. The third composite stereo image is otherwise identical to the second one except that the left and right views have been switched. a) b) Figure 16: Colour test image: a) left channel and b) right channel To prepare the test images for the V3D device 512x600 versions of the images for each view were first formed and then combined to a single 1024x600 image with the columns of the left view placed in even columns (2 nd, 4 th,...) and right view s columns to odd positions (1 st, 3 rd,...). The For the HDDP device the interleaving of the left and right views was not known as the left and right views with resolution of 427x240 were to be given to the device placed side by side in positions visualized in figure 5 inside a 1024x768 resolution input signal. The pixels outside of the left/right view positions are ignored when forming the 854x600 pixels to be displayed. Preparing the test images for the 3DS required forming the 400x240 views and combining them to a single MPO file. The MPO file format is described in [25]. To be able to display the images with the standard 3DS photo viewing software it was also required to produce a preview image by making a JPEG image of either of the views and also following a certain folder and file naming policy with both the.mpo and.jpg files that were placed on a SD-card Results Unlike with other displays the colour component measurements with the V3D device were made only at 15 observation points instead of making them at the standard observation points 2mm apart from each other where the black/white measurements were made. The colour measurements were also exceptionally made by using pure R/G/B images as the whole other view and black as the 19

21 other and the camera was oriented towards the centre of the display when taking the measurements. This procedure was followed as the idea of using composite images 2 and 3 was not yet thought of at the time of the measurements and the VAIO screen was the first one to be measured. Results with the V3D device can be seen in Figure 17. In the figure one can see that the peaks of the visibilities of the colour components occur at different viewing angles, which is the reason for colour-bleeding artefacts. Figure 17. Angular visibility of the V3D display. Results with the screen of the HDDP prototype device are shown in Figure 18. Based on the results the screen is able to project the views for the user without much crosstalk over wider angle intervals than the VAIO screen. At the same time the transitions between the views are very steep. Also the screen produces only little to no distortion to the colour components. 20

22 Figure 18. Angular visibility of HDDP display. In Figure 19 the results for the 3DS screen are shown. Compared to the HDDP results with the dominating view the angular visibility curve is more bell-shaped and less rectangular while the crosstalk levels are very similar. This results in the fact the screen appears slightly dimmer when not viewed exactly from the optimal angle but on the other hand it gives a subtle cue for the viewer about finding the optimal viewing position. Also the transitions between views are less sharp, which creates larger Figure 19. Results for the 3DS screen. 3.3 Optical mask related quality parameters Objective crosstalk In 3D displays, crosstalk is measured as the ratio between the visible brightness of the view intended to be seen, to the visible brightness of the view that is to be suppressed [13]. In is measured in percentages. However, if normalized in a scale between 0 and 1, it would is derived in the following way: where is the measured brightness of an area containing only black pixels in each view, is the measured brightness of the current view, where pixels intended to be visible are white, and the pixels intended to be suppressed are black, and is the measured brightness of the other view, where pixels intended to be visible are black, and the suppressed ones are white. One way to derive the crosstalk is to use the angular visibility function, find the place where one channel has peak visibility and assign that value to, and then assign to the value of the other channel for the same observation angle. Alternative way to measure the crosstalk is to have horizontal stripes in one view and horizontal stripes in the other, as proposed in [33]. In that case, one needs to find the optimal observation point of a view (a positions at which one stripes are predominantly visible), and measure the brightness of four areas on the display, as shown in Figure 20. (1.1) 21

23 Luminance MOBILE3DTV screen 1I 2II 3III I max L R I min I O L R 1I 2II 3III Observation angle θ w (in transverse plane) I min I C I max Figure 20. Methodology for objective measurement of minimal crosstalk adapted from [26] The minimal crosstalk of our displays according to the measurements is given in Table 3.1. Table 3.2 minimal crosstalk of various displays Model Crosstalk in % HDDP 4% MI(L) 22% MI(P) 22% FF 12% FC 35% SL 8% AL(OG) 24% AL(RG) 32% V3D 25% VUON 21% 3DS 7% Perceived crosstalk Apart from the absolute value of the crosstalk, there are other factors which affect the perceptibility of ghosting for example the brightness and contrast settings of the display, the ambient light, the amount of reflected light from the displays, to name a few. In order to measure the perceived amount of crosstalk we decided to conduct a small scale of subjective experiments in typical observation conditions. The main idea is to present two stimuli to the observer one in the desired channel, with intensity of and another in the opposite channel with intensity of. During the experiment, the value of is kept constant, and the user can tune the value of till both simuli are perceived with equal brightness. If is with the maximum brightness for the display, then the ratio in percentages would give the perceived crosstalk. Notably, the results of such subjective comparison gave lower values than the ones from the objective measures performed in a dark room except for 3DS display, in which the perceived crosstalk is slightly higher than the measured one. We believe that the values for perceived crosstalk are better approximation of the amount of experienced ghosting. 22

gray on the squares where right view had black. A test image was created for all the possible gray levels for the left image, which added up to 256 stereo images in total.

24 We used a set of stereo images showing a checkerboard pattern. For the right view the checkerboard had in all images of the set pure white and black squares while the left view had black squares in locations where the right view had white and a varying level of gray on the squares where right view had black. A test image was created for all the possible gray levels for the left image, which added up to 256 stereo images in total. A single stereo pair of the test set is shown in Figure 21, where the left view has 90 as the gray level. Borders of 8 pixels were created for the images to help with finding optimal viewing positions with the displays. Figure 21. One of the stereo pairs of the checkerboard test set. The test set was shown to a group of subjects and they were asked to look at the test image with only their left eye and switch between images of the set until they found the image in which the perceived gray level between the odd and even squares was perceived as the same. For displays with smaller crosstalk the chosen test image was one with lower realization of the varying gray level as less of the pure white of the right view leaked to the left view and only a low gray level was required to compensate. Table 3.3 perceptual crosstalk of various displays Model HDDP 1% AL(OG) 15% AL(RG) 25% SHARP 16% V3D 32% 3DS 10% Perceived crosstalk in % Perceptual resolution As discussed in [3], the presence of crosstalk in 3D display might increase the range of spatial frequencies that can be visualized on the display. If we model the display as a imaging channel, this would correspond to increasing the passband of the channel. Perceptually, the effect would be that larger range of patterns, as if the display has increased resolution. Finding the perceptual resolution of a 3D display would allow 3D content to be rendered or encoded with appropriate pixel density in order to be optimally seen on a given 3D display. Factors that affect the perceived resolution of a 3D display are the amount of views, the interleaving pattern and the disparity of the signal. In order to measure the range of textures that can be seen on a given 3D display we developed methodology for measuring the so-called passband of the display [27]. The measurement methodology is to display various patches on the display, photograph them and 23

MOBILE3DTV analyse the similarity between the input and output. The stages of the algorithms are shown in Figure 22. The input patches are with various orientation and spatial frequency.

Each patch is compared with the input, and if the structural distortions are under certain threshold, it is assumed that that patch can be successfully visualized on the display, and the patch is

25 MOBILE3DTV analyse the similarity between the input and output. The stages of the algorithms are shown in Figure 22. The input patches are with various orientation and spatial frequency. Each patch is interleaved for a range of disparities, which represent the patch as seen at different apparent depths. For each frequency combination, a pass/fail analysis is performed. Each patch is compared with the input, and if the structural distortions are under certain threshold, it is assumed that that patch can be successfully visualized on the display, and the patch is marked as successfully passed. View 1 View 2 Signal Sampling 3D Warping View 3 Interleaving Mask Output Disparity View n Figure 22. Methodology for measuring the passband of a 3D display adapted from [27] The group of successfully passing patches give the range of frequencies, that can be successfully represented by the display. We refer to such range as the passband of the display. Example passband can be seen marked with red dots in Figure 23a. In order to convert it to perceived resolution, we approximate it with rectangle with the same area, and save vertical-to-horizontal proportions as the passband. The perceived resolution of a 3D display is different for signals with different disparity. Example for resolutions for signals with different disparity is given in Figure 23b f y d 2 h v y m x m f x f y f x a) b) Figure 23. Deriving the passband: a) approximating rectangular passband area, b) passband areas for various disparities [27] 24

3.4 Viewpoint related quality parameters 3.4.1 Position and size of viewing sweet-spots All autostereoscopic displays can be used from a limited range of observation positions.

There are two artefacts that are mostly influenced by the observation position ghosting (angular dependat crosstalk) and pseudoscopy. According to [8], 20% of crosstalk is still considered acceptable.

26 3.4 Viewpoint related quality parameters Position and size of viewing sweet-spots All autostereoscopic displays can be used from a limited range of observation positions. A sweetsport is a position, at which the observer would see the image on the display with sufficient quality. There are two artefacts that are mostly influenced by the observation position ghosting (angular dependat crosstalk) and pseudoscopy. According to [8], 20% of crosstalk is still considered acceptable. Following this rule, we define the sweet-spot as an observation position where each eye perceives the proper view and the crosstalk between the views is less than 20%. Since the display is flat, from a given observation position, different parts of the screen surface are seen from slightly different observation angle, as shown in Figure 24a. The viewing zone of a view are formed by the union of the visibility zones of each pixel that belongs to that view, and has a characteristing diamond-like shape, sometimes referred to as viewing diamond [5]. In order for good stereo-image to be perceived, both eyes need to be in the corresponding sweet-spot, as seen in Figure 24b. This requirement imposes a limit on the range of observation distances, suitable for a given display. For a given interpupilar distance (IPD) there would be minimal and maximal distance at which both eyes on the observer appear inside the corresponding sweet-spot. There would be an optimal observation distance, at which there is an optimal optical separation and lower crosstalk visible across the whole surface of the display. The minimal, maximal and optimal viewing distances are marked in Figure 24b with VD max, VD min and OVD correspondingly. Screen (top) VD max VD min OVD Eyes Sweet-spot a) b) Figure 24. Sweet-spots of an autostereoscopic display: a) left and right sweet-spots, b) optimal, minimal and maximal observation distances. Adapter from [26] The size of the sweet-spots can be derived from the angular visibility function, or directly measured using a pair of camera separated at the chosen IPD. In our measurements we used IPD=65mm, which is the mean IPD value for an adult. The measurement results for OVD, VD min and VD max for a range of 3D displays are given in IPD 25

27 Table Optimal, minimal, maximal and viewing distances for various 3D displays, measured for IPD=65mm Model OVD (cm) VD min (cm) VD max (cm) HDDP MI(L) MI(P) FF FC SL AL(OG) AL(RG) V3D DS With the same setting, crosstalk of less than 20% and IPD=65mm, one could measure the width and height of the sweet-spots. The width of the sweet-spots of the displays we measured is given in Figure 25a. The white rectangles indicate the range of angles, where proper stereo with low crosstalk can be observer. All measurements are done at the OVD. The wider the spots are, the easier for the observer is to find the proper observation angle. Notably, the AL display, which uses polarised glasses, has a very wide sweet-spot. In similar setup, one can measure the height of the sweet-spot, by measuring the range of elevation angles, which ensure image with sufficiently low crosstalk. Results of our sweet-spot height measurements for various displays are given in Figure 25b. Most stereoscopic displays are very little affected by elevation of the observation point. Notably, the AL display, which uses polarised glasses and horizontal interleaving, is very sensitive to the height of observation, and provides good quality only for very limited range of vertical angles. Since the minimal crosstalk of the FC display is never under 20%, for that display we give the areas where the crosstalk is lower than 35%, as marked with dashed line in Figure 25a and Figure 25b. 26

28 HDDP MI(L) MI(P) FF FC SL AL V3D Height of sweet spot, degree MOBILE3DTV HDDP MI(L) MI(P) FF FC SL AL V3D Observation angle (in transverse plane), degree a) b) Figure 25. Width and height of the sweet-spots per display model: a) width and b) height. Adapted from [26] Apparent size and resolution The observer of a 2D display has the freedom to choose an observation distance which would give the preferred tradeoff between pixel density and field of view. However, 3D displays work best at OVDs. In order to have fair comparison, one should calculate apparent size of the display when observer from its OVD. In our measurements, we calculate the horizontal and vertical angle of view (AOV) of each display at its OVD as shown in Figure 26a. w w AOV OVD 1deg OVD a) b) Figure 26. Relative size and resolution of a 3D display: a) angle of view, b) characters per degree The calculation is done as follows:, (2), (3) 27

29 Vertical AOV at OVD, degrees Vertical resolution at OVD, CPD MOBILE3DTV where and are the horizontal and vertical AOV, and are the horizontal and vertical size of the display. The calculations for our set of 3D displays are shown in Figure 27a. For comparison, we include the 2D displays used in Nokia N900 and Apple iphone mobile devices. In a similar way, when measuring the apparent resolution, one should calculate the apparent density of the pixels at OVD. Apparent resolution is measured in cycles-per-degree, where two pixels constitute one cycle, as shown in Figure 26b. The amount of pixels that fill a visual angle of one degree is given by:, (4) where PPCM is the pixel density per centimetre for the display. The CPDs of a number of 3D displays are given in Figure 27b. For displays that can switch between 2D and 3D modes CPD is given separately for each case. Notably, the HDDP and FC displays have the same resolution in both modes. For comparison, the CPDs of N900 and iphone are given, calculated for typical observation distance of 40cm. The resolution of the human retina for perfect 20/20 vision (50 CPD) is included as well FC@40cm MI(P)@37cm V3D@45cm FF@46cm MI(L)@37cm N900@40cm HDDP@40cm SL@58cm AL@60cm Horizontal AOV at OVD, degrees a) b) Figure 27. Angular size and resolution per display model measured at the optimal viewing distance: a) angular size, b) resolution in 2D and 3D mode. Adapted from [26]. 3.5 Content related quality parameters Objective comfort disparity range The same disparity value might produce very different depth sensation effect on different 3D displays. What apparent depth would be perceived for given disparity value depends on the observation distance, pixel density of the display, and the amount of visible crosstalk. For each display, there is a limited range of disparity values, that can be comfortable observed. We refer to that range to as comfort disparity range. The combined influence of A/C rivalry and divergent parallax determines the comfort disparity range of a given display. In our measurements, calculated the disparity range that would cause neither diplopia, nor divergent parallax given the pixels density and OVD of each display. In order to compare with disparity range of downscaled HDTV content, we calculated the ratio between the VUON display and each of the portable 3D displays. We calculated the Persival s zone of comfort for the ODV of each display as explained in [23]. Then we converted the minimum and maximum apparent distance to disparity: MI(3D) FF(3D) FC SL(3D) SL(2D) V3D(2D) MI(2D) FF(2D) HDDP AL (2D) AL (3D) V3D(3D) N900 iphone Horizontal resolution at OVD, CPD Human retina

30 VUON HDDP MI(L) MI(P) FF FC SL AL V3D VUON HDDP MI(L) MI(P) FF FC SL AL V3D Disparity range, cm Disparity range, px MOBILE3DTV ; (5) ; (6) where and are the minimum and maximum disparities in centimeters, and are the minimum and maximum distances to the convergence point, prescribed by the Percival s zone of comfort, also in centimeters. Using the optimal disparity range for the VUON display and the downscaling factors, we calculated the disparity range of a downscaled content for each display. Figure 28a gives the comparison between disparity range of downscaled content (black bars) and disparity range of display-optimized content (white bars), per display model, and disparity given in centimetres. At a first glance, it seems that the allowed disparity range for all 3D displays would be insufficient to accommodate directly downscaled HDTV content. However, mobile displays usually have higher pixel density than large TV sets, which decreases the absolute disparity in centimetres for a given disparity in pixels. In Figure 28b we show comparison between downscaled ranges (black bars) versus optimal range (white bars) per display models, for disparities in pixels. In most cases the additional downscaling caused by the higher pixel density compensates the decreased by A/C rivalry comfort disparity range, and allows the mobile display to accommodate the disparity range of a downscaled HDTV content. IPD=6.5cm Downscaled range Optimal range Downscaled range Optimal range a) b) Figure 28. Downscaled (black) and optimal (white) disparity range per model: a) in cm, b) in pixels. Adapted from [26] Subjective comfort disparity range Apart from A/C rivalry and divergent disparity, there are many additional factors that influence the comfort disparity range of a mobile 3D display for example minimal crosstalk, optical quality, brightness and local contrast of the visualized content, etc. We performed a small scale subjective test aimed to determine the subjective comfort range of each display. For the test, we used a synthetic scene with high-contrast content and objects at different apparent depth. Multiple versions of the scene were rendered, each one with different camera baseline, resulting in scenes with different disparity range. The participants were asked to to choose the scene with the most pronounced, yet comfortably perceived depth range. Then, we calculated the mean maximum and mean minimum value for each display. In Figure 29 we give a comparison between downscaled (black bars) and subjectively optimal (white bars) disparity ranges. We used the subjective disparity range derived for the VUON display along with the downscaling factors, to calculate the disparity range of a downscaled 3D HDTV content. In the most cases mobile 3D displays can 29

31 VUON HDDP MI(L) MI(P) FF FC SL AL V3D Disparity range, px MOBILE3DTV accommodate downscaled stereoscopic HD content, with the exception of the FC display which suffers from high crosstalk. In some cases the optimal disparity range for a mobile display is sufficiently larger than the range of downscaled HD content. In Table 3.5 we show the relative difference between the two. Most displays provide sufficient disparity range overhead, and directly downscaled content can be comfortably observed on them. However, there is a room for content repurposing, in order to increase the disparity range and provide extended range of apparent depth. Still, the impact on the quality might not be large enough to justify the CPU intensive postprocessing needed for on-device repurposing Optimal range Downscaled range Figure 29, Downscaled (black) and optimal (white) disparity subjective comfort disparity range. Table 3.5 Rescaling factors and relative differences between optimized and downscaled content per display model. Model Rescaling factor (letterbox) Negative disparity range Positive disparity range HDDP % more 98% more MI_L % more 53.6% more MI_P % more 53.6% more FC % less -52% less FF % more 140% more SL % more 12.5% more AL % more 12.5% more V3D % more 12.5% more 4 Visual optimization According to our measurement results, there are three factors mostly influencing the visual quality of a mobile 3D display observation position, influence of the optical filter and disparity range of the content. The biggest factor is the observation position on most of the displays the size of sweet-spots is rather small, between two sweet-spots there is a pseudoscopic zone, which confuses the viewer. The optical filter influences in two ways the minimal crosstalk creates ghosting artefacts, and the interleaving pattern creates aliasing and eventually colour bleeding. The disparity range has the least influence of all parameters according to our measurements in 30

32 most cases direct downscaling produces results with sufficient quality. Following these results, we develop a family of algorithms aimed at viewpoint optimization and antialiasing for 3D displays. 4.1 Viewpoint optimization In order adapt the display to the observation position of the user, the algorithm must track the eyes of the observer. We have developed a combination of face and eye-tracking algorithms which allow splitting the processes of face and eye detection between the ARM and DSP cores of an OMAP 3430 [42]. We have ported the face detection library, included in the OpenCV library, which uses the Viola and Jones s algorithm [41]. In order to increase the face detection speed, we search for a subset of all possible face sizes, as the visible face size is limited by the requirement to the user to stay within the visual comfort zone. We are using a face detection library by ArcSoft [35]. Our own face detection algorithm is being ported to the OMAP as well. It is based on a two-stage hybrid technique, combing skin detection with feature-based face detection [36], [37]. ARM side DSP side Camera source program initialization DSP initialization capture frame reset DSP load eye classifiers process image no face detected? yes send face detected message face detected check for face detected message check for eyes detected message eyes detected face detected? yes no eyes detected? yes set 2D/3D mode and display orientation update frame buffer no eye detection update eye coordinates send eye detected message Figure 30, Application flow diagram of DSP and ARM. The part of the algorithm, performing eye-detection is implemented on the DSP core as shown in Figure 30. It detects the two pupils by a Bayesian classifier working on Dual-Tree Complex Wavelet Transform (DT-CWT) features. The DT-CWT has been chosen as a low-cost alternative to Gabor transform for real-time feature extraction implementation [38], [39]. The DT-CWT features are formed by a four-scale DT-CWT applied on a spatial area of 16x16 pixels around a landmark., with six differently-oriented sub-bands per scale. The resulting twenty four matrices form a landmark jet [38], [39]. Two landmark classes are modeled: pupil and non-pupil respectively. For modeling a particular landmark class, we have trained Gaussian mixture model (GMM) for each sub-band in the jet, thus leading to 24 models. We have used utmost 5 Gaussian components for each slice. 31

4.1.1 Pseudoscopy correction We have implemented a system that avoids two of the most common viewpoint-related artefacts on an autostereoscopic 3D display pseudoscopy and angular-dependant ghosting.

33 4.1.1 Pseudoscopy correction We have implemented a system that avoids two of the most common viewpoint-related artefacts on an autostereoscopic 3D display pseudoscopy and angular-dependant ghosting. The system tracks the observer s eyes using a single camera and controls the 3D displays accordingly. The algorithm is initially calibrated, which allows the system to relate the horizontal coordinates of the eyes as seen in the tracking camera to the visibility zone of the views. Since the visibility zones are vertical and the camera is positioned on top, the projective distortions related with the distance to the observer are minimal. Based on the horizontal coordinate of the pupil, three tracking zones are defined: visibility zone of the left view (marked with L on Figure 31), visibility zone of the right view (marked with R on the same figure), and zone with high crosstalk (marked with X ). Figure 31, tracking zones based on the horizontal coordinate of the detected pupil Pseudoscopy is avoided by flipping the left and right channel if an eye is detected to be in the opposite viewing zone. Ghosting artefacts are avoided by switching turning the parallax barrier off, and switching the content to 2D, if any of the observer s pupils falls into an X area. The rationale for this rule is that if one eye of the observer perceives excessive crosstalk, stereoscopic perception is not possible, and it is preferable that the observer does not see the ghost artefacts as well. More information about the algorithms can be found in [42] Point-of-view optimization It is also possible to optimize the image on a 3D display for the cases when the observer is using the display at distance, shorter or longer that the optimal one. At the optimal viewing distance, the intended view is seen across the whole surface of the display, as marked with 1 on Figure 32. At a distance, closer than the optimal, the observer sees different visibility zones at the left and right edges of the display, as marked with 2 on Figure 32a. If the distance to the observer is know, the content on the display can be re-rendered accordingly. In the case of multiview display, the imformation is shifted between the views for example, the image along the right edge of the display intended for the central view (marked with red on the Figure 32a) can be rendered in the previous view (as shown by the curved arrow). The opposite is done along the left edge. 32

34 F X 0 0 X F F X 0 0 X F F X 0 0 X F F X 0 0 X F F X 0 0 X F F X 0 0 X F a) b) c) Figure 32, Distance-based content optimization: a) re-routing of views for observation distance, shorter than the optimal, b) example re-routing table for stereoscopic display, c) example rerouting table for stereoscopic display This procedure can be expressed as a re-routing table which optimizes the image for a given observation distance. The re-routing table should be re-calculated for any given distance to the observer. In the case of a multiview display pixels intended for certain view would be re-routed to other views. Example for multiview rerouting table is given in Figure 32b. In this table 2, means that the pixels are rerouted two views to the right information from view 1 goes to view 3, information from view 2 goes to view 5 and so on. In the same table, -2 means the opposite process information from view 4 goes to view 2 and so on. The surface of the display is separated into 36 sub-sections, and the number in each subsection is an instruction which operation to be performed in the corresponding area of the display. In the case of stereoscopic display, the re-routing table looks like the one given in Figure 32c. In this table, 0 means that the pixels in the corresponding area are left unaltered. The pixels in the F areas should be flipped, effectively swapping the pixels intended for the left and right view. The areas marked with X would be perceived with excessive crosstalk, because for these areas, the observer appears between the viewing zones of the left and right views. In the X areas, a monoscopic image should be projected, by copying all pixels from one view to the other. In order to measure the distance to the observer, eye and face tracking is performed by two cameras simultaneously. For more information on the algorithm, the reader can refer to [43]. 4.2 Antialiasing for stereoscopic displays In stereoscopic displays, only two views are interleaved, and sub-sampling happens in one direction only. Most autostereoscopic displays have vertical parallax barrier, or lenticular sheet with vertical lenticules, and their interleaving process involves horizontal sub-sampling. The displays with passive polarized glasses are interleaved row-wise and interleaving for them needs vertically sub-sampled images. Some mobile displays, for example lightguide-based displays developed by 3M, use temporal interleaving and images prepared for them do not need to undergo subsampling. Since the decimation is with a factor of two, signals with frequency higher that half the sampling rate will be mirrored, causing aliasing. In that case, the optimal antialiasing filter should have cutoff frequency ω n =0.5. However, some people often prefer sharper images for the expense of some aliasing. The authors of [44] claim, that the crosstalk increases the perceptual resolution, by allowing bigger portions of the display pixels to be simultaneously visible. However, this claim is true only for images with no-disparity, where additionally visible pixels are placed close to the ones intended to be seen. For images with disparity, the additional details visible due to crosstalk would 33

35 Perceptually optimal cutoff MOBILE3DTV appear displaced and might deteriorate the visual quality of the scene. Apart from disparity, the visibility of aliasing depends on the content (whether aliased signal affects important information) and on the contrast sensitivity of the HVS (whether aliased signal is visible or masked by other image content). In order to assess the perceptually optimal cutoff frequency for the antialiasing filter, we performed small scale subjective test, where the observers had to choose the most pleasant image on different displays. We prepared a test image containing text with various font sizes, as shown in Figure 33a. From the image we prepared multiple copies, each one filtered with a smoothing filter with different cutoff frequency. The filter is designed using Kaiser windowing technique, length N=10, attenuation parameter a=26db, and cutoff ω n varying between 0.4 and 0.9 with a step of The filter is applied in the direction of the decimation (along rows for column interleaved images, and along columns for row interleaved images). Each filtered image is interleaved with two disparities, d=0 and d=5 which results in the patch being perceived at different apparent depth. The images with different filters and apparent depth were visualised on 6 different displays, and the observers were asked to select the most readable image for each disparity on each display. The results are presented in Table 4.1. In Figure 33b we present a plot of the perceptually optimal cutoff (on ordinate) versus the perceived crosstalk (on abscissa). Unfortunately, on the given plot one cannot derive any global dependency. One surprising outcome is that the preferred cutoff frequency for d=0 is not consistently higher than the one for d=5. Even if the subjectively optimal ω n cannot be predicted from the perceived crosstalk, the approach allows that a set of optimal filters are derived for a given stereoscopic display, by doing subjective experiments with larger set of images and disparity values. Table 4.1 Subjectively optimal ω n for different displays Display model ω n for d=0 ω n for d=5 HDDP ACER(OG) ACER(RG) SHARP V3D DS Perceived crosstalk (%) d0 d5 a) b) 34

36 Figure 33, selecting the subjectively optimal cutoff frequency: a) test image, b) results 4.3 Antialiasing for multiview displays We have developed an antialiasing framework for multiview displays. The framework allows the user to specify the desired percentage of visible distortions. The framework does the necessary processing to maintain the distortions within the selected limit, taking into account the display passband for different disparity values. It consists of two modules, shown in [27] offline processing module, where the display is measured and real-time processing module, which filters the input image according to its apparent depth and selected distortion limits. The design of the antialiasing filters starts with finding the passband of the display for different disparity levels. Then, each passband is approximated by a rectangle. The width and the height of the rectangle define the horizontal and vertical passband of a separable filter, that should be applied for that disparity range. The output from the calculations is stored in two tables. One table contains the height of the equivalent passband for various disparity values and levels of distortion, and the other table the corresponding width of the passband. The real-time processing module uses these two tables to design the optimal filter for the input image. The disparity value is used to select the corresponding column in each passband table. The user can set the value of the desired distortion level. We refer to this parameter as 3D-sharpness, since it controls the tradeoff between visibilities of details versus visibility of Moiré artifacts. The value of 3D-sharpness is used to select the corresponding row of each table. The row and column selection in each table selects a cell. The values in the selected cells give the desired vertical and horizontal cutoff frequencies of an anti-aliasing filter. These cutoff frequencies are used for designing the filters. The filter is then applied on the input image before 3D warping and interleaving. Such filter mitigates the aliasing artifacts for the given disparity level and provides a desired level of 3D-sharpness. For additional information on the algorithm, the reader can refer to [27]. 35

Offline processing module Derive interleaving pattern Measure angular visibility Prepare test images Derive passband Approx. equivallent passband Realtime processing module Horizontal cutoff freq.

References [1] LG launches Optimus 3D, Optimus Pad, CNET reviews, available online at http://reviews.cnet.com/8301-13970_7-20031740-78.html [2] Berkel, V. and Clarke, J.

37 Offline processing module Derive interleaving pattern Measure angular visibility Prepare test images Derive passband Approx. equivallent passband Realtime processing module Horizontal cutoff freq. 3D sharpness Filter design Vertical cutoff freq. Depth Disparity 3D scene Image Input image Filter 3D warp and interleave Display Figure 34, Antialiasing framework for multiview displays. References [1] LG launches Optimus 3D, Optimus Pad, CNET reviews, available online at [2] Berkel, V. and Clarke, J., Characterisation and optimisation of 3D-LCD module design, Proc. SPIE 2653, (1997) [3] Konrad, J. and Agniel, P., Subsampling models and anti-alias filters for 3-D automultiscopic displays, IEEE Trans. Image Process. 15, (2006). [4] Saveljev, V., Son, J.-Y., Javidi, B., Kim, S.-K. and Kim, D., Moiré Minimization Condition in Three-Dimensional Image Displays, J. Display Technol. 1, (2005). [5] Salmimaa, M. and Jarvenpaa, T., Optical characterization of autostereoscopic 3-D displays, J. Soc. Inf. Display 16, (2008). [6] Boher, P., Leroux, T., Bignon, T. and Collomb-Patton, V., A new way to characterize auto-stereoscopic 3D displays using Fourier optics instrument, Proc. of SPIE 7237, 72370Z (2009). [7] Hakkinen, J., Takatalo, J., Kilpelainen, M., Salmimaa, M. and Nyman, G., Determining limits to avoid double vision in an autostereoscopic display: Disparity and image element width, J. Soc. Inf. Display 17, (2009). [8] Kooi, F. and Toet, A., Visual comfort of binocular and 3D displays, Displays 25 (2-3), (2004). [9] Pastoor, S., Human factors of 3D images: Results of recent research at Heinrich- Hertz-Institut Berlin, Proceedings of IDW 95, (1995). [10] Pastoor, S., 3D displays, in (Schreer, Kauff, Sikora, edts.) 3D Video Communication, Wiley, [11] Dodgson, N., Autostereoscopic 3D Displays, Computer 38(8), (2005). 36

Comparative study of autostereoscopic displays for mobile devices

Comparative study of autostereoscopic displays for mobile devices Atanas Boev, Atanas Gotchev Institute of Signal Processing, Tampere University of Technology, P.O.Box 553, 331 Tampere, Finland ABSTACT