Joint Example-based Depth Map Super-Resolution

Similar documents
Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Discriminative Dictionary Learning with Pairwise Constraints

A Binarization Algorithm specialized on Document Images and Photos

Collaboratively Regularized Nearest Points for Set Based Recognition

Cluster Analysis of Electrical Behavior

Machine Learning 9. week

An Image Fusion Approach Based on Segmentation Region

An Optimal Algorithm for Prufer Codes *

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Super-resolution with Nonlocal Regularized Sparse Representation

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Lecture 5: Multilayer Perceptrons

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Parallelism for Nested Loops with Non-uniform and Flow Dependences

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Kernel Collaborative Representation Classification Based on Adaptive Dictionary Learning

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Edge Detection in Noisy Images Using the Support Vector Machines

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Shape-adaptive DCT and Its Application in Region-based Image Coding

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

High-Boost Mesh Filtering for 3-D Shape Enhancement

Learning a Class-Specific Dictionary for Facial Expression Recognition

Image Deblurring and Super-resolution by Adaptive Sparse Domain Selection and Adaptive Regularization

An efficient method to build panoramic image mosaics

User Authentication Based On Behavioral Mouse Dynamics Biometrics

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

X- Chart Using ANOM Approach

A Study on Clustering for Clustering Based Image De-Noising

A DCVS Reconstruction Algorithm for Mine Video Monitoring Image Based on Block Classification

Tone-Aware Sparse Representation for Face Recognition

Classifying Acoustic Transient Signals Using Artificial Intelligence

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. *, NO. *, Dictionary Pair Learning on Grassmann Manifolds for Image Denoising

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Feature Reduction and Selection

Key-Selective Patchwork Method for Audio Watermarking

The Theory and Application of an Adaptive Moving Least. Squares for Non-uniform Samples

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

The Comparison of Calibration Method of Binocular Stereo Vision System Ke Zhang a *, Zhao Gao b

Support Vector Machines

A high precision collaborative vision measurement of gear chamfering profile

Related-Mode Attacks on CTR Encryption Mode

Classifier Selection Based on Data Complexity Measures *

Angle-Independent 3D Reconstruction. Ji Zhang Mireille Boutin Daniel Aliaga

S1 Note. Basis functions.

Nonlocal Mumford-Shah Model for Image Segmentation

Some Tutorial about the Project. Computer Graphics

Fitting: Deformable contours April 26 th, 2018

Face Detection with Deep Learning

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Solving two-person zero-sum game by Matlab

Smoothing Spline ANOVA for variable screening

Querying by sketch geographical databases. Yu Han 1, a *

Dynamic wetting property investigation of AFM tips in micro/nanoscale

Feature-Preserving Mesh Denoising via Bilateral Normal Filtering

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

Local Quaternary Patterns and Feature Local Quaternary Patterns

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)

Support Vector Machines

Categories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids Verification. General Terms Algorithms

Active Contours/Snakes

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING.

SIGGRAPH Interactive Image Cutout. Interactive Graph Cut. Interactive Graph Cut. Interactive Graph Cut. Hard Constraints. Lazy Snapping.

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

MOTION BLUR ESTIMATION AT CORNERS

An Image Compression Algorithm based on Wavelet Transform and LZW

Prof. Feng Liu. Spring /24/2017

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

RESOLUTION ENHANCEMENT OF SATELLITE IMAGES USING DUAL-TREE COMPLEX WAVELET AND CURVELET TRANSFORM

Pictures at an Exhibition

Visual Thesaurus for Color Image Retrieval using Self-Organizing Maps

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

Research and Application of Fingerprint Recognition Based on MATLAB

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Module Management Tool in Software Development Organizations

Fast Feature Value Searching for Face Detection

Semantic Image Retrieval Using Region Based Inverted File

IMAGE super-resolution (SR) [2] is the problem of recovering

Hermite Splines in Lie Groups as Products of Geodesics

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Enhanced Watermarking Technique for Color Images using Visual Cryptography

Optimal Workload-based Weighted Wavelet Synopses

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Transcription:

Jont Example-based Depth Map Super-Resoluton Yanje L 1, Tanfan Xue,3, Lfeng Sun 1, Janzhuang Lu,3,4 1 Informaton Scence and Technology Department, Tsnghua Unversty, Bejng, Chna Department of Informaton Engneerng, The Chnese Unversty of Hong Kong 3 Shenzhen Key Lab for CVPR, Shenzhen Insttutes of Advanced Technology, Chna 4 Meda Lab, Huawe Technologes Co. Ltd., Chna carol.lyj@gmal.com, xtf009@e.cuhk.edu.hk, sunlf@tsnghua.edu.cn, jzlu@e.cuhk.edu.hk Abstract The fast development of tme-of-flght (ToF) cameras n recent years enables capture of hgh frame-rate 3D depth maps of movng objects. However, the resoluton of depth map captured by ToF s rather lmted, and thus t cannot be drectly used to buld a hgh qualty 3D model. In order to handle ths problem, we propose a novel jont example-based depth map super-resoluton method, whch converts a low resoluton depth map to a hgh resoluton depth map, usng a regstered hgh resoluton color mage as a reference. Dfferent from prevous depth map SR methods wthout tranng stage, we learn a mappng functon from a set of tranng samples and enhance the resoluton of the depth map va sparse codng algorthm. We further use a reconstructon constrant to make object edges sharper. Expermental results show that our method outperforms state-of-the-art methods for depth map super-resoluton. Keywords-Depth map; Super-resoluton (SR); Regstered color mage; Sparse representaton; I. INTRODUCTION A depth map, representng the relatve dstance of each object pont to the vdeo camera, s wdely used n 3D object presentaton. Current depth sensors for capturng depth maps can be grouped nto two categores: passve sensors and actve sensors. Passve sensors, lke a stereo vson camera system, s tme-consumng and not accurate at textureless or occluded regons. On the contrary, actve sensors generate more accurate result, and two most popular actve sensors are laser scanners and Tme-of-Flght (ToF) sensors. Laser scanners, despte the hgh-qualty depth map they generated, they have lmted applcatons n statc envronments, as they can only measure a sngle pont at a tme. Compared wth the these sensors, ToF sensors are much cheaper and can capture a depth map of fast movng objects, whch are drawng more and more nterest n recent years [1], []. However, a depth map captured by a ToF sensor has very low resoluton. For example, the resoluton of PMD CamCube 3.0 s 00 00 resoluton and the resoluton of MESA SR 4000 s 176 144. In order to mprove the qualty, a postprocess step s needed to enhance the resoluton of the depth map [3], [4], whch s called depth map super-resoluton (SR) n the lterature. Some prevous SR approaches [3] recover a hgh resoluton depth map from multple depth maps of the same statc (a) (b) (c) Fgure 1. Depth map super-resoluton. (a) Raw depth map captured by a ToF camera. (b) Correspondng regstered color mage. (c) Recovered hgh resoluton depth map. object (taken from slghtly dsplaced vewponts). For the stuaton wth only one depth map captured from the a sngle vewpont, most state-of-the-art methods focus on recoverng a hgh resoluton depth map from the low resoluton depth map wth the help of a regstered hgh resoluton color mage, as shown n Fgure 1. A common approach s to apply a jont blateral flter wth color nformaton to rase the resoluton [4]. Ths approach can obtan a depth map wth sharper boundares. However, snce the jont blateral flter do not have the tranng stage, t s senstve to nose n the color mage and a recovered depth map often contans some false edges. Other algorthms, such as detectng edges of the regstered hgh resoluton color mage to drect SR [5] and utlzng color value to calculate weghts for SR to acheve sharp boundares [6], have smlar prncple and problems as the blateral flter method. Recently there s a rapd development of the examplebased D mage SR. In ths approach, the algorthm learns the fne detals that correspond to dfferent low resoluton mage patches from a set of low resoluton and hgh resoluton tranng pars, then use learned correspondence to predct the detals of a low resoluton testng mage [7]. Sun et al. propose a Bayesan approach to sharpen the boundary of nterpolaton result by nferrng hgh resoluton patches from nput low resoluton patches based on prmal sketch prors [8]. Change et al. propose to learn a mappng functon from low resoluton patches to hgh resoluton ones usng locally lnear embeddng (LLE) [9]. Glasner et al. propose to drectly collect tranng patch pars from the sngle low resoluton nput mage, nstead of precollected data set [10]. Yang et al. proposed an example-based D mage SR approach usng sparse sgnal representaton [11]. And Dong et al. extend ths works usng multple sets of

C L H H R e g s t e r e d D m a p I c P a t c h C r a w n I c L o w r e s o l u t o n d e p t h m a p I l E d g e E x t r a c t o n E d g e m a p I n t e r p o l a t e d d e p t h m a p H n t T e x t u r e e d g e R e m o v a l R e f n e d e d g e m a p L C = r (, ) S R r e s u l t : H g h r e s o l u t o n d e p t h m a p I h P a t c h L r a w n I l I n t e r p o l a t o n P a t c h H n t n H n t F e a t u r e E x t r a c t o n P a t c h H n I h r a w H g h r e s o l u t o n d e p t h m a p R e c o n s t r u c t o n Fgure. Ppelne of Example-based SR. bases learned from tranng data set to adapt to the content varaton across dfferent mages [1]. However, all these example-based methods use nput of a sngle mage only and do not well ft to our applcaton where the nput ncludes a depth map and a regstered D color mage. In ths paper, we propose a novel jont example based SR method to enhance the resoluton of a depth map captured by a ToF camera. Unlke tradtonal example based SR methods, whch only utlze a sngle D mage as nput, our jont example based SR use both a low resoluton depth map and a regstered color mage to recover a hgh resoluton depth map. We desgn a mappng functon from a low resoluton depth patch and a color patch to a hgh resoluton depth patch, accordng to ther sparse presentatons on three related dctonares. Furthermore, we use a reconstructon constrant to promse the accuracy of reconstructon and make object edges sharper. Expermental results demonstrate that our SR method obtans a hgh resoluton depth map wth clearer boundares and fewer false edges than state-of-the-art methods. II. DEPTH MAP SR In our work, depth map SR s to reconstruct a hgh resoluton depth map I h from a low resoluton depth map I l and a hgh resoluton color mage I c. Frst, the reconstructed hgh resoluton depth map I h should be consstent wth the low resoluton depth map I l as: I l = DI h (1) where D s a downsamplng operator. Furthermore, as the reconstructed hgh resoluton depth map corresponds wth the regstered color map I c n pxel level, there s a correlaton between two mages I c and I h. Based on the analyss above, we model the correlaton between I l, I c and I h as a mappng functon r(, ) from a low resoluton depth patch and ts correspondng color patch to a hgh resoluton depth patch as: H = r(l, C ) () where L s a patch n I l, C s the correspondng color patch n I c, and H s the recovered hgh resoluton depth patch n I h. In ths paper, we defne the mappng functon r(, ) usng the sparse representaton of patches over a pre-traned dctonary, whch wll be dscuss n secton II-B. A. Feature Representaton for Patches To enhance the robustness, we do not use the raw depth map patches and color patches as nput for (). Instead, we extract features from them and use these features to represent the raw patches as shown n Fgure. For a low resoluton depth patch L raw, we use the frst and second dervatves of ts bcubc nterpolaton result to form a feature[ vector to represent ths patch: ] L = (3) x Hnt, y Hnt x Hnt, y Hnt where H nt s the bcubc nterpolaton result of L raw. Accordng to [11], the frst and second dervatves of patches can better reflect the smlarty between patches. For a color patch C raw, we use ts edge map as features. Snce some edges on the color map are caused by the texture of the object and do not correspond to the edges on the depth map, we need to remove them to enhance the correlaton between hgh resoluton depth patches and color patches. Therefore, we frst upsample the low resoluton depth patch usng bcubc nterpolaton and extract both the edges of the color mage and the upsampled depth mage. Then the pxel-wse product between these two edge maps s used as the feature to represent a color patch, whch can effcently remove the texture edges of the color mage wth: C = ( C raw ) ( H nt ) (4) where s the edge extracton operator, H nt s the bcubc nterpolaton result of L raw and C raw s the color patch. The feature to represent a hgh resoluton depth patch H raw s: H = H raw mean(h raw ) (5) where mean(h raw ) s the vector wth all ts elements equal to the mean depth value of H raw. For the practcal SR procedure, mean(h raw ) s unknown before reconstructon, so we could use mean(h nt ) to represent mean(h raw ). As H nt and H raw share smlar mean depth value, ths replacement s reasonable. B. Mappng Functon va Sparse Representaton The mappng functon r(, ) n () s defned usng the sparse representaton over low resoluton depth patch and color patch. We frst co-tran three dctonares: D h conssts of hgh resoluton patches; D l conssts of low resoluton patches; D c conssts of color patches. Notce that patches n

D l, D c and D h are correspondent,.e., the th low resoluton patch n D l corresponds wth the th hgh resoluton patch n D h and the th color patch n D c. The detals of tranng wll be ntroduced n secton II-C. Then for each new nput low resoluton patch L and ts correspondng color patch C, we can fnd the output hgh resoluton patch H as follows. We frst fnd a sparse representaton of these two patches on the dctonares D l and D c respectvely. Here we enforce L and C have the same sparse coeffcents on dctonares D l and D c. Then the hgh resoluton patch s recovered usng the same coeffcents. The mappng functon s defned as: H = r(l, C ) = D h α where: α = argmn {λ l D l α L + λc Dcα C + f(h, L)} α 0 ǫ (6) where α s the coeffcent vector consstng of all the coeffcents, each of D h, D l and D c s a matrx wth each prototype patch beng a column vector, λ l and λ c are two balance parameters, and 0 denotes the l 0 -norm. f s a constrant functon to ensure reconstructon constrant defned n (1). The detaled defnton of f(h, L ) s gven n secton II-D. Here we enforce the sparsty constrant α 0 ǫ for two reasons: frst, wth the sparsty constrant, t s reasonable to reconstruct H usng coeffcents α, whch are the lnear coeffcents for representng L (or C ) usng patches n D l (or D c ). Second, as dscussed n [13], f a hgh resoluton patch H can be represented as a suffcent sparse lnear combnaton of patches n D h, t can be perfectly recovered from a low resoluton patch. Same as prevous works on sparse representaton, we replace l 0 -norm by l 1 -norm n (6) for computatonal effcency. As dscussed n [13], the l 1 -norm constrant stll ensures the sparsty of the coeffcents α. Then whle gnorng f, (6) becomes α = argmn {λ l D l α L + λ c D cα C } (7) α 1 ǫ C. Dctonary Tranng The dctonares D l, D c and D h are traned from a set of corresponded low resoluton depth patches, hgh resoluton depth patches (ground-truth) and color patches. The tranng s to mnmze the followng estmaton error: E = r(l, C ) H (8) Combnng (7) and (8), we have: mn λ l D l α L D l,dc,d h,α +λc Dcα C + D hα H subject to: α 1 ǫ (9) In our work, we use about 10 5 patches for tranng and after tranng, each dctonary contans only 104 patches. We use dctonary sze whch s much smaller than the number of tranng samples for robustness and effcency. The above dctonary tranng formulaton (9) s common n sparse representaton, and can be solved usng an teratve optmzaton method [14], [11]. D. Mappng Functon wth a Reconstructon Constrant The reconstructon constrant defned n (1) s mportant for SR. Wthout ths constrant, the downsamplng result of the recovered hgh resoluton depth map s not guaranteed to be close to the nput low resoluton depth map, and a serous artfact wll appear when the mappng functon fals to get the correct hgh resoluton patch. Drectly combnng the reconstructon constrant (1) wth the mappng functon defned n prevous secton s not easy. So we frst apply an upsamplng operator U on both sdes of (1), resultng n: UI l = UDI h I h (10) The smplest way for upsamplng s the bcubc nterpolaton. However, there s an obvous blurrng effect on the boundares of the object n the depth map H nt obtaned from the nterpolaton. To remove ths effect, we apply the jont-blateral flter proposed n [4], whch can generate clearer object boundares than H nt : H b (x) = 1 Z x N(x) e x x θs C(x) C(x ) θ c H nt (x ) (11) where Z s the normalzaton factor, N(x) s a neghborhood of x, C(x) s the RGB color vector of the pxel at poston x n the regstered color mage I c, and H nt (x) s the depth value of the pxel at poston x n H nt. After flterng, pxels wth smlar colors tend to have smlar depth values. Therefore, the flterng result H b normally has a sharper boundary compared wth the nterpolaton result H nt. Then we use H b = UI l I h as the reconstructon constrant. Let H b be a patch on H b. Then the recovered hgh resoluton depth patch H n I h should be as near to H b as possble. We use the l -norm to model ths reconstructon constrant f and add t to the mappng functon (7): H = D h α where: α = argmn{λ l D l α L + λ c D cα C (1) α 1 ǫ + λ r D h α H b } where λ r s the weght for reconstructon constrant. (1) s a LASSO lnear regresson problem, and can be effcently solved by [15]. In the experment, we smply set all the weghtng parameters λ l, λ c and λ r to 1. E. Proposed Jont Example-based Depth Map SR Utlzng the fnal mappng functon as (1), the proposed example-based depth map SR algorthm can be summarzed n Algorthm 1. To remove the blockng effect, we dvde the low resoluton depth map nto overlappng patches, and obtan the hgh resoluton patch usng the mappng functon (1). Then we combne these patches to a whole hgh resoluton depth map by averagng the depth values over the overlappng regons.

Algorthm 1 Proposed Example-based depth map SR Input: A low-resoluton depth map I l, the correspondng color mage I c and dctonares D l, D c and D h 1) Upsample I l to the same resoluton as I c, and apply blateral flter (11) to t to get fltered mage H b ) for each patch of L n I l 3) Get the correspondng color patch C and depth patch H b from I c and H b respectvely 4) Get the hgh resoluton depth patch H from L, C and H b by solvng the optmzaton problem (1) 5) endfor 6) Recover the whole hgh resoluton depth map by combnng all patches H obtaned n step 4 III. EXPERIMENTS AND RESULTS A. Comparson wth other approaches We collect 34 pars of color mages and depth maps from 4 vdeos of Phlps 3DTV. The low resoluton depth maps for tranng are obtaned from hgh resoluton depth maps by a Gaussan blurrng and downsamplng. In the tranng stage, we only tran dctonares to ncrease the resoluton of a depth map by factor. Examples of tranng patches are shown n Fgure 3. For larger magnfyng factors, such as 8, we get the hgh resoluton depth map by applyng the SR three tmes. For each tranng patch trple (low and hgh resoluton depth patches and color patches), we randomly select a 4 4 patche from the low resoluton depth maps and ts correspondng 8 8 patches from the hgh resoluton depth maps and color mages. We extract 100,000 patch trples from the 34 groups of mages to tran the dctonares D l, D h and D c, each of whch contans 104 patches after tranng. To evaluate the performance of our algorthm, we compare t wth fve other SR methods. Among them, there are three methods use no regstered color mage nformaton: bcubc nterpolaton (Bc), blateral flterng usng only depth nformaton (D-B)) and sparse representaton method proposed by Yang et al. n [11] (Sc-Y). The other two methods are the state-of-the-art methods desgned specally for depth map SR, whch take nto account the regstered color mage nformaton: the jont-blateral flterng as defned n [4] (J-B) and the algorthm usng regstered color mage to drect calculaton of weghts for nterpolaton as n [6] (C- W). From another angle, among the fve methods, Sc-Y s an example-based algorthm, just lke our method. The same tranng set, and the dctonary sze, patch sze, and overlappng sze are used n both our algorthm and Sc-Y. The other four methods do not have a tranng stage. The testng set conssts of 13 mages randomly selected from 6 vdeos. Four of these vdeos are also from Phlps 3DTV webste as the tranng set (Grl, Football, Dandelon and Frsbee) and two are from Mcrosoft Research Asa webste (Ballet and Breakdancers). We frst evaluate the performance of these sx algorthms Fgure 3. Example of tranng patches. The frst row shows color patches, the second row shows correspondng low resoluton depth patches, and thrd row shows correspondng hgh resoluton depth patches. Table I ROOT MEAN SQUARE ERROR (RMSE) FOR SIX TESTING VIDEOS TestVdeos Average RMSE Ours Sc-Y J-B C-W D-B Bc Grl 6.5 7.03 7.3 9.4 7.35 7.36 Football 4.94 5.7 6.5 7.1 6.37 6.39 Dandelon 5.09 5.49 6.18 5.91 6.3 6.3 Frsbee 8.4 8.84 9.37 10.89 9.46 9.47 Ballet 8.83 9.63 9.98 9.4 10.16 10.5 Breakdancer 5.44 5.68 5.7 6.79 5.74 5.76 wth a magnfyng factor 8. A quanttatve result s gven n Table I. It shows that our algorthm acheves the best results on all the testng sequences. RMSE also shows that the example-based algorthms (our algorthm and Sc-Y) outperform those non-tranng methods. And the methods takng nto account the regstered color mage nformaton (J-B and C-W) perform better than the ones usng only low resoluton depth map (D-B and Bc). To further evaluate these sx methods, we compare the vsual performance between them. Fgure 4 shows examples of one MSRA vdeo (Ballet) and one Phlps vdeo (Frsbee). We can see that the depth maps recovered by our algorthm obvously outperforms Bc, D-B, J-B and Sc-Y as havng sharper boundares and smoother surfaces, whch can be easly found n the enlarged parts. Although the depth map obtaned by C-W has sharper boundares than ours, t contans some obvous artfacts (marked by crcles) and ts RMSE s much hgher than ours (see Table1). Ths s because C-W s not robust and a small nose n the color mage may greatly affect upsampled depth map. For nstance, the woman s hand n Ballet s obvously uncorrect usng C-W because the color of the woman s hand s much lke the wooden bar behnd her, and C-W manly uses the background depth value to fll the unknown pxels. Another example s that the dandelon contans many errors because the lnes are tny and are easly affected by the surroundng backgrounds. In concluson, applyng the machne learnng method n depth map SR can mprove ts performance sgnfcantly. And by takng nto consderaton the regstered color mage nformaton can further mprove ts accuracy. Also, our algorthm s robust to the errors n the color mage and s not obvously affected by them, whch s a bg problem exstng n some state-of-the-art depth map SR methods.

Regstered Color Image Ground Truth Bc D-B Sc-Y [11] Regstered Color Image Ground Truth J-B [4] (a) Ballet C-W [6] Our Algorthm Bc D-B Sc-Y [11] J-B [4] (b) Frsbee C-W [6] Our Algorthm Fgure 4. Vsual comparson between dfferent algorthms for Ballet (left) and Frsbee (rght). Because depth map has lttle texture and the qualty of the depth map s manly evaluated by the qualty of boundares of objects, we manly enlarge some part wth many elaborate boundares to show t clearer. B. Dfferent magnfyng factors From the analyss and experment above, we use Sc-Y to stand for the prevous example-based algorthm, and J-B to stand for the algorthms usng regstered color mage. Then we further test the performance of four algorthms (our algorthm, Sc-Y, J-B and Bc) wth dfferent magnfyng factors. Fgure 5 are the magnfyng factor-rmse curves for the sequences Football and Frsbee. It shows that our algorthm outperforms J-B and Bc under all the factors. Our algorthm has smlar performance wth Sc-Y under magnfyng factor. However, when the magnfyng factor ncreases, our algorthm performs better than Sc-Y. Fgure 6 also shows the vsual comparson between the algorthms under dfferent magnfyng factors, 8 and 16. The vsual qualty of depth maps obtaned by Sc-Y and J-B are serously affected as the magnfyng factor ncreases, whle the depth maps obtaned by our method are stll clear. It demonstrates that our algorthm has a good performance even for a large magnfyng factor. Ths advantage s sgnfcant for practcal applcatons, snce the depth map captured by ToF has a very low resoluton and has to be reconstructed wth a hgh resoluton for 3D modelng or 3DTV representaton. Another pont worth mentonng s that the tranng patches are all taken from four syntheszed vdeos, smlar to the Frsbee mage Fgure 4, but the testng set contans both syntheszed and real vdeos, such as the Ballet mage n Fgure 4. IV. CONCLUSION In ths paper, we have proposed a jont example-based depth map super-resoluton method, usng a regstered hgh resoluton color mage as a reference. Prevous example based SR methods only use a sngle low resoluton mage as an nput and do not well ft to our applcaton where the nput ncludes a depth map and a regstered D color mage. We propose to learn a mappng functon from both

RMSE RMSE 10 8 6 4 Football Ours Sc Y [11] J-B [4] Bc 0 4 6 8 10 1 14 16 Magnfyng factor Frsbee 14 1 10 8 6 4 Ours Sc Y [11] J-B [4] Bc 4 6 8 10 1 14 16 Magnfyng factor Fgure 5. Magnfyng factor-rmse curves for Football and Frsbee. Ths work s supported by the Development Plan of Chna (973) under Grant No. 011CB3006, the Natonal Natural Scence Foundaton of Chna under Grant No. 60833009/60933013/6097509/61070148, and Scence, Industry, Trade, and Informaton Technology Commsson of Shenzhen Muncpalty, Chna (JC00903180635A, JC0100570378A, ZYC01006130313A). REFERENCES [1] D. Chan, Nose vs. feature: Probablstc denosng of tmeof-flght range data, Techncal report, Stanford Unversty, 008. [] A. Kolb, E. Barth, R. Koch, and R. Larsen, Tme-of-flght sensors n computer graphcs, Eurographcs State of the Art Reports, pp. 119 134, 009. [3] S. Schuon, C. Theobalt, J. Davs, and S. Thrun, Ldarboost: Depth superresoluton for tof 3d shape scannng, CVPR, 009. [4] Q. Yang, R. Yang, J. Davs, and D. Nster, Spatal-depth super resoluton for range mages, CVPR, 007. Regstered Color Image Ground Truth [5] E. Ekmekcoglu, M. Mrak, S. Worrall, and A. Kondoz, Utlsaton of edge adaptve upsamplng n compresson of depth map vdeos for enhanced free-vewpont renderng, ICIP, pp. 733 736, 009. [6] Y. L and L. Sun, A novel upsamplng scheme for depth map compresson n 3dtv system, Pcture Codng Symposum, pp. 186 189, 010. Ours Sc-Y [11] [7] W. Freeman, T. Jones, and E. Pasztor, Example-based superresoluton, IEEE Computer Graphcs and Applcatons, pp. 56 65, 00. [8] J. Sun, N. Zheng, H. Tao, and H. Shum, Image hallucnaton wth prmal sketch prors, CVPR, 003. [9] H. Chang, D. Yeung, and Y. Xong, Super-resoluton through neghbor embeddng, CVPR, 004. [10] D. Glasner, S. Bagon, and M. Iran, Super-resoluton from a sngle mage, ICCV, pp. 349 356, 009. J-B [4] (a) Magnfyng factor s 8 (b) Magnfyng factor s 16 Fgure 6. Comparson on Grl when the magnfyng factor s equal to 8 and 16. a patch n the low resoluton depth map and a patch n the color mage to a patch n the hgh resoluton depth map. We also utlze a hgh resoluton depth map reconstructed by the jont-blateral flter as a reconstructon constrant, whch can generate sharp object edges. Our experments have shown that our algorthm outperforms state-of-the-art algorthms. V. ACKNOWLEDGEMENT [11] J. Yang, J. Wrght, T. Huang, and Y. Ma, Image superresoluton va sparse representaton, IEEE Trans. Image Processng, vol. 19, no. 11, pp. 861 873, 010. [1] W. Dong, L. Zhang, G. Sh, and X. Wu, Image deblurrng and super-resoluton by adaptve sparse doman selecton and adaptve regularzaton, IEEE Trans. Image Processng, 011. [13] D. Donoho, For most large underdetermned systems of lnear equatons the mnmal l1-norm soluton s also the sparsest soluton, Communcatons on Pure and Appled Mathematcs, vol. 59, no. 6, pp. 797 89, 006. [14] H. Lee, A. Battle, R. Rana, and A. Ng, Effcent sparse codng algorthms, NIPS, 007. [15] R. Tbshran, Regresson shrnkage and selecton va the lasso, Journal of the Royal Statstcal Socety. Seres B, vol. 58, no. 1, pp. 67 88, 1996.