Ping Tan. Simon Fraser University

Ping Tan Simon Fraser University

Photos vs. Videos (live photos) A good photo tells a story Stories are better told in videos

Videos in the Mobile Era (mobile & share) More videos are captured by mobile devices Sales of compact cameras have fallen 300 hrs of videos are uploaded to YouTube every minute Cisco predicts videos account for 70% of internet traffic by 2017

Challenges in the Mobile Era Mobile: how to produce professional videos with mobile devices? Share: how to create exciting content? the SynthCam app by Marc Levoy [Yu and Gallup 2014]

Computational Videography Enhance Video Quality Stabilization Enable Advanced Photography Video defog and stereo TrackCam Auto Fence Removal

Pipeline of Video Stabilization Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) Feature Tracking Camera Path Smoothed Camera Path t t + 1 t + 1 t

Digital Video Stabilization Video stabilization techniques can be categorized as: Popular commercial solutions: 2D method [Matsushita et al. PAMI 2006; Grundmann et al. CVPR 2011] 3D method [Liu et al. SIGGRAPH 2009; Liu et al. CVPR 2012; Zhou et al. CVPR 2013] imovie, Apple 2.5D method [Liu et al. TOG 2011; Goldstein and Fattal, TOG 2012] YouTube Stabilizer, Google Relevant rolling shutter correction techniques [Baker et al. CVPR 2010; Karpenko et al. Stanford Tech Report. 2011; Grundmann et al. ICCP 2012] Movie Maker, Microsoft After Effect CS6, Adobe

Challenges in Consumer Videos 1. Large depth variation

Challenges in Consumer Videos 1. Large depth variation 2. Quick camera motion (rotation, zooming)

Challenges in Consumer Videos 1. Large depth variation 2. Quick camera motion (rotation, zooming) 3. Large moving objects

Challenges in Consumer Videos 1. Large depth variation 2. Quick camera motion (rotation, zooming) 3. Large moving objects 4. Strong rolling shutter effects

Common Artifacts in Stabilized Videos 1. Not stable enough input previous method(virtual dub stabilizer)

Common Artifacts in Stabilized Videos 1. Not stable enough input 2. Geometry distortion previous method (YouTube)

Common Artifacts in Stabilized Videos 1. Not stable enough input 2. Geometry distortion 3. Cropping previous method (Adobe After Effects)

Our Goals Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) To address the challenges in consumer videos: 1. Large depth variation 2. Quick camera motion 3. Large moving foreground By our novel techniques in: Motion model & estimation Adaptive path smoothing 4. Rolling shutter

Contributions Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. SteayFlow: Spatially Smooth Optical Flow for Video Stabilization. IEEE CVPR 2014 Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. Bundled camera paths for video stabilization. ACM SIGGRAPH 2013 Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, Jian Sun Video Stabilization with a depth camera. IEEE CVPR 2012

Contributions Camera Motion Camera Path Re-rendering (by Image Shuaicheng Liu, Lu Yuan, Estimation Ping Tan, Jian Sun. Smoothing Warping) Bundled camera paths for video stabilization. ACM SIGGRAPH 2013 Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. SteayFlow: Spatially Smooth Optical Flow for Video Stabilization. IEEE CVPR 2014 Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, Jian Sun Video Stabilization with a depth camera. IEEE CVPR 2012

camera motion estimation Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) 3D method: [Liu et al. SIGGRAPH 2009] Spatially-variant motion Time-consuming Depends on fragile 3D reconstruction structure from motion

camera motion estimation Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) 2D method: [Matsushita et al. PAMI 2006; Grundmann et al. CVPR 2011] 1 Robust Efficient Homogenous planar motion

camera motion estimation Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) 2.5D method: [Liu et al. TOG 2011; Goldstein and Fattal, TOG 2012] 3D reconstruction 2D feature tracking Spatially-variant motion Feature tracking is fragile to quick camera motion tracking when rotating

camera motion estimation Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) Our Solution: Novel flexible 2D motion model only 2 frames feature correspondence spatially-variant motion

our mesh-based motion model conventional single homography our mesh-based motion model Divide the video frame to a 2D regular grid mesh

our mesh-based motion model,,,,,,,,,,,,,,,, conventional single homography our mesh-based motion model Divide the video frame to a 2D regular grid mesh Estimate a homography in each cell (now, spatial-variant motion)

our mesh-based motion model,,,,,,,,,,,, frame t+1, warped from frame t frame t,,,, Two challenges: Maintain continuity in motion estimation Estimate motion at textureless cells (e.g. in sky)

our mesh-based motion model Our solution: frame t frame t + 1 Parameterize by the translations at mesh grid points Estimate all translations by an as-similar-as-possible warping [Igarashi et al. 2005; Liu et al. SIGGRAPH 2009] Estimate at each cell from,,,,,,

model estimation Data term: should the same local bilinear coordinates. 2, where 2. frame t frame t+1

model estimation Smooth term: should be close to a similarity 0 0

comparison with global homography frame t frame t+1 Single homography [Matsushita et al. PAMI 2006] Our method

comparison with global homography error = error 0.35 0.3 0.25 0.2 0.15 0.1 single homography 0.05 0 0 20 40 60 80 100 120 140 160 180 frame index mesh-based homography

comparison with global homography Stabilized with a global homography Stabilized with our method

comparison to [Grundmann et al. ICCP 2012] frame t frame t+1 Gaussian smoothness Homography array [Grundmann et al. ICCP 2012] Our method

comparison to [Grundmann et al. ICCP 2012] error = error 1.6 1.4 1.2 1 0.8 0.6 Homography array 0.4 0.2 0 0 50 100 150 200 250 frame index our method

comparison to [Grundmann et al. ICCP 2012] Stabilized by Youtube.com Stabilized with our method

camera path smoothing Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) Low-pass filtering Polynomial curves [Morimoto and Chellappa, ICASSP 1999, Matsushita et al. PAMI 2006] [Chen et al. CG Forum 2008] Piece-wise smoothing L1-norm optimization [Gleicher and Liu, Multimedia 2007] [Grundmann et al. ICCV 2011]

camera path t t + 1 t + 2 homographies camera path

bundled camera paths 1 2 3 4

adaptive smoothing low-pass smoothing our adaptive smoothing distortion input camera path rapid panning jitters

smooth a single path Adaptive smoothing by minimizing: Data term close to original path Smoothness, bilateral weight, temporal range range temporal

smooth a single path Adaptive smoothing by minimizing: Iteratively optimized by (according to the Jacobi iterative solver) 1, Initialized as 1,

smooth bundled paths + 2 local adaptive path smoothing spatial smoothness 1 1

re-rendering.. Input video Input video frame t-1 frame t, Stabilized video Stabilized video frame t-1 frame t..

Video Results

+ 2 spatial smoothness without spatial constraint with spatial constraint

+ 2 local path smoothing low-pass local path smoothing adaptive local path smoothing

Computational Efficiency CPU: Intel i7 3.2GHz Quad-Core, RAM: 8G 400 ~ 600 SURF features / frame 720P video (resolution: 1280 X 720): 392 ms / frame (~2.5 fps) smooth paths (12 ms) render frame (30 ms) estimate motion (50 ms) extract feature (300 ms)

Video Stabilization Pipeline Camera Motion Estimation Camera Path Smoothing Re-rendering (by Image Warping) The bundled path method prefers smaller grid size What if we use 1x1 grid size? optical flow based motion model How to smooth the flow fields?

A Naïve Method Obtain feature trajectories from optical flow Smooth feature trajectories Trajectories have irregular shape (which complicates smoothing) Sub-space constraint between trajectories [Liu et al. 2011] a feature trajectory

Feature Trajectories vs Pixel Profiles A pixel profile: motion vectors at the same pixel location over time. a feature trajectory a pixel profile Pixel profiles are regular (which simplifies smoothing) Different profiles can be smoothed independently SteayFlow: Spatially Smooth Optical Flow for Video Stabilization. Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. IEEE CVPR 2014

Statistics of Pixel Profiles Scene without motion discontinuity feature trajectory pixel profile

Statistics of Pixel Profiles Scene with motion discontinuity feature trajectory pixel profile

Inpaint Discontinuous Motions input frame optical flow steady-flow motion completion (see paper for more details)

Smoothing Pixel Profiles Smooth each pixel profile individually by minimizing: close to original path bilateral weight range temporal

Video Results

Computational Videography Enhance Video Quality Stabilization Enable Advanced Photography Video defog and stereo TrackCam Auto Fence Removal

What is Tracking Shots?

How to Take Tracking Shots?

Our Solution for Tracking Shots

3D Method Object segmentation 3D foreground motion Blur kernels Input video 3D scene reconstruction Result

3D Method Object segmentation 3D foreground motion Blur kernels Input video 3D scene reconstruction Result Adobe After Effects RotoBrush

3D Method Object segmentation 3D foreground motion Blur kernels Input video 3D scene reconstruction Result

3D Method Object segmentation 3D foreground motion Blur kernels Input video 3D scene reconstruction Result foreground motion trajectory virtual cameras

3D Method Object segmentation 3D foreground motion Blur kernels Input video 3D scene reconstruction Result

3D Method Object segmentation 3D foreground motion Blur kernels Input video 3D scene reconstruction Result Main challenge: recover 3D trajectory of the moving foreground

trajectory triangulation Challenge: a static point a moving point? static 3D point dynamic 3D point

trajectory triangulation Input: Camera pose (computed according to the static background) A 2D position of foreground object at each frame Output: A 3D position of the foreground object

algebraic error constraint the 3D point is projected to by the camera 0,1, :, 0 E, :,

linear motion constraint all 3D point roughly form a line 0 0 [Avidan & Shashua 2000] is a 6D vector, Plucker line representation is a 4 4matrix, by rearranging elements in E

constant velocity/acceleration constraint the foreground object has near constant velocity/acceleration 2 2 E 2 3 3

perspective constraint the foreground s apparent size is proportional to its inverse depth, :,, E 1/ :1/S, is s depth is the foreground s pixel counts is the 3 rd row of [Hartley & Zisserman 2003]

final formulation Energy minimization:, Iteratively estimate, applied to overlapping sub-sequences

3D results

trajectory evaluation without and with perspective constraint without and with constant velocity without and with linear motion

Pseudo 3D Method Object segmentation 3D foreground motion Blur kernels Input video 3D scene reconstruction Result 3D scene reconstruction hallucinate background 3D 3D foreground motion hallucinate foreground 3D

hallucinate background 3D Principles: faraway points have smaller disparity Algorithm: 1. remove camera rotation by stabilization 2. turn feature disparity to depth directly 2 /

hallucinate foreground 3D Principles: faraway object appears smaller Algorithm: turn object size to depth directly γ/s

merge foreground and background Hallucinated foreground & background are in different scales. 2 / γ/s We fix and adjust interactively.

pseudo 3D examples

Evaluation synthetic examples by Maya existing commercial tools a manual tool by user study

synthetic examples Hand-held camera Tracking camera rendered in Maya

synthetic examples Hand-held Cam Tracking Cam

Ground-truth 3D method (PSNR=33.36) Pseudo 3D method (PSNR= 32.28)

synthetic examples Hand-held Cam Tracking Cam

Ground-truth 3D method (PSNR=34.12) Pseudo 3D method (PSNR= 31.29)

synthetic examples Hand-held Cam Tracking Cam

Ground-truth 3D method (PSNR = 31.55) Pseudo 3D method (PSNR = 29.48)

photoshop blur gallery

the Analog Efex 2

our manual tool 3x fast

User study 3 subjects, each create 20 tracking shots A and B use our manual tool C create use our automatic tool (10 by 3D, 10 by pseudo 3D) 30 viewers: judge the quality Created by A Created by B Created by C

User study 3 subjects, each create 20 tracking shots A and B use our manual tool C create use our automatic tool (10 by 3D, 10 by pseudo 3D) 30 viewers: judge the quality Subject A Subject B 3D method 61.8% 90.6% Pseudo 3D method 67.7% 91.2% The numbers are the percentages of viewers who favored our results

More Results

Summary Enhance Video Quality Stabilization Enable Advanced Photography Video defog and stereo TrackCam Auto Fence Removal