Multi-view 3D Position Estimation of Sports Players

Similar documents
A Binarization Algorithm specialized on Document Images and Photos

Reducing Frame Rate for Object Tracking

Support Vector Machines

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

CS 534: Computer Vision Model Fitting

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Active Contours/Snakes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

TN348: Openlab Module - Colocalization

S1 Note. Basis functions.

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Fitting: Deformable contours April 26 th, 2018

Parallelism for Nested Loops with Non-uniform and Flow Dependences

An Image Fusion Approach Based on Segmentation Region

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Image Alignment CSC 767

A Comparison and Evaluation of Three Different Pose Estimation Algorithms In Detecting Low Texture Manufactured Objects

Structure from Motion

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

UAV global pose estimation by matching forward-looking aerial images with satellite images

An efficient method to build panoramic image mosaics

3D vector computer graphics

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Optimizing Document Scoring for Query Retrieval

Mathematics 256 a course in differential equations for engineering students

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

High-Boost Mesh Filtering for 3-D Shape Enhancement

PROJECTIVE RECONSTRUCTION OF BUILDING SHAPE FROM SILHOUETTE IMAGES ACQUIRED FROM UNCALIBRATED CAMERAS

Vanishing Hull. Jinhui Hu, Suya You, Ulrich Neumann University of Southern California {jinhuihu,suyay,

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Edge Detection in Noisy Images Using the Support Vector Machines

An Entropy-Based Approach to Integrated Information Needs Assessment

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Lecture 5: Multilayer Perceptrons

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

The Codesign Challenge

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Detection of an Object by using Principal Component Analysis

Unsupervised Learning

Face Tracking Using Motion-Guided Dynamic Template Matching

A Robust Method for Estimating the Fundamental Matrix

Histogram of Template for Pedestrian Detection

Self-Orienting Wireless Multimedia Sensor Networks for Maximizing Multimedia Coverage

Fast Feature Value Searching for Face Detection

Biostatistics 615/815

MOTION BLUR ESTIMATION AT CORNERS

Face Detection with Deep Learning

A high precision collaborative vision measurement of gear chamfering profile

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Classifier Selection Based on Data Complexity Measures *

Machine Learning 9. week

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Feature Reduction and Selection

y and the total sum of

Fast Computation of Shortest Path for Visiting Segments in the Plane

Cluster Analysis of Electrical Behavior

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Wishing you all a Total Quality New Year!

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

X- Chart Using ANOM Approach

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Multiple Frame Motion Inference Using Belief Propagation

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

A New Approach For the Ranking of Fuzzy Sets With Different Heights

An Improved Image Segmentation Algorithm Based on the Otsu Method

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

REFRACTION. a. To study the refraction of light from plane surfaces. b. To determine the index of refraction for Acrylic and Water.

A Probabilistic Approach to Detect Urban Regions from Remotely Sensed Images Based on Combination of Local Features

Dynamic Camera Assignment and Handoff

Collaborative Tracking of Objects in EPTZ Cameras

An Attention Based Method For Motion Detection And Estimation

Machine Learning: Algorithms and Applications

A Scalable Projective Bundle Adjustment Algorithm using the L Norm

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Image Matching Algorithm based on Feature-point and DAISY Descriptor

Problem Set 3 Solutions

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky

Tracking by Cluster Analysis of Feature Points and Multiple Particle Filters 1

Integrated Expression-Invariant Face Recognition with Constrained Optical Flow

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

Delayed Features Initialization for Inverse Depth Monocular SLAM

Hierarchical clustering for gene expression data analysis

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

Fitting and Alignment

Transcription:

Mult-vew 3D Poston Estmaton of Sports Players Robbe Vos and Wlle Brnk Appled Mathematcs Department of Mathematcal Scences Unversty of Stellenbosch, South Afrca Emal: vosrobbe@gmal.com Abstract The problem of estmatng the postons of players on a sports feld usng multple cameras s consdered. A herarchcal partcle flter s used to track the players through each vdeo sequence. Poston estmaton s then performed usng mult-vew trangulaton. We also ntroduce a feedback loop whereby 3D data can be fed back to the 2D trackers to correct errors. Experments suggest that the system performs well and yelds hgh accuracy for poston estmaton, wth an average trackng error less than 10cm. I. INTRODUCTION Mllons of people around the world follow sports n some form or another. Wth all the nterest that sport gathers, spectators and fans are ncreasngly lookng for up-to-date statstcs on all aspects of ther game of choce. Much of these statstcs are manually extracted whle watchng the game or from vdeo footage after the game has completed. A system that s able to track players on a feld durng a game wll be able to automatcally provde a range of statstcs that s relevant for analyzng player and team performances. It wll be possble to calculate how much dstance players are coverng n a match as well as whch areas of the feld they spend the majorty of tme. Ths can be used to measure the work rate of dfferent players on the feld or to compare tactcal strateges between teams. Some work n ths feld has been done by prevous researches. Khan et al. [1] tracked people movng n a buldng by comparng the movement of people wth camera feld of vew lnes. Ca et al. [2] also looked at trackng people n multple vews. People are detected by segmentng the mage nto foreground and background, buldng a model of the background and comparng the current vew to the model. Foreground regons are then analyzed to detect human shapes and people are tracked by feature ponts. Whle both of these systems are able to track people n 2D and perform matches between the cameras they lack some functonalty that s requred for 3D trackng. The bggest problem s that nether of the two solutons are able to calculate the 3D poston of a tracked person. Another problem wth the two systems s that they are unable to track people through occlusons. A full mult-vew 3D trackng systems was developed by Alah et al. [3] for trackng players on a basketball court. Usng adaptve mxture models a foreground mage s extracted for each camera. Foreground slhouettes for each camera are projected onto the ground plane and players are tracked by matchng ground plane projectons and modellng player behavour. Another mult-vew 3D trackng technque was developed by Xu et al. [4]. A mask of the feld s extracted usng background modellng technques, and used to detect players whle excludng unwanted regons. Kalman flterng s used for both 2D and 3D trackng. In our soluton moton detecton was found to be a fast and effectve means of detectng players. In order to track players n each camera vew a fast herarchcal approach to the partcle flter s chosen due to t robustness to common trackng problems. Fnally player postons are found usng mult-vew trangulaton. The rest of ths paper s structured as follows. Secton II explans the camera calbraton requred for the system to functon. In secton III 2D player trackng s dscussed. Estmaton of player postons s consdered n secton IV and the paper concludes wth some results n secton V and conclusons n secton VI. II. CALIBRATION METHOD In order to perform poston estmaton of players on a feld the cameras used need to be calbrated to the world around them. Ths calbraton can be splt nto two sectons: nternal and external calbraton. Durng nternal calbraton the parameters pertanng to the camera tself are calculated. These parameters reman constant for the camera (the focal length may change, but s assumed to reman constant) and as such may be calculated before the system s deployed. A popular approach to perform ths calbraton s to use a checker-board pattern where the number and physcal sze of squares are known. The corners of each of the squares on the checker-board can ether be detected automatcally or selected by a human. Calbraton can then be performed by usng the corners as nterest ponts, wth a method such as the one developed by Tsa [5]. For external calbraton several pont correspondences are needed between ponts wth known real-world coordnates and the mage coordnates of those ponts. A mnmum of three such pont correspondences s requred and any ponts may be used wth the restrcton that at least three ponts do not le on a sngle lne n space. A convenent set of ponts to use s the set of corners made by ntersectng lnes on the feld. These lnes can be accurately detected and the correspondng ntersectons found to gve the 299

(a) (b) Fg. 2. Illustraton of rectangle features used as a trackng descrptor. (c) Fg. 1. Illustraton of the Hough transform for lne detecton: (a) orgnal mage, (b) Sobel edge detecton, (c) Hough transform hstogram and (d) detected lnes. corners. To detect the lnes, and through them the feld corners, the Hough transform s used. Fgure 1 llustrates. Once the lnes are found the ntersecton of those lnes can be found by turnng to homogeneous coordnates. The standard form for lnes n R 2 s ax + by + c =0, allowng for each lne to be unquely represented by a coeffcent vector l =[a, b, c] T. The ntersecton, p, of any two lnes represented ths way s smply the cross-product between the two lnes: (d) p = l l. (1) As the real-world coordnates for the corners are known (f the dmensons of the feld are known) they make deal canddates for calbraton ponts. The calbraton of the external parameters can then be done usng the work presented n [6]. III. TRACKING IN 2D Trackng players n 3D through the use of multple cameras frst requres that the players be tracked n each of the ndvdual camera sequences. To accomplsh ths a herarchcal approach to the partcle flter s used. In ths approach objects are represented usng several dfferent descrptors. Descrptors vary from coarse, fast descrptors that may yeld many false postves, to fne but slow descrptors for more precse classfcaton. Objects are frst compared to the coarse descrptors, and f they are a good match are then passed to the slower second stage. Ths herarchcal approach allows for much faster executon of the flter whle mantanng good results. A. Frst Stage Descrptors For frst stage descrptors n the herarchcal partcle flter the rectangle features of Vola and Jones [7] are used, specfcally those depcted n Fgure 2. The descrptors act smlar to a mask that s convolved wth the mage. At each pxel locaton the ntensty values of pxels n the regon around the object are added and subtracted. Fgure 2 and the followng equaton llustrate how the features are calculated usng a 9 9 block (n our system the block sze s closer to 80 40), [ 4 3 ] 4 2 R(, j) = I(, j)+ I(, j) I(, j). j= 4 = 4 =3 = 2 (2) These features are chosen as they are very fast to calculate. Usng the ntegral mage [7] further speed ncreases can be made. The ntegral mage s such that each pxel value s the sum of all the ntensty values of all pxels to the left or above the current pxel, and allows extremely fast calculaton of the sum over an arbtrary blocks of pxels n the mage. Two rectangle features are calculated for every partcle. The dssmlarty measure between the calculated feature and the model feature s taken as the absolute dfference between the two. Partcle weghts for each feature can now be calculated. Each partcle has two frst-stage weghts assgned to t, one for each feature. The weghts can be combned ether usng the average of the two or by multplyng the two together. An average of the two wll gve partcles that have a good match n both features a hgh fnal weght, however partcles that have one good and one poor match wll stll have relatvely hgh scores. Only partcles that score a good match n both features are of nterest, makng the multplcaton method better suted to the problem. B. Second Stage Descrptors In the second stage of the herarchcal partcle flter only those partcles that have a contrbutable weght after the frst stage,.e. those that have a frst-stage weght greater than some threshold, are consdered. Partcles wth small weghts after the frst stage are consdered to be very dfferent to the object, and by gnorng them n the slower second stage we gan computatonal tme. In ths stage a hstogram of orented gradents s calculated as a more precse descrptor than the rectangle features used n the frst stage. The calculaton of a hstogram of orented gradents requres gradent vectors for each pxel n the mage. Dscrete dervatve operators, such as the Sobel operators, can be used for ths purpose and yeld two edge mages: E h that hghlghts horzontal edges and E v that hghlghts vertcal edges. The magntude and angle of the gradent vector at each pxel s then calculated as M(, j) = E h (, j) 2 + E v (, j) 2, (3) G(, j) = arctan [E v (, j)/e h (, j)]. (4) Gradents wth a magntude greater than some threshold are then bnned nto a hstogram accordng to ther angles. 300

The hstograms of all the partcles need to be compared wth that of the model n order to arrve at some dssmlarty value. There are varous ways n whch the dstance between two hstograms can be calculated. The cty block and Eucldean dstances (.e. the L 1 and L 2 norms) are fast to compute but do not perform adequately on hstograms where the order of the bns carry some meanng. Consder, for example, three hstograms h 1 = [1 5 1 1 1 1 1], h 2 = [3 1 3 1 1 1 1] and h 3 = [1 1 1 1 3 3 1]. Here h 1 and h 2 should be consdered as beng much closer to one another than, say, h 1 and h 3. However the Eucldean dstance gves d(h 1,h 2 )=d(h 1,h 3 )= 24. Dstances that measure the dfference between dscrete probablty densty functons can also be used to compare hstograms. Examples nclude the Kullback-Lebler dvergence and the Bhattacharyya dstance. These measures, however, also fal for the same reason as the L 1 and L 2 norms. There are more ndcatve measures of the dstance between hstograms. The earth mover s dstance (EMD) [8], for example, regards the hstograms as ples of drt and determnes the mnmum cost requred to turn one nto the other (where cost s defned as amount of drt tmes the dstance by whch t s moved). Ths optmzaton problem, although lnear, s rather computatonally ntensve for the purposes of ths problem. Cha and Srhar [9] proposed a measure whch s related to the EMD but s much faster to calculate. Because gradent orentatons range between 0 and 360, wth the endponts regarded as equal, the modulo dstance measure (as explaned n full detal n [9]) s used. Partcles that were gnored n the second stage due to low frst stage weghtngs stll requre a second stage weght for the flter to propagate forward. As there s no dstance measure calculated for these partcles, they are gven a second stage dssmlarty value equal to twce the largest value calculated n the second stage. The second stage weghts for all the partcles can now be calculated. C. Flter Output and Updatng the Flter Model Once all the frst and second stage weghts have been calculated a flter output can be obtaned. The frst and second stage weghts for each partcle are multpled together, after whch all the weghts are normalzed to produce a fnal weght for each partcle. A weghted average of all the partcles s taken to fnd the flter output X: n X(x, y) = w p (x, y). (5) =0 The model that s beng tracked must now be updated for the next teraton of the flter. After fndng X the frst as second stage descrptors are calculated around that pont, and those descrptors are used for the next teraton of the flter. The next secton looks at combnng the data from the 2D trackers to estmate 3D poston and track players n real-world coordnates. Fg. 3. Trackng feedback loop: 2D data s used to calculate 3D ponts whch are fed back to the 2D trackers for error correcton. IV. 3D TRIANGULATION Once players are successfully tracked n 2D t becomes possble to estmate and track ther 3D postons. To accomplsh 3D trackng the data from the 2D trackng s requred for each player n each vew. Ths 3D data can then be fed back to the 2D trackers to check and possbly correct errors n the 2D trackng. Ths forms a feedback loop as llustrated n Fgure 3. The rest of ths secton detals the processes of combnng the varous vews, trangulatng the player postons and the feedback loop. A. Matchng Players Between Vews Before trangulaton of player postons s possble t s necessary to match the players between vews. Several optons exst when tryng to accomplsh ths, such as shape, colour and poston. Shape and colour methods operate by quantfyng the shape and/or colour aspects of the person beng tracked usng, for example, edge or colour hstograms. These hstograms can be compared to hstograms of players beng tracked n dfferent vews, and should a match be found they are assumed to be the same player n the dfferent vews. These methods can fal, however, when appled to the problem of trackng sports players. Colour methods are neffectve as players on the same team wll all be wearng smlar clothng. Attemptng to match players n ths stuaton wll result n multple matches makng t mpossble to know whch s the correct match. Shape matchng on the other hand fals as the player shape may vary drastcally when vewed from dfferent angles. Poston matchng estmates the poston of the player on the feld from each vew ndvdually. Ths estmaton may not be hghly accurate, but t does allow one to dentfy clusters of estmated ponts. These ponts may ndcate the presence of a player on the feld and the correspondng projectons of the players n the 2D vews. The sngle vew estmaton begns by fndng the lne n 3D passng through the tracked pont on the mage plane. Once ths lne has been found the ntersecton between the lne and a plane some dstance above the feld s calculated (accordng 301

to [10] the average heght of a male n South Afrca s 168 cm, ndcatng a plane 84 cm above the playng feld should be chosen when trackng the center of the player, whle the average heght of a female n South Afrca s 158 cm ndcatng a plane 79 cm above the playng feld). After the ntersecton ponts have been calculated they can be compared to ntersecton ponts from dfferent vews. If clusters of ntersecton ponts are found close to one another then a match s made between the dfferent vews. If two or more ntersecton ponts from a sngle vew are located close to each other,.e. durng an occluson, that locaton cannot be assgned wth a hgh level of certanty and t s gnored untl the two tracked players move away from one another. Also note that once a match has been made, that match remans for the rest of the program executon and the matchng step does not need to be repeated. Another advantage of ths matchng method s that the calculaton of the 3D lnes s also requred for the trangulaton step. Ths has the effect that the matchng step does not greatly ncrease computatonal tme. B. Trangulatng Player Postons Once all the players are tracked n each of the vdeo sequences the postons of players can be trangulated on the feld. Two optons exst for trangulatng from multple vews: parwse, back-projecton error mnmzaton; mult-vew, forward-projecton error mnmzaton. In the frst case the object s trangulated for each possble par of cameras usng back-projecton error mnmzaton technque. Ths wll gve 1 2n(n 1) solutons when the object s vsble by n cameras. These solutons may then be combned to get a fnal pont by takng the average or least-squares of the set of ponts. The second opton trangulates a sngle pont usng all vews n a sngle step, by mnmzng the forward projecton error rather than the backward projecton error. Testng of the two technques produced smlar results, causng a decson between the two to hnge on computatonal speed. In ths respect the mult-vew trangulaton s superor to parwse trangulaton due to the fact that mult-vew trangulaton ncreases lnearly n computatonal complexty wth the number of cameras whle parwse trangulaton s of order n 2. Trangulaton from multple vews presents new challenges, but also some benefts above two-vew trangulaton. On the one hand multple vews provde more nformaton, allowng for more accurate trangulaton. On the other hand t s harder to combne the data n a computatonally nexpensve manner whle keepng a hgh degree of accuracy. We try to mnmze the projecton error e p. The frst step n mnmzng e p s to fnd the projecton of the pont X on each of the lnes l. Each of the lnes l can be wrtten as l = p + kn where p s a pont on the lne and n s a unt vector n the drecton of the lne. To solve for each of the lnes one begns wth the camera equaton: x = KR[I C]X = KR X KR C (6) whch can be rewrtten as X = R T K 1 x + C. (7) Usng the camera center, (0, 0, 0) T, and the pont on the mage plane through whch the lne s to be drawn, (x, y, 1) T, and substtutng for x n equaton (7) two ponts, q 1 and q 2, n the real-world coordnate system emerge that both le on the desred lne. It s now possble to solve for p and n : p = q 1, n = q 1 q 2 q 1 q 2. (8) The projecton, y, of x on l can then be found as wth the total projecton error y = n n T (x p )+p, (9) e p = We now want to fnd x that mnmzes e p : e p = = = = y x 2. (10) n n T (x p )+p x 2 (n n T I)x (n n T I)p 2 A x b 2 (A x b ) T (A x b ). (11) Takng the dervatve of (11) wth respect to x and settng t equal to zero yelds e p x = [ 2(A T A )x 2A T b ] = 0 (A T A )x = ( ) A T A x = A T b A T b. (12) Equaton (12) s now n a famlar form, Cx = d, allowng us to solve t usng standard lnear algebra technques. C. Error Correcton When trackng the 3D postons of players usng multple cameras, ths 3D data can be used to ncrease the accuracy of the trackng n the ndvdual camera scenes. By comparng the 3D poston obtaned by trangulaton to an estmaton of the poston based only on each vew ndvdually t becomes possble to detect and correct errors n the sngle vew trackng. After trangulatng a player based on the 2D trackng results a set of dstance measures can be calculated between the trangulated pont and the projected pont from each camera, usng the Eucldean dstance. Ths projected pont s the same pont as calculated n secton IV-A when fndng player matches. If any of the dstance measures are greater than some threshold t may ndcate that there s a problem wth the 302

Fg. 4. Trangulaton of two mage sequences usng forward (green dots) and back projecton (blue dots) methods. The sold blue lne ndcates the ground truth. correspondng 2D trackng. To correct ths error the player s locaton s trangulated a second tme, usng only trackng results from those trackers where the dstance measure s below the threshold. Ths new 3D pont, X, s then projected back to the dscredted vews usng the standard camera equaton: x = PX. (13) Fg. 5. Results for trackng four players wth four vews though 286 frames. The tracker correspondng to that player n that vew can then be restarted at the calculated pont x. If less than two of the projecton ponts fall wthn the threshold then there s no relable way to determne whch of the trackers have faled and whch of them are stll accurate. In ths case the player may need to be dropped from 3D trackng and all correspondng 2D trackers stopped. The player wll then be detected and tracked agan as a new entty as f t s a new player on the feld. V. EXPERIMENTS To test the system that was developed several test were completed. The frst test was to measure the accuracy of the 3D poston estmaton of a sngle player runnng on a feld. Fgure 4 shows the results for four such sequences. The sold blue lne n each fgure s the ground truth path that the humanod ran over, as vewed from above. The blue ponts are the back projecton results and the green ponts are the forward projecton results. Ths test ndcates that trangulaton results for the two methods are smlar. Ths test also ndcates that the proposed trackng and trangulaton method succeeds at locatng a player on the feld of play. In the gven fgures one unt of measurement corresponds to 2 cm on a real-lfe feld. The maxmum devaton from the ground truth between the four sequences s 20 unts (40 cm) whle the average devaton s about 5 unts (10 cm). In Fgure 5 the results of the full system can be seen for trackng 4 players as seen n 4 vews over 286 frames Fg. 6. Trackng a player crossng the feld of vew lne of a camera, shown here from a top down vew. (roughly 10 seconds). The coloured dots ndcated the trangulated poston for each player through the sequence. The approxmate poston of the cameras are shown by the lttle camera drawngs. As can be seen from ths fgure the system as a whole functons as desred: detectng, trackng and trangulatng each of the players. Ths s, however, an deal case and further testng of some possble problem scenaros needs to be done. Durng full system testng two cases of nterest were dentfed. The frst case s when a player leaves or enters a camera feld of vew and the second s when a 2D tracker loses ts player due to occluson. In Fgure 6 the sold lnes ndcates feld of vew boundares of the dfferent cameras, and the blue dots ndcates the path followed by the player. At pont (a) the player moves out of the vew of the camera ndcated by the green lnes. At ths pont the correspondng 2D tracker s stopped and the player s trangulatng wth the remanng vews. At pont (b) the 303

Camera 1 Camera 2 Camera 3 (1) (10) (32) (48) (49) (57) Fg. 7. Automatc correcton of trackng occluson. Frame numbers are lsted on the left of the mages. usng the rest of the vews. By frame (57) the 2D tracker has corrected tself and s trackng the player correctly agan. The plot of the players postons n Fgure 8 llustrates the effect of ths occluson. At pont (a) the trangulaton result begns to drft from the ground truth lne. At pont (b) the 2D tracker s corrected and the trangulaton results snap back to the correct ground truth lne. VI. CONCLUSIONS AND FUTURE WORK In ths paper the problem of estmatng the 3D poston of players on a sports feld usng multple cameras was dscussed. A system was developed, usng moton detecton to fnd players, a fast herarchcal partcle flter to track players n 2D, and mult-vew trangulaton to fnd player postons. The results obtaned for the system were very promsng overall. Whle some mprovements can be made the system s able to solve the ntal problem to a satsfactory extent. Occlusons present some problems, however the use of multple cameras goes some way to solve ths. Some future work that may mprove the system would be to remove the fxed block sze used n 2D trackng. Ths may allow cameras to vew larger areas of the feld as players do not need to appear n some specfc sze n the mage. Usng a player recognton algorthm may also mprove both 2D and 3D trackng results. Trackers can correctly dstngush between players after occlusons by recognsng the player they need to track. More precse matchng of players between vews may also be possble f recognton s used rather than just poston matchng. VII. ACKNOWLEDGMENTS We thank MIH and the Natonal Research Foundaton for fnancal assstance. Fg. 8. Trangulaton results of the mult-vew trackng n Fgure 7, as vewed from above. player agan moves back nto the feld of vew of the camera and s then agan tracked n that vew. Sngle vew poston estmaton showed that ths vew corresponded to the player already beng tracked the player s then trangulated usng the data from that vew as well. The second case s llustrated n Fgure 7, where n one vew the two players move n such a way that the one player occludes the other whlst n the other two vews they move apart from each other. In frame (1) the players are a dstance away from each other. By frame (10) they have started to occlude each other and at frame (32) they are heavly occluded. As can be seen at frame (48) the tracker ndcated by the blue square has begun to track the ncorrect player. At ths pont the system detected that the 2D tracker has lost the player and moved from the correct path and attempts to correct the mstake. In frame (49) the 2D tracker n the frst vew has been corrected by back projecton after trangulatng the player REFERENCES [1] S. Khan, O. Javed, Z. Rasheed, M. Shah, Human trackng n multple cameras, IEEE Internatonal Conference on Computer Vson, vol. 1, pp. 331 336, 2001. [2] Q. Ca, J. Aggarwal, Trackng human moton usng multple cameras, Internatonal Conference on Pattern Recognton, vol. 3, pp. 68 72, 1996. [3] A. Alah, Y. Bourser, L. Jacques, P. Vandergheynst, Sport players detecton and trackng wth a mxed network of planar and omndrectonal cameras, IEEE Internatonal Conference on Dstrbuted Smart Cameras, pp. 1 8, 2009. [4] M. Xu, J. Orwell, L. Lowey, D. Thrde, Archtecture and algorthms for trackng football players wth multple cameras, Intellgent Dstrbuted Survellance Systems, pp. 51 55, 2004. [5] R. Tsa, A versatle camera calbraton technque for hgh-accuracy 3D machne vson metrology usng off-the-shelf TV cameras and lenses, IEEE Journal of Robotcs and Automaton, vol. 4, pp. 323 344, 1987. [6] M. Haralck, C. Lee, K. Ottenberg, M. Nolle, Revew and analyss of solutons of the three pont perspectve pose estmaton problem, Internatonal Journal of Computer Vson, pp. 592 598, 1991. [7] P. Vola, M. J. Jones, Rapd object detecton usng a boosted cascade of smple features, IEEE Computer Vson and Pattern Recognton, vol. 1, pp. 511 518, 2001. [8] Y. Rubner, C. Tomas, L. J. Gubas, A metrc for dstrbutons wth applcatons to mage databases, Internatonal Conference on Computer Vson, pp. 59 66, 1998. [9] S. Cha, S. Srhar, On measurng the dstance between hstograms, Pattern Recognton, vol. 35, pp. 1355 1370, 2002. [10] http://www.doh.gov.za/facts/1998/sadhs98/chapter13.pdf 304