Lower Body Pose Estimation in Team Sports Videos Using Label-Grid Classifier Integrated with Tracking-by-Detection

Size: px
Start display at page:

Download "Lower Body Pose Estimation in Team Sports Videos Using Label-Grid Classifier Integrated with Tracking-by-Detection"

Transcription

1 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Research Paper Lower Body Pose Estmaton n Team Sports Vdeos Usng Label-Grd Classfer Integrated wth Trackng-by-Detecton Masak Hayash 1,a) Kyoko Oshma 2,b) Masamoto Tanabk 2,c) Yoshmtsu Aok 1,d) Receved: May 16, 2014, Accepted: November 6, 2014, Released: February 16, 2015 Abstract: We propose a human lower body pose estmaton method for team sport vdeos, whch s ntegrated wth trackng-by-detecton technque. The proposed Label-Grd classfer uses the grd hstogram feature of the tracked wndow from the tracker and estmates the lower body jont poston of a specfc jont as the class label of the multclass classfers, whose classes correspond to the canddate jont postons on the grd. By learnng varous types of player poses and scales of Hstogram-of-Orented Gradents features wthn one team sport, our method can estmate poses even f the players are moton-blurred and low-resoluton mages wthout requrng a moton-model regresson or part-based model, whch are popular vson-based human pose estmaton technques. Moreover, our method can estmate poses wth part-occlusons and non-uprght sde poses, whch part-detector-based methods fnd t dffcult to estmate wth only one model. Expermental results show the advantage of our method for sde runnng poses and non-walkng poses. The results also show the robustness of our method for a large varety of poses and scales n team sports vdeos. Keywords: human pose estmaton, people trackng, trackng-by-detecton, Random Forests, feature selecton 1. Introducton Vdeo-based player trackng has drawn nterest n computer vson. Snce vdeo-based object detecton and trackng technques have shown rapd mprovement [1], the applcatons of trackng team sports players are becomng ncreasngly more attractve for professonal sports. Body pose nformaton would be a mddle-level feature for classfyng the detaled acton of each player. Lke actvty recognton methods [2], [3] usng a pose estmated by sngle mage pose estmaton technques [4] suggest, the pose (jont locaton of the person n 2D or 3D) can be a stable and clear cue for detaled and fne-graned actvty recognton. Whle acton recognton methods usng spato-temporal local features [5], [6] can estmate the semantc acton class (e.g., runnng or standng) of the player, nner-class acton dfference (e.g., how wdely the person moves hs or her legs whle runnng) cannot be easly estmated. Even for semantc acton recognton, an acton classfer usng a pose feature (annotated jonts) performs better than one usng low-level feature (dense flow) as Ref. [7] llustrated n ther experments. In partcular, usng the lower body pose (or lower body jont postons) from team sports vdeos would be a new way of recognzng each acton of a player n detal. Snce runnng s the very 1 Keo Unversty, Yokohama, Kanagawa , Japan 2 R&D Dvson, Panasonc Corporaton, Yokohama, Kanagawa , Japan a) mhayash@aok-medalab.org b) ohshma.kyoko@jp.panasonc.com c) tanabk.masamoto@jp.panasonc.com d) aok@elec.keo.ac.jp basc and most frequent acton n all team sports, leg movements are one of the most mportant cues for recognzng player actons. For example, when the subject player s runnng, the jont pose can precsely measure the number of steps of the player, whch can be the cue for classfyng the step types (normal step or cross step), and whch can be the contact pont wth the foot n soccer. However lower body pose estmaton has rarely been nvestgated n computer vson. Human pose estmaton from vdeo s stll an open problem n computer vson whle depth-based methods usng RGB-D sensors has already been realzed as a hghly robust system [8]. We cannot yet estmate the pose of the sports players n all types of sports vdeos, whle pose estmaton of some lmted perodc actons (such as walkng) has only been solved usng nonparametrc regresson technques [9]. On the other hand, the frontal human pose of sports players can be robustly estmated from an mage wth methods usng part detectors and pctoral structures such as the flexble mxture-of-parts model (FMP)[4]. However, part-based methods usually fal to estmate non-frontal poses n team sports vdeos where players tend to frequently be dsplayed as sde poses. The reason s that these methods need enough space between parts to become a star-shaped tree confguraton of body parts. Even mult-vew part-based pctoral structure technques [10] wth pan-tled cameras cannot robustly estmate the sde and part-occluded poses n team sports vdeos because they stll depend on the dscrmnatve part classfcaton as Ref. [8] does. If the sde poses of sports players could be estmated wth monocular vdeos, a broad range of possbltes of 246

2 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Fg. 1 Fg. 2 Example result of our framework. Images n the upper row show the tracked player mages n each frame of the nput vdeo wth the estmated jont poston as colored crcles n each frame. The lower mage shows an nput frame of the vdeo. The rectangle s the tracked player wndow, and the green crcle s the center of the wndow. Label-Grd Classfer. The red crcle on the grd s the classfed jont locaton l j t N 2 on each player wndow from the learned Label- Grd class canddates (pnk crcles). In ths example, the Label-Grd classfer for the j-th jont s on the (6 8) grd structure, and the estmated Label-Grd s on l j t = (1, 7) on all three mages. The number of the class of the Label-Grd classfer of the left foot s 21 (sum of pnk crcles and red crcles). vson-based human moton and behavoral understandng would be opened up. We propose a novel grd-wse pose estmaton classfer for monocular team sports vdeos, whch we call the Label-Grd classfer, ntegrated wth a standard trackng-by-detecton framework, such as Ref. [11]. Our Label-Grd classfer estmates the lower body human jont locaton wth Label-Grd resoluton usng the whole body appearance of the tracked wndow estmated by the player tracker (Fg. 1). In other words, the player tracker frst tracks the player wndow n each frame, then the Label-Grd classfer estmates the jont locaton (grd poston n the player wdow (See Fg. 2). To the best of our knowledge, our method s the frst human pose estmaton method that can estmate the pose of sde-runnng players wth scale changes, whch part-based methods [4], [12] cannot estmate very well. As the example results n Fg. 1 show, our method can robustly estmate the poses of the sde runnng sequence. Smlar to (frontal) facal recognton technques [13], our Label-Grd classfer embeds all types of algned player wndow mage nto only one mult-class classfer, whch enables pose estmaton when they are runnng sdeways. Snce our framework employs Hstograms of Orented Gradents (HOG) [14] as whole body gradent hstogram features, t can estmate the jont locaton even from a low-resoluton vdeos owng to the deformaton nvarance and contrast nvarance of the HOG feature. In addton, t can estmate poses that have smlar appearances between parts (e.g., pants wth only one color) that part-based methods fnd t dffcult to estmate, because part appearances become too ambguous to detect (e.g., whle crossng the legs). The contrbutons of ths paper can be summarzed as havng the followng advantages: Label-Grd classfer whose 2D unt blocks (Label-Grd) are synchronzed wth the resoluton of Grd-Hstogram features va mult-class classfer (n ths paper we use HOG features randomzed by Random Decson Forests). Algn the trackng wndow wth a pelvs-algned detector to provde center-algned vsual features (selected from HOG by Random Decson Forests) to easly classfy the class of Label-Grd classfers. Can also estmate the pose of sde runnng sequences, whch frequently appear n team sports vdeos. Per-frame estmaton wthout temporal pose moton models of a specfc acton, such as Ref. [9], whch uses a walkng pose manfold or temporal pose pror. Per-frame pose estmaton for vdeos wthout pctoral structure and part detectors. Our gnorance of pctoral structure framework acheved fast pose estmaton (about 1 fps computatonal tme). The rest of the paper s organzed as follows. In Secton 2 we frst revew the related work for human pose estmaton methods for mages and vdeos. Our framework s presented n Secton 3. Secton 4 descrbes how to learn the Label-Grd classfer wth a prepared dataset. Secton 5 llustrates the expermental results and evaluates the performance of our framework. Fnally, Secton 6 s the concluson. 2. Related Work Frst, we revew human pose estmaton methods usng the classcal slhouette-based template matchng technque for team sport vdeos (Secton 2.1), whch s not robust and s just for graphcs vsualzaton. Then we revew two types of human pose estmaton technques: human pose estmaton from a sngle mage usng pctoral structure (Secton 2.2), and moton model regresson (Secton 2.3). Fnally, we revew nstance template detecton technques such as poselets [15] and Exemplar-SVMs (Secton 2.4). 2.1 Exemplar-based Pose Search Methods Germann et al. [16] proposed slhouette-based database matchng methods for soccer players. These methods frst construct an exemplar-pose database wth slhouette mages of players. At test tme, ther method frst fnds the most smlar slhouette from the 247

3 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan database n each frame, and then apples optmzaton through multple temporal multple frames. The resultng poses are not very accurate because the exemplars cannot nclude every type of human poses, and are senstve when extractng the slhouette va background subtracton. Moreover, ths approach nvolves a hgh computatonal optmzaton cost. 2.2 Sngle Image Pose Estmaton Usng Deformable Part Models The FMP [4] s the extenson of the deformable part models (DPM) [12] that are used to estmate the pose of the human by nferrng from the best part confguraton. DPM was orgnally a weakly-supervsed model usng latent support vector machnes (latent SVM) to learn the appearances and locatons of each part detector automatcally from the labeled whole object wndow. On the other hand, FMP s a supervsed verson of DPM and uses mxture-of-parts to represent the dscrete changes of each part appearance. FMP [12] accurately estmates frontal poses of subjects openng ther legs and arms. However, FMP cannot precsely estmate the poses that have feet and arms occluded, because t depends on the part-detecton scores and depends on the tree-graph where each subnode (arms and feet) s wdely open. For ths reason, FMP tends to estmate the pose of a person who looks rght or left and even frontal poses ncorrectly, because t s hard to detect each arm or leg part n those poses owng to ther ambguous and ncomplete part appearances. Moreover, t s dffcult to model the confguraton wth one tree-structure model for frontal, sde and bendng poses. The model needs to learn those models separately. More recently, poselets-based [15] part-based approaches have been studed [17], [18]. Although those methods overcome the weakness of FMP by representng the relatonshp between parts usng poselets (whch are larger parts than parts of FMP), they are stll poor at hard occluson cases because they stll depend on the pctoral structure. A mult-camera approach [10] helps to deal wth part-occlusons, but even ths methods s not able to tackle wth sde runnng scenaros where part-occlusons occur frequently. There s also a method nvolvng an occluson handlng scheme usng an occluson detector and part-based regresson [19]. Whle ths method provdes good results wth small occlusons between parts, t also cannot estmate sde poses because t stll depends on the pctoral structure. 2.3 Moton Model Regresson and Pose Manfold Methods for Pedestrans If the human pose model only ncludes the pose types wthn one acton class, such as walkng or swngng a golf club, pose estmaton can be solved usng regresson technques wth fxedvew tranng mages. The trackng method usng Gaussan Process Regresson [9] s popular for learnng a (latent) 3D pose manfold from cyclc pedestran mages from one camera vew. For pedestran pose estmaton, Gammeter et al. [20] proposed a people trackng and pose estmaton method for pedestrans usng Gaussan Process Regresson. Rogez et al. [21] proposed a perframe pose estmaton of human pose estmaton based on Random Decson Forests [22] and pose manfolds of a gat sequence wth HOG features. Reference [21] s close to our method n usng Random Decson Forests for pose classfcaton. However, ther Random Decson Forests class s based on camera vews and gat manfold cycle, whle our Label-Grd class s the grd of HOG features. Addtonally, they do not predct the jont postons precsely because they just fnd the most smlar walkng pose exemplar on the gat manfold. These methods only nvestgated the pose dstrbuton of walkng people. They can learn the latent transtons between the poses of pedestrans, but cannot afford to nclude every types of poses. 2.4 Poselets and Exemplar-SVMs Our pelvs-algned detector and Label-Grd classfer are both nspred by the poselets framework [15]. Poselets are the detector of one specfc pose of a mddle-level human part detector, whch can be learned from tranng mages wth the same algned pose and same scale mages but from dfferent subjects (e.g., upper body Poselets wth ther arms crossed). Exemplar-SVM [23] s an object detecton method usng perexemplar detectors. It detects exemplars usng each exemplar- SVM to detect multple appearance types of an object class. Exemplar-SVMs separately learn each pose or vew wth one exemplar-svm (e.g., left-vew car SVM, frontal-vew car SVM, jump-pose human SVM). On the other hand, our pelvs-algned detector learns multple scales and poses of an object class altogether. Ths one-detector soluton makes t easy to ntegrate wth a trackng-by-detecton scheme, whle the goal of Exemplar-SVMs s robust object detecton even wth only one mage usng multple SVMs that know the hard-negatves. 3. Proposed Framework The proposed framework (Fg. 3) s composed of two modules: a trackng player wth trackng-by-detecton technque wth the pelvs-algned detector (Secton 3.1), and estmatng the four jont postons on the grd structure ndependently usng Label- Grd classfers (Secton 3.2). These two modules share the tracked player wndow as HOG (Hstogram-of-Orented Gradents) feature [14] to estmate the pose (locatons of four jonts) n each frame of the vdeo (Secton 3.2). At test tme, the only nput of our framework s the player wndow poston (rectangle) of the subject player n the frst frame of the vdeo. All the lower body poses n each frame are estmated automatcally by trackng the player and are estmated by Label-Grd classfers n each frame. We learn four Label-Grd classfers of each lower-body jont separately; left-knee L lk (x), rght-knee L rk (x), left-foot L lf (x), and rght-foot L rf (x). Note that left or rght means left n the mage and rght n the mage respectvely. Our Label-Grd classfer learns the poston of the left and rght jonts n the mage plane just lke the other pose estmators such as FMP [4] *1. *1 Ths s the typcal lmtaton of the two-dmensonal human pose estmaton methods. To overcome ths, we wll use the three dmensonal nformaton nferred by the other approaches n the future. 248

4 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Fg. 3 Flowchart of the proposed framework. 3.1 Player Trackng wth Pelvs-algned Detector Our frst module s the player trackng method wth standard trackng-by-detecton, such as Ref. [11], to provde an algned player wndow for the second module. We also used ths pelvs-algned detector n our upper body pose estmaton framework [24], the explanaton n Ref. [24] was very short because of the page lmt. Hence, ths paper provdes a more detaled procedure to prepare the dataset of our pelvs-algned player detector n Secton 3.3. For trackng-by-detecton, we use the player detector learned from the dataset D all (Secton 3.3). We use a Kalman Flter (nstead of Partcle Flter n Ref. [11]) to track the player whose lkelhood n each frame s a non-maxmum suppresson result of the detectons wthn the local area around the predcted player locaton. Snce our method manly targets at estmatng sde poses durng runnng or walkng, whch occurs very often n team sports vdeos (as mentoned n the ntroducton), we choose a Kalman Flter by assumng the smple and monotonous trajectory of the subject players n typcal team sports vdeos on large felds. Ths trackng procedure provdes the smoothed and the center of the tracked wndow n each frame, and these tracked wndows are expected to be algned to the pelvs poston. Ths frst module provdes the algned wndow that works well for the Label- Grd Classfer n the second module. Snce we use HOG feature for classfyng the pose, we expect around 1 or 1.5 grd errors n ths tracker to ensure that the Label-Grd classfers can use the algned HOG feature leaned from the pelvs-algned dataset. 3.2 Label-Grd Classfer for Estmatng Jont Grd Poston The second module estmates the four jont locatons usng four Label-Grd classfers. The proposed Label-Grd classfer s a mult-class classfer whose label classes are assgned to the grd locatons of grd hstogram feature such as HOG features [14]. The Label Grd classfer L j (x t ) = {F j (x t ), M j (ŷ j t )} (for the j- th jont) conssts of a mult-class classfer F j (x t ) and the classto-grd mappng functon M j (ŷ j t ), where we denote the nput vsual feature vector (n our case, normalzed three-level HOG) as x t R D at frame t, and the estmated Label-Grd class label of F j (x t )sy j t {l = 1, 2,...,L}. Each class l of F j (x t ) learns the appearances (or poses) of players wth ts j-th jont s on the same grd (See Fg. 2). At test tme, the classfer F j (x t )ofl j (x t ) frst estmates the class y t from the nput vsual feature vector x t : ŷ j t = F j (x t ) (1) The reason why we use not only the lower body but also the full body wndow for the HOG feature s that we am to leverage the whole upper body appearance for classfers, whch result n capturng a wde varaton n upper body appearances n each lower body jont poston. We expect that ths strategy of ncludng upper body appearance makes the Label-Grd classfer easer to dscrmnate the pose of a specfc jont poston from the other poses even when the pelvs s not algned, whle the HOG of only the lower body could cause too much senstvty to the ms-regstraton of the tracker *2. After estmatng the class label ŷ j t from the nput feature vector, we map ŷ j t to the correspondng 2D grd locaton wth M j (ŷ j t ): l j t = M j (ŷ j t ) (2) where M j s the dctonary functon for the j-th jont to map each class ŷ j t to the correspondng grd locaton l j t N 2, whch we call Label-Grd. Ths mappng dctonary M j s bult durng tranng. We typcally assgn each class y j t to the grd from left to rght and from top to bottom f there s more than one sample labeled on the grd (see Fg. 2 for the example grd ndex assgnment). Note that we need the nverse mappng of M j (ŷ j t ) durng tranng, because we frst have to assgn each Label-Grd l j t n the D all to the l-th class n L-class classfer. However, at test tme, we need only M j (ŷ j j t ) for convertng ŷt to l j t. The tranng dataset for the Label-Grd classfer must nclude most of the types and scales of players appearances that could occur n the target vdeos. Hence, our system can estmate the lower body pose n any locaton of the mage (n our case, the *2 If the estmaton of the pelvs s perfect wth any other methods, we need to use only the lower body appearance. 249

5 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan sports feld) where player scale vares accordng to the poston and the pose of the camera. 3.3 Dataset Preparaton We share the same data-augmented dataset D all between two modules to learn the pelvs-algned Detector and four Label-Grd classfers. To share the player wndow wth ts center algned to the pelvs of a player, the player detector and the Label-Grd classfer are learned from the same mages and labels from D all.to create D all, we perform data augmentaton wth dfferent scales Fg. 4 Data Augmentaton. Usng h pla (heght of the blue wndow), mages are scaled to the scale s so that the center poston p pel keeps to the center of the Label-Grd even n the scaled mages. By performng ths algned mage samplng of the tranng dataset, the feature space of the Label-Grd classfer can be augmented to the multscale player szes wthn the Label-Grd wndow. and mrrored mages from the orgnal dataset D or (Fg. 4). We frst prepare a tranng dataset D all = {D sca, D mr } from realstc team sports vdeos to learn both player detector and Label- Grd classfers. D sca (Fg. 4 (a)) s the resampled player wndow mages and ther labels from the orgnal dataset D or for whch we should only need to prepare the labels. D mr (Fg. 4 (b)) s the mrrored dataset of D sca whose mages and labels are flpped horzontally. See the left half of the Fg. 5 for ths dataset preparaton procedure. Frst, we prepare a dataset D or wth N mages I = [I 1, I 2,...,I N ] and labels of each mage I so that the mages ncludes varous types of poses of players n one specfc sport. Each player wndow mage I n D or has labels L = [p 1, p2, p3, p4, ppel, h pla ], where p j R D s the j-th jont locaton of the -th mage I on the mage plane, and h pla s the player heght for resamplng the orgnal mages. p pel R D s the locaton of the pelvs of the -th mage I on the mage plane, whch s always at the center of the player wndow and becomes the nformaton of the locaton of the player wndow. Images I are all clpped from the team sports vdeos so that ther wndow centers p pel are all algned to the center of the wndow, and the labelng person manually nputs h pla as a length between the top of head and the bottom of the foot of the player n I. Ths labelng procedure determnes the scale of the player heght h pla to the fxed sze wndow n each sample. For resamplng the orgnal mage of D or to multple scales, we resze all the mages and labels n D or to the resampled player scales s = h pla /h wn, where h wn s the heght of the Label-Grd wdnow. D sca ncludes several player scales wth regular ntervals (e.g., 0.80, 0.85,...,1.00) by reszng D or (Fg. 4 (a)). Fnally, we acqure D mr by flppng the mages and labels of D sca horzontally to learn the mrrored features and labels (Fg. 4 (b)). The player detector uses only the mages of D all because the Fg. 5 Learnng procedure. 250

6 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan centers of the players wndow mages n D all are all algned to the pelvs of the players. In contrast, mages and Label-Grd classes of D all are used to learn the classfer. Any types of multclass classfer can be used as a Label-Grd classfer. However, for ts hgh precson and fast computatonal tme, we use Random Decson Forests [22] as a Label-Grd classfer n our experments. 4. Learnng Label-Grd Classfer The Label-Grd classfer s a mult-class classfer wth ts class labels (Label-Grd) assgned to the 2D grd locaton of a feature type wth a grd structure such as HOG features. The Label-Grd classfer can be any types of mult-class classfer (e.g., Support Vector Machne, Random Decson Forests, etc.), but preparng the dataset for a Label-Grd classfer to classfy the lower body grd-poston of the player s our orgnal approach to usng a general mult-class classfer. Every one class (Label-Grd) of the Label-Grd classfer learns the HOG features whose jont s on the same grd poston (Fg. 2). Gven the grd feature of a player n a W H wndow, classfyng the Label-Grd of a specfc jont (e.g., left-knee) can be regarded as an L-class grd classfcaton problem, where the task s to choose a grd poston (, j) from L canddate postons (pnk crcles are marked as canddate postons n Fg. 2). The other N grds n the grd feature are just gnored from label-grds to learn. The number of Label-Grds L s decded when buldng M j (y j t ) (see Secton 4.1). For nstance, f you use HOG features wth 6 10 cells n a player wndow and f there are 35 classes where the jont label-grd exsts more than one jont poston n the tranng dataset, the Label-Grd classfer becomes 35-class classfers. The other 25 (= ) grds, whch have no jont labels n the tranng dataset, are gnored for the classfcaton of the jont. 4.1 Learnng Procedure We wll explan how to learn a Label-Grd Classfer for classfyng the j-th jont (e.g., the left knee jont). Fgure 5 shows the whole procedure, used to learn the Label-Grd Classfers (L lk (x), L rk (x), L lf (x), and L rf (x)) and the pelvs-algned detector by preparng a dataset D all. Gven a data-augmented dataset D all, we frst calculate the grd locaton l j from the j-th jont p j of the -th mage n D all usng pelvs poston and the sze of Label-Grd (e.g., each grd s 8 8 and the wndow sze s ). After calculatng all l j n D all, we then buld the mappng functon M j (y j ) to decde the number of the class N of the Label-Grd Classfer and all the Label-Grd ndces y j of the -th mage n D all. After M j (y j ) has been bult, we can fnally learn F j (x ) (usng Random Decson Forests, n ths paper) wth Label-Grd s and the calculated feature x all that we wll explan n the next subsecton. 4.2 Mult-level HOG Feature and Feature Selecton We use a three-level mage pyramd from the player wndow for calculatng three-level HOG features x 1 t, x 2 t, x 3 t for makng the feature vector x t = [x 1 t x 2 t x 3 t ] for a Label-Grd classfer. Learnng multple resoluton of the HOG appearance makes the Label-Grd classfer restrct the label-grd canddates at each resoluton level, whch helps to avod the bad classfcaton result far from the true poston. To decrease the effect of the dfference of feature scales between the three levels, we normalze the feature vector x t to L 2 unt vector both at tranng tme and test tme. We use L-class Random Decson Forests as the L-class Classfcaton Forests [22] as F j (x t ) of the Label-Grd classfer, whch results n performng feature selecton from these normalzed three-level HOG features x t. After learnng the L-class, each splt functon uses two randomly selected values of the mult-level feature vector x t to estmate the class label y j t. 5. Experments We tested our framework n two scenaros: frontal pose sequences (Secton 5.3.1), whch part-based pose estmator [4] can also estmate robustly, and sde pose sequences (Secton 5.3.2), whch part-based pose estmator cannnot predct properly as argued n Secton Expermental setup We performed expermental evaluatons on our system wth Amercan Football vdeos n professonal league matches. The sze of each vdeo s 1, The vdeos are taken from the matches of Panasonc IMPULSE *3. All vdeos were captured from the hgh place n the stadum wth fxed cameras. All vdeos were converted to 29 fps vdeos whle the orgnal vdeos were recorded at 59 fps. These vdeos nclude players from a team wth a whte-colored unform and players from the other team wth a black-colored unform. Although we captured hgh-resoluton vdeos, moton blur of movng legs and arms sometmes occurs and the players are captured wth a relatvely low resoluton. We created 10 test sequences (test (1) (10)) for the fve frontal pose tests and fve sde pose tests from these vdeos (see Fg. 6 to see the player trajectores on each sequence). Each test vdeo s composed of 40 frames. We wll show the detal behavor (pose) on each sequence n Secton for the frontal pose sequences and Secton for sde pose sequences. We manually clpped player wndows from vdeo frames and assgned labels to create our orgnal dataset D or for tranng both Label-Grd classfers and the player detector. We tred to nclude as many pose patterns (and also vews of the pose) as possble n the dataset D or to make the Label-Grd classfers learn the whole possble appearance patterns n the Amercan Football vdeos. We randomly selected the mages from all the vdeos so that the orgnal dataset ncludes more versatle player poses, and the orgnal dataset becomes 977 mages and ts labels. Note that approxmately 10% of the orgnal tranng mages shares the same mages wth test dataset sequences of test (7) and test (8). Then we resampled 977 mages and labels of D or wth 13 scales s = {0.70, 0.725, 0.75,...,0.975, 1.0} and fnally prepared 25,402 mages (25,402 = ) for tranng four Label-Grd classfers ndependently. *

7 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Fg. 7 Four Label-Grd classfers wth 8 12 Label-Grds, whch we use n our experments. Each red crcle shows the canddate Label-Grd class of the classfer. ndependently (rght/left knee and rght/left foot) from the tranng dataset D all. Consequently, we had 34-class left knee Label- Grd classfer, 34-class rght knee Label-Grd classfer, 38-class left foot Label-Grd classfer, and 39-class rght foot Label-Grd classfer (see Fg. 7 for the class assgnment). To apply FMP [4] as a baselne, we used PartsBasedDetector software [25]. We used 26-parts frontal person models as FMP and regard center postons of the 4 part-detector as four lower body jonts to compare wth our jont locaton classfers (ndex 12 as left knee jont, ndex 13 as left foot jont, ndex 24 as rght knee jont, ndex 25 as rght foot jont). We assume that the center of the rectangle of each detected part s the correspondng jont poston n the mage. Fg. 6 Tracked results of all tests (1) (10). Test (1) (5) are the results of the frontal pose sequences whle tests (6) (10) are the results on the sde pose sequences. Green dots show the player wndow center locatons n each frame. 5.2 Evaluaton Manner Pxel Error of the Jont Poston For measurng the performance of our lower body pose system, we defne the Eucldean dstance error as below: E t = d( ˆp t, p GT t ) (3) HOG wndow sze was pxels (wdth heght), and the cell sze was 8 8 both for our pelvs-algned player detector and the four Label-Grd classfers. For learnng the pelvsalgned player detector, we only used HOG and labels of D all. For the Label-Grd classfers, we also created pyramd mages and from the tracked wndow mage n one frame. We then calculated three-level HOG x 1 t, x 2 t, x 3 t from each level pyramd mage wth 8 8 cell sze and combned them. Fnally, we obtaned a 2268-dmensonal L 2 -normalzed feature vector x t as the nput of each Label-Grd classfer. We learned four Label-Grd classfers as Random Decson Forests [22] wth the feature vector x t for each lower body jont where ˆp t = (x,y) s the center pont of the estmated Label-Grd locaton and the p GT t s the ground truth poston. Snce our Label- Grd classfers are learned wth 8 8 Label-Grd, the center poston ˆp t becomes (4, 4) from the left-top pont (0, 0) n each Label- Grd. Note that the runnng speed of the player s fast n most of our expermental vdeos because we apply our method to the solated runnng players, such as Quarterback, Runnngback, and Lnebacker. For ths reason, the length of each vdeo s very short (40 frames). Another reason s that we cannot collect many sequences of long runnng solated play easly, because each Amercan football play s around only 10 seconds and players tend to be occluded and congested frequently. 252

8 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Fg. 8 Example results from frontal pose experments. The left tow panels n each subfgure (a) (e) show the results of our Label-Grd classfers, and the rght panels show the result of the FMP, where only the four detected jonts are shown (no vsualzaton of jonts means that FMP could not detect anyone n the frame). Although we would lke to compare our methods wth FMP usng the PCP [26] score, whch s broadly appled to the evaluaton of part-based methods, we cannot calculate the PCP score because our method does not nfer the stck area of each part whch s needed for calculatng PCP scores. Ths s one of the reasons why we used the Eucldean dstance for the evaluaton Detecton Rate of FMP and How to Apply FMP to Our Vdeos To test the lmtatons of FMP for sde poses n our vdeos, we defned the detecton rate R as R = k/n, where k s the number of detectons aganst the number of frames N n one test vdeo (n our case, N = 40 frames). Snce FMP [4] was the object detector (but jontly estmate the pose whle detectng the object), we automatcally clpped the magnfed and margn-added mage to apply the FMP detector. We frst clpped the player wndow from the trackng-bydetecton trackng module (Secton 3.2) by addng margn, and magnfed t 200% to enable FMP to detect the players n our vdeo. Each of mages n Fg. 8 shows the clpped mages wth ths procedure. 5.3 Expermental Results Fgure 9 (a) shows the detecton rate of FMP [4] n tests (1) (5). Snce the movement of the players n tests (2) (4) are dagonal and curved (Fg. 6), most of ther poses were dffcult for FMP wth hard occluson and low-resoluton, even though we defned those tests as frontal tests. However, snce our method does not use any part-models and just use tracked (whole) player wndow appearance n each frame, t can classfy the jont poston n all frames n test (1) (5). For example, n the Fg. 8 (d), whle FMP could not detect the player n each frame, our Label-Grd classfer estmated the jont postons correctly from the same mages. Whle we wanted to compare the poston error between our methods and FMP, we abandoned the calculaton of the pxel 253

9 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Table 2 Average estmaton error of each jont n the sde pose tests (6) (10). All errors are n pxels. Jonts (6) (7) (8) (9) (10) Left knee Rght knee Left foot Rght foot Pelvs (a) Detecton rate of FMP [4] n frontal pose tests (1) (5). (b) Detecton rate of FMP [4] n sde pose tests (6) (10). Fg. 9 Detecton Rate of FMP [4] n each test. Table 1 Average estmaton error of each jont n the frontal pose tests (1) (5). All errors are n pxels. Jonts (1) (2) (3) (4) (5) Left knee Rght knee Left foot Rght foot Pelvs error for FMP snce we were not able to get enough detectons from even frontal poses (Fg. 9 (a)) Frontal Pose Experment We tested our system on the followng frontal pose scenaros to compare the performance or detecton rate wth FMP [4]. We prepared the followng fve sequences for a frontal pose dataset (Fg. 6, left column): Test (1): The player walks to the start poston whle facng ther frontal upper body to the camera. Test (2): The player runs up to the upper sde of the feld. Test (3): The player begns to run from the start poston. Test (4): The player runs dagonally. Test (5): A large player walks to the outsde of the feld. Fgure 6 shows the tracked trajectory of pelvs poston (center of the player wndow) n each of tests (1) (10). The panels n the left column show the results of frontal pose tests (1) (5). Table 1 shows the average error of our method and FMP [4] for each jont. Note that our Label-Grd s 8 8 pxels for all tests. Whle FMP sometmes faled to detect a player who had occluded parts, our method could detect non tree-structured poses. Fgure 8 shows the example results of our method and FMP to compare wth each other Sde Pose Experment Just as for the frontal pose experment n the prevous secton, we also performed evaluaton of our method and FMP wth the followng fve sde pose scenaros (Fg. 6, rght column): Test (6): The player runs straght (namely, almost no scale change) at relatvely slow speed from left to rght. Test (7): The player runs very fast from rght to left. Test (8): The Runnngback player runs dagonally from the startng poston. Test (9): The Runnngback runs straght from the startng poston. Test (10): The player walks backward. We collected these sde pose test vdeos so that the upper body drecton of the player was almost the same as the lower body drecton. Table 2 shows the average error of our method n these sde pose tests and Fg. 10 shows the example results of tests (6) (8). Fgure 9 (b) shows the detecton rate of FMP [4] n sde pose tests (6) (8). As Fg. 10 shows, FMP can rarely detect the player n sde pose tests. FMP detects player wth a detecton rate 0.15 to Compared wth these results of FMP, our method could estmate the pose n all frames va ts trackng and classfcaton procedure wthn about two Label-Grd errors (Table 2). 5.4 Dscussons by Topc Whole Body Appearance Feature as Mult-level HOG As already argued n Secton 3.2, we use the whole body appearance to classfy the lower body pose. Our HOG-based classfcaton approach can be vewed as the modern replacement of the classcal slhouette-matchng schemes usng background subtracton, such as Ref. [16]. We nstead use randomzed HOG features (learned by Random Decson Forests) to robustly classfy the pose wth machne learnng. Our strategy has rcher nformaton wth whch to dscrmnate Label-Grd classes than usng only the lower body appearance. We nstead use the whole body HOG appearance to estmate the jont poston. Moreover, ownng to the deformaton nvarance of the HOG features, our Label-Grd classfer can estmate the pose of a larger or slmmer person untl the gradent dstrbuton changes from the feature dstrbuton of the dataset. For nstance, we performed an experment usng a large player n test (5) (see Fg. 8 (e)). Even though we only ncluded mddle-szed and thn players n the orgnal dataset D or, our classfers could stll est- 254

10 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Fg. 10 Example results from sde pose experments. Each row shows the results of sde pose test (6), (7), and (8). Grd classfier s learned from the same clothng type whle the pctoral structure ncludes varous types of clothng. In our experments, we learned Label-Grd classfiers from two Amercan Football teams, but the classfiers can classfy the lower body pose wth the appearances of both teams Algnment and Scale of the Wndow s Important As already mentoned, our framework depends on the algnment of the player wndow between the trackng module and the classficaton module. In our experments, our player detector s learned wth a HOG of 8 8 cell sze, whch tracks the player wthn a cell error. Ths means that Label-Grd classfiers can only deal wth the wndow patterns wthn one or two (at most) cell sze drft. Hence, Label-Grd classfiers can robustly estmate the grd locaton f the tracker can provde well-algned wndows. Fgure 11 shows the temporal analyss of the sde pose test (9). By observng Fg. 11 (a), the left foot poston gradually becomes unstable as the left leg s out of the wndow. Ths sequence shows the nature of our wndow algnment scheme. Whle you can use wder wndows for the Label-Grd classfiers to prevent ths case by restrctng the person wthn the wndow, you need to make more patterns for the dataset because the number of classes ncreases wth a wder Label-Grd wdow. Fgure 11 (b) shows the error values of four lower body jonts and the pelvs poston n each frame of the test (9). Owng to the mate the jont locaton of the large-szed player Dsregard of Pctoral Structure and Low-resoluton Invarance of Our Method Another mportant part of the nature of our method s that our framework dsregards the pctoral structure strategy [27] whle t depends on the algned wndow appearance. As we showed n our expermental results for sde pose tests (Fg. 10), our method can model any types of pose ncludng hardly occluded sde poses, whch pctoral pcture models cannot nfer very well owng to ther tree-models. As we already argued n Secton 4.2, our threelevel HOG feature and the Randomzed feature selecton helps to restrct the error as much as possble. Snce the Random Decson Forests technque takes advantage of the spatal grd structure to learn the dstance between classes, the error seems to be restrcted wthn neghborng Label-Grd (See Fg. 8 and Fg. 10). Whle FMP and the other part-detector technques assume clear and non-blurred mages, our mult-level HOG-based LabelGrd classfiers can even classfy the poses n low-resoluton and moton-blurred mages because the HOG feature s robust for contrast change (between mage scales) usng grd-wse edge hstogram poolng and block-wse normalzaton [14] Sport-specfic Classfier Whle our method s able to estmate the lower body pose wth varous types of poses n Amercan Football robustly, our Label- 255

11 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Label-Grd class. Even though these two advantages help to embed as many (contnuous) scales of features (n Random Decson Forests feature space) as possble, the falure happens f the player s an unknown scale (e.g., s = 0.60) Per Frame Estmaton for Movng back Players In team sports vdeos, players durng defensve acton tend to have a pose or body drecton that s not the same as the player s movng drecton. Fgure 12 shows the result of test (10). Ths shows the ablty of our method to classfy the pose correctly even when the player s movng backward. Ths feature shows the hgh applcablty to team sports vdeos, whereas walkng and runnng backwards only rarely appears n survellance vdeos. Fg. 11 Temporal analyss of test (9). Fg. 12 Walkng back result of test (10). dependence of the algnment of the tracker, Label-Grd classfers tend to have more errors along wth an ncrease n pelvs poston error. In Fg. 11 (a), each jont errors tends to ncrease when pelvs error becomes large. In addton, whether the player s wthn the scales of the dataset durng the test tme s also mportant. In the experments, we used scales s = {0.70, 0.725,..., 1.0} for creatng the augmented tranng mages. If the player scale wthn the wndow s too small or too large, the three-level pyramd HOG features wll become an unknown pattern for the Label-Grd classfer (Random Decson Forests). Our framework has two advantages for overcomng ths problem. Frst, Random Decson Forests can learn the nter-class dstance, as our Label-Grd classfer tends to msclassfy the sample wth the neghborng Label-Grd. In addton, the three-level pyramd HOG features also help the appearance over three resoluton levels and help to evade msclassfcaton to the dstant 6. Concluson To estmate lower body poses from low-resoluton mages, we proposed a new human pose estmaton method usng a Label- Grd classfer that s ntegrated wth an object tracker. Our Label- Grd classfer does not use the pctoral structures, but use the algnment of the player s pelvs poston to classfy varous types and scales of poses nto a grd structure wth off-the-shelf multclass classfers (we use Random Decson Forests n ths paper). Algnment between the trackng-by-detecton module and the Label-Grd classfcaton module s the key to realze the estmaton of lower body poses wth all poses n team sports vdeos. Our system can even estmate poses of the solated player wth part-occlusons and non-uprght poses, whch are dffcult to estmate wth the methods usng pctoral structures and part detectors. Our pose classfcaton strategy usng a whole person HOG makes t possble to classfy the lower body jont locatons of a player even durng the sde runnng poses. Our framework can be vewed as a revsted verson of Ref. [16] by usng machnelearnng and dense vsual features. In other words, tradtonal slhouette matchng strategy for pose estmaton was nnovated by our approach usng HOG features and Random Decson Forests to embed all pose appearance patterns nto the randomzed feature space. In ths work, we only nvestgated the possblty of our framework for an solated player wthout any occluson between players. However, the lower body pose estmaton of solated players wll be useful for many team sports vdeos because players are mostly solated durng play. As our experments showed, our system can estmate all types of poses wth only monocular RGB vdeos f a suffcent amount of poses are prepared n datasets. In addton, our method can estmate sde-runnng poses whle FMP [4] can only detect starshaped part confguratons and cannot estmate non-star sde runnng poses. However, the estmaton fals when the algnment s not very accurate because of the drft of the tracker. We beleve that ths method s advantage over prevous pose estmaton methods wll open up the wde range of potental of player actvty recognton from the estmated jont postons usng only monocular cameras and any low-resoluton settngs of people trackng. Jont postons of the lower body wll provde a new source of rcher nformaton for sports data analyss wth only passve sensng. 256

12 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Our future work ncludes jont estmaton of multple jonts and probablstc formulaton usng a knematc body model (whle ths paper only nvestgates one-shot classfcaton as a frst smple proposal of grd-wse classfcaton for pose estmaton). Moreover, we wll use 3D pose nformaton such as the upper body pose, whch we are explorng n other research [24], or movement drecton to restrct the pose feature space usng these types of nformaton as prors. For behavor understandng usng the lower body pose, we wll nvestgate the leg-based actvty recognton such as recognzng foot steps and key-pose extracton for sports vdeo summarzaton and retreval. Acknowledgments The data were provded by the Panasonc Corporaton and the Japan Amercan Football Assocaton. The authors also would lke to thank Kyosh Matsuo for hs help on the dscusson of the expermental settngs. References [1] Wu, Y., Lm, J. and Yang, M.-H.: Onlne object trackng: A benchmark, 2013 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2013). [2] Wang, C., Wang, Y. and Yulle, A.L.: An approach to pose-based acton recognton, 2013 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2013). [3] Yang, Y., Baker, S., Kannan, A. and Ramanan, D.: Recognzng proxemcs n personal photos, 2012 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2012). [4] Yang, Y. and Ramanan, D.: Artculated pose estmaton wth flexble mxtures-of-parts, 2011 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2011). [5] Klaser, A. and Marszalek, M.: A spato-temporal descrptor based on 3d-gradents (2008). [6] Sadanand, S. and Corso, J.J.: Acton bank: A hgh-level representaton of actvty n vdeo, 2012 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2012). [7] Jhuang, H., Gall, J., Zuff, S., Schmd, C. and Black, M.J.: Towards understandng acton recognton, IEEE Internatonal Conference on Computer Vson (ICCV), Sydney, Australa, pp , IEEE (2013). [8] Shotton, J., Sharp, T., Kpman, A., Ftzgbbon, A., Fnoccho, M., Blake, A., Cook, M. and Moore, R.: Real-tme human pose recognton n parts from sngle depth mages, Comm. ACM, Vol.56, No.1, pp (2013). [9] Urtasun, R., Fleet, D.J. and Fua, P.: 3D people trackng wth Gaussan process dynamcal models, 2006 IEEE Computer Socety Conference on Computer Vson and Pattern Recognton, Vol.1, pp , IEEE (2006). [10] Kazem, V.: Mult-vew Body Part Recognton wth Random Forests, 2013 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2013). [11] Bretensten, M.D., Rechln, F., Lebe, B., Koller-Meer, E. and Van Gool, L.: Robust trackng-by-detecton usng a detector confdence partcle flter, 2009 IEEE 12th Internatonal Conference on Computer Vson, pp , IEEE (2009). [12] Felzenszwalb, P.F., Grshck, R.B., McAllester, D.A. and Ramanan, D.: Object Detecton wth Dscrmnatvely Traned Part-Based Models, IEEE Trans. Pattern Analyss and Machne Intellgence, Vol.32, pp (onlne), DOI: /TPAMI (2010). [13] Jan, A.K. and L, S.Z.: Handbook of face recognton, Sprnger (2005). [14] Dalal, N. and Trggs, B.: Hstograms of Orented Gradents for Human Detecton, Computer Vson and Pattern Recognton, Vol.1, pp (2005). [15] Bourdev, L. and Malk, J.: Poselets: Body Part Detectors Traned Usng 3D Human Pose Annotatons, Internatonal Conference on Computer Vson (ICCV) (2009). [16] Germann, M., Popa, T., Zegler, R., Keser, R. and Gross, M.: Space- Tme Body Pose Estmaton n Uncontrolled Envronments, Proc Internatonal Conference on 3D Imagng, Modelng, Processng, Vsualzaton and Transmsson (3DIMPVT 11), pp , Washngton, DC, USA, IEEE Computer Socety (2011). [17] Wang, Y., Tran, D. and Lao, Z.: Learnng herarchcal poselets for human parsng, 2011 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2011). [18] Pshchuln, L., Andrluka, M., Gehler, P. and Schele, B.: Poselet condtoned pctoral structures, 2013 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2013). [19] Radwan, I., Dhall, A., Josh, J. and Goecke, R.: Regresson based pose estmaton wth automatc occluson detecton and rectfcaton, 2012 IEEE Internatonal Conference on Multmeda and Expo (ICME), pp , IEEE (2012). [20] Gammeter, S., Ess, A., Jäggl, T., Schndler, K., Lebe, B. and Van Gool, L.: Artculated mult-body trackng under egomoton, Computer Vson ECCV 2008, pp , Sprnger (2008). [21] Rogez, G., Rhan, J., Orrte-Uruñuela, C. and Torr, P.H.: Fast human pose detecton usng randomzed herarchcal cascades of rejectors, Internatonal Journal of Computer Vson, Vol.99, No.1, pp (2012). [22] Crmns, A. and Shotton, J.: Decson forests for computer vson and medcal mage analyss, Sprnger (2013). [23] Malsewcz, T., Gupta, A. and Efros, A.A.: Ensemble of Exemplar- SVMs for Object Detecton and Beyond, Internatonal Conference on Computer Vson (ICCV) (2011). [24] Hayash, M., Yamamoto, T., Ohshma, K., Tanabk, M. and Aok, Y.: Head and Upper Body Pose Estmaton n Team Sport Vdeos, nd IAPR Asan Conference on Pattern Recognton (ACPR), pp , IEEE (2013). [25] Brstow, H.: C++ mplementaton of Deva Ramanan s Artculated Pose Estmaton wth Flexble Mxtures of Parts. [26] Ferrar, V., Marn-Jmenez, M. and Zsserman, A.: Progressve search space reducton for human pose estmaton, 2008 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp.1 8, IEEE (2008). [27] Andrluka, M., Roth, S. and Schele, B.: Pctoral structures revsted: People detecton and artculated pose estmaton, 2009 IEEE Conference on Computer Vson and Pattern Recognton (CVPR), pp , IEEE (2009). Masak Hayash receved an M.Sc. degree n Computer Vson and Image Processng from Keo Unversty n Snce 2012, he has been a Ph.D. canddate at Keo Unversty. Hs research nterests are vdeo-based human pose estmaton and actvty recognton, especally for team sports vdeos. Kyoko Oshma receved a B.A. of Lberal Arts from Internatonal Chrstan Unversty n She s a staff engneer n the Panasonc Corporaton and works on research and development of computer vson system. Masamoto Tanabk receved an M.Sc. degree n Electrcal Engneerng from Waseda Unversty n Hs research nterests are vdeo-based human trackng and pose estmaton, especally for securty, busness ntellgence, and sports. 257

13 Informaton and Meda Technologes 10(2): (2015) reprnted from: IPSJ Transactons on Computer Vson and Applcatons 7: (2015) Informaton Processng Socety of Japan Yoshmtsu Aok s an Assocate Professor, Department of Electroncs and Electrcal Engneerng, Keo Unversty. He receved hs Ph.D. n Engneerng from Waseda Unversty n From 2002 to 2008, he was an Assocate Professor n Shbaura Insttute of Technology. Snce 2008, he has been an Assocate Professor at Department of Electroncs and Electrcal Engneerng n Keo Unversty. He performs research n the areas of Computer Vson, Pattern Recognton, and Meda Sensng/Understandng. (Communcated by Greg Mor) 258

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature Detecton of hand graspng an object from complex background based on machne learnng co-occurrence of local mage feature Shnya Moroka, Yasuhro Hramoto, Nobutaka Shmada, Tadash Matsuo, Yoshak Shra Rtsumekan

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram Shape Representaton Robust to the Sketchng Order Usng Dstance Map and Drecton Hstogram Department of Computer Scence Yonse Unversty Kwon Yun CONTENTS Revew Topc Proposed Method System Overvew Sketch Normalzaton

More information

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input Real-tme Jont Tracng of a Hand Manpulatng an Object from RGB-D Input Srnath Srdhar 1 Franzsa Mueller 1 Mchael Zollhöfer 1 Dan Casas 1 Antt Oulasvrta 2 Chrstan Theobalt 1 1 Max Planc Insttute for Informatcs

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

Multiple Frame Motion Inference Using Belief Propagation

Multiple Frame Motion Inference Using Belief Propagation Multple Frame Moton Inference Usng Belef Propagaton Jang Gao Janbo Sh The Robotcs Insttute Department of Computer and Informaton Scence Carnege Mellon Unversty Unversty of Pennsylvana Pttsburgh, PA 53

More information

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution Real-tme Moton Capture System Usng One Vdeo Camera Based on Color and Edge Dstrbuton YOSHIAKI AKAZAWA, YOSHIHIRO OKADA, AND KOICHI NIIJIMA Graduate School of Informaton Scence and Electrcal Engneerng,

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

High Five: Recognising human interactions in TV shows

High Five: Recognising human interactions in TV shows PATRON-PEREZ ET AL.: RECOGNISING INTERACTIONS IN TV SHOWS 1 Hgh Fve: Recognsng human nteractons n TV shows Alonso Patron-Perez alonso@robots.ox.ac.uk Marcn Marszalek marcn@robots.ox.ac.uk Andrew Zsserman

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Multi-view 3D Position Estimation of Sports Players

Multi-view 3D Position Estimation of Sports Players Mult-vew 3D Poston Estmaton of Sports Players Robbe Vos and Wlle Brnk Appled Mathematcs Department of Mathematcal Scences Unversty of Stellenbosch, South Afrca Emal: vosrobbe@gmal.com Abstract The problem

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Histogram of Template for Pedestrian Detection

Histogram of Template for Pedestrian Detection PAPER IEICE TRANS. FUNDAMENTALS/COMMUN./ELECTRON./INF. & SYST., VOL. E85-A/B/C/D, No. xx JANUARY 20xx Hstogram of Template for Pedestran Detecton Shaopeng Tang, Non Member, Satosh Goto Fellow Summary In

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Robust Inlier Feature Tracking Method for Multiple Pedestrian Tracking

Robust Inlier Feature Tracking Method for Multiple Pedestrian Tracking 2011 Internatonal Conference on Informaton and Intellgent Computng IPCSIT vol.18 (2011) (2011) IACSIT Press, Sngapore Robust Inler Feature Trackng Method for Multple Pedestran Trackng Young-Chul Lm a*

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

Face Tracking Using Motion-Guided Dynamic Template Matching

Face Tracking Using Motion-Guided Dynamic Template Matching ACCV2002: The 5th Asan Conference on Computer Vson, 23--25 January 2002, Melbourne, Australa. Face Trackng Usng Moton-Guded Dynamc Template Matchng Lang Wang, Tenu Tan, Wemng Hu atonal Laboratory of Pattern

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Classifier Swarms for Human Detection in Infrared Imagery

Classifier Swarms for Human Detection in Infrared Imagery Classfer Swarms for Human Detecton n Infrared Imagery Yur Owechko, Swarup Medasan, and Narayan Srnvasa HRL Laboratores, LLC 3011 Malbu Canyon Road, Malbu, CA 90265 {owechko, smedasan, nsrnvasa}@hrl.com

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

A Study on the Application of Spatial-Knowledge-Tags using Human Motion in Intelligent Space

A Study on the Application of Spatial-Knowledge-Tags using Human Motion in Intelligent Space A Study on the Applcaton of Spatal-Knowledge-Tags usng Human Moton n Intellgent Space Tae-Seok Jn*, Kazuyuk Moroka**, Mhoko Ntsuma*, Takesh Sasak*, and Hdek Hashmoto * * Insttute of Industral Scence, the

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Combined Object Detection and Segmentation

Combined Object Detection and Segmentation Combned Object Detecton and Segmentaton Jarch Vansteenberge, Masayuk Mukunok, and Mchhko Mnoh Abstract We develop a method for combned object detecton and segmentaton n natural scene. In our approach segmentaton

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender 2013 Frst Internatonal Conference on Artfcal Intellgence, Modellng & Smulaton Comparng Image Representatons for Tranng a Convolutonal Neural Network to Classfy Gender Choon-Boon Ng, Yong-Haur Tay, Bok-Mn

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

Motion Boundary Trajectory for Human Action Recognition

Motion Boundary Trajectory for Human Action Recognition Moton Boundary Trajectory for Human Acton Recognton So-Long Lo and Ah-Chung Tso Faculty of Informaton Technology, Macau Unversty of Scence and Technology Abstract. In ths paper, we propose a novel approach

More information

Discriminative classifiers for object classification. Last time

Discriminative classifiers for object classification. Last time Dscrmnatve classfers for object classfcaton Thursday, Nov 12 Krsten Grauman UT Austn Last tme Supervsed classfcaton Loss and rsk, kbayes rule Skn color detecton example Sldng ndo detecton Classfers, boostng

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

Attributed Relational Graph Based Feature Extraction of Body Poses In Indian Classical Dance Bharathanatyam

Attributed Relational Graph Based Feature Extraction of Body Poses In Indian Classical Dance Bharathanatyam Athra.Sugathan et al Int. Journal of Engneerng Research and Applcatons ISSN : 2248-9622, Vol. 4, Issue 5( Verson 7), May 2014, pp.11-17 RESEARCH ARTICLE OPEN ACCESS Attrbuted Relatonal Graph Based Feature

More information

Action Recognition by Matching Clustered Trajectories of Motion Vectors

Action Recognition by Matching Clustered Trajectories of Motion Vectors Acton Recognton by Matchng Clustered Trajectores of Moton Vectors Mchals Vrgkas 1, Vasleos Karavasls 1, Chrstophoros Nkou 1 and Ioanns Kakadars 2 1 Department of Computer Scence, Unversty of Ioannna, Ioannna,

More information

Learning Ensemble of Local PDM-based Regressions. Yen Le Computational Biomedicine Lab Advisor: Prof. Ioannis A. Kakadiaris

Learning Ensemble of Local PDM-based Regressions. Yen Le Computational Biomedicine Lab Advisor: Prof. Ioannis A. Kakadiaris Learnng Ensemble of Local PDM-based Regressons Yen Le Computatonal Bomedcne Lab Advsor: Prof. Ioanns A. Kakadars 1 Problem statement Fttng a statstcal shape model (PDM) for mage segmentaton Callosum segmentaton

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Sequential Monte-Carlo Based Road Region Segmentation Algorithm with Uniform Spatial Sampling

Sequential Monte-Carlo Based Road Region Segmentation Algorithm with Uniform Spatial Sampling IPSJ Transactons on Computer Vson and Applcatons Vol.8 1 10 (Feb. 2016) [DOI: 10.2197/psjtcva.8.1] Regular Paper Sequental Monte-Carlo Based Road Regon Segmentaton Algorthm wth Unform Spatal Samplng Zdeněk

More information

Fitting and Alignment

Fitting and Alignment Fttng and Algnment Computer Vson Ja-Bn Huang, Vrgna Tech Many sldes from S. Lazebnk and D. Hoem Admnstratve Stuffs HW 1 Competton: Edge Detecton Submsson lnk HW 2 wll be posted tonght Due Oct 09 (Mon)

More information

Large-scale Web Video Event Classification by use of Fisher Vectors

Large-scale Web Video Event Classification by use of Fisher Vectors Large-scale Web Vdeo Event Classfcaton by use of Fsher Vectors Chen Sun and Ram Nevata Unversty of Southern Calforna, Insttute for Robotcs and Intellgent Systems Los Angeles, CA 90089, USA {chensun nevata}@usc.org

More information

Learning-based License Plate Detection on Edge Features

Learning-based License Plate Detection on Edge Features Learnng-based Lcense Plate Detecton on Edge Features Wng Teng Ho, Woo Hen Yap, Yong Haur Tay Computer Vson and Intellgent Systems (CVIS) Group Unverst Tunku Abdul Rahman, Malaysa wngteng_h@yahoo.com, woohen@yahoo.com,

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

A B-Snake Model Using Statistical and Geometric Information - Applications to Medical Images

A B-Snake Model Using Statistical and Geometric Information - Applications to Medical Images A B-Snake Model Usng Statstcal and Geometrc Informaton - Applcatons to Medcal Images Yue Wang, Eam Khwang Teoh and Dnggang Shen 2 School of Electrcal and Electronc Engneerng, Nanyang Technologcal Unversty

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction Encodng Crowd Interacton wth Deep Neural Network for Pedestran Trajectory Predcton Yanyu Xu ShanghaTech Unversty xuyy2@shanghatech.edu.cn Zhxn Pao ShanghaTech Unversty paozhx@shanghatech.edu.cn Shenghua

More information

Resolving Ambiguity in Depth Extraction for Motion Capture using Genetic Algorithm

Resolving Ambiguity in Depth Extraction for Motion Capture using Genetic Algorithm Resolvng Ambguty n Depth Extracton for Moton Capture usng Genetc Algorthm Yn Yee Wa, Ch Kn Chow, Tong Lee Computer Vson and Image Processng Laboratory Dept. of Electronc Engneerng The Chnese Unversty of

More information

Tracking by Cluster Analysis of Feature Points and Multiple Particle Filters 1

Tracking by Cluster Analysis of Feature Points and Multiple Particle Filters 1 Tracng by Cluster Analyss of Feature Ponts and Multple Partcle Flters 1 We Du, Justus Pater Unversty of Lège, Department of Electrcal Engneerng and Computer Scence, Insttut Montefore, B28, Sart Tlman Campus,

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Video Object Tracking Based On Extended Active Shape Models With Color Information

Video Object Tracking Based On Extended Active Shape Models With Color Information CGIV'2002: he Frst Frst European Conference Colour on Colour n Graphcs, Imagng, and Vson Vdeo Object rackng Based On Extended Actve Shape Models Wth Color Informaton A. Koschan, S.K. Kang, J.K. Pak, B.

More information

Development of Face Tracking and Recognition Algorithm for DVR (Digital Video Recorder)

Development of Face Tracking and Recognition Algorithm for DVR (Digital Video Recorder) IJCSNS Internatonal Journal of Computer Scence and Network Securty, VOL.6 No.3A, March 2006 7 Development of Face Trackng and Recognton Algorthm for DVR (Dgtal Vdeo Recorder) Jang-Seon Ryu and Eung-Tae

More information

Scale Selective Extended Local Binary Pattern For Texture Classification

Scale Selective Extended Local Binary Pattern For Texture Classification Scale Selectve Extended Local Bnary Pattern For Texture Classfcaton Yutng Hu, Zhlng Long, and Ghassan AlRegb Multmeda & Sensors Lab (MSL) Georga Insttute of Technology 03/09/017 Outlne Texture Representaton

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning Computer Anmaton and Vsualsaton Lecture 4. Rggng / Sknnng Taku Komura Overvew Sknnng / Rggng Background knowledge Lnear Blendng How to decde weghts? Example-based Method Anatomcal models Sknnng Assume

More information

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90)

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90) CIS 5543 Coputer Vson Object Detecton What s Object Detecton? Locate an object n an nput age Habn Lng Extensons Vola & Jones, 2004 Dalal & Trggs, 2005 one or ultple objects Object segentaton Object detecton

More information

Book chapter for Machine Learning for Human Motion Analysis: Theory and Practice

Book chapter for Machine Learning for Human Motion Analysis: Theory and Practice YngL Tan 1, Rogero Fers 2, Lsa Brown 2, Danel Vaquero 3, Yun Zha 2, Arun Hampapur 2 1 Department of Electrcal Engneerng Cty College, Cty Unversty of New York, New York, NY ytan@ccny.cuny.edu 2 IBM T. J.

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Discrete-Continuous Depth Estimation from a Single Image

Discrete-Continuous Depth Estimation from a Single Image Dscrete-Contnuous Depth Estmaton from a Sngle Image Maomao Lu, Matheu Salzmann, Xumng He NICTA and CECS, ANU, Canberra {maomao.lu, matheu.salzmann, xumng.he}@ncta.com.au Abstract In ths paper, we tackle

More information

A SALIENCY BASED OBJECT TRACKING METHOD

A SALIENCY BASED OBJECT TRACKING METHOD A SALIENCY BASED OBJECT TRACKING METHOD Shje Zhang and Fred Stentford Unversty College London, Adastral Park Campus, Ross Buldng Martlesham Heath, Ipswch, IP5 3RE, UK {j.zhang, f.stentford}@adastral.ucl.ac.uk

More information

Capturing Global and Local Dynamics for Human Action Recognition

Capturing Global and Local Dynamics for Human Action Recognition 2014 22nd Internatonal Conference on Pattern Recognton Capturng Global and Local Dynamcs for Human Acton Recognton Sq Ne Department of Electrcal, Computer and System Engneerng Rensselaer Polytechnc Insttute

More information

2 ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE (a) Ambgutes from person re dentfcaton n solaton (b) Assocatng groups of people may reduce ambgutes n mat

2 ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE (a) Ambgutes from person re dentfcaton n solaton (b) Assocatng groups of people may reduce ambgutes n mat ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE 1 Assocatng Groups of People We-Sh Zheng jason@dcs.qmul.ac.uk Shaogang Gong sgg@dcs.qmul.ac.uk Tao Xang txang@dcs.qmul.ac.uk School of EECS, Queen Mary Unversty

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Adaptive Silhouette Extraction and Human Tracking in Dynamic. Environments 1

Adaptive Silhouette Extraction and Human Tracking in Dynamic. Environments 1 Adaptve Slhouette Extracton and Human Trackng n Dynamc Envronments 1 X Chen, Zhha He, Derek Anderson, James Keller, and Marjore Skubc Department of Electrcal and Computer Engneerng Unversty of Mssour,

More information

A Probabilistic Approach to Detect Urban Regions from Remotely Sensed Images Based on Combination of Local Features

A Probabilistic Approach to Detect Urban Regions from Remotely Sensed Images Based on Combination of Local Features A Probablstc Approach to Detect Urban Regons from Remotely Sensed Images Based on Combnaton of Local Features Berl Sırmaçek German Aerospace Center (DLR) Remote Sensng Technology Insttute Weßlng, 82234,

More information

Gender Classification using Interlaced Derivative Patterns

Gender Classification using Interlaced Derivative Patterns Gender Classfcaton usng Interlaced Dervatve Patterns Author Shobernejad, Ameneh, Gao, Yongsheng Publshed 2 Conference Ttle Proceedngs of the 2th Internatonal Conference on Pattern Recognton (ICPR 2) DOI

More information

Multi-View Face Alignment Using 3D Shape Model for View Estimation

Multi-View Face Alignment Using 3D Shape Model for View Estimation Mult-Vew Face Algnment Usng 3D Shape Model for Vew Estmaton Yanchao Su 1, Hazhou A 1, Shhong Lao 1 Computer Scence and Technology Department, Tsnghua Unversty Core Technology Center, Omron Corporaton ahz@mal.tsnghua.edu.cn

More information

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS by XUNYU PAN (Under the Drecton of Suchendra M. Bhandarkar) ABSTRACT In modern tmes, more and more

More information

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity ISSN(Onlne): 2320-9801 ISSN (Prnt): 2320-9798 Internatonal Journal of Innovatve Research n Computer and Communcaton Engneerng (An ISO 3297: 2007 Certfed Organzaton) Vol.2, Specal Issue 1, March 2014 Proceedngs

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Extraction of Human Activities as Action Sequences using plsa and PrefixSpan

Extraction of Human Activities as Action Sequences using plsa and PrefixSpan Extracton of Human Actvtes as Acton Sequences usng plsa and PrefxSpan Takuya TONARU Tetsuya TAKIGUCHI Yasuo ARIKI Graduate School of Engneerng, Kobe Unversty Organzaton of Advanced Scence and Technology,

More information

A high precision collaborative vision measurement of gear chamfering profile

A high precision collaborative vision measurement of gear chamfering profile Internatonal Conference on Advances n Mechancal Engneerng and Industral Informatcs (AMEII 05) A hgh precson collaboratve vson measurement of gear chamferng profle Conglng Zhou, a, Zengpu Xu, b, Chunmng

More information

Detection of Human Actions from a Single Example

Detection of Human Actions from a Single Example Detecton of Human Actons from a Sngle Example Hae Jong Seo and Peyman Mlanfar Electrcal Engneerng Department Unversty of Calforna at Santa Cruz 1156 Hgh Street, Santa Cruz, CA, 95064 {rokaf,mlanfar}@soe.ucsc.edu

More information

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros. Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both

More information