Focal Loss in 3D Object Detection

Size: px
Start display at page:

Download "Focal Loss in 3D Object Detection"

Transcription

1 1 Focal Loss n 3D Object Detecton eng Yun1 Le Ta2 Yuan Wang2 Chengju Lu3 Mng Lu2 Fg. 1. Upper two rows show projected 3D object detecton results from the detector traned wth bnary cross entropy. Lower two rows present related results from the detector traned wth the focal loss. urple and blue boundng boxes are the ground-truth and the estmated results respectvely. Abstract 3D object detecton s stll an open problem n autonomous drvng scenes. When recognzng and localzng key objects from sparse 3D nputs, autonomous vehcles suffer from a larger contnuous searchng space and hgher fore-background mbalance compared to mage-based object detecton. In ths paper, we am to solve ths fore-background mbalance n 3D object detecton. Inspred by the recent use of focal loss n magebased object detecton, we extend ths hard-mnng mprovement of bnary cross entropy to pont-cloud-based object detecton and conduct experments to show ts performance based on two dfferent 3D detectors: 3D-FCN and VoxelNet. The evaluaton results show up to 11.2A gans through the focal loss n a wde range of hyperparameters for 3D object detecton. Index Terms Deep Learnng n Robotcs and Automaton; Object Detecton, Segmentaton and Categorzaton; Recognton. Ths work was supported by the Natonal Natural Scence Foundaton of Chna (Grant No. U ), and was partally supported by Shenzhen Scence Technology and Innovaton Commsson (SZSTI) JCYJ , the Research Grant Councl of Hong Kong SAR Government, Chna, under roject No , No and No awarded to rof. Mng Lu. (Correspondng author: eng Yun.) 1 eng Yun s wth the Department of Computer Scence and Engneerng, The Hong Kong Unversty of Scence and Technology, Hong Kong (e-mal: pyun@ust.hk) 2 Le Ta, Yuan Wang and Mng Lu are wth the Department of Electronc and Computer Engneerng, The Hong Kong Unversty of Scence and Technology, Hong Kong (e-mal: {lta, ywangeq, eelum}@ust.hk) 3 Chengju Lu s wth College of Electrcal and Informaton Engneerng, Tongj Unversty, Chna (e-mal: luchengju@tongj.edu.cn) I. I NTRODUCTION BJECT detecton n 3D s stll challengng n robotcs percepton, the appled scenes of whch wdely nclude urban and suburban roads, hghways, brdges and ndoor settngs. Robots recognze and localze key objects from data n the 3D form and predct ther locatons, szes and orentatons, whch provdes both semantc and spatal nformaton for hgh-level decson makng. The pont cloud s one of the most commonly used 3D data forms, and can be gathered by range cameras, lke LDAR and RGB-D cameras. Snce the coordnate nformaton of pont clouds s not nfluenced by appearance changes, pont clouds are also robust n extreme weather and varous seasons. In addton, t s naturally scalenvarant. The scale of an object s nvarant anywhere n a pont cloud, whle t always changes n an mage due to foreshortenng effects. Moreover, the ncreasng percepton dstance and decreasng prce of 3D LDARs make them a promsng drecton for autonomous drvng researchers [1]. Current mage-based detectors beneft from translaton nvarance from convoluton operatons and can perform wth human-comparable accuracy. However, the successful magebased archtectures cannot be drectly appled n 3D space. ont-cloud-based object detecton consumes pont clouds whch are sparse pont lsts nstead of dense arrays. If drawng O

2 2 on the success of mage-based detectors and conductng dense convoluton operaton to acqure translaton nvarance, preprocessng must be mplemented to convert the sparse pont clouds nto dense arrays. Otherwse, specal layers should be carefully desgned to extract meanngful features from the sparse nputs. Addtonally, the fore-background mbalance s much more serous than n 2D scenaros, snce the new z- axs further enlarges the searchng space and the extent of mbalance s dfferent for each dfferent z value. Ln et al.[2] proposed focal loss to tackle the forebackground mbalance n mage-based object detecton, so that one-stage detectors could acheve state-of-the-art accuracy as two-stage detectors. As a hard-mnng mprovement of bnary cross entropy, t helps the network focus on hard classfed objects, n case they are overwhelmed by a large number of easly classfed objects. Smlar to mage-based detecton methods, pont-cloudbased detecton methods can also be classfed nto twostage [3], [4], [5] and one-stage detectors [6], [7]. In ths paper, nspred by [2], we am to solve the fore-background mbalance for 3D object detecton through the focal loss. We clam the followng contrbutons: We extend focal loss to 3D object detecton to solve the huge fore-background mbalance n one-stage detectors, and conduct experments on two dfferent one-stage 3D object detectors, 3D-FCN [6] and VoxelNet [7]. The experment results demonstrate up to 11.2A gans from the focal loss n a wde range of hyperparameters. To further understand focal loss n 3D object detecton, we analyze ts effect towards foreground and background estmatons, and valdate that t plays a role smlar to mage-based detecton. We also fnd that the specal archtecture of VoxelNet can naturally handle the hard negatves well. We plot the fnal posteror probablty dstrbutons of the two detectors and demonstrate that the focal loss wth the ncreasng hyperparameter γ decreases the estmaton posteror probabltes. II. RELATED WORK A. Two-Stage 3D Object Detecton When extendng two-stage mage detectors to the 3D space, researchers encounter the followng problems: (1) the nput s sparse and at low resoluton; (2) the orgnal mage-based methods are not guaranteed to have enough nformaton to generate regon proposals. Ku et al. [4] proposed AVOD whch fused RGB mages and pont clouds. It frst proposes algned 3D boundng boxes wth a multmodal fuson regon proposal network. Then, the proposed boundng boxes are classfed and regressed wth fully connected layers. Both the appearance and the 3D nformaton are well-utlzed to mprove the accuracy and robustness of the proposed model n extreme scenes. Ther hand-crafted features can be further mproved to learn representatons drectly from raw LDAR nputs to allevate nformaton loss. Q et al. [3] proposed F-ontNet and leveraged both 2D object detectors and 3D deep learnng for object localzaton. TABLE I IMAGE-BASED AND OINT-CLOUD-BASED OBJECT DETECTION Image-Based Object Detecton ont-cloud-based Object Detecton Method - 3D-FCN [6] VoxelNet [7] Dmenson 2D 3D 3D Input Dense Grd Dense Grd Sparse ont Lst Network Dense Conv Dense Conv Heterogeneous pelne One/Two-Stage One-Stage One-Stage They extracted the 3D boundng frustum of an object wth a 2D object detector. Then 3D nstance segmentaton and 3D boundng box regresson were appled wth two varants of ontnet [8]. F-ontNet acheves state-of-the-art accuracy on the KITTI 3D object detecton challenge [9], and also performs at real-tme speed for 3D object detecton. Ther mage detector needs to be carefully desgned wth a hgh recall rate, snce the accuracy upper bound s determned by the frst stage. B. One-Stage 3D Object Detecton L [6] extended a 2D fully convolutonal network to 3D. The voxelzed pont clouds are processed by an encoderdecoder network. The 3D fully convolutonal network (3D- FCN) fnally proposes a probablty and a regresson map for the whole detecton regon. It thoroughly conssts of 3D dense convolutons wth hgh computaton and memory costs, so that the network depth s lmted and hard to extract hgh-level features. Unlke 3D-FCN and AVOD, both of whch adopt hand-crafted features to represent the pont clouds, Zhou et al. [7] desgned an end-to-end network to mplement pont-cloudbased 3D object detecton wth learnng representatons called VoxelNet. Compared to 3D-FCN [6], the computaton cost s mtgated by the Voxel Feature Encodng Layers (VFELayers) and 2D convoluton. In ths paper, we adopt 3D-FCN [6] and VoxelNet [7] as two dfferent types of one-stage 3D detectors. As shown n Table I, 3D-FCN consumes dense grds and conssts of only 3D dense convoluton layers, where the 2D FCN archtecture [10] s extended to 3D for dense feature extracton. In contrast, VoxelNet consumes sparse pont lsts and s a heterogeneous network, whch frstly extracts sparse features wth ts novel VFELayers and then conducts 3D and 2D convoluton sequentally. C. Imbalance between Foreground and Background Image-based object detectors can be classfed nto twostage and one-stage detectors. For two-stage detectors, lke R-CNN [11], the frst stage generates a sparse set of canddate object locatons and the second stage classfes each canddate locaton as one of the foreground classes or as the background usng a convolutonal neural network. The twostage detectors [12], [13] acheve state-of-the-art accuracy on the COCO benchmark. On the other hand, one-stage detectors, lke YOLO [14] and SSD [15], am to smplfy the ppelne. They mprove the tranng speed of deep models and also demonstrate promsng results n terms of accuracy.

3 3 Ln et al. [2] explored both one-stage and two-stage detectors n mage-based object detecton, and clamed that the hurdle that obstructs the one-stage detectors from better accuracy s the extreme fore-background class mbalance encountered durng tranng of dense detectors. They reshaped the standard cross entropy loss and proposed the focal loss such that the losses assgned to well-classfed examples were downweghted. Ths can be seen as a hard-mnng mprovement of bnary cross entropy to help networks focus on hard classfed objects n case they are overwhelmed by a large number of easly classfed objects. We extend focal loss to 3D object detecton to tackle the fore-background mbalance problem. Dfferent from magebased detecton, pont-cloud-based object detecton s a more challengng percepton problem n 3D space wth sparse sensor data and suffers from more serous fore-background mbalance. To thoroughly evaluate the performance of the focal loss n ths harder task, we conduct experments based on two dfferent types of one-stage 3D detectors: 3D-FCN and VoxelNet. We analyze the focal loss effect on these two 3D detectors followng a smlar method to that n [2], and further dscuss the decreasng posteror probablty effect of the focal loss. III. FOCAL LOSS In ths secton, we frst declare notatons and revst the focal loss [2], and then further analyze the fore-background mbalance n 3D object detecton. A. relmnares We defne y {±1} as the ground-truth class, and p as the estmated probablty for the class wth label y = 1. For notatonal convenence, we defne the posteror probablty p t as { p f y = 1 p t = (1) 1 p f y = 1, where p s calculated wth p = sgmod(x). The bnary cross entropy (BCE) loss and ts devaton can be formulated as ε BCE (p t ) = log(p t ) (2) dε BCE (p t ) = y(p t 1). (3) dx As clamed n [2], when the network s traned wth BCE loss, ts gradent wll be domnated by vast easy classfed negatve samples f a huge fore-background mbalance exsts. Focal loss can be consdered as a dynamcally scaled cross entropy loss, whch s defned as ε FL (p t ) = (1 p t ) γ log(p t ) (4) dε FL (p t ) = y(1 p t ) γ (γ p t log(p t ) + p t 1). (5) dx The contrbuton from the well classfed samples (p t 0.5) to the loss s down-weghted. The hyperparameter γ of the focal loss can be used to tune the weght of dfferent samples. As γ ncreases, fewer easly classfed samples contrbute to the tranng loss. Obvously, when γ reaches 0, the focal loss degrades to become same as the BCE loss. In the followng sectons, all the cases wth γ = 0 represent BCE loss cases. Researchers have prevously ether ntroduced hyperparameters to balance the losses calculated from postve and negatve anchors, or normalzed postve and negatve losses by the frequency of correspondng anchors. However, one essental problem that these two prevous methods cannot handle s the gradent salence of hard negatve samples. The gradents of hard negatve anchors (p t < 0.5) are overwhelmed by a large number of easy negatve anchors (p t 0.5). Due to the dynamc scalng wth the posteror probablty p t, a weghted focal loss can be used to handle both the fore-background mbalance and the gradent salence of hard negatve samples wth the followng form, ε FL (p t ) = λ (1 p t ) γ log(p t ), (6) where λ s nduced to weght dfferent classes. In the followng sectons, we adopt hyperparameters α and β to weght postve and negatve focal loss respectvely. B. Fore-background Imbalance n 3D Object Detecton The methods for 3D object detecton can be classfed as one-stage [6], [7] and two-stage [3], [4], [5] detectors. The two-stage detectors frst adopt an algorthm wth a hgh recall rate to propose regons that possbly contan objects and adopt a convoluton network to classfy classes and regress boundng boxes. The one-stage detectors are end-to-end networks that learn representatons and mplement classfcaton and regresson n all anchors. In one-stage methods, anchors are proposed at each locaton, and thus a huge fore-background mbalance exsts. For nstance, there are 50k boundng boxes proposed n each frame for 3D-FCN and 70k for VoxelNet, but less than 30 anchors among them contan postve objects (e.g. car, pedestran, cyclst). Compared to mage detectors, the extra estmaton n z-axs further ncreases the fore-background mbalance. Addtonally, postve samples always locate on the poston wth small z values n some specfc scenes. For nstance, cars and pedestrans are always on the road n autonomous drvng scenes. In such stuatons, the dstrbuton of fore-background mbalance s dfferent along the z-axs: the extent of mbalance ncreases wth hgher z values. The one-stage methods for 3D detectors are dfferent from the 2D detectors because of ther larger searchng space, sparse nput and dfferent types of network archtecture. Therefore, we select two dfferent networks, 3D-FCN and VoxelNet, to conduct experments to evaluate the performance of focal loss n 3D object detecton. The features of these two 3D detectors are dscussed n the followng two sectons, and the expermental detals and results are shown n Secton VI. IV. 3D-FCN FEATURES In ths secton, we dscuss the dense convoluton network archtecture of 3D-FCN and ntroduce our enhanced loss functon for 3D-FCN. The detals of 3D-FCN can be found n [6]. lease refer to AENDIX for our mplementaton of 3D-FCN.

4 4 BodyNet [40,800,800,1] [5,100,100,96] HeadNet -Map: [5,100,100,1] R-Map: [5,100,100,24] Fg. 2. The dense convoluton network archtecture of 3D-FCN [6]. The whole network conssts of only 3D convoluton layers. All ntermedate tensors n the hdden space are dense 3D grds (whch are represented by a tensor wth dmensons as [heght, wdth, length, feature]). A. Dense Convoluton Network Archtecture 3D-FCN [6] draws on experence from mage-based recognton tasks, and extends the 2D convoluton layer to 3D space to acqure translaton nvarance. The nput pont cloud s frstly voxelzed nto a 3D dense grd. In each voxel of the 3D dense grd, the values {0,1} are used to present whether there s any pont observed. The network archtecture of 3D-FCN s shown n Fgure 2. The voxelzed pont cloud s convolved by four blocks sequentally. The output features are then processed by two blocks ndvdually to generate a probablty map and a regresson map (-Map and R-Map). Dfferent from mage-based object detecton, the probablty map and regresson map are all n 3D dense grds, so that the searchng space s exponentally ncreased. B. Enhanced Loss Functon The orgnal loss functon for 3D-FCN [6] s shown n the left of Equaton 7 to 11, where ε and ε R represent the classfcaton loss and regresson loss, as well as ε cls and ε reg are the loss functons used for classfcaton and regresson respectvely. In regresson loss ε R, u and u are the regresson output and ground truth for postve anchors. In classfcaton loss ε, p pos and p neg represent the posteror probablty of postve and negatve estmaton. ε = ε + ε R ε = ε + ε R (7) ε = η(ε pos + εneg ) ε = η(ε pos + εneg ) (8) N pos ε reg (u,u ) (9) ε R = ε reg (u,u ) ε R = 1 ε pos ε neg = = ε cls (p pos,1) ε pos = α 1 ε cls (p neg,0) ε neg = β 1 N pos N neg ε cls (p pos,1) (10) ε cls (p neg,0) (11) In the orgnal form, a large mbalance exsts between ε pos and ε neg, whch represent classfcaton loss of postve and negatve samples respectvely. Therefore, we adopt the loss functon used n VoxelNet [7], whch normalzes sub-loss wth correspondng frequency as well as balances ε pos and ε neg wth two more hyperparameters α and β. The adopted loss functon s shown n the rght of Equaton 7 to 11. In Secton VI, we use the loss functon n the rght part of Equaton 7 to 11 to demonstrate the focal loss mprovement compared wth BCE Loss, where ε reg denotes the square loss and ε cls denotes the focal loss. We also show the enhanced loss functon form mprovement compared wth the orgnal loss functon [6] n the AENDIX, where ε reg denotes the square loss and ε cls denotes the BCE loss. V. VOXELNET FEATURES In ths secton, we dscuss the heterogeneous network archtecture of VoxelNet, and ts brd s-eye-vew estmaton. The detals of VoxelNet can be found n [7]. lease refer to AENDIX for our mplementaton of VoxelNet. A. Heterogeneous Network Archtecture The heterogeneous archtecture overvew of VoxelNet s shown n Fgure 3. It conssts of three man parts: FeatureNet, MddleLayer and RN. FeatureNet extracts features drectly from sparse pont lsts. It adopts Voxel Feature Encodng Layers (VFELayers) [7] to extract both pont-wse and voxel-wse features drectly from ponts, where fully connected layers are used to extract pontwse features and a symmetrc functon s used to aggregate local features from all ponts wthn a local voxel. Compared to sub-optmally dervng hand-crafted features from voxels, VFELayers can learn representatons mnmzng the loss functon. The derved voxel-wse representatons from VFELayers are sparse, whch saves memory and tme n the computaton. In contrast, f a pont cloud of KITTI dataset s parttoned nto a [10, 400, 352] dense grd for vehcle detecton, only around 5300 voxels (about 0.3%) are non-empty. However, the sparse representaton s currently unfrendly to convolutonal operatons. In order to mplement convoluton, VoxelNet compromses on effcency and converts the sparse representaton to a dense representaton at the end of FeatureNet. Each sparse voxel-wse representaton s coped to ts specfc entry n the dense grd. MddleLayer consumes the 3D dense grd and converts t to a 2D brd s-eye-vew form, so that further processng can be done n 2D space. The role of MddleLayer s to learn features from all voxels n the same brd s-eye-vew locaton. Therefore, the 3D convolutonal kernel s of sze [d,1,1], f we denote the dense grd n the order of z,x,y. The 3D kernel of sze [d, 1, 1] helps aggregate voxel-wse features wthn a progressvely expandng receptve feld along the z-axs and keeps the shape n the x,y dmenson. RN predcts the probablty and regresson map from the 2D brd s-eye-vew feature map. Snce the ncreased nvarance and large receptve felds of top-level nodes wll yeld smooth responses and cause naccurate localzaton, t does not utlze max-poolng but adopts skp-layers [10] to combne hgh-level semantc features and low-level spatal features. B. Estmaton n Brd s-eye-vew Form The fnal probablty and regresson estmaton maps are all n brd s-eye vew form, whch s smlar to the fnal estmaton of mage-based detecton methods. Ths saves both memory and tme of the calculaton compared to 3D maps, but only one object per locaton can be estmated n the brd s-eye vew.

5 5 FeatureNet MddleLayer RN K x T x m sparse pont lst VFELayer VFELayer FC max-pool K x n sparse feature Sparse2Dense [10,400,352,128] [400,352,128] [200,176,128] [100,96,128] [50,48,256] dense feature x 3 De x 5 x 5 De De [200,176,256] -Map: [200,176,2] R-Map: [200,176,14] Fg. 3. VoxelNet heterogeneous archtecture [7]. It conssts of three man parts: FeatureNet (pont-wse and voxel-wse feature transformaton), MddleLayer (3D dense convoluton) and RN (2D dense convoluton). The probablty and regresson maps are n brd s-eye-vew form. Ths s acceptable n autonomous drvng scenes but wll meet problems n ndoor scenes, where objects can be stacked up (e.g., a mug on a stack of books). MddleLayer saves calculaton for further processng by aggregatng the 3D dense grd nto a 2D brd s-eye-vew feature map. Otherwse, thoroughly 3D dense convoluton n such a deep network (22 convoluton layers) would brng exponentally more parameters and calculaton. We note that MddleLayer s stll a bottleneck of the whole network as shown n Table VII because of ts 3D dense convoluton operatons. The effcent sparse convolutonal mplementaton s stll an open problem and deserves effort to solve. C. Loss Functon We adopt the loss functon form from the orgnal VoxelNet [7], whch s the same as the rght half part from Equaton 7 to 11. In Secton VI, we use SmoothL1Norm [16] for ε reg as the orgnal paper [7] and use the focal loss for ε cls. VI. EXERIMENTS In ths secton, we ntend to answer two questons: 1) Can focal loss help mprove accuracy n 3D object detecton task? 2) Does focal loss have an equal effect n 3D object detecton to ts effect n mage-based detecton? To answer the former queston, we conduct experments to compare the performance of 3D-FCN and VoxelNet traned wth BCE loss and focal loss on the challengng KITTI benchmark [9]. To answer the second queston, we analyze the cumulatve dstrbuton curve of 3D-FCN and VoxelNet followng a smlar method to that n [13]. The code and weghts for our experments are avalable at A. BCE Loss vs. Focal Loss The KITTI 3D object detecton dataset [9] contans 3D annotatons for cars, pedestrans and cyclsts n urban drvng scenaros. The sensor setup manly conssts of a wde-angle camera and a Velodyne LDAR (HDL-64E), both of whch are well-calbrated. The tranng dataset contans 7481 frames, ncludng both raw sensor data and annotatons. The KITTI 3D detecton dataset contans some bad annotatons whch are empty boundng boxes contanng few ponts. In order to avod overfttng those bad annotatons, we remove all boundng boxes contanng few ponts (fewer than 10). Followng [5], we splt the dataset nto tranng and valdaton sets, each contanng around half of the entre set. For smplcty, we conduct experments only on the car class to show the focal loss mprovement. We do such mplement because both 3D-FCN and VoxelNet are traned class-specfcally and extendng them to other classes s only tunng technques. Also, the focal loss n the form of Equaton 6 s agnostc to the class of objects. We set α = 1, β = 5, η = 10 n 3D-FCN and α = 1, β = 10, η = 0.5 n VoxelNet so that ε pos and ε neg as well as ε and ε R wll be of the same orders of magntude. As clamed n [2], when tranng a network from scratch wth the focal loss, t s unstable n the begnnng. Therefore, we frst tran the network (both 3D-FCN and VoxelNet) for 30 epochs wth the BCE loss and the learnng rate lr, and then for another 30 epochs wth the focal loss and a dscounted learnng rate 0.1lr. The mnmum overlap thresholds are 0.7, 0.5, 0.5 for 2D evaluaton on mage/ground plane and 3D evaluaton. The network detals of both 3D-FCN and VoxelNet are shown n Table VI and Table VII n AENDIX. Non-maxmum suppresson wth the threshold 0.8 s used at the end of 3D-FCN and VoxelNet for estmaton refnement. In order to control a sngle varable γ, we frstly make comparsons among last models, whch are traned wth the same amount of steps. Addtonally, we also make comparsons among best models to make the concluson more concrete. The best models are selected accordng to the mean value among easy, moderate and hard 3D detecton As (3D detecton ma). We compare the results of the last models n Table II and Table III, where the rows wth γ = 0 and γ > 0 represent the results from the BCE loss and the focal loss respectvely. Bolded numbers are the results n whch focal loss cases outperforms the BCE loss case. In general, VoxelNet outperforms 3D- FCN n accuracy, snce the nput of VoxelNet has the orgnal pont clouds, but 3D-FCN suffers from nformaton loss when voxelzng the pont clouds nto bnary representatons. Addtonally, VoxelNet benefts from ts deeper network structure, whch s able to extract more useful hgh-level features. In 3D-FCN, the focal loss helps mprove accuracy n all metrcs n a wde range of hyperparameters (0 < γ 2.0), provdng gans from 0.3A to 11.2A. In VoxelNet, the cases wth γ = 0.1,0.5,1 show gans from the focal loss n all metrcs, rangng from 0.6A to 9.1A. Both gans and losses happen when γ s 0.2 or 2. However, gans (up to 9.1A) are generally much greater than losses (at most 2.7A). The tranng processes nclude some randomness due to sample shufflng and the sophstcated gradent descent tranng scheme. We further evaluate all ntermedate weghts and select the best models

6 6 (a) (b) (c) (d) Fg. 4. Cumulatve dstrbutons of 3D-FCN and VoxelNet for dfferent values of γ. In 3D-FCN (a, b), as γ ncreases, loss of both foreground and background samples concentrate on the harder parttons. The effect on the background s stronger. In VoxelNet (c, d), the effect of the focal loss ncreases as γ ncreases, but the effect on the foreground s stronger than on the background. Note that the VoxelNet background cumulatve dstrbuton (d) s n the range of [0.998, 1]. γ TABLE II EVALUATION RESULTS ON KITTI VALIDATION DATASET FOR LAST MODELS OF 3D-FCN Brd s Eye Vew A (%) 3D Detecton A (%) Easy Mod Hard Easy Mod Hard to make the comparson n Table IV. It shows that focal loss helps mprove accuracy n all metrcs wth a proper γ. The performance losses of γ = 0.2 n Table III mght be caused by tranng randomness and model degradaton wth redundant tranng. From Table II, Table III and Table IV, t shows that the focal loss n 3D object detecton provdes better or comparable results than BCE loss. Therefore, the focal loss works n 3D object detecton and help mprove accuracy n a wde range of γ (normally γ 2). B. Analyss of Focal Loss n 3D Detectors We analyze the emprcal cumulatve dstrbutons of the loss from the converged 3D-FCN and VoxelNet models as n [2]. We apply the two converged models traned wth the focal loss γ TABLE III EVALUATION RESULTS ON KITTI VALIDATION DATASET FOR LAST MODELS OF VOXELNET Brd s Eye Vew A (%) 3D Detecton A (%) Easy Mod Hard Easy Mod Hard TABLE IV EVALUATION RESULT ON KITTI VALIDATION DATASET FOR BEST MODELS Detector γ lr Step Brd s Eye Vew A(%) 3D Detecton A(%) Easy Mod Hard Easy Mod Hard 3D-FCN 0 1e-2 126k D-FCN 2 1e-2 137k VoxelNet 0 1e-4 134k VoxelNet 0.2 1e-4 215k Note that all cases n Table IV are the evaluaton results of the best models selected among all ntermedate weghts. Thus the accuracy mprovement s from the focal loss nstead of longer tranng steps. (row 2 and row 4 n Table IV) on the valdaton dataset and sample the predcted probablty for 10 7 negatve wndows and

7 7 (a) (b) Fg. 5. osteror probablty hstogram of 3D-FCN (a) and VoxelNet (b). As γ ncreases, the peak decreases and moves towards lower values n both 3D-FCN and VoxelNet. 105 postve wndows. Then, we calculate the focal loss wth these probablty data. The calculated focal loss s normalzed such that t sums to one and s sorted from low to hgh. We plot the cumulatve dstrbutons for 3D-FCN and VoxelNet for dfferent γ n Fgure 4. In 3D-FCN, approxmately 15% of the hardest postve samples account for roughly half of the postve loss. As γ ncreases, more of the loss gets concentrated n the top 15% of examples. However, compared to the effect of the focal loss on negatve samples, ts effect on the postve samples s mnor. For γ = 0, the postve and negatve CDFs are qute smlar. As γ ncreases, more weght becomes concentrated on the hard negatve examples. Wth γ = 2 (the best result for 3D-FCN), the vast majorty of the loss comes from a small fracton of samples. As clamed n [2], the focal loss can effectvely dscount the effect of easy negatves, so that the network focuses on learnng the hard negatve examples. In VoxelNet, the condton s dfferent. From c and d n Fgure 4, we can see that the effect of the focal loss ncreases n both the postve and negatve samples as γ ncreases. However, the cumulatve dstrbuton functons for the negatve samples are qute smlar among dfferent values of γ, even though we adjust the x-axs to [0.998, 1]. Ths shows that VoxelNet traned wth the BCE loss s already able to handle negatve hard samples. Compared wth the results on the negatve samples, the effects of focal loss on the postve samples are stronger. Therefore, the accuracy gans of the focal loss n VoxelNet are manly from the postve hard samples. From the analyss of cumulatve dstrbutons, we beleve that the focal loss n 3D object detecton helps networks allevate hard sample gradent salence n the tranng process. C. Focal Loss Decreases the osteror robabltes When undertakng the experments, we found networks traned wth the focal loss should be set wth a lower threshold for non-maxmum suppresson. Ths nspres us to explore the nfluence of the focal loss on the output posteror probabltes. We take the models n Table II and Table III, and evaluate them on the valdaton set. We record all the evaluaton results and plot the probablty hstogram for postve boundng boxes. The results are shown n Fgure 5. As γ ncreases, the peak decreases and moves towards the lower values. Ths demonstrates that networks traned wth the focal loss output postve estmaton wth lower posteror probabltes. A probable explanaton s that objects wth hgh posteror probabltes are easly classfed, and the loss they contrbute s down-weghted n the tranng process due to the focal loss. In other words, they wll be relatvely gnored n the tranng process f they are estmated wth hgh posteror probabltes, so that ther posteror probabltes cannot be further mproved. However, they can also be accurately classfed f we decrease the non-maxmum suppresson threshold n the fnal output step. VII. C ONCLUSION In ths paper, we extended the focal loss of mage detectors to 3D object detecton to solve the fore-background mbalance. We conducted experments on two dfferent types of 3D object detectors to demonstrate the performance of the focal loss n pont-cloud-based object detecton. The expermental results show that the focal loss helps mprove accuracy n 3D object detecton, and t protects the network from fore-background mbalance and allevates hard sample gradent salence both for postve and negatve anchors n the tranng process. The posteror probablty hstograms show that the networks traned wth the focal loss outputs postve estmaton wth lower posteror probabltes. R EFERENCES [1] Z. Wang, Y. Lu, Q. Lao, H. Ye, M. Lu, and L. Wang, Characterzaton of a rs-ldar for 3d percepton, n IEEE Internatonal Conference on CYBER Technology n Automaton, Control, and Intellgent Systems (CYBER), July [2] T. Ln,. Goyal, R. Grshck, K. He, and. Dollar, Focal loss for dense object detecton, IEEE Transactons on attern Analyss and Machne Intellgence, pp. 1 1, [3] C. R. Q, W. Lu, C. Wu, H. Su, and L. J. Gubas, Frustum pontnets for 3d object detecton from rgb-d data, n 2018 IEEE/CVF Conference on Computer Vson and attern Recognton, June 2018, pp [4] J. Ku, M. Mozfan, J. Lee, A. Harakeh, and S. L. Waslander, Jont 3d proposal generaton and object detecton from vew aggregaton, n 2018 IEEE/RSJ Internatonal Conference on Intellgent Robots and Systems (IROS), Oct 2018, pp. 1 8.

8 8 [5] X. Chen, H. Ma, J. Wan, B. L, and T. Xa, Mult-vew 3d object detecton network for autonomous drvng, n 2017 IEEE Conference on Computer Vson and attern Recognton (CVR), July 2017, pp [6] B. L, 3d fully convolutonal network for vehcle detecton n pont cloud, n 2017 IEEE/RSJ Internatonal Conference on Intellgent Robots and Systems (IROS), Sep. 2017, pp [7] Y. Zhou and O. Tuzel, Voxelnet: End-to-end learnng for pont cloud based 3d object detecton, n 2018 IEEE/CVF Conference on Computer Vson and attern Recognton, June 2018, pp [8] R. Q. Charles, H. Su, M. Kachun, and L. J. Gubas, ontnet: Deep learnng on pont sets for 3d classfcaton and segmentaton, n 2017 IEEE Conference on Computer Vson and attern Recognton (CVR), July 2017, pp [9] A. Geger,. Lenz, and R. Urtasun, Are we ready for autonomous drvng? the ktt vson benchmark sute, n 2012 IEEE Conference on Computer Vson and attern Recognton, June 2012, pp [10] J. Long, E. Shelhamer, and T. Darrell, Fully convolutonal networks for semantc segmentaton, n 2015 IEEE Conference on Computer Vson and attern Recognton (CVR), June 2015, pp [11] R. Grshck, J. Donahue, T. Darrell, and J. Malk, Rch feature herarches for accurate object detecton and semantc segmentaton, n 2014 IEEE Conference on Computer Vson and attern Recognton, June 2014, pp [12] K. He, G. Gkoxar,. Dollr, and R. Grshck, Mask r-cnn, n 2017 IEEE Internatonal Conference on Computer Vson (ICCV), Oct 2017, pp [13] T. Ln,. Dollr, R. Grshck, K. He, B. Harharan, and S. Belonge, Feature pyramd networks for object detecton, n 2017 IEEE Conference on Computer Vson and attern Recognton (CVR), July 2017, pp [14] J. Redmon, S. Dvvala, R. Grshck, and A. Farhad, You only look once: Unfed, real-tme object detecton, n 2016 IEEE Conference on Computer Vson and attern Recognton (CVR), June 2016, pp [15] W. Lu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, Ssd: Sngle shot multbox detector, n Computer Vson ECCV 2016, B. Lebe, J. Matas, N. Sebe, and M. Wellng, Eds. Cham: Sprnger Internatonal ublshng, 2016, pp [16] S. Ren, K. He, R. Grshck, and J. Sun, Faster r-cnn: Towards real-tme object detecton wth regon proposal networks, IEEE Transactons on attern Analyss and Machne Intellgence, vol. 39, no. 6, pp , June AENDIX A. Improvement of Enhanced Loss Functon for 3D-FCN We demonstrate the mprovement of adoptng the loss functon from VoxelNet [7] (normalzaton, new hyperparameters, BCE Loss) provdes over ts orgnal loss functon [6] for 3D- FCN. We set α = 1, β = 5, η = 10 n the enhanced 3D-FCN so that ε pos and ε neg as well as ε and ε R can be of the same orders of magntude. We set η = 0.1 n the orgnal 3D-FCN so that ε and ε R can be of the same orders of magntude. γ s set as 0 for usng BCE loss. We tran these two cases from scratch wth 30 epochs. The threshold for non-maxmum suppresson s set as The reason why η s 100 larger n the enhanced 3D-FCN s that we dd normalzaton n the enhanced loss 3D-FCN and N neg s much greater than N pos. We compare the last models n Table V whch shows the mprovement of the enhanced loss functon. B. Our 3D-FCN Implementaton Detals The network detals of 3D-FCN are shown n Table VI. Each block n the BodyNet ncludes a 3D convoluton layer, a ReLU layer and a batch normalzaton layer sequentally. In the HeadNet, each block represents an ndvdual 3D convoluton layer. In the tranng phase, we create the ground truth for -Map by settng the object-voxel whch contans an object center as 1. For the regresson map, we create the ground truth by settng the object-voxels wth 24-length resdual vectors, each of whch s the coordnates for the eght ponts of the boundng box wth a fxed order. The result of the 3D-FCN baselne mplemented by us s shown n the frst row of Table IV. C. Our VoxelNet Implementaton Detals The network detals of VoxelNet are shown n Table VII. The FC block n VoxelNet conssts of a fully connected layer, a batch normalzaton layer and a ReLU layer sequentally. Each block n the MddleLayer ncludes a 3D convoluton layer, a ReLU layer and a batch normalzaton layer. The block n the RN conssts of a 2D convoluton layer, a ReLU layer and a batch normalzaton layer. The model of -Map and R-Map s an ndvdual 2D convoluton layer. We adopt the orgnal parameterzaton method and resdual vector for regresson of VoxelNet[7]. The result of our VoxelNet baselne s shown n the thrd row of Table IV. TABLE V THE IMROVEMENT OF THE ENHANCED LOSS FUNCTION FOR 3D-FCN Detector Brd s Eye Vew A(%) 3D Detecton A(%) Easy Mod Hard Easy Mod Hard Orgnal Enhanced TABLE VI OUR IMLEMENTATION DETAILS OF 3D-FCN Block Name Layer Name Kernel Sze Strdes Flter GFLOs Body conv3d 1 [5,5,5] [2,2,2] conv3d 2 [5,5,5] [2,2,2] conv3d 3 [3,3,3] [2,2,2] conv3d 4 [3,3,3] [1,1,1] Head-Map conv3d obj [3,3,3] [1,1,1] Head-RMap conv3d cor [3,3,3] [1,1,1] TABLE VII OUR IMLEMENTATION DETAILS OF VOXELNET Block Name Layer Name FeatureNet MddleLayer RN Kernel Sze / Output Unt Strdes Flter GFLOs vfe 32 N/A N/A <0.1 vfe 128 N/A N/A <0.1 fc 128 N/A N/A <0.1 conv3d [3,3,3] [2,1,1] conv3d [3,3,3] [1,1,1] conv3d [3,3,3] [2,1,1] reshape N/A N/A N/A / conv2d [3,3] [2,2] conv2d 3 [3,3] [1,1] deconv [3,3] [1,1] conv2d [3,3] [2,2] conv2d 5 [3,3] [1,1] deconv [2,2] [2,2] conv2d [3,3] [2,2] conv2d 5 [3,3] [1,1] deconv [4,4] [4,4] rob-map conv2d [1,1] [1,1] Reg-Map conv2d [1,1] [1,1]

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input Real-tme Jont Tracng of a Hand Manpulatng an Object from RGB-D Input Srnath Srdhar 1 Franzsa Mueller 1 Mchael Zollhöfer 1 Dan Casas 1 Antt Oulasvrta 2 Chrstan Theobalt 1 1 Max Planc Insttute for Informatcs

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram Shape Representaton Robust to the Sketchng Order Usng Dstance Map and Drecton Hstogram Department of Computer Scence Yonse Unversty Kwon Yun CONTENTS Revew Topc Proposed Method System Overvew Sketch Normalzaton

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

3D Backbone Network for 3D Object Detection

3D Backbone Network for 3D Object Detection 3D Backbone Network for 3D Object Detecton Xuesong L 1, Jose E Guvant 1, Ngamng Kwok 1, Yongzh Xu 2 1 School of Mechancal and Manufacturng Engneerng, Unversty of New South Wales, NSW 2052, Australa 2 School

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Histogram of Template for Pedestrian Detection

Histogram of Template for Pedestrian Detection PAPER IEICE TRANS. FUNDAMENTALS/COMMUN./ELECTRON./INF. & SYST., VOL. E85-A/B/C/D, No. xx JANUARY 20xx Hstogram of Template for Pedestran Detecton Shaopeng Tang, Non Member, Satosh Goto Fellow Summary In

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature Detecton of hand graspng an object from complex background based on machne learnng co-occurrence of local mage feature Shnya Moroka, Yasuhro Hramoto, Nobutaka Shmada, Tadash Matsuo, Yoshak Shra Rtsumekan

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Audio Event Detection and classification using extended R-FCN Approach. Kaiwu Wang, Liping Yang, Bin Yang

Audio Event Detection and classification using extended R-FCN Approach. Kaiwu Wang, Liping Yang, Bin Yang Audo Event Detecton and classfcaton usng extended R-FCN Approach Kawu Wang, Lpng Yang, Bn Yang Key Laboratory of Optoelectronc Technology and Systems(Chongqng Unversty), Mnstry of Educaton, ChongQng Unversty,

More information

Supplementary Material DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

Supplementary Material DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents Supplementary Materal DESIRE: Dstant Future Predcton n Dynamc Scenes wth Interactng Agents Namhoon Lee 1, Wongun Cho 2, Paul Vernaza 2, Chrstopher B. Choy 3, Phlp H. S. Torr 1, Manmohan Chandraker 2,4

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender 2013 Frst Internatonal Conference on Artfcal Intellgence, Modellng & Smulaton Comparng Image Representatons for Tranng a Convolutonal Neural Network to Classfy Gender Choon-Boon Ng, Yong-Haur Tay, Bok-Mn

More information

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection sensors Artcle Deep Spatal-Temporal Jont Feature Representaton for Vdeo Object Detecton Baojun Zhao 1,2, Boya Zhao 1,2 ID, Lnbo Tang 1,2, *, Yuq Han 1,2 and Wenzheng Wang 1,2 1 School of Informaton and

More information

Classifier Swarms for Human Detection in Infrared Imagery

Classifier Swarms for Human Detection in Infrared Imagery Classfer Swarms for Human Detecton n Infrared Imagery Yur Owechko, Swarup Medasan, and Narayan Srnvasa HRL Laboratores, LLC 3011 Malbu Canyon Road, Malbu, CA 90265 {owechko, smedasan, nsrnvasa}@hrl.com

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Image Matching Algorithm based on Feature-point and DAISY Descriptor

Image Matching Algorithm based on Feature-point and DAISY Descriptor JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 829 Image Matchng Algorthm based on Feature-pont and DAISY Descrptor L L School of Busness, Schuan Agrcultural Unversty, Schuan Dujanyan 611830, Chna Abstract

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Learning-based License Plate Detection on Edge Features

Learning-based License Plate Detection on Edge Features Learnng-based Lcense Plate Detecton on Edge Features Wng Teng Ho, Woo Hen Yap, Yong Haur Tay Computer Vson and Intellgent Systems (CVIS) Group Unverst Tunku Abdul Rahman, Malaysa wngteng_h@yahoo.com, woohen@yahoo.com,

More information

arxiv: v2 [cs.cv] 9 Apr 2018

arxiv: v2 [cs.cv] 9 Apr 2018 Boundary-senstve Network for Portrat Segmentaton Xanzh Du 1, Xaolong Wang 2, Dawe L 2, Jngwen Zhu 2, Serafettn Tasc 2, Cameron Uprght 2, Stephen Walsh 2, Larry Davs 1 1 Computer Vson Lab, UMIACS, Unversty

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Scale Selective Extended Local Binary Pattern For Texture Classification

Scale Selective Extended Local Binary Pattern For Texture Classification Scale Selectve Extended Local Bnary Pattern For Texture Classfcaton Yutng Hu, Zhlng Long, and Ghassan AlRegb Multmeda & Sensors Lab (MSL) Georga Insttute of Technology 03/09/017 Outlne Texture Representaton

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction Encodng Crowd Interacton wth Deep Neural Network for Pedestran Trajectory Predcton Yanyu Xu ShanghaTech Unversty xuyy2@shanghatech.edu.cn Zhxn Pao ShanghaTech Unversty paozhx@shanghatech.edu.cn Shenghua

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS by XUNYU PAN (Under the Drecton of Suchendra M. Bhandarkar) ABSTRACT In modern tmes, more and more

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment 一般社団法人電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS 信学技報 IEICE Techncal Report SANE2017-92 (2018-01) Deep Learnng for End-to-End Automatc Target Recognton from Synthetc

More information

Vol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Vol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. Journal of Emergng Trends n Computng and Informaton Scences 009-03 CIS Journal. All rghts reserved. http://www.csjournal.org Unhealthy Detecton n Lvestock Texture Images usng Subsampled Contourlet Transform

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

An Approach to Real-Time Recognition of Chinese Handwritten Sentences

An Approach to Real-Time Recognition of Chinese Handwritten Sentences An Approach to Real-Tme Recognton of Chnese Handwrtten Sentences Da-Han Wang, Cheng-Ln Lu Natonal Laboratory of Pattern Recognton, Insttute of Automaton of Chnese Academy of Scences, Bejng 100190, P.R.

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 4, APRIL

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 4, APRIL IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 4, APRIL 2016 1713 Weakly Supervsed Fne-Graned Categorzaton Wth Part-Based Image Representaton Yu Zhang, Xu-Shen We, Janxn Wu, Member, IEEE, Janfe Ca,

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

A Probabilistic Approach to Detect Urban Regions from Remotely Sensed Images Based on Combination of Local Features

A Probabilistic Approach to Detect Urban Regions from Remotely Sensed Images Based on Combination of Local Features A Probablstc Approach to Detect Urban Regons from Remotely Sensed Images Based on Combnaton of Local Features Berl Sırmaçek German Aerospace Center (DLR) Remote Sensng Technology Insttute Weßlng, 82234,

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch Deep learnng s a good steganalyss tool when embeddng key s reused for dfferent mages, even f there s a cover source-msmatch Lonel PIBRE 2,3, Jérôme PASQUET 2,3, Dno IENCO 2,3, Marc CHAUMONT 1,2,3 (1) Unversty

More information

AUTOMATIC RECOGNITION OF TRAFFIC SIGNS IN NATURAL SCENE IMAGE BASED ON CENTRAL PROJECTION TRANSFORMATION

AUTOMATIC RECOGNITION OF TRAFFIC SIGNS IN NATURAL SCENE IMAGE BASED ON CENTRAL PROJECTION TRANSFORMATION AUTOMATIC RECOGNITION OF TRAFFIC SIGNS IN NATURAL SCENE IMAGE BASED ON CENTRAL PROJECTION TRANSFORMATION Ka Zhang a, Yehua Sheng a, Pefang Wang b, Ln Luo c, Chun Ye a, Zhjun Gong d a Key Laboratory of

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Air Transport Demand. Ta-Hui Yang Associate Professor Department of Logistics Management National Kaohsiung First Univ. of Sci. & Tech.

Air Transport Demand. Ta-Hui Yang Associate Professor Department of Logistics Management National Kaohsiung First Univ. of Sci. & Tech. Ar Transport Demand Ta-Hu Yang Assocate Professor Department of Logstcs Management Natonal Kaohsung Frst Unv. of Sc. & Tech. 1 Ar Transport Demand Demand for ar transport between two ctes or two regons

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Improved SIFT-Features Matching for Object Recognition

Improved SIFT-Features Matching for Object Recognition Improved SIFT-Features Matchng for Obect Recognton Fara Alhwarn, Chao Wang, Danela Rstć-Durrant, Axel Gräser Insttute of Automaton, Unversty of Bremen, FB / NW Otto-Hahn-Allee D-8359 Bremen Emals: {alhwarn,wang,rstc,ag}@at.un-bremen.de

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

A high precision collaborative vision measurement of gear chamfering profile

A high precision collaborative vision measurement of gear chamfering profile Internatonal Conference on Advances n Mechancal Engneerng and Industral Informatcs (AMEII 05) A hgh precson collaboratve vson measurement of gear chamferng profle Conglng Zhou, a, Zengpu Xu, b, Chunmng

More information

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION Lng Dng 1, Hongy L 2, *, Changmao Hu 2, We Zhang 2, Shumn Wang 1 1 Insttute of Earthquake Forecastng, Chna Earthquake

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices Hgh resoluton 3D Tau-p transform by matchng pursut Wepng Cao* and Warren S. Ross, Shearwater GeoServces Summary The 3D Tau-p transform s of vtal sgnfcance for processng sesmc data acqured wth modern wde

More information

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING.

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING. WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING Tao Ma 1, Yuexan Zou 1 *, Zhqang Xang 1, Le L 1 and Y L 1 ADSPLAB/ELIP, School of ECE, Pekng Unversty, Shenzhen 518055, Chna

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Novel Fuzzy logic Based Edge Detection Technique

Novel Fuzzy logic Based Edge Detection Technique Novel Fuzzy logc Based Edge Detecton Technque Aborsade, D.O Department of Electroncs Engneerng, adoke Akntola Unversty of Tech., Ogbomoso. Oyo-state. doaborsade@yahoo.com Abstract Ths paper s based on

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

Large-scale Web Video Event Classification by use of Fisher Vectors

Large-scale Web Video Event Classification by use of Fisher Vectors Large-scale Web Vdeo Event Classfcaton by use of Fsher Vectors Chen Sun and Ram Nevata Unversty of Southern Calforna, Insttute for Robotcs and Intellgent Systems Los Angeles, CA 90089, USA {chensun nevata}@usc.org

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

A Computer Vision System for Automated Container Code Recognition

A Computer Vision System for Automated Container Code Recognition A Computer Vson System for Automated Contaner Code Recognton Hsn-Chen Chen, Chh-Ka Chen, Fu-Yu Hsu, Yu-San Ln, Yu-Te Wu, Yung-Nen Sun * Abstract Contaner code examnaton s an essental step n the contaner

More information