EasyChair Preprint. Real Time Object Detection And Tracking

Size: px
Start display at page:

Download "EasyChair Preprint. Real Time Object Detection And Tracking"

Transcription

1 EasyChair Preprint 223 Real Time Object Detection And Tracking Dária Baikova, Rui Maia, Pedro Santos, João Ferreira and Joao Oliveira EasyChair preprints are intended for rapid dissemination of research results and are integrated with the rest of EasyChair. June 2, 2018

2 Real Time Object Detection And Tracking Dária Baikova 1, Rui Maia 1 Pedro Santos 1, João Ferreira 2, and João Oliveira 2 INOV, Lisbon, Portugal Instituto Universitário de Lisboa (ISCTE-IUL), Information Sciences, Technologies and Architecture Research Center (ISTAR-IUL), Portugal, {daria.baikova, rui.maia, pedro.santos}@inov.pt, {jcafa, joao.p.oliveira}@iscte-iul.pt Abstract. The present work proposes a real-time multi-object detection and tracking system to be implemented in commercial areas. The purpose is to gather and make sense of costumer behavior data extracted from surveillance footage (available from ceiling cameras) in order supply retailers with a set of analytics, management and planning tools to help them perform tasks such as planning demand and supply chains and organizing product placement on shelfs. To achieve this goal, deep learning techniques are used, which have been yielding outstanding results in computer vision problems in recent years. Keywords: Deep Learning, Computer Vision, Object Detection, Object Tracking 1 Introduction Today, big grocery retail chains account for the great majority of grocery shopping made globally, generating about $4 trillion annually, which makes this industry one of the biggest worldwide. To keep competitive, retail chains like WalMart (U.S.), Carrefour (France), TESCO (United Kingdom) adopt a wide range of strategies in order to collect information about its customers, to build profiles and to analyze shopping habits, such as targeting clients with personalized vouchers or using loyalty cards. However these strategies don t provide the retailer with sufficient information to have a good understanding about the shopping patterns of its costumers, which leads to inefficiencies felt throughout the current model adopted by most retailers. This includes performing tasks such as planning demand and supply chains or placement on shelfs, monitoring costumer satisfaction and overseeing the payment process in the supermarket checkouts. To oversee and fix these problems in real-time, retailers need to access to information that is currently impossible to obtain, for instance: knowing how many customers are inside the store, determining the distribution of costumers inside the stores, keeping track of the path taken by the costumers throughout the store or learning the costumers shopping patterns under different conditions. The present work is focused on: 1) studying and implementing computer vision algorithms based on Deep Learning, which are the state-of-the art for computer vision tasks; and 2) Apply these techniques to a domain-specific application (retail), which will be focused on obtaining the information retailers need.

3 2 Dária Baikova et al. 2 Related Work The main purpose of object tracking it to detect multiple objects in a video frame and maintain these identities in the subsequent frames (over time) in order to identify the trajectory of each object. Classical approaches to object tracking include Multiple Hypothesis Tracking (MHT) [20], recursive methods which make use of Kalman filtering to predict locations [5], Joint Probabilistic Data Association (JPDA) [15] or Particle filtering techniques [11]. However, with the recent success of Deep Learning techniques in tasks such as image classification [14,16,22,24,12] and object detection [10,9,19,17,21], and more access to data, deep learning models also started being applied to object tracking. Namely, tracking by detection has become a popular paradigm to solve the tracking problem [2,7]. This type of framework consists of two steps: applying a detector, and subsequently, a post-processing step witch involves a tracker to propagate detection scores over time. The main challenge is grouping the detected targets in contiguous frames in order to represent the targets by their trajectory over all frames. The detection step can be solved with high performing deep learning detection model [10,9,19,17,21], which in may use deep learning classification models as its base - models such as the ones in [24,12,24,23]. The detection association step has many solutions: some approaches include associating different tracks by calculating the Intersection over Union (IOU) of the bounding boxes in consecutive frames [6]; other approaches leverage Kalman filters or Hungarian algorithm such as the SORT method (Simple Online and Realtime Tracking) [4]. Other approaches skip tracking by detection and use fully Deep Learning based methods, for instance, methods based on Recurrent Neural Networks (RNNs) [18,8] or siamese convolutional networks archichectures [3]. The siamese CNN (Convolutional Neural Network) proposed in [13] is the base for the present work. 3 Proposed Architecture At a high level the project consists of three modules represented in Fig. 1: a frame reader module, responsible for reading frames from a real-time RTSP stream and writing these to a multi-threading queue which also serves as a buffer, a detection module, responsible for outputting object bounding boxes, and the tracking module which uses the previous detection outputs in order to compute the final verdicts about the tracks of each object. Both detection and tracking modules use recent deep learning techniques: the detection module was tested with different CNN detection architectures and the tracking module uses the architecture introduced in [13], codenamed GOTURN (Generic Object Tracking Using Regression Networks). The following sections describe these modules. 3.1 Object Detection Module The tasks in this module consist of reading frames from the queue, applying a multi-object detector to these frames and finally sending the detections to

4 Real Time Object Detection And Tracking 3 Fig. 1: Architecture overview. the tracking module. The results sent to the tracker include the found bounding boxes and detection scores (the extent to which the model is certain of that detection). The first problem is to select the detection model, which is a trade-off between accuracy and speed (inference time). In order to find the optimal model for this application two types popular CNN detection architectures were tested: Faster-RCNN and SSD models. The R-CNN family models [10,9,21] have demonstrated excellent performance in recent years. The most advanced of these models, Faster-RCNN [21] tackles object detection as a three step problem: firstly, it uses category-independent region proposal algorithms, e.g. [25], in order to generate possible detection candidates, which are then fed to a CNN that serves as a feature extractor, and lastly, a set of class-specific classifiers are applied on top of the features in order to find the final verdict about the classes of the subjects present in the image. However, due to the region proposal step, the R-CNN models have a slower inference time than single-shot CNN alternatives such as SSD [17] architectures. Speed is a significant feature in the context of the present work due to the real-time requirement of the project. The key idea of an SSD architecture is that each of the last few layers of the network is responsible for the prediction of progressively smaller bounding boxes, and final prediction is the union of all these predictions. In this way, these models are able to simultaneously predict the bounding box and the object class in one iteration, eliminating the region proposal step altogether and are able to achieve an improvement in speed. In contrast 7 frames per second (FPS) speed with 73.2% map on VOC2007 test with a Faster-RCNN, SSD operates at 59 FPS with 74.3% map on the same dataset. The object detection models tested for this module are: a MobileNet SSD, an Inception-v2 SSD and a ResNet101 Faster-RCNN. 3.2 Tracking Module The tracking module levreages the single object tracking architecture proposed in [13] codenamed GOTURN (Generic Object Tracking Using Regression Networks). This architecture (Fig. 2) has the structure of a Siamese CNN and is able to perform generic object tracking (which is not trained for a specific class of objects) at very high speeds (100 FPS) making it appropriate for real time applications. The network is trained entirely offline and learns the generic relationship between appearance and motion in order to be able to recognize novel

5 4 Dária Baikova et al. objects online. In order to learn this relationship, at each training iteration, the network takes two inputs: a crop of a frame at time t - 1 and a crop of the next frame at time t. The crop of the frame at time t - 1 contains the object the network will be looking for -trying to track-, i.e., the target. The second input is a crop of the frame at time t, which represents the area the network will be searching for the target object in (search area), which is centered on the same point that the previous crop was, however, it is scaled by a factor of k. If the objects don t move too quickly, this scaling ensures the target object will still be present in the search area. The network directly regresses the bounding box coordinates of the tracked object in the search area of the next frame. Since the GOTURN architecture is only able to track one object at a time, in order to perform multi-object tracking in real-time the tracker runs multiple times per frame (one time for each object detected when it first enters the field of view). Furthermore, in order to initialize, maintain and end a track additional rules are needed. These rules are described below. Fig. 2: GOTURN architecture. Tracker initialization and maintenance and ending The tracker is initialized when it receives the first detection from the detection module. This detection constitutes the first input fed to the tracker (target) and defines what object the tracker will be looking for during it s runtime. At each iteration, the tracker uses the previously predicted frame crop (at time t - 1 ) and the current frame search area (scaled area of the frame at time t, centered on the previous frame target s center) as an input. In order to make sure the tracker does not lose it s target, from N to N frames the detection re-runs on the current frame and the result is compared with the tracker s prediction via IOU (Intersection Over Union) measure; The tracker bounding box prediction is then switched by the bounding box predicted by the detector since the detection bounding boxes tend to be more accurate. When an object leaves the camera field of view it s track should be removed. In order to achieve this, a parameter is used: the track of an object is considered finished when no detections of that object occurred in M frames.

6 4 Dataset and Network Training Real Time Object Detection And Tracking 5 In order to create a new training dataset and to simulate a commercial area, a controlled test environment was created at the INOV-INESC building entrance area where hundreds of different people pass through daily. The setup consists of one top camera which captures one frame per second in the period from 07h00am to 20h00pm. These frames are afterwards stored in an SMB storage system remotely accessible for future use. The live video feed can be accessed in real time via RTSP protocol. The training for the detection models consists of labeled images annotated with object bounding boxes. The tracking dataset has a similar structure with the addition of the object IDs in each frame. Since the tracking module takes a pair of frames as inputs, an additional pre-processing step was to create pairs of target/search area images. The initial labels were created by hand, and afterwards, the ensable learning technique was used to generate new data. Three different detectors were trained on the initial dataset, and when shown a new frame, each of the models voted on where it though a bounding box should exist and the final annotation was a conjugation of all the predictions. In this way, the final training dataset had about labeled images. All the models (detection and tracking) were trained offline with the previously described datasets using the Tensorflow [1] framework. In order to reduce training time and data overfitting, transfer learning was used in the detection models, which were all pre-trained on the COCO dataset. 5 Results In order to determine which of the object detection models is more suitable for this project, mean average precision (map) and inference time of the predictions for each model is measured and shown in Table 1. Between the three tested detection models, the Resnet101/Faster R-CNN model yields the best results Fig. 3(a), however, inference time is the highest, which could cause a delay in reading and processing frames from the queue and consequently cause the queue to overflow and discard frames. This would be a problem since the tracking module relies on the assumption that the objects move slowly through time, and can lose track of an object if there are gaps in the frame sequences. The Mobilenet/SSD model is more appropriate for a real time applications since it is very fast (inference takes 5 ms), however, the detections are faulty, and in some cases, it even fails to detect completely for numerous frames (Figure 3(b)). The Inception-v2/SSD model seems a good compromise between inference time and precision, which is able to have a good map value while maintaining a fast inference time. In conclusion, if the stream reading rate is set between 5 FPS and 10 FPS, Resnet101-Faster R-CNN is still a better choice since no frames are lost in the queue (the queue writing rate is lower or equal to the frame processing rate). Since this range of frame rates does not damage tracking results, Faster- RCNN/Resnet101 was the chosen architecture for the detection module.

7 6 Dária Baikova et al. (a) Resnet classifier with Faster R-CNN detector (b) Mobilenet base network with SSD detector (c) Inception-v2 base network with SSD detector Fig. 3: Samples of the test set inference outputs for the different detection models. Table 1: Detection results. Model map Inference Time (s) Resnet101/Faster R-CNN 0, MobileNet/SSD 0,8015 0,005 Inception-v2/SSD 0, The tracking module is able to use detection bounding boxes and produce and maintain the ID of each object through the object s path (Fig. 4). It works well even in situations where multiple people go through the camera field of view. The results are the best when the M parameter (number of frames where no detection exists before the track is considered finished) is set to two frames, i.e., when no detection of an object is found in two consecutive frames, the track is considered finished. If this number is greater than two, there is the risk of a new object being associated with another object s track. Despite the results being seemingly good, at this time there is still no way of measuring the exact tracking accuracy however the tracker seems to perform well. This approach is suitable for real time tracking due to the fast nature of the performed computations in contrast to more elaborate trackers that use image information (features) to predict the next position of the tracked objects. 6 Conclusion and Future Work The present work proposes a real time detection and tracking system aimed at implementation in commercial areas, in which the tracking by detection framework is used. Based on the results, the models (both tracking and detection) are already yielding good results and with more training data should achieve higher levels of precision. The next steps for the tracking module are to annotate new training and test data with paths of each individual object in order to properly measure tracking accuracy. This task can now be easily achieved since the information returned by the tracking module can be recycled (the tracker does most of the work and human input is only needed for minor adjustments).

8 Real Time Object Detection And Tracking 7 Fig. 4: Tracking sequence. The tracker attributes IDs and draws colored bounding boxes around the objects in every frame which it considers belong to the same object- each object has it s own ID and color. The tracker is able to maintain these IDs even with multiple people going through the field of view. Here, four different IDs (from 2 to 5) are represented in different colors. When these tasks are finished, the next steps will be to train deep learning models (for instance R-NNs) to predict object positions since they are the current state-of-the-art and a really promising research field. References 1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (2016) 2. Andriyenko, A., Schindler, K., Group, R.S.: Multi-target tracking by continuous energy minimization (2014) 3. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fullyconvolutional siamese networks for object tracking. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9914 LNCS, (2016) 4. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. Proceedings - International Conference on Image Processing, ICIP 2016August, (2016) 5. Black, J., Ellis, T., Rosin, P.: Multi view image surveillance and tracking. Proceedings - Workshop on Motion and Video Computing, MOTION 2002 pp (2002) 6. Bochinski, E., Eiselein, V., Sikora, T.: High-Speed Tracking-by-Detection Without Using Image Information. In Proceedings of the IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS) (August) (2017) 7. Choi, W.: Near-online multi-target tracking with aggregated local flow descriptor. Proceedings of the IEEE International Conference on Computer Vision 2015 Inter, (2015)

9 8 Dária Baikova et al. 8. Gan, Q., Guo, Q., Zhang, Z., Cho, K.: First Step toward Model-Free, Anonymous Object Tracking with Recurrent Neural Networks pp (2015) 9. Girshick, R.: Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision 2015 Inter, (2015) 10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition pp (2014) 11. Green, P.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4), (1995), oxfordjournals.org/content/82/4/711.short 12. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition (dec 2015) 13. Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9905 LNCS, (2016) 14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet Classification with Deep Convolutional Neural Networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. pp NIPS 12, Curran Associates Inc., USA (2012) 15. Kuhn, H.: The Hungarian Method For The Assignment Problem. Naval Research Logistics 52(1), 7 21 (2005) 16. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation Applied to Handwritten Zip Code Recognition (1989) 17. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.y., Berg, A.C.: SSD: Single Shot MultiBox Detector (2015) 18. Milan, A., Leal-Taixe, L., Reid, I., Roth, S., Schindler, K.: MOT16: A Benchmark for Multi-Object Tracking pp (2016), Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You Only Look Once: Unified, Real-Time Object Detection. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp (2016) 20. Reid, D.: An algorithm for tracking multiple targets. IEEE Transactions on Automatic Control 24(6), (1979), epic03/wrapper.htm?arnumber= {\%}5cnhttp://ieeexplore.ieee.org/ lpdocs/epic03/wrapper.htm?arnumber= Ren, S., He, K., Sun, J., Girshick, R.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 39(6), (2017) 22. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition pp (2014) 23. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning (2016) 24. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition June, 1 9 (2015) 25. Uijlings, J.R.R., Van De Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective Search for Object Recognition (2012)

Object detection with CNNs

Object detection with CNNs Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Tracking by recognition using neural network

Tracking by recognition using neural network Zhiliang Zeng, Ying Kin Yu, and Kin Hong Wong,"Tracking by recognition using neural network",19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed

More information

An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images

An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images 1 An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images Zenan Ling 2, Robert C. Qiu 1,2, Fellow, IEEE, Zhijian Jin 2, Member, IEEE Yuhang

More information

Cross-domain Deep Encoding for 3D Voxels and 2D Images

Cross-domain Deep Encoding for 3D Voxels and 2D Images Cross-domain Deep Encoding for 3D Voxels and 2D Images Jingwei Ji Stanford University jingweij@stanford.edu Danyang Wang Stanford University danyangw@stanford.edu 1. Introduction 3D reconstruction is one

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological

More information

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection Zeming Li, 1 Yilun Chen, 2 Gang Yu, 2 Yangdong

More information

Scene classification with Convolutional Neural Networks

Scene classification with Convolutional Neural Networks Scene classification with Convolutional Neural Networks Josh King jking9@stanford.edu Vayu Kishore vayu@stanford.edu Filippo Ranalli franalli@stanford.edu Abstract This paper approaches the problem of

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

3D Deep Convolution Neural Network Application in Lung Nodule Detection on CT Images

3D Deep Convolution Neural Network Application in Lung Nodule Detection on CT Images 3D Deep Convolution Neural Network Application in Lung Nodule Detection on CT Images Fonova zl953@nyu.edu Abstract Pulmonary cancer is the leading cause of cancer-related death worldwide, and early stage

More information

Object Detection and Its Implementation on Android Devices

Object Detection and Its Implementation on Android Devices Object Detection and Its Implementation on Android Devices Zhongjie Li Stanford University 450 Serra Mall, Stanford, CA 94305 jay2015@stanford.edu Rao Zhang Stanford University 450 Serra Mall, Stanford,

More information

PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL

PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL Yingxin Lou 1, Guangtao Fu 2, Zhuqing Jiang 1, Aidong Men 1, and Yun Zhou 2 1 Beijing University of Posts and Telecommunications, Beijing,

More information

Real-Time* Multiple Object Tracking (MOT) for Autonomous Navigation

Real-Time* Multiple Object Tracking (MOT) for Autonomous Navigation Real-Time* Multiple Object Tracking (MOT) for Autonomous Navigation Ankush Agarwal,1 Saurabh Suryavanshi,2 ankushag@stanford.edu saurabhv@stanford.edu Authors contributed equally for this project. 1 Google

More information

Unified, real-time object detection

Unified, real-time object detection Unified, real-time object detection Final Project Report, Group 02, 8 Nov 2016 Akshat Agarwal (13068), Siddharth Tanwar (13699) CS698N: Recent Advances in Computer Vision, Jul Nov 2016 Instructor: Gaurav

More information

SUMMARY. in the task of supervised automatic seismic interpretation. We evaluate these tasks qualitatively and quantitatively.

SUMMARY. in the task of supervised automatic seismic interpretation. We evaluate these tasks qualitatively and quantitatively. Deep learning seismic facies on state-of-the-art CNN architectures Jesper S. Dramsch, Technical University of Denmark, and Mikael Lüthje, Technical University of Denmark SUMMARY We explore propagation

More information

Traffic Multiple Target Detection on YOLOv2

Traffic Multiple Target Detection on YOLOv2 Traffic Multiple Target Detection on YOLOv2 Junhong Li, Huibin Ge, Ziyang Zhang, Weiqin Wang, Yi Yang Taiyuan University of Technology, Shanxi, 030600, China wangweiqin1609@link.tyut.edu.cn Abstract Background

More information

arxiv: v1 [cs.cv] 25 Aug 2018

arxiv: v1 [cs.cv] 25 Aug 2018 Painting Outside the Box: Image Outpainting with GANs Mark Sabini and Gili Rusak Stanford University {msabini, gilir}@cs.stanford.edu arxiv:1808.08483v1 [cs.cv] 25 Aug 2018 Abstract The challenging task

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING

A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING 017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 5 8, 017, TOKYO, JAPAN A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING Haruki

More information

REVISITING DISTRIBUTED SYNCHRONOUS SGD

REVISITING DISTRIBUTED SYNCHRONOUS SGD REVISITING DISTRIBUTED SYNCHRONOUS SGD Jianmin Chen, Rajat Monga, Samy Bengio & Rafal Jozefowicz Google Brain Mountain View, CA, USA {jmchen,rajatmonga,bengio,rafalj}@google.com 1 THE NEED FOR A LARGE

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

Feature-Fused SSD: Fast Detection for Small Objects

Feature-Fused SSD: Fast Detection for Small Objects Feature-Fused SSD: Fast Detection for Small Objects Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu School of Electronic Engineering, Xidian University, China xmxie@mail.xidian.edu.cn

More information

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab. [ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that

More information

Towards Real-Time Automatic Number Plate. Detection: Dots in the Search Space

Towards Real-Time Automatic Number Plate. Detection: Dots in the Search Space Towards Real-Time Automatic Number Plate Detection: Dots in the Search Space Chi Zhang Department of Computer Science and Technology, Zhejiang University wellyzhangc@zju.edu.cn Abstract Automatic Number

More information

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object

More information

AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING

AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING Manuel Martin Salvador We Are Base / Bournemouth University Marcin Budka Bournemouth University Tom Quay We Are Base 1. INTRODUCTION Public transport

More information

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK 1 Po-Jen Lai ( 賴柏任 ), 2 Chiou-Shann Fuh ( 傅楸善 ) 1 Dept. of Electrical Engineering, National Taiwan University, Taiwan 2 Dept.

More information

Image Distortion Detection using Convolutional Neural Network

Image Distortion Detection using Convolutional Neural Network 2017 4th IAPR Asian Conference on Pattern Recognition Image Distortion Detection using Convolutional Neural Network Namhyuk Ahn Ajou University Suwon, Korea aa0dfg@ajou.ac.kr Byungkon Kang Ajou University

More information

arxiv: v1 [cs.cv] 27 Jun 2018

arxiv: v1 [cs.cv] 27 Jun 2018 LPRNet: License Plate Recognition via Deep Neural Networks Sergey Zherzdev ex-intel IOTG Computer Vision Group sergeyzherzdev@gmail.com Alexey Gruzdev Intel IOTG Computer Vision Group alexey.gruzdev@intel.com

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY

More information

arxiv: v1 [cs.cv] 20 Jun 2018

arxiv: v1 [cs.cv] 20 Jun 2018 Improving Online Multiple Object tracking with Deep Metric Learning arxiv:1806.07592v1 [cs.cv] 20 Jun 2018 Michael Thoreau, Navinda Kottege Abstract Tracking by detection is a common approach to solving

More information

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of

More information

arxiv: v1 [cs.cv] 15 Oct 2018

arxiv: v1 [cs.cv] 15 Oct 2018 Instance Segmentation and Object Detection with Bounding Shape Masks Ha Young Kim 1,2,*, Ba Rom Kang 2 1 Department of Financial Engineering, Ajou University Worldcupro 206, Yeongtong-gu, Suwon, 16499,

More information

Hand Detection For Grab-and-Go Groceries

Hand Detection For Grab-and-Go Groceries Hand Detection For Grab-and-Go Groceries Xianlei Qiu Stanford University xianlei@stanford.edu Shuying Zhang Stanford University shuyingz@stanford.edu Abstract Hands detection system is a very critical

More information

Visual Odometry using Convolutional Neural Networks

Visual Odometry using Convolutional Neural Networks The Kennesaw Journal of Undergraduate Research Volume 5 Issue 3 Article 5 December 2017 Visual Odometry using Convolutional Neural Networks Alec Graves Kennesaw State University, agrave15@students.kennesaw.edu

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

Towards Real-Time Detection and Tracking of Basketball Players using Deep Neural Networks

Towards Real-Time Detection and Tracking of Basketball Players using Deep Neural Networks Towards Real-Time Detection and Tracking of Basketball Players using Deep Neural Networks David Acuna University of Toronto davidj@cs.toronto.edu Abstract Online multi-player detection and tracking in

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

Fully-Convolutional Siamese Networks for Object Tracking

Fully-Convolutional Siamese Networks for Object Tracking Fully-Convolutional Siamese Networks for Object Tracking Luca Bertinetto*, Jack Valmadre*, João Henriques, Andrea Vedaldi and Philip Torr www.robots.ox.ac.uk/~luca luca.bertinetto@eng.ox.ac.uk Tracking

More information

arxiv: v1 [cs.cv] 30 Apr 2018

arxiv: v1 [cs.cv] 30 Apr 2018 An Anti-fraud System for Car Insurance Claim Based on Visual Evidence Pei Li Univeristy of Notre Dame BingYu Shen University of Notre dame Weishan Dong IBM Research China arxiv:184.1127v1 [cs.cv] 3 Apr

More information

Object Detection in Sports Videos

Object Detection in Sports Videos Object Detection in Sports Videos M. Burić, M. Pobar, M. Ivašić-Kos University of Rijeka/Department of Informatics, Rijeka, Croatia matija.buric@hep.hr, marinai@inf.uniri.hr, mpobar@inf.uniri.hr Abstract

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

arxiv: v1 [cs.cv] 26 Jun 2017

arxiv: v1 [cs.cv] 26 Jun 2017 Detecting Small Signs from Large Images arxiv:1706.08574v1 [cs.cv] 26 Jun 2017 Zibo Meng, Xiaochuan Fan, Xin Chen, Min Chen and Yan Tong Computer Science and Engineering University of South Carolina, Columbia,

More information

Industrial Technology Research Institute, Hsinchu, Taiwan, R.O.C ǂ

Industrial Technology Research Institute, Hsinchu, Taiwan, R.O.C ǂ Stop Line Detection and Distance Measurement for Road Intersection based on Deep Learning Neural Network Guan-Ting Lin 1, Patrisia Sherryl Santoso *1, Che-Tsung Lin *ǂ, Chia-Chi Tsai and Jiun-In Guo National

More information

arxiv: v2 [cs.cv] 28 Mar 2018

arxiv: v2 [cs.cv] 28 Mar 2018 Mobile Video Object Detection with Temporally-Aware Feature Maps Mason Liu Georgia Tech masonliuw@gatech.edu Menglong Zhu Google menglong@google.com arxiv:1711.06368v2 [cs.cv] 28 Mar 2018 Abstract This

More information

A Dense Take on Inception for Tiny ImageNet

A Dense Take on Inception for Tiny ImageNet A Dense Take on Inception for Tiny ImageNet William Kovacs Stanford University kovacswc@stanford.edu Abstract Image classificiation is one of the fundamental aspects of computer vision that has seen great

More information

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a

More information

Supplementary material for Analyzing Filters Toward Efficient ConvNet

Supplementary material for Analyzing Filters Toward Efficient ConvNet Supplementary material for Analyzing Filters Toward Efficient Net Takumi Kobayashi National Institute of Advanced Industrial Science and Technology, Japan takumi.kobayashi@aist.go.jp A. Orthonormal Steerable

More information

Real-Time Human Detection as an Edge Service Enabled by a Lightweight CNN

Real-Time Human Detection as an Edge Service Enabled by a Lightweight CNN 1 Real-Time Human Detection as an Edge Service Enabled by a Lightweight CNN Seyed Yahya Nikouei, Yu Chen, Sejun Song, Ronghua Xu, Baek-Young Choi, Timothy R. Faughnan ξ Dept. of Electrical and Computing

More information

Pedestrian Detection based on Deep Fusion Network using Feature Correlation

Pedestrian Detection based on Deep Fusion Network using Feature Correlation Pedestrian Detection based on Deep Fusion Network using Feature Correlation Yongwoo Lee, Toan Duc Bui and Jitae Shin School of Electronic and Electrical Engineering, Sungkyunkwan University, Suwon, South

More information

Pitch and Roll Camera Orientation From a Single 2D Image Using Convolutional Neural Networks

Pitch and Roll Camera Orientation From a Single 2D Image Using Convolutional Neural Networks Pitch and Roll Camera Orientation From a Single 2D Image Using Convolutional Neural Networks Greg Olmschenk, Hao Tang, Zhigang Zhu The Graduate Center of the City University of New York Borough of Manhattan

More information

Fish Species Likelihood Prediction. Kumari Deepshikha (1611MC03) Sequeira Ryan Thomas (1611CS13)

Fish Species Likelihood Prediction. Kumari Deepshikha (1611MC03) Sequeira Ryan Thomas (1611CS13) Fish Species Likelihood Prediction Kumari Deepshikha (1611MC03) Sequeira Ryan Thomas (1611CS13) Content 1. Problem Statement 2. Literature Survey 3. Data Sources 4. Faster R-CNN training 5. CNN training

More information

arxiv: v3 [cs.cv] 9 Dec 2015

arxiv: v3 [cs.cv] 9 Dec 2015 Scalable High Quality Object Detection arxiv:1412.1441v3 [cs.cv] 9 Dec 2015 Christian Szegedy Google Inc. 1600 Amphitheatre Pkwy, Mountain View, CA szegedy@google.com Abstract Dragomir Anguelov dragomir@google.com

More information

Gated Bi-directional CNN for Object Detection

Gated Bi-directional CNN for Object Detection Gated Bi-directional CNN for Object Detection Xingyu Zeng,, Wanli Ouyang, Bin Yang, Junjie Yan, Xiaogang Wang The Chinese University of Hong Kong, Sensetime Group Limited {xyzeng,wlouyang}@ee.cuhk.edu.hk,

More information

Modern Convolutional Object Detectors

Modern Convolutional Object Detectors Modern Convolutional Object Detectors Faster R-CNN, R-FCN, SSD 29 September 2017 Presented by: Kevin Liang Papers Presented Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

More information

Convolutional Neural Network-based Real-time ROV Detection Using Forward-looking Sonar Image

Convolutional Neural Network-based Real-time ROV Detection Using Forward-looking Sonar Image Convolutional Neural Network-based Real-time ROV Detection Using Forward-looking Sonar Image 10.3 Juhwan Kim and Son-Cheol Yu Dept. of Creative IT Engineering Pohang University of Science and Technology

More information

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) e-isjn: A4372-3114 Impact Factor: 7.327 Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies Research Article

More information

Fully-Convolutional Siamese Networks for Object Tracking

Fully-Convolutional Siamese Networks for Object Tracking Fully-Convolutional Siamese Networks for Object Tracking Luca Bertinetto*, Jack Valmadre*, João Henriques, Andrea Vedaldi and Philip Torr www.robots.ox.ac.uk/~luca luca.bertinetto@eng.ox.ac.uk Tracking

More information

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK Wenjie Guan, YueXian Zou*, Xiaoqun Zhou ADSPLAB/Intelligent Lab, School of ECE, Peking University, Shenzhen,518055, China

More information

Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization Valentin Dalibard Michael Schaarschmidt Eiko Yoneki Abstract We present an optimizer which uses Bayesian optimization

More information

Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation

Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation Al-Hussein A. El-Shafie Faculty of Engineering Cairo University Giza, Egypt elshafie_a@yahoo.com Mohamed Zaki Faculty

More information

Object Detection with YOLO on Artwork Dataset

Object Detection with YOLO on Artwork Dataset Object Detection with YOLO on Artwork Dataset Yihui He Computer Science Department, Xi an Jiaotong University heyihui@stu.xjtu.edu.cn Abstract Person: 0.64 Horse: 0.28 I design a small object detection

More information

3D model classification using convolutional neural network

3D model classification using convolutional neural network 3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing

More information

arxiv: v1 [cs.cv] 5 Oct 2015

arxiv: v1 [cs.cv] 5 Oct 2015 Efficient Object Detection for High Resolution Images Yongxi Lu 1 and Tara Javidi 1 arxiv:1510.01257v1 [cs.cv] 5 Oct 2015 Abstract Efficient generation of high-quality object proposals is an essential

More information

Mimicking Very Efficient Network for Object Detection

Mimicking Very Efficient Network for Object Detection Mimicking Very Efficient Network for Object Detection Quanquan Li 1, Shengying Jin 2, Junjie Yan 1 1 SenseTime 2 Beihang University liquanquan@sensetime.com, jsychffy@gmail.com, yanjunjie@outlook.com Abstract

More information

Pipeline-Based Processing of the Deep Learning Framework Caffe

Pipeline-Based Processing of the Deep Learning Framework Caffe Pipeline-Based Processing of the Deep Learning Framework Caffe ABSTRACT Ayae Ichinose Ochanomizu University 2-1-1 Otsuka, Bunkyo-ku, Tokyo, 112-8610, Japan ayae@ogl.is.ocha.ac.jp Hidemoto Nakada National

More information

Aggregating Frame-level Features for Large-Scale Video Classification

Aggregating Frame-level Features for Large-Scale Video Classification Aggregating Frame-level Features for Large-Scale Video Classification Shaoxiang Chen 1, Xi Wang 1, Yongyi Tang 2, Xinpeng Chen 3, Zuxuan Wu 1, Yu-Gang Jiang 1 1 Fudan University 2 Sun Yat-Sen University

More information

arxiv: v1 [cs.cv] 9 Dec 2018

arxiv: v1 [cs.cv] 9 Dec 2018 A Comparison of Embedded Deep Learning Methods for Person Detection Chloe Eunhyang Kim 1, Mahdi Maktab Dar Oghaz 2, Jiri Fajtl 2, Vasileios Argyriou 2, Paolo Remagnino 2 1 VCA Technology Ltd, Surrey, United

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

arxiv: v1 [cs.cv] 19 Oct 2016

arxiv: v1 [cs.cv] 19 Oct 2016 arxiv:1610.06136v1 [cs.cv] 19 Oct 2016 POI: Multiple Object Tracking with High Performance Detection and Appearance Feature Fengwei Yu 1,3 Wenbo Li 2,3 Quanquan Li 3 Yu Liu 3 Xiaohua Shi 1 Junjie Yan 3

More information

EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang

EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang Image Formation and Processing (IFP) Group, University of Illinois at Urbana-Champaign

More information

Cultural Event Recognition by Subregion Classification with Convolutional Neural Network

Cultural Event Recognition by Subregion Classification with Convolutional Neural Network Cultural Event Recognition by Subregion Classification with Convolutional Neural Network Sungheon Park and Nojun Kwak Graduate School of CST, Seoul National University Seoul, Korea {sungheonpark,nojunk}@snu.ac.kr

More information

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage HENet: A Highly Efficient Convolutional Neural Networks Optimized for Accuracy, Speed and Storage Qiuyu Zhu Shanghai University zhuqiuyu@staff.shu.edu.cn Ruixin Zhang Shanghai University chriszhang96@shu.edu.cn

More information

Evaluating Mask R-CNN Performance for Indoor Scene Understanding

Evaluating Mask R-CNN Performance for Indoor Scene Understanding Evaluating Mask R-CNN Performance for Indoor Scene Understanding Badruswamy, Shiva shivalgo@stanford.edu June 12, 2018 1 Motivation and Problem Statement Indoor robotics and Augmented Reality are fast

More information

arxiv: v5 [cs.cv] 11 Dec 2018

arxiv: v5 [cs.cv] 11 Dec 2018 Fire SSD: Wide Fire Modules based Single Shot Detector on Edge Device HengFui, Liau, 1, Nimmagadda, Yamini, 2 and YengLiong, Wong 1 {heng.hui.liau,yamini.nimmagadda,yeng.liong.wong}@intel.com, arxiv:1806.05363v5

More information

Content-Based Image Recovery

Content-Based Image Recovery Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

Yield Estimation using faster R-CNN

Yield Estimation using faster R-CNN Yield Estimation using faster R-CNN 1 Vidhya Sagar, 2 Sailesh J.Jain and 2 Arjun P. 1 Assistant Professor, 2 UG Scholar, Department of Computer Engineering and Science SRM Institute of Science and Technology,Chennai,

More information

Convolutional Neural Networks (CNNs) for Power System Big Data Analysis

Convolutional Neural Networks (CNNs) for Power System Big Data Analysis Convolutional Neural Networks (CNNs) for Power System Big Analysis Siby Jose Plathottam, Hossein Salehfar, Prakash Ranganathan Electrical Engineering, University of North Dakota Grand Forks, USA siby.plathottam@und.edu,

More information

Structured Prediction using Convolutional Neural Networks

Structured Prediction using Convolutional Neural Networks Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

More information

COUNTING PLANTS USING DEEP LEARNING

COUNTING PLANTS USING DEEP LEARNING COUNTING PLANTS USING DEEP LEARNING Javier Ribera 1, Yuhao Chen 1, Christopher Boomsma 2 and Edward J. Delp 1 1 Video and Image Processing Laboratory (VIPER), Purdue University, West Lafayette, Indiana,

More information

arxiv: v1 [cs.cv] 9 Aug 2017

arxiv: v1 [cs.cv] 9 Aug 2017 BlitzNet: A Real-Time Deep Network for Scene Understanding Nikita Dvornik Konstantin Shmelkov Julien Mairal Cordelia Schmid Inria arxiv:1708.02813v1 [cs.cv] 9 Aug 2017 Abstract Real-time scene understanding

More information

Comprehensive Feature Enhancement Module for Single-Shot Object Detector

Comprehensive Feature Enhancement Module for Single-Shot Object Detector Comprehensive Feature Enhancement Module for Single-Shot Object Detector Qijie Zhao, Yongtao Wang, Tao Sheng, and Zhi Tang Institute of Computer Science and Technology, Peking University, Beijing, China

More information

A Novel Weight-Shared Multi-Stage Network Architecture of CNNs for Scale Invariance

A Novel Weight-Shared Multi-Stage Network Architecture of CNNs for Scale Invariance JOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 A Novel Weight-Shared Multi-Stage Network Architecture of CNNs for Scale Invariance Ryo Takahashi, Takashi Matsubara, Member, IEEE, and Kuniaki

More information

Elastic Neural Networks for Classification

Elastic Neural Networks for Classification Elastic Neural Networks for Classification Yi Zhou 1, Yue Bai 1, Shuvra S. Bhattacharyya 1, 2 and Heikki Huttunen 1 1 Tampere University of Technology, Finland, 2 University of Maryland, USA arxiv:1810.00589v3

More information

TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning

TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning Shanqing Cai, Eric Breck, Eric Nielsen, Michael Salib, D. Sculley Google, Inc. {cais, ebreck, nielsene, msalib, dsculley}@google.com

More information

ChainerMN: Scalable Distributed Deep Learning Framework

ChainerMN: Scalable Distributed Deep Learning Framework ChainerMN: Scalable Distributed Deep Learning Framework Takuya Akiba Preferred Networks, Inc. akiba@preferred.jp Keisuke Fukuda Preferred Networks, Inc. kfukuda@preferred.jp Shuji Suzuki Preferred Networks,

More information

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 Etienne Gadeski, Hervé Le Borgne, and Adrian Popescu CEA, LIST, Laboratory of Vision and Content Engineering, France

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Robust Object detection for tiny and dense targets in VHR Aerial Images

Robust Object detection for tiny and dense targets in VHR Aerial Images Robust Object detection for tiny and dense targets in VHR Aerial Images Haining Xie 1,Tian Wang 1,Meina Qiao 1,Mengyi Zhang 2,Guangcun Shan 1,Hichem Snoussi 3 1 School of Automation Science and Electrical

More information

LARGE-SCALE PERSON RE-IDENTIFICATION AS RETRIEVAL

LARGE-SCALE PERSON RE-IDENTIFICATION AS RETRIEVAL LARGE-SCALE PERSON RE-IDENTIFICATION AS RETRIEVAL Hantao Yao 1,2, Shiliang Zhang 3, Dongming Zhang 1, Yongdong Zhang 1,2, Jintao Li 1, Yu Wang 4, Qi Tian 5 1 Key Lab of Intelligent Information Processing

More information

arxiv: v2 [cs.cv] 23 Nov 2017

arxiv: v2 [cs.cv] 23 Nov 2017 Light-Head R-CNN: In Defense of Two-Stage Object Detector Zeming Li 1, Chao Peng 2, Gang Yu 2, Xiangyu Zhang 2, Yangdong Deng 1, Jian Sun 2 1 School of Software, Tsinghua University, {lizm15@mails.tsinghua.edu.cn,

More information

An algorithm for highway vehicle detection based on convolutional neural network

An algorithm for highway vehicle detection based on convolutional neural network Chen et al. EURASIP Journal on Image and Video Processing (2018) 2018:109 https://doi.org/10.1186/s13640-018-0350-2 EURASIP Journal on Image and Video Processing RESEARCH An algorithm for highway vehicle

More information

CloudifierNet. Deep Vision Models for Artificial Image Processing

CloudifierNet. Deep Vision Models for Artificial Image Processing CloudifierNet Deep Vision Models for Artificial Image Processing Andrei Ionut DAMIAN (Author) Cloudifier SRL, Bucharest, Romania Laurentiu Gheorghe PICIU (Author) University Politehnica Bucharest, Romania

More information