Fashion Analytics and Systems
|
|
- Kristina Jackson
- 5 years ago
- Views:
Transcription
1 Learning and Vision Group, NUS (NUS-LV) Fashion Analytics and Systems Shuicheng YAN National University of Singapore [ Special thanks to Luoqi LIU, Xiaodan LIANG, Si LIU, Jianshu LI]
2 Deep Learning Ecosystem in NUS-LV Lab Purine: General, bi-graph based DL framework Multi-PC Multi-CPU/GPU Linear speedup Model parallel, data parallel Brain-like + Baby-like: Brain-like network structures and baby-like self/endless learning process Algorithms Landing Visual Perception + Big Data Modeling/Learning Object/products/human analytics, and other non-visual big data Architecture 1. 4 winner awards in VOC 2. One 2nd prize in VOC 3. 2nd prize in ImageNet st prize in ImageNet 14 Best paper/demo awards: ACM MM13, ACM MM12, Licensed with M-scale users daily Fashion Analytics and Systems Applications LFW: 99.70% Best human parsing performance Cross-age synthesis Face analysis with occlusions
3 Fashion Analytics and Systems in NUS-LV Human Parsing Fashion Item Search Smart Advertisement Body Beauty e-experts Face
4 Task I: Human/Fashion Parsing (Pixel-to-Pixel Deep Prediction: 44.76%--> 85.36%) Time for Human+
5 Goal: Human Parsing Decompose a human photo into semantic fashion/body items Pixel-level semantic labeling Upper-clothes Sun-glass skirt scarf right-shoe right-leg right-arm pants left-shoe left-leg left-arm hat face dress belt bag hair null
6 Human Parsing = Engine for Applications
7 Stage-1: Pipeline Solutions Hand-designed pipelines --- Heavily rely on the performance of individual component --- Founded on hand-designed features and complex context models Parametric model [1] Segmentation hypotheses Extract handcrafted features Non-parametric model (retrieval) [2] Postprocessing (e.g. CRF) [1] Jian Dong, Qiang Chen, Wei Xia, ZhongYang Huang, and Shuicheng Yan. A deformable mixture parsing model with parselets. In ICCV, 2013 [2] K. Yamaguchi, M.H. Kiapour, and T.L. Berg. Paper doll parsing: Retrieving similar styles to parse clothing items. In ICCV, 2013
8 Exemplar Solution: Paper Doll Parsing
9 Stage II: Deep Regression for Components Hair Face Left-arm Sun-glasses Right-arm Bag Upper-clothes Left-leg Skirt Right-leg Left-shoe Right-shoe Assumption: pixel-to-pixel impossible! Deformable Human Items Model (similar to ASM): predict the normalized item masks, and their active shape/location parameters with two CNN networks
10 Normalized Item Mask The masks of different items often appear in various specific shapes The mask can be approximated as a linear combination of the learned templates
11 Superpixel Smoothness Our Framework Active Template Network for predicting item template coefficients Active Shape Network for predicting active shape/location parameters Combine the resulting structure outputs and then refine the parsing result Structure Labels Human detection Active Template Network Item Mask Reconstruction Item confidence Map Active Shape Network Item position, scale and visibility Background confidence Map
12 Active Template Network Learn 50 templates for each item by Non-negative Matrix Factorization (NMF) in an offline way Regress the output: 50*17 for 17 human items Image size 227 Filter size Template coefficients Stride 2 3x3 max pool stride Contrast 3x3 max pool norm stride Contrast norm x3 max pool stride units units 850 Input Image Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 Layer 6 Layer 7 Output
13 Active Shape Network Predict x,y coordinates, width, height, visibility flag for each item Eliminate the max-pooling layer in CNN to keep the position sensitiveness Image size 227 Filter size Active Shape parameters Stride units 1024 units 85 3 Input Image 55 Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 Layer 6 Layer 7 Output Accuracy Foreground accuracy Average precision Average recall Average F-1 scores Original Structure Ours
14 Structure Output Combination Combine the structure outputs from two networks, and generate 17 confidence maps of the human items Hair Sunglasses Face Dress LeftArm Optional bounding-box refinement and super-pixel smoothening
15 Results Datasets: 7,700 images, 6,000 for training, 1,000 for testing and 700 for validation Training: Manually decrease the learning rate according to the validation error Training time: for 120 epochs, take 2-3 days on two NVIDIA GTX TITAN 6GB GPUs Testing time: process one image within about 0.5 second
16 Results Comparison of parsing performances with two state-of-the-art methods: Accuracy Foreground accuracy Average precision Average recall Average F-1 scores Yamaguchi [3] Paper-doll [2] ATR(noSPR) ATR ATR + BBox Regression [2] K. Yamaguchi, M.H. Kiapour, and T.L. Berg. Paper doll parsing: Retrieving similar styles to parse clothing items. In ICCV, 2013 [3] K. Yamaguchi, M.H. Kiapour, L.E. Ortiz, and T.L. Berg. Parsing clothing in fashion photographs. In CVPR 2012.
17 Parsing Results Test Paperdoll ATR (nospr) ATR Test Paperdoll ATR (nospr) ATR
18 Stage-III: Pixel-to-pixel Deep Prediction Contexts + Fully convolutional neural network Cross-layer context: : multi-level feature fusion Global image-level context: : coherence between pixel-wise labelling and image label prediction Local Super-pixel context: : within-superpixel consistency and cross-superpixel appearance consistency
19 Contextualized Network Cross-layer context Four feature map fusions Image 150* *100 75*50 37*25 18*12 37*25 37*25 75*50 75*50 150* * * *5 convolutions Hierarchically combine the low-level local details and high-level semantic information
20 Contextualized Network Global image-level context Incorporate global image label prediction
21 Contextualized Network Local Super-pixel context Integrate within-super-pixel smoothing and cross-super-pixel neighbourhood voting
22 Contextualized Network Global image-level context helps distinguish the ambiguous labels Skirt Dress Co-CNN w/o global label Global image label Co-CNN skirt dress upper-clothes
23 Contextualized Network Local super-pixel context retains with-superpixel and appearance consistency input Co-CNN w/o sp Co-CNN input Co-CNN w/o sp Co-CNN
24 Results Comparison of parsing performances with four state-of-the-art methods on ATR dataset: Accuracy Foreground accuracy Average precision Average recall Average F-1 scores Yamaguchi et al Paperdoll ATR Co-CNN
25 Results Analyses on architectural variants of our model Cross-layer context
26 Results Analyses on architectural variants of our model Global image label context
27 Results Analyses on architectural variants of our model Local super-pixel context
28 Results Adding 10,000 human pictures from chictopia.com Accuracy Foreground accuracy Average precision Average recall Average F-1 scores Paperdoll ATR Co-CNN Co-CNN(+Chictopia10k)
29 Multi-task with Semantic Edge Detection Semantic edge detection task Input Semantic Edge Motivations: Incorrect Edges within-item edge vs. cross-item edge
30 Multi-task with Semantic Edge Detection Semantic edge integrate the semantic edge into the Co-CNN Multi-resolution fusion
31 w/o edge w edge Results Human parsing results with semantic edge co-prediction Semantic edge accuracy: 92.31% Accuracy Foreground accuracy Average precision Average recall Average F-1 scores Paperdoll ATR Co-CNN Co-CNN(+Chictopia10k) Co-CNN(+Chictopia10k) (semanticedge)
32 Parsing Results Test Paperdoll ATR Co-CNN Test Paperdoll ATR Co-CNN
33 Online Human Parsing Engine (<0.15s) 44.76%--> 85.36%: nearly ready for many industry applications
34 Ongoing Work: Robust Snap&Buy
35 Task II: Face Beautification (Beauty e-experts)
36 Makeover (makeup+ hairstyle) Process The makeover process (2) Synthesis Before-makeup (1) Foundation (2) Lip (3) Eye shadow (4) Hairstyle (1) Recommendation color & shape
37 Database System Flowchart Recommendation Module Synthesis Module Image Feature s Beauty Attribut es Beauty Related Attributes Hair Synthesis Makeup Synthesis Try & Buy Testing Face (without makeover)
38 Recommendation Module Beauty Attributes Beautyrelated Attributes Visual Features
39 Beauty Attributes Hair length Hair color Hair shape Hair bangs Hair volume Spectral matting Eye shadow shape Spectral matting+ clustering Eye shadow color Clustering Foundation Clustering Lip gloss Totally, we define 9 kinds of beauty attributes(directly related with real cosmetic products). Clustering
40 Beauty-related Attributes Face shape long oval round Lip thickness thick normal Ocular distance wide normal narrow Race western eastern Totally, we define 21 kinds of beauty-related attributes: (1) Unchanged during makeover process (2) Strong correlations with beauty attributes
41 Visual Features Color Histograms Color Moments Histogram of Gradients Local Binary Patterns ASM Parameters Shape Context
42 Recommendation Model: Formulation Beautyrelated Attributes( a r ) Beauty Attributes (a b ) Visual Features (x) Z x = Gibbs Distribution p a b, a r x = 1 Z x exp E ab, a r, x a b,a r exp E a b, a r, x Super-graph [super-vertex and Maximum Spanning Tree] to define E
43 Synthesis Module Alignment Alpha blending Recommended beauty templates/attributes Synthesis result #3 #12 Short, straight side part #4 #3
44 Exemplar Synthesis Process Foundation Lip Gloss Eye Shadow Hair style
45 Recommendation and Synthesis Results
46 Recommendation and Synthesis Results
47 Beyond Makeover: Shape Beautification
48 Beyond Makeover: Shape Beautification
49 Fashion Analytics and Systems in NUS-LV Human Parsing Fashion Item Search Smart Advertisement Body Beauty e-experts Face
50
Human Parsing with Contextualized Convolutional Neural Network
Human Parsing with Contextualized Convolutional Neural Network Xiaodan Liang 1,2, Chunyan Xu 2, Xiaohui Shen 3, Jianchao Yang 5, Si Liu 6, Jinhui Tang 4 Liang Lin 1, Shuicheng Yan 2 1 Sun Yat-sen University
More informationMatching-CNN Meets KNN: Quasi-Parametric Human Parsing
Matching-CNN Meets KNN: Quasi-Parametric Human Parsing Si Liu 1,2 Xiaodan Liang 2,4 Luoqi Liu 2 Xiaohui Shen 3 Jianchao Yang 3 Changsheng Xu Liang Lin 4 Xiaochun Cao 1 Shuicheng Yan 2 1 SKLOIS, IIE, Chinese
More informationPASCAL VOC Classification: Local Features vs. Deep Features. Shuicheng YAN, NUS
PASCAL VOC Classification: Local Features vs. Deep Features Shuicheng YAN, NUS PASCAL VOC Why valuable? Multi-label, Real Scenarios! Visual Object Recognition Object Classification Object Detection Object
More informationDeep condolence to Professor Mark Everingham
Deep condolence to Professor Mark Everingham Towards VOC2012 Object Classification Challenge Generalized Hierarchical Matching for Sub-category Aware Object Classification National University of Singapore
More informationSemantic Object Parsing with Local-Global Long Short-Term Memory
Semantic Object Parsing with Local-Global Long Short-Term Memory Xiaodan Liang 1,3, Xiaohui Shen 4, Donglai Xiang 3, Jiashi Feng 3 Liang Lin 1, Shuicheng Yan 2,3 1 Sun Yat-sen University 2 360 AI Institute
More informationFashion Parsing With Video Context
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 17, NO. 8, AUGUST 2015 1347 Fashion Parsing With Video Context Si Liu, Member, IEEE, Xiaodan Liang, Luoqi Liu, Ke Lu, Liang Lin, Xiaochun Cao, Member, IEEE, and Shuicheng
More informationSemantic Object Parsing with Graph LSTM
Semantic Object Parsing with Graph LSTM Xiaodan Liang 1 Xiaohui Shen 4 Jiashi Feng 3 Liang Lin 1 Shuicheng Yan 2,3 1 Sun Yat-sen University 2 360 AI Institue 3 National University of Singapore 4 Adobe
More informationDeep Learning for Virtual Shopping. Dr. Jürgen Sturm Group Leader RGB-D
Deep Learning for Virtual Shopping Dr. Jürgen Sturm Group Leader RGB-D metaio GmbH Augmented Reality with the Metaio SDK: IKEA Catalogue App Metaio: Augmented Reality Metaio SDK for ios, Android and Windows
More informationFashion Parsing with Video Context
Fashion Parsing with Video Context Si Liu 1, Xiaodan Liang 2,1, Luoqi Liu 1, Ke Lu 3, Liang Lin 2, Shuicheng Yan 1 1 National University of Singapore 2 Sun Yat-Sen University 3 University of Chinese Academy
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationCost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling
[DOI: 10.2197/ipsjtcva.7.99] Express Paper Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling Takayoshi Yamashita 1,a) Takaya Nakamura 1 Hiroshi Fukui 1,b) Yuji
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationDeep learning for object detection. Slides from Svetlana Lazebnik and many others
Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep
More informationObject detection with CNNs
Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals
More informationLecture 7: Semantic Segmentation
Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr
More informationDetecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)
Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of
More informationDisguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601
Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationarxiv: v1 [cs.cv] 8 Mar 2017
Interpretable Structure-Evolving LSTM Xiaodan Liang 1 Liang Lin 2 Xiaohui Shen 4 Jiashi Feng 3 Shuicheng Yan 3 Eric P. Xing 1 1 Carnegie Mellon University 2 Sun Yat-sen University 3 National University
More informationLSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University
LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in
More informationSpeaker: Ming-Ming Cheng Nankai University 15-Sep-17 Towards Weakly Supervised Image Understanding
Towards Weakly Supervised Image Understanding (WSIU) Speaker: Ming-Ming Cheng Nankai University http://mmcheng.net/ 1/50 Understanding Visual Information Image by kirkh.deviantart.com 2/50 Dataset Annotation
More informationLecture 5: Object Detection
Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationClassification of objects from Video Data (Group 30)
Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time
More informationJOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA
JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationCSE255 Assignment 1 Improved image-based recommendations for what not to wear dataset
CSE255 Assignment 1 Improved image-based recommendations for what not to wear dataset Prabhav Agrawal and Soham Shah 23 February 2015 1 Introduction We are interested in modeling the human perception of
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationDirect Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.
[ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that
More informationProposal-free Network for Instance-level Object Segmentation
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. X, X 20XX 1 Proposal-free Network for Instance-level Object Segmentation Xiaodan Liang, Yunchao Wei, Xiaohui Shen, Jianchao
More informationAttributes and More Crowdsourcing
Attributes and More Crowdsourcing Computer Vision CS 143, Brown James Hays Many slides from Derek Hoiem Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed.
More informationPart Localization by Exploiting Deep Convolutional Networks
Part Localization by Exploiting Deep Convolutional Networks Marcel Simon, Erik Rodner, and Joachim Denzler Computer Vision Group, Friedrich Schiller University of Jena, Germany www.inf-cv.uni-jena.de Abstract.
More informationWhen Big Datasets are Not Enough: The need for visual virtual worlds.
When Big Datasets are Not Enough: The need for visual virtual worlds. Alan Yuille Bloomberg Distinguished Professor Departments of Cognitive Science and Computer Science Johns Hopkins University Computational
More informationACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang
ACM MM 2010 Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet
More informationSSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang
SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationarxiv: v2 [cs.cv] 15 Mar 2018
Multi-Human Parsing in the Wild arxiv:1705.07206v2 [cs.cv] 15 Mar 2018 Jianshu Li 1 Jian Zhao 1 Yunchao Wei 1 Congyan Lang 2 Yidong Li 2 Terence Sim 1 Shuicheng Yan 1 Jiashi Feng 1 1 National University
More informationFACIAL POINT DETECTION USING CONVOLUTIONAL NEURAL NETWORK TRANSFERRED FROM A HETEROGENEOUS TASK
FACIAL POINT DETECTION USING CONVOLUTIONAL NEURAL NETWORK TRANSFERRED FROM A HETEROGENEOUS TASK Takayoshi Yamashita* Taro Watasue** Yuji Yamauchi* Hironobu Fujiyoshi* *Chubu University, **Tome R&D 1200,
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationDigital Makeup Face Generation
Digital Makeup Face Generation Wut Yee Oo Mechanical Engineering Stanford University wutyee@stanford.edu Abstract Make up applications offer photoshop tools to get users inputs in generating a make up
More informationLearning Semantic Environment Perception for Cognitive Robots
Learning Semantic Environment Perception for Cognitive Robots Sven Behnke University of Bonn, Germany Computer Science Institute VI Autonomous Intelligent Systems Some of Our Cognitive Robots Equipped
More informationDeep Face Recognition. Nathan Sun
Deep Face Recognition Nathan Sun Why Facial Recognition? Picture ID or video tracking Higher Security for Facial Recognition Software Immensely useful to police in tracking suspects Your face will be an
More informationEfficient Segmentation-Aided Text Detection For Intelligent Robots
Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related
More informationDEEP BLIND IMAGE QUALITY ASSESSMENT
DEEP BLIND IMAGE QUALITY ASSESSMENT BY LEARNING SENSITIVITY MAP Jongyoo Kim, Woojae Kim and Sanghoon Lee ICASSP 2018 Deep Learning and Convolutional Neural Networks (CNNs) SOTA in computer vision & image
More informationSemantic Segmentation without Annotating Segments
Chapter 3 Semantic Segmentation without Annotating Segments Numerous existing object segmentation frameworks commonly utilize the object bounding box as a prior. In this chapter, we address semantic segmentation
More informationarxiv: v1 [cs.cv] 15 Oct 2018
Instance Segmentation and Object Detection with Bounding Shape Masks Ha Young Kim 1,2,*, Ba Rom Kang 2 1 Department of Financial Engineering, Ajou University Worldcupro 206, Yeongtong-gu, Suwon, 16499,
More informationPaper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items
2013 IEEE International Conference on Computer Vision Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items Kota Yamaguchi Stony Brook University Stony Brook, NY, USA kyamagu@cs.stonybrook.edu
More informationSemantic Segmentation
Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,
More informationDirect Matrix Factorization and Alignment Refinement: Application to Defect Detection
Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection Zhen Qin (University of California, Riverside) Peter van Beek & Xu Chen (SHARP Labs of America, Camas, WA) 2015/8/30
More informationEncoder-Decoder Networks for Semantic Segmentation. Sachin Mehta
Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:
More informationColor Naming for Multi-Color Fashion Items
Color Naming for Multi-Color Fashion Items Vacit Oguz Yazici 1,2, Joost van de Weijer 1, and Arnau Ramisa 2 1 Computer Vision Center, Universitat Autonoma de Barcelona, Building O Campus UAB, 08193 Bellaterra,
More informationPhoto-realistic Renderings for Machines Seong-heum Kim
Photo-realistic Renderings for Machines 20105034 Seong-heum Kim CS580 Student Presentations 2016.04.28 Photo-realistic Renderings for Machines Scene radiances Model descriptions (Light, Shape, Material,
More informationDiscrete Optimization of Ray Potentials for Semantic 3D Reconstruction
Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction Marc Pollefeys Joined work with Nikolay Savinov, Christian Haene, Lubor Ladicky 2 Comparison to Volumetric Fusion Higher-order ray
More informationREGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION
REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological
More informationA Deep Learning Framework for Authorship Classification of Paintings
A Deep Learning Framework for Authorship Classification of Paintings Kai-Lung Hua ( 花凱龍 ) Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology Taipei,
More informationDeformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationMask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi
Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image
More informationFACIAL POINT DETECTION BASED ON A CONVOLUTIONAL NEURAL NETWORK WITH OPTIMAL MINI-BATCH PROCEDURE. Chubu University 1200, Matsumoto-cho, Kasugai, AICHI
FACIAL POINT DETECTION BASED ON A CONVOLUTIONAL NEURAL NETWORK WITH OPTIMAL MINI-BATCH PROCEDURE Masatoshi Kimura Takayoshi Yamashita Yu Yamauchi Hironobu Fuyoshi* Chubu University 1200, Matsumoto-cho,
More informationSupplemental Document for Deep Photo Style Transfer
Supplemental Document for Deep Photo Style Transfer Fujun Luan Cornell University Sylvain Paris Adobe Eli Shechtman Adobe Kavita Bala Cornell University fujun@cs.cornell.edu sparis@adobe.com elishe@adobe.com
More informationAn Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012
An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012, 19.01.2012 INSTITUTE FOR ANTHROPOMATICS, FACIAL IMAGE PROCESSING AND ANALYSIS YIG University of the State of Baden-Wuerttemberg
More informationSeparating Objects and Clutter in Indoor Scenes
Separating Objects and Clutter in Indoor Scenes Salman H. Khan School of Computer Science & Software Engineering, The University of Western Australia Co-authors: Xuming He, Mohammed Bennamoun, Ferdous
More informationSemantic RGB-D Perception for Cognitive Robots
Semantic RGB-D Perception for Cognitive Robots Sven Behnke Computer Science Institute VI Autonomous Intelligent Systems Our Domestic Service Robots Dynamaid Cosero Size: 100-180 cm, weight: 30-35 kg 36
More informationLearning Visual Semantics: Models, Massive Computation, and Innovative Applications
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Part II: Visual Features and Representations Liangliang Cao, IBM Watson Research Center Evolvement of Visual Features
More informationInteractive Image Search with Attributes
Interactive Image Search with Attributes Adriana Kovashka Department of Computer Science January 13, 2015 Joint work with Kristen Grauman and Devi Parikh We Need Search to Access Visual Data 144,000 hours
More information[Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors
[Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors Junhyug Noh Soochan Lee Beomsu Kim Gunhee Kim Department of Computer Science and Engineering
More informationDeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,
More informationEXPLOITING TEXTURE CUES FOR CLOTHING PARSING IN FASHION IMAGES
EXPLOITING TEXTURE CUES FOR CLOTHING PARSING IN FASHION IMAGES Tarasha Khurana,1 Kushagra Mahajan,1 Chetan Arora 1 Atul Rai 2 1 IIIT Delhi 2 Staqu Technologies ABSTRACT We focus on the problem of parsing
More informationInterpretable Structure-Evolving LSTM
Interpretable Structure-Evolving LSTM Xiaodan Liang 1,2 Liang Lin 2,5 Xiaohui Shen 4 Jiashi Feng 3 Shuicheng Yan 3 Eric P. Xing 1 1 Carnegie Mellon University 2 Sun Yat-sen University 3 National University
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationCS 534: Computer Vision Segmentation and Perceptual Grouping
CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation
More informationarxiv: v1 [cs.cv] 23 Jan 2019
DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images Yuying Ge 1, Ruimao Zhang 1, Lingyun Wu 2, Xiaogang Wang 1, Xiaoou Tang 1, and
More informationA Deformable Mixture Parsing Model with Parselets
2013 IEEE International Conference on Computer Vision A Deformable Mixture Parsing Model with Parselets Jian Dong 1, Qiang Chen 1, Wei Xia 1, Zhongyang Huang 2, Shuicheng Yan 1 1 Department of Electrical
More informationMulti-view Stereo. Ivo Boyadzhiev CS7670: September 13, 2011
Multi-view Stereo Ivo Boyadzhiev CS7670: September 13, 2011 What is stereo vision? Generic problem formulation: given several images of the same object or scene, compute a representation of its 3D shape
More informationShape Customisation of Human Subjects Based on Human Parsing Technology
Shape Customisation of Human Subjects Based on Human Parsing Technology Shuaiyin ZHU 2, Yanghong ZHOU 2, K.P. CHAU 2, P.Y. MOK* 1,2 1 The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen,
More informationAttributes. Computer Vision. James Hays. Many slides from Derek Hoiem
Many slides from Derek Hoiem Attributes Computer Vision James Hays Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed. Human-in-the-loop recognition:
More informationCIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm
CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their
More informationMeasuring Aristic Similarity of Paintings
Measuring Aristic Similarity of Paintings Jay Whang Stanford SCPD jaywhang@stanford.edu Buhuang Liu Stanford SCPD buhuang@stanford.edu Yancheng Xiao Stanford SCPD ycxiao@stanford.edu Abstract In this project,
More informationMulti-View 3D Object Detection Network for Autonomous Driving
Multi-View 3D Object Detection Network for Autonomous Driving Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia CVPR 2017 (Spotlight) Presented By: Jason Ku Overview Motivation Dataset Network Architecture
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationDeep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong
Deep Learning for Face Recognition Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong Deep Learning Results on LFW Method Accuracy (%) # points # training images Huang
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationFashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction SUPPLEMENTAL MATERIAL
Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction SUPPLEMENTAL MATERIAL Edgar Simo-Serra Waseda University esimo@aoni.waseda.jp Hiroshi Ishikawa Waseda
More informationTransfer Learning. Style Transfer in Deep Learning
Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning
More informationCS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016
CS 1674: Intro to Computer Vision Attributes Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 Plan for today What are attributes and why are they useful? (paper 1) Attributes for zero-shot
More informationConvolutional Networks in Scene Labelling
Convolutional Networks in Scene Labelling Ashwin Paranjape Stanford ashwinpp@stanford.edu Ayesha Mudassir Stanford aysh@stanford.edu Abstract This project tries to address a well known problem of multi-class
More informationContent-Based Image Recovery
Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose
More information3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis
3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors
More informationEarly Hierarchical Contexts Learned by Convolutional Networks for Image Segmentation
Early Hierarchical Contexts Learned by Convolutional Networks for Image Segmentation Zifeng Wu, Yongzhen Huang, Yinan Yu, Liang Wang and Tieniu Tan Center for Research on Intelligent Perception and Computing
More informationSupervised Hashing for Image Retrieval via Image Representation Learning
Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar
More informationAn Exploration of Computer Vision Techniques for Bird Species Classification
An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex
More informationGradient of the lower bound
Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level
More informationMediaTek Video Face Beautify
MediaTek Video Face Beautify November 2014 2014 MediaTek Inc. Table of Contents 1 Introduction... 3 2 The MediaTek Solution... 4 3 Overview of Video Face Beautify... 4 4 Face Detection... 6 5 Skin Detection...
More informationShadowDraw Real-Time User Guidance for Freehand Drawing. Harshal Priyadarshi
ShadowDraw Real-Time User Guidance for Freehand Drawing Harshal Priyadarshi Demo Components of Shadow-Draw Inverted File Structure for indexing Database of images Corresponding Edge maps Query method Dynamically
More informationBidirectional Recurrent Convolutional Networks for Video Super-Resolution
Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Qi Zhang & Yan Huang Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition
More informationArchitectures for Scalable Media Object Search
Architectures for Scalable Media Object Search Dennis Sng Deputy Director & Principal Scientist NVIDIA GPU Technology Workshop 10 July 2014 ROSE LAB OVERVIEW 2 Large Database of Media Objects Next- Generation
More informationPredicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus
Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus Presented by: Rex Ying and Charles Qi Input: A Single RGB Image Estimate
More informationCNN for Low Level Image Processing. Huanjing Yue
CNN for Low Level Image Processing Huanjing Yue 2017.11 1 Deep Learning for Image Restoration General formulation: min Θ L( x, x) s. t. x = F(y; Θ) Loss function Parameters to be learned Key issues The
More informationAdvances in Face Recognition Research
The face recognition company Advances in Face Recognition Research Presentation for the 2 nd End User Group Meeting Juergen Richter Cognitec Systems GmbH For legal reasons some pictures shown on the presentation
More informationSu et al. Shape Descriptors - III
Su et al. Shape Descriptors - III Siddhartha Chaudhuri http://www.cse.iitb.ac.in/~cs749 Funkhouser; Feng, Liu, Gong Recap Global A shape descriptor is a set of numbers that describes a shape in a way that
More informationXiaowei Hu* Lei Zhu* Chi-Wing Fu Jing Qin Pheng-Ann Heng
Direction-aware Spatial Context Features for Shadow Detection Xiaowei Hu* Lei Zhu* Chi-Wing Fu Jing Qin Pheng-Ann Heng The Chinese University of Hong Kong The Hong Kong Polytechnic University Shenzhen
More information