Xian-Sheng Hua ( 华先胜 )

Size: px
Start display at page:

Download "Xian-Sheng Hua ( 华先胜 )"

Transcription

1 Machine Learning and Applications Workshop 2009 Xian-Sheng Hua ( 华先胜 ) Lead Researcher, Media Computing Group, Microsoft Research Asia Nanjing China Nov 7-8, 2009

2 Introduce research problems in multimedia search area that are related to machine learning Introduce exemplary research efforts of my team along this direction (how we find/analyze/formulate/solve the problems using machine learning in this area) 2

3 Introduction Internet Multimedia Search Machine Learning in Internet Multimedia Search and Mining Learning in internet multimedia indexing Learning in internet multimedia ranking Learning in internet multimedia mining Discussion Microsoft Confidential 3

4 How many images on the Internet? And how about videos? 4

5 4 billion (June 2009) 128 years to view all of them (1s per image) ~4000 uploads/minute 2% Internet users visit Daily time on site: 4.7 minutes 200 million (July 2009 ) 1000 years to see all of them ~20 hours uploaded/minute 20% Internet users visit Daily time on site: 23 minutes 2007 bandwidth = entire Internet in 2000 March 2008: bandwidth cost US$1M a day 12% copyright issue 15 billion (April 2009 ) 480 years to view all of them (1s per image) ~22000 uploads/minute 24% Internet users visit Daily time on site: 30 minutes 5

6 Search Navigation Sharing Authoring/Editing Copy Detection Recommendation Tagging Mining Streaming Summarization Visualization Advertising Categorization Forensics Media on Mobiles 6

7 Query Interface Ranking Indexing User Query Index Data Results Presentation 7

8 Indexing Understand and describe the content, and then build index Ranking Order relevant results according to the relevance to the query/intent Query Interface How to effectively express the query input/intent Result Presentation How to effectively present the search results 8

9 Introduction Internet Multimedia Search Machine Learning in Internet Multimedia Search and Mining Learning in internet multimedia indexing Learning in internet multimedia ranking Learning in internet multimedia mining Discussion Microsoft Confidential 9

10 Q.-J. Qi, X.-S Hua, Y. Rui, et al. Correlative Multi-Label Video Annotation. ACM Multimedia 2007 (Best Paper Award). 10

11 Automatic 90 Pure Content Based (QBE) Learning-Based Recently Automated Annotation Manual Manual Labeling Year 11

12 A typical strategy Individual Concept Detection Annotate multiple concepts separately Low-Level Features Outdoor Face Person People- Marching Road Walking- Running -1 / 1-1 / 1-1 / 1-1 / 1-1 / 1-1 / 1 12

13 Person Street Building Beach Mountain Crowd Outdoor Walking/Running? Marching 13

14 Another typical strategy Fusion-Based Context Based Concept fusion (CBCF) Low-Level Features Outdoor Face Person Road People- Marching Walking- Running Score Score Score Score Score Score -1 / 1-1 / 1-1 / 1-1 / 1-1 / 1-1 / 1 Concept Model Vector Concept Fusion -1 / 1-1 / 1-1 / 1-1 / 1-1 / 1-1 / 1 S. Godbole and S. Sarawagi. Discriminative methods for multi-labeled classification. PAKDD,

15 Low-Level Features Outdoor Face Person People- Marching Road Walking- Running Score Score Score Score Score Score Concept Model Vector Concept Fusion -1 / 1-1 / 1-1 / 1-1 / 1-1 / 1-1 / 1 15

16 A better strategy Integrated Concept Detection Correlative Multi-Label Learning (CML) Low-Level Features Outdoor Face Person People- Marching Road Walking- Running -1 / 1-1 / 1-1 / 1-1 / 1-1 / 1-1 / 1 Guo-Jun Qi, Xian-Sheng Hua, et al. Correlative Multi-Label Video Annotation. ACM Multimedia

17 A better strategy Integrated Concept Detection Correlative Multi-Label Learning (CML) Guo-Jun Qi, Xian-Sheng Hua, et al. Correlative Multi-Label Video Annotation. ACM Multimedia

18 How to model concepts and the correlations among concept in a single step Our Strategy Converting correlations into features. Constructing a new feature vector that captures both The characteristics of concepts, and The correlations among concepts 18

19 Notations 19

20 Modeling concept and correlations 2K D K 1 20

21 Learning the classifier Misclassification Error Loss function Empirical risk Regularization Introduce slack variables Lagrange dual Find solution by SMO 21

22 Connection to Gibbs Random Field Define a random field consists of all adjacent sites, that is, this RF is fully connected is a random field Define energy function Define GRF Rewrite the classifier 22

23 Connection to Gibbs Random Field Define a random field consists of all adjacent sites, that is, this RF is fully connected is a random field Define energy function Define GRF Rewrite the classifier Intuitive explanation of CML 23

24 Experiments TRECVID 2005 dataset (170 hours) 39 concepts (LSCOM-Lite) Training (65%), Validation (16%), Testing (19%) 24

25 Experiments TRECVID 2005 dataset (170 hours) 39 concepts (LSCOM-Lite) Training (65%), Validation (16%), Testing (19%) CML (MAP=0.290) improves IndSVM (MAP=0.246) 17% and CBCF (MAP=0.253) 14% SVM CML 17 CBCF CML 14 SVM CBCF CM 25

26 Experiments TRECVID 2005 dataset (170 hours) 39 concepts (LSCOM-Lite) Training (65%), Validation (16%), Testing (19%) CML (MAP=0.290) improves IndSVM (MAP=0.246) 17% and CBCF (MAP=0.253) 14% SVM CML 131% CBCF CML 128% SVM CBCF CML 26

27 Experiments TRECVID 2005 dataset (170 hours) 39 concepts (LSCOM-Lite) Training (65%), Validation (16%), Testing (19%) CML (MAP=0.290) improves IndSVM (MAP=0.246) 17% and CBCF (MAP=0.253) 14% SVM CBCF CML SVM CBCF CML SVM CBCF CML 27

28 Experiments TRECVID 2005 dataset (170 hours) 39 concepts (LSCOM-Lite) Training (65%), Validation (16%), Testing (19%) CML (MAP=0.290) improves IndSVM (MAP=0.246) 17% and CBCF (MAP=0.253) 14% 28

29 Multi-Instance Multi-Label Learning Zhengjun Zha, Xian-Sheng Hua, et al. Joint Multi-Label Multi-Instance Learning for Image Classification. CVPR Semi-Supervised Multi-Label Learning Jingdong Wang, Yinghai Zhao, Xiuqing Wu, Xian-Sheng Hua: Transductive multi-label learning for video concept detection. Multimedia Information Retrieval 2008 Multi-Label Active Learning Guo-Jun Qi, Xian-Sheng Hua, et al. Two-Dimensional Active Learning for Image Classification. CVPR Online Multi-Label Active Learning Xian-Sheng Hua, Guo-Jun Qi. Online Multi-Label Active Annotation. ACM Multimedia Guo-Jun Qi, Xian-Sheng Hua, et al. Two-Dimensional Multi-Label Active Learning with An Efficient Online Adaptation Model for Image Classification. T-PAMI

30 Sky Water Mountain Sands Scenery Zhengjun Zha, Xian-Sheng Hua, et al. Joint Multi-Label Multi-Instance Learning for Image Classification. CVPR

31 Multi-Label Learning Multi-Instance Learning Multi-Label Multi-Instance Learning Zhengjun Zha, Xian-Sheng Hua, et al. Joint Multi-Label Multi-Instance Learning for Image Classification. CVPR

32 Notations input pattern image label region label 32

33 Formulation: Hidden Conditional Random Field Association between a region and its label Spatial relation between region labels ( is neighbor or not) Coherence between region and image labels (MIL) Correlations of image labels (MLL) Learning: EM Inference: Maximum Posterior Marginals (MPM) Zhengjun Zha, Xian-Sheng Hua, et al. Joint Multi-Label Multi-Instance Learning for Image Classification. CVPR

34 34

35 Outdoor Water Sea People Crowd Sky Cloud Y Y Y N N Y Y Y N N N N Y N Y N N Y Y N N N N N Y N N N (Single-Label Active Learning for Multi-Label Problems) 35

36 Outdoor Water Sea People Crowd Sky Cloud Y N Y N N Y N Y N Y N N N Y N Guo-Jun Qi, Xian-Sheng Hua, et al. Two-Dimensional Active Learning for Image Classification. CVPR

37 Sample dimension Label dimension x 1 x 2 x 3 x 4 +???????-? +???? To minimize multi-labeled Bayesian classification error. Sample-label selection criterion: m (, ) arg max (, ) ( ;, ) xs ys H ys yl( xs) xs MI yi ys yl( xs) xs xs P, ys U( xs ) i 1, i s Self entropy Mutual information between labels Guo-Jun Qi, Xian-Sheng Hua, et al. Two-Dimensional Active Learning for Image Classification. CVPR

38 Online Learner Preserve existing knowledge Comply with the new coming examples Old Multi-Label Model Preserve existing knowledge New Multi-Label Model Comply with the new examples New coming examples Xian-Sheng Hua, Guo-Jun Qi. Online Multi-Label Active Annotation. ACM Multimedia Guo-Jun Qi, Xian-Sheng Hua, et al. Two-Dimensional Multi-Label Active Learning with An Efficient Online Adaptation Model for Image Classification. T-PAMI

39 Minimize KLD between old model and new one ˆ 1 1 ( y x) arg min KL( ( y x) ( y x)) 1 P P D P p P Comply with multi-label constraints s. t. y y,1 i m i 1 i i P P y y y y,1 i j m 1 i j P i j P ij y x y x 1,1 i m,1 l d i l P i l P il y P 1 ( y x) 1 C n n n i ij il i 2 i j 2 i, l 2 Xian-Sheng Hua, Guo-Jun Qi. Online Multi-Label Active Annotation. ACM Multimedia Guo-Jun Qi, Xian-Sheng Hua, et al. Two-Dimensional Multi-Label Active Learning with An Efficient Online Adaptation Model for Image Classification. T-PAMI

40 Online vs. Offline On multi-label scene dataset: 2407 images, 6 labels Performance is very close (F1 score differences are less than 0.001) TRECVID Data 40

41 Single-Label vs. Multi-Label Active Learning On TRECVID dataset: 2006 Dev set, shots, 39 concepts Initial labeling: 10000, each step sample-label pairs 41

42 Adding new labels On TRECVID dataset: 2006 Dev set, shots, 39 concepts Initial labeling: 10000, each step sample-label pairs Building Explosion-Fire 42

43 Introduction Internet Multimedia Search Machine Learning in Internet Multimedia Search and Mining Learning in internet multimedia indexing Learning in internet multimedia ranking Learning in internet multimedia mining Discussion Microsoft Confidential 43

44 Content-Aware Ranking Content-Aware Reranking Text based Ranking 44

45 X. Tian, L. Yang, J. Wang, Y. Yang, X. Wu, X.-S. Hua. Bayesian Video Search Reranking. ACM Multimedia

46 The difficulty of text based search Incorrect surrounding text Ambiguity of words 46

47 Reorder the ranking list from visual consistency Based on initial text-based search results To exploit the intrinsic visual pattern 47

48 Existing methods Pseudo relevance feedback [Yan,03][Liu, 08] Selecting training samples from initial result Training a ranking model Random walk [Hsu, 07][Liu, 07] Propagating the scores Problems: Key problems not well studied How to use the initial text-based search result How to mine the visual pattern 48

49 Objective To maximize the posterior probability of the ranking list Images/video shots in the rank list initial rank list. The prior The likelihood 49

50 Visually similar samples have close ranks min E( r) r 50

51 Point-Wise Distance Dist(r 1,r 0 ) = 0.63 Dist(r 3, r 0 ) =

52 Pair-Wise Distance Simple method: Count the number of inconsistent pairs Dist(r 1,r 0 ) = 10 Dist(r 3, r 0 ) = 0 52

53 Absolute Rank Difference The difference of two ranks Absolute-Difference Distance Reranking Objective function Solved using quadratic programming 53

54 Relative Rank Difference Relative-Difference Distance Reranking Objective function Solved by Matrix Analysis 54

55 Method TRECVID 2006 TRECVID 2007 MAP Gain MAP Gain Text Baseline Random Walk % % Ranking SVM PRF % % GRF [Zhu, 03] % % LGC [Zhou, 03] % % AD Reranking % % RD Reranking % % Dataset TRECVID 2006 & 2007 Text Baseline Visual Feature Measurement Okapi BM-25 5*5 ColorMoment Average Precision 55

56 Initial Reranked 56

57 Initial Reranked 57

58 Bo Geng, Linjun Yang, Xian-Sheng Hua. Learning to Rank with Graph Consistency. MSR Technical Report,

59 Assumptions Visual Consistency The ranks among visually similar images should be consistent Ranking Consistency The reranked results and the initial text-based one should not be too different Two-step solution: text-based search and then content-based reranking 59

60 Why Content-Aware Ranking Textual information is often noisy Reranking suffer from error propagation and depends severely on initial results. Solution Introduce the visual consistency into the ranking model Bo Geng, Linjun Yang, Xian-Sheng Hua. Learning to Rank with Graph Consistency. MSR Technical Report, 2009.

61 Solution Based on Structural SVM framework Cutting-plane algorithm to solve the optimization problem Results

62 Introduction Internet Multimedia Search Machine Learning in Internet Multimedia Search and Mining Learning in internet multimedia indexing Learning in internet multimedia ranking Learning in internet multimedia mining Discussion Microsoft Confidential 62

63 L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, S. Li. Flickr Distance. ACM Multimedia 2008 (Best Paper Candidate).

64 Multimedia Information Recommendation Annotation Clustering Indexing Ranking Retrieval 64

65 Indexing Annotation Recommendation Image Similarity/ Multimedia Distance Information Retrieval Concept Similarity/ Distance Ranking Clustering 65

66 Image Similarity/Distance Image Similarity/ Distance Concept Similarity/ Distance 66

67 Image Similarity/Distance Numerous efforts have been made. Concept Similarity/Distance Concept Similarity/ Distance 67

68 Image Similarity/Distance Numerous efforts have been made. Concept Similarity/Distance Olympic Cat Sports Paw Tiger More and more used, but not well studied. 68

69 WordNet Distance Google Distance Tag Concurrence Distance 69

70 WordNet 150,000 words WordNet Distance Quite a few methods to get it in WordNet Basic idea is to measure the length of the path between two words Pros and Cons Pros: Built by human experts, so close to Cons: human perception Coverage is limited and difficult to extend 70

71 Normalized Google Distance (NGD) Reflects the concurrency of two words in Web documents Defined as Pros and Cons Pros: Cons: Easy to get and huge coverage Only reflects concurrency in textual documents. Not really concept distance (semantic relationship) 71

72 Concept Pairs Google Distance Airplane Dog Football Soccer Horse Donkey Airplane Airport Car Wheel

73 Image Tag Concurrence Distance (Qi, Hua, et al. ACMMM07) Reflects the frequency of two tags occur in the same images Based on the same idea of NGD Mostly is sparse (> 95% are zero in the similarity matrix) Pros and Cons Pros: Cons: Images are taken into account a) Tags are sparse so visual concurrency is not well reflected b) Training data is difficult to get similarity matrix: 500 tags similarity matrix: 50 tags 73

74 Concept Pairs GoogleTag Distance Concurrence Distance Airplane Dog Football Soccer Horse Donkey Airplane Airport Car Wheel

75 table tennis pingpong horse donkey car wheel airplane airport Synonymy different words but the same meaning Visually Similar similar things or things of same type Meronymy part and the whole Concurrency exist at the same scene/place 75

76 Can we mine concept distance from image content? 76

77 Semantic concept distance is based on human s cognition To mine concept distance from a 80% of human cognition comes from visual information large tagged image collection There are around 4 billion photos on Flickr based on image content In average each Flickr image has around 8 tags bear, fur, grass, tree polar bear, water, sea polar bear, fighting, usa 77

78 Concept A: Airplane Concept B: Airport Concept Model A Concept Model B Flickr Distance (A, B) 78

79 FlickrDistance Flickr Distance is able to cover the four different semantic relationships Synonymy, Visually Similar, Meronymy, and Concurrency 79

80 R1: A Good Image Collection Large High coverage, especially on real-life world With tags Real-life photos reflect human s perception Based on collective intelligence from grassroots 80

81 R2: A Good Concept Representation or Model Based on image content Can cover wider concept relationships Can handle large-concept set Concept Models Discriminative SVM, Boosting, Generative Global Feature Local Feature w/o Spatial Relation w/ Spatial Relation Bag-of-Words (plsa, LDA), 2D HMM, MRF, 81

82 VLM Visual Language Model Spatial-relation sensitive Efficient Concept Models Can handle object variations Discriminative Generative Global Feature Local Feature SVM, Boosting, w/o Spatial Relation w/ Spatial Relation Bag-of-Words, 2D HMM, MRF, 82

83 I am talking about statistical language model. Unigram Model Bigram Model Trigram Model P P P wx w w 1 2 wn P wx w w w w p w w x 1 2 n x x 1 w w w w P w w w x 1 2 n x x 1 x 2 83

84 Visual Word Generation Patch Gradient Texture Histogram Hashing Visual Word Image Patch Unigram Model Bigram Model Trigram Model P w w w w P w xy mn w w w w P w w P xy mn xy x 1, y P w w w w p w w w xy mn xy x 1, y x, y 1 xy Trigram VLM is estimated by directly counting from sufficient samples of each category. To avoid the bias in the sampling, back-off smoothing method is adopted. 84

85 Why Latent-Topic Latent-Topic VLM Visual variations of concept are taken as latent topics 85

86 Comparison on Image Categorization Caltech 8 categories / 5097 images Accuracy (%) plsa (BOW) LDA (BOW) 2D MHMM SVM VLM LT-VLM Training Time (sec/image) plsa (BOW) LDA (BOW) 2D MHMM SVM VLM LT-VLM 86

87 Kullback Leibler (KL) divergence Good, but not symmetric topic distance Jensen Shannon (JS) divergence Better, as it is symmetric And, square root of JS divergence is a metric, so is Flickr Distance topic distance concept distance 87

88 Concept A: Airplane Tag search in Flickr Concept B: Airport Concept Model A LT-VLM Concept Model B Flickr Distance (A, B) Jensen-Shannon Divergence 88

89 Concept clustering Image annotation Tag recommendation 89

90 Concept Clustering 23 concepts; 3 groups (1) outer space, (2) animal and (3) sports Normalized Google Distance Tag Concurrence Distance Flickr Distance Group1 Group2 Group3 Group 1 Group2 Group3 Group1 Group2 Group3 bears horses moon space bowling dolphin donkey Saturn sharks snake softball spiders turtle Venus whale wolf baseball basketball football golf soccer tennis volleyball moon space Venus whale baseball donkey softball wolf basketball bears bowling dolphin football golf horses Saturn sharks soccer spiders tennis turtle volleyball moon Saturn space Venus bears dolphin donkey golf horses sharks spiders tennis whale wolf baseball basketball football snake soccer bowling softball volleyball 90

91 Total number of correct keywords Based on an approach using concept relation Dual Cross-Media Relevance Model (DCMRM, J. Liu et al. ACMMM 2007) On 79 concepts / 79,000 images 1200 NGD-DCMRM TCD-DCMRM FD-DCMRM NGD-DCMRM TCD-DCMRM FD-DCMRM The number of correctly annotated keywords at the first N words 91

92 To Improve Tagging Quality Eliminating tag incompletion, noises, and ambiguity 500 images / 10 recommended tags per image NGD Tag Concurrent Distance Flickr Distance 10 92

93 Why VLM divergence can estimate concept distance? Why FD works well even tags are not complete? Computer VLM: distribution of uni/bi/trigrams room patterns computer patterns other patterns TV room patterns TV patterns other patterns Office room patterns screen patterns other patterns 93

94 If we find similar patterns in the images associated with different concepts, the corresponding concept relationships can be discovered. Computer Office 94

95 95

96 Intention/Semantic Gap User Query Index Data Intention Gap Semantic Gap 96

97 Data Data+User+ Model Large-Scale Users Model User 97

98 Basic Assumptions of all automatic and semi-automatic approaches (model driven / data driven approaches) Data Tag Images have correlations Tags have correlations Tags have correlation with image content Semantics

99 Data Tag Data Tag Semantics Semantics For Model-able Tags For Un-model-able Tags 99

100 This slide is borrowed from Lei Zhang (MSR Asia) 100

101 This slide is borrowed from Lei Zhang (MSR Asia)

102 Data Tag Data Tag Semantics Semantics For Model-able Tags For Un-model-able Tags 102

103 We need Large amounts of tagged data Large amounts of users to do labeling Can be regarded as a combination of Data Tag Manual labeling Model based annotation Data driven approach Semantics 103

104 Research Challenges: Large-scale content-based indexing Large-scale active learning Large-scale online learning Data Tag Training data acquisition/analysis o Labeling psychology and incentive o Labeling quality estimation/evaluation o Label quality improvement/filtering o Labeling Interface o Anti-spam/cheating in labeling o Semantics 104

105 Machine learning has been one of the most effective approaches to enable content-aware multimedia search, including indexing, ranking, query interface and search result navigation Still difficult to bridge the semantic gap and intention gap through machine learning Combining data, user and model may be one possible way out 105

106 106

Columbia University High-Level Feature Detection: Parts-based Concept Detectors

Columbia University High-Level Feature Detection: Parts-based Concept Detectors TRECVID 2005 Workshop Columbia University High-Level Feature Detection: Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric Zavesky Digital Video and Multimedia Lab

More information

Supervised Reranking for Web Image Search

Supervised Reranking for Web Image Search for Web Image Search Query: Red Wine Current Web Image Search Ranking Ranking Features http://www.telegraph.co.uk/306737/red-wineagainst-radiation.html 2 qd, 2.5.5 0.5 0 Linjun Yang and Alan Hanjalic 2

More information

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang ACM MM 2010 Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet

More information

Visual Query Suggestion

Visual Query Suggestion Visual Query Suggestion Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang University of Science and Technology of China Textual Visual Query Suggestion Microsoft Research Asia Motivation Framework

More information

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH Sai Tejaswi Dasari #1 and G K Kishore Babu *2 # Student,Cse, CIET, Lam,Guntur, India * Assistant Professort,Cse, CIET, Lam,Guntur, India Abstract-

More information

Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data

Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),

More information

Joint Inference in Image Databases via Dense Correspondence. Michael Rubinstein MIT CSAIL (while interning at Microsoft Research)

Joint Inference in Image Databases via Dense Correspondence. Michael Rubinstein MIT CSAIL (while interning at Microsoft Research) Joint Inference in Image Databases via Dense Correspondence Michael Rubinstein MIT CSAIL (while interning at Microsoft Research) My work Throughout the year (and my PhD thesis): Temporal Video Analysis

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN: Semi Automatic Annotation Exploitation Similarity of Pics in i Personal Photo Albums P. Subashree Kasi Thangam 1 and R. Rosy Angel 2 1 Assistant Professor, Department of Computer Science Engineering College,

More information

ImageCLEF 2011

ImageCLEF 2011 SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test

More information

ImgSeek: Capturing User s Intent For Internet Image Search

ImgSeek: Capturing User s Intent For Internet Image Search ImgSeek: Capturing User s Intent For Internet Image Search Abstract - Internet image search engines (e.g. Bing Image Search) frequently lean on adjacent text features. It is difficult for them to illustrate

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System

The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System Our first participation on the TRECVID workshop A. F. de Araujo 1, F. Silveira 2, H. Lakshman 3, J. Zepeda 2, A. Sheth 2, P. Perez

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009 Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context

More information

over Multi Label Images

over Multi Label Images IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,

More information

Leveraging flickr images for object detection

Leveraging flickr images for object detection Leveraging flickr images for object detection Elisavet Chatzilari Spiros Nikolopoulos Yiannis Kompatsiaris Outline Introduction to object detection Our proposal Experiments Current research 2 Introduction

More information

Visual Search: 3 Levels of Real-Time Feedback

Visual Search: 3 Levels of Real-Time Feedback Visual Search: 3 Levels of Real-Time Feedback Prof. Shih-Fu Chang Department of Electrical Engineering Digital Video and Multimedia Lab http://www.ee.columbia.edu/dvmm A lot of work on Image Classification

More information

Class 5: Attributes and Semantic Features

Class 5: Attributes and Semantic Features Class 5: Attributes and Semantic Features Rogerio Feris, Feb 21, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Project

More information

Topic Diversity Method for Image Re-Ranking

Topic Diversity Method for Image Re-Ranking Topic Diversity Method for Image Re-Ranking D.Ashwini 1, P.Jerlin Jeba 2, D.Vanitha 3 M.E, P.Veeralakshmi M.E., Ph.D 4 1,2 Student, 3 Assistant Professor, 4 Associate Professor 1,2,3,4 Department of Information

More information

Class 9 Action Recognition

Class 9 Action Recognition Class 9 Action Recognition Liangliang Cao, April 4, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Visual Recognition

More information

Tag Based Image Search by Social Re-ranking

Tag Based Image Search by Social Re-ranking Tag Based Image Search by Social Re-ranking Vilas Dilip Mane, Prof.Nilesh P. Sable Student, Department of Computer Engineering, Imperial College of Engineering & Research, Wagholi, Pune, Savitribai Phule

More information

Re-Ranking of Web Image Search Using Relevance Preserving Ranking Techniques

Re-Ranking of Web Image Search Using Relevance Preserving Ranking Techniques Re-Ranking of Web Image Search Using Relevance Preserving Ranking Techniques Delvia Mary Vadakkan #1, Dr.D.Loganathan *2 # Final year M. Tech CSE, MET S School of Engineering, Mala, Trissur, Kerala * HOD,

More information

Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search

Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search Lu Jiang 1, Deyu Meng 2, Teruko Mitamura 1, Alexander G. Hauptmann 1 1 School of Computer Science, Carnegie Mellon University

More information

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 1 Student, M.E., (Computer science and Engineering) in M.G University, India, 2 Associate Professor

More information

Multimedia Data Management M

Multimedia Data Management M ALMA MATER STUDIORUM - UNIVERSITÀ DI BOLOGNA Multimedia Data Management M Second cycle degree programme (LM) in Computer Engineering University of Bologna Semantic Multimedia Data Annotation Home page:

More information

Volume 2, Issue 6, June 2014 International Journal of Advance Research in Computer Science and Management Studies

Volume 2, Issue 6, June 2014 International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 6, June 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Internet

More information

Scene Modeling Using Co-Clustering

Scene Modeling Using Co-Clustering Scene Modeling Using Co-Clustering Jingen Liu Computer Vision Lab University of Central Florida liujg@cs.ucf.edu Mubarak Shah Computer Vision Lab University of Central Florida shah@cs.ucf.edu Abstract

More information

Multimedia Information Retrieval The case of video

Multimedia Information Retrieval The case of video Multimedia Information Retrieval The case of video Outline Overview Problems Solutions Trends and Directions Multimedia Information Retrieval Motivation With the explosive growth of digital media data,

More information

A Bayesian Approach to Hybrid Image Retrieval

A Bayesian Approach to Hybrid Image Retrieval A Bayesian Approach to Hybrid Image Retrieval Pradhee Tandon and C. V. Jawahar Center for Visual Information Technology International Institute of Information Technology Hyderabad - 500032, INDIA {pradhee@research.,jawahar@}iiit.ac.in

More information

Multimedia Search Reranking: A Literature Survey

Multimedia Search Reranking: A Literature Survey 1 Multimedia Search Reranking: A Literature Survey Tao Mei, Microsoft Research Asia Yong Rui, Microsoft Research Asia Shipeng Li, Microsoft Research Asia Qi Tian, University of Texas at San Antonio The

More information

70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing

70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing 70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY 2004 ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing Jianping Fan, Ahmed K. Elmagarmid, Senior Member, IEEE, Xingquan

More information

VISUAL RERANKING USING MULTIPLE SEARCH ENGINES

VISUAL RERANKING USING MULTIPLE SEARCH ENGINES VISUAL RERANKING USING MULTIPLE SEARCH ENGINES By Dennis Lim Thye Loon A REPORT SUBMITTED TO Universiti Tunku Abdul Rahman in partial fulfillment of the requirements for the degree of Faculty of Information

More information

Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction

Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction Shih-Fu Chang, Winston Hsu, Lyndon Kennedy, Lexing Xie, Akira Yanagawa, Eric Zavesky, Dong-Qing Zhang Digital Video and Multimedia

More information

Unsupervised discovery of category and object models. The task

Unsupervised discovery of category and object models. The task Unsupervised discovery of category and object models Martial Hebert The task 1 Common ingredients 1. Generate candidate segments 2. Estimate similarity between candidate segments 3. Prune resulting (implicit)

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Automatic image annotation refinement using fuzzy inference algorithms

Automatic image annotation refinement using fuzzy inference algorithms 16th World Congress of the International Fuzzy Systems Association (IFSA) 9th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT) Automatic image annotation refinement using fuzzy

More information

Probabilistic Graphical Models Part III: Example Applications

Probabilistic Graphical Models Part III: Example Applications Probabilistic Graphical Models Part III: Example Applications Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2014 CS 551, Fall 2014 c 2014, Selim

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Object Class Recognition using Images of Abstract Regions

Object Class Recognition using Images of Abstract Regions Object Class Recognition using Images of Abstract Regions Yi Li, Jeff A. Bilmes, and Linda G. Shapiro Department of Computer Science and Engineering Department of Electrical Engineering University of Washington

More information

Segmentation. Bottom up Segmentation Semantic Segmentation

Segmentation. Bottom up Segmentation Semantic Segmentation Segmentation Bottom up Segmentation Semantic Segmentation Semantic Labeling of Street Scenes Ground Truth Labels 11 classes, almost all occur simultaneously, large changes in viewpoint, scale sky, road,

More information

Image classification Computer Vision Spring 2018, Lecture 18

Image classification Computer Vision Spring 2018, Lecture 18 Image classification http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 18 Course announcements Homework 5 has been posted and is due on April 6 th. - Dropbox link because course

More information

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS Nilam B. Lonkar 1, Dinesh B. Hanchate 2 Student of Computer Engineering, Pune University VPKBIET, Baramati, India Computer Engineering, Pune University VPKBIET,

More information

TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation

TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, Cordelia Schmid LEAR team, INRIA Rhône-Alpes, Grenoble, France

More information

Label Distribution Learning. Wei Han

Label Distribution Learning. Wei Han Label Distribution Learning Wei Han, Big Data Research Center, UESTC Email:wei.hb.han@gmail.com Outline 1. Why label distribution learning? 2. What is label distribution learning? 2.1. Problem Formulation

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information

Jan Pedersen 22 July 2010

Jan Pedersen 22 July 2010 Jan Pedersen 22 July 2010 Outline Problem Statement Best effort retrieval vs automated reformulation Query Evaluation Architecture Query Understanding Models Data Sources Standard IR Assumptions Queries

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

NUS-WIDE: A Real-World Web Image Database from National University of Singapore

NUS-WIDE: A Real-World Web Image Database from National University of Singapore NUS-WIDE: A Real-World Web Image Database from National University of Singapore Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, Yantao Zheng National University of Singapore Computing

More information

An Introduction to Content Based Image Retrieval

An Introduction to Content Based Image Retrieval CHAPTER -1 An Introduction to Content Based Image Retrieval 1.1 Introduction With the advancement in internet and multimedia technologies, a huge amount of multimedia data in the form of audio, video and

More information

Keywords semantic, re-ranking, query, search engine, framework.

Keywords semantic, re-ranking, query, search engine, framework. Volume 5, Issue 3, March 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Image Re-Ranking

More information

AS STORAGE and bandwidth capacities increase, digital

AS STORAGE and bandwidth capacities increase, digital 974 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 13, NO. 7, JULY 2004 Concept-Oriented Indexing of Video Databases: Toward Semantic Sensitive Retrieval and Browsing Jianping Fan, Hangzai Luo, and Ahmed

More information

Fuzzy Knowledge-based Image Annotation Refinement

Fuzzy Knowledge-based Image Annotation Refinement 284 Int'l Conf. IP, Comp. Vision, and Pattern Recognition IPCV'15 Fuzzy Knowledge-based Image Annotation Refinement M. Ivašić-Kos 1, M. Pobar 1 and S. Ribarić 2 1 Department of Informatics, University

More information

PRISM: Concept-preserving Social Image Search Results Summarization

PRISM: Concept-preserving Social Image Search Results Summarization PRISM: Concept-preserving Social Image Search Results Summarization Boon-Siew Seah Sourav S Bhowmick Aixin Sun Nanyang Technological University Singapore Outline 1 Introduction 2 Related studies 3 Search

More information

Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University

Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University Why Image Retrieval? World Wide Web: Millions of hosts Billions of images Growth of video

More information

THE POPULARITY of the internet has caused an exponential

THE POPULARITY of the internet has caused an exponential IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 4, APRIL 2011 381 Contextual Bag-of-Words for Visual Categorization Teng Li, Tao Mei, In-So Kweon, Member, IEEE, and Xian-Sheng

More information

Web Image Re-Ranking UsingQuery-Specific Semantic Signatures

Web Image Re-Ranking UsingQuery-Specific Semantic Signatures IEEE Transactions on Pattern Analysis and Machine Intelligence (Volume:36, Issue: 4 ) April 2014, DOI:10.1109/TPAMI.201 Web Image Re-Ranking UsingQuery-Specific Semantic Signatures Xiaogang Wang 1 1 Department

More information

Discriminative classifiers for image recognition

Discriminative classifiers for image recognition Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study

More information

Latest development in image feature representation and extraction

Latest development in image feature representation and extraction International Journal of Advanced Research and Development ISSN: 2455-4030, Impact Factor: RJIF 5.24 www.advancedjournal.com Volume 2; Issue 1; January 2017; Page No. 05-09 Latest development in image

More information

Query-Specific Visual Semantic Spaces for Web Image Re-ranking

Query-Specific Visual Semantic Spaces for Web Image Re-ranking Query-Specific Visual Semantic Spaces for Web Image Re-ranking Xiaogang Wang 1 1 Department of Electronic Engineering The Chinese University of Hong Kong xgwang@ee.cuhk.edu.hk Ke Liu 2 2 Department of

More information

Syllabus. 1. Visual classification Intro 2. SVM 3. Datasets and evaluation 4. Shallow / Deep architectures

Syllabus. 1. Visual classification Intro 2. SVM 3. Datasets and evaluation 4. Shallow / Deep architectures Syllabus 1. Visual classification Intro 2. SVM 3. Datasets and evaluation 4. Shallow / Deep architectures Image classification How to define a category? Bicycle Paintings with women Portraits Concepts,

More information

24 hours of Photo Sharing. installation by Erik Kessels

24 hours of Photo Sharing. installation by Erik Kessels 24 hours of Photo Sharing installation by Erik Kessels And sometimes Internet photos have useful labels Im2gps. Hays and Efros. CVPR 2008 But what if we want more? Image Categorization Training Images

More information

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures

More information

Beyond Bags of features Spatial information & Shape models

Beyond Bags of features Spatial information & Shape models Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-

More information

An Efficient Methodology for Image Rich Information Retrieval

An Efficient Methodology for Image Rich Information Retrieval An Efficient Methodology for Image Rich Information Retrieval 56 Ashwini Jaid, 2 Komal Savant, 3 Sonali Varma, 4 Pushpa Jat, 5 Prof. Sushama Shinde,2,3,4 Computer Department, Siddhant College of Engineering,

More information

ANovel Approach to Collect Training Images from WWW for Image Thesaurus Building

ANovel Approach to Collect Training Images from WWW for Image Thesaurus Building ANovel Approach to Collect Training Images from WWW for Image Thesaurus Building Joohyoun Park and Jongho Nang Dept. of Computer Science and Engineering Sogang University Seoul, Korea {parkjh, jhnang}@sogang.ac.kr

More information

Content-Based Multimedia Information Retrieval

Content-Based Multimedia Information Retrieval Content-Based Multimedia Information Retrieval Ishwar K. Sethi Intelligent Information Engineering Laboratory Oakland University Rochester, MI 48309 Email: isethi@oakland.edu URL: www.cse.secs.oakland.edu/isethi

More information

Harvesting collective Images for Bi-Concept exploration

Harvesting collective Images for Bi-Concept exploration Harvesting collective Images for Bi-Concept exploration B.Nithya priya K.P.kaliyamurthie Abstract--- Noised positive as well as instructive pessimistic research examples commencing the communal web, to

More information

278 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 4, JUNE 2010

278 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 4, JUNE 2010 278 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 4, JUNE 2010 Image Classification With Kernelized Spatial-Context Guo-Jun Qi, Xian-Sheng Hua, Member, IEEE, Yong Rui, Fellow, IEEE, Jinhui Tang, Member,

More information

EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari

EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION Ing. Lorenzo Seidenari e-mail: seidenari@dsi.unifi.it What is an Event? Dictionary.com definition: something that occurs in a certain place during a particular

More information

AUTOMATIC VIDEO ANNOTATION BY CLASSIFICATION

AUTOMATIC VIDEO ANNOTATION BY CLASSIFICATION AUTOMATIC VIDEO ANNOTATION BY CLASSIFICATION MANJARY P. GANGAN, Dr. R. KARTHI Abstract Automatic video annotation is a technique to provide semantic video retrieval. The proposed method of automatic video

More information

Markov Networks in Computer Vision. Sargur Srihari

Markov Networks in Computer Vision. Sargur Srihari Markov Networks in Computer Vision Sargur srihari@cedar.buffalo.edu 1 Markov Networks for Computer Vision Important application area for MNs 1. Image segmentation 2. Removal of blur/noise 3. Stereo reconstruction

More information

Experimenting VIREO-374: Bag-of-Visual-Words and Visual-Based Ontology for Semantic Video Indexing and Search

Experimenting VIREO-374: Bag-of-Visual-Words and Visual-Based Ontology for Semantic Video Indexing and Search Experimenting VIREO-374: Bag-of-Visual-Words and Visual-Based Ontology for Semantic Video Indexing and Search Chong-Wah Ngo, Yu-Gang Jiang, Xiaoyong Wei Feng Wang, Wanlei Zhao, Hung-Khoon Tan and Xiao

More information

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Columbia University Computer Science Department Technical Report # CUCS 007-13 (2013) Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang

More information

Region-based Segmentation and Object Detection

Region-based Segmentation and Object Detection Region-based Segmentation and Object Detection Stephen Gould Tianshi Gao Daphne Koller Presented at NIPS 2009 Discussion and Slides by Eric Wang April 23, 2010 Outline Introduction Model Overview Model

More information

NATURAL LANGUAGE PROCESSING

NATURAL LANGUAGE PROCESSING NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity

More information

Latent Variable Models for Structured Prediction and Content-Based Retrieval

Latent Variable Models for Structured Prediction and Content-Based Retrieval Latent Variable Models for Structured Prediction and Content-Based Retrieval Ariadna Quattoni Universitat Politècnica de Catalunya Joint work with Borja Balle, Xavier Carreras, Adrià Recasens, Antonio

More information

CS 231A Computer Vision (Fall 2011) Problem Set 4

CS 231A Computer Vision (Fall 2011) Problem Set 4 CS 231A Computer Vision (Fall 2011) Problem Set 4 Due: Nov. 30 th, 2011 (9:30am) 1 Part-based models for Object Recognition (50 points) One approach to object recognition is to use a deformable part-based

More information

MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset

MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset 2009 IEEE International Conference on Data Mining Workshops MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset Hao Li Institute of Computing Technology Chinese Academy of Sciences Beijing, 100190, China

More information

Structured Learning. Jun Zhu

Structured Learning. Jun Zhu Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum

More information

Large-scale visual recognition The bag-of-words representation

Large-scale visual recognition The bag-of-words representation Large-scale visual recognition The bag-of-words representation Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline Bag-of-words Large or small vocabularies? Extensions for instance-level

More information

Previous Lecture - Coded aperture photography

Previous Lecture - Coded aperture photography Previous Lecture - Coded aperture photography Depth from a single image based on the amount of blur Estimate the amount of blur using and recover a sharp image by deconvolution with a sparse gradient prior.

More information

Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval

Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo 1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang 1, and Winston H. Hsu 1 1 National Taiwan University

More information

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem Many slides from Derek Hoiem Attributes Computer Vision James Hays Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed. Human-in-the-loop recognition:

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

AS DIGITAL cameras become more affordable and widespread,

AS DIGITAL cameras become more affordable and widespread, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 3, MARCH 2008 407 Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation Jianping

More information

Metric learning approaches! for image annotation! and face recognition!

Metric learning approaches! for image annotation! and face recognition! Metric learning approaches! for image annotation! and face recognition! Jakob Verbeek" LEAR Team, INRIA Grenoble, France! Joint work with :"!Matthieu Guillaumin"!!Thomas Mensink"!!!Cordelia Schmid! 1 2

More information

Markov Networks in Computer Vision

Markov Networks in Computer Vision Markov Networks in Computer Vision Sargur Srihari srihari@cedar.buffalo.edu 1 Markov Networks for Computer Vision Some applications: 1. Image segmentation 2. Removal of blur/noise 3. Stereo reconstruction

More information

Supervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information

Supervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information Supervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information Duc-Tien Dang-Nguyen, Giulia Boato, Alessandro Moschitti, Francesco G.B. De Natale Department of Information

More information

Large-Scale Semantics for Image and Video Retrieval

Large-Scale Semantics for Image and Video Retrieval Large-Scale Semantics for Image and Video Retrieval Lexing Xie with Apostol Natsev, John Smith, Matthew Hill, John Kender, Quoc-Bao Nyugen, *Jelena Tesic, *Rong Yan, *Alex Haubold IBM T J Watson Research

More information

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 CS 1674: Intro to Computer Vision Attributes Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 Plan for today What are attributes and why are they useful? (paper 1) Attributes for zero-shot

More information

Structured Models in. Dan Huttenlocher. June 2010

Structured Models in. Dan Huttenlocher. June 2010 Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational

More information

Urban Scene Segmentation, Recognition and Remodeling. Part III. Jinglu Wang 11/24/2016 ACCV 2016 TUTORIAL

Urban Scene Segmentation, Recognition and Remodeling. Part III. Jinglu Wang 11/24/2016 ACCV 2016 TUTORIAL Part III Jinglu Wang Urban Scene Segmentation, Recognition and Remodeling 102 Outline Introduction Related work Approaches Conclusion and future work o o - - ) 11/7/16 103 Introduction Motivation Motivation

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

A Taxonomy of Semi-Supervised Learning Algorithms

A Taxonomy of Semi-Supervised Learning Algorithms A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph

More information

Development in Object Detection. Junyuan Lin May 4th

Development in Object Detection. Junyuan Lin May 4th Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,

More information

Automatic Categorization of Image Regions using Dominant Color based Vector Quantization

Automatic Categorization of Image Regions using Dominant Color based Vector Quantization Automatic Categorization of Image Regions using Dominant Color based Vector Quantization Md Monirul Islam, Dengsheng Zhang, Guojun Lu Gippsland School of Information Technology, Monash University Churchill

More information

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)"

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3) CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Language Model" Unigram language

More information