Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus, OH
Introduction Instructors: Shih-Fu Chang John Smith Rogerio Feris Liangliang Cao Columbia University IBM T. J. Watson Research Center
Introduction 1970s Early Days of Computer Vision
Introduction First Digital Camera (1975) 0.01 Megapixels 23 seconds to record a photo to cassette
Introduction Datasets with 5 or 10 images Large-Scale Experiment: 800 photos (Takeo Kanade Thesis, 1973) [D. Marr, 1976]
Introduction Today Visual Data is Exploding!
Introduction Announcement of Pope Benedict in 2005
Introduction Announcement of Pope Francis in 2013 Rapid proliferation of mobile devices equipped with cameras
Introduction Era of Big Visual Data Billions of cell phones equipped with cameras ~500 billion consumer photos are taken each year world-wide ~500 million photos taken per year in NYC alone Hundreds of millions of Facebook photo uploads per day
Introduction
Introduction Exciting Time for Computer Vision + DATA + Computational Processing + Advances in Computer Vision and Machine Learning Major opportunities for systems that automatically extract visual semantics from images and videos
Examples of Practical Application Areas
Examples of Application Areas Smart Surveillance Show me all images of people matching the suspect description from time X to time Y from all cameras in area Z. Visual Semantics: Fine-grained person attributes Slide credit: Rogerio Feris
Examples of Application Areas Medical Imaging MRI Brain Axial MRI Knee PET Color DX Appendage Visual Semantics: Medical Image Modality and Anatomy Slide credit: John Smith DX Torso DX Cervical Spine
Examples of Application Areas Astronomy [Cui et al, WACV 2015] http://www.galaxyzoo.org/ Visual Semantics: morphological galaxy attributes (important to understand star formation, gas fraction, galaxy evolution, ) Huge dataset of galaxy images makes manual labeling infeasible Slide credit: Rogerio Feris
Examples of Application Areas Nature / Ecology Snapshot Serengeti http://www.youtube.com/watch?v=aul03ivs8by http://www.snapshotserengeti.org/ Visual Semantics: species of animals from camera traps Understanding how competing species coexist is a fundamental theme in ecology, with important implications for biodiversity, and the sustainability of life on Earth Slide credit: Rogerio Feris
Examples of Application Areas Nature / Ecology Plant Species Bird Species [Kumar et al, ECCV 2012] Used by botanists, educators, http://www.vision.caltech.edu/visipedia/ Understanding of migration, conservation, Slide credit: Rogerio Feris
Examples of Application Areas Social Media: Visual Sentiment Analysis [Borth et al, ACM MM 2013] Colorful clouds Misty night Colorful butterfly Crying Baby Slide credit: Rogerio Feris
Many more applications Google Goggles [Kovashka et al, CVPR 2012] Amazon Slide credit: Rogerio Feris
Tutorial Overview
Tutorial Overview Objectives: Cover state-of-the-art techniques for learning visual semantics from images and videos Focus on intuitive, semantic visual representations Provide tools for scalable learning of semantic models Cover innovative and practical applications Provide pointers to related source code and datasets
Tutorial Overview Part I: Feature Extraction, Coding, and Pooling (Liangliang) Brief Introduction to local feature descriptors, coding,and pooling Focus on modern representations such as Fisher Vector and Sparse Coding
Tutorial Overview Part I: Feature Extraction, Coding, and Pooling (Liangliang) Connections to feature learning approaches (e.g., deep convolutional neural networks) Picture credit: Kai Yu
Tutorial Overview Part II: Large-Scale Semantic Modeling (John Smith) Semantic Concept Modeling: Historic Overview Picture credit: John Smith
Tutorial Overview Part II: Large-Scale Semantic Modeling (John Smith) How to deal with class imbalance? How to scale to millions of semantic unit models? Picture credit: John Smith
Tutorial Overview Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris) Scalable learning with Attribute Models / Zero-Shot Learning [Lampert et al, CVPR 2009]
Tutorial Overview Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris) Attribute-based Search Slide credit: Rogerio Feris
Tutorial Overview Part IV: High-level Semantic Modeling: Visual Sentiment Analysis (Shih-Fu Chang) Semantic models for encoding emotions in social media