Lecture 1, August 12, Reading:

Size: px

Start display at page:

Download "Lecture 1, August 12, Reading:"

Hannah Sharp
5 years ago
Views:

1 Machine Learning Introduction & Nonparametric Classifiers Eric Xing Lecture 1, August 12, 2010 Reading: Eric CMU,

2 Machine Learning Where does it come from: Eric CMU,

3 Logistics Text book Chris Bishop, Pattern Recognition and Machine Learning (required) Tom Mitchell, Machine Learning David Mackay, Information Theory, Inference, and Learning Algorithms Daphnie Koller and Nir Friedman, Probabilistic Graphical Models Class resource cn/ds/ Host: Lu Baoliang, Shanghai JiaoTong University Xue Xiangyang, Fudan University Instructors: Eric CMU,

4 Local Hosts and co-instructors Eric CMU,

5 Logistics Class mailing list Home work Exam Project Eric CMU,

What is Learning Learning is about seeking

6 What is Learning Learning is about seeking a predictive and/or executable understanding of natural/artificial subjects, phenomena, or activities from Apoptosis + Medicine Grammatical rules Manufacturing gprocedures Natural laws Inference Eric CMU,

7 Machine Learning Eric CMU,

8 Fetching a stapler from inside an office --- the Stanford STAIR robot Eric CMU,

9 What is Machine Learning? Machine Learning seeks to develop theories and computer systems for representing; classifying, clustering and recognizing; reasoning under uncertainty; t predicting; and reacting to complex, real world data, based on the system's own experience with data, and (hopefully) under a unified model or mathematical framework, that can be formally characterized and analyzed can take into account human prior knowledge can generalize and adapt across data and domains can operate automatically and autonomously and can be interpreted and perceived by human. Eric CMU,

10 Where Machine Learning is being used or can be useful? Information retrieval Speech recognition Computer vision Games Robotic control Pedigree Evolution Eric CMU, Planning

11 Natural language processing and speech recognition Now most pocket Speech Recognizers or Translators are running on some sort of learning device --- the more you play/use them, the smarter they become! Eric CMU,

12 Object Recognition Behind a security camera, most likely there is a computer that is learning and/or checking! Eric CMU,

13 Robotic Control I The best helicopter pilot is now a computer! it runs a program that learns how to fly and make acrobatic maneuvers by itself! no taped instructions, joysticks, or things like A. Ng 2005 Eric CMU,

14 Text Mining We want: Reading, digesting, and categorizing a vast text database is too much for human! Eric CMU,

15 g g g g ggg g ggg g g g gg g g g g g g gg g g gg g gg g gg g cacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgc ggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttgg attaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgta ctgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggca cagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtc actccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtca Bioinformatics tattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccata catatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccg g g g g g g g g g g g g g g g g g g g g g g gg g g tggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggc aagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgt tctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgtt tacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactat agtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagc gagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagatt gagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaataga accaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgag ttactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtag aaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattg attgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtac aagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttc ttcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctc ttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtc gtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacg ctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctcwhereisthegenetgattaaaaatatcctttaagaaagcccatgggtataactt actgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcg aggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtg ataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataa cccagcatattttacgtaaaaacaaaacggtaatgcgaacataacttatttattggggcccggaccgcaaaccggccaaacgcgtttgcacccataaaaacataagggcaacaaaaaaattgttaagctgttgtttatttttgcaatcgaaa cgctcaaatagctgcgatcactcgggagcagggtaaagtcgcctcgaaacaggaagctgaagcatcttctataaatacactcaaagcgatcattccgaggcgagtctggttagaaatttacatggactgcaaaaaggtatagccccacaaac cacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgc ggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttgg attaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgta ctgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggca cagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtc actccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtca tattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccata catatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccg tggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggc aagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgt tctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgtt Where is the gene? tacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactat agtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagc gagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagatt gagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaataga accaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgag ttactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtag aaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattg attgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtac aagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttc ttcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctc ttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtc gtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacg ctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctctgaaaacagttcatggtttaaaaatatcctttaagaaagcccatgggtataactt actgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcg aggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtg ataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataa Eric CMU,

16 Paradigms of Machine Learning Supervised Learning D X, Y f( ) : Y f Given, learn, s.t. i i i X i Unsupervised Learning f( ) : Y f Given D X i, learn i X i, s.t. new D X Y new D X Y Reinforcement Learning Given D env, actions, rewards, simulator/trace/real game j j j j policy :ee, r a learn, s.t. utility : a, e r env,new real game a, a, 1 2 a 3 Active Learning new Given ~ G( ), learn D ~ G'( ) and f( ), s.t. all D G'( ), policy, D Y j Eric CMU,

17 Elements of Learning Here are some important elements to consider before you start: Task: Embedding? Classification? Clustering? Topic extraction? Data and other info: Input and output (e.g., continuous, binary, counts, ) Supervised or unsupervised, of a blend of everything? Prior knowledge? Bias? Models and paradigms: BN? MRF? Regression? SVM? Bayesian/Frequents? Parametric/Nonparametric? Objective/Loss function: MLE? MCLE? Max margin? Log loss, hinge loss, square loss? Tractability and exactness trade off: Exact inference? MCMC? Variational? Gradient? Greedy search? Online? Batch? Distributed? Evaluation: Visualization? Human interpretability? Perperlexity? Predictive accuracy? It is better to consider one element at a time! Eric CMU,

18 Theories of Learning For the learned F(; ) Consistency (value, pattern, ) Bias versus variance Sample complexity Learning rate Convergence Error bound Confidence Stability Eric CMU,

19 Classification Representing data: Hypothesis (classifier) Eric CMU,

20 Decision-making as dividing a high-dimensional space Classification-specific Dist.: P(X Y) p( X Y 1) p ( X ;, 1( 1 1 ) p( X Y 2) p ( X ;, ) Class prior (i.e., "weight"): P(Y) Eric CMU,

21 The Bayes Rule What we have just did leads to the following general expression: P( Y X ) P ( X Y ) p ( Y ) P( X ) This is Bayes Rule Eric CMU,

22 The Bayes Decision Rule for Minimum Error Minimum Error The a posteriori probability of a sample p p y p ) ( ) ( ) ( ) ( ) ( ) ( ) ( X q i Y X p i Y X p X p i Y P i Y X p X i Y P i i i i i i Bayes Test: Likelihood Ratio: (X ) Discriminant function: (X ) Eric CMU, h(x )

23 Example of Decision Rules When each class is a normal We can write the decision boundary analytically in some cases homework!! Eric CMU,

24 Bayes Error We must calculate the probability of error the probability that a sample is assigned to the wrong class Given a datum X, what is the risk? The Bayes error (the expected risk): Eric CMU,

25 More on Bayes Error Bayes error is the lower bound of probability of classification error Bayes classifier is the theoretically best classifier that minimize probability of classification error Computing Bayes error is in general a very complex problem. Why? Density estimation: Integrating density function: Eric CMU,

26 Learning Classifier The decision rule: Learning strategies Generative Learning Discriminative Learning Instance-based Learning (Store all past experience in memory) A special case of nonparametric classifier Eric CMU,

27 Supervised Learning K-Nearest-Neighbor Classifier: where the h(x) is represented by all the data, and by an algorithm Eric CMU,

28 Recall: Vector Space Representation Each document is a vector, one component for each term (= word). Doc 1 Doc 2 Doc 3... Word Word Word Normalize to unit length. High-dimensional vector space: Terms are axes, 10,000+ dimensions, or even 100,000+ Docs are vectors in this space Eric CMU,

29 Test Document =? Sports Science Arts Eric CMU,

30 1-Nearest Neighbor (knn) classifier Sports Science Arts Eric CMU,

31 2-Nearest Neighbor (knn) classifier Sports Science Arts Eric CMU,

32 3-Nearest Neighbor (knn) classifier Sports Science Arts Eric CMU,

33 K-Nearest Neighbor (knn) classifier Voting knn Sports Science Arts Eric CMU,

34 Classes in a Vector Space Sports Science Arts Eric CMU,

35 knn Is Close to Optimal Cover and Hart 1967 Asymptotically, the error rate of 1-nearest-neighbor classification is less than twice the Bayes rate [error rate of classifier knowing model that generated data] In particular, asymptotic error rate is 0 if Bayes rate is 0. Where does knn come from? Nonparametric density estimation Eric CMU,

36 Nearest-Neighbor Learning Algorithm Learning is just storing the representations of the training examples in D. Testing instance x: Compute similarity between x and all examples in D. Assign x the category of the most similar example in D. Does not explicitly compute a generalization or category prototypes. Also called: Case-based learning Memory-based learning Lazy learning Eric CMU,

37 knn is an instance of Instance-Based Learning What makes an Instance-Based Learner? A distance metric How many nearby neighbors to look at? A weighting function (optional) How to relate to the local points? Eric CMU,

38 Euclidean Distance Metric Or equivalently, D ( x, x ') ( x i 2 i i x i ') 2 Other metrics: T D( x, x') ( x x') ( x x') L 1 norm: x-x' L norm: max x-x' (elementwise ) Mahalanobis: where is full, and symmetric Correlation Angle Hamming distance, Manhattan distance Eric CMU,

39 Case Study: knn for Web Classification Dataset 20 News Groups (20 classes) Download :( 61, words, 18,774 documents Class labels descriptions Eric Eric Xing CMU, CMU,

40 Results: Binary Classes Accuracy alt.atheism vs. comp.graphics rec.autos vs. rec.sport.baseball comp.windows.x vs. rec.motorcycles k Eric CMU,

41 Results: Multiple Classes Accuracy Random select 5-out-of-20 classes, repeat 10 runs and average All 20 classes Eric CMU, k

42 Is knn ideal? Eric CMU,

43 Is knn ideal? more later Eric CMU,

44 Effect of Parameters Sample size The more the better Need efficient search algorithm for NN Dimensionality Curse of dimensionality Density How smooth? Metric The relative scalings in the distance metric affect region shapes. Weight K Spurious or less relevant points need to be downweighted Eric CMU,

45 Summary Machine Learning is Cool and Useful!! Paradigms of Machine Learning. Design elements learning Theories on learning Fundamental theory of classification Bayes optimal classifier Instance-based learning: knn a Nonparametric classifier A nonparametric method does not rely on any assumption concerning the structure of the underlying density function. Very little learning is involved in these methods Good news: Simple and powerful methods; Flexible and easy to apply to many problems. knn classifier asymptotically approaches the Bayes classifier, which is theoretically the best classifier that minimizes the probability of classification error. Bad news: High memory requirements Very dependant on the scale factor for a specific problem. Eric CMU,

46 Learning Based Approaches for Visual Recognition L. Fei Fei Computer Science Dept. Stanford University

47 As legend goes 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 47

48 CVPR: August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 48

49 What is vision? Real world understanding di pictures pixel world forming pictures L. Fei-Fei, Dragon Star 2010, 8 August Stanford

50 What is vision? Real world edges intensity texture understanding di pictures pixel world Low Level Level Vision L. Fei-Fei, Dragon Star 2010, 8 August Stanford

51 What is vision? Real world groupings of similar pixels geometry understanding di pictures pixel world Mid Level Vision Low Level Level Vision L. Fei-Fei, Dragon Star 2010, 8 August Stanford

52 What is vision? Real world This is a story of love and friendship among three young hamsters. On a sunny day in the garden understanding di pictures pixel world High Level lvi Vision i Mid Level Vision Low Level Level Vision L. Fei-Fei, Dragon Star 2010, 8 August Stanford

53 Humans are extremely good at highlevel semantic understanding time 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford Fei Fei et al. JoV

54 Humans are extremely good at highlevel semantic understanding 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford Fei Fei et al. JoV

55 PT = 27ms This was a picture with some dark sploches in it. Yeah...that's t' about tit it. (Subject: KM) PT = 40ms I think I saw two people p on a field. (Subject: RW) PT = 67ms Outdoor scene. There were some kind of animals, maybe dogs or horses, in the middle of the picture. It looked like they were running in PT = 500ms the middle of a grassy field. (Subject: IV) Some kind of game or fight. Two groups of two men? The foregound pair looked like one was getting a fist in the face. Outdoors seemed like PT = 107ms because i have an impression of grass and maybe lines on the grass? That would be why I think two people, whose profile was toward me. perhaps a game, rough game though, more like looked like they were on a field of some sort and rugby than football because they pairs weren't in engaged in some sort of sport (their attire pads and helmets, though I did get the suggested soccer, but it looked like there was too impression i of similar il clothing. maybe some much contact for that). (Subject: AI) trees? in the background. (Subject: SM) 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford Fei Fei et al. JoV

Visual recognition High level presentatio on ion tasks Level of image re an nd recogniti Low level social roles situation roles and functions functionality (human-object interaction) 3D geometry

56 Visual recognition High level presentatio on ion tasks Level of image re an nd recogniti Low level social roles situation roles and functions functionality (human-object interaction) 3D geometry recognition parts and attributes segmentation shape Object texture scene understanding geometric layout basic-level classification segmentation features and descriptors Scene goals and intentions causality activity understanding action classification tracking Event/activity 1 sec me in ystem ocessing tim man visual sy Pro hum 150 msec 90 msec L. Fei-Fei, Dragon Star 2010, 8 August Stanford

57 High level presentatio on ion tasks Level of image re an nd recogniti Low level social roles Visual recognition 1990 early 2000 situation roles and functions functionality (human-object interaction) 3D geometry recognition parts and attributes segmentation shape Object texture scene understanding geometric layout basic-level classification segmentation features and descriptors Scene goals and intentions causality activity understanding action classification tracking Event/activity 1 sec me in ystem ocessing tim man visual sy Pro hum 150 msec 90 msec L. Fei-Fei, Dragon Star 2010, 8 August Stanford

High level presentatio on ion tasks Level of image re an nd recogniti Low level social roles Visual recognition early 2000 now situation roles and

geometric layout basic-level classification segmentation features and descriptors Scene goals and intentions causality activity understanding action

58 High level presentatio on ion tasks Level of image re an nd recogniti Low level social roles Visual recognition early 2000 now situation roles and functions functionality (human-object interaction) 3D geometry recognition parts and attributes segmentation shape Object texture scene understanding geometric layout basic-level classification segmentation features and descriptors Scene goals and intentions causality activity understanding action classification tracking Event/activity 1 sec me in ystem ocessing tim man visual sy Pro hum 150 msec 90 msec L. Fei-Fei, Dragon Star 2010, 8 August Stanford

59 Why machine learning? 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 59

60 Why machine learning? 80,000,000,000+ images 5,000,000,000+ images 120,000,000 videos (upload: 13hours/min) 20Gb of images Mymom s hard drive: drive: 220+Gb ofimages 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 60

61 Why machine learning? Today: Lots of data, complex tasks Internet images, personal photo albums Movies, news, sports 8 August 2010 L. Fei-Fei, Dragon Star 2010, Surveillance and security Medical and scientific images 61 Stanford

62 Machine learning in computer vision Aug 12, Lecture 1: Nearest Neighbor Large scale image classification Scene completion Aug 12, Lecture 3: Neural Network Convolutional Nets for object recognition Unsupervised feature learning via Deep Belief Net Aug 13, Lecture 7: Dimensionality reduction, Manifold learning Eigen and Fisher faces Applications to object representation Aug 15, Lecture 13: Conditional Random Field Image segmentation, object recognition & image annotation Aug 15 & 16, Lecture : Topic models Object recognition Scene classification, image annotation, large scale image clustering Total scene understanding 8 August L. Fei-Fei, Dragon Star 2010, Stanford

63 Nearest Neighbor approaches: two case study L. Fei Fei Computer Science Dept. Stanford University

64 Machine learning in computer vision Aug 12, Lecture 1: Nearest Neighbor Large scale image classification Scene completion 8 August L. Fei-Fei, Dragon Star 2010, Stanford

65 Machine learning in computer vision Aug 12, Lecture 1: Nearest Neighbor Large scale image classification Scene completion 8 August L. Fei-Fei, Dragon Star 2010, Stanford

66 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 66

67 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 67

68 Large-scale image classification and retrieval ti lis an unaddressed dproblem in vision i?? 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 68

69 ) Humans category (log_10) clean ima ages per PASCAL 1 LabelMe 3 MRSC Caltech101/256 Tiny Images 2 # of # of visual concept categories (log_ 10) 1. Excluding the Caltech101 datasets from PASCAL 2. Image in this dataset are not human annotated. The # of clean images per category is a rough estimation 3. Categories of more than 100 images are considered only

70 knn for image classification: basic set up Antelope? Trombone Jllfih Jellyfish German Shepherd Kangaroo

71 knn for image classification: basic set up 5 NN?

72 Classification 5 NN? Kangaroo Count Antelope Jellyfish German Shepherd Kangaroo Trombone

73 10K classes, 4.5M Queries, 4.5M training images images? Background image courtesy: Antonioo Torralba

74 KNN on 10K classes 10K classes 4.5M queries 4.5M training Features BOW GIST Deng, Berg, Li & Fei Fei, ECCV August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 74

75 How fast is knn? Brute force linear scan E.g. scanning 4.5M images! Can we be faster? Yes if feature dimensionality is low ( e.g. <= 16 ) K D tree spring/ta/manuals/cgal/ref manual2/searchstructures/kdtree.gif L. Fei-Fei, Dragon Star 2010, 8 August Stanford

little improvement over brute force linear

KD tree on 128 dimension SIFT is not much

76 Curse of dimensionality For high dimensionality, in both theory and practice, little improvement over brute force linear scan. E.g. KD tree on 128 dimension SIFT is not much faster than linear scan spring/ta/manuals/cgal/ref manual2/searchstructures/kdtree.gif L. Fei-Fei, Dragon Star 2010, 8 August Stanford

77 Locality sensitive hashing Approximate knn Good enough in practice Can get around curse of dimensionality Locality sensitive hashing Near feature points (likely) same hash values Hash table 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 77

78 Example: Random projection h(x) = sgn (x r), r is a random unit vector h(x) gives 1 bit. Repeat and concatenate. Prob[h(x) = h(y)] = 1 θ(x,y) / π θ y y y r x x x h(x) = 0, h(y) = 0 h(x) = 0, h(y) = 1 x h(x) = 0, h(y) = 1 hyperplane y 8 August L. Fei-Fei, Dragon Star 2010, Stanford Hash table 78

79 Example: Random projection x h(x) = sgn (x r), r is a random unit vector h(x) gives 1 bit. Repeat and concatenate. Prob[h(x) = h(y)] = 1 θ(x,y) / π y x θ r h(x) = 0, h(y) = 0 y hyperplane x y h(x) = 0, h(y) = 0 h(x) = 0, h(y) = 0 x y 8 August L. Fei-Fei, Dragon Star 2010, Stanford Hash table 79

80 Locality sensitive hashing Retrieved NNs Hash table? 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 80

81 Locality sensitive hashing 1000X speed up with 50% recall of top 10 NN 1.2M images dimensions rieved p 10 exact NN ret Prod at top rcentage of e ecall of L1P Per Re L1Prod LSH + L1Prod ranking RandHP LSH + L1Prod ranking Scan cost Percentage of points scanned x August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 81

82 10K classes, 4.5M Queries, 4.5M training images images?

83 Machine learning in computer vision Aug 12, Lecture 1: Nearest Neighbor Large scale image classification Scene completion (slides courtesy: Alyosha Efros (CMU)) 8 August L. Fei-Fei, Dragon Star 2010, Stanford

84 [Hays and Efros. Scene Completion Using Millions of Photographs. L. Fei-Fei, Dragon Star 2010, 8 August SIGGRAPH 2007 and CACM October Stanford 2008.]

85 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 85

86 8 August 2010 Diffusion Result L. Fei-Fei, Dragon Star 2010, Stanford 86

87 8 August 2010 Efros and Leung result L. Fei-Fei, Dragon Star 2010, Stanford 87

88 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 88

89 Scene Matching for Image Completion 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 89

90 8 August 2010 Scene L. Fei-Fei, Completion Dragon Star 2010, Result 90 Stanford

91 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 91

92 Scene Matching 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 92

93 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 200 total 93

94 Context Matching 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 94

95 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 95

96 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 96

97 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 97

98 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 98

99 The Internet is the world s largest library. 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 99

100 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 100

101 8 August 2010 Nearest neighbors from a L. Fei-Fei, Dragon Star 2010, collection of 20 thousand images 101 Stanford

102 8 August 2010 Nearest neighbors from a L. Fei-Fei, Dragon Star 2010, collection of 2 million images 102 Stanford

103 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 103 Hays and Efros, SIGGRAPH 2007

104 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 104 Hays and Efros, SIGGRAPH 2007

105 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 105 Hays and Efros, SIGGRAPH 2007

106 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 106 Hays and Efros, SIGGRAPH 2007

107 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 107 Hays and Efros, SIGGRAPH 2007

108 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 108 Hays and Efros, SIGGRAPH 2007

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016 Machine Learning 10-701, Fall 2016 Nonparametric methods for Classification Eric Xing Lecture 2, September 12, 2016 Reading: 1 Classification Representing data: Hypothesis (classifier) 2 Clustering 3 Supervised