PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks

Size: px
Start display at page:

Download "PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks"

Transcription

1 PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks Pramod Srinivasan CS591txt - Text Mining Seminar University of Illinois, Urbana-Champaign April 8, 2016 Pramod Srinivasan 1 of 43

2 Outline Motivation Introduction The Data Sparseness Problem Related Work Overview Information Network Embedding Predictive Text Embedding The Text Representation Heterogeneous Text Network Embedding Bipartite Network Embedding Text Embedding Experiments Statistics of the Dataset Results Results Analysis Conclusion Pramod Srinivasan 2 of 43

3 Motivation Learning a meaningful and effective representation of text Pramod Srinivasan 3 of 43

4 Motivation Learning a meaningful and effective representation of text Critical pre-cursor to several machine learning tasks Pramod Srinivasan 3 of 43

5 Motivation Learning a meaningful and effective representation of text Critical pre-cursor to several machine learning tasks PTE s Goals Capture Semantic Relatedness between words Pramod Srinivasan 3 of 43

6 Motivation Learning a meaningful and effective representation of text Critical pre-cursor to several machine learning tasks PTE s Goals Capture Semantic Relatedness between words Avoid issues such as data sparsity, polysemy and synonymy Pramod Srinivasan 3 of 43

7 Motivation Learning a meaningful and effective representation of text Critical pre-cursor to several machine learning tasks PTE s Goals Capture Semantic Relatedness between words Avoid issues such as data sparsity, polysemy and synonymy Pramod Srinivasan 3 of 43

8 Motivation Introduction The Data Sparseness Problem Related Work Overview Information Network Embedding Predictive Text Embedding The Text Representation Heterogeneous Text Network Embedding Bipartite Network Embedding Text Embedding Experiments Statistics of the Dataset Results Results Analysis Conclusion Pramod Srinivasan 4 of 43

9 Solving the Problem of Data Sparseness( V ) : Text Embedding Representing words and documents in low-dimensional space Words and documents with similar meanings are embedded closely to each other Pramod Srinivasan 5 of 43

10 Motivation Introduction The Data Sparseness Problem Related Work Overview Information Network Embedding Predictive Text Embedding The Text Representation Heterogeneous Text Network Embedding Bipartite Network Embedding Text Embedding Experiments Statistics of the Dataset Results Results Analysis Conclusion Pramod Srinivasan 6 of 43

11 Unsupervised Text Embedding CBOW (Mikolov et al. 2013) Skip-Gram (Mikolov et al. 2013) Paragraph Vector (Le et al. 2014) Pramod Srinivasan 7 of 43

12 Unsupervised Text Embedding CBOW (Mikolov et al. 2013) Skip-Gram (Mikolov et al. 2013) Paragraph Vector (Le et al. 2014) Pramod Srinivasan 7 of 43

13 Unsupervised Text Embedding CBOW (Mikolov et al. 2013) Skip-Gram (Mikolov et al. 2013) Paragraph Vector (Le et al. 2014) Pros Scalable, yet simple model Insensitive parameters Potential to leverage a large amount of unlabeled data, embeddings are general for different tasks Pramod Srinivasan 7 of 43

14 Unsupervised Text Embedding CBOW (Mikolov et al. 2013) Skip-Gram (Mikolov et al. 2013) Paragraph Vector (Le et al. 2014) Pros Scalable, yet simple model Insensitive parameters Potential to leverage a large amount of unlabeled data, embeddings are general for different tasks Cons Fully unsupervised, not tuned for specific tasks Pramod Srinivasan 7 of 43

15 Unsupervised Text Embedding CBOW (Mikolov et al. 2013) Skip-Gram (Mikolov et al. 2013) Paragraph Vector (Le et al. 2014) Pros Scalable, yet simple model Insensitive parameters Potential to leverage a large amount of unlabeled data, embeddings are general for different tasks Cons Fully unsupervised, not tuned for specific tasks Pramod Srinivasan 7 of 43

16 (Deep) Neural Networks Recurrent Neural Networks (Mikolov et al. 2010) Recursive Neural Networks (Socher et al. 2012) Convolutional Neural Network (Kim et al. 2014) Pramod Srinivasan 8 of 43

17 Supervised Learning Model Recurrent Neural Networks (Mikolov et al. 2010) Recursive Neural Networks (Socher et al. 2012) Convolutional Neural Network (Kim et al. 2014) Pros State-of-the-art performance on specific tasks Cons Computationally expensive Require a large number of labeled data, hard to leverage unlabeled data Very sensitive to parameters, difficult to tune Potential to leverage a large amount of unlabeled data, embeddings are general for different tasks Pramod Srinivasan 9 of 43

18 Information Network Embedding Embedding one instance of some mathematical structure contained within another instance. Words that are used together with many similar words are likely to have similar meanings. Pramod Srinivasan 10 of 43

19 Information Network Embedding Embedding one instance of some mathematical structure contained within another instance. Words that are used together with many similar words are likely to have similar meanings. Pramod Srinivasan 10 of 43

20 The Ideal Embedding Model Reduced Parameter Space Pramod Srinivasan 11 of 43

21 The Ideal Embedding Model Reduced Parameter Space Improve generalization Pramod Srinivasan 11 of 43

22 The Ideal Embedding Model Reduced Parameter Space Improve generalization Transfer or share knowledge between entities Pramod Srinivasan 11 of 43

23 The Ideal Embedding Model Reduced Parameter Space Improve generalization Transfer or share knowledge between entities Pramod Srinivasan 11 of 43

24 LINE : Intuition Pramod Srinivasan 12 of 43

25 LINE : Intuition(contd) Pramod Srinivasan 13 of 43

26 LINE : Intuition(contd) Pramod Srinivasan 14 of 43

27 LINE : Intuition(contd) Pramod Srinivasan 15 of 43

28 LINE : Intuition(contd) Pramod Srinivasan 16 of 43

29 LINE : Intuition(contd) Pramod Srinivasan 17 of 43

30 Large-scale Information Network Embedding LINE extends the embedding idea to general information networks, more specifically, it transfers the vertices in a graph to vectors. Pramod Srinivasan 18 of 43

31 Large-scale Information Network Embedding LINE extends the embedding idea to general information networks, more specifically, it transfers the vertices in a graph to vectors. Preserves the first-order and second-order proximity between the vertices Pramod Srinivasan 18 of 43

32 Large-scale Information Network Embedding LINE extends the embedding idea to general information networks, more specifically, it transfers the vertices in a graph to vectors. Preserves the first-order and second-order proximity between the vertices First-order proximity : Observed Link between vertices Pramod Srinivasan 18 of 43

33 Large-scale Information Network Embedding LINE extends the embedding idea to general information networks, more specifically, it transfers the vertices in a graph to vectors. Preserves the first-order and second-order proximity between the vertices First-order proximity : Observed Link between vertices Second-order proximity : Proximity between their neighborhood structures Pramod Srinivasan 18 of 43

34 Large-scale Information Network Embedding LINE extends the embedding idea to general information networks, more specifically, it transfers the vertices in a graph to vectors. Preserves the first-order and second-order proximity between the vertices First-order proximity : Observed Link between vertices Second-order proximity : Proximity between their neighborhood structures Pramod Srinivasan 18 of 43

35 Motivation Introduction The Data Sparseness Problem Related Work Overview Information Network Embedding Predictive Text Embedding The Text Representation Heterogeneous Text Network Embedding Bipartite Network Embedding Text Embedding Experiments Statistics of the Dataset Results Results Analysis Conclusion Pramod Srinivasan 19 of 43

36 Predictive Text Embedding(PTE) Adapt the advantages of unsupervised text embedding approaches but naturally utilize the labeled data for specific tasks How to uniformly represent unsupervised and supervised information? Pramod Srinivasan 20 of 43

37 Predictive Text Embedding(PTE) Adapt the advantages of unsupervised text embedding approaches but naturally utilize the labeled data for specific tasks How to uniformly represent unsupervised and supervised information? Heterogeneous Text Network Different Levels of Word Occurrences : Word-Word Network, Word-Document Network, Word-Label Network Pramod Srinivasan 20 of 43

38 Converting Text Corpora Pramod Srinivasan 21 of 43

39 Partially Labeled Text Corpora Pramod Srinivasan 22 of 43

40 Word-Word and Word-Document Network Both word-document and word-word networks encode unsupervised information Pramod Srinivasan 23 of 43

41 Word-Word and Word-Document Network Both word-document and word-word networks encode unsupervised information Pramod Srinivasan 23 of 43

42 Word-Word and Word-Document Network Both word-document and word-word networks encode unsupervised information Word-Document Topic Models, LDA Pramod Srinivasan 23 of 43

43 Word-Word and Word-Document Network Both word-document and word-word networks encode unsupervised information Word-Document Topic Models, LDA Word-Word Skip-Gram Pramod Srinivasan 23 of 43

44 Word-Word and Word-Document Network Both word-document and word-word networks encode unsupervised information Word-Document Topic Models, LDA Word-Word Skip-Gram The weight between the word and document is its frequency in the document. Pramod Srinivasan 23 of 43

45 Word-Word and Word-Document Network Both word-document and word-word networks encode unsupervised information Word-Document Topic Models, LDA Word-Word Skip-Gram The weight between the word and document is its frequency in the document. The weight between two words is the number of times the two words co-occur in a given window size Pramod Srinivasan 23 of 43

46 Word-Word and Word-Document Network Both word-document and word-word networks encode unsupervised information Word-Document Topic Models, LDA Word-Word Skip-Gram The weight between the word and document is its frequency in the document. The weight between two words is the number of times the two words co-occur in a given window size Pramod Srinivasan 23 of 43

47 Word Label Network Both word-document and word-word networks encode unsupervised information Pramod Srinivasan 24 of 43

48 Word Label Network Both word-document and word-word networks encode unsupervised information Pramod Srinivasan 24 of 43

49 Word Label Network Both word-document and word-word networks encode unsupervised information Word-Label Network encodes the supervised information Pramod Srinivasan 24 of 43

50 Word Label Network Both word-document and word-word networks encode unsupervised information Word-Label Network encodes the supervised information Pramod Srinivasan 24 of 43

51 Heterogeneous Text Network Pramod Srinivasan 25 of 43

52 Heterogeneous Text Network Three Bipartite Networks : Word-word(word-context), word-document and word-label network Encodes different levels of word co-occurency Pramod Srinivasan 25 of 43

53 Heterogeneous Text Network Three Bipartite Networks : Word-word(word-context), word-document and word-label network Encodes different levels of word co-occurency Contains both supervised and unsupervised information Pramod Srinivasan 25 of 43

54 Heterogeneous Text Network Three Bipartite Networks : Word-word(word-context), word-document and word-label network Encodes different levels of word co-occurency Contains both supervised and unsupervised information Embedding a Heterogeneous Text Network, we obtain a very robust and optimized word embeddings for a specific task. Pramod Srinivasan 25 of 43

55 Bipartite Network Embedding Pramod Srinivasan 26 of 43

56 Bipartite Network Embedding(contd.) What is the ideal proximity measure? Hint : It s either First Order or Second Order Pramod Srinivasan 27 of 43

57 Bipartite Network Embedding(contd.) What is the ideal proximity measure? Hint : It s either First Order or Second Order Pramod Srinivasan 27 of 43

58 Bipartite Network Embedding : Second-order proximity For each edge (v i, v j), define a conditional probability: p 2(v j v i) = O 2 = (i,j) E e u T j u i V k=1 e u T k u i w ij log p 2(v j v i). (1) (2) Pramod Srinivasan 28 of 43

59 Heterogeneous Text Network Embedding Pramod Srinivasan 29 of 43

60 Heterogeneous Text Network Embedding Jointly embed three bipartite networks Pramod Srinivasan 29 of 43

61 Heterogeneous Text Network Embedding Jointly embed three bipartite networks Objective Function : O pte = O ww + O wd + O wl, (3) where O ww = w ij log p(v i v j) (4) (i,j) E ww O wd = w ij log p(v i d j) (5) (i,j) E wd O wl = w ij log p(v i l j) (6) (i,j) E wl Pramod Srinivasan 29 of 43

62 Optimization Two different ways of optimization : Depends on when labeled data(word-label network) are utilized. Joint Training Train the unlabeled data and the labeled data simultaneously Pre-training + Fine-Tuning Jointly train the G ww and G wd networks Fine tuning the word embeddings with the word-label network Pramod Srinivasan 30 of 43

63 Learning Word Representations Robust and Optimized word Embeddings for specific tasks Containing different levels of word co-occurences. Encoding both supervised and unsupervised data Given an arbitrary textpiece d = w 1w 2...w n For every w i, the text embedding is given by u i. The vector representation of the embedding can be computed as : d = 1 n n u i (7) i Pramod Srinivasan 31 of 43

64 Motivation Introduction The Data Sparseness Problem Related Work Overview Information Network Embedding Predictive Text Embedding The Text Representation Heterogeneous Text Network Embedding Bipartite Network Embedding Text Embedding Experiments Statistics of the Dataset Results Results Analysis Conclusion Pramod Srinivasan 32 of 43

65 Evaluating Effectiveness of PTE Task : Text Classification Embeddings as Features Classifier : Logistic Regression Compared Algorithms BOW : Classical bag-of-words representation Pramod Srinivasan 33 of 43

66 Evaluating Effectiveness of PTE Task : Text Classification Embeddings as Features Classifier : Logistic Regression Compared Algorithms BOW : Classical bag-of-words representation Unsupervised Text Embedding Skip-Gram Paragraph Vector(PV) LINE, applied to text networks Pramod Srinivasan 33 of 43

67 Evaluating Effectiveness of PTE Task : Text Classification Embeddings as Features Classifier : Logistic Regression Compared Algorithms BOW : Classical bag-of-words representation Unsupervised Text Embedding Skip-Gram Paragraph Vector(PV) LINE, applied to text networks Predictive Text Embedding : incorporates labels a.k.a Supervised Learning Pramod Srinivasan 33 of 43

68 Evaluating Effectiveness of PTE Task : Text Classification Embeddings as Features Classifier : Logistic Regression Compared Algorithms BOW : Classical bag-of-words representation Unsupervised Text Embedding Skip-Gram Paragraph Vector(PV) LINE, applied to text networks Predictive Text Embedding : incorporates labels a.k.a Supervised Learning CNN PTE Pramod Srinivasan 33 of 43

69 Datasets Long Documents Short Documents Mr Twitter Name 20ng Wiki IMDB Corporate Economics Government Market Dblp Train 11,314 1,911,617* 25, ,650 77, , ,040 61,479 7, ,000 Test 7,532 21,000 25, ,827 38,623 69,496 66,020 20,000 3, ,000 V 89, ,881 71, ,740 65, ,960 64,049 22,270 17, ,994 Doc. length #classes *In the Wiki data set, only 42,000 documents are labeled. Pramod Srinivasan 34 of 43

70 Results of Text Classification Long Documents : Unsupervised 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Unsupervised Embedding Skip-gram PVDBOW PVDM LINE(Gww ) LINE(G wd ) LINE(Gww + G wd ) Pramod Srinivasan 35 of 43

71 Results of Text Classification Long Documents : Unsupervised 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Unsupervised Embedding Skip-gram PVDBOW PVDM LINE(Gww ) LINE(G wd ) LINE(Gww + G wd ) Local context-level word co-occurences : LINE(G ww) > Skip-gram Pramod Srinivasan 35 of 43

72 Results of Text Classification Long Documents : Unsupervised 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Unsupervised Embedding Skip-gram PVDBOW PVDM LINE(Gww ) LINE(G wd ) LINE(Gww + G wd ) Local context-level word co-occurences : LINE(G ww) > Skip-gram Document-Level word co-occurences : LINE(G wd ) > PV Pramod Srinivasan 35 of 43

73 Results on Long Documents : Predictive 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) Pramod Srinivasan 36 of 43

74 Results on Long Documents : Predictive 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) Pramod Srinivasan 36 of 43

75 Results on Long Documents : Predictive 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) PTE(joint) > PTE(G wl ) Pramod Srinivasan 36 of 43

76 Results on Long Documents : Predictive 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) PTE(joint) > PTE(G wl ) PTE(joint) > CNN/CNN(pretrain) Pramod Srinivasan 36 of 43

77 Results on Long Documents : Predictive 20NG Wikipedia IMDB Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) PTE(joint) > PTE(G wl ) PTE(joint) > CNN/CNN(pretrain) Pramod Srinivasan 36 of 43

78 Results on Short Documents : Unsupervised DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Unsupervised Embedding Skip-gram PVDBOW PVDM LINE(Gww ) LINE(G wd ) LINE(Gww + G wd ) Pramod Srinivasan 37 of 43

79 Results on Short Documents : Unsupervised DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Unsupervised Embedding Skip-gram PVDBOW PVDM LINE(Gww ) LINE(G wd ) LINE(Gww + G wd ) Local context-level word co-occurences : LINE(G ww) > Skip-gram Pramod Srinivasan 37 of 43

80 Results on Short Documents : Unsupervised DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Unsupervised Embedding Skip-gram PVDBOW PVDM LINE(Gww ) LINE(G wd ) LINE(Gww + G wd ) Local context-level word co-occurences : LINE(G ww) > Skip-gram Document-Level word co-occurences LINE(G wd ) > PV LINE(G ww + G wd ) > LINE(G ww) > LINE(G wd) Pramod Srinivasan 37 of 43

81 Results on Short Documents : Predictive DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) Pramod Srinivasan 38 of 43

82 Results on Short Documents : Predictive DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) Pramod Srinivasan 38 of 43

83 Results on Short Documents : Predictive DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) Pramod Srinivasan 38 of 43

84 Results on Short Documents : Predictive DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) PTE(joint) > PTE(G wl ) Pramod Srinivasan 38 of 43

85 Results on Short Documents : Predictive DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) PTE(joint) > PTE(G wl ) PTE(joint) > CNN/CNN(pretrain) Pramod Srinivasan 38 of 43

86 Results on Short Documents : Predictive DBLP MR Twitter Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1 Word BOW Predictive Embedding CNN CNN(pretrain) PTE(G wl ) PTE(Gww + G wl ) PTE(G wd + G wl ) PTE(pretrain) PTE(joint) PTE(joint) > PTE(pretrain) PTE(joint) > PTE(G wl ) PTE(joint) > CNN/CNN(pretrain) Pramod Srinivasan 38 of 43

87 Motivation Introduction The Data Sparseness Problem Related Work Overview Information Network Embedding Predictive Text Embedding The Text Representation Heterogeneous Text Network Embedding Bipartite Network Embedding Text Embedding Experiments Statistics of the Dataset Results Results Analysis Conclusion Pramod Srinivasan 39 of 43

88 Unsupervised Embedding Pramod Srinivasan 40 of 43

89 Unsupervised Embedding Long Documents Documemt-level word co-occurences are more useful than local Context-Level word co-occurences. No improvement observed when these two co-occurences are combined. Pramod Srinivasan 40 of 43

90 Unsupervised Embedding Long Documents Documemt-level word co-occurences are more useful than local Context-Level word co-occurences. No improvement observed when these two co-occurences are combined. Short Documents Local context-level word co-occurences are more useful than document-level word co-occurences. Combination further improves the embedding. Pramod Srinivasan 40 of 43

91 Summary Pramod Srinivasan 41 of 43

92 Summary Predictive Text Embedding Adapt the advantages of unsupervised text embedding approaches Naturally incorporate the labeled data Encode unsupervised and supervised information through Large-scale heterogeneous information networks Outperform or comparable to sophisticated methods such as CNN Outperform CNN on long documents Comparable to CNN on short documents Pramod Srinivasan 41 of 43

93 Takeaway Predictive Text Embedding Given a large collection of text data with unlabeled and labeled information, PTE aims to learn low-dimensional representations of words by embedding the heterogeneous representations of words by embedding the heterogeneous text network constructed from the collection into a low dimensional vector space. Pramod Srinivasan 42 of 43

94 Thank You! Pramod Srinivasan 43 of 43

FastText. Jon Koss, Abhishek Jindal

FastText. Jon Koss, Abhishek Jindal FastText Jon Koss, Abhishek Jindal FastText FastText is on par with state-of-the-art deep learning classifiers in terms of accuracy But it is way faster: FastText can train on more than one billion words

More information

node2vec: Scalable Feature Learning for Networks

node2vec: Scalable Feature Learning for Networks node2vec: Scalable Feature Learning for Networks A paper by Aditya Grover and Jure Leskovec, presented at Knowledge Discovery and Data Mining 16. 11/27/2018 Presented by: Dharvi Verma CS 848: Graph Database

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing

More information

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018 CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2018 Last Time: Multi-Dimensional Scaling Multi-dimensional scaling (MDS): Non-parametric visualization: directly optimize the z i locations.

More information

Deep Learning on Graphs

Deep Learning on Graphs Deep Learning on Graphs with Graph Convolutional Networks Hidden layer Hidden layer Input Output ReLU ReLU, 22 March 2017 joint work with Max Welling (University of Amsterdam) BDL Workshop @ NIPS 2016

More information

Neural Network Models for Text Classification. Hongwei Wang 18/11/2016

Neural Network Models for Text Classification. Hongwei Wang 18/11/2016 Neural Network Models for Text Classification Hongwei Wang 18/11/2016 Deep Learning in NLP Feedforward Neural Network The most basic form of NN Convolutional Neural Network (CNN) Quite successful in computer

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 24, 2015 Course Information Website: www.stat.ucdavis.edu/~chohsieh/ecs289g_scalableml.html My office: Mathematical Sciences Building (MSB)

More information

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text Philosophische Fakultät Seminar für Sprachwissenschaft Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text 06 July 2017, Patricia Fischer & Neele Witte Overview Sentiment

More information

Deep Learning on Graphs

Deep Learning on Graphs Deep Learning on Graphs with Graph Convolutional Networks Hidden layer Hidden layer Input Output ReLU ReLU, 6 April 2017 joint work with Max Welling (University of Amsterdam) The success story of deep

More information

Unsupervised Learning of Spatiotemporally Coherent Metrics

Unsupervised Learning of Spatiotemporally Coherent Metrics Unsupervised Learning of Spatiotemporally Coherent Metrics Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun arxiv 2015. Presented by Jackie Chu Contributions Insight between slow feature

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2????? Machine Learning Node

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines

Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines SemDeep-4, Oct. 2018 Gengchen Mai Krzysztof Janowicz Bo Yan STKO Lab, University of California, Santa Barbara

More information

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

More information

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Do Something..

More information

AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks

AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks Yu Shi, Huan Gui, Qi Zhu, Lance Kaplan, Jiawei Han University of Illinois at Urbana-Champaign (UIUC) Facebook Inc. U.S. Army Research

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification. June 8, 2018

Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification. June 8, 2018 Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification Konstantinos Skianis École Polytechnique France Fragkiskos D. Malliaros CentraleSupélec &

More information

Supervised classification of law area in the legal domain

Supervised classification of law area in the legal domain AFSTUDEERPROJECT BSC KI Supervised classification of law area in the legal domain Author: Mees FRÖBERG (10559949) Supervisors: Evangelos KANOULAS Tjerk DE GREEF June 24, 2016 Abstract Search algorithms

More information

Network embedding. Cheng Zheng

Network embedding. Cheng Zheng Network embedding Cheng Zheng Outline Problem definition Factorization based algorithms --- Laplacian Eigenmaps(NIPS, 2001) Random walk based algorithms ---DeepWalk(KDD, 2014), node2vec(kdd, 2016) Deep

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Learning to Reweight Terms with Distributed Representations

Learning to Reweight Terms with Distributed Representations Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results

More information

CPSC 340: Machine Learning and Data Mining. Multi-Dimensional Scaling Fall 2017

CPSC 340: Machine Learning and Data Mining. Multi-Dimensional Scaling Fall 2017 CPSC 340: Machine Learning and Data Mining Multi-Dimensional Scaling Fall 2017 Assignment 4: Admin 1 late day for tonight, 2 late days for Wednesday. Assignment 5: Due Monday of next week. Final: Details

More information

Convolutional-Recursive Deep Learning for 3D Object Classification

Convolutional-Recursive Deep Learning for 3D Object Classification Convolutional-Recursive Deep Learning for 3D Object Classification Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng NIPS 2012 Iro Armeni, Manik Dhar Motivation Hand-designed

More information

DeepWalk: Online Learning of Social Representations

DeepWalk: Online Learning of Social Representations DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

Machine Learning on Encrypted Data

Machine Learning on Encrypted Data Machine Learning on Encrypted Data Kim Laine Microsoft Research, Redmond WA January 5, 2017 Joint Mathematics Meetings 2017, Atlanta GA AMS-MAA Special Session on Mathematics of Cryptography Two Tracks

More information

Transfer Learning. Style Transfer in Deep Learning

Transfer Learning. Style Transfer in Deep Learning Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning

More information

CS231N Section. Video Understanding 6/1/2018

CS231N Section. Video Understanding 6/1/2018 CS231N Section Video Understanding 6/1/2018 Outline Background / Motivation / History Video Datasets Models Pre-deep learning CNN + RNN 3D convolution Two-stream What we ve seen in class so far... Image

More information

(Multinomial) Logistic Regression + Feature Engineering

(Multinomial) Logistic Regression + Feature Engineering -6 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University (Multinomial) Logistic Regression + Feature Engineering Matt Gormley Lecture 9 Feb.

More information

Chapter 1 AN INTRODUCTION TO TEXT MINING. 1. Introduction. Charu C. Aggarwal. ChengXiang Zhai

Chapter 1 AN INTRODUCTION TO TEXT MINING. 1. Introduction. Charu C. Aggarwal. ChengXiang Zhai Chapter 1 AN INTRODUCTION TO TEXT MINING Charu C. Aggarwal IBM T. J. Watson Research Center Yorktown Heights, NY charu@us.ibm.com ChengXiang Zhai University of Illinois at Urbana-Champaign Urbana, IL czhai@cs.uiuc.edu

More information

Deep Learning Applications

Deep Learning Applications October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning

More information

A Brief Review of Representation Learning in Recommender 赵鑫 RUC

A Brief Review of Representation Learning in Recommender 赵鑫 RUC A Brief Review of Representation Learning in Recommender Systems @ 赵鑫 RUC batmanfly@qq.com Representation learning Overview of recommender systems Tasks Rating prediction Item recommendation Basic models

More information

Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval

Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Xiaodong Liu 12, Jianfeng Gao 1, Xiaodong He 1 Li Deng 1, Kevin Duh 2, Ye-Yi Wang 1 1

More information

Backpropagation + Deep Learning

Backpropagation + Deep Learning 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation + Deep Learning Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders

More information

1 Document Classification [60 points]

1 Document Classification [60 points] CIS519: Applied Machine Learning Spring 2018 Homework 4 Handed Out: April 3 rd, 2018 Due: April 14 th, 2018, 11:59 PM 1 Document Classification [60 points] In this problem, you will implement several text

More information

Domain-specific user preference prediction based on multiple user activities

Domain-specific user preference prediction based on multiple user activities 7 December 2016 Domain-specific user preference prediction based on multiple user activities Author: YUNFEI LONG, Qin Lu,Yue Xiao, MingLei Li, Chu-Ren Huang. www.comp.polyu.edu.hk/ Dept. of Computing,

More information

COMS 4771 Clustering. Nakul Verma

COMS 4771 Clustering. Nakul Verma COMS 4771 Clustering Nakul Verma Supervised Learning Data: Supervised learning Assumption: there is a (relatively simple) function such that for most i Learning task: given n examples from the data, find

More information

Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web

Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web Chenghua Lin, Yulan He, Carlos Pedrinaci, and John Domingue Knowledge Media Institute, The Open University

More information

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1 Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that

More information

Mining Human Trajectory Data: A Study on Check-in Sequences. Xin Zhao Renmin University of China,

Mining Human Trajectory Data: A Study on Check-in Sequences. Xin Zhao Renmin University of China, Mining Human Trajectory Data: A Study on Check-in Sequences Xin Zhao batmanfly@qq.com Renmin University of China, Check-in data What information these check-in data contain? User ID Location ID Check-in

More information

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster

More information

List of Accepted Papers for ICVGIP 2018

List of Accepted Papers for ICVGIP 2018 List of Accepted Papers for ICVGIP 2018 Paper ID ACM Article Title 3 1 PredGAN - A deep multi-scale video prediction framework for anomaly detection in videos 7 2 Handwritten Essay Grading on Mobiles using

More information

Learning Binary Code with Deep Learning to Detect Software Weakness

Learning Binary Code with Deep Learning to Detect Software Weakness KSII The 9 th International Conference on Internet (ICONI) 2017 Symposium. Copyright c 2017 KSII 245 Learning Binary Code with Deep Learning to Detect Software Weakness Young Jun Lee *, Sang-Hoon Choi

More information

Subject. Dataset. Copy paste feature of the diagram. Importing the dataset. Copy paste feature into the diagram.

Subject. Dataset. Copy paste feature of the diagram. Importing the dataset. Copy paste feature into the diagram. Subject Copy paste feature into the diagram. When we define the data analysis process into Tanagra, it is possible to copy components (or entire branches of components) towards another location into the

More information

Deep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong

Deep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong Deep Learning for Face Recognition Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong Deep Learning Results on LFW Method Accuracy (%) # points # training images Huang

More information

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1 Multi-class Image Segmentation Assign a class label to each pixel in

More information

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function: 1 Query-adaptive Image Retrieval by Deep Weighted Hashing Jian Zhang and Yuxin Peng arxiv:1612.2541v2 [cs.cv] 9 May 217 Abstract Hashing methods have attracted much attention for large scale image retrieval.

More information

LINE: Large-scale Information Network Embedding

LINE: Large-scale Information Network Embedding LINE: Large-scale Information Network Embedding Jian Tang 1, Meng Qu 2, Mingzhe Wang 2, Ming Zhang 2, Jun Yan 1, Qiaozhu Mei 3 1 Microsoft Research Asia, {jiatang, junyan}@microsoft.com 2 School of EECS,

More information

Combining Neural Networks and Log-linear Models to Improve Relation Extraction

Combining Neural Networks and Log-linear Models to Improve Relation Extraction Combining Neural Networks and Log-linear Models to Improve Relation Extraction Thien Huu Nguyen and Ralph Grishman Computer Science Department, New York University {thien,grishman}@cs.nyu.edu Outline Relation

More information

ART 알고리즘특강자료 ( 응용 01)

ART 알고리즘특강자료 ( 응용 01) An Adaptive Intrusion Detection Algorithm Based on Clustering and Kernel-Method ART 알고리즘특강자료 ( 응용 01) DB 및데이터마이닝연구실 http://idb.korea.ac.kr 2009 년 05 월 01 일 1 Introduction v Background of Research v In

More information

Building Corpus with Emoticons for Sentiment Analysis

Building Corpus with Emoticons for Sentiment Analysis Building Corpus with Emoticons for Sentiment Analysis Changliang Li 1(&), Yongguan Wang 2, Changsong Li 3,JiQi 1, and Pengyuan Liu 2 1 Kingsoft AI Laboratory, 33, Xiaoying West Road, Beijing 100085, China

More information

A Deep Learning Framework for Authorship Classification of Paintings

A Deep Learning Framework for Authorship Classification of Paintings A Deep Learning Framework for Authorship Classification of Paintings Kai-Lung Hua ( 花凱龍 ) Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology Taipei,

More information

arxiv: v1 [cs.lg] 12 Mar 2015

arxiv: v1 [cs.lg] 12 Mar 2015 LINE: Large-scale Information Network Embedding Jian Tang 1, Meng Qu 2, Mingzhe Wang 2, Ming Zhang 2, Jun Yan 1, Qiaozhu Mei 3 1 Microsoft Research Asia, {jiatang, junyan}@microsoft.com 2 School of EECS,

More information

9. Conclusions. 9.1 Definition KDD

9. Conclusions. 9.1 Definition KDD 9. Conclusions Contents of this Chapter 9.1 Course review 9.2 State-of-the-art in KDD 9.3 KDD challenges SFU, CMPT 740, 03-3, Martin Ester 419 9.1 Definition KDD [Fayyad, Piatetsky-Shapiro & Smyth 96]

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

A Hybrid Neural Model for Type Classification of Entity Mentions

A Hybrid Neural Model for Type Classification of Entity Mentions A Hybrid Neural Model for Type Classification of Entity Mentions Motivation Types group entities to categories Entity types are important for various NLP tasks Our task: predict an entity mention s type

More information

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao Large Scale Chinese News Categorization --based on Improved Feature Selection Method Peng Wang Joint work with H. Zhang, B. Xu, H.W. Hao Computational-Brain Research Center Institute of Automation, Chinese

More information

Linked Document Classification by Network Representation Learning

Linked Document Classification by Network Representation Learning Linked Document Classification by Network Representation Learning Yue Zhang 1,2, Liying Zhang 1,2 and Yao Liu 1* 1 Institute of Scientific and Technical Information of China, Beijing, China 2 School of

More information

Large-scale Video Classification with Convolutional Neural Networks

Large-scale Video Classification with Convolutional Neural Networks Large-scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Note: Slide content mostly from : Bay Area

More information

Guiding Semi-Supervision with Constraint-Driven Learning

Guiding Semi-Supervision with Constraint-Driven Learning Guiding Semi-Supervision with Constraint-Driven Learning Ming-Wei Chang 1 Lev Ratinov 2 Dan Roth 3 1 Department of Computer Science University of Illinois at Urbana-Champaign Paper presentation by: Drew

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

arxiv: v1 [cs.cl] 18 Jan 2015

arxiv: v1 [cs.cl] 18 Jan 2015 Workshop on Knowledge-Powered Deep Learning for Text Mining (KPDLTM-2014) arxiv:1501.04325v1 [cs.cl] 18 Jan 2015 Lars Maaloe DTU Compute, Technical University of Denmark (DTU) B322, DK-2800 Lyngby Morten

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky

More information

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional

More information

Slide credit from Hung-Yi Lee & Richard Socher

Slide credit from Hung-Yi Lee & Richard Socher Slide credit from Hung-Yi Lee & Richard Socher 1 Review Word Vector 2 Word2Vec Variants Skip-gram: predicting surrounding words given the target word (Mikolov+, 2013) CBOW (continuous bag-of-words): predicting

More information

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Deep Relevance Matching Model for Ad-hoc Retrieval A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese

More information

Using Machine Learning for Classification of Cancer Cells

Using Machine Learning for Classification of Cancer Cells Using Machine Learning for Classification of Cancer Cells Camille Biscarrat University of California, Berkeley I Introduction Cell screening is a commonly used technique in the development of new drugs.

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Data Matrices and Vector Space Model Denis Helic KTI, TU Graz Nov 6, 2014 Denis Helic (KTI, TU Graz) KDDM1 Nov 6, 2014 1 / 55 Big picture: KDDM Probability

More information

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences

More information

Recognition. Topics that we will try to cover:

Recognition. Topics that we will try to cover: Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) Object classification (we did this one already) Neural Networks Object class detection Hough-voting techniques

More information

This Talk. 1) Node embeddings. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured

This Talk. 1) Node embeddings. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured Representation Learning on Networks, snap.stanford.edu/proj/embeddings-www, WWW 2018 1 This Talk 1) Node embeddings Map nodes to low-dimensional embeddings. 2) Graph neural networks Deep learning architectures

More information

Pre-Requisites: CS2510. NU Core Designations: AD

Pre-Requisites: CS2510. NU Core Designations: AD DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification

More information

Tutorial on Machine Learning Tools

Tutorial on Machine Learning Tools Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow

More information

Empirical Evaluation of RNN Architectures on Sentence Classification Task

Empirical Evaluation of RNN Architectures on Sentence Classification Task Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks

More information

Generative Adversarial Text to Image Synthesis

Generative Adversarial Text to Image Synthesis Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method

More information

Novel Lossy Compression Algorithms with Stacked Autoencoders

Novel Lossy Compression Algorithms with Stacked Autoencoders Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is

More information

Fully Convolutional Networks for Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling

More information

Learning Semantic Entity Representations with Knowledge Graph and Deep Neural Networks and its Application to Named Entity Disambiguation

Learning Semantic Entity Representations with Knowledge Graph and Deep Neural Networks and its Application to Named Entity Disambiguation Learning Semantic Entity Representations with Knowledge Graph and Deep Neural Networks and its Application to Named Entity Disambiguation Hongzhao Huang 1 and Larry Heck 2 Computer Science Department,

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document

More information

Word2vec and beyond. presented by Eleni Triantafillou. March 1, 2016

Word2vec and beyond. presented by Eleni Triantafillou. March 1, 2016 Word2vec and beyond presented by Eleni Triantafillou March 1, 2016 The Big Picture There is a long history of word representations Techniques from information retrieval: Latent Semantic Analysis (LSA)

More information

Convolutional Networks for Text

Convolutional Networks for Text CS11-747 Neural Networks for NLP Convolutional Networks for Text Graham Neubig Site https://phontron.com/class/nn4nlp2017/ An Example Prediction Problem: Sentence Classification I hate this movie very

More information

Machine Learning with Python

Machine Learning with Python DEVNET-2163 Machine Learning with Python Dmitry Figol, SE WW Enterprise Sales @dmfigol Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session

More information

Semantic Segmentation

Semantic Segmentation Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,

More information

Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014

Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Problem Definition Viewpoint estimation: Given an image, predicting viewpoint for object of

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

Geodesic Flow Kernel for Unsupervised Domain Adaptation

Geodesic Flow Kernel for Unsupervised Domain Adaptation Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman 1 Motivation TRAIN TEST Mismatch between different

More information

A Text Classification Model Using Convolution Neural Network and Recurrent Neural Network

A Text Classification Model Using Convolution Neural Network and Recurrent Neural Network Volume 119 No. 15 2018, 1549-1554 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ A Text Classification Model Using Convolution Neural Network and Recurrent

More information

Using PageRank in Feature Selection

Using PageRank in Feature Selection Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy {ienco,meo,botta}@di.unito.it Abstract. Feature selection is an important

More information

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of

More information

Limitations of XPath & XQuery in an Environment with Diverse Schemes

Limitations of XPath & XQuery in an Environment with Diverse Schemes Exploiting Structure, Annotation, and Ontological Knowledge for Automatic Classification of XML-Data Martin Theobald, Ralf Schenkel, and Gerhard Weikum Saarland University Saarbrücken, Germany 23.06.2003

More information

Paper2vec: Combining Graph and Text Information for Scientific Paper Representation

Paper2vec: Combining Graph and Text Information for Scientific Paper Representation Paper2vec: Combining Graph and Text Information for Scientific Paper Representation Soumyajit Ganguly and Vikram Pudi International Institute of Information Technology Hyderabad, India. soumyajit.ganguly@research.iiit.ac.in,

More information

Topics in Machine Learning

Topics in Machine Learning Topics in Machine Learning Gilad Lerman School of Mathematics University of Minnesota Text/slides stolen from G. James, D. Witten, T. Hastie, R. Tibshirani and A. Ng Machine Learning - Motivation Arthur

More information

Supervised Learning for Image Segmentation

Supervised Learning for Image Segmentation Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun Efficient Learning of Sparse Overcomplete Representations with an Energy-Based Model Marc'Aurelio Ranzato C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun CIAR Summer School Toronto 2006 Why Extracting

More information

Feature Selection in Learning Using Privileged Information

Feature Selection in Learning Using Privileged Information November 18, 2017 ICDM 2017 New Orleans Feature Selection in Learning Using Privileged Information Rauf Izmailov, Blerta Lindqvist, Peter Lin rizmailov@vencorelabs.com Phone: 908-748-2891 Agenda Learning

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information