Vector Space Models: Theory and Applications

Similar documents
Knowledge Discovery and Data Mining 1 (VO) ( )

vector space retrieval many slides courtesy James Amherst

C O M P U T E R G R A P H I C S. Computer Graphics. Three-Dimensional Graphics I. Guoying Zhao 1 / 52

CS452/552; EE465/505. Geometry Transformations

Machine Learning for Signal Processing Fundamentals of Linear Algebra

Machine Learning for Signal Processing Fundamentals of Linear Algebra

CT5510: Computer Graphics. Transformation BOCHANG MOON

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\

UNIT 2 2D TRANSFORMATIONS

Geometry. CS 537 Interactive Computer Graphics Prof. David E. Breen Department of Computer Science

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

Concept Based Search Using LSI and Automatic Keyphrase Extraction

AH Matrices.notebook November 28, 2016

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

This lecture: IIR Sections Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes Vector space scoring

Computer Graphics. Coordinate Systems and Change of Frames. Based on slides by Dianna Xu, Bryn Mawr College

Text Analytics (Text Mining)

Analysis and Latent Semantic Indexing

PetShop (BYU Students, SIGGRAPH 2006)

Multiple View Geometry in Computer Vision

Homework 5: Transformations in geometry

MIA - Master on Artificial Intelligence

Text Analytics (Text Mining)

CHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING

Digital Libraries: Language Technologies

Lecture 2. Topology of Sets in R n. August 27, 2008

Homework: Exercise 1. Homework: Exercise 2b. Homework: Exercise 2a. Homework: Exercise 2d. Homework: Exercise 2c

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

Introduction to Information Retrieval

Transformations Computer Graphics I Lecture 4

Lecture 5: Matrices. Dheeraj Kumar Singh 07CS1004 Teacher: Prof. Niloy Ganguly Department of Computer Science and Engineering IIT Kharagpur

Therefore, after becoming familiar with the Matrix Method, you will be able to solve a system of two linear equations in four different ways.

Latent Semantic Indexing

Matrices. Chapter Matrix A Mathematical Definition Matrix Dimensions and Notation

Introduction to Information Retrieval

Document indexing, similarities and retrieval in large scale text collections

CSE 494: Information Retrieval, Mining and Integration on the Internet

Vector Algebra Transformations. Lecture 4

Dimension Reduction CS534

Computer Graphics. Lecture 2. Doç. Dr. Mehmet Gokturk

Conic Duality. yyye

Rapid growth of massive datasets

Information Retrieval. (M&S Ch 15)

Introduction to Matrix Operations in Matlab

Overview of Clustering

Data Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\

Unsupervised Learning

Chapter 1: Number and Operations

ECG782: Multidimensional Digital Signal Processing

Information Retrieval

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

10/10/13. Traditional database system. Information Retrieval. Information Retrieval. Information retrieval system? Information Retrieval Issues

Lecture 4: Transformations and Matrices. CSE Computer Graphics (Fall 2010)

Unsupervised Feature Selection for Sparse Data

Mining di Dati Web. Lezione 3 - Clustering and Classification

Information Retrieval. hussein suleman uct cs

Feature selection. LING 572 Fei Xia

Geometric transformations assign a point to a point, so it is a point valued function of points. Geometric transformation may destroy the equation

In = number of words appearing exactly n times N = number of words in the collection of words A = a constant. For example, if N=100 and the most

The Semantic Conference Organizer

Convex Optimization. 2. Convex Sets. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University. SJTU Ying Cui 1 / 33

Information Retrieval

Lecture 7: Relevance Feedback and Query Expansion

Transformations Computer Graphics I Lecture 4

Homework 5: Transformations in geometry

11/4/2015. Lecture 2: More Retrieval Models. Previous Lecture. Fuzzy Index Terms. Fuzzy Logic. Fuzzy Retrieval: Open Problems

Clustering and Dimensionality Reduction. Stony Brook University CSE545, Fall 2017

Document Clustering using Concept Space and Cosine Similarity Measurement

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Text Modeling with the Trace Norm

Section III: TRANSFORMATIONS

Topics in Information Retrieval

Geometry. Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts University of New Mexico

Document Clustering: Comparison of Similarity Measures

June 15, Abstract. 2. Methodology and Considerations. 1. Introduction

2D Euclidean Geometric Algebra Matrix Representation

Monday, 12 November 12. Matrices

Parallel and perspective projections such as used in representing 3d images.

Vector: A series of scalars contained in a column or row. Dimensions: How many rows and columns a vector or matrix has.

CS490W. Text Clustering. Luo Si. Department of Computer Science Purdue University

Information Retrieval and Data Mining Part 1 Information Retrieval

NATURAL LANGUAGE PROCESSING

Introduction to Information Retrieval

Information Retrieval and Web Search Engines

Transformations. CSCI 420 Computer Graphics Lecture 4

Informa(on Retrieval

Computer Science 336 Fall 2017 Homework 2

Text Data Pre-processing and Dimensionality Reduction Techniques for Document Clustering

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

Multiple View Geometry in Computer Vision

Objectives. Geometry. Coordinate-Free Geometry. Basic Elements. Transformations to Change Coordinate Systems. Scalars

XPM 2D Transformations Week 2, Lecture 3

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang

GEOMETRIC TRANSFORMATIONS AND VIEWING

CALCULATING TRANSFORMATIONS OF KINEMATIC CHAINS USING HOMOGENEOUS COORDINATES

BOOLEAN MATRIX FACTORIZATIONS. with applications in data mining Pauli Miettinen

Modern Multidimensional Scaling

Today. Today. Introduction. Matrices. Matrices. Computergrafik. Transformations & matrices Introduction Matrices

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri

Transcription:

Vector Space Models: Theory and Applications Alexander Panchenko Centre de traitement automatique du langage (CENTAL) Université catholique de Louvain FLTR 2620 Introduction au traitement automatique du langage 8 December 2010 FLTR2620 - Vector-Space Models 1/55

Plan 1 Vector Algebra Basics 2 Vector Space Model 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 2/55

Vector Space Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 3/55

Vector Space Vector Space Vector Space Set of elements x 1, x 2, x 3,... called vector space L if this set is closed under vector addition and scalar multiplication operations. Elements of this set called vectors. The following conditions must hold for x 1, x 2, x 3 L and α, β: 1 Commutativity x 1 + x 2 = x 2 + x 1. 2 Associativity of vector addition: (x 1 + x 2 ) + x 3 = x 1 + (x 2 + x 3 ). 3 Additive identity: For all x, 0 + x = x + 0 = x. 4 Existence of additive inverse: For any x, there exists a x such that x + ( x) = 0. 5 Associativity of scalar multiplication: α(βx) = (αβ)x. 6 Distributivity of scalar sums: (α + β)x = αx + βx. 7 Distributivity of vector sums: α(x 1 + x 2 ) = αx 1 + αx 2. 8 Scalar multiplication identity: 1x = x. FLTR2620 - Vector-Space Models 4/55

Euclidean Space Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 5/55

Euclidean Space Euclidean Space Euclidean Space Euclidean n-dimensional space R n is a vector space, where (1) scalars are real numbers, (2) every element is represented by a tuple of real numbers, (3) addition is componentwise, and (4) scalar multiplication is multiplication on each term separately. A scalar α is an element of the field of real numbers R: α R, for example α = 3.14, β = 5.25, γ = 1.45. FLTR2620 - Vector-Space Models 6/55

Euclidean Space Euclidean Space: Vectors A vector x is n-tuple of real numbers, an element of n-dimensional Euclidean space R n : x n 1 {}}{ x = x 2 R n = R R... R, x 3 for example 3.14 3.14 x 1 = 5.25 R 3 5.25, x 2 = 1.45 1.45 5.33 R5. 6.44 FLTR2620 - Vector-Space Models 7/55

Euclidean Space Euclidean Space: Column and Row Vectors By default the vectors are column vectors: x 1 x = x 2 x 3 The transpose of a column vector is a row vector: x 1 x T = x 2 x 3 T = (x 1, x 2, x 3 ). FLTR2620 - Vector-Space Models 8/55

Euclidean Space Euclidean Space: Vector Addition, Scalar Multiplication Vector addition is componentwise for example x 1 + x 2 = (x 11 + x 21, x 12 + x 22,..., x 1n + x 2n ) T, x 1 = (3.14, 5.25, 1.45) T, x 2 = (1.45, 5.25, 3.14) T. x 1 + x 2 = (4.59, 10.50, 4.59) T. Multiplication of a vector x by a scalar α: αx = (αx 1, αx 2,..., αx n ) T, for example α = 2, x = (3.14, 5.25, 1.45) T, αx = (6.28, 10.50, 2.90) T. FLTR2620 - Vector-Space Models 9/55

Euclidean Space Geometrical Interpretation FLTR2620 - Vector-Space Models 10/55

Euclidean Space Euclidean Space: Dot Product, Vector Norm Dot (inner) product of two vectors for example x 1 x 2 = x 11 x 21 + x 12 x 22 +... + x 1n x 2n = n x 1i x 2i, i=1 x 1 = (3.14, 5.25, 1.45) T, x 2 = (1.45, 5.25, 3.14). x 1 x 2 = 4.55 + 27.56 + 4.55 = 36.66 Euclidean norm of a vector x = x x = n xi 2, for example x 1 = 3.14 2 + 5.25 2 + 1.45 2 = 9.85 + 27.56 + 2.10 = 6.28 FLTR2620 - Vector-Space Models 11/55 i=1

Euclidean Space Euclidean Space: Cosine Cosine between two vectors for example cos(x 1, x 2 ) = x 1 x 2 x 1 x 2 x 1 = (3.14, 5.25, 1.45) T, x 2 = (0, 0, 1), cos(x 1, x 2 ) = 0 + 0 + 1.45 = 0.23( 77 ) 6.28 1 The cosine is defined in terms of vector norm, and inner product. Therefore, every linear space with inner product defines cosine between vectors. FLTR2620 - Vector-Space Models 12/55

Euclidean Space Geometrical Interpretation Length of the vector a is its norm a. Length of the projection of the vector a to the vector i equals: a x = a cos(a, i) = a i i = a i. FLTR2620 - Vector-Space Models 13/55

Vector Space Basis Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 14/55

Vector Space Basis Linear Independence Linear Combination Linear combination of k vectors is an expression as following: α 1 x 1 + α 2 x 2 +... + α k x k, where α 1, α 2,..., α k R are scalars. Linearly Dependent and Independent Vectors Vectors x 1, x 2,...x k are linearly dependent iff there exist scalars α 1, α 2,..., α k, not all zero, such that α 1 x 1 + α 2 x 2 +... + α k x k = 0 If no such scalars exist, then the vectors are said to be linearly independent. FLTR2620 - Vector-Space Models 15/55

Vector Space Basis Basis Basis A basis of a vector space L is a subset b 1, b 2,..., b n of vectors in L such that all basis vectors are linearly independent and if every vector x L can be represented as a linear combination of basis vectors: For all x L exist α 1, α 1,..., α n R such that Uniqueness of representation x = α 1 b 1 + α 2 b 2 +...α n b n. A vector x L can be represented only in a one way with help of a basis of this vector space. FLTR2620 - Vector-Space Models 16/55

Vector Space Basis Standard Basis Standard Basis The standard basis for a Euclidean space consists of one unit vector pointing in the direction of each axis of the Cartesian coordinate system. The standard basis for the three-dimensional Euclidean space R 3 are three following orthogonal vectors of unit length: i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1). The standard basis for the n-dimensional Euclidean space R n is set of the following vectors: b 1 = (1, 0, 0, 0,..., 0) b 2 = (0, 1, 0, 0,..., 0)... b n = (0, 0, 0, 0,..., 1). FLTR2620 - Vector-Space Models 17/55

Matrices Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 18/55

Matrices Matrix A m n matrix X is a rectangular array of scalars x ij R. x 11 x 12... x 1n X =.... R m n x m1 x m2... x mn for example 1.12 0.55 0.58 0.23 X = 5.52 0.03 1.96 0.03 R 3 4. 0.37 0.78 2.02 0.03 A matrix with m rows and n columns X can be represented as a set of m row vectors or as a set of n column vectors: X = (x 1, x 2,..., x m ) T, X = (x 1, x 2,..., x n ). FLTR2620 - Vector-Space Models 19/55

Matrices Matrix Operations Matrix addition C = A + B is elementwise c ij = a ij + b ij. Matrix multiplication by a scalar C = αa is multiplication on each element separately c ij = αa ij. Matrix Euclidean norm equals n A = Transpose of the matrix A T is the matrix obtained by exchanging A s rows and columns: a ij = a ji. i=1 FLTR2620 - Vector-Space Models 20/55 n j=1 a 2 ij

Matrices Matrix Product: Coordinate Form a 11 a 12... a 1n b 11 b 12... b 1k A =...., B =............. a m1 a m2... a mn b n1 b n2... b nk The product C = AB of two matrices A and B is defined as following: c ij = n a il b lj = a i b j. l=1 Matrix multiplication is defined only if the dimensions of the matrices A, and B are compatible: C A B {}}{{}}{{}}{ [m k] = [m n] [n k]. FLTR2620 - Vector-Space Models 21/55

Matrices Matrix Product: Vector Form The Row by Column Method Represent A as a set of m row vectors, and B as a set of k column vectors. Then if C = AB, element c ij of C is the inner product of the i-th row of A and the j-th column of B: c ij = a i b j, i = 1, m, j = 1, k. a 11 a 12... a 1n a 1 A =............ =., a m1 a m2... a mn a m b 11 b 12... b 1k B =............ = ( ) b 1, b 2,..., b k. b n1 b n2... b nk FLTR2620 - Vector-Space Models 22/55

Matrices Matrix Multiplication: Vector Form FLTR2620 - Vector-Space Models 23/55

Matrices Matrix Product: Example 2 4 6 4 1 For example, let A = 5 7 1 and B = 0 2. 2 3 5 5 1 The dimensions of the matrices agree matrix multiplication is defined: C A B {}}{{}}{{}}{ [3 2] = [3 3] [3 2]. The matrix multiplication equals (2 4 + 4 0 + 6 5) (2 1 + 4 2 + 6 1) 38 16 C = AB = (5 4 + 7 0 + 1 5) (5 1 + 7 2 + 1 1) = 25 20 (2 4 + 3 0 + 5 5) (2 1 + 3 2 + 5 1) 18 12 FLTR2620 - Vector-Space Models 24/55

Matrices Properties of Matrix Product Matrix multiplication is associative: A(BC) = (AB)C. Matrix multiplication is distributive over matrix addition: A(B + C) = AB + AC. Matrix product is compatible with scalar multiplication: α(ab) = (αa)b = A(αB). Matrix multiplication is NOT commutative: AB BA FLTR2620 - Vector-Space Models 25/55

Matrices Matrix Factorization Singular Value Decomposition is a factorization of a rectangular m n matrix A such that A = UDV T, where U is a m m matrix, and V is a n n matrix. These matrices are composed of orthogonal column vectors U T U = I, V T V = I. The m n matrix D has nonegative real numbers long the diagonal called singular values. FLTR2620 - Vector-Space Models 26/55

Definition Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 27/55

Definition Main Characteristics of the Vector Space Model Vector Space Model (VSM) calculates similarity between m homogeneous objects O = {o 1, o 2,..., o m }. The model represents an object o as a vector (point) x in a n-dimensional Euclidean space R n. Every dimension of the vector space corresponds to a feature of an object. Set of all object are represented with a feature matrix X x 1 x 11 x 12... x 1n x 2 X =. = x 21 x 22... x 2n..... x m x m1 x m2... x mn The similarity between objects is modeled in terms of spatial distance between vectors (points). FLTR2620 - Vector-Space Models 28/55

Vector space model FLTR2620sometimes - Vector-Space Models called29/55 semantic space model in the Definition Vector Space Model Vector-Space Model Formally, Vector Space Model can be represented as a quadruple A, B, S, M, where B is a set b 1,..., b n of basis elements that determine the dimensionality of the space and the interpretation of each dimension. A specifies the weighting function A : R n R n. It takes as input a vector x representing an object o, and returns its normalized version. S is a similarity function S : R n 2 [0; 1] that maps pairs of vectors onto a scalar that represents measure of their similarity. M is a transformation that takes one vector space L and maps it onto another vector space L, in order to reduce dimensionality.

Basis Elements Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 30/55

Basis Elements Interpretation: Basis Elements and Objects Basis elements b 1,..., b n define the interpretation of each dimension, or to the standard basis vectors b 1,..., b n. Type of objects defines the interpretation for each vector, represented by a VSM. The bag-of-words (BOW) is a vector space model, where objects are text documents, and basis elements are words of these text documents: Here b 1 = car, b 2 = auto, b 3 = insurance, b 4 = best, and o 1 = Doc1, o 2 = Doc2, o 3 = Doc3. FLTR2620 - Vector-Space Models 31/55

Basis Elements Interpretation: Feature Matrix Basis elements (features) can be also lemmas, multi-word expressions, named entities, documents, syntactic dependencies, morphemes, etc. Term-Document matrix: objects are documents, features are words of the document. Problem: information retrieval, text categorization and clustering. Term-Term matrix: objects are terms, features are context words / words from a dictionary definition. Problem: computational lexical semantics, distributional analysis. Term Senses-Terms matrix: objects are word senses, features are words. Problem: word sense disambiguation. Term-Syntactic Dependencies matrix: objects are terms, features are syntactic dependencies of a term. Problem: computational lexical semantics.... FLTR2620 - Vector-Space Models 32/55

Weighting Function Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 33/55

Weighting Function Weighting Function Weighting Function Weighting function A : R n R n takes as input a vector x, representing an object o, and returns its normalized version. Weighting is used to adapt the feature value according to its actual importance. Identity function (trivial): A(x) = x. Logarithmic weighting function: A(x ij ) = 1 + log(x ij ), x ij > 0. Length-normalization with Euclidean norm: A(x) = x x. Convert to probability distribution: x ij A(x ij ) = p(i, j) = n j=1 x ij = x ij x i l. FLTR2620 - Vector-Space Models 34/55

Weighting Function Weighting Function Entropy weighting: ( A(x ij ) = x ij + 1 + n k=1 Pointwise Mutual Information: ) p ik log(p ik ), p ik = x ik log(n) n l=1 x. il p(i, j) A(x ij ) = log p(i)p(j). TF-IDF (Term Frequency - Inversed Document Frequency):... TF IDF {}}{{}}{ x ij m A(x ij ) = n k=1 x log ik {x lj > 0, l = 1, m} FLTR2620 - Vector-Space Models 35/55

Weighting Function Weighting Function: Example Consider the following term-document matrix X, where x ij is term frequency: Let us normalize it with the Euclidean norm: x Doc1 = x Doc1 x Doc1 = (27,3,0,14)T 27 2 +3 2 +0 2 +14 2 = (27,3,0,14)T 30.56 = (0.88, 0.10, 0, 0.46) T. Finally, we obtain the normalized term-document matrix: FLTR2620 - Vector-Space Models 36/55

Similarity Function Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 37/55

Similarity Function Similarity Function Similarity Function A similarity function S(x, y) defines a measure of similarity of two vectors x, y R n. It should follow the following properties for any vectors x, y: Non-negativity: S(x, y) 0. Maximality: S(x, x) S(x, y). Symmetry : S(x, y) = S(y, x). FLTR2620 - Vector-Space Models 38/55

Similarity Function Distance Function Distance Function A distance (dissimilarity) function D(x, y) defines distance between two vectors x, y R n. It should follow the following properties for any vectors x, y, z: Non-negativity D(x, y) 0. Identity of indiscernibles D(x, y) = 0 iff x = y. Symmetry D(x, y) = D(y, x). Triangle inequality: D(x, z) D(x, y) + D(y, z). FLTR2620 - Vector-Space Models 39/55

Similarity Function Converting Distance to Similarity A distance measure between two vectors x, y R n can be converted to a similarity measure between them as following: S(x, y) = 1 D(x, y), if S(x, y) [0; 1] S(x, y) = 1 2D(x, y), if S(x, y) [ 1; +1] FLTR2620 - Vector-Space Models 40/55

Similarity Function Some Similarity and Distance Functions Minkowski distance (L q distance): D(x, y) = n q (x i y i ) q. i=1 Euclidean distance (L 2 distance): D(x, y) = n (x i y i ) 2 = x y. i=1 Manhattan or city block distance (L 1 distance): n D(x, y) = x i y i. i=1 FLTR2620 - Vector-Space Models 41/55

Similarity Function Some Similarity and Distance Functions Jaccard similarity: S(x, y) = n i=1 min(x i, y i ) n i=1 max(x i, y i ). Dice similarity: S(x, y) = 2 n i=1 min(x i, y i ) n i=1 (x. i, y i ) Cosine similarity: S(x, y) = x y x y. FLTR2620 - Vector-Space Models 42/55

Transformation Plan 1 Vector Algebra Basics Vector Space Euclidean Space Vector Space Basis Matrices 2 Vector Space Model Definition Basis Elements Weighting Function Similarity Function Transformation 3 Applications of the Vector Space Models 4 References and Further Reading FLTR2620 - Vector-Space Models 43/55

Transformation Transformation: Dimensionality Reduction Transformation M is a transformation that takes a vector space L and maps it onto another vector space L, in order to reduce dimensionality, so that dim(l) dim( L). The goal of a dimensionality reduction is to find a smaller number of uncorrelated or lowly correlated dimensions. Reasons for dimensionality reduction: The VSM assumes independence of dimensions. In practice, some dimensions are linear combinations of other dimensions: synonyms, various spellings, etc. High computational complexity in the high-dimensional space. Can help discover latent structure in the data. FLTR2620 - Vector-Space Models 44/55

Transformation Transformation: Dimensionality Reduction Simple dimensionality reduction can be done on the preprocessing stage: stop words, rare dimensions, etc. In addition, feature matrix factorization methods can be used for dimensionality reduction: Truncated Singular Value Decomposition (SVD) Non-Negative Matrix Factorization (NMF)... FLTR2620 - Vector-Space Models 45/55

Transformation Truncated Singular Value Decomposition FLTR2620 - Vector-Space Models 46/55

Various applications of the Vector Space Models 1 Information Retrieval 2 Computational Lexical Semantics 3 Word Sense Disambiguation 4 Other Applications FLTR2620 - Vector-Space Models 47/55

Information Retrieval Problem Formulation Given a user query q find the k most relevant documents {d 1,..., d k } from collection of n documents {d 1,..., d n }. A TF-IDF B Terms from all documents O Documents S Cosine similarity M Truncated SVD (Latent Semantic Indexing) Documents are represented as vectors in the bag-of-word space. User text query is represented as a vector in the same space as the documents. FLTR2620 - Vector-Space Models 48/55

Information Retrieval Let search query be q = car, then it will be represented as the following vector: q = (1, 0, 0, 0). FLTR2620 - Vector-Space Models 49/55

Computational Lexical Semantics Problem Formulation Given a term t find the k most semantically similar terms {t 1,..., t k } from the vocabulary of n terms {t 1,..., t n }. A Pointwise Mutual Information B Words / Terms / Syntactic Contexts O Terms S Cosine similarity / Kullback-Leibler divergence M Truncated SVD (Latent Semantic Analysis)/ Non-Negative Matrix Factorization Distributional hypothesis of Harris: terms are semantically similar if they appear within similar context windows. FLTR2620 - Vector-Space Models 50/55

Computational Lexical Semantics FLTR2620 - Vector-Space Models 51/55

Word Sense Disambiguation Problem Formulation Given a word occurrence w find its sense from the k possible senses {s 1,..., s k }. A Identity function / Length-normalization B Words / Terms O Term Senses S Inner Product (simplified Lesk) M No Term senses are represented as vectors in the BOW of the dictionary definitions. Term is represented as a vector in the same space as term senses. FLTR2620 - Vector-Space Models 52/55

Some Other Applications Named Entity Disambiguation Text Documents Clustering Text Documents Categorization Collaborative Recommendations... FLTR2620 - Vector-Space Models 53/55

References I Berry, M. W. and Browne, M. (2005). Understanding Search Engines: Mathematical Modeling and Text Retrieval (Software, Environments, Tools), Second Edition. SIAM, Society for Industrial and Applied Mathematics. Berry, M. W., Drmac, Z., and Jessup, E. R. (1999). Matrices, vector spaces, and information retrieval. SIAM Rev., 41:335 362. Lowe, W. Towards a theory of semantic space. Manning, C. D., Raghavan, P., and Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press, 1 edition. Van de Cruys, T. (2010). Mining for Meaning. The Extraction of Lexicosemantic Knowledge from Text. FLTR2620 - Vector-Space Models 54/55

Acknowledgments Some illustrations for this presentation were borrowed from [Manning et al., 2008], [Van de Cruys, 2010], and Wikipedia. I would like to thank the authors of these figures. FLTR2620 - Vector-Space Models 55/55