Ranking Clustered Data with Pairwise Comparisons
|
|
- Randolf Fisher
- 5 years ago
- Views:
Transcription
1 Ranking Clustered Data with Pairwise Comparisons Kevin Kowalski 1. INTRODUCTION Background. Machine learning often relies heavily on being able to rank instances in a large set of data by some measure of relative fitness, but in many settings the only reasonable way to acquire information on the ranking is through making pairwise comparisons between the instances. Oftentimes, directly comparing these instances requires human intervention, which can be expensive in terms of time and other resources. Work by Jamieson and Nowak in [JN11] mitigates this problem by providing an algorithm for minimizing the number of pairwise comparisons necessary to rank a data set, assuming that the set is structured in a certain way. In particular, [JN11] assumes that the set can be embedded in a Euclidean space and that each point s fitness is inversely proportional to its distance from an unknown reference point. The algorithm presented in that paper achieves full ranking of the sample space with Θ(d log n) comparisons in the average case, where d is the dimensionality of the Euclidean space and n is the number of samples. This is a nontrivial improvement over general-purpose comparisonbased sorting algorithms which require Ω(n log n) pairwise comparisons even in the average case. The work also presents a version of the algorithm that works in a noisy setting, where each query to the comparison oracle has some probability of returning an incorrect result (and keeps returning the same result on further queries). This returns a full ranking of some subset of the samples that is consistent with all query responses and uses Θ(d log 2 n) queries in the average case, assuming a constant probability of error. Contribution. We generalize the algorithm of [JN11] to perform on data sets with rankings that follow a more general form. Specifically, we expect the data to be partitioned into k clusters where each cluster has its own unknown reference point, and the distance from each data point to its corresponding reference point determines its fitness. We demonstrate that this extended algorithm achieves Θ(kd log n + n log k) comparisons in the noiseless case, and additionally that it performs robustly in both the noiseless and noisy cases on points drawn uniformly from a hypercube and on real-world data on product reviews, respectively. 2. METHODS The algorithm we present in this report relies on the algorithms of [JN11] as subroutines in both the noiseless and noisy settings, so we begin with a brief characterizations of each For the remainder of this section, we assume that S = {s 1,..., s n} is the set of data points, s i s j denotes the event that s i is ranked below s j, S 1 S k is a partitioning of S into k clusters, and d is the dimensionality of the points in S. Noiseless active ranking with one reference point. In the noiseless case, the procedure works by running any standard comparison-based sorting algorithm 1 on the set of points to be ranked, with the following exception. Whenever a comparison is to be made, the procedure first checks whether the outcome of the comparison can be imputed from the values of the comparisons it has already made. If the value of the comparison cannot be imputed, then the comparison oracle is queried, but otherwise the imputed value can be used instead. More specifically, the outcome of a comparison s k sl is ambiguous if there exists a ranking consistent with the extant comparisons that ranks s k below s l, and another that ranks s l below s k. Equivalently, a comparison is ambiguous if there exist candidate points of reference that produce each of these rankings. For any point of reference r = (r 1,..., r d ) and any points s i, s j S, let H ij be the hyperplane normal to and bisecting the line between s i and s j. Then, s i is ranked above s j in the ranking induced by r if and only if r lies on the side of H ij closer to s i (since H ij divides the space of all points into a half that is closer to s i and a half that is closer to s j). Hence, any candidate reference point r is consistent with the extant comparisons if for every comparison s i s j for which the outcome is known, r lies on the correct side of H ij. This defines a set of linear constraints on r, and given a comparison s k sl the goal is then to determine whether these constraints force the comparison to take a particular value. We can determine whether it is possible for s k to be ranked below s l by adding the linear constraint encoding s k s l to the rest and invoking a standard linear programming algorithm to determine whether a feasible r exists. We can also determine whether it is possible to rank s k above s l in exactly the same way, so this tells us whether s k is ambiguous, as desired. sl 1 For our experiment, we used TimSort, a combination of merge sort and insertion sort that is used as default by Python.
2 Noisy active ranking with one reference point. In the noisy setting, queries to the comparison oracle have some probability of returning an erroneous result, and this error is persistent across multiple queries regarding the same comparison. Hence, ibecomest is impossible in the general case to fully rank a list of samples, so our goal instead becomes to return a full ranking on a reasonably large subset of the original list that is consistent with all the queries we receive from the oracle. Working in this setting necessitates making two major changes to the noiseless procedure: 1. The underlying comparison-based sorting algorithm must be insertion sort. In order to create the ranking on a subset, the algorithm must build the subset one element at a time while ensuring that query results are consistent with all elements already in the subset. The structure of insertion sort is the only natural one for this purpose. 2. Suppose that we already have a partial ranking on some subset of the first l 1 samples, and we want to add s l to the ranking. If the outcome of a query s k sl is imputable, then we take the imputed value as truth, but if the outcome is ambiguous, then instead of querying the oracle and trusting in its response, we create a voting set of exactly R samples s j where R is a parameter of the algorithm. Each sample votes for the outcome s k s l if after querying the oracle on s k sj and s j s l (or imputing the values of these queries if they are imputable) we get that s k s j s l, or votes for the opposite outcome if s l s j s k. The sample abstains if s l, s k s j or vice versa. The plurality vote determines the outcome we accept as truth, and in the case of a tie we just directly query the oracle on s k sl. In order for a sample s j to be considered for the voting set, it must be possible for s j to lie between s k and s l (or else the sample would simply abstain). Another way of stating this condition is that the outcome of either s k sj or s j sl must be ambiguous with respect to all the query results we have accepted as true 2. If there are fewer than R samples that meet this criterion, then we likely will not be able to obtain enough data to rank s l accurately, so we delete s l from the ranking. Active ranking with multiple reference points. In the setting of multiple reference points, we are given S = S 1 S k as input, where each data point in S i is ranked according to its distance to the i-th reference point. A high-level description of our algorithm is as follows. Given S, we pass each cluster S i to the single-reference-point ranking algorithm in sequence, obtaining a ranking (or partial ranking in the noisy case) on the elements of each cluster. Then, we merge the k rankings together with k 1 invocations of our pairwise merge procedure. These pairwise merges are organized into a binary tree, so that if σ 1,..., σ k are the rankings of each cluster, then first we merge σ 1 and σ 2 to make σ 1,2, then σ 3 and σ 4 to make σ 3,4 and so on 3 2 The definition of ambiguous in this context is actually underspecified in [JN11], but this interpretation makes the algorithm perform about as well as the results in that paper lead us to expect. 3 If k is odd, then σ k gets merged with an arbitrary other ranking. until σ k 1 and σ k make σ k 1,k. After that we merge σ 1,2 and σ 3,4 to make σ 1,2,3,4, and this process continues until we get the final ranking σ 1,...,k. In the noiseless case, the pairwise merges work exactly like the standard merge procedure from merge sort in each iteration, we compare the least element of the first list to the least element of the second, then remove the lesser of the two from the corresponding list and add it to the sorted list we are building. In this case, we can prove an upper bound on the expected number of pairwise comparisons the algorithm makes. Proposition 1. Let S = S 1 S k be such that S i = n i for i [k], let σ be a ranking on S that is inducible by some k reference points, and let M(σ) denote the number of pairwise comparisons the algorithm makes to produce a full ranking on S. Then, E[M(σ)] = O(kd log n + n log k), where the expectation is over σ drawn from the uniform distribution of rankings inducible by k reference points, d is the dimensionality of each point in S, and n = i ni. Proof. By [JN11], the expected number of comparisons the noiseless single-reference-point algorithm makes on input of size n i is at most cd log n i for some constant c. The expected number of comparisons that the multiple-referencepoint algorithm takes is then i cd log ni plus the expected number of comparisons for the merging procedure. By the convexity of logarithms, cd log n i ckd log n/k = O(kd log n). i Any pairwise merge of lists of size a and b uses at most a+b comparisons since it takes at most one comparison to add each element to the fully sorted list, and since the pattern of merges forms a binary tree, each sample undergoes at most log 2 k merges. Hence, the entire merge procedure uses at most n log 2 k = O(n log k) comparisons. Adding this to the expected number of comparisons used to sort each cluster gives a total of O(kd log n + n log k), as desired. In the noisy case, the algorithm takes an additional parameter R that serves a purpose analogous to that of R in the single-reference-point algorithm. Since we cannot trust the result of the comparison between the least element of the first list and the least element of the second, after getting the result we create a voting set of size R. Let A and B be the two lists we are merging, and assume that they are sorted from least to greatest (so in particular, A[0] and B[0] are the least elements of the two lists). Without loss of generality, assume that the initial query returns that A[0] B[0]. Then, the voting set for the query consists of the elements {B[0], B[1],..., B[R 1]}, or all the elements in B if B has fewer than R elements. For each element B[j] in the voting set, we query the oracle on whether A[0] B[j], and each positive result counts as a vote for the result A[0] B[0] while each negative result counts as a vote for B[0] A[0]. The plurality vote indicates the outcome we accept as truth, and in case of a tie the outcome is chosen randomly. The intuition behind this voting procedure is that if B[0] A[0], then the remaining elements of the voting set will very likely vote for the correct outcome, and if A[0] B[0], then
3 one of two things will happen. If there are many (say, more than R ) elements in B that outrank A, then the voting set will likely vote overwhelmingly for the correct outcome, but if there are few (say, less than R /2), then the voting set will likely vote for the incorrect outcome. However, in this latter case, if we rank B[0] below A[0] we are unlikely to create many inversions between B[0] and A since B[0] and A[0] are likely to be close. 3. RESULTS AND INTERPRETATION We ran experiments to evaluate the quality of our multiplereference-point sorter in both the noiseless and noisy cases. Noiseless experiment. For this algorithm we adapted the noiseless experiment of [JN11] to the case of two reference points. In each trial, S was initialized to contain n = 100 points drawn uniformly at random from the hypercube [0, 1] d and two reference points were drawn uniformly at random from the same distribution. The partition S 1 was defined to be the subset of S consisting of those points closer to the first reference point than the second, and S 2 was defined to be the remaining points. For each value of d = 10, 20,..., 100, 20 trials were run, and the mean numbers of queries the algorithm used are plotted in Figure 1. As in [JN11], the number of queries used approaches an asymptote as the dimensionality increases. In the k = 1 case, the algorithm is exactly TimSort except that the values of certain comparisons are imputed wherever possible, so it is impossible for the algorithm to make more queries than TimSort on an given input. As dimensionality increases, values become harder to impute until they are completely impossible in the case of d = 100 (since in this case d = n, and by cleverly choosing the reference point one can make any ordering possible). For k = 2 and k = 4, the algorithm still outperforms the baseline of TimSort for small values of d, but do worse when the dimensionality is high. This worsening is due to the extra overhead incurred by the merge procedure. The overall numbers of queries are well within the bounds predicted by Proposition 1. Noisy experiment. We would have liked to use the same data set as in [JN11] to evaluate our noisy algorithm, but the Aural Sonar data set appears to have disappeared. Instead, critical reviews for both A Game of Thrones and The Fault in Our Stars were scraped from Amazon and ranked according to helpfulness for each review, Amazon gives users the option to mark a review as either helpful or unhelpful, so the helpfulness score is defined as the ratio of helpful votes to total votes. All reviews with fewer votes than a certain threshold value were excluded, giving us a total of 33 reviews for each book. After scraping, each review was mapped onto a bag-ofwords representation with nltk. To reduce the dimensionality of this representation, the samples were further mapped into [0, 1] 10 and [0, 1] 15 using non-metric multidimensional scaling, a dimensionality reduction technique that preserves the relative distances (in this case, Euclidean distance in the bag-of-words space) between points. A good reference for this technique can be found in [CC08]. Implicitly, we are assuming that helpful reviews for the same book are more similar to each other in word choice than to unhelpful reviews, which means that we might be able to induce something similar to the helpfulness ranking with a reference point in the low-dimensional Euclidean space. This is not obviously a safe assumption to make, but it is borne out by the strength of our results. For each of d = 10 and d = 15 representations, we ran our multiple-reference-point sorter with R = 11 and R = 5 for 20 iterations, where in each iteration the sorter received the samples in a random order. The number of queries used, the number of inversions between the partial ranking output by the algorithm and the correct ranking on those elements, and the number of elements in each partial order are all plotted in Figure 2, relative to the maximum possible values for each of these. Though the performance of our algorithm on this noisy data set is not directly comparable to the performance of the single-point algorithm on the Aural Sonar data set, the value for the proportion of inversions present is similar (at approximately 40% for d = 10 and 35% for d = 15) while the proportion of queries used is much higher (at approximately 30% for d = 10 and 40% for d = 15, compared to 15% in [JN11] for d = 2). There are a number of factors that influence the numbers we obtain. 1. Noise in the data set. The Aural Sonar data set used in [JN11] was likely much more amenable to being embedded in a Euclidean space than our ad hoc solution. Previous work in [PPA06] suggests that a faithful 2-dimensional embedding of the data set exists, which allows the algorithm to work with a very small number of comparisons. Our data set, on the other hand, has only intuition backing up its suitability. 2. The difficulty of the problem. The two books we chose have reviews with very similar average helpfulness ratings, so the true ranking on the merged list requires a great deal of interleaving. In this case, there are many more possible orderings for the samples than in the case where all the samples have a single reference point, so many more comparisons would be necessary to tease apart the ranking to a similar degree of accuracy. There are no values for proportion of ranked elements in [JN11], so it is difficult to say how well our algorithm does in that respect, but we seem to rank close to all of the elements in both the d = 10 and d = 15 cases. Between the d = 10 and d = 15 cases, the d = 15 case sorts a greater proportion of the elements with a smaller proportion of inversions, though requires a substantially greater number of queries to do so. This is unsurprising embedding the points into a higher-dimensional space allows for a more faithful representation of distances in the original space, which translates to a lower rate of error. On the flip side, it is more difficult to impute values in a higher-dimensional space, so we more often need to make queries to compute the value of a comparison.
4 Figure 1: Mean and standard deviation of the number of queries are plotted against the dimensionality of the points with n = 100. The dashed line represents the mean number of queries that TimSort uses on the same data sets. Figure 2: Mean values for various proportions are plotted here. The bars represent the maximum and minimum values of the proportions over 20 trials.
5 4. REFERENCES [CC08] M.A.A. Cox and T.F. Cox. Multidimensional scaling. In Handbook of data visualization, pages Springer-Verlag, [JN11] Kevin G. Jamieson and Robert D. Nowak. Active ranking using pairwise comparisons. CoRR, abs/ , [PPA06] S. Philips, J. Pitton, and L. Atlas. Perceptual feature identification for active sonar echoes. In OCEANS 2006, pages 1 6, Sept 2006.
Ranking Clustered Data with Pairwise Comparisons
Ranking Clustered Data with Pairwise Comparisons Alisa Maas ajmaas@cs.wisc.edu 1. INTRODUCTION 1.1 Background Machine learning often relies heavily on being able to rank the relative fitness of instances
More informationChapter 2 Basic Structure of High-Dimensional Spaces
Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,
More informationTheorem 2.9: nearest addition algorithm
There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used
More information6 Randomized rounding of semidefinite programs
6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can
More information1 The range query problem
CS268: Geometric Algorithms Handout #12 Design and Analysis Original Handout #12 Stanford University Thursday, 19 May 1994 Original Lecture #12: Thursday, May 19, 1994 Topics: Range Searching with Partition
More informationApplied Algorithm Design Lecture 3
Applied Algorithm Design Lecture 3 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 3 1 / 75 PART I : GREEDY ALGORITHMS Pietro Michiardi (Eurecom) Applied Algorithm
More informationLower and Upper Bound Theory. Prof:Dr. Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY
Lower and Upper Bound Theory Prof:Dr. Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY 1 Lower and Upper Bound Theory How fast can we sort? Lower-Bound Theory can
More informationBagging for One-Class Learning
Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one
More informationMultidimensional Divide and Conquer 2 Spatial Joins
Multidimensional Divide and Conque Spatial Joins Yufei Tao ITEE University of Queensland Today we will continue our discussion of the divide and conquer method in computational geometry. This lecture will
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationActive Clustering and Ranking
Active Clustering and Ranking Rob Nowak, University of Wisconsin-Madison IMA Workshop on "High-Dimensional Phenomena" (9/26-30, 2011) Gautam Dasarathy Brian Eriksson (Madison/Boston) Kevin Jamieson Aarti
More informationarxiv: v1 [cs.ma] 8 May 2018
Ordinal Approximation for Social Choice, Matching, and Facility Location Problems given Candidate Positions Elliot Anshelevich and Wennan Zhu arxiv:1805.03103v1 [cs.ma] 8 May 2018 May 9, 2018 Abstract
More informationGetting to Know Your Data
Chapter 2 Getting to Know Your Data 2.1 Exercises 1. Give three additional commonly used statistical measures (i.e., not illustrated in this chapter) for the characterization of data dispersion, and discuss
More informationOne-Point Geometric Crossover
One-Point Geometric Crossover Alberto Moraglio School of Computing and Center for Reasoning, University of Kent, Canterbury, UK A.Moraglio@kent.ac.uk Abstract. Uniform crossover for binary strings has
More informationFormal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T.
Although this paper analyzes shaping with respect to its benefits on search problems, the reader should recognize that shaping is often intimately related to reinforcement learning. The objective in reinforcement
More informationNUMERICAL METHODS PERFORMANCE OPTIMIZATION IN ELECTROLYTES PROPERTIES MODELING
NUMERICAL METHODS PERFORMANCE OPTIMIZATION IN ELECTROLYTES PROPERTIES MODELING Dmitry Potapov National Research Nuclear University MEPHI, Russia, Moscow, Kashirskoe Highway, The European Laboratory for
More informationMetrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates?
Model Evaluation Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Methods for Model Comparison How to
More information3 No-Wait Job Shops with Variable Processing Times
3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select
More information2. On classification and related tasks
2. On classification and related tasks In this part of the course we take a concise bird s-eye view of different central tasks and concepts involved in machine learning and classification particularly.
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationPebble Sets in Convex Polygons
2 1 Pebble Sets in Convex Polygons Kevin Iga, Randall Maddox June 15, 2005 Abstract Lukács and András posed the problem of showing the existence of a set of n 2 points in the interior of a convex n-gon
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationSalford Systems Predictive Modeler Unsupervised Learning. Salford Systems
Salford Systems Predictive Modeler Unsupervised Learning Salford Systems http://www.salford-systems.com Unsupervised Learning In mainstream statistics this is typically known as cluster analysis The term
More informationComputer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 14
Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 14 Scan Converting Lines, Circles and Ellipses Hello everybody, welcome again
More informationDesign and Analysis of Algorithms
Design and Analysis of Algorithms CSE 5311 Lecture 8 Sorting in Linear Time Junzhou Huang, Ph.D. Department of Computer Science and Engineering CSE5311 Design and Analysis of Algorithms 1 Sorting So Far
More informationInterleaving Schemes on Circulant Graphs with Two Offsets
Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationUnsupervised Learning
Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,
More informationTopological Classification of Data Sets without an Explicit Metric
Topological Classification of Data Sets without an Explicit Metric Tim Harrington, Andrew Tausz and Guillaume Troianowski December 10, 2008 A contemporary problem in data analysis is understanding the
More informationJie Gao Computer Science Department Stony Brook University
Localization of Sensor Networks II Jie Gao Computer Science Department Stony Brook University 1 Rigidity theory Given a set of rigid bars connected by hinges, rigidity theory studies whether you can move
More informationCrossing Families. Abstract
Crossing Families Boris Aronov 1, Paul Erdős 2, Wayne Goddard 3, Daniel J. Kleitman 3, Michael Klugerman 3, János Pach 2,4, Leonard J. Schulman 3 Abstract Given a set of points in the plane, a crossing
More information26 The closest pair problem
The closest pair problem 1 26 The closest pair problem Sweep algorithms solve many kinds of proximity problems efficiently. We present a simple sweep that solves the two-dimensional closest pair problem
More informationChapter 2: The Normal Distribution
Chapter 2: The Normal Distribution 2.1 Density Curves and the Normal Distributions 2.2 Standard Normal Calculations 1 2 Histogram for Strength of Yarn Bobbins 15.60 16.10 16.60 17.10 17.60 18.10 18.60
More informationMath 190: Quotient Topology Supplement
Math 190: Quotient Topology Supplement 1. Introduction The purpose of this document is to give an introduction to the quotient topology. The quotient topology is one of the most ubiquitous constructions
More informationHARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS
HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS An Undergraduate Research Scholars Thesis by DENISE IRVIN Submitted to the Undergraduate Research Scholars program at Texas
More informationCSE151 Assignment 2 Markov Decision Processes in the Grid World
CSE5 Assignment Markov Decision Processes in the Grid World Grace Lin A484 gclin@ucsd.edu Tom Maddock A55645 tmaddock@ucsd.edu Abstract Markov decision processes exemplify sequential problems, which are
More informationMaximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube
Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Kavish Gandhi April 4, 2015 Abstract A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long
More informationData Partitioning. Figure 1-31: Communication Topologies. Regular Partitions
Data In single-program multiple-data (SPMD) parallel programs, global data is partitioned, with a portion of the data assigned to each processing node. Issues relevant to choosing a partitioning strategy
More informationRecursively Enumerable Languages, Turing Machines, and Decidability
Recursively Enumerable Languages, Turing Machines, and Decidability 1 Problem Reduction: Basic Concepts and Analogies The concept of problem reduction is simple at a high level. You simply take an algorithm
More informationConstructing Hidden Units using Examples and Queries
Constructing Hidden Units using Examples and Queries Eric B. Baum Kevin J. Lang NEC Research Institute 4 Independence Way Princeton, NJ 08540 ABSTRACT While the network loading problem for 2-layer threshold
More informationII (Sorting and) Order Statistics
II (Sorting and) Order Statistics Heapsort Quicksort Sorting in Linear Time Medians and Order Statistics 8 Sorting in Linear Time The sorting algorithms introduced thus far are comparison sorts Any comparison
More information1 Minimum Cut Problem
CS 6 Lecture 6 Min Cut and Karger s Algorithm Scribes: Peng Hui How, Virginia Williams (05) Date: November 7, 07 Anthony Kim (06), Mary Wootters (07) Adapted from Virginia Williams lecture notes Minimum
More informationMIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018
MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge
More informationCS 395T Computational Learning Theory. Scribe: Wei Tang
CS 395T Computational Learning Theory Lecture 1: September 5th, 2007 Lecturer: Adam Klivans Scribe: Wei Tang 1.1 Introduction Many tasks from real application domain can be described as a process of learning.
More information10701 Machine Learning. Clustering
171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among
More informationCS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp
CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp Chris Guthrie Abstract In this paper I present my investigation of machine learning as
More informationLecture 5: Duality Theory
Lecture 5: Duality Theory Rajat Mittal IIT Kanpur The objective of this lecture note will be to learn duality theory of linear programming. We are planning to answer following questions. What are hyperplane
More informationAdvanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs
Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material
More informationOverview of Clustering
based on Loïc Cerfs slides (UFMG) April 2017 UCBL LIRIS DM2L Example of applicative problem Student profiles Given the marks received by students for different courses, how to group the students so that
More informationNon-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions
Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551,
More informationLinear Models. Lecture Outline: Numeric Prediction: Linear Regression. Linear Classification. The Perceptron. Support Vector Machines
Linear Models Lecture Outline: Numeric Prediction: Linear Regression Linear Classification The Perceptron Support Vector Machines Reading: Chapter 4.6 Witten and Frank, 2nd ed. Chapter 4 of Mitchell Solving
More informationAlgorithms, Games, and Networks February 21, Lecture 12
Algorithms, Games, and Networks February, 03 Lecturer: Ariel Procaccia Lecture Scribe: Sercan Yıldız Overview In this lecture, we introduce the axiomatic approach to social choice theory. In particular,
More information6 Distributed data management I Hashing
6 Distributed data management I Hashing There are two major approaches for the management of data in distributed systems: hashing and caching. The hashing approach tries to minimize the use of communication
More information1 (15 points) LexicoSort
CS161 Homework 2 Due: 22 April 2016, 12 noon Submit on Gradescope Handed out: 15 April 2016 Instructions: Please answer the following questions to the best of your ability. If you are asked to show your
More informationCost Models for Query Processing Strategies in the Active Data Repository
Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272
More informationExact Algorithms Lecture 7: FPT Hardness and the ETH
Exact Algorithms Lecture 7: FPT Hardness and the ETH February 12, 2016 Lecturer: Michael Lampis 1 Reminder: FPT algorithms Definition 1. A parameterized problem is a function from (χ, k) {0, 1} N to {0,
More informationMulti-Cluster Interleaving on Paths and Cycles
Multi-Cluster Interleaving on Paths and Cycles Anxiao (Andrew) Jiang, Member, IEEE, Jehoshua Bruck, Fellow, IEEE Abstract Interleaving codewords is an important method not only for combatting burst-errors,
More information3 Nonlinear Regression
CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic
More informationSpatial Information Based Image Classification Using Support Vector Machine
Spatial Information Based Image Classification Using Support Vector Machine P.Jeevitha, Dr. P. Ganesh Kumar PG Scholar, Dept of IT, Regional Centre of Anna University, Coimbatore, India. Assistant Professor,
More informationEfficient Pairwise Classification
Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany Abstract. Pairwise classification is a class binarization
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 24: Online Algorithms
princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 24: Online Algorithms Lecturer: Matt Weinberg Scribe:Matt Weinberg Lecture notes sourced from Avrim Blum s lecture notes here: http://www.cs.cmu.edu/
More informationCSC 447: Parallel Programming for Multi- Core and Cluster Systems
CSC 447: Parallel Programming for Multi- Core and Cluster Systems Parallel Sorting Algorithms Instructor: Haidar M. Harmanani Spring 2016 Topic Overview Issues in Sorting on Parallel Computers Sorting
More informationMatching and Alignment: What is the Cost of User Post-match Effort?
Matching and Alignment: What is the Cost of User Post-match Effort? (Short paper) Fabien Duchateau 1 and Zohra Bellahsene 2 and Remi Coletta 2 1 Norwegian University of Science and Technology NO-7491 Trondheim,
More informationConsensus, impossibility results and Paxos. Ken Birman
Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}
More informationCOMP Data Structures
COMP 2140 - Data Structures Shahin Kamali Topic 5 - Sorting University of Manitoba Based on notes by S. Durocher. COMP 2140 - Data Structures 1 / 55 Overview Review: Insertion Sort Merge Sort Quicksort
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More informationEfficient Optimal Linear Boosting of A Pair of Classifiers
Efficient Optimal Linear Boosting of A Pair of Classifiers Victor Boyarshinov Dept Computer Science Rensselaer Poly. Institute boyarv@cs.rpi.edu Malik Magdon-Ismail Dept Computer Science Rensselaer Poly.
More informationMATH3016: OPTIMIZATION
MATH3016: OPTIMIZATION Lecturer: Dr Huifu Xu School of Mathematics University of Southampton Highfield SO17 1BJ Southampton Email: h.xu@soton.ac.uk 1 Introduction What is optimization? Optimization is
More informationApplying the Q n Estimator Online
Applying the Q n Estimator Online Robin Nunkesser 1, Karen Schettlinger 2, and Roland Fried 2 1 Department of Computer Science, Univ. Dortmund, 44221 Dortmund Robin.Nunkesser@udo.edu 2 Department of Statistics,
More informationLecture 8 Parallel Algorithms II
Lecture 8 Parallel Algorithms II Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Original slides from Introduction to Parallel
More informationDetecting Clusters and Outliers for Multidimensional
Kennesaw State University DigitalCommons@Kennesaw State University Faculty Publications 2008 Detecting Clusters and Outliers for Multidimensional Data Yong Shi Kennesaw State University, yshi5@kennesaw.edu
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)
More information6.867 Machine Learning
6.867 Machine Learning Problem set - solutions Thursday, October What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove. Do not
More informationSorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Sorting Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview Issues in Sorting on Parallel
More informationCS675: Convex and Combinatorial Optimization Spring 2018 Consequences of the Ellipsoid Algorithm. Instructor: Shaddin Dughmi
CS675: Convex and Combinatorial Optimization Spring 2018 Consequences of the Ellipsoid Algorithm Instructor: Shaddin Dughmi Outline 1 Recapping the Ellipsoid Method 2 Complexity of Convex Optimization
More informationLeveraging Transitive Relations for Crowdsourced Joins*
Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,
More informationCS264: Beyond Worst-Case Analysis Lecture #19: Self-Improving Algorithms
CS264: Beyond Worst-Case Analysis Lecture #19: Self-Improving Algorithms Tim Roughgarden March 14, 2017 1 Preliminaries The last few lectures discussed several interpolations between worst-case and average-case
More informationRepresentation Learning for Clustering: A Statistical Framework
Representation Learning for Clustering: A Statistical Framework Hassan Ashtiani School of Computer Science University of Waterloo mhzokaei@uwaterloo.ca Shai Ben-David School of Computer Science University
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationExploiting a database to predict the in-flight stability of the F-16
Exploiting a database to predict the in-flight stability of the F-16 David Amsallem and Julien Cortial December 12, 2008 1 Introduction Among the critical phenomena that have to be taken into account when
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationE-Companion: On Styles in Product Design: An Analysis of US. Design Patents
E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing
More informationApproximate Nearest Line Search in High Dimensions
Approximate Nearest Line Search in High Dimensions Sepideh Mahabadi MIT mahabadi@mit.edu Abstract We consider the Approximate Nearest Line Search (NLS) problem. Given a set L of N lines in the high dimensional
More informationTraining Digital Circuits with Hamming Clustering
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 47, NO. 4, APRIL 2000 513 Training Digital Circuits with Hamming Clustering Marco Muselli, Member, IEEE, and Diego
More informationHMMT February 2018 February 10, 2018
HMMT February 2018 February 10, 2018 Combinatorics 1. Consider a 2 3 grid where each entry is one of 0, 1, and 2. For how many such grids is the sum of the numbers in every row and in every column a multiple
More informationA New Pool Control Method for Boolean Compressed Sensing Based Adaptive Group Testing
Proceedings of APSIPA Annual Summit and Conference 27 2-5 December 27, Malaysia A New Pool Control Method for Boolean Compressed Sensing Based Adaptive roup Testing Yujia Lu and Kazunori Hayashi raduate
More informationEvaluating Robot Systems
Evaluating Robot Systems November 6, 2008 There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it
More informationPredictive Indexing for Fast Search
Predictive Indexing for Fast Search Sharad Goel, John Langford and Alex Strehl Yahoo! Research, New York Modern Massive Data Sets (MMDS) June 25, 2008 Goel, Langford & Strehl (Yahoo! Research) Predictive
More informationStats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms
Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms Padhraic Smyth Department of Computer Science Bren School of Information and Computer Sciences University of California,
More information/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17
601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17 5.1 Introduction You should all know a few ways of sorting in O(n log n)
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationNotes in Computational Geometry Voronoi Diagrams
Notes in Computational Geometry Voronoi Diagrams Prof. Sandeep Sen and Prof. Amit Kumar Indian Institute of Technology, Delhi Voronoi Diagrams In this lecture, we study Voronoi Diagrams, also known as
More informationSmoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data
Smoothing Dissimilarities for Cluster Analysis: Binary Data and unctional Data David B. University of South Carolina Department of Statistics Joint work with Zhimin Chen University of South Carolina Current
More informationUNLABELED SENSING: RECONSTRUCTION ALGORITHM AND THEORETICAL GUARANTEES
UNLABELED SENSING: RECONSTRUCTION ALGORITHM AND THEORETICAL GUARANTEES Golnoosh Elhami, Adam Scholefield, Benjamín Béjar Haro and Martin Vetterli School of Computer and Communication Sciences École Polytechnique
More information1 Case study of SVM (Rob)
DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how
More informationFundamental Properties of Graphs
Chapter three In many real-life situations we need to know how robust a graph that represents a certain network is, how edges or vertices can be removed without completely destroying the overall connectivity,
More informationChapter 15 Introduction to Linear Programming
Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of
More informationThe Fibonacci hypercube
AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 40 (2008), Pages 187 196 The Fibonacci hypercube Fred J. Rispoli Department of Mathematics and Computer Science Dowling College, Oakdale, NY 11769 U.S.A. Steven
More informationGraphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs
Graphs and Network Flows IE411 Lecture 21 Dr. Ted Ralphs IE411 Lecture 21 1 Combinatorial Optimization and Network Flows In general, most combinatorial optimization and integer programming problems are
More information7. Nearest neighbors. Learning objectives. Centre for Computational Biology, Mines ParisTech
Foundations of Machine Learning CentraleSupélec Paris Fall 2016 7. Nearest neighbors Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning
More information