Automated Controversy Detection on the Web

Size: px
Start display at page:

Download "Automated Controversy Detection on the Web"

Transcription

1 Automated Controversy Detection on the Web Shiri Dori-Hacohen and James Allan Center for Intelligent Information Retrieval Department of Computer Science University of Massachusetts Amherst Amherst, Massachusetts {shiri, ABSTRACT Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the filter bubble effect, and therefore would be a useful feature in a search engine or browser extension. In order to implement such a feature, however, the binary classification task of determining which topics or webpages are controversial must be solved. Earlier work described a proof of concept approach to this problem using a supervised nearest neighbor classifier, by comparing web pages to an oracle of manually annotated Wikipedia articles. This paper generalizes and extends that concept by taking the human out of the loop, leveraging the rich metadata available in Wikipedia articles in a distantly-supervised classification approach. The new technique we present allows the nearest neighbor approach to be extended on a much larger scale and to other datasets. The results improve substantially over naive baselines and are nearly identical to the supervised approach by standard measures of F 1, F 0.5, and accuracy. Finally, we discuss implications of solving this problem as part of a wider subject of interest to the IR community, and suggest several avenues for further exploration in this exciting new space. 1. INTRODUCTION On the web today, alternative medicine sites appear alongside pediatrician advice websites. In the United States, the phrase global warming is a hoax is in wide circulation and political debates rage, both online and offline, over the economy, same-sex marriage and healthcare. Worldwide, geopolitical and religious tensions are causing strife and upheaval; the Middle East is being reshaped by a generation that expects or even demands access to information. In countries with unlimited access, controversies proliferate online, but not everyone hears all sides of the argument; the filter bubble effect encourages confirmation bias by offering users the answers they want to hear [17]. In other cases, access does not translate into trustworthy information. For example, first-time parents seeking information about vaccines will Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX...$ find plenty of proof that they cause autism, and may not even realize the depth of the controversy involved (see Figure 1). Ads for helplines displayed to users searching for abortion are discreetly frequently funded by pro-life (antiabortion) religious groups [11]. Figure 1: One of the top query results in a major search engine for the query do vaccines cause autism [11]. Unfortunately, critical literacy, civic discourse and trustworthy information are not immediate results of successful information retrieval. The underlying thread connecting all these examples is that users searching for these topics may not even be aware that a controversy exists; without the aid of a search engine feature or browser extension to warn them, they may never find out. We believe that a valuable addition to end-user experience would be to inform users about controversial topics; this requires detecting such topics as a prerequisite. This problem was first proposed as a binary classification task by Dori-Hacohen and Allan [8]. Our paper suggests a method to solve that task and determine which topics or webpages are controversial. In prior work, a nearest neighbor approach was proposed to detect controversy, and a proof-of-concept was shown using human judgments as an oracle for the controversy level of related Wikipedia articles [8]; this approach obtained 10-20% absolute gains over several baselines. However, human

2 annotations are expensive to generate, and do not scale. Dori-Hacohen and Allan had 1,761 Wikipedia articles annotated, which, while representing 20% of the articles in their dataset, are only a minuscule 0.04% of the English language Wikipedia. This naturally raises the question of whether the nearest neighbor approach can be generalized, obviating the need for human annotations. In this work, we answer that question by extending and generalizing the nearest neighbor approach to controversy detection, and present a new, distantly-supervised approach to detect controversial topics. We use automated techniques to score related Wikipedia articles, replacing the human annotations and creating a fully-automated system with no human-in-the-loop, that can classify a webpage as either controversial or not. We consider our system as distantlysupervised [16] since we use heuristic labels for Wikipedia articles, which act as a bridge between the rich metadata available in Wikipedia and the sparse data on the web. While one might hypothesize that using an automated approach to scoring Wikipedia articles (instead of an oracle of human annotations) would degrade the results, our approach achieves comparable results to the fully-supervised system [8], while at the same time making the approach scalable and applicable to any web dataset. 2. RELATED WORK Several strands of related work inform our work on controversy: controversy detection in Wikipedia, controversy on the web and in search, fact disputes and trustworthiness, as well as sentiment analysis. We describe each area in turn. 2.1 Controversy Detection in Wikipedia Research on controversy detection in Wikipedia can be leveraged and applied to a wider scope on the web. Wikipedia s collaborative nature, along with its versioning system, is a rich resource that has excited many researchers, with several papers focusing on detecting controversy in Wikipedia. Kittur et al. first proposed that controversy could be detected automatically using the rich metadata available on Wikipedia articles. They proposed a logistic regression classifier based on several metrics such as the length of the article, its associated talk page, and the number of users [14]. Sumi, Yasseri and their colleagues proposed a metric for controversy level, M, that is based on reverts - a special type of Wikipedia edits, wherein an editor completely undos the previous edit [21, 25]. Sepehri Rad et al. presented a binary classifier for controversy using collaboration networks, and also presented a comparative study of the various approaches to detecting controversy in Wikipedia [20, 19]. Wikipedia is a valuable resource, but often hides the existence of debate by presenting even controversial topics in deliberately neutral tones [24], which may be misleading to people unfamiliar with the debate. For example, the Wikipedia article for U.S. President Barack Obama make him appear non-controversial, while both the Talk page and an automated analysis show otherwise. While detecting controversy in Wikipedia automatically can be seen as an end in itself, these controversy detection methods have wider reach and can be used as a step for solving other problems. Recently, Das et al. explored the possibility of Wikipedia administrators attempting to manipulate controversial pages in a certain area by extending any controversy metric into a clustered controversy measure, hypothesizing that editors focusing on a certain controversial topic were more invested in the outcomes than those spreading their edits across several topical areas [7]. Additionally, Wikipedia has been used in the past as a valuable resource assisting in controversy detection elsewhere, whether as a lexicon or as a hierarchy for controversial words and topics [3, 2, 18]. Likewise, we make use of a few of the Wikipedia-specific controversy measures described above as a step in our approach (see Section 3.2). As described above, prior work showed that it is possible to use related Wikipedia articles as a proxy for controversy on the web, by using human annotations as an oracle rating the controversy of the articles [8]. In contrast, we use automatically-generated values for the Wikipedia articles. In order to evaluate our distantly-supervised system and compare it to the previously described supervised results, we acquired the same dataset collected by Dori-Hacohen and Allan, consisting of webpages and Wikipedia articles annotated as controversial or noncontroversial [8]. 2.2 Controversy on the web and in search Outside of Wikipedia, other targeted domains such as news [3, 5] and Twitter [18] have been mined for controversial topics, mostly focusing on politics and politicians. Some work relies on domain-specified sources such as Debatepedia 1 [3, 13] that are likewise politics-heavy: for example, as of this writing Debatepedia has no entry discussing Homeopathy. We consider controversy to be wider in scope than simply politics; medical and religious controversies are equally interesting. In some cases, a simple word search can be useful in detecting controversial queries [10]; unfortunately, this queryside approach to controversy detection relied on the Google Suggest API, which was deprecated in 2011 as part of the Google Labs shutdown. Assuming one knows that a query is controversial, diversifying search results based on opinions is a useful feature [12, 13]. Prior work has shown that there is value in presenting users with information that may differ from their original perspective, whether it is portrayed implicitly or explicitly [9, 23, 26]. When reading about claims that may be in dispute users do not actively seek contrasting information or viewpoints, but they are more likely to read the contrasting viewpoint when it was explicitly portrayed as such [23]. Yom-Tov et al. described how, for polarized political topics, two versions of seemingly similar search queries existed; using one query or the other exposed the biases of the user issuing the query (e.g. obamacare vs. affordable health care ) [26]. Their research demonstrated that interspersing search results from the opposite viewpoint on such polarized queries had the double effect of both encouraging users to read more diverse opinions, and to read more news in general [26]; this partially counteracts the filter bubble effect [17]. This is the motivation behind our work: to automatically inform users of the various viewpoints on controversial topics. 2.3 Fact disputes and Trustworthiness Previous work also explored fact disputes and trustworthiness [9, 23], which is often related to controversial topics. For example, the Dispute Finder tool focused on finding 1

3 and exposing disputed claims on the web to users as they browse [9]; our ultimate goal, similarly, is to inform users of controversial topics in their search results. However, Dispute Finder was focused largely on manually added or bootstrapped fact disputes, whereas we are interested in scalably detecting controversies that may, or may not, stem from fact disputes. Some controversies indeed originate from arguments over fact disputes (such as the birthplace of Barack Obama) or over matters of scientific study (the effect of vaccines on autism). In other cases, however, judgments, values and interpretation can fuel the fire despite widespread agreement on the basic facts of the matter; moral debates over abortion and euthanasia, or political debates over the size of the U.S. government and the Israeli-Palestinian conflict, are some examples. 2.4 Sentiment Analysis Sentiment analysis can naturally be seen as a useful tool as a step towards detecting varying opinions and potentially controversy [5, 18, 22]. However, sentiment analysis and opinion mining work has long focused on product or movie reviews, where users have no intention or reason to hide their true thoughts; in contrast, many biased webpages intentionally obfuscate the sentiment and controversy involved, attempting to pass their vision as objective. As others have mentioned, sentiment alone may not suffice for detecting controversy [3, 8], though it may be useful as a feature. We believe that warning about controversial webpages therefore requires a dedicated approach that will likely not be based solely on sentiment analysis, and must rely on information outside the text of the page itself - at the very least, other websites on the same topic. 3. NEAREST NEIGHBOR APPROACH Our approach to detecting controversy on the web is a nearest neighbor classifier that maps webpages to the Wikipedia articles related to them. We start from a webpage and find Wikipedia articles that discuss the same topic; if the Wikipedia articles are controversial, it is reasonable to assume the webpage is controversial as well. Prior work, using human annotations as an oracle for the Wikipedia articles controversy level, showed that such a nearest neighbor classifier can achieve accuracy of 72.9% and F 0.5 of 64.8%, which represent measurable improvements over several baselines [8]. Our work replicates these results in a scalable manner by removing the human-in-the-loop and replacing it with automatically-generated scores for Wikipedia. The choice to map specifically to Wikipedia rather than to any webpages was driven by the availability of the rich metadata and edit history on Wikipedia; as described above, there is a fair amount of prior work dedicated to detecting controversy on Wikipedia by using this wealth of information [14, 19, 20, 21, 25]. Using a targeted nearest neighbor approach, with neighbors specifically selected from Wikipedia, allows us to bridge the gap between sparse web data and Wikipedia, affording an opportunity to leverage this valuable resource to other areas of the web. We consider our approach as a distantly-supervised classifier [16], since the labels used for controversy on Wikipedia are heuristics which are automatically generated, rather than truth labels; some of them were learned using a supervised classifier on Wikipedia, but none of them were trained for the task at hand, namely classifying webpages controversy. To implement our nearest neighbor classifier, we use several parts: matching via query generation, scoring the Wikipedia articles, aggregation, thresholding and voting. A schematic of the system is shown in Figure 2, which we will refer to while describing the various parts. Figure 2: Schematic of the fully automated Nearest Neighbor Approach 3.1 Matching via Query Generation We use the same query generation approach described in previous work [8] to map from webpages to the related Wikipedia articles. The top ten most frequent terms on the webpage, excluding stop words, are extracted from the webpage (step 1 in Figure 2), and then used as a keyword query restricted to the Wikipedia domain and run on a commercial search engine (step 2 in Figure 2). We use one of two different stop sets, a 418 word set (which we refer to as Full Stopping [4]) or a 35 word set ( Light Stopping [15]). Wikipedia redirects were followed wherever applicable in order to ensure we reached the full Wikipedia article with its associated metadata; any talk or user pages were ignored. We considered the articles returned from the query as the webpage s neighbors, which will be evaluated for their controversy level. Based on the assumption that higher ranked articles might be more relevant, but provide less coverage, we varied the number of neighbors in our experiments from 1 to 20, or used all articles returned by the search engine. A brief evaluation of the query generation approach is presented in Section Automatically-generated Wikipedia labels The Wikipedia articles, found as neighbors to webpages, were labeled with several scores measuring their controversy level (step 3 in Figure 2). In previous work [8], human annotations were used to label the Wikipedia articles. In our paper, we use three different types of automated scores for controversy in Wikipedia, which we refer to as D, C and M. All three scores can be automatically generated for any Wikipedia article, based on information available in its associated metadata, talk page and revision history. For all three scores (and without loss of generality), higher scores indicate more controversy.

4 3.2.1 D score - presence of dispute tags D tests for the presence of Dispute tags that are manually added to the talk pages of Wikipedia articles by its contributors [14, 19]. These tags, while very sparse and therefore difficult to rely on [19, 8], are potentially valuable when they are present. We test for the presence of such tags, and use the results as a binary score (1 if the tag exists or -1 if it doesn t). Although this approach relies on data that was manually entered into Wikipedia, our use of it is automatic and unsupervised. Unfortunately, the number of dispute tags available is very low: in a recent Wikipedia dump, there were only 1,449 articles that had a dispute tag on their talk page, only 0.03% of the articles on Wikipedia. This is an even smaller dataset than the human annotations provided in prior work [8]; the overlap between these articles and the 8,755 articles in our dataset is a mere 165 articles C score - metadata-based regression The C score is a regression that predicts the controversy level of the Wikipedia article using a variety of metadata features. This regression is based on the approach first described by Kittur et al. [14], and referred to alternately as the meta classifier [19] or the C score [7]. The goal of this score is to extend the dispute tags to a wider set of articles that don t contain them, thus increasing coverage. The C score is trained on articles that contain dispute tags as training examples. The regression relies on the assumption that the more revisions of the article that the dispute tag appears in, the more controversial the page will be. The classifier is thus implemented as a linear regression that attempts to predict, for a given Wikipedia article that doesn t actually contain a dispute tag, the estimated number of revisions which would theoretically contain a dispute tag (if it did have one). The classifier uses a variety of metadata features, such as: the length of the article and its associated talk page; the number of revisions to each of those; and the number of editors and anonymous editors, among others. We use the version of this regression as implemented and trained recently by Das et al. [7], which weighs the score by the number of edits on the page and normalizes the score, generating a floating point score in the range (0,1) M score - mutual reverts score The M score, as defined by Sumi et al., is a different way of estimating the controversy level of a Wikipedia article, based on the concept of reverts and edit wars in Wikipedia [21, 25]. A revert is a special type of edit in Wikipedia, wherein an editor completely undos a previous edit, thus reverting the page to a prior version. Simply measuring reverts, however, is not a sufficient measure of controversy; reverts are quite commonly associated with vandalism, which is very frequent on popular pages regardless of their controversy level. In an edit war, however, multiple editors revert each other, often more than once - termed mutual reverts [21]. The M score is therefore defined as a function of the number of mutual reverts to the article and the number of edits each editor has performed on the page [21]. The premise of the M score is that the more times an editor have edited a page, the less likely that their edit being reverted is associated with vandalism (most vandals are blocked from Wikipedia quickly), and therefore the more likely to be caused by an actual controversy. Thus, the M score of an article considers the sum of mutual reverts on the page; for every pair of editors engaging in a mutual revert on it, weighed by the minimum number of contributions of the two editors [21]. The effect of vandalism is thus expected to be minimized. Finally, the top pair of reverting editors is ignored, thus nullifying potential personal vendettas. The score is a positive real number, theoretically unbounded (in practice it ranges from 0 to several billion). 3.3 Aggregation and Thresholding The score for a webpage is computed, by taking either the maximum or the average of all its neighbors scores (step 4 in Figure 2). Which of the two aggregation methods used is one parameter we vary in our experiments. After aggregation, each webpage has a single score according to each of the three scoring methods ( D, C and M ) which can be considered an estimator of the page s controversy level. A threshold must be chosen if one wishes to convert them into binary judgments (step 5 in Figure 2). We found that the threshold chosen for the M metric by Sumi et. al [21] was not sufficient for our purposes, while C had no available threshold. We therefore trained several thresholds for both C and M (see Section 4.1). 3.4 Voting In addition to using each of the three labels in isolation, we can also combine them by voting (step 6 in Figure 2). We apply one of several voting schemes to the binary classification labels, after the thresholds have been applied. The schemes we use are: Majority vote: consider the webpage controversial if at least two out of the three labels are controversial. Logical Or: consider the webpage controversial if any of the three labels is controversial. Logical And: consider the webpage controversial only if all the three labels are controversial. Other logical combinations: we consider results for the combination (Dispute (C M)), based on the premise that if the dispute tag happens to be present, it would be valuable. We tried using all three combinations of the form (x (y z)), but D s coverage was so low that with with x=c or x=m the results were essentially identical to the majority voting; we therefore omit them. Table 1: Data set size and annotations. NNT denotes the subset of Wikipedia articles that are Nearest Neighbors of the webpages Training set. Webpages Set Pages Controversial All (32.6%) Training (29.8%) Testing (38.0%) Wikipedia articles Set Articles Annotated Controversial All 8,755 1, (16.0%) NNT 4, (13.5%)

5 4. EXPERIMENTAL SETUP AND DATA SET The dataset we use, which was described in prior work [8], is comprised of 377 webpages from ClueWeb09, drawn from 41 seed queries sent to a commercial search engine. The collection was split into approximately 60% training and 40% testing sets in a random fashion; the seeds were split up, with some seeds pages going entirely into either train or test, and others being split among the two categories. The final distribution of controversy in the two sets differed slightly (see Table 1), as the training set had a lower proportion of controversial pages than the testing set (29.8% vs. 38.0%). The webpages were annotated for controversy level on a four point scale. Additionally, the dataset contained 8,755 Wikipedia articles that were found as neighbors of the webpages in the set, using the approach described in Section 3.1. Of these articles, 4,060 were the Nearest Neighbors associated with the Training set ( NNT in Table 1), which we used to train the thresholds for C and M. We use the following five metrics, both to train and to evaluate our results: Figure 3: Precision-Recall curves (uninterpolated) for C and M thresholds on the Wikipedia NNT set Precision is the proportion of pages that the system labeled as controversial, that were actually controversial. Recall is the proportion of pages that the system labeled as controversial, out of all the controversial pages in the dataset. Accuracy is the proportion of pages that the system labeled correctly (both controversial and noncontroversial), out of all the pages in the dataset. F 1 is the harmonic mean between precision and recall. F 0.5 is the weighted harmonic mean which gives precision twice as much importance as recall. Another way of looking these definitions is by using the classic IR sense of these metrics, with controversial standing in for relevant. 4.1 Threshold training C and M are both real-valued numbers, as described in SectionSec:AutoWP; in order to generate a binary classification, we must select a threshold above which the page will be considered controversial. (For the D score which we defined as a binary value of either -1 or 1, a natural threshold exists at 0.) The Precision-Recall curve for both scores is displayed in Figure 3. Since we had access to human annotations on some of the Wikipedia articles [8], we trained the thresholds for C and M for the subset of articles associated with the training set (labeled NNT in Table 1). We select five thresholds for the two scoring methods, based on the best results achieved on this subset for Precision, Recall, F 1, Accuracy and F 0.5 as defined above. The resulting thresholds are shown in Table 2, along with the results on the same set achieved by D 2. For comparison, we also present single-class baselines on this task of labeling the Wikipedia articles. The dominant class baseline labels all pages as noncontroversial, which 2 The threshold of 0 for the M metric returns every nonzero score as controversial, and only zero scores as noncontroversial. yields 0% precision and 0% recall since no pages are labeled as controversial; accuracy is the only nonzero metric in this case. The nondominant class baseline, in turn, labels all pages as controversial and thus achieves 100% recall; its precision and accuracy are equivalent to the incidence of controversy in the dataset. Finally, a random baseline, which labels every article as either controversial or noncontroversial based on a coin flip, is presented for comparison (average of three random runs). Its precision is also similar to the incidence of controversy in the dataset, while its recall and accuracy are closer to 50%. 5. EVALUATION We treat the controversy detection problem as a binary classification problem of assigning labels of controversial and noncontroversial to webpages. We therefore evaluate our results to the classification problem by using the definitions of precision, recall, accuracy and F β, as defined in Section 4. We present a brief evaluation for the query generation approach before turning to describe our results for the controversy detection problem. 5.1 Judgments from Matching A key step in our approach is selecting which Wikipedia articles to use as nearest neighbors. In order to evaluate how well our query generation approach is mapping webpages to Wikipedia articles, we evaluated the automated queries and the relevance of their results to the original webpage. This allows an intrinsic measure of the effectiveness of this step - independent of its effect on the extrinsic task. Focusing on the queries generated by the Light Stopping set (see Section 3.1), we annotated 3,430 of the query-article combinations (out of 7,630 combinations total) that were returned from the search engine; the combinations represented 2,454

6 Table 2: Thresholds for M and C trained on NNT set, based on the task of labeling Wikipedia articles (not webpages). Results are optimized for each of the metrics (Precision, Recall, F 1, Accuracy and F 0.5 ); the metric optimized is in bold. Threshold P R F 1 Acc F 0.5 M M M M M C C C C C Dispute Random Dominant Non-Dominant Figure 4: Judgments on Wikipedia articles returned by the automatically-generated queries, by query rank. Annotators could choose one of the following options: H-On= Highly on [webpage s] topic, S- On= Slightly on topic, S-Off= Slightly off topic, H- Off= Highly off topic, Links= Links to this topic, but doesn t discuss it directly, DK= Don t Know unique Wikipedia articles. Preference was given to annotating articles that were higher-ranked in the query results, resulting in 361 annotations for the top rank, and between 330 and 280 annotations each for ranks 2 through 10. Our annotators were presented with the webpage and the titles of up to 10 Wikipedia articles in alphabetical order (not ranked order). The annotators were asked to name the single article that best matched the webpage. They were also asked to judge, for each article, whether it was relevant to the original page. The annotators were not shown the automatically-generated query. The options were highly on topic, slightly on topic, slightly off topic or highly off topic, as well as links to this topic, but doesn t discuss it directly or don t Know. Figure 4 shows how the ranked list of Wikipedia articles were judged. In the figure, it is clear that the top-ranking article was viewed as highly on topic but then the quality dropped rapidly. However, if both on-topic judgments are combined, a large number of highly or slightly relevant articles are being selected. Also, the higher the rank, the more likely that the article was on topic. Figure 5 further supports this, by showing that most of the best articles are at ranks 1 to 3; the top-ranked article was considered the best for 42% of the pages. Using a reciprocal rank contribution of 0 for don t know or none of the above, the Mean Reciprocal Rank for the dataset was Baseline runs and prior work We compare our approach to several baselines. We use the same sentiment analysis approach as described in prior work [8], along with random and single-class baselines as described in section 4.1. The sentiment analysis baseline is a logistic regression classifier [1] trained to detect presence of sentiment on the webpage, whether positive or negative; sentiment is used as a proxy for controversy. Finally, the best results from prior work are displayed [8]; as mentioned above, these results use a similar nearest neighbor approach, with human annotations serving as an oracle for the Wiki- pedia articles controversy level. 5.3 Our approach As described in Section 3, we varied several parameters in our the nearest neighbor approach: 1. Stopping set (Light or Full) 2. Number of neighbors (k=1..20, or no limit) 3. Aggregation method (average or max) 4. Scoring or voting method (C, M, D; Majority, Or, And, D (C M)) 5. Thresholds for C and M (one of the five values each from Table 2). These parameters were evaluated on the training set and the best runs were selected, optimizing for Precision, Recall, F 1, F 0.5 and Accuracy. The parameters that performed best, for each of the scoring/voting methods, were then run on the test set. 5.4 Results The results of our approach on the test set are displayed in Table 3. For ease of discussion, we will refer to row numbers in the table. We present the best runs, as trained on the training set, with their test set results DISCUSSION 3 In some cases, the number of neighbors was similar, all other parameters were identical and the results were very close; we present only the best of these similar runs.

7 Table 3: Results on Testing Set. Results are displayed for the best runs on the training set, using each scoring method, optimized for F 1, Accuracy and F 0.5. Sample results optimizing for Recall are also displayed. The overall best results of our runs, in each metric, are displayed in bold; the best prior results (rows 18-20, marked Oracle [8]) and baseline results (rows 21-25) are also displayed in bold. See text for discussion. Parameters Test Set # Stop Score k agg Thres C Thres M Target P R F 1 Acc F Full M 8 avg F 1, Acc Light M 8 max F Light M 12 max F Light M 18 max 660 R Light C 14 max Acc Light C 15 max F Light C 7 avg Acc, F Light C 7 max R Light D 19 max F Full D 5 max Acc Light D 6 max Acc, F Light Majority 15 max F Full Majority 5 max Acc, F Full Or 8 avg F 1, Acc, F Light And no max F 1, Acc, F Full D (CM) 8 avg F Light D (CM) 7 avg Acc, F Light Oracle 4 avg P Light Oracle 20 max R Full Oracle 8 max F Sentiment Random Random Dominant Class (all noncontroversial) Non-Dominant Class (all controversial) As the results in Table 3 show, our fully-automated approach (rows 1-17) achieves results higher than all baselines (rows 21-25), in all five metrics. The precision-recall curves for select runs is presented in Figure 6. Results optimizing for precision and recall are largely omitted from the table; values of exactly 100% for each were achievable on the training set by many different parameter combinations, often using either the most discriminative (for precision) or most permissive (for recall) thresholds. The runs with 100% recall on the training set generalized well, and consistently generated at or near 100% recall on the test set. These runs were essentially identical to the nondominant class baseline, and we present the best of these runs (rows 4 and 8 in Table 3). The thresholds for precision did not generalize across the folds; they were mostly too restrictive, generating a single controversy label on the training set (thus gaining 100% correctness on one label), in most cases returning no results on the test set and thus equal to the dominant class baseline. Note that for the task of detecting controversy in Wikipedia articles (one of the steps in our approach), Precision and Recall combinations of the automated methods were fairly low, as shown in Table 2 and Figure 3; F 1 and F 0.5 are never above 35%. Despite those seemingly poor results for this step, aggregating these automated scores over several neighbors results in fairly high F values for the webpage task (near or above 60% for both M and C methods). We note that the results for the dispute method on the webpage task were much lower than one might otherwise expect; however, those are partially explained by the low results on the Wikipedia task (see Table 2). We observe that when taking the maximum of the neighbors, results were generally better with higher, more discriminative thresholds and a large number of neighbors; when average was used, a lower threshold with fewer neighbors was more effective. This makes sense when considering that the max function is more sensitive to noise than the average function - a single highly controversial neighbor can change the classification, so a higher threshold can reduce the sensitivity to such noise while extending coverage by considering more neighbors. For many of the max runs, increasing the number of neighbors beyond a certain number had very little effect on the actual results since controversial pages had already been labeled as such by a higher-ranked neighbor. The method that optimized for F 0.5 on the training set among all the single score approaches was the run using Light Stopping, M with a rather high (discriminative) threshold, and aggregating over the maximal value of all the result neighbors (rows 2 and 3 in Table 3). Using k values of 8 through 12 achieved identical results on the training set. These runs ended up achieving some of the best results on the test set; with value k=8 the results were the best

8 Figure 5: Frequency of page selected as best, by rank. DK= Don t Know, N= None of the above Figure 6: Precision-Recall curves (uninterpolated) for select runs on the Test set. Row numbers refer to Table 3. Number of Pages Precision Run Light C 7 avg (row 7) Light M 8 max (row 2) DK N Rank Recall for F 0.5 as well as Accuracy (row 2), and with value k=12 the results were best for F 1 among the single-score methods (row 3). The k=8 run (row 2) showed 10.1% absolute gain in accuracy (16.3% relative gain) over the dominant class baseline, which had the best accuracy score among the baselines; F 0.5 showed 19.5% absolute gain (44.5% relative gain) over the best F 0.5 score, which was achieved by the Random 50 baseline. The k=12 run (row 3) showed a 9% absolute gain (16.3% relative gain) in F 1 over the best baseline for F 1, the non dominant class baseline. Even though none of the results displayed in the table were optimized for precision, they still had higher results than the baselines across the board (compare rows 1-17 to rows 21-25). Among the voting methods, the method that optimized for F 1 on the training set was the Majority voting, using Light Stopping, aggregating over the maximal value of 15 neighbors, with discriminative thresholds for both M and C (row 12). This run showed a 10.4% (18.9% relative gain) absolute gain on the test set over the best baseline for F 1. The results of the sentiment baseline (row 21 in Table 3) are surprisingly similar to the all controversial baseline (row 25); at closer look, the sentiment classifier only returns about 10% of the webpages as lacking sentiment and therefore its results are not far from the single-class baseline. We tried applying higher confidence thresholds to the sentiment classifier, but this only resulted in lower recall without improvement in precision. It s worth noting that the sentiment classifier was not trained to detect controversy; it s clear from these results, as others have noted, that sentiment alone is too simplistic to predict controversy [3, 8]. For example, a webpage may proclaim its love of Julia Roberts; this contains clear sentiment, but fairly little controversy (unless you happen to hate Julia Roberts). Finally, when comparing our distantly-supervised results (rows 1-17) to the best supervised runs from prior work (rows 18-20, see [8]), the results are quite comparable. The best supervised run (row 18) had absolute gains of 1.5% and 0.8% in F 0.5 and Accuracy, respectively, over our best for those metrics (row 2); F 1 for the same runs had an absolute gain of 4.5% for our approach over the supervised approach. The run that is best for F 1 using our approach (row 12) has a 1.3% absolute gain over the best F 1 run in prior work (row 20). This is despite the fact that the supervised runs relied on human judgments for Wikipedia articles, while the distantly-supervised methods we described can be automatically applied to any webpage in a scalable manner. 7. CONCLUSIONS AND FUTURE WORK In this work, we presented the first fully automated approach to solving the recently proposed binary classification task of controversy detection [8]. We showed that Web controversy detection can be done with automatic labeling of exemplars in a nearest neighbor classifier. Our approach improves upon the previous work by creating a scalable distantly-supervised classification system, that leverages the rich metadata available in Wikipedia and uses it to classify webpages for which such information is not available. We reported results that represent 20% absolute gains in F measures and 10% absolute gains in accuracy over several baselines including sentiment analysis, and are comparable to prior work that used human annotations as an oracle standin [8]. As with other work that leverages such controversy scores calculated on Wikipedia articles, our approach is modular and therefore agnostic to the method chosen; the scores used for the Wikipedia articles can be generated via any method available to evaluate their controversy level [7]. For example, scores based on a network collaboration approach [20]

9 could be substituted in place of the M and C values, or added to them as another feature. Additionally, the nearest neighbor method we described is agnostic to the choice of target collection we query, and is not limited to Wikipedia. Other rich web collections whose controversy values can be easily computed, such as Debate.org, Debatabase or procon.org, could also be used alongside Wikipedia as targeted neighbors, potentially improving precision. There are several ways that our method could be improved upon, and we look forward to exploring some of these. More complex query generation methods could be employed to match neighbors; alternatively, it can be enhanced by entity linking for Wikification, or else language models of the pages and articles could be used to compare candidate neighbors directly. Other classifiers for the Wikipedia articles could be explored, and standard machine learning approaches can be used to combine our method with other features such as sentiment analysis. We would also be interested in validating our approach on a larger data set: the low results that the dispute metric received against the data set presented here raise interesting questions that have we have not had the chance to explore. The nearest neighbor approach we presented is limited in nature by the collection it targets; it will not properly detect controversial topics that are sparsely covered by Wikipedia or not covered at all. One could argue that is a major problem with this approach; though Wikipedia s coverage is fairly wide, it will not reach the truly long tail of the web when it comes to controversies of small scope, or those interesting only to a specific subpopulation. Entirely different approaches would need to be employed to detect such controversies. We believe sentiment presence alone may not suffice to detect controversy, and some biased websites may intentionally attempt to present as neutral. Nonetheless, it s possible that some metric of sentiment variance across multiple websites could provide useful clues; clustering websites topically and then analyzing the sentiments across them might be one way to handle this challenge. Another approach could use language models or topic models to automatically detect the fact that strongly opposing, possibly biased points of view exist on a topic, and thus that it may be controversial. This would flip the directionality of some recent work that presupposes subjectivity and bias to detect points of view [6, 26]. Being able to automatically detect controversies on the web has potential implications for improving critical literacy and civic discourse. Along with the ever-growing trend towards personalization in search comes the risk of fragmenting the web into separate worlds, with search engines showing users only what they think the user wants to see in a self-fulfilling prophecy. The ability to inform users about fact disputes and controversies in their queries enables search engines to improve trustworthiness. Additionally, explicitly exposing bias and polarization may partially counteract the filter bubble or echo chamber effects, wherein click feedback further reenforce users predispositions. We see the controversy detection problem itself as a prerequisite to several other interesting applications and larger problems. The most obvious application is a feature for search engines, or a browser plugin, that informs users when the webpage they are looking at is controversial. It would be useful to conduct user studies to see if such a warning had impact on search and browsing activity, whether in the moment or over time. Tracking the evolution of controversial topics on the web over time, or exploring the incidence of controversy in the wild, would be additional questions to explore that could be of interest to social scientists. Other interesting problems include diversifying controversial search results according to the stances and opinions on them, and automatically clustering or classifying webpages into various points of view on controversial topics, just to name a few. We re fascinated to explore these larger avenues of future work that can further healthy debates on the web, encourage civic discourse, and promote critical literacy for end-users of search. 8. ACKNOWLEDGMENTS This work was supported in part by the Center for Intelligent Information Retrieval and in part by NSF grant #IIS Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsor. We would like to thank Ed Chi, Kevin Murphy, Elad Yom- Tov and the members of the CIIR lab for fruitful conversations; Allen Lavoie, Hoda Sepehri Rad, Taha Yasseri added to those, along with supplying valuable resources. Sandeep Kalra, Ravali Rochampally, Emma Tosch and Celal Ziftci commented on versions of the draft and helped improve it significantly. Finally, this paper would not have been possible to write without the incredible support of Gonen Dori- Hacohen. 9. REFERENCES [1] E. Aktolga and J. Allan. Sentiment diversification with different biases. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, SIGIR 13, pages , New York, NY, USA, ACM. [2] R. Awadallah, M. Ramanath, and G. Weikum. Opinionetit: understanding the opinions-people network for politically controversial topics. In Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM 11, pages , New York, NY, USA, ACM. [3] R. Awadallah, M. Ramanath, and G. Weikum. Harmony and dissonance: organizing the people s voices on political controversies. In Proceedings WSDM, pages , [4] J. P. Callan, W. B. Croft, and S. M. Harding. The INQUERY retrieval system. In Database and Expert Systems Applications, pages Springer Vienna, [5] Y. Choi, Y. Jung, and S.-H. Myaeng. Identifying controversial issues and their sub-topics in news articles. In Proceedings of the 2010 Pacific Asia conference on Intelligence and Security Informatics, PAISI 10, pages , Berlin, Heidelberg, Springer-Verlag. [6] S. Das and A. Lavoie. Automated inference of point of view from user interactions in collective intelligence venues. In Workshop on Social Computing and User Generated Content, EC 13, [7] S. Das, A. Lavoie, and M. Magdon-Ismail. Manipulation among the arbiters of collective

10 intelligence: How Wikipedia administrators mold public opinion. In Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management, CIKM 13, pages , New York, NY, USA, ACM. [8] S. Dori-Hacohen and J. Allan. Detecting controversy on the web. In Proceedings of the 22nd ACM international conference on Conference on Information & Knowledge Management, CIKM 13, pages , New York, NY, USA, ACM. [9] R. Ennals, B. Trushkowsky, and J. M. Agosta. Highlighting disputed claims on the web. In Proceedings of the 19th International Conference on World Wide Web, WWW 10, pages , New York, NY, USA, ACM. [10] K. Gyllstrom and M.-F. Moens. Clash of the typings: finding controversies and children s topics within queries. In Proceedings of the 33rd European conference on Advances in information retrieval, ECIR 11, pages 80 91, Berlin, Heidelberg, Springer-Verlag. [11] Heroic Media. Free abortion help website, January Retrieved from [12] M. Kacimi and J. Gamper. Diversifying search results of controversial queries. In Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM 11, pages 93 98, New York, NY, USA, ACM. [13] M. Kacimi and J. Gamper. MOUNA: mining opinions to unveil neglected arguments. In Proceedings of CIKM, pages , [14] A. Kittur, B. Suh, B. A. Pendleton, and E. H. Chi. He says, she says: conflict and coordination in Wikipedia. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 07, pages , New York, NY, USA, ACM. [15] C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval, volume 1. Cambridge University Press Cambridge, [16] M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2, ACL 09, pages , Stroudsburg, PA, USA, Association for Computational Linguistics. [17] E. Pariser. The Filter Bubble: What the Internet is hiding from you. Penguin Press HC, [18] A.-M. Popescu and M. Pennacchiotti. Detecting controversial events from twitter. In Proceedings of CIKM, pages , [19] H. Sepehri Rad and D. Barbosa. Identifying controversial articles in Wikipedia: A comparative study. In Proceedings of 8th conference on WikiSym, WikiSym 12. ACM, [20] H. Sepehri Rad, A. Makazhanov, D. Rafiei, and D. Barbosa. Leveraging editor collaboration patterns in Wikipedia. In Proceedings of the 23rd ACM conference on Hypertext and social media, HT 12, pages 13 22, New York, NY, USA, ACM. [21] R. Sumi, T. Yasseri, A. Rung, A. Kornai, and J. Kertész. Edit wars in Wikipedia. CoRR, pages , [22] M. Tsytsarau, T. Palpanas, and K. Denecke. Scalable discovery of contradictions on the web. In Proceedings of the 19th international conference on World wide web, WWW 10, pages , New York, NY, USA, ACM. [23] V. G. V. Vydiswaran, C. Zhai, D. Roth, and P. Pirolli. Biastrust: Teaching biased users about controversial topics. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 12, pages , New York, NY, USA, ACM. [24] Wikipedia. Wikipedia: Neutral Point of View policy, January Retrieved from wiki/wikipedia:neutral point of view#controversial subjects. [25] T. Yasseri, R. Sumi, A. Rung, A. Kornai, and J. Kertész. Dynamics of conflicts in Wikipedia. PLoS ONE, 7(6):e38869, [26] E. Yom-Tov, S. Dumais, and Q. Guo. Promoting civil discourse through search engine diversity. Social Science Computer Review, 2013.

Automated Controversy Detection on the Web

Automated Controversy Detection on the Web Automated Controversy Detection on the Web Shiri Dori-Hacohen and James Allan Center for Intelligent Information Retrieval School of Computer Science University of Massachusetts, Amherst, Amherst MA 01002,

More information

Detecting Controversial Articles in Wikipedia

Detecting Controversial Articles in Wikipedia Detecting Controversial Articles in Wikipedia Joy Lind Department of Mathematics University of Sioux Falls Sioux Falls, SD 57105 Darren A. Narayan School of Mathematical Sciences Rochester Institute of

More information

A New Measure of the Cluster Hypothesis

A New Measure of the Cluster Hypothesis A New Measure of the Cluster Hypothesis Mark D. Smucker 1 and James Allan 2 1 Department of Management Sciences University of Waterloo 2 Center for Intelligent Information Retrieval Department of Computer

More information

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS 1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,

More information

Link Prediction for Social Network

Link Prediction for Social Network Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue

More information

SCALABLE KNOWLEDGE BASED AGGREGATION OF COLLECTIVE BEHAVIOR

SCALABLE KNOWLEDGE BASED AGGREGATION OF COLLECTIVE BEHAVIOR SCALABLE KNOWLEDGE BASED AGGREGATION OF COLLECTIVE BEHAVIOR P.SHENBAGAVALLI M.E., Research Scholar, Assistant professor/cse MPNMJ Engineering college Sspshenba2@gmail.com J.SARAVANAKUMAR B.Tech(IT)., PG

More information

Sentiment analysis under temporal shift

Sentiment analysis under temporal shift Sentiment analysis under temporal shift Jan Lukes and Anders Søgaard Dpt. of Computer Science University of Copenhagen Copenhagen, Denmark smx262@alumni.ku.dk Abstract Sentiment analysis models often rely

More information

Retrieval Evaluation. Hongning Wang

Retrieval Evaluation. Hongning Wang Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User

More information

Introduction to Text Mining. Hongning Wang

Introduction to Text Mining. Hongning Wang Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-

More information

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp Chris Guthrie Abstract In this paper I present my investigation of machine learning as

More information

Ranking Clustered Data with Pairwise Comparisons

Ranking Clustered Data with Pairwise Comparisons Ranking Clustered Data with Pairwise Comparisons Alisa Maas ajmaas@cs.wisc.edu 1. INTRODUCTION 1.1 Background Machine learning often relies heavily on being able to rank the relative fitness of instances

More information

Interactive Campaign Planning for Marketing Analysts

Interactive Campaign Planning for Marketing Analysts Interactive Campaign Planning for Marketing Analysts Fan Du University of Maryland College Park, MD, USA fan@cs.umd.edu Sana Malik Adobe Research San Jose, CA, USA sana.malik@adobe.com Eunyee Koh Adobe

More information

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,

More information

CMPSCI 646, Information Retrieval (Fall 2003)

CMPSCI 646, Information Retrieval (Fall 2003) CMPSCI 646, Information Retrieval (Fall 2003) Midterm exam solutions Problem CO (compression) 1. The problem of text classification can be described as follows. Given a set of classes, C = {C i }, where

More information

Chapter 3: Google Penguin, Panda, & Hummingbird

Chapter 3: Google Penguin, Panda, & Hummingbird Chapter 3: Google Penguin, Panda, & Hummingbird Search engine algorithms are based on a simple premise: searchers want an answer to their queries. For any search, there are hundreds or thousands of sites

More information

Search Engines. Information Retrieval in Practice

Search Engines. Information Retrieval in Practice Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems

More information

ANNUAL REPORT Visit us at project.eu Supported by. Mission

ANNUAL REPORT Visit us at   project.eu Supported by. Mission Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing

More information

An Attempt to Identify Weakest and Strongest Queries

An Attempt to Identify Weakest and Strongest Queries An Attempt to Identify Weakest and Strongest Queries K. L. Kwok Queens College, City University of NY 65-30 Kissena Boulevard Flushing, NY 11367, USA kwok@ir.cs.qc.edu ABSTRACT We explore some term statistics

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna

More information

Predictive Analysis: Evaluation and Experimentation. Heejun Kim

Predictive Analysis: Evaluation and Experimentation. Heejun Kim Predictive Analysis: Evaluation and Experimentation Heejun Kim June 19, 2018 Evaluation and Experimentation Evaluation Metrics Cross-Validation Significance Tests Evaluation Predictive analysis: training

More information

Tilburg University. Authoritative re-ranking of search results Bogers, A.M.; van den Bosch, A. Published in: Advances in Information Retrieval

Tilburg University. Authoritative re-ranking of search results Bogers, A.M.; van den Bosch, A. Published in: Advances in Information Retrieval Tilburg University Authoritative re-ranking of search results Bogers, A.M.; van den Bosch, A. Published in: Advances in Information Retrieval Publication date: 2006 Link to publication Citation for published

More information

Detection and Extraction of Events from s

Detection and Extraction of Events from  s Detection and Extraction of Events from Emails Shashank Senapaty Department of Computer Science Stanford University, Stanford CA senapaty@cs.stanford.edu December 12, 2008 Abstract I build a system to

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

UMass at TREC 2017 Common Core Track

UMass at TREC 2017 Common Core Track UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano

More information

Tatiana Braescu Seminar Visual Analytics Autumn, Supervisor Prof. Dr. Andreas Kerren. Purpose of the presentation

Tatiana Braescu Seminar Visual Analytics Autumn, Supervisor Prof. Dr. Andreas Kerren. Purpose of the presentation Tatiana Braescu Seminar Visual Analytics Autumn, 2007 Supervisor Prof. Dr. Andreas Kerren 1 Purpose of the presentation To present an overview of analysis and visualization techniques that reveal: who

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already

More information

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Alejandro Bellogín 1,2, Thaer Samar 1, Arjen P. de Vries 1, and Alan Said 1 1 Centrum Wiskunde

More information

Chapter 6 Evaluation Metrics and Evaluation

Chapter 6 Evaluation Metrics and Evaluation Chapter 6 Evaluation Metrics and Evaluation The area of evaluation of information retrieval and natural language processing systems is complex. It will only be touched on in this chapter. First the scientific

More information

second_language research_teaching sla vivian_cook language_department idl

second_language research_teaching sla vivian_cook language_department idl Using Implicit Relevance Feedback in a Web Search Assistant Maria Fasli and Udo Kruschwitz Department of Computer Science, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom fmfasli

More information

Robust Relevance-Based Language Models

Robust Relevance-Based Language Models Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new

More information

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,

More information

Predicting Query Performance on the Web

Predicting Query Performance on the Web Predicting Query Performance on the Web No Author Given Abstract. Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However,

More information

Oleksandr Kuzomin, Bohdan Tkachenko

Oleksandr Kuzomin, Bohdan Tkachenko International Journal "Information Technologies Knowledge" Volume 9, Number 2, 2015 131 INTELLECTUAL SEARCH ENGINE OF ADEQUATE INFORMATION IN INTERNET FOR CREATING DATABASES AND KNOWLEDGE BASES Oleksandr

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES

CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES 188 CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES 6.1 INTRODUCTION Image representation schemes designed for image retrieval systems are categorized into two

More information

Evaluating the Usefulness of Sentiment Information for Focused Crawlers

Evaluating the Usefulness of Sentiment Information for Focused Crawlers Evaluating the Usefulness of Sentiment Information for Focused Crawlers Tianjun Fu 1, Ahmed Abbasi 2, Daniel Zeng 1, Hsinchun Chen 1 University of Arizona 1, University of Wisconsin-Milwaukee 2 futj@email.arizona.edu,

More information

A Formal Approach to Score Normalization for Meta-search

A Formal Approach to Score Normalization for Meta-search A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003

More information

Improving Difficult Queries by Leveraging Clusters in Term Graph

Improving Difficult Queries by Leveraging Clusters in Term Graph Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

A Framework for adaptive focused web crawling and information retrieval using genetic algorithms

A Framework for adaptive focused web crawling and information retrieval using genetic algorithms A Framework for adaptive focused web crawling and information retrieval using genetic algorithms Kevin Sebastian Dept of Computer Science, BITS Pilani kevseb1993@gmail.com 1 Abstract The web is undeniably

More information

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

On the automatic classification of app reviews

On the automatic classification of app reviews Requirements Eng (2016) 21:311 331 DOI 10.1007/s00766-016-0251-9 RE 2015 On the automatic classification of app reviews Walid Maalej 1 Zijad Kurtanović 1 Hadeer Nabil 2 Christoph Stanik 1 Walid: please

More information

Window Extraction for Information Retrieval

Window Extraction for Information Retrieval Window Extraction for Information Retrieval Samuel Huston Center for Intelligent Information Retrieval University of Massachusetts Amherst Amherst, MA, 01002, USA sjh@cs.umass.edu W. Bruce Croft Center

More information

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1 Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that

More information

UFeed: Refining Web Data Integration Based on User Feedback

UFeed: Refining Web Data Integration Based on User Feedback UFeed: Refining Web Data Integration Based on User Feedback ABSTRT Ahmed El-Roby University of Waterloo aelroby@uwaterlooca One of the main challenges in large-scale data integration for relational schemas

More information

Automatic Domain Partitioning for Multi-Domain Learning

Automatic Domain Partitioning for Multi-Domain Learning Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels

More information

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Walid Magdy, Gareth J.F. Jones Centre for Next Generation Localisation School of Computing Dublin City University,

More information

Building and Annotating Corpora of Collaborative Authoring in Wikipedia

Building and Annotating Corpora of Collaborative Authoring in Wikipedia Building and Annotating Corpora of Collaborative Authoring in Wikipedia Johannes Daxenberger, Oliver Ferschke and Iryna Gurevych Workshop: Building Corpora of Computer-Mediated Communication: Issues, Challenges,

More information

Assignment 1. Assignment 2. Relevance. Performance Evaluation. Retrieval System Evaluation. Evaluate an IR system

Assignment 1. Assignment 2. Relevance. Performance Evaluation. Retrieval System Evaluation. Evaluate an IR system Retrieval System Evaluation W. Frisch Institute of Government, European Studies and Comparative Social Science University Vienna Assignment 1 How did you select the search engines? How did you find the

More information

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

Self-tuning ongoing terminology extraction retrained on terminology validation decisions Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin

More information

Popularity of Twitter Accounts: PageRank on a Social Network

Popularity of Twitter Accounts: PageRank on a Social Network Popularity of Twitter Accounts: PageRank on a Social Network A.D-A December 8, 2017 1 Problem Statement Twitter is a social networking service, where users can create and interact with 140 character messages,

More information

Evaluating an Associative Browsing Model for Personal Information

Evaluating an Associative Browsing Model for Personal Information Evaluating an Associative Browsing Model for Personal Information Jinyoung Kim, W. Bruce Croft, David A. Smith and Anton Bakalov Department of Computer Science University of Massachusetts Amherst {jykim,croft,dasmith,abakalov}@cs.umass.edu

More information

Wrapper: An Application for Evaluating Exploratory Searching Outside of the Lab

Wrapper: An Application for Evaluating Exploratory Searching Outside of the Lab Wrapper: An Application for Evaluating Exploratory Searching Outside of the Lab Bernard J Jansen College of Information Sciences and Technology The Pennsylvania State University University Park PA 16802

More information

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks Archana Sulebele, Usha Prabhu, William Yang (Group 29) Keywords: Link Prediction, Review Networks, Adamic/Adar,

More information

Final Report - Smart and Fast Sorting

Final Report - Smart and Fast  Sorting Final Report - Smart and Fast Email Sorting Antonin Bas - Clement Mennesson 1 Project s Description Some people receive hundreds of emails a week and sorting all of them into different categories (e.g.

More information

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval SCCS414: Information Storage and Retrieval Christopher Manning and Prabhakar Raghavan Lecture 10: Text Classification; Vector Space Classification (Rocchio) Relevance

More information

Semantic Website Clustering

Semantic Website Clustering Semantic Website Clustering I-Hsuan Yang, Yu-tsun Huang, Yen-Ling Huang 1. Abstract We propose a new approach to cluster the web pages. Utilizing an iterative reinforced algorithm, the model extracts semantic

More information

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Recommender Systems. Collaborative Filtering & Content-Based Recommending Recommender Systems Collaborative Filtering & Content-Based Recommending 1 Recommender Systems Systems for recommending items (e.g. books, movies, CD s, web pages, newsgroup messages) to users based on

More information

Key questions to ask before commissioning any web designer to build your website.

Key questions to ask before commissioning any web designer to build your website. Key questions to ask before commissioning any web designer to build your website. KEY QUESTIONS TO ASK Before commissioning a web designer to build your website. As both an entrepreneur and business owner,

More information

Query Likelihood with Negative Query Generation

Query Likelihood with Negative Query Generation Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer

More information

Text Categorization (I)

Text Categorization (I) CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization

More information

CMO Briefing Google+:

CMO Briefing Google+: www.bootcampdigital.com CMO Briefing Google+: How Google s New Social Network Can Impact Your Business Facts Google+ had over 30 million users in the first month and was the fastest growing social network

More information

Indexing and Query Processing

Indexing and Query Processing Indexing and Query Processing Jaime Arguello INLS 509: Information Retrieval jarguell@email.unc.edu January 28, 2013 Basic Information Retrieval Process doc doc doc doc doc information need document representation

More information

Website Designs Australia

Website Designs Australia Proudly Brought To You By: Website Designs Australia Contents Disclaimer... 4 Why Your Local Business Needs Google Plus... 5 1 How Google Plus Can Improve Your Search Engine Rankings... 6 1. Google Search

More information

5 Choosing keywords Initially choosing keywords Frequent and rare keywords Evaluating the competition rates of search

5 Choosing keywords Initially choosing keywords Frequent and rare keywords Evaluating the competition rates of search Seo tutorial Seo tutorial Introduction to seo... 4 1. General seo information... 5 1.1 History of search engines... 5 1.2 Common search engine principles... 6 2. Internal ranking factors... 8 2.1 Web page

More information

A Measurement Design for the Comparison of Expert Usability Evaluation and Mobile App User Reviews

A Measurement Design for the Comparison of Expert Usability Evaluation and Mobile App User Reviews A Measurement Design for the Comparison of Expert Usability Evaluation and Mobile App User Reviews Necmiye Genc-Nayebi and Alain Abran Department of Software Engineering and Information Technology, Ecole

More information

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul 1 CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul Introduction Our problem is crawling a static social graph (snapshot). Given

More information

Clean Living: Eliminating Near-Duplicates in Lifetime Personal Storage

Clean Living: Eliminating Near-Duplicates in Lifetime Personal Storage Clean Living: Eliminating Near-Duplicates in Lifetime Personal Storage Zhe Wang Princeton University Jim Gemmell Microsoft Research September 2005 Technical Report MSR-TR-2006-30 Microsoft Research Microsoft

More information

How to Conduct a Heuristic Evaluation

How to Conduct a Heuristic Evaluation Page 1 of 9 useit.com Papers and Essays Heuristic Evaluation How to conduct a heuristic evaluation How to Conduct a Heuristic Evaluation by Jakob Nielsen Heuristic evaluation (Nielsen and Molich, 1990;

More information

Use of Synthetic Data in Testing Administrative Records Systems

Use of Synthetic Data in Testing Administrative Records Systems Use of Synthetic Data in Testing Administrative Records Systems K. Bradley Paxton and Thomas Hager ADI, LLC 200 Canal View Boulevard, Rochester, NY 14623 brad.paxton@adillc.net, tom.hager@adillc.net Executive

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with

More information

Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track

Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track Anastasia Giachanou 1,IlyaMarkov 2 and Fabio Crestani 1 1 Faculty of Informatics, University of Lugano, Switzerland

More information

21. Search Models and UIs for IR

21. Search Models and UIs for IR 21. Search Models and UIs for IR INFO 202-10 November 2008 Bob Glushko Plan for Today's Lecture The "Classical" Model of Search and the "Classical" UI for IR Web-based Search Best practices for UIs in

More information

An Investigation of Basic Retrieval Models for the Dynamic Domain Task

An Investigation of Basic Retrieval Models for the Dynamic Domain Task An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu

More information

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Abstract Deciding on which algorithm to use, in terms of which is the most effective and accurate

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Outlier Ensembles. Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY Keynote, Outlier Detection and Description Workshop, 2013

Outlier Ensembles. Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY Keynote, Outlier Detection and Description Workshop, 2013 Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier

More information

Developing Focused Crawlers for Genre Specific Search Engines

Developing Focused Crawlers for Genre Specific Search Engines Developing Focused Crawlers for Genre Specific Search Engines Nikhil Priyatam Thesis Advisor: Prof. Vasudeva Varma IIIT Hyderabad July 7, 2014 Examples of Genre Specific Search Engines MedlinePlus Naukri.com

More information

Proximity Prestige using Incremental Iteration in Page Rank Algorithm

Proximity Prestige using Incremental Iteration in Page Rank Algorithm Indian Journal of Science and Technology, Vol 9(48), DOI: 10.17485/ijst/2016/v9i48/107962, December 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Proximity Prestige using Incremental Iteration

More information

Rank Measures for Ordering

Rank Measures for Ordering Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many

More information

Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science

Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science 310 Million + Current Domain Names 11 Billion+ Historical Domain Profiles 5 Million+ New Domain Profiles Daily

More information

Taccumulation of the social network data has raised

Taccumulation of the social network data has raised International Journal of Advanced Research in Social Sciences, Environmental Studies & Technology Hard Print: 2536-6505 Online: 2536-6513 September, 2016 Vol. 2, No. 1 Review Social Network Analysis and

More information

An Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst

An Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst An Evaluation of Information Retrieval Accuracy with Simulated OCR Output W.B. Croft y, S.M. Harding y, K. Taghva z, and J. Borsack z y Computer Science Department University of Massachusetts, Amherst

More information

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks University of Amsterdam at INEX 2010: Ad hoc and Book Tracks Jaap Kamps 1,2 and Marijn Koolen 1 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Faculty of Science,

More information

Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs

Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs Yongfeng Zhang, Tsinghua University zhangyf07@gmail.com Outline Research Background Research Topic Current

More information

Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm

Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm K.Parimala, Assistant Professor, MCA Department, NMS.S.Vellaichamy Nadar College, Madurai, Dr.V.Palanisamy,

More information

TREC 2017 Dynamic Domain Track Overview

TREC 2017 Dynamic Domain Track Overview TREC 2017 Dynamic Domain Track Overview Grace Hui Yang Zhiwen Tang Ian Soboroff Georgetown University Georgetown University NIST huiyang@cs.georgetown.edu zt79@georgetown.edu ian.soboroff@nist.gov 1. Introduction

More information

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,

More information

Building Rich User Profiles for Personalized News Recommendation

Building Rich User Profiles for Personalized News Recommendation Building Rich User Profiles for Personalized News Recommendation Youssef Meguebli 1, Mouna Kacimi 2, Bich-liên Doan 1, and Fabrice Popineau 1 1 SUPELEC Systems Sciences (E3S), Gif sur Yvette, France, {youssef.meguebli,bich-lien.doan,fabrice.popineau}@supelec.fr

More information

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA. Data Mining Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA January 13, 2011 Important Note! This presentation was obtained from Dr. Vijay Raghavan

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

Optimizing Search Engines using Click-through Data

Optimizing Search Engines using Click-through Data Optimizing Search Engines using Click-through Data By Sameep - 100050003 Rahee - 100050028 Anil - 100050082 1 Overview Web Search Engines : Creating a good information retrieval system Previous Approaches

More information