Exploiting Discourse Analysis for Article-Wide Temporal Classification

Size: px
Start display at page:

Download "Exploiting Discourse Analysis for Article-Wide Temporal Classification"

Transcription

1 Exploiting Discourse Analysis for Article-Wide Temporal Classification Jun-Ping Ng Min-Yen Kan Ziheng Lin Vanessa Wei Feng Bin Chen Jian Su Chew-Lim Tan National University of Singapore National University of Singapore SAP Asia Pte. Ltd. University of Toronto Institute for Infocomm Research Institute for Infocomm Research National University of Singapore

2 Outline Preliminaries Discourse Analysis Methodology Experiments Discussion 2

3 Temporal Classification At least 19 people were killed and 114 people were wounded in Tuesday s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. 3

4 Temporal Classification At least 19 people were killed and 114 people were wounded in Tuesday s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. wounded said killed time 3

5 Temporal Classification At least 19 people were killed and 114 people were wounded in Tuesday s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. OVERLAP wounded said killed time 3

6 Temporal Classification At least 19 people were killed and 114 people were wounded in Tuesday s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. BEFORE OVERLAP wounded said killed time 3

7 Temporal Classification At least 19 people were killed and 114 people were wounded in Tuesday s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. AFTER OVERLAP wounded said killed time 3

8 Article-Wide Event Pairs Decide on temporal relationship between any two events in an article Helps give us a more complete picture of temporal relations between event pairs 4

9 Challenge Distance between event pairs can potentially be significant Current approaches which are heavy on lexical and syntactic features are not useful 5

10 Discourse Analysis Tells us how sentences are composed together Relating sentences together gives us a better idea of temporal ordering 6

11 Discourse Analysis Rhetorical Structure Theory (RST) PDTB-styled Discourse Relations Topical Text Segmentation 7

12 RST Breaks up text into basic textual units called Elementary Discourse Units (EDU) Relates neighbouring EDUs via a set of typed discourse relations 8

13 RST Max switched off the light. The room was pitch dark. RESULT Max switched off the light. The room was pitch dark. 9

14 PDTB-styled Discourse Relations Identifies explicit and implicit relations between text Also relates text fragments together via a set of typed relations 10

15 PDTB-styled Discourse Relations Max switched off the light. The room was pitch dark. CONTINGENCY :: CAUSE arg1 Max switched off the light. arg2 The room was pitch dark. 11

16 Topical Text Segmentation Groups neighbouring sentences about the same topic together Transitioning across groups of sentences represents a shift in the topic being discussed Coarse-grained discourse analysis 12

17 Topical Text Segmentation The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. 13

18 Topical Text Segmentation The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. 13

19 Topical Text Segmentation The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. 13

20 Outline Preliminaries Discourse Analysis Methodology Experiments Discussion 14

21 Classifying Temporal Relations RST Parsing PDTB Parsing Text Segmentation Identify events within article Feature Generation Convolution Kernel w SVM Temporal Relations 15

22 Classifying Temporal Relations 1 RST Parsing PDTB Parsing Text Segmentation Identify events within article Feature Generation Convolution Kernel w SVM Temporal Relations 15

23 Classifying Temporal Relations 1 2 RST Parsing PDTB Parsing Text Segmentation Identify events within article Feature Generation Convolution Kernel w SVM Temporal Relations 15

24 Classifying Temporal Relations 1 2 RST Parsing PDTB Parsing Text Segmentation 3 Identify events within article Feature Generation Convolution Kernel w SVM Temporal Relations 15

25 Classifying Temporal Relations 1 2 RST Parsing PDTB Parsing Text Segmentation 3 Identify events within article Feature Generation 4 Convolution Kernel w SVM Temporal Relations 15

26 Classifying Temporal Relations 1 2 RST Parsing PDTB Parsing Text Segmentation Identify events within article Feature Generation 3 Convolution Kernel w SVM Temporal Relations

27 RST - Feature conjunction elaboration elaboration elaboration at (0915 GMT) while bus terminal at the city. A powerful. shed at the. airport 16

28 RST - Feature conjunction elaboration elaboration elaboration at (0915 GMT) while bus terminal at the city. A powerful. shed at the. airport 16

29 RST - Feature conjunction elaboration elaboration elaboration at (0915 GMT) while bus terminal at the city. A powerful. shed at the. airport conjunction elaboration 16

30 PDTB - Feature synchrony A powerful bomb. at about 5.15 pm (0915 GMT) while another explosion. at the city 17

31 PDTB - Feature synchrony A powerful bomb. at about 5.15 pm (0915 GMT) while another explosion. at the city 17

32 PDTB - Feature synchrony A powerful bomb. at about 5.15 pm (0915 GMT) while another explosion. at the city synchrony 17

33 Text Segmentation - Feature The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. 18

34 Text Segmentation - Feature 1 The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 2 A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. 18

35 Text Segmentation - Feature 1 The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 2 A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. 18

36 SVM + Tree Kernels Proposed discourse features are structural Tree kernels with SVM can help us capture structural similarity easily Does away with difficult feature engineering to vectorise discourse structures 19

37 Measuring Similarity with Tree Kernels R1 R1 R2 R3 R2 R4 R4 20

38 Measuring Similarity with Tree Kernels R1 R1 R1 R2 R1 R3 R2 R1 R3 R4 R4 21

39 Measuring Similarity with Tree Kernels R1 R1 R1 R2 R1 R3 R2 R1 R3 R4 R4 21

40 Outline Preliminaries Discourse Analysis Methodology Experiments Discussion 22

41 Data Set Subset of 20 news articles from ACE 2005 corpus Annotated with 375 event pairs Applied temporal transitivity rules to obtain 7994 event pairs 23

42 Temporal Transitivity Max opened the door. He went into the room. He placed the book on the table. 24

43 Temporal Transitivity Max opened the door. He went into the room. He placed the book on the table. opened went placed time 24

44 Temporal Transitivity Max opened the door. He went into the room. He placed the book on the table. opened went placed time 24

45 Temporal Transitivity Max opened the door. He went into the room. He placed the book on the table. opened went placed time 24

46 Data Set Breakdown Class Proportion OVERLAP 10% BEFORE 45% AFTER 45% 25

47 Sentence Gap and Temporal Class 600 AFTER BEFORE OVERLAP 450 # E-E Pairs Sentence Gap 26

48 Results System P R F1 1 Do Base Base + RST + PDTB + TS Base + RST + PDTB + TS + CR Base + ORST + PDTB + OTS + OCR

49 Results System P R F1 1 Do Base Base + RST + PDTB + TS Base + RST + PDTB + TS + CR Base + ORST + PDTB + OTS + OCR

50 Results System P R F1 1 Do Base Base + RST + PDTB + TS Base + RST + PDTB + TS + CR Base + ORST + PDTB + OTS + OCR

51 Results System P R F1 1 Do Base Base + RST + PDTB + TS Base + RST + PDTB + TS + CR Base + ORST + PDTB + OTS + OCR

52 Results System P R F1 1 Do Base Base + RST + PDTB + TS Base + RST + PDTB + TS + CR Base + ORST + PDTB + OTS + OCR

53 Results System P R F1 1 Do Base Base + RST + PDTB + TS Base + RST + PDTB + TS + CR Base + ORST + PDTB + OTS + OCR

54 Results System P R F1 1 Do Base Base + RST + PDTB + TS Base + RST + PDTB + TS + CR Base + ORST + PDTB + OTS + OCR

55 Useful Temporal Relations (Temporal.. (Temporal (Elaboration... (Elaboration (Background... 34

56 RST Before Relation Milosevic and his wife wielded enormous power in Yugoslavia for more than a decade before he was swept out of power after a popular revolt in October temporal Milosevic wielded a decade temporal before.. swept out.. power after a October

57 RST Before Relation Milosevic and his wife wielded enormous power in Yugoslavia for more than a decade before he was swept out of power after a popular revolt in October temporal Milosevic wielded a decade temporal before.. swept out.. power after a October

58 Confusion Matrix Predicted O B A N Actual O B A 14.7% 14.1% 12.8% 58.5% 0.5% 57.9% 15.5% 26.0% 0.5% 15.7% 57.3% 26.5% 36

59 Confusion Matrix Predicted O B A N Actual O B A 14.7% 14.1% 12.8% 58.5% 0.5% 57.9% 15.5% 26.0% 0.5% 15.7% 57.3% 26.5% 37

60 Confusion Matrix Predicted O B A N Actual O B A 14.7% 14.1% 12.8% 58.5% 0.5% 57.9% 15.5% 26.0% 0.5% 15.7% 57.3% 26.5% 38

61 Doing Better Investigate how to exploit other aspects of discourse analysis Integrate features with joint-inference approaches for better overall results 39

62 Conclusion Discourse analysis is important for temporal classification Rich structural information can be robustly captured with tree kernels Improvements to automatic discourse parsing results will help 40

Exploiting Conversation Structure in Unsupervised Topic Segmentation for s

Exploiting Conversation Structure in Unsupervised Topic Segmentation for  s Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails Shafiq Joty, Giuseppe Carenini, Gabriel Murray, Raymond Ng University of British Columbia Vancouver, Canada EMNLP 2010 1

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information

More information

Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning

Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning Jing Ma 1, Wei Gao 2*, Kam-Fai Wong 1,3 1 The Chinese University of Hong Kong 2 Victoria University of Wellington, New Zealand

More information

Rhetoric Map of an Answer to Compound Queries

Rhetoric Map of an Answer to Compound Queries Rhetoric Map of an Answer to Compound Queries Boris Galitsky Knowledge Trail Inc. San-Francisco, USA bgalitsky@hotmail.com Dmitry Ilvovsky National Research University Higher School of Economics, Moscow,

More information

Ranking Algorithms For Digital Forensic String Search Hits

Ranking Algorithms For Digital Forensic String Search Hits DIGITAL FORENSIC RESEARCH CONFERENCE Ranking Algorithms For Digital Forensic String Search Hits By Nicole Beebe and Lishu Liu Presented At The Digital Forensic Research Conference DFRWS 2014 USA Denver,

More information

NUS-I2R: Learning a Combined System for Entity Linking

NUS-I2R: Learning a Combined System for Entity Linking NUS-I2R: Learning a Combined System for Entity Linking Wei Zhang Yan Chuan Sim Jian Su Chew Lim Tan School of Computing National University of Singapore {z-wei, tancl} @comp.nus.edu.sg Institute for Infocomm

More information

Indoor Object Recognition of 3D Kinect Dataset with RNNs

Indoor Object Recognition of 3D Kinect Dataset with RNNs Indoor Object Recognition of 3D Kinect Dataset with RNNs Thiraphat Charoensripongsa, Yue Chen, Brian Cheng 1. Introduction Recent work at Stanford in the area of scene understanding has involved using

More information

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao Large Scale Chinese News Categorization --based on Improved Feature Selection Method Peng Wang Joint work with H. Zhang, B. Xu, H.W. Hao Computational-Brain Research Center Institute of Automation, Chinese

More information

Separating Speech From Noise Challenge

Separating Speech From Noise Challenge Separating Speech From Noise Challenge We have used the data from the PASCAL CHiME challenge with the goal of training a Support Vector Machine (SVM) to estimate a noise mask that labels time-frames/frequency-bins

More information

AutoODC: Automated Generation of Orthogonal Defect Classifications

AutoODC: Automated Generation of Orthogonal Defect Classifications AutoODC: Automated Generation of Orthogonal Defect Classifications LiGuo Huang 1 Vincent Ng 2 Isaac Persing 2 Ruili Geng 1 Xu Bai 1 Jeff Tian 1 Dept. of Computer Science and Engineering, Southern Methodist

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

Hands-Free Internet using Speech Recognition

Hands-Free Internet using Speech Recognition Introduction Trevor Donnell December 7, 2001 6.191 Preliminary Thesis Proposal Hands-Free Internet using Speech Recognition The hands-free Internet will be a system whereby a user has the ability to access

More information

Smarter text input system for mobile phone

Smarter text input system for mobile phone CS 229 project report Younggon Kim (younggon@stanford.edu) Smarter text input system for mobile phone 1. Abstract Machine learning algorithm was applied to text input system called T9. Support vector machine

More information

Online Pose Classification and Walking Speed Estimation using Handheld Devices

Online Pose Classification and Walking Speed Estimation using Handheld Devices Online Pose Classification and Walking Speed Estimation using Handheld Devices Jun-geun Park MIT CSAIL Joint work with: Ami Patel (MIT EECS), Jonathan Ledlie (Nokia Research), Dorothy Curtis (MIT CSAIL),

More information

Chapter 6 Evaluation Metrics and Evaluation

Chapter 6 Evaluation Metrics and Evaluation Chapter 6 Evaluation Metrics and Evaluation The area of evaluation of information retrieval and natural language processing systems is complex. It will only be touched on in this chapter. First the scientific

More information

Math Information Retrieval: User Requirements and Prototype Implementation. Jin Zhao, Min Yen Kan and Yin Leng Theng

Math Information Retrieval: User Requirements and Prototype Implementation. Jin Zhao, Min Yen Kan and Yin Leng Theng Math Information Retrieval: User Requirements and Prototype Implementation Jin Zhao, Min Yen Kan and Yin Leng Theng Why Math Information Retrieval? Examples: Looking for formulas Collect teaching resources

More information

Towards Summarizing the Web of Entities

Towards Summarizing the Web of Entities Towards Summarizing the Web of Entities contributors: August 15, 2012 Thomas Hofmann Director of Engineering Search Ads Quality Zurich, Google Switzerland thofmann@google.com Enrique Alfonseca Yasemin

More information

Two Practical Rhetorical Structure Theory Parsers

Two Practical Rhetorical Structure Theory Parsers Two Practical Rhetorical Structure Theory Parsers Mihai Surdeanu, Thomas Hicks, and Marco A. Valenzuela-Escárcega University of Arizona, Tucson, AZ, USA {msurdeanu, hickst, marcov}@email.arizona.edu Abstract

More information

WELCOME TO A SILVER JACKETS WEBINAR ON:

WELCOME TO A SILVER JACKETS WEBINAR ON: WELCOME TO A SILVER JACKETS WEBINAR ON: Flood Vulnerability Assessment for Critical Facilities For audio, call 877-336-1839 Access code: 8165946 Security Code: 4567 MOLLY WOLOSZYN Extension Climate Specialist

More information

Generating FrameNets of various granularities: The FrameNet Transformer

Generating FrameNets of various granularities: The FrameNet Transformer Generating FrameNets of various granularities: The FrameNet Transformer Josef Ruppenhofer, Jonas Sunde, & Manfred Pinkal Saarland University LREC, May 2010 Ruppenhofer, Sunde, Pinkal (Saarland U.) Generating

More information

Robust Lossless Data Hiding. Outline

Robust Lossless Data Hiding. Outline Robust Lossless Data Hiding Yun Q. Shi, Zhicheng Ni, Nirwan Ansari Electrical and Computer Engineering New Jersey Institute of Technology October 2010 1 Outline What is lossless data hiding Existing robust

More information

Section A. 1. a) Explain the evolution of information systems into today s complex information ecosystems and its consequences.

Section A. 1. a) Explain the evolution of information systems into today s complex information ecosystems and its consequences. Section A 1. a) Explain the evolution of information systems into today s complex information ecosystems and its consequences. b) Discuss the reasons behind the phenomenon of data retention, its disadvantages,

More information

Deep Learning based Authorship Identification

Deep Learning based Authorship Identification Deep Learning based Authorship Identification Chen Qian Tianchang He Rao Zhang Department of Electrical Engineering Stanford University, Stanford, CA 94305 cqian23@stanford.edu th7@stanford.edu zhangrao@stanford.edu

More information

Chapter 3. Speech segmentation. 3.1 Preprocessing

Chapter 3. Speech segmentation. 3.1 Preprocessing , as done in this dissertation, refers to the process of determining the boundaries between phonemes in the speech signal. No higher-level lexical information is used to accomplish this. This chapter presents

More information

Text Classification and Clustering Using Kernels for Structured Data

Text Classification and Clustering Using Kernels for Structured Data Text Mining SVM Conclusion Text Classification and Clustering Using, pgeibel@uos.de DGFS Institut für Kognitionswissenschaft Universität Osnabrück February 2005 Outline Text Mining SVM Conclusion 1 Text

More information

The Graph of an Equation Graph the following by using a table of values and plotting points.

The Graph of an Equation Graph the following by using a table of values and plotting points. Precalculus - Calculus Preparation - Section 1 Graphs and Models Success in math as well as Calculus is to use a multiple perspective -- graphical, analytical, and numerical. Thanks to Rene Descartes we

More information

Real time facial expression recognition from image sequences using Support Vector Machines

Real time facial expression recognition from image sequences using Support Vector Machines Real time facial expression recognition from image sequences using Support Vector Machines I. Kotsia a and I. Pitas a a Aristotle University of Thessaloniki, Department of Informatics, Box 451, 54124 Thessaloniki,

More information

DeepWalk: Online Learning of Social Representations

DeepWalk: Online Learning of Social Representations DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:

More information

Mining Association Rules in Temporal Document Collections

Mining Association Rules in Temporal Document Collections Mining Association Rules in Temporal Document Collections Kjetil Nørvåg, Trond Øivind Eriksen, and Kjell-Inge Skogstad Dept. of Computer and Information Science, NTNU 7491 Trondheim, Norway Abstract. In

More information

Graphical Notation for Topic Maps (GTM)

Graphical Notation for Topic Maps (GTM) Graphical Notation for Topic Maps (GTM) 2005.11.12 Jaeho Lee University of Seoul jaeho@uos.ac.kr 1 Outline 2 Motivation Requirements for GTM Goals, Scope, Constraints, and Issues Survey on existing approaches

More information

Linear Algebra and Gaming

Linear Algebra and Gaming Linear Algebra and Gaming Bryan Lutgen 05/10/12 Abstract This article will demonstrate how linear algebra is used with virtual environments in video games. It also has implementation of applied physics

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

Fine-Grained Semantic Class Induction via Hierarchical and Collective Classification

Fine-Grained Semantic Class Induction via Hierarchical and Collective Classification Fine-Grained Semantic Class Induction via Hierarchical and Collective Classification Altaf Rahman and Vincent Ng Human Language Technology Research Institute The University of Texas at Dallas What are

More information

Comment Extraction from Blog Posts and Its Applications to Opinion Mining

Comment Extraction from Blog Posts and Its Applications to Opinion Mining Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan

More information

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population Heather Simpson 1, Stephanie Strassel 1, Robert Parker 1, Paul McNamee

More information

Lecture 10 May 14, Prabhakar Raghavan

Lecture 10 May 14, Prabhakar Raghavan Lecture 10 May 14, 2001 Prabhakar Raghavan Centroid/nearest-neighbor classification Bayesian Classification Link-based classification Document summarization Given training docs for a topic, compute their

More information

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset LSTM: An Image Classification Model Based on Fashion-MNIST Dataset Kexin Zhang, Research School of Computer Science, Australian National University Kexin Zhang, U6342657@anu.edu.au Abstract. The application

More information

SAPIENT Automation project

SAPIENT Automation project Dr Maria Liakata Leverhulme Trust Early Career fellow Department of Computer Science, Aberystwyth University Visitor at EBI, Cambridge mal@aber.ac.uk 25 May 2010, London Motivation SAPIENT Automation Project

More information

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional

More information

Predicting Messaging Response Time in a Long Distance Relationship

Predicting Messaging Response Time in a Long Distance Relationship Predicting Messaging Response Time in a Long Distance Relationship Meng-Chen Shieh m3shieh@ucsd.edu I. Introduction The key to any successful relationship is communication, especially during times when

More information

Question Answering Using XML-Tagged Documents

Question Answering Using XML-Tagged Documents Question Answering Using XML-Tagged Documents Ken Litkowski ken@clres.com http://www.clres.com http://www.clres.com/trec11/index.html XML QA System P Full text processing of TREC top 20 documents Sentence

More information

Automatic Summarization

Automatic Summarization Automatic Summarization CS 769 Guest Lecture Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of Wisconsin, Madison February 22, 2008 Andrew B. Goldberg (CS Dept) Summarization

More information

FLOOD VULNERABILITY ASSESSMENT FOR CRITICAL FACILITIES

FLOOD VULNERABILITY ASSESSMENT FOR CRITICAL FACILITIES FLOOD VULNERABILITY ASSESSMENT FOR CRITICAL FACILITIES Lisa Graff GIS Team Manager Prairie Research Institute Illinois State Water Survey University of Illinois OUTLINE Motivation Project details Partners

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Multimodal Medical Image Retrieval based on Latent Topic Modeling

Multimodal Medical Image Retrieval based on Latent Topic Modeling Multimodal Medical Image Retrieval based on Latent Topic Modeling Mandikal Vikram 15it217.vikram@nitk.edu.in Suhas BS 15it110.suhas@nitk.edu.in Aditya Anantharaman 15it201.aditya.a@nitk.edu.in Sowmya Kamath

More information

Topics du jour CS347. Centroid/NN. Example

Topics du jour CS347. Centroid/NN. Example Topics du jour CS347 Lecture 10 May 14, 2001 Prabhakar Raghavan Centroid/nearest-neighbor classification Bayesian Classification Link-based classification Document summarization Centroid/NN Given training

More information

Video Aesthetic Quality Assessment by Temporal Integration of Photo- and Motion-Based Features. Wei-Ta Chu

Video Aesthetic Quality Assessment by Temporal Integration of Photo- and Motion-Based Features. Wei-Ta Chu 1 Video Aesthetic Quality Assessment by Temporal Integration of Photo- and Motion-Based Features Wei-Ta Chu H.-H. Yeh, C.-Y. Yang, M.-S. Lee, and C.-S. Chen, Video Aesthetic Quality Assessment by Temporal

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Spring 2016 A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

AN APPROACH FOR SCANNING BASIC BUSINESS CARD IN ANDROID DEVICE

AN APPROACH FOR SCANNING BASIC BUSINESS CARD IN ANDROID DEVICE AN APPROACH FOR SCANNING BASIC BUSINESS CARD IN ANDROID DEVICE Amareshwar Manjunath Bhat 1, Sahana S Patkar 2, Prof.Venugopal 3 1 MCA, 6th Semester, 2 BE, 8 th Semester CSE Department, 3 CSE Department

More information

Parsing tree matching based question answering

Parsing tree matching based question answering Parsing tree matching based question answering Ping Chen Dept. of Computer and Math Sciences University of Houston-Downtown chenp@uhd.edu Wei Ding Dept. of Computer Science University of Massachusetts

More information

Language resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core)

Language resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core) INTERNATIONAL STANDARD ISO 24617-8 First edition 2016-12-15 Language resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core)

More information

Overview of the medical task of ImageCLEF Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller

Overview of the medical task of ImageCLEF Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller Overview of the medical task of ImageCLEF 2016 Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller Tasks in ImageCLEF 2016 Automatic image annotation Medical image classification Sub-tasks

More information

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es

More information

Computer Science 572 Midterm Prof. Horowitz Tuesday, March 12, 2013, 12:30pm 1:45pm

Computer Science 572 Midterm Prof. Horowitz Tuesday, March 12, 2013, 12:30pm 1:45pm Computer Science 572 Midterm Prof. Horowitz Tuesday, March 12, 2013, 12:30pm 1:45pm Name: Student Id Number: 1. This is a closed book exam. 2. Please answer all questions. 3. There are a total of 40 questions.

More information

What s the Difference?

What s the Difference? What s the Difference? Subtracting Integers Learning Goals In this lesson, you will: Model subtraction of integers using two-color counters. Model subtraction of integers on a number line. Develop a rule

More information

Learning Algorithms for Medical Image Analysis. Matteo Santoro slipguru

Learning Algorithms for Medical Image Analysis. Matteo Santoro slipguru Learning Algorithms for Medical Image Analysis Matteo Santoro slipguru santoro@disi.unige.it June 8, 2010 Outline 1. learning-based strategies for quantitative image analysis 2. automatic annotation of

More information

A CASE STUDY: Structure learning for Part-of-Speech Tagging. Danilo Croce WMR 2011/2012

A CASE STUDY: Structure learning for Part-of-Speech Tagging. Danilo Croce WMR 2011/2012 A CAS STUDY: Structure learning for Part-of-Speech Tagging Danilo Croce WM 2011/2012 27 gennaio 2012 TASK definition One of the tasks of VALITA 2009 VALITA is an initiative devoted to the evaluation of

More information

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es

More information

Homework 2: Parsing and Machine Learning

Homework 2: Parsing and Machine Learning Homework 2: Parsing and Machine Learning COMS W4705_001: Natural Language Processing Prof. Kathleen McKeown, Fall 2017 Due: Saturday, October 14th, 2017, 2:00 PM This assignment will consist of tasks in

More information

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text Philosophische Fakultät Seminar für Sprachwissenschaft Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text 06 July 2017, Patricia Fischer & Neele Witte Overview Sentiment

More information

Text Categorization (I)

Text Categorization (I) CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Spring 2012 A second course in data mining!! http://www.it.uu.se/edu/course/homepage/infoutv2/vt12 Kjell Orsborn! Uppsala Database Laboratory! Department of Information Technology,

More information

Semantic and Multimodal Annotation. CLARA University of Copenhagen August 2011 Susan Windisch Brown

Semantic and Multimodal Annotation. CLARA University of Copenhagen August 2011 Susan Windisch Brown Semantic and Multimodal Annotation CLARA University of Copenhagen 15-26 August 2011 Susan Windisch Brown 2 Program: Monday Big picture Coffee break Lexical ambiguity and word sense annotation Lunch break

More information

Title: "A Physician's Authoring Tool for Generation of Personalized. Health Education in Reconstructive Surgery

Title: A Physician's Authoring Tool for Generation of Personalized. Health Education in Reconstructive Surgery Title: "A Physician's Authoring Tool for Generation of Personalized Health Education in Reconstructive Surgery Chrysanne Di Marco, 1 Don Cowan, 1 Peter Bray, 2 Dominic Covvey, 1 Vic Di Ciccio, 1 Eduard

More information

Research Article. Three-dimensional modeling of simulation scene in campus navigation system

Research Article. Three-dimensional modeling of simulation scene in campus navigation system Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2013, 5(12):103-107 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Three-dimensional modeling of simulation scene

More information

arxiv: v1 [cs.cv] 31 Mar 2016

arxiv: v1 [cs.cv] 31 Mar 2016 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.

More information

P-CNN: Pose-based CNN Features for Action Recognition. Iman Rezazadeh

P-CNN: Pose-based CNN Features for Action Recognition. Iman Rezazadeh P-CNN: Pose-based CNN Features for Action Recognition Iman Rezazadeh Introduction automatic understanding of dynamic scenes strong variations of people and scenes in motion and appearance Fine-grained

More information

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

More information

Machines that test Software like Humans

Machines that test Software like Humans Machines that test Software like Humans Anurag Dwarakanath anurag.dwarakanath@accenture.com Neville Dubash neville.dubash@accenture.com Sanjay Podder sanjay.podder@accenture.com Abstract Automated software

More information

Using machine learning to predict temporal orientation of search engines queries in the Temporalia challenge

Using machine learning to predict temporal orientation of search engines queries in the Temporalia challenge Using machine learning to predict temporal orientation of search engines queries in the Temporalia challenge Michele Filannino, Goran Nenadic filannim@cs.man.ac.uk, g.nenadic@manchester.ac.uk Tokyo, 11/12/2014

More information

Separation of Overlapping Text from Graphics

Separation of Overlapping Text from Graphics Separation of Overlapping Text from Graphics Ruini Cao, Chew Lim Tan School of Computing, National University of Singapore 3 Science Drive 2, Singapore 117543 Email: {caorn, tancl}@comp.nus.edu.sg Abstract

More information

Query Difficulty Prediction for Contextual Image Retrieval

Query Difficulty Prediction for Contextual Image Retrieval Query Difficulty Prediction for Contextual Image Retrieval Xing Xing 1, Yi Zhang 1, and Mei Han 2 1 School of Engineering, UC Santa Cruz, Santa Cruz, CA 95064 2 Google Inc., Mountain View, CA 94043 Abstract.

More information

Domain-specific user preference prediction based on multiple user activities

Domain-specific user preference prediction based on multiple user activities 7 December 2016 Domain-specific user preference prediction based on multiple user activities Author: YUNFEI LONG, Qin Lu,Yue Xiao, MingLei Li, Chu-Ren Huang. www.comp.polyu.edu.hk/ Dept. of Computing,

More information

Transition-based dependency parsing

Transition-based dependency parsing Transition-based dependency parsing Syntactic analysis (5LN455) 2014-12-18 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Overview Arc-factored dependency parsing

More information

Hebei University of Technology A Text-Mining-based Patent Analysis in Product Innovative Process

Hebei University of Technology A Text-Mining-based Patent Analysis in Product Innovative Process A Text-Mining-based Patent Analysis in Product Innovative Process Liang Yanhong, Tan Runhua Abstract Hebei University of Technology Patent documents contain important technical knowledge and research results.

More information

Machine Learning (CSMML16) (Autumn term, ) Xia Hong

Machine Learning (CSMML16) (Autumn term, ) Xia Hong Machine Learning (CSMML16) (Autumn term, 28-29) Xia Hong 1 Useful books: 1. C. M. Bishop: Pattern Recognition and Machine Learning (2007) Springer. 2. S. Haykin: Neural Networks (1999) Prentice Hall. 3.

More information

An Error Analysis Tool for Natural Language Processing and Applied Machine Learning

An Error Analysis Tool for Natural Language Processing and Applied Machine Learning An Error Analysis Tool for Natural Language Processing and Applied Machine Learning Apoorv Agarwal Department of Computer Science Columbia University New York, NY, USA apoorv@cs.columbia.edu Ankit Agarwal

More information

EXPLORING STYLE EMERGENCE IN ARCHITECTURAL DESIGNS

EXPLORING STYLE EMERGENCE IN ARCHITECTURAL DESIGNS EXPLORING STYLE EMERGENCE IN ARCHITECTURAL DESIGNS JOHN S. GERO AND LAN DING Key Centre of Design Computing Department of Architectural and Design Science University of Sydney NSW 2006 Australia e-mail:{john,lan}@arch.usyd.edu.au

More information

Structured Learning. Jun Zhu

Structured Learning. Jun Zhu Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum

More information

Tokenization and Sentence Segmentation. Yan Shao Department of Linguistics and Philology, Uppsala University 29 March 2017

Tokenization and Sentence Segmentation. Yan Shao Department of Linguistics and Philology, Uppsala University 29 March 2017 Tokenization and Sentence Segmentation Yan Shao Department of Linguistics and Philology, Uppsala University 29 March 2017 Outline 1 Tokenization Introduction Exercise Evaluation Summary 2 Sentence segmentation

More information

ME 4054W: SENIOR DESIGN PROJECTS

ME 4054W: SENIOR DESIGN PROJECTS ME 4054W: SENIOR DESIGN PROJECTS Week 3 Thursday Documenting Your Design Before we get started We have received feedback from an industry advisor that some of the students on their design team were not

More information

Ghent University-IBCN Participation in TAC-KBP 2015 Cold Start Slot Filling task

Ghent University-IBCN Participation in TAC-KBP 2015 Cold Start Slot Filling task Ghent University-IBCN Participation in TAC-KBP 2015 Cold Start Slot Filling task Lucas Sterckx, Thomas Demeester, Johannes Deleu, Chris Develder Ghent University - iminds Gaston Crommenlaan 8 Ghent, Belgium

More information

Excel Manual X Axis Labels Below Chart 2010 Scatter

Excel Manual X Axis Labels Below Chart 2010 Scatter Excel Manual X Axis Labels Below Chart 2010 Scatter Of course, I want the chart itself to remain the same, so, the x values of dots are in row "b(o/c)", their y values are in "a(h/c)" row, and their respective

More information

Electronically Filed - City of St. Louis - May 27, :02 PM

Electronically Filed - City of St. Louis - May 27, :02 PM Electronically Filed - City of St. Louis - May 27, 2014-03:02 PM Electronically Filed - City of St. Louis - May 27, 2014-03:02 PM Electronically Filed - City of St. Louis - May 27, 2014-03:02 PM Electronically

More information

Annotation of Human Motion Capture Data using Conditional Random Fields

Annotation of Human Motion Capture Data using Conditional Random Fields Annotation of Human Motion Capture Data using Conditional Random Fields Mert Değirmenci Department of Computer Engineering, Middle East Technical University, Turkey mert.degirmenci@ceng.metu.edu.tr Anıl

More information

Data 100 Lecture 5: Data Cleaning & Exploratory Data Analysis

Data 100 Lecture 5: Data Cleaning & Exploratory Data Analysis OrderNum ProdID Name OrderId Cust Name Date 1 42 Gum 1 Joe 8/21/2017 2 999 NullFood 2 Arthur 8/14/2017 2 42 Towel 2 Arthur 8/14/2017 1/31/18 Data 100 Lecture 5: Data Cleaning & Exploratory Data Analysis

More information

Darshan Institute of Engineering & Technology for Diploma Studies

Darshan Institute of Engineering & Technology for Diploma Studies REQUIREMENTS GATHERING AND ANALYSIS The analyst starts requirement gathering activity by collecting all information that could be useful to develop system. In practice it is very difficult to gather all

More information

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

Self-tuning ongoing terminology extraction retrained on terminology validation decisions Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin

More information

Every Picture Tells a Story: Generating Sentences from Images

Every Picture Tells a Story: Generating Sentences from Images Every Picture Tells a Story: Generating Sentences from Images Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, Peter Young, Cyrus Rashtchian, Julia Hockenmaier, David Forsyth University of Illinois

More information

Data 100. Lecture 5: Data Cleaning & Exploratory Data Analysis

Data 100. Lecture 5: Data Cleaning & Exploratory Data Analysis Data 100 Lecture 5: Data Cleaning & Exploratory Data Analysis Slides by: Joseph E. Gonzalez, Deb Nolan, & Joe Hellerstein jegonzal@berkeley.edu deborah_nolan@berkeley.edu hellerstein@berkeley.edu? Last

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors

More information

Robust PDF Table Locator

Robust PDF Table Locator Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

More information

Comparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio

Comparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio Comparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio Adela Ioana Tudor, Adela Bâra, Simona Vasilica Oprea Department of Economic Informatics

More information

Introduction. Can we use Google for networking research?

Introduction. Can we use Google for networking research? Unconstrained Profiling of Internet Endpoints via Information on the Web ( Googling the Internet) Ionut Trestian1 Soups Ranjan2 Aleksandar Kuzmanovic1 Antonio Nucci2 1 Northwestern 2 Narus University Inc.

More information

Query Modifications Patterns During Web Searching

Query Modifications Patterns During Web Searching Bernard J. Jansen The Pennsylvania State University jjansen@ist.psu.edu Query Modifications Patterns During Web Searching Amanda Spink Queensland University of Technology ah.spink@qut.edu.au Bhuva Narayan

More information

Sentiment analysis under temporal shift

Sentiment analysis under temporal shift Sentiment analysis under temporal shift Jan Lukes and Anders Søgaard Dpt. of Computer Science University of Copenhagen Copenhagen, Denmark smx262@alumni.ku.dk Abstract Sentiment analysis models often rely

More information

Automatic Domain Partitioning for Multi-Domain Learning

Automatic Domain Partitioning for Multi-Domain Learning Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels

More information