Computing Similarity between Cultural Heritage Items using Multimodal Features
|
|
- Shona Sparks
- 6 years ago
- Views:
Transcription
1 Computing Similarity between Cultural Heritage Items using Multimodal Features Nikolaos Aletras and Mark Stevenson Department of Computer Science, University of Sheffield
2 Could the combination of textual and image features assist in similarity estimation between Cultural Heritage items?
3 Could the combination of textual and image features assist in similarity estimation between Cultural Heritage items? Yes. We show that making use of text and image features produces better estimates of similarity than considering only one medium.
4 Talk Outline 1 2 Text Similarity 3 Image Similarity 4 Combining Text and Image Similarity 5 Evaluation 6 Results 7 Conclusion
5 Huge amount of digitised Cultural Heritage (CH) artefacts.
6 Huge amount of digitised Cultural Heritage (CH) artefacts. e.g. the Louvre, the British Museum and Europeana.
7 Huge amount of digitised Cultural Heritage (CH) artefacts. e.g. the Louvre, the British Museum and Europeana. Artefacts are usually associated with some text and an image.
8 Huge amount of digitised Cultural Heritage (CH) artefacts. e.g. the Louvre, the British Museum and Europeana. Artefacts are usually associated with some text and an image. Information is diverse and unstructured.
9 Huge amount of digitised Cultural Heritage (CH) artefacts. e.g. the Louvre, the British Museum and Europeana. Artefacts are usually associated with some text and an image. Information is diverse and unstructured. Exploring and navigation is difficult.
10 Huge amount of digitised Cultural Heritage (CH) artefacts. e.g. the Louvre, the British Museum and Europeana. Artefacts are usually associated with some text and an image. Information is diverse and unstructured. Exploring and navigation is difficult.
11 Solution Huge amount of digitised Cultural Heritage (CH) artefacts. e.g. the Louvre, the British Museum and Europeana. Artefacts are usually associated with some text and an image. Information is diverse and unstructured. Exploring and navigation is difficult. Identify similar items in collections. Text similarity. Image similarity.
12 Text Similarity Text Similarity Corpus-based approaches rely on statistics that they learn from given corpora. Each CH item is considered as a document. Word Overlap: Number of common tokens in the associated text of two items normalised by the combined total. N-gram Overlap: Identifying N-grams in common between two texts, increase the score by n 2 for each n-gram of length n.
13 Text Similarity Text Similarity TF.IDF Term and document frequencies are computed given a corpus of CH items. Latent Dirichlet Allocation (Blei et al., 2003): Summarising a collection of CH items into a predefined number of topics. Each document is represented as a probability distribution over a set of topics, each topic is a probability distribution of words given a corpus, cosine similarity by converting topic distributions of documents into vectors.
14 Image Similarity Image Similarity R,G,B Histograms Intersection (Swain and Ballard, 1991): Colour histograms record the number of the pixels that fall within predefined intervals (bins). Intersection is the number of bins that have same colour. Similarity score: average of the red, green and blue histogram similarity scores. Image Querying Metric (Jacobs et al., 1995): Features: Colour and basic shape information, implemented in imgseek 1 API. 1
15 Combining Text and Image Similarity Combining Text and Image Similarity Weighted linear combination of text and image similarity, sim t and sim img, between two items A, B. sim T +I (A, B) = w 1 sim t (A t, B t ) + w 2 sim img (A i, B i ) Weights w 1, w 2 are optimised using standard linear regression.
16 Evaluation Europeana Web-portal 2 providing access to collections of CH items. 2,000 contributors through out Europe. 20M CH artefacts, e.g. paintings, photographs, sculpture, newspaper archives. Information about each artefact: Title, description, subject, creator, provider. Thumbnail image Metadata in an XML Schema. 2
17 Evaluation Europeana
18 Evaluation Evaluation Data Europeana295 data set 295 pairs of items from Culture Grid 3 and Scran 4. Textual information: Title Description Subject keywords Preprocessing: Stemming, stop words. Visual information: Thumbnail image (average size 7,000-10,000 pixels)
19 Evaluation Human Judgements of Similarity Crowdflower 5 Humans rate pairs from 0-4 (unrelated-highly similar). 3,261 annotations from 99 participants. Gold-standard generated as the average of human ratings for each pair. Training the linear regression model using the gold-standard. Inter-annotator agreement: average of the Pearson correlation between the ratings of each participant and the average ratings of the other participants. ρ =
20 Evaluation Experiments Three types of experiments: Text similarity measures between pairs of items. Image similarity measures between pairs of items. Linear combination of text and image similarities. Performance is measured as the Pearson s correlation coefficient with the gold-standard data.
21 Results Results Image Similarity RGB imgseek Text Similarity Word Overlap tf.idf N-gram overlap LDA Table: Performance of similarity measures applied to Europeana295 data set (Pearson s correlation coefficient).
22 Results Results Best performance for text similarity: Word Overlap For image similarity results using imgseek are higher than RGB. Results obtained from both image similarity measures is lower than all of the text-based measures. The performance of all text similarity measures improves when combined with imgseek. RGB reduces performance when combined with text measures.
23 Conclusion Conclusion Information from text and images of CH artefacts can be combined to improve similarity estimation. We combined four corpus-based and two image-based similarity measures. Evaluation on a data set of 295 manually-annotated pairs of items from Europeana. Results showed that imgseek similarity method consistently improves performance of text similarity methods.
24 Conclusion Thank You For more details about PATHS project, please visit: Questions?
25 Conclusion References I David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3: , ISSN Charles E. Jacobs, Adam Finkelstein, and David H. Salesin. Fast multiresolution image querying. In Proceedings of the 22nd annual conference on Computer Graphics and Interactive Techniques (SIGGRAPH 95), pages , New York, NY, USA, ISBN doi: Michael J. Swain and Dana H. Ballard. Color indexing. International Journal of Computer Vision, 7:11 32, ISSN
Exploring archives with probabilistic models: Topic modelling for the European Commission Archives
Exploring archives with probabilistic models: Topic modelling for the European Commission Archives Simon Hengchen, Mathias Coeckelbergs, Seth van Hooland, Ruben Verborgh & Thomas Steiner Université libre
More informationInterpreting Document Collections with Topic Models. Nikolaos Aletras University College London
Interpreting Document Collections with Topic Models Nikolaos Aletras University College London Acknowledgements Mark Stevenson, Sheffield Tim Baldwin, Melbourne Jey Han Lau, IBM Research Talk Outline Introduction
More informationLinked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library
Linked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library Nuno Freire Chief data officer The European Library Pacific Neighbourhood Consortium 2014 Annual
More informationExploiting Conversation Structure in Unsupervised Topic Segmentation for s
Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails Shafiq Joty, Giuseppe Carenini, Gabriel Murray, Raymond Ng University of British Columbia Vancouver, Canada EMNLP 2010 1
More informationMultimodal Medical Image Retrieval based on Latent Topic Modeling
Multimodal Medical Image Retrieval based on Latent Topic Modeling Mandikal Vikram 15it217.vikram@nitk.edu.in Suhas BS 15it110.suhas@nitk.edu.in Aditya Anantharaman 15it201.aditya.a@nitk.edu.in Sowmya Kamath
More informationImage Similarity Based on Direct Human Judgment
Image Similarity Based on Direct Human Judgment Raul Guerra Dept. of Computer Science University of Maryland College Park, MD 20742 rguerra@cs.umd.edu Abstract Recently the field of human-based computation
More informationLinks, languages and semantics: linked data approaches in The European Library and Europeana. Valentine Charles, Nuno Freire & Antoine Isaac
Links, languages and semantics: linked data approaches in The European Library and Europeana. Valentine Charles, Nuno Freire & Antoine Isaac 14 th August 2014, IFLA2014 satellite meeting, Paris The European
More informationRanking models in Information Retrieval: A Survey
Ranking models in Information Retrieval: A Survey R.Suganya Devi Research Scholar Department of Computer Science and Engineering College of Engineering, Guindy, Chennai, Tamilnadu, India Dr D Manjula Professor
More informationMultimodal Information Spaces for Content-based Image Retrieval
Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due
More informationMINT METADATA INTEROPERABILITY SERVICES
MINT METADATA INTEROPERABILITY SERVICES DIGITAL HUMANITIES SUMMER SCHOOL LEUVEN 10/09/2014 Nikolaos Simou National Technical University of Athens What is MINT? 2 Mint is a herb having hundreds of varieties
More informationBasic techniques. Text processing; term weighting; vector space model; inverted index; Web Search
Basic techniques Text processing; term weighting; vector space model; inverted index; Web Search Overview Indexes Query Indexing Ranking Results Application Documents User Information analysis Query processing
More informationCompany Search When Documents are only Second Class Citizens
Company Search When Documents are only Second Class Citizens Daniel Blank, Sebastian Boosz, and Andreas Henrich University of Bamberg, D-96047 Bamberg, Germany, firstname.lastname@uni-bamberg.de, WWW home
More informationBringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure
Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure Twan Goosen 1 (CLARIN ERIC), Nuno Freire 2, Clemens Neudecker 3, Maria Eskevich
More informationImplementation of a High-Performance Distributed Web Crawler and Big Data Applications with Husky
Implementation of a High-Performance Distributed Web Crawler and Big Data Applications with Husky The Chinese University of Hong Kong Abstract Husky is a distributed computing system, achieving outstanding
More informationCLUSTER ANALYSIS APPLIED TO EUROPEANA DATA
CLUSTER ANALYSIS APPLIED TO EUROPEANA DATA by Esra Atescelik In partial fulfillment of the requirements for the degree of Master of Computer Science Department of Computer Science VU University Amsterdam
More informationInformation Retrieval. (M&S Ch 15)
Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion
More informationEfficient Indexing and Searching Framework for Unstructured Data
Efficient Indexing and Searching Framework for Unstructured Data Kyar Nyo Aye, Ni Lar Thein University of Computer Studies, Yangon kyarnyoaye@gmail.com, nilarthein@gmail.com ABSTRACT The proliferation
More informationFondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model Valentine Charles and Kerstin Arnold 8th October 2014, DCMI2014, Austin Overview The Archives Portal Europe - Introduction Projects and
More informationText Document Clustering Using DPM with Concept and Feature Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,
More informationImproved Query by Image Retrieval using Multi-feature Algorithms
International Journal of Scientific & Engineering Research, Volume 4, Issue 8, August 2013 Improved Query by Image using Multi-feature Algorithms Rani Saritha R, Varghese Paul, P. Ganesh Kumar Abstract
More informationarxiv: v1 [cs.cl] 29 Mar 2019
Re-Ranking Words to Improve Interpretability of Automatically Generated Topics Areej Alokaili 1,2, Nikolaos Aletras 1 and Mark Stevenson 1 1 University of Sheffield, United Kingdom 2 King Saud University,
More informationUKOLN involvement in the ARCO Project. Manjula Patel UKOLN, University of Bath
UKOLN involvement in the ARCO Project Manjula Patel UKOLN, University of Bath Overview Work Packages User Requirements Specification ARCO Data Model Types of Requirements Museum User Trials Metadata for
More informationIMPROVING THE PERFORMANCE OF CONTENT-BASED IMAGE RETRIEVAL SYSTEMS WITH COLOR IMAGE PROCESSING TOOLS
IMPROVING THE PERFORMANCE OF CONTENT-BASED IMAGE RETRIEVAL SYSTEMS WITH COLOR IMAGE PROCESSING TOOLS Fabio Costa Advanced Technology & Strategy (CGISS) Motorola 8000 West Sunrise Blvd. Plantation, FL 33322
More informationInformativeness for Adhoc IR Evaluation:
Informativeness for Adhoc IR Evaluation: A measure that prevents assessing individual documents Romain Deveaud 1, Véronique Moriceau 2, Josiane Mothe 3, and Eric SanJuan 1 1 LIA, Univ. Avignon, France,
More informationIntegrating Image Content and its Associated Text in a Web Image Retrieval Agent
From: AAAI Technical Report SS-97-03. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Integrating Image Content and its Associated Text in a Web Image Retrieval Agent Victoria Meza
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationA Miniature-Based Image Retrieval System
A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,
More informationThe Sunshine State Digital Network
The Sunshine State Digital Network Keila Zayas-Ruiz, Sunshine State Digital Network Coordinator May 10, 2018 What is DPLA? The Digital Public Library of America is a free online library that provides access
More informationA Measurement Design for the Comparison of Expert Usability Evaluation and Mobile App User Reviews
A Measurement Design for the Comparison of Expert Usability Evaluation and Mobile App User Reviews Necmiye Genc-Nayebi and Alain Abran Department of Software Engineering and Information Technology, Ecole
More informationMSRA Columbus at GeoCLEF2007
MSRA Columbus at GeoCLEF2007 Zhisheng Li 1, Chong Wang 2, Xing Xie 2, Wei-Ying Ma 2 1 Department of Computer Science, University of Sci. & Tech. of China, Hefei, Anhui, 230026, P.R. China zsli@mail.ustc.edu.cn
More informationMatching Cultural Heritage items to Wikipedia
Matching Cultural Heritage items to Wikipedia Eneko Agirre, Ander Barrena, Oier Lopez de Lacalle, Aitor Soroa, Samuel Fernando, Mark Stevenson IXA NLP Group, University of the Basque Country, Donostia,
More informationDHTK: The Digital Humanities ToolKit
DHTK: The Digital Humanities ToolKit Davide Picca, Mattia Egloff University of Lausanne Abstract. Digital Humanities have the merit of connecting two very different disciplines such as humanities and computer
More informationMultimedia Project Presentation
Exploring Europe's Television Heritage in Changing Contexts Multimedia Project Presentation Deliverable 7.1. Euscreen in a nutshell A Best Practice Network funded by the econtentplus programme of the EU.
More informationContent-based Image Retrieval (CBIR)
Content-based Image Retrieval (CBIR) Content-based Image Retrieval (CBIR) Searching a large database for images that match a query: What kinds of databases? What kinds of queries? What constitutes a match?
More informationEUROPEANA METADATA INGESTION , Helsinki, Finland
EUROPEANA METADATA INGESTION 20.11.2012, Helsinki, Finland As of now, Europeana has: 22.322.604 Metadata (related to a digital record) in CC0 3.698.807 are in the Public Domain 697.031 Digital Objects
More informationReducing Redundancy with Anchor Text and Spam Priors
Reducing Redundancy with Anchor Text and Spam Priors Marijn Koolen 1 Jaap Kamps 1,2 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Informatics Institute, University
More informationNon-negative Matrix Factorization for Multimodal Image Retrieval
Non-negative Matrix Factorization for Multimodal Image Retrieval Fabio A. González PhD Machine Learning 2015-II Universidad Nacional de Colombia F. González NMF for MM IR ML 2015-II 1 / 54 Outline 1 The
More informationEuropeana, the prototype EDLfoundation Europeana Network Europeana, vs. 1.0 ThoughtLab Technical requirements
Europeana European cultural heritage: united in its diversity Paul Doorenbosch KB - EDL Foundation 11th Special and University Libraries Conference, Opatija, 2 April 2009 Europeana, the prototype EDLfoundation
More informationEQUELLA. Searching User Guide. Version 6.4
EQUELLA Searching User Guide Version 6.4 Document History Document No. Reviewed Finalised Published 1 19/05/2015 20/05/2015 20/05/2015 May 2015 edition. Information in this document may change without
More informationECLAP Kick-off An Aggregator Project for EUROPEANA
ECLAP Kick-off An Aggregator Project for EUROPEANA Paolo Nesi, nesi@dsi.unifi.it it Europeana: The Vision A digital it library that t is a single, direct and multilingual l access point to the European
More informationWhere Should the Bugs Be Fixed?
Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports Presented by: Chandani Shrestha For CS 6704 class About the Paper and the Authors Publication
More informationReport on Image Processing (ECE 8741) Project. Fast Multiresolution Image Querying implementation of paper by Jacobs, Finkelstein, Salesin.
Report on Image Processing (ECE 8741) Project Fast Multiresolution Image Querying implementation of paper by Jacobs, Finkelstein, Salesin. Author: Keywords: wavelet-signature, multiresolution, image-search,
More informationEuropeana and the Mediterranean Region
Europeana and the Mediterranean Region Dov Winer Israel MINERVA Network for Digitisation of Culture MAKASH Advancing CMC in Education, Culture and Science (IL) Scientific Manager, Judaica Europeana (EAJC,
More informationEfficient Content Based Image Retrieval System with Metadata Processing
IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 10 March 2015 ISSN (online): 2349-6010 Efficient Content Based Image Retrieval System with Metadata Processing
More informationHUKB at NTCIR-12 IMine-2 task: Utilization of Query Analysis Results and Wikipedia Data for Subtopic Mining
HUKB at NTCIR-12 IMine-2 task: Utilization of Query Analysis Results and Wikipedia Data for Subtopic Mining Masaharu Yoshioka Graduate School of Information Science and Technology, Hokkaido University
More informationCOLOR AND SHAPE BASED IMAGE RETRIEVAL
International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol.2, Issue 4, Dec 2012 39-44 TJPRC Pvt. Ltd. COLOR AND SHAPE BASED IMAGE RETRIEVAL
More informationExtraction of Color and Texture Features of an Image
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology July-2015 Volume 2, Issue-4 Email: editor@ijermt.org www.ijermt.org Extraction of Color and Texture Features of an
More informationD 4.2. Final Prototype Interface Design
Grant Agreement No. Project Acronym Project full title ICT-2009-270082 PATHS Personalised Access To cultural Heritage Spaces D 4.2 Final Prototype Interface Design Authors: Mark Hall (USFD), Paula Goodale
More informationOntology based Model and Procedure Creation for Topic Analysis in Chinese Language
Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,
More informationThe Europeana Data Model and Europeana Libraries Robina Clayphan
The Europeana Data Model and Europeana Libraries Robina Clayphan 27 April 2012, The British Library, London Overview 1. How delighted I am to be here 2. The Europeana Data Model What is it for? What does
More informationClassifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped
More informationThe Europeana Data Model, current status
The Europeana Data Model, current status Carlo Meghini Europeana v1.0 WP3 Meeting Berlin, January 25-26, 2010 Outline Part I Background Requirements Status Part II The general picture Classes Properties
More informationCOAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly
COAR Interoperability Roadmap Uppsala, May 21, 2012 COAR General Assembly 1 Background COAR WG2 s main objective for 2011-2012 was to facilitate a discussion on interoperability among Open Access repositories.
More informationEuropeana DSI 2 Access to Digital Resources of European Heritage
Europeana DSI 2 Access to Digital Resources of European Heritage MILESTONE Revision 1.0 Date of submission 28.04.2017 Author(s) Krystian Adamski, Tarek Alkhaeir, Marcin Heliński, Aleksandra Nowak, Marcin
More informationThe CARARE project: modeling for Linked Open Data
The CARARE project: modeling for Linked Open Data Kate Fernie, MDR Partners Fagdag om modellering, 7 March 2014 CARARE: Bringing content for archaeology and historic buildings to Europeana users When:
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationDeveloping Focused Crawlers for Genre Specific Search Engines
Developing Focused Crawlers for Genre Specific Search Engines Nikhil Priyatam Thesis Advisor: Prof. Vasudeva Varma IIIT Hyderabad July 7, 2014 Examples of Genre Specific Search Engines MedlinePlus Naukri.com
More informationCollection management systems migration report. ARTISTE, D9.4 # Rev. A April, 2002
ARTISTE, D9.4 #905-0004528 Rev. A April, 2002 2002-04-08 Collection management systems migration report Project acronym ARTISTE Contract number IST 11.978 Deliverable number D9.4 Deliverable title Collection
More informationAn aggregation system for cultural heritage content
An aggregation system for cultural heritage content Nasos Drosopoulos, Vassilis Tzouvaras, Nikolaos Simou, Anna Christaki, Arne Stabenau, Kostas Pardalis, Fotis Xenikoudakis, Eleni Tsalapati and Stefanos
More informationWelcome Back to Fundamental of Multimedia (MR412) Fall, ZHU Yongxin, Winson
Welcome Back to Fundamental of Multimedia (MR412) Fall, 2012 ZHU Yongxin, Winson zhuyongxin@sjtu.edu.cn Content-Based Retrieval in Digital Libraries 18.1 How Should We Retrieve Images? 18.2 C-BIRD : A
More informationMETAINFORMATION INCORPORATION IN LIBRARY DIGITISATION PROJECTS
METAINFORMATION INCORPORATION IN LIBRARY DIGITISATION PROJECTS Michael Middleton QUT School of Information Systems, Brisbane, Australia. m.middleton@qut.edu.au This paper was accepted in Poster form and
More informationFrom Passages into Elements in XML Retrieval
From Passages into Elements in XML Retrieval Kelly Y. Itakura David R. Cheriton School of Computer Science, University of Waterloo 200 Univ. Ave. W. Waterloo, ON, Canada yitakura@cs.uwaterloo.ca Charles
More informationMultiMatch. D1.4 Functional Specification of the Second Prototype
Project no. 033104 MultiMatch Technology-enhanced Learning and Access to Cultural Heritage Instrument: Specific Targeted Research Project FP6-2005-IST-5 D1.4 Functional Specification of the Second Prototype
More informationCh. 1.4 Histograms & Stem-&-Leaf Plots
Ch. 1.4 Histograms & Stem-&-Leaf Plots Learning Intentions: Create a histogram & stem-&-leaf plot of a data set. Given a list of data, use a calculator to graph a histogram. Interpret histograms & stem-&-leaf
More informationComposite Heuristic Algorithm for Clustering Text Data Sets
Composite Heuristic Algorithm for Clustering Text Data Sets Nikita Nikitinsky, Tamara Sokolova and Ekaterina Pshehotskaya InfoWatch Nikita.Nikitinsky@infowatch.com, Tamara.Sokolova@infowatch.com, Ekaterina.Pshehotskaya@infowatch.com
More informationEvaluating an Associative Browsing Model for Personal Information
Evaluating an Associative Browsing Model for Personal Information Jinyoung Kim, W. Bruce Croft, David A. Smith and Anton Bakalov Department of Computer Science University of Massachusetts Amherst {jykim,croft,dasmith,abakalov}@cs.umass.edu
More informationIsiXhosa Search Engine Development Report DEVELOPING INFORMATION RETRIEVAL SYSTEMS FOR AFRICAN LANGAUGES MICHAEL KYEYUNE
2015 IsiXhosa Search Engine Development Report DEVELOPING INFORMATION RETRIEVAL SYSTEMS FOR AFRICAN LANGAUGES MICHAEL KYEYUNE KYYMIC001@MYUCT.AC.ZA Table of Contents ABSTRACT... 3 1.INTRODUCTION... 4 2.PROJECT
More informationAchieving interoperability between the CARARE schema for monuments and sites and the Europeana Data Model
Achieving interoperability between the CARARE schema for monuments and sites and the Europeana Data Model Antoine Isaac, Valentine Charles, Kate Fernie, Costis Dallas, Dimitris Gavrilis, Stavros Angelis
More informationEuropeana update: aspects of the data
Europeana update: aspects of the data Robina Clayphan, Europeana Foundation European Film Gateway Workshop, 30 May 2011, Frankfurt/Main Overview The Europeana Data Model (EDM) Data enrichment activity
More informationTable of Contents (As covered from textbook)
Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression
More informationWhen Semantics support Multilingual Access to Cultural Heritage The Europeana Case. Valentine Charles and Juliane Stiller
When Semantics support Multilingual Access to Cultural Heritage The Europeana Case Valentine Charles and Juliane Stiller SWIB 2014, Bonn, 2.12.2014 Our outline 1. Europeana 2. Multilinguality in digital
More informationRough Feature Selection for CBIR. Outline
Rough Feature Selection for CBIR Instructor:Dr. Wojciech Ziarko presenter :Aifen Ye 19th Nov., 2008 Outline Motivation Rough Feature Selection Image Retrieval Image Retrieval with Rough Feature Selection
More informationOverview of 3D Object Representations
Overview of 3D Object Representations Thomas Funkhouser Princeton University C0S 597D, Fall 2003 3D Object Representations What makes a good 3D object representation? Stanford and Hearn & Baker 1 3D Object
More informationINTRO INTO WORKING WITH MINT
INTRO INTO WORKING WITH MINT TOOLS TO MAKE YOUR COLLECTIONS WIDELY VISIBLE BERLIN 16/02/2016 Nikolaos Simou National Technical University of Athens What is MINT? 2 Mint is a herb having hundreds of varieties
More informationTopic Model Visualization with IPython
Topic Model Visualization with IPython Sergey Karpovich 1, Alexander Smirnov 2,3, Nikolay Teslya 2,3, Andrei Grigorev 3 1 Mos.ru, Moscow, Russia 2 SPIIRAS, St.Petersburg, Russia 3 ITMO University, St.Petersburg,
More informationA Content Based Image Retrieval System Based on Color Features
A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Beyond Bag of Words Bag of Words a document is considered to be an unordered collection of words with no relationships Extending
More informationInteractive Visual Text Analytics for Decision Making. Shixia Liu Microsoft Research Asia
Interactive Visual Text Analytics for Decision Making Shixia Liu Microsoft Research Asia 1 Text is Everywhere We use documents as primary information artifact in our lives Our access to documents has grown
More informationBasic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval
Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval 1 Naïve Implementation Convert all documents in collection D to tf-idf weighted vectors, d j, for keyword vocabulary V. Convert
More informationPhotoshop Introduction to The Shape Tool nigelbuckner This handout is an introduction to get you started using the Shape tool.
Photoshop Introduction to The Shape Tool nigelbuckner 2008 This handout is an introduction to get you started using the Shape tool. What is a shape in Photoshop? The Shape tool makes it possible to draw
More informationPATHS: personalising access to cultural heritage spaces
PATHS: personalising access to cultural heritage spaces Kate Fernie, Jillian Griffiths, MDR Partners, London, UK Mark Stevenson, Paul Clough, Paula Goodale, Mark Hall, University of Sheffield, UK Phil
More informationIMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM
IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM Myomyo Thannaing 1, Ayenandar Hlaing 2 1,2 University of Technology (Yadanarpon Cyber City), near Pyin Oo Lwin, Myanmar ABSTRACT
More informationHOW USEFUL ARE COLOUR INVARIANTS FOR IMAGE RETRIEVAL?
HOW USEFUL ARE COLOUR INVARIANTS FOR IMAGE RETRIEVAL? Gerald Schaefer School of Computing and Technology Nottingham Trent University Nottingham, U.K. Gerald.Schaefer@ntu.ac.uk Abstract Keywords: The images
More informationEuropeana: from. inspirational idea to sustainable service. National Conference Romania. Cluj-Napoca 16 th June Lizzy Komen, Europeana
Europeana: from inspirational idea to sustainable service National Conference Romania Cluj-Napoca 16 th June 2010 Lizzy Komen, Europeana Content 1. Europeana Foundation and Europeana 2. Content Strategy
More informationSEMILAR API 1.0. User guide. Authors: Rajendra Banjade, Dan Stefanescu, Nobal Niraula, Mihai Lintean, and Vasile Rus
WWW.SEMANTICSIMILARITY.ORG SEMILAR API 1.0 User guide Authors: Rajendra Banjade, Dan Stefanescu, Nobal Niraula, Mihai Lintean, and Vasile Rus Contact: Rajendra Banjade at rbanjade@memphis.edu 7/29/2013
More informationPerforming searches on Érudit
Performing searches on Érudit Table of Contents 1. Simple Search 3 2. Advanced search 2.1 Running a search 4 2.2 Operators and search fields 5 2.3 Filters 7 3. Search results 3.1. Refining your search
More informationDocument Clustering: Comparison of Similarity Measures
Document Clustering: Comparison of Similarity Measures Shouvik Sachdeva Bhupendra Kastore Indian Institute of Technology, Kanpur CS365 Project, 2014 Outline 1 Introduction The Problem and the Motivation
More informationAN EFFECTIVE INFORMATION RETRIEVAL FOR AMBIGUOUS QUERY
Asian Journal Of Computer Science And Information Technology 2: 3 (2012) 26 30. Contents lists available at www.innovativejournal.in Asian Journal of Computer Science and Information Technology Journal
More informationJoint UNECE/Eurostat/OECD work session on statistical metadata (METIS) (Geneva, 3-5 April 2006)
WP. 20 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT)
More informationThe National Digital Library Finna Among Digital Research Infrastructures in Finland
The National Digital Library Finna Among Digital Research Infrastructures in Finland Heli Kautonen Head of Services, The National Library of Finland 2 March, 2013 Seminar: Epics, Digital Cultural Heritage
More informationMahout in Action MANNING ROBIN ANIL SEAN OWEN TED DUNNING ELLEN FRIEDMAN. Shelter Island
Mahout in Action SEAN OWEN ROBIN ANIL TED DUNNING ELLEN FRIEDMAN II MANNING Shelter Island contents preface xvii acknowledgments about this book xx xix about multimedia extras xxiii about the cover illustration
More informationEvaluating Topic Representations for Exploring Document Collections
Evaluating Topic Representations for Exploring Document Collections Nikolaos Aletras (corresponding author) Computer Science University College London nikos.aletras@gmail.com Timothy Baldwin Computing
More informationProf. Ahmet Süerdem Istanbul Bilgi University London School of Economics
Prof. Ahmet Süerdem Istanbul Bilgi University London School of Economics Media Intelligence Business intelligence (BI) Uses data mining techniques and tools for the transformation of raw data into meaningful
More informationUsing Statistical Properties of Text to Create. Metadata. Computer Science and Electrical Engineering Department
Using Statistical Properties of Text to Create Metadata Grace Crowder crowder@cs.umbc.edu Charles Nicholas nicholas@cs.umbc.edu Computer Science and Electrical Engineering Department University of Maryland
More informationMetadata Topic Harmonization and Semantic Search for Linked-Data-Driven Geoportals -- A Case Study Using ArcGIS Online
Metadata Topic Harmonization and Semantic Search for Linked-Data-Driven Geoportals -- A Case Study Using ArcGIS Online Yingjie Hu 1, Krzysztof Janowicz 1, Sathya Prasad 2, and Song Gao 1 1 STKO Lab, Department
More informationIntegration of Heterogeneous Metadata in Europeana. Cesare Concordia Institute of Information Science and Technology-CNR
Integration of Heterogeneous Metadata in Europeana Cesare Concordia cesare.concordia@isti.cnr.it Institute of Information Science and Technology-CNR Outline What is Europeana The Europeana data model The
More informationWhat is this Song About?: Identification of Keywords in Bollywood Lyrics
What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics
More informationA Comparison of Algorithms used to measure the Similarity between two documents
A Comparison of Algorithms used to measure the Similarity between two documents Khuat Thanh Tung, Nguyen Duc Hung, Le Thi My Hanh Abstract Nowadays, measuring the similarity of documents plays an important
More informationNon-negative Matrix Factorization for Multimodal Image Retrieval
Non-negative Matrix Factorization for Multimodal Image Retrieval Fabio A. González PhD Bioingenium Research Group Computer Systems and Industrial Engineering Department Universidad Nacional de Colombia
More informationFrom The European Library to The European Digital Library. Jill Cousins Inforum, Prague, May 2007
From The European Library to The European Digital Library Jill Cousins Inforum, Prague, May 2007 Timeline Past to Present Started as TEL a project funded by the EU and led by The British Library now fully
More information