ALMA MATER STUDIORUM - UNIVERSITÀ DI BOLOGNA Information Technology for Documentary Data Representation Laurea Magistrale in Scienze del Libro e del Documento University of Bologna Course presentation Academic Year 2016/2017 Home page: http://www-db.disi.unibo.it/courses/dd/ Electronic version: 0.01.Presentation.pdf Electronic version: 0.01.Presentation-2p.pdf Ravenna, February 3 th, 2017
Teacher Prof.ssa Ilaria Bartolini Department of Computer Science and Engineering (DISI) University of Bologna Viale Risorgimento, 2, Bologna Multimedia Database Group http://www-db.disi.unibo.it/mmdbgroup/ Datalab http://www-db.disi.unibo.it/research/datalab/ 2
Contacts E-mail: ilaria.bartolini@unibo.it Telephone: 051 20 93550 Web: http://www-db.disi.unibo.it/ibartolini/ Office hours: by appointment (via email) 3
General information and course calendar Name: Information Technology for Documentary Data Representation Credits: 6 Teaching hours: 30 hours Period: III February 3 rd 2017 March 9 th 2017 Teaching hours: Friday 10:00-13:00 Friday 14:00-16:00 Room 2, Campus Ravenna - Via Mariani, 5 I. Bartolini Information Technology for Documentary Data Representation 4
Course contents Learning outcomes The course aims at providing the students with the knowledge and skills needed for achieving an effective and efficient management of documentary databases. Particular attention is devoted to the documents representation, their indexing techniques, and models of queries (together with metrics for the evaluation of the quality of the provided results), starting from traditional documents and then continuing with the most complex multimedia documents Topics at a glance Basics on structured, semi-structured, and unstructured data Textual Information Retrieval Multimedia Information Retrieval 5
Main goal Facilitate and improve the access to documentary data repositories for general users, conjunctively exploiting: dedicated users manually provided metadata low level features (e.g., document keywords, color of an image) semi-automatically provided annotations Models, Algorithms, Interfaces Archivio Storico Fiat Cineteca Archivio Artistico Trimotore Fiat G212 Data: 1947 Collezione: Tema di cultura industriale Tipologia: Immagine Aereo, Motore, Ali Das Cabinet des Dr. Caligari Data: 1920 Nazione: Germania Regista: Robert Wiene Genere: Horror Espressionismo, Ipnosi, Sonnambulismo La Gioconda Sito: Museo Louvre, Parigi Secolo: XVI Autore: Leonardo da Vinci Periodo: Rinascimento Data: 1503 Dipinto, Ritratto, Sorriso 6
Detailed program Basics on structured, semi-structured, and unstructured data Textual Information Retrieval (IR) systems: general principles Documents representation in IR systems Automatic indexing techniques, stemming, stoplist Searches of Boolean type Searches of phrases and for proximity The Vector Space model: weighing techniques and ranking of the results Evaluation of IR systems: Precision and Recall metrics Multimedia Information Retrieval (MM-IR) General concepts of MM-IR systems: feature extraction and similarity criteria Examples for different types of multimedia data Query paradigms and presentation of the results Interactive searches 7
Course home page http://www-db.disi.unibo.it/courses/dd/ Contents: News Copy of slides in PDF format Bibliography Useful links Assessment methods Exam sessions 8
Readings/Bibliography Education material provided by the teachers copies of the slides used in the classroom and relevant scientific literature 9
Teaching methods Course lectures are in "traditional" classrooms and exploit the slides Several use cases will be presented in order to show how such information technologies can be profitably applied in a number of real applications 10
Assessment methods The exam evaluation consists of an oral examination To participate to the exam, interested students have to register themselves by exploiting the usual UniBO Web application, called AlmaEsami 11
Examination sessions Six examination sessions per year at the request of the students 12