Deep Learning for Broadcast Videos and Multimedia

Size: px
Start display at page:

Download "Deep Learning for Broadcast Videos and Multimedia"

Transcription

1 Deep Learning for Broadcast Videos and Multimedia Lorenzo Baraldi University of Modena and Reggio Emilia

2 Deep Learning State-of-the-art in image classification, object detection, semantic object segmentation and action recognition. It is general! It can be applied to images, videos and multimodal data. At Imagelab: Città Educante project Develop and test new DL algorithms for Temporal Video Segmentation and Concept Detection, with C. Grana and R. Cucchiara Two hardware grants for Deep Learning projects: NVIDIA Hardware Grant, with the donation of one Tesla K40 GPU. Italian Supercomputing Resource Allocation (ISCRA) Grant from CINECA, which gives access to the Galileo HPC Platform.

3 Città Educante Almaviva SpA, RAI UniTN, UNIMORE, Reggio Children, CNR ATI Città Educante (13 PMI) exo Platform (Almaviva) RAI server RAI metadata Neuralstory Web interface (ATI) OR 3.2: Knowledge Extraction Video annotation Temporal Video Segmentation Deep learning engine (UNIMORE)

4 Broadcast videos Audio Speech to text Words Sentences Increasing level of abstraction Visual Shot detection Frames Shots Increasing level of abstraction Video Annotation of basic units (shots and sentences) is a necessary step for dividing a video into complex segments, like storyboards. During Antarctica winter, emperor penguins endure four months of darkness In the Arctic, polar bear cubs take their first steps into a world of rapidly thawing ice In northern Canada, 3 million caribou complete an overland migration The forests of eastern Russia are home to the Amur leopard In the tropics, the jungle that covers 3% of the planet's surface supports 50% of its species

5 Video story detection Group adjacent shots according to semantic coherence Can not be identified with visual features Can be identified with visual features only Need of multi-modal features!

6 Perceptual multi-modal features Visual appearance 1000 Visual concepts 205 Scenes ILSVRC-12: 1.2 million images Places: 2.5 million images Short term audio spectrum features POS tagger Quantity of speech Time

7 Semantic multi-modal features Textual semantic Cluster words in the transcript using a Word2Vec embedding space: words with similar semantics lie close The deciduous forests of America begin to shut down, losing their leaves in preparation for the dark cold months ahead. Textual concept space Visual semantic A Visual Word2Vec: words in transcript are visually confirmed using the entire Imagenet dataset ( categories) Visual concept space

8 A Deep Multi-modal architecture

9 Retrieval: merging semantics and aesthetics Scene-based: retrieve parts of videos instead of videos Semantic: thumbnail should represent the query Aesthetic: thumbnail should be aesthetically pleasant Low and high level activations from CNN + max-margin linear Ranking Less data, no DNN training! L. Baraldi, C. Grana, R. Cucchiara, Scene-driven Retrieval in Edited Videos using Aesthetic and Semantic Deep Features ICMR 2016, New York

10 Retrieval: merging semantics and aesthetics Query: penguin and calf Same video, different scenes Query: ant and spider Same scene, different thumbnails

11 Evaluation Synthetic: YFCC100M-Stories Built using YFCC100M videos TV Series: AllyMcBeal dataset for scene detection First four episodes of the first season Documentaries: BBC Planet Earth 11 episodes from a BBC educational TV Series 4900 shots and 670 segments Our method (Deep) State of the art YFCC100M-Stories AllyMcBeal BBC Planet Earth

12 Visualization Automatically generated stories can be visualized in a timeline fashion. Visual concepts enhance navigation and search inside the archive.

13 Visualization Video re-use!

14 Thank you Any questions?

A Video Library System using Scene Detection and Automatic Tagging

A Video Library System using Scene Detection and Automatic Tagging A Video Library System using Scene Detection and Automatic Tagging Lorenzo Baraldi, Costantino Grana, Rita Cucchiara Dipartimento di Ingegneria Enzo Ferrari Università degli Studi di Modena e Reggio Emilia

More information

CultMEDIA Machine learning-based services for harvesting multimedia documents to support low-cost video post-production and cross-media storytelling

CultMEDIA Machine learning-based services for harvesting multimedia documents to support low-cost video post-production and cross-media storytelling CultMEDIA Machine learning-based services for harvesting multimedia documents to support low-cost video post-production and cross-media storytelling Italian Cluster TICHE Technologies for Cultural Heritage

More information

Lorenzo Baraldi Curriculum Vitae

Lorenzo Baraldi Curriculum Vitae Lorenzo Baraldi Curriculum Vitae lorenzo.baraldi@unimore.it www.lorenzobaraldi.com [11, 13, 33][26, 27, 29][8][19, 20, 24] 2018 - on going Education and Expertise Postdoctoral Fellow, AImageLab, University

More information

Beyond detection: GANs and LSTMs to pay attention at human presence

Beyond detection: GANs and LSTMs to pay attention at human presence Talk @Munich October 11, 2017 Beyond detection: GANs and LSTMs to pay attention at human presence Rita Cucchiara Imagelab, Dipartimento di Ingegneria «Enzo Ferrari» University of Modena e Reggio Emilia,

More information

Shot, scene and keyframe ordering for interactive video re-use

Shot, scene and keyframe ordering for interactive video re-use Shot, scene and keyframe ordering for interactive video re-use Lorenzo Baraldi 1, Costantino Grana 1, Guido Borghi 1, Roberto Vezzani 1, Rita Cucchiara 1 1 Dipartimento di Ingegneria Enzo Ferrari, Università

More information

Shifting from Naming to Describing: Semantic Attribute Models. Rogerio Feris, June 2014

Shifting from Naming to Describing: Semantic Attribute Models. Rogerio Feris, June 2014 Shifting from Naming to Describing: Semantic Attribute Models Rogerio Feris, June 2014 Recap Large-Scale Semantic Modeling Feature Coding and Pooling Low-Level Feature Extraction Training Data Slide credit:

More information

Lecture Video Indexing and Retrieval Using Topic Keywords

Lecture Video Indexing and Retrieval Using Topic Keywords Lecture Video Indexing and Retrieval Using Topic Keywords B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa International Science Index, Computer and Information Engineering waset.org/publication/10007915

More information

CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM

CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM Nick Hatzigeorgiu, Nikolaos Sidiropoulos and Harris Papageorgiu Institute for Language and Speech Processing Epidavrou & Artemidos 6, 151 25 Maroussi,

More information

Faceted Navigation for Browsing Large Video Collection

Faceted Navigation for Browsing Large Video Collection Faceted Navigation for Browsing Large Video Collection Zhenxing Zhang, Wei Li, Cathal Gurrin, Alan F. Smeaton Insight Centre for Data Analytics School of Computing, Dublin City University Glasnevin, Co.

More information

Lecture 12: Video Representation, Summarisation, and Query

Lecture 12: Video Representation, Summarisation, and Query Lecture 12: Video Representation, Summarisation, and Query Dr Jing Chen NICTA & CSE UNSW CS9519 Multimedia Systems S2 2006 jchen@cse.unsw.edu.au Last week Structure of video Frame Shot Scene Story Why

More information

BMVA symposium on "Security and surveillance: performance evaluation

BMVA symposium on Security and surveillance: performance evaluation BMVA symposium on "Security and surveillance: performance evaluation Visor Video Surveillance Online Repository Roberto o Vezzani and Rita Cucchiara a Imagelab Information Engineering Department University

More information

Deep Learning in Pulmonary Image Analysis with Incomplete Training Samples

Deep Learning in Pulmonary Image Analysis with Incomplete Training Samples Deep Learning in Pulmonary Image Analysis with Incomplete Training Samples Ziyue Xu, Staff Scientist, National Institutes of Health Nov. 2nd, 2017 (GTC DC Talk DC7137) Image Analysis Arguably the most

More information

MESH. Multimedia Semantic Syndication for Enhanced News Services. Project Overview

MESH. Multimedia Semantic Syndication for Enhanced News Services. Project Overview MESH Multimedia Semantic Syndication for Enhanced News Services Project Overview Presentation Structure 2 Project Summary Project Motivation Problem Description Work Description Expected Result The MESH

More information

Class 5: Attributes and Semantic Features

Class 5: Attributes and Semantic Features Class 5: Attributes and Semantic Features Rogerio Feris, Feb 21, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Project

More information

and Creativity on Public Service: A Brave New World

and Creativity on Public Service: A Brave New World Session 5 Innovation and Creativity on Public Service: A Brave New World G. Alberico RAI (Italy) INNOVATION & CREATIVITY Digital switch over introduces more channels More content items produced/published/archived

More information

Photo-realistic Renderings for Machines Seong-heum Kim

Photo-realistic Renderings for Machines Seong-heum Kim Photo-realistic Renderings for Machines 20105034 Seong-heum Kim CS580 Student Presentations 2016.04.28 Photo-realistic Renderings for Machines Scene radiances Model descriptions (Light, Shape, Material,

More information

MemoryBox Reconstructing and Presenting Memories for Persons with Memory Loss

MemoryBox Reconstructing and Presenting Memories for Persons with Memory Loss MemoryBox Reconstructing and Presenting Memories for Persons with Memory Loss Design Document (revision 1) 10/16/09 Alex Day Tommy Garcia Jeff Rzeszotarski Ezra Velazquez Advisor: Amy Csizmar Dalal Introduction

More information

Architectures for Scalable Media Object Search

Architectures for Scalable Media Object Search Architectures for Scalable Media Object Search Dennis Sng Deputy Director & Principal Scientist NVIDIA GPU Technology Workshop 10 July 2014 ROSE LAB OVERVIEW 2 Large Database of Media Objects Next- Generation

More information

Associating video frames with text

Associating video frames with text Associating video frames with text Pinar Duygulu and Howard Wactlar Informedia Project School of Computer Science University Informedia Digital Video Understanding Project IDVL interface returned for "El

More information

Online Open World Face Recognition From Video Streams

Online Open World Face Recognition From Video Streams IARPA JANUS Online Open World Face Recognition From Video Streams ID:23202 Federico Pernici, Federico Bartoli, Matteo Bruni and Alberto Del Bimbo MICC - University of Florence - Italy http://www.micc.unifi.it

More information

Semantic Video Indexing

Semantic Video Indexing Semantic Video Indexing T-61.6030 Multimedia Retrieval Stevan Keraudy stevan.keraudy@tkk.fi Helsinki University of Technology March 14, 2008 What is it? Query by keyword or tag is common Semantic Video

More information

DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA

DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA TOPICS COVERED Convolutional Networks Deep Learning Use Cases GPUs cudnn 2 MACHINE LEARNING! Training! Train the model from supervised

More information

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval Lesson 11 Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Retrieval = Query + Search Informational Retrieval: Get required information from database/web

More information

Annotation Graphs, Annotation Servers and Multi-Modal Resources

Annotation Graphs, Annotation Servers and Multi-Modal Resources Annotation Graphs, Annotation Servers and Multi-Modal Resources Infrastructure for Interdisciplinary Education, Research and Development Christopher Cieri and Steven Bird University of Pennsylvania Linguistic

More information

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Grounded Compositional Semantics for Finding and Describing Images with Sentences Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational

More information

Introduzione alle Biblioteche Digitali Audio/Video

Introduzione alle Biblioteche Digitali Audio/Video Introduzione alle Biblioteche Digitali Audio/Video Biblioteche Digitali 1 Gestione del video Perchè è importante poter gestire biblioteche digitali di audiovisivi Caratteristiche specifiche dell audio/video

More information

What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara

What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara COMPUTER VISION IN THE ARTISTIC DOMAIN The effectiveness of Computer Vision

More information

Deep Learning Based Semantic Video Indexing and Retrieval

Deep Learning Based Semantic Video Indexing and Retrieval Deep Learning Based Semantic Video Indexing and Retrieval Anna Podlesnaya, Sergey Podlesnyy Cinema and Photo Research Institute (NIKFI) This work was funded by Russian Federation Ministry of Culture Contract

More information

Optimal Video Adaptation and Skimming Using a Utility-Based Framework

Optimal Video Adaptation and Skimming Using a Utility-Based Framework Optimal Video Adaptation and Skimming Using a Utility-Based Framework Shih-Fu Chang Digital Video and Multimedia Lab ADVENT University-Industry Consortium Columbia University Sept. 9th 2002 http://www.ee.columbia.edu/dvmm

More information

Alberto Messina, Maurizio Montagnuolo

Alberto Messina, Maurizio Montagnuolo A Generalised Cross-Modal Clustering Method Applied to Multimedia News Semantic Indexing and Retrieval Alberto Messina, Maurizio Montagnuolo RAI Centre for Research and Technological Innovation Madrid,

More information

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Previous Lecture Audio Retrieval - Query by Humming

More information

Enhancing applications with Cognitive APIs IBM Corporation

Enhancing applications with Cognitive APIs IBM Corporation Enhancing applications with Cognitive APIs After you complete this section, you should understand: The Watson Developer Cloud offerings and APIs The benefits of commonly used Cognitive services 2 Watson

More information

Visual Information Retrieval: The Next Frontier in Search

Visual Information Retrieval: The Next Frontier in Search Visual Information Retrieval: The Next Frontier in Search Ramesh Jain Abstract: The first ten years of search techniques for WWW have been concerned with text documents. The nature of data on WWW and in

More information

Columbia University High-Level Feature Detection: Parts-based Concept Detectors

Columbia University High-Level Feature Detection: Parts-based Concept Detectors TRECVID 2005 Workshop Columbia University High-Level Feature Detection: Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric Zavesky Digital Video and Multimedia Lab

More information

Entity-centric Topic Extraction and Exploration: A Network-based Approach

Entity-centric Topic Extraction and Exploration: A Network-based Approach Entity-centric Topic Extraction and Exploration: A Network-based Approach Andreas Spitz and Michael Gertz March 27, 2018 ECIR 2018, Grenoble Heidelberg University, Germany Database Systems Research Group

More information

Video search requires efficient annotation of video content To some extent this can be done automatically

Video search requires efficient annotation of video content To some extent this can be done automatically VIDEO ANNOTATION Market Trends Broadband doubling over next 3-5 years Video enabled devices are emerging rapidly Emergence of mass internet audience Mainstream media moving to the Web What do we search

More information

DIGITS DEEP LEARNING GPU TRAINING SYSTEM

DIGITS DEEP LEARNING GPU TRAINING SYSTEM DIGITS DEEP LEARNING GPU TRAINING SYSTEM AGENDA 1 Introduction to Deep Learning 2 What is DIGITS 3 How to use DIGITS Practical DEEP LEARNING Examples Image Classification, Object Detection, Localization,

More information

IMOTION. Heiko Schuldt, University of Basel, Switzerland

IMOTION. Heiko Schuldt, University of Basel, Switzerland IMOTION Heiko Schuldt, University of Basel, Switzerland heiko.schuldt@unibas.ch IMOTION at a Glance Project Title Intelligent Multimodal Augmented Video Motion Retrieval System (IMOTION) Project Start

More information

DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM

DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM AGENDA 1 Introduction to Deep Learning 2 What is DIGITS 3 How to use DIGITS Practical DEEP LEARNING Examples Image Classification, Object Detection,

More information

Extending the Facets concept by applying NLP tools to catalog records of scientific literature

Extending the Facets concept by applying NLP tools to catalog records of scientific literature Extending the Facets concept by applying NLP tools to catalog records of scientific literature *E. Picchi, *M. Sassi, **S. Biagioni, **S. Giannini *Institute of Computational Linguistics **Institute of

More information

Multimodal Learning. Victoria Dean. MIT 6.S191 Intro to Deep Learning IAP 2017

Multimodal Learning. Victoria Dean. MIT 6.S191 Intro to Deep Learning IAP 2017 Multimodal Learning Victoria Dean Talk outline What is multimodal learning and what are the challenges? Flickr example: joint learning of images and tags Image captioning: generating sentences from images

More information

Enriching Perspectives in Exploring Cultural Heritage Documentaries Using Informedia Technologies

Enriching Perspectives in Exploring Cultural Heritage Documentaries Using Informedia Technologies Enriching Perspectives in Exploring Cultural Heritage Documentaries Using Informedia Technologies Tobun Dorbin Ng, Howard D. Wactlar School of Computer Science Carnegie Mellon University Pittsburgh, PA

More information

Department of Computer Science & Engineering. The Chinese University of Hong Kong Final Year Project LYU0102

Department of Computer Science & Engineering. The Chinese University of Hong Kong Final Year Project LYU0102 Department of Computer Science & Engineering The Chinese University of Hong Kong LYU0102 Supervised by Prof. LYU, Rung Tsong Michael Group Members: Chan Pik Wah Ngai Cheuk Han Prepared by Chan Pik Wah

More information

CIMWOS: A MULTIMEDIA, MULTIMODAL AND MULTILINGUAL INDEXING AND RETRIEVAL SYSTEM

CIMWOS: A MULTIMEDIA, MULTIMODAL AND MULTILINGUAL INDEXING AND RETRIEVAL SYSTEM Ebroul Izquierdo, editor. Digital Media Processing for Multimedia Interactive Services. Proceedings of the 4 th European Workshop on Image Analysis for Multimedia Interactive Services. World Scientific

More information

Multimedia Information Retrieval at ORL

Multimedia Information Retrieval at ORL Multimedia Information Retrieval at ORL Dr Kenneth R Wood ORL 24a Trumpington Street Cambridge CB2 1QA ENGLAND krw@orl.co.uk http://www.orl.co.uk/ 1 ORL Funding & IPR from 1986 1990-1992 ongoing from 1996

More information

International Journal of Advanced Networking & Applications (IJANA) ISSN:

International Journal of Advanced Networking & Applications (IJANA) ISSN: Integration of Visual Temporal and Textual Distribution Information for News Video Mining Prof Shivamurthy R C, Tauseef Ahmed S S Department of Computer Science, Akshaya Institute of Technology, Tumkur

More information

VERGE IN VBS Thessaloniki, Greece {moumtzid, andreadisst, markatopoulou, dgalanop, heliasgj, stefanos, bmezaris,

VERGE IN VBS Thessaloniki, Greece {moumtzid, andreadisst, markatopoulou, dgalanop, heliasgj, stefanos, bmezaris, VERGE IN VBS 2018 Anastasia Moumtzidou 1, Stelios Andreadis 1 Foteini Markatopoulou 1,2, Damianos Galanopoulos 1, Ilias Gialampoukidis 1, Stefanos Vrochidis 1, Vasileios Mezaris 1, Ioannis Kompatsiaris

More information

Hello, I am from the State University of Library Studies and Information Technologies, Bulgaria

Hello, I am from the State University of Library Studies and Information Technologies, Bulgaria Hello, My name is Svetla Boytcheva, I am from the State University of Library Studies and Information Technologies, Bulgaria I am goingto present you work in progress for a research project aiming development

More information

A Hybrid Approach to News Video Classification with Multi-modal Features

A Hybrid Approach to News Video Classification with Multi-modal Features A Hybrid Approach to News Video Classification with Multi-modal Features Peng Wang, Rui Cai and Shi-Qiang Yang Department of Computer Science and Technology, Tsinghua University, Beijing 00084, China Email:

More information

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017 How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017 Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Deep Learning

More information

Information Extraction from News Video using Global Rule Induction Technique

Information Extraction from News Video using Global Rule Induction Technique Information Extraction from News Video using Global Rule Induction Technique Lekha Chaisorn and 2 Tat-Seng Chua Media Semantics Department, Media Division, Institute for Infocomm Research (I 2 R), Singapore

More information

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / 2017. 10. 31 syoh@add.re.kr Page 1/36 Overview 1. Introduction 2. Data Generation Synthesis 3. Distributed Deep Learning 4. Conclusions

More information

DL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza

DL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza DL User Interfaces Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Delos work on DL interfaces Delos Cluster 4: User interfaces and visualization Cluster s goals:

More information

Multimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009

Multimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009 9 Video Retrieval Multimedia Databases 9 Video Retrieval 9.1 Hidden Markov Models (continued from last lecture) 9.2 Introduction into Video Retrieval Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme

More information

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Extracting meaning

More information

Fraunhofer IAIS Audio Mining Solution for Broadcast Archiving. Dr. Joachim Köhler LT-Innovate Brussels

Fraunhofer IAIS Audio Mining Solution for Broadcast Archiving. Dr. Joachim Köhler LT-Innovate Brussels Fraunhofer IAIS Audio Mining Solution for Broadcast Archiving Dr. Joachim Köhler LT-Innovate Brussels 22.11.2016 1 Outline Speech Technology in the Broadcast World Deep Learning Speech Technologies Fraunhofer

More information

EC-TEL Community Hypermedia in Collaborative Marc Spaniol. and Self-reflective E-learning Applications. Marc Spaniol

EC-TEL Community Hypermedia in Collaborative Marc Spaniol. and Self-reflective E-learning Applications. Marc Spaniol First European Conference on Technology Enhanced Learning Community Hypermedia in Collaborative and Self-reflective E-learning Applications Hersonissou, Greece, 2 nd of October 2006 I5-Spa1006-1/12 Agenda

More information

The David Attenborough Essential Collection CAMPAIGN DETAILS

The David Attenborough Essential Collection CAMPAIGN DETAILS what WE KNOW Documentaries are relevant: In the past 3 months, 70% of parents with kids under 10 watched documentaries. 92% of parents agree it is important for kids to know about nature and animals. Sir

More information

Overview of ImageCLEF Mauricio Villegas (on behalf of all organisers)

Overview of ImageCLEF Mauricio Villegas (on behalf of all organisers) Overview of ImageCLEF 2016 Mauricio Villegas (on behalf of all organisers) ImageCLEF history Started in 2003 with a photo retrieval task 4 participants submitting results In 2009 we had 6 tasks and 65

More information

Two-Stream Convolutional Networks for Action Recognition in Videos

Two-Stream Convolutional Networks for Action Recognition in Videos Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman Cemil Zalluhoğlu Introduction Aim Extend deep Convolution Networks to action recognition in video. Motivation

More information

The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System

The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System Our first participation on the TRECVID workshop A. F. de Araujo 1, F. Silveira 2, H. Lakshman 3, J. Zepeda 2, A. Sheth 2, P. Perez

More information

SES 123 Global and Regional Energy Lab Worksheet

SES 123 Global and Regional Energy Lab Worksheet SES 123 Global and Regional Energy Lab Worksheet Introduction An important aspect to understand about our planet is global temperatures, including spatial variations, such as between oceans and continents

More information

Nuxeo Platform 5.5. DAM Module. User Guide

Nuxeo Platform 5.5. DAM Module. User Guide Nuxeo Platform 5.5 DAM Module User Guide Table of Contents 1. Digital Asset Management User Guide........................................................................... 3 1.1 Digital Asset Management

More information

A Digital Library Framework for Reusing e-learning Video Documents

A Digital Library Framework for Reusing e-learning Video Documents A Digital Library Framework for Reusing e-learning Video Documents Paolo Bolettieri, Fabrizio Falchi, Claudio Gennaro, and Fausto Rabitti ISTI-CNR, via G. Moruzzi 1, 56124 Pisa, Italy paolo.bolettieri,fabrizio.falchi,claudio.gennaro,

More information

USING DEEP LEARNING TECHNOLOGIES TO PROVIDE DESCRIPTIVE METADATA FOR LIVE VIDEO CONTENTS

USING DEEP LEARNING TECHNOLOGIES TO PROVIDE DESCRIPTIVE METADATA FOR LIVE VIDEO CONTENTS USING DEEP LEARNING TECHNOLOGIES TO PROVIDE DESCRIPTIVE METADATA FOR LIVE VIDEO CONTENTS A. Maraga, V. Zamboni Metaliquid, Italy ABSTRACT As the consumption of video contents rises and consumers behaviour

More information

Learning Semantic Video Captioning using Data Generated with Grand Theft Auto

Learning Semantic Video Captioning using Data Generated with Grand Theft Auto A dark car is turning left on an exit Learning Semantic Video Captioning using Data Generated with Grand Theft Auto Alex Polis Polichroniadis Data Scientist, MSc Kolia Sadeghi Applied Mathematician, PhD

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

Available online at ScienceDirect. Procedia Computer Science 87 (2016 ) 12 17

Available online at  ScienceDirect. Procedia Computer Science 87 (2016 ) 12 17 Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 87 (2016 ) 12 17 4th International Conference on Recent Trends in Computer Science & Engineering Segment Based Indexing

More information

The IEE International Symposium on IMAGING FOR CRIME DETECTION AND PREVENTION, Savoy Place, London, UK 7-8 June 2005

The IEE International Symposium on IMAGING FOR CRIME DETECTION AND PREVENTION, Savoy Place, London, UK 7-8 June 2005 Ambient Intelligence for Security in Public Parks: the LAICA Project Rita Cucchiara, Andrea Prati, Roberto Vezzani Dipartimento di Ingegneria dell Informazione University of Modena and Reggio Emilia, Italy

More information

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 N. Adami, A. Bugatti, A. Corghi, R. Leonardi, P. Migliorati, Lorenzo A. Rossi, C. Saraceno 2 Department of Electronics

More information

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017 A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017 TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 6 10 5 1.1X per year 10 4 10 3 10 2 1.5X per year Single-threaded

More information

Overview of the medical task of ImageCLEF Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller

Overview of the medical task of ImageCLEF Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller Overview of the medical task of ImageCLEF 2016 Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller Tasks in ImageCLEF 2016 Automatic image annotation Medical image classification Sub-tasks

More information

Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection

Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning TR2005-155

More information

Transfer Learning. Style Transfer in Deep Learning

Transfer Learning. Style Transfer in Deep Learning Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing

More information

Web-scale Multimedia Search for Internet Video Content

Web-scale Multimedia Search for Internet Video Content CARNEGIE MELLON UNIVERSITY Web-scale Multimedia Search for Internet Video Content Lu Jiang CMU-LTI-17-003 Language and Information Technologies School of Computer Science Carnegie Mellon University 5000

More information

Overview MULTIMEDIA INFORMATION RETRIEVAL. Search Engines. Information Retrieval. Explanation. Van Rijsbergen

Overview MULTIMEDIA INFORMATION RETRIEVAL. Search Engines. Information Retrieval. Explanation. Van Rijsbergen MULTIMEDIA INFORMATION RETRIEVAL Arjen P. de Vries arjen@acm.org Overview Information Retrieval Text Retrieval Multimedia Retrieval Recent Developments Research Topics Centrum voor Wiskunde en Informatica

More information

Deep Learning for Remote Sensing

Deep Learning for Remote Sensing 1 ENPC Data Science Week Deep Learning for Remote Sensing Alexandre Boulch 2 ONERA Research, Innovation, expertise and long-term vision for industry, French government and Europe 3 Materials Optics Aerodynamics

More information

MultiMatch. D1.4 Functional Specification of the Second Prototype

MultiMatch. D1.4 Functional Specification of the Second Prototype Project no. 033104 MultiMatch Technology-enhanced Learning and Access to Cultural Heritage Instrument: Specific Targeted Research Project FP6-2005-IST-5 D1.4 Functional Specification of the Second Prototype

More information

Generative Adversarial Text to Image Synthesis

Generative Adversarial Text to Image Synthesis Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method

More information

Interoperable Content-based Access of Multimedia in Digital Libraries

Interoperable Content-based Access of Multimedia in Digital Libraries Interoperable Content-based Access of Multimedia in Digital Libraries John R. Smith IBM T. J. Watson Research Center 30 Saw Mill River Road Hawthorne, NY 10532 USA ABSTRACT Recent academic and commercial

More information

Visual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017

Visual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017 Visual Concept Detection and Linked Open Data at the TIB AV- Portal Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017 Agenda 1. TIB and TIB AV-Portal 2. Automated Video Analysis 3. Visual

More information

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Jung-Rim Kim, Seong Soo Chun, Seok-jin Oh, and Sanghoon Sull School of Electrical Engineering, Korea University,

More information

Multi-modal Tag Localization for Mobile Video Search

Multi-modal Tag Localization for Mobile Video Search Noname manuscript No. (will be inserted by the editor) Multi-modal Tag Localization for Mobile Video Search Rui Zhang 1,3 Sheng Tang 1 Wu Liu 2 Yongdong Zhang 1 Jintao Li 1 Received: date / Accepted: date

More information

WHAT YOU SEE IS (ALMOST) WHAT YOU HEAR: DESIGN PRINCIPLES FOR USER INTERFACES FOR ACCESSING SPEECH ARCHIVES

WHAT YOU SEE IS (ALMOST) WHAT YOU HEAR: DESIGN PRINCIPLES FOR USER INTERFACES FOR ACCESSING SPEECH ARCHIVES ISCA Archive http://www.isca-speech.org/archive 5 th International Conference on Spoken Language Processing (ICSLP 98) Sydney, Australia November 30 - December 4, 1998 WHAT YOU SEE IS (ALMOST) WHAT YOU

More information

Exploiting noisy web data for largescale visual recognition

Exploiting noisy web data for largescale visual recognition Exploiting noisy web data for largescale visual recognition Lamberto Ballan University of Padova, Italy CVPRW WebVision - Jul 26, 2017 Datasets drive computer vision progress ImageNet Slide credit: O.

More information

CHIST-ERA Projects Seminar Topic IUI

CHIST-ERA Projects Seminar Topic IUI CHIST-ERA Projects Seminar Topic IUI Heiko Schuldt, Alexey Andrushevich, Laurence Devillers (based on slides from S. Dupont) Brussels, March 21-23, 2017 Introduction: Projects of the topic eglasses: The

More information

Multimedia Event Detection for Large Scale Video. Benjamin Elizalde

Multimedia Event Detection for Large Scale Video. Benjamin Elizalde Multimedia Event Detection for Large Scale Video Benjamin Elizalde Outline Motivation TrecVID task Related work Our approach (System, TF/IDF) Results & Processing time Conclusion & Future work Agenda 2

More information

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis 3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors

More information

Multimodal Information Spaces for Content-based Image Retrieval

Multimodal Information Spaces for Content-based Image Retrieval Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due

More information

Offering Access to Personalized Interactive Video

Offering Access to Personalized Interactive Video Offering Access to Personalized Interactive Video 1 Offering Access to Personalized Interactive Video Giorgos Andreou, Phivos Mylonas, Manolis Wallace and Stefanos Kollias Image, Video and Multimedia Systems

More information

Real-Time Content-Based Adaptive Streaming of Sports Videos

Real-Time Content-Based Adaptive Streaming of Sports Videos Real-Time Content-Based Adaptive Streaming of Sports Videos Shih-Fu Chang, Di Zhong, and Raj Kumar Digital Video and Multimedia Group ADVENT University/Industry Consortium Columbia University December

More information

Apparel Classifier and Recommender using Deep Learning

Apparel Classifier and Recommender using Deep Learning Apparel Classifier and Recommender using Deep Learning Live Demo at: http://saurabhg.me/projects/tag-that-apparel Saurabh Gupta sag043@ucsd.edu Siddhartha Agarwal siagarwa@ucsd.edu Apoorve Dave a1dave@ucsd.edu

More information

CONTENT BASED VIDEO RETRIEVAL SYSTEM

CONTENT BASED VIDEO RETRIEVAL SYSTEM CONTENT BASED RETRIEVAL SYSTEM Madhav Gitte 1, Harshal Bawaskar 2, Sourabh Sethi 3, Ajinkya Shinde 4 1 B.E. Scholar, Department of Information Technology, Sinhgad College of Engineering Pune-41, University

More information

Multi-modal Information Retrieval experiences from Context-Aware Image Management, CAIM

Multi-modal Information Retrieval experiences from Context-Aware Image Management, CAIM Multi-modal Information Retrieval experiences from Context-Aware Image Management, CAIM Joan Nordbotten Dept. Of Information and Media Science University of Bergen, Norway 1 Outline Multi-modal Information

More information

UDC at the BBC. Alexander, Fran; Stickley, Kathryn; Buser, Vicky; Miller, Libby.

UDC at the BBC. Alexander, Fran; Stickley, Kathryn; Buser, Vicky; Miller, Libby. UDC at the BBC Item Type Article Authors Alexander, Fran; Stickley, Kathryn; Buser, Vicky; Miller, Libby Citation Alexander, Fran; Stickley, Kathryn; Buser, Vicky; Miller, Libby. UDC at the BBC. Extensions

More information

ACCESSING VIDEO ARCHIVES USING INTERACTIVE SEARCH

ACCESSING VIDEO ARCHIVES USING INTERACTIVE SEARCH ACCESSING VIDEO ARCHIVES USING INTERACTIVE SEARCH M. Worring 1, G.P. Nguyen 1, L. Hollink 2, J.C. van Gemert 1, D.C. Koelma 1 1 Mediamill/University of Amsterdam worring@science.uva.nl, http://www.science.uva.nl/

More information

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017 DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE Dennis Lui August 2017 THE RISE OF GPU COMPUTING APPLICATIONS 10 7 10 6 GPU-Computing perf 1.5X per year 1000X by 2025 ALGORITHMS 10 5 1.1X

More information

Integrating Visual and Textual Cues for Query-by-String Word Spotting

Integrating Visual and Textual Cues for Query-by-String Word Spotting Integrating Visual and Textual Cues for D. Aldavert, M. Rusiñol, R. Toledo and J. Lladós Computer Vision Center, Dept. Ciències de la Computació Edifici O, Univ. Autònoma de Barcelona, Bellaterra(Barcelona),

More information

Deep Face Recognition. Nathan Sun

Deep Face Recognition. Nathan Sun Deep Face Recognition Nathan Sun Why Facial Recognition? Picture ID or video tracking Higher Security for Facial Recognition Software Immensely useful to police in tracking suspects Your face will be an

More information