Deep Learning for Broadcast Videos and Multimedia
|
|
- Joshua Todd
- 6 years ago
- Views:
Transcription
1 Deep Learning for Broadcast Videos and Multimedia Lorenzo Baraldi University of Modena and Reggio Emilia
2 Deep Learning State-of-the-art in image classification, object detection, semantic object segmentation and action recognition. It is general! It can be applied to images, videos and multimodal data. At Imagelab: Città Educante project Develop and test new DL algorithms for Temporal Video Segmentation and Concept Detection, with C. Grana and R. Cucchiara Two hardware grants for Deep Learning projects: NVIDIA Hardware Grant, with the donation of one Tesla K40 GPU. Italian Supercomputing Resource Allocation (ISCRA) Grant from CINECA, which gives access to the Galileo HPC Platform.
3 Città Educante Almaviva SpA, RAI UniTN, UNIMORE, Reggio Children, CNR ATI Città Educante (13 PMI) exo Platform (Almaviva) RAI server RAI metadata Neuralstory Web interface (ATI) OR 3.2: Knowledge Extraction Video annotation Temporal Video Segmentation Deep learning engine (UNIMORE)
4 Broadcast videos Audio Speech to text Words Sentences Increasing level of abstraction Visual Shot detection Frames Shots Increasing level of abstraction Video Annotation of basic units (shots and sentences) is a necessary step for dividing a video into complex segments, like storyboards. During Antarctica winter, emperor penguins endure four months of darkness In the Arctic, polar bear cubs take their first steps into a world of rapidly thawing ice In northern Canada, 3 million caribou complete an overland migration The forests of eastern Russia are home to the Amur leopard In the tropics, the jungle that covers 3% of the planet's surface supports 50% of its species
5 Video story detection Group adjacent shots according to semantic coherence Can not be identified with visual features Can be identified with visual features only Need of multi-modal features!
6 Perceptual multi-modal features Visual appearance 1000 Visual concepts 205 Scenes ILSVRC-12: 1.2 million images Places: 2.5 million images Short term audio spectrum features POS tagger Quantity of speech Time
7 Semantic multi-modal features Textual semantic Cluster words in the transcript using a Word2Vec embedding space: words with similar semantics lie close The deciduous forests of America begin to shut down, losing their leaves in preparation for the dark cold months ahead. Textual concept space Visual semantic A Visual Word2Vec: words in transcript are visually confirmed using the entire Imagenet dataset ( categories) Visual concept space
8 A Deep Multi-modal architecture
9 Retrieval: merging semantics and aesthetics Scene-based: retrieve parts of videos instead of videos Semantic: thumbnail should represent the query Aesthetic: thumbnail should be aesthetically pleasant Low and high level activations from CNN + max-margin linear Ranking Less data, no DNN training! L. Baraldi, C. Grana, R. Cucchiara, Scene-driven Retrieval in Edited Videos using Aesthetic and Semantic Deep Features ICMR 2016, New York
10 Retrieval: merging semantics and aesthetics Query: penguin and calf Same video, different scenes Query: ant and spider Same scene, different thumbnails
11 Evaluation Synthetic: YFCC100M-Stories Built using YFCC100M videos TV Series: AllyMcBeal dataset for scene detection First four episodes of the first season Documentaries: BBC Planet Earth 11 episodes from a BBC educational TV Series 4900 shots and 670 segments Our method (Deep) State of the art YFCC100M-Stories AllyMcBeal BBC Planet Earth
12 Visualization Automatically generated stories can be visualized in a timeline fashion. Visual concepts enhance navigation and search inside the archive.
13 Visualization Video re-use!
14 Thank you Any questions?
A Video Library System using Scene Detection and Automatic Tagging
A Video Library System using Scene Detection and Automatic Tagging Lorenzo Baraldi, Costantino Grana, Rita Cucchiara Dipartimento di Ingegneria Enzo Ferrari Università degli Studi di Modena e Reggio Emilia
More informationCultMEDIA Machine learning-based services for harvesting multimedia documents to support low-cost video post-production and cross-media storytelling
CultMEDIA Machine learning-based services for harvesting multimedia documents to support low-cost video post-production and cross-media storytelling Italian Cluster TICHE Technologies for Cultural Heritage
More informationLorenzo Baraldi Curriculum Vitae
Lorenzo Baraldi Curriculum Vitae lorenzo.baraldi@unimore.it www.lorenzobaraldi.com [11, 13, 33][26, 27, 29][8][19, 20, 24] 2018 - on going Education and Expertise Postdoctoral Fellow, AImageLab, University
More informationBeyond detection: GANs and LSTMs to pay attention at human presence
Talk @Munich October 11, 2017 Beyond detection: GANs and LSTMs to pay attention at human presence Rita Cucchiara Imagelab, Dipartimento di Ingegneria «Enzo Ferrari» University of Modena e Reggio Emilia,
More informationShot, scene and keyframe ordering for interactive video re-use
Shot, scene and keyframe ordering for interactive video re-use Lorenzo Baraldi 1, Costantino Grana 1, Guido Borghi 1, Roberto Vezzani 1, Rita Cucchiara 1 1 Dipartimento di Ingegneria Enzo Ferrari, Università
More informationShifting from Naming to Describing: Semantic Attribute Models. Rogerio Feris, June 2014
Shifting from Naming to Describing: Semantic Attribute Models Rogerio Feris, June 2014 Recap Large-Scale Semantic Modeling Feature Coding and Pooling Low-Level Feature Extraction Training Data Slide credit:
More informationLecture Video Indexing and Retrieval Using Topic Keywords
Lecture Video Indexing and Retrieval Using Topic Keywords B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa International Science Index, Computer and Information Engineering waset.org/publication/10007915
More informationCIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM
CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM Nick Hatzigeorgiu, Nikolaos Sidiropoulos and Harris Papageorgiu Institute for Language and Speech Processing Epidavrou & Artemidos 6, 151 25 Maroussi,
More informationFaceted Navigation for Browsing Large Video Collection
Faceted Navigation for Browsing Large Video Collection Zhenxing Zhang, Wei Li, Cathal Gurrin, Alan F. Smeaton Insight Centre for Data Analytics School of Computing, Dublin City University Glasnevin, Co.
More informationLecture 12: Video Representation, Summarisation, and Query
Lecture 12: Video Representation, Summarisation, and Query Dr Jing Chen NICTA & CSE UNSW CS9519 Multimedia Systems S2 2006 jchen@cse.unsw.edu.au Last week Structure of video Frame Shot Scene Story Why
More informationBMVA symposium on "Security and surveillance: performance evaluation
BMVA symposium on "Security and surveillance: performance evaluation Visor Video Surveillance Online Repository Roberto o Vezzani and Rita Cucchiara a Imagelab Information Engineering Department University
More informationDeep Learning in Pulmonary Image Analysis with Incomplete Training Samples
Deep Learning in Pulmonary Image Analysis with Incomplete Training Samples Ziyue Xu, Staff Scientist, National Institutes of Health Nov. 2nd, 2017 (GTC DC Talk DC7137) Image Analysis Arguably the most
More informationMESH. Multimedia Semantic Syndication for Enhanced News Services. Project Overview
MESH Multimedia Semantic Syndication for Enhanced News Services Project Overview Presentation Structure 2 Project Summary Project Motivation Problem Description Work Description Expected Result The MESH
More informationClass 5: Attributes and Semantic Features
Class 5: Attributes and Semantic Features Rogerio Feris, Feb 21, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Project
More informationand Creativity on Public Service: A Brave New World
Session 5 Innovation and Creativity on Public Service: A Brave New World G. Alberico RAI (Italy) INNOVATION & CREATIVITY Digital switch over introduces more channels More content items produced/published/archived
More informationPhoto-realistic Renderings for Machines Seong-heum Kim
Photo-realistic Renderings for Machines 20105034 Seong-heum Kim CS580 Student Presentations 2016.04.28 Photo-realistic Renderings for Machines Scene radiances Model descriptions (Light, Shape, Material,
More informationMemoryBox Reconstructing and Presenting Memories for Persons with Memory Loss
MemoryBox Reconstructing and Presenting Memories for Persons with Memory Loss Design Document (revision 1) 10/16/09 Alex Day Tommy Garcia Jeff Rzeszotarski Ezra Velazquez Advisor: Amy Csizmar Dalal Introduction
More informationArchitectures for Scalable Media Object Search
Architectures for Scalable Media Object Search Dennis Sng Deputy Director & Principal Scientist NVIDIA GPU Technology Workshop 10 July 2014 ROSE LAB OVERVIEW 2 Large Database of Media Objects Next- Generation
More informationAssociating video frames with text
Associating video frames with text Pinar Duygulu and Howard Wactlar Informedia Project School of Computer Science University Informedia Digital Video Understanding Project IDVL interface returned for "El
More informationOnline Open World Face Recognition From Video Streams
IARPA JANUS Online Open World Face Recognition From Video Streams ID:23202 Federico Pernici, Federico Bartoli, Matteo Bruni and Alberto Del Bimbo MICC - University of Florence - Italy http://www.micc.unifi.it
More informationSemantic Video Indexing
Semantic Video Indexing T-61.6030 Multimedia Retrieval Stevan Keraudy stevan.keraudy@tkk.fi Helsinki University of Technology March 14, 2008 What is it? Query by keyword or tag is common Semantic Video
More informationDEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA
DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA TOPICS COVERED Convolutional Networks Deep Learning Use Cases GPUs cudnn 2 MACHINE LEARNING! Training! Train the model from supervised
More informationLesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval
Lesson 11 Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Retrieval = Query + Search Informational Retrieval: Get required information from database/web
More informationAnnotation Graphs, Annotation Servers and Multi-Modal Resources
Annotation Graphs, Annotation Servers and Multi-Modal Resources Infrastructure for Interdisciplinary Education, Research and Development Christopher Cieri and Steven Bird University of Pennsylvania Linguistic
More informationGrounded Compositional Semantics for Finding and Describing Images with Sentences
Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational
More informationIntroduzione alle Biblioteche Digitali Audio/Video
Introduzione alle Biblioteche Digitali Audio/Video Biblioteche Digitali 1 Gestione del video Perchè è importante poter gestire biblioteche digitali di audiovisivi Caratteristiche specifiche dell audio/video
More informationWhat was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara
What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara COMPUTER VISION IN THE ARTISTIC DOMAIN The effectiveness of Computer Vision
More informationDeep Learning Based Semantic Video Indexing and Retrieval
Deep Learning Based Semantic Video Indexing and Retrieval Anna Podlesnaya, Sergey Podlesnyy Cinema and Photo Research Institute (NIKFI) This work was funded by Russian Federation Ministry of Culture Contract
More informationOptimal Video Adaptation and Skimming Using a Utility-Based Framework
Optimal Video Adaptation and Skimming Using a Utility-Based Framework Shih-Fu Chang Digital Video and Multimedia Lab ADVENT University-Industry Consortium Columbia University Sept. 9th 2002 http://www.ee.columbia.edu/dvmm
More informationAlberto Messina, Maurizio Montagnuolo
A Generalised Cross-Modal Clustering Method Applied to Multimedia News Semantic Indexing and Retrieval Alberto Messina, Maurizio Montagnuolo RAI Centre for Research and Technological Innovation Madrid,
More informationMultimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Previous Lecture Audio Retrieval - Query by Humming
More informationEnhancing applications with Cognitive APIs IBM Corporation
Enhancing applications with Cognitive APIs After you complete this section, you should understand: The Watson Developer Cloud offerings and APIs The benefits of commonly used Cognitive services 2 Watson
More informationVisual Information Retrieval: The Next Frontier in Search
Visual Information Retrieval: The Next Frontier in Search Ramesh Jain Abstract: The first ten years of search techniques for WWW have been concerned with text documents. The nature of data on WWW and in
More informationColumbia University High-Level Feature Detection: Parts-based Concept Detectors
TRECVID 2005 Workshop Columbia University High-Level Feature Detection: Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric Zavesky Digital Video and Multimedia Lab
More informationEntity-centric Topic Extraction and Exploration: A Network-based Approach
Entity-centric Topic Extraction and Exploration: A Network-based Approach Andreas Spitz and Michael Gertz March 27, 2018 ECIR 2018, Grenoble Heidelberg University, Germany Database Systems Research Group
More informationVideo search requires efficient annotation of video content To some extent this can be done automatically
VIDEO ANNOTATION Market Trends Broadband doubling over next 3-5 years Video enabled devices are emerging rapidly Emergence of mass internet audience Mainstream media moving to the Web What do we search
More informationDIGITS DEEP LEARNING GPU TRAINING SYSTEM
DIGITS DEEP LEARNING GPU TRAINING SYSTEM AGENDA 1 Introduction to Deep Learning 2 What is DIGITS 3 How to use DIGITS Practical DEEP LEARNING Examples Image Classification, Object Detection, Localization,
More informationIMOTION. Heiko Schuldt, University of Basel, Switzerland
IMOTION Heiko Schuldt, University of Basel, Switzerland heiko.schuldt@unibas.ch IMOTION at a Glance Project Title Intelligent Multimodal Augmented Video Motion Retrieval System (IMOTION) Project Start
More informationDEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM
DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM AGENDA 1 Introduction to Deep Learning 2 What is DIGITS 3 How to use DIGITS Practical DEEP LEARNING Examples Image Classification, Object Detection,
More informationExtending the Facets concept by applying NLP tools to catalog records of scientific literature
Extending the Facets concept by applying NLP tools to catalog records of scientific literature *E. Picchi, *M. Sassi, **S. Biagioni, **S. Giannini *Institute of Computational Linguistics **Institute of
More informationMultimodal Learning. Victoria Dean. MIT 6.S191 Intro to Deep Learning IAP 2017
Multimodal Learning Victoria Dean Talk outline What is multimodal learning and what are the challenges? Flickr example: joint learning of images and tags Image captioning: generating sentences from images
More informationEnriching Perspectives in Exploring Cultural Heritage Documentaries Using Informedia Technologies
Enriching Perspectives in Exploring Cultural Heritage Documentaries Using Informedia Technologies Tobun Dorbin Ng, Howard D. Wactlar School of Computer Science Carnegie Mellon University Pittsburgh, PA
More informationDepartment of Computer Science & Engineering. The Chinese University of Hong Kong Final Year Project LYU0102
Department of Computer Science & Engineering The Chinese University of Hong Kong LYU0102 Supervised by Prof. LYU, Rung Tsong Michael Group Members: Chan Pik Wah Ngai Cheuk Han Prepared by Chan Pik Wah
More informationCIMWOS: A MULTIMEDIA, MULTIMODAL AND MULTILINGUAL INDEXING AND RETRIEVAL SYSTEM
Ebroul Izquierdo, editor. Digital Media Processing for Multimedia Interactive Services. Proceedings of the 4 th European Workshop on Image Analysis for Multimedia Interactive Services. World Scientific
More informationMultimedia Information Retrieval at ORL
Multimedia Information Retrieval at ORL Dr Kenneth R Wood ORL 24a Trumpington Street Cambridge CB2 1QA ENGLAND krw@orl.co.uk http://www.orl.co.uk/ 1 ORL Funding & IPR from 1986 1990-1992 ongoing from 1996
More informationInternational Journal of Advanced Networking & Applications (IJANA) ISSN:
Integration of Visual Temporal and Textual Distribution Information for News Video Mining Prof Shivamurthy R C, Tauseef Ahmed S S Department of Computer Science, Akshaya Institute of Technology, Tumkur
More informationVERGE IN VBS Thessaloniki, Greece {moumtzid, andreadisst, markatopoulou, dgalanop, heliasgj, stefanos, bmezaris,
VERGE IN VBS 2018 Anastasia Moumtzidou 1, Stelios Andreadis 1 Foteini Markatopoulou 1,2, Damianos Galanopoulos 1, Ilias Gialampoukidis 1, Stefanos Vrochidis 1, Vasileios Mezaris 1, Ioannis Kompatsiaris
More informationHello, I am from the State University of Library Studies and Information Technologies, Bulgaria
Hello, My name is Svetla Boytcheva, I am from the State University of Library Studies and Information Technologies, Bulgaria I am goingto present you work in progress for a research project aiming development
More informationA Hybrid Approach to News Video Classification with Multi-modal Features
A Hybrid Approach to News Video Classification with Multi-modal Features Peng Wang, Rui Cai and Shi-Qiang Yang Department of Computer Science and Technology, Tsinghua University, Beijing 00084, China Email:
More informationHow GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017
How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017 Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Deep Learning
More informationInformation Extraction from News Video using Global Rule Induction Technique
Information Extraction from News Video using Global Rule Induction Technique Lekha Chaisorn and 2 Tat-Seng Chua Media Semantics Department, Media Division, Institute for Infocomm Research (I 2 R), Singapore
More informationDefense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR
Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / 2017. 10. 31 syoh@add.re.kr Page 1/36 Overview 1. Introduction 2. Data Generation Synthesis 3. Distributed Deep Learning 4. Conclusions
More informationDL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza
DL User Interfaces Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Delos work on DL interfaces Delos Cluster 4: User interfaces and visualization Cluster s goals:
More informationMultimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009
9 Video Retrieval Multimedia Databases 9 Video Retrieval 9.1 Hidden Markov Models (continued from last lecture) 9.2 Introduction into Video Retrieval Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme
More informationWorkshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards
Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Extracting meaning
More informationFraunhofer IAIS Audio Mining Solution for Broadcast Archiving. Dr. Joachim Köhler LT-Innovate Brussels
Fraunhofer IAIS Audio Mining Solution for Broadcast Archiving Dr. Joachim Köhler LT-Innovate Brussels 22.11.2016 1 Outline Speech Technology in the Broadcast World Deep Learning Speech Technologies Fraunhofer
More informationEC-TEL Community Hypermedia in Collaborative Marc Spaniol. and Self-reflective E-learning Applications. Marc Spaniol
First European Conference on Technology Enhanced Learning Community Hypermedia in Collaborative and Self-reflective E-learning Applications Hersonissou, Greece, 2 nd of October 2006 I5-Spa1006-1/12 Agenda
More informationThe David Attenborough Essential Collection CAMPAIGN DETAILS
what WE KNOW Documentaries are relevant: In the past 3 months, 70% of parents with kids under 10 watched documentaries. 92% of parents agree it is important for kids to know about nature and animals. Sir
More informationOverview of ImageCLEF Mauricio Villegas (on behalf of all organisers)
Overview of ImageCLEF 2016 Mauricio Villegas (on behalf of all organisers) ImageCLEF history Started in 2003 with a photo retrieval task 4 participants submitting results In 2009 we had 6 tasks and 65
More informationTwo-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman Cemil Zalluhoğlu Introduction Aim Extend deep Convolution Networks to action recognition in video. Motivation
More informationThe Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System
The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System Our first participation on the TRECVID workshop A. F. de Araujo 1, F. Silveira 2, H. Lakshman 3, J. Zepeda 2, A. Sheth 2, P. Perez
More informationSES 123 Global and Regional Energy Lab Worksheet
SES 123 Global and Regional Energy Lab Worksheet Introduction An important aspect to understand about our planet is global temperatures, including spatial variations, such as between oceans and continents
More informationNuxeo Platform 5.5. DAM Module. User Guide
Nuxeo Platform 5.5 DAM Module User Guide Table of Contents 1. Digital Asset Management User Guide........................................................................... 3 1.1 Digital Asset Management
More informationA Digital Library Framework for Reusing e-learning Video Documents
A Digital Library Framework for Reusing e-learning Video Documents Paolo Bolettieri, Fabrizio Falchi, Claudio Gennaro, and Fausto Rabitti ISTI-CNR, via G. Moruzzi 1, 56124 Pisa, Italy paolo.bolettieri,fabrizio.falchi,claudio.gennaro,
More informationUSING DEEP LEARNING TECHNOLOGIES TO PROVIDE DESCRIPTIVE METADATA FOR LIVE VIDEO CONTENTS
USING DEEP LEARNING TECHNOLOGIES TO PROVIDE DESCRIPTIVE METADATA FOR LIVE VIDEO CONTENTS A. Maraga, V. Zamboni Metaliquid, Italy ABSTRACT As the consumption of video contents rises and consumers behaviour
More informationLearning Semantic Video Captioning using Data Generated with Grand Theft Auto
A dark car is turning left on an exit Learning Semantic Video Captioning using Data Generated with Grand Theft Auto Alex Polis Polichroniadis Data Scientist, MSc Kolia Sadeghi Applied Mathematician, PhD
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationAvailable online at ScienceDirect. Procedia Computer Science 87 (2016 ) 12 17
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 87 (2016 ) 12 17 4th International Conference on Recent Trends in Computer Science & Engineering Segment Based Indexing
More informationThe IEE International Symposium on IMAGING FOR CRIME DETECTION AND PREVENTION, Savoy Place, London, UK 7-8 June 2005
Ambient Intelligence for Security in Public Parks: the LAICA Project Rita Cucchiara, Andrea Prati, Roberto Vezzani Dipartimento di Ingegneria dell Informazione University of Modena and Reggio Emilia, Italy
More informationThe ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1
The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 N. Adami, A. Bugatti, A. Corghi, R. Leonardi, P. Migliorati, Lorenzo A. Rossi, C. Saraceno 2 Department of Electronics
More informationA NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017
A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017 TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 6 10 5 1.1X per year 10 4 10 3 10 2 1.5X per year Single-threaded
More informationOverview of the medical task of ImageCLEF Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller
Overview of the medical task of ImageCLEF 2016 Alba G. Seco de Herrera Stefano Bromuri Roger Schaer Henning Müller Tasks in ImageCLEF 2016 Automatic image annotation Medical image classification Sub-tasks
More informationBrowsing News and TAlk Video on a Consumer Electronics Platform Using face Detection
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning TR2005-155
More informationTransfer Learning. Style Transfer in Deep Learning
Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning
More informationPouya Kousha Fall 2018 CSE 5194 Prof. DK Panda
Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing
More informationWeb-scale Multimedia Search for Internet Video Content
CARNEGIE MELLON UNIVERSITY Web-scale Multimedia Search for Internet Video Content Lu Jiang CMU-LTI-17-003 Language and Information Technologies School of Computer Science Carnegie Mellon University 5000
More informationOverview MULTIMEDIA INFORMATION RETRIEVAL. Search Engines. Information Retrieval. Explanation. Van Rijsbergen
MULTIMEDIA INFORMATION RETRIEVAL Arjen P. de Vries arjen@acm.org Overview Information Retrieval Text Retrieval Multimedia Retrieval Recent Developments Research Topics Centrum voor Wiskunde en Informatica
More informationDeep Learning for Remote Sensing
1 ENPC Data Science Week Deep Learning for Remote Sensing Alexandre Boulch 2 ONERA Research, Innovation, expertise and long-term vision for industry, French government and Europe 3 Materials Optics Aerodynamics
More informationMultiMatch. D1.4 Functional Specification of the Second Prototype
Project no. 033104 MultiMatch Technology-enhanced Learning and Access to Cultural Heritage Instrument: Specific Targeted Research Project FP6-2005-IST-5 D1.4 Functional Specification of the Second Prototype
More informationGenerative Adversarial Text to Image Synthesis
Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method
More informationInteroperable Content-based Access of Multimedia in Digital Libraries
Interoperable Content-based Access of Multimedia in Digital Libraries John R. Smith IBM T. J. Watson Research Center 30 Saw Mill River Road Hawthorne, NY 10532 USA ABSTRACT Recent academic and commercial
More informationVisual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017
Visual Concept Detection and Linked Open Data at the TIB AV- Portal Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017 Agenda 1. TIB and TIB AV-Portal 2. Automated Video Analysis 3. Visual
More informationScalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme
Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Jung-Rim Kim, Seong Soo Chun, Seok-jin Oh, and Sanghoon Sull School of Electrical Engineering, Korea University,
More informationMulti-modal Tag Localization for Mobile Video Search
Noname manuscript No. (will be inserted by the editor) Multi-modal Tag Localization for Mobile Video Search Rui Zhang 1,3 Sheng Tang 1 Wu Liu 2 Yongdong Zhang 1 Jintao Li 1 Received: date / Accepted: date
More informationWHAT YOU SEE IS (ALMOST) WHAT YOU HEAR: DESIGN PRINCIPLES FOR USER INTERFACES FOR ACCESSING SPEECH ARCHIVES
ISCA Archive http://www.isca-speech.org/archive 5 th International Conference on Spoken Language Processing (ICSLP 98) Sydney, Australia November 30 - December 4, 1998 WHAT YOU SEE IS (ALMOST) WHAT YOU
More informationExploiting noisy web data for largescale visual recognition
Exploiting noisy web data for largescale visual recognition Lamberto Ballan University of Padova, Italy CVPRW WebVision - Jul 26, 2017 Datasets drive computer vision progress ImageNet Slide credit: O.
More informationCHIST-ERA Projects Seminar Topic IUI
CHIST-ERA Projects Seminar Topic IUI Heiko Schuldt, Alexey Andrushevich, Laurence Devillers (based on slides from S. Dupont) Brussels, March 21-23, 2017 Introduction: Projects of the topic eglasses: The
More informationMultimedia Event Detection for Large Scale Video. Benjamin Elizalde
Multimedia Event Detection for Large Scale Video Benjamin Elizalde Outline Motivation TrecVID task Related work Our approach (System, TF/IDF) Results & Processing time Conclusion & Future work Agenda 2
More information3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis
3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors
More informationMultimodal Information Spaces for Content-based Image Retrieval
Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due
More informationOffering Access to Personalized Interactive Video
Offering Access to Personalized Interactive Video 1 Offering Access to Personalized Interactive Video Giorgos Andreou, Phivos Mylonas, Manolis Wallace and Stefanos Kollias Image, Video and Multimedia Systems
More informationReal-Time Content-Based Adaptive Streaming of Sports Videos
Real-Time Content-Based Adaptive Streaming of Sports Videos Shih-Fu Chang, Di Zhong, and Raj Kumar Digital Video and Multimedia Group ADVENT University/Industry Consortium Columbia University December
More informationApparel Classifier and Recommender using Deep Learning
Apparel Classifier and Recommender using Deep Learning Live Demo at: http://saurabhg.me/projects/tag-that-apparel Saurabh Gupta sag043@ucsd.edu Siddhartha Agarwal siagarwa@ucsd.edu Apoorve Dave a1dave@ucsd.edu
More informationCONTENT BASED VIDEO RETRIEVAL SYSTEM
CONTENT BASED RETRIEVAL SYSTEM Madhav Gitte 1, Harshal Bawaskar 2, Sourabh Sethi 3, Ajinkya Shinde 4 1 B.E. Scholar, Department of Information Technology, Sinhgad College of Engineering Pune-41, University
More informationMulti-modal Information Retrieval experiences from Context-Aware Image Management, CAIM
Multi-modal Information Retrieval experiences from Context-Aware Image Management, CAIM Joan Nordbotten Dept. Of Information and Media Science University of Bergen, Norway 1 Outline Multi-modal Information
More informationUDC at the BBC. Alexander, Fran; Stickley, Kathryn; Buser, Vicky; Miller, Libby.
UDC at the BBC Item Type Article Authors Alexander, Fran; Stickley, Kathryn; Buser, Vicky; Miller, Libby Citation Alexander, Fran; Stickley, Kathryn; Buser, Vicky; Miller, Libby. UDC at the BBC. Extensions
More informationACCESSING VIDEO ARCHIVES USING INTERACTIVE SEARCH
ACCESSING VIDEO ARCHIVES USING INTERACTIVE SEARCH M. Worring 1, G.P. Nguyen 1, L. Hollink 2, J.C. van Gemert 1, D.C. Koelma 1 1 Mediamill/University of Amsterdam worring@science.uva.nl, http://www.science.uva.nl/
More informationDEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017
DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE Dennis Lui August 2017 THE RISE OF GPU COMPUTING APPLICATIONS 10 7 10 6 GPU-Computing perf 1.5X per year 1000X by 2025 ALGORITHMS 10 5 1.1X
More informationIntegrating Visual and Textual Cues for Query-by-String Word Spotting
Integrating Visual and Textual Cues for D. Aldavert, M. Rusiñol, R. Toledo and J. Lladós Computer Vision Center, Dept. Ciències de la Computació Edifici O, Univ. Autònoma de Barcelona, Bellaterra(Barcelona),
More informationDeep Face Recognition. Nathan Sun
Deep Face Recognition Nathan Sun Why Facial Recognition? Picture ID or video tracking Higher Security for Facial Recognition Software Immensely useful to police in tracking suspects Your face will be an
More information