Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors
|
|
- Claude Martin
- 5 years ago
- Views:
Transcription
1 Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Ajay Divakaran, Kadir A. Peker, Regunathan Radhakrishnan, Ziyou Xiong and Romain Cabasson Presented by Giulia Fanti 1
2 Overview Motivation Overview of MPEG compression Keyframe extraction Use motion information from MPEG-7 compression Extract better keyframes Constant pace video skimming Effectively summarize varying action levels Audio assisted news video browsing Sports highlights detection Golf Soccer 2
3 Motivation Amount of video available is enormous Want to search/navigate easily Emphasis on computational efficiency, Incorporation into consumer system hardware 3
4 How was video previously summarized? Use keyframes-- first frame of a shot Colouras similarity metric Proposition: Use MPEG-7 intensity of motion activity descriptors! Keyframedescribes whole shot 4
5 MPEG OVERVIEW How does MPEG Work? 5
6 MPEG OVERVIEW MPEG (cont d) 6
7 MPEG OVERVIEW MPEG (cont d) 7
8 MPEG OVERVIEW MPEG (cont d) Frame N Frame N+1 8
9 KEYFRAME IDENTIFICATION Specify notion of fidelity d i = min(d(s j,r i )), j = 1,m Semi-Hausdorff distance d sh (S,R) = max(d i ), i = 1, n Keyframequality S = keyframes R = regular frames Key frames Remaining Frames 9
10 KEYFRAME IDENTIFICATION KeyframeExtraction Ideas (1) Baseline technique for comparison For each set of n keyframes: For each shot, compute the dsh Choose the keyframe set with the lowest semihausdorff distance Computationally very expensive, but optimal 10
11 KEYFRAME IDENTIFICATION KeyframeExtraction Ideas (2) Extract n keyframes Examine cumulative motion function Divide range of motion function into n intervals, choose the frame corresponding to the middle of each interval Cumulative Motion Intensity from First Frame 11
12 KEYFRAME IDENTIFICATION Similarity Single keyframeworks for over 90% of low-motion frames 12
13 KEYFRAME IDENTIFICATION KeyframeExtraction Ideas (3) Progressively increase resolution Take the first and last frames as keyframes At each iteration add a keyframein the middle of motion spectrum 13
14 Constant Pace Skimming using Motion Activity Two interpretations Speed up/slow down playback Change the sampling rate Uniform summarization: fast playback Instead fix the motion activity, and adjust video accordingly Set minimum action threshold for entire playback 14
15 CONSTANT PACE SKIMMING Testing Surveillance data colour data is useless! No quantitative explanation of results Uniform Sampling Motion Compensation 15
16 CONSTANT PACE SKIMMING Testing (commercial video) Observation: Sometimes you can get semantic information from the motion vectors! Golf News Soccer Basketball 16
17 Audio-Assisted News Browsing News is structured Large-scale semantic segmentation Individual shots make up larger segment Goal: extract query segment and generate summary Old approach: Speech/non-speech detection Train GMMs for each speaker Fit each new speech segment to the GMMs Computationally complex 17
18 AUDIO-ASSISTED NEWS BROWSING Instead Try the sound-recognition framework by Casey et al Offline train HMMs for various sounds Online run Viterbi algorithm on HMMs Computationally cheap! Given sound clip, return histogram w/ state frequency Dog barks Man talks Lady talks Glass Break 18
19 Principal Cast Identification: Structure Feature Extraction - MPEG-7 Extract intensity motion of motion activity from P frames activity, colour, - 64 bin colourhistogram from I-frames audio features from - Audio energy news bands projected onto HMM class bases ID Speaker Changes - Classification as male, female, speech with music - Clustering determines break points Use sound recognition framework to ID speaker changes Merge motion/speaker clusters to ID principal speakers/ segment Apply the browsing to each segment 19
20 PRINCIPAL CAST IDENTIFICATION Feature Extraction 3 s 6s 9s 12 s -Sum energy bands -Project onto sound class principal components 20
21 Casey, M., MPEG-7 Sound Recognition Tools, IEEE Transactions on Circuits and Systems for Video Technology, June Dog barks Man speaks Lady speaks Glass breaks Feature Vector: [ ] 21
22 PRINCIPAL CAST IDENTIFICATION ID Speaker Changes Observation: The HMM framework is just like GMMs. Each state in HMM is like a cluster in feature space! Use KL divergence as metric 22
23 PRINCIPAL CAST IDENTIFICATION Clustering Contiguous set of female speech segments Build a dendrogram by merging clusters to ID speakers 23
24 AUDIO-ASSISTED NEWS BROWSING Testing 3.5 hours of news video for training data 4 different TV channels Training data partitioned 90%-10% training-validation Actual testing One 34-minute news segment, one 59-minute segment 24
25 Results 25
26 Sports Highlight Detection Motion vectors are noisy in sports footage Use temporal motion patterns to detect highlights Use structure of game to help Combine visual and audio features Focus on detection of interesting parts 26
27 SPORTS HIGHLIGHTS DETECTION Golf Smooth out motion vectors Look for long stretches of low activity followed by high activity Stitch together 10 second segments starting at the spike in activity Misses putts (slow camera motion) 27
28 SPORTS HIGHLIGHTS DETECTION Soccer Locate all audio volume peaks For each peak, check if play stopped, stayed stopped Concatenate periods immediately preceding stopped play Testing: 7 soccer games: Korea, USA, Europe 28
29 SPORTS HIGHLIGHTS DETECTION Unified framework Impractical to make separate algorithm for every sport Wish to combine highlight detection for general sport Consider soccer, golf, and baseball 29
30 SPORTS HIGHLIGHTS DETECTION Technique Interesting events usually marked by applause Want to classify: Applause, cheering, ball hits, music, speech, speech with music Train HMMs for each class Use Mel Frequency Cepstrum Coefficients 30
31 SPORTS HIGHLIGHTS DETECTION Technique Collect all sequences of uninterrupted cheering Keep all sequences that last >67% of the longest cheering segment Add time cushion to start and end of cheering Use length of cheering as indicator of importance Amplitude X Time 31
32 SPORTS HIGHLIGHTS DETECTION Highlight Extraction Framework 32
33 Classification results 33
34 Future work Better clustering algorithms Multi-level pruning on dendrograms More sophisticated associations between audio and visual features Assessment of semantic success of summarization Improve audio-based video browsing More robust Use the semantic info from audio classification more Content-adaptive techniques that learn variations in content 34
35 Conclusions Totally heuristic approach Seems to work for their needs Summarization works best within a semantic segment Use MPEG-7 generalized sound recognition to ID semantic units Use domain knowledge to ID regions of high/low motion activity 35
36 Questions? 36
Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain Radhakrishan, R.; Xiong, Z.; Divakaran,
More informationHighlights Extraction from Unscripted Video
Highlights Extraction from Unscripted Video T 61.6030, Multimedia Retrieval Seminar presentation 04.04.2008 Harrison Mfula Helsinki University of Technology Department of Computer Science, Espoo, Finland
More informationComparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification
More informationBaseball Game Highlight & Event Detection
Baseball Game Highlight & Event Detection Student: Harry Chao Course Adviser: Winston Hu 1 Outline 1. Goal 2. Previous methods 3. My flowchart 4. My methods 5. Experimental result 6. Conclusion & Future
More informationReal-Time Content-Based Adaptive Streaming of Sports Videos
Real-Time Content-Based Adaptive Streaming of Sports Videos Shih-Fu Chang, Di Zhong, and Raj Kumar Digital Video and Multimedia Group ADVENT University/Industry Consortium Columbia University December
More informationChapter 9.2 A Unified Framework for Video Summarization, Browsing and Retrieval
Chapter 9.2 A Unified Framework for Video Summarization, Browsing and Retrieval Ziyou Xiong, Yong Rui, Regunathan Radhakrishnan, Ajay Divakaran, Thomas S. Huang Beckman Institute for Advanced Science and
More informationBrowsing News and TAlk Video on a Consumer Electronics Platform Using face Detection
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning TR2005-155
More informationDetection of goal event in soccer videos
Detection of goal event in soccer videos Hyoung-Gook Kim, Steffen Roeber, Amjad Samour, Thomas Sikora Department of Communication Systems, Technical University of Berlin, Einsteinufer 17, D-10587 Berlin,
More informationSearching Video Collections:Part I
Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion
More informationSOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2
Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 1 Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 33720, Tampere, Finland toni.heittola@tut.fi,
More informationMATRIX BASED SEQUENTIAL INDEXING TECHNIQUE FOR VIDEO DATA MINING
MATRIX BASED SEQUENTIAL INDEXING TECHNIQUE FOR VIDEO DATA MINING 1 D.SARAVANAN 2 V.SOMASUNDARAM Assistant Professor, Faculty of Computing, Sathyabama University Chennai 600 119, Tamil Nadu, India Email
More informationAUTOMATIC VIDEO INDEXING
AUTOMATIC VIDEO INDEXING Itxaso Bustos Maite Frutos TABLE OF CONTENTS Introduction Methods Key-frame extraction Automatic visual indexing Shot boundary detection Video OCR Index in motion Image processing
More informationMultimedia Database Systems. Retrieval by Content
Multimedia Database Systems Retrieval by Content MIR Motivation Large volumes of data world-wide are not only based on text: Satellite images (oil spill), deep space images (NASA) Medical images (X-rays,
More informationRepeating Segment Detection in Songs using Audio Fingerprint Matching
Repeating Segment Detection in Songs using Audio Fingerprint Matching Regunathan Radhakrishnan and Wenyu Jiang Dolby Laboratories Inc, San Francisco, USA E-mail: regu.r@dolby.com Institute for Infocomm
More informationScene Change Detection Based on Twice Difference of Luminance Histograms
Scene Change Detection Based on Twice Difference of Luminance Histograms Xinying Wang 1, K.N.Plataniotis 2, A. N. Venetsanopoulos 1 1 Department of Electrical & Computer Engineering University of Toronto
More informationAn Enhanced Video Summarization System Using Audio Features for a Personal Video Recorder
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com An Enhanced Video Summarization System Using Audio Features for a Personal Video Recorder Isao Otsuka, Regunathan Radhakrishnan, Michael Siracusa,
More informationLecture 7: Introduction to Multimedia Content Description. Reji Mathew & Jian Zhang NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2009
Lecture 7: Introduction to Multimedia Content Description Reji Mathew & Jian Zhang NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2009 Outline Why do we need to describe multimedia content? Low level
More informationAudio Compression. Audio Compression. Absolute Threshold. CD quality audio:
Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =
More informationMultimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Previous Lecture Audio Retrieval - Query by Humming
More informationMultimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009
9 Video Retrieval Multimedia Databases 9 Video Retrieval 9.1 Hidden Markov Models (continued from last lecture) 9.2 Introduction into Video Retrieval Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme
More informationMotion analysis for broadcast tennis video considering mutual interaction of players
14-10 MVA2011 IAPR Conference on Machine Vision Applications, June 13-15, 2011, Nara, JAPAN analysis for broadcast tennis video considering mutual interaction of players Naoto Maruyama, Kazuhiro Fukui
More informationLesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval
Lesson 11 Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Retrieval = Query + Search Informational Retrieval: Get required information from database/web
More informationUnderstanding Sport Activities from Correspondences of Clustered Trajectories
Understanding Sport Activities from Correspondences of Clustered Trajectories Francesco Turchini, Lorenzo Seidenari, Alberto Del Bimbo http://www.micc.unifi.it/vim Introduction The availability of multimedia
More informationAlgorithms and System for High-Level Structure Analysis and Event Detection in Soccer Video
Algorithms and Sstem for High-Level Structure Analsis and Event Detection in Soccer Video Peng Xu, Shih-Fu Chang, Columbia Universit Aja Divakaran, Anthon Vetro, Huifang Sun, Mitsubishi Electric Advanced
More informationComputer Vision and Image Understanding
Computer Vision and Image Understanding 113 (2009) 415 424 Contents lists available at ScienceDirect Computer Vision and Image Understanding journal homepage: www.elsevier.com/locate/cviu A framework for
More informationIntroduction to Medical Imaging (5XSA0) Module 5
Introduction to Medical Imaging (5XSA0) Module 5 Segmentation Jungong Han, Dirk Farin, Sveta Zinger ( s.zinger@tue.nl ) 1 Outline Introduction Color Segmentation region-growing region-merging watershed
More informationVideo search requires efficient annotation of video content To some extent this can be done automatically
VIDEO ANNOTATION Market Trends Broadband doubling over next 3-5 years Video enabled devices are emerging rapidly Emergence of mass internet audience Mainstream media moving to the Web What do we search
More information8.5 Application Examples
8.5 Application Examples 8.5.1 Genre Recognition Goal Assign a genre to a given video, e.g., movie, newscast, commercial, music clip, etc.) Technology Combine many parameters of the physical level to compute
More informationImport Footage You can import footage using a USB/1394 cable, 1394/1394 cable or a firewire/i.link connection.
Windows Movie Maker Collections view screen. Where imported clips, video effects, and transitions are displayed. Preview Screen Windows Movie Maker is used for editing together video footage. Similar to
More informationOptimal Video Adaptation and Skimming Using a Utility-Based Framework
Optimal Video Adaptation and Skimming Using a Utility-Based Framework Shih-Fu Chang Digital Video and Multimedia Lab ADVENT University-Industry Consortium Columbia University Sept. 9th 2002 http://www.ee.columbia.edu/dvmm
More informationTemporal structure analysis of broadcast tennis video using hidden Markov models
Temporal structure analysis of broadcast tennis video using hidden Markov models Ewa Kijak a,b, Lionel Oisel a, Patrick Gros b a THOMSON multimedia S.A., Cesson-Sevigne, France b IRISA-CNRS, Campus de
More informationAudio-Visual Content Indexing, Filtering, and Adaptation
Audio-Visual Content Indexing, Filtering, and Adaptation Shih-Fu Chang Digital Video and Multimedia Group ADVENT University-Industry Consortium Columbia University 10/12/2001 http://www.ee.columbia.edu/dvmm
More informationTitle: Adaptive Region Merging Segmentation of Airborne Imagery for Roof Condition Assessment. Abstract:
Title: Adaptive Region Merging Segmentation of Airborne Imagery for Roof Condition Assessment Abstract: In order to perform residential roof condition assessment with very-high-resolution airborne imagery,
More informationAudio-Visual Content Indexing, Filtering, and Adaptation
Audio-Visual Content Indexing, Filtering, and Adaptation Shih-Fu Chang Digital Video and Multimedia Group ADVENT University-Industry Consortium Columbia University 10/12/2001 http://www.ee.columbia.edu/dvmm
More informationLecture 12: Video Representation, Summarisation, and Query
Lecture 12: Video Representation, Summarisation, and Query Dr Jing Chen NICTA & CSE UNSW CS9519 Multimedia Systems S2 2006 jchen@cse.unsw.edu.au Last week Structure of video Frame Shot Scene Story Why
More informationQuantitative - One Population
Quantitative - One Population The Quantitative One Population VISA procedures allow the user to perform descriptive and inferential procedures for problems involving one population with quantitative (interval)
More informationCS229: Action Recognition in Tennis
CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active
More informationMotion in 2D image sequences
Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or activities Segmentation and understanding of video sequences
More informationExtracting Spatio-temporal Local Features Considering Consecutiveness of Motions
Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Akitsugu Noguchi and Keiji Yanai Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka,
More informationCONTENT analysis of video is to find meaningful structures
1576 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 11, NOVEMBER 2008 An ICA Mixture Hidden Markov Model for Video Content Analysis Jian Zhou, Member, IEEE, and Xiao-Ping
More informationA MPEG-4/7 based Internet Video and Still Image Browsing System
A MPEG-4/7 based Internet Video and Still Image Browsing System Miroslaw Bober 1, Kohtaro Asai 2 and Ajay Divakaran 3 1 Mitsubishi Electric Information Technology Center Europe VIL, Guildford, Surrey,
More informationCombination of Accumulated Motion and Color Segmentation for Human Activity Analysis
1 Combination of Accumulated Motion and Color Segmentation for Human Activity Analysis Alexia Briassouli, Vasileios Mezaris, Ioannis Kompatsiaris Informatics and Telematics Institute Centre for Research
More informationAbout Me. Bio: Research Interests:
About Me Bio: Sr. Staff Researcher, Core Networks R&D, Huawei Tech USA, 2010.10~ to date Asst Prof, HK Polytechnic Univ, CTO, Mudi Tech, 2008.04~2010.09 Senior, Senior Staff, and then Principal Staff Researcher,
More informationMulti-Camera Calibration, Object Tracking and Query Generation
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Multi-Camera Calibration, Object Tracking and Query Generation Porikli, F.; Divakaran, A. TR2003-100 August 2003 Abstract An automatic object
More informationEVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari
EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION Ing. Lorenzo Seidenari e-mail: seidenari@dsi.unifi.it What is an Event? Dictionary.com definition: something that occurs in a certain place during a particular
More informationText-Tracking Wearable Camera System for the Blind
2009 10th International Conference on Document Analysis and Recognition Text-Tracking Wearable Camera System for the Blind Hideaki Goto Cyberscience Center Tohoku University, Japan hgot @ isc.tohoku.ac.jp
More informationVC 11/12 T14 Visual Feature Extraction
VC 11/12 T14 Visual Feature Extraction Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos Miguel Tavares Coimbra Outline Feature Vectors Colour Texture
More informationScalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme
Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Jung-Rim Kim, Seong Soo Chun, Seok-jin Oh, and Sanghoon Sull School of Electrical Engineering, Korea University,
More informationOptimizing the Deblocking Algorithm for. H.264 Decoder Implementation
Optimizing the Deblocking Algorithm for H.264 Decoder Implementation Ken Kin-Hung Lam Abstract In the emerging H.264 video coding standard, a deblocking/loop filter is required for improving the visual
More informationTitle: Pyramidwise Structuring for Soccer Highlight Extraction. Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang
Title: Pyramidwise Structuring for Soccer Highlight Extraction Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang Mailing address: Microsoft Research Asia, 5F, Beijing Sigma Center, 49 Zhichun Road, Beijing
More informationA Novel Template Matching Approach To Speaker-Independent Arabic Spoken Digit Recognition
Special Session: Intelligent Knowledge Management A Novel Template Matching Approach To Speaker-Independent Arabic Spoken Digit Recognition Jiping Sun 1, Jeremy Sun 1, Kacem Abida 2, and Fakhri Karray
More informationMinimal-Impact Personal Audio Archives
Minimal-Impact Personal Audio Archives Dan Ellis, Keansub Lee, Jim Ogle Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu
More informationDiscriminative Genre-Independent Audio-Visual Scene Change Detection
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Discriminative Genre-Independent Audio-Visual Scene Change Detection Kevin Wilson, Ajay Divakaran TR2009-001 January 2009 Abstract We present
More informationHierarchical Video Summarization Based on Video Structure and Highlight
Hierarchical Video Summarization Based on Video Structure and Highlight Yuliang Geng, De Xu, and Songhe Feng Institute of Computer Science and Technology, Beijing Jiaotong University, Beijing, 100044,
More informationEE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7)
EE799 -- Multimedia Signal Processing Multimedia Signal Compression VI (MPEG-4, 7) References: 1. http://www.mpeg.org 2. http://drogo.cselt.stet.it/mpeg/ 3. T. Berahimi and M.Kunt, Visual data compression
More informationThe ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1
The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 N. Adami, A. Bugatti, A. Corghi, R. Leonardi, P. Migliorati, Lorenzo A. Rossi, C. Saraceno 2 Department of Electronics
More informationAudio-coding standards
Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.
More informationEDITING GUIDE (EDIT SUITES)
PREMIERE PRO CC (VERSION 2015.2) EDITING GUIDE (EDIT SUITES) Version 3.3 (FEB 2016) PREMIERE PRO CC EDIT GUIDE - La Trobe University 2015 latrobe.edu.au 2 What do you want to do? 3 1. Back up SD card footage
More informationdoc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague
Praha & EU: Investujeme do vaší budoucnosti Evropský sociální fond course: Searching the Web and Multimedia Databases (BI-VWM) Tomáš Skopal, 2011 SS2010/11 doc. RNDr. Tomáš Skopal, Ph.D. Department of
More informationAnalyzing Vocal Patterns to Determine Emotion Maisy Wieman, Andy Sun
Analyzing Vocal Patterns to Determine Emotion Maisy Wieman, Andy Sun 1. Introduction The human voice is very versatile and carries a multitude of emotions. Emotion in speech carries extra insight about
More informationLarge-scale Video Classification with Convolutional Neural Networks
Large-scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Note: Slide content mostly from : Bay Area
More informationGade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll; Moeslund, Thomas B.
Downloaded from vbn.aau.dk on: januar 11, 2019 Aalborg Universitet Audio-Visual Classification of Sports Types Gade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll; Moeslund, Thomas B. Published
More informationAutomatic Video Caption Detection and Extraction in the DCT Compressed Domain
Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,
More informationA NOVEL FEATURE EXTRACTION METHOD BASED ON SEGMENTATION OVER EDGE FIELD FOR MULTIMEDIA INDEXING AND RETRIEVAL
A NOVEL FEATURE EXTRACTION METHOD BASED ON SEGMENTATION OVER EDGE FIELD FOR MULTIMEDIA INDEXING AND RETRIEVAL Serkan Kiranyaz, Miguel Ferreira and Moncef Gabbouj Institute of Signal Processing, Tampere
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationBipartite Graph Partitioning and Content-based Image Clustering
Bipartite Graph Partitioning and Content-based Image Clustering Guoping Qiu School of Computer Science The University of Nottingham qiu @ cs.nott.ac.uk Abstract This paper presents a method to model the
More informationUNSUPERVISED MINING OF MULTIPLE AUDIOVISUALLY CONSISTENT CLUSTERS FOR VIDEO STRUCTURE ANALYSIS
Author manuscript, published in "Intl. Conf. on Multimedia and Exhibition, Australia (2012)" UNSUPERVISED MINING OF MULTIPLE AUDIOVISUALLY CONSISTENT CLUSTERS FOR VIDEO STRUCTURE ANALYSIS Anh-Phuong TA
More informationExperiments in computer-assisted annotation of audio
Experiments in computer-assisted annotation of audio George Tzanetakis Computer Science Dept. Princeton University en St. Princeton, NJ 844 USA +1 69 8 491 gtzan@cs.princeton.edu Perry R. Cook Computer
More informationSemantic Video Indexing
Semantic Video Indexing T-61.6030 Multimedia Retrieval Stevan Keraudy stevan.keraudy@tkk.fi Helsinki University of Technology March 14, 2008 What is it? Query by keyword or tag is common Semantic Video
More informationResearch Article Combination of Accumulated Motion and Color Segmentation for Human Activity Analysis
Hindawi Publishing Corporation EURASIP Journal on Image and Video Processing Volume 2008, Article ID 735141, 20 pages doi:10.1155/2008/735141 Research Article Combination of Accumulated Motion and Color
More informationIf we want widespread use and access to
Content-Based Multimedia Indexing and Retrieval Semantic Indexing of Multimedia Documents We propose two approaches for semantic indexing of audio visual documents, based on bottom-up and top-down strategies.
More informationActivities of Daily Living Indexing by Hierarchical HMM for Dementia Diagnostics
Activities of Daily Living Indexing by Hierarchical HMM for Dementia Diagnostics Svebor Karaman, Jenny Benois-Pineau LaBRI, Rémi Mégret IMS, Yann Gaëstel, Jean-Francois Dartigues - INSERM U.897, University
More informationIntroduzione alle Biblioteche Digitali Audio/Video
Introduzione alle Biblioteche Digitali Audio/Video Biblioteche Digitali 1 Gestione del video Perchè è importante poter gestire biblioteche digitali di audiovisivi Caratteristiche specifiche dell audio/video
More informationIntroduction to Similarity Search in Multimedia Databases
Introduction to Similarity Search in Multimedia Databases Tomáš Skopal Charles University in Prague Faculty of Mathematics and Phycics SIRET research group http://siret.ms.mff.cuni.cz March 23 rd 2011,
More informationClustering Methods for Video Browsing and Annotation
Clustering Methods for Video Browsing and Annotation Di Zhong, HongJiang Zhang 2 and Shih-Fu Chang* Institute of System Science, National University of Singapore Kent Ridge, Singapore 05 *Center for Telecommunication
More informationA ROBUST SPEAKER CLUSTERING ALGORITHM
A ROBUST SPEAKER CLUSTERING ALGORITHM J. Ajmera IDIAP P.O. Box 592 CH-1920 Martigny, Switzerland jitendra@idiap.ch C. Wooters ICSI 1947 Center St., Suite 600 Berkeley, CA 94704, USA wooters@icsi.berkeley.edu
More information70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing
70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY 2004 ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing Jianping Fan, Ahmed K. Elmagarmid, Senior Member, IEEE, Xingquan
More informationSpeech Articulation Training PART 1. VATA (Vowel Articulation Training Aid)
Speech Articulation Training PART 1 VATA (Vowel Articulation Training Aid) VATA is a speech therapy tool designed to supplement insufficient or missing auditory feedback for hearing impaired persons. The
More informationMultimedia Summarization in Law Courts: An Environment for Browsing and Consulting
Multimedia Summarization in Law Courts: An Environment for Browsing and Consulting E. Fersini 1, G. Arosio 1, E. Messina 1, F. Archetti 1,2, D. Toscani 2 1 DISCo, Università degli Studi di Milano-Bicocca,
More informationApproach to Metadata Production and Application Technology Research
Approach to Metadata Production and Application Technology Research In the areas of broadcasting based on home servers and content retrieval, the importance of segment metadata, which is attached in segment
More informationRegion-based Segmentation
Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.
More informationWorkshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards
Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Extracting meaning
More informationSPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION
Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing
More informationVideo Syntax Analysis
1 Video Syntax Analysis Wei-Ta Chu 2008/10/9 Outline 2 Scene boundary detection Key frame selection 3 Announcement of HW #1 Shot Change Detection Goal: automatic shot change detection Requirements 1. Write
More informationMultimodal Video Indexing: A Review of the State-of-the-art
Multimedia Tools and Applications, 25, 5 35, 2005 c 2005 Springer Science + Business Media, Inc. Manufactured in The Netherlands. Multimodal Video Indexing: A Review of the State-of-the-art CEES G.M. SNOEK
More informationMotivation. Technical Background
Handling Outliers through Agglomerative Clustering with Full Model Maximum Likelihood Estimation, with Application to Flow Cytometry Mark Gordon, Justin Li, Kevin Matzen, Bryce Wiedenbeck Motivation Clustering
More informationColor and Shading. Color. Shapiro and Stockman, Chapter 6. Color and Machine Vision. Color and Perception
Color and Shading Color Shapiro and Stockman, Chapter 6 Color is an important factor for for human perception for object and material identification, even time of day. Color perception depends upon both
More informationActivity Log File Aggregation (ALFA) toolkit for computer mediated consultation observation
Activity Log File Aggregation (ALFA) toolkit for computer mediated consultation observation Technical setup Stage of ALFA method 1. Observation 1.1 Audio visual recording 1.2 Observational data collection
More informationDUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING
DUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING Christopher Burges, Daniel Plastina, John Platt, Erin Renshaw, and Henrique Malvar March 24 Technical Report MSR-TR-24-19 Audio fingerprinting
More informationA GET YOU GOING GUIDE
A GET YOU GOING GUIDE To Your copy here Audio Notetaker 4.0 April 2015 1 Learning Support Getting Started with Audio Notetaker Audio Notetaker is highly recommended for those of you who use a Digital Voice
More informationMULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO
MULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO YIFAN ZHANG, QINGSHAN LIU, JIAN CHENG, HANQING LU National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of
More information11. Image Data Analytics. Jacobs University Visualization and Computer Graphics Lab
11. Image Data Analytics Motivation Images (and even videos) have become a popular data format for storing information digitally. Data Analytics 377 Motivation Traditionally, scientific and medical imaging
More informationContent Discovery from Composite Audio
Content Discovery from Composite Audio An unsupervised approach Proefschrift ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof. dr. ir.
More informationA Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection
A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection Kuanyu Ju and Hongkai Xiong Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China ABSTRACT To
More informationHierarchical Segmentation of Videos into Shots and Scenes using Visual Content
Hierarchical Segmentation of Videos into Shots and Scenes using Visual Content prepared by Andrew Thompson supervised by Robert Laganière and Pierre Payeur Thesis submitted to the Faculty of Graduate and
More informationSegmentation of Images
Segmentation of Images SEGMENTATION If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a
More informationBASEBALL TRAJECTORY EXTRACTION FROM
CS670 Final Project CS4670 BASEBALL TRAJECTORY EXTRACTION FROM A SINGLE-VIEW VIDEO SEQUENCE Team members: Ali Goheer (mag97) Irene Liew (isl23) Introduction In this project we created a mobile application
More informationHierarchical Semantic Content Analysis and Its Applications in Multimedia Summarization. and Browsing.
1 Hierarchical Semantic Content Analysis and Its Applications in Multimedia Summarization and Browsing Junyong You 1, Andrew Perkis 1, Moncef Gabbouj 2, Touradj Ebrahimi 1,3 1. Norwegian University of
More informationAvailable online Journal of Scientific and Engineering Research, 2016, 3(4): Research Article
Available online www.jsaer.com, 2016, 3(4):417-422 Research Article ISSN: 2394-2630 CODEN(USA): JSERBR Automatic Indexing of Multimedia Documents by Neural Networks Dabbabi Turkia 1, Lamia Bouafif 2, Ellouze
More informationWITH huge amount of video data generated daily, it becomes
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 1, JANUARY 2007 89 Major Cast Detection in Video Using Both Speaker and Face Information Zhu Liu, Senior Member, IEEE, and Yao Wang, Fellow, IEEE Abstract Major
More information