Text, Speech, and Vision for Video Segmentation: The Informedia TM Project

Size: px
Start display at page:

Download "Text, Speech, and Vision for Video Segmentation: The Informedia TM Project"

Transcription

1 Text, Speech, and Vision for Video Segmentation: The Informedia TM Project Alexander G. Hauptmann Michael A. Smith School Computer Science Dept. Electrical and Computer Engineering Carnegie Mellon University Carnegie Mellon University Pittsburgh, PA Pittsburgh, PA Abstract We describe three technologies involved in creating a digital video library suitable for fullcontent search and retrieval. Image processing analyzes scenes, speech processing transcribes audio signal, and natural language processing determines word relevance. The integration se technologies enables us to include vast amounts video data in library. 1 Introduction The Informedia Digital Video Library Project at Carnegie Mellon University is creating a digital library text, images, videos and audio data available for full content retrieval [Stevens94][Christel94]. The initial testbed will be installed in several K-12 schools and students will use system to explore multi-media data for educational purposes. The Informedia system for video libraries goes far beyond current paradigm video-on-demand by retrieving a short video paragraph in response to user s query. The project can be divided into two phases: library creation and library exploration (See Figure 1). 1.1 Library creation The Informedia project is creating intelligent, automatic mechanisms for populating a video library and allowing for its full-content and knowledge-based search and segment retrieval. The material is obtained from video assets WQED/Pittsburgh as well as British Open University video courses.the project uses Sphinx-II speech recognition system to transcribe and align narratives and dialogues automatically. The resulting transcript is n processed through methods natural language understanding to extract subjective descriptions and mark relevant key words. Acoustic signal analysis identifies potential segment boundaries paragraph size. Within a paragraph, scenes isolated and clustered into video segments through use various image understanding techniques. These components described in Figure Library exploration Users able to explore Informedia library through an interface that allows m to search using TV Footage Extra Footage New Video Footage Speech & Language Interpretation and Indexing CREATION Indexed Video Database Video Library EXPLORATION Figure 1: Indexed Transcript Text Indexed Transcript Text Raw Video audio video Video Segmentation and Description Segmented Described Video Distribution or Sale to Users OFFLINE Segmented Described Video Interactive Video Search Visual, Spoken, and Natural Language Query Video Segments Presentation Store ONLINE Overview Informedia Digital Video Library System typed or spoken natural language queries, select relevant documents retrieved from library and display material on ir PC workstations. The library retrieval system can effectively process spoken queries and deliver relevant video data in a compact format, based on information embedded with video during library creation. Video and or data may be explored in depth for related content. During retrieval based on keyword searches by a user, only relevant video segments displayed. Prototype exploration systems have been implemented on both Macintosh and PC platforms. In this paper we will focus on library creation aspect Informedia Project. In particular, we

2 Video Paragraph - Speech SNR Speech Transcript - Keywords y tough y demanding y jury every toy manufacturer hopes to please creators new toy knex have received a large amount Scene Isolation - Image Analysis Representative Frame Icon - Keyword Search Figure 2: Combined technology to select representative frame (icon) will describe how to segment a video meaningfully using integration different technologies. Through combined efforts Carnegie Mellon s speech, image and natural language processing groups, this system provides a robust tool for segmenting many types video data in order to utilize m within a digital video library. 2 Video Segmentation Generally, videos in Informedia Library full one-hour feature broadcast videos based on educational documentaries. To allow efficient access to relevant content videos, we need to separate m into small pieces. To answer a user query by showing an hour long video is rly a reasonable response. The Informedia library creation phase uses three different levels segmentation for a video. The first and generally largest segment shows a video paragraph, which consists a series related scenes with a common content. The second level segmentation identifies a single scene on video within video paragraph. Finally, within a single scene we also need to be able to select a representative frame icon for static displays. 2.1 Video paragraphs When a user receives response to a query, system needs to determine how much content and context to display. Where should video clip start and where does it end? The answer to this is partly determined by content user query. But answer is also dependent on natural segments within video which we call video paragraphs. In ideal case, a video paragraph starts at natural boundary relevant content and ends wherever video moves to a different context. 2.2 Individual scenes Segment breaks produced by image processing examined along with boundaries identified by speech and natural language processing transcript, and an improved set segment boundaries heuristically derived to partition video paragraphs into scenes. All frames from each new scene will be used to select frame icon. This technique will allow for inclusion all relevant image information in video and elimination redundant data.

3 2.3 Frame icons For purposes static displays, most characteristic frame a scene is included in static (nonanimated) representations user s selection. A single frame is displayed as representative for whole video segment. This is used in an outlined display showing results a user query. Showing frame icons allows user to simultaneously look at a static representation multiple video paragraphs and to obtain some information about ir content and possible relevance to user s query, before selecting any one paragraph for playback. Frame icons also important as encapsulations video paragraph for printed reports and viewgraphs. In order to create se various levels segmentation, we integrate a number different technologies which will be described in next section. 3 Component Technologies There 3 broad categories technologies we can bring to bear to problem identifying video segments from broadcast video materials. a. Text processing looks at textual (ASCII) representation words that were spoken, as well as or annotations derived from transcript, production notes or closecaptioning that may be available. b. Speech signal analysis provides basis for analyzing audio component material. c. Image analysis looks at images in video-only portion. Currently in library creation phase Informedia Digital Video Library following specific approaches used to create segmentation information. 3.1 Text Analysis Text analysis can work on an existing ASCII transcript to help segment text into paragraphs. An analysis keyword prominence allows us to identify important sections in transcript [Mauldin 89]. Or more sophisticated language based criteria under investigation. The notion semantic connections between text portions might be exploited for segmentation as well. Currently we use two main techniques in natural language analysis. a. If we have a complete time aligned transcript available from close-captioning or through a human generated transcription, we can exploit natural structural text markers such as punctuation to identify segments video paragraph granularity b. To identify and rank contents various segments, we use well-known technique TF/IDF (term frequency/inverse document frequency) to identify critical keywords and ir relative importance for video document [Salton83]. 3.2 Speech Analysis Speech analysis operates only on audio portion video. Using speech recognition we can obtain a transcript, although it may contain errors. We can also detect transitions between speakers and topics which usually marked by silence or low energy as in acoustic signal. Recognition To transcribe content video material, we recognize spoken words with Sphinx-II speech recognizer. The CMU Sphinx-II system uses semi-continuous Hidden Markov Models to model contextdependent phones (triphones), including between word context [Hwang94]. The recognizer processes an utterance in 3 steps: It makes a forward time synchronous pass using full between word models, Viterbi scoring and a trigram language model. This produces a word lattice where words may have only one begin time but several end times. The recognizer n makes a backward pass which uses end times from words in first pass and produces a second lattice which contains multiple begin times for words. An A* algorithm is used to generate best hyposis from se two lattices. The language model consists words (with probabilities), bigrams/trigrams which word pairs/triplets with conditional probabilities for last word given previous word(s). The language model was constructed from a corpus news stories from Wall Street Journal from 1989 to 1994 and Associated Press news service stories from 1988 to 199. Only trigrams that were encountered more than once were included in model, but all bigrams and most frequent 588 words in corpus were included [Rudnicky95]. Processing video tape using speech recognition system gives us a transcript. This transcript contains errors, which depending on quality tape and subject matter, currently range from 2% to 7% word error rate. 1 Power = log -- n Si 2 Acoustic Segmentation To detect breaks between utterances we use a modification Signal to Noise ratio (SNR) techniques which compute signal power. This algorithm computes power digitized speech samples where Si is a preemphasized sample speech within a frame 2 milliseconds. A low power level indicates that re is little

4 active speech occurring in this frame (low energy). Segmentation breaks between utterances set at minimum power as averaged over a 1 second window. To prevent unusually long segments, we force system to place at least one break within 3 seconds. 3.3 Image Analysis Image analysis is primarily used for identification breaks between scenes and identification a single static frame icon that is representative a scene. Histogram Analysis Video is segmented into scenes through use comparative difference measures [Zhang93]. Images with small histogram disparity considered to be relatively equivalent. By detecting significant changes in weighted color histogram each successive frame, image sequences can be separated into individual scenes. A comparison between cumulative distributions is used as a difference measure. The histogram difference plot is shown in bottom graph Figure Motion Vector Confidence Measure [Akutsu94]. We can interpret camera motion as a pan or zoom by examining geometric properties optical flow vectors. Using Lucas-Kanade gradient descent method for optical flow, we can track individual regions from one frame to next [Lucas81]. By measuring velocity that individual regions show over time, a motion representation scene is created. Figure 4 shows examples optical flow analysis for different types camera motion. Drastic changes in this flow describe random motion, and refore, new scenes. These changes will also occur during gradual transitions between images such as fades or special effects. Only regions low ambiguity selected for tracking. Trackable regions found by searching entire image for subwindows whose gradient derivatives exhibit relatively similar eigenvalues. In order to accurately track a region over large as, a multiresolution structure is used. With this structure we can track regions across many pixels and reduce time needed for computation. When optical flow is minimal frames suitable for an iconic frame representation. Since we primarily interested in distinguishing static frames from motion frames, it was sufficient to track only top 3 regions. I I 1 I 2 Flow Histogram Difference Analysis Frames Figure 3: Scene segmentation and motion vector error. This result is passed through a high pass filter to furr isolate peaks and an empirical threshold is used to select only those regions where scene breaks occur. To make analysis more robust, we examine individual images in tiled subwindows. This reduces noise in our difference data and compensates for motion between frames. The images initially subsampled to provide an efficient means computation. Using only histogram difference, we have achieved 9% accuracy on a test set roughly 2, video images (2 hours). Optical Flow One important method visual segmentation and description is based on interpreting camera motion Figure 4: Camera motion analysis using optical flow. Flow vectors amplified for visibility. These techniques work well when scene changes abrupt, however, camera motion and gradual changes can severely affect accuracy system. The first graph in Figure 3 shows optical flow error for a given sequence. When changes gradual, we combine optical flow results with histogram analysis. This allows for segmentation under conditions that do not involve drastic changes in image content and detection accuracy as high as 95%.

5 Histogram Scene Analysis Scenes Audio Segments and Text 1.5 x Audio 4 Signal Samples 6 7 despite heroic efforts many worlds wild creatures doomed loss species is now same as when great dinosaurs become extinct will se creatures become dinosaurs our time today mankind is changing entire face planet earth x 5... Figure 5: Analysis scene changes in video and audio signal 4 Technology Synsis We now describe how we integrate different component technologies. In our early work on Informedia digital video library, all segmentation was done by hand. We have now moved to a procedure where segmentation boundaries suggested by system, but adjusted and verified by a person supervising digital video library creation process. Eventually we will transition from computer-assisted procedures to fully automatic video segmentation, as algorithms described above become better tested and more robust. Our current library creation process starts with a raw digitized video tape. The audio portion is fed through speech analysis routines, which produces a transcript spoken text. The speech signal is also analyzed for low energy sections that indicate acoustic paragraphs through silence. This is first pass at segmentation. If a close-caption transcript is available, we use that instead speech recognition output, since it is less errorful. The transcript is processed by natural language system and important keywords identified. Using results returned from image analysis, we n match acoustic paragraph to nest scene break. This gives us an appropriate video paragraph clip in response to a user s request. The keywords and ir corresponding paragraph locations in video indexed in informedia library catalogue. To obtain video clips suitable for viewers, we first search for keywords from user query in recognition transcript. When we find a match, surrounding video paragraph is returned. For a static icon representative a video clip, we place most emphasis on image data. The paragraph is determined by transcript and keywords. Within paragraph most prominent keywords identify most prominent scene. The scene boundaries determined by image analysis color histogram differences and optical flow analysis. Figure 5 shows integration technologies used by system. 5 Conclusion We currently using se techniques to create digital video library collections suitable for full content retrieval. While some steps not yet fully

6 integrated, each one has been shown to work independently, and several techniques fully integrated within informedia system. Anor use combined technologies will be development video skim [Smith95]. By only presenting significant regions, a short synopsis video paragraph can be used as a preview for actual segment. The Informedia Project will establish an online digital video library consisting over hours video material. In order to be able to process this volume data, practical, effective and efficient tools essential. We have outlined a practical set techniques for video segmentation that allows us to automatically process volume data required. 6 References [Akutsu94] Akutsu, A. and Tonomura, Y. Video Tomography: An efficient method for Camerawork Extraction and Motion Analysis, Proc ACM Multimedia 94, Oct. 15-2, 1994, San Francisco, CA, pp [Christel94] Christel, M., Stevens, S., & Wactlar, H. Informedia Digital Video Library, Proceedings Second ACM International Conference on Multimedia, Video Program. New York: ACM, October, 1994, pp [Salton83] Salton, G., McGill, M.J. Introduction to Modern Information Retrieval, McGraw-Hill, New York, McGraw-Hill Computer Science Series, [Stevens94] Stevens, S., Christel, M., Wactlar, H. Informedia: Improving Access to Digital Video. Interactions 1 (October 1994), pp [Zhang93] [Smith95] Zhang, H., Kankanhalli, A., and Smoliar, S. Automatic partitioning fullmotion video, Multimedia Systems (1993) 1, pp Smith, M., Kanade, T., Video Skimming for Quick Browsing Based on Audio and Image Characterization, CS Technical Report, Carnegie Mellon University, Summer Acknowledgment The authors would like to thank Howard Wactlar and or members Informedia Project for ir valuable discussions and contributions. This work is partially funded by National Science Foundation, National Space and Aeronautics Administration, and Advanced Research Projects Agency. [Hwang94] [Lucas 81] [Mauldin89] Hwang, M., Rosenfeld, R., Thayer, E., Mosur, R., Chase, L., Weide, R., Huang, X., Alleva, F., Improving Speech Recognition Performance via Phone-Dependent VQ Codebooks and Adaptive Language Models in SPHINX-II. ICASSP-94, vol. I, pp Lucas, B.D., Kanade, T. An Iterative Technique Image Registration and Its Application to Stereo, Proc. 7th International Joint Conference on Artificial Intelligence, pp , August Mauldin, M. Information Retrieval by Text Skimming, PhD Thesis, Carnegie Mellon University. August Revised edition published as Conceptual Information Retrieval: A Case Study in Adaptive Partial Parsing, Kluwer Press, September [Rudnicky95] Rudnicky, A., Language Modeling with Limited Domain Data, Proceeding 1995 ARPA Workshop on Spoken Language Technology, in press.

INFORMEDIA TM : NEWS-ON-DEMAND EXPERIMENTS IN SPEECH RECOGNITION

INFORMEDIA TM : NEWS-ON-DEMAND EXPERIMENTS IN SPEECH RECOGNITION INFORMEDIA TM : NEWS-ON-DEMAND EXPERIMENTS IN SPEECH RECOGNITION Howard D. Wactlar, Alexander G. Hauptmann and Michael J. Witbrock ABSTRACT In theory, speech recognition technology can make any spoken

More information

Informedia News-On Demand: Using Speech Recognition to Create a Digital Video Library

Informedia News-On Demand: Using Speech Recognition to Create a Digital Video Library Informedia News-On Demand: Using Speech Recognition to Create a Digital Video Library Howard D. Wactlar 1, Alexander G. Hauptmann 1 and Michael J. Witbrock 2,3 March 19 th, 1998 CMU-CS-98-109 1 School

More information

[6] Rowley, H., Baluja, S. and Kanade, K. Neural Network-Based Face Detection, Computer Vision and Pattern Recognition, San Francisco, May 1996.

[6] Rowley, H., Baluja, S. and Kanade, K. Neural Network-Based Face Detection, Computer Vision and Pattern Recognition, San Francisco, May 1996. [7] Stevens, S., Christel, M., and Wactlar, H. Informedia: Improving Access to Digital Video Interactions 1 Oct.94. [8] Zhang, H., et a.l, Automatic Partitioning of Full-Motion Video, Multimedia Systems

More information

Scalable Trigram Backoff Language Models

Scalable Trigram Backoff Language Models Scalable Trigram Backoff Language Models Kristie Seymore Ronald Rosenfeld May 1996 CMU-CS-96-139 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 This material is based upon work

More information

Lecture 12: Video Representation, Summarisation, and Query

Lecture 12: Video Representation, Summarisation, and Query Lecture 12: Video Representation, Summarisation, and Query Dr Jing Chen NICTA & CSE UNSW CS9519 Multimedia Systems S2 2006 jchen@cse.unsw.edu.au Last week Structure of video Frame Shot Scene Story Why

More information

Associating video frames with text

Associating video frames with text Associating video frames with text Pinar Duygulu and Howard Wactlar Informedia Project School of Computer Science University Informedia Digital Video Understanding Project IDVL interface returned for "El

More information

Department of Computer Science & Engineering. The Chinese University of Hong Kong Final Year Project LYU0102

Department of Computer Science & Engineering. The Chinese University of Hong Kong Final Year Project LYU0102 Department of Computer Science & Engineering The Chinese University of Hong Kong LYU0102 Supervised by Prof. LYU, Rung Tsong Michael Group Members: Chan Pik Wah Ngai Cheuk Han Prepared by Chan Pik Wah

More information

CHAPTER 8 Multimedia Information Retrieval

CHAPTER 8 Multimedia Information Retrieval CHAPTER 8 Multimedia Information Retrieval Introduction Text has been the predominant medium for the communication of information. With the availability of better computing capabilities such as availability

More information

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 N. Adami, A. Bugatti, A. Corghi, R. Leonardi, P. Migliorati, Lorenzo A. Rossi, C. Saraceno 2 Department of Electronics

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

Video Representation. Video Analysis

Video Representation. Video Analysis BROWSING AND RETRIEVING VIDEO CONTENT IN A UNIFIED FRAMEWORK Yong Rui, Thomas S. Huang and Sharad Mehrotra Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign

More information

Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection

Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning TR2005-155

More information

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore Particle Filtering CS6240 Multimedia Analysis Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore (CS6240) Particle Filtering 1 / 28 Introduction Introduction

More information

WHAT YOU SEE IS (ALMOST) WHAT YOU HEAR: DESIGN PRINCIPLES FOR USER INTERFACES FOR ACCESSING SPEECH ARCHIVES

WHAT YOU SEE IS (ALMOST) WHAT YOU HEAR: DESIGN PRINCIPLES FOR USER INTERFACES FOR ACCESSING SPEECH ARCHIVES ISCA Archive http://www.isca-speech.org/archive 5 th International Conference on Spoken Language Processing (ICSLP 98) Sydney, Australia November 30 - December 4, 1998 WHAT YOU SEE IS (ALMOST) WHAT YOU

More information

An Improvement of the Occlusion Detection Performance in Sequential Images Using Optical Flow

An Improvement of the Occlusion Detection Performance in Sequential Images Using Optical Flow , pp.247-251 http://dx.doi.org/10.14257/astl.2015.99.58 An Improvement of the Occlusion Detection Performance in Sequential Images Using Optical Flow Jin Woo Choi 1, Jae Seoung Kim 2, Taeg Kuen Whangbo

More information

Hypervideo Summaries

Hypervideo Summaries Hypervideo Summaries Andreas Girgensohn, Frank Shipman, Lynn Wilcox FX Palo Alto Laboratory, 3400 Hillview Avenue, Bldg. 4, Palo Alto, CA 94304 ABSTRACT Hypervideo is a form of interactive video that allows

More information

Content Based Retrieval Video System for Educational Purposes

Content Based Retrieval Video System for Educational Purposes Content Based Retrieval Video System for Educational Purposes Antoni Bibiloni, Ricardo Galli Depart. Matemàtiques i Informàtica Universitat de les Illes Balears Ctra. de Valldemossa Km 7,5 E-07071 Palma

More information

Clustering Methods for Video Browsing and Annotation

Clustering Methods for Video Browsing and Annotation Clustering Methods for Video Browsing and Annotation Di Zhong, HongJiang Zhang 2 and Shih-Fu Chang* Institute of System Science, National University of Singapore Kent Ridge, Singapore 05 *Center for Telecommunication

More information

IBM Research Report. Multi-Search of Video Segments Indexed by Time-Aligned Annotations of Video Content

IBM Research Report. Multi-Search of Video Segments Indexed by Time-Aligned Annotations of Video Content RC21444 (96156) 18 November 1998 Computer Science IBM Research Report Multi-Search of Video Segments Indexed by Time-Aligned Annotations of Video Content Anni Coden, Norman Haas, Robert Mack IBM Research

More information

Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors

Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Ajay Divakaran, Kadir A. Peker, Regunathan Radhakrishnan, Ziyou Xiong and Romain Cabasson Presented by Giulia Fanti 1 Overview Motivation

More information

Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition. Motion Tracking. CS4243 Motion Tracking 1

Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition. Motion Tracking. CS4243 Motion Tracking 1 Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition Motion Tracking CS4243 Motion Tracking 1 Changes are everywhere! CS4243 Motion Tracking 2 Illumination change CS4243 Motion Tracking 3 Shape

More information

A Miniature-Based Image Retrieval System

A Miniature-Based Image Retrieval System A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,

More information

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Previous Lecture Audio Retrieval - Query by Humming

More information

A Content Based Image Retrieval System Based on Color Features

A Content Based Image Retrieval System Based on Color Features A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris

More information

Research on Construction of Road Network Database Based on Video Retrieval Technology

Research on Construction of Road Network Database Based on Video Retrieval Technology Research on Construction of Road Network Database Based on Video Retrieval Technology Fengling Wang 1 1 Hezhou University, School of Mathematics and Computer Hezhou Guangxi 542899, China Abstract. Based

More information

Video Processing for Judicial Applications

Video Processing for Judicial Applications Video Processing for Judicial Applications Konstantinos Avgerinakis, Alexia Briassouli, Ioannis Kompatsiaris Informatics and Telematics Institute, Centre for Research and Technology, Hellas Thessaloniki,

More information

Multimedia Abstractions for a Digital Video Library

Multimedia Abstractions for a Digital Video Library In Proceedings of ACM Digital Libraries 97 Conference, Philadelphia, PA, pp. 21-29, July 1997. Multimedia Abstractions for a Digital Video Library Michael G. Christel HCI Institute and CS Dept. Carnegie

More information

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,

More information

2. Basic Task of Pattern Classification

2. Basic Task of Pattern Classification 2. Basic Task of Pattern Classification Definition of the Task Informal Definition: Telling things apart 3 Definition: http://www.webopedia.com/term/p/pattern_recognition.html pattern recognition Last

More information

ABSTRACT 1. INTRODUCTION

ABSTRACT 1. INTRODUCTION ABSTRACT A Framework for Multi-Agent Multimedia Indexing Bernard Merialdo Multimedia Communications Department Institut Eurecom BP 193, 06904 Sophia-Antipolis, France merialdo@eurecom.fr March 31st, 1995

More information

Multimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009

Multimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009 9 Video Retrieval Multimedia Databases 9 Video Retrieval 9.1 Hidden Markov Models (continued from last lecture) 9.2 Introduction into Video Retrieval Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme

More information

A Multi-View Intelligent Editor for Digital Video Libraries

A Multi-View Intelligent Editor for Digital Video Libraries A Multi-View Intelligent Editor for Digital Video Libraries Brad A. Myers, Juan P. Casares, Scott Stevens, Laura Dabbish, Dan Yocum, Albert Corbett ABSTRACT Silver is an authoring tool that aims to allow

More information

Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages

Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages Chirag Shah Dept. of CSE IIT Madras Chennai - 600036 Tamilnadu, India. chirag@speech.iitm.ernet.in A. Nayeemulla Khan Dept. of CSE

More information

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field Finally: Motion and tracking Tracking objects, video analysis, low level motion Motion Wed, April 20 Kristen Grauman UT-Austin Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys, and S. Lazebnik

More information

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Intelligent Control Systems Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Columbia University High-Level Feature Detection: Parts-based Concept Detectors

Columbia University High-Level Feature Detection: Parts-based Concept Detectors TRECVID 2005 Workshop Columbia University High-Level Feature Detection: Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric Zavesky Digital Video and Multimedia Lab

More information

AIIA shot boundary detection at TRECVID 2006

AIIA shot boundary detection at TRECVID 2006 AIIA shot boundary detection at TRECVID 6 Z. Černeková, N. Nikolaidis and I. Pitas Artificial Intelligence and Information Analysis Laboratory Department of Informatics Aristotle University of Thessaloniki

More information

Video search requires efficient annotation of video content To some extent this can be done automatically

Video search requires efficient annotation of video content To some extent this can be done automatically VIDEO ANNOTATION Market Trends Broadband doubling over next 3-5 years Video enabled devices are emerging rapidly Emergence of mass internet audience Mainstream media moving to the Web What do we search

More information

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm Dirk W. Wagener, Ben Herbst Department of Applied Mathematics, University of Stellenbosch, Private Bag X1, Matieland 762,

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

AUTOMATIC VIDEO INDEXING

AUTOMATIC VIDEO INDEXING AUTOMATIC VIDEO INDEXING Itxaso Bustos Maite Frutos TABLE OF CONTENTS Introduction Methods Key-frame extraction Automatic visual indexing Shot boundary detection Video OCR Index in motion Image processing

More information

From Multimedia Retrieval to Knowledge Management. Pedro J. Moreno JM Van Thong Beth Logan

From Multimedia Retrieval to Knowledge Management. Pedro J. Moreno JM Van Thong Beth Logan From Multimedia Retrieval to Knowledge Management Pedro J. Moreno JM Van Thong Beth Logan CRL 2002/02 March 2002 From Multimedia Retrieval to Knowledge Management Pedro J. Moreno JM Van Thong Beth Logan

More information

CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM

CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM Nick Hatzigeorgiu, Nikolaos Sidiropoulos and Harris Papageorgiu Institute for Language and Speech Processing Epidavrou & Artemidos 6, 151 25 Maroussi,

More information

Key Frame Extraction and Indexing for Multimedia Databases

Key Frame Extraction and Indexing for Multimedia Databases Key Frame Extraction and Indexing for Multimedia Databases Mohamed AhmedˆÃ Ahmed Karmouchˆ Suhayya Abu-Hakimaˆˆ ÃÃÃÃÃÃÈÃSchool of Information Technology & ˆˆÃ AmikaNow! Corporation Engineering (SITE),

More information

Video shot segmentation using late fusion technique

Video shot segmentation using late fusion technique Video shot segmentation using late fusion technique by C. Krishna Mohan, N. Dhananjaya, B.Yegnanarayana in Proc. Seventh International Conference on Machine Learning and Applications, 2008, San Diego,

More information

Maximum Likelihood Beamforming for Robust Automatic Speech Recognition

Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Barbara Rauch barbara@lsv.uni-saarland.de IGK Colloquium, Saarbrücken, 16 February 2006 Agenda Background: Standard ASR Robust ASR

More information

Kanade Lucas Tomasi Tracking (KLT tracker)

Kanade Lucas Tomasi Tracking (KLT tracker) Kanade Lucas Tomasi Tracking (KLT tracker) Tomáš Svoboda, svoboda@cmp.felk.cvut.cz Czech Technical University in Prague, Center for Machine Perception http://cmp.felk.cvut.cz Last update: November 26,

More information

A Video Optimization Framework for Tracking Teachers in the Classroom

A Video Optimization Framework for Tracking Teachers in the Classroom A Video Optimization Framework for Tracking Teachers in the Classroom Lele Ma College of William and Mary lma03@email.wm.edu Yantao Li Southwest University yantaoli@swu.edu.cn Gang Zhou College of William

More information

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval Lesson 11 Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Retrieval = Query + Search Informational Retrieval: Get required information from database/web

More information

STEREO BY TWO-LEVEL DYNAMIC PROGRAMMING

STEREO BY TWO-LEVEL DYNAMIC PROGRAMMING STEREO BY TWO-LEVEL DYNAMIC PROGRAMMING Yuichi Ohta Institute of Information Sciences and Electronics University of Tsukuba IBARAKI, 305, JAPAN Takeo Kanade Computer Science Department Carnegie-Mellon

More information

Content-Based Multimedia Information Retrieval

Content-Based Multimedia Information Retrieval Content-Based Multimedia Information Retrieval Ishwar K. Sethi Intelligent Information Engineering Laboratory Oakland University Rochester, MI 48309 Email: isethi@oakland.edu URL: www.cse.secs.oakland.edu/isethi

More information

Semantic Video Indexing

Semantic Video Indexing Semantic Video Indexing T-61.6030 Multimedia Retrieval Stevan Keraudy stevan.keraudy@tkk.fi Helsinki University of Technology March 14, 2008 What is it? Query by keyword or tag is common Semantic Video

More information

One category of visual tracking. Computer Science SURJ. Michael Fischer

One category of visual tracking. Computer Science SURJ. Michael Fischer Computer Science Visual tracking is used in a wide range of applications such as robotics, industrial auto-control systems, traffic monitoring, and manufacturing. This paper describes a new algorithm for

More information

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS Nilam B. Lonkar 1, Dinesh B. Hanchate 2 Student of Computer Engineering, Pune University VPKBIET, Baramati, India Computer Engineering, Pune University VPKBIET,

More information

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA Journal of Computer Science, 9 (5): 534-542, 2013 ISSN 1549-3636 2013 doi:10.3844/jcssp.2013.534.542 Published Online 9 (5) 2013 (http://www.thescipub.com/jcs.toc) MATRIX BASED INDEXING TECHNIQUE FOR VIDEO

More information

Search Engines. Information Retrieval in Practice

Search Engines. Information Retrieval in Practice Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Beyond Bag of Words Bag of Words a document is considered to be an unordered collection of words with no relationships Extending

More information

Communications of ACM, pp. xx - yy, December Video Abstracting. Rainer Lienhart, Silvia Pfeiffer and Wolfgang Effelsberg

Communications of ACM, pp. xx - yy, December Video Abstracting. Rainer Lienhart, Silvia Pfeiffer and Wolfgang Effelsberg Video Abstracting Rainer Lienhart, Silvia Pfeiffer and Wolfgang Effelsberg University of Mannheim, 68131 Mannheim, Germany {lienhart, pfeiffer, effelsberg}@pi4.informatik.uni-mannheim.de 1. What is a Video

More information

Optimization of HMM by the Tabu Search Algorithm

Optimization of HMM by the Tabu Search Algorithm JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 20, 949-957 (2004) Optimization of HMM by the Tabu Search Algorithm TSONG-YI CHEN, XIAO-DAN MEI *, JENG-SHYANG PAN AND SHENG-HE SUN * Department of Electronic

More information

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Features Points Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Finding Corners Edge detectors perform poorly at corners. Corners provide repeatable points for matching, so

More information

Multimedia structuring using trees

Multimedia structuring using trees Multimedia structuring using trees George Tzanetakis & Luc Julia Computer Human Interaction Center (CHIC!) SRI International 333 Ravenswood Avenue Menlo Park, CA 94025 gtzan@cs.princeton.edu julia@speech.sri.com

More information

arxiv: v1 [cs.cv] 2 May 2016

arxiv: v1 [cs.cv] 2 May 2016 16-811 Math Fundamentals for Robotics Comparison of Optimization Methods in Optical Flow Estimation Final Report, Fall 2015 arxiv:1605.00572v1 [cs.cv] 2 May 2016 Contents Noranart Vesdapunt Master of Computer

More information

Detector. Flash. Detector

Detector. Flash. Detector CLIPS at TRECvid: Shot Boundary Detection and Feature Detection Georges M. Quénot, Daniel Moraru, and Laurent Besacier CLIPS-IMAG, BP53, 38041 Grenoble Cedex 9, France Georges.Quenot@imag.fr Abstract This

More information

3 Publishing Technique

3 Publishing Technique Publishing Tool 32 3 Publishing Technique As discussed in Chapter 2, annotations can be extracted from audio, text, and visual features. The extraction of text features from the audio layer is the approach

More information

Baseball Game Highlight & Event Detection

Baseball Game Highlight & Event Detection Baseball Game Highlight & Event Detection Student: Harry Chao Course Adviser: Winston Hu 1 Outline 1. Goal 2. Previous methods 3. My flowchart 4. My methods 5. Experimental result 6. Conclusion & Future

More information

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution 2011 IEEE International Symposium on Multimedia Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution Jeffrey Glaister, Calvin Chan, Michael Frankovich, Adrian

More information

On Modeling Variations for Face Authentication

On Modeling Variations for Face Authentication On Modeling Variations for Face Authentication Xiaoming Liu Tsuhan Chen B.V.K. Vijaya Kumar Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213 xiaoming@andrew.cmu.edu

More information

Towards the completion of assignment 1

Towards the completion of assignment 1 Towards the completion of assignment 1 What to do for calibration What to do for point matching What to do for tracking What to do for GUI COMPSCI 773 Feature Point Detection Why study feature point detection?

More information

Aircraft Tracking Based on KLT Feature Tracker and Image Modeling

Aircraft Tracking Based on KLT Feature Tracker and Image Modeling Aircraft Tracking Based on KLT Feature Tracker and Image Modeling Khawar Ali, Shoab A. Khan, and Usman Akram Computer Engineering Department, College of Electrical & Mechanical Engineering, National University

More information

Automated Tagging to Enable Fine-Grained Browsing of Lecture Videos

Automated Tagging to Enable Fine-Grained Browsing of Lecture Videos Automated Tagging to Enable Fine-Grained Browsing of Lecture Videos K.Vijaya Kumar (09305081) under the guidance of Prof. Sridhar Iyer June 28, 2011 1 / 66 Outline Outline 1 Introduction 2 Motivation 3

More information

An Introduction to Pattern Recognition

An Introduction to Pattern Recognition An Introduction to Pattern Recognition Speaker : Wei lun Chao Advisor : Prof. Jian-jiun Ding DISP Lab Graduate Institute of Communication Engineering 1 Abstract Not a new research field Wide range included

More information

Tips on DVD Authoring and DVD Duplication M A X E L L P R O F E S S I O N A L M E D I A

Tips on DVD Authoring and DVD Duplication M A X E L L P R O F E S S I O N A L M E D I A Tips on DVD Authoring and DVD Duplication DVD Authoring - Introduction The postproduction business has certainly come a long way in the past decade or so. This includes the duplication/authoring aspect

More information

Simple and Robust Tracking of Hands and Objects for Video-based Multimedia Production

Simple and Robust Tracking of Hands and Objects for Video-based Multimedia Production Simple and Robust Tracking of Hands and Objects for Video-based Multimedia Production Masatsugu ITOH Motoyuki OZEKI Yuichi NAKAMURA Yuichi OHTA Institute of Engineering Mechanics and Systems University

More information

Kanade Lucas Tomasi Tracking (KLT tracker)

Kanade Lucas Tomasi Tracking (KLT tracker) Kanade Lucas Tomasi Tracking (KLT tracker) Tomáš Svoboda, svoboda@cmp.felk.cvut.cz Czech Technical University in Prague, Center for Machine Perception http://cmp.felk.cvut.cz Last update: November 26,

More information

Video Skimming and Characterization through the Combination of Image and Language Understanding

Video Skimming and Characterization through the Combination of Image and Language Understanding Video Skimming and Characterization through the Combination of mage and Language Understanding Michael A. Smith Takeo Kanade Department of Electrical and Computer Engineering Carnegie Mellon University

More information

Facilitating Video Access by Visualizing Automatic Analysis

Facilitating Video Access by Visualizing Automatic Analysis Facilitating Video Access by Visualizing Automatic Analysis Andreas Girgensohn, John Boreczky, Lynn Wilcox, and Jonathan Foote FX Palo Alto Laboratory 3400 Hillview Avenue Palo Alto, CA 94304 {andreasg,

More information

Lecture Video Indexing and Retrieval Using Topic Keywords

Lecture Video Indexing and Retrieval Using Topic Keywords Lecture Video Indexing and Retrieval Using Topic Keywords B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa International Science Index, Computer and Information Engineering waset.org/publication/10007915

More information

Robust Model-Free Tracking of Non-Rigid Shape. Abstract

Robust Model-Free Tracking of Non-Rigid Shape. Abstract Robust Model-Free Tracking of Non-Rigid Shape Lorenzo Torresani Stanford University ltorresa@cs.stanford.edu Christoph Bregler New York University chris.bregler@nyu.edu New York University CS TR2003-840

More information

Information Extraction from News Video using Global Rule Induction Technique

Information Extraction from News Video using Global Rule Induction Technique Information Extraction from News Video using Global Rule Induction Technique Lekha Chaisorn and 2 Tat-Seng Chua Media Semantics Department, Media Division, Institute for Infocomm Research (I 2 R), Singapore

More information

Consistent Line Clusters for Building Recognition in CBIR

Consistent Line Clusters for Building Recognition in CBIR Consistent Line Clusters for Building Recognition in CBIR Yi Li and Linda G. Shapiro Department of Computer Science and Engineering University of Washington Seattle, WA 98195-250 shapiro,yi @cs.washington.edu

More information

Video Key-Frame Extraction using Entropy value as Global and Local Feature

Video Key-Frame Extraction using Entropy value as Global and Local Feature Video Key-Frame Extraction using Entropy value as Global and Local Feature Siddu. P Algur #1, Vivek. R *2 # Department of Information Science Engineering, B.V. Bhoomraddi College of Engineering and Technology

More information

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Extracting meaning

More information

An Approach for Reduction of Rain Streaks from a Single Image

An Approach for Reduction of Rain Streaks from a Single Image An Approach for Reduction of Rain Streaks from a Single Image Vijayakumar Majjagi 1, Netravati U M 2 1 4 th Semester, M. Tech, Digital Electronics, Department of Electronics and Communication G M Institute

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Visual Tracking (1) Feature Point Tracking and Block Matching

Visual Tracking (1) Feature Point Tracking and Block Matching Intelligent Control Systems Visual Tracking (1) Feature Point Tracking and Block Matching Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Iterative Image Based Video Summarization by Node Segmentation

Iterative Image Based Video Summarization by Node Segmentation Iterative Image Based Video Summarization by Node Segmentation Nalini Vasudevan Arjun Jain Himanshu Agrawal Abstract In this paper, we propose a simple video summarization system based on removal of similar

More information

Digital Video Projects (Creating)

Digital Video Projects (Creating) Tim Stack (801) 585-3054 tim@uen.org www.uen.org Digital Video Projects (Creating) OVERVIEW: Explore educational uses for digital video and gain skills necessary to teach students to film, capture, edit

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

DocQspeech for Medical Editors M*Modal Fluency for Transcription

DocQspeech for Medical Editors M*Modal Fluency for Transcription SPEECH RECOGNITION SETTINGS 1. To access the speech recognition settings and select personal preference options, do one of the following: Press Ctrl + Shift + T to open the Speech Recognition tab. Click

More information

Experiments in computer-assisted annotation of audio

Experiments in computer-assisted annotation of audio Experiments in computer-assisted annotation of audio George Tzanetakis Computer Science Dept. Princeton University en St. Princeton, NJ 844 USA +1 69 8 491 gtzan@cs.princeton.edu Perry R. Cook Computer

More information

Enterprise Multimedia Integration and Search

Enterprise Multimedia Integration and Search Enterprise Multimedia Integration and Search José-Manuel López-Cobo 1 and Katharina Siorpaes 1,2 1 playence, Austria, 2 STI Innsbruck, University of Innsbruck, Austria {ozelin.lopez, katharina.siorpaes}@playence.com

More information

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES Mehran Yazdi and André Zaccarin CVSL, Dept. of Electrical and Computer Engineering, Laval University Ste-Foy, Québec GK 7P4, Canada

More information

Blood vessel tracking in retinal images

Blood vessel tracking in retinal images Y. Jiang, A. Bainbridge-Smith, A. B. Morris, Blood Vessel Tracking in Retinal Images, Proceedings of Image and Vision Computing New Zealand 2007, pp. 126 131, Hamilton, New Zealand, December 2007. Blood

More information

Comment on Numerical shape from shading and occluding boundaries

Comment on Numerical shape from shading and occluding boundaries Artificial Intelligence 59 (1993) 89-94 Elsevier 89 ARTINT 1001 Comment on Numerical shape from shading and occluding boundaries K. Ikeuchi School of Compurer Science. Carnegie Mellon dniversity. Pirrsburgh.

More information

Recall precision graph

Recall precision graph VIDEO SHOT BOUNDARY DETECTION USING SINGULAR VALUE DECOMPOSITION Λ Z.»CERNEKOVÁ, C. KOTROPOULOS AND I. PITAS Aristotle University of Thessaloniki Box 451, Thessaloniki 541 24, GREECE E-mail: (zuzana, costas,

More information

Story Unit Segmentation with Friendly Acoustic Perception *

Story Unit Segmentation with Friendly Acoustic Perception * Story Unit Segmentation with Friendly Acoustic Perception * Longchuan Yan 1,3, Jun Du 2, Qingming Huang 3, and Shuqiang Jiang 1 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

Content Description Servers for Networked Video Surveillance

Content Description Servers for Networked Video Surveillance Content Description Servers for Networked Video Surveillance Jeffrey E. Boyd Maxwell Sayles Luke Olsen Paul Tarjan Department of Computer Science, University of Calgary boyd@cpsc.ucalgary.ca Abstract Advances

More information

Image Database Modeling

Image Database Modeling Image Database Modeling Aibing Rao, Rohini Srihari Center of Excellence for Document Analysis and Recognition, State University of New York At Buffalo, Amherst, NY14228 arao, rohini@cedar.buffalo.edu Abstract

More information