FLORIDA INTERNATIONAL UNIVERSITY. Miami, Florida DIMUSE: AN INTEGRATED FRAMEWORK FOR DISTRIBUTED MULTIMEDIA

Size: px
Start display at page:

Download "FLORIDA INTERNATIONAL UNIVERSITY. Miami, Florida DIMUSE: AN INTEGRATED FRAMEWORK FOR DISTRIBUTED MULTIMEDIA"

Transcription

1 FLORIDA INTERNATIONAL UNIVERSITY Miami, Florida DIMUSE: AN INTEGRATED FRAMEWORK FOR DISTRIBUTED MULTIMEDIA SYSTEM WITH DATABASE MANAGEMENT AND SECURITY SUPPORT A disseraion submied in parial fulfillmen of he requiremens for he degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE by Na Zhao 2007

2 To: Inerim Dean Amir Mirmiran College of Engineering and Compuing This disseraion, wrien by Na Zhao, and eniled DIMUSE: An Inegraed Framework for Disribued Mulimedia Sysem wih Daabase Managemen and Securiy Suppor, having been approved in respec o syle and inellecual conen, is referred o you for judgmen. We have read his disseraion and recommend ha i be approved. Xudong He Nagarajan Prabakar Keqi Zhang Mei-Ling Shyu Shu-Ching Chen, Major Professor Dae of Defense: July 9, 2007 The disseraion of Na Zhao is approved. Inerim Dean Amir Mirmiran College of Engineering and Compuing Dean George Walker Universiy Graduae School Florida Inernaional Universiy, 2007 ii

3 ACKNOWLEDGMENTS I would like o exend my sincere graiude and appreciaion o my Ph.D. advisor, Professor Shu-Ching Chen, for his guidance, suppor, suggesions and encouragemen while his disseraion was being conduced. I am also indebed o Professors Xudong He, Nagarajan Prabakar of he School of Compuing and Informaion Sciences, Professor Keqi Zhang of Deparmen of Environmenal Sudies and Inernaional Hurricane Research Cener, and Professor Mei-Ling Shyu of he Deparmen of Elecrical and Compuer Engineering, Universiy of Miami, for acceping he appoinmen o he disseraion commiee, as well as for heir suggesions and suppor. The financial assisance I received from he School of Compuing and Informaion Sciences and he Disseraion Year Fellowship from Universiy Graduae School is graefully acknowledged. I would also like o hank all my friends and colleagues whom I have me and known while aending Florida Inernaional Universiy. In paricular, I would like o hank Chengcui Zhang, Min Chen, Kasuri Chaerjee, Khalid Saleem, Fauso Fleies, Michael Armella, Hsin-Yu Ha and oher members of he Disribued Mulimedia Informaion Sysem Laboraory for heir generous help. My special hanks go o Mr. Frank Oreovicz in Purdue Universiy for his help wih English wriing and presenaion checking. Finally, my umos graiude goes o my husband Cheng Xu, my faher Guoqing Zhao, my moher Jiyun Cang, and broher Kang Zhao, for heir love, suppor and encouragemen, which made his work possible. iii

4 ABSTRACT OF THE DISSERTATION DIMUSE: AN INTEGRATED FRAMEWORK FOR DISTRIBUTED MULTIMEDIA SYSTEM WITH DATABASE MANAGEMENT AND SECURITY SUPPORT by Na Zhao Florida Inernaional Universiy, 2007 Miami, Florida Professor Shu-Ching Chen, Major Professor Wih he recen explosion in he complexiy and amoun of digial mulimedia daa, here has been a huge impac on he operaions of various organizaions in disinc areas, such as governmen services, educaion, medical care, business, enerainmen, ec. To saisfy he growing demand of mulimedia daa managemen sysems, an inegraed framework called DIMUSE is proposed and deployed for disribued mulimedia applicaions o offer a full scope of mulimedia relaed ools and provide appealing experiences for he users. This research mainly focuses on video daabase modeling and rerieval by addressing a se of core challenges. Firs, a comprehensive mulimedia daabase modeling mechanism called Hierarchical Markov Model Mediaor (HMMM) is proposed o model high dimensional media daa including video objecs, low-level visual/audio feaures, as well as hisorical access paerns and frequencies. The associaed rerieval and ranking algorihms are designed o suppor no only he general queries, bu also he complicaed emporal even paern queries. Second, sysem raining and learning mehodologies are incorporaed such ha user ineress are mined efficienly o improve he rerieval performance. Third, video clusering echniques are proposed o coninuously increase he searching speed and accuracy by archiecing a more efficien mulimedia daabase srucure. A disribued video managemen and rerieval sysem is designed and implemened o demonsrae he overall performance. The proposed approach is furher iv

5 cusomized for a mobile-based video rerieval sysem o solve he percepion subjeciviy issue by considering individual user s profile. Moreover, o deal wih securiy and privacy issues and concerns in disribued mulimedia applicaions, DIMUSE also incorporaes a pracical framework called SMARXO, which suppors mulilevel mulimedia securiy conrol. SMARXO efficienly combines role-based access conrol (RBAC), XML and objec-relaional daabase managemen sysem (ORDBMS) o achieve he arge of proficien securiy conrol. A disribued mulimedia managemen sysem named DMMManager (Disribued MuliMedia Manager) is developed wih he proposed framework DIMUSE o suppor mulimedia capuring, analysis, rerieval, auhoring and presenaion in one single framework. v

6 TABLE OF CONTENTS CHAPTER PAGE CHAPTER I. INTRODUCTION AND MOTIVATION.... Significance and Impac of Mulimedia Sysem Research Proposed Soluions Conribuions Scope and Limiaions of he Proposed Prooype....5 Ouline of he Disseraion...2 CHAPTER II. LITERATURE REVIEW Mulimedia Daa Modeling, Indexing and Daa Srucures Mulimedia Rerieval Mehodologies Keyword-based Rerieval Conen-based Rerieval Challenges in Mulimedia Rerieval Mulimedia Securiy Soluions Prooype Mulimedia Managemen Sysems Conen-based Mulimedia Rerieval Sysems Mulimedia Presenaion Auhoring and Rendering Sysems...32 CHAPTER III. OVERVIEW OF THE FRAMEWORK Mulimedia Daabase Modeling and Rerieval Module Image Daabase Modeling and Rerieval using MMM Video Daabase Modeling and Rerieval using HMMM Online Learning and Offline Training via HMMM Video Daabase Clusering Mulimedia Presenaion Module Presenaion Design wih MATN Model Presenaion Rendering wih JMF and SMIL Securiy Managemen Componen Securiy Policy and Role Managing Securiy Checking Mulimedia Daa Managing and Processing Mulimedia Applicaion and Sysem Inegraion DMMManager: Disribued Mulimedia Manager...45 CHAPTER IV. MULTIMEDIA DATABASE MODELING AND RETRIEVAL Inroducion Overall Framework Hierarchical Markov Model Mediaor (HMMM) Two-level HMMM Model Video sho level MMM Video-level MMM Connecions beween firs level MMMs and second level MMM Iniial Process for Temporal Even Paern Rerieval Video Daabase Clusering and Consrucion of 3rd Level MMM Overall Workflow...67 vi

7 4.5.2 Concepual Video Clusering Consrucing he 3 rd level MMM model Ineracive Rerieval hrough Clusered Video Daabase Experimenal Resuls for Video Clusering Conclusions...78 CHAPTER V. MULTIMEDIA SYSTEM TRAINING AND LEARNING Inroducion Relaed Work Auomae Offline Training using Associaion Rule Mining Overall Process Auomaed Training using ARM Experimenal Resuls for Auomaed Learning Mechanism Online Relevance Feedback Anicipan Even Paern Insance Affiniy Insances for A Feaure Insances for B Updaed Similariy Measuremens and Query Processing Experimenal Resuls for Sysem Learning Techniques Applicaion: A Mobile-based Video Rerieval Sysem Inroducion Relaed Work Sysem Archiecure MoVR: Mobile-based Video Rerieval HMMM-based User Profile Fuzzy Associaed Rerieval Implemenaion and Experimens Summary...24 CHAPTER VI. SECURITY SOLUTIONS FOR MULTIMEDIA SYSTEMS Inroducion SMARXO Archiecure Mulimedia Access Conrol Mulimedia Indexing Phase Securiy Modeling Phase DBMS Managemen Phase Securiy Verificaion Conclusions...37 CHAPTER VII. MULTIMEDIA SYSTEM INTEGRATION Sysem Overview Mulimedia Daa Collecing Mulimedia Analysis and Indexing Image Analysis and Indexing Video Analysis and Indexing Mulimedia Rerieval Conen-based Image Rerieval Video Daa Browsing and Rerieval Mulimedia Presenaion Module...47 vii

8 7.5. Mulimedia Presenaion Auhoring Mulimedia Presenaion Rendering Presenaion Rendering via JMF Player Presenaion Rendering via SMIL Language Conclusions...54 CHAPTER VIII. CONCLUSIONS AND FUTURE WORK Conclusions Fuure Work...57 LIST OF REFERENCES...60 VITA...69 viii

9 TABLE LIST OF TABLES PAGE Table IV-. HMMM is an 8-Tuple: Λ = ( d, S, F, A, B, Π, O, L)...54 Table IV-2. 3-level HMMM model...56 Table IV-3. Feaure lis for he video shos...59 Table V-. Experimenal resuls for ARM-based feedback evaluaions...9 Table V-2. Average accuracy for he differen recommendaions...22 Table VI-. Comparison of mulimedia securiy echniques...37 Table VII-. Example mappings o he graphical query language...45 Table VII-2. MATN srucures for 3 emporal relaionships...49 Table VII-3. MATN design buons & funcionaliies...50 ix

10 LIST OF FIGURES FIGURE PAGE Figure II-. CIRES inerface wih sample images... 2 Figure II-2. WebSeek inerfaces (a) sample caalog (b) image rerieval resuls wih relevance feedback Figure II-3. Query inerface of VDBMS Figure II-4. User inerface of he Goalgle soccer video search engine Figure II-5. User inerface for IBM VideoAnnEx Tool Figure II-6. Query inerface of IBM MARVEL Figure II-7. User inerface for CuVid Figure II-8. User inerface for Youube video search Figure II-9. User inerface for Google video search Figure II-0. User inerface for Yahoo! video search Figure II-. User inerface for AOL TRUVEO video search... 3 Figure II-8. LAMP inerface wih he synchronizaion graph of a news-on-demand presenaion Figure II-9. Views layou and user inerface for T-Cube Figure II-0. Srucured media auhoring environmen of Madeus Figure III-. Overall framework and componens of DIMUSE Figure IV-. Overall framework of video daabase modeling and emporal paern rerieval uilizing HMMM, online learning, offline raining and clusering echniques... 5 Figure IV-2. Three-level consrucion of Hierarchical Markov Model Mediaor Figure IV-3. An example resul of a emporal paern query Figure IV-4. HMMM-based soccer video rerieval inerface Figure IV-5. Overall workflow for he proposed approach Figure IV-6. The proposed concepual video daabase clusering procedure Figure IV-7. Laice srucure of he clusered video daabase x

11 Figure IV-8. Resul paerns and he raverse pah Figure IV-9. Comparison of he average execuion ime Figure IV-0. Soccer video rerieval sysem inerfaces (a) query over non-clusered soccer video daabase (b) query over clusered soccer video daabase Figure V-. Two feedback scenarios for he soccer video goal even rerieval Figure V-2. Overall process for he auomaed raining Figure V-3. Sysem inerfaces for he Mobile-based Video Rerieval Sysem Figure V-4. Online learning procedure of emporal based query paern rerieval Figure V-5. User-cenered soccer video rerieval and feedback inerface Figure V-6. Online raining experimenal resuls for Query Figure V-7. Soccer video rerieval and feedback resuls for Query 2. (a) firs round even paern rerieval; (b) hird round even paern rerieval Figure V-8. Mobile-based video rerieval sysem archiecure Figure V-9. Overall framework of mobile-based video rerieval sysem Figure V-0. Generaion of individual user s affiniy profile... Figure V-. Fuzzy weigh adjusmen ool (a) generalized recommendaion; (b) personalized recommendaion; (c) fuzzy associaed recommendaion... 4 Figure V-2. Mobile-based soccer video rerieval inerfaces (a) iniial choices (b) rerieval by even (c) rerieval by paern... 9 Figure V-3. Mobile-based soccer video rerieval resuls (a) video browsing resuls (b) video rerieval resuls (c) video player... 2 Figure V-4. Experimenal comparison of differen recommendaions Figure VI-. Example of image objec-level securiy (a) original image (b) segmenaion map (c) hiding a porion of he image Figure VI-2. SMARXO archiecure Figure VI-3. Exended RBAC definiions in SMARXO Figure VI-4. XML examples of mulimedia hierarchy (a) example for image objecs (b) example for video hierarchy xi

12 Figure VI-5. XML examples of he fundamenal roles (a) example of subjec roles (b) example of objec roles... 3 Figure VI-6. Example requiremens for video scene/sho-level access conrol Figure VI-7. XML examples of he opional roles (a) example of emporal roles (b) example of IP address roles Figure VI-8. Securiy policies (a) formalized securiy policy (b) XML example on policy roles Figure VI-9. Algorihm for securiy verificaion in SMARXO Figure VII-. The mulimedia managemen flow of DMMManager Figure VII-2. Mulimedia presenaion auhoring ool Figure VII-3. The key-frame based video rerieval inerface wih a sho displayed Figure VII-4. Soccer rerieval inerface wih example emporal query and resuls Figure VII-5. The user inerface for MATN model design... 5 Figure VII-6. The rendered mulimedia presenaion played by he JMF player Figure VII-7. The rendered mulimedia presenaion played by he web browser xii

13 LIST OF DEFINITIONS DEFINITION PAGE Definiion IV-: Markov Model Mediaor (MMM) [Shyu03] Definiion IV-2: Hierarchical Markov Model Mediaor (HMMM) Definiion IV-3: SV(v i, v j ), he similariy measure beween wo videos, is defined by k evaluaing he probabiliies of finding he same even paern Q from v i and v j in he same query for all he query paerns in QS Definiion IV-4: Assume CC m and CC n are wo video clusers in he video daabase D. Their relaionship is denoed as an enry in he affiniy marix A 3, which can be compued by Equaions (IV-22) and (IV-23). Here, SC is he funcion ha calculaes he similariy score beween wo video clusers Definiion V-: An HMMM-based User Profile is defined as a 4-uple: Φ = { τ, Aˆ, Bˆ, O ˆ },... 0 Definiion VI-: An Objec Hierarchy OH = ( O, OG, OG ), where O is a se of objecs and OG = O U G wih G is a se of objec groups. is a parial order on OG called he dominance relaion, and O OG is he se of minimal elemens of OG wih respec o he parial order. Given wo elemens x, y OG, x y iff x is a member of y Definiion VI-2: Given he oces named I, I 2, I 3, I 4, he IP address segmen expression IP can be defined as IP = n x I y I j = >, where n = 4, 0 x 2 8, j j j d 0 y 2 8, x, y N, x y 2 8 for j =,, 4, I d I, I, I, } j j j j + j j { 2 3 I4 Definiion VI-3: Objec Eniy Se: OES ( o) = { o} U { s : s o} xiii

14 CHAPTER I. INTRODUCTION AND MOTIVATION Wih he rapid evoluion of echnologies and applicaions for consumer digial media, here has been an explosion in he complexiy and amoun of digial mulimedia daa being generaed and persisenly sored. This revoluion is changing he way people live, work, and communicae wih each oher, and is impacing he operaions of various organizaions in disinc areas, such as governmen services, educaion, medical care, business, enerainmen, ec. To solve he relaed problems, a large number of papers have been published recenly on mulimedia echniques and mulimedia sysems. However, he issues relaed o analysis, modeling, specificaion, and design of disribued mulimedia sysems and applicaions are sill challenging boh researchers and developers. In comparison o radiional ex and daa, mulimedia objecs are ypically very large and may include images, video, audio and some oher visualizaion componens. Due o he specific characerisics of he mulimedia daa, many subsequen research issues arise wihin he fields of mulimedia analysis, sorage, rerieval, ransmission, presenaion, and securiy proecion. Generally, he following aspecs should be considered when a mulimedia sysem is designed. Firs, a disribued archiecure is required for he consrucion of a large-scale mulimedia sysem. Mulimedia daa is sorage consuming and may be disribued hrough he nework and allocaed a disinc compuers. Accordingly, he mulimedia applicaions should be capable of managing he disribued mulimedia daa in he nework environmen. The sysems should allow mulimedia daa o be ransmied hrough he neworks or oher connecions easily. Second, conen based rerieval is one of he major issues which should be considered in he mulimedia applicaions. As mulimedia daa is rich in semanic informaion, inermediae processing and semanic inerpreaion become much more helpful, especially for handling images, audio and video daa. Manual annoaion of mulimedia daa for conen based rerieval is cumbersome, error prone, and prohibiively expensive. To make i feasible, mulimedia analysis

15 echniques are developed o auomaically exrac he visual/audio feaures and obain he semanic undersanding for he mulimedia conen. Thereafer, an advanced mulimedia modeling framework should be consruced o combine hese feaures, semanic annoaions, along wih he user percepions for conen based rerieval purposes. Third, user feedback should be deployed o refine he rerieval performance by saisfying diverse user ineress. Undoubedly, he disinc background, siuaion and ineres of differen users ineviably call for individual views ino a semanic undersanding of he mulimedia daa and herefore produce user cenered mea-daa. Accordingly, mulimedia rerieval, summarizaion, ranking composiion, delivery, and presenaion need o be designed o saisfy users requiremens and preferences. Fourh, i is a challenging ye rewarding ask o provide securiy suppor for a large scale mulimedia managemen sysem. For some of he mulimedia conen generaed in medical, commercial and miliary fields, i may only be parially exposed o he general public or should no be accessible a all. Hence, i is criical o develop a user-adapive framework for he daa access conrol o provide enhanced securiy suppor in mulimedia daabase design and mulimedia sysem developmen. In recen years, emerging ubiquious mulimedia applicaions have been developed o fulfill various kinds of demands for mulimedia analysis, rerieval, and usage. However, mos of hese sysems can only provide one or few funcionaliies for mulimedia daa managemen. For example, some sysems are concerned wih he producion of mulimedia maerial; some sysems handle mainly mulimedia analysis, annoaion and rerieval issues, while some ohers only provide he funcionaliies for mulimedia presenaion design. Wha is lacking is an inegraed framework for he consrucion of a comprehensive disribued mulimedia sysem, which can suppor a full scope of funcionaliies. 2

16 In his research, an inegraed framework called DIMUSE (DIsribued Mulimedia SysEm) is proposed for disribued mulimedia applicaions including mulimedia daa capuring, analysis, daabase modeling, conen-based rerieval, presenaion auhoring and rendering, ec. In addiion o a complee se of mulimedia searching and ediing ools, anoher aracive aspec of DIMUSE is ha user ineracions and percepions are fully considered in he sysem learning and raining process for providing innovaive mulimedia experience o he users. The remainder of his chaper is organized as follows. The nex secion discusses he deailed significance and impac of mulimedia sysem research o develop an inegraed framework for he daabase modeling, informaion rerieval, and securiy suppor in he disribued mulimedia sysem. In Secion.2, he proposed soluions are inroduced for consrucing such an inegraed framework. Secion.3 presens he main conribuions of his research. The scope and limiaions of his proposed framework are furher explored in Secion.4. Finally, Secion.5 summarizes he ouline of his disseraion.. Significance and Impac of Mulimedia Sysem Research The populariy of digial media is growing fas in all aspecs of he marke including radiional broadcasing, new media enerprise, and World Wide Web. I resuls from he convergence of many facors, including he general affordabiliy of mulimedia capuring, managemen and disribuion devices, and he pervasive increase in nework bandwidh. The digial media is expeced o play a criical role in enhancing he value of radiional compuer applicaions. Correspondingly, here is a growing demand for efficien echnologies for rerieving semanic informaion and exracing knowledge from mulimedia conen. The mehods for describing and rerieving semanic informaion from mulimedia conen can ineviably enable or enhance applicaions and services boh for commercial users and end users. Such kinds of applicaions include, bu are no limied o he following areas: auomaic and semi-auomaic annoaion, mulimedia daa indexing, conen based image/video rerieval, 3

17 collaboraive filering, digial media sharing, personalized adapaion, and mulimedia presenaion delivery. One ulimae goal of mulimedia sysem research is o offer appealing mulimedia experiences o users considering heir own preferences and informaion needs. In mulimedia sysem research, users ineress and percepions no only have o be aken ino accoun, bu in fac mus become he essenial concern a all pars of mulimedia sysem design. This process involves far more han merely physical sorage or echnical analysis of mulimedia daa. I should also consider he user abiliies, device capabiliies, nework characerisics, ec. Moreover, he ineracive mulimedia user inerfaces mus be developed o address he conen based rerieval faciliies, auhoring and display environmen, as well as he managemen funcions for user access conrol. The major research issues in his disseraion can be oulined as follows: () Hierarchical mulimedia daa modeling issue. In he developmen cycle of a mulimedia managemen sysem, one of he mos crucial issues is how o proficienly model, accumulae, and manage he mulimedia daa, along wih heir meadaa, feaures, and oher relaed informaion. In addiion o he source daa, i should also be able o proficienly model he mulimedia objecs in a hierarchical way considering heir emporal and/or spaial relaionships, such as video shos, video key frames, and segmened image objecs. (2) Semanic concep mining, sorage, and rerieval issues. An efficien conen-based rerieval operaion is necessary o offer querying funcionaliy o a mulimedia daabase managemen sysem (MMDBMS). In conen-based mulimedia rerieval, he semanic gap denoes he gap beween he rich meaning and inerpreaion ha he users anicipae he daabase sysems o associae heir queries for searching and browsing mulimedia daa. This issue needs o be addressed o make mulimedia informaion pervasively 4

18 accessible and reusable upon he original conceps and meanings represened by he digial media daa. The criical difficuly here is how o efficienly derive and faciliae semanic annoaions which require knowledge and echniques from assored disciplines and domains, even hough many of hem are ouside of he radiional compuer science fields. (3) User-cenered learning issue. As differen users may evenually have diverse ineress, users percepions need o be aken ino accoun when modeling he underlying daabase. Two kinds of mehods can be considered o improve he rerieval performance. One is o rigger he online learning algorihm which can handle ineracions of a single user, which may pose resrained performance due o he limied size of posiive feedback. An alernaive soluion is o learn general user percepions via feedback from differen users. The raining process is iniiaed only when he number of feedbacks reaches a cerain hreshold. This can improve he overall raining performance bu i becomes a manual process o decide he hreshold and iniiae he raining process. This issue should be furher invesigaed o discover he bes scheme on sysem learning process. (4) Securiy managemen issue. There is a growing concern abou securiy and privacy of he disribued mulimedia conens over he Inerne or local neworks. I may involve muliple levels of access conrol requiremens when accessing mulimedia conens in a disribued environmen. Moreover, composing mulimedia documens brings ogeher mulimedia objecs ha exis in various formas. The securiy requiremen varies for differen ypes of mulimedia objecs. Hence, a securiy model is desired for a disribued mulimedia managemen sysem ha allows creaion, sorage, indexing and presenaion for he muli-level secured mulimedia conens. (5) Sysem inegraion issue. In addiion o he querying capabiliy, he developmen of an absrac semanic model is also essenial for an inegraed robus MMDBMS. The model 5

19 should be powerful enough o suppor mulimedia presenaion synchronizaion and uilize opimal programming daa srucures for he implemenaion. A proficien semanic model is anicipaed o model no only mulimedia presenaions, bu also he emporal and/or spaial relaions of differen media sreams. In addiion, his semanic model should be able o help he inegraion of differen componens wih various funcionaliies for he purpose of developing an advanced mulimedia sysem. For insance, his model can be deployed and inegraed in his sysem o generae mulimedia presenaions by synchronizing user preferred mulimedia daa which are rerieved from various mulimedia browsing or rerieval modules..2 Proposed Soluions In response o he above-menioned research problems, a se of models, mehodologies and echniques are proposed, implemened and applied o fulfill he requiremens of he disribued mulimedia managemen sysem. In he curren sysem implemenaion, we employ a muli-hreaded clien/server archiecure ha can run on Windows, Unix or Linux plaforms. The sysem is developed by using C++, Java, and an objec-relaional daabase called PosgreSQL [PosgreSQL]. In his disribued mulimedia managemen sysem (DMMS), a daabase engine is implemened o suppor image feaure exracion, video sho segmenaion, conen-based image and video queries, daa managemen, file delivery, and mulimedia presenaions suppors, ec. The clien applicaion uilizes a variey of user inerfaces, which allow for he browsing, rerieval and composiion of he media conens from various domains (e.g., spors, hurricane, medical, ec.) from is respecive daa sored wihin he daabase. Paricularly, he following echniques are proposed and developed o address he grea challenges aforemenioned. () Video daabase modeling and rerieval via HMMM To efficienly manage a large mulimedia archive, a promising soluion should incorporae high-level semanic descripions for mulimedia conen processing, managemen, and 6

20 rerieval. In his research, he Hierarchical Markov Model Mediaor (HMMM) mechanism is proposed o efficienly sore, organize, and manage low-level feaures, mulimedia objecs, and semanic evens along wih high-level user percepions, such as user preferences, in he mulimedia daabase managemen sysem (MMDBMS). In order o archive all valuable daa, HMMM also adops muli-disciplinary echniques, such as conen-based analysis, audio feaure exracion, video sho deecion and segmenaion algorihms, machine learning mehodologies, and relevance feedback echniques. Basically, HMMM can help o bridge he semanic gap beween concep-based and he conen-based rerieval approaches o he underlying mulimedia daabase model. By employing he proposed HMMM mechanism, high-dimensional mulimedia daa can be efficienly organized, indexed and managed. Moreover, he emporal relaionships beween he video shos are naurally inegraed in HMMM such ha he proposed mechanism can offer he capabiliy o execue no only he radiional even queries bu also he complicaed emporal paern rerieval owards he large scale video daabase quickly and accuraely. Moreover, an advanced video clusering echnique is proposed and implemened o cluser he videos based on no only he low level feaures, bu also he high level semanic meanings and he user percepions. (2) User ineracion suppor by offline raining and online learning mechanisms In his research, online learning and offline raining sraegies are designed and incorporaed in he HMMM mechanism such ha high-level user percepions and preferences as well as he low-level visual/audio feaures can be considered. Furher research is conduced o combine hese wo echniques o gain he bes radeoff in boh performance and speed. In paricular, a user adapive video rerieval framework called MoVR [Zhao07a] is proposed for efficien mulimedia daa searching and managemen in he mobile wireless environmen. In his framework, individual user profiles are designed o learn personal ineress, while general user access hisory is also recorded o accumulae he common knowledge and preferences. Fuzzy 7

21 associaion concep is adoped here such ha users can choose he bes combinaion beween general user percepions and individual user ineress o find heir anicipaed resuls easily. (3) Securiy managemen via SMARXO A framework called SMARXO [ChenSC04b] is proposed o perform mulilevel mulimedia securiy for mulimedia applicaions. This echnique efficienly combines Role Based Access Conrol (RBAC), XML [XML] and Objec-Relaional Daabase Managemen Sysem (ORDBMS) o achieve he arge of proficien securiy conrol. By using his framework, he sysem can check and follow he securiy policies by evaluaing he user s subjec role, objec role, emporal role, and spaial role. Hence, inappropriae accesses are sricly prohibied and he sensiive mulimedia daa are efficienly proeced. Wih he designed securiy managemen inerfaces, adminisraors are capable of creaing, deleing, and modifying all kinds of access conrol roles and rules. Meanwhile, securiy informaion rerieval becomes very convenien because all he proecion relaed informaion is managed by XML. (4) Sysem developmen and inegraion in DMMManager In his research, a disribued mulimedia managemen sysem named DMMManager (Disribued MuliMedia Manager) is developed wih he proposed framework DIMUSE o suppor mulimedia capuring, analysis, rerieval, auhoring and presenaion in one single framework. The disribued clien/server archiecure is adoped in DMMManager such ha muliple requess from differen cliens can be handled simulaneously. A se of core componens are efficienly inegraed in DMMManager so ha he user can complee various asks including video/audio capuring, conen-based image/video rerieval, mulimedia presenaion design and rendering, and securiy managemen..3 Conribuions In his disseraion, an inegraed framework called DIMUSE is proposed for mulimedia applicaion design. In DIMUSE, a variey of advanced echniques are proposed, implemened and 8

22 inegraed o develop a large scale disribued mulimedia sysem wih boh powerful daabase managemen capabiliies and enhanced securiy proecion. The major conribuions of his research can be oulined as follows: The proposed Hierarchical Markov Model Mediaor (HMMM) mechanism offers a hierarchical srucure o assis in he proficien consrucion of he mulimedia daabase. Wih he power of HMMM, he proposed video daabase modeling mechanism, we can narrow down he semanic gap beween he conen/concep based rerieval approaches wih he comprehensive mulimedia daabase modeling. Firs, HMMM naurally incorporaes he emporal relaionship beween semanic evens such ha complicaed emporal paern queries can be execued. Second, HMMM helps o rerieve more accurae paerns quickly wih lower compuaional coss. Third, he mulimedia rerieval approaches associaed wih HMMM inegrae he feedback and learning sraegies by considering no only he low-level visual/audio feaures, bu also he high-level semanic informaion and user preferences. In his research, he proposed ineracive video rerieval framework incorporaes a concepual video clusering sraegy. The proposed framework can reuse he cumulaed user feedback o perform he video clusering process, such ha he overall sysem can no only learn he user percepions, bu also ge more efficien mulimedia daabase srucure via adoping video clusering echnique. As he HMMM mechanism helps o raverse he mos opimized pah o perform he rerieval, he proposed framework can only search several clusers for he candidae resuls wihou raversing all he pahs o check he whole daabase. Tha is, he proposed video clusering echnique can be conduced o furher reduce he searching ime especially when dealing wih he op-k similariy rerievals. Meanwhile, he clusering echnique helps o furher improve he daabase srucure by adding a new level o model he video clusers. 9

23 DIMUSE employs a sraegy o accommodae advanced queries by considering he high level semanic meaning. Firs, i is capable of searching semanic evens or even paerns considering heir populariy by evaluaing heir access frequencies in a large number of hisorical queries. Second, users can choose one or more example paerns wih heir anicipaed feaures from he iniial rerieved resuls, and hen issue he nex round of queries. I can search and re-rank he candidae paerns which involve similar aspecs wih posiive examples reflecing he user s ineress. Third, i is worh menioning ha his approach suppors boh online learning and offline raining such ha he sysem can efficienly learn he individual user preferences in real ime, while coninuously improving he overall performance o gain he long erm benefis. In his research, he offline raining mechanism is furher improved and auomaed by adoping he associaion rule mining echnique. Tha is, he raining process can be auomaically invoked for cerain videos by evaluaing he hisorical queries and feedbacks. In order o accommodae various consrains of he mobile devices, a se of advanced echniques are developed and deployed o address essenial issues in he proposed mobile-based video rerieval sysem. Firs, HMMM-based user profiles are creaed o inegrae seamlessly wih a novel learning mechanism. I can enable he personalized recommendaion for an individual user by evaluaing his/her personal hisories and feedbacks. Second, he fuzzy associaion concep is employed in he rerieval process such ha he users gain conrol of he preference selecions o achieve reasonable radeoff beween he rerieval performance and processing speed. Third, virual cliens are designed o perform as a middleware beween server applicaions and mobile cliens. This design helps o reduce he sorage load of mobile devices, and o provide greaer accessibiliy wih heir cached media files. 0

24 Several significan access conrol echniques are incorporaed in SMARXO o saisfy he complicaed mulimedia securiy requiremens. Firs, SMARXO incorporaes efficien mulimedia analysis mechanisms o acquire meaningful visual/audio objecs or segmens. Second, XML and objec-relaional daabases are adoped such ha proficien mulimedia conen indexing can be easily achieved. Third, a dominan access conrol model is upgraded and embedded o sui he specific characerisics of mulimedia daa. Moreover, XML is also applied o organize all kinds of securiy relaed roles and policies. Finally, and mos imporanly, all of hese echniques are efficienly organized such ha mulilevel mulimedia access conrol can be achieved in SMARXO easily..4 Scope and Limiaions of he Proposed Prooype The proposed prooype has he following assumpions and limiaions: () The curren design and deploymen of he HMMM model parly relies on he precision and correcness of auomaic pre-filering and semanic even annoaion algorihms. I assumes ha we can ge reasonably good even annoaion resuls auomaically as he inpus. However, in a real scenario, mos of he even annoaion and learning algorihms are consruced by considering he domain knowledge, while he precision of a paricular algorihm canno be guaraneed for all he video samples. The annoaion accuracy could acually be affeced by all kinds of noisy daa. In oher words, semanic informaion exracion is sill an open research issue and a very challenging ask. (2) For he conen based video rerieval (CBVR) approach, he user feedback and sysem raining echniques are proposed and implemened. However, he performance of he offline raining algorihm is dependen on he number of hisorical queries. The assumpion is made such ha enough feedback is provided from muliple users. Moreover, he coverage of user feedback also couns. The greaer he number of he media files involved in he hisorical feedback, he beer he performance of he sysem

25 afer he raining process. If he seleced query sample has no been ouched in any hisorical feedback, he sysem rerieval performance can hardly be improved for his specific query. On he conrary, a small number of hisorical records can be used in online learning bu hey canno dramaically improve he rerieval performance for he whole daabase. (3) The proposed securiy managemen framework is suiable for large scale companies and organizaions because i can suppor muliple levels of securiy assurance while considering all kinds of possible securiy roles and complicaed access conrol rules. However, i could be somewha complicaed o learn and operae. There is sill lacking a general semanic model o describe and formalize he securiy access conrol processes. (4) Alhough he presen sysem suppors disribued archiecure wih muliple servers and muliple cliens, he curren research mainly focuses on one cenral daabase. The exising daabase in DMMManager conains a huge amoun of mulimedia daa (around 5G), and is coninuously expanding. Therefore, more effors should be made o manage he disribued mulimedia daabase. Fuure researches are expeced o manage several separaed daabases which are allocaed a disribued servers hrough he nework. These daabases can be conneced, indexed and linked ogeher by adoping he conceps of HMMM model..5 Ouline of he Disseraion This disseraion is organized as follows. Chaper II gives a lieraure review of he approaches of mulimedia modeling and indexing sraegies, rerieval mehodologies, sysem raining mehodologies, as well as he securiy soluions for a mulimedia sysem. A se of prooype sysems and applicaions are also reviewed. 2

26 In Chaper III, he overall framework of he proposed approach is described for he disribued mulimedia managemen sysems. A series of modules are presened in deail o furher advance he undersanding of he proposed prooype. Chaper IV mainly presens advanced soluions for he mulimedia daabase modeling and conen based rerieval. The Hierarchical Markov Model Mediaor (HMMM) model is inroduced and formalized. Relaed issues are furher discussed, for example: consrucion of a 2- level HMMM model, video clusering mehod and consrucion of he 3 rd level HMMM, ec. Chaper V focuses on he user ineracions hrough mulimedia sysem offline raining and online learning echniques. Firs, an innovaive mehod is proposed o auomae he offline sysem raining by using he associaion rule mining mehod. Second, he online learning scheme is designed based on he HMMM daabase model o improve rerieval performance in real ime. Third, a user adapive video rerieval sysem, called MoVR is proposed and developed in a mobile wireless environmen. Chaper VI presens a securiy model called SMARXO for disribued mulimedia applicaions via uilizing RBAC, XML and Objec-Relaional daabase. The differen phases in securiy conrol managemen are described in deail. The sysem inegraion issues are covered in Chaper VII by presening he disribued mulimedia managemen sysem DMMManager. The sysem modules are described and he sysem inerfaces are demonsraed o show he funcionaliies. Finally, he conclusions and fuure work are summarized in Chaper VIII. 3

27 CHAPTER II. LITERATURE REVIEW Alhough he recen developmen in mulimedia analysis and disribuion echniques has made digial media more accessible han ever before, i sill lacks a comprehensive and effecive soluion for daabase modeling and rerieval. In his chaper, he exising approaches and mehodologies in mulimedia sysem research are summarized. The deailed discussions focus on he areas of mulimedia (especially video) daabase modeling, indexing, conen-based mulimedia rerieval, and securiy managemen, ec. 2. Mulimedia Daa Modeling, Indexing and Daa Srucures I is much more difficul o index and search mulimedia daabases, in which informaion is implicily represened by pixel colors, moion vecors, and audio samples, han he radiional ex documens. Therefore, he mos common approach adoped in mulimedia daa managemen is o firs exrac he media conen represenaions and hen o apply daa indexing or clusering echniques on hem for fas media rerieval. There is a rapid proliferaion of visual processing and analysis echniques o exrac he media conen represenaions for media managemen, indexing and rerieval. Take video daabase as an example, some researchers have conduced experimenal sudies o idenify he salien objecs and heir moions. For example, [ChenSC03c] presens a learning based algorihm o rack he vehicles and idenify heir spaio-emporal informaion in he ransporaion daa. The exraced salien objecs and heir rajecories can be indexed for video rerieval. For he purpose of high-dimensional mulimedia daa indexing and modeling, many echniques have been proposed. In [DeMenhon03], he pixel regions are represened by highdimensional poins, and hese poins are assigned labels and sored ino a single binary ree for k- neares neighbor rerieval. The auhors in [Fan0] proposed a mulilevel video modeling and indexing approach, called MuliView, which consiss of a se of separae indices for he clusers, 4

28 where each cluser is conneced o a single roo node. [Fan04] describes heir laer work, named ClassView, where he daabase indexing srucure includes a se of hash ables for differen visual concep levels, and a roo hash able conaining he informaion abou all semanic clusers. Anoher new layered approach for mulimedia conen represenaion and sorage for search or rerieval was inroduced in [Huang00]. Currenly, here also exis approaches focusing on he clusering echniques for he video daa managemen. For example, a hierarchical clusering mehod for spors video was presened in [Ngo0]. Two levels of clusers are consruced where he op level is clusered by he color feaure and he boom level is clusered by he moion vecors. [Odobez03] presens a specral clusering mehod o group video shos ino scenes based on heir visual similariy and emporal relaionships. In [Xie03], algorihms are proposed for unsupervised discovery of he video srucure by modeling he evens and heir sochasic srucures in video sequences by using Hierarchical Hidden Markov Models (HHMM). However, mos of he exising research works produce he clusers mainly on low-level and/or mid-level feaures, and do no consider high-level conceps or user percepions in he clusering procedure. This gives rise o he problem of semanic gap. 2.2 Mulimedia Rerieval Mehodologies 2.2. Keyword-based Rerieval The early adoped mulimedia rerieval soluions were o query upon he exual daa. For his purpose, radiional mulimedia daabases basically sore he exual descripors along wih he source media daa such ha exual-based mulimedia queries can be convenienly performed. Such exure descripors are mainly exraced based on he use of annoaions. For insance, here exis some video query approaches wih he use of even annoaions ha are generally described as ime-dependen informaion or values ha are synchronous wih he source daa. These approaches eiher suppor semanic queries and some basic emporal queries, or deploy even- 5

29 based indexing via he inclusion of he even name, sar ime, and end ime. For insance, IBM TRL s MEPG-7 auhoring sysem [IBM_TRL] deploys even-based indexing and rerieval. Addiionally, SMOOTH [Kosch0] and GOALGLE [Snoek03] suppor he semanic queries and some basic emporal queries for soccer even rerieval. However, in pracice, i is difficul o perform correc and comprehensive annoaions auomaically by uilizing machine inerpreaion echniques due o he inheren complexiy of media conen. Alernaively, manual annoaions can be performed. However, his process involves some uncerainy because of he subjeciviy of human percepion and he limiaion of keywords and informaion loss. As an aemp o address his issue, [Deyniecki00] inroduces a mehod of fuzzy annoaions by embedding cerainy values in he XML file. Anoher issue of manual annoaion is ha i requires remendous manual effor, which becomes infeasible wih he fas growh of media daa Conen-based Rerieval Differen from he radiional keyword-based search echnologies, he conen-based indexing and rerieval approaches auomaically exrac feaures such as color, exure, shape, ec. I provides more powerful search abiliies and becomes a focus in his research area. Many approaches have been proposed in boh he academia and indusry, such as IBM s QBIC sysem [Flickner95], Virage s VIR engine [Virage], PhooBook [Penland94], and Foofile [Kuchinsky99]. However, hese sysems focus only on conen-based image rerieval (CBIR). There are also some projecs ha aim o offer video daa soluions. In hese projecs, he video daa is analyzed and segmened o faciliae he browsing funcionaliy upon he video srucure daa, e.g., video segmen, scene, sho, frame, ec. For example, in he Mulimedia Analysis and Rerieval Sysem (MARS) [Rui97], he role of a able of conens (ToC) is employed o srucure a video ino a se of scenes. In addiion, an ineracive conen-based video browser is presened in [Guillemo03], which suppors a hierarchical navigaion of video over he Inerne hrough 6

30 muliple levels of key frames. There also exis some sysems which suppor queries for boh images and videos, such as VisualSEEk [Smih96] and VISMAP [ChenW0]. In hese conen-based rerieval sysems, Query by Examples (QBE) is mainly adoped as he query approach. QBE focuses on rerieval based on low-level or mid-level visual/audio feaures. Given an example image or video clip, he sysem aims o rerieve similar mulimedia objecs wih similar feaures (color, shape, ec.). In [Aref02], a video-enhanced daabase sysem called VDBMS is proposed o suppor feaure-based medical video daa rerieval. [ChenL03] describes a sysem which applies image rerieval echniques o query videos by seing up he links beween videos and images. IBM s video rerieval sysem MARVEL [IBM_Marvel] suppors QBE in boh he low-level feaure space and he high-level model-vecor space. In addiion, he auhors in [Ianeva04] presen a probabilisic mulimedia rerieval model, which can capure correlaions in ime and space o improve he precision of he QBE approach. However, QBE approaches have heir own limiaions because he users may no have he image/video example a hand when issuing he queries. In addiion, QBE will no perform well if he query example is no aken wih an appropriae angle or scale Challenges in Mulimedia Rerieval In erms of video daa rerieval, he mos recen researches mainly focus on semanic evens rerieval. The exising even-based and objec-based video rerieval applicaions may encouner a problem since even deecion and objec segmenaion require manual annoaions of video evens, salien objecs, and heir boundaries. Ideally, he semanic conen of he video daa can be mined auomaically by uilizing various machine inerpreaion echniques, and herefore he videos can be auomaically annoaed. However, based on he curren echnology developmen and general experiences, hese kinds of complicaed daa absracions are no feasible in pracice. Insead, he compuer may perform auomaic or semi-auomaic annoaion wih limied semanic inerpreaion. 7

31 The individual user s background calls for a differen view on mulimedia daa. Therefore, anoher criical challenge is o design he mulimedia sysem and rerieval algorihm such ha individual user ineress can be learned and saisfied. For his purpose, relevance feedback (RF) is uilized in Conen Based Image Rerieval (CBIR) o bridge he semanic gaps and provide more accurae resuls based on users responses [Rui98]. Several recen sudies have incorporaed his echnique in video rerieval. [Amir05] presens a video rerieval sysem uilizing relevance feedback for mulimodal formulaions. [Yan03] describes a negaive pseudo-relevance feedback (NPRF) which can exrac informaion from he rerieved iems ha are no similar o he query iems. Relevance feedback helps o refine he mulimedia search resuls. However, he exising RF approaches do no incorporae an efficien mehodology for mulimedia daabase modeling ha has he capabiliy o model all layers of mulimedia objecs and consequenly offer mulimodal video rerieval o saisfy individual user s ineress. 2.3 Mulimedia Securiy Soluions In he earlier imes, Mandaory Access Conrol (MAC) and Discreionary Access Conrol (DAC) were he only wo known access conrol models available. Tha is, if an access conrol model was no MAC, i had o be a DAC and vice versa. In a compuer sysem employing MAC, he adminisraor ses he sysem securiy policies which enirely deermines he access righ graned. Even for he user who creaes he resource, he/she canno gran less resricive access o i han ha specified by an adminisraor. As for DAC, he basic access conrol policies are defined o objecs in a file sysem. The users are permied o enirely deermine he access graned o heir creaed resources. Tha is, unauhorized users can be graned access hrough acciden or malice of he users. Compared wih MAC and DAC, Role-Based Access Conrol (RBAC) is a newer and alernaive securiy soluion o resricing sysem access o auhorized users. The fundamenal feaure of RBAC is o suppor he adminisraion of large numbers of privileges on sysem 8

32 objecs, and reduce he effor o define and manage complex securiy policies. Wih RBAC, roles are creaed for various characers based on heir job funcions. For a specific role, he permissions o perform cerain operaions are assigned o i and he member of his role can acquire hese permissions o perform he paricular sysem operaions. Since he permissions are no assigned o users direcly, he managemen of individual users becomes easier. I is simply a maer of assigning appropriae roles o he users. Sandhu e al. [Sandhu96] summarizes and caegorizes he radiional RBAC models ino four families: RBAC0 base model; RBAC hierarchical model; RBAC2 consrain model; and RBAC3 combined model. Tradiional RBAC models have many resricions on he access conrol modeling. Therefore, numerous exended RBAC models have emerged o handle hose unresolved securiy issues. By evaluaing he radiional RBAC approaches, i has been found ha several issues sill remain open. Firs, emporal consrains may no be considered when seing he roles. Second, he locaions of users are no resriced. Third, mos securiy applicaions can only handle access conrol on mulimedia files wihou aking care of mulimedia conens. Fourh, i lacks a hierarchical archiecure for he roles and herefore he role managemen will become complicaed when he user number increases manifold. Tradiional RBAC models [Sandhu96] have many resricions on access conrol modeling. Therefore, numerous exended RBAC models have emerged o handle hose unresolved issues. In [Berino0], he Temporal Role-Based Access Conrol (TRBAC) model, which brings he basic emporal dependencies, is proposed bu i canno handle several useful emporal variables including he consrains on user-role and role-permission assignmens. The Generalized Temporal Role-Based Access Conrol (GTRBAC) model [Joshi05] is proposed laer o solve his problem. Recenly, his model was exended o an XML based version called X- GTRBAC [Bhai05], which incorporaes he conen- and conex-aware dynamic access conrol requiremens of an enerprise. However, hese models only improved he conrol capabiliy on 9

33 emporal consrains. Moyer e al. propose he Generalized Role-Based Access Conrol (GRBAC) model which leverages he radiional RBAC by incorporaing subjec roles, objec roles, and environmen roles [Moyer0]. Bu hey only inroduce he emporal consrains in he environmen roles, and i can only handle access conrol on mulimedia files wihou aking care of mulimedia conens. Anoher Generalized Objec-Composiion Peri-Ne Model (GOCPN) is proposed in [Joshi02], which mainly focuses on he modeling of documens o allow secure accesses o a mulimedia daabase managemen sysem. GOCPN uilizes a mandaory access conrol (MAC) approach which canno fully perform complicaed roles, role hierarchies, emporal consrains, and IP address resricions. The comparison beween our proposed securiy model wih he aforemenioned research works is discussed in Chaper VI. 2.4 Prooype Mulimedia Managemen Sysems 2.4. Conen-based Mulimedia Rerieval Sysems CIRES: Conen Based Image RErieval Sysem By evaluaing a combinaion of higher level and lower level visual clues, CIRES [Iqbal02] is developed o suppor conen-based image rerieval, including he queries ranging from scenes of purely naural objecs such as sea, sky, rees, ec., o images conaining conspicuous srucural objecs such as buildings, owers, bridges, ec. Figure II- shows he inerface in CIRES sysem. In he lower level analysis, a channel energy model is uilized o represen he image exure, while he color hisogram echniques are also employed. In order o describe he srucural conen of an image, he percepual organizaion is considered in he higher level analysis o exrac fracional energies in various spaial-frequency channels. 20

34 Figure II-. CIRES inerface wih sample images WebSEEk: Conen Based Image and Video Caalog and Search Tool for he Web WebSEEk [WebSEEk] is a web-based image/video caalog and search ool developed by Columbia Universiy. This sysem combines ex-based and color based queries hrough a caalog of images and videos colleced from he Web. The user can iniiae a query by choosing a subjec from he available caalogue or enering a opic (as shown in Figure II-2(a)). There are several selecions available such ha query resuls may be used for a color query in he whole caalogue or for soring he resul lis by decreasing color similariy o he seleced iem. Users can also possibly modify an image/video color hisogram manually before reieraing he search. Furhermore, his sysem also adops he relevance feedback echnique for finer grain refinemen of query resuls (as shown in Figure II-2(b)). 2

35 (a) (b) Figure II-2. WebSeek inerfaces (a) sample caalog (b) image rerieval resuls wih relevance feedback Figure II-3. Query inerface of VDBMS 22

36 VDBMS: A Medical Video Rerieval Sysem In he VDBMS (Video Daa Based Managemen Sysem) Projec [Aref02][Aref03] developed a Purdue Universiy, a video-enhanced daabase sysem is proposed o suppor a series of funcionaliies for video daabase managemen, including video conen preprocessing, represenaion and indexing, video and mea-daa sorage, feaure-based video rerieval, buffer managemen, and coninuous video sreaming. As a muli-discipline video rerieval sysem, VDBMS suppors boh search-by-conen and search-by-sreaming. Furhermore, VDBMS has also been developed as a research plaform via incorporaing new echniques. For insance, wo query operaors are implemened: rank-join and sop-afer algorihms. In addiion, VDBMS employs a mehod o define and process video sreams hrough he query execuion engine such ha he coninuous queries are suppored o realize he requess as fas-forward, lef ouer join, and region-based blurring. Here he window-join algorihm works as he core operaor for coninuous query processing. The query inerface of VDBMS is illusraed in Figure II Goalgle: Soccer Video Search Engine Goalgle [Snoek03] is a prooype search engine for soccer video. As illusraed in Figure II-4, browsing and rerieval funcionaliies are provided by means of a web based inerface. Goalgle allows users o rerieve video segmens from a collecion of prerecorded and analyzed soccer maches by selecing he specific players, evens, maches, and/or ex. In [Snoek05], he auhor expands heir research o a ime inerval mulimedia even (TIME) framework for semanic even classificaion in mulimodal video conens. Three machine learning echniques are sudied and compared: C4.5 decision ree, maximum enropy, and suppor vecor machine. 23

37 Figure II-4. User inerface of he Goalgle soccer video search engine IBM VideoAnnEx: Video Annoaion Tool Figure II-5 shows he IBM VideoAnnEx annoaion ool [IBM_VideoAnnEx], which is developed o assis auhors in he ask of annoaing video sequences wih MPEG-7 meadaa. Each sho in he video sequence can be annoaed wih saic scene descripions, key objec descripions, even descripions, and oher lexicon ses. The annoaed descripions are associaed wih each video sho and are sored as MPEG-7 descripions in an oupu XML file. VideoAnnEx can also open MPEG-7 files in order o display he annoaions for he corresponding video sequence. The annoaion ool also allows cusomized lexicons o be creaed, saved, downloaded, and updaed. 24

38 Figure II-5. User inerface for IBM VideoAnnEx Tool IBM MARVEL The IBM video rerieval sysem MARVEL [IBM_Marvel] is developed o organize he growing amouns of online mulimedia daa by using machine learning echniques o auomaically label he mulimedia conens. I suppors query by example in low level spaces as well as high level model-vecor space. In his research work, he ime-consuming and error-prone processes of meadaa labeling are replaced wih a semanic-based machine learning approach. I is claimed ha only -5% of he conen is required o be manually annoaed as he raining examples. Mulimodal feaures are employed for auomaic annoaing, for example visual clues, sounds, speech ranscrips, ec. The MARVEL mulimedia analysis engine and he MARVEL mulimedia search engine are implemened o provide he inernal suppors. Figure II-6 shows he online sysem inerfaces of IBM MARVAL. 25

39 Figure II-6. Query inerface of IBM MARVEL CuVid: News Video Search Sysem The Columbia DVMM lab creaed he CuVid [CuVid] sysem for he 2005 TRECVID ineracive search evaluaion. I inegraes a search engine for broadcas news video by employing he advanced echniques such as video sory segmenaion, semanic concep deecion, duplicae deecion, mulimodal rerieval, and ineracive browsing inerfaces. Specifically, he sory 26

40 segmenaion algorihm considers he informaion boleneck principle and he fusion of visual feaures and prosody feaures exraced from he speech. Moreover, a pars-based approach is uilized o deec he duplicae scenes across various news sources. The online rerieval inerface of CuVid is shown in Figure II-7. Figure II-7. User inerface for CuVid Youube Youube is a video sharing websie creaed in early 2005 where consumers can upload, view and share video clips [Youube]. I uses Adobe Flash echnology o display a wide variey of video conen, including movie clips, TV clips, and music videos, as well as amaeur conen such as shor videos which are creaed and edied by users. Unregisered users can wach mos videos on he sie, while regisered users are permied o upload an unlimied number of videos. In YouTube s second year, funcions were added o enhance user abiliy o pos video commens 27

41 and subscribe o conen raings. Each uploaded video feaures a series of ags ha are user inpued and hese ags are indexed for keyword-based searches. As demonsraed in Figure II-8, he resuls can be sored by heir posing ime, viewing couns, or raing scores. Figure II-8. User inerface for Youube video search Google Image/Video Search Engine As a company specializing in Inerne searches and online adverising, Google indexes billions of web pages and offers users convenien ools o search for informaion hrough he use of keywords and operaors. Google has also employed web search echnology in oher search services, including image search and video search. The new Google Video has been ransformed ino a video search service ha provides links o online and offsie video conen [GVideo]. I can index media from YouTube as well as an assormen of oher video hosing sies, including 28

42 Meacafe, MySpace, and BBC. Previews are available nex o search resuls ha are hosed on YouTube or Google Video, and humbnail snapshos are available for conen hosed by oher providers. Much like Google image search, a frame wih relevan Google-provided funcionaliy appears a he op of he window when he user clicks hrough a search resul. Figure II-9 shows he video searching resuls of soccer free kick goal, where he keywords are usually found from he video descripions and associaed web pages. Figure II-9. User inerface for Google video search Yahoo! Image/Video Search Engine Originally Yahoo! sared as a web direcory of oher websies, organized in a hierarchy index of pages. Over ime, Yahoo! designed and developed is own web crawler and search engine. In lae 2007, Yahoo! Search was updaed wih a more modern appearance wih Search Assis added, which can auomaically sugges and offer relaed search erms as hey are yped. 29

43 The keyword-based image/video search funcions are also suppored by Yahoo! Search based on he web searching echniques. A combinaion of facors are used in Yahoo! Video Search [YVideo] o enable users o find and view differen ypes of online video, including movie railers, TV clips, news fooage, and independenly produced video. These facors include Yahoo!'s media crawling and ranking echnology, is conen and media relaionships, as well as is suppor for Media Really Simple Syndicaion (Media RSS), a self-publishing specificaion for audio and video conen. Yahoo! Video Search suppors open sandards in he creaion and syndicaion of conen. By supporing Media RSS, Yahoo! Video Search hopes o foser openness and choice for independen video publishers looking o promoe heir conen. Figure II-0 shows he Yahoo! Video Search resuls for a query wih keywords Soccer Free Kick Goal. Figure II-0. User inerface for Yahoo! video search 30

44 2.4.. AOL Truveo Video Search Engine AOL Truveo [Truveo] operaes under he idea ha users do no merely search for video by enering specific words or phrases, as hey would when saring a regular web search. Insead, Truveo assumes ha people do no always know exacly wha hey are looking for in online video searches, so browsing hrough conen can help o rerieve unexpeced bu welcome resuls. Truveo provides useful inerfaces o suppor video browsing. I repeaedly displays spo-on resuls when users are looking for a video abou a specific subjec, or provides a variey of oher video clips ha are similar o encourage users o view more resuls. As shown in Figure II-, one useful feaure of Truveo is he way i shows resuls: by soring clips ino nealy organized caegories, such as Feaured Caegories, Channels, and Tags. These buckes spread ou on he page in a grid-like manner, giving users more o see in a quick glance. Figure II-. User inerface for AOL TRUVEO video search 3

45 2.4.2 Mulimedia Presenaion Auhoring and Rendering Sysems LAMP: Laboraory for Mulimedia Presenaions Prooyping Gaggi e al. [Gaggi06] propose a sysem called LAMP, a prooyping environmen which allows an auhor o se up and es a complex hypermedia presenaion. The media ediing ool in LAMP is implemened based on a graph noaion, where he nodes are media objecs and he edges are he synchronizaion relaions beween hem. An execuion simulaor is also included o es he presenaion dynamics by manually riggering he relaed evens. Finally, a player is developed o display he presenaion and visually inerpre he synchronized schema. In Figure II-2, he LAMP inerface is shown wih an example synchronizaion graph for news-on-demand presenaion. Figure II-2. LAMP inerface wih he synchronizaion graph of a news-on-demand presenaion 32

46 T-Cube: Mulimedia Auhoring Sysem for E-Learning Ma e al. [Ma03] inroduce a rich media auhoring sysem, T-Cube, which has been designed by and used a he Universiy of Trier for elearning. By using T-Cube, mulimedia conen can be consruced and presened o sudens wih eiher offline (CD/DVD/download) or online (in real ime or on demand) usage. The mulimedia-based eaching conens, including video, audio and screensho, are recorded and encoded a he classroom and simulaneously published on he Inerne. Figure II-3 illusraes an example of he layou design and he generaed presenaion inerface for T-Cube. Figure II-3. Views layou and user inerface for T-Cube Madeus: Srucured Media Auhoring ENvironmen In he paper [Jourdan98], Madeus is developed o help in ediing media documens ha conain fine-grained synchronizaions in he emporal, spaial and spaioemporal dimensions. A semiauomaic ool is inegraed in he sysem ha analyzes, generaes and allows he ediing of he conen descripion of video media. Madeus employs a documen model ha is based on srucured, emporal inerval-based and region-based models. Figure II-4 illusraes he sysem inerface for he imeline-based mulimedia presenaion auhoring environmen. 33

47 Figure II-4. Srucured media auhoring environmen of Madeus 34

48 CHAPTER III. OVERVIEW OF THE FRAMEWORK This chaper mainly provides an overview of DIMUSE, he proposed framework for he design of a disribued mulimedia sysem. Three major modules have been incorporaed in he proposed framework: mulimedia rerieval module, mulimedia presenaion design and rendering module, and securiy managemen module. In order o efficienly and effecively inegrae hem ogeher, a se of clien-side inerfaces as well as server-side componens are implemened and conneced. As illusraed in Figure III-, users are required o provide heir ID and password for logging-in purposes. A securiy checking componen is designed o check he user role and securiy rule by sending he reques o he server-side securiy checker. Afer passing he securiy check, general users are allowed o access boh he mulimedia rerieval and presenaion modules (shown as green links), while he adminisraors are also permied o access he securiy managemen module (shown as orange links). Users are able o raverse beween he mulimedia presenaion design environmen and he mulimedia rerieval componens (e.g., conen-based image rerieval, video browsing, soccer video even paern rerieval, ec.). The sysem provides he flexibiliy for users o search heir anicipaed media files and download he daa back o he presenaion design inerface. These source media files are lised and users can preview and hen choose heir preferred maerial o archiec diverse kinds of mulimedia presenaions. These designed presenaion models can be rendered o a real presenaion and displayed on he Java media player or a web browser. However, no all he mulimedia informaion can be accessed or displayed compleely because of he securiy assurance issues. The proposed securiy managemen module acually akes charge of he mulimedia daa accessing conrol of he whole applicaion, including boh of he previously menioned modules. Therefore, he user requess generaed in rerieval module and presenaion module may receive hree kinds of responses. Firs, if he user has full access o some specific media file under he specific environmen, he/she 35

49 can view and download he source daa in boh of he rerieval and presenaion modules. Second, if he user is no allowed o access some specific media file, no maer wha he reason is, he sysem basically rejecs he reques and no daa will be offered. Third, if he user has parial accessibiliy permission o he specific media files, hese files will be processed such ha resriced pars are hidden and no accessible by he user. However, users can sill view he processed media files for he un-sensiive pars and download his processed daa file o he presenaion. The only difference here is ha he resriced daa objecs are no shown in he final presenaion. The whole framework adops muli-hread clien/server archiecure, where a se of nework proocols can be uilized for he ransmission of mulimedia informaion, including boh requess and media daa. These proocols include TCP/IP, UDP, HTTP, RTP, ec. In he serverside, an objec-relaional daabase is designed for mulimedia daa sorage and managemen purposes. In DIMUSE, muliple caegories of daa are sored, which include: () Source media daa, including ex, audio, image, and video daa. (2) Processed mulimedia daa, including he mea daa, image objecs, video segmens, ec. (3) Low level visual/audio feaures. (4) High-level semanic informaion, including affiniy relaionships, and user access frequencies and access paerns, ec. (5) Securiy roles and rules, including user roles, objec roles, environmenal roles (i.e., emporal roles and IP address roles), and securiy policy rules. In DIMUSE, he Markov Model Mediaor (MMM) [Shyu03] and HMMM model is adoped o model mulimedia relaed daabase. The securiy-relaed informaion is mainly sored in XML. 36

50 Server-side Mulimedia Daabase Managemen (Relaional Objec Daabase) MMM & HMMM XML Source Daa Processed Daa Permission Roles Texual Daa Images Audios Videos Segmens and Objecs Mulimedia Mea Daa Visual/Audio Feaures Affiniy Relaionships User Roles Objec Roles Environmenal Roles Securiy Policy Rules Mulimedia Search Engine Image Rerieval Video Rerieval Online Learning Video Clusering Offline Training Reques Handler SMARXO Securiy Conrol Mulimedia Daa Manager Receiver Daa Sender Mulimedia Daa Processor Securiy Policy Checker Mulimedia Daa Analyzer Mulimedia Daa File Supplier Nework Layer TCP/IP UDP RTP HTTP Clien-side Reques Handler Reques Packager Reques Sender Informaion Receiver Mulimedia Browsing and Rerieval Module Presenaion Auhoring and Rendering Module Securiy Managemen Module Video Browsing Video Rerieval MATN Model Design SMIL Generaor User Roles Manage Objec Role Manage Image Rerieval Online Media Rerieval Java Media Frame Player Web based Player Environmen Role Manage Securiy Rule Manage General User Securiy Checking User Log In Adminisraor Figure III-. Overall framework and componens of DIMUSE 37

51 Server-side engines are developed according o he diverse funcionaliies wih daabase access and compuaion-inensive operaions. () A reques handler is implemened for receiving he reques and sending back he resuls. (2) A mulimedia search engine is developed o suppor conen-based image rerieval, video even/paern queries, online learning, offline raining, as well as he video clusering, ec. (3) The mulimedia daa manager is designed, where he daa analyzer is responsible for some analysis funcions, such as feaure exracion and video segmenaion. The mulimedia daa/file supplier is capable of caching he requesed daa/file from he daabase. (4) The SMARXO securiy conrol componen is also incorporaed in he server-side. This sub-module is uilized o manage he access conrol roles and rules, perform securiy checking, and process he media daa upon he securiy consrains. In he clien-side, a se of user-friendly inerfaces as well as some managing funcions are included o fulfill he diverse requiremens. () Muliple media rerieving inerfaces are faciliaed o suppor various mehods of daa accessing, including video browsing, conen-based image rerieval, concepual video rerieval, and web-based mulimedia rerieval. (2) The mulimedia presenaion componen incorporaes a Mulimedia Augmened Transiion Nework (MATN) [ChenSC00a][ChenSC0b] based presenaion design environmen. The sysem can provide wo possible mehods for presenaion rendering: JMF player and web-based player. The SMIL inerpreer is also included for he efficien convering from MATN model o HTML+SMIL scrips. (3) Securiy managemen module offers he inerfaces for user role managing, objec role managing, emporal role managing, IP address role managing, as well as securiy rules managing. 38

52 3. Mulimedia Daabase Modeling and Rerieval Module The major challenges of conen-based mulimedia rerieval include no only he difficuly of exracing feaures and generaing semanic indexes for hierarchical mulimedia conens, bu also he incapabiliy of discovering hidden personalized user ineress. 3.. Image Daabase Modeling and Rerieval using MMM As a well esablished mahemaical consruc, he Markov Model Mediaor (MMM) [Shyu03] is applied o model complicaed images as well as he rerieval engine in he conen based image rerieval componen of DIMUSE. The developmen of he MMM suppored image daabases are accomplished by employing he objec-relaional daabase. This module is no he focus of his disseraion so please refer o paper [Shyu04d] for more deails Video Daabase Modeling and Rerieval using HMMM By exending MMM o a muliple level descripion, an innovaive daabase modeling mechanism called Hierarchical Markov Model Mediaor (HMMM) is proposed in his research for video daabase modeling, sorage and rerieval purposes. In order o model hierarchical media objecs, HMMM is composed wih muliple levels of MMM models which are conneced effecively and efficienly. The dream of pervasive mulimedia rerieval and reuse will no be realized wihou incorporaing semanics in he mulimedia daabase. In his research, HMMM inegraes low-level feaures, semanic conceps, and high-level user percepions for modeling and indexing muliplelevel video objecs o faciliae emporal paern rerieval. A variey of mulimedia objecs in differen levels are modeled wih he sae sequences associaed wih heir ransiion probabiliies by incorporaing he emporal meanings and/or heir affiniy relaionships. Differen from he exising daabase modeling mehods, his proposed approach carries a sochasic and dynamic process in boh search and similariy calculaion. In he rerieval of semanic even paerns, HMMM always ries o raverse he mos opimized pah, and herefore i can assis in rerieving 39

53 more accurae paerns quickly wih lower compuaional coss. Moreover, HMMM suppors feedbacks and learning sraegies, which can proficienly assure he coninuous improvemens of he overall performance Online Learning and Offline Training via HMMM In DIMUSE, an innovaive mehod is proposed and developed o capure he individual user s preferences by considering he low-level feaures as well as he semanic conceps and relaionships. Wih he hierarchical and sochasic design for video daabase modeling, he proposed framework suppors no only he general concep-based rerieval mehods, bu also he complicaed emporal even paern queries. In he online learning approach, a se of MMM insances are creaed for he user wih disinc preferences, and he sysem is capable of learning and hen generaing he updaed resuls o saisfy he special informaion requiremens. Wih he proposed online learning mechanism, he rerieval and ranking of video evens and he emporal paerns can be updaed dynamically in real ime o saisfy individual user s ineress and informaion requiremens. Moreover, user feedback is efficienly accumulaed for he offline sysem raining process such ha he overall rerieval performance can be enhanced periodically and coninuously. Tha is, he overall sysem can always remain as a learning mechanism since he access paerns and frequencies from various users can be proficienly sored and analyzed for he long-erm offline sysem raining. The offline raining process is normally iniiaed only when he number of feedbacks reaches a cerain hreshold. This could improve he performance bu i becomes a manual process o decide he hreshold and iniiae he raining process. To address his challenge, we propose an advanced raining mehod by adoping he associaion rule mining echnique [Zhao07b], which can effecively evaluae accumulaed feedback and auomaically invoke he raining process. Training is performed per video raher han for all videos in he daabase, making he process 40

54 more efficien and robus. In addiion, i can furher improve he semanic models in he video daabase and coninuously improve rerieval performance in he long run. As an example, he proposed mehod is applied o a soccer video rerieval sysem and he experimenal resuls are analyzed. Furher, we applied he Hierarchical Markov Model and sysem raining mechanism in a mobile-based video rerieval sysem (MoVR). We developed innovaive soluions for personal video rerieval and browsing hrough mobile devices wih he suppor of conen analysis, semanic exracion, as well as user ineracions. HMMM-based user profiles were designed o capure and sore individual user s access hisories and preferences such ha he sysem can provide he personalized recommendaion. We also employed he fuzzy associaion concep o empower he framework so ha he users can make heir choices of rerieving conen based solely on heir personal ineress, general users preferences, or anywhere in beween. Consequenly, he users gain conrol in deermining he desirable level of radeoff beween rerieval accuracy and processing speed. A mobile-based soccer video navigaion sysem was implemened and examined o demonsrae he performances of he proposed MoVR framework Video Daabase Clusering To accommodae he requiremens of muli-disciplinary video rerieval in he disribued mulimedia applicaions, a concepual video daabase clusering echnique is proposed, implemened and incorporaed in DIMUSE. As menioned above, he video daabase is modeled by HMMM, which is a hierarchical learning mechanism and suppors boh online and offline raining. Acually, he cumulaed hisorical queries and he associaed user feedbacks can be reused o updae he affiniy relaionships of he video objecs as well as heir iniial sae probabiliies. Correspondingly, boh he high level semanics and user percepions are employed in he video clusering sraegy. The associaed rerieval algorihm is also proposed o search he op-k paerns wih raversing he 4

55 minimum number of clusers. This echnique assiss o cluser he relaed media daa o improve he rerieval performance. Wih he clusering informaion, he daabase srucure can be furher refined by adding a new level of MMM o model he clusers. Furhermore, he compuaion coss in he query processing can be significanly reduced. 3.2 Mulimedia Presenaion Module A mulimedia presenaion is a delivery medium of a collecion of media sreams which are consrained by emporal synchronizaion relaionships among each oher. An absrac model called Mulimedia Augmened Transiion Nework (MATN) [ChenSC00a][ChenSC0b] is adoped in DIMUSE as he presenaion model. This componen is one of he key modules in our disribued mulimedia managemen sysem. However, he mulimedia presenaion module and is relaed echniques are no he major conribuions of his disseraion. This module will only be presened in sysem inegraion secion of Chaper VII and he reader is referred o he book of [ChenSC00a] for more deails relaed o MATN model Presenaion Design wih MATN Model An MATN model is composed wih a group of saes conneced by direced arcs wih marked mulimedia srings. By combining srucure-based auhoring wih well-defined graphicbased noaions, MATN offers grea flexibiliy for users o design a complicaed mulimedia presenaion wih synchronizaion of he heerogeneous mulimedia objecs. MATN suppors he specificaion of emporal consrains for mulimedia conen, and hese emporal requiremens can be saisfied a runime. MATN also provides a good daa srucure for he implemenaion o conrol mulimedia playback. A group of feaures are implemened in he MATN-based presenaion design environmen such ha users can easily add or delee he presenaion saes, adjus he emporal consrains for each arc, design a sub-nework o accommodae diverse condiions, ec. In addiion, he MATN file forma is designed by considering he MATN srucures and embedded 42

56 informaion. The file saving and opening funcions are also developed o sore and resume he user-designed MATN based presenaions Presenaion Rendering wih JMF and SMIL A presenaion rendering componen is implemened and inegraed in DIMUSE o conver he designed MATN model o a mulimedia scenario perceivable o he users. Basically, here are wo approaches provided o fulfill differen requiremens based on diverse environmens. One approach is o synchronize and display he presenaion in a clien-side player which is implemened by using Java Media Framework (JMF) echnologies. JMF provides superior echniques for rendering he presenaion models ino a sand-alone applicaion in a runime environmen. Four kinds of disinc media players are developed o exhibi he ex, image, audio, and video. Since he MATN model capures he spaial and emporal relaionships, hey can be inerpreed and uilized o conrol he players. The oher approach is o conver he designed MATN model o SMIL languages, which can be displayed in he web browser direcly. SMIL noaions can be combined wih he HTML file. Therefore, an SMIL emplae is deployed in he sysem such ha he MATN srucure can be inerpreed ino he SMIL+HTML forma. The SMIL-based scrips can be displayed wherever he web browser is available. This approach is specifically suiable for he online-based mulimedia applicaions. 3.3 Securiy Managemen Componen 3.3. Securiy Policy and Role Managing The main objecive of securiy policy and role manager is o deal wih he various access conrol roles and rules. In he proposed securiy framework, four kinds of roles are defined o handle a reques behavior in he mulimedia applicaions. Firs, as he mos fundamenal feaure in RBAC, subjec roles are defined o recognize he users role in he applicaion. As he 43

57 permissions are graned o he roles, he users wih he same role are permied o perform he same se of operaions. Second, objec roles are faciliaed o conrol he access of no only he source media daa, bu also he embedded objecs and segmens. Third, emporal roles are responsible o conrol he effecive ime of he access funcionaliies. Fourh, spaial roles are designed o resric unauhorized accesses from alien compuers based on he checking of heir IP addresses. By combining all hese four roles, he securiy access policies are defined as access conrol rules. These conrol informaion are designed o be sored in he XML forma so ha hey can be easily rerieved and viewed Securiy Checking Upon receiving an access reques, he securiy checker will firs validae he user ID and Password and idenify is subjec role. As operaion ime and operaor s IP address can be easily obained, he emporal roles and spaial roles are also checked. For he requesed media daa and objecs, he objec roles are considered. Based on he securiy checking resuls, he sysem responds o he reques wih he following hree possibiliies: Firs, he user s access wih cerain media daa is denied; Second, he user is allowed o access and perform cerain operaions for he requesed media daa; Third, he user is allowed o access or operae parial conens of he requesed media daa. Wihin he hird condiion, he sysem will perform media daa processing o hide he resriced pars (objecs, segmens, ec.) from he source media daa and show he processed mulimedia daa o he user Mulimedia Daa Managing and Processing The mulimedia daa manager is responsible of managing he media source daa, along wih heir exraced objecs or segmens. For he purpose of supporing muli-level securiy, mulimedia daa are required o be sored in a hierarchical way. The recen mulimedia daa processing echniques can help in he mulimedia indexing phase o exrac he mulimedia objecs from he source daa. For insance, image segmenaion can help o idenify he image 44

58 objecs; video decoding, sho deecion and scene deecion can assis o achieve meaningful video sho sequences. In addiion, users are allowed o manually idenify and define heir arge mulimedia objecs or segmens. In case of a mulimedia documen conaining resriced objecs, he mulimedia daa manager akes charge o perform he daa processing such ha he resriced pars (e.g. image objecs, video segmens) are hidden while he users are sill capable of viewing he remaining pars of he source media. 3.4 Mulimedia Applicaion and Sysem Inegraion 3.4. DMMManager: Disribued Mulimedia Manager Based on he proposed framework, a disribued mulimedia sysem called DMMManager is developed. DMMManager adops a muli-hreaded clien-server archiecure. In he server-side, an objec relaional daabase called PosgreSQL [PosgreSQL] is employed o sore he media source daa, mea daa, feaures, and he oher informaion. A daabase engine is developed wih C++ o suppor compuaion inensive processes, such as query processing, feaure exracion, media supply, online and offline raining, ec. The clien-side applicaion is developed wih Java, which provides a variey of user-friendly inerfaces for users o issue he mulimedia queries, download he rerieved media files, design and view mulimedia presenaions, ec. DMMManager is capable of supporing a full scope of mulimedia managemen funcionaliies by efficienly inegraing muliple modules ogeher. I is also uilized as a es bed for our recen mulimedia researches. This curren applicaion sores oally 0,000 color images, around 50 videos along wih more han 0,000 video shos. A series of clien-side inerfaces are designed o rank he conen-based image rerieval (CBIR) and conen-based video rerieval (CBVR) query resuls by similariy scores. Moreover, users are allowed o mark he resuling media objec wih posiive or negaive labels. When he number of accumulaed feedbacks reaches a hreshold, he offline raining process will be riggered o refine he underlying affiniy relaionship marix o improve he overall rerieval performance. 45

59 Paricularly, a soccer video rerieval sysem named SoccerQ [ChenSC05a] is developed and inegraed in DMMManager o suppor no only he basic queries bu also he complicaed emporal even / even paern queries for a soccer video daabase. The clien-side inerfaces inegrae he video browsing panels and soccer even query in a common framework. The clienside applicaions can collec he user requess based on heir anicipaed soccer evens wih he associaed emporal relaionships, hen consruc a reques message and send i o he servers. The server-side daabase engine exracs he relaed parameers from he received reques, rerieves he desired video sequences and finally reurns he video clips o cliens. In his research, he SoccerQ sysem is furher updaed and expanded o incorporae new echniques such as video daabase modeling and clusering. 46

60 CHAPTER IV. MULTIMEDIA DATABASE MODELING AND RETRIEVAL This chaper addresses he research issues involved in he mulimedia daabase modeling and rerieval module of DIMUSE, which offers a variey of funcionaliies for users o search for and access heir favorie media files. An innovaive mechanism called Hierarchical Markov Model Mediaor (HMMM) is proposed for managing muliple levels of media objecs. As an example, he mos basic 2-level HMMM model and he associaed rerieval algorihm are inroduced for emporal even paern rerieval. Furhermore, a concepual video clusering sraegy is proposed o improve he overall rerieval performance and reduce he compuaion ime by consrucing he 3 rd level HMMM model. A soccer video rerieval sysem has been developed and employed as a es bed for all hese newly proposed echniques. 4. Inroducion Due o he rapid propagaion of mulimedia applicaions ha require daa managemen, i becomes more desirable o provide effecive mulimedia daabase modeling and rerieval echniques capable of represening and searching he rich semanics in media daa. In he exising conen-based mulimedia rerieval approaches, here are four essenial challenges o be addressed. The firs challenge is o bridge he semanic gap beween he muli-modal visual/audio feaures and he rich semanic, which means ha he users anicipae he daabase sysems o associae heir queries for searching and browsing purposes based on he semanic conceps represened by he digial media daa. The semanic inerpreaions are required o be derived and faciliaed efficienly by uilizing assored mehodologies and echniques from various disciplines and domains, even hough many of hem do no belong o he radiional compuer science fields. The second emerging challenge is o proficienly model and search for he mulimedia objecs by considering heir emporal and/or spaial relaionships. I is anicipaed ha a 47

61 generalized daabase modeling mechanism can be designed o incorporae all he relaed mulimedia informaion o suppor no only he basic rerieval mehods, bu also he complicaed emporal even paern queries (i.e., o rerieve he video clips conaining a user-designed sequence of semanic evens ha follow some specific emporal relaions). Here, semanic even annoaions are used o recognize real-world represenaion of he video shos, also referred o as evens or conceps. Anoher crucial problem is o incorporae high-level user percepions in he daabase modeling and rerieval process. When performing mulimedia rerieval, differen users may evenually have diverse ineress, leading o separae preferences for he anicipaed mulimedia objecs. Therefore, mulimedia summarizaion, rerieval, and ranking should focus on saisfying he individual user s ineres and informaion requiremens. Hence, users percepions need o be aken ino accoun when modeling he underlying daabase and designing he rerieval algorihm. Finally, an addiional imporan research opic is o mine and cluser he mulimedia daa, especially o accommodae he requiremens of video rerieval in a disribued environmen. Wih he recen advances in mulimedia echnologies, he number of mulimedia files and archives increases dramaically. Since he mulimedia daabases may be disribued geographically hrough he local nework or world-wide Inerne, he associaed workloads could be quie expensive when dealing wih complicaed video queries. In paricular, semanic-based video rerieval is muli-disciplinary and involves he inegraion of visual/audio feaures, emporal/spaial relaionships, semanic evens/even paerns, high-level user percepions, ec. Therefore, i is expeced o uilize a concepual daabase clusering echnique o index and manage he mulimedia daabases such ha he relaed daa can be rerieved ogeher and furhermore he communicaion coss in he query processing can be significanly reduced. In his chaper, an inegraed and ineracive framework is proposed for video daabase modeling and rerieval approaches o efficienly and effecively organize, model, and rerieve he 48

62 conen of a large scale mulimedia daabase. In his proposed work, he semanic descripions and user preferences are successfully applied o enhance he performance no only for mulimedia conen managemen bu also daabase clusering and concepual video rerieval. In order o achieve he goal, his newly proposed framework includes a variey of advanced echniques. Firs, for he purpose of daa processing and concep mining, his framework adops muli-disciplinary echniques, such as conen-based image analysis, audio feaure exracion, video sho deecion and segmenaion algorihms, daa mining, and machine learning. Second, he Hierarchical Markov Model Mediaor (HMMM) mechanism is inroduced o efficienly sore, organize, and manage low-level feaures, mulimedia objecs, and semanic evens along wih high-level user percepions (such as user preferences) in he mulimedia daabase managemen sysem (MMDBMS). Third, innovaive feedback and learning mehods are proposed o suppor boh online relevance feedback and offline sysem raining such ha he sysem can learn he common user percepion as well as discover he individual user requiremens. Fourh, a clusering sraegy is also proposed o group video daa wih similar characerisics ino clusers ha exhibi cerain high level semanics. This proposed approach is able o reuse he cumulaed user feedback o perform video clusering, such ha he overall sysem can learn he user percepions and also consruc more efficien mulimedia daabase srucure by adoping he video clusering echnique. For evaluaion purposes, a soccer video rerieval sysem uilizing he proposed framework is developed. 4.2 Overall Framework In general, mulimedia daa and meadaa can be caegorized ino hree groups: eniies, aribues, and values, where he descripion of an eniy is composed of he combinaions of aribues and heir corresponding values. One of he significan characerisics of video daa is ha video eniies may pose various emporal or spaial relaionships. Accordingly, users are normally ineresed in specific semanic conceps and he associaed emporal-based even 49

63 paerns when querying a large scale video archive. However, some of he curren compuer vision and video/audio analysis echniques only offer limied query processing echniques on exual annoaions or primiive low-level or mid-level feaures. Alhough a variey of researches have begun o consider rerieval of semanic evens and he salien objecs, a comprehensive daabase modeling echnique is lacking o suppor he access and query on he emporal-based even paerns. In his sudy, a emporal even paern is defined as a sequence of semanic evens ha follow some specific emporal relaions. Here, a semanic even annoaion is used o mark realworld siuaions of he video sho, also referred o as evens. For insance, in a soccer video, he evens such as goal, corner kick, free kick, foul, goal kick, yellow card, and red card are considered. An example emporal paern query can be expressed as follows: A user wans o search for a specific soccer video segmen wih he following emporal paerns. A firs, a goal even resuling from a free kick happens. Afer ha, a corner kick occurs a some poin in ime, followed by a player change, and finally anoher goal sho even happens. In our earlier sudies, we proposed various approaches in he mulimedia area, especially video daa mining, indexing and rerieval. In [ChenSC03a][ChenSC04a], he mehodologies were proposed o idenify he goal and corner kick evens. Moreover, a emporal query model relaed graphical query language was inroduced in [ChenSC05a] o assure he soccer even queries wih he suppor on emporal relaionships. In his proposed approach, he Markov Model Mediaor (MMM) is exended o he Hierarchical MMM mechanism such ha he muliple-level video eniies and heir associaed emporal or affiniy relaionships can be efficienly modeled o answer his ype of emporal paern query. 50

64 Video Conen Processing Query Translaing Video Sho Deecion Sho Feaure Exracion Daa Cleaning Semanic Even Deecion Graphical Rerieval Inerface Anicipaed Evens or Even Paerns Temporal Relaionships Video Daabase Modeling via HMMM Iniial Query Processing Source Video Daa Affiniy Relaionships beween Videos Numbers of Semanic Evens Video Shos Video Sho Access Paerns / Frequencies Visual/Audio Feaures for Shos Temporal Even or Even Paern Similariy Maching Candidae Video Sho Sequences Ranking Process Videos Iniial Sae Probabiliies Video Shos Iniial Sae Probabiliies Imporance Weighs of Feaures for he Evens Feedback and Training Processing Link condiions beween s and 2 nd levels of MMMs Video Clusers Clusers Iniial Sae Probabiliies Sored Video Clips (Even Paerns) Access Paerns User Saisfied Video Clips (Even Paerns) Access Frequencies Hisory Queries Video Daabase Clusering Video Clusering Offline Training (Long Term) Updae Video Level Affiniy Marix Updae Sho Level Affiniy Marix User Preference Online Learning Consruc/Updae MMM Insances Online Resul Refining Cumulaed User Feedbacks Consruc 3 rd Level MMM for Clusers Updae Iniial Sae Probabiliy Marix, ec. Updaed Resuls Figure IV-. Overall framework of video daabase modeling and emporal paern rerieval uilizing HMMM, online learning, offline raining and clusering echniques As illusraed in Figure IV-, our proposed framework consiss of six major sages. ) The firs sep is o process he video daa by uilizing muli-disciplinary echniques for video sho boundaries deecion and sho feaures exracion. Afer he daa cleaning procedure, daa mining echniques are employed o deec he semanic evens. The algorihms for soccer even deecion can be found in [ChenSC05a]. 5

65 2) Secondly, in he Video Daabase Modeling module, HMMM is employed o model he exraced feaures, deeced evens, segmened video shos along wih he original source daa. The proposed hree-level HMMM model is capable of managing he hierarchical mulimedia objecs (i.e., video clusers, videos, video shos) as well as heir associaed affiniy relaionships. However, i should be noed ha iniially only he firs wo levels of MMM models are consruced. The hird level MMM model is consruced afer video clusering. 3) Once a emporal paern query is issued via he graphical rerieval inerface, he Query Translaor analyzes he user requiremens and encodes he query o a se of expeced evens and heir associaed emporal paerns. 4) These requess are hen sen o he server-side query processing componen (Iniial Query Processing) as inpus. The similariy maching process is hen execued o achieve he candidae video sho sequences and finally hey are sored according o he similariy scores. 5) Wih hese iniial resuls rerieved, users are allowed o choose heir preferred paerns by marking hem as posiive. Wih he online learning mechanism, he sysem can refine he query resuls and rank hem in real-ime capuring a user s specific percepions. Moreover, hese hisorical queries wih feedback are accumulaed in he daabase for fuure usage. As illusraed in he righ-lower box, he HMMM mechanism can be rained by considering he sored user feedback for coninuous sysem learning. The mulimedia sysem raining and learning sraegies will be furher discussed in he nex chaper. 6) Finally, hese hisorical access paerns and frequencies are also uilized in he video clusering mechanism as demonsraed in he lef-lower box. Afer his process, he HMMM-based daabase model can be updaed by adding he hird level of MMM 52

66 model for he generaed video clusers. As he sysem learns user knowledge, all of hese updaes can help o enhance he overall rerieval performance and reduce he compuaion coss. 4.3 Hierarchical Markov Model Mediaor (HMMM) The Markov Model Mediaor (MMM) [Shyu03] is a well-esablished mahemaical consruc capable of modeling complicaed mulimedia daabases and can efficienly collec and repor informaion periodically. MMM has been successfully applied in several applicaions such as conen-based image rerieval [Shyu04b][Shyu04c][Shyu04d] and web documen clusering [Shyu04a]. Definiion IV-: Markov Model Mediaor (MMM) [Shyu03] An MMM is represened by a 5-uple λ = (S, F, A, B, Π), where S is a se of saes which represens disinc media objecs; F includes a variey of disinc feaures; A denoes he saes ransiion probabiliy disribuion, where each enry acually indicaes he relaionship beween wo media objecs, which can be capured hrough he off-line raining processes; B represens he low-level feaure values of media objecs; and Π is he iniial sae probabiliy disribuion, which indicaes he likelihood of a media objec being seleced as he query. Here, a media objec may refer o an image, a salien objec, a video sho, ec., depending on he modeling perspecive and he daa source. A and Π are used o model user preference and o bridge he semanic gap, which are rained via he affiniy-based daa mining process based on he query logs. The basic idea of he affiniy-based daa mining process is ha he more wo media objecs Obj m and Obj n are accessed ogeher, he higher relaive affiniy relaionship hey have, i.e., he probabiliy ha a raversal choice o sae (media objec) Obj n given he curren sae (media objec) is in Obj m (or vice versa) is higher. Deails abou he raining and consrucion processes of he MMM parameers can be found in [Shyu04b]. 53

67 In his research, MMM is exended o muliple level descripions and uilized for video daabase modeling, sorage and rerieval purposes. In paricular, he Hierarchical Markov Model Mediaor (HMMM) is designed o model various levels of mulimedia objecs, heir emporal relaionships, he deeced semanic conceps, and he high-level user percepions. The formal descripion of an HMMM is defined as below. Definiion IV-2: Hierarchical Markov Model Mediaor (HMMM) An HMMM is represened by an 8-uple Λ = ( d, S, F, A, B, Π, O, L), as shown in Table IV-. Table IV-. HMMM is an 8-Tuple: Λ = ( d, S, F, A, B, Π, O, L) Tuple d Number of levels in an HMMM. Represenaion S( S ) The group of mulimedia objec ses in differen levels, where n = o d. n F ( Fn ) The ses of disinc feaures or semanic conceps of he specific mulimedia objecs, where n = o d. The group of sae ransiion probabiliy marices. The higher he enry is, he A ( An ) igher he relaionship ha exiss beween he arge objecs, where n = o d. The group of feaure/concep marices of differen-level MMMs, where n = B ( Bn ) o d. Π ( Π n ) The iniial sae probabiliy disribuions, where n = o d. The weighs of imporance for he lower-level feaures in F O ( O i, i + Fi + Fi ) i when describing he higher level feaure conceps in F i+, where i = o d-. Link condiions beween he higher level saes and he lower level saes, L ( L i,i + ) where i = o d-. Each of he MMM models incorporaes a se of marices for affiniy relaionships, feaures/conceps, and iniial sae probabiliy disribuions. Le S n denoe he size of S n, which means he number of he n h level MMMs (or sae ses). S = S }, where g S. Here, n { g n n g S n represens he sae se of he g h MMM in he n h level. Since he modeling descripions of he MMM models in each level are he same 54

68 and o simplify he noaion, S n is generically used o represen one member in he se of saes in he curren MMM model of ineres and hus g is ignored. S n, i.e., A = A }, where g S. A ( A S S ) is designed as he affiniy marix n { g n n g n g n for he g h MMM in he n h level. I describes he affiniy relaionship beween pairs of saes in g S n. Similarly, A n is generically used o represen any member in g n g n A n. B = B }, where g S. B B S F ) conains he feaure values or n { g n n g g ( g n n n n number of semanic evens for he saes in g S n. Similarly, B n is generically used o represen any member in B n. Π = Π }, where g S. in n { g n n g Π n includes he iniial sae probabiliies for he saes g S n. Similarly, Π n is generically used o represen any member in Π n. 3 rd Level MMM Model for Clusers S g 3 ( i) 2 nd Level MMM Models for Videos S g 2 ( i) s Level MMM Models for Video Shos S g ( i) Figure IV-2. Three-level consrucion of Hierarchical Markov Model Mediaor In his proposed approach, we uilize a hree-level HMMM model o manage he hierarchical video daabase. As demonsraed in Table IV-2, he MMM models of differen levels 55

69 in he 3-level HMMM describe disinc objecs and represen differen meanings. Though he general descripion is he same, he marices in differen levels represen slighly dissimilar meanings o reflec he various naures of disinc mulimedia objecs. In he firs level MMM (d = ), he saes represen he video shos, which are he elemenary unis in he video daabase o describe he coninuous acion beween he sar and he end of a camera operaion. The feaure se (F ) consiss of low-level or mid-level visual/audio feaures. In he second level MMM (d = 2), he saes describe he se of videos in he daabase, and he feaure se (F 2 ) conains he semanic evens deeced in he video collecion. While in he hird level MMM (d = 3), he saes represen he se of video clusers. Table IV-2. 3-level HMMM model s Level MMM 2 nd Level MMM 3 rd Level MMM S Sae se of video shos Sae se of Videos Sae se of video clusers F Low level visual/audio feaures Semanic evens (conceps) - A Temporal based sae ransiion probabiliy beween video shos Affiniy relaionship beween videos Affiniy relaionship beween video clusers B Formalized feaure values Annoaed even numbers - Π Iniial sae probabiliy disribuion for video shos Iniial sae probabiliy disribuion for videos Iniial sae probabiliy disribuion for video clusers 4.4 Two-level HMMM Model In his secion, he firs wo levels of he HMMM model are consruced in he beginning o model he source video and heir associaed video shos. More specifically, he fundamenal level of he MMM model consiss of a series of consecuive video shos. I needs o be noed ha he evens are referred o as sho-level video clips in his research. I is merely a choice of represenaion raher han a saemen abou he acual duraion of a specific even. Thus, one local MMM is designed for he video shos in each video; while he second level MMM models 56

70 are consruced o model he videos in a cluser or daabase and hus hey incorporae all he corresponding lower level MMM models Video sho level MMM As we saed before, he marices for affiniy relaionship, feaure, and iniial sae probabiliy disribuions a differen levels may hold slighly dissimilar meanings alhough he general depicions are he same. In he mos fundamenal level, he saes (S ) represen he video shos, which are he elemenary unis in he video daabase and describe he coninuous acion beween he sar and end of a camera operaion. The feaure se (F ) for he video sho level MMM consiss of low-level or mid-level visual/audio feaures A : emporal-based relaive affiniy marix A represens he emporal-based affiniy relaionship beween he video shos in he video sho-level MMM. Le S ( ) and S ( ) (where 0 < i < j) represen wo specific video shos i j wih cerain semanic evens, and if hey are frequenly accessed ogeher in one emporal even paern, hey are said o have a higher affiniy relaionship. Hence, heir emporal based affiniy relaionship value from S ( ) o S ( ) will be larger. i j ) Iniializaion of A Assume here are N video shos {s, s 2,, s N } in video v, and all of hese video shos follow he emporal sequence, i.e., T < T <... < T s s2 sn, where T is he occurrence ime of video si sho s i. When searching for he emporal even paern, he sysem will search for he video shos by following heir emporal sequences. If he sysem goes from sae s i o sae s j, i mus follow he rule of T T. Therefore, for sae s i, here are (N-i+) possible saes ha he sysem will s j si ransi o. Accordingly, A can be iniialized as follows. ( N i + ); where i Ν, j Ν, j i. A ( i, j) = (IV-) 0; where j < i. 57

71 2) Updae of A By adoping HMMM, users are allowed o provide heir feedback o he sysem. The video sho sequences similar o he anicipaed emporal even paern will be marked as Posiive paerns which are used o capure he user preferences o refine he sysem rerieval capabiliy for he fuure updae. A marix AF is defined o capure he emporal-based affiniy relaionships among all he annoaed video shos using user access paerns and access frequencies. For he y h paern R y, access (y) represens is access frequencies, and use (i, y) equals if s i (he i h video sho) was accessed in he y h paern R y. Moreover, boh s m and s n should belong o his Posiive emporal paern, and s m should occur before s n or hey should occur a he same ime. Le q be he number of posiive paerns on he sho level, AF can be calculaed as below: aff q ( m, n) = A ( m, n) use ( m, y) use ( n, y) access ( y), y= (IV-2) where s R, s R, T T. m y n y sm sn Each enry of aff ( m, ) in AF indicaes he frequency of s m and s n being accessed n ogeher in he firs level MMM, and consequenly he probabiliy of hese wo video shos being accessed ogeher in he emporal paerns. A can hen be updaed via normalizing AF per row and hus MMM represens he relaive affiniy relaionships among all he video clips in he daabase. Le A ( m, ) be he elemen in he (m, n) enry in he firs level MMM, hen n aff ( m, n) A ( m, n) =. (IV-3) aff ( m, j) N j = For he sake of efficiency, he raining sysem can only record all he user access paerns and access frequencies during a raining period, insead of updaing A marix on-line every ime. Once he number of newly achieved feedbacks reaches a cerain hreshold, he updae of A marix can be riggered auomaically. All he compuaions should be done offline. 58

72 Table IV-3. Feaure lis for he video shos Caegory Feaure Name Feaure Descripion grass_raio Average percen of grass areas in a sho Visual Feaures pixel_change_percen hiso_change background_var Average percen of he changed pixels beween frames wihin a sho Mean value of he hisogram difference beween frames wihin a sho Mean value of he variance of background pixels background_mean Mean value of he background pixels volume_mean Mean value of he volume volume_sd Sandard deviaion of he volume, normalized by he maximum volume volume_sdd Sandard deviaion of he difference of he volume volume_range Dynamic range of he volume, defined as (max(v)-min(v))/max(v) energy_mean Mean RMS energy sub_mean Average RMS energy of he firs sub-band Audio Feaures sub3_mean energy_lowrae sub_lowrae sub3_lowrae sub_sd Average RMS energy of he hird sub-band Percenage of samples wih RMS power less han 0.5 imes he mean RMS power Percenage of samples wih RMS power less han 0.5 imes he mean RMS power of he firs sub-band Percenage of samples wih RMS power less han 0.5 imes he mean RMS power of he hird sub-band Sandard deviaion of he mean RMS power of he firs sub-band energy sf_mean sf_sd sf_sdd sf_range Mean value of he Specrum Flux Sandard deviaion of he Specrum Flux, normalized by he maximum Specrum Flux Sandard deviaion of he difference of he Specrum Flux, which is normalized oo Dynamic range of he Specrum Flux. 59

73 B : visual/audio feaure marix We consider boh he visual and audio feaures in he feaure marix B for he video sho level MMM consrucions. As shown in Table IV-3, here are a oal of 5 visual and 5 audio feaures [ChenSC03a]. ) Normalizaion of B The iniial values of he feaures need o be normalized o achieve more accurae similariy measures. To capure he original value of a feaure in a video sho, we define a emporal marix BB whose rows represen he disinc video shos while he columns denoe all he disinc feaures. The enry of BB (i, k) denoes he original value of he k h feaure of he i h video sho, where k K, K is number of feaures and i N, N is he number of video shos. Our arge is o normalize all of he feaures o fall beween [0, ]: BB ( i, k) min( BB ( j, k)) j= B ( i, k) =, where i N, k K. N N max( BB ( j, k)) min( BB ( j, k)) j= N j= (IV-4) Π : iniial sae probabiliy marix for shos The preference of he iniial saes for queries can be achieved from he raining daa se. For any video sho sae sm S, he iniial sae probabiliy is defined as he fracion of he number of occurrences of video sho s m as he iniial sae can raverse wih respec o he oal number of occurrences for all he iniially raversed video sho saes in he video daabase from he raining daa se. The Π can hus be consruced as below, where π m is defined as he iniial sae probabiliy for video sho s m. N y = use ( m, y) Π = { π m } =. (IV-5) use ( l, y) N l S y = 60

74 4.4.2 Video-level MMM The purpose of consrucing video-level MMM is o cluser he videos describing similar evens. A large video archive may conain various kinds of videos, such as news videos, movies, adverisemen videos, and spors videos. The second level MMM is consruced such ha he sysem is able o learn he semanic conceps and hen cluser he videos ino differen caegories A 2 : relaive affiniive marix for videos Based on he informaion conained in he raining daa se, he affiniy relaionships among he video ses in he daabase can be capured, i.e., he higher he frequency of wo videos being accessed ogeher, he closer hey are relaed o each oher. The relaive affiniy marix A 2 is consruced in wo seps as follows: Firs, a marix AF 2 is defined o capure he affiniy measures among all he videos by using user access paerns and access frequencies. Afer ha, each enry aff ( m, ) in AF 2 2 n indicaes he frequency of he wo videos v m and v n being accessed ogeher in he 2 nd level MMM, and consequenly how closely hese wo videos are relaed o each oher. Le q be he number of queries on he video level. aff 2 ( y y q' m, n) = = use ( m, y) use ( n, y) access ( ). (IV-6) The marix A 2 can hen be obained via normalizing AF 2 per row and hus represens he relaive affiniy relaionships among all he M videos in he daabase (D). aff 2 ( m, n) A2 ( m, n) =, where m M and n M. aff ( m, j) M j = 2 (IV-7) Please noe ha A and A 2 are differen since A considers he emporal relaionships as well, while A 2 does no. 6

75 B 2 : even number marix for videos Marix B 2 includes he even numbers of each video, where each row represens a video and each column denoes one semanic even. Assume here are a oal of M videos in he daabase, where he video v i ( i M) conains he se of C evens denoed as {e, e 2,, e C }, and B ( i, ) means he number of he j h even (e j ) in v i. B 2 does no need o be normalized and 2 j he ineger values are kep Π 2 : iniial sae probabiliy marix for videos In he video-level, he access paerns and access frequencies for videos in use 2 (insead of use ) are used o consruc he marix Π Connecions beween firs level MMMs and second level MMM O,2 : weigh imporance marix The weigh imporance marix (O,2 ) is required o denoe he relaionship beween he feaures for video shos and he specific semanic evens. This marix is uilized o adjus he characerisic influences by learning he feaures of he annoaed evens. In O,2, each row represens an even concep, while each column represens a feaure. The value in O,2 means he weigh of imporance of he corresponding feaure for he specific even concep. ) Iniializaion of O,2 Le each mulimedia objec have K feaures {f, f 2,, f K } and C evens {e, e 2,, e C }. We define he iniial value for each feaure in an even concep o be /K, which means hey carry he same weigh imporance. O i, j) =, where i C, j. (IV-8) K,2 ( K 2) Updae of O,2 Once a group of N video shos {s, s 2,, s N } consising of he same even concep e i ( i C) are known, he sandard deviaions of he K feaures for all he N video shos can be 62

76 calculaed as {Sd i,, Sd i,2,, Sd i,k }, where Sd i,k represens he sandard deviaion of he i h even and k h feaure ( i C, k K). Equaions (IV-9)-(IV-) can be employed o compue O,2. The larger he O,2 value is, he more imporan his feaure is when calculaing he similariy score wih he specified even. O' ( i, k) =, where i C, k K. (IV-9) Sd i, k O,2 O' ( i, k) ( i, k) = ; and (IV-0) K O' ( i, k) k = K O,2 ( i, k) = ( ) /( ). (IV-) Sd i k = Sd, k i, k B : mean value of he feaures per even In marix B, he row represens an even (concep), and he column denoes he visual and audio feaures. Assume ha for he even e i ( i C), a se of N video shos s, s,..., s } { 2 N are idenified as e i, where hese video shos are no necessarily consecuive shos. Le B ( s j, f ) k represen he normalized value for video sho s j and feaure f k, he mean value of he feaures f k ( k K) for e i can be calculaed as follows. N B (, ) s j f k ' j B ( ei, f k ) = =, where i C, k K. N (IV-2) L,2 : link condiions marix To faciliae he connecions beween he local MMM model and he second level MMM model, he link condiions marix L,2 is designed. Le {v, v 2,, v M } be he M videos and {s, s 2,, s N } be he N video shos, if s j belongs o v i, L,2 (v i, s j ) = (where i M, j N). Oherwise, L,2 (v i, s j ) = 0. 63

77 4.4.4 Iniial Process for Temporal Even Paern Rerieval Given a emporal paern wih C evens Q = e, e,..., e } sored by he emporal { 2 C relaionships such ha T e T... T e e C, he iniial rerieval process is presened as below. Here, we assume here are M videos {v,, v M } in he mulimedia daabase archive, and here are oal K non-zero feaures {f, f 2,, f K } of he query sample. Here, K 20 since 20 feaures are used. Wihou any online feedback or video clusers, he iniial rerieval process includes he following seps. Sep. Iniializes he flag parameers as i=, =, and y=. Sep 2. Checks marix B 2 and/or marix A 2 o search for video v i which conains even e. This video should have a close affiniy relaionship wih he previous video if i is available. Sep 3. Checks he link condiion marix L,2 and/or marix A o find he specified video sho s which is annoaed as even e or similar o even e. This video sho should also have a srong connecion o he previous video sho. Sep 4. Calculaes he edge weigh w s, e ) using Equaions (IV-3) and (IV-4), which is ( defined as he edge weigh from he curren sae s o he arge even e a he evaluaion of he k h feaure (f k ) in he query, where k K and C. When < C : A =, w s, e ) = Π ( s ) sim( s, ). (IV-3) ( e w s, e ) w ( s, e ) A ( s, s ) sim( s, e ). (IV-4) + ( + + = Equaion (IV-5) defines he similariy funcion o measure he similariy beween s and e based on all of he non-zero feaures in f, f,..., f }. { 2 K 64

78 sim( s K ( B ( s, f k ) B '( e, f k ) ), e ) = ( O,2 ( e, f k ) ), k= B '( e, f ) (IV-5) where s S, k K, C. In each raversal, he sysem will choose he opimized pah o access he nex possible video k sho saes similar o he anicipaed evens. A he end of one video, he nex possible video candidae will be seleced by checking he higher-level affiniy and feaure marices. Sep 5. = +. If > C, all he evens in his paern have been raversed and herefore he similariy score of he whole candidae paern should be compued as indicaed in Sep 6. Oherwise, he sysem goes o Sep 3 o coninue checking he nex video sho candidae which mos closely maches he nex even. Noe ha he raversal pah should be recorded in he whole process. Sep 6. Assumes a candidae video sho sequence is defined as R = s, s,..., s } y { 2 C, he final similariy score can be calculaed as: C SS( Q, Ry ) = = w ( s, e ). (IV-6) Sep 7. i = i+; y = y+. Checks if i > M. If yes, all he candidae video ses are checked and he sysem goes o Sep 8. If no, he sysem goes o Sep 2 and checks marices A 2 and B 2 o find he nex video candidae. Sep 8. There are y- candidae paerns. The sysem ranks he candidae video sho sequences according o he similariy scores. Sep 9. Finally, a lis of y- sored video sho sequences is rerieved as he oupu. Free Kick & Goal Corner Kick Player Change Goal Figure IV-3. An example resul of a emporal paern query 65

79 Figure IV-4. HMMM-based soccer video rerieval inerface As illusraed in Figure IV-3, he key frames of a se of rerieved emporal even paerns are displayed below he emporal paern query o show an example of he resuls. A soccer video rerieval sysem has been developed for he evaluaion of he proposed approach. In he curren approach, he proposed HMMM mechanism is uilized o model he mulimedia daabase. Two levels of MMM models are consruced o model 54 soccer videos which are segmened ino,567 video shos. Among hese video shos, 506 of hem are annoaed as semanic evens. Figure IV-4 shows he clien-side inerface of he sysem, where he lef-boom par shows he ineracive panels where a user can issue he queries. The righ side panel demonsraes he resuling paerns sored by heir similariy scores. In his case, he arge paern is issued wih a goal sho followed by a free kick, and herefore 8 paerns (including 6 shos) are displayed, where he magena box marked he 3 rd paern. The lef-upper panel displays he video sho which is chosen by he user. Finally, by using he drop down menu below he key frames, users are able o selec heir preferred video shos/paerns, and heir feedback can be sen back o he server-side for furher improvemen of he rerieval performance. 66

80 4.5 Video Daabase Clusering and Consrucion of 3rd Level MMM In his secion, an inegraed and ineracive video rerieval framework is proposed o efficienly organize, model, and rerieve he conen of a large scale mulimedia daabase. The core of our proposed framework is a learning mechanism called HMMM (Hierarchical Markov Model Mediaor) [Zhao06a] and an innovaive video clusering sraegy [Zhao06b]. HMMM models he video daabase, while he clusering sraegy groups video daa wih similar characerisics ino clusers ha exhibi cerain high level semanics. The HMMM srucure is hen exended by adding an addiional level o represen he clusers and heir relaionships. The proposed framework is designed o accommodae advanced queries via considering he high level semanic meaning. Firs, i is capable of searching semanic evens or even paerns considering heir populariy by evaluaing heir access frequencies in he large number of hisorical queries. Second, he users can choose one or more example paerns wih heir anicipaed feaures from he iniial rerieved resuls, and hen issue he nex round of queries. I can search and re-rank he candidae paerns which involve similar aspecs wih he posiive examples reflecing he user s ineress. Third, video clusering can be conduced o furher reduce he searching ime especially when dealing wih he op-k similariy rerievals. As he HMMM mechanism helps o raverse he mos opimized pah o perform he rerieval, he proposed framework can only search several clusers for he candidae resuls wihou raversing all he pahs o check he whole daabase Overall Workflow Figure IV-5 demonsraes he overall workflow of he proposed framework. In his framework, he soccer videos are firs segmened ino disinc video shos and heir low-level video/audio feaures are exraced. A mulimedia daa mining approach is uilized o pre-process he video shos o ge an iniial candidae pool for he poenial imporan evens. Afer ha, a se of iniial even labels will be given o some of he shos, where no all of hese labels are correc. 67

81 All he daa and informaion will be fed ino his framework for even paern searching and video rerieval purposes. The videos included in he candidae pool are modeled in he s level of MMM (Markov Model Mediaor) models, whereas all he videos are modeled in he 2 nd level. Afer iniializing he s level and 2 nd level of he MMM models, users are allowed o issue even or even paern queries. Furhermore, users can selec heir even paerns of ineres in he iniial resuls and re-issue he query o refine he rerieval resuls and heir rankings. This sep is also recognized as online learning. These user-seleced sho sequences are sored as posiive paerns for fuure offline raining. Video Sho Segmenaion Sho Feaure Exracion Iniialize he s level and 2 nd level MMM models Daa Cleaning Even Deecion Updae he s level and 2 nd level MMM models Even/Paern Queries Feedback Accumulae Consruc he 3 rd level MMM models and Updae he 2 nd level MMM models Offline Training Video Clusering Provide enhanced performance for he Top-k even paern rerieval Figure IV-5. Overall workflow for he proposed approach Afer a cerain number of queries and feedback, he proposed framework is able o perform he offline raining. The hisorical queries and user access records are uilized o updae he affiniy relaionships of he videos/video shos as well as heir iniial sae probabiliies. Thereafer, boh he semanic evens and he high level user percepions are employed o 68

82 consruc he video clusers, which are hen modeled by a higher level (3 rd level) of he MMM model. In he meanwhile, he 2 nd level MMM models are divided ino a se of sub-models based on he clusered video groups. The clusered daabase and he updaed HMMM mechanism are capable of providing appealing mulimedia experience o he users because he modeled mulimedia daabase sysem learns he user s preferences and ineress ineracively by reusing he hisorical queries Concepual Video Clusering Similariy Measuremen In his proposed framework, a video is reaed as an individual daabase in a disribued mulimedia daabase sysem, where is video shos are he daa insances in he daabase. Accordingly, a similariy measure beween wo videos is defined as a value indicaing he likeness of hese wo videos wih respec o heir concepual conens. I is calculaed by evaluaing heir posiive evens and even paerns in he hisorical queries. If wo videos consis of he same even(s) and/or even paern(s) and are accessed ogeher frequenly, hey are considered closely relaed and heir similariy score should be high. Assume here are H user queries issued hrough he video rerieval framework, where he se of all he query paerns is denoed as QS. In order o refine heir rerieved resuls in real-ime, he users mark heir preferred even paerns as posiive before making he nex query. By evaluaing he issued query ses and heir associaed posiive paerns, he similariy measure is defined as follows. Le v i and v j be wo videos, and X={x,, x m } and Y={y,, y n } be he ses of video shos belonging o v i and v j ( X vi, Y v j ), where m and n are he numbers of annoaed video shos in v i and v j. 69

83 Denoe a query wih an observaion sequence (semanic even paern) wih C semanic k k k { 2 k C evens as Q = e, e,..., e }, where Q k QS. Le R k be he se of G posiive paerns ha a user has seleced from he iniial rerieval resuls for query Q k. This can be represened by a marix of size G C, G, C. As shown in Equaion (IV-7), each row of R k represens an even sho sequence ha he user marked as posiive, and each column includes he candidae even shos which correspond o he requesed even in he query paern. { s, s 2,..., sc} k { s, s2,..., sc} R =. (IV-7)... G G G { s, s2,..., sc } Based on he above assumpions, he video similariy funcion is defined as below. Definiion IV-3: SV(v i, v j ), he similariy measure beween wo videos, is defined by evaluaing he probabiliies of finding he same even paern query paerns in QS. where k H, and (H ) occurrence probabiliies of finding k Q from v i and v j in he same query for all he k k ( v ) P( Q v ) FA( ) SV( vi, v j ) = P Q i j H k Q QS FA is an adjusing facor. P ( ) and ( ) Q k v i. (IV-8) P Q k v j represen he k Q from v i and v j, where he occurrence probabiliy can be obained by summing he join probabiliies over all he possible saes [Rabiner93]. In order o calculae his value, we need o selec all he subses wih C even shos from he posiive paern se R k, which also belong o v i or v j. Tha is, X = { x ', x ',..., x '} and Y = { y ', y ',..., y '}, ' 2 C ' 2 C where k k X ' X, X ' R, Y ' Y, Y ' R. If hese paerns do no exis, hen he probabiliy value is se as 0 auomaically. 70

84 k k k ( Q v ) = P( Q, X ' v ) = P( Q X ', v ) P( X ' v ) P. (IV-9) i all X ' i all X ' i i Assume he saisical independence of he observaions, and given he sae sequence of X = { x ', x ',..., x '}, Equaion (IV-20) gives he probabiliy of X ' given v i. ' 2 C P Here, ( ' ' ) C C ( X v ) = P( x ' x ') P( x ') = A ( x ', x ') Π ( ') + ' i + x = = +. (IV-20) P x x represens he probabiliy of rerieving a video sho x + ' given ha he curren video sho is x '. I corresponds o he A ( x ', x + ' ) enry in he relaionship marix. ( x ') P is he iniial probabiliy for video sho x ', i.e., Π ( '). Equaion (IV-2) gives he probabiliy of an observaion sequence (semanic even paern) x k Q. C k ( X, v ) = P Q i = k ' P( e x '). (IV-2) k where P e x ') indicaes he probabiliy of observing a semanic even ( k e from a video sho This value is compued by using a similariy measure by considering low-level and mid-level feaures. However, in his approach, since he users have already marked hese video shos as he evens hey requesed and preferred, he probabiliy of observing he semanic evens is simply reaed as. x ' Clusering Sraegy Considering a large scale video daabase, i is a significan issue o cluser similar videos ogeher o speed up he similariy search. As we saed before, a wo-level HMMM has been consruced o model video and video shos. Furhermore, a video daabase clusering sraegy which is raversal-based and greedy is proposed. 7

85 As illusraed in Figure IV-6, he proposed video daabase clusering echnique conains he following seps. Given he video daabase D wih M videos and he maximum size of he video daabase cluser as Z (Z 2), he mechanism: a) Iniializes he parameers as p=0; n=0, where p denoes he number of videos being clusered, and n represens he cluser number. b) Ses n = n +. Searches he curren video daabase D for he video v i wih he larges saionary probabiliy Π 2 (v i ), and hen sars a new cluser CC n wih his video (CC n = { }; CC n CC n {v i }). Iniializes he parameer as q=, where q represens he number of videos in he curren cluser. c) Removes v i from daabase D (D D-{v i }). Checks if p = M. If yes, oupu he clusers. If no, goes o sep d). d) Searches for v j, which has he larges A ( v, v ) SV ( v, v ) in D. Adds v j o he curren 2 i j i j cluser CC n (CC n CC n {v i }). e) v i v j, where v i represens he mos recen clusered video. Every ime when a video is assigned o a cluser, i is auomaically removed from D (D D {v i }). f) p++ and q++. Checks if p=m. If yes, oupus he clusering resuls. If no, checks if q=z. If yes, goes o sep b) o sar a new cluser. If no, goes o sep d) o add anoher video in he curren cluser. g) If here is no un-clusered video lef in he curren daabase, i oupus he curren clusers Consrucing he 3 rd level MMM model In his research, he HMMM model is exended by he 3 rd level MMM o improve he overall rerieval performance. In he 3 rd level MMM (d = 3), he saes (S 3 ) denoe he video clusers. Marix A 3 describes he relaionships beween each pair of clusers. 72

86 Definiion IV-4: Assume CC m and CC n are wo video clusers in he video daabase D. Their relaionship is denoed as an enry in he affiniy marix A 3, which can be compued by Equaions (IV-22) and (IV-23). Here, SC is he funcion ha calculaes he similariy score beween wo video clusers. SC( CCm, CCn ) = ( Π2 ( vi ) max ( A2 ( vi, v j ) SV( vi, v j ))) / M, (IV-22) v CC i m v CC j n where CC m D, CC D. n SC( CCm, CCn) A3 ( CCm, CCn) =. (IV-23) SC( CC, CC ) CC D j m j Inpus: () D: daabase including M videos (2) Z: Maximum size of a cluser (Z 2) p=0; n=0; n++; sar a new cluser: CC n {}; CC n CC n {v i }; q=; p++; where v i has he larges Π 2 (v i ) value in D Y p=m? N D D-{v i }; N q=z? Y Add v j o CC n : CC n CC n {v j } ; where v j has he larges value of A ( v, v ) SV ( v, v ) in D 2 i j i j v i v j ; D D-{v i }; p++; q++; p=m? N Y Oupus: () n: Number of video clusers (2) CC k ( k n): video clusers Figure IV-6. The proposed concepual video daabase clusering procedure 73

87 The marix Π 3 can be consruced o represen he iniial sae probabiliy of he clusers. The calculaion of Π 3 is similar o he ones for Π and Π 2. In addiion, marix L 2,3 can also be consruced o illusrae he link condiions beween he 2 nd level MMMs and he 3 rd level MMM Ineracive Rerieval hrough Clusered Video Daabase { 2 C Given an example sho sequence Q = { s, s,..., s } which represens he even paern as ' 2 C e, e,..., e } such ha s i describes e i ( i C), and hey follow he emporal sequence as T s T... T s 2 s C. Assume ha a user wans o find op-k relaed sho sequences which follow similar paerns. In our proposed rerieval algorihm, a recursive process is conduced o raverse he HMMM daabase model and find he op k candidae resuls. As shown in Figure IV-7, a laice-based srucure for he overall video daabase can be consruced. Assume he ransiions are sored based on heir edge weighs [Zhao06a], and he rerieval algorihm will raverse he edge wih a higher weigh each ime. For example, in Figure IV-7, we assume ha he edge weighs saisfy w(s, s 2 ) w(s, s 4 ) w(s, s 7 ). The algorihm can be described as below.. Searches for he firs candidae cluser, firs candidae video and firs candidae video sho by checking marices Π 3, Π 2, B 2, Π and B. 2. If he paern is no complee, coninues search for he nex even (video sho) via compuing he edge weighs by checking A. 3. If he candidae paern has been compleed, goes back sae by sae and checks for oher possible pahs. Also checks if here are already k candidae paerns being rerieved. If yes, sops searching and goes o Sep If here are no more possibiliies in he curren video, hen marks his video wih a searched flag and check A 2 and B 2 o find he nex candidae video. 5. If all he videos are searched in he curren cluser, hen marks he curren cluser as searched cluser and check A 3 o find he nex candidae video cluser. 74

88 6. Once k paerns are rerieved, or here are no more possibiliies in he daabase, ranks he candidae paerns via calculaing he similariy scores [Zhao06a] and oupus he candidae paerns. Cluser Video T : Even T 2 : Even 2 T 3 : Even 3 CC v s s 2 s 3 s 9 s 4 s 5 s 7 s 6 s 8 v 2 s 0 CC 2 v 3 s s 2 s 3 Candidae video cluser Candidae video sae Video sho sae which maches he expeced even Transiion which maches he expeced emporal even Transiion which goes o search he nex candidae video Figure IV-7. Laice srucure of he clusered video daabase R Cluser Video Even Even 2 Even 3 CC v s s 2 s 3 2 CC v s s 4 s 5 3 CC v s s 4 s 6 4 CC v s s 7 s 8 5 CC v s 9 s 7 s 8 6 CC 2 v 3 s s 2 s 3 Figure IV-8. Resul paerns and he raverse pah As shown in he Figure IV-8, he yellow cells include he pahs he algorihm has raversed. Furhermore, we designed a funcion o fill in he missed cells by copying he corresponden shos in he previous candidae paerns. Finally, six complee candidae paerns are generaed. Once k candidae paerns are generaed, he sysem does no need o raverse any 75

89 oher clusers or videos. Therefore, i significanly reduces he searching spaces and acceleraes he searching speed. 200 Execuion Time Comparison Average Execuion Time (milliseconds) wihou Clusers wih Clusers G C F FG CG GC CGF FCG FGC Query Paern Figure IV-9. Comparison of he average execuion ime Experimenal Resuls for Video Clusering We have buil up a soccer video daabase wih 45 videos which conain 8977 video shos. A rerieval sysem has also been implemened for he sysem raining and experimenal ess. Toally, 50 ses of hisorical queries were issued and user feedbacks were reurned wih heir preferred paerns, which cover all of he 45 videos and 259 disinc video shos. In he clusering process, we defined he cluser size as 0 and he expeced resul paern number as k=60. As shown in Figure IV-9, we use leers G, F, and C o represen Goal, Free kick, Corner kick evens, respecively. Therefore, he x-axis represens differen query paerns, e.g., G means a query o search for Goal Evens; FG means a query o search for he even paern where a Free kick followed by a Goal ; and CGF means a query paern of a Corner kick even, followed by a Goal and hen a Free kick, ec. For each query paern, we issued 0 queries o compue he average execuion ime in milliseconds. 76

90 (a) (b) Figure IV-0. Soccer video rerieval sysem inerfaces (a) query over non-clusered soccer video daabase (b) query over clusered soccer video daabase As illusraed in Figure IV-9, he query paerns wih fewer even numbers will be execued in less ime as expeced. In addiion, he execuion ime of he sysem wih clusers is less han ha of he sysem wihou clusers, indicaing ha our proposed approach effecively groups relevan videos in he video clusers so ha only he relevan clusers and heir member 77

91 videos will need o be searched. Therefore, he searching space is dramaically decreased, and he execuion of he queries becomes faser. For he query paern ( Corner kick followed by a Goal ), Figure IV-0(a) demonsraes he firs screen of rerieval resuls over he non-clusered soccer video daabase; while Figure IV-0(b) shows he query resuls over he clusered soccer video daabase. I can be clearly seen ha he query resuls in he same cluser represen he similar visual clues, which are mined from he hisorical queries/feedbacks, and correspondingly represen user preferences. 4.6 Conclusions In his chaper, an HMMM-based mulimedia daa modeling mechanism is proposed o develop a user-ineracive mulimedia rerieval framework. User feedbacks are adoped in his mechanism o perform boh hierarchical learning and concepual-based video clusering. Specifically, he definiion of HMMM is formalized in his chaper, while he consrucion and basic learning mehods are also given for each of he hree levels in HMMM. Furher, several ses of rerieval procedures and ranking algorihms are designed and presened in deail o mee differen condiions of video daabase, i.e., daabase before clusering, clusered daabase, ec. A soccer video rerieval sysem is developed and employed for he experimenal ess in he differen sages of he whole process. The resuls show ha our proposed approach helps accelerae he rerieval speed while providing decen rerieval resuls. The major conribuions of his proposed research include he following aspecs. Firs, he HMMM mechanism offers a hierarchical srucure o assis he proficien consrucion of a highdimensional mulimedia daabase. I also helps o bridge he semanic gap beween he concepbased and he conen-based rerieval approaches o he comprehensive mulimedia daabase modeling. The emporal relaionship beween he semanic evens is naurally incorporaed in HMMM such ha complicaed emporal paern queries can be execued. Second, his framework inegraes he feedback and learning sraegies by considering no only he low-level visual/audio 78

92 feaures, bu also he high-level semanic informaion and user preferences. In addiion, he proposed framework is designed o accommodae advanced queries via considering he high level semanic meaning. Finally, he video clusering can be conduced o furher reduce he searching ime especially when dealing wih he op-k similariy rerievals. As he HMMM mechanism helps o raverse he mos opimized pah o perform he rerieval, he proposed framework can only search several clusers for he candidae resuls wihou raversing all he pahs o check he whole daabase. Hence, more accurae paerns can be rerieved quickly wih lower compuaional coss. I is worh menioning ha his approach suppors no only offline raining, bu also online learning. In his chaper, we only inroduce he basic offline raining mehod, which ries o updaed A and П marices based on a large number of hisorical queries and feedbacks from muliple users. The overall rerieval performance can be refined coninuously o gain long erm benefis. However, his mehod has is own disadvanages: his offline raining mehod lacks efficiency, canno mee individual user preferences, and needs a manual process o rigger. In fac, his research framework can suppor more powerful online learning mehods and he offline raining mehod can also be enhanced. Furher deails will be invesigaed and discussed in Chaper V. 79

93 CHAPTER V. MULTIMEDIA SYSTEM TRAINING AND LEARNING Semanic rerieval of media objecs may well exend exual consequens o include all forms of mulimedia. However, i is a challenging ask for a mulimedia sysem o perform conen-based rerieval on muli-dimensional audio/visual daa, and i is even harder o refine he rerieval resuls ieraively and ineracively based on user preferences. This chaper mainly discusses he sysem learning mechanisms, which conain boh offline sysem raining and online relevance feedback [ChenSC07]. Firs, an innovaive mehod is proposed o auomae he offline sysem raining by using he associaion rule mining mehod. Second, online relevance feedback is hen inroduced o updae he HMMM daabase model and provide refined resuls in real ime. Finally, his chaper addresses he learning issues in designing and implemening a user adapive video rerieval sysem, called MoVR, in a mobile wireless environmen. Paricularly, HMMM-based user profiles are designed and developed for learning individual user preferences as well as general user percepions. The fuzzy associaion concep is uilized in he rerieval and ranking process such ha users can ge he flexibiliy o achieve heir anicipaed refining resuls in erms of differen knowledge ses. 5. Inroducion Users are usually ineresed in specific semanic conceps and/or he associaed emporalbased even paerns when querying a large scale video archive. In his sudy, a emporal even paern is defined as a series of semanic evens wih some paricular emporal relaions. In soccer video rerieval, an example emporal even paern query can be expressed as Search for hose soccer video segmens where a goal resuls from a free kick. Using he algorihm proposed in he previous chaper, he sysem should be able o search for he video clips ha conain he desired paern and rank hem based on a cerain similariy measuremen mehod. However, no all of he reurned video clips will be chosen by he user as posiive resuls. The possible reasons are () 80

94 some video clips may no exacly mach he requesed evens due o he accuracy consrains of he auomaic even annoaion algorihm, and (2) hough some video clips mach he correc even paern, hey do no saisfy he user s paricular ineress. Furhermore, he ranking iniially may no reflec he user expecaions. Thus, he sysem should allow user feedback and learn from i o filer ou inaccurae resuls as well as refine searching & ranking performance. Once he iniial rerieval resuls are reurned and displayed, users should be allowed o provide heir feedback hrough he clien-side inerface. Differen people may have differen perspecives when evaluaing he similariy of he rerieved resuls and heir expeced video clips. Taking he query paern wih only one even as an example, Figure V- illusraes wo possible scenarios for disinc users feedbacks. Given a query o search for he goal shos, a se of resuls are reurned and 0 of hem are shown in Figure V-. One user may wan o find a goal possibly resuling from a corner kick so ha he s, 5h, and 7h key frames marked in red recangles are seleced as he samples of ineres o provide he feedback (shown in Figure V-(a)). Anoher scenario is shown in Figure V-(b), where he oher user may wan a specific se of resuls in some series of soccer video games (e.g., FIFA Women s World Cup 2003 in his example) which represen he similar visual clues. The anicipaed key frames of his user are marked in he blue recangles as shown in Figure V-(b). In general, a se of possible properies can be used o simulae a user s selecions, e.g., low-level visual and audio feaures, high-level semanic conceps, and possibly he emporal informaion. As saed above, i is anicipaed ha he relevance feedback can be suppored by he video rerieval sysem, and herefore he nex round of resuls can be generaed and ranked in real-ime based on he individual user s perspecives. Furhermore, he massive amoun of feedback from muliple users should also be considered o improve he overall performance of he video rerieval mechanism in he long run. In his secion, we will discuss he online relevance of feedback performance, as well as he procedures for he off-line sysem raining. 8

95 (a) (b) Figure V-. Two feedback scenarios for he soccer video goal even rerieval 5.2 Relaed Work One of he mos challenging asks in mulimedia informaion rerieval is o perform he raining and learning process such ha he rerieval performance of he mulimedia search engine can be refined efficienly and coninuously. In general, exising mulimedia sysem raining and learning mechanisms can be caegorized ino online learning and offline raining. Relevance feedback (RF) [Rui98] is designed o bridge he semanic gap and provide more accurae resuls based on he user s responses. Incorporaing RF is an online soluion for improving rerieval accuracy, especially for sill-image applicaions. However, exising RF approaches have several consrains and limiaions such ha i is difficul o employ RF in video rerieval approaches. For example, i does no incorporae any mehodology o model all layers of mulimedia objecs and, consequenly, i does no offer efficien learning for mulimodal video rerieval o saisfy general users ineress. In addiion, as menioned by Muneesawang and Guan 82

96 [Muneesawang03], RF does no offer a decen soluion for video daabase represenaion o incorporae sequenial informaion for analyic purposes. Research effors have been conduced o exend and refine he RF mehod for video rerieval and learning purposes. Several mulimedia sysem raining approaches ry o uilize oher possible learning mechanisms such as Suppor Vecor Machine (SVM) and Neural Nework echniques. For example, a emplae frequency model was proposed and a self-learning neural nework was employed o implemen an auomaic RF scheme by Muneesawang and Guan [Muneesawang03]. Yan e al. [Yan03] describe a negaive pseudo-relevance feedback (NPRF) approach o exrac informaion from he rerieved iems ha are no similar o he query iems. Unlike he canonical RF approach, NPRF does no require he users o make judgmens in he rerieval process, as negaive examples can be obained from he wors maching resuls. In Bruno e al. [Bruo06], a query-based dissimilariy space (QDS) was proposed o cope wih he asymmerical classificaion problem wih query-byexamples (QBE) and RF, where sysem learning in QDS is compleed hrough a simple linear SVM. However, his linear-based mehod failed o saisfy he complicaed requiremens for conen-based video rerieval and learning. For offline raining algorihms, he curren research mainly focuses on one-ime raining using cerain kind of daa ses or classificaion informaion. For some cases, user feedback is no he major daa source in sysem raining. For insance, Herz e al. [Herz03] inroduced a learning approach using he form of equivalence consrains which deermine wheher wo daa poins come from he same class. I provides relaional informaion abou he labels of daa poins raher han he labels hemselves. An auomaic video rerieval approach was proposed by Yan e al. [Yan04] for he queries ha can be mapped ino four predefined user queries: named persons, named objecs, general objecs, and scenes. I learns he query-class dependen weighs uilized in rerieval offline. This kind of offline raining processes is ime-consuming, no fully auomaic, and limied o pre-defined query ypes. 83

97 In summary, mos of he curren online learning algorihms mainly deal wih ineracions wih a single user. Due o he small amoun of feedback, hey can be performed in real-ime, bu he performance could only be improved o a limied exen, especially when handling a largescale mulimedia daabase. On he oher hand, some offline raining mehods ry o learn he knowledge from no only colleced user feedback, bu also some oher raining daa sources. The performance could be beer as i considers more raining daa, bu he major drawback is ha a manual process needs o be execued o iniiae he raining process. Moreover, since raining is performed hrough he enire daabase, i becomes a edious ask and can only run offline. 5.3 Auomae Offline Training using Associaion Rule Mining User feedback is widely deployed in recen mulimedia research o refine rerieval performance. In he previous chaper, we showed an HMMM mechanism designed o suppor offline raining. For he sake of efficiency, he raining sysem is designed o record all he user access paerns and access frequencies during a raining period. Once he number of new feedbacks reaches a cerain hreshold, he sysem will rigger he marix updae procedure auomaically. All he calculaions are execued offline. This procedure helps o refine he overall sysem performance in he long run. This mehod can improve he performance bu i becomes a manual process o decide he hreshold and iniiae he raining process. To address his challenge, we propose an advanced raining mehod by adoping he associaion rule mining echnique, which can effecively evaluae accumulaed feedback and auomaically invoke he raining process. Training is performed per video raher han for all videos in he daabase, making he process more efficien and robus. In addiion, i can furher improve semanic modeling in he video daabase and coninuously improve rerieval performance in he long run. As an example, he proposed mehod is applied o a mobile-based soccer video rerieval sysem and he experimenal resuls are analyzed. 84

98 In his secion, he associaion-rule mining (ARM) echnique [Agrawal93] [Agrawal94] is applied o auomae he raining process. The auomaed raining process has he following advanages. Firs, he mulimedia sysem is updaed o check he hreshold effecively in real ime and iniiae he raining auomaically using accumulaed user feedback. In oher words, no manual process is required. Second, he overall raining process becomes more efficien and robus since only par of he video daabase ha conains enough hisorical rerieval daa and posiive paerns needs o be updaed. Finally, he raining process can furher improve semanic video daabase modeling and coninuously improve sysem rerieval and ranking performance in he long run Overall Process Figure V-2 shows he overall process of he proposed mehod. When a user issues a query paern, he background server execues he query and ranking process such ha he sysem can reurn o he user wih he ranked video clips ha mach he query paern. The user is allowed o choose his/her favorie video clips as posiive paerns and issue he feedback. The server engine receives he feedback and accumulaes hem in he video daabase. The sysem hen checks if he number of new feedbacks reaches a pre-defined hreshold. If yes, a background checking mechanism is invoked auomaically o evaluae he feedback using ARM. Oherwise, he nex round query is performed. For efficiency purposes, he sysem will check if here is any video conaining enough posiive paerns. If no video saisfies he qualificaion, hen he raining process will no proceed. Oherwise, he sysem will iniiae he raining process on cerain video(s) based on he evaluaion resuls. Afer he raining, he posiive paerns used in raining he video daabase model will be removed from he unrained feedback daa se and, accordingly, new couns begin for he nex query. Afer he underlying video daabase has been updaed, all he users can have he opporuniy o achieve he refined ranked resuls based on he rained video daabase models. 85

99 User issues a query paern Sysem query and ranking process Reurn rerieval resuls User provides feedback Accumulae user feedback N Number of new feedbacks reaches he hreshold? Y Evaluae feedback using ARM Is here any video conaining enough posiive paerns? N Y Iniiae sysem raining per video as required Clean he feedback for he rained video Nex round query Figure V-2. Overall process for he auomaed raining Alhough his figure only shows he process sequence for one user, he sysem acually collecs feedback from muliple users. Only one evaluaion measure is calculaed for each video and updaed wih he accumulaed feedback from he common users. Of course, muually exclusive issues should be considered such ha when he feedback from one user invokes sysem raining for a cerain video, he hread which processes he reques from anoher user should be aware of his siuaion and cerain acions should be resriced o avoid any conflics. 86

100 I is also worh menioning ha his proposed framework can be easily adjused and applied o image rerieval applicaions. The basic idea is o evaluae if here are enough posiive image paerns in a local image daabase. Tha is, he raining process is performed on independen local image daabases raher han he overall image reposiory Auomaed Training using ARM The challenge for such a mulimedia raining process is o deermine a suiable hreshold value o invoke model re-raining for a video v. Because he suppor measure used in ARM [Agrawal93] [Agrawal94] can well capure he percenage of daa uples for which he paern is rue, we invesigae how o bes adop his concep for he purpose of inspecing wheher he underlying HMMM model for a paricular video v needs o be re-rained. As firs inroduced by Agrawal e al. [Agrawal93], ARM is designed o discover iems ha co-occur frequenly wihin a daa se. Given a se of ransacions in marke baske analysis applicaions, where each ransacion conains a se of iems, an associaion rule is defined as an expression X Y, where X and Y are ses of iems and X Y = Ø. The rule implies ha he ransacions of he daabase conaining X end o also conain Y. In ARM, he suppor consrain concerns he number of ransacions ha suppor a rule. The suppor value is defined o be he fracion of ransacions ha saisfy he union of iems in he consequen and aneceden of he rule. This idea can be mapped and applied o mine he associaion rules in he posiive feedback. Here, each posiive even paern is reaed as a ransacion, and he hisorical access paern daabase is defined as he se of all ransacions. To saisfy our requiremens, we modify he definiion of arge rules and define hem as wo iemse associaion rules which follow cerain emporal sequences. For example, s m s can be reaed as a arge rule, where s m and s n are n video shos in video v and T < T. Accordingly, he suppor measure can be defined as below: sm sn 87

101 Coun ( sm sn ) Suppor ( m, n) =, (V-) NumTrans of s m where Coun s m s ) reurns he number of posiive even paerns ha conain he rule n ( n s, and NumTrans represens he oal number of all emporal even paerns (ransacions) in he daa se of posiive feedbacks which were no used in he previous raining process. In his applicaion, we are more concerned wih he number of all rules in a cerain video han he number of a specific rule. For a given video v, we can sum all he couns for he idenified emporal rules o ge his number as Coun ( s m s ), which can also be represened as Suppor ( m, n) NumTrans. sm s n sm s n In our video rerieval and raining sysem, a novel means for represening he percenage of s m and s n in video v ha are accessed in he posiive paern R y wih T < T can be uilized o define an Evaluaion measuremen for each video for checking purposes. n sm s n Evaluaion ( v) = Suppor ( m, n) NumTrans sm s n aff sm sn ( m, n) (V-2) Equaion V-2 capures he percenage of s m and s n appearing in he posiive emporal paerns versus he overall affiniy relaionship beween hem. If his percenage reaches a cerain value, i indicaes ha such a relaionship should be refleced more frequenly in he model raining process. Here, he hreshold is defined as H o see if he video is ready for he nex round of raining. When Evaluaion ( v) < H : The daabase model for video v will no be rained and he feedback is simply accumulaed in he server-side. When Evaluaion ( v) H : 88

102 aff ( m, n) = A ( m, n) Suppor ( m, n) NumTrans, iff s m R, s y n R, T y sm T sn (V-3) The aff values are hen uilized for updaing he corresponding affiniy relaionship marix and iniial sae probabiliy marix for he paricular video v Experimenal Resuls for Auomaed Learning Mechanism The proposed approach is applied o a disribued mulimedia sysem environmen wih simulaed mobile cliens, which will be furher inroduced in Secion 5.4. As shown in Figure V-3, afer a user issues he even paern query, he sysem will search, rank he resuls, and reurn he key frames o he user. Due o he limied size of wireless devices, each screen is designed o show up o six candidae video clips as presened in Figure V-3(a). By clicking he user preferred key frame, he corresponding video segmen will be displayed as shown in Figure V-3(b) and he user can provide posiive feedback by using he upper righ buon o rigger he choice of I like i!. (a) (b) Figure V-3. Sysem inerfaces for he Mobile-based Video Rerieval Sysem 89

103 For ARM, we use he source codes for he Apriori algorihm from [Apriori], which provides an efficien program (Borgel e al. [Borgel02][Borgel03]) o find associaion rules and frequen iemses wih he Apriori algorihm. Apriori is designed o operae on he daa ses conaining ransacions (i.e., he posiive even paerns in he hisorical feedback). The curren sysem conains 45 videos and around 0,000 video shos. The ARM-based sysem evaluaion resuls are recorded in Table V-. Experimenal resuls for ARM-based feedback evaluaions. Iniially, he sysem performs ARM-based evaluaion every 50 hisorical queries. However, i seems ha he knowledge capured in he firs 50 hisorical queries is no enough and no video needs o be rained. When i reaches 00 hisorical queries, more associaion rules are discovered. For example, here are 240 disinc iems (posiive video shos), 588 ransacions (posiive even paerns), 44 idenified associaion rules, and accordingly video passing he evaluaion hreshold. The daabase model of his video is rained separaely. Nex, all he posiive feedback paerns ha are in his rained video are removed from he unused feedback daase. Then, he sysem sars ARM-based evaluaions per 00 new feedbacks. The same procedures are applied when he number of queries reaches 200 and 300, where hree videos and wo videos are rained, respecively. The sysem is designed o conduc raining per video, raher han for all he videos in he daabase. We believe such a design can improve semanic modeling in he second layer and lead o furher improvemens in he overall rerieval performance. For example, based on 200 hisorical queries, he sysem evaluaes all videos and deermines ha hree videos need o be rained. The hisorical daa ha are already used for he raining process will hen be excluded from he nex round of evaluaion. When he number of hisorical queries reaches 300, he daabase models for wo more videos are required o be rained. Compared wih sysem raining which needs o updae he MMM models for 45 videos, he proposed approach reduces he raining ime by approximaely 900 percen, while achieving a similar degree of performance improvemen. 90

104 Table V-. Experimenal resuls for ARM-based feedback evaluaions Num of Hisorical Queries Iems Paerns (Transacions) Num of All Rules Videos need o be rained Online Relevance Feedback Our designed rerieval algorihm is capable of achieving he resul evens or paerns which mach wih he user-designed paerns. However, differen users may have heir own preferences and idenify only parial resuls as heir favories based on heir personal judgmens. Hence, online relevance feedback funcionaliies can be incorporaed in he proposed video rerieval sysem, which is realized by creaing and updaing he specific MMM insances for each individual user who has disinc preferences. The MMM insances are buil upon he srucure and values of he exising consruced HMMM model and hey are used in online sysem learning in order o saisfy each specific user s requiremens. Assume for a query paern Q = e, e,..., e }, where { 2 C T e T... T e 2 e C, a se of emporal paerns are rerieved and G of hem R, R,..., R } are marked as Posiive by he { 2 G user. Here, ~ y ~ y,,..., ~ y R = s s s } represens he y h Posiive paern, where y { 2 C y s~ i is defined as he i h video sho in he y h Posiive paern, i C, and y G. The relaed MMM insances are consruced as below Anicipan Even Paern Insance Someimes users may be ineresed in more han one even in some specific emporal posiion alhough hey do no specifically idenify hem in he iniial query paern. Therefore, he sysem should be able o capure and learn he underlying possibiliies for hese addiional evens. In our proposed approach, he sysem creaes an insance for he anicipaed even paerns by 9

105 checking he Posiive video shos. Assume here are oally m addiional evens AE = e ', e ',..., e '} a ime Te ' Te '... Te ' in he user s feedback on he G posiive { 2 m 2 m paerns. The numbers of heir occurrences are denoed as p '( e '), where 0 < p'( e i ') G and i m. Therefore, he acual query paern is expanded from Q = e, e,..., e } i { 2 C o = { e, e,..., }. For an even e i Q', if e i Q, is weigh is p =. Oherwise, if e i AE, Q ' 2 e C + m p = p'( e ) G. This means ha a newly deeced even holds he weigh based on is occurrence i i / frequency in he Posiive paerns. These addiional evens and heir occurrence probabiliies are used in he nex round of rerieval o ge he updaed similariy scores and perform he new similariy ranking process. Given a simple example, for a goal sho rerieval example, he users marked en resuls as Posiive, where six of he video shos are also Corner Kick shos, wo of hem are also marked as Free Kick shos, and he oher wo only have he annoaion of Goals. Therefore, we can exrac a se of wo addiional evens e ', '} which are no menioned in he query { e 2 i paern, where e denoes Corner Kick and e ' represens Free Kick. Their occurrence ' 2 probabiliies are 6/0 and 2/0, respecively. Accordingly, in he nex round of rerieval, hese wo user-preferred evens should be included for comparison purposes. However, since he exraced addiional evens are no specified by he user, hey are used only for he similariy measuremen calculaion Affiniy Insances for A In his framework, he affiniy insances are rerieved and updaed for he purpose of user preference learning. Assuming he G posiive paerns come from M disinc videos (M G), one affiniy insance ( * A ) for each of hese M videos can be consruced. Accordingly, M affiniy insances can be generaed. In each affiniy insance, he rows represen he posiive video shos, and he columns represen all he shos in his video. The affiniy values of he Posiive 92

106 video shos are exraced from A which he corresponding rows are from he exising affiniy relaionship marices. Le FP(i,j) be he number of posiive feedback paerns which conain he emporal sequence like {..., s i, s j,...} and v be one of he M posiive videos, he affiniy insance * A for video v is generaed as follows. A * ( i, j) = where s v', s i j A ( i, j) ( + FP( i, j)), ( A ( i, y) ( + FP( i, y))) ' (V-4) s k v v', T s i T s j, s i R, y G. Le FV(i,j) denoe he number of feedbacks where wo videos v i and v j are accessed ogeher. Accordingly, he higher-level affiniy insance y * A 2 is generaed as shown below. * A2 ( i, j) ( + FV ( i, j)) A2 ( i, j) =, where i M, j M. A ( i, x) ( + FV ( i, x)) 2 vx D (V-5) Feaure Insances for B The iniial search for he arge evens or paerns ries o compare he feaures of he candidae video shos wih he mean values for he feaures of he arge evens (B ). Since users may poenially have heir own preferences on some kind of visual or audio feaures, feaure insances are also consruced. In oher words, once he feedback is issued, he feaures of posiive shos will be aken ino accoun for he nex round of similariy measuremen. As shown in Equaion (V-6), he feaure insance marix values for he visual/audio feaures of he Posiive video shos. B * ( i, j) G B ( ~ s y = = G y i, f ) j * B is consruced for B by calculaing he mean, where i C+m, j K. (V-6) 93

107 Updaed Similariy Measuremens and Query Processing By considering he low-level feaures, high-level semanic conceps, as well as he user s percepions, he specific preferences can be efficienly capured and learned such ha he new requiremens are incorporaed ino he feedback-based similariy measuremen procedure. Le },...,, { 2 ' m y s C s s R + = be he y h candidae paern which maches he query even paern Q, he similariy score of ' y R can be calculaed by Equaions (V-7) o (V-). = = K k k k k f e B f s B f e O e s dis 2 2 *,2 * )) )), ( ), ( ( ), ( ( ( ), (, (V-7) where m C S s +,. ), ( ), ( * * e s dis e s sim + = (V-8) where m C S s +,. * * ), ( ) ( ), ( p e s sim s e s w Π =. (V-9) + < + < = ,, ), ( ), ( ), ( ;,, ), ( ), ( ), ( ), ( * * * * * * y y R s m C where p e s sim s s A e s w R s m C where p e s sim s s A e s w e s w (V-0) ), ( ) ' ', ( * * m C y e s w R Q SS = + =. (V-) Figure V-4 illusraes he procedure of he concep-based video rerieval and online sysem learning. Afer he firs round of rerieval, he users provide heir feedbacks and hen he sysem ries o exrac he affiniy insances and feaure insances from he posiive mulimedia objecs. The anicipaed even paern is also generaed by including more possible evens. The rerieval process on he righ hand side follows in a similar way as he iniial query processing (as shown in Figure V-4) excep he following hree aspecs: () When calculaing he weighs, he sysem will check he affiniy insances in he user s profile. As shown in Equaion (V-0), if he video sho is no posiive in R y, i will no be included in * A, and herefore he sysem will go o

108 A o ge he affiniy value. (2) The feaures used for similariy measuremens are changed from B o * B. (3) If here are addiional evens deeced, he sysem will calculae he similariy score for he generaed paern R y. Inpus: ) Targe emporal paern Q = {e,, e C} 2) Affiniy, feaure, and iniial sae probabiliy marices 3) Video se {v,, v M } ) Anicipan even paern insance Q ={e,, e C+m} 2) Generaed MMM insances, including A *, A 2 *, B *, ec. Even paern rerieval and ranking process i = ; = ; y = ; Ranked video sho sequences Search for he h sho s in v i which maches e Feedback Calculae weigh w * (s, e ) N Posiive evens / even paerns / videos = +; Calculae similariy score SS * (Q, R y ) for each candidae video sho sequence > C+m? Y Consruc/updae he affiniy insances A *, A 2 * i = i+; y = y+; i > M? N Consruc/updae he feaure insance B * Y Rank he candidae video sho sequences according o SS * scores Reconsruc he anicipan paern insance Updaed candidae even paerns Figure V-4. Online learning procedure of emporal based query paern rerieval Experimenal Resuls for Sysem Learning Techniques The curren video daabase conains 54 soccer videos, which are segmened ino,567 video shos and 506 of hem are imporan semanic evens. In he experimens, hree query examples are used o demonsrae he resuls of he rerieval mehod and how he online relevance feedback mehod improves he precision of he resuls and heir rankings. 95

109 Query : Search for he goal evens. Query 2: Find he paerns conaining a goal afer a free kick. Query 3: Search for a video segmen wih he emporal paern conaining a sequence of four evens, namely Goal, Player Change, Corner Kick, and Free Kick. Figure V-5 demonsraes he soccer video rerieval and user feedback inerface wih Query 3 being issued. The resuls are reurned on he righ hand side of he panel and sored from lef o righ and op o down. As marked in he purple box, four video shos form a candidae emporal even paern. By double clicking he ineresed key frame, he video shos are displayed in he upper-lef corner of he screen. Users are able o selec heir preferred video shos by using he drop down menu below he key frames. Video rerieval is no merely he comparison of low-level visual or audio feaures. Moreover, people may have oally differen ineress when hey search for heir arge video shos. For example, a oal of 72 resuls are rerieved for Query, and all of hem are proved o be goal shos. The iniial ranking is he same for all he users. However, users have differen feedbacks even hrough he iniial resuls. Some users may choose he ones in heir ineresed series of soccer games, some users may prefer he goals resuling from he corner kicks, while ohers may be ineresed in he exciing goal shos where he erm of exciing is defined subjecively based on heir own judgmens. I is a very complicaed ask o learn high-level user percepions since even he users hemselves may no be able o describe heir own requiremens. However, our proposed mechanism is capable of refining he video even rerieval resuls by modeling no only low-level feaures bu also high-level conceps and user preferences by uilizing he HMMM mechanism and MMM insances. 96

110 Figure V-5. User-cenered soccer video rerieval and feedback inerface In our experimens, 20 users paricipaed, wih mos of hem no compuer specialiss. Figure V-6 shows he number of user preferred video shos in wo rounds of rerieval. From he firs 6 candidae video shos iniially shown, he average number of user preferred video shos is 5.5. Afer he firs round of feedbacks, i reaches 8. I is noiceable ha mos of he users achieve beer resuls afer he firs round of feedback. Few of he users have very sric requiremens when hey choose he goal shos, and herefore he improvemen gain is limied even afer he feedback and learning process. As for Query 2, here are 75 pairs of resuls rerieved from he video daabase. However, based on our experimens, we observed ha he number of similar paerns is very small. Therefore, an example insead of numbers is used o demonsrae he performance. Figure V-7(a) shows he iniial resul where he paerns in red boxes are seleced as Posiive. They were 97

111 ranked as he h and 5 h candidae paerns iniially. As illusraed in Figure V-7(b), afer wo rounds of feedbacks, all he similar paerns in he same series of soccer videos are successfully rerieved and ranked as he s and 8 h. Online Training Experimenal Resuls of Soccer Video Rerieval Sysem Number of User Preferred Soccer Goal Shos User ID Iniial Resuls Resuls Afer Firs Round Feedback Figure V-6. Online raining experimenal resuls for Query (a) (b) Figure V-7. Soccer video rerieval and feedback resuls for Query 2. (a) firs round even paern rerieval; (b) hird round even paern rerieval. 98

112 5.5 Applicaion: A Mobile-based Video Rerieval Sysem 5.5. Inroducion Nowadays, handheld mobile devices including cell phones and Personal Digial Assisans (PDAs) have become increasingly popular and capable, which creaes new possibiliies for accessing pervasive mulimedia informaion. The new generaions of mobile devices are no longer used only for voice communicaion; hey are also used frequenly o capure, manipulae, and display differen audiovisual media conens. Wih he rapid emergence of wireless nework echnologies, such as GSM, Saellie, Wireless Local Area Nework (WLAN), and 3G, i is much easier now o ransfer large sized mulimedia iems o mobile cliens. However, mulimedia mobile services sill suffer no only he consrains of small display sizes, bu also a limiaion in erms of power supply, sorage space, processing speed, ec. The navigaion of mulimedia conen on handheld devices is always resriced in he limied ime periods wih minimized numbers of ineracions. Meanwhile, large sized mulimedia daa such as video canno be sored permanenly on he mobile devices due o he limied memory. Consider he following ypical scenario in a mobile-based mulimedia applicaion. Spors fans wish o wach spors video hrough heir cell phones. However, i is boh unaffordable and someimes unnecessary for hem o wach he enire game, which would ake a long ime, occupy huge memory, and exhaus he power supply. Thus, a beer soluion is o offer hem he capabiliy of browsing and rerieving only shor video clips conaining ineresing even shos. Given a huge collecion of spors videos, here exis many challenges o accomplish his ask. I is quie difficul o segmen he videos properly and annoae he semanic evens auomaically. Alhough advanced echniques offer grea capabiliies in exracing mulimodal visual and audio feaures from all kinds of videos, he semanic gap remains a crucial problem when bridging hese low-level or mid- 99

113 level feaures wih high-level rich semanics. Even wih he bes even annoaion algorihms, here is hardly sufficien guaranee in erms of he correcness and compleeness of he semanic inerpreaion resuls. This moivaes he modeling of high-level semanic absracions by uilizing he exising annoaion resuls, heir feaures, and user feedbacks. I is criical o address he daabase modeling issue, especially when considering he emporal and/or spaial relaionships beween he mulimedia objecs. I should be able o suppor no only he basic rerieval mehods, bu also he complicaed emporal even paern queries. There is an emerging need for supporing individual user preferences in mulimedia applicaions. I is well-known ha people have diverse ineress and percepions owards media daa. Thus, i is desirable o incorporae user feedbacks wih he purpose of raining he rerieval sysem. In he meanwhile, we may wan o reduce he number of user ineracions o alleviae he burden on he users and o accommodae he resricions of he mobile devices. I is hus desirable o keep rack of user acions and accumulae knowledge abou user preferences. The sysem archiecure should be designed o reduce he size of daa o be ransferred and o minimize he requiremen of daa sorage for he mobile devices. The mobile-based rerieval inerface should be user-friendly, easy o operae, and capable of offering sufficien informaion and choices for he users. Therefore, an efficien and effecive mulimedia conen managemen and rerieval framework will be essenial for he evoluion of mobile-based mulimedia services. 00

114 This secion addresses he issues of designing and implemening a user adapive video rerieval sysem, called MoVR, in a mobile wireless environmen. Innovaive soluions are developed for personal video rerieval and browsing hrough mobile devices wih he suppor of conen analysis, semanic exracion, as well as user ineracions. Firs, a sochasic daabase modeling mechanism called Hierarchical Markov Model Mediaor (HMMM) is deployed o model and organize he videos, along wih heir associaed video shos and clusers, in a mulimedia daabase o offer suppor for boh even and complicaed emporal paern queries. Second, HMMM-based profiles are designed o capure and sore individual user s access hisories and preferences such ha he sysem can provide a personalized recommendaion. Third, he fuzzy associaion concep is employed o empower he framework so ha he users can make heir choices of rerieving conen based solely on heir personal ineress, general users preferences, or anywhere in beween. Consequenly, users gain conrol in deermining he desirable level of radeoff beween rerieval accuracy and processing speed. In addiion, o improve he processing performance and enhance he porabiliy of clien-side applicaions, he sorage consumpion informaion and compuaionally inensive operaions are suppored in he server-side, while mobile cliens mainly arge o manage he rerieved media and user feedbacks for he curren query. In order o provide more efficien accessing and informaion caching for he mobile devices, we also designed he virual cliens a he server-side compuers o keep some relevan informaion ha mobile users require. To demonsrae he performance of he proposed MoVR framework, a mobile-based soccer video navigaion and rerieval sysem is implemened and esed Relaed Work Video browsing and rerieval in mobile devices is an emerging research area. Due o he consrains of mobile devices in erms of heir power consumpion, processing speed and display 0

115 capabiliy, more challenges have been encounered han in radiional mulimedia applicaions and many research sudies have been conduced o address various issues. To reduce he viewing ime and o minimize he amoun of ineracion and navigaion processes, a variey of sudies in academia and indusry have worked on he summarizaion of video conens. For insance, in [Gong0], Singular Value Decomposiion (SVD) of aribue marix was proposed o reduce he redundancy of video segmens and hus generae video summaries. Clusering echniques were also used o opimize key frame selecion based on visual or moion feaures o enhance video summarizaion [Babaguchi0]. In indusry, Virage has implemened preliminary video summarizaion sysems for NHL hockey videos using mulimodal feaures [Virage]. However, a major issue remains in erms of he semanic gap beween compuable video feaures and he meaning of he conen as perceived by he users. For his purpose, meadaa abou he conen was used and i played an acive role in video rerieval [Sachi05]. For insance, onologies have been proposed in [Jokela00] o perform inelligen queries and video summarizaion from meadaa. In [Tseng02], a video semanic summarizaion sysem in he wireless/mobile environmens was presened, which includes an MPEG-7 complian annoaion inerface, a semanic summarizaion middleware, a real-ime MPEG/2 video ranscoder for Palm-OS devices, and an applicaion inerface on color/black-and-whie Palm-OS PDA. Meadaa selecion componen was also developed in [Lahi06a][Lahi06b] o faciliae annoaion. However, auomaic media analysis and annoaion is sill far from maure and purely manual media annoaion is exremely ime-consuming and error prone [Davis04]. Alernaively, semanic even deecion frameworks have been proposed o faciliae video summarizaions. Hu e al. used similar video clips across differen sources of news saions o idenify ineresing news evens [Hu0]. In our earlier sudies [ChenSC03a][ChenSC04a], an effecive approach was proposed for video even deecion faciliaing mulimedia daa mining echnique and mulimodal feaure analysis. 02

116 In erms of video rerieval in mobile devices, Query by Example (QBE) is a wellknown query scheme used for conen-based audio/visual rerieval, and many sysems have been developed accordingly [Ahmad06] [Sachi05]. However, in mos exising approaches, he similariy esimaion in he query process is generally based on he compuaion of he (dis)similariy disance beween a query and each objec in he daabase and followed by a rank operaion [Ahmad06]. Therefore, especially for large daabases, i may urn ou o be a cosly operaion and he rerieval ime becomes unreasonably long for a mobile device. In addiion, emporal paern query, where a sequence of emporal evens is of ineres, is no well suppored in hese sudies. In essence, he aforemenioned approaches conribue o address some resricions of video browsing and rerieval in mobile devices. However, hey fail o accommodae individual user preferences wih heir diverse ineress and percepions owards video daa. In he lieraure, relevance feedback [Rui98] has been widely adoped in he conen-based rerieval research sociey o address user preference issue. [Coyle04] presens a mechanism for learning he requesed feaure weighs based on user feedbacks o improve he recommendaion ranking. In addiion, [Dubois0] sudied he applicaion of fuzzy logic in he represenaion of flexible queries and in expressing a user s preference learned from feedback in a gradual and qualiaive way. In [Doulamis99], fuzzy classificaion and relevance feedback echniques were applied for processing he video conen o capure user preference. A similar idea was also proposed in [Kang06], where a fuzzy ranking model was developed based on user preference conained in feedbacks. However, a common weakness of mos relevance feedback mehods is ha he feedback process has no memory [LiQ0] so he user feedbacks conduced in he pas fail o help he fuure queries. Therefore, he rerieval accuracy does no improve over ime in he long run. 03

117 In addiion, mobile devices have limied luxury o suppor frequen ineracions and realime feedback learning. Alernaively, user profiling has been exensively used in informaion filering and recommendaion. In [ChenL98], a personal agen called WebMae is devised which learns he user profile incremenally and faciliaes browsing and searching on he web. [Marin02] also presened a sudy of he role of user profiles using fuzzy logic in web rerieval processes. John e al. [John0] developed a prooype informaion rerieval sysem using a combinaion of user modeling and fuzzy logic. [Gibbon04] presened he idea of exracing relevan video clips based on a user profile of ineress and creaing personalized informaion delivery sysems o reduce he sorage, bandwidh, and processing power requiremens and o simplify user ineracions. In our framework, a common profile which represens he general knowledge on he semanics of video daa is consruced. Such a general user profile serves as he semanic indexes of video daa o speed up he rerieval process. Meanwhile, a user profile is se up for each user o characerize he personalized ineres. In addiion, mulilevel video modeling and emporal paern queries are well suppored in our approach Sysem Archiecure In his proposed research, he radiional clien/server sysem archiecure is adoped bu enhanced o accommodae he requiremens for he mobile-based mulimedia services. In order o provide he maximum suppor and opimized soluion, he following crieria are sricly followed in he sysem design. Firs, sorage consumpion informaion and compuaionally inensive operaions are handled on he server-side. Second, mobile cliens are solely required o mainain he minimized daa o enable he rerieval process. Third, he sysem should reduce he load for he wireless nework, and a he same ime increase he daa ransfer speed for he mulimedia daa. 04

118 In he server-side daabase, a huge amoun of mulimedia daa are sored and managed by employing he Hierarchical Markov Model Mediaor (HMMM) mechanism. The video daabase conains no only he archived videos, video shos, and clusers, bu also he numerical values which represen heir affiniy relaionships, feaures, and access hisories, ec. Server General User Feedbacks General User Percepions Video Daabase (Modeled using HMMM) Fuzzy Associaed Rerieval Mechanism Reques Handler HMMM based User Profiles Individual User Percepions Virual Cliens Virual Clien () Reques Composer And Resul Receiver Wih Caches Mulimedia Conens For All Queries Access Hisory For All Queries Virual Clien (n-) Virual Clien (n) Mobile Devices Wireless Connecions Mobile Device (n-) Mobile Device (n) Mobile Device () Graphical User Inerface Wih Fuzzy Conroller Mulimedia Conens For Curren Query Access Hisory For Curren Query Figure V-8. Mobile-based video rerieval sysem archiecure 05

119 As shown in Figure V-8, he daabase for general user feedback is developed, which consiss of he posiive access evens or paerns from a whole group of differen users. The individual user feedback can also be exraced from his daabase o develop he HMMM-based individual user profiles, which will be explained in he laer secions. These access hisories are uilized by he sysem o learn boh he general user percepions and individual user ineress. Based on he fuzzy weigh provided by he mobile user, he fuzzy associaed rerieval algorihm is able o make a compromise beween hese wo models of percepions inelligenly, rerieve he video clips, and make he ranking recommendaions accordingly. The reques handler is designed o inerpre he reques packages and o respond o he mobile devices by sending back he rerieved and ranked resuls. The clien-side applicaions on mobile devices do no need o keep sorage of all he accessed media daa. Alernaively, hey mainly arge o manage he rerieved media and user feedbacks for he curren query, which includes he key frames of video shos shown on he curren screen, and he video clips requesed for he curren operaion. The mobile-based graphical inerface is designed for he video rerieval sysem, which allows he user o easily compose and issue he even or emporal paern based queries, o navigae hrough and wach he collecion of rerieved resuls, and o provide he feedback. In order o promoe he mobiliy and manageabiliy of his sysem, a new layer wih virual cliens is designed and incorporaed in he server-side applicaions o exend he dynamic compuing and sorage capabiliy of he mobile cliens. The virual clien is designed o represen he mobile user sae in he mobile-based video rerieval sysem. Each virual clien is cusomized o a disinc mobile user who accesses he video rerieval sysem. I conains a communicaion componen ha consiss of he requess by checking and collecing he messages and commands sen from he mobile devices. The communicaion componen can also receive he mulimedia daa resuls from he server. Since we wan o reduce he daa size sored in he 06

120 mobile devices, he virual clien is designed o cache all he relaed mulimedia conen and access hisories for he corresponding mobile user. Generally speaking, he proposed virual clien soluion can deliver improved flexibiliy, scalabiliy and cos benefis over he radiional clien/server models. Mobile users gain efficiency and produciviy because hey can access he mulimedia resources wihou worrying abou heir sorage limiaion MoVR: Mobile-based Video Rerieval In his chaper, he MoVR framework is proposed for he mobile-based video rerieval sysem developmen. This framework can suppor no only basic even queries, bu also he complicaed queries owards some emporal even paern, which consiss of a se of imporan evens followed by a cerain emporal sequence. More imporanly, his framework is capable of providing boh personalized recommendaions and generalized recommendaions. Users can also specify a fuzzy weigh parameer if heir query ineress have no ye been clearly formed. The sysem will make he adjusmen and generae differen rerieval resuls based on he fuzzy associaed queries. In essence, MoVR is designed o provide no only powerful rerieval capabiliies bu also a porable and flexible soluion o mobile users. As shown in Figure V-9, he overall framework of MoVR includes hree main processing phases on he server-side. Phase is for video daa preprocessing. I consiss of he following seps. The firs sep is o process he source video daa o deec he video sho boundaries, segmen he video, and exrac he sho-based feaures. Daa cleaning and even annoaion algorihms are hen applied o deec he anicipaed semanic evens by employing he exraced sho-level feaures and mulimodal daa mining scheme. The componens in Phase are processed offline, which is no he focus of his chaper. 07

121 Phase 2 is o model he video daabases. As shown in he op-cener box, he HMMM mechanism is deployed o model he mulilevel video eniies, all kinds of feaures, along wih heir associaed emporal and affiniy relaionships. These process seps are also performed offline. Such an HMMM daabase model will be updaed periodically during he learning process by uilizing user feedback. Phase 3 includes he sysem rerieval and learning processes which are mainly performed online in real ime o inerac frequenly wih he virual cliens and he mobile cliens. Once a user issues a query requesing a cerain semanic even or a emporal even paern, such informaion will be sen o he virual clien, where i is packed and passed o he server for processing. Afer his poin, he process will be slighly differen for a firs ime user or a revisiing user on he server-side. For he former case, he HMMM model wih iniial seings will be adoped and he sysem performs general similariy maching and ranking processes. In conras, for a revisiing user, his/her user profile sored in he server-side will be rerieved and used for more advanced rerieval funcionaliies. Accordingly, an enhanced algorihm is developed on he server-side o handle hese fuzzy associaed video rerieval and ranking asks. The rerieved video clips are ranked and sen back o he virual cliens. Though all he resuls are cached for fas rerieval, only a porion of hem are acually delivered o he mobile devices in defaul. Users may issue feedback owards he query resuls hrough heir mobile devices, which will be sen o he virual clien so ha his feedback can be organized and emporarily sored. Afer ha, he feedback will be delivered o he server for he consrucion or updae of he user profiles. Moreover, real-ime online learning is also suppored so ha he sysem will yield refined resuls based solely on he feedback for he curren query. Essenially, wo innovaive echniques, user profiling and fuzzy associaion, are adoped and inegraed wih HMMM inelligenly for server-side applicaions, which are inroduced in he following secions. 08

122 Phase : Video Daa Pre-processing Phase 2: Video Daabase Modeling Phase 3: Sysem Rerieval and Learning Video Sources Video Shos General Process for Similariy Maching and Ranking Video Sho Boundary Deecion Video Sho Segmenaion Visual/Audio Feaure Exracion Daa Cleaning Video Sho Feaures Iniial Annoaion of Semanic Evens Consruc Iniial HMMM Models Video Daabase Modeled wih HMMM Model Candidae Video Shos / Sho Sequences Fuzzy Associaed Video Rerieval and Ranking Consruc/Updae User Profiles Offline Training Sho Even Deecion and Annoaion Updae HMMM Models Online Learning Mobile Cliens and Virual Cliens Compose and Send he Reques Receive, Cache and Transfer he Resuls Organize, Cache, and Send he Feedbacks Query wih Temporal Even Paern (Possibly Fuzzy Associaed) Receive and Display Parial Resuls for he Curren Query Provide Feedback or Access Hisory for he Curren Query Figure V-9. Overall framework of mobile-based video rerieval sysem HMMM-based User Profile One of he major challenges in mulimedia rerieval is o idenify and learn personalized user ineress. The underlying reason for his challenge is ha a user s query ineress can hardly be expressed precisely by using query examples or keywords. In addiion, differen users end o 09

123 have diverse opinions or percepions owards even he same query and inend o seek for differen resuls and rankings. For example, given a query for soccer goal shos, differen users may be ineresed in differen rerieval requiremens: followed by a corner kick; in he female soccer videos; wih exciing screams; ec. In his research sudy, he consruced HMMM model can serve as a general user profile which represens common knowledge of he mulimedia daa and he relaed semanics. On he oher hand, an HMMM-based individual user profile is also consruced for each mobile user, which is mainly consruced based on learning he individual user s query hisory and access paerns. The definiion is described as follows: Definiion V-: An HMMM-based User Profile is defined as a 4-uple: Φ = { τ, Aˆ, Bˆ, O ˆ }, where τ : represens he idenificaion of a mobile user; Â : Affiniy profile, which incorporaes a se of affiniy marices ˆ { ˆ g A = } ha describes he relaionships beween he user accessed media objecs and all he media objecs. Here n d, and g λ n. g A n Bˆ : Feaure profile, which represens he feaure measuremens based on he posiive feedbacks of cerain evens and/or even paerns; Ô : Feaure weigh profile, which consiss of he feaure weighs obained by mining and evaluaing he users access hisory. 0

124 Affiniy Profile The affiniy profile  is designed o model he affiniy relaionships among he mulimedia objecs ha are relaed o users hisoric query/feedback logs. The proposed soluion ries o minimize he memory size ha a user profile would occupy. As illusraed in Figure V-0, he sysem will check he query logs and access hisories for he purpose of consrucing he affiniy profile. i A i Aˆ Figure V-0. Generaion of individual user s affiniy profile Taking affiniy marix j A as an example, i describes he emporal-based affiniy relaionships among he video shos of he j h video. The sysem will find he posiive video shos ha he user has accessed before and he corresponding rows are exraced from he original marix ( j A ). These values are hen updaed and used o creae a new marix j Aˆ in his user s affiniy profile. Similarly, in he second-level user affiniy profiles, he rows represen he accessed videos which include a leas one posiive video sho, and he column includes all he videos in he cluser.

125 For a mobile user, his/her query log and access hisory include he se of issued queries as well as he associaed posiive feedbacks. We define marices UF n o capure he individual user s access frequencies for he n h level objecs in he mulilevel HMMM daabase. For example, le UF (i, j) represen he number of posiive feedback paerns which conain he emporal sequence g g as {..., S ( i), S ( j),...}. UF 2 (i, j) denoes he number of posiive paerns where boh video v i and v j are accessed ogeher, and UF 3 (i, j) denoes he number of posiive paerns which conain he video shos across boh video cluser CC i and CC j. For he affiniy marix g  n, he corresponding affiniy profile is compued and updaed as below. ˆ g An ( i, j) ( + UFn ( i, j)) A n ( i, j) =. (V-2) g A ( i, x) ( + UF ( i, x)) x g n n Where n d, d = 3, and S g n (x) represens all he possible saes in he same MMM model wih S g n (i) and S g n ( j). In addiion, when n =, saes S g ( ) and S g n ( j) also need i o follow he cerain emporal sequence where T T. g S ( i ) g S ( j ) Feaure Profile Feaure profiles are consruced o describe he disinc searching ineress for each user by modifying he arge feaure values. As discussed in our previous paper [Zhao06a], an even feaure marix ' B was compued based on he annoaed evens. However, he annoaed resuls may no be fully correc or complee. Furher, he users may have heir paricular ineress when looking for a cerain even. To address hese issues, a feaure profile ˆB is proposed. Specifically, in he profile marix ˆB, each row represens an even, and each column represens a feaure. Le f k represen he k h feaure, where k K, and K is he oal number of feaures. Given ~ z m as a subse of all he posiive shos wih even ype e m, and leing B ( ~ z ( i), f ) denoe he feaure m k 2

126 values for video sho ~ zm ( i ), Equaion V-3 defines ˆB. If here is no posiive sho wih even ype e ( ~ = 0), he corresponding row is copied from he even feaure marix ' B in he m z m consruced HMMM model. ~ z = ( ~ m i B zm ( i), f k ) ~ ~ ~, where z m, zm, k K; Bˆ ( e, ) = z m f k m (V-3) ' ~ B ( em, f k ), where z m = 0, k K Feaure Weigh Profile In he lieraure, many approaches used Euclidean disance, relaional coefficiens, ec. o deermine he similariy measure beween wo daa iems in erms of heir feaure values. However, he effeciveness of differen feaures migh vary grealy from each oher in expressing he media conen, so i is essenial o apply feaure weighs in measuring he similariy beween mulimedia daa objecs. In HMMM, a marix O, 2 is used o describe he imporance of lower level visual/audio feaures F when describing he even conceps F 2. The iniial values of all is enries are se o be equal, which indicaes ha all he feaures are considered o be equally imporan before any user feedback is colleced and any learning process is performed. Once we obain he annoaed even se, he feaure weighs will hen be updaed as inroduced in our previous work [Zhao06a]. This research mainly focuses on he feaure weigh profile, which is consruced based on he mobile user s individual access and feedback hisories. As users can provide posiive feedback on heir favorie video shos, he basic idea is o increase he weigh of similar feaures among he posiive video shos, while decreasing he weigh of dissimilar feaures among hem. For his purpose, we use he sandard deviaion Sd e m, f ) o measure he disribuion condiion ( k of feaure ( k K) on he video shos conaining even ( m C), where C represens f k e m 3

127 he number of disinc even conceps. A large sandard deviaion indicaes greaer scaer of he daa poins. Accordingly, when here is more han one posiive sho for even e m, ( ~ z ( i ) > m ), he value of / Sd ( e m, f ) can be employed o measure he similariy of he feaures, which in urn k indicaes he imporance of his feaure in erms of evaluaing even e m. However, his soluion does no apply when here is no posiive sho or only one posiive sho ( ~ ( i) ), so we would borrow he corresponding feaure weighs for even e m from marix O, 2. The feaure weigh profile is hus defined as follows. z m ~ z ( ~ ˆ 2 m i= ( B z m ( i), f k ) B ( em, f k )) Sd ( em, f k ) = ~, (V-4) z m Where ~ z ( i) > m, m C, and k K. Figure V-. Fuzzy weigh adjusmen ool (a) generalized recommendaion; (b) personalized recommendaion; (c) fuzzy associaed recommendaion / Sd( em, f k ), where ~ zm >, m C, and k K Oˆ K e f = j = (/ Sd( e m, f j )),2 ( m, k ) (V-5) O ( em, f k ), where ~,2 zm, m C, and k K Fuzzy Associaed Rerieval Fuzzy logic has been noed for is abiliy o describe and model vagueness and uncerainy, which are inheren in mulimedia informaion rerieval. In his proposed framework, fuzzy logic is adoped o model he uncerainy of users rerieval ineress. Specifically, users are allowed o make heir choices of rerieving conen based solely on general user percepions, personalized ineress, or anywhere in beween, which leads o a differen level of radeoff 4

128 beween rerieval accuracy and processing speed. We use a fuzzy weigh parameer ρ [0,] o measure he uncerainy ha users may pose when issuing he video queries. As shown in Figure V-, we use an ineracive gauge on he mobile device inerface for he users o adjus ρ. By choosing he personalized ineres (as shown in Figure V-(b)), ρ = 0, he sysem will evaluae he user s profile and rerieve he video clips based on he learned knowledge wih respec o he user s previous access paerns. On he oher hand, if he generalized recommendaion mode is seleced (Figure V-(a)), i.e., ρ =, he sysem will comply wih he common knowledge learned from he complee query log colleced across differen users. Therefore, he mos popular video clips will be rerieved wih higher ranks. Assuming ha we have already performed video clusering [Zhao06b] hrough he daabase, he generalized recommendaion mode is normally more efficien as saisfacory resuls can generally be rerieved by checking he relaed clusers. Le Q = { e, e,..., e }( T T... T ) be a query paern and s S be a candidae 2 C e e2 ec video sho for he even e ( C ); he sysem can adjus he rerieval algorihm and provide hree kinds of recommendaions, namely, generalized recommendaion, personalized recommendaion, and fuzzy weighed recommendaion according o he fuzzy weigh issued by he user. The deails are addressed below Generalized Recommendaion When a generalized recommendaion (Figure V-(a)) is seleced, he marices in he consruced HMMM model will be used as he common user profile o perform he sochasic rerieval process. Firs, as shown in Equaion (V-6), he weighed Euclidean disance dis s, e ) ( is calculaed by adoping he general feaure weigh O e, f ), which is hen used o derive he,2 ( k ' k similariy measuremens (see Equaion (V-7)). Here, B ( e, f ) denoes he exraced mean 5

129 value of feaure f k wih respec o even e based on he learned general users common knowledge. dis s, e ) ( O ( e, f ) ( B ( s, f ) B ( e, f )) ), (V-6) ( = K k=,2 k k ' k 2 Where s S, k K, C. sim( s, e ) =, (V-7) + dis( s, e ) Where s S, C. Nex, he edge weighs are calculaed based on Equaions (V-8) and (V-9). When = 0, he iniial edge weigh are calculaed by using he iniial sae probabiliy and similariy measure beween sae s and even e. I is worh menioning ha he sysem ries o evaluae he opimized pah o access he nex possible video sho sae which is similar o he nex anicipaed evens. Therefore, he edge weigh from sae s o s + ( C ) is calculaed by adoping he affiniy relaionship as well as he similariy beween he candidae sho s+ wih he even concep e +. w s, e ) = Π ( s ) sim( s, ). (V-8) ( e w + s +, e + ) = w ( s, e ) A ( s, s + ) sim( s +, e + ), where. (V-9) ( C Afer one round of raversal, he sysem rerieves a sequence of video shos R y which mach he desired even paern Q. The nex sep would be he calculaion of he similariy score. Here, SS Q, R ) is compued by summing up all he edge weighs, where a greaer similariy ( y score indicaes a closer mach. SS( Q, R ) = = w ( s, e ) (V-20) y C 6

130 Personalized Recommendaion In case he user prefers a personalized recommendaion (Figure V-(a)), he overall process seps are similar excep ha he marices used are mainly from HMMM-based user profiles. = = K k k k k f e B f s B f e O e s is d 2,2 ) )), ( ˆ ), ( ( ), ( ˆ ( ), ( ˆ (V-2) ), ( ˆ ), ( ˆ e s dis e s sim + = (V-22) Where C S s,. When calculaing he edge weigh ), ( ˆ e s w, here could be wo condiions. If he candidae video sho has been accessed by his user and marked as Posiive ( y R s + ), he user s affiniy profiles should include his video sho and, herefore, he formula akes he affiniy value from he user s personal affiniy profile ( Â ). Oherwise, here is no record for he video sho in he user profile ( y R s + ), and he sysem will pick he value from he affiniy marices ( A ) of he consruced HMMM. ), ( ˆ ) ( ), ( ˆ e s sim s e s w Π = (V-23) < < = y y R s C where e s sim s s A e s w R s C where e s sim s s A e s w e s w, ),, ( ), ( ), ( ˆ, ),, ( ), ( ˆ ), ( ˆ ), ( ˆ (V-24) = = C y e s w R Q S S ), ( ˆ ), ( ˆ (V-25) Fuzzy Associaed Recommendaion Alernaively, if he user is uncerain and hus chooses a fuzzy weigh parameer (0,) ρ o describe his/her ineres (as illusraed in Figure V-(c)), he sysem will make he

131 adjusmen on he edge weighs and accordingly, he opimized pah and he similariy score would possibly be changed as defined in he following equaions. w ~ ( s, e ) = ρ w ( s, e ) + ( ρ) wˆ ( s, e ), where C. (V-26) ~ SS( Q, R ) = C = w ~ ( s, e ). (V-27) y Afer geing he candidae video sho sequences, hey will be ranked based on heir similariy scores and sen back o he clien Implemenaion and Experimens A mobile-based soccer video rerieval sysem is developed based on he proposed MoVR framework, which consiss of he following componens: A soccer video daabase is consruced and mainained in he server-side by using PosgreSQL [PosgreSQL]. Toally, 45 soccer videos along wih 8977 segmened video shos and corresponding key frames are sored and managed in he daabase. Server-side engine is implemened by using C++. This module conains no only he searching and ranking algorihms, bu also a se of oher compuaionally inensive echniques, including video sho segmenaion, HMMM daabase modeling, user profile generaion and updaing, ec. The virual clien applicaion is implemened wih Java J2SE [J2SE]. I works as middleware beween server engine and mobile cliens, where daa communicaion is mainly fulfilled by using UDP and TCP. The user inerface on he mobile device is developed by using Sun Java J2ME [J2ME] Wireless Toolki [JavaWTK]. We ry o make i porable, flexible, and 8

132 user friendly wih simple bu effecive funcions. The user can easily issue even/paern queries, navigae key frames, play ineresed video clips, and provide feedbacks. Figure V-2. Mobile-based soccer video rerieval inerfaces (a) iniial choices (b) rerieval by even (c) rerieval by paern Figure V-2 and Figure V-3 show he user query inerfaces of he MoVR soccer video rerieval sysem. In Figure V-2(a), he iniial choices are displayed, which include Soccer Video Browsing, Soccer Video Rerieval by Even, and Soccer Video Rerieval by Even Paern, ec. The user can use he upper-cener buon o move up/down o selec he arge menu and hen push he lef-upper buon o launch he seleced applicaion. 9

133 Figure V-2(b) shows he query inerface for he single-even queries. I allows he user o choose one or more evens wih no emporal consrains. For insance, in his figure, he user chooses Goal, Free Kick, and Corner Kick, which means he video clips wih eiher one of hese hree evens are of ineres. Under he even lis, here is a gauge conrol which allows he user o change he fuzzy weigh parameer beween wo exremes: personalized recommendaions and generalized recommendaions. The upper-lef buon can be used o exi his componen and go back o he main menu, while he upper-righ buon can be used o issue he query. Figure V-2(c) illusraes he inerface for he emporal even paern rerieval. The user can use he popup liss o choose he even number o define he size of he query paern. Then i is allowed o choose he evens one by one, along wih he emporal relaionship beween wo adjacen evens. Taking his figure as an example, he user firs ses he even number as 2, and hen chooses he paern as Corner Kick <= Goal, which means ha he user wans o search for he video clips wih a Corner Kick followed by a Goal. These wo evens could also occur in he same video sho (when emporal relaionship is se o = ), which can be called a Corner Goal. 20

134 Figure V-3. Mobile-based soccer video rerieval resuls (a) video browsing resuls (b) video rerieval resuls (c) video player The reurned key frames are displayed as shown in Figure V-3(a) and Figure V-3(b). Due o he limiaion of screen display size, only six key frames are shown in he firs screen, where each key frame represens he firs frame for each of he reurned video clips. Users can choose heir key frame of ineres and hen rigger he Play! buon o display he corresponding video clip, which may include one (for even query) or more video shos (for even paern query). Noe ha Figure V-3(a) shows he video browsing resuls wih he key frames displayed for consecuive video shos in one video. Figure V-3(b) illusraes he resuls for an even query argeing he video shos wih corner kicks. These video shos are rerieved from differen soccer videos and are ranked from lef o righ, and from op o down based on heir similariy scores. 2

135 The video player inerface, which is shown in Figure V-3(c), conains a buon called Posiive Feedback, which can be seleced o send back a posiive feedback if he user is saisfied wih he curren video clip. A Snapsho funcionaliy is also provided such ha he user can capure a video frame from he video. Table V-2. Average accuracy for he differen recommendaions ID Query Generalized Recommendaions Fuzzy Weighed Recommendaions Personalized Recommendaions Goal 30.6% 38.9% 6.% 2 Free Kick 9.4% 50.0% 58.3% 3 Corner Kick 27.8% 58.3% 72.2% 4 Goal < Goal 36.% 55.6% 86.% 5 Free Kick <= Goal 36.% 50.0% 66.7% 6 Corner Kick <= Goal 33.3% 47.2% 72.2% 7 Free Kick < Corner Kick 25.0% 36.% 52.8% 8 Corner Kick < Free Kick 36.% 44.4% 52.8% 9 Corner Kick <= Goal < Free Kick 6.7% 38.9% 63.9% 0 Free Kick <= Goal < Goal 25.0% 36.% 55.6% In our experimens, a oal of 300 hisorical queries were used for he consrucion of HMMM model, where he sysem learned he common knowledge from he general users. As shown in Table V-2, we performed he es for en ses of disinc queries, including hree singleeven queries, five wo-even paern queries, and wo hree-even paern queries. For example, Query 9 Corner Kick <= Goal < Free Kick means a paern wih a corner kick, followed by a goal, and hen a free kick, where corner kick and goal may possibly occur wihin he same video sho. For each of hese en queries, hree ses of ess are performed and each of hem represens disinc user ineress. In each of hese ess, he user profile is consruced based on 30 hisorical 22

136 queries and all hree possible recommendaion mehods are esified. Twelve op-ranked video clips shown in he firs wo screens (wih six resuls each) are checked, which is called scope. Thus, accuracy here is defined as he percenage of he number of user saisfied video clips wihin he scope. Finally, he average accuracy is compued based on hese ess. Experimenal Comparison of Differen Recommendaions 00.0% 90.0% 80.0% Generalized Fuzzy Weighed Personalized Average Accuracy 70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 0.0% 0.0% Query ID Figure V-4. Experimenal comparison of differen recommendaions Figure V-4 illusraes he comparison of he average accuracy values across hese hree kinds of recommendaions. As we can see from his figure, he resuls based on Generalized recommendaion have he lowes average accuracy; Personalized recommendaion offers he bes resuls, while Fuzzy Weighed recommendaion has he performance in beween. In general, he personalized recommendaion does offer beer resuls by learning individual user preferences by using he HMMM-based profile. Meanwhile, hough he generalized recommendaion may no fully saisfy he individual users, i represens he common knowledge learned from he general users and requires shorer processing ime. By adoping fuzzy weighed recommendaions, he users are offered more flexibiliy in video rerieval. 23

137 5.5.8 Summary Wih he proliferaion of mobile devices and mulimedia daa sources, here is a grea need for effecive mobile mulimedia services. However, wih heir unique consrains in display size, power supply, sorage space as well as processing speed, he mulimedia applicaions in mobile devices encouner grea challenges. In his secion, we presen MoVR a user adapive video rerieval framework in he mobile wireless environmen. While accommodaing various consrains of he mobile devices, a se of advanced echniques are developed and deployed o address essenial issues, such as he semanic gap beween low-level video feaures and high-level conceps, he emporal characerisics of video evens, and individual user preference, ec. Specifically, a Hierarchical Markov Model Mediaor (HMMM) scheme is proposed o model various levels of media objecs, heir emporal relaionships, he semanic conceps, and highlevel user percepions. The HMMM-based user profile is defined, which is also inegraed seamlessly wih a novel learning mechanism o enable he personalized recommendaion for an individual user by evaluaing his/her personal hisories and feedbacks. In addiion, he fuzzy associaion concep is employed in he rerieval process such ha he users gain conrol of he preference selecions o achieve reasonable radeoff beween he rerieval performance and processing speed. Furhermore, o improve he processing performance and enhance he porabiliy of clien-side applicaions, sorage consumpion informaion and compuaionally inensive operaions are suppored in he server side; whereas he mobile cliens are solely required o mainain he minimized size of daa o enable he rerieval process. The virual cliens are designed o perform as a middleware beween server applicaions and mobile cliens. This design helps o reduce he sorage load of mobile devices and o provide greaer accessibiliy wih heir cached media files. Finally, a mobile-based soccer video rerieval sysem is developed and esed o demonsrae he effeciveness of he proposed framework. 24

138 CHAPTER VI. SECURITY SOLUTIONS FOR MULTIMEDIA SYSTEMS This chaper discusses he securiy module of DIMUSE, which is responsible for he auhoriy checking o guaranee he securiy assurance of he mulimedia informaion rerieved and used by he disribued mulimedia applicaions. In order o saisfy he inricae requiremens of mulimedia securiy, a framework called SMARXO is presened o suppor muli level securiy access conrol by combining several imperaive echniques: Role-Based Access Conrol (RBAC), exensible Markup Language (XML), and Objec-Relaional Daabase Managemen Sysem (ORDBMS). 6. Inroducion Wih he rapid developmen of various mulimedia echnologies, more and more mulimedia daa are generaed in he medical, commercial, and miliary fields, which may include some sensiive informaion ha should no be accessed by, or can only be parially exposed o, general users. Therefore, user-adapive mulimedia daa access conrol has became an essenial opic in he areas of mulimedia daabase design and mulimedia applicaion developmen for informaion securiy purposes. RBAC (Role-Based Access Conrol) is a good candidae for user auhorizaion conrol. However, mos of he exising RBAC models mainly focus on documen proecion wihou fully considering all he possible environmenal consrains. Alhough i is claimed ha some exended models are able o offer proecion on mulimedia files, here are sill some unsolved problems. For insance, Figure VI-(a) shows an image which can be accessed, bu he plae objec inside should no be displayed. If a user requess his image, he/she can only view he parial image as shown in Figure VI-(c). The focal goal of our securiy research can be oulined as consrucing a framework o conrol he access o mulimedia applicaions, files, and furhermore he visual/audio objecs or segmens embedded in he mulimedia daa. In his chaper, we develop a framework named 25

139 SMARXO (Secured Mulimedia Applicaion by adoping RBAC, XML and ORDBMS). Several significan echniques are proficienly mixed in SMARXO o saisfy he complicaed mulimedia securiy requiremens. Firs, efficien mulimedia analysis mechanisms can be used o acquire he meaningful visual/audio objecs or segmens. Second, XML and objec-relaional daabases are adoped such ha proficien mulimedia conen indexing can be easily achieved. Third, we upgrade and embed a dominan access conrol model which can be ailored o he specific characerisics of mulimedia daa. XML is also applied o organize all kinds of securiy-relaed roles and policies. Finally, and mos imporanly, hese echniques are efficienly organized such ha muli-level mulimedia access conrol can be achieved in SMARXO wihou any difficuly. Figure VI-. Example of image objec-level securiy (a) original image (b) segmenaion map (c) hiding a porion of he image 6.2 SMARXO Archiecure There are hree phases available in order o build up he complee securiy verificaion archiecure for mulimedia applicaions. Figure VI-2 illusraes he SMARXO archiecure. The mulimedia daa, exraced feaures, and furhermore he XML documens are all organized in he ORDBMS. Once a user, including he adminisraor, logs in o he sysem and requess he mulimedia daa, he securiy checker verifies he user s idenificaion and he relaed permission. The mulimedia manager responds based on he securiy checking resuls. The source mulimedia daa may need o be processed in order o hide he objec-level or scene/sho-level informaion. In addiion, hrough his framework, he adminisraors are capable of creaing, deleing, and modifying he user roles, objec roles, emporal roles, IP address roles, and securiy policies. 26

140 Since all he proecion-relaed informaion is managed by XML, securiy informaion rerieval becomes very convenien. Clien Side User Adminisraor Secured Mulimedia Managemen Securiy Checker Securiy Manager Mulimedia Manager XML + Objec Relaional Daabase Permission Roles (Allow/Deny) Video Hierarchy Image Objecs... User Roles Objec Roles Temporal Roles IP address Roles Mulimedia Conens Mulimedia Feaures XML doc Bye daa or objecs Texs or numerical daa Figure VI-2. SMARXO archiecure 6.3 Mulimedia Access Conrol The radiional RBAC mehods need o be exended o perform superior access conrol funcionaliies such as he emporal and IP address conrol, objec-level and scene/sho-level access conrol, ec. Based on he formal definiion of radiional RBAC in [Sandhu96], he exended formal definiions are given in Figure VI-3. Compared wih he radiional RBAC model, we also inroduce he objec roles, emporal roles, and IP address roles. The associaed rules are defined such ha hese advanced roles can be combined o perform inclusive access conrol. 27

141 Ses: U: Users (*) O: Objecs Ru: User Roles (*) Ro: Objec Roles S: Sessions (*) R: Temporal Roles P: Permissions (*) Ri: IP address Roles Rules: ) UA U Ru: user-role assignmen 2) RH Ru Ru: a parial order of role hierarchy 3) PA P Ru: a basic permission-user role assignmen 4) (*) OA O Ro: objec-role assignmen 5) (*) OP P Ro: a permission-objec assignmen 6) (*) R Ru R Ri: an assembled role se wih environmenal consrains 7) (*) OPA OP R: an advanced permission-role assignmen 8) user: S U, a funcion mapping a session o a user 9) u_roles: S 2 Ru, a basic funcion mapping a session o a se of user roles 0) (*) roles: S 2 R, an advanced funcion mapping a session o a se of roles ) permissions: Ru 2 P, mapping a user role o a se of permissions 2) permissions : Ru 2 P, mapping a user role o a se of permissions wih role hierarchies 3) (*) permissions : R 2 OP, mapping an assembled role o a se of permissions 4) (*) permissions : R 2 OP, mapping an assembled role o a se of permissions wih role hierarchies 5) permissions(r) = {p: P (r, p) PA } 6) permissions (r) = {p : P r r (r, p) PA} 7) (*) permissions (r) = {p: OP (r, p) OPA } 8) (*) permissions (r) = {p : OP r r (r, p) OPA} (Noe: he ones marked wih * are advanced feaures of SMARXO) Figure VI-3. Exended RBAC definiions in SMARXO 28

142 6.3. Mulimedia Indexing Phase In order o suppor muli-level securiy, he mulimedia daa are required o be sored hierarchically. For insance, by applying image segmenaion echniques on Figure VI-(a), he corresponding segmenaion map (as shown in Figure VI-(b)) can be achieved. Each exraced objec is bounded wih a recangle. The exracion resuls may help people idenify he meaningful objecs and compue he associaed bounding boxes. Boh he original image and he image objec informaion can be sored in he ORDBMS. If a specific securiy policy requires some porions of he arge image o be hidden from he user, he sysem can rerieve he subobjec s aribues and process he original image o hide hose porions (e.g., he proeced plae in Figure VI-(c)). XML can be adoped o index he image objec informaion by a 6- uple elemen: <o_id, o_name, o_x, o_y, o_widh, o_heigh>, which are he objec id, objec descripion, x and y coordinaes of he op-lef poin, and he objec widh and heigh, respecively. Such an example can be found in Figure VI-4(a). (a) <ImageObjecs> <Image imgid= i00 > <Objec o_id= i00o0 > <o_name>tag</o_name> <o_x>40</o_x> <o_y>80</o_y> <o_widh>8</o_widh> <o_heigh>50</o_heigh> </Objec> <Objec o_id= i00o02 > <o_name>car</o_name> </Objec> </Image> </ImageObjecs> (b) <VideoHierarchy> <Video v_id= v0 > <Even e_id= e0 > <Scene c_id= c0 > <Sho s_id= s0 > <frame_s></frame_s> <frame_e>89</frame_e> </Sho> </Scene> </Even> </Video> </VideoHierarchy> Figure VI-4. XML examples of mulimedia hierarchy (a) example for image objecs (b) example for video hierarchy 29

143 By uilizing video decoding, sho deecion, and scene deecion echniques, he specific video can be auomaically segmened and diverse levels of he video objecs can be achieved: frame, sho, scene, and even. For he purpose of video indexing, we can furhermore apply XML o sore his kind of video hierarchy informaion. As shown in Figure VI-4(b), he sar frame and end frame numbers of he shos are sored o mark he segmenaion boundaries. In SMARXO, a sho is reaed as he fundamenal uni o sore he video daa for efficiency purposes. Hence, sho-level securiy can be performed easily by displaying he accessible shos and skipping hose prohibied shos. In addiion, users are allowed o manually idenify heir arge mulimedia objecs or segmens by giving he corresponding parameers Securiy Modeling Phase In mos mulimedia applicaions, a reques behavior can be briefly recognized by a 4- uple: <who, wha, when, where>. The meaning of his reques is ha some user requess some daa a some ime by using some compuer. As discussed before, mos of he relaed research work can only conrol accesses by he who and wha aribues. Few models can suppor securiy verificaion on he when aribue. By conras, our framework suppors all of hem User Roles User roles, also recognized as Subjec roles, are he mos fundamenal feaure of RBAC. In addiion o he basic requiremens, SMARXO suppors one more specific feaure on user auhorizaion. When he adminisraor creaes a new user accoun, he/she can choose he defaul propery of his user from wo opions. One is o iniially gran all he access abiliies o his user, and hen assign he roles which deny his user s access o some objec. The oher opion is o disable he user from accessing by defaul. Then he permission roles can be graned o his accoun. In Figure VI-5(a), he user Bailey in he Professor group is assigned he defaul value Allow ; while he user Smih in he Suden group is assigned he defaul propery Deny.. 30

144 (a) <SubjecRoles> <UserGroup defaul= Allow > <Group g_id= Professor > <User u_id= Bailey > <Password>abc</Password> </User> </Group> </UserGroup> <UserGroup defaul= Deny > <Group g_id= Suden > <User u_id= Smih > <Password>32</Password> </User> </Group> </UserGroup> </SubjecRoles> (b) <ObjecRoles> <o_group id= Shos_a > <scene s_id= s02 > <sho>2</sho> <sho>3</sho> <sho>4</sho> <scene> </o_group> <o_group id= Shos_b > <sho>6/sho> <sho>2</sho> </o_group> </ObjecRoles> Objec Roles Figure VI-5. XML examples of he fundamenal roles (a) example of subjec roles (b) example of objec roles Someimes, he user may no be able o access one or more segmens/objecs of a mulimedia file. However, he/she should be able o access oher pars of his file. The objec roles are faciliaed o saisfy his requiremen. Figure VI-6 illusraes a video sho sequence sored in he daabase. User A canno access shos 2, 3, 4; while User B canno access shos 6 and 2. However, A and B should be allowed o view he oher shos of his video excep for heir prohibied segmens. SAMRXO suppors his kind of access conrol by modeling boh he objec roles and mulimedia hierarchy informaion. Figure VI-5(b) depics an XML example for he objec roles. Furhermore, in order o efficienly organize pleniful objec roles, we inroduce he objec-role hierarchy which is defined as follows. 3

145 User A should no access User B should no access Figure VI-6. Example requiremens for video scene/sho-level access conrol Definiion VI-: An Objec Hierarchy OH = O, OG, ), where O is a se of objecs and ( OG OG = O U G wih G is a se of objec groups. is a parial order on OG called he dominance relaion, and O OG is he se of minimal elemens of OG wih respec o he parial order. Given wo elemens x, y OG, x y iff x is a member of y. (a) <TemporalRoles> <Group e_id= Holiday > <Holiday h_id= Thanksgiving > <Monh></Monh> <WeekNo>4</WeekNo> <WeekDay>4</WeekDay> </Holiday> </Group> <Group e_id= OfficeHour > <H_inerval> <H_sar>9</H_sar> <H_end>7</H_end> </H_inerval> </Group> </TemporalRoles> (b) <SpaialRoles> <ipgroup ipg_id= Universiy > <ipuniv ipu_id= FIU > <ipdep ipd_id= SCS > <seg_fix>3</seg_fix> <seg2_fix>94</seg2_fix> <seg3_fix>33</seg3_fix> <seg4_sar></seg4_sar> <seg4_end>255</seg4_end> </ipdep> </ipuniv> </ipgroup> </SpaialRoles> Figure VI-7. XML examples of he opional roles (a) example of emporal roles (b) example of IP address roles 32

146 Temporal Roles In a mulimedia applicaion, daa may be available o he users a cerain ime periods bu unavailable a ohers. In order o achieve his arge, he emporal consrains can be generally formalized wih he following aribues: year, monh, week number, week day, hour, minues, ec. As shown in Figure VI-7(a), Thanksgiving is depiced wih hree aribues, which means ha Thanksgiving is he fourh Thursday of November. The oher emporal role named OfficeHour illusraes ha he office hours are from 9 o clock o 7 o clock every day Spaial Roles Even for he same user, he/she may be able o access he mulimedia daa only by using some specific compuers. The IP addresses can be used o embed his kind of consrain by idenifying he differen neworks and cliens. Usually, an IP address appears in he equivalen doed decimal represenaion such as and each oce in i ranges from 0 o 255. By checking he associaed IP address, he server can judge wheher his access is allowed. For his purpose, we define he IP address segmen for he relaed role modeling. Definiion VI-2: Given he oces named I, I 2, I 3, I 4, he IP address segmen expression IP can be defined as IP = = >, where n = 4, 0 x 2, 0 y 2, n x I j j j y j I d j 8 j 8 x, y N, x y 2 8 for j =,, 4, I d I, I, I, }. j j j + j { 2 3 I4 The symbol > idenifies he se of saring poins of he inervals. For example, 3 I + 94 I I3 + ( I4 > 254 I4) sands for he segmen beween and I can be modeled by XML as shown in Figure VI-7(b), which also means his segmen is under he role of Universiy FIU SCS Securiy Rules The securiy policies in radiional policies are basically classified ino wo caegories. One is Allow Policy which means ha cerain users can access cerain objecs; he oher is 33

147 Deny Policy which means ha cerain users canno access cerain objecs. SMARXO inroduces Parial Allow Policy which means ha he user can only access parial daa of his objec. The definiion of securiy policy is given in Figure VI-8(a) wih a 5-uple. Figure VI-8(b) gives a policy example which means ha he Suden can access Shos_a in Holiday by using he machines of SCS. (a) A securiy policy can be a 5-uple: <Ru, Ro, R, Ri, Acc> Where: Ru: a user role; Ro: an objec role; R: a emporal role; Ri: an IP address role; Acc: accessibiliy, he value can be Allow, Deny, or PariallyAllow. (b) <PolicyRoles> <policy p_id= p0 > <Ru>Suden</Ru> <Ro>Shos_a<Ro> <R>Holiday<R> <Ri>SCS</Ri> <Acc>Allow</Acc> </policy> </PolicyRoles> Figure VI-8. Securiy policies (a) formalized securiy policy (b) XML example on policy roles DBMS Managemen Phase In his framework, he mulimedia feaures, XML documens, and he mulimedia conens are sored ino an ORDBMS. By efficienly managing he XML segmens in he ORDBMS, he XML documens can be easily updaed when ediing he securiy policies or he mulimedia hierarchy informaion. Moreover, all he conens prepared in XML can be searched easily and accuraely. In oher words, i is very convenien for he adminisraor o rerieve he securiy policies by performing XML queries in he ORDBMS. Furhermore, ORDBMS provides some valuable funcionaliy o sore he bye daa and large objecs. Therefore, he images as well as he video shos can be professionally managed. 34

148 6.4 Securiy Verificaion Based on an access reques, he sysem will firs check he user ID and password, and hen check he user roles, objec roles, emporal roles, and IP address roles consequenly. Afer ha, he securiy policy checks are performed on he Objec Eniy Se (OES) of he requesed objec o ha includes boh he objec iself and all he eniies s (segmens or sub-objecs belong o o). Definiion VI-3: Objec Eniy Se: OES ( o) = { o} U { s : s o}. Figure VI-9 depics he securiy verificaion algorihm. A brief funcion p_check(o) is presumed o check if he user can access objec o in he specified ime from some specified compuer. Three kinds of resuls can be formalized as follows:. The access will be denied iff p _ check( o) = FALSE. 2. A user can access he original mulimedia daa o iff OES( o)[ p _ check( ) = TRUE], where can be any eniy including o and all o s sub-objecs. 3. A user can access he processed mulimedia daa o where he prohibied sub-objecs are removed from o iff ( p _ check( o) = TRUE) ( s o[ p _ check( s) = FALSE]), where s can be any sub-objec or segmen which belongs o o. 35

149 Inpu: An Access Reques <id, pwd, ime*, ip_addr*, objec> Oupu: () FALSE: Access is denied; (2) objec: Complee mulimedia daa as requesed; (3) objec : Processed mulimedia daa wihou he proeced objecs. Algorihm securiy_check(id, pwd, ime*, ip_addr*, objec): ) BEGIN 2) if (id, pwd) U //Verify user ideniy 3) reurn FALSE; 4) else 5) if (ge_user_role(id)) //Check user-role assignmen 6) u_role = ge_user_role(id); 7) else u_role = id; 8) if (ge_objec_role(objec)) //Check objec-role assignmen 9) o_role = ge_objec_role(objec); 0) else o_role = objec; ) if (ge_emporal_role(ime)) //Check emporal-role assignmen 2) _role = ge_emporal_role(ime); 3) else _role = ime; 4) if (ge_ipaddr_role(ip_addr)) //Check IP address role assignmen 5) ip_role = ge_ipaddr_role(ip_addr); 6) else ip_role = ip_addr; 7) if (check_permission(u_role, o_role, _role, ip_role)=deny) 8) reurn FALSE; 9) else 20) for all sub_objec objec //Check permission on he sub-objecs 2) if (check_permission(u_role, sub_objec, _role, ip_role)=deny) { 22) objec = securiy_process(objec) //Process mulimedia daa 23) reurn objec ; } //User can access he processed objec 24) else 25) reurn objec; //User can access he complee objec 26) END (Noe: Feaures marked wih * are advanced ones bu opional in SMARXO.) Figure VI-9. Algorihm for securiy verificaion in SMARXO 36

150 6.5 Conclusions In his chaper, a pracical framework SMARXO is proposed o provide mulilevel mulimedia securiy for mulimedia applicaions. RBAC, XML and ORDBMS are efficienly combined o achieve his arge. In SMARXO, he RBAC model is enhanced and uilized o manage complicaed roles and role hierarchies. Moreover, he mulimedia documens are indexed and modeled such ha access conrol can be faciliaed on muli-level mulimedia daa. Compared wih he oher exising securiy models or projecs, SMARXO can deal wih more inricae siuaions. Firs, he image objec-level securiy and video scene/sho-level securiy can be easily achieved. Second, he emporal consrains and IP address resricions are modeled for access conrol purposes. Finally, XML queries can be performed such ha he adminisraors can proficienly rerieve useful informaion from he securiy roles and policies. The comparison among SMARXO and hese exising securiy models/approaches is depiced in Table VI-. Table VI-. Comparison of mulimedia securiy echniques Suppor RBAC 3 TRBAC GTRBAC GRBAC GOCPN SMARXO Access Conrol Yes Yes Yes Yes Yes Yes Role Hierarchy Yes Yes Yes Yes No Yes Temporal Consrains IP address Resricions Securiy on Mulimedia Daa Securiy on Mulilevel Objecs No Yes Yes Yes No Yes No No No No No Yes No No No Yes Yes Yes No No No No Yes Yes 37

151 CHAPTER VII. MULTIMEDIA SYSTEM INTEGRATION During he las decade, he rapid developmen of echnologies and applicaions for consumer digial media has led o he desire o capure, sore, analyze, organize, rerieve, and display mulimedia daa. Accordingly, various mulimedia daa managemen sysems have been developed o fulfill hese requiremens. However, mos of hese sysems mainly focus on one or few funcionaliies. Some sysems are concerned wih he producion of mulimedia maerial; some sysems handle mulimedia analysis and rerieval issues; while some oher sysems only provide he funcionaliies o synchronize various mulimedia files ino a presenaion. In his chaper, a disribued mulimedia managemen sysem called DMMManager [ChenSC03b] is presened, which ries o inegrae he full scope of funcionaliies for mulimedia managemen including mulimedia daa capuring, analysis, indexing, organizaion, conen-based rerieval, mulimedia presenaion design and rendering. 7. Sysem Overview In order o provide he abiliy for handling simulaneous accesses of muliple users, a muli-hreaded clien/server archiecure is designed and deployed in DMMManager. The server and he clien are developed by using C++ and Java, respecively, and can run on muliple plaforms. On he server-side, a huge amoun of mulimedia daa is organized and sored, and all he compuaion inensive funcions are arranged on he server-side o fully uilize he server's compuaion power. Accordingly, a daabase engine is implemened o suppor file supply, feaure exracion, query processing, raining compuaions, ec. The clien-side applicaions provide several user-friendly inerfaces for he users o issue queries, check rerieval resuls, and edi he mulimedia presenaion convenienly, gaher various commands issued by he users, and package hem ino differen caegories. The TCP proocol is uilized in he clien and server communicaions. Upon receiving he requess, he server analyzes he message, idenifies wha 38

152 kind of operaions he user did, decides which componen(s) need(s) o be run nex, and hen acivaes he corresponding operaions. When he complee mulimedia conens are required for he playback of media sreams, UDP proocol is adoped o ransfer he mulimedia daa. DMMManager Daa Collecing Analysis and Indexing Rerieval Presenaion Design Presenaion Rendering Creae Mulimedia Files via Sofware Disribue Files Ino Direcories Browse Server Direcories Handle MATN Files: Creae Save Open Presenaion via JMF Player Copy or Download from Oher Sorage Devices Transfer from Oher Devices (Scanner, Digial Camera, or Digial Camcorders) Image Feaure Exracion (Color, Objec Locaion, ec.) Video Sho Deecion Conen Based Image Rerieval Training he Sysem wih Access Paerns & Access Frequencies Design MATN Model: Add/Delee Arc (Branches, Subnework, Loops) Presenaion via Web Browser Live Video/Audio Capure Video Indexing via Key-frames & Sho Boundaries Conen Based Video Rerieval Conver MATN Srucure o HTML+SMIL Documen Figure VII-. The mulimedia managemen flow of DMMManager This secion mainly describes how DMMManager deals wih a se of mulimedia managemen issues and generaes a mulimedia presenaion. The mulimedia managemen flow is illusraed in Figure VII-, where he colored boxes conain he funcionaliies suppored by DMMManager and he operaional sequence goes from daa collecing o presenaion rendering. The res of his chaper is organized as follows. In he nex secion, mulimedia daa gahering mehods are addressed and he live video/audio capure componen is inroduced. In Secion 7.3, mulimedia daa analysis and indexing are discussed. Secion 7.4 describes hree major rerieval mehods in DMMManager. The Mulimedia Augmened Transiion Nework (MATN) [ChenSC97][ChenSC00a][ChenSC00c] model and he corresponding presenaion design module are inroduced in Secion 7.5. Secion 7.5 discusses he presenaion rendering echniques as well. Finally, he conclusions are summarized in Secion

153 7.2 Mulimedia Daa Collecing Recen developmens of he mulimedia capure devices, daa compression algorihms, large capaciy sorage and high bandwidh neworks have helped creae he overwhelming producion of mulimedia conen. DMMManager can handle he mulimedia daa generaed or colleced from muliple sources. As shown in Figure VII-, he daa can be creaed via sofware, copied or downloaded from oher sorage devices, ransferred from oher devices, or capured from live video/audio. To capure live video/audio, he video/audio capure hardware such as a web camera is used o capure consecuive scenes, where he users can decide when o begin he capuring process. Then he daa gahered are encoded ino video files wih MPEG or AVI formas. Implemened by using Java Media Framework (JMF) [JMF], decoding and monioring he live video/audio are also suppored. In DMMManager, he capured raw video/audio daa are iniially sored in he clien-side and he user may use he upload funcion o ransfer hem o he daabase a he server so ha hey can be processed and analyzed for fuure rerieval. 7.3 Mulimedia Analysis and Indexing In order o address and access he desired mulimedia daa efficienly, large-scale digial archives need o be analyzed and indexed in he daabase a he server. In DMMManager, differen mulimedia files are caegorized and sored ino a se of direcories based on heir media ypes and conens. For example, he image files can be classified ino animals, flowers, spors, ec.; while he video files can be caegorized as movies, news, adverises, ec Image Analysis and Indexing Due o he explosion of image files and he inefficiency of ex-based image rerieval, Conen-Based Image Rerieval (CBIR) approaches are implemened in DMMManager. Currenly, DMMManager provides he funcionaliies o exrac low-level feaures (such as color) and mid-level feaures (such as objec locaion) from he images, where he HSV color space is 40

154 used o obain he color feaures and he SPCPE algorihm [ChenSC00b] is applied for objec locaion feaures. I is worh menioning ha our sysem is flexible so ha he funcional componens o exrac oher feaures such as exure and shape can be easily plugged in. The exraced feaures ogeher wih he images are indexed and sored in he daabase in he server, where each image is classified ino a cerain caegory and has is own domain name and a unique ID. In DMMManager, he image meadaa and feaure daa ses can be sored in he ex files or in he Microsof Access daabase, while he laer one can provide more suppor on he image daabase managemen and speed up he rerieval process Video Analysis and Indexing Differen from images, videos are coninuous media wih he emporal dimensions and consume a huge amoun of sorage. Therefore, insead of sequenial access o he video conen, which requires remendous ime, video summarizaion, a process of exracing absrac represenaions ha compresses he essence of he video, becomes a challenging research opic. DMMManager adops an efficien way o summarize video by uilizing he video sho deecion mehod proposed in [Yilmaz00], which auomaically segmens he video ino shos and exracs he key frames in a meaningful manner. Here, a video sho is defined as a coninuous view filmed by one camera wihou inerrupion and he firs frame of he deeced sho is considered as he key frame represening his sho. The procedure of video analysis and indexing is as follows. Firs, he video is segmened ino frames. Then he feaures of each frame are exraced for comparison. Noe ha each frame can be considered as an image, and herefore, he image feaure exracion funcionaliies implemened in he image analysis sage can be used o exrac he feaures from he frames. By considering he relaive spaial-emporal relaionships beween video objecs [ChenSC0a], he key frame represenaions and sho boundaries can be achieved. Finally, he video is segmened 4

155 ino smaller video clips (shos) based on hese sho boundaries. These sho files ogeher wih heir boundary informaion and key frames are sored in he server for fuure video rerieval. Boh AVI and MPEG formas are suppored in his video sho deecion and segmenaion process. 7.4 Mulimedia Rerieval Since he mulimedia daa are caegorized and sored in he direcories based on heir media ypes and he defined caegories, DMMManager provides a direcory-based rerieval funcionaliy, where all he daa can be browsed in a hierarchical manner. In order o faciliae he users o selec and acquire he desired files, a file supply funcionaliy is implemened, which is used by a conen-based image rerieval componen and a key-frame based video rerieval componen o allow he clien o browse, download and upload files from/o he server Conen-based Image Rerieval Since he rained CBIR subsysem can provide users wih heir desired images more accuraely and more quickly, i will grealy faciliae he design of mulimedia auhoring and presenaion by using he rerieved images. In DMMManager, his MMM-based CBIR sysem is conneced wih he mulimedia presenaion auhoring ool based on he Mulimedia Augmened Transiion Nework (MATN) model [ChenSC00a]. Wih a muli-hreaded clien/server archiecure, his sysem can suppor muliple cliens o issue queries and offer feedback simulaneously. In he server-side, a daabase engine is implemened o suppor image query processing, file supply, raining compuaions, feaure exracion and indexing of images. There are 0,000 color images as well as heir feaure se sored in our daabase. The clien-side provides a conen-based image rerieval user inerface, which allows he browsing, query and feedback of he image conens. 42

156 Sep3: Download o presenaion design inerface Sep2: Query by example and choose preferred image Sep: Go o Image rerieval inerface Sep 4: Compose MATN presenaion model Sep 5: Play he designed presenaion wih JMF player Figure VII-2. Mulimedia presenaion auhoring ool 43

157 As shown in Figure VII-2, afer he user finds he desired images using he image rerieval sysem, he/she may use he Download funcion o add hem ino he presenaion maerial ree, which can be employed laer in mulimedia presenaion designs. Basically, our CBIR sysem can help users find heir images of ineres more accuraely and more quickly such ha mulimedia presenaion design becomes much easier Video Daa Browsing and Rerieval Key-frame Based Video Browsing In [Yoshiaka99], he Query By Example (QBE) approach for video rerieval is proposed, where he video daa are considered as a se of images wihou emporal inerrelaions. Therefore, he image rerieval mehod can be performed for video rerieval purposes. Differen from he QBE video rerieval sysem, DMMManager adops he key-frame based searching mehod, which considers he emporal relaionships among he video daa and is much more powerful for video queries. Figure VII-3. The key-frame based video rerieval inerface wih a sho displayed Each video file is divided ino shos, which are sored and organized in he server-side wih heir sho boundary informaion and key frames. Each key frame corresponds o a unique 44

158 sho file. The video rerieval process is defined as follows. Firs, users selec he favorie domain. The sysem liss he names of all he video files in his caegory accordingly. Based on his lis, users can selec he desired video file. Then he key frames of he seleced video will be displayed. From he key frames, users can easily know he conens of he whole video wihou previewing i. Once he users double-click a key-frame, he corresponding video sho will be displayed via a JMF player as shown in Figure VII-3. Boh he original video file and he video shos can be downloaded for fuure mulimedia presenaion design and rendering SoccerQ: A Soccer Video Rerieval Sysem A soccer video rerieval sysem named SoccerQ [ChenSC05a] is inegraed in DMMManager o faciliae soccer even searching and browsing. A graphical query language is designed for specifying he relaive emporal queries. Basically, he user is allowed o specify he search arge and search space. Afer ha, differen objecive evens and he emporal relaionship model ypes can be chosen. The double-humbed slider is uilized for he even posiion, duraion, or range specificaion. To furher quanify he relaed parameers, he user is allowed o inpu he number and specify he uni. As menioned earlier, he minue, second, and sho unis are provided. In addiion, he senenial operaors are provided such ha differen query rules can be combined. Those operaors include: he negaive operaor no, he conjuncive operaor and, and he disjuncive operaor or. Given he following wo queries, he corresponding visual query specificaions are lised in Table VII-. Table VII-. Example mappings o he graphical query language Evens and Relaionships: Query 3: R A θ V, θ = sars A = Goal Query 4: A θ R B, θ = sars A = Corner kick B = Goal Parameers (uni) R A0 = 0 R Af = 0 (minues) R B0 -A 0 = 0 R Bf -A f = 2 (minues) R A A Graphical Query Filer R A0 A 0 R Af A f R B R B0 R Bf 45

159 Query : Find all he soccer videos from he daabase where here is a goal occurrence in he firs 0 minues of he video. Query 2: Find all he corner kick shos from all he female soccer videos where he corner kick resuled in a goal even occurring in 2 minues. Figure VII-4. Soccer rerieval inerface wih example emporal query and resuls The clien-side applicaion inegraes boh soccer even query and video browsing panels in a common framework. When a query is issued, he relaed parameers are ransferred o he server-side. Accordingly, he server-side daabase engine performs compuaion-inensive funcionaliies including graphical query ranslaion, query processing, video supply, ec. The huge amoun of mulimedia daa is organized by adoping he PosgreSQL daabase. As shown in Figure VII-4, he query crieria are specified in he graphical query panel in he upper-righ corner. The illusraed query example is o find all he corner kick shos from all he female soccer videos and he corner kick sho can cause a goal even occurring in 2 minues. The key frames of four resul shos are displayed in he video browsing panel. The video 46

160 sho can be displayed by double clicking he corresponding key frame. Finally, he rerieved video or video shos can be downloaded o he clien-side and be composed and hen displayed in a mulimedia presenaion. 7.5 Mulimedia Presenaion Module 7.5. Mulimedia Presenaion Auhoring Compared o he radiional ex or numerical daa, he mulimedia daa are far more complicaed because hey usually conain spaial and/or emporal relaionships. Therefore, in order o compose he mulimedia objecs from disribued sources ino a sophisicaed presenaion, i is very imporan o develop absrac semanic models which mee he following requiremens [Berino98][ChenSC00a]: Firs, he specificaion of emporal consrains for mulimedia conen mus be suppored by he devised model. Second, he model mus ensure ha hese emporal consrains can be saisfied a runime. Finally, he model should be a good programming daa srucure for implemenaion o conrol mulimedia playback. Currenly, he researches on he mulimedia concepual models lead o four differen direcions: Timeline-based models, Scrip-based models, Graph-based models, and Srucure-based models. Our DMMManager adops an absrac semanic model called Mulimedia Augmened Transiion Nework (MATN) model [ChenSC00a], which combines he srucure-based auhoring wih well-defined graphic based noaions. Typically, he MATN model can offer grea flexibiliy for he designers o synchronize he heerogeneous mulimedia objecs ino presenaions. MATN can no only model he sequenial mulimedia scenarios, bu also skech he complicaed presenaions, which suppor user ineracions, srucure reusabiliy, and qualiy of services. Basically, an MATN model consiss of a group of saes conneced by he direced arcs. The mulimedia srings are marked as he labels of he arcs. The mos fundamenal componens of he MATN model are defined as follows: 47

161 Sae: Denoes he saring siuaion of he nex mulimedia sream and/or he ending siuaion of he previous mulimedia sream. Each sae is idenified wih is own name. Arc: An arc connecs wo saes and has is own ime duraion. When he ime duraion is over, a ransiion occurs and he nex sream combined wih he nex arc will be impored and displayed. Mulimedia inpu sring: A regular expression which describes he spaio-emporal combinaion of diverse mulimedia sreams and how hey will be displayed. The single media objec can be represened by is ype ( T : ex; I : image; V : video; A : audio) and a unique number. For example: T, V2, ec. The mulimedia sreams can be expressed by he connecion of he muliple objecs linked by he symbol &, e.g. T&V2, ec. Feaure se: Each mulimedia objec has is individual feaure se, which conains a grea deal of useful informaion for he presenaion rendering, such as: he corresponding mulimedia file s pah and name, duraion, saring ime, ending ime, display window locaion and size, ec. The feaure ses are embedded in he MATN and can be edied easily. As shown in Table VII-2, he MATN models are consruced o model Allen s 3 kinds of emporal relaionships [Allen83], where he leers A and B denoe wo diverse mulimedia objecs. I can be easily seen from he able ha he MATN model has a powerful expressive capabiliy so ha i is able o represen all he 3 kinds of emporal relaionships correcly. In addiion, MATN inroduces some advanced erms and relaed funcionaliies: 48

162 Table VII-2. MATN srucures for 3 emporal relaionships Temporal Relaionships Timeline Represenaion MATN Srucure A is BEFORE B B is AFTER A A B A B A MEETS B B is MET by A A B A B A OVERLAPS B B is OVERLAPED by A A B A A&B B A DURING B B CONTAINS A B A B A&B B A STARTS B B is STARTED by A A A&B B B A FINISHES B B is FINISHED by A B A B A&B A is EQUAL o B (B is EQUAL o A) A B A&B Branches: Users can design he muliple branches and selec he favorie branch in he real presenaion. Subsequenly, he corresponding par will be rendered a runime. Therefore, he indeerminacy and user ineracions can be handled. Loops: The loops can be designed and he user can decide he repeiion imes for he corresponding porions of he scenario. Accordingly, he presenaion srucure is simplified when some sreams need o be displayed repeaedly. Sub-nework: The previously designed MATN srucures can be reused as a porion when auhoring he new scenario. Wih only one noion (e.g., P) on he arc, he presenaion will open he relaed MATN file and follow is srucure. When a sub-nework is encounered during he inerpreaion of an MATN model, he conrol of he main presenaion is passed o he sub-nework level in a way ha he presenaion flow wihin 49

163 he sub-nework is insered seamlessly ino he main presenaion. Consequenly, reusabiliy is ensured and he presenaion can be designed in a hierarchical way. Condiion/Acion Table: The able is used o sore he informaion which canno be implied in he saes or arcs. Differen acions can be carried ou depending on wheher a cerain condiion is me or no. Hence, he qualiy of service (QoS) conrol can be suppored wihou difficuly. For example, if he available bandwidh is lower han a cerain hreshold, he compressed version of he video can be delivered insead of he original one. Table VII-3. MATN design buons & funcionaliies Buon Symbol Buon Funcionaliy Buon Symbol Buon Funcionaliy A B Add an Arc A B Creae Branches (The number of branches >=2) A B Delee an Arc A B Merge he available Branches (The number of branches >=2) A B Add a Sub-nework A A&B B Creae a Loop As discussed before, he MATN model can be formally defined as an eigh-uple: <Σ, Γ, Q, ψ,, S, F, T>. Because his model is more capable of modeling synchronizaion relaionships such as concurrency, alernaives, looping, ec., we implemened i o faciliae he design of mulimedia presenaion in DMMManager. Also he MATN is uilized as he inernal daa srucure of his module. Figure VII-5 shows he user inerface of he MATN model design module. The MATN presenaion model can be easily creaed, edied, saved and opened hrough his inerface. As menioned, during he mulimedia rerieval process, users can download he rerieved mulimedia files. From his figure, we can see ha he corresponding file names are 50

164 caegorized and lised wihin a ree-view in he lef side of he window. Users may preview any lised mulimedia objec and selec one of hem or heir combinaions o consruc he MATN model. Several buons are designed o provide he funcionaliies o assemble he MATN srucure, which can be found in Table VII-3. In addiion, more edi funcions, such as delee funcion, are also developed for users o modify he srucure. Figure VII-5 shows an MATN example ha conains a se of filled circles, arcs and he corresponding arc labels. The video file named beach.mpg is previewed as shown in he lefboom corner. The pop up window works for he Add an arc funcion. By clicking he Ener buon, he seleced files will be synchronized ino a mulimedia sream and he corresponding arc will be generaed and added afer he curren final sae (or he seleced sae). Figure VII-5. The user inerface for MATN model design Mulimedia Presenaion Rendering In order o conver he designed MATN model o a mulimedia scenario perceivable o users, a presenaion rendering componen is implemened in DMMManager. In his presenaion rendering layer, wo approaches are offered o fulfill differen requiremens in differen 5

165 environmens. One is o display he mulimedia presenaion in a clien-side applicaion, namely JMF player, implemened wih Java Media Framework (JMF) [JMF]. When he JMF player is no available, he oher approach can be adoped o conver he MATN srucure ino an HTML file wih SMIL [SMIL] noaions which can be displayed in he web browser direcly Presenaion Rendering via JMF Player The Java Media Framework provides superior echniques for rendering hese kinds of mulimedia presenaion models ino a sand-alone applicaion in a runime environmen. In DMMManager, four kinds of disinc media players are implemened o exhibi he ex, image, video and audio daa, respecively. The synchronizaion informaion and he spaial and/or emporal relaionships can be easily rerieved from he MATN model and used o conrol hese players. As indicaed before, he MATN presenaion model can be used o creae a clean and inegraed srucure while enclosing a large amoun of valuable informaion, which plays an imporan role in he rendering process. By double-clicking he sae, he ime duraion specificaion, which is employed o conrol he sar and end of a cerain media sream, can be checked and modified. In addiion, each mulimedia objec involved in he presenaion has a feaure se, which is used o display he presenaion and conrol he layou. Moreover, he user can choose o render he whole or any porion of he MATN model by indicaing he sar and ending sae. Figure VII-6 demonsraes a presenaion rendering example via JMF players. Four differen ypes of mulimedia objecs (image, video, audio, ex) gahered from disribued sources are displayed concurrenly. 52

166 Figure VII-6. The rendered mulimedia presenaion played by he JMF player Figure VII-7. The rendered mulimedia presenaion played by he web browser Presenaion Rendering via SMIL Language As a scrip-based model, SMIL has some disadvanages. For example, close o he form of a programming language, i is hard for general users o learn and use. However, one of is benefis is ha he SMIL noaions can be combined wih he general HTML documens so he 53

A Matching Algorithm for Content-Based Image Retrieval

A Matching Algorithm for Content-Based Image Retrieval A Maching Algorihm for Conen-Based Image Rerieval Sue J. Cho Deparmen of Compuer Science Seoul Naional Universiy Seoul, Korea Absrac Conen-based image rerieval sysem rerieves an image from a daabase using

More information

Video Content Description Using Fuzzy Spatio-Temporal Relations

Video Content Description Using Fuzzy Spatio-Temporal Relations Proceedings of he 4s Hawaii Inernaional Conference on Sysem Sciences - 008 Video Conen Descripion Using Fuzzy Spaio-Temporal Relaions rchana M. Rajurkar *, R.C. Joshi and Sananu Chaudhary 3 Dep of Compuer

More information

Analysis of Various Types of Bugs in the Object Oriented Java Script Language Coding

Analysis of Various Types of Bugs in the Object Oriented Java Script Language Coding Indian Journal of Science and Technology, Vol 8(21), DOI: 10.17485/ijs/2015/v8i21/69958, Sepember 2015 ISSN (Prin) : 0974-6846 ISSN (Online) : 0974-5645 Analysis of Various Types of Bugs in he Objec Oriened

More information

Simple Network Management Based on PHP and SNMP

Simple Network Management Based on PHP and SNMP Simple Nework Managemen Based on PHP and SNMP Krasimir Trichkov, Elisavea Trichkova bsrac: This paper aims o presen simple mehod for nework managemen based on SNMP - managemen of Cisco rouer. The paper

More information

The Impact of Product Development on the Lifecycle of Defects

The Impact of Product Development on the Lifecycle of Defects The Impac of Produc Developmen on he Lifecycle of Rudolf Ramler Sofware Compeence Cener Hagenberg Sofware Park 21 A-4232 Hagenberg, Ausria +43 7236 3343 872 rudolf.ramler@scch.a ABSTRACT This paper invesigaes

More information

Handling uncertainty in semantic information retrieval process

Handling uncertainty in semantic information retrieval process Handling uncerainy in semanic informaion rerieval process Chkiwa Mounira 1, Jedidi Anis 1 and Faiez Gargouri 1 1 Mulimedia, InfoRmaion sysems and Advanced Compuing Laboraory Sfax Universiy, Tunisia m.chkiwa@gmail.com,

More information

IntentSearch:Capturing User Intention for One-Click Internet Image Search

IntentSearch:Capturing User Intention for One-Click Internet Image Search JOURNAL OF L A T E X CLASS FILES, VOL. 6, NO. 1, JANUARY 2010 1 InenSearch:Capuring User Inenion for One-Click Inerne Image Search Xiaoou Tang, Fellow, IEEE, Ke Liu, Jingyu Cui, Suden Member, IEEE, Fang

More information

Implementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report)

Implementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report) Implemening Ray Casing in Terahedral Meshes wih Programmable Graphics Hardware (Technical Repor) Marin Kraus, Thomas Erl March 28, 2002 1 Inroducion Alhough cell-projecion, e.g., [3, 2], and resampling,

More information

Test - Accredited Configuration Engineer (ACE) Exam - PAN-OS 6.0 Version

Test - Accredited Configuration Engineer (ACE) Exam - PAN-OS 6.0 Version Tes - Accredied Configuraion Engineer (ACE) Exam - PAN-OS 6.0 Version ACE Exam Quesion 1 of 50. Which of he following saemens is NOT abou Palo Alo Neworks firewalls? Sysem defauls may be resored by performing

More information

STEREO PLANE MATCHING TECHNIQUE

STEREO PLANE MATCHING TECHNIQUE STEREO PLANE MATCHING TECHNIQUE Commission III KEY WORDS: Sereo Maching, Surface Modeling, Projecive Transformaion, Homography ABSTRACT: This paper presens a new ype of sereo maching algorihm called Sereo

More information

Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding

Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding Moivaion Image segmenaion Which pixels belong o he same objec in an image/video sequence? (spaial segmenaion) Which frames belong o he same video sho? (emporal segmenaion) Which frames belong o he same

More information

Design and Application of Computer-aided English Online Examination System NONG DeChang 1, a

Design and Application of Computer-aided English Online Examination System NONG DeChang 1, a 3rd Inernaional Conference on Maerials Engineering, Manufacuring Technology and Conrol (ICMEMTC 2016) Design and Applicaion of Compuer-aided English Online Examinaion Sysem NONG DeChang 1, a 1,2 Guangxi

More information

A time-space consistency solution for hardware-in-the-loop simulation system

A time-space consistency solution for hardware-in-the-loop simulation system Inernaional Conference on Advanced Elecronic Science and Technology (AEST 206) A ime-space consisency soluion for hardware-in-he-loop simulaion sysem Zexin Jiang a Elecric Power Research Insiue of Guangdong

More information

Learning in Games via Opponent Strategy Estimation and Policy Search

Learning in Games via Opponent Strategy Estimation and Policy Search Learning in Games via Opponen Sraegy Esimaion and Policy Search Yavar Naddaf Deparmen of Compuer Science Universiy of Briish Columbia Vancouver, BC yavar@naddaf.name Nando de Freias (Supervisor) Deparmen

More information

Visual Indoor Localization with a Floor-Plan Map

Visual Indoor Localization with a Floor-Plan Map Visual Indoor Localizaion wih a Floor-Plan Map Hang Chu Dep. of ECE Cornell Universiy Ihaca, NY 14850 hc772@cornell.edu Absrac In his repor, a indoor localizaion mehod is presened. The mehod akes firsperson

More information

STRING DESCRIPTIONS OF DATA FOR DISPLAY*

STRING DESCRIPTIONS OF DATA FOR DISPLAY* SLAC-PUB-383 January 1968 STRING DESCRIPTIONS OF DATA FOR DISPLAY* J. E. George and W. F. Miller Compuer Science Deparmen and Sanford Linear Acceleraor Cener Sanford Universiy Sanford, California Absrac

More information

Voltair Version 2.5 Release Notes (January, 2018)

Voltair Version 2.5 Release Notes (January, 2018) Volair Version 2.5 Release Noes (January, 2018) Inroducion 25-Seven s new Firmware Updae 2.5 for he Volair processor is par of our coninuing effors o improve Volair wih new feaures and capabiliies. For

More information

Improving Ranking of Search Engines Results Based on Power Links

Improving Ranking of Search Engines Results Based on Power Links IPASJ Inernaional Journal of Informaion Technology (IIJIT) Web Sie: hp://www.ipasj.org/iijit/iijit.hm A Publisher for Research Moivaion... Email: edioriiji@ipasj.org Volume 2, Issue 9, Sepember 2014 ISSN

More information

MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES

MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES B. MARCOTEGUI and F. MEYER Ecole des Mines de Paris, Cenre de Morphologie Mahémaique, 35, rue Sain-Honoré, F 77305 Fonainebleau Cedex, France Absrac. In image

More information

Managing XML Versions and Replicas in a P2P Context

Managing XML Versions and Replicas in a P2P Context Managing XML Versions and Replicas in a P2P Conex Deise de Brum Saccol1 1,2, Nina Edelweiss 2, Renaa de Maos Galane 2,4, Carlo Zaniolo 3 2 Insiuo de Informáica - Universidade Federal do Rio Grande do Sul

More information

Coded Caching with Multiple File Requests

Coded Caching with Multiple File Requests Coded Caching wih Muliple File Requess Yi-Peng Wei Sennur Ulukus Deparmen of Elecrical and Compuer Engineering Universiy of Maryland College Park, MD 20742 ypwei@umd.edu ulukus@umd.edu Absrac We sudy a

More information

Service Oriented Solution Modeling and Variation Propagation Analysis based on Architectural Building Blocks

Service Oriented Solution Modeling and Variation Propagation Analysis based on Architectural Building Blocks Carnegie Mellon Universiy From he SelecedWorks of Jia Zhang Ocober, 203 Service Oriened Soluion Modeling and Variaion Propagaion Analysis based on Archiecural uilding locks Liang-Jie Zhang Jia Zhang Available

More information

Let s get physical - EDA Tools for Mobility

Let s get physical - EDA Tools for Mobility Le s ge physical - EDA Tools for Mobiliy Aging and Reliabiliy Communicaion Mobile and Green Mobiliy - Smar and Safe Frank Oppenheimer OFFIS Insiue for Informaion Technology OFFIS a a glance Applicaion-oriened

More information

Network management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional);

Network management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional); QoS in Frame Relay Frame relay characerisics are:. packe swiching wih virual circui service (virual circuis are bidirecional);. labels are called DLCI (Daa Link Connecion Idenifier);. for connecion is

More information

Querying Moving Objects in SECONDO

Querying Moving Objects in SECONDO Querying Moving Objecs in SECONDO Vicor Teixeira de Almeida, Ralf Harmu Güing, and Thomas Behr LG Daenbanksyseme für neue Anwendungen Fachbereich Informaik, Fernuniversiä Hagen D-58084 Hagen, Germany {vicor.almeida,

More information

Quick Verification of Concurrent Programs by Iteratively Relaxed Scheduling

Quick Verification of Concurrent Programs by Iteratively Relaxed Scheduling Quick Verificaion of Concurren Programs by Ieraively Relaxed Scheduling Parick Mezler, Habib Saissi, Péer Bokor, Neeraj Suri Technische Univerisä Darmsad, Germany {mezler, saissi, pbokor, suri}@deeds.informaik.u-darmsad.de

More information

An Adaptive Spatial Depth Filter for 3D Rendering IP

An Adaptive Spatial Depth Filter for 3D Rendering IP JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.3, NO. 4, DECEMBER, 23 175 An Adapive Spaial Deph Filer for 3D Rendering IP Chang-Hyo Yu and Lee-Sup Kim Absrac In his paper, we presen a new mehod

More information

IDEF3 Process Description Capture Method

IDEF3 Process Description Capture Method IDEF3 Process Descripion Capure Mehod IDEF3 is par of he IDEF family of mehods developmen funded by he US Air Force o provide modelling suppor for sysems engineering and enerprise inegraion 2 IDEF3 Mehod

More information

NRMI: Natural and Efficient Middleware

NRMI: Natural and Efficient Middleware NRMI: Naural and Efficien Middleware Eli Tilevich and Yannis Smaragdakis Cener for Experimenal Research in Compuer Sysems (CERCS), College of Compuing, Georgia Tech {ilevich, yannis}@cc.gaech.edu Absrac

More information

Weighted Voting in 3D Random Forest Segmentation

Weighted Voting in 3D Random Forest Segmentation Weighed Voing in 3D Random Fores Segmenaion M. Yaqub,, P. Mahon 3, M. K. Javaid, C. Cooper, J. A. Noble NDORMS, Universiy of Oxford, IBME, Deparmen of Engineering Science, Universiy of Oxford, 3 MRC Epidemiology

More information

Scheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012

Scheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012 EDA421/DIT171 - Parallel and Disribued Real-Time Sysems, Chalmers/GU, 2011/2012 Lecure #4 Updaed March 16, 2012 Aemps o mee applicaion consrains should be done in a proacive way hrough scheduling. Schedule

More information

CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL

CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL Klečka Jan Docoral Degree Programme (1), FEEC BUT E-mail: xkleck01@sud.feec.vubr.cz Supervised by: Horák Karel E-mail: horak@feec.vubr.cz

More information

User Adjustable Process Scheduling Mechanism for a Multiprocessor Embedded System

User Adjustable Process Scheduling Mechanism for a Multiprocessor Embedded System Proceedings of he 6h WSEAS Inernaional Conference on Applied Compuer Science, Tenerife, Canary Islands, Spain, December 16-18, 2006 346 User Adjusable Process Scheduling Mechanism for a Muliprocessor Embedded

More information

FUZZY HUMAN/MACHINE RELIABILITY USING VHDL

FUZZY HUMAN/MACHINE RELIABILITY USING VHDL FUZZY HUMN/MCHINE RELIBILITY USING VHDL Carlos. Graciós M. 1, lejandro Díaz S. 2, Efrén Gorroiea H. 3 (1) Insiuo Tecnológico de Puebla v. Tecnológico 420. Col. Maravillas, C. P. 72220, Puebla, Pue. México

More information

Michiel Helder and Marielle C.T.A Geurts. Hoofdkantoor PTT Post / Dutch Postal Services Headquarters

Michiel Helder and Marielle C.T.A Geurts. Hoofdkantoor PTT Post / Dutch Postal Services Headquarters SHORT TERM PREDICTIONS A MONITORING SYSTEM by Michiel Helder and Marielle C.T.A Geurs Hoofdkanoor PTT Pos / Duch Posal Services Headquarers Keywords macro ime series shor erm predicions ARIMA-models faciliy

More information

Rule-Based Multi-Query Optimization

Rule-Based Multi-Query Optimization Rule-Based Muli-Query Opimizaion Mingsheng Hong Dep. of Compuer cience Cornell Universiy mshong@cs.cornell.edu Johannes Gehrke Dep. of Compuer cience Cornell Universiy johannes@cs.cornell.edu Mirek Riedewald

More information

source managemen, naming, proecion, and service provisions. This paper concenraes on he basic processor scheduling aspecs of resource managemen. 2 The

source managemen, naming, proecion, and service provisions. This paper concenraes on he basic processor scheduling aspecs of resource managemen. 2 The Virual Compuers A New Paradigm for Disribued Operaing Sysems Banu Ozden y Aaron J. Goldberg Avi Silberschaz z 600 Mounain Ave. AT&T Bell Laboraories Murray Hill, NJ 07974 Absrac The virual compuers (VC)

More information

AML710 CAD LECTURE 11 SPACE CURVES. Space Curves Intrinsic properties Synthetic curves

AML710 CAD LECTURE 11 SPACE CURVES. Space Curves Intrinsic properties Synthetic curves AML7 CAD LECTURE Space Curves Inrinsic properies Synheic curves A curve which may pass hrough any region of hreedimensional space, as conrased o a plane curve which mus lie on a single plane. Space curves

More information

Visually Summarizing the Web using Internal Images and Keyphrases

Visually Summarizing the Web using Internal Images and Keyphrases Visually Summarizing he Web using Inernal Images and Keyphrases M.V.Gedam, S. A. Taale Deparmen of compuer engineering, PUNE Universiy Vidya Praishhan s College of Engg., India Absrac Visual summarizaion

More information

Web System for the Remote Control and Execution of an IEC Application

Web System for the Remote Control and Execution of an IEC Application Web Sysem for he Remoe Conrol and Execuion of an IEC 61499 Applicaion Oana ROHAT, Dan POPESCU Faculy of Auomaion and Compuer Science, Poliehnica Universiy, Splaiul Independenței 313, Bucureși, 060042,

More information

Why not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems

Why not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems Simulaion Wha is simulaion? Simple synonym: imiaion We are ineresed in sudying a Insead of experimening wih he iself we experimen wih a model of he Experimen wih he Acual Ways o sudy a Sysem Experimen

More information

Evaluation and Improvement of Region-based Motion Segmentation

Evaluation and Improvement of Region-based Motion Segmentation Evaluaion and Improvemen of Region-based Moion Segmenaion Mark Ross Universiy Koblenz-Landau, Insiue of Compuaional Visualisics, Universiässraße 1, 56070 Koblenz, Germany Email: ross@uni-koblenz.de Absrac

More information

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. XX, NO. XX, XX XXXX 1

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. XX, NO. XX, XX XXXX 1 This is he auhor's version of an aricle ha has been published in his journal. Changes were made o his version by he publisher prior o publicaion. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. XX,

More information

BI-TEMPORAL INDEXING

BI-TEMPORAL INDEXING BI-TEMPORAL INDEXING Mirella M. Moro Uniersidade Federal do Rio Grande do Sul Poro Alegre, RS, Brazil hp://www.inf.ufrgs.br/~mirella/ Vassilis J. Tsoras Uniersiy of California, Rierside Rierside, CA 92521,

More information

PART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR

PART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR . ~ PART 1 c 0 \,).,,.,, REFERENCE NFORMATON CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONTOR n CONTROL DATA 6400 Compuer Sysems, sysem funcions are normally handled by he Monior locaed in a Peripheral

More information

MATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008

MATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008 MATH 5 - Differenial Equaions Sepember 15, 8 Projec 1, Fall 8 Due: Sepember 4, 8 Lab 1.3 - Logisics Populaion Models wih Harvesing For his projec we consider lab 1.3 of Differenial Equaions pages 146 o

More information

MIC2569. Features. General Description. Applications. Typical Application. CableCARD Power Switch

MIC2569. Features. General Description. Applications. Typical Application. CableCARD Power Switch CableCARD Power Swich General Descripion is designed o supply power o OpenCable sysems and CableCARD hoss. These CableCARDs are also known as Poin of Disribuion (POD) cards. suppors boh Single and Muliple

More information

Win, Yuzana; Masada, Tomonari. Citation. Issue Date Right. this work in other works.

Win, Yuzana; Masada, Tomonari. Citation. Issue Date Right. this work in other works. NAOSITE: Nagasaki Universiy's Ac Tile Exploring Technical Phrase Frames f Auhor(s) Ciaion Win, Yuzana; Masada, Tomonari 05 IEEE 9h Inernaional Confer Neworking and Applicaions Worksho Issue Dae 05-04-7

More information

Accenture Report Documentum 4i on NetApp filers Deployment Guide

Accenture Report Documentum 4i on NetApp filers Deployment Guide Accenure Repor Documenum 4i on NeApp filers Deploymen Guide Documen Purpose This projec was conduced by Accenure o research & develop a Nework Appliance filer and Documenum 4i Deploymen Guide. This documen

More information

Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game

Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game Deecion Tracking and Recogniion of Human Poses for a Real Time Spaial Game Feifei Huo, Emile A. Hendriks, A.H.J. Oomes Delf Universiy of Technology The Neherlands f.huo@udelf.nl Pascal van Beek, Remco

More information

Selective Offloading in Mobile Edge Computing for the Green Internet of Things

Selective Offloading in Mobile Edge Computing for the Green Internet of Things EDGE COMPUTING FOR THE INTERNET OF THINGS Selecive Offloading in Mobile Edge Compuing for he Green Inerne of Things Xinchen Lyu, Hui Tian, Li Jiang, Alexey Vinel, Sabia Maharjan, Sein Gjessing, and Yan

More information

The University of Sheffield Department of Computer Science. Indexing XML Databases: Classifications, Problems Identification and a New Approach

The University of Sheffield Department of Computer Science. Indexing XML Databases: Classifications, Problems Identification and a New Approach The Universiy of Sheffield Deparmen of Compuer Science Indexing XML Daabases: Classificaions, Problems Idenificaion and a New Approach Research Memorandum CS-7-5 Mohammed Al-Badawi Compuer Science Dep

More information

Open Access Research on an Improved Medical Image Enhancement Algorithm Based on P-M Model. Luo Aijing 1 and Yin Jin 2,* u = div( c u ) u

Open Access Research on an Improved Medical Image Enhancement Algorithm Based on P-M Model. Luo Aijing 1 and Yin Jin 2,* u = div( c u ) u Send Orders for Reprins o reprins@benhamscience.ae The Open Biomedical Engineering Journal, 5, 9, 9-3 9 Open Access Research on an Improved Medical Image Enhancemen Algorihm Based on P-M Model Luo Aijing

More information

Less Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks

Less Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks Less Pessimisic Wors-Case Delay Analysis for Packe-Swiched Neworks Maias Wecksén Cenre for Research on Embedded Sysems P O Box 823 SE-31 18 Halmsad maias.wecksen@hh.se Magnus Jonsson Cenre for Research

More information

COSC 3213: Computer Networks I Chapter 6 Handout # 7

COSC 3213: Computer Networks I Chapter 6 Handout # 7 COSC 3213: Compuer Neworks I Chaper 6 Handou # 7 Insrucor: Dr. Marvin Mandelbaum Deparmen of Compuer Science York Universiy F05 Secion A Medium Access Conrol (MAC) Topics: 1. Muliple Access Communicaions:

More information

Nonparametric CUSUM Charts for Process Variability

Nonparametric CUSUM Charts for Process Variability Journal of Academia and Indusrial Research (JAIR) Volume 3, Issue June 4 53 REEARCH ARTICLE IN: 78-53 Nonparameric CUUM Chars for Process Variabiliy D.M. Zombade and V.B. Ghue * Dep. of aisics, Walchand

More information

The Beer Dock: Three and a Half Implementations of the Beer Distribution Game

The Beer Dock: Three and a Half Implementations of the Beer Distribution Game The Beer Dock 2002-08-13 17:55:44-0700 The Beer Dock: Three and a Half Implemenaions of he Beer Disribuion Game Michael J. Norh[1] and Charles M. Macal Argonne Naional Laboraory, Argonne, Illinois Absrac

More information

Quantitative macro models feature an infinite number of periods A more realistic (?) view of time

Quantitative macro models feature an infinite number of periods A more realistic (?) view of time INFINIE-HORIZON CONSUMPION-SAVINGS MODEL SEPEMBER, Inroducion BASICS Quaniaive macro models feaure an infinie number of periods A more realisic (?) view of ime Infinie number of periods A meaphor for many

More information

A High-Speed Adaptive Multi-Module Structured Light Scanner

A High-Speed Adaptive Multi-Module Structured Light Scanner A High-Speed Adapive Muli-Module Srucured Ligh Scanner Andreas Griesser 1 Luc Van Gool 1,2 1 Swiss Fed.Ins.of Techn.(ETH) 2 Kaholieke Univ. Leuven D-ITET/Compuer Vision Lab ESAT/VISICS Zürich, Swizerland

More information

A Tool for Multi-Hour ATM Network Design considering Mixed Peer-to-Peer and Client-Server based Services

A Tool for Multi-Hour ATM Network Design considering Mixed Peer-to-Peer and Client-Server based Services A Tool for Muli-Hour ATM Nework Design considering Mied Peer-o-Peer and Clien-Server based Services Conac Auhor Name: Luis Cardoso Company / Organizaion: Porugal Telecom Inovação Complee Mailing Address:

More information

Lecture 18: Mix net Voting Systems

Lecture 18: Mix net Voting Systems 6.897: Advanced Topics in Crypography Apr 9, 2004 Lecure 18: Mix ne Voing Sysems Scribed by: Yael Tauman Kalai 1 Inroducion In he previous lecure, we defined he noion of an elecronic voing sysem, and specified

More information

I. INTRODUCTION. Keywords -- Web Server, Perceived User Latency, HTTP, Local Measuring. interchangeably.

I. INTRODUCTION. Keywords -- Web Server, Perceived User Latency, HTTP, Local Measuring. interchangeably. Evaluaing Web User Perceived Laency Using Server Side Measuremens Marik Marshak 1 and Hanoch Levy School of Compuer Science Tel Aviv Universiy, Tel-Aviv, Israel mmarshak@emc.com, hanoch@pos.au.ac.il 1

More information

(10) Patent No.: US 6,931,558 Bl (57) ABSTRACT ~ :!j 304 ; OS. BMR. & TSM files needed at restore time. Boot ~II backed-up ~ 106

(10) Patent No.: US 6,931,558 Bl (57) ABSTRACT ~ :!j 304 ; OS. BMR. & TSM files needed at restore time. Boot ~II backed-up ~ 106 111111 1111111111111111111111111111111111111111111111111111111111111 US006931558Bl (12) Unied Saes Paen Jeffe e ai. (10) Paen No.: US 6,931,558 Bl (45) Dae of Paen: Aug. 16,2005 (54) COMPUTER RESTORATION

More information

Achieving Security Assurance with Assertion-based Application Construction

Achieving Security Assurance with Assertion-based Application Construction Achieving Securiy Assurance wih Asserion-based Applicaion Consrucion Carlos E. Rubio-Medrano and Gail-Joon Ahn Ira A. Fulon Schools of Engineering Arizona Sae Universiy Tempe, Arizona, USA, 85282 {crubiome,

More information

Video-Based Face Recognition Using Probabilistic Appearance Manifolds

Video-Based Face Recognition Using Probabilistic Appearance Manifolds Video-Based Face Recogniion Using Probabilisic Appearance Manifolds Kuang-Chih Lee Jeffrey Ho Ming-Hsuan Yang David Kriegman klee10@uiuc.edu jho@cs.ucsd.edu myang@honda-ri.com kriegman@cs.ucsd.edu Compuer

More information

Vulnerability Evaluation of Multimedia Subsystem Based on Complex Network

Vulnerability Evaluation of Multimedia Subsystem Based on Complex Network JOURAL OF MULTIMDIA, VOL. 8, O. 4, AUGUST 23 439 Vulnerabiliy valuaion of Mulimedia Subsysem Based on Complex ewor Xiaoling Tang Insiue of Higher ducaion Research, Jilin Business and Technology College,

More information

Improving the Efficiency of Dynamic Service Provisioning in Transport Networks with Scheduled Services

Improving the Efficiency of Dynamic Service Provisioning in Transport Networks with Scheduled Services Improving he Efficiency of Dynamic Service Provisioning in Transpor Neworks wih Scheduled Services Ralf Hülsermann, Monika Jäger and Andreas Gladisch Technologiezenrum, T-Sysems, Goslarer Ufer 35, D-1585

More information

Design Alternatives for a Thin Lens Spatial Integrator Array

Design Alternatives for a Thin Lens Spatial Integrator Array Egyp. J. Solids, Vol. (7), No. (), (004) 75 Design Alernaives for a Thin Lens Spaial Inegraor Array Hala Kamal *, Daniel V azquez and Javier Alda and E. Bernabeu Opics Deparmen. Universiy Compluense of

More information

4 Error Control. 4.1 Issues with Reliable Protocols

4 Error Control. 4.1 Issues with Reliable Protocols 4 Error Conrol Jus abou all communicaion sysems aemp o ensure ha he daa ges o he oher end of he link wihou errors. Since i s impossible o build an error-free physical layer (alhough some shor links can

More information

A NEW APPROACH FOR 3D MODELS TRANSMISSION

A NEW APPROACH FOR 3D MODELS TRANSMISSION A NEW APPROACH FOR 3D MODELS TRANSMISSION A. Guarnieri a, F. Piroi a, M. Ponin a, A. Veore a a CIRGEO, Inerdep. Research Cener of Carography, Phoogrammery, Remoe Sensing and GIS Universiy of Padova, Agripolis

More information

Window Query and Analysis on Massive Spatio-Temporal Data

Window Query and Analysis on Massive Spatio-Temporal Data Available online a www.sciencedirec.com ScienceDirec IERI Procedia 10 (2014 ) 138 143 2014 Inernaional Conference on Fuure Informaion Engineering Window Query and Analysis on Massive Spaio-Temporal Daa

More information

Robot localization under perceptual aliasing conditions based on laser reflectivity using particle filter

Robot localization under perceptual aliasing conditions based on laser reflectivity using particle filter Robo localizaion under percepual aliasing condiions based on laser refleciviy using paricle filer DongXiang Zhang, Ryo Kurazume, Yumi Iwashia, Tsuomu Hasegawa Absrac Global localizaion, which deermines

More information

Optimal Crane Scheduling

Optimal Crane Scheduling Opimal Crane Scheduling Samid Hoda, John Hooker Laife Genc Kaya, Ben Peerson Carnegie Mellon Universiy Iiro Harjunkoski ABB Corporae Research EWO - 13 November 2007 1/16 Problem Track-mouned cranes move

More information

BEST DYNAMICS NAMICS CRM A COMPILATION OF TECH-TIPS TO HELP YOUR BUSINESS SUCCEED WITH DYNAMICS CRM

BEST DYNAMICS NAMICS CRM A COMPILATION OF TECH-TIPS TO HELP YOUR BUSINESS SUCCEED WITH DYNAMICS CRM DYNAMICS CR A Publicaion by elogic s fines Microsof Dynamics CRM Expers { ICS CRM BEST OF 2014 A COMPILATION OF TECH-TIPS TO HELP YOUR BUSINESS SUCCEED WITH DYNAMICS CRM NAMICS CRM { DYNAMICS M INTRODUCTION

More information

Chapter 3 MEDIA ACCESS CONTROL

Chapter 3 MEDIA ACCESS CONTROL Chaper 3 MEDIA ACCESS CONTROL Overview Moivaion SDMA, FDMA, TDMA Aloha Adapive Aloha Backoff proocols Reservaion schemes Polling Disribued Compuing Group Mobile Compuing Summer 2003 Disribued Compuing

More information

A Bayesian Approach to Video Object Segmentation via Merging 3D Watershed Volumes

A Bayesian Approach to Video Object Segmentation via Merging 3D Watershed Volumes A Bayesian Approach o Video Objec Segmenaion via Merging 3D Waershed Volumes Yu-Pao Tsai 1,3, Chih-Chuan Lai 1,2, Yi-Ping Hung 1,2, and Zen-Chung Shih 3 1 Insiue of Informaion Science, Academia Sinica,

More information

4. Minimax and planning problems

4. Minimax and planning problems CS/ECE/ISyE 524 Inroducion o Opimizaion Spring 2017 18 4. Minima and planning problems ˆ Opimizing piecewise linear funcions ˆ Minima problems ˆ Eample: Chebyshev cener ˆ Muli-period planning problems

More information

Motion Level-of-Detail: A Simplification Method on Crowd Scene

Motion Level-of-Detail: A Simplification Method on Crowd Scene Moion Level-of-Deail: A Simplificaion Mehod on Crowd Scene Absrac Junghyun Ahn VR lab, EECS, KAIST ChocChoggi@vr.kais.ac.kr hp://vr.kais.ac.kr/~zhaoyue Recen echnological improvemen in characer animaion

More information

A Formalization of Ray Casting Optimization Techniques

A Formalization of Ray Casting Optimization Techniques A Formalizaion of Ray Casing Opimizaion Techniques J. Revelles, C. Ureña Dp. Lenguajes y Sisemas Informáicos, E.T.S.I. Informáica, Universiy of Granada, Spain e-mail: [jrevelle,almagro]@ugr.es URL: hp://giig.ugr.es

More information

Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available.

Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Provided by he auhor(s) and NUI Galway in accordance wih publisher policies. Please cie he published version when available. Tile Conneciviy soluions o link a blueooh camera o he inerne Auhor(s) Ionas,

More information

An Experimental QoS Manager Implementation

An Experimental QoS Manager Implementation An Experimenal QoS Manager Implemenaion Drago Žagar, Goran Marinović, Slavko Rupčić Faculy of Elecrical Engineering Universiy of Osijek Kneza Trpimira 2B, Osijek Croaia drago.zagar@efos.hr Absrac-- Qualiy

More information

FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS

FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS Mohammed A. Aseeri and M. I. Sobhy Deparmen of Elecronics, The Universiy of Ken a Canerbury Canerbury, Ken, CT2

More information

Probabilistic Detection and Tracking of Motion Discontinuities

Probabilistic Detection and Tracking of Motion Discontinuities Probabilisic Deecion and Tracking of Moion Disconinuiies Michael J. Black David J. Flee Xerox Palo Alo Research Cener 3333 Coyoe Hill Road Palo Alo, CA 94304 fblack,fleeg@parc.xerox.com hp://www.parc.xerox.com/fblack,fleeg/

More information

SEINA: A Stealthy and Effective Internal Attack in Hadoop Systems

SEINA: A Stealthy and Effective Internal Attack in Hadoop Systems SEINA: A Sealhy and Effecive Inernal Aack in Hadoop Sysems Jiayin Wang, Teng Wang, Zhengyu Yang, Ying ao, Ningfang i, and Bo Sheng Deparmen of Compuer Science, Universiy of assachuses Boson, 1 orrissey

More information

Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases

Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases Lmarks: A New Model for Similariy-Based Paern Querying in Time Series Daabases Chang-Shing Perng Haixun Wang Sylvia R. Zhang D. So Parker perng@cs.ucla.edu hxwang@cs.ucla.edu Sylvia Zhang@cle.com so@cs.ucla.edu

More information

Automatic Calculation of Coverage Profiles for Coverage-based Testing

Automatic Calculation of Coverage Profiles for Coverage-based Testing Auomaic Calculaion of Coverage Profiles for Coverage-based Tesing Raimund Kirner 1 and Waler Haas 1 Vienna Universiy of Technology, Insiue of Compuer Engineering, Vienna, Ausria, raimund@vmars.uwien.ac.a

More information

Dynamic Route Planning and Obstacle Avoidance Model for Unmanned Aerial Vehicles

Dynamic Route Planning and Obstacle Avoidance Model for Unmanned Aerial Vehicles Volume 116 No. 24 2017, 315-329 ISSN: 1311-8080 (prined version); ISSN: 1314-3395 (on-line version) url: hp://www.ijpam.eu ijpam.eu Dynamic Roue Planning and Obsacle Avoidance Model for Unmanned Aerial

More information

Parallel and Distributed Systems for Constructive Neural Network Learning*

Parallel and Distributed Systems for Constructive Neural Network Learning* Parallel and Disribued Sysems for Consrucive Neural Nework Learning* J. Flecher Z. Obradovi School of Elecrical Engineering and Compuer Science Washingon Sae Universiy Pullman WA 99164-2752 Absrac A consrucive

More information

On the Impact of Concurrency for the Enforcement of Entailment Constraints in Process-driven SOAs

On the Impact of Concurrency for the Enforcement of Entailment Constraints in Process-driven SOAs On he Impac of Concurrency for he Enforcemen of Enailmen Consrains in Process-driven OAs Thomas Quirchmayr and Mark rembeck Insiue for Informaion ysems, New Media Lab, WU Vienna, Ausria {firsname.lasname}@wu.ac.a

More information

A time-varying subjective quality model for mobile streaming videos with stalling events

A time-varying subjective quality model for mobile streaming videos with stalling events A ime-varying subjecive qualiy model for mobile sreaming videos wih salling evens Deepi Ghadiyaram, Janice Pan, and Alan C Bovik The Universiy of Texas a Ausin ABSTRACT Over-he-op mobile video sreaming

More information

Trust-based Service Management of Mobile Devices in Ad Hoc Networks

Trust-based Service Management of Mobile Devices in Ad Hoc Networks Trus-based Service Managemen of Mobile Devices in Ad Hoc Neworks Yaing Wang, Ing-Ray Chen Virginia Tech Deparmen of Compuer Science Falls Church, VA, USA Email: {yaingw, irchen}@v.edu Jin-Hee Cho U.S.

More information

EVALUATING ACCURACY OF A TIME ESTIMATOR IN A PROJECT

EVALUATING ACCURACY OF A TIME ESTIMATOR IN A PROJECT EVALUATING ACCURACY OF A TIME ESTIMATOR IN A PROJECT Thanh-Lam Nguyen, Graduae Insiue of Mechanical and Precision Engineering Wei-Ju Hung, Deparmen of Indusrial Engineering and Managemen Ming-Hung Shu,

More information

Computer representations of piecewise

Computer representations of piecewise Edior: Gabriel Taubin Inroducion o Geomeric Processing hrough Opimizaion Gabriel Taubin Brown Universiy Compuer represenaions o piecewise smooh suraces have become vial echnologies in areas ranging rom

More information

Outline. EECS Components and Design Techniques for Digital Systems. Lec 06 Using FSMs Review: Typical Controller: state

Outline. EECS Components and Design Techniques for Digital Systems. Lec 06 Using FSMs Review: Typical Controller: state Ouline EECS 5 - Componens and Design Techniques for Digial Sysems Lec 6 Using FSMs 9-3-7 Review FSMs Mapping o FPGAs Typical uses of FSMs Synchronous Seq. Circuis safe composiion Timing FSMs in verilog

More information

Mobile Robots Mapping

Mobile Robots Mapping Mobile Robos Mapping 1 Roboics is Easy conrol behavior percepion modelling domain model environmen model informaion exracion raw daa planning ask cogniion reasoning pah planning navigaion pah execuion

More information

Difficulty-aware Hybrid Search in Peer-to-Peer Networks

Difficulty-aware Hybrid Search in Peer-to-Peer Networks Difficuly-aware Hybrid Search in Peer-o-Peer Neworks Hanhua Chen, Hai Jin, Yunhao Liu, Lionel M. Ni School of Compuer Science and Technology Huazhong Univ. of Science and Technology {chenhanhua, hjin}@hus.edu.cn

More information

Low-Cost WLAN based. Dr. Christian Hoene. Computer Science Department, University of Tübingen, Germany

Low-Cost WLAN based. Dr. Christian Hoene. Computer Science Department, University of Tübingen, Germany Low-Cos WLAN based Time-of-fligh fligh Trilaeraion Precision Indoor Personnel Locaion and Tracking for Emergency Responders Third Annual Technology Workshop, Augus 5, 2008 Worceser Polyechnic Insiue, Worceser,

More information

Adding Time to an Object-Oriented Versions Model

Adding Time to an Object-Oriented Versions Model Insiuo de Informáica Universidade Federal do Rio Grande do Sul Poro Alegre - RS - BRAZIL Adding Time o an Objec-Oriened Versions Model Mirella Moura Moro Nina Edelweiss Silvia Maria Saggiorao Clesio Saraiva

More information

EECS 487: Interactive Computer Graphics

EECS 487: Interactive Computer Graphics EECS 487: Ineracive Compuer Graphics Lecure 7: B-splines curves Raional Bézier and NURBS Cubic Splines A represenaion of cubic spline consiss of: four conrol poins (why four?) hese are compleely user specified

More information

An Efficient Delivery Scheme for Coded Caching

An Efficient Delivery Scheme for Coded Caching 201 27h Inernaional Teleraffic Congress An Efficien Delivery Scheme for Coded Caching Abinesh Ramakrishnan, Cedric Wesphal and Ahina Markopoulou Deparmen of Elecrical Engineering and Compuer Science, Universiy

More information