Real-time large-scale analysis of audiovisual data
|
|
- Briana Boone
- 5 years ago
- Views:
Transcription
1 Finnish Center of Excellence in Computational Inference Real-time large-scale analysis of audiovisual data Department of Signal Processing and Acoustics Aalto University School of Electrical Engineering Thanks to: Jorma Laaksonen Department of Computer Science Aalto University School of Science Thanks also to research groups at both departments
2 About Mikko Associate professor in speech and language processing at Aalto Background from Machine Learning algorithms and Pattern Recognition systems PhD 1997 at TKK on speech recognition training algorithms Research experience in several top speech and language groups: Research Centers: IDIAP (CH), SRI (USA), ICSI (USA) Universities: Edinburgh, Cambridge, Colorado, Nagoya Head of Aalto speech recognition research group, several national and European speech projects Research topics: Speech recognition, language modeling, speaker adaptation, speech translation, information retrieval from audio and video data
3 Goals of today 1. Know why video data are so important today 2. Learn ways how large-scale video data are used 3. Learn about related research topics at Aalto 4. Learn how to study speech and video processing at Aalto 3
4 Most mobile data are video Global mobile data traffic grew 69 percent in Mobile video traffic exceeded 50 percent of total mobile data traffic for the first time in ( 4
5 National audiovisual institute KAVI Archives Finnish television and radio streams ( 32 main channels full time, every day 100 other channels by samples Available for studio viewing for researchers and public since 2009 (no mobile viewing) 5
6 Digital archives of Yle Television and radio broadcasts of the Finnish Broadcasting Company (Yle) archived since 1935 Full digital archive available for Yle, selected parts also for public: Elävä Arkisto Areena 6
7 What people watch? Every day people watch hundreds of millions of hours on YouTube. Over 100 hours of video are uploaded every minute More than half of YouTube views come from mobile devices. ( ress/statistics.html) 7
8 How to use large-scale video data? Give a few examples! 8
9 Research at COIN Speech recognition: Turn the speech in videos to text Content-based video retrieval: Analyse the visual content 9
10 Research at COIN Speech recognition: Turn the speech in videos to text Index, summarize, search, browse, and play the video based on what was spoken Add captions, translations, and links to support understanding Recognize speakers and provide training data for improving speech recognition and speech synthesis systems Content-based video retrieval: Analyse the visual content 10
11 Research at COIN Speech recognition: Turn the speech in videos to text Index, summarize, search, browse, and play the video based on what was spoken Add closed captions, translations, and links to support understanding Recognize speakers and provide data for improving speech recognition and speech synthesis systems Content-based video retrieval: Analyse the visual content Segment the video into shots, find visual objects and concepts describe the video by natural language sentences Recognize people by faces etc. Detect non-speech sounds: explosions, clapping hands, laughing etc. 11
12 Real-time analysis In speech recognition optimize between: Acoustic and language model complexity Search accuracy in decoding In visual concept detection optimize between: Number of concepts detected Number and type of features extracted Time-complexity of the classifier(s) Number of classifiers used in post fusion Number of detections made per second Obtainable accuracy 12
13 Video content annotation demo + Character recognition for name tags Visual concept detection Face recognition Speaker recognition Speech recognition 13
14 Match voice and face when appearing together 14
15 Speaker spotting: - who is moving her lips? Detect faces and identify the rhythm of moving lips, eye blinks and eyebrows Results from Jorma Laaksonen 15
16 Information for a second screen Use audiovisual analysis to provide additional information. Show it on another screen. Can be links to Wikipedia, maps, search results 16
17 Research at COIN Speech recognition: Turn the speech in videos to text Index, summarize, search, browse, and play the video based on what was spoken Add closed captions, translations, and links to support understanding Recognize speakers and provide new data for improving and personalization of speech recognition and synthesis Content-based video retrieval: Analyse the visual content Segment the video into shots, find visual objects and concepts, describe the video by natural language sentences Recognize people by faces etc. Detect non-speech sounds: explosions, clapping hands, laughing etc. 17
18 Personalization requires adaptation of the computational speech models to speaker, language, speaking style, and recording conditions. Speech recognition: Dictation Translation: input Interfaces: input Retrieval of A/V content Speech synthesis: Reading text aloud Translation: output Interfaces: output Storing your personal voice 18
19 How to study the topic at Aalto? COURSES ELEC-E5500 Speech processing ELEC-E5510 Speech recognition ELEC-E5520 Speech and Language processing methods ELEC-E5530 Speech and Language processing seminar ELEC-E5550 Statistical natural language processing CS-E4850 Computer vision CS-E3210 Machine learning MASTER'S PROGRAMME Computer, Communication and Information Sciences MAJORS Signal, Speech and Language Processing Machine Learning and Data Mining (Macadamia)
20 More demos, results etc. Contact: ELEC SCI
Voice. Voice. Patterson EagleSoft Overview Voice 629
Voice Voice Using the Microsoft voice engine, Patterson EagleSoft's Voice module is now faster, easier and more efficient than ever. Please refer to your Voice Installation guide prior to installing the
More informationEntering the World of Ubiquitous Media. Mikko Rusama, Chief Digital Yle February 15th, 2018
Entering the World of Ubiquitous Media Mikko Rusama, Chief Digital Officer @ Yle February 15th, 2018 Yle milestones 1926 Radio 1958 TV 2004 2007 Revolution of user interfaces Over 35m smart speakers
More informationLesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval
Lesson 11 Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Retrieval = Query + Search Informational Retrieval: Get required information from database/web
More informationAccessibility Guidelines
Accessibility s Table 1: Accessibility s The guidelines in this section should be followed throughout the course, including in word processing documents, spreadsheets, presentations, (portable document
More informationQuick Start Guide MAC Operating System Built-In Accessibility
Quick Start Guide MAC Operating System Built-In Accessibility Overview The MAC Operating System X has many helpful universal access built-in options for users of varying abilities. In this quickstart,
More informationPart II: Universally-Designed Course Materials
Part II: Universally-Designed Course Materials Applying the UDL principles Two sides of the UDL coin Diverse Learning Needs Disabilities Mainstream Assistive Usability Accessibility Mandates vs. UDL Legal
More informationEcho360 is collaborating with Amazon to deliver native close captioning. This feature should be available in the next few months.
Echo360 is collaborating with Amazon to deliver native close captioning. This feature should be available in the next few months. Until that time, here are instructions to use YouTube and Echo360 to generate
More informationWindows VISTA Built-In Accessibility. Quick Start Guide
Windows VISTA Built-In Accessibility Quick Start Guide Overview Vista Built-In Accessibility Options Vista Ease of Access Center Magnifier Narrator On-Screen Keyboard Voice Recognition To Use How it is
More informationIntegrate Speech Technology for Hands-free Operation
Integrate Speech Technology for Hands-free Operation Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks
More informationA GET YOU GOING GUIDE
A GET YOU GOING GUIDE To Your copy here Audio Notetaker 4.0 April 2015 1 Learning Support Getting Started with Audio Notetaker Audio Notetaker is highly recommended for those of you who use a Digital Voice
More informationBUILDING CORPORA OF TRANSCRIBED SPEECH FROM OPEN ACCESS SOURCES
BUILDING CORPORA OF TRANSCRIBED SPEECH FROM OPEN ACCESS SOURCES O.O. Iakushkin a, G.A. Fedoseev, A.S. Shaleva, O.S. Sedova Saint Petersburg State University, 7/9 Universitetskaya nab., St. Petersburg,
More informationAPPLYING THE POWER OF AI TO YOUR VIDEO PRODUCTION STORAGE
APPLYING THE POWER OF AI TO YOUR VIDEO PRODUCTION STORAGE FINDING WHAT YOU NEED IN YOUR IN-HOUSE VIDEO STORAGE SECTION 1 You need ways to generate metadata for stored videos without time-consuming manual
More informationAccessibility: Building Products Everyone Can Use
Accessibility: Building Products Everyone Can Use Brad Green & Erin Rosenthal May 10, 2011 Twitter hash tags: #io2011, #TechTalk Feedback: goo.gl/n9bbr How many of you Accessibility awareness? Responsible
More informationStudents are placed in System 44 based on their performance in the Scholastic Phonics Inventory. System 44 Placement and Scholastic Phonics Inventory
System 44 Overview The System 44 student application leads students through a predetermined path to learn each of the 44 sounds and the letters or letter combinations that create those sounds. In doing
More informationy texthelp Read&Write for Google Chrome Quick Reference Guide Docs, Slides and Web read&write - j & Google Docs
y texthelp Read&Write for Chrome Quick Reference Guide 12.17 f m El 11 s, Slides and i >» := n i* - j Tool Symbol Where it works How it works Text to Speech Reads text aloud with dual color highlighting
More informationNielsen List of Top 10 ios Mobile Apps
Nielsen List of Top 10 ios Mobile Apps Nielsen's list of the most popular 10 mobile apps for ios in 2016 was dominated by just four technology giants: Google, Facebook, Apple and Amazon. The Nielsen organization
More informationTypeIt ReadIt. Windows v 1.7
TypeIt ReadIt Windows v 1.7 1 Table of Contents Page Topic 3 TypeIt ReadIt 4 What s New With Version 1.7 5 System Requirements 6 User Interface 11 Keyboard Shortcuts 12 Printing 2 TypeIt ReadIt TypeIt
More informationABSTRACT 1. INTRODUCTION
ABSTRACT A Framework for Multi-Agent Multimedia Indexing Bernard Merialdo Multimedia Communications Department Institut Eurecom BP 193, 06904 Sophia-Antipolis, France merialdo@eurecom.fr March 31st, 1995
More informationThe Leading Monitoring and Intelligence Platform for Post Broadcast Media
The Leading Monitoring and Intelligence Platform for Post Broadcast Media Actus View Web-based Broadcast and Monitoring Platform Records any TV, radio or internet media from any input and any format View
More informationBalancing Usability and Security in a Video CAPTCHA
Balancing Usability and Security in a Video CAPTCHA Google, Inc. kak@google.com Rochester Institute of Technology rlaz@cs.rit.edu Symposium on Usable Privacy and Security (SOUPS) 2009 July 15th-17th, 2009,
More informationUser guide. Parrot SK4000. English. Parrot SK4000 User Guide 1
User guide Parrot SK4000 English Parrot SK4000 User Guide 1 Table of contents Introduction... 4 Kit contents... 4 Using the Parrot SK4000 for the first time... 5 Installing the Parrot SK4000... 5 Description
More informationSystem 44 Next Generation Software Manual
System 44 Next Generation Software Manual For use with System 44 Next Generation version 3.x or later and Student Achievement Manager version 3.x or later Table of Contents Overview... 5 Instructional
More informationGuide to creating a PowerPoint presentation with audio (Mac) and uploading to Moodle
Guide to creating a PowerPoint presentation with audio (Mac) and uploading to Moodle This is a guide to creating an audio enhanced PowerPoint presentation using the Mac version. The PowerPoint programme
More informationNative Reporting for CARESTREAM Vue PACS
Native Reporting for CARESTREAM Vue PACS Part # 6K5150 2012-11-29 PAGE 1 of 29 Table of Contents Before You Begin... 3 Using the Speech Microphone Buttons... 3 Audio Wizard... 4 Running the Audio Wizard...
More informationCMU Sphinx: the recognizer library
CMU Sphinx: the recognizer library Authors: Massimo Basile Mario Fabrizi Supervisor: Prof. Paola Velardi 01/02/2013 Contents 1 Introduction 2 2 Sphinx download and installation 4 2.1 Download..........................................
More informationAutomatic Transcription of Speech From Applied Research to the Market
Think beyond the limits! Automatic Transcription of Speech From Applied Research to the Market Contact: Jimmy Kunzmann kunzmann@eml.org European Media Laboratory European Media Laboratory (founded 1997)
More informationNew Features. Importing Resources
CyberLink StreamAuthor 4 is a powerful tool for creating compelling media-rich presentations using video, audio, PowerPoint slides, and other supplementary documents. It allows users to capture live videos
More informationGood afternoon and thank you for being at the webinar on accessible PowerPoint presentations. This is Dr. Zayira Jordan web accessibility coordinator
Good afternoon and thank you for being at the webinar on accessible PowerPoint presentations. This is Dr. Zayira Jordan web accessibility coordinator at Iowa State and this is the topic for this week s
More informationMultimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Previous Lecture Audio Retrieval - Query by Humming
More informationSpeech Recognition Systems for Automatic Transcription, Voice Command & Dialog applications. Frédéric Beaugendre
Speech Recognition Systems for Automatic Transcription, Voice Command & Dialog applications Frédéric Beaugendre www.seekiotech.com SeekioTech Start-up hosted at the multimedia incubator «la Belle de Mai»,
More informationHands-Free Internet using Speech Recognition
Introduction Trevor Donnell December 7, 2001 6.191 Preliminary Thesis Proposal Hands-Free Internet using Speech Recognition The hands-free Internet will be a system whereby a user has the ability to access
More informationTypeIt ReadIt. Macintosh v 1.7
TypeIt ReadIt Macintosh v 1.7 1 Table of Contents Page Topic 3 TypeIt ReadIt 4 What s New With Version 1.7 5 System Requirements 6 User Interface 11 Keyboard Shortcuts 12 Printing 2 TypeIt ReadIt TypeIt
More information8.5 Application Examples
8.5 Application Examples 8.5.1 Genre Recognition Goal Assign a genre to a given video, e.g., movie, newscast, commercial, music clip, etc.) Technology Combine many parameters of the physical level to compute
More informationEye and Mouth Openness Estimation in Sign Language and News Broadcast Videos
Aalto University School of Science Master s Programme in Machine Learning and Data Mining Marcos Luzardo Eye and Mouth Openness Estimation in Sign Language and News Broadcast Videos Master s Thesis Espoo,
More informationLIP ACTIVITY DETECTION FOR TALKING FACES CLASSIFICATION IN TV-CONTENT
LIP ACTIVITY DETECTION FOR TALKING FACES CLASSIFICATION IN TV-CONTENT Meriem Bendris 1,2, Delphine Charlet 1, Gérard Chollet 2 1 France Télécom R&D - Orange Labs, France 2 CNRS LTCI, TELECOM-ParisTech,
More informationD6.4: Report on Integration into Community Translation Platforms
D6.4: Report on Integration into Community Translation Platforms Philipp Koehn Distribution: Public CasMaCat Cognitive Analysis and Statistical Methods for Advanced Computer Aided Translation ICT Project
More informationQ.bo Webi User s Guide
Contents Q.bo Webi reference guide... 2 1.1. Login... 3 1.2. System Check... 3 1.3. Config Wizard... 6 1.4. Teleoperation... 7 1.5. Training... 9 1.6. Questions & Answers... 10 1.7. Voice Recognition...
More informationISAN: the Global ID for AV Content
ISAN: the Global ID for AV Content A value added number for RMI Patrick Attallah Managing Director ISAN International Agency WIPO Geneva 17th of September 2007 ISAN International Agency 30 rue de Saint
More informationProf. Ahmet Süerdem Istanbul Bilgi University London School of Economics
Prof. Ahmet Süerdem Istanbul Bilgi University London School of Economics Media Intelligence Business intelligence (BI) Uses data mining techniques and tools for the transformation of raw data into meaningful
More informationApproach to Metadata Production and Application Technology Research
Approach to Metadata Production and Application Technology Research In the areas of broadcasting based on home servers and content retrieval, the importance of segment metadata, which is attached in segment
More informationSmore s Accessibility Conformance Report VPAT Version 2.1 March 2018
Smore s Accessibility Conformance Report VPAT Version 2.1 March 2018 Voluntary Product Accessibility Template and VPAT are registered service marks of the Information Technology Industry Council (ITI)
More informationToday. Web Accessibility. No class next week. Spring Break
HCI and Design Today Web Accessibility No class next week. Spring Break Who is affected? People with disabilities Visual, hearing, motor, cognitive, reading About 1 in 5 adults (webaim.org/intro) Older
More informationRead&Write 5 GOLD FOR MAC MANUAL
Read&Write 5 GOLD FOR MAC MANUAL ABBYY FineReader Engine 8.0 ABBYY Software Ltd. 2005. ABBYY FineReader the keenest eye in OCR. ABBYY, FINEREADER and ABBYY FineReader are registered trademarks of ABBYY
More informationSpeech Applications. How do they work?
Speech Applications How do they work? What is a VUI? What the user interacts with when using a speech application VUI Elements Prompts or System Messages Prerecorded or Synthesized Grammars Define the
More informationGender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV
Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Jan Vaněk and Josef V. Psutka Department of Cybernetics, West Bohemia University,
More informationWhat s Working Now. October YouTube Optimization
What s Working Now October 2015 YouTube Optimization Software Updates Today s Content Crowd Force & Bounce Breaker Why use YouTube? Video content strategy Uploading your videos the right way Optimization
More informationAnthony Ho. Ian Brown
Anthony Ho Ian Brown Practical Uses of Video Intro Video Resources (commercial) Video Resources (making your own) Video for MOOCs Video@PolyU a summary Practical Uses of Video 2014 a pivotal year for elearning
More informationCreate accessible video from a PowerPoint slide presentation
Create accessible video from a PowerPoint slide presentation The instructions below outline the process and preparations to create an accessible video file from your PowerPoint slide presentation. Create
More informationPage 1. Arrakis Systems 6604 Powell St. Loveland, CO
Page 1 REVISION 1.0 27 February 2014 Page 2 NEW~WAVE QUICK START GUIDE Congratulations on your purchase of the New~Wave automation system! This quick start guide is to help get you setup quickly and easily.
More informationCIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM
CIMWOS: A MULTIMEDIA ARCHIVING AND INDEXING SYSTEM Nick Hatzigeorgiu, Nikolaos Sidiropoulos and Harris Papageorgiu Institute for Language and Speech Processing Epidavrou & Artemidos 6, 151 25 Maroussi,
More informationMaine CITE Webinar Presenter s Guide
Maine CITE Webinar Presenter s Guide Revised January 2016 When presenting at a Maine CITE sponsored webinar, we ask that you use this guide in preparing for your session. Maine CITE is committed to ensuring
More informationMaster Your Mac. simple ways to tweak, customize, and secure os x
Master Your Mac simple ways to tweak, customize, and secure os x matt cone 10 Talking to Your Mac You don t need a degree in computer science to know that talking to your computer is one of the ultimate
More informationCreate Swift mobile apps with IBM Watson services IBM Corporation
Create Swift mobile apps with IBM Watson services Create a Watson sentiment analysis app with Swift Learning objectives In this section, you ll learn how to write a mobile app in Swift for ios and add
More informationINTERNET AUDIO GUY MIKE STEWART INTERVIEWED
INTERNET AUDIO GUY MIKE STEWART INTERVIEWED WHO IS MIKE STEWART WHAT SOFTWARE MAKES IT EASIER TO RECORD AND EDIT THE SOUND FILES YOU CREATE? AUDIO MARKETING TIPS VIDEO MARKETING TIPS AMAZON S3 TO REDUCE
More informationWEB APPLICATION FOR VOICE OPERATED EXCHANGE
WEB APPLICATION FOR VOICE OPERATED E-MAIL EXCHANGE Sangeet Sagar 1, Vaibhav Awasthi 2, Samarth Rastogi 3, Tushar Garg 4, S. Kuzhalvaimozhi 5 1, 2,3,4,5 Information Science and Engineering, National Institute
More informationReaching All Learners With Leopard
Reaching All Learners With Leopard Diverse Learners Learning disabilities English Language barriers Emotional, behavior problems Lack of interest or engagement Sensory and physical disabilities Teaching
More informationThe Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System
The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System Our first participation on the TRECVID workshop A. F. de Araujo 1, F. Silveira 2, H. Lakshman 3, J. Zepeda 2, A. Sheth 2, P. Perez
More informationDigital Audio Basics
CSC 170 Introduction to Computers and Their Applications Lecture #2 Digital Audio Basics Digital Audio Basics Digital audio is music, speech, and other sounds represented in binary format for use in digital
More informationSystem 44 Next Generation Software Manual
System 44 Next Generation Software Manual For use with System 44 Next Generation version 2.4 or later and Student Achievement Manager version 2.4 or later PDF0836 (PDF) Houghton Mifflin Harcourt Publishing
More information2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology
ISCA Archive STREAM WEIGHT OPTIMIZATION OF SPEECH AND LIP IMAGE SEQUENCE FOR AUDIO-VISUAL SPEECH RECOGNITION Satoshi Nakamura 1 Hidetoshi Ito 2 Kiyohiro Shikano 2 1 ATR Spoken Language Translation Research
More informationFACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT
FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT Shoichiro IWASAWA*I, Tatsuo YOTSUKURA*2, Shigeo MORISHIMA*2 */ Telecommunication Advancement Organization *2Facu!ty of Engineering, Seikei University
More informationMultimedia Information Retrieval The case of video
Multimedia Information Retrieval The case of video Outline Overview Problems Solutions Trends and Directions Multimedia Information Retrieval Motivation With the explosive growth of digital media data,
More informationThe 10 Questions Learning Leaders Should Ask in a Video Platform RFP
The 10 Questions Learning Leaders Should Ask in a Video Platform RFP Steve Rozillis, Director, Customer Evangelism, Panopto ATD Watch & Learn webcast April 26, 2017 Storing in your LMS or CMS with 2GB
More informationThe power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect.
Web Accessibility The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect. Tim Berners-Lee, W3C Director and inventor of the World Wide Web 20% of
More informationMultimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009
9 Video Retrieval Multimedia Databases 9 Video Retrieval 9.1 Hidden Markov Models (continued from last lecture) 9.2 Introduction into Video Retrieval Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme
More informationHands-off Use of Computer towards Universal Access through Voice Control Human-Computer Interface
Hands-off Use of Computer towards Universal Access through Voice Control Human-Computer Interface Dalila Landestoy, Melvin Ayala, Malek Adjouadi, and Walter Tischer Center for Advanced Technology and Education
More informationSNLP ELEC-E5550 Exercise 4: Speech Recognition
: Speech Recognition Stig-Arne Grönroos Department of Signal Processing and Acoustics Aalto University, School of Electrical Engineering stig-arne.gronroos@aalto.fi 02.02.2017 Ex 4.1: Viterbi alignment
More informationMedia Retrieval (2) Prepared by. Ling Guan Jose Lay Paisarn Muneesawang Ning Zhang Rui Zhang. Outlines (revisited)
Media Retrieval (2) Prepared by Ling Guan Jose Lay Paisarn Muneesawang Ning Zhang Rui Zhang 1 Outlines (revisited) Introduction: Intellectual Foundation of Multimedia Information Retrieval Retrieval Models
More informationJAWS for Windows Training Bundle Outline
Introduction to the Training Overview of topics to be covered in the training JAWS for Windows Training Bundle Outline Introduction to the DAISY format and why it is being used PlexTalk Pocket Introduction
More informationAccessibility on the Mac Website:
Website: http://etc.usf.edu/te/ The Mac operating system includes several assistive technologies designed to make it easier for a person with a disability to use the computer. Whether you have difficulty
More informationOptical Character Recognition Based Speech Synthesis System Using LabVIEW
Optical Character Recognition Based Speech Synthesis System Using LabVIEW S. K. Singla* 1 and R.K.Yadav 2 1 Electrical and Instrumentation Engineering Department Thapar University, Patiala,Punjab *sunilksingla2001@gmail.com
More informationTopics in Operating Systems (mini-project)
Topics in Operating Systems (mini-project) Open-set speaker recognition by Ilya Kaganovsky Abstract Saya is a robotic receptionist of the Department of Computer Science in Ben- Gurion University of the
More informationC. The system is equally reliable for classifying any one of the eight logo types 78% of the time.
Volume: 63 Questions Question No: 1 A system with a set of classifiers is trained to recognize eight different company logos from images. It is 78% accurate. Without further information, which statement
More informationLife is a Learning Management System that is being rolled out into BCE Schools over the next year.
Life is a Learning Management System that is being rolled out into BCE Schools over the next year. It allows teachers and students to share an online space, where resources and digital tools may be accessed
More informationUploading Videos and Podcast to your Website
Part of the New Enhancements of the Saddleback Website Templates is the ability to Upload Videos just as simply as uploading pictures to the slideshow. Newly engineered sites have a Podcast and Stream
More informationMPEG-7. Multimedia Content Description Standard
MPEG-7 Multimedia Content Description Standard Abstract The purpose of this presentation is to provide a better understanding of the objectives & components of the MPEG-7, "Multimedia Content Description
More informationBig Data, exploiter de grands volumes de données
Big Data, exploiter de grands volumes de données mardi 3 juillet 2012 Daniel Teruggi, Head of Research dteruggi@ina.fr Ina: Institut National de l Audiovisuel Institut national de l audiovisuel Missions:
More informationHot Transfer. Guide & User Instructions. America s Largest Message Notification Provider. Revised 04/2013
Hot Transfer Guide & User Instructions Revised 04/2013 726 Grant Street Troy Ohio 45373 877.698.3262 937.335.3887 onecallnow.com support@onecallnow.com America s Largest Message Notification Provider Copyright
More informationFP SIMPLE4ALL deliverable D6.5. Deliverable D6.5. Initial Public Release of Open Source Tools
Deliverable D6.5 Initial Public Release of Open Source Tools The research leading to these results has received funding from the European Community s Seventh Framework Programme (FP7/2007-2013) under grant
More information17/09/2015 Dyslexia Handbook XMC/LOC
INDEX 1. How to add the Speak Tab to Microsoft word. This option allows you to highlight text within Word and have these words spoken back to you. 2. How to switch on Speech Recognition in Microsoft 3.
More informationEnhancing applications with Cognitive APIs IBM Corporation
Enhancing applications with Cognitive APIs After you complete this section, you should understand: The Watson Developer Cloud offerings and APIs The benefits of commonly used Cognitive services 2 Watson
More informationVoice Control becomes Natural
Voice Control becomes Natural ITU-T FOCUS GROUP CarCom -- SPEECH IN CARS Dr. Udo Haiber Torino, Italy, October 16, 2009 Overview Company What is Natural? Involved Components Focus Change Approach Conclusion
More informationBrowsing News and TAlk Video on a Consumer Electronics Platform Using face Detection
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning TR2005-155
More information3 Publishing Technique
Publishing Tool 32 3 Publishing Technique As discussed in Chapter 2, annotations can be extracted from audio, text, and visual features. The extraction of text features from the audio layer is the approach
More informationVisual Modeling and Feature Adaptation in Sign Language Recognition
Visual Modeling and Feature Adaptation in Sign Language Recognition Philippe Dreuw and Hermann Ney dreuw@cs.rwth-aachen.de ITG 2008, Aachen, Germany Oct 2008 Human Language Technology and Pattern Recognition
More informationPractical Applications of Machine Learning for Image and Video in the Cloud
Practical Applications of Machine Learning for Image and Video in the Cloud Shawn Przybilla, AWS Solutions Architect M&E @shawnprzybilla 2/27/18 There were 3.7 Billion internet users in 2017 1.2 Trillion
More informationPowerPoint with Voice-over-slides
Making the Document Accessible: PowerPoint with Voice-over-slides Voice-over-slides should include closed caption for hearing-impaired audience. This instruction consists of two parts: 1. Convert a PowerPoint
More informationCS 4518 Mobile and Ubiquitous Computing Lecture 15: Final Project Slides/Paper, Other Ubicomp Android APIs Emmanuel Agu
CS 4518 Mobile and Ubiquitous Computing Lecture 15: Final Project Slides/Paper, Other Ubicomp Android APIs Emmanuel Agu The Rest of the Class The Rest of this class Part 1: Course and Android Introduction
More informationFraunhofer IAIS Audio Mining Solution for Broadcast Archiving. Dr. Joachim Köhler LT-Innovate Brussels
Fraunhofer IAIS Audio Mining Solution for Broadcast Archiving Dr. Joachim Köhler LT-Innovate Brussels 22.11.2016 1 Outline Speech Technology in the Broadcast World Deep Learning Speech Technologies Fraunhofer
More informationR&D White Paper WHP 070. A distributed live subtitling system. Research & Development BRITISH BROADCASTING CORPORATION. September M.
R&D White Paper WHP 070 September 2003 A distributed live subtitling system M. Marks Research & Development BRITISH BROADCASTING CORPORATION BBC Research & Development White Paper WHP 070 A distributed
More informationThese are meant to be used as desktop reminders or cheat sheets for using Read&Write Gold. To use. your Print Dialog box as shown
These are meant to be used as desktop reminders or cheat sheets for using Read&Write Gold. To use them Print as HANDOUTS by setting your Print Dialog box as shown Then Print and Cut up as individual cards,
More informationReference Manual ACTIV 1.0
Reference Manual ACTIV 1.0 Adapted Captions through Interactive Video (ACTIV) system is designed to easily enhance existing video clips with adaptive features such as highlighted text captions, picture
More informationDIGITAL ACCESSIBILITY IN PRACTICE
DIGITAL ACCESSIBILITY IN PRACTICE MAKING CONTENT ACCESSIBLE: WEBSITES PDFs SOCIAL MEDIA VIDEO CAPTIONING Learn more at: cuny.edu/accessibility MAKING CONTENT ACCESSIBLE: Anyone developing content (text,
More informationSection Software Applications and Operating Systems - Detail Criteria Supporting Features Remarks and explanations (a) When software is design
Section 1194.21 Software Applications and Operating Systems - Detail Criteria Supporting Features Remarks and explanations (a) When software is designed to run on a system that has a keyboard, product
More informationIntroductory Visualizing Technology
Introductory Visualizing Technology Seventh Edition Chapter 6 Digital Devices and Multimedia Learning Objectives 6.1 Explain the Features of Digital Cameras 6.2 Compare Methods for Transferring Images
More informationGlog One! Glog All! Jan McGee, Technology Coordinator West Monroe High School. Modified by Katherine Powell, Teacher Librarian Poway High School
Glog One! Glog All! Jan McGee, Technology Coordinator West Monroe High School Modified by Katherine Powell, Teacher Librarian Poway High School A Glog is like a poster... only better Glogs allow students
More informationADA Compliant Design. Short Guide
ADA Compliant Design Short Guide Suffolk County Community College Center for Innovative Pedagogy 2018 Table of Contents ADA Compliant Design: General Tips Using Heading Styles in Microsoft Word Creating
More informationTE-001. viaplatz. - Video-based Knowledge and Information Sharing - NTT IT Corp.
2015.06.TE-001 viaplatz - Video-based Knowledge and Information Sharing - NTT IT Corp. Problem? Too much time, work and cost for employee training and seminars for compliance. We are not good at sharing
More informationObject-based audio production. Chris Baume EBU-PTS - 27th January 2016
Object-based audio production Chris Baume EBU-PTS - 27th January 2016 Structure Challenges in Radio ORPHEUS project Impact on production workflow Production tool demo What is object-based
More informationQuick Start Guide Natural Reader 14 (free version)
Assistive Technology & Alternative Format Centre Disability Resource Service University of Canterbury Quick Start Guide Natural Reader 14 (free version) Overview Natural Reader text to speech software
More information