Automatic Subtitle Generation for Sound in Videos
|
|
- Magdalene Strickland
- 6 years ago
- Views:
Transcription
1 ISSN: X Impact factor: (Volume 4, Issue 2) Available online at: Automatic Subtitle Generation for Sound in Videos Anshul Ganvir Sanket Jagtap Kunal Pal Pranita Katole Mayur Bhalavi Datta Meghe Institute of Engineering Technology and ABSTRACT The last ten years have been the witnesses of the emergence of any kind of video content. Moreover, the appearance of dedicated websites for this phenomenon has increased the importance the public gives to it. In the same time, certain individuals are deaf and occasionally cannot understand the meanings of such videos because there is not any text transcription available. Therefore, it is necessary to find solutions for the purpose of making these media artifacts accessible to most people. Several software proposes utilities to create subtitles for videos but all require an extensive participation of the user. Hence, a more automated concept is envisaged. This report indicates a way to generate subtitles following standards by using speech recognition. Three parts are distinguished. The first one consists of separating audio from video and converting the audio in suitable format if necessary. This second phase proceeds to the recognition of speech contained in the audio. The ultimate stage generates a subtitle file from the recognition results of the previous step. Directions of implementation have been proposed for the three distinct modules. The experiment results have not done enough satisfaction and adjustments have to be realized for further work. Decoding parallelization, use of well-trained models, and punctuation insertion are some of the improvements to be done. Keywords: Audio Extraction, Java Media Framework, Speech Recognition, Acoustic Model, Subtitle Generation, FFMPEG. 1. INTRODUCTION This application indicates a way to generate subtitles following standards by using speech recognition. The systems should take video file as input and generate subtitle file as output. Consequently, the study of automatic subtitle generation appears to be a valid subject of research. Nowadays, it exists much software dealing with subtitle generation. Some proceed on copyright DVDs by extracting the original subtitle track and converting it in a format recognized by media players, for example, ImTOO DVD Subtitle Ripper, and Xilisoft DVD Subtitle Ripper. Others allow the user to watch the video and to insert subtitles using the timeline of the video, e.g. Subtitle Editor, and Subtitle Workshop. It can also be found subtitle editors providing facilities to handle subtitle formats and ease changes, for instance, Jubler and Gaupol. Nonetheless, software generating subtitles without the intervention of an individual using speech recognition have not been developed. Therefore, it seems necessary to start investigations on this concept. 2. LITERATURE SURVEY By Prof. Sanjib Das in this paper, they have collected all the information about Speech Recognition Technique which is also known as Automatic Speech Recognition (ASR), or computer speech recognition which is the process of converting a speech signal to a sequence of words by means of an algorithm implemented as a computer program. It has the potential of being an important mode of interaction between humans and computers. Generally, machine recognition of spoken words is carried out by matching the given speech signal against the sequence of words which best matches the given speech sample. The main goal of speech recognition area 2018, All Rights Reserved Page 410
2 is to develop techniques and systems for speech input to the machine. Speech is the primary means of communication between humans. For reasons ranging from technological curiosity about the mechanisms for the mechanical realization of human speech capabilities to desire to automate simple tasks necessitate human-machine interactions. The research in ASR by machines has attracted a great deal of attention for about sixty years and ASR today finds widespread application in tasks that require humanmachine interface, such as automatic call processing. India is a linguistically rich area which has 18 constitutional languages written in 10 different scripts. Hence there is a special need for the ASR system to develop in different native languages. [1] Sultana, S.; Akhand, M. A H; Das, P.K.; Hafizur Rahman,M.M. Investigate Speech-to-Text (STT) conversion using SAPI for Bangla language. They say that experimental study was carried out for the technique on an article from a newspaper and the recognition rate was approximately 78% on an average. Although achieved performance is promising for STT related studies, they identified several elements to improve the performance and might give better accuracy and assures that the theme of this study will also be helpful for other languages for Speech-to-Text conversion and similar tasks. [2] Moulines, E.In his paper "Text-to-speech algorithms based on FFT synthesis," present FFT synthesis algorithms for a French textto-speech system based on diphone concatenation. FFT synthesis techniques are capable of producing high-quality prosodic modifications of natural speech. Several approaches are presented to reduce the distortions due to diphone concatenation. [3] Martinez, M.; Quilis, A.; Bernstein, J.In this paper, they have done a research aiming to develop a text-to-speech converter (TSC) for Spanish, that accepts a continuous source of alphanumeric characters (up to 250 words per minute) and produces good quality, natural Spanish output, is described. Four sets of problems are considered in this work: the hard-ware structure adopted for realtime operation; the complex control software needed to handle the orthographic input and linguistic programs; the linguistic processing rules, and the parameterization of the Spanish language matched to a TSC. Emphasis is made on the problems of adapting a general hardware structure to a specific language.[4] By Boris Guenebaut. So, after doing all this research and literature survey we want to design a system which will generate subtitles for sound in videos. Reviewing all these papers we came up with a system where subtitles will be automatically generated from the video, and you will not need to download it from third-party website.[5] 3. OBJECTIVES The main objective is to generate subtitle automatically without human intervention. Our objective is to generate subtitle by using an FFMPEG library. Our objective is to produce which is properly time synchronized and displays accurate subtitles. Our aim is to make a media player which will be more convenient for the use. 4. PROBLEM STATEMENT In a majority of cases within a video, the sound holds an important place. It appears essential to make the understanding of a sound video available for people with auditory problems the most natural way lies in the use of subtitles. At present, we have to download subtitle by our own and copy it to video. However, manual subtitle creation is a long and boring activity and requires the presence of the user. 5. PROJECT DESCRIPTION Start Media File Audio Extraction Audio File Speech Recognition Time Synchronization Subtitle Generation Subtitle File End Fig (a):- Architecture Breakdown structure of the AutoSubGen experimental system. A media file (either video or directly audio) is given in input. The audio track is extracted and then read chunk by chunk until the end of the track is reached. Within this loop happen successively three tasks: speech recognition, and subtitle generation. Finally, a subtitle file is returned as output. 2018, All Rights Reserved Page 411
3 A. FFMPEG FFMPEG libraries are used to do most of our multimedia tasks quickly and easily say, audio compression, audio/video format conversion, extract images from a video and a lot more. It can be used by developers for transcoding, streaming and playing. It is a very stable framework for transcoding of videos and audio. ffmpeg is a command-line tool that converts audio or video formats. It can also capture and encode in real-time from various hardware and software sources such as a TV capture card. ffplay is a simple media player utilizing SDL and the FFmpeg libraries. ffprobe is a command-line tool to display media information (text, CSV, XML, JSON), see also Mediainfo. FFmpeg is used by software such as VLC media player, xine, Plex, Kodi, Blender, YouTube, and MPC-HC; it handles video and audio playback in Google Chrome and Linux version of Firefox. Graphical user interface front-ends for FFmpeg have been developed, including Avanti, XMedia Recode, and Multimedia Xpert. JavaCV, a Javawrapper for OpenCV, includes a supplementary Java wrapper for FFmpeg. FFmpeg is used by ffdshow, LAV Filters, GStreamer FFmpeg plug-in, Perian and OpenMAX IL to expand the encoding and decoding capabilities of their respective multimedia platform. B. Audio Extraction The audio extraction routine is expected to return a suitable audio format that can be used by the speech recognition module as pertinent material. It must handle a defined list of video and audio formats. It has to verify the file given in input so that it can evaluate the extraction feasibility. The audio track has to be returned in the most reliable format. C. Speech Recognition The speech recognition routine is the key part of the system. Indeed, it affects directly performance and results in evaluation. First, it must get the type (film, music, information, home-made, etc...) of the input file as often as possible. Then, if the type is provided, an appropriate processing method is chosen. Otherwise, the routine uses a default configuration. D. Subtitle Generation The subtitle generation routine aims to create and write in a file in order to add multiple chunks of text corresponding to utterances limited by silences and their respective start and end times. Time synchronization considerations are of main importance. 6. IMPLEMENTATION METHODOLOGY A. Audio Extraction FFMPEG Input Video FFMPEG Process in PowerShell Output Audio File Fig (b):- Audio Extraction Activity diagram for audio extraction describes the successive steps of the audio extraction module in order to obtain an audio file from a media file given in input. However, we face up to some limitations. Indeed, it will not be able to define punctuation in our system it involves much more speech analysis and deeper design.. The task was to figure out how to convert the output audio file into a format recognized by FFMPEG. Despite the fact we followed guidelines to do so in Java we did not obtain the expected result. 2018, All Rights Reserved Page 412
4 B. Subtitle Generation Current Directory Audio File Path 1 Subtitle Generation Bubble Timeout Initial Silence Timeout End Silence Timeout String Builder Break True If(rec text==null) False Fig(c):- Subtitle Generation Activitty diagram for subtitle generation exhibits the principle statements of the subtitle generation module. First, it receives a list of pairs Utterance- Speech Time. Then, it traverses thr list till the end. In each iteration, the current utterance is checked. If it is a real utterance, we verify if the current line is empty. If so, the subtitle number is incremented and the start time of the current line is set to the utterance speech time. Then, utteranc eis added to the current line. In the case, it is a SIL utterance, we check if the current line is empty: if not, the end time of the current line is set to SIL speech time. If the line is empty, we ignore the SIL utterance. Once the list has been traversed, the file is finalized and released to the user. C. Speech Recognition Prompt for input audio file and Parameters Check Audio File Filter Input Media Category Valid Format Wrong Format Select Suitable Model Throw Exception Adjust SR Config Allocate Retained Components Show Helper Launch Decode Process Store Result for Later Usage Fig (d):- Speech Recognition Activity diagram for speech recognition shows the successive statements to be executed at the time of speech recognition process. An audio file and some parameters are given as arguments to the module. First, the audio file is checked: if its format is valid, the 2018, All Rights Reserved Page 413
5 process continues; otherwise, an exception is thrown and the execution ends. According to the category (potentially amateur, movie news, series, music) given as argument, related acoustic and language models are selected. Some adjustments are realized in the FFMPEG configuration based on the set parameters. Then, all components used in ASR process are allocated required resources. Finally, the decoding phase takes place and results are periodically saved to be reused later. 7. CONCLUSION By using this application subtitles or subtitle file will be generated for any English videos. This software will minimize the efforts for downloading or manually writing the subtitle file. It supports all the MPEG standards. The video and subtitles are synchronized. User can extract audio in any MPEG standard formats. 8. REFERENCES [1] Santosh K. Gaikwad, BhartiW. Gawali, Pravin Yannawar, A Review on Speech Recognition Technique, International Journal of Computer Applications ( ) Volume 10 No.3, November [2] Penagarikano, M.; Bordel, G., Speech-to-text translation by a non-word lexical unit based system,"signal Processing and Its Applications, ISSPA '99. Proceedings of the Fifth International Symposium on, vol.1, no., pp.111,114 vol.1, 1999 [3] Olabe, J. C.; Santos, A.; Martinez, R.; Munoz, E.; Martinez, M.; Quilis, A.; Bernstein, J., Real timetext-to-speech conversion system for spanish," Acoustics, Speech, and Signal Processing,IEEEInternational Conference on ICASSP '84., vol.9, no., pp.85,87, Mar [4] Kavala, R. et al., A Dynamic Time Warp Integrated Circuitfor a 1000-Word Recognition System, IEEE Journal ofsolid-state Circuits, vol SC- 22, NO 1, February 1987, pp 3-14 [5] F.; Moulines, E., "Text-to-speech algorithms based on FFT synthesis," Acoustics, Speech, and Signal Processing, ICASSP- 88., 1988 International Conference on, vol., no., pp.667,670 vol.1, Apr [6] Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh, Evandro Gouvea, Peter Wolf, and Joe Woelfel. Sphinx-4: A flexible open source framework for speech recognition. In SMLI TR SUN MICROSYSTEMS INC., , All Rights Reserved Page 414
Comprehensive Tool for Generation and Compatibility Management of Subtitles for English Language Videos
International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 12, Number 1 (2016), pp. 63-68 Research India Publications http://www.ripublication.com Comprehensive Tool for Generation
More informationReview on Recent Speech Recognition Techniques
International Journal of Scientific and Research Publications, Volume 3, Issue 7, July 2013 1 Review on Recent Speech Recognition Techniques Prof. Deepa H. Kulkarni Assistant Professor, SKN College of
More information3 Publishing Technique
Publishing Tool 32 3 Publishing Technique As discussed in Chapter 2, annotations can be extracted from audio, text, and visual features. The extraction of text features from the audio layer is the approach
More informationWebomania Solutions Pvt. Ltd About ClipBucket-A way to broadcast yourself:
About ClipBucket-A way to broadcast yourself: Video sharing websites are in great demand in today s world. There are many successful video sharing websites like YouTube, Dailymotion, Metacafe, Veoh, Hulu
More information4-3 Telemetry and Command Processing System for Experiments
4-3 Telemetry and Command Processing System for Experiments OHASHI Hajime Two telemetry and command processing systems are being prepared as part of the ground facilities by CRL to monitor and control
More informationMedia player for windows 10 free download
Media player for windows 10 free download Update to the latest version of Internet Explorer. You need to update your browser to use the site. PROS: High-quality playback, Wide range of formats, Fast and
More informationLatest Press Release. summer elementary school slogans
corp@stantec.com Latest Press Release summer elementary school slogans S Find helpful customer reviews and review ratings for VLC for Fire at Amazon.com. Read honest and unbiased product reviews from our
More informationMultimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Previous Lecture Audio Retrieval - Query by Humming
More informationAnnotation by category - ELAN and ISO DCR
Annotation by category - ELAN and ISO DCR Han Sloetjes, Peter Wittenburg Max Planck Institute for Psycholinguistics P.O. Box 310, 6500 AH Nijmegen, The Netherlands E-mail: Han.Sloetjes@mpi.nl, Peter.Wittenburg@mpi.nl
More informationTips on DVD Authoring and DVD Duplication M A X E L L P R O F E S S I O N A L M E D I A
Tips on DVD Authoring and DVD Duplication DVD Authoring - Introduction The postproduction business has certainly come a long way in the past decade or so. This includes the duplication/authoring aspect
More informationWindows Media Player Manual Update Vista Full Version
Windows Media Player Manual Update Vista Full Version How to Manually Check for Updates in Windows Media Player Information Windows Media Player (WMP) is updated from time to Click image for larger version.
More informationDesign of the CMU Sphinx-4 Decoder
MERL A MITSUBISHI ELECTRIC RESEARCH LABORATORY http://www.merl.com Design of the CMU Sphinx-4 Decoder Paul Lamere, Philip Kwok, William Walker, Evandro Gouva, Rita Singh, Bhiksha Raj and Peter Wolf TR-2003-110
More informationSAUTRELA: A HIGHLY MODULAR OPEN SOURCE SPEECH RECOGNITION FRAMEWORK. Mikel Penagarikano and German Bordel
SAUTRELA: A HIGHLY MODULAR OPEN SOURCE SPEECH RECOGNITION FRAMEWORK Mikel Penagarikano and German Bordel Department of Electricity and Electronics University of the Basque Country, 48940 Leioa, Spain E-mail:
More informationMODELING REAL-TIME MULTIMEDIA STREAMING USING HLS PROTOCOL
MODELING REAL-TIME MULTIMEDIA STREAMING USING HLS PROTOCOL Smita R Gupta 1, Krunal Panchal 2 1 Studen, Information Technology, L.J. Institute of Engineering & Technology, Gujarat, India 1 Project Trainee,
More informationAutomatic Transcription of Speech From Applied Research to the Market
Think beyond the limits! Automatic Transcription of Speech From Applied Research to the Market Contact: Jimmy Kunzmann kunzmann@eml.org European Media Laboratory European Media Laboratory (founded 1997)
More informationFinished at last: after weeks of tinkering
DVD Authoring on Linux CUT AND TOAST Authoring video DVDs with Linux was a problem for a long time, but a few useful tools are closing the gap. And Q-DVD-Author gives you a handy front-end for controlling
More informationHistory of Video in Linux
History of Video in Linux By Henning Kulander In the beginning (1995) Windows 95 with early DirectShow Hardware scaling Different pluggable codecs (INDEO popular) QuickTime 2.1 available for Windows and
More informationCMU Sphinx: the recognizer library
CMU Sphinx: the recognizer library Authors: Massimo Basile Mario Fabrizi Supervisor: Prof. Paola Velardi 01/02/2013 Contents 1 Introduction 2 2 Sphinx download and installation 4 2.1 Download..........................................
More informationWritten by Tranced_1 Monday, 04 November :52 - Last Updated Monday, 04 November :52
Monday, 04 November 2013 10:52 Last Updated Monday, 04 November 2013 11:52 This is the next Official release of Universal Media Server that follows the previous stables ve rsion 2.6.5. The developers behind
More informationNOT FOR DISTRIBUTION OR REPRODUCTION
www.pipelinepub.com Volume 10, Issue 11 Next-Generation Video Transcoding By Alexandru Voica The Emergence of H.265 (HEVC) and 10- Bit Color Formats Today s increasingly demanding applications, such as
More information1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra
Pattern Recall Analysis of the Hopfield Neural Network with a Genetic Algorithm Susmita Mohapatra Department of Computer Science, Utkal University, India Abstract: This paper is focused on the implementation
More informationRelevance Feature Discovery for Text Mining
Relevance Feature Discovery for Text Mining Laliteshwari 1,Clarish 2,Mrs.A.G.Jessy Nirmal 3 Student, Dept of Computer Science and Engineering, Agni College Of Technology, India 1,2 Asst Professor, Dept
More informationThe Cisco Show and Share mobile client for Apple ios devices will provide the following features when connected to a Cisco Show and Share system:
Data Sheet Cisco Show and Share Product Overview The Cisco Digital Media Suite (DMS) is a comprehensive offering of webcasting and video sharing, digital signage, and business IPTV applications that can
More informationTurns your voice into text with up to 99% accuracy. New - Up to a 15% improvement to out-of-the-box accuracy compared to Dragon version 12
Recognition accuracy Turns your voice into text with up to 99% accuracy New - Up to a 15% improvement to out-of-the-box accuracy compared to Dragon version 12 Recognition speed Words appear on the screen
More informationTowards Corpus Annotation Standards The MATE Workbench 1
Towards Corpus Annotation Standards The MATE Workbench 1 Laila Dybkjær, Niels Ole Bernsen Natural Interactive Systems Laboratory Science Park 10, 5230 Odense M, Denmark E-post: laila@nis.sdu.dk, nob@nis.sdu.dk
More informationNero Recode Manual. Nero AG
Nero Recode Manual Nero AG Copyright and Trademark Information The Nero Recode manual and all its contents are protected by copyright and are the property of Nero AG. All rights reserved. This manual contains
More informationEUDICO, Annotation and Exploitation of Multi Media Corpora over the Internet
EUDICO, Annotation and Exploitation of Multi Media Corpora over the Internet Hennie Brugman, Albert Russel, Daan Broeder, Peter Wittenburg Max Planck Institute for Psycholinguistics P.O. Box 310, 6500
More informationAutomated Tagging to Enable Fine-Grained Browsing of Lecture Videos
Automated Tagging to Enable Fine-Grained Browsing of Lecture Videos K.Vijaya Kumar (09305081) under the guidance of Prof. Sridhar Iyer June 28, 2011 1 / 66 Outline Outline 1 Introduction 2 Motivation 3
More informationMultimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009
9 Video Retrieval Multimedia Databases 9 Video Retrieval 9.1 Hidden Markov Models (continued from last lecture) 9.2 Introduction into Video Retrieval Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme
More informationGet DVD-Cloner software cheapest ]
Get DVD-Cloner software cheapest ] Description: Perfect 1:1 DVD copy Removes all known DVD protections such as CSS, CPRM, CPPM, RC, RCE, APS, UOPs, Sony ARccOS, Rip-Guard, Macrovision, Disney X-project
More informationSTEP 1: DOWNLOAD THE PROGRAM
STEP 1: DOWNLOAD THE PROGRAM Note: The screen shots in this documents are using a Windows 7 operating system, images may be different on other versions and operating systems. A current version of Java
More informationQuicktime Player Error Codec For Avi Per
Quicktime Player Error Codec For Avi Per Oct 2, 2014. a movie and its.avi and i am getting an error post "converting" where it says. QuickTime player can't open Try another video player, such as VLC. Quicktime
More informationA Study on Transmission System for Realistic Media Effect Representation
Indian Journal of Science and Technology, Vol 8(S5), 28 32, March 2015 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 DOI : 10.17485/ijst/2015/v8iS5/61461 A Study on Transmission System for Realistic
More informationWindows Media Player Manual Update 12 For Vista Full Version
Windows Media Player Manual Update 12 For Vista Full Version You can also extract music from CDs, while with its online connection, Windows Media Player has library updates providing you with new information.
More informationVoice Recognition Based Smart Home Control System
International Journal of Engineering Inventions e-issn: 2278-7461, p-issn: 2319-6491 Volume 6, Issue 4 [April 2017] PP: 01-05 Voice Recognition Based Smart Home Control System Awadalla Taifour Ali 1, Eisa
More informationQUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose
QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose Department of Electrical and Computer Engineering University of California,
More informationIntroduction. You have just registered on the platform cined.eu and your account has been validated by your referring administrator.
How to use CinEd Introduction You have just registered on the platform cined.eu and your account has been validated by your referring administrator. This account allows you to view and download the films
More informationFramework for Supporting Metadata Services
Framework for Supporting Services Mitsuaki Tsunakara, Ryoji Kataoka, and Masashi Morimoto Abstract -sharing businesses have been attracting considerable attention recently. These include highspeed search
More informationLesson 5: Multimedia on the Web
Lesson 5: Multimedia on the Web Learning Targets I can: Define objects and their relationships to multimedia Explain the fundamentals of C, C++, Java, JavaScript, JScript, C#, ActiveX and VBScript Discuss
More informationTesting Exceptions with Enforcer
Testing Exceptions with Enforcer Cyrille Artho February 23, 2010 National Institute of Advanced Industrial Science and Technology (AIST), Research Center for Information Security (RCIS) Abstract Java library
More informationManual Do Vlc Media Player Gratis Windows 7 64 Bit
Manual Do Vlc Media Player Gratis Windows 7 64 Bit VLC Media Player (64-bit) is a favorite of many video watchers thanks to abundant format support, style, Free. File Size: 28.45MB. Downloads Last Week:
More informationReduction of Blocking artifacts in Compressed Medical Images
ISSN 1746-7659, England, UK Journal of Information and Computing Science Vol. 8, No. 2, 2013, pp. 096-102 Reduction of Blocking artifacts in Compressed Medical Images Jagroop Singh 1, Sukhwinder Singh
More informationI. INTRODUCTION ABSTRACT
2018 IJSRST Volume 4 Issue 8 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology Voice Based System in Desktop and Mobile Devices for Blind People Payal Dudhbale*, Prof.
More informationCompleting the Multimedia Architecture
Copyright Khronos Group, 2011 - Page 1 Completing the Multimedia Architecture Erik Noreke Chair of OpenSL ES Working Group Chair of OpenMAX AL Working Group Copyright Khronos Group, 2011 - Page 2 Today
More informationCustomize your personal and home media library
Video Station Qvideo Customize your personal and home media library Have you ever had those concerns? Always need to spent a long time to sort out your video library? Your video player cannot support specific
More informationThe Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform
Computer and Information Science; Vol. 11, No. 1; 2018 ISSN 1913-8989 E-ISSN 1913-8997 Published by Canadian Center of Science and Education The Analysis and Implementation of the K - Means Algorithm Based
More informationMPML: A Multimodal Presentation Markup Language with Character Agent Control Functions
MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions Takayuki Tsutsui, Santi Saeyor and Mitsuru Ishizuka Dept. of Information and Communication Eng., School of Engineering,
More informationStreaming Media. Advanced Audio. Erik Noreke Standardization Consultant Chair, OpenSL ES. Copyright Khronos Group, Page 1
Streaming Media Advanced Audio Erik Noreke Standardization Consultant Chair, OpenSL ES Copyright Khronos Group, 2010 - Page 1 Today s Consumer Requirements Rich media applications and UI - Consumer decisions
More informationSK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher
ISSN: 2394 3122 (Online) Volume 2, Issue 1, January 2015 Research Article / Survey Paper / Case Study Published By: SK Publisher P. Elamathi 1 M.Phil. Full Time Research Scholar Vivekanandha College of
More informationHTML5: MULTIMEDIA. Multimedia. Multimedia Formats. Common Video Formats
LEC. 5 College of Information Technology / Department of Information Networks.... Web Page Design/ Second Class / Second Semester HTML5: MULTIMEDIA Multimedia Multimedia comes in many different formats.
More informationMARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID PHONES
International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol. 3, Issue 2, May 2016, 34-38 IIST MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID
More informationDesign and Implementation of the Omni Media Content Production and Service Platform
Design and Implementation of the Omni Media Content Production and Service Platform Shi-Min Liu1,a, Ling-Jun Yang1 and San-Xing Cao2 1 Information Engineering school, Communication University of China,
More informationWindows Media Player Dvd Decoder Not Installed Error Message
Windows Media Player Dvd Decoder Not Installed Error Message Windows Media Player 11 does not come with the codecs needed to play DVD without having the MPEG-2 codec installed, an error message such as
More informationNetworking Applications
Networking Dr. Ayman A. Abdel-Hamid College of Computing and Information Technology Arab Academy for Science & Technology and Maritime Transport Multimedia Multimedia 1 Outline Audio and Video Services
More informationAchieving 24-bit Resolution with TASCAM s New-Generation DTRS Format Recorders / Reproducers
Achieving 24-bit Resolution with TASCAM s New-Generation DTRS Format Recorders / Reproducers Introduction. The DTRS 16-bit format was originally seen by many people as an interim technology to bridge the
More informationDistributed Face Recognition Using Hadoop
Distributed Face Recognition Using Hadoop A. Thorat, V. Malhotra, S. Narvekar and A. Joshi Dept. of Computer Engineering and IT College of Engineering, Pune {abhishekthorat02@gmail.com, vinayak.malhotra20@gmail.com,
More informationTema 0: Transmisión de Datos Multimedia
Tema 0: Transmisión de Datos Multimedia Clases de aplicaciones multimedia Redes basadas en IP y QoS Computer Networking: A Top Down Approach Featuring the Internet, 3 rd edition. Jim Kurose, Keith Ross
More informationBinju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Survey on Summarization of Multiple User-Generated
More informationJournal of Applied Research and Technology ISSN: Centro de Ciencias Aplicadas y Desarrollo Tecnológico.
Journal of Applied Research and Technology ISSN: 1665-6423 jart@aleph.cinstrum.unam.mx Centro de Ciencias Aplicadas y Desarrollo Tecnológico México Singla, S. K.; Yadav, R. K. Optical Character Recognition
More informationMPEG-4 Structured Audio Systems
MPEG-4 Structured Audio Systems Mihir Anandpara The University of Texas at Austin anandpar@ece.utexas.edu 1 Abstract The MPEG-4 standard has been proposed to provide high quality audio and video content
More informationWhat's new in GStreamer Land The last 2 years and the future
What's new in GStreamer Land The last 2 years and the future FOSDEM 2017, Brussels Open Media Devroom 5 February 2017 Sebastian Dröge Tim Müller Introduction
More information9/8/2016. Characteristics of multimedia Various media types
Chapter 1 Introduction to Multimedia Networking CLO1: Define fundamentals of multimedia networking Upon completion of this chapter students should be able to define: 1- Multimedia 2- Multimedia types and
More information2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology
ISCA Archive STREAM WEIGHT OPTIMIZATION OF SPEECH AND LIP IMAGE SEQUENCE FOR AUDIO-VISUAL SPEECH RECOGNITION Satoshi Nakamura 1 Hidetoshi Ito 2 Kiyohiro Shikano 2 1 ATR Spoken Language Translation Research
More informationRobin Dittwald Institut für Linguistik Universität zu Köln June Subtitles with ELAN An Easy Guide
Institut für Linguistik Universität zu Köln Email: robin.dittwald@uni-koeln.de June 2008 Subtitles with ELAN An Easy Guide Table of Contents 1. Introduction... 1 2. Necessary Software... 1 3. Creating
More informationOptimized design of customized KML files
Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 2. pp. 203 208 doi: 10.14794/ICAI.9.2014.2.203 Optimized design of customized KML
More informationThe Analysis and Research of IPTV Set-top Box System. Fangyan Bai 1, Qi Sun 2
Applied Mechanics and Materials Online: 2012-12-13 ISSN: 1662-7482, Vols. 256-259, pp 2898-2901 doi:10.4028/www.scientific.net/amm.256-259.2898 2013 Trans Tech Publications, Switzerland The Analysis and
More informationSystem Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework
System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Literature Survey Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao March 25, 2002 Abstract
More informationVideoCD Audio + Stills A solution compatible with DVD players
VideoCD Audio + Stills A solution compatible with DVD players 1. INTRODUCTION This manual is a translation into English from the original Spanish document available in www.videoedicion.org and www.vcdsp.com,
More informationTA Document Test specification of self-test for AV Devices 1.0 (Point-to-Point Test and Network Test)
TA Document 2003012 Test specification of self-test for AV Devices 1.0 (Point-to-Point Test and Network Test) February 4, 2003 Sponsored by: 1394 Trade Association Accepted for Release by: 1394 Trade Association
More informationEVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID. Kodela Divya* 1, J.Pratibha 2
ISSN 2277-2685 IJESR/May 2015/ Vol-5/Issue-5/179-183 Kodela Divya et. al./ International Journal of Engineering & Science Research EVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID ABSTRACT Kodela
More informationA Short Introduction to Audio Fingerprinting with a Focus on Shazam
A Short Introduction to Audio Fingerprinting with a Focus on Shazam MUS-17 Simon Froitzheim July 5, 2017 Introduction Audio fingerprinting is the process of encoding a (potentially) unlabeled piece of
More informationAn Efficient Link Bundling Transport Layer Protocol for Achieving Higher Data Rate and Availability
An Efficient Link Bundling Transport Layer Protocol for Achieving Higher Data Rate and Availability Journal: IET Communications Manuscript ID: COM-2011-0376 Manuscript Type: Research Paper Date Submitted
More informationHuman Computer Interaction Using Speech Recognition Technology
International Bulletin of Mathematical Research Volume 2, Issue 1, March 2015 Pages 231-235, ISSN: 2394-7802 Human Computer Interaction Using Recognition Technology Madhu Joshi 1 and Saurabh Ranjan Srivastava
More informationANDROID APPLICATION USED IN DIAGNOSTIC SYSTEM OF ENGINEERING DEVICES AND MACHINES
Journal of KONES Powertrain and Transport, Vol. 25, No. 3 2018 ANDROID APPLICATION USED IN DIAGNOSTIC SYSTEM OF ENGINEERING DEVICES AND MACHINES Artur Gawlik, Damian Brewczyński Cracow University of Technology
More informationSubTech 1. Short intro on different subtitle standards ISOBMFF, MPEG-DASH, DVB-DASH, DASH-IF, CMAF, HLS
SubTech 1 24 Mai 2018, IRT, Symposium on Subtitling Technology Short intro on different subtitle standards ISOBMFF, MPEG-DASH, DVB-DASH, DASH-IF, CMAF, HLS 24 Mai 2018, IRT, Symposium on Subtitling Technology
More informationISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-11,
NATURAL LANGUAGE PROCESSING BASED HOME AUTOMATION SYSTEM USING SMART PHONE AND AURDINO MICROCONTROLLER BOARD Burgoji Santhosh Kumar Assistant Professor, Dept Of Ece, Anurag Group Of Institutions, Hyderabad,
More informationA Development Of A Web-Based Application System Of QR Code Location Generator and Scanner named QR- Location
UTM Computing Proceedings Innovations in Computing Technology and Applications Volume 2 Year: 2017 ISBN: 978-967-0194-95-0 1 A Development Of A Web-Based Application System Of QR Code Location Generator
More informationSpoken Document Retrieval (SDR) for Broadcast News in Indian Languages
Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages Chirag Shah Dept. of CSE IIT Madras Chennai - 600036 Tamilnadu, India. chirag@speech.iitm.ernet.in A. Nayeemulla Khan Dept. of CSE
More informationLetterScroll: Text Entry Using a Wheel for Visually Impaired Users
LetterScroll: Text Entry Using a Wheel for Visually Impaired Users Hussain Tinwala Dept. of Computer Science and Engineering, York University 4700 Keele Street Toronto, ON, CANADA M3J 1P3 hussain@cse.yorku.ca
More informationSystem Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework
System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Final Report Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao May 8, 2002 Abstract Stream
More information3 September, 2015 Nippon Telegraph and Telephone Corporation NTT Advanced Technology Corporation
3 September, 2015 Nippon Telegraph and Telephone Corporation NTT Advanced Technology Corporation NTT Develops World s First H.265/HEVC Software-Encoding Engine Supporting 60P/120P Simultaneous Transmission
More information1.1 Technical Evaluation Guidelines and Checklist:
1.1 Technical Guidelines and Checklist: This checklist is derived from the LRMDS Technical Specification. Refer to Section 10.2. Instructions: 1. Digital resources may be: a) digital and accessible online
More informationThe innovating Windows Mobile -based Telematic Solution for the car
The innovating Windows Mobile -based Telematic Solution for the car CONTENTS OVERVIEW... 3 The hands-free kit... 3 Message reader... 5 Media player... 6 Road safety... 7 DISPLAY AND BUTTONS ON THE STEERING
More informationINTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 N15071 February 2015, Geneva,
More informationWhat is this Song About?: Identification of Keywords in Bollywood Lyrics
What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics
More informationIntelligent Hands Free Speech based SMS System on Android
Intelligent Hands Free Speech based SMS System on Android Gulbakshee Dharmale 1, Dr. Vilas Thakare 3, Dr. Dipti D. Patil 2 1,3 Computer Science Dept., SGB Amravati University, Amravati, INDIA. 2 Computer
More informationSection 508 Annual Report
Section 508 Annual Report HHS Requestor: NIH/NHLBI Date: May 10, 2013 Item(s) Name: Web-based Tool Kit Version: Initial Design Vendor: ArchieMD, Inc. Vendor Contact: Robert J. Levine Section 1194.21 Software
More informationA New Technique for Segmentation of Handwritten Numerical Strings of Bangla Language
I.J. Information Technology and Computer Science, 2013, 05, 38-43 Published Online April 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijitcs.2013.05.05 A New Technique for Segmentation of Handwritten
More informationBluray (
Bluray (http://www.blu-ray.com/faq) MPEG-2 - enhanced for HD, also used for playback of DVDs and HDTV recordings MPEG-4 AVC - part of the MPEG-4 standard also known as H.264 (High Profile and Main Profile)
More informationINTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)
More informationTRIBHUVAN UNIVERSITY Institute of Engineering Pulchowk Campus Department of Electronics and Computer Engineering
TRIBHUVAN UNIVERSITY Institute of Engineering Pulchowk Campus Department of Electronics and Computer Engineering A Final project Report ON Minor Project Java Media Player Submitted By Bisharjan Pokharel(061bct512)
More informationThe State of Streaming Video in Professional and Scholarly Communications. STM Digital Publishing, London - December 5, 2017
The State of Streaming Video in Professional and Scholarly Communications STM Digital Publishing, London - December 5, 2017 Streaming Video Survey Results Tracy Gardner Renew Publishing Consultants STM
More informationAn evaluation tool for Wireless Digital Audio applications
An evaluation tool for Wireless Digital Audio applications Nicolas-Alexander Tatlas 1, Andreas Floros 2, and John Mourjopoulos 3 1 Audiogroup, Electrical Engineering and Computer Technology Department,
More informationRTSP Based Video Surveillance System Using IP Camera for Human Detection in OpenCV
RTSP Based Video Surveillance System Using IP Camera for Human Detection in OpenCV K. Bapayya 1,K. Sujitha 2, Mr. SD. Akthar Basha 3 1 Asst. Professor, Department of ECE, CVR College of Engineering, Hyderabad-501510
More informationThis page contains all known bugs from drivers, codecs, Windows, etc. which can cause issues in MediaPortal.
Known Issues Table Of Content 1 Display Issues 1.1 Text with special (extended) characters does not render properly 1.2 MediaPortal GUI gets black after video playback stops 1.3 DVD Menus are black 1.4
More informationShrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent
More informationCodec Error Help Please Use Windows Media Player
Codec Error Help Please Use Windows Media Player Certain avi file won't play in Windows Media Player with error message says: Windows Media Player used to support all video codecs that compressed in AVI.
More informationModule 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur
Module 9 AUDIO CODING Lesson 29 Transform and Filter banks Instructional Objectives At the end of this lesson, the students should be able to: 1. Define the three layers of MPEG-1 audio coding. 2. Define
More informationLesson 5: Multimedia on the Web
Lesson 5: Multimedia on the Web Lesson 5 Objectives Define objects and their relationships to multimedia Explain the fundamentals of C, C++, Java, JavaScript, JScript, C#, ActiveX and VBScript Discuss
More informationconvert MP4 m3u8 convert MP4 MP4 Convert MP4 MP4 MP4 M3U8 convert M3U8 MP4 mp4 MP4
M3u8 mp4 convert May 14, 2016. The m3u8 file extension is commonly used for m3u playlists in UTF-8. M3U8 Converter app can download m3u8 to mp4 in easy step just past. Jun 7, 2017. If you're looking to
More information