Open-Source Speech Recognition for Hand-held and Embedded Devices

Size: px
Start display at page:

Download "Open-Source Speech Recognition for Hand-held and Embedded Devices"

Transcription

1 PocketSphinx: Open-Source Speech Recognition for Hand-held and Embedded Devices David Huggins Daines Mohit Kumar Arthur Chan Alan W Black (awb@cs.cmu.edu) Mosur Ravishankar (rkm@cs.cmu.edu) Alexander I. Rudnicky (air@cs.cmu.edu) Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 1

2 What is PocketSphinx? Based on Sphinx-II Open source code under MIT-style license Widely used in CMU and elsewhere Mature and stable API Design goals Statistical Language Model support Finite-State Grammars also available Medium-Large Vocabulary (1-10kwords) Make it go faster Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 2

3 Why do we need it? Typical desktop/workstation of bit memory bus (6-10GB/sec) 1.8-3GHz processor (5000 MIPS) ATA, SATA, or SCSI storage ( MB/sec) Typical PDA/SOC/smartphone of or 32-bit memory bus ( MB/sec) MHz processor ( MIPS) SD/MMC or CF storage (1-16MB/sec) no FPU or vector unit (sometimes a DSP...) Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 3

4 ASR bottlenecks Wait, you say: My cell phone is pretty darn fast! At least as fast as that DEC we had a real-time 20k system on back in 1996! However: ASR is system bandwidth limited Sphinx benchmarks (shown to the right) favor large caches and high memory bandwidth (Intel) Search, LM, and dictionary lookup are highly memory-intensive We will have to deal with them (Source: techreport.com) Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 4

5 Scaling: Hand-held vs Desktop Speed (xrt) # of words in vocabulary Hand-held Desktop Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 5

6 How to make it go faster Low-hanging fruit Front-end optimizations (fixed-point, logarithm) Speeding up GMM computation Old-fashioned beam tuning Non-speech-related work Memory optimization (+ model compression) Machine-level optimization (assembly code) What's left? Search optimization dynamic beam tuning Language model compression and optimization Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 6

7 Front-End Optimizations Fixed-point calculations 32-bit, or format Using 64-bit multiply (SMULL) on ARM, multiply-accumulate on DSP MFCC calculated in log domain, using a lookup of log 2 w/conversion to log Audio downsampling Allows smaller order FFT and MFCC Not as useful for large-vocabulary systems Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 7

8 GMM Optimizations Top-N based Gaussian selection (Mosur 96) Use previous frame's top codewords to select current frame standard Sphinx-II technique Partial frame-based downsampling (Woszczyna 98) Only update top-n every Mth frame Can significantly affect accuracy kd-tree based Gaussian selection (Fritsch 96) Approximate nearest neighbor search in k dimensions using stable partition trees 10% speedup, little or no effect on accuracy Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 8

9 Search Optimizations Absolute pruning Approximations in the front end and GMM increase the effective beam width, paradoxically decreasing performance We would like to enforce a hard limit on the number of states or word exits evaluated per frame - how? Histogram pruning (Ney 1996) Partition the beam width into bins Dynamically recompute beam based on bin occupancy counts 30% speedup with 10% relative degradation in WER Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 9

10 Memory Optimizations Read-only model files mmap(2)able, shareable between processes leverage OS-level caching (virtual memory) Precompiled (binary) LM Inherited from Sphinx-II Adapted for memory-mapping vocabulary in <32M of RAM Read-only binary model definition file Pre-built radix tree of triphones->senones Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 10

11 Performance Task Vocabulary Perplexity xreal-time Word Error TIDIGITS % RM % WSJ devel5k % Test platform: ipaq MHz StrongARM running Linux (FPU emulation in kernel) Also running on: Other embedded Linux platforms Analog Devices Blackfin, uclinux WinCE using GNU toolchain (untested) Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 11

12 How to get it Web Site: Compiles with GCC for i386, ARM, PowerPC, and Blackfin Cross-compiles using an arm-wince-pe toolchain (available in various Linux distributions) for Windows CE Compatible with Sphinx2 fbs.h interface Good (fast) acoustic models forthcoming Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 12

13 Future work Improve accuracy Remove Sphinx-II codebook limitations Optimize the language model and dictionary Statistical profiling of LM access patterns Investigate dynamic search strategies Remove various legacy code Fast speaker and channel adaptation Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 13

14 Thank you Any questions? This work was supported by DARPA grant NB CH D The content of the information in this publication does not necessarily reflect the position or the policy of the US Government, and no official endorsement should be inferred. Language Technologies Institute, Carnegie Mellon University 05/18/06 Slide 14

POCKETSPHINX: A FREE, REAL-TIME CONTINUOUS SPEECH RECOGNITION SYSTEM FOR HAND-HELD DEVICES

POCKETSPHINX: A FREE, REAL-TIME CONTINUOUS SPEECH RECOGNITION SYSTEM FOR HAND-HELD DEVICES POCKETSPHINX: A FREE, REAL-TIME CONTINUOUS SPEECH RECOGNITION SYSTEM FOR HAND-HELD DEVICES David Huggins-Daines, Mohit Kumar, Arthur Chan, Alan W Black, Mosur Ravishankar, and Alex I. Rudnicky Carnegie

More information

irobotrock: A Speech Recognition Mobile Application Reema Pimpale Prabhat Narayan Anand Kamath

irobotrock: A Speech Recognition Mobile Application Reema Pimpale Prabhat Narayan Anand Kamath irobotrock: A Speech Recognition Mobile Reema Pimpale Prabhat Narayan Anand Kamath Outline Introduction Technologies Current Approaches Our Solution Users ( Domain) Our Approach Pending Functionality Future

More information

In Silico Vox: Towards Speech Recognition in Silicon

In Silico Vox: Towards Speech Recognition in Silicon In Silico Vox: Towards Speech Recognition in Silicon Edward C Lin, Kai Yu, Rob A Rutenbar, Tsuhan Chen Electrical & Computer Engineering {eclin, kaiy, rutenbar, tsuhan}@ececmuedu RA Rutenbar 006 Speech

More information

A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models

A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating Michael Price*, James Glass, Anantha Chandrakasan MIT, Cambridge, MA * now at Analog Devices, Cambridge,

More information

In Silico Vox: Towards Speech Recognition in Silicon

In Silico Vox: Towards Speech Recognition in Silicon In Silico Vox: Towards Speech Recognition in Silicon Edward C Lin, Kai Yu, Rob A Rutenbar, Tsuhan Chen Electrical & Computer Engineering {eclin, kaiy, rutenbar, tsuhan}@ececmuedu RA Rutenbar 2006 Speech

More information

Familiar Linux for the ipaq H3975 (XScale Processor) CSC 714 Real Time Computing Systems Term Project

Familiar Linux for the ipaq H3975 (XScale Processor) CSC 714 Real Time Computing Systems Term Project Familiar Linux for the ipaq H3975 (XScale Processor) CSC 714 Real Time Computing Systems Term Project Adinarayanan Venkatachalam (avenkat2@unity.ncsu.edu) Srivatsa Venkata Chivukula (svchivuk@unity.ncsu.edu)

More information

Memory-Efficient Heterogeneous Speech Recognition Hybrid in GPU-Equipped Mobile Devices

Memory-Efficient Heterogeneous Speech Recognition Hybrid in GPU-Equipped Mobile Devices Memory-Efficient Heterogeneous Speech Recognition Hybrid in GPU-Equipped Mobile Devices Alexei V. Ivanov, CTO, Verbumware Inc. GPU Technology Conference, San Jose, March 17, 2015 Autonomous Speech Recognition

More information

Review on Recent Speech Recognition Techniques

Review on Recent Speech Recognition Techniques International Journal of Scientific and Research Publications, Volume 3, Issue 7, July 2013 1 Review on Recent Speech Recognition Techniques Prof. Deepa H. Kulkarni Assistant Professor, SKN College of

More information

ANALYSIS OF A PARALLEL LEXICAL-TREE-BASED SPEECH DECODER FOR MULTI-CORE PROCESSORS

ANALYSIS OF A PARALLEL LEXICAL-TREE-BASED SPEECH DECODER FOR MULTI-CORE PROCESSORS 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 ANALYSIS OF A PARALLEL LEXICAL-TREE-BASED SPEECH DECODER FOR MULTI-CORE PROCESSORS Naveen Parihar Dept. of

More information

Automatic Speech Recognition (ASR)

Automatic Speech Recognition (ASR) Automatic Speech Recognition (ASR) February 2018 Reza Yazdani Aminabadi Universitat Politecnica de Catalunya (UPC) State-of-the-art State-of-the-art ASR system: DNN+HMM Speech (words) Sound Signal Graph

More information

General Purpose Signal Processors

General Purpose Signal Processors General Purpose Signal Processors First announced in 1978 (AMD) for peripheral computation such as in printers, matured in early 80 s (TMS320 series). General purpose vs. dedicated architectures: Pros:

More information

Cross Compiling. Real Time Operating Systems and Middleware. Luca Abeni

Cross Compiling. Real Time Operating Systems and Middleware. Luca Abeni Cross Compiling Real Time Operating Systems and Middleware Luca Abeni luca.abeni@unitn.it The Kernel Kernel OS component interacting with hardware Runs in privileged mode (Kernel Space KS) User Level Kernel

More information

Scalable Trigram Backoff Language Models

Scalable Trigram Backoff Language Models Scalable Trigram Backoff Language Models Kristie Seymore Ronald Rosenfeld May 1996 CMU-CS-96-139 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 This material is based upon work

More information

FAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH

FAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH Key words: Digital Signal Processing, FIR filters, SIMD processors, AltiVec. Grzegorz KRASZEWSKI Białystok Technical University Department of Electrical Engineering Wiejska

More information

Three-Layer Optimizations for Fast GMM Computations on GPU-like Parallel Processors

Three-Layer Optimizations for Fast GMM Computations on GPU-like Parallel Processors Three-Layer Optimizations for Fast GMM Computations on GPU-like Parallel Processors Kshitij Gupta, John D. Owens Department of Electrical & Computer Engineering, University of California, Davis One Shields

More information

Storage I/O Summary. Lecture 16: Multimedia and DSP Architectures

Storage I/O Summary. Lecture 16: Multimedia and DSP Architectures Storage I/O Summary Storage devices Storage I/O Performance Measures» Throughput» Response time I/O Benchmarks» Scaling to track technological change» Throughput with restricted response time is normal

More information

DESIGN & IMPLEMENTATION OF A CO-PROCESSOR FOR EMBEDDED, REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM-ON-A-CHIP.

DESIGN & IMPLEMENTATION OF A CO-PROCESSOR FOR EMBEDDED, REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM-ON-A-CHIP. DESIGN & IMPLEMENTATION OF A CO-PROCESSOR FOR EMBEDDED, REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM-ON-A-CHIP by Kshitij Gupta B.E., Osmania University, 2002 Submitted to the Graduate

More information

Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology

Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology EE382C: Embedded Software Systems Final Report David Brunke Young Cho Applied Research Laboratories:

More information

A 1000-Word Vocabulary, Speaker-Independent, Continuous Live-Mode Speech Recognizer Implemented in a Single FPGA

A 1000-Word Vocabulary, Speaker-Independent, Continuous Live-Mode Speech Recognizer Implemented in a Single FPGA A 1000-Word Vocabulary, Speaker-Independent, Continuous Live-Mode Speech Recognizer Implemented in a Single FPGA Edward C. Lin, Kai Yu, Rob A. Rutenbar, Tsuhan Chen Carnegie Mellon University Pittsburgh,

More information

high performance medical reconstruction using stream programming paradigms

high performance medical reconstruction using stream programming paradigms high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming

More information

Building and Using the ATLAS Transactional Memory System

Building and Using the ATLAS Transactional Memory System Building and Using the ATLAS Transactional Memory System Njuguna Njoroge, Sewook Wee, Jared Casper, Justin Burdick, Yuriy Teslyar, Christos Kozyrakis, Kunle Olukotun Computer Systems Laboratory Stanford

More information

Evaluation of a High Performance Code Compression Method

Evaluation of a High Performance Code Compression Method Evaluation of a High Performance Code Compression Method Charles Lefurgy, Eva Piccininni, and Trevor Mudge Advanced Computer Architecture Laboratory Electrical Engineering and Computer Science Dept. The

More information

High Level Cache Simulation for Heterogeneous Multiprocessors

High Level Cache Simulation for Heterogeneous Multiprocessors High Level Cache Simulation for Heterogeneous Multiprocessors Joshua J. Pieper 1, Alain Mellan 2, JoAnn M. Paul 1, Donald E. Thomas 1, Faraydon Karim 2 1 Carnegie Mellon University [jpieper, jpaul, thomas]@ece.cmu.edu

More information

Evaluating MMX Technology Using DSP and Multimedia Applications

Evaluating MMX Technology Using DSP and Multimedia Applications Evaluating MMX Technology Using DSP and Multimedia Applications Ravi Bhargava * Lizy K. John * Brian L. Evans Ramesh Radhakrishnan * November 22, 1999 The University of Texas at Austin Department of Electrical

More information

Sphinx Parallelization

Sphinx Parallelization Sphinx Parallelization James Tuck, Lee Baugh, Jose Renau, and Josep Torrellas University of Illinois at Urbana-Champaign May 2002 ABSTRACT Speech recognition applications challenge traditional out-of-order

More information

Simplify System Complexity

Simplify System Complexity 1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller

More information

Statistical Evaluation of a Self-Tuning Vectorized Library for the Walsh Hadamard Transform

Statistical Evaluation of a Self-Tuning Vectorized Library for the Walsh Hadamard Transform Statistical Evaluation of a Self-Tuning Vectorized Library for the Walsh Hadamard Transform Michael Andrews and Jeremy Johnson Department of Computer Science, Drexel University, Philadelphia, PA USA Abstract.

More information

Text-Independent Speaker Identification

Text-Independent Speaker Identification December 8, 1999 Text-Independent Speaker Identification Til T. Phan and Thomas Soong 1.0 Introduction 1.1 Motivation The problem of speaker identification is an area with many different applications.

More information

SVD-based Universal DNN Modeling for Multiple Scenarios

SVD-based Universal DNN Modeling for Multiple Scenarios SVD-based Universal DNN Modeling for Multiple Scenarios Changliang Liu 1, Jinyu Li 2, Yifan Gong 2 1 Microsoft Search echnology Center Asia, Beijing, China 2 Microsoft Corporation, One Microsoft Way, Redmond,

More information

Transitioning from uclibc to musl for embedded development. Embedded Linux Conference 2015 Rich Felker, maintainer, musl libc March 24, 2015

Transitioning from uclibc to musl for embedded development. Embedded Linux Conference 2015 Rich Felker, maintainer, musl libc March 24, 2015 Transitioning from uclibc to musl for embedded development Embedded Linux Conference 2015 Rich Felker, maintainer, musl libc March 24, 2015 What is musl? musl is a libc, an implementation of the user-space

More information

Towards Speech Recognition in Silicon: The Carnegie Mellon In Silico Vox Project

Towards Speech Recognition in Silicon: The Carnegie Mellon In Silico Vox Project 2007 STARC Forum Towards Speech Recognition in Silicon: The Carnegie Mellon In Silico Vox Project Rob A Rutenbar Professor, Electrical & Computer Engineering rutenbar@ececmuedu RA Rutenbar 2007 Speech

More information

SWAR: MMX, SSE, SSE 2 Multiplatform Programming

SWAR: MMX, SSE, SSE 2 Multiplatform Programming SWAR: MMX, SSE, SSE 2 Multiplatform Programming Relatore: dott. Matteo Roffilli roffilli@csr.unibo.it 1 What s SWAR? SWAR = SIMD Within A Register SIMD = Single Instruction Multiple Data MMX,SSE,SSE2,Power3DNow

More information

Embedded Computation

Embedded Computation Embedded Computation What is an Embedded Processor? Any device that includes a programmable computer, but is not itself a general-purpose computer [W. Wolf, 2000]. Commonly found in cell phones, automobiles,

More information

This page intentionally left blank

This page intentionally left blank This page intentionally left blank 216 THE DIGITAL LOGIC LEVEL CHAP. 3 and in 1995, 2.1 came out. 2.2 has features for mobile computers (mostly for saving battery power). The bus runs at up to 66 MHz and

More information

Design of the CMU Sphinx-4 Decoder

Design of the CMU Sphinx-4 Decoder MERL A MITSUBISHI ELECTRIC RESEARCH LABORATORY http://www.merl.com Design of the CMU Sphinx-4 Decoder Paul Lamere, Philip Kwok, William Walker, Evandro Gouva, Rita Singh, Bhiksha Raj and Peter Wolf TR-2003-110

More information

Simplify System Complexity

Simplify System Complexity Simplify System Complexity With the new high-performance CompactRIO controller Fanie Coetzer Field Sales Engineer Northern South Africa 2 3 New control system CompactPCI MMI/Sequencing/Logging FieldPoint

More information

Data Parallel Architectures

Data Parallel Architectures EE392C: Advanced Topics in Computer Architecture Lecture #2 Chip Multiprocessors and Polymorphic Processors Thursday, April 3 rd, 2003 Data Parallel Architectures Lecture #2: Thursday, April 3 rd, 2003

More information

XPU A Programmable FPGA Accelerator for Diverse Workloads

XPU A Programmable FPGA Accelerator for Diverse Workloads XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for

More information

SPIDER: A Continuous Speech Light Decoder

SPIDER: A Continuous Speech Light Decoder SPIDER: A Continuous Speech Light Decoder Abdelaziz AAbdelhamid, Waleed HAbdulla, and Bruce AMacDonald Department of Electrical and Computer Engineering, Auckland University, New Zealand E-mail: aabd127@aucklanduniacnz,

More information

Microkernel Construction. Introduction. Michael Hohmuth. Lars Reuther. TU Dresden Operating Systems Group

Microkernel Construction. Introduction. Michael Hohmuth. Lars Reuther. TU Dresden Operating Systems Group Introduction Lecture Goals Provide deeper understanding of OS mechanisms Make all of you enthusiastic kernel hackers Illustrate alternative system design concepts Promote OS research at 2 Administration

More information

card slots CPU socket Monitor Computer case houses CPU (Central Processing Unit), CPU central power supply, DVD drive, etc processing unit Keyboard

card slots CPU socket Monitor Computer case houses CPU (Central Processing Unit), CPU central power supply, DVD drive, etc processing unit Keyboard Why Are Words Important? Terminology Chapter 1 Connection between language and thought 1984 and Newspeak Wine appreciation Communication with others "The cup holder on my PC is broken"* Where is the computer?

More information

Simplifying DSP Development with C6EZ Tools

Simplifying DSP Development with C6EZ Tools Simplifying DSP Development with C6EZ Tools DSP Development made easier with C6EZ Tools Seamlessly ports ARM code to DSP (ARM Developers) Provides ARM access to ready-to-use DSP kernels (System Developers)

More information

Introduction to HPC. Lecture 21

Introduction to HPC. Lecture 21 443 Introduction to HPC Lecture Dept of Computer Science 443 Fast Fourier Transform 443 FFT followed by Inverse FFT DIF DIT Use inverse twiddles for the inverse FFT No bitreversal necessary! 443 FFT followed

More information

Martin Kruliš, v

Martin Kruliš, v Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal

More information

Fahad Zafar, Dibyajyoti Ghosh, Lawrence Sebald, Shujia Zhou. University of Maryland Baltimore County

Fahad Zafar, Dibyajyoti Ghosh, Lawrence Sebald, Shujia Zhou. University of Maryland Baltimore County Accelerating a climate physics model with OpenCL Fahad Zafar, Dibyajyoti Ghosh, Lawrence Sebald, Shujia Zhou University of Maryland Baltimore County Introduction The demand to increase forecast predictability

More information

A Fast Instruction Set Simulator for RISC-V

A Fast Instruction Set Simulator for RISC-V A Fast Instruction Set Simulator for RISC-V Maxim.Maslov@esperantotech.com Vadim.Gimpelson@esperantotech.com Nikita.Voronov@esperantotech.com Dave.Ditzel@esperantotech.com Esperanto Technologies, Inc.

More information

2014, IJARCSSE All Rights Reserved Page 461

2014, IJARCSSE All Rights Reserved Page 461 Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Real Time Speech

More information

Efficient Scalable Encoding for Distributed Speech Recognition

Efficient Scalable Encoding for Distributed Speech Recognition EFFICIENT SCALABLE ENCODING FOR DISTRIBUTED SPEECH RECOGNITION 1 Efficient Scalable Encoding for Distributed Speech Recognition Naveen Srinivasamurthy, Antonio Ortega and Shrikanth Narayanan Standards

More information

Computer Architecture and Organization. Instruction Sets: Addressing Modes and Formats

Computer Architecture and Organization. Instruction Sets: Addressing Modes and Formats Computer Architecture and Organization Instruction Sets: Addressing Modes and Formats Addressing Modes Immediate Direct Indirect Register Register Indirect Displacement (Indexed) Stack Immediate Addressing

More information

Robust speech recognition using features based on zero crossings with peak amplitudes

Robust speech recognition using features based on zero crossings with peak amplitudes Robust speech recognition using features based on zero crossings with peak amplitudes Author Gajic, Bojana, Paliwal, Kuldip Published 200 Conference Title Proceedings of the 200 IEEE International Conference

More information

Computing platforms. Design methodology. Consumer electronics architectures. System-level performance and power analysis.

Computing platforms. Design methodology. Consumer electronics architectures. System-level performance and power analysis. Computing platforms Design methodology. Consumer electronics architectures. System-level performance and power analysis. Evaluation boards Designed by CPU manufacturer or others. Includes CPU, memory,

More information

Contents. The Hieroglyphs. This page was generated with the help of DOC++ March 11, 2007 i

Contents. The Hieroglyphs. This page was generated with the help of DOC++  March 11, 2007 i Contents The Hieroglyphs This page was generated with the help of DOC++ http://docpp.sourceforge.net March 11, 2007 i Contents This page was generated with the help of DOC++ http://docpp.sourceforge.net

More information

Speech Recognition. Project: Phone Recognition using Sphinx. Chia-Ho Ling. Sunya Santananchai. Professor: Dr. Kepuska

Speech Recognition. Project: Phone Recognition using Sphinx. Chia-Ho Ling. Sunya Santananchai. Professor: Dr. Kepuska Speech Recognition Project: Phone Recognition using Sphinx Chia-Ho Ling Sunya Santananchai Professor: Dr. Kepuska Objective Use speech data corpora to build a model using CMU Sphinx.Apply a built model

More information

EMBEDDED SYSTEM FOR VIDEO AND SIGNAL PROCESSING

EMBEDDED SYSTEM FOR VIDEO AND SIGNAL PROCESSING EMBEDDED SYSTEM FOR VIDEO AND SIGNAL PROCESSING Slavy Georgiev Mihov 1, Dimitar Stoykov Dimitrov 2, Krasimir Angelov Stoyanov 3, Doycho Dimitrov Doychev 4 1, 4 Faculty of Electronic Engineering and Technologies,

More information

Maximizing NFS Scalability

Maximizing NFS Scalability Maximizing NFS Scalability on Dell Servers and Storage in High-Performance Computing Environments Popular because of its maturity and ease of use, the Network File System (NFS) can be used in high-performance

More information

Speech Technology Using in Wechat

Speech Technology Using in Wechat Speech Technology Using in Wechat FENG RAO Powered by WeChat Outline Introduce Algorithm of Speech Recognition Acoustic Model Language Model Decoder Speech Technology Open Platform Framework of Speech

More information

Introduction to Linux

Introduction to Linux Introduction to Linux EECS 211 Martin Luessi April 14, 2010 Martin Luessi () Introduction to Linux April 14, 2010 1 / 14 Outline 1 Introduction 2 How to Get Started 3 Software Development under Linux 4

More information

Design Choices for FPGA-based SoCs When Adding a SATA Storage }

Design Choices for FPGA-based SoCs When Adding a SATA Storage } U4 U7 U7 Q D U5 Q D Design Choices for FPGA-based SoCs When Adding a SATA Storage } Lorenz Kolb & Endric Schubert, Missing Link Electronics Rudolf Usselmann, ASICS World Services Motivation for SATA Storage

More information

Embedded Systems: Architecture

Embedded Systems: Architecture Embedded Systems: Architecture Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

USB 3.0 Software Architecture and Implementation Issues. Terry Moore, CEO MCCI Corporation

USB 3.0 Software Architecture and Implementation Issues. Terry Moore, CEO MCCI Corporation USB 3.0 Software Architecture and Implementation Issues Terry Moore, CEO MCCI Corporation 2009-08-03 Agenda Introducing MCCI USB 3.0 from a Software Perspective USB 3.0 Software Challenges New Device Classes

More information

Organising benchmarking LLVM-based compiler: Arm experience

Organising benchmarking LLVM-based compiler: Arm experience Organising benchmarking LLVM-based compiler: Arm experience Evgeny Astigeevich LLVM Dev Meeting April 2018 2018 Arm Limited Terminology Upstream: everything on llvm.org side. Downstream: everything on

More information

Performance Comparisons of Dell PowerEdge Servers with SQL Server 2000 Service Pack 4 Enterprise Product Group (EPG)

Performance Comparisons of Dell PowerEdge Servers with SQL Server 2000 Service Pack 4 Enterprise Product Group (EPG) Performance Comparisons of Dell PowerEdge Servers with SQL Server 2000 Service Pack 4 Enterprise Product Group (EPG) Dell White Paper By Neelima Chinthamani (Enterprise OS Releases) Ravikanth Chaganti

More information

Performance Modeling and Analysis of Flash based Storage Devices

Performance Modeling and Analysis of Flash based Storage Devices Performance Modeling and Analysis of Flash based Storage Devices H. Howie Huang, Shan Li George Washington University Alex Szalay, Andreas Terzis Johns Hopkins University MSST 11 May 26, 2011 NAND Flash

More information

Evaluating the Potential of Graphics Processors for High Performance Embedded Computing

Evaluating the Potential of Graphics Processors for High Performance Embedded Computing Evaluating the Potential of Graphics Processors for High Performance Embedded Computing Shuai Mu, Chenxi Wang, Ming Liu, Yangdong Deng Department of Micro-/Nano-electronics Tsinghua University Outline

More information

Bus Example: Pentium II

Bus Example: Pentium II Peripheral Component Interconnect (PCI) Conventional PCI, often shortened to PCI, is a local computer bus for attaching hardware devices in a computer. PCI stands for Peripheral Component Interconnect

More information

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Cheng Li, Austin Rovinski,

More information

An introduction to Digital Signal Processors (DSP) Using the C55xx family

An introduction to Digital Signal Processors (DSP) Using the C55xx family An introduction to Digital Signal Processors (DSP) Using the C55xx family Group status (~2 minutes each) 5 groups stand up What processor(s) you are using Wireless? If so, what technologies/chips are you

More information

Open Source Software in mobile devices. Timofey Turenko Nokia Research Center, Office of CTO

Open Source Software in mobile devices. Timofey Turenko Nokia Research Center, Office of CTO Open Source Software in mobile devices Timofey Turenko timofey.turenko@nokia.com Nokia Research Center, Office of CTO Agenda What is OSS? How did OSS come into mobile? OSS platforms for mobile devices

More information

MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014

MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014 MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION Steve Tjoa kiemyang@gmail.com June 25, 2014 Review from Day 2 Supervised vs. Unsupervised Unsupervised - clustering Supervised binary classifiers (2 classes)

More information

Leveraging OpenCoarrays to Support Coarray Fortran on IBM Power8E

Leveraging OpenCoarrays to Support Coarray Fortran on IBM Power8E Executive Summary Leveraging OpenCoarrays to Support Coarray Fortran on IBM Power8E Alessandro Fanfarillo, Damian Rouson Sourcery Inc. www.sourceryinstitue.org We report on the experience of installing

More information

David R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms.

David R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms. Whitepaper Introduction A Library Based Approach to Threading for Performance David R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms.

More information

Learning The Lexicon!

Learning The Lexicon! Learning The Lexicon! A Pronunciation Mixture Model! Ian McGraw! (imcgraw@mit.edu)! Ibrahim Badr Jim Glass! Computer Science and Artificial Intelligence Lab! Massachusetts Institute of Technology! Cambridge,

More information

CMU Sphinx: the recognizer library

CMU Sphinx: the recognizer library CMU Sphinx: the recognizer library Authors: Massimo Basile Mario Fabrizi Supervisor: Prof. Paola Velardi 01/02/2013 Contents 1 Introduction 2 2 Sphinx download and installation 4 2.1 Download..........................................

More information

Implementation of Deep Convolutional Neural Net on a Digital Signal Processor

Implementation of Deep Convolutional Neural Net on a Digital Signal Processor Implementation of Deep Convolutional Neural Net on a Digital Signal Processor Elaina Chai December 12, 2014 1. Abstract In this paper I will discuss the feasibility of an implementation of an algorithm

More information

Techniques for Optimizing Performance and Energy Consumption: Results of a Case Study on an ARM9 Platform

Techniques for Optimizing Performance and Energy Consumption: Results of a Case Study on an ARM9 Platform Techniques for Optimizing Performance and Energy Consumption: Results of a Case Study on an ARM9 Platform BL Standard IC s, PL Microcontrollers October 2007 Outline LPC3180 Description What makes this

More information

The blob bootloader. The blob bootloader. Thomas Petazzoni Free Electrons

The blob bootloader. The blob bootloader. Thomas Petazzoni Free Electrons The blob bootloader The blob bootloader Thomas Petazzoni Free Electrons 1 Rights to copy Copyright 2008 2009, Free Electrons feedback@free electrons.com Document sources, updates and translations: http://free

More information

Near Memory Computing Spectral and Sparse Accelerators

Near Memory Computing Spectral and Sparse Accelerators Near Memory Computing Spectral and Sparse Accelerators Franz Franchetti ECE, Carnegie Mellon University www.ece.cmu.edu/~franzf Co-Founder, SpiralGen www.spiralgen.com The work was sponsored by Defense

More information

Parallelized Progressive Network Coding with Hardware Acceleration

Parallelized Progressive Network Coding with Hardware Acceleration Parallelized Progressive Network Coding with Hardware Acceleration Hassan Shojania, Baochun Li Department of Electrical and Computer Engineering University of Toronto Network coding Information is coded

More information

CASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster

CASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster CASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster Mr. Matthew Krzych Naval Undersea Warfare Center Phone: 401-832-8174 Email Address: krzychmj@npt.nuwc.navy.mil The Robust Passive Sonar

More information

Homework 3: Dialog. Part 1. Part 2. Results are due 17 th November 3:30pm

Homework 3: Dialog. Part 1. Part 2. Results are due 17 th November 3:30pm Homework 3: Dialog Part 1 Call TellMe and get two sets of driving directions Call CMU s Let s Go Call Amtrak Part 2 Build your own pizza ordering systems Register with Tell Me Studio Use VoiceXML to build

More information

Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks

Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Christos Kozyrakis Stanford University David Patterson U.C. Berkeley http://csl.stanford.edu/~christos Motivation Ideal processor

More information

Dynamic Time Warping

Dynamic Time Warping Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Dynamic Time Warping Dr Philip Jackson Acoustic features Distance measures Pattern matching Distortion penalties DTW

More information

Independent DSP Benchmarks: Methodologies and Results. Outline

Independent DSP Benchmarks: Methodologies and Results. Outline Independent DSP Benchmarks: Methodologies and Results Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, California U.S.A. +1 (510) 665-1600 info@bdti.com http:// Copyright 1 Outline

More information

V. Mass Storage Systems

V. Mass Storage Systems TDIU25: Operating Systems V. Mass Storage Systems SGG9: chapter 12 o Mass storage: Hard disks, structure, scheduling, RAID Copyright Notice: The lecture notes are mainly based on modifications of the slides

More information

DSP using Labview FPGA. T.J.Moir AUT University School of Engineering Auckland New-Zealand

DSP using Labview FPGA. T.J.Moir AUT University School of Engineering Auckland New-Zealand DSP using Labview FPGA T.J.Moir AUT University School of Engineering Auckland New-Zealand Limitations of a basic processor Despite all of the advancements we ve made in the world of processors, they still

More information

Assembly Language for x86 Processors 7 th Edition. Chapter 2: x86 Processor Architecture

Assembly Language for x86 Processors 7 th Edition. Chapter 2: x86 Processor Architecture Assembly Language for x86 Processors 7 th Edition Kip Irvine Chapter 2: x86 Processor Architecture Slides prepared by the author Revision date: 1/15/2014 (c) Pearson Education, 2015. All rights reserved.

More information

Single Chip Heterogeneous Multiprocessor Design

Single Chip Heterogeneous Multiprocessor Design Single Chip Heterogeneous Multiprocessor Design JoAnn M. Paul July 7, 2004 Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213 The Cell Phone, Circa 2010 Cell

More information

Scheduling FFT Computation on SMP and Multicore Systems Ayaz Ali, Lennart Johnsson & Jaspal Subhlok

Scheduling FFT Computation on SMP and Multicore Systems Ayaz Ali, Lennart Johnsson & Jaspal Subhlok Scheduling FFT Computation on SMP and Multicore Systems Ayaz Ali, Lennart Johnsson & Jaspal Subhlok Texas Learning and Computation Center Department of Computer Science University of Houston Outline Motivation

More information

Rapid: A Configurable Architecture for Compute-Intensive Applications

Rapid: A Configurable Architecture for Compute-Intensive Applications Rapid: Configurable rchitecture for Compute-Intensive pplications Carl Ebeling Dept. of Computer Science and Engineering niversity of Washington lternatives for High-Performance Systems SIC se application-specific

More information

Map3D V58 - Multi-Processor Version

Map3D V58 - Multi-Processor Version Map3D V58 - Multi-Processor Version Announcing the multi-processor version of Map3D. How fast would you like to go? 2x, 4x, 6x? - it's now up to you. In order to achieve these performance gains it is necessary

More information

BENCHMARKING LIBEVENT AGAINST LIBEV

BENCHMARKING LIBEVENT AGAINST LIBEV BENCHMARKING LIBEVENT AGAINST LIBEV Top 2011-01-11, Version 6 This document briefly describes the results of running the libevent benchmark program against both libevent and libev. Libevent Overview Libevent

More information

Julius rev LEE Akinobu, and Julius Development Team 2007/12/19. 1 Introduction 2

Julius rev LEE Akinobu, and Julius Development Team 2007/12/19. 1 Introduction 2 Julius rev. 4.0 L Akinobu, and Julius Development Team 2007/12/19 Contents 1 Introduction 2 2 Framework of Julius-4 2 2.1 System architecture........................... 2 2.2 How it runs...............................

More information

Achieve Fastest System Startup Sequences.

Achieve Fastest System Startup Sequences. Achieve Fastest System Startup Sequences. How to tune an Embedded System. Embedded Systems Design Conference ARM vs. x86 July 3, 2014 Kei Thomsen MicroSys Electronics GmbH Agenda Target: reduce startup

More information

Computer Performance. Relative Performance. Ways to measure Performance. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So

Computer Performance. Relative Performance. Ways to measure Performance. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So Computer Architecture ELEC344 Computer Performance How do you measure performance of a computer? 2 nd Semester, 208-9 Dr. Hayden Kwok-Hay So How do you make a computer fast? Department of Electrical and

More information

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

SDA: Software-Defined Accelerator for Large- Scale DNN Systems SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant

More information

ARM Processor. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ARM Processor. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University ARM Processor Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu CPU Architecture CPU & Memory address Memory data CPU 200 ADD r5,r1,r3 PC ICE3028:

More information

Separating Reality from Hype in Processors' DSP Performance. Evaluating DSP Performance

Separating Reality from Hype in Processors' DSP Performance. Evaluating DSP Performance Separating Reality from Hype in Processors' DSP Performance Berkeley Design Technology, Inc. +1 (51) 665-16 info@bdti.com Copyright 21 Berkeley Design Technology, Inc. 1 Evaluating DSP Performance! Essential

More information

Computers and Microprocessors. Lecture 34 PHYS3360/AEP3630

Computers and Microprocessors. Lecture 34 PHYS3360/AEP3630 Computers and Microprocessors Lecture 34 PHYS3360/AEP3630 1 Contents Computer architecture / experiment control Microprocessor organization Basic computer components Memory modes for x86 series of microprocessors

More information

Introducing the Superscalar Version 5 ColdFire Core

Introducing the Superscalar Version 5 ColdFire Core Introducing the Superscalar Version 5 ColdFire Core Microprocessor Forum October 16, 2002 Joe Circello Chief ColdFire Architect Motorola Semiconductor Products Sector Joe Circello, Chief ColdFire Architect

More information

Defining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.

Defining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain. Defining Performance Performance 1 Which airplane has the best performance? Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300

More information