The Design And Experimental Study Of A Kind of Speech Instruction. Control System Prototype of Manned Spacecraft

Similar documents
A Circle Detection Method Based on Optimal Parameter Statistics in Embedded Vision

Proposal of a Touch Panel Like Operation Method For Presentation with a Projector Using Laser Pointer

Speech Recognition Based on Efficient DTW Algorithm and Its DSP Implementation

Vision-based Real-time Road Detection in Urban Traffic

LINEAR PROGRAMMING. Straight line graphs LESSON

Developing a Tracking Algorithm for Underwater ROV Using Fuzzy Logic Controller

Here are some guidelines for solving a linear programming problem in two variables in which an objective function is to be maximized or minimized.

Two Dimensional Viewing

SpeakUp click. Contents. Applications. SpeakUp Firwmware. Algorithm. SpeakUp and SpeakUp 2 click. From MikroElektonika Documentation

Appendix F: Systems of Inequalities

3D X-ray Laminography with CMOS Image Sensor Using a Projection Method for Reconstruction of Arbitrary Cross-sectional Images

7. f(x) = 1 2 x f(x) = x f(x) = 4 x at x = 10, 8, 6, 4, 2, 0, 2, and 4.

Disparity Fusion Using Depth and Stereo Cameras for Accurate Stereo Correspondence

A Novel Adaptive Algorithm for Fingerprint Segmentation

Research Article Scene Semantics Recognition Based on Target Detection and Fuzzy Reasoning

Design And Implementation of Remote Video-Audio Communication Module in The Commanding-Dispatching System of Transmitting Stations

The Structure of Boolean Neuron for the Optimal Mapping to FPGAs

Thermo vision system with embedded digital signal processor for real time objects detection

EELE 482 Lab #3. Lab #3. Diffraction. 1. Pre-Lab Activity Introduction Diffraction Grating Measure the Width of Your Hair 5

Chapter 3. Exponential and Logarithmic Functions. Selected Applications

A Robust and Real-time Multi-feature Amalgamation. Algorithm for Fingerprint Segmentation

Matrix Representations

ROUTING OPTIMIZATION FOR FORCES BASED ON TRAFFIC MATRIX

OBJECTS RECOGNITION BY MEANS OF PROJECTIVE INVARIANTS CONSIDERING CORNER-POINTS.

6.867 Machine learning

VLSI Solution. VS10XX - Plugins. Plugins, Applications. Plugins. Description. Applications. Patches. In Development. Public Document.

Lens Screw Pad Arm (a) Glasses frames Screw (b) Parts Bridge Nose Pad Fig. 2 Name of parts composing glasses flames Temple Endpiece 2.2 Geometric mode

Pupil Center Detection Using Edge and Circle Characteristic

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

An Fuzzy Neural Approach for Medical Image Retrieval

Embedded Systems and Signal Processing Lab The University of Texas at Dallas, Richardson, TX,

Research on Hybrid Network Technologies of Power Line Carrier and Wireless MAC Layer Hao ZHANG 1, Jun-yu LIU 2, Yi-ying ZHANG 3 and Kun LIANG 3,*

The research on Uighur speaker-dependent isolated word speech recognition

Human Computer Interaction Using Speech Recognition Technology

Section 4.2 Graphing Lines

HEXFLEX: A PLANAR MECHANISM FOR SIX-AXIS MANIPULATION AND ALIGNMENT

6.867 Machine learning

YVC Unified Communications Microphone & Speaker System. User's Manual

ACTIVITY 9 Continued Lesson 9-2

REPORT DOCUMENTATION PAGE

DA-250 DSP Stereo Array Microphone Module Specification

Function Notation. Essential Question How can you use function notation to represent a function?

SHT-2B/USB SHT-4B/USB

Project: IEEE P Working Group for Wireless Personal Area Networks N

A Color Interpolation Method for Bayer Filter Array Images Based on Direction Flag

Taylor Expansion Diagrams: A Canonical Representation for Verification of Data Flow Designs

Lines and Their Slopes

Capturing and Presenting Shared Multi-Resolution Video

20 Calculus and Structures

MATH SPEAK - TO BE UNDERSTOOD AND MEMORIZED

Dynamic Time Warping & Search

Adaptive Threshold Median Filter for Multiple-Impulse Noise

Voice. Voice. Patterson EagleSoft Overview Voice 629

A Fourth-Order Gas-Kinetic CPR Method for the Navier-Stokes Equations on Unstructured Meshes

IBM Netfinity Availability Extensions for Microsoft Cluster Server

Ready To Go On? Skills Intervention 3-1 Using Graphs and Tables to Solve Linear Systems

Statistically Analyzing the Impact of Automated ETL Testing on Data Quality

Study on Image Retrieval Method of Integrating Color and Texture

Fingerprint Image Segmentation Based on Quadric Surface Model *

Action Detection in Cluttered Video with. Successive Convex Matching

Polar Functions Polar coordinates

Online Supplement for Toward Automated Intelligent Manufacturing Systems (AIMS)

Analyzing Mel Frequency Cepstral Coefficient for Recognition of Isolated English Word using DTW Matching

3.5 Equations of Lines

A novel implementation of tile-based address mapping

SIM900 demonstration projects. Introduction

Face Cyclographs for Recognition

Graphing Systems of Linear Inequalities in Two Variables

Television on IP Networks. TNS-100 (Ref. 5102) DVB-T IP Streamer. Configuration and Settings. User Manual

Dynamic Time Warping

A9.1 Linear programming

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 5. Graph sketching

Cardiac Segmentation from MRI-Tagged and CT Images

Implementing a Speech Recognition System on a GPU using CUDA. Presented by Omid Talakoub Astrid Yi

Section 4.3 Features of a Line

QoS Configuration FSOS

Chapter 4 Section 1 Graphing Linear Inequalities in Two Variables

Laurie s Notes. Overview of Section 6.3

INFLUENCE OF CONNECTION LENGTH ON SPEECH SIGNAL QUALITY IN PACKET NETWORK OF ELECTRIC POWER UTILITY

PJP-50USB. Conference Microphone Speaker. User s Manual MIC MUTE VOL 3 CLEAR STANDBY ENTER MENU

Fitting a transformation: Feature-based alignment April 30 th, Yong Jae Lee UC Davis

NUMERICAL PERFORMANCE OF COMPACT FOURTH ORDER FORMULATION OF THE NAVIER-STOKES EQUATIONS

Improvement of the Communication Protocol Conversion Equipment Based on Embedded Multi-MCU and μc/os-ii

Low-Cost Embedded Controller for Complex Control Systems

Voice Command Based Computer Application Control Using MFCC

Partial Semi-Coarsening Multigrid Method Based on the HOC Scheme on Nonuniform Grids for the Convection-diffusion Problems

QoS Configuration. Page 1 of 13

Essential Question How many turning points can the graph of a polynomial function have?

Study on monitor system of pollution discharge in chemical enterprise based on internet of things

Precision Peg-in-Hole Assembly Strategy Using Force-Guided Robot

Automatic Facial Expression Recognition Using Neural Network

2. Basic Task of Pattern Classification

3.2 Polynomial Functions of Higher Degree

ABSOLUTE EXTREMA AND THE MEAN VALUE THEOREM

BTH-300. <Product Descriptions> <Specification>

Installation Guide & Users Manual

3x 4y 2. 3y 4. Math 65 Weekly Activity 1 (50 points) Name: Simplify the following expressions. Make sure to use the = symbol appropriately.

The simulation and emulation verification that was based on NS-2

The Study and Implementation of Text-to-Speech System for Agricultural Information

Transportation Solutions. Audio Signal Transducer

Transcription:

The Design And Eperimental Stud Of A Kind of Speech Instruction Control Sstem Prototpe of Manned Spacecraft Hao Zhai Xiaolin Yang Jianhua Yang LanZhou Institute of Phsics BOX 94, Lanzhou, P.R. China, 730000 Tel:86-93-82672-536, Fa:86-93-826539 E-mail:zhaihao8848@sina.com ABSTRACT The application of speech instruction control sstem in man machine interface of the manned spacecraft can enrich the intelligentization of the man machine interface and lighten the operation load of cosmonauts. Speech recognition integrated with speech snthesis constructs Speech Instruction Control Sstem (SICS). The design of a kind of Speech Instruction Control Sstem (SICS) prototpe is presented in this paper. The 32 bit embedded sstem is adopted as the hardware core, while the middle laer is embedded Linu. Outside the middle laer is the application software shell. The development and eperimental stud on the SICS prototpe has been carried out. Under the 00Mbps fast switch Ethernet and WINDOWS 98 OS environment, a simplified Chinese SICS bas been realized. In the sstem, a broadening endpoint Dnamic Time Warping (DTW) algorithm is applied which implements speaker-dependent rapid speech recognition and has the adaptive performance to process the tin diversification of the speech speed. The UDP protocal is adopted to implement the real time transmission of the recognition result code of speech commands. The SICS prototpe have been also realized in Linu Redhat 7.0. It is concluded from the eperiment that the SICS prototpe is feasible. There are several aspects of the problems pointed out in this paper for the SICS prototpe to be applied in man machine interface of the manned spacecraft. Introduction The application of speech recognition in man machine interface of the manned spacecraft can enrich the intelligentization of the man machine interface and lighten the operation load of spacemen. It is a profitable complementarit to the manual operation, especiall in the zero gravit environment. The Channel of the input speech is just like the third hand of the spaceman. Especiall, it can implement the remote operation during the walking out of the cabin for the spacemen (such as during maintenance, rendezvous etc.). In this paper, we bring forward a speech instruction control sstem (SICS) prototpe of manned spacecraft, and give the result of eperimental stud in a simplified SICS in which rapid speech recognition is applied to realize speech instruction control. 2 Design of A Speech Instruction Control Sstem Prototpe of Manned Spacecraft In man-machine interface of a spacecraft, speech recognition integrated with speech snthesis constructs a Speech Instruction Control Sstem (SICS). On one hand, the sstem can recognize the input speech command, perform the operation defined b the speech command and implement speech

control operations. On the other hand, the SICS can impart the operation acknowledgement cue according to the first recognition result to the spacemen with the snthesized speech. Furthermore, speech alarm of the parameter eceeding and event speech informing can be carried out if necessar. The primar design of a kind of Speech Instruction Control Sstem (SICS) prototpe is presented here. The embedded sstem is adopted as the hardware platform. In the platform, 32 bit MCU PowerPC is the hardware core while DSP and Codec Module is configured as the speech processing hardware, 8M bit flash and 8M bit RAM are epanded as storage media and high speed industrial ethernet interface instead of MIL-STD-553B is collocated to implement the sstem integration. The hardware block diagram is shown as Fig.. Above the hardware is RTOS embedded Linu. As a real time operating sstem, Linu is measured not onl b the correctness of the result but b the time in which the results are produced. Outside the middle laer is the application software shell. On the basis of RTOS, the demand for real-time of both speech recognition and speech snthesis is satisfied. Now das, information technologies have been rapidl developed. Wh we can not tr some new method, new device in our traditional space industr especiall in the manufacture of device level. So, new technolog of embedded sstem development and industrial devices are adopted in this prototpe promising the high reliabilit and low power consumption is insured. Thus new thinking for the future manned spacecraft development is provided. High Speed Industrial Ethernet Interface Flash and RAM Module The Embedded MCU Module DSP and Codec Module Earphone/Speaker Microphone/larngophone Fig. Hardware Block Diagram of SICS Prototpe Input Command Speech Recognition As Recognition Result,Output Snthesized Speech as Operation Cue Y Is It Sure N Recognize The Acknowledgement Command Recognize The Cancel Comma Eit Fig. 2 The Framework of SICS Software

The framework of the software is shown as Fig. 2. The speech recognition must be accurate, reliable, rapid and robust. Isolated word or discontinuous speech recognition can satisf the demand of the speech command control. Speaker-dependent and speaker-independent sstem can both satisf the requirement while multi-user templates storage technolog can get over the location of it. 6 Bit 32kpbs limited vocabular speech snthesis technolog is adopted to snthesize the speech for speech recognition feedback, speech alarm and speech informing. In addition, barge-in technolog must be used to make the SICS can process speech recognition during the snthesizing. The recognition result is transmitted to the OBDH via high speed serial interface. Meanwhile, the code of speech alarm and event is received from the interface. 3 The Development of Rapid Speech Recognition Sstem S(n) Filter Bank Pattern Training Templates Pattern Classifier (DTW algorithm) Decision Logic Recognition Result Fig. 3 Block Diagram of Pattern-Recognition approach The classic Pattern-Recognition approach is used in our development. The block diagram of the approach is shown as Fig. 3. In this paper, a 8-channel octave band filter bank is used in feature etraction phase of speech recognition which separates the signal frequenc bandwidth in a number of frequenc bands where the signal energ is measured. In the sstem, a relaed endpoint constraints Dnamic Time Warping (DTW) algorithm is applied which implements speaker-dependent rapid speech recognition and has the adaptive performance to process the tin diversification of the speech speed. The local continuit constraints with slope weighting is shown as Fig. 4. The dnamic programming recursion formula is following: Where: ( ) (, i ) D i D i, i : the minimum partial accumulated distortion along a path connecting (,) and (i,i) d ( i, i ): the short-time spectral distortions D( i 2, i ) + [ d( i, i ) + d( i, i )], 2 = min D( i, i ) + d( i, i ), D( i, i ) + [ d( i, i ) + d( i, i )] 2

/2 (i -2,i -) (i -,i ) /2 (i -,i -) (i,i ) /2 (i,i -) (i -2,i -) /2 The DTW algorithm is finding the best path through a T b T grid, beginning at (,) and ending at (T,T ), as follows:. Initialization where: m(k), the local slope weighting 2. Recursion For D A (,) = d(,) m() i T, i T such that i and i sta within the allowable grid, compute ς D A ( i in which: d( ( k), φ ( k) ) 3. Termination Fig. 4 The Local Continuit Constraints with slope weighting, i ) = min ( i ', i ' ) [ D ( i', i' ) + ς ( i', i' ), ( i, i ))] where: ( i', i' ), ( i, i ) A ( ) ς is defined b L s ( i ', i ' ), ( i, i ) ) = d ( φ ( T ' l ), φ ( T ' l )) l = 0 m ( T ' l ) φ is the short-time spectral distortions of φ ) and φ ) (k (k d ( X, Y ) = D A ( T, T ) M φ d, is the dissimilarit between X and Y, M φ is the normalizing factor where: ( X Y ) The following new set of boundar conditions is used to rela the endpoint constraints: φ ( ) + Q ma φ () + T Q ( T ) T ma φ T φ ( T ) T where: represents the maimum anticipated mismatch in the endpoint of the pattern

To reach a higher recognition rate, sentence sncopation and keword matching is applied in this sstem. The Modified K-means(MKM) algorithm is used in template training, and multi user templates storage technolog is used to get over the localization of speaker-dependence. That is, different speaker choice the different templates librar trained previousl. 4 Eperiment The development and eperimental stud on the SICS prototpe have been achieved. Under the 00Mbps fast switch Ethernet and WINDOWS 98 OS environment, a simplified Chinese SICS bas been realized. The configuration of eperimental environment of the SICS is shown as Fig. 5. The above Rapid Speech Recognition Sstem is applied in the SICS. The sample rate is.052khz. As a result, the responding time of the sstem is less than 0.5 second and the first recognition ratio of 4 speech control commands (including operation and operation object ) arrives at 95% under the laborator environment (less than 40dB). In addition, the mistake operation is avoided because of the using of operation acknowledgement of snthesized speech. The UDP protocal is adopted to implement the real time transmission of the recognition result code of speech commands. It is also implemented that the recognition result code drives each corresponding dela output in a 6 delas arra remotel to demonstrate the real time performance of speech control. Recentl, the SICS prototpe have been realized in Linu Redhat 7.0, drawing up the final destination and the eperiment result is the same. Server/Sstem Controller 00M Switch Ethernet Ethernet SICS Unit Drive Demonstartion Unit Rela Output Drive Card Microphone Sound Output Drive Demonstration Panel Fig. 5 The Configuration of Eperimental Environment of The SICS

5 Conclusion The speech instruction control sstem prototpe presented in this article can be used in space ship, space station or space lab, especiall the spaceman s walking out of the cabin. The main framework of the sstem is demonstrated viable. In speech control application in man machine interface of the manned spacecraft, vocabular is not the main factor, 200 vocabular is much enough. The main factor are speed (real time performance), recognition rate and robust. Finall, there are several aspects of the problems which must be resolved according to the SICS prototpe: more high recognition rate robust sstem further miniaturization the application of larngophone References [] Lawrence Rabiner, Biing-Hwang Juang, Fundamentals of Speech Recognition,993. [2] Richard L. Klevans, Robert D.Rodman, Voice Recognition. [3] Brain Wanstall, DVI in The Militra Cockpit A Third Hand For The Combat Pilot. [4] Jianhua Yang, Hao Zhai, A Kind of Chinese Speech Snthesis Sstem with Large Vocabular Which Can Be Used in Man-Machine Interface of Spacecraft,IAF-99-U.4.05. [5] Hao Zhai, Jianhua Yang, A Stud on the Application of Speech Snthesis Technolog with Limited Vocabular in Man-Machine Interface of Manned Spacecraft, 8 th ISCOPS,999.