THE most popular training method for hidden Markov

Size: px
Start display at page:

Download "THE most popular training method for hidden Markov"

Transcription

1 204 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 A Discriminative Training Algorithm for Hidden Markov Models Assaf Ben-Yishai and David Burshtein, Senior Member, IEEE Abstract We introduce a discriminative training algorithm for the estimation of hidden Markov model (HMM) parameters. This algorithm is based on an approximation of the maximum mutual information (MMI) objective function and its maximization in a technique similar to the expectation-maximization (EM) algorithm. The algorithm is implemented by a simple modification of the standard Baum Welch algorithm, and can be applied to speech recognition as well as to word-spotting systems. Three tasks were tested: Isolated digit recognition in a noisy environment, connected digit recognition in a noisy environment and word-spotting. In all tasks a significant improvement over maximum likelihood (ML) estimation was observed. We also compared the new algorithm to the commonly used extended Baum Welch MMI algorithm. In our tests the algorithm showed advantages in terms of both performance and computational complexity. Index Terms Discriminative training, hidden Markov model (HMM), maximum mutual information (MMI) criterion. I. INTRODUCTION THE most popular training method for hidden Markov model (HMM)-based speech recognition systems is maximum likelihood (ML) estimation. The objective of ML estimation is to find the parameter set that maximizes the likelihood of the training utterances given their corresponding transcription. ML estimation stems from the assumption that the speech signal is distributed according to the model, and is well justified in the theory of parameter estimation. Another advantage of ML estimation of HMMs is its simplicity of implementation using the Baum Welch algorithm. Discriminative training methods such as maximum mutual information (MMI) [1], [13], corrective training [2] and minimum classification error (MCE) [7] attempt to minimize the error rate more effectively by utilizing both the correct and the other categories, and incorporating that into the training phase. Note that the MMI and MCE criteria were shown to be closely related [9]. It was shown [11] that if the true distribution of the samples to be classified can be accurately described by the assumed statistical model, and the size of the training set tends to infinity, then ML estimation outperforms MMI estimation in the sense that it yields less variance in the parameter estimates. Unfortunately, the true distribution of the speech signal cannot be modeled by a HMM, and in realistic speech recognition tasks the training data Manuscript received August 9, 2001; revised October 7, This research was supported by the KITE consortium of the Israeli Ministry of Industry and Trade. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Andreas Stolcke. The authors are with the Department of Electrical Engineering Systems, Tel-Aviv University, Tel-Aviv 69978, Israel ( assaf@eng.tau.ac.il; burstyn@eng.tau.ac.il). Digital Object Identifier /TSA is always sparse. Consequently, the minimization of the recognition error rate is a more suitable objective than the minimization of the error of the parameter estimates. MMI estimation aims to maximize the posterior probability of the words in the training set given their corresponding utterances. Unlike in the ML case, there is no simple optimization method to this problem. First experiments in MMI were reported by Bahl et al. [1], who used the gradient descent algorithm for the optimization of the objective function. Gradient descent algorithms are sensitive to the size of the update step. A large update step can cause an unstable behavior. However, a small update step might result in a prohibitively slow convergence rate. Gopalakrishnan et al. [5] proposed a method for maximizing the MMI objective function which is based on a generalization of the Baum Eagon inequality. This method was proposed to discrete HMMs. Normandin [13] proposed a useful approximated generalization of this method to HMMs with Gaussian output densities known as the extended Baum Welch (EBW) algorithm. Further use of this algorithm was reported in [8], [17] and [18]. A different approach to the optimization of the MMI criterion was reported in [19]. The EBW algorithm is a popular and elegant algorithm that was found useful in various tasks. However, it suffers from the following shortcomings that motivate the development of other algorithms such as the one proposed in this paper: 1) The exact relation between the MMI objective function and the recognition error rate is unknown. This motivates the search of other objective functions. 2) The EBW optimization algorithm is not a simple extension of the standard Baum Welch algorithm. It requires many iterations and is thus computationally expensive. 3) The EBW algorithm is not easy to generalize to other tasks such as word spotting. The proposed algorithm addresses the above shortcomings: It is easily implemented by a simple modification of the standard Baum Welch algorithm. It converges after one or two iterations and is computationally efficient. It is also easily generalized to other tasks such as word spotting. The algorithm we propose is based on an approximation of the MMI objective function, and its maximization in a technique similar to the expectationmaximization (EM) algorithm [4]. Like the EM algorithm, the algorithm proposed in this paper has the desirable property that in practice it monotonically increases the objective function, and is therefore stable. Although the focus of this paper is in the implementation of the new algorithm to speech recognition, it can nevertheless be applied to a general statistical pattern recognition problem. In Section II we give general background to HMM-based speech recognition and explain the standard ML estimation procedure. In Section III we give a general formulation of the ap /04$ IEEE

2 BEN-YISHAI AND BURSHTEIN: DISCRIMINATIVE TRAINING ALGORITHM FOR HIDDEN MARKOV MODELS 205 proximated MMI algorithm. In Section IV we explain the implementation of the algorithm to HMMs. In Section V we asses the performance of the algorithm on several tasks: isolated and connected digit recognition in a noisy environment and word-spotting, and make a comparison to the EBW algorithm and the H-criterion, in which our algorithm is found superior in terms of both performance and computational complexity. Finally, in Section VI we conclude the study, summarize the results and provide some points for further research. II. BACKGROUND A. HMM-Based Speech Recognition In order to simplify the description of the algorithm we assume an isolated word recognition task. Nevertheless, the algorithm can be easily generalized to the recognition of continuous speech. The extensions will be given in Section V.C. Our vocabulary comprises words that form the set. Each word in the vocabulary has a prior probability of occurrence,. The speech signal is divided into frames, and from each frame a feature vector is extracted. We denote the th feature vector by, and the entire sequence of feature vectors that comprise the utterance by. We assume that each word is characterized by a conditional probability density function,, and we perform recognition using the MAP criterion, namely, B. ML Training The training of the models is performed according to a given training set. The training set consists of the utterances and their corresponding transcriptions. ML training is basically the maximization of the ML objective function,, defined as (1) where. We assume that parameters are not tied across words. Hence by (1), it is clear that training can be performed on each word separately. That is, the parameters of each word, are estimated according to its correspondingly labeled utterances, whose indices form the set. Observing this property, it is clear that ML estimation can not take into account confusions between words and recognition errors, and in that sense it differs from discriminative training methods. The optimization of the ML objective function is iteratively implemented using the Baum Welch algorithm [3] which was shown to be a special case of the EM algorithm [4]. The re-estimation formulas are (2) We choose to model the words by a Gaussian mixture HMM, i.e., the probability density function of is (3) (4) where are the transition probabilities. is the state sequence where states and are constrained to be the initial and final non emitting states respectively and the summation is over all possible state sequences. are the output distributions (5) where (6) is the weight of mixture in state, and is a Gaussian distribution with a mean vector and a diagonal covariance matrix. Thus, the parameter set of each word comprises the following elements: the transition probability from state to state.. the weight of the th mixture of the th state.. the mean vector of the th mixture of the th state. the diagonal covariance matrix of the th mixture of the th state. We denote the entire parameter set of all the words in the vocabulary by. and and is the duration of the utterance. The terms and can be efficiently calculated using the well known Forward-Backward algorithm as explained in [14]. For the general case of a HMM parameter, the re-estimation formula takes the form where and are usually referred to as accumulators, and are calculated using the utterances in the set. (7) (8)

3 206 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 III. THE APPROXIMATED MMI ALGORITHM A. The Approximated MMI Criterion The MMI objective function is given by For all. As in the ML case, the parameter set of each word can be estimated separately. Note that this is different from the MMI case (9), in which the parameters of the entire vocabulary should be optimized jointly. It is now possible to formulate the two steps of the algorithm 1. Perform recognition on the training set and obtain the sets, and the objective functions. 2. Maximize with respect to, and obtain new estimates of the parameters. We apply the approximation on the right hand sum of (9) and obtain (9) (10) (11) Note that the above approximation is a special case of the -best approach where. However, the choice of is not arbitrary and is crucial for the implementation of the optimization. Now, recall that and let (12) If we apply the MAP criterion for recognition, we can say that contains the indices of training utterances that were recognized as the word. Using these definitions of and we can rewrite (11) as follows, (13) Motivated by (13) we introduce the following objective function, which we call the approximated MMI criterion, (14) is a prescribed parameter that controls the discrimination rate. Note that using we obtain the ML objective function and using we obtain the MMI objective function under the approximation in (10). Note that in the proposed criterion, resembles in the H-criterion [5]. Observing (14), it is clear that the prior probabilities of the words,, do not effect the maximization of. Unless parameters are tied across the models, maximizing is equivalent to maximizing the following objective functions (15) The ML estimates for the parameters are taken as the initial condition. The two steps can be iterated repeatedly. In our experiments, however, best results were obtained after the first iteration. B. Example As explained in the introduction, if the observations are distributed according to the assumed statistical model, the optimal training technique is ML estimation. The theoretical justification for using MMI estimation stems from its superiority in examples where the assumed statistical model is incorrect [12]. In this section, we give a simple example in which the assumed statistical model is incorrect, and show that the approximated MMI criterion leads to a better decision rule than the ML criterion, in the sense that it yields a smaller probability of error. Consider a classification problem with two classes, and, that have equal prior probabilities, i.e.,. Both and are Gaussian with means and, and variances and, respectively. Suppose that. The MAP classification rule reads (16) This solution is optimal in the sense that it reaches the minimal classification error probability. The decision regions can be obtained by an explicit solution of (16). When the decision rule becomes (17) where and ( ) are the two solutions of (16) when an equality is imposed. Now suppose that and are not known in advance, and suppose that they are assumed to be Gaussian. The parameters are estimated given a training set, which consists of i.i.d. samples, that correspond to, and that correspond to. We also assume that. Let us now make an incorrect assumption that both classes have the same variance:. We want to calculate the estimates for the means, and. Assuming equal variances and, the decision rule becomes:. The ML Solution: The ML estimates in our case are simple averages of the samples: and. According to the law of large numbers

4 BEN-YISHAI AND BURSHTEIN: DISCRIMINATIVE TRAINING ALGORITHM FOR HIDDEN MARKOV MODELS 207 ( ),, and. (convergence is in the mean square sense). The Approximated MMI Solution: We take the threshold obtained by the ML estimation,, as the algorithm s initial estimate. The sets and are defined as in (12). Using straightforward differentiation to maximize and defined in (15) yields (18) (19) By the law of large numbers, the various terms in (18) and (19) can be replaced by their expectations (e.g., and ). This enables us to calculate the threshold. The probability of error in our case is given by As an example we set,, and. We obtained the thresholds and. The associated probability of error when using the decision rule (17) is The minimal probability of error obtained by using the approximated MMI algorithm is obtained at, and its relative distance from the optimal probability of error (using the correct model and parameter values) is less than. The MMI estimates were also calculated using a Monte Carlo experiment. In this experiment samples were drawn for each class, and the MMI estimates were calculated by a direct maximization of the MMI objective function. The MMI probability of error is , which is larger than the one obtained by the approximated MMI algorithm. Fig. 1 shows the behavior of the probability of error,,as a function of the parameter. It can be seen that for sufficiently small values of, is smaller than the one obtained by ML estimation. We note that for, tends to infinity, since the denominators of (18) and (19) are zeroed. However for, the probability of error is smaller than the one obtained by the ML estimate. Consecutive iterations were also experimented by taking for the calculation of the sets and, and then calculating the new threshold, using (18) and (19). More than one iteration, however, did not yield a consistent improvement in the error rate over the range of values. Fig. 1. P versus. this reason we propose an iterative solution for models that include complete and incomplete data, which is similar to the EM algorithm. Algorithm Formulation: Our training set comprises the elements with the pdf. We assume the existence of complete data corresponding to, with the pdf, where and where is in general a noninvertible (many-to-one) transformation. We are interested in maximizing the following function: As in the EM algorithm, we can write Hence, Now, applying the conditional expectation we obtain Hence, (20) becomes (20) C. Maximization Process for Models With Incomplete Data As shown in Section III-A, the approximated MMI estimates are obtained by maximizing the functions. However, in our case, due to the nature of the pdfs of the HMMs, these objective functions cannot be maximized in closed form. For

5 208 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 Now, similar to the EM algorithm we want to see if yields. We want to check if : is also guaranteed to increase the objective function only in a certain range of its free parameter. However, in practice the parameter of the algorithm is chosen empirically, and is typically outside this range [5]. We thus have the following two steps solution: E-step Compute (21) (22) Note that, M-step (23) Where represents the Kullback-Leibler distance between the densities and, which is always nonnegative. All the summation terms in the right hand side of (21) are Kullback-Leibler distances, hence they are non negative. However, the last term in (21) is multiplied by a negative factor. Our assumption is that this does not change the sign of the entire sum since is chosen sufficiently small, and the number of recognition errors is small, hence, this term contains a small number of elements. The assumption that a maximization of the auxiliary function, actually increases the objective function, was found to be true in all our experiments for the range of used. In light of that, like in the case of the EM algorithm, our algorithm has the desirable property that it monotonically increases the objective function and is therefore stable. Note that the EBW algorithm Finally, it should be noted that the algorithm we proposed can be applied to a general parameter estimation problem and is not restricted to the context of speech recognition using HMMs. IV. APPLICATION TO HMMS Using the two steps solution given in (22) and (23) it is possible to derive the explicit re-estimation formulas for the HMM case. The explicit derivation is detailed in the Appendix. The following re-estimation formulas are obtained: see equations (24) (27) at the bottom of the page where and are defined in (6) and (7) respectively. Comparing (24) (27) to the formulas for the ML estimates (2) (5) it is possible to describe the new re-estimation procedure for each parameter,, in the following way: (28) (24) (25) (26) (27)

6 BEN-YISHAI AND BURSHTEIN: DISCRIMINATIVE TRAINING ALGORITHM FOR HIDDEN MARKOV MODELS 209 Fig. 2. Recognition rate (TIDIGITS) versus. Where and are accumulators that are computed according to the set, the original transcription of training set. Similarly, and are the discriminative accumulators, and are computed according to the set, the transcription obtained by recognition. As seen so far, the new algorithm has two major steps: Approximation: Performing recognition on the training set, in order to obtain the sets. Using these sets, the approximated MMI objective function can be calculated. Maximization: Maximizing the objective function using the re-estimation formulas (24) (27). The algorithm has the following degrees of freedom: The choice of the parameter : Choosing a constant for all words, or choosing a different one for each word. Recognition method in the Approximation step. When the training set consists of continuous utterances of words, recognition can be performed in several ways: using the boundaries of the words in the transcription, not using them but using Viterbi recognition. Iterations: Using one iteration, or more. The order of the steps in the iterations: applying Approximation and Maximization successively, or applying Approximation and then several iterations of Maximization. V. EXPERIMENTAL RESULTS Experiments were conducted on two tasks. All experiments were done using the HTK3 [6] toolkit. A. Isolated Digit Recognition in a Noisy Environment The utterances were taken from the adult speakers of the TIDIGITS corpus [10], which is a multi-speaker small vocabulary database. The corpus vocabulary comprises 11 words (the digits 1 to 9 plus oh and zero ) spoken by 326 speakers, in both an isolated and a continuous manner. We used only the adult speakers of the corpus according to the following division: the training comprised 113 speakers (55 men, 58 women), and the test set comprised 115 speakers (57 men, 58 women). Each speaker utters each digit twice. In order to lower the baseline recognition rate we added white Gaussian noise to the speech signal so as to obtain a SNR level of 0 db. The feature vector comprised 12 MFCC, log energy and the corresponding delta and acceleration coefficients. The frame rate was 10 ms and the window size was 25 ms. Mean normalization was also applied. Each digit, including the silence segments surrounding it, was modeled by a HMM with 10 emitting states and with diagonal covariance matrices. In the first experiments single mixture Gaussian output distributions were chosen. The HMM topology was left to right with no skips. The baseline (ML) system was obtained by applying three segmental K-means iterations and seven Baum Welch iterations. The baseline recognition rate was 88.58%. In the first experiments one iteration of Approximation and one of Maximization were applied. It was found that re-estimating the means, variances and transition probabilities yielded better results than re-estimating only the means, or only the means and variances. Variant values of were tested, each value was used for the re-estimation of all the models. For large values of, variances and transition probabilities tended to become negative. In these cases, they were replaced by their ML values. However, when such an event occurred, the recognition rate deteriorated drastically. So, in further experiments, we have restricted the values to be sufficiently small. Fig. 2 shows the recognition rate versus on both the training and test sets. Best results were obtained for for both the training and test set, so in this case can be set using cross validation on the training set. A reduction of 57% and 28% in the error rate was observed on the training set and test set, respectively. The behavior of the algorithm along several iteration was also experimented. All iterations were applied with the same value of, and the following criteria were calculated. The recognition rate on the training set. The MMI objective function

7 210 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 Fig. 3. Evolution of various criteria along successive Approximation, Maximization steps. The MMI objective function under the approximation in (13) The objective function of the approximated MMI algorithm Iterations were applied using two schedules 1) Each iteration comprises a single Approximation step followed by a single Maximization step. 2) One Approximation step followed by several Maximization iterations. Fig. 3 shows the evolution of the above criteria along four iterations of the algorithm, where iterations were implemented in the first schedule with. Each iteration in the graph represents one step of Approximation followed by one step of Maximization. The zeroth iteration represents the values of the criteria before applying the first iteration. It is possible to see that after the first iteration no improvement was obtained, other than a consistent growth in the algorithm s objective function. Fig. 4 shows the corresponding evolution, where iterations were implemented in the second order. It is possible to see that a growth in all the objective functions:,, and was obtained. Recalling Section III-A, the approximated MMI criterion was obtained by an approximation of the MMI criterion. In the experiment reported the relative approximation error was only 0.1%. In Section III, we assumed that a maximization of (the Maximization step) increases the value of the approximated MMI criterion. Indeed, in the experiment where several iterations of Maximization were applied, each iteration yielded a monotonic growth in the approximated MMI criterion. In light of that, like in the case of the EM algorithm, our algorithm has the desirable property that it monotonically increases the objective function and is therefore stable. The best recognition rate obtained on the test set was 92.16%, which reflects a reduction of 31% in the error rate, in comparison to the ML baseline. This result was obtained by applying two iterations of Maximization with (second schedule). Table I summarizes the results obtained by the algorithm on the TIDIGITS database. The iteration columns in the table represent Maximization iterations. The performance of the discriminative training algorithm was also experimented while increasing the number of Gaussian mixtures in the output distributions. In all cases, we implemented one Approximation step followed by one Maximization step, with. The results of this experiment

8 BEN-YISHAI AND BURSHTEIN: DISCRIMINATIVE TRAINING ALGORITHM FOR HIDDEN MARKOV MODELS 211 Fig. 4. Evolution of various criteria along Maximization iterations. TABLE I SUMMARY OF THE RESULTS IN THE ISOLATED DIGIT RECOGNITION TASK TABLE II TIDIGITS RECOGNITION RATE WITH DIFFERENT NUMBERS OF MIXTURES However, the improvement decreased while increasing the number of mixtures. B. Comparison to Other Discriminative Training Algorithms In this subsection we compare the algorithm performance to the ones obtained by other two related algorithms, namely, the EBW algorithm as in [13] and the algorithm proposed in [19]. The EBW algorithm introduced in [13] is aimed to maximize the MMI objective function (34), and is an extension of the optimization procedure proposed in [5] to the case of continuous output HMMs. We implemented the EBW algorithm in the task of isolated noisy digits modeled by a single mixture HMM. The EBW re-estimation formulas are [18], [13], are summarized in Table II. It is possible to see that in all cases the algorithm yielded an improvement over ML estimation. (29)

9 212 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 where and Fig. 5. Evolution of the recognition rate along EBW iterations. TABLE III SUMMARY OF THE RESULTS OBTAINED BY THE EBW ALGORITHM (30) The parameter is set at the maximum between twice the value that ensures positive variances, and. As in [18], a different value of was set for each state. The evolution of the recognition rate on both the training set and test set are depicted in Fig. 5. The results are summarized in Table III. It appears that the EBW algorithm yields a better improvement on the training set (75% versus 57% improvement). However, the opposite occurs on the test set, a 15% improvement by the EBW algorithm and 31% by the proposed algorithm. We note that the recognition rate on the test set is the one of actual importance. We now compare the computational complexity of the two algorithms. We denote by the number of computations for one Baum Welch re-estimation pass over all the models, and Fig. 6. Evolution of the MMI objective function along EBW iterations. by the number of models. The new algorithm requires less than computations for the Approximation step (the determination of, assuming equal number of operations for the Viterbi and Baum Welch algorithms) and for each Maximization step (24) (27). In total, assuming one Approximation step and two Maximization steps (see Section V.A) we need less than computations. The EBW algorithm requires about computations for each iteration. Assuming iterations are sufficient for the EBW algorithm to converge, the total number of computations is. Observing the growth in the MMI objective function, the EBW algorithm yielded a growth from a value of to a value of, whereas the new algorithm yielded a growth to a value of. However, the growth in the new algorithm was faster (after four iterations the new algorithm reached a value of while the EBW reached a value of ). In Fig. 6 we show the evolution of the MMI objective function along the EBW iterations. We note that the EBW algorithm has many free parameters whose tuning was obtained as a result of long research. In our implementation, following [18], there is one parameter per state for each model, resulting in an overall number of 110 parameters. We also tried to use only one tunable parameter, common to all models, but this resulted in an extremely slow convergence rate. The estimation of the transition probabilities neither improved the performance. In the implementation of our algorithm we had only one tunable parameter,. Adding tunable parameters to the algorithm (e.g., a different for each model or for each state) may improve the performance, when the training database is sufficiently large (to avoid overfitting). As noted in Chapter III, our objective function resembles the H-criterion [5], which is (31) The authors of [5] used their optimization algorithm in order to maximize the H-criterion objective function. However, their implementation was for the case of discrete output HMMs. To

10 BEN-YISHAI AND BURSHTEIN: DISCRIMINATIVE TRAINING ALGORITHM FOR HIDDEN MARKOV MODELS 213 TABLE IV SUMMARY OF THE IMPROVEMENTS IN THE H-CRITERION ALGORITHM (THE IMPROVEMENTS ARE IN PARENTHESES) Fig. 7. Evolution of the recognition rate along H-criterion iterations for different values of h. the best of our knowledge, the extension of the above optimization algorithm to continuous HMMs for the H-criterion is not as straightforward as in the MMI case. However, Zheng et al. [19] have proposed a gradient descent based optimization procedure that yields the following re-estimation formulas: where (32) (33) is a tunable parameter which is tuned as in the EBW case. is a tunable parameter which resembles the in our algorithm. yields the ML objective function and yields the MMI objective. Note that when, (32) coincides with (29). We tuned the parameter in the following way: we first implemented one iteration for various values of. yielded the best improvement (12.26%) on the test set and yielded the best improvement (18%) on the training set. Then, we implemented 30 iterations of the algorithm for these values and for. The evolution in the recognition rate is depicted in Fig. 7. The best improvements are summarized in Table IV. It appears from the results that our algorithm outperforms the H-criterion on both the training and test sets. The H-criterion outperforms the EBW on the test set but not on the training set. C. Connected Digit Recognition in a Noisy Environment The utterances were taken from the adult speakers of the TIDIGITS corpus after adding white Gaussian noise to the speech signal so as to obtain a SNR level of 0 db. Each speaker contribution to the corpus consisted two repetitions of each digit in isolation and 55 digit strings, of lengths 2, 3, 4, 5, and 7. The feature vector was the same as in the isolated digit recognition task. Each digit was modeled by a HMM with 8 emitting states, with a single mixture Gaussian output distribution and with diagonal covariance matrices. The entry and exit silences were modeled by a HMM with 3 emitting states, with a single mixture Gaussian output distribution and with diagonal covariance matrices. The HMM topology was left to right with no skips. The baseline (ML) system was obtained in three steps. First, the models were initialized by applying three segmental K-means iterations using the isolated digit utterances. Second, seven Baum Welch iterations were applied using the connected digit utterances that were previously labeled using alignment with models trained on clean speech. Lastly, the models were refined using 19 iterations of embedded training [6]. Embedded training does not use the require labeled utterances. Instead, while training, each continuous utterance is modeled by a composite HMM which is a concatenation of the uttered digits, and the accumulators are calculated using this composite HMM. Finally, the parameters are calculated from the accumulators in the usual way (2) (5). In order to generalize our algorithm to connected speech, we propose the following. The MMI objective function can be written as (34) where represents all the training set utterances and the complete training set transcription. We can also write (35) where the sum is over all possible transcriptions. Following the same steps as in Section III.A, we arrive at the following objective function: (36) where corresponds to the largest term in the sum of (35). In order to find we apply unconstrained Viterbi recognition on the training set. The accumulators were calculated using an embedded training version that does not require utterance segmentation of the known text. The discriminative accumulators were calculated by an embedded training pass using the transcription obtained by Viterbi recognition. Lastly, the parameters were calculated using (24) (27). Recognition was implemented using the Viterbi algorithm, where the beginning and the end of each utterance where constrained to be silences. A word insertion penalty [6] of was also used in order to reduce insertions.

11 214 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 TABLE V SUMMARY OF THE RESULTS IN THE CONNECTED DIGIT RECOGNITION TASK ON THE TRAINING SET TABLE VII RESULTS IN THE WORD-SPOTTING TASK TABLE VI SUMMARY OF THE RESULTS IN THE CONNECTED DIGIT RECOGNITION TASK ON THE TEST SET Performance was evaluated using the following two expressions: Where is the total number of words in the transcription files, is the number of words correctly recognized, and is the number of insertions. As in the isolated digit recognition task, the best improvement was obtained for both the training and test sets with the same value ( ). The results of this experiment are summarized in Tables V and VI. D. Word-Spotting The task involved the spotting of 20 keywords (KWs) on the conversational part of the Stonehenge corpus taken from the Road Rally database [15]. This corpus contains 20 identified KWs, spoken by 80 speakers (28 females, 52 males) that were recorded in laboratory conditions. The speech was then filtered to simulate a telephone line frequency response. The database transcription contains the KW locations. Non-KW speech is not transcribed. The training and test sets were chosen so as to give a good representation of confusable utterances. Sentences sf01-sf02,sf11-sf12,sf42,sf44- sf48,sm03-sm16,sm49-sm59 comprised the training set (85 minutes of speech, containing 1313 KW utterances). Sentences sf58,sf60-sf64,sm33-sm41,sm43 comprised the test set (39 minutes of speech, containing 617 KW utterances). We used the baseline word-spotter proposed in [16], in which likelihood ratio scoring was implemented. Each KW was modeled by a HMM with 18 emitting states and single mixture Gaussian output distributions. Only one filler model was used, modeled by a stationary HMM with 50 mixtures. The feature vector was where represents an MFCC coefficient. Mean normalization was also applied. Fig. 8. Receiver operating curves. The KW models were re-estimated using the new proposed algorithm. The filler model was trained using standard ML estimation. The Approximation step was implemented by performing recognition on the training set. It was experimentally shown better to use all the false alarms for the calculation of the discriminative accumulators, and not reduce them using scoring. One iteration of Maximization was used, and a different value of was used for each KW. Two estimation procedures were examined for : According to the first, is determined for each KW according to its figure of merit (FOM) on the test set. According to the second, is determined by an empirical rule. The first procedure involves the test set, and therefore is not feasible in a realistic situation. According to the second procedure is chosen for each word as some fraction of the value in which variances become negative. The fraction values that we tried were 0.5, 0.7, and 0.9. Results are shown in Table VII. The Improvement column represents the error rate reduction. Fig. 8 presents the receiver operating curves of the word-spotting system with and without discriminative training. VI. CONCLUSIONS This paper has described a new algorithm for discriminative training. We started by introducing a new estimation criterion referred to as the approximated MMI criterion. We then introduced an optimization technique similar to the EM algorithm. Unlike existing discriminative training algorithms, the training procedure can be implemented by a simple modification of the Baum Welch algorithm.

12 BEN-YISHAI AND BURSHTEIN: DISCRIMINATIVE TRAINING ALGORITHM FOR HIDDEN MARKOV MODELS 215 The algorithm has two major steps: Approximation, which is the derivation of the algorithm s criterion, and Maximization, which is similar to the EM algorithm. It was seen in experiments that the approximation yields a small relative error. The maximization process yielded a monotonic growth in the objective function along the iterations. This is a desirable property that can be proved for the EM algorithm. In the case of the new proposed algorithm, this property was shown to hold under certain conditions that were validated in the experiments. Three tasks were tested: isolated and connected digit recognition in a noisy environment and word-spotting. In the isolated digit recognition task, a reduction of 31% in the error rate was observed. In the connected digit recognition task, a reduction of 13% in the error rate was observed. In the word-spotting task, the best improvement was a reduction of 17% in the error rate. We also compared our algorithm to the EBW algorithm on an isolated digit recognition task. Our algorithm was shown to be superior in terms of its performance on the test database, and in terms of its computational complexity. The generalization to context dependent phonetic systems (e.g., triphone based) is conceptually simple, and is the same as in the generalization to connected speech, (34) (36). Tying is accounted for in the usual way by tying together the appropriate counters. where represents the complete underlying sequence of states and mixtures that correspond to the utterance. In the case of the HMMs defined in Section II.A, the auxiliary function is the first equation at the bottom of the page. In the M-step we maximize with respect to all the elements of the parameter vector. We start with the transition probabilities. In order to satisfy the constraints, we shall use the set of Lagrange multipliers, and maximize the Lagrangian. Hence, Therefore, APPENDIX DERIVATION OF EXPLICIT FORMULAS FOR HMMS We apply the estimation algorithm (22), (23) to a Gaussian mixture HMM, as defined in Section II. Let be the auxiliary function corresponding to Summing over we obtain where. Therefore, see the second equation at the bottom of the page. Deriving the formulas for

13 216 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004, we shall use the Lagrange multipliers in order to satisfy the constraints. Differentiating we obtain, Deriving the re-estimation formulas for the diagonal covariance matrices,, the elements of where. Summing over we obtain Hence, Hence, We now derive the re-estimation formulas for of the mean vectors, Hence,, the elements ACKNOWLEDGMENT The authors would like to thank the Cambridge University Engineering Department and, in particular, G. Evermann for providing and supporting the HTK3 toolkit. REFERENCES [1] L. R. Bahl, P. F. Brown, P. V. de Souza, and R. L. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition, in Proc. ICASSP 86, Apr/ 1986, [2], A new algorithm for the estimation of hidden Markov model parameters, in Proc. ICASSP 88, 1988, pp [3] L. E. Baum, T. Peterie, G. Souled, and N. Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains, Ann. Math. Statist., vol. 41, no. 1, pp , [4] A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Statist. Soc., vol. 39, pp. 1 38, [5] P. S. Gopalakrishnan, D. Kanevsky, A. Nádas, and D. Nahamoo, An inequality for rational function with applications to some statistical estimation problems, IEEE Trans. Inform. Theory, vol. 37, Jan [6] HTK Hidden Markov Model Toolkit [Online]. Available: eng.cam.ac.uk [7] B.-H. Juang, W. Chou, and C.-H. Lee, Minimum classification error methods for speech recognition, IEEE Trans. Speech Audio Processing, vol. 5, no. 3, pp , [8] S. Kapadia, V. Valtchev, and S. J. Young, MMI training for continuous phoneme recognition on the TIMIT database, in Proc. ICASSP 1993, vol. 2, 1993, pp [9] S. Katagiri, B.-H. Juang, and C. H. Lee, Pattern recognition using a family of design algorithm based upon the generalized probabilistic descent method, Proc. IEEE, vol. 86, no. 11, pp [10] R. G. Leonard, A database for speaker-independent digit recognition, in Proc. ICASSP 84, [11] A. Nádas, A decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional vesus conditional maximum likelihood, IEEE Trans. Acoust., Speech, Signal Processing, vol. 31, no. 4, pp , 1983.

14 BEN-YISHAI AND BURSHTEIN: DISCRIMINATIVE TRAINING ALGORITHM FOR HIDDEN MARKOV MODELS 217 [12] A. Nádas, D. Nahamoo, and M. A. Picheny, On a model robust training method for speech recognition, IEEE Trans. Acoust., Speech, Signal Processing, vol. 39, no. 9, pp , [13] Y. Normandin, R. Cardin, and R. De Mori, High-performance connected digit recognition using maximum mutual information estimation, IEEE Trans. Speech Audio Processing, vol. 2, pp , [14] L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE, vol. 77, pp , [15] The Road Rally Word-Spotting Corpora (RDRALLY1), NIST, NIST Speech Disc6 1.1, [16] R. C. Rose and D. B. Paul, A hidden Markov model based keyword recognition system, in Proc ICASSP 90, vol. 2.24, Apr. 1990, pp [17] V. Valtchev, J. J. Odell, P. C. Woodland, and S. J. Young, MMIE training of large vocabulary speech recognition systems, Speech Commun., vol. 22, pp , [18] P. C. Woodland and D. Povey, Large scale MMIE training for conversational telephone speech recognition, in Proc. Speech Transcription Workshop, [19] J. Zheng, J. Butzberger, H. Franco, and A. Stolke, Improved maximum mutual information estimation training of continuous density HMM s, in Proc. 7th Eur. Conf. Speech Communication and Technology, Aalborg, Denmark, Sept Assaf Ben-Yishai was born in Israel in He received the B.Sc. and M.Sc. degrees in electrical engineering in 1999 and 2001, respectively, both from Tel-Aviv University. He is currently pursuing the Ph.D. degree in electrical engineering at Tel-Aviv University. His research interests include speech recognition and information theory. David Burshtein (M 92 SM 99) received the B.Sc. and Ph.D. degrees in electrical engineering in 1982 and 1987, respectively, from Tel-Aviv University. During he was a Research Staff Member in the Speech Recognition Group of IBM T. J. Watson Research Center, Yorktown Heights, NY. In 1989, he joined the Department of Electrical Engineering Systems, Tel-Aviv University, where he is currently Associate Professor. His research interests include information theory, speech, and signal processing.

Discriminative training and Feature combination

Discriminative training and Feature combination Discriminative training and Feature combination Steve Renals Automatic Speech Recognition ASR Lecture 13 16 March 2009 Steve Renals Discriminative training and Feature combination 1 Overview Hot topics

More information

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of

More information

Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model

Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model 608 IEEE TRANSACTIONS ON MAGNETICS, VOL. 39, NO. 1, JANUARY 2003 Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model Tsai-Sheng Kao and Mu-Huo Cheng Abstract

More information

Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri

Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute eugenew@cs.nyu.edu Slide Credit: Mehryar Mohri Speech Recognition Components Acoustic and pronunciation model:

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Discriminative Training and Adaptation of Large Vocabulary ASR Systems

Discriminative Training and Adaptation of Large Vocabulary ASR Systems Discriminative Training and Adaptation of Large Vocabulary ASR Systems Phil Woodland March 30th 2004 ICSI Seminar: March 30th 2004 Overview Why use discriminative training for LVCSR? MMIE/CMLE criterion

More information

Client Dependent GMM-SVM Models for Speaker Verification

Client Dependent GMM-SVM Models for Speaker Verification Client Dependent GMM-SVM Models for Speaker Verification Quan Le, Samy Bengio IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland {quan,bengio}@idiap.ch Abstract. Generative Gaussian Mixture Models (GMMs)

More information

Mixture Models and EM

Mixture Models and EM Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering

More information

Introduction to HTK Toolkit

Introduction to HTK Toolkit Introduction to HTK Toolkit Berlin Chen 2003 Reference: - The HTK Book, Version 3.2 Outline An Overview of HTK HTK Processing Stages Data Preparation Tools Training Tools Testing Tools Analysis Tools Homework:

More information

Convex combination of adaptive filters for a variable tap-length LMS algorithm

Convex combination of adaptive filters for a variable tap-length LMS algorithm Loughborough University Institutional Repository Convex combination of adaptive filters for a variable tap-length LMS algorithm This item was submitted to Loughborough University's Institutional Repository

More information

Variable-Component Deep Neural Network for Robust Speech Recognition

Variable-Component Deep Neural Network for Robust Speech Recognition Variable-Component Deep Neural Network for Robust Speech Recognition Rui Zhao 1, Jinyu Li 2, and Yifan Gong 2 1 Microsoft Search Technology Center Asia, Beijing, China 2 Microsoft Corporation, One Microsoft

More information

Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training

Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training Chao Zhang and Phil Woodland March 8, 07 Cambridge University Engineering Department

More information

An Introduction to Pattern Recognition

An Introduction to Pattern Recognition An Introduction to Pattern Recognition Speaker : Wei lun Chao Advisor : Prof. Jian-jiun Ding DISP Lab Graduate Institute of Communication Engineering 1 Abstract Not a new research field Wide range included

More information

Hidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017

Hidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017 Hidden Markov Models Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017 1 Outline 1. 2. 3. 4. Brief review of HMMs Hidden Markov Support Vector Machines Large Margin Hidden Markov Models

More information

Convexization in Markov Chain Monte Carlo

Convexization in Markov Chain Monte Carlo in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non

More information

Simultaneous Design of Feature Extractor and Pattern Classifer Using the Minimum Classification Error Training Algorithm

Simultaneous Design of Feature Extractor and Pattern Classifer Using the Minimum Classification Error Training Algorithm Griffith Research Online https://research-repository.griffith.edu.au Simultaneous Design of Feature Extractor and Pattern Classifer Using the Minimum Classification Error Training Algorithm Author Paliwal,

More information

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models Gleidson Pegoretti da Silva, Masaki Nakagawa Department of Computer and Information Sciences Tokyo University

More information

AN ALGORITHM FOR BLIND RESTORATION OF BLURRED AND NOISY IMAGES

AN ALGORITHM FOR BLIND RESTORATION OF BLURRED AND NOISY IMAGES AN ALGORITHM FOR BLIND RESTORATION OF BLURRED AND NOISY IMAGES Nader Moayeri and Konstantinos Konstantinides Hewlett-Packard Laboratories 1501 Page Mill Road Palo Alto, CA 94304-1120 moayeri,konstant@hpl.hp.com

More information

application of learning vector quantization algorithms. In Proceedings of the International Joint Conference on

application of learning vector quantization algorithms. In Proceedings of the International Joint Conference on [5] Teuvo Kohonen. The Self-Organizing Map. In Proceedings of the IEEE, pages 1464{1480, 1990. [6] Teuvo Kohonen, Jari Kangas, Jorma Laaksonen, and Kari Torkkola. LVQPAK: A program package for the correct

More information

Mono-font Cursive Arabic Text Recognition Using Speech Recognition System

Mono-font Cursive Arabic Text Recognition Using Speech Recognition System Mono-font Cursive Arabic Text Recognition Using Speech Recognition System M.S. Khorsheed Computer & Electronics Research Institute, King AbdulAziz City for Science and Technology (KACST) PO Box 6086, Riyadh

More information

EM Algorithm with Split and Merge in Trajectory Clustering for Automatic Speech Recognition

EM Algorithm with Split and Merge in Trajectory Clustering for Automatic Speech Recognition EM Algorithm with Split and Merge in Trajectory Clustering for Automatic Speech Recognition Yan Han and Lou Boves Department of Language and Speech, Radboud University Nijmegen, The Netherlands {Y.Han,

More information

A ROBUST SPEAKER CLUSTERING ALGORITHM

A ROBUST SPEAKER CLUSTERING ALGORITHM A ROBUST SPEAKER CLUSTERING ALGORITHM J. Ajmera IDIAP P.O. Box 592 CH-1920 Martigny, Switzerland jitendra@idiap.ch C. Wooters ICSI 1947 Center St., Suite 600 Berkeley, CA 94704, USA wooters@icsi.berkeley.edu

More information

Confidence Measures: how much we can trust our speech recognizers

Confidence Measures: how much we can trust our speech recognizers Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition

More information

2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology

2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology ISCA Archive STREAM WEIGHT OPTIMIZATION OF SPEECH AND LIP IMAGE SEQUENCE FOR AUDIO-VISUAL SPEECH RECOGNITION Satoshi Nakamura 1 Hidetoshi Ito 2 Kiyohiro Shikano 2 1 ATR Spoken Language Translation Research

More information

GMM-FREE DNN TRAINING. Andrew Senior, Georg Heigold, Michiel Bacchiani, Hank Liao

GMM-FREE DNN TRAINING. Andrew Senior, Georg Heigold, Michiel Bacchiani, Hank Liao GMM-FREE DNN TRAINING Andrew Senior, Georg Heigold, Michiel Bacchiani, Hank Liao Google Inc., New York {andrewsenior,heigold,michiel,hankliao}@google.com ABSTRACT While deep neural networks (DNNs) have

More information

Constraints in Particle Swarm Optimization of Hidden Markov Models

Constraints in Particle Swarm Optimization of Hidden Markov Models Constraints in Particle Swarm Optimization of Hidden Markov Models Martin Macaš, Daniel Novák, and Lenka Lhotská Czech Technical University, Faculty of Electrical Engineering, Dep. of Cybernetics, Prague,

More information

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Jan Vaněk and Josef V. Psutka Department of Cybernetics, West Bohemia University,

More information

Assignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018

Assignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018 Assignment 2 Unsupervised & Probabilistic Learning Maneesh Sahani Due: Monday Nov 5, 2018 Note: Assignments are due at 11:00 AM (the start of lecture) on the date above. he usual College late assignments

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages

Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages Chirag Shah Dept. of CSE IIT Madras Chennai - 600036 Tamilnadu, India. chirag@speech.iitm.ernet.in A. Nayeemulla Khan Dept. of CSE

More information

Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data

Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data Martin Karafiát Λ, Igor Szöke, and Jan Černocký Brno University of Technology, Faculty of Information Technology Department

More information

c COPYRIGHT Microsoft Corporation. c COPYRIGHT Cambridge University Engineering Department.

c COPYRIGHT Microsoft Corporation. c COPYRIGHT Cambridge University Engineering Department. The HTK Book Steve Young Gunnar Evermann Dan Kershaw Gareth Moore Julian Odell Dave Ollason Valtcho Valtchev Phil Woodland The HTK Book (for HTK Version 3.1) c COPYRIGHT 1995-1999 Microsoft Corporation.

More information

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Samer Al Moubayed Center for Speech Technology, Department of Speech, Music, and Hearing, KTH, Sweden. sameram@kth.se

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Hidden Markov Model for Sequential Data

Hidden Markov Model for Sequential Data Hidden Markov Model for Sequential Data Dr.-Ing. Michelle Karg mekarg@uwaterloo.ca Electrical and Computer Engineering Cheriton School of Computer Science Sequential Data Measurement of time series: Example:

More information

Optimization of HMM by the Tabu Search Algorithm

Optimization of HMM by the Tabu Search Algorithm JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 20, 949-957 (2004) Optimization of HMM by the Tabu Search Algorithm TSONG-YI CHEN, XIAO-DAN MEI *, JENG-SHYANG PAN AND SHENG-HE SUN * Department of Electronic

More information

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 8, NO. 6, DECEMBER 2000 747 A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks Yuhong Zhu, George N. Rouskas, Member,

More information

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract Clustering Sequences with Hidden Markov Models Padhraic Smyth Information and Computer Science University of California, Irvine CA 92697-3425 smyth@ics.uci.edu Abstract This paper discusses a probabilistic

More information

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern

More information

Automatic basis selection for RBF networks using Stein s unbiased risk estimator

Automatic basis selection for RBF networks using Stein s unbiased risk estimator Automatic basis selection for RBF networks using Stein s unbiased risk estimator Ali Ghodsi School of omputer Science University of Waterloo University Avenue West NL G anada Email: aghodsib@cs.uwaterloo.ca

More information

Applications of Keyword-Constraining in Speaker Recognition. Howard Lei. July 2, Introduction 3

Applications of Keyword-Constraining in Speaker Recognition. Howard Lei. July 2, Introduction 3 Applications of Keyword-Constraining in Speaker Recognition Howard Lei hlei@icsi.berkeley.edu July 2, 2007 Contents 1 Introduction 3 2 The keyword HMM system 4 2.1 Background keyword HMM training............................

More information

PARALLEL TRAINING ALGORITHMS FOR CONTINUOUS SPEECH RECOGNITION, IMPLEMENTED IN A MESSAGE PASSING FRAMEWORK

PARALLEL TRAINING ALGORITHMS FOR CONTINUOUS SPEECH RECOGNITION, IMPLEMENTED IN A MESSAGE PASSING FRAMEWORK PARALLEL TRAINING ALGORITHMS FOR CONTINUOUS SPEECH RECOGNITION, IMPLEMENTED IN A MESSAGE PASSING FRAMEWORK Vladimir Popescu 1, 2, Corneliu Burileanu 1, Monica Rafaila 1, Ramona Calimanescu 1 1 Faculty

More information

An Optimized Pixel-Wise Weighting Approach For Patch-Based Image Denoising

An Optimized Pixel-Wise Weighting Approach For Patch-Based Image Denoising An Optimized Pixel-Wise Weighting Approach For Patch-Based Image Denoising Dr. B. R.VIKRAM M.E.,Ph.D.,MIEEE.,LMISTE, Principal of Vijay Rural Engineering College, NIZAMABAD ( Dt.) G. Chaitanya M.Tech,

More information

Large Margin Hidden Markov Models for Automatic Speech Recognition

Large Margin Hidden Markov Models for Automatic Speech Recognition Large Margin Hidden Markov Models for Automatic Speech Recognition Fei Sha Computer Science Division University of California Berkeley, CA 94720-1776 feisha@cs.berkeley.edu Lawrence K. Saul Department

More information

1 1 λ ( i 1) Sync diagram is the lack of a synchronization stage, which isthe main advantage of this method. Each iteration of ITSAT performs ex

1 1 λ ( i 1) Sync diagram is the lack of a synchronization stage, which isthe main advantage of this method. Each iteration of ITSAT performs ex Fast Robust Inverse Transform SAT and Multi-stage ation Hubert Jin, Spyros Matsoukas, Richard Schwartz, Francis Kubala BBN Technologies 70 Fawcett Street, Cambridge, MA 02138 ABSTRACT We present a new

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification

Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification Gazi. Ali, Pei-Ju Chiang Aravind K. Mikkilineni, George T. Chiu Edward J. Delp, and Jan P. Allebach School

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

A Gaussian Mixture Model Spectral Representation for Speech Recognition

A Gaussian Mixture Model Spectral Representation for Speech Recognition A Gaussian Mixture Model Spectral Representation for Speech Recognition Matthew Nicholas Stuttle Hughes Hall and Cambridge University Engineering Department PSfrag replacements July 2003 Dissertation submitted

More information

Gene regulation. DNA is merely the blueprint Shared spatially (among all tissues) and temporally But cells manage to differentiate

Gene regulation. DNA is merely the blueprint Shared spatially (among all tissues) and temporally But cells manage to differentiate Gene regulation DNA is merely the blueprint Shared spatially (among all tissues) and temporally But cells manage to differentiate Especially but not only during developmental stage And cells respond to

More information

Audio-visual interaction in sparse representation features for noise robust audio-visual speech recognition

Audio-visual interaction in sparse representation features for noise robust audio-visual speech recognition ISCA Archive http://www.isca-speech.org/archive Auditory-Visual Speech Processing (AVSP) 2013 Annecy, France August 29 - September 1, 2013 Audio-visual interaction in sparse representation features for

More information

Comparative Evaluation of Feature Normalization Techniques for Speaker Verification

Comparative Evaluation of Feature Normalization Techniques for Speaker Verification Comparative Evaluation of Feature Normalization Techniques for Speaker Verification Md Jahangir Alam 1,2, Pierre Ouellet 1, Patrick Kenny 1, Douglas O Shaughnessy 2, 1 CRIM, Montreal, Canada {Janagir.Alam,

More information

Conditional Random Fields : Theory and Application

Conditional Random Fields : Theory and Application Conditional Random Fields : Theory and Application Matt Seigel (mss46@cam.ac.uk) 3 June 2010 Cambridge University Engineering Department Outline The Sequence Classification Problem Linear Chain CRFs CRF

More information

A General Greedy Approximation Algorithm with Applications

A General Greedy Approximation Algorithm with Applications A General Greedy Approximation Algorithm with Applications Tong Zhang IBM T.J. Watson Research Center Yorktown Heights, NY 10598 tzhang@watson.ibm.com Abstract Greedy approximation algorithms have been

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

Constrained Discriminative Training of N-gram Language Models

Constrained Discriminative Training of N-gram Language Models Constrained Discriminative Training of N-gram Language Models Ariya Rastrow #1, Abhinav Sethy 2, Bhuvana Ramabhadran 3 # Human Language Technology Center of Excellence, and Center for Language and Speech

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation

HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation 009 10th International Conference on Document Analysis and Recognition HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation Yaregal Assabie and Josef Bigun School of Information Science,

More information

Shared Kernel Models for Class Conditional Density Estimation

Shared Kernel Models for Class Conditional Density Estimation IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 5, SEPTEMBER 2001 987 Shared Kernel Models for Class Conditional Density Estimation Michalis K. Titsias and Aristidis C. Likas, Member, IEEE Abstract

More information

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER 1999 1271 Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming Bao-Liang Lu, Member, IEEE, Hajime Kita, and Yoshikazu

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition

Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition by Hong-Kwang Jeff Kuo, Brian Kingsbury (IBM Research) and Geoffry Zweig (Microsoft Research) ICASSP 2007 Presented

More information

Transductive Phoneme Classification Using Local Scaling And Confidence

Transductive Phoneme Classification Using Local Scaling And Confidence 202 IEEE 27-th Convention of Electrical and Electronics Engineers in Israel Transductive Phoneme Classification Using Local Scaling And Confidence Matan Orbach Dept. of Electrical Engineering Technion

More information

Speaker Diarization System Based on GMM and BIC

Speaker Diarization System Based on GMM and BIC Speaer Diarization System Based on GMM and BIC Tantan Liu 1, Xiaoxing Liu 1, Yonghong Yan 1 1 ThinIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences Beijing 100080 {tliu, xliu,yyan}@hccl.ioa.ac.cn

More information

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can 208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer

More information

A long, deep and wide artificial neural net for robust speech recognition in unknown noise

A long, deep and wide artificial neural net for robust speech recognition in unknown noise A long, deep and wide artificial neural net for robust speech recognition in unknown noise Feipeng Li, Phani S. Nidadavolu, and Hynek Hermansky Center for Language and Speech Processing Johns Hopkins University,

More information

ARELAY network consists of a pair of source and destination

ARELAY network consists of a pair of source and destination 158 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 55, NO 1, JANUARY 2009 Parity Forwarding for Multiple-Relay Networks Peyman Razaghi, Student Member, IEEE, Wei Yu, Senior Member, IEEE Abstract This paper

More information

An Improved Measurement Placement Algorithm for Network Observability

An Improved Measurement Placement Algorithm for Network Observability IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 4, NOVEMBER 2001 819 An Improved Measurement Placement Algorithm for Network Observability Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing

More information

Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques

Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques Sea Chen Department of Biomedical Engineering Advisors: Dr. Charles A. Bouman and Dr. Mark J. Lowe S. Chen Final Exam October

More information

Introduction to The HTK Toolkit

Introduction to The HTK Toolkit Introduction to The HTK Toolkit Hsin-min Wang Reference: - The HTK Book Outline An Overview of HTK HTK Processing Stages Data Preparation Tools Training Tools Testing Tools Analysis Tools A Tutorial Example

More information

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper

More information

Chapter 3. Speech segmentation. 3.1 Preprocessing

Chapter 3. Speech segmentation. 3.1 Preprocessing , as done in this dissertation, refers to the process of determining the boundaries between phonemes in the speech signal. No higher-level lexical information is used to accomplish this. This chapter presents

More information

SVD-based Universal DNN Modeling for Multiple Scenarios

SVD-based Universal DNN Modeling for Multiple Scenarios SVD-based Universal DNN Modeling for Multiple Scenarios Changliang Liu 1, Jinyu Li 2, Yifan Gong 2 1 Microsoft Search echnology Center Asia, Beijing, China 2 Microsoft Corporation, One Microsoft Way, Redmond,

More information

HIERARCHICAL LARGE-MARGIN GAUSSIAN MIXTURE MODELS FOR PHONETIC CLASSIFICATION. Hung-An Chang and James R. Glass

HIERARCHICAL LARGE-MARGIN GAUSSIAN MIXTURE MODELS FOR PHONETIC CLASSIFICATION. Hung-An Chang and James R. Glass HIERARCHICAL LARGE-MARGIN GAUSSIAN MIXTURE MODELS FOR PHONETIC CLASSIFICATION Hung-An Chang and James R. Glass MIT Computer Science and Artificial Intelligence Laboratory Cambridge, Massachusetts, 02139,

More information

The Method of User s Identification Using the Fusion of Wavelet Transform and Hidden Markov Models

The Method of User s Identification Using the Fusion of Wavelet Transform and Hidden Markov Models The Method of User s Identification Using the Fusion of Wavelet Transform and Hidden Markov Models Janusz Bobulski Czȩstochowa University of Technology, Institute of Computer and Information Sciences,

More information

Package HMMCont. February 19, 2015

Package HMMCont. February 19, 2015 Type Package Package HMMCont February 19, 2015 Title Hidden Markov Model for Continuous Observations Processes Version 1.0 Date 2014-02-11 Author Maintainer The package includes

More information

A Model Selection Criterion for Classification: Application to HMM Topology Optimization

A Model Selection Criterion for Classification: Application to HMM Topology Optimization A Model Selection Criterion for Classification Application to HMM Topology Optimization Alain Biem IBM T. J. Watson Research Center P.O Box 218, Yorktown Heights, NY 10549, USA biem@us.ibm.com Abstract

More information

Robust Shape Retrieval Using Maximum Likelihood Theory

Robust Shape Retrieval Using Maximum Likelihood Theory Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2

More information

Intelligent Hands Free Speech based SMS System on Android

Intelligent Hands Free Speech based SMS System on Android Intelligent Hands Free Speech based SMS System on Android Gulbakshee Dharmale 1, Dr. Vilas Thakare 3, Dr. Dipti D. Patil 2 1,3 Computer Science Dept., SGB Amravati University, Amravati, INDIA. 2 Computer

More information

Image Denoising AGAIN!?

Image Denoising AGAIN!? 1 Image Denoising AGAIN!? 2 A Typical Imaging Pipeline 2 Sources of Noise (1) Shot Noise - Result of random photon arrival - Poisson distributed - Serious in low-light condition - Not so bad under good

More information

Short-time Viterbi for online HMM decoding : evaluation on a real-time phone recognition task

Short-time Viterbi for online HMM decoding : evaluation on a real-time phone recognition task Short-time Viterbi for online HMM decoding : evaluation on a real-time phone recognition task Julien Bloit, Xavier Rodet To cite this version: Julien Bloit, Xavier Rodet. Short-time Viterbi for online

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

Complexity-Optimized Low-Density Parity-Check Codes

Complexity-Optimized Low-Density Parity-Check Codes Complexity-Optimized Low-Density Parity-Check Codes Masoud Ardakani Department of Electrical & Computer Engineering University of Alberta, ardakani@ece.ualberta.ca Benjamin Smith, Wei Yu, Frank R. Kschischang

More information

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and

More information

ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System

ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System Nianjun Liu, Brian C. Lovell, Peter J. Kootsookos, and Richard I.A. Davis Intelligent Real-Time Imaging and Sensing (IRIS)

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Speaker Verification with Adaptive Spectral Subband Centroids

Speaker Verification with Adaptive Spectral Subband Centroids Speaker Verification with Adaptive Spectral Subband Centroids Tomi Kinnunen 1, Bingjun Zhang 2, Jia Zhu 2, and Ye Wang 2 1 Speech and Dialogue Processing Lab Institution for Infocomm Research (I 2 R) 21

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

Software/Hardware Co-Design of HMM Based Isolated Digit Recognition System

Software/Hardware Co-Design of HMM Based Isolated Digit Recognition System 154 JOURNAL OF COMPUTERS, VOL. 4, NO. 2, FEBRUARY 2009 Software/Hardware Co-Design of HMM Based Isolated Digit Recognition System V. Amudha, B.Venkataramani, R. Vinoth kumar and S. Ravishankar Department

More information

SOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2

SOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 1 Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 33720, Tampere, Finland toni.heittola@tut.fi,

More information

Adaptive Filtering using Steepest Descent and LMS Algorithm

Adaptive Filtering using Steepest Descent and LMS Algorithm IJSTE - International Journal of Science Technology & Engineering Volume 2 Issue 4 October 2015 ISSN (online): 2349-784X Adaptive Filtering using Steepest Descent and LMS Algorithm Akash Sawant Mukesh

More information

Treba: Efficient Numerically Stable EM for PFA

Treba: Efficient Numerically Stable EM for PFA JMLR: Workshop and Conference Proceedings 21:249 253, 2012 The 11th ICGI Treba: Efficient Numerically Stable EM for PFA Mans Hulden Ikerbasque (Basque Science Foundation) mhulden@email.arizona.edu Abstract

More information

The HTK Book. Steve Young Gunnar Evermann Dan Kershaw Gareth Moore Julian Odell Dave Ollason Dan Povey Valtcho Valtchev Phil Woodland

The HTK Book. Steve Young Gunnar Evermann Dan Kershaw Gareth Moore Julian Odell Dave Ollason Dan Povey Valtcho Valtchev Phil Woodland The HTK Book Steve Young Gunnar Evermann Dan Kershaw Gareth Moore Julian Odell Dave Ollason Dan Povey Valtcho Valtchev Phil Woodland The HTK Book (for HTK Version 3.2) c COPYRIGHT 1995-1999 Microsoft Corporation.

More information

Semi-blind Block Channel Estimation and Signal Detection Using Hidden Markov Models

Semi-blind Block Channel Estimation and Signal Detection Using Hidden Markov Models Semi-blind Block Channel Estimation and Signal Detection Using Hidden Markov Models Pei Chen and Hisashi Kobayashi Department of Electrical Engineering Princeton University Princeton, New Jersey 08544,

More information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Mustafa Berkay Yilmaz, Hakan Erdogan, Mustafa Unel Sabanci University, Faculty of Engineering and Natural

More information

Speech User Interface for Information Retrieval

Speech User Interface for Information Retrieval Speech User Interface for Information Retrieval Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic Institute, Nagpur Sadar, Nagpur 440001 (INDIA) urmilas@rediffmail.com Cell : +919422803996

More information