Weighted Finite Automatas using Spectral Methods for Computer Vision

Size: px
Start display at page:

Download "Weighted Finite Automatas using Spectral Methods for Computer Vision"

Transcription

1 Weighted Finite Automatas using Spectral Methods for Computer Vision A Thesis Presented by Zulqarnain Qayyum Khan to The Department of Electrical and Computer Engineering in partial fulfillment of the requirements for the degree of Master of Science in Electrical and Computer Engineering Northeastern University Boston, Massachusetts April 2016

2 To Abbu Jaan, wish you were still here! i

3 Contents List of Figures List of Tables Acknowledgments Abstract of the Thesis v vii viii ix 1 Introduction Background Problem Statement Related Work Overview Weighted Finite Automatas Introduction Definition Transformations WFA Hankels Spectral Learning Empirical Hankel Recovering WFA Pre-Processing Introduction Posebits Posebit Selection Clusters of Velocities and Acceleration Hankel Matrices Clustering Gram Matrices Synthetic Experiments WFA Generation iii

4 4.2. String Generation from WFAs Evaluation Functions, Empirical Hankels, and Spectral Learning Evaluation Functions Empirical Hankels Spectral Learning Experiments to Evaluate Estimated WFAs Frobenius Norm Perplexity K-L Divergence Word Prediction Error Rate Experiments Experimental Setup MHAD Dataset Description Evaluation MSR3D Dataset Description Evaluation Composable Activities Dataset Description Evaluation HDM05 Dataset Description Evaluation UTKinect Dataset Description Evaluation Some Other Experiments Experiments on PbDb Dataset Experiments with Hankels Conclusion and Future Work 46 Bibliography 48 iv

5 List of Figures 1. Graphical Representation of a WFA The General Flow sheet of the learning algorithm Examples of posebits, and some posses condition on different posebits Different relationships in body parts Posebit Binary Tree Description of structure of a hankel matrix Snapshots from MHAD Dataset Snapshots throwing action from MHAD Dataset Confusion matrix for MHAD Dataset with s=90% Confusion matrix for MHAD Dataset with s=60% Confusion matrix for MHAD Dataset with s=99% Confusion matrix for MSR3D Dataset with s=90% Confusion matrix for MSR3D Dataset with s=95% Confusion matrix for MSR3D Dataset with s=75% Snapshots from Composable Activites Dataset Confusion matrix for Composable Activities Dataset with s=95% Confusion matrix for Composable Activities Dataset with s=99% Confusion matrix for Composable Activities Dataset with s=75% Confusion matrix for Composable Activities Dataset from [38] Confusion matrix for HDM05 Dataset with s=94% Confusion matrix for HDM05 Dataset with s=99% Confusion matrix for HDM05 Dataset with s=85% Confusion matrix for HDM05 Dataset following protocol of [2] Snapshots from UTKinect Dataset Confusion matrix for UTKinect Dataset with s=95% Confusion matrix for UTKinect Dataset with s=99% Confusion matrix for UTKinect Dataset with s=75% v

6 12.1. Scores when WFA is trained on walk Scores when WFA is trained on jogging Scores when WFA is trained on boxing vi

7 List of Tables 1. Perplexity comparison for estimated WFAs KLD comparison for estimated WFAs Comparison of accuracies with other methods on MHAD Dataset Comparison of accuracies with other methods on UTKinect Dataset vii

8 Acknowledgments Here I wish to thank everyone who has supported me during the process of the thesis work, especially Prof. Camps for advising and supervising my work, and Prof. Sznaier and Prof. Dy for agreeing to be on my thesis committee. I would also like to thank my lab fellows, especially Caglayan and Xikang for guiding and helping me. Last but not least I d like to acknowledge the support of my family back home and my support base here in Boston, my Minions. viii

9 Abstract of the Thesis Weighted Finite Automatas using Spectral Methods for Computer Vision By Zulqarnain Qayyum Khan Master of Science in Electrical and Computer Engineering Northeastern University, April 2016 Dr. Octavia I. Camps, Adviser There are many possible ways to model the machine or model that generates a set of sequences, Weighted Finite Automatas (WFAs) have been demonstrated to be a powerful tool in this regard by the Natural Language Processing Community. Spectral techniques of recovering WFAs from empirically constructed hankel matrices have also been demonstrated to work very well, with theoretical backing, and thus make the task of recovering the underlying machine very much possible. Our focus here is an attempt to port WFAs and the spectral recovery techniques to the field of Computer Vision, implementing every technique from scratch to gain more in depth understanding. More specifically we look at activity videos (simple and complex) as string sequences, where the goal is to then recover the underlying machines that generate similar activities. Different features are used to convert the videos into strings, spectral methods are then applied to demonstrate viability of WFAs in tasks such as Action Classification on multiple datasets. The results are encouraging but indicate a further refinement of the approach and more data is needed. ix

10 Chapter 1 Introduction 1.1. Background Recognizing, Classifying or segmenting sequences plays a major role in any field that deals with pattern recognition, be it text based Natural Language Processing, or image based Computer Vision. There are multiple ways to identify sequences, one possible way can be to try and differentiate between sequences based on appearance or motion or any other features. Another way is the approach explored in this thesis, and that is making the assumption that instead of directly comparing sequences what if the underlying system that generates those sequences can be modelled, for example [1] and [2] take the approach by attempting to identify and comparing the dynamical systems that generate activities and then using different metrics for the task of activity recognition. Other possible approaches can be broadly classified as Generative, such as HMM based modelling that have been around since as early as [3] to more recent approaches such as those used by [4] vs Discriminative models which have been in more use recently, such as SVMs and Artificial Neural Networks (ANNs), such as those utilized by [53][23][30][54]. Keeping these in mind and the work done by Borja Balle et al [5] in the Natural Language Processing community, the intention is to introduce another generative model, namely Weighted Finite Automatas to the Computer Vision community. 1

11 1.2. Problem Statement We start with a from scratch implementation of [5], to develop a more in-depth understanding of the working of Weighted Finite Automatas, and also to make it easier to adapt it to tasks more specific to us. The next step is to test the implementation on synthetic data, for this we ll need to implement a synthetic WFA generator, as well as a generator that can mimic producing strings from WFAs. After testing the discriminative ability of the WFAs on synthetic examples and satisfactory implementation, we move on to applying the WFA and the spectral techniques associated with them to Computer Vision tasks of Activity Recognition and Action Segmentation. The goal is to demonstrate usability of WFAs in the community and provide this as a tool. To use WFAs pre-processing of activity recognition videos needs to be done in different ways, which is also tackled, with a related issue being what kind of videos to use. For now we deal with videos that provide skeletal joint locations Related Work The body of work related to this thesis can be broadly divided into two different subsections, which are touched upon separately below: Weighted Finite Automatas: To a large extent this is the main focus of the thesis, implementing and following the lead of [5], who in turn are motivated by more detailed work on Automatas, like [6] on spectral learning and Quadratic Weighted Automatas, fundamental work on automatas and theorems that form the backbone of this work can be found in [7]. Activity Recognition: The problem of activity recognition is one of the most intuitive and commonly tackled problem in Computer Vision, despite that it also remains one of the most complicated ones. This interest and complexity has spawned a number of ways to attack the problem. The approaches vary inherently as well as based on the kind of data they are dealing with, some being more efficient in tackling data that has skeletal joint information, some dealing with dynamics, and yet others motivated more by appearance based features. The list of work in the area is exhaustive and for brevity we ll just point out to approaches that are different from each other to give the reader an idea of the work been done. 2

12 Recent work includes approaches that are based on grammars, such as those using segmental grammars to parse videos, for example [8] that uses a latent structural SVM to train grammar parameters, learning the hidden sub-actions in the process, other similar approaches make use of Context Free Grammars (CFGs), such as [9][10][11][12]. This, looking at actions as a set of sub-actions approach is very natural and intuitive and hence is oft-utilized, for example by the likes of [13][14], which used decompoasable motion segments and learning temporal structures for the task. Yet another way is to make use of spatio-temporal features such as optical flow [15][16][17], and Bag of Features [18]. Longer video sequences that have multiple activities tend to be dealt with by by probabilistic models such Hidden Markov Models (HMMs) [4] including earlier Finite State Machines [19][20], to the more recent models such Conditional Random Fields (CRFs) [18]. Further variability in length of video sequences is tackled by approaches such as Hierarchical HMMs [21][22][23] or segmental HMMs [24][25][26][27]. A very different way of approaching the problem is to assume that there are underlying systems that generate a particular activity, and then to make use of Hankelets and Dynamical Distance metrics to identify those systems indirectly, such as done by [1] and [2] Overview The thesis is further divided into chapters dealing individually with the different steps and methods involved. What follows next in Chapter 2 is an in-depth discussion and explanation of Weighted Finite Automatas, their implementation, their generation, as well as the spectral techniques used to recover them. Chapter 3 deals with the pre-processing step, that is, how to convert available videos into strings that can be processed by the WFAs. Chapter 4 is an explanation of our implementation of the whole process and the synthetic experiments done to establish confidence in the method moving forwards. Finally Chapter 5 provides results on multiple real-world datasets with skeletal joint information. This is followed by a conclusion of the whole work and a brief discussion on what the future holds in this direction. 3

13 Chapter 2 Weighted Finite Automatas 2.1. Introduction Weighted Finite Automatas (WFAs), also referred to as Observable Operator Models (or OOMs) [28], are a generalization of HMMs [29][28]. WFAs can be viewed as a more expressive form of HMM with the advantage that this expressiveness doesn t come at the cost of an increased complexity in learning, in fact as [28] points out, they re often easier to learn, WFAs like HMMs are inherently random models and hence are best suited to model systems that are intrinsically random themselves. Moreover WFAs can be probabilistic as well as improbablistic. Keeping this in mind we now move on to a formal definition of WFAs 2.2. Definition From an application point of view WFAs are functions that map strings to real numbers, more formally as defined in [5] WFA W with n states can be completely defined by the set of tuple W,,{ A } 1 over a set of symbols where, 1 n R - is the initial state probability vector n R - is the termination probability vector A R nxn - are the transition probability matrices for each symbol 4

14 Figure 1 (a) Graphical representation of a WFA with 2 states (n=2) and { ab, } (b) operator or matrix representation of the same WFA Given this form of WFA and a string x, it can be used to model the probability (or score) of the given string being generated from the WFA W, as follows: T f () x 1 A x (1) Where A x is a product of all the matrices associated with the symbols in the string x. For example given the WFA of figure 1, and a string x = aba we ll have, f ( x) f ( aba) A A A (2) T 1 a b a Since this is not a probabilistic WFA, a higher score indicates a higher likelihood of the string coming from this WFA Transformations Another useful characteristic of a WFA is its ability to model different scoring functions, through equivalent transformation. Two transformation pointed out both by [5] and [28] make it easier to manipulate and use WFAs. Given the WFA W as defined in the previous section, It is possible to transform it into two equivalent WFAs W and W, by using the following transformations: s Transformation 1: Given a WFA W,,{ A } 1 W,,{ } s 1 s s A, where p X ( I X) T T 1s 1 s A ( I X) 1 1, it can be transformed into Given this representation we can evaluate another scoring function f ( x) E[ w ] s x (3), that is the expected number of times the string x appears in a string w as a substring, where now it ll be given by the equation 5

15 f x E w A (4) T s( ) [ ] x 1s x s This is a critical transformation and one that we ll make use of often in this work. Transformation 2: Third representation W p can be obtained by only applying the transformation of (3) to the final probability vector, with the transformed WFA coming to be defined by the tuple W p,,{ } 1 s A, this now realizes another scoring function that realizes the scores/probabilities of x being a prefix in the sample space of the strings * T p( ) [ ] 1 s x s *, i.e. f x P x A (5) We did not find any useful application of this transformation and hence this is mentioned here just for completeness. The proves of both transformation are discussed in detail in [5] WFA Hankels Now we introduce another important building block in this discussion, i.e. creating hankel matrices from which WFAs can be recovered. The idea is to construct a large matrix H f PxS R, such that H f( p, s) fs( p. s), where p P and s S, P and S being the set of all possible prefixes and suffixes respectively. Now in this way, theoretically the hankel matrix will be of infinite size, and hence it ll be impossible to work with it. To circumvent this problem a basis is defined by restricting the set of prefixes and suffixes before hand. Now the Hankel can be created while being finite. The choice of basis depends on the problem at hand, the important part being that the values of this hankel now correspond to the scores obtained from an underlying WFA. Example: Let s assume we have a set of sample strings X { aa, b, bab, a, b, a, ab, aa, ba, b, aa, a, aa, bab, b, aa} If we want to create a hankel matrix that realizes the substring expectations such that those given by (4), we can define a set of basis P {, a, b, ba} and S { a, b}, and empirically fill in the hankel matrix with these expectations such that Giving the empirical hankel of the form N 1 i HS ( p, s) I[ x x] (6) N i 1 6

16 H s a b a b ba In our case we ll generally define the hankel matrix with equal set of basis, since it s easier to deal with and is much more intuitive Spectral Learning Figure 2. The general flow sheet of the learning algorithm, the training data is used to create the empirical hankel matrix, which is then factorized to create the underlying WFA. The spectral learning of WFA from data can be divided into two parts: Empirical Hankel Now we get to a very integral part of the method that is the spectral learning of the underlying WFA responsible for generating a set of sequence. Let X be an available training set of N strings, also assume the strings consist of the alphabets a and b, appearing in differing orders. The first step in learning the underlying WFA from this sample set is creation of the empirical hankel matrix. The critical property of this hankel matrix as mentioned in Theorem 1 below is that the rank of this hankel matrix gives the number of states in the WFA, the theorem of course holds for the theoretical case of infinite matrix. Theorem 1: [30] [31] f 1. If A f for some WFA A with n states implies rank( H ) n 2. If rank ( H f ) n implies exists WFA A with n states s.t. f fa This is an important theorem in the context of the work, however, working with infinite matrices is not possible in practice and hence as pointed out earlier we need to define a set of basis in advance. This big hankel H is a concatenation of empirically constructed hankels for each alphabet and the empty symbol. i.e. H [ H H ] PxS Where each of the sub hankels are of dimensions R, if P and S are the number of prefix and suffix bases. Two more hankel vectors are needed for the learning which are f 7

17 h h p,, s R R Px1 1xS Moreover each H, a sub-block of the big H where H ( p, s) H( p s) Example: Consider a set of sequences with 2 symbols {a,b} and a basis of the form P { a, b, aa, ab, ba, bb} S, then the matrices and vectors discussed above will look like fs ( ) fs ( a) fs ( b) fs ( aa) fs( ab) fs( ba) fs( bb) fs ( a) fs ( aa) fs ( ab) fs( aaa) fs( aab) fs( aba) fs( abb) fs ( b) fs ( ba) fs( bb) fs( baa) fs( bab) fs( bba) fs( bbb) H fs ( aa) fs( aaa) fs( aab) fs( aaaa) fs( aaab) fs( aaba) fs( aabb) fs ( ab) fs ( aba) fs( abb) fs ( abaa) fs ( abab) fs ( abba) fs( abbb) fs ( ba) fs ( baa) fs ( bab) fs ( baaa) fs( baab) fs ( baba) fs( babb) fs ( bb) fs ( bba) fs ( bbb) fs ( bbaa) fs( bbab) fs( bbba) fs( bbbb) fs ( a) fs ( aa) fs( ab) fs ( aaa) fs( aab) fs( aba) fs( abb) fs ( aa) fs ( aaa) fs( aab) fs( aaaa) fs( aaab) fs( aaba) fs( aabb) fs ( ba) fs ( baa) fs( bab) fs( baaa) fs( baab) fs( baba) fs( babb) Ha fs ( aaa) fs ( aaaa) fs ( aaab) fs( aaaaa) fs( aaaab) fs( aaaba) fs ( aaabb) fs ( aba) fs ( abaa) fs( abab) fs( abaaa) fs( abaab) fs( ababa) fs( ababb) fs ( baa) fs ( baaa) fs( baab) fs( baaaa) fs( baaab) fs( baaba) fs( baabb) fs ( bba) fs ( bbaa) fs( bbab) fs( bbaaa) fs( bbaab) fs( bbaba) fs( bbabb) 8

18 fs ( b) fs ( ba) fs ( bb) fs( baa) fs( bab) fs( bba) fs( bbb) fs ( ab) fs ( aba) fs( abb) fs( abaa) fs( abab) fs( abba) fs( abbb) fs ( bb) fs ( bba) fs ( bbb) fs( bbaa) fs( bbab) fs( bbba) fs( bbbb) Hb fs ( aab) fs ( aaba) fs ( aabb) fs( aabaa) fs( aabab) fs( aabba) fs( aabbb) fs ( abb) fs ( abba) fs ( abbb) fs( abbaa) fs( abbab) fs( abbba) fs( abbbb) fs ( bab) fs ( baba) fs ( babb) fs ( babaa) fs( babab) fs( babba) f s( babbb) fs ( bbb) fs ( bbba) fs ( bbbb) fs ( bbbaa) fs( bbbab) fs( bbbba) f s( bbbbb) h fs ( a) fs ( b) f ( ab) h fs ( ba) fs ( bb) s T P,, S fs ( aa) Recovering WFA Once the above hankels have been learnt the recovery part is pretty straightforward, and involves taking an SVD and doing some matrix multiplications and inversions. The step by step algorithm is as follows: 1. Given the Hankel Matrices, H and H 2. Take a reduced SVD of H UDV T, based on the desired number of states n 3. Let X U * D, 4. Then Y V 1 1 (, ) T s h SY, T A X H Y 1 s X hp,, 1 1 We have found substring counting to be more intuitive and hence the empirical hankels are created by substring expectation calculations, that s why the recovered WFA is defined by the tuple W,,{ } s 1 s s A and can be transformed into W,,{ A } 1 transformations discussed in section 2.3. using the 9

19 Chapter 3 Pre-Processing 3.1. Introduction The WFAs, the way they are implemented here primarily deal with strings, while in our target tasks we are dealing with videos. So our data needs to be pre-processed in order for it to be ready for training the WFAs. The intention is to convert the available data into a set of representative alphabet sequences. Different fairly simple ways are explored to this end, the intention in most of them being exploiting dynamical information rather than appearance based. This is also one of the reasons why we deal primarily with videos that have skeleton joint information available Posebits One of the very initial features that we started off with is the use of Posebits, as introduced in [32]. Posebits are a mid-level representation and are based on Boolean relationships between body parts, for example, is the left arm in front of the right arm etc. More examples are shown in Figure 3. The idea is to directly infer them from image features using a trained classifier. They are by nature compositional and hence are very flexible as compared to just action class labels. The dataset made available by [32] is known as Posebit Dataset (PbDb) and is mainly made up of videos collected from 4 further different datasets, some with available MoCap Data while other being 2D images. From MoCap data there are 10,000 poses taken from Human-Eva [33] and HMODB [34] while for 2D images they use the Fashion [35] and Parse [36]. 10

20 Figure 3. Examples of posebits, and some poses conditioned on different posebits [32] Out of these we make use of the Human-Eva dataset information since it corresponds more to the task we are initially looking at, that is action classification Posebit Selection Figure 4.1. Different relationships in body parts that posebits intend to exploit. Joints distance, relative positions, articulation angles. 11

21 Figure 4.2. Posebit Binary Tree: the poses in each leaf node, constrained by all posebits in a posebyte. Not all posebits are created equal, [32] argue that it is important to select posebits based on what tasks you intend to perform with them. To this end they propose a simple selection mechanism inspired by decision trees to choose a subset of posebits from the available ones, based on the task at hand. For example for 3D pose estimation (and activity recognition) the aim is to choose a subset of posebits using the following two criteria for posebits selection: - Reliability inferred from image features, r - How helpful they can be in reducing uncertainty in the hidden variable, x To select a subset S m from the available posebit candidate pool S c [32] use a forward selection mechanism to select the posebits with a greedy approach. That is, one bit at a time. With each next posebit at step j selected to maximize information gain * C R a arg max I j I j. I j as (7) M Where, I j - mixed information gain at the j-th level of the tree C I - Clustering term R I - Reliability term - balances the two terms, generally kept at

22 The clustering information gain is further defined in terms of entropies as: I H H (8) C j j1 j Where defined as : H j is the sum of entropies weighted at each node of the j-th level of the tree, it can be H j j 2 X SC X H( SC ) (9) X S c1 X S C being the subset of poses X laste term ( ) C x p in class C, HS is the differential entropy. The reliability measure is defined as, X S being the bigger set of MoCap poses, while the Q( X r, m) p( x a) p( a r) (10) m aa p( x a ) and p( a r ) being the conditional pose and posterior posebyte distributions respectively. For posebits classification a structural SVM model is used: ^ T a arg max F( r, a, w ) w ( a, r) (11) j j a j a j j a A j Where ( a, r) j is the joint feature map of input r and output j The experiments we did combining posebits with WFA will be discussed in the experiments section. a Clusters of Velocities and Accelerations The second approach we used is by utilizing the skeleton joint informations available with datasets such as Berkley Multimodal Human Action Database (MHAD) [37], Composable Activities Dataset [38], UT Kinect Dataset [39] and extracted from larger datasets such as J-HMDB [40]. 13

23 Given the skeleton joint positions, for example in 3D, x, y, z we first center the skeleton around one of the joints (usually the hip joint), afterwards the mean of all the frames is removed to center the skeleton in the center. Afterwards a combination of these three simple techniques is utilized: 1. Sub-Sampling: In most cases the joints do not move too much from one frame to the other and using all frames can result in redundantly long sequences while capturing much less information, for this purpose first of all instead of using each frame, an average skeleton is taken from K frames at a time. F subsampled K j1 K F j (12) 2. Velocities: Once these subsampled skeletons have been obtained, the velocities of these skeletons are taken to account for first order motion v F F (13) j subsampled subsampled 3. Acceleration: The acceleration is represented by taking differences of these velocities j aj vj vj 1 (14) Once this has been obtained, different combinations of these are utilized and the next step is to do K-means clustering on them, with number of clusters C serving as the number of characters in the alphabet of the WFA. Matlab s inbuilt Kmeans++ algorithm [41] j Hankel Matrices This is a more complicated way as compared to the ones discussed above. But can potentially encode much more dynamical information. From control systems we know that a dynamic system can be defined by the following set of equations: y Cx w x k k k k Ax k 1 (15) Dynamical systems play a pivotal role in recognition systems that emphasize more on dynamics as compared to appearance, these include systems for recognizing various tasks such as gait, dynamic texture recognition and activity recognition systems etc. The basic idea is gleamed from system ID methods, that is the identification of the A and C matrices in eq.15 from training data. 14

24 However, most of the times, and specially in computer vision the identification of these matrices is not an easy task, as these matrices are not unique and trying to recover them can lead to non-convex problem statements. To work around this [1] introduced the making use of the special structure of Hankel Matrices [42], it is important to mention here that these hankel matrices are different from the ones discussed in Chapter 2. To understand hankelets (Tracklets of Hankels), consider a tracklet from a video sequence with measurements t k the underlying dynamic sequence behind this tracklet can be modelled by a linear regressor [43] n t a t, k s n k i1 This regressor can be modelled as a hankel matrix H D (to differentiate from the hankel matrices previously discussed we are adding the subscript D), in the absence of noise, such that rank( H D) order of the system, i ki (16) H D t1 t2... ts t t... t.... t t... t 2 3 s1 r r1 rs1 (17) Figure 5. The line represents a trajectory, with colored points representing observation, the matrics on the right shows how to create a hankel matrix from these observations The important argument by [42] in favour of this Hankel Matrix is that it captures the underlying dynamics of the system irrespective of the initial conditions or in other words two Hankelets from two trajectories output from the same underlying system will span the same linear subspace. They show this by factoring H into, where is the observability matrix and X is the state matrix, that is D 15 X

25 C CA, X x0 x1... x. m CA m (18) These hankel matrices can be formed either by using trajectories or any other features such as joint information etc. For our purposes we follow the lead of [2] and use gram matrices of Hankels encoding the joint positions in the Hankels, that is each observation t i encodes the 3D locations of joints in each frame t [ x, y, z, x, y, z,...] i (19) i i i i i i T Given the hankel matrix defined as in (17) the corresponding gram matrix is given by Clustering Gram Matrices ^ G H H T D D T (20) H D H D F The next step in conversion to the grammar required by WFA is the clustering of Gram matrices defined by (20), for clustering a distance like metric needs to be defined to find the centers of clusters, since these matrices live on the Positive Semi Definite (PSD) manifold, [2] mentions a number of metrics that can be used for the purpose including Affine Invariant Riemannien Metric (AIRM) [44], defined as, given two Gram Matrices XY, d X Y X YX 1/2 1/2 R(, ) log( ) F (21) The second one that can be used is the Log-Euclidean Riemannian Metric (LERM) [45] d ( X, Y) log( X ) log( Y) (22) le Another metric that they mention and argue in favour of is and hence is used here is the Jensen- Bregman Log-det Divergence (JBLD) [46], defined as X Y 1 dj ( X, Y) log log XY (23) 2 2 F 16

26 This JBLD defined in eq. 23 is what we use here in clustering with the mean (or the center of clusters) defined as X * N arg min J( X, X ) (24) X i1 i So, in summary, if we have a set of sequences of different activities, we chop the sequences into smaller overlapping sequences, encode them into gram matrices and cluster them using JBLD, the cluster labels will serve as the alphabet for training WFAs. 17

27 Chapter 4 Synthetic Experiments Before moving on to the target tasks in computer vision, we felt it important to establish our implementation of the Weighted Finite Automatas and their discriminative capabilities on firmer grounds by performing a variety of synthetic experiments, the details of which we ll discuss here with examples. This chapter will also cover the implementation details of the WFA part of the work, topics covered include: 1. WFA Generation 2. String Generation from WFAs 3. Evaluation Functions, Hankel Construction, and Spectral Learning 4. Experiments to Evaluate Estimated WFAs 4.1. WFA Generation The first step in performing synthetic experiments was the establishment of a Ground Truth. Which means having the ability to create WFAs on our own with different specifications. This was handled from the knowledge that a WFA W defined by the tuple W,,{ A } 1, where N 1 R - initial state probability vector N R - termination probability vector A R NxN - transition probabilities for the symbol 18

28 And the knowledge that we can create a probabilistic WFA by following the below rules: N i1 N j1 1 1i { A } 1, i 1... N ij i (25) In addition to this we also provide a sparsity option to control how dense or how sparse (in terms of connections between states) we want our WFA to be. Example: An example of the WFAs we created is the following WFA, the code takes input the number of states N and number of characters of the alphabet S. for example, with N = 6, S = 3 ( abc,, ), the following WFA was generated with 80% density T 1 [ ] = [ ] T A a

29 Ab Ac String Generation from WFAs Now that we have created a WFA, this WFA can be used to generate random sequences by traversing the states of the WFA, these strings can be used for training as well as testing. The following steps are followed to generate a string from a WFA W,,{ A } 1 - Select initial state by sampling the states based on the initial probability distribution defined by 1 - The next state, as well as the symbol to be emitted is selected based on the probability distribution defined by rows of { A } and the termination vector - No length limit is imposed, the generation stops once the termination state is reached Example: Based on the WFA from section 4.1. one of the strings generated looks like this x bcbcabbaabcbaacbcbccbbcaabbbbbaacbacca For our training purposes in the synthetic experminets we generate 10,000 strings from each WFA with an average length of around 12 characters, the longest string generated was

30 characters, minimum being an empty string. Which shows WFAs cover wide range of sequences in terms of length Evaluation Functions, Empirical Hankels, and Spectral Learning These have already been discussed in Chapter 2, for completeness and narrative purposes we ll mention them very briefly here Evaluation Functions Given the WFA W,,{ A } 1 and its transformation W,,{ } s 1 s s A, and a string or substring x, the following two evaluation functions are implemented by simple multiplication T f () x A x i) 1 T f () x A ii) 1 s s x s In practice the scores are kept positive, and also normalized to make sure the length of the sequences has lesser effect on the scores Empirical Hankels The structure and of these Hankels and the number of Hankel Matrices calculated as well as the entries are exactly as mentioned in section 2.5. In terms of implementation, in the case of the running example of this chapter with 3 characters. We select the basis PS { all combinations of a,b,c up to length 4}, which results in a basis of length 121 thus the size of the Hankels mentioned previously would be: H, H, H, H R h h p,, p R R a b c 121x1 1x x121 The number of the basis based on number of characters S, length of the basis l is given by the following formula: 21

31 S S basis 1 S l1 1 (26) Moreover while selecting basis, we do a frequency counting and the order of the bases depends on their frequency as substrings in the training strings. Also, since we have the ground truth WFA available we can directly fill up the entries in the Hankel Matrices, to create what we call Theoretical Hankel. If our Empirically estimated hankels are close to these Theoretical Hankels it means our estimation corresponds well to the theory. Recall that the entries are filled in by the evaluation function given by: f ( x) E[ w ] w P( w) s x w x Empirically, this means calculating the expected value of each combination of bases in the training set Spectral Learning Once we have the empirical hankel matrices, we can proceed with learning the underlying WFA, since the hankels are constructed while using expected values, the recovered WFA (using the methods outlined in Section 2.5., is the transformed one ^ W,,{ } s 1 s s A, which of course can be transformed to correspond to W. An important thing to remember is the spectral technique does not guarantee the recovery of a unique WFA however, so in terms of entries in the matrices the estimate ^ W and the ground truth W can and most of the time will be very different, but what we are interested in more is there behavior, which will be evaluated using the experiments outlined in the next section Experiments to Evaluate Estimated WFAs To make the case of whether or not the estimated WFAs are a good approximation of the ground Truth we did the following experiments: Frobenius Norm Frobenius Norms are generally a good metric to compare matrices. We posit that even before the spectral learning method kicks in it is important to establish the closeness of empirically 22

32 calculated Hankel Matrices with the Theoretical Hankels created using the ground truth WFA. If eh is the estimated Hankel and Hgt is the theoretical Hankel, the normalized frobenius norm distance can be calculated as: eh Hgt F d( eh, Hgt) (27) Hgt F We ran exhaustive experiments creating WFAs with different number of states, and alphabets, and in all cases found that the difference in (27) never went above 2%. Moreover when the same was calculated with different WFAs, the frobenius norm distance was found to be larger than the distance of estimated hankel from the ground truth. For example, with an estimation using 10,000 strings, the froenius norm distance of (27) vs the ground truth Hankels was found to be around 1%, when the ground truth was compared to Hankels constructed from false WFAs the distance was much larger (on average more than 10%). Very similar behaviour was observed when the same calculations were done using subspace angles and JBLD instead of frobenius norm. All establishing the validity of our counting process and the empirically created hankels Perplexity This is a fairly popularly used metric in Natural Language Processing, and is also suggested by [5]. The idea is to evaluate a number of test strings, treating them as an ensemble, normalizing the resultant scores to sum to 1, thus making a probability distribution over the ensemble. If the estimated WFAs are good enough this probability distribution should be close to the distribution obtained if the scores are calculated using the ground truth WFAs. Perplexity is a measure used to compare probability distributions, it is defined as follows: Given a probability distribution px ( ) and its estimate qx, ( ) the perplexity P( p, q ) is defined as: p( x)log( q( x)) P( p, q) 2 x (28) 23

33 A lower perplexity means a closer approximation, but an important thing to remember is that there s a lower bound as well, a perplexity lower than the lower bound also indicates a farther approximation, this lower bound is given by the self perplexity of p p( x)log( p( x)) L( p) 2 x (29) 1000 For example, in our case we have a test ensemble consisting of 1000 strings, i.e. { x }, i i 1 the probability distributions are calculated. The following tare the number for perplexity and lower bound 1000 p( xi)log( q( xi) i1 P( p, q) p( xi)log( p( xi)) i1 Lp ( ) As can be seen the perplexity, , obtained from the estimated WFA vs Ground Truth WFA is very close to the Lower bound , indicating that the estimation q is fairly close to p. To drive home the point the same calculations were done with false WFAs vs Ground Truth WFA, the results are tabulated in the table below Table 1. The perplexity values for false WFAs vs the ground truth WFA, either they are above the perplexity obtained with our estimate, or well below the Lower bound, indicating in both cases that our estimate is performing well. N Perplexity KL Divergence This is very closely related to perplexity, considering both model entropy, with the exception that the lower bound for KLD is zero, as it s defined as follows px ( ) KLD( p, q) p( x)ln (30) qx ( ) Just like Perplexity, a lower KLD indicated a closer estimation. The value we obtain with our estimate is, 24 x

34 1000 px ( i ) KLD( p, q) p( xi )ln qx ( ) i1 i Table 2. Again, the KLD values here are higher than the KLD value obtained by our estimate, indicating the estimate is close to the actual WFA. N KLD Word Prediction Error Rate As mentioned in [5] given a prefix, WFAs are able to predict the next symbol. That allows us another way to see how close our estimated WFA is performing versus the ground truth WFA. We define Word Prediction Error Rate (WPER) The number of times the prediction of ground truth WFA W differs from the prediction from the estimated ^ W divided by total number of symbols predicted The predictions are done the same way as suggested by [5], that is given a wfa W and its transformw s, if a prefix w 1,i is provided do the following: - Compute scores it A, for all possible symbols it - Compute the score for end of sequence s - 1 1, A is calculated iteratively it i1 wi it Aw i s - The symbol (or end of string) that gives highest score is predicted In our case the WPER remains on average around 5% that is for 100 predicted symbols there are only 5 errors. 25

35 Chapter 5 Experiments After establishing the grounds for the use of WFA with experiments on synthetic data. We proceed towards applying our work to different datasets in Computer Vision, specifically in the Activity Recognition scenario. What follows in this chapter is an explanation of the datasets used, followed by the experiments performed while varying different parameters, and results obtained on each dataset. Some of the initial experiments, for example those done on Posebit Database were done as a proof of concept, and thus are not as detailed as those done later on with other datasets Experimental Setup As described earlier there are different parameters to play with in the method, including the methods for preparing the alphabet needed to train the WFAs, the number of states for the WFAs, the number of basis and the number of symbols. Since the WFAs in general require considerably more data to train than is available in these datasets, we use leave one out strategy for evaluation. The idea here is to train one WFA per action, leaving out one sequence for testing, and then checking the scores assigned by all the WFAs to that sequence, if the maximum score assigned corresponds to ground truth label, it s a correct recognition, if not, it s counted as an error. Instead of controlling the number of states of individual WFAs (which would leave us with too many parameters to tune), we control the number of states en-masse by using the percentage rule: n the number where the eigen value of H s % of the highest eigen value of H 26

36 Thus, varying this ' s ' allows us to vary the number of states of the WFAs, without individually tuning them. Other parameters include the number of symbols (or clusters)' C ', and the overlap window while considering velocities, acceleration etc. During the course of our experiments we found that C = 10, seems to give good results, having too less symbols leads to more monotonous sequences, while having too many symbols can lead to slowing down of the process without yielding any significant improvement MHAD Dataset Description Multimodal Human Action Database (MHAD) [37] is one of the most ubiquitous action recognition datasets with 3D joint information in the Computer Vision community. The dataset consists of 11 actions performed by 12 actors. Each action is performed 5 times each. One sequence is missing, so the total number of sequences is 659. The actions are as follows, the same numbering will be followed in the confusion matrices shown in this section 1. Jumping in place 2. Jumping Jacks 3. Bending hands up all the way down 4. Punching/boxing 5. Waving two hands 6. Waving One hand 7. Clapping hands 8. Throwing a ball 9. Sit down then stand up 10. Sit down 11. Stand up The activities have varying level of dynamics, some just in the upper body, like waving, punching, clapping etc, while others have dynamics in the whole body. So this dataset can be considered a naturalistic dataset. 27

37 Figure 6.1. Snapshots from one of the actors performing the 11 actions in MHAD, we make use of the MoCap data, with 35 joints, 3-Dimensional, leading to 105 point feature vectors. Figure 6.2. Snapshots from throwing action as captured by MoCap cameras from different angles Evaluation For MHAD Dataset, results in Figure 6 are the best results that we were able to achieve while using the parameters C 10, s 95%, basislength 2. 28

38 Figure 7.1. Confusion matrix for Activities in MHAD Dataset, with an average accuracy of around 90%. As opposed to this, if a very low s 60% is used, it means the WFAs are not able to capture much the dynamics of the sequences, and hence the performance drops drastically, as shown in Figure 6b. Figure 7.2. Modelling the activities using a lower number of states leads to a significant drop in performance, as shown here for MHAD Dataset, the average accuracy has gone down to around 54%. 29

39 Similarly, increasing s to a higher percentage, which means allowing for a higher dependence on the number of states gleamed from the training data, can result in over-fitting, and hence once again a drop in performance is observed. Figure 7.3. Confusion Matrix showing a drop in average accuracy when the WFAs are allowed to fit too much to the data s 99% Overall, the number of states plays a critical role in the performance of the system, we noticed a bell shaped trend relative to the value of s, that is increasing s yielded an improvement in the average accuracy up to a certain value (generally close to 95% ), any further increase results in a deteriorating accuracy. The MHAD Dataset is now a solved dataset, and hence our accuracy is not state of the art, recently [2] have demonstrated 100% accuracy, and also list the accuracies achieved by other methods, we are copying there results here. Table 3. Comparison of accuracies with existing methods shows there is room for improvement. Method SMIJ [47] RBF Net [48] Dynemes [49] Bio-LDS [50] Average Accuracy (%)

40 HBRNN-L [51] 100 G-L [2] G-J [2] G-A [2] G-K [2] WFA MSR 3D Dataset Description Microsoft Research 3D (MSR-3D) dataset [52] is another popular action recognition dataset which provides the mocap information, the dataset consists of 20 actions, of varying similarities and dynamics, from full body movements to partial body movements. Each action is performed by 10 subjects, leading to a total of 557 relatively short sequences. The actions are as follows, the same numbering is followed in the confusion matrices: 1. High Arm Wave 2. Horizontal Arm Wave 3. Hammer 4. Hand Catch 5. Forward Punch 6. High Throw 7. Draw Cross 8. Draw Tick 9. Draw Circle 10. Hand Clap 11. Two Hand Wave 12. Side Boxing 13. Bend 14. Forward Kick 15. Side Kick 16. Jogging 31

41 17. Tennis Swing 18. Tennis Serve 19. Golf Swing 20. Pickup & Throw Evaluation Similar set of experiments were performed and once again the best accuracy was observed with s 90%, C 10, basislength 2, the results are shown in Figure 8a in the form of a confusion matrix. Figure 8.1. An average accuracy of almost 93% is achieved with understandable confusion in two different types of kicks (forward and side kicks). For s 95%, the accuracy goes down, indicating overfitting. 32

42 Figure 8.2. The average accuracy goes down when s is increased. Similarly on lowering s 75%, the accuracy again suffers drastically, indicating the WFAs have failed to model the dynamics Figure 8.3. The average accuracy again suffers when a smaller s is used. 33

43 5.4. Composable Activites Dataset Description The Composable Activities Dataset, introduced in [38] is a very different dataset as compared to the datasets discussed so far, since it s made up of sequences of complex activities, which in turn are made up of sub-activities. All in all, there are 693 sequences of 16 classes performed by 14 actors. Each composable action is made up of different combinations of 3 to 11 sub-activities out of a total of 26 activities. The Dataset exhibits high variance in the complexity as well as similarity of the sequences and as such is a difficult dataset for action recognition. Following are the 16 action classes for classification: 1. Composable Activity 1 2. Composable Activity 2 3. Composable Activity 3 4. Composable Activity 4 5. Composable Activity 5 6. Composable Activity 6 7. Composable Activity 7 8. Composable Activity 8 9. Hand Wave and Drink 10. Talk Phone and Drink 11. Talk Phone and Pickup 12. Talk Phone and Scratch Head 13. Walk while Calling with Hands 14. Walk while Clapping 15. Walk while Hand Waving 16. Walk while Reading The first 8 activities are composed of 3 to 11 sub-actions, most of the time performed sequentially, but sometimes performed in parallel. The sub-actions include reading, gesticulating, erasing/writing on a board etc. The authors provide skeleton data, and annotations. 34

44 Figure 9.1. A few examples of actions from the composable activities dataset, some actions are parallel, like top-left the subject walks while hand waving, top-right the subject talks on phone, and then runs sequentially. Since this is a comparatively harder dataset, it was harder to perform well on this dataset, the best available performance on this dataset was around 86% by the creators of the dataset [38] Evaluation Our best performance is similar to the Bag of Visual Words baseline mentioned by the authors, with s 95%, C 10 we were able to get an accuracy of around 67.83%. Figure 9.2. Confusion Matrix for Composable Activities dataset. We are able to do well in the first 8 activities which are composed of multiple activities. 35

45 As previously observed increasing s led to a decrease in performance Figure 9.3. A drastic decrease in average accuracy to about half with s 99%. Similarly, a significant decrease in s also lead to a similarly reduced recognition accuracy Figure 9.4. A decrease in accuracy is seen when s is reduced to 75% As before, we are not performing close to the best, however we were able to hit at least one baseline which goes on to show that the method, although not perfect, can be made viable. As a reference the confusion matrix obtained by [38] is shown here. We are actually able to outperform 36

46 them in recognizing some of the activities, like Composable Activities 5,6,7 (numbered the same in fig 9.2) Figure 9.5. Confusion Matrix obtained by [38] with around 85% Average Accuracy HDM05 Dataset Description The next dataset that we experiment on is the HDM05 dataset [55]. It is also a MoCap dataset which provides 3D locations, for 31 joints, However like [2] we also used just 4 joints corresponding to arms and legs. The results were again similar to the pattern followed in the previous experiments Evaluation The best recognition accuracy that we were able to achieve was 90.5%, with s94%, C 10. This is the only dataset on which we were able to outperform the state of the 37

47 art, however, as mentioned earlier we follow a leave one out protocol, while [2] follows a leave one out subject protocol, keeping that in mind, our performance is still not state of the art. Figure An accuracy of over 90% is achieved with s 94% and C 10 Following the pattern so far, a higher s leads to a degradation in performance. At s 99% the confusion matrix looks like this Figure The average accuracy drops to around 77% at s 99% Similarly, going down also leads to a drop in performance 38

48 Figure A drop in performance is observed when s 85% is used. Considering that we were able to achieve a higher performance than the one reported in [2] we re-did the experiments following there protocol, with leave one subject out, the performance dropped well below the state of the art, to around 71%. Figure Confusion Matrix for the best possible performance following protocol of [2] 39

49 5.6. UTKinect Dataset Description The UTKinect Dataset [39] is another popular dataset used evaluated frequently in action recognition settings. It is also based on 3D skeleton joints, consists of 10 simple actions including: 1. Walk 2. Sit Down 3. Stand Up 4. Pick Up 5. Carry 6. Throw 7. Push 8. Pull 9. Wave Hands 10. Clap Hands Furthermore, each action is performed by 10 subjects twice. Leading to 199 sequences (with one missing sequence). Figure Some sample images from different actions from the UTKinect Dataset. 40

50 Evaluation Leave One Out protocol is itself proposed by [39] in this case, which we follow. Table 4 is picked up directly from [2] and shows the performance of different methods on the dataset. While, once again this is a solved dataset now, we are able to perform significantly better than [1] and close to [4] Table 4. A comparison with different methods on UTKinect dataset, we are able to do reasonably well on most activities except carry and throw. ` Walk S.Dwn S.Up P.Up Carry Throw Push Pull Wave Clap Avg [1] [4] [39] [2] WFA (.95) The following is the confusion matrix for s 95%. We are able to perform reasonably well on all activities except carry and throw. Figure Confusion matrix for s 95% showing our best performance. 41

51 Bumping up s to 99% expectedly results in a drop in average accuracy. Figure Confusion matrix for s 99% showing a drop in average accuracy. Similarly, selecting a lower s also leads to a drastic drop in accuracy. Figure Confusion matrix for s 75% showing a drastic drop. 42

CHAPTER TWO LANGUAGES. Dr Zalmiyah Zakaria

CHAPTER TWO LANGUAGES. Dr Zalmiyah Zakaria CHAPTER TWO LANGUAGES By Dr Zalmiyah Zakaria Languages Contents: 1. Strings and Languages 2. Finite Specification of Languages 3. Regular Sets and Expressions Sept2011 Theory of Computer Science 2 Strings

More information

Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81%

Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81% Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81% 1. (2 pts) See text. You can t be sloppy defining terms like this. You must show a bijection between the natural numbers

More information

Glynda, the good witch of the North

Glynda, the good witch of the North Strings and Languages It is always best to start at the beginning -- Glynda, the good witch of the North What is a Language? A language is a set of strings made of of symbols from a given alphabet. An

More information

CMPSCI 250: Introduction to Computation. Lecture #28: Regular Expressions and Languages David Mix Barrington 2 April 2014

CMPSCI 250: Introduction to Computation. Lecture #28: Regular Expressions and Languages David Mix Barrington 2 April 2014 CMPSCI 250: Introduction to Computation Lecture #28: Regular Expressions and Languages David Mix Barrington 2 April 2014 Regular Expressions and Languages Regular Expressions The Formal Inductive Definition

More information

Tilings of the Sphere by Edge Congruent Pentagons

Tilings of the Sphere by Edge Congruent Pentagons Tilings of the Sphere by Edge Congruent Pentagons Ka Yue Cheuk, Ho Man Cheung, Min Yan Hong Kong University of Science and Technology April 22, 2013 Abstract We study edge-to-edge tilings of the sphere

More information

Person Identity Recognition on Motion Capture Data Using Label Propagation

Person Identity Recognition on Motion Capture Data Using Label Propagation Person Identity Recognition on Motion Capture Data Using Label Propagation Nikos Nikolaidis Charalambos Symeonidis AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Greece email:

More information

Proof Techniques Alphabets, Strings, and Languages. Foundations of Computer Science Theory

Proof Techniques Alphabets, Strings, and Languages. Foundations of Computer Science Theory Proof Techniques Alphabets, Strings, and Languages Foundations of Computer Science Theory Proof By Case Enumeration Sometimes the most straightforward way to prove that a property holds for all elements

More information

Finite Automata Part Three

Finite Automata Part Three Finite Automata Part Three Friday Four Square! Today at 4:15PM, Outside Gates. Announcements Problem Set 4 due right now. Problem Set 5 out, due next Friday, November 2. Play around with finite automata

More information

Latent Variable Models for Structured Prediction and Content-Based Retrieval

Latent Variable Models for Structured Prediction and Content-Based Retrieval Latent Variable Models for Structured Prediction and Content-Based Retrieval Ariadna Quattoni Universitat Politècnica de Catalunya Joint work with Borja Balle, Xavier Carreras, Adrià Recasens, Antonio

More information

Chapter 4: Regular Expressions

Chapter 4: Regular Expressions CSI 3104 /Winter 2011: Introduction to Formal Languages What are the languages with a finite representation? We start with a simple and interesting class of such languages. Dr. Nejib Zaguia CSI3104-W11

More information

Introduction to SLAM Part II. Paul Robertson

Introduction to SLAM Part II. Paul Robertson Introduction to SLAM Part II Paul Robertson Localization Review Tracking, Global Localization, Kidnapping Problem. Kalman Filter Quadratic Linear (unless EKF) SLAM Loop closing Scaling: Partition space

More information

Strings, Languages, and Regular Expressions

Strings, Languages, and Regular Expressions Strings, Languages, and Regular Expressions Strings An alphabet sets is simply a nonempty set of things. We will call these things symbols. A finite string of lengthnover an alphabets is a total function

More information

Finite Automata Part Three

Finite Automata Part Three Finite Automata Part Three Recap from Last Time A language L is called a regular language if there exists a DFA D such that L( D) = L. NFAs An NFA is a Nondeterministic Finite Automaton Can have missing

More information

Multiple Choice Questions

Multiple Choice Questions Techno India Batanagar Computer Science and Engineering Model Questions Subject Name: Formal Language and Automata Theory Subject Code: CS 402 Multiple Choice Questions 1. The basic limitation of an FSM

More information

JNTUWORLD. Code No: R

JNTUWORLD. Code No: R Code No: R09220504 R09 SET-1 B.Tech II Year - II Semester Examinations, April-May, 2012 FORMAL LANGUAGES AND AUTOMATA THEORY (Computer Science and Engineering) Time: 3 hours Max. Marks: 75 Answer any five

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Structured Models in. Dan Huttenlocher. June 2010

Structured Models in. Dan Huttenlocher. June 2010 Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs ayesian Networks and ecision raphs hapter 7 hapter 7 p. 1/27 Learning the structure of a ayesian network We have: complete database of cases over a set of variables. We want: ayesian network structure

More information

Compiler Construction

Compiler Construction Compiler Construction Exercises 1 Review of some Topics in Formal Languages 1. (a) Prove that two words x, y commute (i.e., satisfy xy = yx) if and only if there exists a word w such that x = w m, y =

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Lesson 18: Counting Problems

Lesson 18: Counting Problems Student Outcomes Students solve counting problems related to computing percents. Lesson Notes Students will continue to apply their understanding of percent to solve counting problems. The problems in

More information

Articulated Pose Estimation with Flexible Mixtures-of-Parts

Articulated Pose Estimation with Flexible Mixtures-of-Parts Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:

More information

Notes for Comp 454 Week 2

Notes for Comp 454 Week 2 Notes for Comp 454 Week 2 This week we look at the material in chapters 3 and 4. Homework on Chapters 2, 3 and 4 is assigned (see end of notes). Answers to the homework problems are due by September 10th.

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Bayesian model ensembling using meta-trained recurrent neural networks

Bayesian model ensembling using meta-trained recurrent neural networks Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Languages and Strings. Chapter 2

Languages and Strings. Chapter 2 Languages and Strings Chapter 2 Let's Look at Some Problems int alpha, beta; alpha = 3; beta = (2 + 5) / 10; (1) Lexical analysis: Scan the program and break it up into variable names, numbers, etc. (2)

More information

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models Gleidson Pegoretti da Silva, Masaki Nakagawa Department of Computer and Information Sciences Tokyo University

More information

Structural and Syntactic Pattern Recognition

Structural and Syntactic Pattern Recognition Structural and Syntactic Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent

More information

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation. Equation to LaTeX Abhinav Rastogi, Sevy Harris {arastogi,sharris5}@stanford.edu I. Introduction Copying equations from a pdf file to a LaTeX document can be time consuming because there is no easy way

More information

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision Object Recognition Using Pictorial Structures Daniel Huttenlocher Computer Science Department Joint work with Pedro Felzenszwalb, MIT AI Lab In This Talk Object recognition in computer vision Brief definition

More information

AUBER (Models of Computation, Languages and Automata) EXERCISES

AUBER (Models of Computation, Languages and Automata) EXERCISES AUBER (Models of Computation, Languages and Automata) EXERCISES Xavier Vera, 2002 Languages and alphabets 1.1 Let be an alphabet, and λ the empty string over. (i) Is λ in? (ii) Is it true that λλλ=λ? Is

More information

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata Solved Subjective From Midterm Papers Dec 07,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 MIDTERM SPRING 2012 Q. Point of Kleen Theory. Answer:- (Page 25) 1. If a language can be accepted

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Graph-based High Level Motion Segmentation using Normalized Cuts

Graph-based High Level Motion Segmentation using Normalized Cuts Graph-based High Level Motion Segmentation using Normalized Cuts Sungju Yun, Anjin Park and Keechul Jung Abstract Motion capture devices have been utilized in producing several contents, such as movies

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

UNIT I PART A PART B

UNIT I PART A PART B OXFORD ENGINEERING COLLEGE (NAAC ACCREDITED WITH B GRADE) DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING LIST OF QUESTIONS YEAR/SEM: III/V STAFF NAME: Dr. Sangeetha Senthilkumar SUB.CODE: CS6503 SUB.NAME:

More information

Finite Automata Part Three

Finite Automata Part Three Finite Automata Part Three Recap from Last Time A language L is called a regular language if there exists a DFA D such that L( D) = L. NFAs An NFA is a Nondeterministic Finite Automaton Can have missing

More information

Autoencoders, denoising autoencoders, and learning deep networks

Autoencoders, denoising autoencoders, and learning deep networks 4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,

More information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Ana González, Marcos Ortega Hortas, and Manuel G. Penedo University of A Coruña, VARPA group, A Coruña 15071,

More information

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Combining PGMs and Discriminative Models for Upper Body Pose Detection Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, 2014 1 Introduction In this project, I utilized probabilistic graphical models together with discriminative

More information

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Estimating Human Pose in Images. Navraj Singh December 11, 2009 Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks

More information

Introduction to Computer Architecture

Introduction to Computer Architecture Boolean Operators The Boolean operators AND and OR are binary infix operators (that is, they take two arguments, and the operator appears between them.) A AND B D OR E We will form Boolean Functions of

More information

Regularization and model selection

Regularization and model selection CS229 Lecture notes Andrew Ng Part VI Regularization and model selection Suppose we are trying select among several different models for a learning problem. For instance, we might be using a polynomial

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors

More information

Combined Shape Analysis of Human Poses and Motion Units for Action Segmentation and Recognition

Combined Shape Analysis of Human Poses and Motion Units for Action Segmentation and Recognition Combined Shape Analysis of Human Poses and Motion Units for Action Segmentation and Recognition Maxime Devanne 1,2, Hazem Wannous 1, Stefano Berretti 2, Pietro Pala 2, Mohamed Daoudi 1, and Alberto Del

More information

Assignment No.4 solution. Pumping Lemma Version I and II. Where m = n! (n-factorial) and n = 1, 2, 3

Assignment No.4 solution. Pumping Lemma Version I and II. Where m = n! (n-factorial) and n = 1, 2, 3 Assignment No.4 solution Question No.1 a. Suppose we have a language defined below, Pumping Lemma Version I and II a n b m Where m = n! (n-factorial) and n = 1, 2, 3 Some strings belonging to this language

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

Optimization Methods for Machine Learning (OMML)

Optimization Methods for Machine Learning (OMML) Optimization Methods for Machine Learning (OMML) 2nd lecture Prof. L. Palagi References: 1. Bishop Pattern Recognition and Machine Learning, Springer, 2006 (Chap 1) 2. V. Cherlassky, F. Mulier - Learning

More information

Data Compression Fundamentals

Data Compression Fundamentals 1 Data Compression Fundamentals Touradj Ebrahimi Touradj.Ebrahimi@epfl.ch 2 Several classifications of compression methods are possible Based on data type :» Generic data compression» Audio compression»

More information

Chapter Seven: Regular Expressions

Chapter Seven: Regular Expressions Chapter Seven: Regular Expressions Regular Expressions We have seen that DFAs and NFAs have equal definitional power. It turns out that regular expressions also have exactly that same definitional power:

More information

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017 CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other

More information

ITEC2620 Introduction to Data Structures

ITEC2620 Introduction to Data Structures ITEC2620 Introduction to Data Structures Lecture 9b Grammars I Overview How can a computer do Natural Language Processing? Grammar checking? Artificial Intelligence Represent knowledge so that brute force

More information

Neural Networks: promises of current research

Neural Networks: promises of current research April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

14.1 Encoding for different models of computation

14.1 Encoding for different models of computation Lecture 14 Decidable languages In the previous lecture we discussed some examples of encoding schemes, through which various objects can be represented by strings over a given alphabet. We will begin this

More information

Robust Signal-Structure Reconstruction

Robust Signal-Structure Reconstruction Robust Signal-Structure Reconstruction V. Chetty 1, D. Hayden 2, J. Gonçalves 2, and S. Warnick 1 1 Information and Decision Algorithms Laboratories, Brigham Young University 2 Control Group, Department

More information

Hand Tracking Miro Enev UCDS Cognitive Science Department 9500 Gilman Dr., La Jolla CA

Hand Tracking Miro Enev UCDS Cognitive Science Department 9500 Gilman Dr., La Jolla CA Hand Tracking Miro Enev UCDS Cognitive Science Department 9500 Gilman Dr., La Jolla CA menev@ucsd.edu Abstract: Tracking the pose of a moving hand from a monocular perspective is a difficult problem. In

More information

Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen I. INTRODUCTION

Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen I. INTRODUCTION Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen Abstract: An XBOX 360 Kinect is used to develop two applications to control the desktop cursor of a Windows computer. Application

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Turing Machine Languages

Turing Machine Languages Turing Machine Languages Based on Chapters 23-24-25 of (Cohen 1997) Introduction A language L over alphabet is called recursively enumerable (r.e.) if there is a Turing Machine T that accepts every word

More information

Learning and Recognizing Visual Object Categories Without First Detecting Features

Learning and Recognizing Visual Object Categories Without First Detecting Features Learning and Recognizing Visual Object Categories Without First Detecting Features Daniel Huttenlocher 2007 Joint work with D. Crandall and P. Felzenszwalb Object Category Recognition Generic classes rather

More information

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra Recurrent Neural Networks Nand Kishore, Audrey Huang, Rohan Batra Roadmap Issues Motivation 1 Application 1: Sequence Level Training 2 Basic Structure 3 4 Variations 5 Application 3: Image Classification

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.

More information

Introduction to Graphical Models

Introduction to Graphical Models Robert Collins CSE586 Introduction to Graphical Models Readings in Prince textbook: Chapters 10 and 11 but mainly only on directed graphs at this time Credits: Several slides are from: Review: Probability

More information

Arm-hand Action Recognition Based on 3D Skeleton Joints Ling RUI 1, Shi-wei MA 1,a, *, Jia-rui WEN 1 and Li-na LIU 1,2

Arm-hand Action Recognition Based on 3D Skeleton Joints Ling RUI 1, Shi-wei MA 1,a, *, Jia-rui WEN 1 and Li-na LIU 1,2 1 International Conference on Control and Automation (ICCA 1) ISBN: 97-1-9-39- Arm-hand Action Recognition Based on 3D Skeleton Joints Ling RUI 1, Shi-wei MA 1,a, *, Jia-rui WEN 1 and Li-na LIU 1, 1 School

More information

1.3 Functions and Equivalence Relations 1.4 Languages

1.3 Functions and Equivalence Relations 1.4 Languages CSC4510 AUTOMATA 1.3 Functions and Equivalence Relations 1.4 Languages Functions and Equivalence Relations f : A B means that f is a function from A to B To each element of A, one element of B is assigned

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Gaussian Processes for Robotics. McGill COMP 765 Oct 24 th, 2017

Gaussian Processes for Robotics. McGill COMP 765 Oct 24 th, 2017 Gaussian Processes for Robotics McGill COMP 765 Oct 24 th, 2017 A robot must learn Modeling the environment is sometimes an end goal: Space exploration Disaster recovery Environmental monitoring Other

More information

Topology and Topological Spaces

Topology and Topological Spaces Topology and Topological Spaces Mathematical spaces such as vector spaces, normed vector spaces (Banach spaces), and metric spaces are generalizations of ideas that are familiar in R or in R n. For example,

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Limitations of Matrix Completion via Trace Norm Minimization

Limitations of Matrix Completion via Trace Norm Minimization Limitations of Matrix Completion via Trace Norm Minimization ABSTRACT Xiaoxiao Shi Computer Science Department University of Illinois at Chicago xiaoxiao@cs.uic.edu In recent years, compressive sensing

More information

Bayesian Classification Using Probabilistic Graphical Models

Bayesian Classification Using Probabilistic Graphical Models San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2014 Bayesian Classification Using Probabilistic Graphical Models Mehal Patel San Jose State University

More information

CS 3100 Models of Computation Fall 2011 This assignment is worth 8% of the total points for assignments 100 points total.

CS 3100 Models of Computation Fall 2011 This assignment is worth 8% of the total points for assignments 100 points total. CS 3100 Models of Computation Fall 2011 This assignment is worth 8% of the total points for assignments 100 points total September 7, 2011 Assignment 3, Posted on: 9/6 Due: 9/15 Thursday 11:59pm 1. (20

More information

Graphical Models. David M. Blei Columbia University. September 17, 2014

Graphical Models. David M. Blei Columbia University. September 17, 2014 Graphical Models David M. Blei Columbia University September 17, 2014 These lecture notes follow the ideas in Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. In addition,

More information

Predicting Popular Xbox games based on Search Queries of Users

Predicting Popular Xbox games based on Search Queries of Users 1 Predicting Popular Xbox games based on Search Queries of Users Chinmoy Mandayam and Saahil Shenoy I. INTRODUCTION This project is based on a completed Kaggle competition. Our goal is to predict which

More information

Derivative Delay Embedding: Online Modeling of Streaming Time Series

Derivative Delay Embedding: Online Modeling of Streaming Time Series Derivative Delay Embedding: Online Modeling of Streaming Time Series Zhifei Zhang (PhD student), Yang Song, Wei Wang, and Hairong Qi Department of Electrical Engineering & Computer Science Outline 1. Challenges

More information

Action recognition in videos

Action recognition in videos Action recognition in videos Cordelia Schmid INRIA Grenoble Joint work with V. Ferrari, A. Gaidon, Z. Harchaoui, A. Klaeser, A. Prest, H. Wang Action recognition - goal Short actions, i.e. drinking, sit

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule. CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your

More information

Recitation 4: Elimination algorithm, reconstituted graph, triangulation

Recitation 4: Elimination algorithm, reconstituted graph, triangulation Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Recitation 4: Elimination algorithm, reconstituted graph, triangulation

More information

Robust PDF Table Locator

Robust PDF Table Locator Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 29 Source Coding (Part-4) We have already had 3 classes on source coding

More information

Learning Articulated Skeletons From Motion

Learning Articulated Skeletons From Motion Learning Articulated Skeletons From Motion Danny Tarlow University of Toronto, Machine Learning with David Ross and Richard Zemel (and Brendan Frey) August 6, 2007 Point Light Displays It's easy for humans

More information

1. Which of the following regular expressions over {0, 1} denotes the set of all strings not containing 100 as a sub-string?

1. Which of the following regular expressions over {0, 1} denotes the set of all strings not containing 100 as a sub-string? Multiple choice type questions. Which of the following regular expressions over {, } denotes the set of all strings not containing as a sub-string? 2. DFA has a) *(*)* b) ** c) ** d) *(+)* a) single final

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

3 Feature Selection & Feature Extraction

3 Feature Selection & Feature Extraction 3 Feature Selection & Feature Extraction Overview: 3.1 Introduction 3.2 Feature Extraction 3.3 Feature Selection 3.3.1 Max-Dependency, Max-Relevance, Min-Redundancy 3.3.2 Relevance Filter 3.3.3 Redundancy

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D.

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. Rhodes 5/10/17 What is Machine Learning? Machine learning

More information

1. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which:

1. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which: P R O B L E M S Finite Autom ata. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which: a) Are a multiple of three in length. b) End with the string

More information

Final Exam Study Guide

Final Exam Study Guide Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table. Code No: XXXXX JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY, HYDERABAD B.Tech II Year I Semester Examinations (Common to CSE and IT) Note: This question paper contains two parts A and B. Part A is compulsory

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

Object Segmentation and Tracking in 3D Video With Sparse Depth Information Using a Fully Connected CRF Model

Object Segmentation and Tracking in 3D Video With Sparse Depth Information Using a Fully Connected CRF Model Object Segmentation and Tracking in 3D Video With Sparse Depth Information Using a Fully Connected CRF Model Ido Ofir Computer Science Department Stanford University December 17, 2011 Abstract This project

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Finite-State Transducers in Language and Speech Processing

Finite-State Transducers in Language and Speech Processing Finite-State Transducers in Language and Speech Processing Mehryar Mohri AT&T Labs-Research Finite-state machines have been used in various domains of natural language processing. We consider here the

More information