Scalable Movement Representation In Low Dimensional Latent Space. Ioannis Havoutis

Size: px

Start display at page:

Download "Scalable Movement Representation In Low Dimensional Latent Space. Ioannis Havoutis"

Shanna Lyons
5 years ago
Views:

1 Scalable Movement Representation In Low Dimensional Latent Space Ioannis Havoutis Master of Science Artificial Intelligence School of Informatics University of Edinburgh 7

2 Abstract We investigate a novel approach for representation of kinematic trajectories in complex movement systems. Our framework consists of a powerful nonlinear, probabilistic and fully generative dimensionality reduction method coupled with a non-linear dynamical systems-based motion representation technique, allowing for learning of discrete and rhythmic motions. The high-dimensional kinematic data is mapped to a latent space representation. The latent space trajectories formed are then encoded as non-linear dynamical systems, resulting in a flexible motion representation. Learning in the low-dimensional latent space requires only a small number of systems, without loss of accuracy with respect to the demonstrated motion. This way the encoding is made more compact and more manageable. Moreover, the modification of parameters of the motion representation in latent space is demonstrated to lead to structured changes of the observed motion when mapped back to task space. We test the system with kinematic data from three different setups. We start with a 3-DoF 3-link planar arm, continue with a 6-DoF manipulator that operates in 3- dimensional space and then experiment with 59-dimensional human motion capture data. In all cases discrete and rhythmic motion scenarios are employed. The efficiency of the system is measured at various steps of the overall process, while qualitative evaluations are also made. The results are promising as the system maintains a low error with respect to the desired motions, even in the most complex of cases evaluated. Moreover, it allows for easy generalisation in the vicinity of the training data allowing for novel motion generation in a structured manner. The system provides a very compact representation, that is easily and rapidly learned. In addition, it allows online modification that results in resistance against external perturbations. i

3 Acknowledgements I would like to thank my supervisor Dr.Sethu Vijayakumar, for his invaluable advice, his support and the confidence he placed in me throughout this project. I would also like to thank my second supervisor Sebastian Bitzer for his constant support and help with all issues encountered throughout the course of this project. I am thankfull to Stefan Klanke for his knowledge on robotics and also to Djordje Mitrovic for his comments and inspiring conversations. Last but not least, I would like to thank all the friends here in Edinburgh. ii

4 Declaration I declare that this thesis was composed by myself, that the work contained herein is my own except where explicitly stated otherwise in the text, and that this work has not been submitted for any other degree or professional qualification except as specified. (Ioannis Havoutis) iii

5 Table of Contents Introduction. Motivation Background and Methodology Related Work Dissertation Outline Representing Movement Plans 7. Dynamical Systems First Order Linear Systems Second Order Linear Systems Control Policies from Dynamical Systems Discrete Motion Rhythmic Motion Benefits of Movement Primitives Nonlinear Dimensionality Reduction 8 3. Principal Component Analysis Probabilistic PCA Standard Model Fitting Approach Dual Model Fitting Approach Gaussian Processes The Gaussian Process Latent Variable Model Fitting the nonlinear GPLVM Advantages of the GPLVM Other Related Methods Robots and Datasets 9 4. Robot Control iv

6 4.. Kinematics Planar Arm Redundancy Resolution Grid Data Discrete Kinematic Data Rhythmic Kinematic Data Puma Arm Redundancy Resolution Discrete Kinematic Data Rhythmic Kinematic Data Human Motion Capture Experiments and Evaluations Overview of Methodology Quantitative Evaluation Structure in the Latent Space Experiments with the Planar Arm Experiments with the Puma Arm Learning and Generalising in the Latent Space Discrete Motions Rhythmic Motions Human Motion Data Boxing Walking and Running Discussion Conclusion 7 6. Future Work A Rhythmic Motion Experiments 75 Bibliography 8 v

7 Chapter Introduction Fast, accurate and compliant movement is a prerequisite for modern day robots. This is crucial as robotic systems move form highly controlled and secure environments (like factory floors and remote handling) to much tighter human-machine interaction (entertainment robotics, rehabilitation robotics) and uncontrolled environments (search and rescue robots). There is a large variety of robotic platforms ranging from special purpose manipulators, modern entertainment robots to large degree of freedom humanoid anthropomorphic systems. What all such examples share in common is the need for efficient planning and control. This has spawned an ever increasing active area of research in adaptive control and planning as the morphology and characteristics, materials, actuators and sensors are constantly evolving and their complexity demands flexible approaches. Robot control involves a number of issues to be addressed, beginning with the specification of an overall goal. The goal has to be divided into a hierarchy of intermediate tasks that is optimised in a goal-directed manner. Movement plans for achieving the sub-tasks defined earlier need to be created. The kinematic plans produced should be able to address goals in distal space. In order to achieve goals in distal space the plans have to be translated to proximal space plans, requiring a series of transformations between coordinate frames. Furthermore, there is the need for adaptive control to account for changes in the robot s structure and changes in a dynamical environment. Corrections based on feedback should also be performed to account for unpredicted and systematic errors. In addition, accurate and adaptive control requires learning of the system s kinematics and dynamics in an online fashion. This thesis is concerned with kinematic planning for control, addressed as a movement representation task (Figure.). Planning in kinematics level provides with more

8 Chapter. Introduction generality, allowing for far broader applicability. This way, kinematic plans can be equally applied to robot motions, to avatar animation, visual tracking systems and to any other subject that requires movement representation. Figure.: Kinematic plans as a way of representing movements.. Motivation A traditional approach to kinematic planning problems is planning based on the optimisation of a cost function. Various cost functions have been applied to movement systems. Some of the high level cost functions used are minimum jerk (Flash and Hogan (984)), minimum torque (Uno et al. (989)) and minimum endpoint variance (Harris and Wolpert (998)). Such approaches face the problem of large computational cost as the optimisation depends on complicated mathematical models. Moreover, complex movement systems in general have to utilise combinations of cost functions. Recognising the individual cost functions that shape a particular motion is not always possible. In addition, the solution of combinations of cost functions in a mathematical model can be intractable. Specifying movements explicitly is a tedious task for complex systems. The alternative to cost function based optimisation is to use a higher level encoding for motion representation. Such an encoding should be produced from demonstration and also have a compact and reusable form. In this context, a number of different implementations have been studied for motion representation (Miyamoto et al. (996), Sutton and Barto (998), Schaal (997)). The central drawback of these approaches is that they depend explicitly on time. This way the representations are not flexible and are hard to reuse in novel conditions.

9 Chapter. Introduction 3 What this thesis proposes is a framework that allows accurate motion representation in a time independent manner. It allows for a compact representation of motions with online modification capacities in order to handle external perturbations and provides with fast and easy motion learning capabilities as learning is performed with observation. At this time there is no system that uses such a combined, dimensionality reduction and motion learning approach, thus we consider our formulation to be novel to the literature.. Background and Methodology A novel method of planning using control policies based on simple differential equations has been proposed by Ijspeet et al. in ( Ijspeert et al. ()),( Ijspeert et al. (3)), presenting with several advantages. Some interesting aspects of the dynamical systems based representation are that movements become robust against external perturbations and changes in a dynamical environment. It provides an ease of representation and learning of desired trajectories and the generated control policies are easily reusable. Moreover, similar movements can be produced by simple scaling of an existing control policy, which is invaluable in the context of natural motion as movement trajectories tend to be greatly similar. The process of learning, however, gets increasingly difficult with the dimensionality of the trajectories. As the number of degrees of freedom involved in the movement increase, the learning outcome deteriorates qualitatively. The central assumption is that even thought the kinematic dataset we wish to learn is high dimensional, its intrinsic dimensionality is relatively small (Vijayakumar et al. (5)). Thus, we would be able to represent the high dimensional kinematic dataset through a low dimensional latent space. Our hypothesis is that learning in a lower dimensional space would be easier. To accomplish this task we would need a dimensionality reduction technique that can handle nonlinear data and also provide with an accurate, continuous latent space mapping. Moreover, we need to employ a model that can generalise well as we wish to sustain the generalisation ability of the motion representation in the low dimensional mapping. For finding lower dimensional representations of kinematic trajectories we use the Gaussian Process Latent Variable Model (GPLVM) (Lawrence (5)). The GPLVM

10 Chapter. Introduction 4 is a probabilistic, nonlinear dimensionality reduction technique based on the theory of Gaussian Processes (Rasmussen and Williams (6)). This model has already been used for dimensionality reduction of kinematic data in related research work (Grochow et al. (4), Wang et al. (6)). The GPLVM provides with a flexible latent representation of arbitrary dimensionality and with great accuracy of reconstruction. It is a probabilistic model that offers a degree of certainty at the predictions made. Moreover, it is fully generative, allowing for the creation of fantasy data (Lawrence (4)). Nonetheless it is unclear whether the properties of the representation algorithm carry over in the low dimensional space and this investigation is a significant part of this thesis. The general motivation of this thesis is motion representation. We formulate a framework of representing movement plans in a scalable manner, such that they generalise in time and space and can be learned and manipulated in some lower dimensional latent space, which is much more manageable. We aspire to prove that encoding movements in the low dimensional latent space is as accurate as the corresponding representation in the high dimensional joint space. Moreover we aim to show how this low dimensional encoding preserves the beneficial advantages of the dynamical systems-based representation. That is, the encoded motion is easily reusable and can generalise to novel movement in the vicinity of the original trajectory. In addition we wish to show that the representation is time invariant and can be modified in an online fashion in the low encoding representation..3 Related Work Modeling motion within a low-dimensional representation of the original movement data is a relatively fresh subject of research. This can be attributed to the recent developments in nonlinear dimensionality reduction machinery that allows for such flexible and accurate representations. Below we present the articles that deal with modeling of motion in a low dimensional space, which we consider to be the most relative to this thesis. More specifically Growchow et al. have proposed an inverse kinematics system based on learning from previously observed poses derived from human motion capture data (Grochow et al. (4)). They developed a computer animation and computer vision approach with applications in interactive character posing, trajectory keyframing and real time motion capture with missing markers. Their main hypothesis is that even

11 Chapter. Introduction 5 though many poses are possible, some poses are more likely than others in a learned style based context so their idea was to define a probability density function (PDF) that would define the likelihood over poses. For learning such likelihood function they have employed a Scaled Gaussian Processed Latent Variable Model (SGPLVM), the parameters of which are learned automatically from the training data. Their approach uses a learning active set, which consist of a heuristically determined subset of the original poses, in order to reduce the computational complexity of their algorithm and to optimise the training dataset. Learning consists of optimising the model parameters, optimising the latent variables and selecting the active set. This results in a general purpose probability distribution for new poses. Their SGPLVM avoids overfitting and yields smooth objective functions for large and for small datasets resulting in natural synthesised poses and trajectories satisfying style based constrains. Another approach motivated by modeling human motion for video based people tracking and data driven animation is the Gaussian Process Dynamical Model (GPDM) proposed by Wang et al. in (Wang et al. (6)). Again the main observation is that a small number of dimensions, often three, suffice to capture the intrinsic dimensionality of a large state space, that of activity-specific human poses and motions space. Their work is directly inspired by the GPLVM and the GPDM proposed is a GPLVM augmented with a latent dynamical model. The model learns the mapping between the observation space and the latent space, as defined in the GPLVM framework, and also the nonlinear dynamics associated with the low dimensional latent space. It generalises well from small datasets as in effect models a high probability tube around the training data. They have also demonstrated that the GPDM can handle missing data as it produces good results for short missing segments but fails on long missing sequences. The results are smooth predicted human movement sequences sampled from the low dimensional latent model learned. Both of the above approaches demonstrate that working in a low dimensional latent space favours motion representation. This evidence serves to reinforce our hypothesis..4 Dissertation Outline This thesis is separated into two main parts. The first part deals with the theoretical and background material that are crucial for the reader, in order to have a clear view

12 Chapter. Introduction 6 of the framework we later detail. The second main part deals with the more practical aspects of the project. Specifically the second chapter starts with an overview of the general concepts concerning the subject of dynamical systems, in order to introduce the reader to the dynamical systems-based movement learning method that we have chosen for motion representation, underlining the advantages that follow such a choice. The third chapter discusses the dimensionality reduction framework chosen. We start from a simple linear model and with step-by-step elaborations we reach a powerful, nonlinear, probabilistic and fully generative dimensionality reduction model. Furthermore, the desirable characteristics of the model are discussed. Chapter four is the beginning of the practical part. In this chapter the robotic setups used are introduced and some general robot control concepts are also presented. The chapter details the data generation process that is used to create meaningful joint angle datasets suitable for the aims of the project. The fifth chapter presents the experimental formulations that support the hypothesised benefits of the combination of dimensionality reduction and movement learning from demonstration. The general methodology followed is explained and the aim of each experiment is made explicit. A thorough discussion of the results follows, pointing out the drawbacks and benefits of the resulting system.

13 Chapter Representing Movement Plans Many control problems take place in continuous state-action spaces where the control objective is often defined as finding a desired trajectory that reaches a particular goal state. The increased complexity, crucial for dexterous movement systems, renders common approaches such as inverse kinematics ill defined and difficult to work with. Learning from observation is centrally motivated by the key fact that it is easier to show a desired motion than to fully specify it. Thus, motion learning by demonstration has started to gain increasing attention by the robotics research society over the past years. Such formulations suggest that a robot would benefit by possessing a set of movement skills encoded as control policies (CPs), each of which defines a specific motion. Moreover these movement encodings should exhibit a set of desirable properties as enumerated in (Ijspeert et al. (a)). In particular, these properties are the ease and compactness of representation, re-usability and fast learning, robustness against external perturbations and changes in a dynamic environment, ease of modification for new tasks and ease of categorisation (Ijspeert et al. (b)). Various methods for movement learning have been suggested in the related literature, some of which present great advantages. Nevertheless most suffer from common difficulties. These methods include memorisation of entire trajectories, spline-based methods (Miyamoto et al. (996)), recurrent neural networks and reinforcement learning (Sutton and Barto (998)). Memorising trajectories is a relatively easy task but it is difficult to reuse such formulations for new tasks and sensitive to external perturbations as they depend explicitly on time. Spline-based methods, though more compact, have a similar problem. Recurrent neural networks and reinforcement learning avoid Note that the terms motion learning and motion encoding are used interchangeably throughout this thesis. 7

14 Chapter. Representing Movement Plans 8 explicit time indexing but suffer from computational complexity for training (Schaal (997)). In this thesis we employ a movement imitation learning method based on nonlinear dynamical systems (Ijspeert et al. (3)). Such a framework takes advantage of the direct relation between nonlinear differential equations and control policies in the context of robotic control. In this chapter we introduce the movement imitation framework used in this project. Our framework is based on nonlinear dynamical systems, thus we start of by first introducing some general concepts regarding systems of differential equations as well as their purpose and usage. Next we briefly present one dimensional linear systems and explain what can be represented by such simple systems in order to familiarise the reader with the subject. Following we introduce the two dimensional linear dynamical systems formulation, their representational power as well as the general solution to such systems and the different qualitative behaviours such systems might exhibit. Afterward, the movement imitation method employed is detailed for both the discrete movement and rhythmic movement formulations as well as the addition of nonlinearity. Emphasis is given to the tunable parameters of both formulations and the effect such changes have when producing novel motion. Last we demonstrate the major advantages of the nonlinear dynamical systems-based representation.. Dynamical Systems Dynamical systems are part of a grander subject known as dynamics. It is a well established interdisciplinary area with abundant applications to natural sciences and engineering, as a systematic way of explaining the behaviour of various interdependent mechanisms in the course of time. There are two main types of dynamical systems, differential equations and iterated maps. Iterated maps handle problems where time is discrete, something not suitable for our goals, thus we would focus only on the former. A differential equation is a formula that describes how state variables evolve over time (Strogatz (994)). In that way, given the current state of the system, or state vector, and the relation that describes the system s behaviour we can iteratively predict its future states. Differential equations can be separated into ordinary and partial. Ordinary equations involve only ordinary derivatives of one independent value, the time. Partial differential equations involve partial derivatives of the state variables and are Also referred to as difference equations.

15 Chapter. Representing Movement Plans 9 beyond the scope of our overview. Linear differential equations are equations where the state variables appear in linear combinations, as opposed to the nonlinear case where state variables appear in nonlinear combinations as products, powers and functions. A differential equation is considered autonomous when it does not explicitly depend on time. It is a common practice to make a system autonomous by treating time as an independent variable. The order of a system is given by the number of state variables it involves, thus any n-order nonautonomous systems can be formulated as a (n + )-order autonomous system. In general not all dynamical systems have analytical solutions and solutions depend on the system s initial values. This way, analysing the behaviour of a dynamical system is the process of finding its fixed points, where all the derivatives are zero, and observing how solutions approach or move away from such points represented as trajectories in the system s phase space, a Cartesian space of the systems state variables. Many off the shelf software packages exist that offer this kind of mathematical analysis using numerical integration methods as the Euler method, the Runge-Kutta method or other techniques... First Order Linear Systems The general formulation of a first order linear system is a single equation of the form ẋ = f(x), where f(x) can be any linear real-valued function of x, and ẋ denotes the time derivative of x. To find the fixed points 3 of such system we set f(x) = and solve for x. The behaviour of first order linear systems is quite trivial. Solutions can only decay exponentially toward stable fixed points or grow exponentially away from unstable fixed points. The sign of f (x) at a fixed point determines its behaviour, thus f (x) < defines a stable fixed point where a small perturbation about it decays exponentially and f (x) > defines an unstable fixed point where a small perturbation about it grows exponentially. A special case can be a half-stable point that attracts solutions approaching it from one side and repels solutions that approach it from the opposite side. 3 Sometimes also referred to as equilibrium points.

16 Chapter. Representing Movement Plans.. Second Order Linear Systems A second order linear system is of the form ẋ = ax+by ẏ = cx+dy such system can be written more compactly in matrix notation as ẋ = Ax, where A = c ( a b ) d and x = ( x ) y. It is obvious that ẋ = when x =, as this is always a fixed point for any choice of A. The solutions for this system can be visualised as trajectories moving on the (x,y) phase plane. For any matrix A we would like to find the straight line trajectories solutions for the system. That is we seek solutions of the form x(t) = e λt v where v is a nonzero fixed vector and λ is a growth rate, both to be determined. Substituting x(t) = e λt v into ẋ = Ax yields λe λt v = e λt Av Av = λv, which points that the desired solution exists if v is an eigenvector of A with corresponding eigenvalue λ. For the matrix A the characteristic equation is det c ( a λ b ) d λ =, which yields λ τλ+ =, where τ = trace(a) = a+d, = det(a) = as bc. Then λ, = τ ± τ 4,

17 Chapter. Representing Movement Plans are the solutions to the quadratic problem. This observation points out that the solutions depend on the trace and determinant of the matrix A. Their positivity, negativity or equality to zero gives rise to a number of interesting behaviours. Depending on these values, the phase portrait of a second order dynamical system can contain stable and unstable nodes and spirals, centres, saddle points, stars, degenerate nodes and non-isolated fixed points. The characteristics of all these cases are not relative to our goal except that of stable nodes. A stable node solution occurs when a number of constrains hold. First, the determinant of A is greater than zero, hence both eigenvalues are real with the same sign. Second, the trace of A is lesser than zero, thus both eigenvalues have negative signs, that is the node is attracting. Third, τ 4 < which ensures that the eigenvalues have no imaginary part, a case that would lead to a spiral. A stable node solution as the one sketched above guarantees that every starting point of the system would result in a trajectory that exponentially decays to to the attractor point. The importance of this property will become apparent in the next section, where we present the dynamical systems based motion learning framework, according to which the goal state of a movement is encoded as an asymptotically stable universal attractor (Figure.(a)). Figure.: Different phase portraits. (a) Stable node (attractor), (b) unstable spiral, (c) saddle point (Adopted from system).. Control Policies from Dynamical Systems In this section we present the dynamical systems framework used throughout this project for movement learning. In a general form, the problem of learning motions

18 Chapter. Representing Movement Plans can be formulated as learning a function f, u = f(x,t), that maps a state of the controlled system x in time t to a desired action u. Such an atomic state-action representation can be formulated as a reinforcement learning problem Sutton and Barto (998), nonetheless high dimensionality and continuous time give rise to many problems with such approaches. The alternative in achieving fast and robust movement learning is to use a higher level representation to encode control policies. This way the relationship between dynamical systems and our goal of motion learning should be apparent, as through dynamical systems we can encode the state evolution of a controlled system. Considering a learning scenario, our aim is to encode a goal state as a point attractor in a dynamical system s phase portrait. Moreover we should be able to learn a particular trajectory to such attractor as this would result to an accurate representation of a demonstrated motion. In demonstration learning, movement data is acquired as kinematic trajectories. This way our controlled state variables refer to joint positions, velocities and accelerations. The motions to be learned can be divided into discrete and rhythmic motions. Thus there are two relatively similar dynamical systems-based implementations for each case. The general methodology was initially introduced in (Ijspeert et al. (a)) and (Ijspeert et al. (b)), later various improvements were added resulting to greater simplifications in (Ijspeert et al. ()) and (Ijspeert et al. (3)). The implementation used for the purposes of this thesis as well as the one presented below is the most recent 4 and even more simplified formulation, without loss of any merits. The approach followed in both cases is surprisingly simple. The key idea around which the framework is built is that we can have a simple first order system that guarantees monotonic convergence to a target state, as explained earlier. The crucial step to take next is to add nonlinearity to such system, in order to be able to encode a specific trajectory, and also be able to learn such nonlinearity from demonstration. Unfortunately nonlinear dynamical systems are in general to complicated to serve such purpose so a simpler approach is realised. Nonlinearization can be achieved by coupling the monotonically decaying system 5 defined earlier with another second order linear dynamical system that posses the same well defined properties but is modulated by a nonlinear function. The nonlinearity to be added can then be learned via a function approximation method. Next we present 4 Core available at but some alterations were made for our purposes. 5 Also referred to as canonical dynamical system.

19 Chapter. Representing Movement Plans 3 in detail the discrete and rhythmic formulations. This first system is a discrete control policy (CP) learner which basically shapes an attractor landscape in accordance to a demonstrated trajectory. The second system is an extension of the first, learning CPs for representing rhythmic movements... Discrete Motion The canonical first order dynamical system (x) has the form τ ẋ = a xx, where a x is a time constant, τ is a temporal scaling factor and x can be considered as an internal state variable of the overall system 6. This system presents with a trivial exponential convergence to zero regardless its initial position. The transformation system 7 (z,y) takes the form τ ż = α z(β z (g y) z)+(g y ) f τ ẏ = z, where g is a known goal state, y is the initial position at the time the movement is started, α z, β z are time constants, τ is a temporal scaling factor and y, ẏ and ÿ correspond to the desired position, velocity and acceleration generated by the policy as a movement plan. The nonlinear function f has the form f(x) = N i= ψ iw i x N i= ψ i where ψ i = exp( h i (x c i ) ). When given a demonstrated motion, y demo, ẏ demo, ÿ demo, of duration T, learning the desired trajectory can be formulated as learning f target = τ ẏdemo z demo, where z demo is obtained thought integration of the transformation system with y demo. The corresponding start and end positions are given by y demo () and y demo (T) respectively. This way motion learning results to a function approximation problem which is easily addressed with a nonparametric regression technique from locally weighted learning (LWL) Schaal and Atkeson (998). 6 Here the overall system refers to the set of differential equations that encode a single control policy. 7 Also referred to as trajectory plan dynamics.

20 Chapter. Representing Movement Plans 4... Discrete CP parameters and movement execution The tunable parameters of the discrete CP are the initial and goal positions,y and g, and the temporal scaling factor τ, which can be regarded as the duration of motion. Changing the start position of the CP corresponds to starting from a different position in the kinematic space of the controlled plant. A change of g results to a scaling of the attractor in phase space and a different goal state in kinematic space. Altering τ produces faster or slower reproduction of the learned movement. An important attribute of the CP is that modification of the aforementioned parameters does not affect the topological properties of the attractor landscape, explained in detail later. Triggering a motion from a learned CP is as simple as specifying y, g, τ and setting x to.. y yd Learned Demonstrated position.6.4. Learned Demonstrated velocity time time (a) Position (b) Velocity x Weighting Kernels position time (c) Canonical dynamics (d) LWL Kernels (3) Figure.: Learning a minimum jerk trajectory with a discrete CP.

21 Chapter. Representing Movement Plans 5.. Rhythmic Motion The rhythmic CP formulation is very similar to the discrete presented earlier, with the basic difference that periodicity is added. The canonical dynamics take the form τ φ = ω, with ω as the phase velocity. The transformation system is τ ż = α z(β z (y m y) z)+a f τ ẏ = z, where y m is the baseline of the oscillation, which can be regarded as an anchor point for the trajectory, and A, premultiplying the nonlinearity, is the overall amplitude of the movement. The nonlinear function f becomes f(φ) = N i= ψ iw i N i= ψ i where ψ i = exp( h i ( cos(φ c i ))) The course followed in order to learn the demonstrated rhythmic motion is exactly the same as for the discrete CP case.... Rhythmic CP parameters and movement execution The parameters of a learned rhythmic CP are the baseline of oscillation y m, the system s amplitude A and the period of the motion, τ. Modification of y m shifts the encoded trajectory in phase space while modification of A results to scaling of the magnitude of oscillation. Changing the period results to faster or slower oscillations. As with the discrete CP, changing of these parameters does not affect the topology of the attractor..3 Benefits of Movement Primitives Both systems presented above simplify the learning and representation of kinematic trajectories. This framework provides with a compact representation which is easy to reuse and manipulate. Moreover, it allows for rapid learning both in batch or online fashion. Such high level representation can handle strong external perturbations. In particular online modification of a learned motion is possible as time is treated as a state

22 Chapter. Representing Movement Plans 6 y yd position velocity Learned Demonstrated time 8 Learned Demonstrated time (a) Position (b) Velocity Learned Demonstrated ydd.9 Weighting Kernels acceleration time (c) Acceleration (d) LWL Kernels (3 per period) Figure.3: Learning a rhythmic trajectory with a rhythmic CP. Here the motion is repeated for two periods. variable. A simple error feedback between actual and predicated motion can act as a perceptual que to gradually slow down, or even halt, the system for the duration of the external perturbation. When the perturbation ceases the system can gradually resume performing a time-delayed learned motion. This feature can be regarded as a major advantage for the control of compliant robotics that often interact with humans. Additionally, the dynamical systems based representation exhibits spatial and temporal invariance for both discrete and rhythmic CPs. Scaling of the goal for a discrete CP or the amplitude for a rhythmic CP does not affect the topology of the attractor landscape. Accordingly, the period and duration of a learned motion is directly determined by τ leading to the same behaviour. This means that modification of any of the aforementioned parameters in both formulations does not affect the qualitative shape of the kinematic trajectory. Such a property allows for reusing CPs in novel

23 Chapter. Representing Movement Plans 7 conditions, allowing the generalisation of learned motions in different regions of the controlled systems workspace. Last, following the topological and spatial invariance, learning similar movements results in similar parameters w i in the nonlinearity function f 8. This property allows for easy movement recognition and classification as demonstrated in (Ijspeert et al. (3)). On the other hand this approach suffers a serious drawback. That is, learning gets increasingly difficult as the complexity of the controlled plant rises. The basic defect that results to such behaviour is that each degree of freedom of the plant has to be controlled by an independent CP. The lack of coupling between individual CPs leads to a difficulty in generalising in complex setups as uniform scaling of all the CPs does not necessarily result to a scaling of the learned motion. This problem of scaling up suggests that combining a nonlinear dimensionality reduction technique with the dynamical systems learning framework should be beneficial in the motion representation context. The next chapter deals with the dimensionality reduction method employed. It begins with the general motivation behind such approach and continues with the introduction of a simple dimensionality reduction model. Starting from this simple model and with a step-by-step addition of complexity we reach the final nonlinear model formulation used. 8 Assuming equal number of kernels.

24 Chapter 3 Nonlinear Dimensionality Reduction Movement learning in robotics involves the investigation of high dimensional datasets. It is often the case that movement systems posses a large number of degrees of freedom, which result in multidimensional datapoints when considering motion in configuration space. Typically a robotic arm from shoulder down has nine degrees of freedom, leaving aside the extra dimensions that a hand introduces. A humanoid robot usually has far more than sixty, depending on the robot s design, leading to a high dimensional representation of movement data. Exponentially many datapoints are needed in order to correctly understand the structure of a high dimensional dataset as implied by the curse of dimensionality (Scott (99)). In practice though we find that many datasets of interest, while seemingly high dimensional, have a much lower intrinsic dimensionality. More specifically, many researchers across various domains like vision, speech, motor control, climate patterns, human gene distributions, and a range of other physical and biological sciences, have reported evidence that support the fact that the true intrinsic dimensionality of high dimensional data is often very low (Tenenbaum et al. (), Roweis and Saul (), Vlassis et al. ()). Such evidence can be interpreted as the coherent structure the physical world possesses, expressed in terms of strong correlations between variables that describe the world s state at a particular moment in time (Vijayakumar et al. (5)). One example from research on human motor control is that despite the fact that humans can accomplish motions in almost any arbitrary way, in other words generating arbitrary distributions of the variables that define their motions, behavioral research has discovered a tremendous amount of regularity within and across individuals, leading to locally low dimensional data distributions (Kawato (999), Schaal and Sternad 8

25 Chapter 3. Nonlinear Dimensionality Reduction 9 ()). Dimensionality reduction is the process of reducing the number of random variables under consideration. It is a widely used approach when dealing with high dimensional datasets from various sources. In this chapter we present the dimensionality reduction method used throughout this project. First we briefly introduce Principal Component Analysis (PCA) as the most commonly used, simple and linear dimensionality reduction method. We then proceed to the formulation of a probabilistic model of PCA (PPCA) in the context of which the basic aspects of latent variable models are discussed. Next we show how the PPCA model can give rise to nonlinear dimensionality reduction when combined with the Gaussian Processes framework which is also introduced. This combination leads, lastly, to the Gaussian Process Latent Variable Model which we explain in detail, underlining its main advantageous properties as well as its limitations. 3. Principal Component Analysis PCA is the most widely used linear dimensionality reduction method (Jolliffe ()). The main assumption behind its working is that the high dimensional data that we wish to represent lies close to a linear hyperplane. What we are then interested in is finding the vectors, sometimes referred to as the basis set, that span this low dimensional hyperplane close to which the data is concentrated. Assuming a centred dataset Y = [y,...,y N ] T of dimensionality D and a low dimensional representation X = [x,...,x N ] T of dimensionality q, where D >> q, we can formulate their relationship as a linear model of the form : y n = Wx n where the linear mapping is given by the matrix W R D q. For the projection matrix W, q is the number of vectors that approximately span the subspace, the data Y lie close to. The solution of finding these vectors is given by the eigendecomposition of the covariance matrix S of the dataset. The q eigenvectors that correspond to the largest eigenvalues of the covariance matrix S yield the optimal, in the sense of minimal reconstruction error, lower dimensional representation. Furthermore, such solutions leads to orthonormal vectors from which the mapping matrix is composed, thus the low dimen-

26 Chapter 3. Nonlinear Dimensionality Reduction sional representation is of the form : x n = W T y n where the approximate reconstruction of the datapoint y n is : ỹ n = Wx n PCA can be viewed as a change of the coordinate system to a suitable position described by the mean of the dataset (µ), zero for the case considered above as we assume a centred dataset, and the choice of a novel, small number of axis governed by the variance of each dimension of the high dimensional dataset. In this way we are preserving the dimensions of high variance, translated as large eigenvalues corresponding to the eigenvectors we project the data onto, and discarding the rest dimensions with lesser variance. This leads to an approximation error in analogy to the variance of the discarded eigenvectors. In effect the suitability of PCA for dimensionality reduction of a given dataset can be easily confirmed when considering the eigenvalues of the eigendecomposition of the covariance matrix S. Ideally we hope for q distinctively large eigenvalues and D q small values. In other words the dominant projection directions should approximately capture the complete variance of the dataset in consideration, while the directions discarded can be interpreted as noise. In the next section we present the Probabilistic PCA model as a means of introducing the reader to latent variable model formulations. Furthermore we present the dual formulation of PPCA, crucial for the transition to the GPLVM method presented later. 3. Probabilistic PCA A latent variable model can be specified as a model relating a set of latent variables, X R N q, to a set of observed variables, Y R N D, through a set of parameters. Such model has a probabilistic definition, in accordance to the Bayesian perspective, and the standard approach for fitting such latent variable model is to marginalize over the latent variables and maximize the model s likelihood by optimizing the parameters.

27 Chapter 3. Nonlinear Dimensionality Reduction Figure 3.: Graphical representation of a latent variable model. The white shaded X node represents the latent variables and the grey shaded Y node represents the observed variables. Note that parameters are omitted. 3.. Standard Model Fitting Approach Probabilistic PCA (PPCA) can be viewed as a simple latent variable model, where the latent space X = [x,...,x N ] T is assumed to be related to the centered data set, Y = [y,...,y N ] T through a linear mapping that is corrupted with noise. Such relationship of the latent variable x n of dimensionality q and each datapoint is of the form : y n = Wx n + η n where the matrix W R D q specifies the linear relationship between the latent and the observed spaces of dimensionality q and D accordingly. The noise term η n R D is vector of values sampled independently from a spherical Gaussian distribution of zero mean and β I covariance, where he parameter β is an inverse variance and is referred to as a precision, p(η n ) = N(η n,β I). The likelihood of a datapoint can then be written as p(y n x n,w,β) = N(y n Wx n,β I). (3.) By integrating over the latent variables we obtain the marginal likelihood, Z p(y n W,β) = p(y n x n,w,β)p(x n )dx n, (3.) where we are required to specify a prior distribution over x n. For probabilistic PCA the appropriate prior is a unit covariance, zero mean Gaussian distribution, p(x n ) = N(x n,i).

28 Chapter 3. Nonlinear Dimensionality Reduction Thus the marginal likelihood for each datapoint y n can be found analytically through the marginalization in 3. as, p(y n W,β) = N(y n,ww T + β I). By assuming independence across datapoints, the likelihood of the full dataset can be written as, N p(y W,β) = p(y n W,β). (3.3) n= Maximization of 3.3 yields the optimized parameter set W. Moreover (Tipping and Bishop (999)) have demonstrated that there is also an analytic solution to this maximization. Such solution is recovered when the matrix W spans the principal subspace of the data, resulting to the interpretation of the probabilistic version of PCA as was presented earlier. The method presented above is the standard approach for fitting latent variable models by marginalizing the latent variables and optimizing the parameters via maximum likelihood. An alternative approach was proposed in (Lawrence (4)) as a dual method where parameters are marginalized and latent variables are optimized in contrast to the marginalization of latent variables and optimization of parameters presented above. In the next section we present this dual approach and how its nonlinearization can arise with the combination of the Gaussian Processes framework. 3.. Dual Model Fitting Approach In this section we present an alternative approach to fitting the latent variable model of PPCA as presented previously. This formulation can be viewed as the dual of the one introduced earlier, sharing obvious similarities on the one hand and a critical change of strategy regarding the terms that are optimized and marginalized. Thus the following model is termed Dual Probabilistic PCA (DPPCA) (Lawrence (5)). In accordance to the Bayesian framework, parameters as W are considered as random variables. Thus we are required to choose an appropriate prior for W and then continue to treat the parameters as random variables. A simple conjugate prior to 3. would be a spherical Gaussian distribution p(w) = D i= N(w i,i) where w i is the ith row of the matrix W. Marginalizing over both W and X=[x,...,x N ] T is intractable so we are presented with a dilemma of what to marginalize over. In the

29 Chapter 3. Nonlinear Dimensionality Reduction 3 previous approach we have chosen to marginalize over X R N q as typically it would be of larger dimensionality than W R D q. In this approach we choose its counterpart as in practice both approaches are equivalent. Our choice of a conjugate prior earlier makes the marginalization of W straightforward. The marginalized likelihood is of the form D p(y X,β) = p(y :,d,x,β ), d= where we use y :,d to represent the dth column of Y and p(y :,d X,β) = N(y :,d,xx T + β I). Following we optimize with respect to the latent variables whilst using as an objective function the log-likelihood where L = DN ln(π) D ln K tr(k YY T ), (3.4) K = XX T + β I. The gradients of 3.4 with respect to X can be found from (Magnus and Neudecker (999)) as L X = K YY T K X DK X, while a fixed point where the gradients are zero is then given by D YYT K X = X. It can be shown that the values of X that maximize the log-likelihood have an equivalent interpretation of the classical PCA presented earlier. Nevertheless the DPPCA formulation is crucial for the development of the nonlinear latent variable model that follows. 3.3 Gaussian Processes Gaussian processes are a class of probabilistic models which specify distributions over function spaces (Williams (999), Rasmussen and Williams (6), MacKay (998), A.O Hagan (99), Bishop (6)). The question that naturally arises is how can we manipulate a function, which is an infinite dimensional object. A Gaussian process is a generalization of the Gaussian probability distribution. Whereas a probability distribution describes random variables which are either scalars

30 Chapter 3. Nonlinear Dimensionality Reduction 4 Figure 3.: Graphical representation of (left) the standard probabilistic PCA model and (right) its dual representation. The nodes are shaded to represent different treatments. Black shaded nodes are optimized, white shaded nodes are marginalized and grey shaded nodes are observed variables (Lawrence (5)). or vectors, a Gaussian process governs the properties of a distribution over functions. Thus, as a Gaussian distribution is parametrized by a mean and a variance, a Gaussian process is parametrized by a mean and a covariance function. The key point that makes such manipulations tractable is that we are focusing to that space where the function is instantiated. Typically the mean is taken to be zero or another constant value, while the covariance function, also referred to as kernel, is necessarily constrained to produce positive definite matrices. In this framework the distribution over the instantiations is taken to be Gaussian and a Gaussian process prior is required. Following directly from the formulation of DPPCA presented earlier, a Gaussian process prior over the space of functions that are linear with Gaussian noise of variance β I is given by a kernel of the form k(x i,x j ) = x T i x j + β δ i j, where x i,x j are vectors from the space of inputs to the function and δ i j is the Kronecker delta. Furthermore, if these inputs were taken from the embedding matrix X and the covariance function was evaluated at each of the N latent space points we would recover a covariance matrix of the form K = XX T + β I, where the element at the ith row and jth column of K is given by the previous formula. This can be immediately recognized as the covariance associated with the DPPCA model. In this way the marginal likelihood of DPPCA can be viewed as a product of D independent Gaussian processes and points out the relation between DPPCA and the Gaussian process latent variable model. In the same context DPPCA can be interpreted

31 Chapter 3. Nonlinear Dimensionality Reduction 5 as a special case of the GPLVM class of models. The crucial step to take next, as naturally follows, is to introduce a nonlinear covariance function so as to nonlinearize the mapping from the low dimensional embedding to the high dimensional space. 3.4 The Gaussian Process Latent Variable Model In the previous sections we have presented how a simple dimensionality reduction method can be formulated as a probabilistic model. Furthermore we have shown how such a model can be fitted to a dataset through two equivalent approaches and how the DPPCA formulation can be combined with the Gaussian processes framework by extending the model with a Gaussian process mapping from the latent space X to the observed data space Y. This key step points to the Gaussian process latent variable models class, which define a powerful probabilistic, fully generative, nonlinear class of models. There exists a variety of nonlinear kernels that can be used. The constrain when selecting a kernel is that it produces a positive definite matrix which ensures that the gradients of the kernel with respect to the latent points can be computed. A common choice of kernel is the Radial Basis Function (RBF) kernel, also used throughout this thesis. The resulting model has no analytical closed form solution and its nonlinear nature is sure to yield multiple local optima. (a) Linear (b) RBF (c) RBF Figure 3.3: Samples from different kernels. (a) Linear kernel with α = 6, (b) RBF kernel with γ = and α =, (c) RBF kernel with γ = and α = 4 (Adopted from (Lawrence (6))). Equivalent to Mercer kernels class.

32 Chapter 3. Nonlinear Dimensionality Reduction Fitting the nonlinear GPLVM The RBF kernel prior takes the form ( k n,m = α exp γ ) (x n x m ) T (x n x m ) + δ nm β where k n,m is the element in the nth row and mth column of K, γ is a scale parameter and δ nm denotes the Kronecker delta. The gradients of 3.4 with respect to the latent points can be found through combining L K = K YY T K DK, with K x n, j via the chain rule. A non linear optimizer such as Scaled Conjugate Gradients (SCG) (Moller (993)) can be used to obtain the latent variable representation of the data. Moreover gradients with respect to the parameters of the kernel matrix may be computed and used in order to optimize X, α, γ and β. Again, note that here we have focused on the RBF kernel but the approach presented above can be used with any kernel that respects the constrains mentioned. 3.5 Advantages of the GPLVM When choosing a dimensionality reduction method, one is presented with an abundance of alternatives as the subject of dimensionality reduction has been extensively studied, yet still remains an active area of research. In this section we present the major advantages of the GPLVM that lead to such a choice. The GPLVM is a powerful, probabilistic, nonlinear, fully generative dimensionality reduction method. Its merits have been demonstrated in detail in the related bibliography. Such a model is robust against overfitting even when presented with datasets that have fewer datapoints than dimensions (Lawrence and Quinonero-Candela (6)). Moreover, it is strong in generalizing and in handling missing values, even with small datasets, while also providing a degree of certainty in the predictions made. In the RBF kernel case such degree of certainty is high close to training data and decays gracefully when moving, in latent space, away from the training set. Furthermore the direct mapping from latent space to observed space serve to yield a fully generative model suitable for creating fantasy data. Last, the suitability of the GPLVM to handle multidimensional movement data has already been established in related research as presented in the first chapter (Section.3).

33 Chapter 3. Nonlinear Dimensionality Reduction 7 Nevertheless the GPLVM presents with some limitations. Such include the lack of a direct mapping from observed to latent space, an attribute possessed by other nonlinear dimensionality reduction techniques such as the Laplacian Eigenmaps Latent Variable Model (Carreira-Perpinan and Lu (7)). Note that there is an indirect approach to overcome this limitation as presented in (Shon et al. (6)) but beyond the scope of this thesis. Moreover the GPLVM is sensitive to different initializations as the log-likelihood that is optimized is a highly nonlinear function of the embeddings and the parameters, leading to multiple local optima. Approaches that face this limitation are initializations using PCA and IsoMap (Lawrence (5)). The computational cost of the model has also to be taken into account as each gradient step requires an inverse of the kernel matrix, an O(N 3 ) operation, rendering the model impractical for large datasets. Fortunately the commonly used technique of sparsification, an iterative selection of a subset of the dataset, has been shown to yield optimal results (Grochow et al. (4)). 3.6 Other Related Methods Other methods exist that address both the problems of nonlinear data and the mapping from the low dimensional embedding to the observed space. Comparable to the GPLVM can be considered Density Networks (MacKay (995)), the Generative Topographic Mapping (GTM)(Bishop et al. (998)) and the very recent Laplacian Eigenmaps Latent Variable Model (LELVM)(Carreira-Perpinan and Lu (7)). For this project we have also experimented with the GTM and LELVM and here we present the reasons why we have ultimately chosen to use the GPLVM. The GTM is a probabilistic nonlinear dimensionality reduction method, regarded by some as the probabilistic version of Kohonen s Self-Organizing Map. In this framework the latent space is sampled on a uniform grid and the model is being fit with the expectation-maximization algorithm (EM). For our experimentations we have used the Netlab implementation of GTM. Several grid sizes and number of hidden nodes where tried with well results. The main disadvantage is that the GTM s layout of latent points presents with a characteristic griding effect that disrupts the desirable continuity of the latent representations. Moreover a model selection step should be introduced for choosing parameters. The LELVM is a probabilistic method of nonlinear dimensionality reduction that Available at

34 Chapter 3. Nonlinear Dimensionality Reduction 8 combines the advantages of spectral methods, Laplacian Eigenmaps (Belkin and Niyogi (3)), with those of latent variable models (Carreira-Perpinan and Lu (7)). It is a novel model that has been shown to perform well with movement data. Moreover the computational cost of fitting the model is very low, allowing for fast training. In our experiments the model performed equally accurate with the GPLVM while the latent space representations were continuous, thus suitable for learning, and the errors extremely low. In addition, as mentioned earlier, the LELVM provides with a mapping from observed to latent space something unavailable with the GPLVM, where addition of new data requires retraining of the model. Nonetheless, the LELVM requires explicit setting of parameters which are crucial for its performance. This disadvantage, as with the GTM, would result to an extra parameter selection step, something not required for the GPLVM. In the next chapter we present in detail the movement systems used to generate the high dimensional kinematic datasets in question. We explain every robotic setup in turn, along with the redundancy resolution approach employed for each case. Moreover, we formulate our motivation behind the selection of such datasets that portray simple geometrical structures in task space.

35 Chapter 4 Robots and Datasets Most applications of interest deal with complex movement systems. In general, controlling a plant that possesses multiple degrees of freedom is not a trivial task. Problems arise as a consequence of the number of degrees of freedom a system has, in relation to the degrees of freedom of its task space. Multiple degrees of freedom result to redundancy, a central problem for motion planning. Nonetheless, redundancy is a crucial characteristic of complex movement systems, as it allows for arbitrary trajectories and unconstrained movement. In order to investigate the properties of the latent space representations of kinematic data obtained from complex and redundant systems we choose simple geometrical shape motions. Our hypothesis is that structure in the task space would be reflected in the latent space resulting from the GPLVM, even though only joint space data is presented to our system. Our aim is to be able to exploit such structure in latent space in generalising demonstrated motion in the vicinity of our observed data. Thus we choose simple geometrical structures in task space as grids, straight lines and circles, starting from a relatively simple redundant manipulator with three degrees of freedom and continuing with a six degree of freedom robotic arm. Later human motion capture data is used in a similar context, demonstrating how such approach is able to scale to more complex movement systems. In this chapter we present the robotic setups used to generate our experimental datasets. We start by introducing some general robot control concepts, crucial for understanding the tasks to follow. After, we introduce the systems employed and how such systems are implemented in simulation. Then we present the redundancy resolution strategy followed for the generation of data in each different setup. Next, we explain in detail the structure of each dataset, as well as the purpose that each dataset 9

36 Chapter 4. Robots and Datasets 3 serves in the context of our experimental formulation. 4. Robot Control The theory of robot control defines how to model and control a robotic setup. In this framework, robots are modeled as a series of rigid links of arbitrary length, connected by joints that allow angular movements. Such a simplification might seem severe, nonetheless yields no loss of generality and is equally valid from simple robotic arms to humanoid setups, as the later can be regarded as complicated structures following the underling skeletal structure of humans. 4.. Kinematics Kinematics refer to geometrical and time based properties of motions. The variables of interest are positions, velocities and accelerations. Modeling requires the use of two different coordinate frames. The first is the joint angle space or configuration space which is defined by the positions of the set of joints the controlled plant possesses. The second is the task space or global coordinate frame which is defined by the position of a specific part of the robot, in most cases an end-effector of interest, in Cartesian space. In general, from the roboticist s perspective, it is easier to work with task space representations as understanding motions in, for example three dimensional, task space comes naturally. On the other hand specifying motions in joint space is easier, from the robot control point of view, as joint angles are the variables that can be directly manipulated. Simple trajectories in task space have complex representations in joint space, dependent on the plant s design, and accurate coordination of the degrees of freedom, in a continuous manner, is crucial for producing meaningful, in the sense of goal directed, movements Forward kinematics The transformation from joint space to task space is defined as forward kinematics. Let x R d be a target position in task space and θ R q the corresponding position in joint space. For a redundant setup q > d as the plant has more degrees of freedom than the space it operates in. Forward kinematics define a unique mapping F from joint space

37 Chapter 4. Robots and Datasets 3 to task space, of the form x = F(θ). Defining the function F is a straightforward geometric transformation of intermediate coordinate frames using joint positions and link lengths, based on the plant s morphology Inverse kinematics Moving the opposite direction between coordinate frames is called inverse kinematics. It defines the mapping from task space to joint space and has the form θ = F (x). Unfortunately, inverse kinematics is often an ill defined problem as solutions that map a specific task space position to joint space are not unique, and possibly infinitely many, if existing (Jordan and Rumelhart (99)). Nonetheless, various approaches for inverse kinematics exist as it is a central subject of robotics research. For the purposes of this thesis we have chosen to follow an iterative solution that solves the inverse kinematics problem via the optimisation of a cost function for our first robotic setup. This implementation is detailed in section 4... A well known symbolic solution was used for the second robotic arm, presented in section Figure 4.: The relationship between joint space and task space via forward and inverse kinematics.

38 Chapter 4. Robots and Datasets Trajectory formulation The task of trajectory formulation is to define in a structured manner how a motion is performed in either joint or task space. Our interest falls into trajectories in task space, and having already defined the task of inverse kinematics, trajectory planning defines the points in between a starting and an end position of an end-effector. Many approaches for trajectory formation exist, most of which are based on optimisation of a cost function. Examples of simple cost functions are shortest distance (Park and Brockett (994)), minimum acceleration (Noakes et al. (989)) and minimum time (Shiller (994)). Higher level cost functions include minimum jerk (Flash and Hogan (984)), minimum torque (Uno et al. (989)) and minimum endpoint variance (Harris and Wolpert (998)). It has been observed experimentally that humans perform motions similar to minimum jerk predicted movements (Wolpert et al. (995)). Such approach seeks to minimise the jerk, the third derivative of position, in the trajectory, resulting to bell-shaped velocity profiles and smooth, natural motion (Figure 4.). The cost function can be mathematically formulated as Z T C J = θ(t) dt, where T is the duration of the motion. It has been shown in (Flash and Hogan (984)) that a function with its sixth derivative equal to zero will minimise the jerk of a trajectory. The general solution is of the form θ(t) = a + a t + a t + a 3 t 3 + a 4 t 4 + a 5 t 5, which by considering known values of position, velocity, acceleration at the start and end point as well as the duration of movement, results to a general solution of the form θ(t) = θ s +(θ f θ s )( t3 5t4 d 3 d 4 + 6t5 d 5 ), where θ f and θ s are the final and initial positions respectively. Minimum jerk trajectory planning depends only on the kinematics of the task and is independent of the physical structure of the controlled plant. For our purposes we have used minimum jerk straight line trajectories in task space as a means of representing parallel and reaching motions. 4. Planar Arm We have chosen to start the data generation process with a relatively simple robotic manipulator. This choice serves the purpose of demonstrating the validity of our proposed

39 Chapter 4. Robots and Datasets Degrees 5 4 Degrees/sec Time (sec) Time (sec) (a) Position. (b) Velocity. 4 3 Degrees/sec Time (sec) (c) Acceleration. Figure 4.: Exemplary minimum jerk profiles of a joint angle motion. approach first in a simple setup and then generalising to more complex systems. The first movement system is a three link planar arm with equal link lengths. It possesses three degrees of freedom, which are the angle between the first link and the origin and the angles between the subsequent links. The first angle corresponds to the displacement between the first link and the x axis of the Cartesian space the robot operates in. The following two DoFs are the angles between consecutive links with respect to the imaginary extension of every previous link. All joint angles are allowed to be both positive and negative. The joints can only rotate with respect to the vertical z axis, thus defining a two dimensional workspace as a circle in the horizontal plane (Figure 4.3). Regardless of its simplicity, such planar arm is a redundant system as it possesses three degrees of freedom while its task space is constrained to be two dimensional. As a result any target task space point that is not on the bounds of the Degrees of freedom.

40 Chapter 4. Robots and Datasets 34 arm s reaching area has infinitely many inverse kinematics solutions, relative to the precision considered, given the continuity of the joint space. The problem has an analytical solution if we constrain the orientation with which the end-effector reaches a particular target or constrain two of the three angles to be equal. The result would be a system of three equations and three unknowns that is easily soluble. This approach leads to an oversimplification of the manipulator that results to trivial motion patterns. We employ a more flexible redundancy resolution method for inverse kinematics which is based on cost function minimisation and allows for multiple optimisation criteria. The next subsection presents our redundancy resolution method for the three link planar arm in detail. 3 3 y x z x z y Z Simple three link Y Simple three link 3 3 Y 3 3 X X (a) Perspective. (b) Top view. Figure 4.3: The 3-DoF 3-link planar arm as visualised with the Robotic Toolbox. The blue planar circle defines the systems workspace. 4.. Redundancy Resolution The three link 3-DoF planar arm is composed by links of length. Forward kinematics of this model is just the transformation from the origin of the first link to consecutive frames, up to the position of the end effector, the third link s tip, in global coordinates. This is formulated as a series of translations and rotations which result in the form x = l cos(θ )+l cos(θ + θ )+l 3 cos(θ + θ + θ 3 ) x = l sin(θ )+l sin(θ + θ )+l 3 sin(θ + θ + θ 3 ), The measuring unit is not important but by conversion we can regard it as being meters.

41 Chapter 4. Robots and Datasets 35 where for our case l = l = l 3 =. For inverse kinematics we have used an iterative solution method based on Resolved Motion Rate Control (RMRC) (Whitney (969)). The key observation of the RMRC framework is that the forward kinematics function F can be linearly approximated when considered in small increments with use of the Jacobian matrix J. Mathematically that is x = J(θ) θ, where J is defined as the matrix of partial derivatives of task space positions with respect to the joint space angles. For our setup, J has the form ) J(θ) = ( x θ x θ x θ 3 x θ x θ x θ 3 This way we can relate changes in joint angles to changes in task space positions. Thus by making the assumption that the underlying function can be linearly approximated we can calculate the inverse kinematics as θ = J + (θ) x, where J + denotes the pseudoinverse of the Jacobian. Nonetheless, in this formulation singularities that can disrupt the iterative solution process can occur. A way to overcome this difficulty is to constrain the joint space movements that align joint angles or reach the bounds of the considered joint space. The procedure starts from a predefined point in joint space and calculates the forward kinematics in order to measure the distance to the target task space position. For each iteration, up to a predefined number of iterations, a movement in joint space of a small step size is made for minimising the task space error until the target point is reached. Simultaneously, for every movement the gradient of an objective function is projected on the null space of J in order to move in joint space along curves that do not affect the task space position of the arm s end-effector but minimise the objective function. As soon as the task space error between current and desired position is minimised the procedure continues by minimising the cost function. Null space movement is performed until an error threshold is reached or no further optimisation is possible. The optimisation criteria we have used are the sum of squared joint angles, which seeks the solution with the minimum displacement from the zero configuration. Moreover we have experimented with weighted sums, penalising a particular joint the most. The motivation of such weighted cost is that in robotics it is often the case that actuators have different characteristics so that one might prefer movement in a specific.

42 Chapter 4. Robots and Datasets 36 fashion. Additionally, we have also experimented with a predefined convenience pose, which is often the configuration of minimum strain for robotic setups or any other criterion suitable for each task at hand. 4.. Grid Data Our initial experimentations concerned grid datasets of distinct task space points. These datasets contain a set of 56 points, each sampled equidistantly from a planar grid in task space (Figure 4.4). The grid ranges from.5 to in both x and y directions resulting in a square grid of points that are. distant from one another. These task space positions in Cartesian coordinates are then run through the inverse kinematics to produce the corresponding joint angle positions. Two datasets where created, the first corresponds to a uniform penalty cost function while the second corresponds to a cost function that penalises the angle between the first link and the x axis four times more than the rest two angles. Both these datasets do not correspond to kinematic trajectories and their experimental purpose is detailed in the next chapter Y X Figure 4.4: Grid datapoints in task space Discrete Kinematic Data Our next step in generating data with the planar arm was to produce discrete movement data. Two scenarios where employed. The first corresponds to parallel trajectories in the arm s taskspace while the second corresponds to reaching movements in a radial fashion. Again the data was first represented in task space and then passed through

43 y Chapter 4. Robots and Datasets 37 the inverse kinematics to obtain the corresponding joint space representation. In both discrete movement datasets the uniform angle penalisation cost function was used. In particular, the first dataset consists of ten minimum jerk trajectories that are parallel to the x axis and are spaced equally. The start points are on a line parallel to the y axis at x =.5. The end points are on another parallel line at x =.5 (Figure 4.5(a)). The second dataset consists of reaching movements starting on a quartercircle arc of radius.5 and reaching the angularly corresponding points of a quartercircle arc of radius. Again these are minimum jerk trajectories as shown on Figure 4.5(b). The trajectory on the x axis was not used as it would result to a bound configuration presenting difficulties with the inverse kinematics Y X x (a) Parallel straight line motions. (b) Reaching motions. Figure 4.5: Discrete movements with the planar arm Rhythmic Kinematic Data As rhythmic movements we have chosen circular trajectories. The dataset consists of four circular trajectories of radii.,.4,.6 and.8 accordingly. The trajectories are homocentric and consist of 8 points each. Their centres are placed at (.5,.5) in task space as shown in figure 4.6. The points on each circle are sampled in equal time steps and result to constant angular velocities for each closed motion. Again the points in task space are run through the inverse kinematics and the corresponding joint angle datapoints are produced using the uniform angle penalisation cost function.

44 Chapter 4. Robots and Datasets Y X Figure 4.6: Circular constant velocity trajectories. 4.3 Puma Arm The second robotic setup employed is the Unimation Puma-56 robotic arm manipulator. The Puma arm is a well known robotic manipulator with well defined kinematics and dynamics. A detailed model of the Puma arm is readily available in the Matlab Robotic Toolbox 3 (Corke (996)), used for our simulations and the creation of the relevant datasets. The Puma arm s morphology in the physical implementation is comprised by a fixed base trunk, a planar rotating shoulder, a vertical rotating upper arm and a vertical rotating forearm equipped with a planar and vertical rotating gripper module (Figure 4.7(a)). The corresponding simulation model is a series of three different length links. The links are connected with a -DoF rotary joint at trunk-shoulder level, a -DoF rotary joint at upper arm and forearm level, a -DoF rotary joint for the planar rotation of the forearm and a -DoF rotary joint for the positioning of the gripper module (Figure 4.7(b)). The design results to a six degree of freedom manipulator moving in a three dimensional workspace. Computation of the inverse kinematics of such a redundant manipulator is a difficult problem. Fortunately, the Puma arm is a manipulator widely used and inverse kinematics are available in the toolbox employed. Below we present the workings of the inverse kinematics method used. 3 The toolbox is implemented by Peter I. Corke and it is available at

45 Chapter 4. Robots and Datasets 39.5 y z x Z Puma Y.5.5 X.5 (a) (b) Figure 4.7: The Unimation Puma-56 robotic arm. (a) Puma arm s morphology (Adopted from model of the Puma arm in the arm-ready position. (b) Simulation 4.3. Redundancy Resolution The inverse kinematics approach followed for the Puma arm is qualitatively different than the cost function based optimisation method used for the planar arm. For the Puma arm we have made use of a symbolic solution, specific for manipulators with spherical wrists, based on the homogeneous transformation representation (Paul and Zhang (986)). This way, the orientation of the end effector is taken into account and the multiple solutions that result are constrained to be mirror solutions. The mirror solutions correspond to elbow up and down, right or left handed and wrist flipped or not configurations. We have chosen a right handed, elbow up and non flipped wrist configuration for our experiments. Mathematically the joint angles θ, θ, θ 3, θ 4, θ 5, θ 6, are computed as follows. The parameters α, α, α and d, d, d 3, d 4 are joint parameters defined by the robot s design, following the Denavit-Hartenberg conventions. O x, O y, O z and A x, A y, A z are the second and third columns of the homogeneous transformation of the given task space position while p x, p y, p z are the task space coordinates of the position in Cartesian space. Below we present the formulas that result to the configuration chosen. To find θ we solve for r = p x + p y θ = tan p y p x + sin d 3 r,

46 Chapter 4. Robots and Datasets 4 and for θ we compute 4 V 4 = p x cos(θ )+ p y sin(θ ) r = V4 + p z and θ 3 is defined as ψ = cos a d 4 a 3 +V 4 + p z α r p z θ = tan + ψ, V 4 Following we compute θ 3 = tan a 3 d 4 tan cos(θ )V 4 + sin(θ )p z α cos(θ )p z sin(θ )V 4. V 3 = cos(θ )A x + sin(θ )A y V 33 = cos(θ )A y sin(θ )A x V 33 = cos(θ + θ 3 )V 3 + sin(θ + θ 3 )A z Then for θ 5 we have θ 4 = tan V 33 V 33. θ 5 = tan cos(θ 4 )V 33 sin(θ 4 )V 33 V 3 sin(θ + θ 3 )+A z cos(θ + θ 3 ). Finally for computing θ 6 we calculate V = cos(θ )O x + cos(θ )O y V = sin(θ )O x cos(θ )O y V 3 = V cos(θ + θ 3 )+O z sin(θ + θ 3 ) V 33 = V sin(θ + θ 3 )+O z cos(θ + θ 3 ) V 4 = V 3 cos(θ 4 ) V 3 sin(θ 4 ) V 43 = V 3 sin(θ 4 )+V 3 cos(θ 4 ) θ 6 = tan V 4cos(θ 5 ) V 33 sin(θ 5 ) V 43. This configuration based solution is fast and guaranteed to be unique. Nonetheless such solution also faces problems when close to the workspace bounds, as a result of 4 We follow the notation as presented in (Paul and Zhang (986)).

47 Chapter 4. Robots and Datasets 4 the inverse trigonometric functions that are used for the calculations. In such cases the inverse kinematics may result in complex number solutions, which subsequently cannot be handled by our later processing steps. Moreover, it has to be emphasised that such an analytical solution for the 6-DoF Puma arm, is possible only through the definition of the orientation of the end-effector. Such an approach, though, does not constrain the motion of the robot and does not result to oversimplification of the robot s kinematics. A drawback of this approach can be considered the DoF that controls the wrist s orientation in relation to the desired task space point. This DoF is directly affected by the configuration preference, resulting to a constant value of π for the configuration chosen Discrete Kinematic Data In generating discrete movement data we have followed the same strategy as with the 3-DoF planar arm. For discrete motions we have generated parallel and reaching trajectories in the robot s task space. For the parallel motion case we have generated straight line minimum jerk trajectories that originate from x =. and end at x =., spaced evenly, vertical to the y axis as shown in figure 4.8(a). For the reaching motions case the straight line minimum jerk trajectories originate, equally spaced, from a quartercircle of radius. and terminate on a quartercircle of radius.7 at the corresponding angular positions (Figure 4.8(b)). In both cases each kinematic trajectory is points long and the relevant datasets result from the taskspace positions through the arm s inverse kinematics. The datasets are six-dimensional but the 4 th dimension, which defines the wrist s orientation, is constant at π, defining a right-handed configuration. The time evolution in joint space of a trajectory from each dataset is depicted in figure Rhythmic Kinematic Data From the Puma robotic arm we have generated two rhythmic motion datasets. The first dataset consists of four circular trajectories on the xy plane. Again the trajectories are homocentric with radii.6,.,.8 and.4 accordingly, centred at (.4,.4). The trajectories are sampled in equal time steps resulting in constant angular velocity motions (Figure 4.(a)). The second dataset consists of three scaled infinity trajectories ( ), originating at the same starting point in 3D space. The trajectories can be defined

48 Chapter 4. Robots and Datasets Y Y X X (a) Parallel movements. (b) Reaching movements. Figure 4.8: Straight line minimum jerk trajectories in taskspace angle angle 3 θ θ θ 3 θ 4 θ 5 θ time 3 θ θ θ 3 θ 4 θ 5 θ time (a) Parallel motion. (b) Reaching motion. Figure 4.9: Kinematic trajectories of discrete movements from the Puma arm. mathematically with the following system of equations x(t) = C y(t) = Asin(πt) z(t) = Asin(πt), where C is a predefined constant regulating the position of the trajectory along the x axis and A defines the amplitude of the resulting trajectory. We have recorded three trajectories of amplitude.,. and.3, while C was set to.4. The results where rotated by π/4 to be positioned in the arm s workspace as shown in figure 4.(b). A special characteristic of such motion, apart from its periodicity, is that it contains

49 Chapter 4. Robots and Datasets 43 regions of high and low velocity resulting in more complex kinematic trajectories. Again the joint space datasets were computed with the inverse kinematics. Example joint space trajectories from both datasets are presented on figure Y.45.4 Z X Y.. X (a) Circular trajectories. (b) Infinity trajectories. Figure 4.: Rhythmic motion in task space. angle 4 3 θ θ θ 3 θ 4 3 θ 5 θ time angle θ θ θ 3 θ 4 θ 5 θ time (a) Circular motion. (b) Infinity motion. Figure 4.: Kinematic trajectories of rhythmic movements from the Puma arm. 4.4 Human Motion Capture The last datasets employed correspond to human motion capture data. The data was obtained from the Carnegie Mellon University Graphics Lab and is available at the CMU Graphics Lab Motion Capture Database 5. The database consists of a collection 5

Such methodology is very common for motion capture as the non-intrusive nature of distinctive markers does not interfere with the motion of the subject and is easy to setup.

50 Chapter 4. Robots and Datasets 44 of motions from various human activities, modeled with a humanoid skeleton mannequin. The data is recorded with the use of markers and the pose of the actor is obtained with a vision based approach. Such methodology is very common for motion capture as the non-intrusive nature of distinctive markers does not interfere with the motion of the subject and is easy to setup. Nonetheless, the resulting data is not noise-free and the quality largely depends on the optical hardware used. For our purposes we have selected a dataset of a human boxing to serve for discrete movement data. For rhythmic motion we have selected two datasets, one of a human walking and another one of a human running. All datasets consist of 6 dimensions, while the first three dimensions encode the position of the pelvis. The position of the pelvis serves as a root-node for positioning the mannequin and does not correspond to joint angle data, thus these dimensions were not used for our experiments, resulting in 59-dimensional datasets. Snapshots from each dataset are depicted in figure 4.. (a) Boxing. (b) Walking. (c) Running. Figure 4.: Snapshots from the motion capture datasets. The following chapter presents our experimental formulations. We start with the formulation of the general methodology followed and then examine each experiment specifically. The aim of the experiments is made explicit for each particular case,

51 Chapter 4. Robots and Datasets 45 detailing the setups and parametrisations used. Moreover, the experimental results are presented to support the efficiency of the proposed methodology.

52 Chapter 5 Experiments and Evaluations In this chapter, we describe how experiments were formulated and performed. We start by introducing the general methodology followed throughout the course of the project, as well as the metrics employed for the evaluation of the system. Next, each experimental scenario is detailed, along with the aim of each formulation, followed by the presentation of the experimental results. We have divided the experiments performed according to the complexity of the controlled plants. Thus we start with the results from the planar arm and then continue with the Puma arm results. Finally we present the human motion capture experiments as the motion capture datasets are qualitatively different than the previous. Each section is further divided into discrete and rhythmic motion scenarios as we aspire to underline the applicability of the methodology for both cases. We conclude with a discussion of the results from the different setups and on the overall evaluation of the methodology. 5. Overview of Methodology This thesis aims to clarify whether the properties of the dynamical systems-based kinematic representation carry over when learning is performed in a lower dimensional space. Furthermore we wish to demonstrate the suitability of the dimensionality reduction method employed to handle kinematic data, within the context of low dimensional learning. Another goal is to prove that encoding kinematic trajectories is possible and highly accurate in the low dimensional manifold. Ideally, learning in the latent space would lead to negligible error between demonstrated and learned motion, even though learning is performed with a largely smaller number of CPs. 46

53 Chapter 5. Experiments and Evaluations 47 The first step, as presented in the previous chapter, was to collect a series of kinematic datasets corresponding to motions that follow simple geometrical structures. We should remind the reader that such structure concerns only the task space of the system at hand, in other words it is explicit in the Cartesian space. Our hypothesis is that such structure is implicitly encoded in the joint space data derived through the inverse kinematics. Thus the question is whether the GPLVM would be able to distinguish such patterns in the high dimensional data and portray a relevant pattern in the latent space representation. We should clarify that by relevant we would hope for a stretched, skewed, rotated and translated pattern as the mapping is nonlinear. Having reduced the dimensionality of the data with the GPLVM we investigate the low dimensional representations by visualising the latent space. Next, we use the dynamical systems-based movement representation to encode low dimensional control policies. We then modify the parameters of the low dimensional CPs and use them to get novel latent space trajectories. The predicted low dimensional motions are them mapped back to the observed space, i.e. to joint angles. The resulting joint angles are run through the forward kinematics to visualise the task space motion that results from the overall process (Figure 5.). Figure 5.: Diagram of the general methodology used for experiments. 5.. Quantitative Evaluation For quantitatively evaluating our results we have made use of the Normalised Mean- Squared Error (nmse) criterion. Mathematically, nmse is calculated as ε = Nσ q N i= (q i ˆq i ),

54 Chapter 5. Experiments and Evaluations 48 where ˆq i is the prediction of the system and q i is the ground truth for the evaluation at hand. This error criterion is advantageous in the movement systems context as it takes into account the variance of the prediction. This is suitable for joint data as it is often the case that different joints have different offset ranges. Furthermore, due to the complexity of the system, evaluation is performed at many different levels. Thus, in our experimental results we present the nmse between the original joint data and the predictions from the learned GPLVM model. We will refer to this error as model error. Next we measure the error between a demonstrated latent space trajectory and the resulting learned trajectory encoded as a CP, termed learned latent trajectory error. Moreover we measure the error between the original joint space data and a learned trajectory which is mapped back to joint space, referred to as joint space error. Last we measure the error in task space between ground truth data and trajectories that result from the complete path through our processing as small errors in joint space are prone to get amplified when transformed to task space positions. We will subsequently refer to this measure as the task space error, which can be regarded as the overall process error. 5. Structure in the Latent Space Our first experimental formulation investigates the morphology that the latent space has when the GPLVM is presented with joint data from inverse kinematics. Our hypothesis is whether there is a direct correspondence with the structure in task space and whether we are able to exploit such observed structure in order to infer novel points in latent space via interpolation. This particular experiment does not involve any learning in the latent space but serves the purpose of demonstrating the accuracy of interpolation between learned latent points, a consequence of the strong generalisation that the GPLVM possesses. The interpolation strategy followed is relatively simple. For predicting novel points in latent space we average over the adjacent latent point representations that result from the training set used. We start off with three-dimensional data derived from the planar arm and then move to the six-dimensional data produced with the Puma arm.

Chapter 5. Experiments and Evaluations 49 5.. Experiments with the Planar Arm We have made use of the grid dataset presented in section 4.

The resultant -D latent space representations are presented in figure 5.. Training was carried out with a missing horizontal line of points from the grid data.

55 Chapter 5. Experiments and Evaluations Experiments with the Planar Arm We have made use of the grid dataset presented in section 4.. with a cost function of equal angle penalisation in the first case and quadruple penalisation for the first angle in the second case. The resultant -D latent space representations are presented in figure 5.. Training was carried out with a missing horizontal line of points from the grid data. Then the missing line of points was inferred by simple averaging of the two adjacent points for each missing datum in latent space. The points resulting from averaging are depicted in blue x s in figure 5.. The latent space data was then mapped back to joint space and subsequently to task space through forward kinematics. The joint space nmse of the predicted points against the ground truth data is and , corresponding to a task space error of and , for the two different cost functions (Figure 5.3)..4 Trained Averaged...4 Trained Averaged (a) (b) Figure 5.: -D latent space representation of the grid dataset. (a) Uniform penalty cost function, (b) weighted penalty cost function. Red x s are training data and blue x s are predicted latent space positions. The results indicate that the structure of the latent space is governed by the cost function used to perform the inverse kinematics mapping. This suggests that different cost functions shape differently the latent space. Nonetheless we can accurately infer the latent positions of points in between our training data, offering an easy way to predict task space configurations. Another significant observation that can be made is that the inferred points follow the cost function used in creating the data. That is, the cost function is implicitly carried over the latent space. After our experimentation with the grid dataset we tried the same experiment with

56 Chapter 5. Experiments and Evaluations 5 (Prediction) Joint Space nmse : 8.599e 5, Task Space nmse :.855e 5 (Prediction) Joint Space nmse : 9.583e 5, Task Space nmse : 6.39e (a) (b) Figure 5.3: Predictions mapped back to joint angles and task space positions through forward kinematics. (a) Uniform penalty cost function, (b) weighted penalty cost function. Circles are ground truth positions, blue dots are trained data and red dots are inferred positions. trajectories in task space, both for the parallel and reaching motions. Apart from the qualitative difference of the movements, the datasets are also quantitatively different. The grid dataset consist only of 56 datapoint whilst the straight line minimum jerk trajectories in task space, as presented in section 4..3, comprise of and 99 data points each, presenting with and 9 trajectories in total. Again the trajectories appear in the two dimensional latent space while in both cases the fifth trajectory was not included in the training set for evaluation. The resulting latent space representations are presented in figure 5.4 and the corresponding task space trajectories in figure 5.5. The nmse error for our prediction in both cases is very small. Specifically the the joint space error is and , while the task space error is and for the parallel and reaching datasets accordingly. In the same experimental scenario the rhythmic circular motions dataset from the planar arm was tested as well. The results are equally accurate, as presented in figure 5.6. The second kinematic trajectory was not included in the training set for evaluation and the error against the ground truth data was and in joint and task space respectively.

Chapter 5. Experiments and Evaluations 5.5 Trained Averaged.5 Trained Averaged.5.5.5.5.5.5 (a) Parallel movements. (b) Reaching movements. Figure 5.

(Prediction) Joint Space nmse :.767, Task Space nmse :.8869.5 (Prediction) Joint Space nmse :.3946, Task Space nmse : 9.6646e 5.8.6.4..8.6.4.5.4.6.8..4.6...4.6.8..4.6.8 (a) Parallel movements.

57 Chapter 5. Experiments and Evaluations 5.5 Trained Averaged.5 Trained Averaged (a) Parallel movements. (b) Reaching movements. Figure 5.4: -D latent space representation of the discrete movements datasets with a uniform penalty cost function. Red x s are training data and blue x s are predicted latent space positions. (Prediction) Joint Space nmse :.767, Task Space nmse : (Prediction) Joint Space nmse :.3946, Task Space nmse : e (a) Parallel movements. (b) Reaching movements. Figure 5.5: Predictions mapped back to joint angles and task space positions through forward kinematics. Circles are ground truth positions, blue dots are trained data and red dots are inferred positions. 5.. Experiments with the Puma Arm Having presented how structured task space data result in a structured latent space representation for the 3-dimensional datasets from the planar arm, we then increase the complexity of the training set for the GPLVM by using the data from the Puma robot. The data is now six-dimensional while the motions are qualitatively similar to our previous experimentations, as presented in section 4.3. Again we start with the discrete

Chapter 5. Experiments and Evaluations 5.8 Trained Averaged.6.4...4.6.5.5 (a) Latent space. (Prediction) Joint Space nmse :.6746, Task Space nmse :.3669..8.6.4..8.5.5.5 (b) Task space. Figure 5.

(a)red x s are training data and blue x s are predicted latent space positions. (B)Circles are ground truth positions, blue dots are trained data and red dots are inferred positions.

58 Chapter 5. Experiments and Evaluations 5.8 Trained Averaged (a) Latent space. (Prediction) Joint Space nmse :.6746, Task Space nmse : (b) Task space. Figure 5.6: -D latent space representation and task space points of the rhythmic circular dataset with a uniform penalty cost function. (a)red x s are training data and blue x s are predicted latent space positions. (B)Circles are ground truth positions, blue dots are trained data and red dots are inferred positions. movement minimum jerk straight line motions in task space. The representation of the datasets in the -dimensional latent space is presented in figure 5.7, while we observe that the latent trajectories are appearing in a structured manner. We again interpolate the missing trajectory by averaging between the preceding and succeeding latent trajectories. Mapping the interpolated latent trajectories back to joint angles and through Trained Averaged (a) Parallel movements Trained Averaged (b) Reaching movements. Figure 5.7: -D latent space representation of the discrete movements datasets. Red x s are training data and blue x s are predicted latent space positions. the forward kinematics results in the predicted trajectories as depicted in figure 5.8.

59 Chapter 5. Experiments and Evaluations (Prediction) Joint Space nmse :.536, Task Space nmse :.5.7 (Prediction) Joint Space nmse :.68e 5, Task Space nmse : 4.596e (a) Parallel movements. (b) Reaching movements. Figure 5.8: Predictions mapped back to joint angles and task space positions through forward kinematics. Circles are ground truth positions, blue dots are trained data and red dots are inferred positions. The joint space errors are and.68 5 while the task space errors are.5 4 and for the parallel and reaching movements respectively. Similarly, the rhythmic datasets of circular trajectories and infinity-symbol shaped trajectories resulted in latent space representations as depicted on figure 5.9. The task space trajectories resulting from the interpolated low dimensional trajectories have an nmse of and.79 3, while the corresponding joint space error is.39 4 and.493 3, for the circular and infinity symbol motions respectively. 5.3 Learning and Generalising in the Latent Space Our second set of experiments concerns leaning in the low dimensional space. The experiments presented are divided into two major categories. The first dealing with learning discrete control policies, while the second concerns learning rhythmic control policies. The methodology followed is to present the GPLVM with the high dimensional data and produce their low dimensional representation. Next, a trajectory in the latent space is chosen to be demonstrated to the movement imitation component. The latent trajectory is then encoded as a discrete or rhythmic CP and the relevant evaluation

Chapter 5. Experiments and Evaluations 54.6.4...4.6.8.8.6.4...4.6.8 (a) Circular movements. Trained Averaged.5.5.5.5 Trained Averaged (b) Infinity movements. Figure 5.

60 Chapter 5. Experiments and Evaluations (a) Circular movements. Trained Averaged Trained Averaged (b) Infinity movements. Figure 5.9: -D latent space representation of the rhythmic movements datasets. Red x s are training data and blue x s are predicted latent space positions..65 (Prediction) Joint Space nmse :.39, Task Space nmse : (Prediction) Joint Space nmse :.493, Task Space nmse : (a) Circular movements. (b) Infinity movements. Figure 5.: Predictions mapped back to joint angles and task space positions through forward kinematics. Circles are ground truth positions, blue dots are trained data and red dots are inferred positions. steps are taken. Afterward, we modify the parameters of the learned movement primitive and run the CP on the new parameters to produce novel latent space trajectories. The resulting low-dimensional motion is mapped back to joint space and subsequently to task space in order to observe the effect the changes in parameters induces. The pseudococde of this formulation is available in table 5.3.

61 Chapter 5. Experiments and Evaluations 55 Table 5.3. Pseudocode of the experimental formulation.. Obtain joint angle data Q = K (D) where D is task space data. Represent in Latent Space Initialise the GPLVM with data Q Optimise with SCG and objective eq(3.4) q, low dimensional representation 3. Learn the low dimensional motion (Tr) case (discrete) case (rhythmic) Initialise time constants Initialise time constants α z = n res, β x = n res /4, α x = n res /3, τ = α z = n res, β x = n res /4, ω = π, τ = where n res number of receptive fields where n res number of receptive fields Initialise state variables Initialise state variables x =, ẋ =, z =, ż =, ẏ =, ÿ = p =, ṗ =, z =, ż =, ẏ =, ÿ = y = y = Tr start y = y m = Tr start Set goal Set amplitude g = Tr end A = Compute f demo Learn nonlinearity f with LWL 4. Change CP parameters case (discrete) Reset CP State y = y new Start State g = g new Goal State τ = τ new Duration case (rhythmic) Reset CP State y m = y new Baseline of Oscillation A = A new Amplitude of Oscillation τ = τ new Period 5. Generate novel motion Run CP q new Map back to joint space Q new 6. Observe results in task space D new = K(Q new )

62 Chapter 5. Experiments and Evaluations Discrete Motions For learning low dimensional discrete trajectories we have made use of the discrete datasets presented in the previous chapter and already used in the previous set of experiments. The methodology followed is as described earlier. We start by reducing the dimensionality of the dataset in question and then learn a latent trajectory with a discrete CP. Next we need to modify the parameters of the CP in a meaningful manner in order to be able to quantitatively measure the generality of the primitive. We have chosen to modify the starting and ending position of the CP according to the starting and ending positions of all the latent trajectories in turn. This serves a dual purpose as this way we can metrically evaluate the fit of the predicted trajectories and furthermore compare the outcome with the ground truth data. Moreover, following from our previous experimentation, we interpolate between the learned movement and its preceding and successive trajectories only by interpolating between the start and end points of the trajectories. The resulting trajectories are presented both in the latent space and in the task space. Again we start by presenting the planar arm experiments and then move to the similar Puma arm examples. For the discrete case of the planar arm we have used the parallel and reaching motions datasets presented earlier. The resulting latent space is chosen to be twodimensional and the trajectory chosen to be learned is the fifth one in both cases (Figure 5.). Below we present the resulting latent space representation of the training data (blue circles), the trajectories predicted by the learned CP when run on novel starting and ending points (red dots) and the interpolated latent trajectories (magenta dots) for both the parallel and reaching motions cases. The kinematic trajectories are all datapoints long and the CP uses 3 basis functions to preform the LWL function approximation. Also note that each CP is encoded by two motion primitives as the space is two dimensional (Figure 5.(b)). Apart from the error measures presented in section 5.. we also measure the error between all the latent space trajectories and the predicted latent space trajectories resulting from running the learned CP on the new parameters as described earlier. In general the error is of the order of 4 and the nmses are presented analytically in table 5.. In the same context the dataset of the reaching kinematics trajectories from the planar arm are evaluated. The results are depicted in figure 5.. The same experimental methodology was followed for the corresponding discrete motion datasets from the Puma arm. The results for the parallel and reaching motions are depicted in

63 Chapter 5. Experiments and Evaluations 57 figures 5.3 and 5.4 accordingly. Latent Space nmse : Trajectory number : 5 nmse : Latent Trajectories st lantent dim.5 CP predictions Interpolated Trajectories nd lantent dim st dim CP nd dim CP (a) Latent space. (b) Latent trajectory..5 Joint Space nmse :.5365 Task Space nmse :.647 Ground truth CP predictions Interpolated Trajectories Model predictions (c) Task space. Figure 5.: (a)latent space of the parallel motions dataset with the predicted trajectories, (b)the latent trajectory demonstrated and learned, (c)resulting task space motions. The results demonstrate how we are able to generalise in latent space, producing similar motions in task space. The system is highly accurate as the measured nmses are very low. The generalisation on the other hand deteriorates when we move far away from the demonstrated trajectory in latent space. This is mostly obvious in the planar arm datasets where the topology of the mapping changes more drastically. In contrast, the structure that comes about in the Puma datasets latent space allows for optimal results. Model error Latent traj. error Latent space error Joint space error Task space error Planar arm Parallel movements Reaching movements Puma arm Parallel movements Reaching movements Table 5.: nmse errors for the discrete movement cases for both robotic setups.

64 Chapter 5. Experiments and Evaluations 58 Latent Space nmse :.445 Trajectory number : 5 nmse : st lantent dim.5 nd lantent dim st dim CP nd dim CP Latent Trajectories CP predictions Interpolated Trajectories (a) Latent space. (b) Latent trajectory. Joint Space nmse :.5339 Task Space nmse : Ground truth CP predictions Interpolated Trajectories Model predictions (c) Task space. Figure 5.: (a)latent space of the reaching motions dataset with the predicted trajectories, (b)the latent trajectory demonstrated and learned, (c)resulting task space motions. Latent Space nmse :.34 Trajectory number : 5 nmse : Latent Trajectories CP predictions Interpolated Trajectories.5 st lantent dim nd lantent dim st dim CP nd dim CP (a) Latent space. (b) Latent trajectory Joint Space nmse :.4943 Task Space nmse :.797 Ground truth CP predictions Interpolated Trajectories Model predictions (c) Task space. Figure 5.3: (a)latent space of the parallel motions dataset with the predicted trajectories, (b)the latent trajectory demonstrated and learned, (c)resulting task space motions.

65 Chapter 5. Experiments and Evaluations Latent Space nmse :.9455 Latent Trajectories CP predictions Interpolated Trajectories.5 Trajectory number : 5 nmse :.3965 st lantent dim nd lantent dim st dim CP nd dim CP (a) Latent space. (b) Latent trajectory. Joint Space nmse : 6.436e 5 Task Space nmse : Ground truth CP predictions Interpolated Trajectories Model predictions (c) Task space. Figure 5.4: (a)latent space of the reaching motions dataset with the predicted trajectories, (b)the latent trajectory demonstrated and learned, (c)resulting task space motions Rhythmic Motions For experimenting with rhythmic motions we have used the datasets of rhythmic movements as presented in section The strategy is similar to the discrete movement case as the data is presented to the GPLVM and a two-dimensional latent space is learned. A trajectory is then chosen to be the seed of learning for a rhythmic movement primitive, in latent space. In turn we modify the parameters of the rhythmic CP to produce novel latent space motions. After, the new latent space trajectories are mapped back to joint space and subsequently to task space positions. The parameters of the rhythmic movement primitives are the baseline of oscillation, the amplitude of oscillation and the temporal scaling variable (τ). Changing τ leads to a slower or faster motion without a change of the trajectory. Modifying the other two parameters on the other hand is not that straightforward as for any demonstrated rhythmic movement the baseline and amplitude are by default set to zero and one respectively. We chose to gradually modify these parameters to first observe the change in latent space, then map back to joint space and then to task space to observe the results of changing the parameters of the low-dimensional encoding. Below we present the most significant results as the bulk of experiments has lead

66 Chapter 5. Experiments and Evaluations 6 to a rather large set of figures. Cursorily, modifying the amplitude and baseline of the oscillation leads to a similar change in the resulting motion, similar for both movement systems. More specifically for the circular motion case, for both robots, increasing the amplitude of oscillation equally in both dimensions leads to a series of circular motions (Figure 5.5(c) and 5.5(d)) in task space. The step chosen is.5 starting from and ranging to, i.e. double amplitude of oscillation in latent space. We have also portrayed a set of trajectories that are more close to the region of the demonstrated movement (black dots), which differ by. to depict the generalisation in the vicinity of the original motion. The trajectories that are close to the learned motion exhibit similar topological properties. Moving away from that region yields more and more deformed shapes. Nonetheless the resulting motions still have a roughly circular, elliptical shape..5 Result of modulating both systems amplitude in latent space Result of modulating both systems amplitude in latent space Training set CP.5 step CP. step Training set CP.5 step CP. step Tweak both systems Ampitudes.65 Tweak both systems Ampitudes Figure 5.5: First row result of modifying the amplitude in latent space for the planar arm (first column) and the Puma arm (second column) for the circular motions corresponding datasets. Changing the baseline of oscillation needs more consideration as the produced motion, in latent space and consequently in task space, depends on the coupling of both dimensions. Thus, we have selected to change the baseline of each system in turn and then try changing both dimensions equally. The results of changing the second Refer to Appendix A for more examples.

67 Chapter 5. Experiments and Evaluations 6 dimension s baseline are depicted in figure 5.6. Keeping the first dynamical system constant at its default parameters and varying the second system leads to a structured rotation of the circular motion trajectory. Again the produced trajectories are not exactly circular as the demonstrated motion. Such change of shape is more obvious as we move further away from the default parametrisation. Such behaviour is common in the planar arm and the Puma robot, leading to a hypothesis that the complexity of the controlled system does not affect the topological properties of the latent embedding. On the contrary the qualitative shape of the data is what affects the latent space, even through such structure is only indirectly represented in joint space..5 Result of modulating the second system s baseline in latent space Result of modulating the second system s baseline in latent space Tweak second system Baseline Tweak second system Baseline Figure 5.6: First row result of modifying the baseline of oscillation of the second dynamical system in latent space for the planar arm (first column) and the Puma arm (second column) for the circular motions corresponding datasets. We choose to present the results of the infinity-symbol trajectory in more detail as it constitutes a more complex kinematics case. The approach is exactly the same and the results surprisingly structured. A change of the baseline of the first dynamical system leads to a vertical displacement of the learned trajectory in the imaginary vertical plane the training data lives within (Figure 5.7(b)). Changing the baseline of the second system leads to a similar horizontal displacement of the trajectory on the plane in space (Figure 5.7(d)). The modification of both systems baselines leads to the predicted, based on the earlier findings, series of infinity motions from the lower left part to the

68 Chapter 5. Experiments and Evaluations 6 upper right part of the region, as depicted in figure 5.7(f). In all previous cases the motion is deformed but still recognisable as an infinity-symbol trajectory, deteriorating as we move further away from the default parametrisation. Accordingly, a change in Result of modulating the first system s baseline in latent space Tweak first system Baseline Result of modulating the second system s baseline in latent space Tweak second system Baseline Result of modulating the both systems baseline in latent space Tweak both systems Baselines Figure 5.7: Modification of the baseline of oscillation of the first dynamical system (first row), the second (second row), both equally (third row) in latent space for the Puma arm s infinity dataset. the systems amplitudes leads to a corresponding change of the amplitude of motion in the task space. Modification of the first systems amplitude leads to a vertical change of amplitude in the resulting kinematic trajectory (Figure 5.8(b)). Changing the second system s amplitude results to a corresponding modification of the horizontal amplitude of motion (Figure 5.8(d)). When both systems amplitude is varied, the kinematic trajectories result to motions closely following the remaining trajectories in the dataset (Figure 5.8(f)), with a deformation of the shape being obvious. We regard these interesting results as a direct consequence of the structure observed in the latent space,

69 Chapter 5. Experiments and Evaluations 63 which in turn is nicely inferred by the GPLVM when presented with the joint angle data. Result of modulating the first system s amplitude in latent space Tweak first system Ampitude Result of modulating the second system s amplitude in latent space Tweak second system Ampitude Result of modulating both systems amplitude in latent space Tweak both systems Ampitudes Training set CP.5 step CP. step Figure 5.8: Modification of the amplitude of oscillation of the first dynamical system (first row), the second (second row), both equally (third row) in latent space for the Puma arm s infinity symbol dataset. Model error Latent traj. error Joint space error Task space error Planar arm Circular movements Puma arm Circular movements Infinity movements e Table 5.: nmse errors for the rhythmic movement cases for both robotic setups.

70 Chapter 5. Experiments and Evaluations Human Motion Data At this point we take a giant leap in complexity as we move from 6-dimensional datasets to 59-dimensional motion capture datasets. These datasets were presented in section 4.4 and apart from the dimensionality of the joint space, posses another significant difference. This difference is the existence of noise due to the vision-based marker detection approach used. For the following experiments a -dimensional latent space is not adequate to represent the data within a small error margin. We thus increase the dimensionality of the latent space to three dimensions, an approach followed for data from the CMU motion capture database in (Wang et al. (6)). Another problem faced was that the level of noise in the datasets resulted to spiky latent space trajectories that are unsuitable to be learned by the motion learning component. Accurate learning of such trajectories would require a large number of basis functions to account for the rapid changes of the motion, something we wish to avoid. The solution to this problem is the addition of a dynamical kernel to the GPLVM. The dynamical kernel is an RBF-linear kernel (Wang et al. (6)) which takes into account that the data is noisy. This addition serves to smooth the trajectories and furthermore loosens the IID assumption of the data generation process. The latter is particularly beneficial as it accounts for our knowledge that the data is sequential. The results of adding the RBF-linear kernel are depicted in figure 5.9, for the walking and running datasets, similarly for the boxing dataset Boxing The evolution of the three latent variables that form the low dimensional trajectory representation of the complete boxing sequence is depicted in figure 5.. This dataset presents with some significant drawbacks that do not allow us to make any strong statements. The motion in the dataset is not structured in any manner as the mannequin is shadow boxing and moving around at random. Optimally, we would desire a stationary mannequin that would perform similar boxing motions. For example targeting different positions at the same distance with similar boxing moves. Moreover we do not have the forward kinematics of the humanoid, thus we cannot quantitatively quantify the effect that parameter changes have on the resulting motions. An interesting observation is that in figure 5. whenever there is a spike in the first dimension a left punch is Independently sampled and identically distributed.

71 Chapter 5. Experiments and Evaluations 65.5 Latent Variables x x x3.5.5 Latent Variables x x x Latent Variables x x x3.5 Latent Variables x x x Figure 5.9: Difference between latent trajectories produced by the GPLVM (first column) and the same trajectories when a dynamical RBF-linear kernel is added (second column). First row is from the walking dataset while the second is produced from the running dataset. performed by the mannequin. Similarly spikes in the third dimension represent right handed punching motions. Based on this, we have chosen to learn a left handed punch motion from the complete sequence. The learned CP is depicted in figure 5. while the error at all levels of measure is very low (Table 5.4.). Slight variations of the end positions of the three dynamical systems lead to similar punching motions but as mentioned earlier we are unable to make any concrete evaluation. Nonetheless the system is highly accurate in reproducing the demonstrated motion. The temporal variable τ can also be altered to modulate the velocity of the motion without any loss of accuracy.

72 Chapter 5. Experiments and Evaluations 66 4 Latent Boxing Sequence 3 3 dim dim dim Figure 5.: Complete low dimensional boxing sequence representation. Learned left punch sequence 3 CP CP CP Figure 5.: Learned left handed punch as represented in the 3-dimensional latent space. Left-handed punch Model error Latent traj. error Joint space error Table 5.3: nmse errors for the left-handed punching motion Walking and Running Reducing the dimensionality of the walking and running sequences leads to a latent space representation where the periodicity of motion can be easily observed. The data

Chapter 5. Experiments and Evaluations 67 Figure 5.: Snapshots from different learned boxing motions. for both cases do not correspond to one period of walking or running.

73 Chapter 5. Experiments and Evaluations 67 Figure 5.: Snapshots from different learned boxing motions. for both cases do not correspond to one period of walking or running. A periodic part of each resulting latent space sequence was chosen explicitly as the seed of learning. The resulting latent space sequences are presented in figure 5.3 for both datasets, whilst the segments chosen are separated with red vertical lines. The resulting demonstrated trajectories along with the corresponding CP predictions are depicted in figure Latent Variables x x x3.5.5 x x x3 Latent Variables (a) (b) Figure 5.3: Latent trajectory representation of the (a)walking and (b)running sequences. The segments chosen to be learned are enclosed by the vertical red lines. Again we have archived low nmses for both rhythmic motion cases, detailed in

74 Chapter 5. Experiments and Evaluations dim dim dim CP CP CP dim dim dim CP CP CP (a) (b) Figure 5.4: The demonstrated trajectories with the corresponding CP predictions. table The errors for the walking primitive are larger than those of the running motion, as the length of the sequences differs. The length of the demonstrated walking movement is double the length of the running motion. Nonetheless both motions are learned with equally parametrised systems, resulting in low errors. Motion capture Model error Latent traj. error Joint space error Walking Running Table 5.4: nmse errors for the rhythmic motion capture datasets. When changing the parameters in the resulting 3-dimensional space we have concentrated on altering the amplitude of the oscillations and τ. Experiments where also performed for altering the baselines of oscillation of the systems with little success. It is apparent that, now that the latent space is 3-dimensional, little uniform alterations would quickly result to latent trajectories distant from the training points. Predictions away from the original latent points have low certainty and consequently lead to unnaturally uncoordinated motions when projected back to joint space and preformed by the mannequin. Tuning τ by definition leads to faster and slower motion, thus as expected we can produce walking and running full body motions with varying velocities. On the other hand, changing the amplitude of the oscillations does not allow for straightforward predictions on the resulting movements. We have experimented with

75 Chapter 5. Experiments and Evaluations 69 amplitude parameters ranging from., as amplitude results in no motion at all, to. with a step size of. and equally altering all three dynamical systems. The results where encouraging as we observe that low amplitudes (i.e..) leads to a small motion about a pose very similar to standing. This underlines that the point in latent space corresponding to the three baselines of the learned systems (amplitude zero case) is validly generalised as a roughly standing pose. This is considered to be a direct consequence of the generalisation ability of the GPLVM. Moreover, increasing the amplitude results to hand and leg swings of increasing offsets, resulting to optimally orchestrated and naturally walking and running motions. Snapshots from the resulting approximately peak displacement configurations when varying the amplitudes of oscillation are available in figure 5.5 and 5.6 for walking and running respectively. Again we face the problem of metrically evaluating the resulting motions in task space as we do not posses the mannequin s forward kinematics. Additionally we cannot make any claims for the stability of the produced gaits as a dynamical analysis of the produced movement is far from the scope of our experimentation. Nonetheless, it is easy to qualitatively evaluate the overall animated motion. The mannequin moves in a smooth and natural, human manner, both for the walking and running cases, in the original parametrisation and the modified scenarios. 5.5 Discussion All the experiments presented earlier directly support that the combination of the GPLVM with the nonlinear dynamical systems based motion representation framework yields multiple advantages. The nature of the GPLVM allows for powerful generalisation and very accurate mapping as the extremely low errors support. The accuracy of the encoding component leads to minimal residuals between demonstrated and learned motions in the latent space providing with an overall efficient system for motion representation. The system can learn rapidly, be modified on-line and handle strong external perturbations, suitable for dynamically changing environments. The representation requires a radically smaller number of CPs, thus the encoding is more compact and more easily manageable.

76 Chapter 5. Experiments and Evaluations 7 (a). (b). (c).3 (d).4 (e).5 (f).6 (g).7 (h).8 (i).9 (j) (k). (l). Figure 5.5: Snapshots from the same frame (approximately peak offset configuration) taken from different amplitude trajectories resulting from the learned walking motion. The desired properties of such high level motion representation also hold for the low dimensional trajectories, as we have demonstrated how novel parametrisation in latent space leads to structured changes of the corresponding task space motions. Learning in a structured latent space yields ease of generalisation to new latent space trajectories and consequently, similar novel task space movements. Modification of parameters in latent space leads to a structured change of the resulting motion in task space. Temporal invariance is by definition valid while spatial invariance is enforced by the structure of the latent space. Such way, we have achieved a beneficial combination of the advantageous properties of both systems as we have demonstrated that learning is equivalently accurate and easy in the low dimensional manifold. This leads to a great simplification for motion learning from demonstration as the number of CPs used is reduced dramatically. For

Does Dimensionality Reduction Improve the Quality of Motion Interpolation?

Does Dimensionality Reduction Improve the Quality of Motion Interpolation? Sebastian Bitzer, Stefan Klanke and Sethu Vijayakumar School of Informatics - University of Edinburgh Informatics Forum, 10 Crichton