Model Based Impact Location Estimation Using Machine Learning Techniques

Size: px

Start display at page:

Download "Model Based Impact Location Estimation Using Machine Learning Techniques"

Alvin Norman
5 years ago
Views:

Location of the impact is a crucial factor that determines the overall response of the system to the impact.

1 Model Based Impact Location Estimation Using Machine Learning Techniques 1. Introduction Impacts on composite structures result in invisible damages that need to be detected and corrected before they lead to catastrophic failure of the structures. Location of the impact is a crucial factor that determines the overall response of the system to the impact. Several attempts using interpolation methods have been made to estimate impact location based on sensor measurements. The results in most cases, however, are not satisfying due to the limitation on training impacts. With the aid of a numerical model of the structure, more training impacts can be taken numerically at low cost. This report presents an alternative way of estimating the impact location using machine- learning techniques. 2. Data 2.1 System Setup Figure 1: system setup (physical panel) Figure 2: system setup (FEM simulation) The overall purpose of the project is to identify an impact on a composite structure. A T800H/ graphite- epoxy panel is used to validate the concept of model- based training technique for general purpose (Figure 1). To analyze this problem, a simulation model is built using finite element method (FEM). As a result, impact experiments can be taken on the FEM simulation at virtually no cost (Figure 2). This allows us to collect a large amount of samples during the training process. 2.2 Data Collection The main purpose of the data collection is to create a database that captures the characteristics of impacts over the entire panel (Figure 3). The impacts (yellow arrow) are simulated on the FEM model at different locations. For this specific panel, there are 800 training impacts uniformly distributed. For each impact, 20 sensor measurements (red dot) are recorded and each measurement contains 50 data points. The true value of the impact location (x, y) is also recorded for each of the impact. The total sample space, as a result, contains the following data that is stored as vectors and matrices in the following format. 1

2 X (!) = y (!) = x, y!!!!"" s!,! s!,!" s!",! s!",!"!!!!"" 2.3 Feature Figure 3: sample data collection for the panel For a given sensor signal of 50 data points, several important characteristics can be extracted. Three are examined in this case: (1) windowed energy, (2) time of flight, and (3) peak amplitude (Figure 4). Those features are ideal because they have strong correlation with impact locations. By extracting those important features of the data, the size of the input space is decreased from to (1) (2) (3) Figure 4: features used for sensor measurements 3. Methods Using the data and features collected above, the goal is to estimate the impact location based on the given set of sensor measurements. Linear regression is an intuitive and reasonable model because the target variables are continuous rather than discrete classes. Finite element method is also a linear model, although the sensor response due to change of impact location is non- linear. 2

3 3.1 holdout cross validation To validate the learning model, the samples are divided into training (70%) and testing (30%) samples. To ensure sufficient coverage of the entire panel, the training samples are chosen at even- spaced distances. A random selection method has also been tried assuming uniform distribution. It has a lower training result due to additional selection uncertainty. 3.2 k- means clustering There are 20 sensors in total and 3 features for each measurement. To model the entire panel at once would require 60 parameters. Moreover, the sensor measurements far away from the impact location have low signal. It is insufficient and biased to use all the signals at once. One way to deal with this issue is to divide the panel into different regions and develop models in each sub region. It is not straightforward, but meaningless, to divide the panel by physical coordinates (say small rectangles), because it is impossible to assign a new impact to any region without knowing the location. K- means clustering technique is used to divide samples into 12 groups for the specific panel. Only energy feature is used to perform clustering, because there is no need to develop a perfect clustering, as long as it is meaningful and distinctive. The energy of the sensor measurement are quite sensitive to impact location, providing 12 set of clusters in the 20- dimension feature space. 3.3 Linear Regression After dividing the training sample set into 12 different groups, linear regression is performed on each group consisting 47 training samples ( /12). To further simplify the model, instead of using all the features from all the sensors (60 parameters in total), each are modified to have a weighted- averaged contribution from all the sensors (3 parameters in total). Different combinations of the features are used. For the most comprehensive case: x! = Energy ToF Amp!!!!" y! = x y!!!!" X = (x! )! (x!" )!!"! Y = (y! )! (y!" )!!"! By setting Θ = (X! X)!! X! Y, the parameter [Θ]!! is created. The fist and second column of the Θ correspond to the parameters of location x and location y respectively. 3

4. Results 4.1 training error and test error First, the training errors and test errors are studied as a function of data size. As mentioned above, the total sample size is 800.

4 4. Results 4.1 training error and test error First, the training errors and test errors are studied as a function of data size. As mentioned above, the total sample size is 800. It is purposely reduced to study the accuracy of the model. For total sample size lower than 51 (3 12/0.7), the training error is 0 because the system is underdetermined. The training error increases as the number of sample size increases. The test set, on the other hand, decreases as the sample size increases. In this model, the samples are taken evenly spaced over the entire panel. As a result, the training error and test error converge fairly well as sample size goes up (Figure 5). The effective error used to evaluate the accuracy is calculated by averaging the normalized Euclidean distance between the real and estimated location. It is also noted that the error is higher than expected, suggesting future improvement Test Error Feature Combination Figure 5: model accuracy as a function of sample size Figure 6: accuracy vs. features The modeling process is also repeated for different combinations of features. It is observed that there is a significant increase in accuracy by adding the energy (1) feature. Adding an additional feature of peak amplitude (2), which is a lower order version of the energy, does not further increase the accuracy (Figure 6). Using only time of flight (2) is not sufficient in this case. 4.2 individual cases To further understand the hidden flaws of the model, errors from the test samples are studied in details. Impact samples are chosen in three categories: (1) flat region, (2) edges, and (3) stiffened region. It is found, in general, that the estimation errors appear larger when the impact location is around the edges and certain locations 4

5 with stiffeners on the panel. This is contributed to the high variance in the FEM model of the structure. The signals are more sensitive in these regions. 5. Future Improvements From analyzing the results and the error distribution on the panel, certain improvements can be made in the future. As mentioned above, a significant part of the error is contributed due to the high variance around certain locations. Thus a signal- preprocessing scheme can be developed to detect data with high variance and eliminate them from the training set. This will decrease the test error in many cases, but will increase the error in some corner cases. Secondly, the sensor location is not used in this model. The error could be further reduced if those parameters are used in the model. 6. Conclusion Machine learning techniques can be used to estimate the impact location during numerical training process. Compare to traditional methods, it explores more features of the signals. The accuracy is comparable with other methods, which takes sensor locations into account. The training and test errors converge as sample size gets larger, although the accuracy is lower than expected. The error is mainly contributed to some outlier signals with high variance. In sum, k- clustering and linear regression provides an efficient way of estimating the impact location from sensor signals. Modeling and understanding the features of the signal is the main difficulty. 5

Adaptive Robotics - Final Report Extending Q-Learning to Infinite Spaces

Adaptive Robotics - Final Report Extending Q-Learning to Infinite Spaces Eric Christiansen Michael Gorbach May 13, 2008 Abstract One of the drawbacks of standard reinforcement learning techniques is that