NEURAL- David Furrer* Ladish Co. Inc. Cudahy, Wisconsin. Stephen Thaler Imagination Engines Inc. Maryland Heights, Missouri

Similar documents
Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Artificial Neural Networks Lecture Notes Part 5. Stephen Lucci, PhD. Part 5

Time Series prediction with Feed-Forward Neural Networks -A Beginners Guide and Tutorial for Neuroph. Laura E. Carter-Greaves

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.

Knowledge Discovery and Data Mining

Image Compression: An Artificial Neural Network Approach

CS6220: DATA MINING TECHNIQUES

WHAT TYPE OF NEURAL NETWORK IS IDEAL FOR PREDICTIONS OF SOLAR FLARES?

Supervised Learning in Neural Networks (Part 2)

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

DARWIN 9.0 Release Notes

Proceedings of the 2016 International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, September 23-25, 2016

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Ensemble methods in machine learning. Example. Neural networks. Neural networks

New Release of the Welding Simulation Suite

Neural Network Neurons

Notes on Multilayer, Feedforward Neural Networks

Character Recognition Using Convolutional Neural Networks

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Machine Learning Applications for Data Center Optimization

Adaptive Regularization. in Neural Network Filters

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

International Journal of Electrical and Computer Engineering 4: Application of Neural Network in User Authentication for Smart Home System

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Deep Learning With Noise

Visual object classification by sparse convolutional neural networks

1. Assumptions. 1. Introduction. 2. Terminology

Contents Metal Forming and Machining Processes Review of Stress, Linear Strain and Elastic Stress-Strain Relations 3 Classical Theory of Plasticity

CMPT 882 Week 3 Summary

Dynamic Analysis of Structures Using Neural Networks

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

Simulation of Back Propagation Neural Network for Iris Flower Classification

How Learning Differs from Optimization. Sargur N. Srihari

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Roger Wende Acknowledgements: Lu McCarty, Johannes Fieres, Christof Reinhart. Volume Graphics Inc. Charlotte, NC USA Volume Graphics

Lecture #11: The Perceptron

Using Artificial Neural Networks for Prediction Of Dynamic Human Motion

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Week 3: Perceptron and Multi-layer Perceptron

Climate Precipitation Prediction by Neural Network

A Neural Network Model Of Insurance Customer Ratings

A neural network that classifies glass either as window or non-window depending on the glass chemistry.

Development of an Artificial Neural Network Surface Roughness Prediction Model in Turning of AISI 4140 Steel Using Coated Carbide Tool

3 Nonlinear Regression

Large Data Analysis via Interpolation of Functions: Interpolating Polynomials vs Artificial Neural Networks

Perceptron: This is convolution!

NUMERICAL DESIGN OPTIMISATION OF A COMPOSITE REACTION LINK

Research on Evaluation Method of Product Style Semantics Based on Neural Network

CHAPTER 7 MASS LOSS PREDICTION USING ARTIFICIAL NEURAL NETWORK (ANN)

3 Nonlinear Regression

Application of Artificial Neural Network for the Inversion of Electrical Resistivity Data

1. The program has automatic generation of technical reports using customized Word templates as *.dotx-files.

Principal Roll Structure Design Using Non-Linear Implicit Optimisation in Radioss

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.

5 Learning hypothesis classes (16 points)

Machine Learning 13. week

"The real world is nonlinear"... 7 main Advantages using Abaqus

Neural Networks and Deep Learning

NEURAL NETWORK VISUALIZATION

Optimal Design of Steel Columns with Axial Load Using Artificial Neural Networks

CHAPTER VI BACK PROPAGATION ALGORITHM

Practical Tips for using Backpropagation

Machine Learning for NLP

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Optimum Design of Truss Structures using Neural Network

RESPONSE SURFACE METHODOLOGIES - METAMODELS

Volume 1, Issue 3 (2013) ISSN International Journal of Advance Research and Innovation

Liquefaction Analysis in 3D based on Neural Network Algorithm

Artificial Neural Network Methodology for Modelling and Forecasting Maize Crop Yield

Data Mining. Covering algorithms. Covering approach At each stage you identify a rule that covers some of instances. Fig. 4.

Customisation and Automation using the LUSAS Programmable Interface (LPI)

CSC 411: Lecture 02: Linear Regression

Classifying Depositional Environments in Satellite Images

Data Mining. Neural Networks

Economizing the stability of rubble-mound breakwaters using artificial neural network

Neural Networks Laboratory EE 329 A

Neural Networks. Neural Network. Neural Network. Neural Network 2/21/2008. Andrew Kusiak. Intelligent Systems Laboratory Seamans Center

Use of Artificial Neural Networks to Investigate the Surface Roughness in CNC Milling Machine

3D Finite Element Software for Cracks. Version 3.2. Benchmarks and Validation

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

2. Neural network basics

Motivation. Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight. Fixed basis function

Machine Learning in Biology

11/14/2010 Intelligent Systems and Soft Computing 1

Neural Networks. By Laurence Squires

CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016

17. SEISMIC ANALYSIS MODELING TO SATISFY BUILDING CODES

THE EFFECT OF THE FREE SURFACE ON THE SINGULAR STRESS FIELD AT THE FATIGUE CRACK FRONT

Neural Networks: What can a network represent. Deep Learning, Fall 2018

A COUPLED ARTIFICIAL NEURAL NETWORK AND RESPONSE SURFACE METHODOLOGY MODEL FOR THE PREDICTION OF AVERAGE SURFACE ROUGHNESS IN END MILLING OF PREHEATED

Neural Networks: What can a network represent. Deep Learning, Spring 2018

Radial Basis Function Neural Network Classifier

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

Model learning for robot control: a survey

Transcription:

NEURAL- Neural-network modeling tools enable the engineer to study and analyze the complex interactions between material and process inputs with the goal of predicting final component properties. Fig. 1 This image shows the predicted cross-sectional tensile yield strength for an example forged titanium Ti-64 component. This image is the result of a neural-network model that has been linked to a Scientific Forming Technologies Deform finite element model. (The strength contours are in ksi.) N David Furrer* Ladish Co. Inc. Cudahy, Wisconsin Stephen Thaler Imagination Engines Inc. Maryland Heights, Missouri eural-network models are mathematical tools designed to map input to output patterns, with the overall goal of minimizing the error between modeled and measured output values. Quite a variety of neural network models have been designed to fit a range of processes and materials, but this plethora of choices can sometimes be confusing to potential users, so much so that it even inhibits their application. In particular, neural-network models have multiplied for manufacturing and metallurgical engineering. Initial application of neural-network modeling to forging processes was conducted under the U.S. Air Force sponsored Forging Supplier Initiative, and it continues under the U.S. Air Force sponsored Metals Affordability Initiative. A significant amount of math supports each type of neural-network structure. In fact, these are inherently very complex mathematical models, and it has been challenging to win acceptance by non-mathematician, practical engineers who consider math a tool and not an end in itself. Efforts at Imagination Engines Inc. have resulted in a modeling tool that has a user-friendly interface for inputting data, developing models, and analyzing results. Called PatternMaster, it enables engineers to develop and apply neuralnetwork models on a desktop computer, Fig. 1. In addition, many of the possible neural-network options can be pre-selected to provide useful, fast, and straightforward application. Also, an optimization routine can be utilized that automatically seeks and develops optimum model configurations. This article discusses various neural-network models, and then shows how PatternMaster may be applied to develop products quickly and accurately. *Fellow of ASM International 42 ADVANCED MATERIALS & PROCESSES/NOVEMBER 2005

NETWORK MODELING In 1 θ Neural-network models Neural-network models include Perceptrons, Radial Basis Functions, Probabilistic Neural Networks, Generalized Regression Neural Networks, and several others. Of these, the Perceptron models are the most common and can be tailored for nearly any application. The name of this model type does not help in its acceptance by those unfamiliar with neural networks. The term Perceptron suggests images of the brain or some neuroscientific construct, while in fact it is simply a computational program with inputs and outputs. It can be regarded graphically as a collection of nodes in a series of layers. When a perceptron has more than two layers of nodes it is called a Multilayer Perceptron, or MLP. A node can be schematically drawn as a point with inputs, outputs, and an activation function. Figure 2 shows a schematic of a neuralnetwork model node. Multilayer model The layers in a simple perceptron model consist of an input layer (which contains nodes for each input data parameter) and an output layer (which contains nodes for each resultant data parameter). This type of arrangement is suitable for linear regression analyses of datasets. In reality, many real-world relationships are nonlinear and may involve synergistic effects between several input parameters. Therefore, simple linear regression modeling does not provide accurate representations of the general relationships involved with a series of inputs and outputs. To handle this higher level of complexity, additional Input layer Hidden layer Output layer Fig. 3 This schematic shows a three-layer neuralnetwork model. The middle layers are called hidden layers. layers are added to the simple perceptron. Each node in the added layer relates to the prior layer and to the subsequent layer with connections. The added layer or layers sandwiched between the initial input and output layers are called hidden layers. This structure, shown graphically in Fig. 3, allows for very complex equations that fit the relationship between the inputs and the outputs. The larger the number of hidden layers and nodes on each layer, the more capable the MLP will be of absorbing complex relationships. Fortunately, the form of the developed relationship is not needed prior to model construction, although it is best to attempt to model datasets with minimal layers and nodes. Network nodes The nodes in a neural-network model connect all prior and subsequent nodes in a model. The connections are given values called weights. The node computes an output value based on the input weights and an activation function. The calculation of the output value (often called a signal) and the form of the activation function result in various types of neural-network models. The most common types of models form a weighted sum of inputs and weights feeding any node, and this sum is passed along to an activation function. The most common type of activation function is the sigmoid function. The sigmoid function serves to switch any given node between low and high states to help model nonlinear behaviors. The ramp connecting these low and high regions assists in modeling linear relationships. Training neural-network models The process of training is aimed at developing the relationship that best fits the general function between the input and output parameters. Any error between predicted and actual output values is measured as each record within a dataset is passed through the neural network. Then the entire set of individual errors from the model establishes an error surface. The algorithm known as the training algorithm then updates connection In 2 W 1 = x W 2 = y Node (neuron) W 3 =z Fig. 2 Schematic configuration of a node within a neural-network model, where In represents inputs, with two shown here. The W represents connection weights, Out is the output, and θ is the bias to the node. Out ADVANCED MATERIALS & PROCESSES/NOVEMBER 2005 43

can allow low rate-of-training models to escape from local minima into global minima. Output parameter Input parameter Fig. 4 This graph shows training within a neural network. Relationship A shows that the model is under-trained, and relationship C shows over-training. The optimal general relationship is shown at relationship B. weights so as to locate the minimum in the error surface. In multilayer neural-network models, the input data is passed in a feed-forward manner, shown as left to right in Fig. 3. Initially, the connection weights are set to random values. As the datasets are passed through the model, an error is calculated between the predicted outputs and the desired outputs. The corrections to all of the weights within the neural net are chosen so as to enter as rapidly as possible into the valleys of the error surface in a process known as gradient descent. By forcing the network through such gradients, we find mathematically that the update to any given weight should be the product of the net s output error, appropriately weighted by all the connection weights leading back to the neuron it feeds, the first derivative of the recipient neuron s activation function in the neighborhood of its current state, and the raw signal coursing through that weight. An additional multiplicative constant called the learning rate can speed up or slow down the traversal of such gradients. The magnitude of the learning rate is important in allowing the network weights to assume values that produce global minima, rather than local-error minima. High rates of training provide large changes from iteration to iteration based on the errors calculated, but can also lead to lack of resolution of the global minimum. The more complicated the model (i.e., the more hidden layers and nodes per layer) the greater the number of local minima. Therefore, the simplest model that works for an application will be the safest to train to avoid false minima and to determine the global minimum. Low rates of training can be a problem with complex models having large numbers of local minima, because the model may not be able to escape from a local minimum with the small jumps. A momentum term is also included in the errorcorrection term. If a correction is in the same or general direction for several correction iterations, then the subsequent corrections gain momentum, which B A C Automated training Training a neural network is an iterative process that occurs automatically within the training algorithm. Training follows these steps: Input: A set of training data is input into the model. The program processes each record and provides iterative corrections to the network s connection weights. Training errors: During this process, the training error is minimized. The training error is defined as the error between the modeled output and the outputs in the training dataset. After the training error is minimized, any implicit relationships between input and output patterns are absorbed into the neural network. Optimal training: Optimally trained neuralnetwork models describe a relationship that accurately represents the general correlation between the input and output parameters. If a model is under-trained, the general relationship may not be determined and therefore can t be represented by the collection of connection weights within the model. On the other hand, if the model is overtrained, the model will model the behavior of the examples well, but might depart from the overall general relationship. To show this graphically, Fig. 4 shows a set of plotted data points. The data may have noise in it, and is therefore not exact. A model of the general relationship of this data may best be a smooth line (B), but if the model is over-trained, a complex, higher-order relationship may be developed that fits the example training data well, but may cause problems when it is applied to other examples that are expected to apply to this model. Multilayer neural-network models with the minimum of hidden layers and nodes per layer will be resistant to over-training. The more complex the neural-network structure, the more capable the model is in forming complex, and possibly non-real general relationships. Conversely, simpler neural-network model forms can not depart too far from simple, low-order relationships. Goal of training The goal of the training process is not to minimize the training error. Instead, the goal is to minimize error when the model is used with data that was not used for training (i.e., set-aside data). This means that to correctly train a multilayer neuralnetwork model, an available dataset should be divided into two subsets: a training set and a testing set. The training set trains the model with progressive reduction in training error with successive iterations. The testing set serves to assess the so-called generalization error on a random population of representative data that was not part of the training of the model. The calculated average error between predicted and actual values in the testing dataset is 44 ADVANCED MATERIALS & PROCESSES/NOVEMBER 2005

evaluated to determine if the model is properly trained. Continued assessment of the training and generalization error will show a decrease with time, but if the model is over-trained, then the assessment error will start to increase with continued training. This is because the model is memorizing the pattern of the examples instead of gleaning the overall general pattern. In addition, noise in the training data from data measurement errors or the like will become part of the model and the assessment data set will most likely not fit exactly with the model developed from the training dataset. Once the model structure is established and it is optimally trained, the testing dataset is used to confirm the accuracy and acceptability of the model. It is important to note that a trained multilayer neural-network model is typically good only at predicting outputs from inputs that are within the range of the training dataset. Some extrapolation can be done with caution with this type of model by adjusting the scaling factor in the activation function of each neuron. Tailorable structures The previous discussions on neural-networks are only high level and are not complete in any PatternMaster software The PatternMaster software package, developed by IEI Inc., has several important features, including An XML-based script to describe details of the network architecture, training parameters, and file i/o; and A three-dimensional virtual reality display of the neural net to assist in visualization of critical factors and underlying schema. The program is extremely fast and efficient at training due to its state-of-art model engine and IEI s patented STANNO (Self Training Artificial Neural-Network Object) technology. Furthermore, a neural network built into the software automatically trains the neural network of interest, rather than requiring the engineer to do this manually. The trainer net learns by experience how to correct the weights of the trainee net. As a result, this training technique is much faster than traditional learning schemes such as conventional back-propagation. PatternMaster has five main user interface functions: Model Development Wizard, XML Program View, Network View, Input/Output Prediction Visualization, and Data View. A model development wizard creates the necessary XML training script and links in the relevant training data. After this operation, training of the neural-network model can begin. The network view (Fig. 5) shows the input, hidden, and output layers, as well as associated connections. Through simple mouse clicks and drags, the user may quickly determine which input parameters are critical to a given output parameter, based on the trained model. The software also allows the user to assess any possible combination of inputs within the range of the training dataset to determine their effect on output parameters. This can be done, one set of input data points at a time, in the Input/Output Visualization screen. This allows the user to quickly assess interactions of input variable and effects on output predictions. PatternMaster also provides program files for the trained neural network, which can be linked to other programs, or can be run as a standalone tool. The ability to export a program that emulates the trained neural network is important and extremely useful. It allows generation of the trained neural networks in Excel, Java, C++, Fortran, and other codes. These output codes can be linked to other engineering tools such as DEFORM, to allow prediction and visualization of forged component properties for any set of input processing parameters. It is a useful engineering tool to develop and apply neural-network modeling. This software is programmed to provide an optimum set of neural-network parameters (layers, nodes, training rates, momentums, etc.), which do not need to be set by an engineer. A B Fig. 5 An example of a PatternMaster network view that shows the layers, nodes, and connections is shown in A. The skeleton view in B indicates the most significant parameters that affect ultimate tensile strength, where UTS is directly related to Cooling Rate (CR) and indirectly related to Solution Temperature. ADVANCED MATERIALS & PROCESSES/NOVEMBER 2005 45

Neuralnetwork models have helped to develop a relationship between processing parameters, roll settings, and final steel-plate rolling thicknesses. form or fashion. It is clear to see that the structure of neural-network models is very tailorable, which is good from the standpoint of flexibility, but is a negative from the standpoint of usability. For a neural-network modeling tool to be practical for engineers, the tool must guide users through the setup of the most appropriate model architecture and execution of neural-network model training. No single setting will be perfect for all modeling applications and datasets, but IEI has established a modeling program called PatternMaster, in which many of the complex modeling parameters are pre-set. This program also provides an automated function for developing optimum modeling parameters. A wizard tool walks users through establishing an optimization routine that seeks an optimal model architecture and training parameters. Training models For training a neural-network model, a rule of thumb says that a minimum of one record in a dataset is needed for every neural-network weight (number of nodes and connections). This means that a model with three inputs, four outputs, and a single hidden layer of eight nodes requires a minimum of 68 input records (56 connections and 12 biases to the hidden and output layer nodes). For optimum model development, it would be helpful to have larger quantities of training data, which can be many times larger than the smallest estimated minimum. Less training data results in less fidelity of the general relationship. Models having a large number of inputs/output parameters have been successfully trained with a small amount of data to determine which input variables may contribute most significantly to the output. Once this is known, new models can be developed with a greatly reduced number of input parameters in the model. This can allow for increased model accuracy of relationships between the most significant factors when limited data is available. Successful applications The literature contains a number of citations regarding neural-network models for manufacturing and metallurgical engineering. Rolling parameters: Neural-network models have helped to develop a relationship between processing parameters, roll settings, and final steel-plate rolling thicknesses. This industrial application is aimed are reducing scrap and improving quality and yield through proper selection, monitoring, and control of in-process manufacturing parameters. Fatigue cracks: Neural-network modeling of superalloy fatigue crack growth rate has been successful. These efforts showed that second-stage fatigue crack growth rate could be predicted based on temperature, yield strength, ultimate tensile strength, and Young s modulus. The goal of these efforts is to develop a tool that could guide alloy design for slower crack-growth rate materials. Tensile properties: Another neural-network model that has been presented in the literature is the prediction of tensile properties of nickel-base superalloys based on alloy chemistry and temperature. This neural-network modeling effort has successfully predicted the tensile strength of a wide range of superalloy chemistries and test temperatures. The most significant input parameters (in order of significance) were temperature, percent titanium, percent aluminum, percent niobium, percent tungsten, percent molybdenum, and percent boron. This effort was also aimed at developing a predictive tool for alloy design and optimization. Foundation design: Design engineers who develop, manufacture, and evaluate the construction of foundations have successfully applied neural-network models. It was noted in the literature that the neural-network approach to shallow and deep foundation modeling was equal to and often superior to that of conventional models. Geotechnical materials and structures are very complicated, and many features and interactions are not well understood. Conventional modeling requires assumptions of model equation forms, which often leads to errors. The neural-network model established relationships based on available data and did not require assumptions or theories. Transformation kinetics: Researchers at Queen s University in Belfast, Ireland, have developed commercially available trained neuralnetwork models that provide transformation kinetic information (TTT) data, and mechanical property data for titanium alloys as a function of chemistry. These tools are presumably trained and tested with literature data. Metals Affordability Initiative: Current neural-network activities under the Metals Affordability Initiative include modeling of Ti-64 mechanical properties from measured input material compositions, microstructural features, and input processing parameters. Models are being created for Ti-64 at Ladish and OSU. From input processing data, it is quickly determined that several chemical elements are critical for increasing strength in Ti-64, as well as strain and the heattreat cooling rate. The developed models can provide predicted property results as a function of location within the cross-section of a forged and heat-treated component. For more information: Dr. David Furrer is Manager, Advanced Materials & Process Technology, Ladish Co. Inc., Cudahy, WI 53110-8902; tel: 414/747-3063; e-mail: dfurrer@ladishco.com; Web site: www.ladishco.com. Dr. Stephen Thaler is Chairman and CEO of Imagination Engines Inc., 11970 Borman Drive, Suite 250, St. Louis, MO 63146-4153; tel: 314/317-2228 x 4428; e-mail: sthaler@imagination-engines.com; Web site: www. imagination-engines.com. 46 ADVANCED MATERIALS & PROCESSES/NOVEMBER 2005