Pre- and Post-processing
All neural networks take numeric input and produce numeric output. The transfer function of a unit is typically chosen so that it can accept input in any range, and produces output in a strictly limited range (it has a squashing effect). Although the input can be in any range, there is a saturation effect so that the unit is only sensitive to inputs within a fairly limited range. The illustration below shows one of the most common transfer functions, the logistic function(also sometimes referred to as the sigmoid function, although strictly speaking it is only one example of a sigmoid - S-shaped - function). In this case, the output is in the range (0,1), and the input is sensitive in a range not much larger than (-1,+1). The function is also smooth and easily differentiable, facts that are critical in allowing the network training algorithms to operate (this is the reason why the step function is not used in practice).
![[Neural Network Example]](file:///C:/Documents%20and%20Settings/123/Desktop/New%20Folder%20%282%29/neural%20networks/nn100_files/nn_fig2_2.gif)
The limited numeric response range, together with the fact that information has to be in numeric form, implies that neural solutions require preprocessing and post-processing stages to be used in real applications (see Bishop, 1995). Two issues need to be addressed:
Scaling. Numeric values have to be scaled into a range that is appropriate for the network. Typically, raw variable values are scaled linearly. In some circumstances, non-linear scaling may be appropriate (for example, if you know that a variable is exponentially distributed, you might take the logarithm). Non-linear scaling is not supported in ST Neural Networks. Instead, you should scale the variable using STATISTICA’s data transformation facilities before transferring the data to ST Neural Networks.
Nominal variables. Nominal variables may be two state (e.g., Gender={Male,Female}) or many-state (i.e., more than two states). A two-state nominal variable is easily represented by transformation into a numeric value (e.g., Male=0, Female=1). Many-state nominal variables are more difficult to handle. They can be represented using an ordinal encoding (e.g., Dog=0,Budgie=1,Cat=2) but this implies a (probably) false ordering on the nominal values - in this case, that Budgies are in some sense midway between Dogs and Cats. A better approach, known as one of n encoding, is to use a number of numeric variables to represent the single nominal variable. The number of numeric variables equals the number of possible values; one of the N variables is set, and the others cleared (e.g., Dog={1,0,0}, Budgie={0,1,0}, Cat={0,0,1}). ST Neural Networks has facilities to convert both two state and many-state nominal variables for use in theĀ neural networks Unfortunately, a nominal variable with a large number of states would require a prohibitive number of numeric variables for one-of-N encoding, driving up the network size and making training difficult. In such a case it is possible (although unsatisfactory) to model the nominal variable using a single numeric ordinal; a better approach is to look for a different way to represent the information.

Prediction problems may be divided into two main categories:
Classification. In classification, the objective is to determine to which of a number of discrete classes a given input case belongs. Examples include credit assignment (is this person a good or bad credit risk), cancer detection (tumor, clear), signature recognition (forgery, true). In all these cases, the output required is clearly a single nominal variable. The most common classification tasks are (as above) two state, although many-state tasks are also not unknown.
Regression. In regression, the objective is to predict the value of a (usually) continuous variable: tomorrow’s stock price, the fuel consumption of a car, next year’s profits. In this case, the output required is a single numeric variable.
Neural networks can actually perform a number of regression and/or classification tasks at once, although commonly each network performs only one. In the vast majority of cases, therefore, the network will have a single output variable, although in the case of many-state classification problems, this may correspond to a number of output units (the post-processing stage takes care of the mapping from output units to output variables). If you do define a single network with multiple output variables, it may suffer from cross-talk (the hidden neurons experience difficulty learning, as they are attempting to model at least two functions at once). The best solution is usually to train separate networks for each output, then to combine them into an ensemble so that they can be run as a unit.
Posted in Neural Networks |
