COST Action IC0702 - SoftStat




Work Group A  

Model Selection and Validation

Statistical Validation and Monitoring of Soft Computing Models

Currently only very simple statistical techniques are applied to validate and to assess soft computing models. However, a better, objective validation from the point of view of inferential statistics, which takes the state of the art into account, can lead to an integral methodology with interesting results and high application potential. For example, a good way to cope with the problem of overfitting (which means adapting a possibly a model with too many degrees of freedom to accidental and spurious properties of the training data set) is to consider a true underlying model (describable, of course, by a soft computing technique) from which the observed data has been sampled, but then this data has been disturbed by random noise. This view of the situation allows usto apply inferential statistical techniques to assess the performance of the model adequately and to handle the given data properly. Furthermore, resampling techniques like bootstrap and bagging provide means to measure the goodness of fit of the model, to objectively compare different learning methods, and to improve performance by combining multiple models, thus removing or at least mitigating the bias of the individual models. Statistical analysis can also be highly useful when monitoring the resulting system once it is deployed in an application. System failures, change points, loss of fit etc can be detected and maybe even predicted by analyzing the time series of the inputs and outputs and may lead to statistics-based model adaptation techniques.

Model Selection and Validation for Neural Networks

Even though support vector machines rely on statistical principles, only part of the power and flexibility of neural networks is used in combination with statistical techniques. For example, in support vector machines the kernel functions, by which the coordinate transformation is achieved, are not moved away from the data points and their parameters (like the "radius" or "window width" of the kernel functions) are not adapted during training. Neural networks, on the other hand, if trained with some gradient descent scheme like error back-propagation, allow for such adaptations: in principle, any parameter can be seen as an argument of the error function and thus be adapted. It is obviously desirable to have methods that can provide bounds on the expected performance in these cases. Statistical methods can also be useful in the design and training of neural networks. For instance, the number of neurons in the hidden layer(s) of both multilayer perceptrons and radial basis function networks and the structure of the network can be examined from a statistical point of view, providing rules for the best choice. Objective methods for comparing different network structures can be developed in terms of statistical hypothesis tests. On the other hand, since neural networks are more flexible than most of the usual predictive statistical methods, it is desirable to analyze possible combinations in order to balance predictive accuracy (which tends to be higher for more flexible methods) and interpretability as well as robustness (which prefers simple models).