NIPS 96 Workshop on Error Surfaces

Towards More Convex Error Surfaces

Talk by Grace Wahba at the NIPS Error Surfaces Workshop, December 7, 1996, at Snowmass CO.

Department of Statistics, University of Wisconsin, Madison.


In this talk I argued that if the error surface that you are trying to minimize, as a function of many variables, is sufficiently nasty, as it frequently is when fitting a sigmoidal basis function feedforward neural net, then you should think about reformulating the optimization problem to be solved. In keeping with the informality of a workshop a number of ideas, not all combinations of which have been tested, were thrown out. It was argued that sigmoidal basis functions should be parametrized by (not too many) unknown scale factors, by a unit vector $\gamma$ and by a distance $b$ along the unit vector. $b$ tells you how far along you go along the unit vector to get to the half-power point of the sigmoidal function, and a scale factor, say $\alpha$ tell you the slope of the sigmoidal function at the half power point. I proposed (for fixed scale factors) the idea of generating a large number of unit vectors and possibly half-power points via a random number generator, and using them to generate a large library of basis functions; then using either a fast rank-one update to iteratively select one new basis function at a time to minimize the penalized likelihood function, possibly with an iterative update of the smoothing parameter(s) via GCV, perhaps preceeded by a screening of the library via support vector methods. In general least squares fits of sigmoidal basis functions are mildly insensitive to scale factors, it is probably appropriate to allow only a small discrete number of scale factors, equally spaced on a log scale, for searching. I also gave a talk in the NIPS Model Complexity Workshop on December 6, 1996 which provides some related details and discusses other basis functions and various penalty functionals. Further information and references are available via my nips.96 workshop talks home page. The overhead slides for the Model Complexity Workshop are here.

Key words: error surfaces, sigmoidal basis functions, penalized likelihood, iterative update, support vector, generalized cross validation.
Click here for Grace Wahba, Students and Colleagues recent Technical Reports