Here is an update on the AMS-IMS-SIAM Joint Summer Research Conference on
Adaptive Selection of Models and Statistical Procedures, held on the campus
of Mount Holyoke College in South Hadley, Massachusetts on Sunday, June 23
through Thursday, June 27, 1996.
..........
Principal Invited Speakers Organizers
Lucien Birge Andrew Barron barron@stat.yale.edu
Leo Breiman Peter Bickel
Raphy Coifman David Donoho
Ron Devore Iain Johnstone
Niels Keiding
Dan McFadden
Dominique Picard
Aad van der Vaart
Vladimir Vapnik
Grace Wahba
..........
..........
______________________________________________________________________
ADAPTIVE SELECTION OF MODELS AND STATISTICAL PROCEDURES
Andrew Barron (Yale University), chair
Peter Bickel (University of California, Berkeley), co-chair Iain Johnstone
(Stanford University), co-chair David Donoho (Stanford Univ. and Univ.
California, Berkeley)
In recent decades scientific development has been accompanied by an
exponential increase in the volume and complexity of data of all sorts and
in the speed of computation and the potential complexity of methods of
analysis. In response the goal of statisticians (and other scientists) is
seen more and more clearly to be to obtain descriptions which are
parsimonious and accurate, with good predictive power.
By parsimony we mean descriptions in terms of a few familiar components in
which variables are linked to a response by a simple mechanism. The
accuracy we refer to is in the identification of the components and the
estimation of the parameters linking the variables to a prediction. By
predictive power we mean that the model fitted both is not too parsimonious
(has small bias) and is accurate (has small variance).
Both in statistics and related fields such as signal processing, machine
learning, and information theory the focus has been more and more on
accurately selecting and fitting parsimonious models from ever larger
families of such models. The more complex the data the more potential ways
there are of describing them. There has been a huge range of statistical
approaches. The small number of parameters model of classical statistics
has been extended to semiparametric modelling particularly in biostatistics
and econometrics. These models are reasonably parsimonious, accurately
estimated, and theoretically well understood, but as with the classical
parametric models they may have little predictive power.
Other approachs have come out of nonparametric modelling and curve
estimation in statistics and the recently growing neural network field. A
pragmatic and to some extent applications oriented point of view here has
given rise to a variety of computer intensive methods. The resulting
procedures are often observed to have good predictive power, but are not
necessarily parsimonious, and their action may not be easy to understand.
One theoretical viewpoint relates the statistics problem to notions of
regularization in classical applied mathematics, giving rise to penalized
maximum likelihood approaches and the method of sieves. Theoretical work in
nonparametric statistics has mathematically characterized the result of the
bias-variance tradeoff for assumed classes of response functions, but we
are only beginning to understand the extent to which it is possible to
achieve the tradeoff adaptively through model selection. Procedures for
model selection seek a data-directed balance between parsimony and
accuracy. Model selection criteria have been based on estimates of
statistical risk, on Bayes procedures, on information-theoretic
characterizations of data description, or pragmatically by what bounds can
be developed for the risk of the adaptively selected models.
At this conference we try to bring these strands together. Can a mixed
group of mathematical and computational statisticians, econometricians,
biostatisticians, engineers and computer scientists agree as to how to
adequately measure parsimony, accuracy, predictability? Given agreed on
criteria, which methods are most useful, where? Can some of the methods
with more theory serve just as usefully as the more adhoc methods on real
data? Can real theory be developed for some of the adhoc methods? In a
variety of settings -- from biostatistics, to wavelet analysis of time
series and images, to sparse multidimensional data analysis -- a discussion
of how the notions of parsimony, accuracy, and predictability are
interpreted and how extensions of different methods compare should prove
very valuable.