TY - GEN
T1 - Linear discriminant analysis with more variables than observations
T2 - 11th Biennial Conference of the International Federation of Classification Societies, IFCS 2009 and with the 33rd Annual Conf of the German Classification Society (Gesellschaft fur Klassifikation) on Classification as a Tool fo Research, GfKl 2009
AU - Silva, A. Pedro Duarte
N1 - Copyright:
Copyright 2013 Elsevier B.V., All rights reserved.
PY - 2010
Y1 - 2010
N2 - A new linear discrimination rule, designed for two-group problems with many correlated variables, is proposed. This proposal tries to incorporate the most important patterns revealed by the empirical correlations while approximating the optimal Bayes rule as the number of variables grows without limit. In order to achieve this goal the new rule relies on covariance matrix estimates derived from Gaussian factor models with small intrinsic dimensionality. Asymptotic results show that, when the model assumed for the covariance matrix estimate is a reasonable approximation to the true data generating process, the expected error rate of the new rule converges to an error close to that of the optimal Bayes rule, even in several cases where the number of variables grows faster than the number of observations. Simulation results suggest that the new rule clearly outperforms both Fisher's and Naive linear discriminant rules in the data conditions it was designed for.
AB - A new linear discrimination rule, designed for two-group problems with many correlated variables, is proposed. This proposal tries to incorporate the most important patterns revealed by the empirical correlations while approximating the optimal Bayes rule as the number of variables grows without limit. In order to achieve this goal the new rule relies on covariance matrix estimates derived from Gaussian factor models with small intrinsic dimensionality. Asymptotic results show that, when the model assumed for the covariance matrix estimate is a reasonable approximation to the true data generating process, the expected error rate of the new rule converges to an error close to that of the optimal Bayes rule, even in several cases where the number of variables grows faster than the number of observations. Simulation results suggest that the new rule clearly outperforms both Fisher's and Naive linear discriminant rules in the data conditions it was designed for.
UR - http://www.scopus.com/inward/record.url?scp=79959704165&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-10745-0_24
DO - 10.1007/978-3-642-10745-0_24
M3 - Conference contribution
AN - SCOPUS:79959704165
SN - 9783642107443
T3 - Studies in Classification, Data Analysis, and Knowledge Organization
SP - 227
EP - 234
BT - Classification as a Tool for Research - Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fur Klassifikation e.V., GfKl 2009
Y2 - 13 March 2009 through 18 March 2009
ER -