Discarding variables in a principal component analysis: algorithms for all-subsets comparisons

António Pedro Duarte Silva*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)

Abstract

The traditional approach to the interpretation of the results from a Principal Component Analysis implicitly discards variables that are weakly correlated with the most important and/or most interesting Principal Components. Some authors argue that this practice is potentially misleading and that it is preferable to take a variable selection approach, comparing variable subsets according to appropriate approximation criteria. In this paper, we propose algorithms for the comparison of all possible subsets according to some of the most important comparison criteria proposed to date. The computational effort of the proposed algorithms is studied and it is shown that, given current computer technology, they are feasible for problems involving up to thirty variables. A free-domain software implementation can be downloaded from the Internet.
Original languageEnglish
Pages (from-to)251-271
Number of pages21
JournalComputational Statistics
Volume17
Issue number2
DOIs
Publication statusPublished - 2002

Keywords

  • All-subsets algorithms
  • Principal component analysis
  • Principal variables
  • Variable selection

Fingerprint

Dive into the research topics of 'Discarding variables in a principal component analysis: algorithms for all-subsets comparisons'. Together they form a unique fingerprint.

Cite this