Linear dimensionality reduction: Survey, insights, and generalizations

Cunningham, JP; Ghahramani, Z

Linear dimensionality reduction: Survey, insights, and generalizations

Repository URI

https://www.repository.cam.ac.uk/handle/1810/248027

Files

Cunningham and Ghahramani 2015 Journal of Machine Learning Research.pdf (1.68 MB)

Type

Article

Authors

Cunningham, JP

Ghahramani, Zoubin

https://orcid.org/0000-0002-7464-6475

Abstract

Linear dimensionality reduction methods are a cornerstone of analyzing high dimensional data, due to their simple geometric interpretations and typically attractive computational properties. These methods capture many data features of interest, such as covariance, dynamical structure, correlation between data sets, input-output relationships, and margin between data classes. Methods have been developed with a variety of names and motivations in many fields, and perhaps as a result the connections between all these methods have not been highlighted. Here we survey methods from this disparate literature as optimization programs over matrix manifolds. We discuss principal component analysis, factor analysis, linear multidimensional scaling, Fisher's linear discriminant analysis, canonical correlations analysis, maximum autocorrelation factors, slow feature analysis, sufficient dimensionality reduction, undercomplete independent component analysis, linear regression, distance metric learning, and more. This optimization framework gives insight to some rarely discussed shortcomings of well-known methods, such as the suboptimality of certain eigenvector solutions. Modern techniques for optimization over matrix manifolds enable a generic linear dimensionality reduction solver, which accepts as input data and an objective to be optimized, and returns, as output, an optimal low-dimensional projection of the data. This simple optimization framework further allows straightforward generalizations and novel variants of classical methods, which we demonstrate here by creating an orthogonal-projection canonical correlations analysis. More broadly, this survey and generic solver suggest that linear dimensionality reduction can move toward becoming a blackbox, objective-agnostic numerical technology.

Keywords

stat.ML, stat.ML

Journal Title

Journal of Machine Learning Research

Journal ISSN

1532-4435
1533-7928

Volume Title

16

Publisher

MIT Press

Publisher URL

http://jmlr.org/papers/v16/cunningham15a.html

Rights

Attribution-NonCommercial 2.0 UK: England & Wales

Sponsorship

JPC and ZG received funding from the UK Engineering and Physical Sciences Research Council (EPSRC EP/H019472/1). JPC received funding from a Sloan Research Fellowship, the Simons Foundation (SCGB#325171 and SCGB#325233), the Grossman Center at Columbia University, and the Gatsby Charitable Trust.

Collections

Scholarly Works - Engineering
Symplectic mapped items for data match