Approximate Message Passing for Matrix Regression
Repository URI
Repository DOI
Change log
Authors
Abstract
Approximate message passing (AMP) algorithms have become popular in various structured high-dimensional statistical problems. Previous AMP algorithms for generalized linear models (GLM) typically require the signal to be in the form of a vector, and the design matrix to have independent and identically distributed Gaussian entries. In this thesis, we explore estimation problems where previous assumptions of AMP algorithms are no longer satisfied.
To relax the vector signal assumption, we introduce the matrix GLM model which accounts for matrix signals, and propose an AMP algorithm for estimation and rigorously characterize its performance in the high-dimensional limit. We consider prominent cases of the matrix GLM model to illustrate its usefulness, namely mixed linear regression, max-affine regression, and mixture-of-experts. For max-affine regression, we propose an algorithm that combines AMP with expectation-maximization to estimate intercepts of the model along with the signals.
To relax the Gaussian design assumption, we extend the matrix GLM to have a generalized white noise design matrix (instead of Gaussian), and propose an AMP algorithm for estimating the matrix signal and rigorously characterize its performance. We apply this model to the pooled data and the quantitative group testing (QGT) problem, where the design matrix is binary-valued. For the noiseless pooled data setting, we show that the studied AMP algorithm is equivalent to one recently proposed by El Alaoui et al. Our results provide a rigorous version of their performance guarantees, previously obtained via non-rigorous techniques. For comparison, we propose estimators based on convex relaxation and iterative thresholding, without providing theoretical guarantees.
To further improve the performance of AMP algorithms for QGT and pooled data, we introduce the spatially coupled Bernoulli test matrix and an AMP algorithm. We rigorously characterize its asymptotic performance in both the noiseless and noisy settings, and prove that in the noiseless case, the AMP algorithm achieves almost-exact recovery with a number of tests sublinear in the number of items. For both QGT and pooled data, this is the first efficient scheme that provably achieves recovery in the linear regime with a sublinear number of tests, with performance degrading gracefully in the presence of noise.