Statistical Hypothesis Testing Under Model Uncertainty
Repository URI
Repository DOI
Change log
Authors
Abstract
Statistical testing is one of the main problems in statistics and finds applications in a number of fields, including engineering, signal processing, medicine, and finance among others. Traditionally in hypothesis testing problem, the hypothesis distributions subject to testing are known. However, in practice, true distributions are difficult to obtain. In this thesis, we study the hypothesis testing problem with unknown distribution.
In the first part of this thesis, we consider the problem of mismatched binary hypothesis testing between i.i.d. distributions, and between Markov sources. We analyze the tradeoff between the pairwise error probability exponents when the actual distributions generating the observation are different from the distributions used in the likelihood ratio test, sequential probability ratio test, and Hoeffding's generalized likelihood ratio test in the composite setting. When the real distributions are within a small divergence ball of the test distributions, we define the worst-case error exponent of each test with respect to the matched error exponent. In addition, we consider the case where an adversary tampers with the observation, again within a divergence ball of the observation type. We show that the tests are more sensitive to distribution mismatch than to adversarial observation tampering.
In the next part of the thesis, we propose a composite hypothesis test in the Neyman-Pearson setting where the null distribution is known and the alternative distribution belongs to a certain family of distributions.
The proposed test interpolates between Hoeffding's test and the likelihood ratio test and achieves the optimal error exponent tradeoff for every distribution in the family. In addition, the proposed test is shown to attain the type-I error probability prefactor of
In addition, the proposed test achieves the optimal type-II error probability prefactor for every distribution in the family.
Finally, we consider the universal classification for the binary Neyman-Pearson classification where the null distribution is known while only a training sequence is available for the alternative distribution. The proposed classifier interpolates between Hoeffding's classifier and the likelihood ratio test and attains the same error probability prefactor as the likelihood ratio test, i.e., the same prefactor as if both distributions were known. In addition, like Hoeffding's universal classifier, the proposed classifier is shown to attain the optimal error exponent tradeoff attained by the likelihood ratio test whenever the ratio of training to observation samples exceeds a certain value. We propose upper and lower bounds to the training to observation ratio. In addition, we propose a sequential classifier that attains the optimal error exponent tradeoff. We also consider the classification problem in the minimax setting when both distributions are unknown in the Neyman-Pearson setting. We propose classifiers that can asymptotically achieve a predetermined ratio between type-II and type-I error exponents.