On Gap-Based Lower Bounding Techniques for Best-Arm Identification.
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Truong, Lan V https://orcid.org/0000-0002-7756-3464
Scarlett, Jonathan https://orcid.org/0000-0003-1403-9160
Abstract
In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be refined to match the former (up to constant factors) in many cases of interest under Bernoulli rewards, including the case that the rewards are bounded away from zero and one. Together with existing upper bounds, this indicates that the divergence-based and gap-based approaches are both effective for establishing sample complexity lower bounds for best-arm identification.
Description
Keywords
PAC learning, best-arm identification, information-theoretic lower bounds, multi-armed bandits
Journal Title
Entropy (Basel)
Conference Name
Journal ISSN
1099-4300
1099-4300
1099-4300
Volume Title
22
Publisher
MDPI AG
Publisher DOI
Sponsorship
National Research Foundation Singapore (R-252-000-A74-281)