Towards a Competitive 3-Player Mahjong AI using Deep Reinforcement Learning
View / Open Files
Conference Name
2022 IEEE Conference on Games
Type
Conference Object
This Version
AM
Metadata
Show full item recordCitation
Zhao, X., & Holden, S. Towards a Competitive 3-Player Mahjong AI using Deep Reinforcement Learning. 2022 IEEE Conference on Games. https://doi.org/10.17863/CAM.86627
Abstract
Mahjong is a multi-player imperfect-information game with challenging features for AI research. Sanma, being a 3-player variant of Japanese Riichi Mahjong, possesses unique characteristics and a more aggressive playing style than the 4- player game. It is thus challenging and of research interest in its own right, but has not been explored. We present Meowjong, the first ever AI for Sanma using deep reinforcement learning (RL). We define a 2-dimensional data structure for encoding the observable information in a game. We pre-train 5 convolutional neural networks (CNNs) for Sanma’s 5 actions—discard, Pon, Kan, Kita and Riichi, and enhance the major (discard) action’s model via self-play reinforcement learning. Meowjong demon- strates potential for becoming the state-of-the-art in Sanma, by achieving test accuracies comparable with AIs for 4-player Mahjong through supervised learning, and gaining a significant further enhancement from reinforcement learning.
Identifiers
External DOI: https://doi.org/10.17863/CAM.86627
This record's URL: https://www.repository.cam.ac.uk/handle/1810/339217
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk