Reinforcement Learning

Towards a Competitive 3-Player Mahjong AI using Deep Reinforcement Learning

Presented at the 2022 IEEE Conference on Games (CoG).

24 Aug 2022 6:50 PM — 7:00 PM Virtual

Xiangyu Zhao, Sean B. Holden

Towards a Competitive 3-Player Mahjong AI using Deep Reinforcement Learning

Towards a Competitive 3-Player Mahjong AI using Deep Reinforcement Learning

We present Meowjong, an AI for 3-player Mahjong (Sanma) using deep reinforcement learning. We define an informative and compact 2-dimensional data structure for encoding the observable information in a Sanma game. We pre-train 5 CNNs for Sanma’s 5 actions, and enhance the major action’s model via self-play RL using the Monte Carlo policy gradient method.

Xiangyu Zhao, Sean B. Holden

Multi-Agent Deep Q-Learning for the Berry Poisoning Game

MEng Advanced Topics in Machine Learning Coursework.

Xiangyu Zhao

Last updated on 4 Apr 2022

Multi-Agent Deep Q-Learning for the Berry Poisoning Game

MEng Advanced Topics in Machine Learning Coursework.

Xiangyu Zhao

Building a 3-Player Mahjong AI using Deep Reinforcement Learning

We present Meowjong, an AI for 3-player Mahjong (Sanma) using deep reinforcement learning. We define an informative and compact 2-dimensional data structure for encoding the observable information in a Sanma game. We pre-train 5 CNNs for Sanma’s 5 actions, and enhance the major action’s model via self-play RL using the Monte Carlo policy gradient method.

Xiangyu Zhao, Sean B. Holden

Asynchronous Methods for Deep Reinforcement Learning

Paper-reading presentation for the Reinforcement Learning topic of the MEng Advanced Topics in Machine Learning module.

11 Feb 2022 3:00 PM — 4:00 PM Computer Laboratory, University of Cambridge

Xiangyu Zhao

Asynchronous Methods for Deep Reinforcement Learning

Deep Reinforcement Learning for Mahjong

Bachelor’s Dissertation supervised by Dr Sean Holden – We present Meowjong, an AI for 3-player Mahjong (Sanma) using deep reinforcement learning, with an informative and compact 2-dimensional data structure for encoding the observable information in a Sanma game.

Xiangyu Zhao, Sean B. Holden

14 May 2021

Deep Reinforcement Learning for Mahjong

Bachelor’s Dissertation supervised by Dr Sean Holden – We present Meowjong, an AI for 3-player Mahjong (Sanma) using deep reinforcement learning, with an informative and compact 2-dimensional data structure for encoding the observable information in a Sanma game.

Xiangyu Zhao, Sean B. Holden