Players
Player Interface
AlphaZero.AbstractPlayer
— TypeAbstractPlayer{Game}
Abstract type for a player of Game
.
AlphaZero.think
— Functionthink(::AbstractPlayer, game)
Return a probability distribution over actions as a (actions, π)
pair.
AlphaZero.select_move
— Functionselect_move(player::AbstractPlayer, game, turn_number)
Return a single action. A default implementation is provided that samples an action according to the distribution computed by think
, with a temperature given by player_temperature
.
AlphaZero.reset_player!
— Functionreset_player!(::AbstractPlayer)
Reset the internal memory of a player (e.g. the MCTS tree). The default implementation does nothing.
AlphaZero.player_temperature
— Functionplayer_temperature(::AbstractPlayer, game, turn_number)
Return the player temperature, given the number of actions that have been played before by both players in the current game.
A default implementation is provided that always returns 1.
Player Instances
AlphaZero.MctsPlayer
— TypeMctsPlayer{Game, MctsEnv} <: AbstractPlayer{Game}
A player that selects actions using MCTS.
Constructors
MctsPlayer(mcts::MCTS.Env; τ, niters, timeout=nothing)
Construct a player from an MCTS environment. When computing each move:
- if
timeout
is provided, MCTS simulations are executed fortimeout
seconds by groups ofniters
- otherwise,
niters
MCTS simulations are run
The temperature parameter τ
can be either a real number or a AbstractSchedule
.
MctsPlayer(oracle::MCTS.Oracle, params::MctsParams; timeout=nothing)
Construct an MCTS player from an oracle and an MctsParams
structure. If the oracle is a network, this constructor handles copying it, putting it in test mode and copying it on the GPU (if necessary).
AlphaZero.RandomPlayer
— TypeRandomPlayer{Game} <: AbstractPlayer{Game}
A player that picks actions uniformly at random.
AlphaZero.NetworkPlayer
— TypeNetworkPlayer{Game, Net} <: AbstractPlayer{Game}
A player that uses the policy output by a neural network directly, instead of relying on MCTS.
AlphaZero.EpsilonGreedyPlayer
— TypeEpsilonGreedyPlayer{Game, Player} <: AbstractPlayer{Game}
A wrapper on a player that makes it choose a random move with a fixed $ϵ$ probability.
AlphaZero.PlayerWithTemperature
— TypePlayerWithTemperature{Game, Player} <: AbstractPlayer{Game}
A wrapper on a player that enables overwriting the temperature schedule.
AlphaZero.TwoPlayers
— TypeTwoPlayers{Game} <: AbstractPlayer{Game}
If white
and black
are two AbstractPlayer
, then TwoPlayers(white, black)
is a player that behaves as white
when white
is to play and as black
when black
is to play.
Derived Functions
AlphaZero.play_game
— Functionplay_game(player; flip_probability=0.) :: Trace
Simulate a game by an AbstractPlayer
and return a trace.
- For two-player games, please use
TwoPlayers
. - If the
flip_probability
argument is set to $p$, the board is flipped randomly at every turn with probability $p$, usingGI.apply_random_symmetry
.
AlphaZero.Trace
— TypeTrace{Game, State}
An object that collects all states visited during a game, along with the rewards obtained at each step and the successive player policies to be used as targets.
Constructor
Trace{Game}(initial_state)
Base.push!
— MethodBase.push!(t::Trace, π, r, s)
Add a (target policy, reward, new state) triple to a trace.
AlphaZero.pit
— Functionpit(handler, contender, baseline, ngames)
Evaluate two AbstractPlayer
against each other in a series of games.
Note that this function can only be used with two-player games.
Arguments
handler
: this function is called after each simulated game with three arguments: the game numberi
, the rewardr
for the contender player and the tracet
ngames
: number of games to play
Optional keyword arguments
reset_every
: if set, players are reset everyreset_every
gamescolor_policy
: determines theColorPolicy
, which isALTERNATE_COLORS
by defaultflip_probability=0.
: seeplay_game
AlphaZero.ColorPolicy
— Type@enum ColorPolicy ALTERNATE_COLORS BASELINE_WHITE CONTENDER_WHITE
Policy for attributing colors in a duel between a baseline and a contender.
AlphaZero.interactive!
— Functioninteractive!(game, white, black)
Launch an interactive session for game::AbstractGame
between players white
and black
. Both players have type AbstractPlayer
and one of them is typically Human
.
AlphaZero.Human
— TypeHuman{Game} <: AbstractPlayer{Game}
Human player that queries the standard input for actions.
Does not implement think
but instead implements select_move
directly.