Players
Player Interface
AlphaZero.AbstractPlayer — TypeAbstractPlayer{Game}Abstract type for a player of Game.
AlphaZero.think — Functionthink(::AbstractPlayer, game)Return a probability distribution over actions as a (actions, π) pair.
AlphaZero.select_move — Functionselect_move(player::AbstractPlayer, game, turn_number)Return a single action. A default implementation is provided that samples an action according to the distribution computed by think, with a temperature given by player_temperature.
AlphaZero.reset_player! — Functionreset_player!(::AbstractPlayer)Reset the internal memory of a player (e.g. the MCTS tree). The default implementation does nothing.
AlphaZero.player_temperature — Functionplayer_temperature(::AbstractPlayer, game, turn_number)Return the player temperature, given the number of actions that have been played before by both players in the current game.
A default implementation is provided that always returns 1.
Player Instances
AlphaZero.MctsPlayer — TypeMctsPlayer{Game, MctsEnv} <: AbstractPlayer{Game}A player that selects actions using MCTS.
Constructors
MctsPlayer(mcts::MCTS.Env; τ, niters, timeout=nothing)Construct a player from an MCTS environment. When computing each move:
- if
timeoutis provided, MCTS simulations are executed fortimeoutseconds by groups ofniters - otherwise,
nitersMCTS simulations are run
The temperature parameter τ can be either a real number or a AbstractSchedule.
MctsPlayer(oracle::MCTS.Oracle, params::MctsParams; timeout=nothing)Construct an MCTS player from an oracle and an MctsParams structure. If the oracle is a network, this constructor handles copying it, putting it in test mode and copying it on the GPU (if necessary).
AlphaZero.RandomPlayer — TypeRandomPlayer{Game} <: AbstractPlayer{Game}A player that picks actions uniformly at random.
AlphaZero.NetworkPlayer — TypeNetworkPlayer{Game, Net} <: AbstractPlayer{Game}A player that uses the policy output by a neural network directly, instead of relying on MCTS.
AlphaZero.EpsilonGreedyPlayer — TypeEpsilonGreedyPlayer{Game, Player} <: AbstractPlayer{Game}A wrapper on a player that makes it choose a random move with a fixed $ϵ$ probability.
AlphaZero.PlayerWithTemperature — TypePlayerWithTemperature{Game, Player} <: AbstractPlayer{Game}A wrapper on a player that enables overwriting the temperature schedule.
AlphaZero.TwoPlayers — TypeTwoPlayers{Game} <: AbstractPlayer{Game}If white and black are two AbstractPlayer, then TwoPlayers(white, black) is a player that behaves as white when white is to play and as black when black is to play.
Derived Functions
AlphaZero.play_game — Functionplay_game(player; flip_probability=0.) :: TraceSimulate a game by an AbstractPlayer and return a trace.
- For two-player games, please use
TwoPlayers. - If the
flip_probabilityargument is set to $p$, the board is flipped randomly at every turn with probability $p$, usingGI.apply_random_symmetry.
AlphaZero.Trace — TypeTrace{Game, State}An object that collects all states visited during a game, along with the rewards obtained at each step and the successive player policies to be used as targets.
Constructor
Trace{Game}(initial_state)Base.push! — MethodBase.push!(t::Trace, π, r, s)Add a (target policy, reward, new state) triple to a trace.
AlphaZero.pit — Functionpit(handler, contender, baseline, ngames)Evaluate two AbstractPlayer against each other in a series of games.
Note that this function can only be used with two-player games.
Arguments
handler: this function is called after each simulated game with three arguments: the game numberi, the rewardrfor the contender player and the tracetngames: number of games to play
Optional keyword arguments
reset_every: if set, players are reset everyreset_everygamescolor_policy: determines theColorPolicy, which isALTERNATE_COLORSby defaultflip_probability=0.: seeplay_game
AlphaZero.ColorPolicy — Type@enum ColorPolicy ALTERNATE_COLORS BASELINE_WHITE CONTENDER_WHITEPolicy for attributing colors in a duel between a baseline and a contender.
AlphaZero.interactive! — Functioninteractive!(game, white, black)Launch an interactive session for game::AbstractGame between players white and black. Both players have type AbstractPlayer and one of them is typically Human.
AlphaZero.Human — TypeHuman{Game} <: AbstractPlayer{Game}Human player that queries the standard input for actions.
Does not implement think but instead implements select_move directly.