Players and Simulations

Player Interface

AlphaZero.thinkFunction
think(::AbstractPlayer, game)

Return a probability distribution over available actions as a (actions, π) pair.

source
AlphaZero.select_moveFunction
select_move(player::AbstractPlayer, game, turn_number)

Return a single action. A default implementation is provided that samples an action according to the distribution computed by think, with a temperature given by player_temperature.

source
AlphaZero.reset_player!Function
reset_player!(::AbstractPlayer)

Reset the internal memory of a player (e.g. the MCTS tree). The default implementation does nothing.

source
AlphaZero.player_temperatureFunction
player_temperature(::AbstractPlayer, game, turn_number)

Return the player temperature, given the number of actions that have been played before by both players in the current game.

A default implementation is provided that always returns 1.

source

Player Instances

AlphaZero.MctsPlayerType
MctsPlayer{MctsEnv} <: AbstractPlayer

A player that selects actions using MCTS.

Constructors

MctsPlayer(mcts::MCTS.Env; τ, niters, timeout=nothing)

Construct a player from an MCTS environment. When computing each move:

  • if timeout is provided, MCTS simulations are executed for timeout seconds by groups of niters
  • otherwise, niters MCTS simulations are run

The temperature parameter τ can be either a real number or a AbstractSchedule.

MctsPlayer(game_spec::AbstractGameSpec, oracle,
           params::MctsParams; timeout=nothing)

Construct an MCTS player from an oracle and an MctsParams structure.

source
AlphaZero.NetworkPlayerType
NetworkPlayer{Net} <: AbstractPlayer

A player that uses the policy output by a neural network directly, instead of relying on MCTS. The given neural network must be in test mode.

source
AlphaZero.EpsilonGreedyPlayerType
EpsilonGreedyPlayer{Player} <: AbstractPlayer

A wrapper on a player that makes it choose a random move with a fixed $ϵ$ probability.

source
AlphaZero.TwoPlayersType
TwoPlayers <: AbstractPlayer

If white and black are two AbstractPlayer, then TwoPlayers(white, black) is a player that behaves as white when white is to play and as black when black is to play.

source

Game Simulations

Simulation traces

AlphaZero.TraceType
Trace{State}

An object that collects all states visited during a game, along with the rewards obtained at each step and the successive player policies to be used as targets for the neural network.

Constructor

Trace(initial_state)
source
Base.push!Method
Base.push!(t::Trace, π, r, s)

Add a (target policy, reward, new state) quadruple to a trace.

source

Playing a single game

Playing multiple games in a distibuted fashion

AlphaZero.SimulatorType
Simulator(make_player, make_oracles, measure)

A distributed simulator that encapsulates the details of running simulations across multiple threads and multiple machines.

Arguments

  • make_oracles: a function that takes no argument and returns the oracles used by the player, which can be either nothing, a single oracle or a pair of oracles.
  • make_player: a function that takes as an argument the result of make_oracles and builds a player from it. In practice, an oracle returned by make_oracles may be replaced by a BatchedOracle before it is passed to make_player, which is why these two functions are specified separately.
  • measure(trace, colors_flipped, player): the function that is used to take measurements after each game simulation.
source
AlphaZero.record_traceFunction
record_trace

A measurement function to be passed to a Simulator that produces named tuples with two fields: trace::Trace and colors_flipped::Bool.

source
AlphaZero.simulateFunction
simulate(::Simulator, ::AbstractGameSpec; ::SimParams; <kwargs>)

Play a series of games using a given Simulator.

Keyword Arguments

  • game_simulated is called every time a game simulation is completed (with no arguments)

Return

Return a vector of objects computed by simulator.measure.

source
AlphaZero.simulate_distributedFunction
simulate_distributed(::Simulator, ::AbstractGameSpec, ::SimParams; <kwargs>)

Identical to simulate but splits the work across all available distributed workers, whose number is given by Distributed.nworkers().

source

Utilities for playing interactive games

AlphaZero.interactive!Function
interactive!(game)
interactive!(gspec)
interactive!(game, player)
interactive!(gspec, player)
interactive!(game, white, black)
interactive!(gspec, white, black)

Launch a possibly interactive game session.

This function takes either an AbstractGameSpec or AbstractGameEnv as an argument.

source