Players and Simulations

Player Interface

AlphaZero.AbstractPlayer — Type

AbstractPlayer

Abstract type for a game player.

source

AlphaZero.think — Function

think(::AbstractPlayer, game)

Return a probability distribution over available actions as a (actions, π) pair.

source

AlphaZero.select_move — Function

select_move(player::AbstractPlayer, game, turn_number)

Return a single action. A default implementation is provided that samples an action according to the distribution computed by think, with a temperature given by player_temperature.

source

AlphaZero.reset_player! — Function

reset_player!(::AbstractPlayer)

Reset the internal memory of a player (e.g. the MCTS tree). The default implementation does nothing.

source

AlphaZero.player_temperature — Function

player_temperature(::AbstractPlayer, game, turn_number)

Return the player temperature, given the number of actions that have been played before by both players in the current game.

A default implementation is provided that always returns 1.

source

Player Instances

AlphaZero.AlphaZeroPlayer — Function

AlphaZeroPlayer(::Env; [timeout, mcts_params, use_gpu])

Create an AlphaZero player from the current training environment.

Note that the returned player may be slow as it does not batch MCTS requests.

source

AlphaZero.MctsPlayer — Type

MctsPlayer{MctsEnv} <: AbstractPlayer

A player that selects actions using MCTS.

Constructors

MctsPlayer(mcts::MCTS.Env; τ, niters, timeout=nothing)

Construct a player from an MCTS environment. When computing each move:

if timeout is provided, MCTS simulations are executed for timeout seconds by groups of niters
otherwise, niters MCTS simulations are run

The temperature parameter τ can be either a real number or a AbstractSchedule.

MctsPlayer(game_spec::AbstractGameSpec, oracle,
           params::MctsParams; timeout=nothing)

Construct an MCTS player from an oracle and an MctsParams structure.

source

AlphaZero.RandomPlayer — Type

RandomPlayer <: AbstractPlayer

A player that picks actions uniformly at random.

source

AlphaZero.NetworkPlayer — Type

NetworkPlayer{Net} <: AbstractPlayer

A player that uses the policy output by a neural network directly, instead of relying on MCTS. The given neural network must be in test mode.

source

AlphaZero.EpsilonGreedyPlayer — Type

EpsilonGreedyPlayer{Player} <: AbstractPlayer

A wrapper on a player that makes it choose a random move with a fixed $ϵ$ probability.

source

AlphaZero.PlayerWithTemperature — Type

PlayerWithTemperature{Player} <: AbstractPlayer

A wrapper on a player that enables overwriting the temperature schedule.

source

AlphaZero.TwoPlayers — Type

TwoPlayers <: AbstractPlayer

If white and black are two AbstractPlayer, then TwoPlayers(white, black) is a player that behaves as white when white is to play and as black when black is to play.

source

Game Simulations

Simulation traces

AlphaZero.Trace — Type

Trace{State}

An object that collects all states visited during a game, along with the rewards obtained at each step and the successive player policies to be used as targets for the neural network.

Constructor

Trace(initial_state)

source

Base.push! — Method

Base.push!(t::Trace, π, r, s)

Add a (target policy, reward, new state) quadruple to a trace.

source

Playing a single game

AlphaZero.play_game — Function

play_game(gspec::AbstractGameSpec, player; flip_probability=0.) :: Trace

Simulate a game by an AbstractPlayer.

For two-player games, please use TwoPlayers.
If the flip_probability argument is set to $p$, the board is flipped randomly at every turn with probability $p$, using GI.apply_random_symmetry!.

source

Playing multiple games in a distibuted fashion

AlphaZero.Simulator — Type

Simulator(make_player, make_oracles, measure)

A distributed simulator that encapsulates the details of running simulations across multiple threads and multiple machines.

Arguments

make_oracles: a function that takes no argument and returns the oracles used by the player, which can be either nothing, a single oracle or a pair of oracles.
make_player: a function that takes as an argument the result of make_oracles and builds a player from it. In practice, an oracle returned by make_oracles may be replaced by a BatchedOracle before it is passed to make_player, which is why these two functions are specified separately.
measure(trace, colors_flipped, player): the function that is used to take measurements after each game simulation.

source

AlphaZero.record_trace — Function

record_trace

A measurement function to be passed to a Simulator that produces named tuples with two fields: trace::Trace and colors_flipped::Bool.

source

AlphaZero.simulate — Function

simulate(::Simulator, ::AbstractGameSpec; ::SimParams; <kwargs>)

Play a series of games using a given Simulator.

Keyword Arguments

game_simulated is called every time a game simulation is completed (with no arguments)

Return

Return a vector of objects computed by simulator.measure.

source

AlphaZero.simulate_distributed — Function

simulate_distributed(::Simulator, ::AbstractGameSpec, ::SimParams; <kwargs>)

Identical to simulate but splits the work across all available distributed workers, whose number is given by Distributed.nworkers().

source

Utilities for playing interactive games

AlphaZero.Human — Type

Human <: AbstractPlayer

Human player that queries the standard input for actions.

Does not implement think but instead implements select_move directly.

source

AlphaZero.interactive! — Function

interactive!(game)
interactive!(gspec)
interactive!(game, player)
interactive!(gspec, player)
interactive!(game, white, black)
interactive!(gspec, white, black)

Launch a possibly interactive game session.

This function takes either an AbstractGameSpec or AbstractGameEnv as an argument.

source