Game Interface

AlphaZero.GameInterfaceModule

A generic interface for two-players, zero-sum games.

Stochastic games and intermediate rewards are supported. By convention, rewards are expressed from the point of view of the player called white.

source

Mandatory Interface

Types

AlphaZero.GameInterface.AbstractGameType
AbstractGame

Abstract base type for a game environment.

Constructors

Any subtype Game must implement the following constructors:

Game()

Return an initialized game environment. Note that this constructor does not have to be deterministic.

Game(state)

Return a fresh game environment starting at a given state.

source
AlphaZero.GameInterface.StateFunction
State(Game::Type{<:AbstractGame})

Return the state type corresponding to Game.

State objects must be persistent or appear as such as they are stored into the MCTS tree without copying. They also have to be comparable and hashable.

source

Game Functions

AlphaZero.GameInterface.white_playingFunction
white_playing(::Type{<:AbstractGame}, state) :: Bool
white_playing(env::AbstractGame)
  = white_playing(typeof(env), current_state(env))

Return true if white is to play and false otherwise. For a one-player game, it must always return true.

source
AlphaZero.GameInterface.white_rewardFunction
white_reward(env::AbstractGame)

Return the intermediate reward obtained by the white player after the last transition step. The result is undetermined when called at an initial state.

source
AlphaZero.GameInterface.actions_maskFunction
actions_mask(env::AbstractGame)

Return a boolean mask indicating what actions are available from env.

The following identities must hold:

  • game_terminated(env) || any(actions_mask(env))
  • length(actions_mask(env)) == length(actions(typeof(env)))
source
AlphaZero.GameInterface.play!Function
play!(env::AbstractGame, action)

Update the game environment by making the current player perform action. Note that this function does not have to be deterministic.

source
AlphaZero.GameInterface.heuristic_valueFunction
heuristic_value(env::AbstractGame)

Return a heuristic estimate of the state value for the current player.

The given state must be nonfinal and returned values must belong to the $(-∞, ∞)$ interval.

This function is not needed by AlphaZero but it is useful for building baselines such as minmax players.

source
AlphaZero.GameInterface.symmetriesFunction
symmetries(::Type{G}, state) where {G <: AbstractGame}

Return the vector of all pairs (s, σ) where:

  • s is the image of state by a nonidentical symmetry
  • σ is the associated actions permutation, as an integer vector of size num_actions(Game).

A default implementation is provided that returns an empty vector.

Example

In the game of tic-tac-toe, there are eight symmetries that can be obtained by composing reflexions and rotations of the board (including the identity symmetry).

source

Interface for Interactive Tools

AlphaZero.GameInterface.read_stateFunction
read_state(::Type{G}) where G <: AbstractGame :: Union{State(G), Nothing}

Read a state from the standard input. Return the corresponding state or nothing in case of an invalid input.

source

Derived Functions