Game Interface

AlphaZero.GameInterfaceModule

A generic interface for single-player games and two-player zero-sum games.

Stochastic games and intermediate rewards are supported. By convention, rewards are expressed from the point of view of the player called white. In two-player zero-sum games, we call black the player trying to minimize the reward.

source

A test suite is provided in the AlphaZero.Scripts to check the compliance of your environment with this interface.

Mandatory Interface

The game interface of AlphaZero.jl differs from many standard RL interfaces by making a distinction between a game specification and a game environment:

  • A specification holds all static information about a game, which does not depend on the current state (e.g. the world dimensions in a grid world environment)
  • In contrast, an environment holds information about the current state of the game (e.g. the player's position in a grid-world environment).

Game Specifications

Game Environments

AlphaZero.GameInterface.current_stateFunction
current_state(game::AbstractGameEnv)

Return the game state.

Warn

The state returned by this function may be stored (e.g. in the MCTS tree) and must therefore either be fresh or persistent. If in doubt, you should make a copy.

source
AlphaZero.GameInterface.actions_maskFunction
actions_mask(::AbstractGameEnv)

Return a boolean mask indicating what actions are available.

The following identities must hold:

  • game_terminated(game) || any(actions_mask(game))
  • length(actions_mask(game)) == length(actions(spec(game)))
source
AlphaZero.GameInterface.play!Function
play!(game::AbstractGameEnv, action)

Update the game environment by making the current player perform action. Note that this function does not have to be deterministic.

source
AlphaZero.GameInterface.white_rewardFunction
white_reward(game::AbstractGameEnv)

Return the intermediate reward obtained by the white player after the last transition step. The result is undetermined when called at an initial state.

source

Optional Interface

Interface for Interactive Tools

These functions are required for the default User Interface to work well.

AlphaZero.GameInterface.read_stateFunction
read_state(game_spec::AbstractGameSpec)

Read a state from the standard input. Return the corresponding state (with type state_type(game_spec)) or nothing in case of an invalid input.

source

Other Optional Functions

AlphaZero.GameInterface.heuristic_valueFunction
heuristic_value(game::AbstractGameEnv)

Return a heuristic estimate of the state value for the current player.

The given state must be nonfinal and returned values must belong to the $(-∞, ∞)$ interval.

This function is not needed by AlphaZero but it is useful for building baselines such as minmax players.

source
AlphaZero.GameInterface.symmetriesFunction
symmetries(::AbstractGameSpec, state)

Return the vector of all pairs (s, σ) where:

  • s is the image of state by a nonidentical symmetry
  • σ is the associated actions permutation, as an integer vector of size num_actions(game).

A default implementation is provided that returns an empty vector.

Note that the current state of the passed environment is ignored by this function.

Example

In the game of tic-tac-toe, there are eight symmetries that can be obtained by composing reflexions and rotations of the board (including the identity symmetry).

Property

If (s2, σ) is a symmetry for state s1, then mask2 == mask1[σ] must hold where mask1 and mask2 are the available action masks for s1 and s2 respectively.

source

Derived Functions

Operations on Specifications

AlphaZero.GameInterface.state_typeFunction
state_type(::AbstractGameSpec)

Return the state type associated to a game.

State objects must be persistent or appear as such as they are stored into the MCTS tree without copying. They also have to be comparable and hashable.

source
AlphaZero.GameInterface.state_memsizeFunction
state_memsize(::AbstractGameSpec)

Return the memory footprint occupied by a state of the given game.

The computation is based on a random initial state, assuming that all states have an identical footprint.

source

Operations on Environments

Wrapper for CommonRLInterface.jl

AlphaZero.CommonRLInterfaceWrapper.EnvType
Env(rlenv::CommonRLInterface.AbstractEnv; <kwargs>) <: AbstractGameEnv

Wrap an environment implementing the interface defined in CommonRLInterface.jl into an AbstractGameEnv.

Requirements

The following optional methods must be implemented for rlenv:

  • clone
  • state
  • setstate!
  • valid_action_mask
  • player
  • players

Keyword arguments

The following optional functions from GameInterface are not present in CommonRLInterface.jl and can be provided as keyword arguments:

  • vectorize_state: must be provided unless states already have type Array{<:Number}
  • heuristic_value
  • symmetries
  • render
  • action_string
  • parse_action
  • read_state

If f is not provided, the default implementation calls GI.f(::CommonRLInterface.AbstractEnv, ...).

source