Game Interface

AlphaZero.GameInterface — Module

A generic interface for single-player games and two-player zero-sum games.

Stochastic games and intermediate rewards are supported. By convention, rewards are expressed from the point of view of the player called white. In two-player zero-sum games, we call black the player trying to minimize the reward.

source

A test suite is provided in the AlphaZero.Scripts to check the compliance of your environment with this interface.

Mandatory Interface

The game interface of AlphaZero.jl differs from many standard RL interfaces by making a distinction between a game specification and a game environment:

A specification holds all static information about a game, which does not depend on the current state (e.g. the world dimensions in a grid world environment)
In contrast, an environment holds information about the current state of the game (e.g. the player's position in a grid-world environment).

Game Specifications

AlphaZero.GameInterface.AbstractGameSpec — Type

AbstractGameSpec

Abstract type for a game specification.

The specification holds all static information about a game, which does not depend on the current state.

source

AlphaZero.GameInterface.two_players — Function

two_players(::AbstractGameSpec) :: Bool

Return whether or not a game is a two-players game.

source

AlphaZero.GameInterface.actions — Function

actions(::AbstractGameSpec)

Return the vector of all game actions.

source

AlphaZero.GameInterface.vectorize_state — Function

vectorize_state(::AbstractGameSpec, state) :: Array{Float32}

Return a vectorized representation of a given state.

source

Game Environments

AlphaZero.GameInterface.AbstractGameEnv — Type

AbstractGameEnv

Abstract base type for a game environment.

Intuitively, a game environment holds a game specification and a current state.

source

AlphaZero.GameInterface.init — Function

init(::AbstractGameSpec) :: AbstractGameEnv

Create a new game environment in a (possibly random) initial state.

source

AlphaZero.GameInterface.spec — Function

spec(game::AbstractGameEnv) :: AbstractGameSpec

Return the game specification of an environment.

source

AlphaZero.GameInterface.set_state! — Function

set_state!(game::AbstractGameEnv, state)

Modify the state of a game environment in place.

source

AlphaZero.GameInterface.current_state — Function

current_state(game::AbstractGameEnv)

Return the game state.

Warn

The state returned by this function may be stored (e.g. in the MCTS tree) and must therefore either be fresh or persistent. If in doubt, you should make a copy.

source

AlphaZero.GameInterface.game_terminated — Function

game_terminated(::AbstractGameEnv)

Return a boolean indicating whether or not the game is in a terminal state.

source

AlphaZero.GameInterface.white_playing — Function

white_playing(::AbstractGameEnv) :: Bool

Return true if white is to play and false otherwise.

For a one-player game, this function must always return true.

source

AlphaZero.GameInterface.actions_mask — Function

actions_mask(::AbstractGameEnv)

Return a boolean mask indicating what actions are available.

The following identities must hold:

game_terminated(game) || any(actions_mask(game))
length(actions_mask(game)) == length(actions(spec(game)))

source

AlphaZero.GameInterface.play! — Function

play!(game::AbstractGameEnv, action)

Update the game environment by making the current player perform action. Note that this function does not have to be deterministic.

source

AlphaZero.GameInterface.white_reward — Function

white_reward(game::AbstractGameEnv)

Return the intermediate reward obtained by the white player after the last transition step. The result is undetermined when called at an initial state.

source

Optional Interface

Interface for Interactive Tools

These functions are required for the default User Interface to work well.

AlphaZero.GameInterface.action_string — Function

action_string(::AbstractGameSpec, action) :: String

Return a human-readable string representing the provided action.

source

AlphaZero.GameInterface.parse_action — Function

parse_action(::AbstractGameSpec, str::String)

Return the action described by string str or nothing if str does not denote a valid action.

source

AlphaZero.GameInterface.read_state — Function

read_state(game_spec::AbstractGameSpec)

Read a state from the standard input. Return the corresponding state (with type state_type(game_spec)) or nothing in case of an invalid input.

source

AlphaZero.GameInterface.render — Function

render(game::AbstractGameEnv)

Print the game state on the standard output.

source

Other Optional Functions

AlphaZero.GameInterface.heuristic_value — Function

heuristic_value(game::AbstractGameEnv)

Return a heuristic estimate of the state value for the current player.

The given state must be nonfinal and returned values must belong to the $(-∞, ∞)$ interval.

This function is not needed by AlphaZero but it is useful for building baselines such as minmax players.

source

AlphaZero.GameInterface.symmetries — Function

symmetries(::AbstractGameSpec, state)

Return the vector of all pairs (s, σ) where:

s is the image of state by a nonidentical symmetry
σ is the associated actions permutation, as an integer vector of size num_actions(game).

A default implementation is provided that returns an empty vector.

Note that the current state of the passed environment is ignored by this function.

Example

In the game of tic-tac-toe, there are eight symmetries that can be obtained by composing reflexions and rotations of the board (including the identity symmetry).

Property

If (s2, σ) is a symmetry for state s1, then mask2 == mask1[σ] must hold where mask1 and mask2 are the available action masks for s1 and s2 respectively.

source

Derived Functions

Operations on Specifications

AlphaZero.GameInterface.state_type — Function

state_type(::AbstractGameSpec)

Return the state type associated to a game.

State objects must be persistent or appear as such as they are stored into the MCTS tree without copying. They also have to be comparable and hashable.

source

AlphaZero.GameInterface.state_dim — Function

state_dim(::AbstractGameSpec)

Return a tuple that indicates the shape of a vectorized state representation.

source

AlphaZero.GameInterface.state_memsize — Function

state_memsize(::AbstractGameSpec)

Return the memory footprint occupied by a state of the given game.

The computation is based on a random initial state, assuming that all states have an identical footprint.

source

AlphaZero.GameInterface.action_type — Function

action_type(::AbstractGameSpec)

Return the action type associated to a game.

source

AlphaZero.GameInterface.num_actions — Function

num_actions(::AbstractGameSpec)

Return the total number of actions associated with a game.

source

AlphaZero.GameInterface.init — Method

init(::AbstractGameSpec, state) :: AbstractGameEnv

Create a new game environment, initialized in a given state.

source

Operations on Environments

AlphaZero.GameInterface.clone — Function

clone(::AbstractGameEnv)

Return an independent copy of the given environment.

source

AlphaZero.GameInterface.available_actions — Function

available_actions(::AbstractGameEnv)

Return the vector of all available actions.

source

AlphaZero.GameInterface.apply_random_symmetry! — Function

apply_random_symmetry!(::AbstractGameEnv)

Update a game environment by applying a random symmetry to the current state (see symmetries).

source

Wrapper for CommonRLInterface.jl

AlphaZero.CommonRLInterfaceWrapper — Module

Utilities for using AlphaZero.jl on RL environments that implement CommonRLInterface.jl.

source

AlphaZero.CommonRLInterfaceWrapper.Env — Type

Env(rlenv::CommonRLInterface.AbstractEnv; <kwargs>) <: AbstractGameEnv

Wrap an environment implementing the interface defined in CommonRLInterface.jl into an AbstractGameEnv.

Requirements

The following optional methods must be implemented for rlenv:

clone
state
setstate!
valid_action_mask
player
players

Keyword arguments

The following optional functions from GameInterface are not present in CommonRLInterface.jl and can be provided as keyword arguments:

vectorize_state: must be provided unless states already have type Array{<:Number}
heuristic_value
symmetries
render
action_string
parse_action
read_state

If f is not provided, the default implementation calls GI.f(::CommonRLInterface.AbstractEnv, ...).

source

AlphaZero.CommonRLInterfaceWrapper.Spec — Type

Spec(rlenv::RL.AbstractEnv; kwargs...) = spec(Env(rlenv; kwargs...))

source