Environment

AlphaZero.EnvType
Env{Game, Network, Board}

Type for an AlphZero environment.

The environment features the current neural network, the best neural network seen so far that is used for data generation, a memory buffer and an iteration counter.

Constructor

Env{Game}(params, curnn, bestnn=copy(curnn), experience=[], itc=0)

Construct a new AlphaZero environment.

  • Game is the type of the game being played
  • params has type Params
  • curnn is the current neural network and has type AbstractNetwork
  • bestnn is the best neural network so far, which is used for data generation
  • experience is the initial content of the memory buffer as a vector of TrainingSample
  • itc is the value of the iteration counter (0 at the start of training)
source
AlphaZero.HandlersModule
Handlers

Namespace for the callback functions that are used during training. This enables logging, saving and plotting to be implemented separately. An example handler object is Session.

All callback functions take a handler object h as their first argument and sometimes a second argment r that consists in a report.

CallbackComment
iteration_started(h)called at the beggining of an iteration
self_play_started(h)called once per iter before self play starts
game_played(h)called after each game of self play
self_play_finished(h, r)sends report: Report.SelfPlay
memory_analyzed(h, r)sends report: Report.Memory
learning_started(h, r)sends report: Report.LearningStatus
updates_started(h)called before each series of batch updates
updates_finished(h, r)sends report: Report.LearningStatus
checkpoint_started(h)called before a checkpoint evaluation starts
checkpoint_game_played(h)called after each arena game
checkpoint_finished(h, r)sends report: Report.Checkpoint
learning_finished(h, r)sends report: Report.Learning
iteration_finished(h, r)sends report: Report.Iteration
training_finished(h)called once at the end of training
source
AlphaZero.get_experienceFunction
get_experience(::MemoryBuffer{S}) where S :: Vector{TrainingSample{S}}

Return all samples in the memory buffer.

source
get_experience(env::Env)

Return the content of the agent's memory as a vector of TrainingSample.

source
AlphaZero.train!Method
train!(env::Env, handler=nothing)

Start or resume the training of an AlphaZero agent.

A handler object can be passed that implements a subset of the callback functions defined in Handlers.

source