Env{Game, Network, Board}

Type for an AlphZero environment.

The environment features the current neural network, the best neural network seen so far that is used for data generation, a memory buffer and an iteration counter.


Env{Game}(params, curnn, bestnn=copy(curnn), experience=[], itc=0)

Construct a new AlphaZero environment.

  • Game is the type of the game being played
  • params has type Params
  • curnn is the current neural network and has type AbstractNetwork
  • bestnn is the best neural network so far, which is used for data generation
  • experience is the initial content of the memory buffer as a vector of TrainingSample
  • itc is the value of the iteration counter (0 at the start of training)

Namespace for the callback functions that are used during training. This enables logging, saving and plotting to be implemented separately. An example handler object is Session.

All callback functions take a handler object h as their first argument and sometimes a second argment r that consists in a report.

iteration_started(h)called at the beggining of an iteration
self_play_started(h)called once per iter before self play starts
game_played(h)called after each game of self play
self_play_finished(h, r)sends report: Report.SelfPlay
memory_analyzed(h, r)sends report: Report.Memory
learning_started(h, r)sends report: Report.LearningStatus
updates_started(h)called before each series of batch updates
updates_finished(h, r)sends report: Report.LearningStatus
checkpoint_started(h)called before a checkpoint evaluation starts
checkpoint_game_played(h)called after each arena game
checkpoint_finished(h, r)sends report: Report.Checkpoint
learning_finished(h, r)sends report: Report.Learning
iteration_finished(h, r)sends report: Report.Iteration
training_finished(h)called once at the end of training
get_experience(::MemoryBuffer{S}) where S :: Vector{TrainingSample{S}}

Return all samples in the memory buffer.


Return the content of the agent's memory as a vector of TrainingSample.

train!(env::Env, handler=nothing)

Start or resume the training of an AlphaZero agent.

A handler object can be passed that implements a subset of the callback functions defined in Handlers.