Environment

AlphaZero.Env — Type

Env{Game, Network, Board}

Type for an AlphZero environment.

The environment features the current neural network, the best neural network seen so far that is used for data generation, a memory buffer and an iteration counter.

Constructor

Env{Game}(params, curnn, bestnn=copy(curnn), experience=[], itc=0)

Construct a new AlphaZero environment.

Game is the type of the game being played
params has type Params
curnn is the current neural network and has type AbstractNetwork
bestnn is the best neural network so far, which is used for data generation
experience is the initial content of the memory buffer as a vector of TrainingSample
itc is the value of the iteration counter (0 at the start of training)

source

AlphaZero.Handlers — Module

Handlers

Namespace for the callback functions that are used during training. This enables logging, saving and plotting to be implemented separately. An example handler object is Session.

All callback functions take a handler object h as their first argument and sometimes a second argment r that consists in a report.

Callback	Comment
`iteration_started(h)`	called at the beggining of an iteration
`self_play_started(h)`	called once per iter before self play starts
`game_played(h)`	called after each game of self play
`self_play_finished(h, r)`	sends report: `Report.SelfPlay`
`memory_analyzed(h, r)`	sends report: `Report.Memory`
`learning_started(h, r)`	sends report: `Report.LearningStatus`
`updates_started(h)`	called before each series of batch updates
`updates_finished(h, r)`	sends report: `Report.LearningStatus`
`checkpoint_started(h)`	called before a checkpoint evaluation starts
`checkpoint_game_played(h)`	called after each arena game
`checkpoint_finished(h, r)`	sends report: `Report.Checkpoint`
`learning_finished(h, r)`	sends report: `Report.Learning`
`iteration_finished(h, r)`	sends report: `Report.Iteration`
`training_finished(h)`	called once at the end of training

source

AlphaZero.get_experience — Function

get_experience(::MemoryBuffer{S}) where S :: Vector{TrainingSample{S}}

Return all samples in the memory buffer.

source

get_experience(env::Env)

Return the content of the agent's memory as a vector of TrainingSample.

source

AlphaZero.initial_report — Function

initial_report(env::Env)

Return a report summarizing the configuration of agent before training starts, as an object of type Report.Initial.

source

AlphaZero.train! — Method

train!(env::Env, handler=nothing)

Start or resume the training of an AlphaZero agent.

A handler object can be passed that implements a subset of the callback functions defined in Handlers.

source