Environment
AlphaZero.Env
— TypeEnv
Type for an AlphZero environment.
The environment features the current neural network, the best neural network seen so far that is used for data generation, a memory buffer and an iteration counter.
Constructor
Env(game_spec, params, curnn, bestnn=copy(curnn), experience=[], itc=0)
Construct a new AlphaZero environment:
game_spec
specified the game being playedparams
has typeParams
curnn
is the current neural network and has typeAbstractNetwork
bestnn
is the best neural network so far, which is used for data generationexperience
is the initial content of the memory buffer as a vector ofTrainingSample
itc
is the value of the iteration counter (0 at the start of training)
AlphaZero.Handlers
— ModuleHandlers
Namespace for the callback functions that are used during training. This enables logging, saving and plotting to be implemented separately. An example handler object is Session
.
All callback functions take a handler object h
as their first argument and sometimes a second argment r
that consists in a report.
Callback | Comment |
---|---|
iteration_started(h) | called at the beggining of an iteration |
self_play_started(h) | called once per iter before self play starts |
game_played(h) | called after each game of self play |
self_play_finished(h, r) | sends report: Report.SelfPlay |
memory_analyzed(h, r) | sends report: Report.Memory |
learning_started(h) | called at the beginning of the learning phase |
updates_started(h, r) | sends report: Report.LearningStatus |
updates_finished(h, r) | sends report: Report.LearningStatus |
checkpoint_started(h) | called before a checkpoint evaluation starts |
checkpoint_game_played(h) | called after each arena game |
checkpoint_finished(h, r) | sends report: Report.Checkpoint |
learning_finished(h, r) | sends report: Report.Learning |
iteration_finished(h, r) | sends report: Report.Iteration |
training_finished(h) | called once at the end of training |
AlphaZero.get_experience
— Methodget_experience(env::Env)
Return the content of the agent's memory as a vector of TrainingSample
.
AlphaZero.initial_report
— Functioninitial_report(env::Env)
Return a report summarizing the configuration of agent before training starts, as an object of type Report.Initial
.
AlphaZero.train!
— Methodtrain!(env::Env, handler=nothing)
Start or resume the training of an AlphaZero agent.
A handler
object can be passed that implements a subset of the callback functions defined in Handlers
.