Environment
AlphaZero.Env — TypeEnvType for an AlphZero environment.
The environment features the current neural network, the best neural network seen so far that is used for data generation, a memory buffer and an iteration counter.
Constructor
Env(game_spec, params, curnn, bestnn=copy(curnn), experience=[], itc=0)Construct a new AlphaZero environment:
- game_specspecified the game being played
- paramshas type- Params
- curnnis the current neural network and has type- AbstractNetwork
- bestnnis the best neural network so far, which is used for data generation
- experienceis the initial content of the memory buffer as a vector of- TrainingSample
- itcis the value of the iteration counter (0 at the start of training)
AlphaZero.Handlers — ModuleHandlersNamespace for the callback functions that are used during training. This enables logging, saving and plotting to be implemented separately. An example handler object is Session.
All callback functions take a handler object h as their first argument and sometimes a second argment r that consists in a report.
| Callback | Comment | 
|---|---|
| iteration_started(h) | called at the beggining of an iteration | 
| self_play_started(h) | called once per iter before self play starts | 
| game_played(h) | called after each game of self play | 
| self_play_finished(h, r) | sends report: Report.SelfPlay | 
| memory_analyzed(h, r) | sends report: Report.Memory | 
| learning_started(h) | called at the beginning of the learning phase | 
| updates_started(h, r) | sends report: Report.LearningStatus | 
| updates_finished(h, r) | sends report: Report.LearningStatus | 
| checkpoint_started(h) | called before a checkpoint evaluation starts | 
| checkpoint_game_played(h) | called after each arena game | 
| checkpoint_finished(h, r) | sends report: Report.Checkpoint | 
| learning_finished(h, r) | sends report: Report.Learning | 
| iteration_finished(h, r) | sends report: Report.Iteration | 
| training_finished(h) | called once at the end of training | 
AlphaZero.get_experience — Methodget_experience(env::Env)Return the content of the agent's memory as a vector of TrainingSample.
AlphaZero.initial_report — Functioninitial_report(env::Env)Return a report summarizing the configuration of agent before training starts, as an object of type Report.Initial.
AlphaZero.train! — Methodtrain!(env::Env, handler=nothing)Start or resume the training of an AlphaZero agent.
A handler object can be passed that implements a subset of the callback functions defined in Handlers.