User Interface
AlphaZero.UserInterface
— ModuleThe default user interface for AlphaZero.
The user interface is fully separated from the core algorithm and can therefore be replaced easily.
Session
AlphaZero.UserInterface.Session
— TypeSession{Env}
A wrapper on an AlphaZero environment that adds features such as:
- Logging and plotting
- Loading and saving of environments
In particular, it implements the Handlers
interface.
Public fields
env::Env
is the environment wrapped by the sessionreport
is the current session report, with typeSessionReport
AlphaZero.UserInterface.Session
— MethodSession(::Type{Game}, ::Type{Net}, params, netparams) where {Game, Net}
Create a new session using the given parameters, or load it from disk if it already exists.
Arguments
Game
is the type ot the game that is being learntNet
is the type of the network that is being usedparams
has typeParams
netparams
has typeNetwork.HyperParams(Net)
Optional keyword arguments
dir
: session directory in which all files and reports are saved; this argument is either a string ornothing
(default), in which case the session won't be saved automatically and no file will be generatedautosave=true
: if set tofalse
, the session won't be saved automatically nor any file will be generatednostdout=false
: disables logging on the standard output when set totrue
benchmark=[]
: vector ofBenchmark.Duel
to be used as a benchmarkload_saved_params=false
: if set totrue
, load the training parameters from the session directory (if present) rather than using theparams
argumentsave_intermediate=false
: if set to true (along withautosave
), all intermediate training environments are saved on disk so that the whole training process can be analyzed later. This can consume a lot of disk space.
Session(::Type{Game}, ::Type{Network}, dir::String) where {Game, Net}
Load an existing session from a directory.
This constructor accepts the optional keyword arguments autosave
, nostdout
, benchmark
and save_intermediate
.
Session(env::Env[, dir])
Create a session from an initial environment.
- The iteration counter of the environment must be equal to 0
- If a session directory is provided, this directory must not exist yet
This constructor features the optional keyword arguments autosave
, nostdout
, benchmark
and save_intermediate
.
AlphaZero.UserInterface.resume!
— Functionresume!(session::Session)
Resume a previously created or loaded session. The user can interrupt training by sending a SIGKILL signal.
AlphaZero.UserInterface.save
— Functionsave(session::Session)
Save a session on disk.
This function is called automatically by resume!
after each training iteration if the session was created with autosave=true
.
AlphaZero.UserInterface.play_interactive_game
— Functionplay_interactive_game(session::Session; timeout=2.)
Start an interactive game against AlphaZero, allowing it timeout
seconds of thinking time for each move.
AlphaZero.UserInterface.start_explorer
— Methodstart_explorer(session::Session)
Start an explorer session for the current environment. See Explorer
.
AlphaZero.UserInterface.SessionReport
— TypeSessionReport
The full collection of statistics and benchmark results collected during a training session.
Fields
iterations
: vector of $n$ iteration reports with typeReport.Iteration
benchmark
: vector of $n+1$ benchmark reports with typeBenchmark.Report
Explorer
AlphaZero.UserInterface.Explorer
— TypeExplorer{Game}
A command interpreter to explore the internals of a player through interactive play.
Constructors
Explorer(player::AbstractPlayer, state=nothing; memory=nothing)
Build an explorer to investigate the behavior of player
from a given state
(by default, the initial state). Optionally, a reference to a memory buffer can be provided, in which case additional state statistics will be displayed.
Explorer(env::Env, state=nothing; arena_mode=false)
Build an explorer for the MCTS player based on neural network env.bestnn
and on parameters env.params.self_play.mcts
or env.params.arena.mcts
(depending on the value of arena_mode
).
Commands
The following commands are currently implemented:
do [action]
: make the current player performaction
. By default, the action of highest score is played.explore [num_sims]
: runnum_sims
MCTS simulations from the current state (for MCTS players only).go
: query the user for a state description and go to this state.flip
: flip the board according to a random symmetry.undo
: undo the effect of the previous command.restart
: restart the explorer.
AlphaZero.UserInterface.start_explorer
— Methodstart_explorer(exp::Explorer)
Start an interactive explorer session.