Networks Library
AlphaZero.KNets
— ModuleThis module provides utilities to build neural networks with Knet, along with a library of standard architectures.
Knet Utilities
AlphaZero.KNets.KNetwork
— TypeKNetwork{Game} <: AbstractNetwork{Game}
Abstract type for neural networks implemented using the Knet framework.
- Subtypes are expected to be expressed as the composition of Flux-like layers that implement a functor interface through functions
children
andmapchildren
. - A custom implementation of
regularized_params_
must also be implemented for layers containing parameters that are subject to regularization.
Provided that the above holds, KNetwork
implements the full network interface with the following exceptions: Network.HyperParams
, Network.hyperparams
, Network.forward
and Network.on_gpu
.
AlphaZero.KNets.TwoHeadNetwork
— TypeTwoHeadNetwork{Game} <: KNetwork{G}
An abstract type for two-head neural networks implemented with Knet.
Subtypes are assumed to have the following fields: hyper
, common
, vhead
and phead
. Based on those, an implementation is provided for Network.hyperparams
, Network.forward
and Network.on_gpu
, leaving only Network.HyperParams
to be implemented.
Networks Library
Convolutional ResNet
AlphaZero.KNets.ResNet
— TypeResNet{Game} <: TwoHeadNetwork{Game}
The convolutional residual network architecture that is used in the original AlphaGo Zero paper.
AlphaZero.KNets.ResNetHP
— TypeResNetHP
Hyperparameters for the convolutional resnet architecture.
Parameter | Type | Default |
---|---|---|
num_blocks | Int | - |
num_filters | Int | - |
conv_kernel_size | Tuple{Int, Int} | - |
num_policy_head_filters | Int | 2 |
num_value_head_filters | Int | 1 |
batch_norm_momentum | Float32 | 0.6f0 |
The trunk of the two-head network consists of num_blocks
consecutive blocks. Each block features two convolutional layers with num_filters
filters and with kernel size conv_kernel_size
. Note that both kernel dimensions must be odd.
During training, the network is evaluated in training mode on the whole dataset to compute the loss before it is switched to test model, using big batches. Therefore, it makes sense to use a high batch norm momentum (put a lot of weight on the latest measurement).
AlphaGo Zero Parameters
The network in the original paper from Deepmind features 20 blocks with 256 filters per convolutional layer.
Simple Network
AlphaZero.KNets.SimpleNet
— TypeSimpleNet{Game} <: TwoHeadNetwork{Game}
A simple two-headed architecture with only dense layers.
AlphaZero.KNets.SimpleNetHP
— TypeSimpleNetHP
Hyperparameters for the simplenet architecture.
Parameter | Description |
---|---|
width :: Int | Number of neurons on each dense layer |
depth_common :: Int | Number of dense layers in the trunk |
depth_phead = 1 | Number of hidden layers in the actions head |
depth_vhead = 1 | Number of hidden layers in the value head |
use_batch_norm = false | Use batch normalization between each layer |
batch_norm_momentum = 0.6f0 | Momentum of batch norm statistics updates |