Networks Library

AlphaZero.KNetsModule

This module provides utilities to build neural networks with Knet, along with a library of standard architectures.

source

Knet Utilities

AlphaZero.KNets.KNetworkType
KNetwork{Game} <: AbstractNetwork{Game}

Abstract type for neural networks implemented using the Knet framework.

  • Subtypes are expected to be expressed as the composition of Flux-like layers that implement a functor interface through functions children and mapchildren.
  • A custom implementation of regularized_params_ must also be implemented for layers containing parameters that are subject to regularization.

Provided that the above holds, KNetwork implements the full network interface with the following exceptions: Network.HyperParams, Network.hyperparams, Network.forward and Network.on_gpu.

source

Networks Library

Convolutional ResNet

AlphaZero.KNets.ResNetType
ResNet{Game} <: TwoHeadNetwork{Game}

The convolutional residual network architecture that is used in the original AlphaGo Zero paper.

source
AlphaZero.KNets.ResNetHPType
ResNetHP

Hyperparameters for the convolutional resnet architecture.

ParameterTypeDefault
num_blocksInt-
num_filtersInt-
conv_kernel_sizeTuple{Int, Int}-
num_policy_head_filtersInt2
num_value_head_filtersInt1
batch_norm_momentumFloat320.6f0

The trunk of the two-head network consists of num_blocks consecutive blocks. Each block features two convolutional layers with num_filters filters and with kernel size conv_kernel_size. Note that both kernel dimensions must be odd.

During training, the network is evaluated in training mode on the whole dataset to compute the loss before it is switched to test model, using big batches. Therefore, it makes sense to use a high batch norm momentum (put a lot of weight on the latest measurement).

AlphaGo Zero Parameters

The network in the original paper from Deepmind features 20 blocks with 256 filters per convolutional layer.

source

Simple Network

AlphaZero.KNets.SimpleNetHPType
SimpleNetHP

Hyperparameters for the simplenet architecture.

ParameterDescription
width :: IntNumber of neurons on each dense layer
depth_common :: IntNumber of dense layers in the trunk
depth_phead = 1Number of hidden layers in the actions head
depth_vhead = 1Number of hidden layers in the value head
use_batch_norm = falseUse batch normalization between each layer
batch_norm_momentum = 0.6f0Momentum of batch norm statistics updates
source