# Networks Library

For convenience, we provide a library of standard networks implementing the neural network interface.

These networks are contained in the `AlphaZero.NetLib`

module, which is resolved to either `AlphaZero.KnetLib`

or `AlphaZero.FluxLib`

during precompilation depending on the value of the `ALPHAZERO_DEFAULT_DL_FRAMEWORK`

environment variable (Knet is recommended and used by default).

## Convolutional ResNet

`AlphaZero.FluxLib.ResNet`

— Type`ResNet <: TwoHeadNetwork`

The convolutional residual network architecture that is used in the original AlphaGo Zero paper.

`AlphaZero.FluxLib.ResNetHP`

— Type`ResNetHP`

Hyperparameters for the convolutional resnet architecture.

Parameter | Type | Default |
---|---|---|

`num_blocks` | `Int` | - |

`num_filters` | `Int` | - |

`conv_kernel_size` | `Tuple{Int, Int}` | - |

`num_policy_head_filters` | `Int` | `2` |

`num_value_head_filters` | `Int` | `1` |

`batch_norm_momentum` | `Float32` | `0.6f0` |

The trunk of the two-head network consists of `num_blocks`

consecutive blocks. Each block features two convolutional layers with `num_filters`

filters and with kernel size `conv_kernel_size`

. Note that both kernel dimensions must be odd.

During training, the network is evaluated in training mode on the whole dataset to compute the loss before it is switched to test model, using big batches. Therefore, it makes sense to use a high batch norm momentum (put a lot of weight on the latest measurement).

**AlphaGo Zero Parameters**

The network in the original paper from Deepmind features 20 blocks with 256 filters per convolutional layer.

## Simple Network

`AlphaZero.FluxLib.SimpleNet`

— Type`SimpleNet <: TwoHeadNetwork`

A simple two-headed architecture with only dense layers.

`AlphaZero.FluxLib.SimpleNetHP`

— Type`SimpleNetHP`

Hyperparameters for the simplenet architecture.

Parameter | Description |
---|---|

`width :: Int` | Number of neurons on each dense layer |

`depth_common :: Int` | Number of dense layers in the trunk |

`depth_phead = 1` | Number of hidden layers in the actions head |

`depth_vhead = 1` | Number of hidden layers in the value head |

`use_batch_norm = false` | Use batch normalization between each layer |

`batch_norm_momentum = 0.6f0` | Momentum of batch norm statistics updates |