MAF and RealNVP

LINFA implements two widely used normalizing flow formulations, MAF [PPM18] and RealNVP [DSDB16].

MAF belongs to the class of autoregressive normalizing flows. Given the latent variable \(\boldsymbol{z} = (z_{1},z_{2},\dots,z_{d})\) it assumes \(p(z_i|z_{1},\dots,z_{i-1}) = \phi[(z_i - \mu_i) / e^{\alpha_i}]\), where \(\phi\) is the standard normal density, \(\mu_i = f_{\mu_i}(z_{1},\dots,z_{i-1})\), \(\alpha_i = f_{\alpha_i}(z_{1},\dots,z_{i-1}),\,i=1,2,\dots,d\), and \(f_{\mu_i}\) and \(f_{\alpha_i}\) are masked autoencoder neural networks (MADE, [GGML15]).

In a MADE autoencoder the network connectivities are multiplied by Boolean masks so the input-output relation maintains a lower triangular structure, making the computation of the Jacobian determinant particularly simple. MAF transformations are then composed of multiple MADE layers, possibly interleaved by batch normalization layers [IS15], typically used to add stability during training and increase network accuracy [PPM18].

RealNVP is another widely used flow where, at each layer the first \(d'\) variables are left unaltered while the remaining \(d-d'\) are subject to an affine transformation of the form \(\widehat{\boldsymbol{z}}_{d'+1:d} = \boldsymbol{z}_{d'+1:d}\,\odot\,e^{\boldsymbol{\alpha}} + \boldsymbol{\mu}\), where \(\boldsymbol{\mu} = f_{\mu}(\boldsymbol{z}_{1:d'})\) and \(\boldsymbol{\alpha} = f_{\alpha}(\boldsymbol{z}_{d'+1:d})\) are MADE autoencoders. In this context, MAF could be seen as a generalization of RealNVP by setting \(\mu_i=\alpha_i=0\) for \(i\leq d'\) [PPM18].