This ICML 2015 paper by Salimans, Kingma and Welling tries to explore ‘a new synthesis of variational inference and Monte Carlo’.
The paper looks at a potentially interesting approach, namely introducing an optimization objective for optimizing parameters of the inference algorithm and approximating some integrals appearing in the objective using Monte Carlo. I had big problems following the notation and general ideas however. The authors want to treat Monte Carlo samples as auxiliary variables and introduce a distribution where is the data, and are the samples from Monte Carlo. This distribution is named auxiliary inference distribution at one place and inverse model further down in the same column. This is one example of notation/terminology changing in the paper. I have little intuition as to what is used for or why it is interesting to look at. An elementary explanation would be nice here.
When using Monte Carlo to approximate some intractable integrals, in particular a variational lower bound, the authors claim that this gives them unbiased estimates. It is unclear to me why the estimates should be unbiased, especially if only few MC samples are collected as suggested by the paper. Completely elusive to me is the way they propose to compute gradients with respect to the lower bound.
At this point I was probably so lost that any further material in the paper had no hopes of me understanding. The proposed Hamiltonian Variational Inference is the first of these proposals I don’t really get, as is Annealed Variational Inference.
In general, I feel there is a potpourri of ideas and I am unclear how they fit together. But this of course can be a result of me not understanding what the principles of the approach are in the first place.