Note: a trained GFN is both a sampler (i.e., generating compositional objects) and an inference machine (i.e., answering questions and predicting probabilities)
How does a parametrized energy function look like?
TO_BE_ANSWERED. But the training of it can be done with classical maximum likelihood.
we have shown how we can jointly train the GFlowNet sampler and its target reward function (which is an energy function in disguise: R(x)=e^{-{\cal E}(x)}) from a dataset
https://milayb.notion.site/95434ef0e2d94c24aab90e69b30be9b3
The reward function is in the world model. I know that the inference machine Q helps train world model P. It means that the sampler and the reward function are trained jointly.
But then what is the ultimate training objective that guides the joint training? Or how does the training process looks like?
TO_BE_ANSWERED.
What is energy-based modeling?
My answer after brief reading: EBM is a way to interpret systems through their energy level. Low energy represents a stable system.
To be continued tomorrow.
Leave a comment