What is energy-based modeling?
My answer after brief reading: the fundamental idea of EBM is to interpret a system in terms of energy. The energy function, which outputs a scalar value, defines a system’s state. The lower the energy, the more stable, desirable or likely the system. In optimization, the energy function is similar to a loss function where they are to be minimized.
Critique: A more precise description of the energy function is that this function is defined to represent a system’s state and assigns a scalar energy value to each configuration of the variables in the model. the energy of a state is inversely related to its probability.
The formula E(x) = -log R(x) I saw in the GFN tutorial is a transformation of the Boltzmann distribution, P(v)=e^−E(v) / Z, using unnormalized prob function R(x). A Boltzmann distribution is a probability distribution that gives the probability of a system being in a certain state as a function of that state’s energy, to which it is inversely related.
The relationship between EBM and MCMC: MCMC is used to sample from the prob dist modeled by the energy function.
How could an energy function look like?
This is an example with a simple Boltzmann machine set up:
Suppose we have a system with binary nodes (neurons) vi (which can take values 0 or 1) and weights wij between nodes. The energy function E for such a system can be defined as:
E(v)=−∑i<jw_ij * v_i * v_j − ∑ib_i * v_i
Leave a comment