GFlowNet Study – Dec 12, 2023

In regular GFlowNets, choosing a_t from s_t deterministically yields some s_{t+1}, which means that we can also write \pi(a_t|s_t)=P_F(s_{t+1}|s_t) for that policy

Doesn’t choosing a_t from s_t deterministically mean P_F(s_{t+1}|s_t)=1? What do we still need \pi(a_t|s_t)=P_F(s_{t+1}|s_t)?

My answer with ChatGPT help: \pi(a_t|s_t) remains a stochastic policy given P_F. The deterministic element is instead the equation we can establish between \pi and P_F, which states that when a_t is chosen, s_t+1 is the fixed outcome.

Note: on “inference

the relational diagram presented in scaling-in-the-service-of-reasoning-model-based-ml/ depicts the relationship between the world model and the inference machine, which is embodied by GFlowNet. I noted this because I saw the architectural designs in both are virtually the same, albeit the scaling paper keeps things at an abstract level. The relationship is the world model provides R(x) when the inference machine’s P(x) is proportional to R(x), which represent the bayesian posterior.

Inference, under the context of the system 2 AI, is the formulation of the compositional objects, which could be ideas, questions, solutions and etc. And it goes hand in hand with the source of knowledge, which is represented by the world model. Inference, under the context of ML, is the process of prediction using a trained model. Inference, in the most general sense, is the process of drawing conclusions or making judgments based on evidence and reasoning. It is a deductive process.

Overall, GFlowNet is a framework for sampling from a Bayesian posterior, which enables it to deduct (i.e., inference) based on observed data and prior knowledge.

How does R(x) represent a Bayesian posterior?

My answer:

Note: the object construction process of GFlowNet

Note that “GFN” in this illustration isn’t really GFlowNet, but the basic component which is a neural network that defines its constructive policy \pi.

Note: the foundational principles of GFlowNet

  • Compositional. System 2 performs reasoning by building ideas step by step, in a compositional manner.
  • Policy \pi(a_t|s_t) is learned and modeled after system 2.
  • Generative. It can then be used to artificially build thoughts.

Note: the proposed link between GFlowNet and consciousness science that is the foundation of GFlowNets for Brain-inspired reasoning:

This suggests that something like a GFlowNet could learn the internal policy that selects that sequence of thoughts.

https://milayb.notion.site/The-GFlowNet-Tutorial-95434ef0e2d94c24aab90e69b30be9b3#208ee566b55048cda2c87fd5e0e93330

Leave a comment