How do GFlowNets relate to attention or human thinking?

How is that related to attention?

It makes sense to consider deterministic transitions when the policy is an internal policy, not operating in some stochastic external environment, but instead performing computational choices (like what to attend to, what computation to perform, what memory to retrieve, etc).

Immediately,

GFlowNets are motivated by the kind of internal policy which could model cognition, i.e., the sequence of micro-choices about internal computation of a thinking agent corresponding to a form of probabilistic inference.

https://milayb.notion.site/The-GFlowNet-Tutorial-95434ef0e2d94c24aab90e69b30be9b3#208ee566b55048cda2c87fd5e0e93330

Brain sciences show that conscious reasoning involves a sequential process of thought formation, where at each step a competition takes place among possible thought contents (and relevant parts of the brain with expertise on that content), and each thought involves very few symbolic elements (a handful). This is the heart of the Global Workspace Theory (GWT), initiated by Baars (1993,1997) and extended (among others) by Dehaene et al (2011, 2017, 2020) as well as through Graziano’s Attention Schema Theory (2011, 2013, 2017).

So GFlowNet models this thought formation process and it revolves around consciousness.

This suggests that something like a GFlowNet could learn the internal policy that selects that sequence of thoughts

In particular, the GWT bottleneck, when applied to such probabilistic inference, would enforce the inductive bias that the graph of dependencies between high-level concepts (that we can manipulate consciously and reason with) is very sparse, in the sense that each factor or energy term only has at most a handful of arguments.

The sparsity I heard in interviews with Yoshua

Leave a comment