Category: System 2 AI
-
Reproducing Paper “GPT-4 Can’t Reason”
(updated on Apr 22) Introduction The higher level cognitive abilities of ChatGPT has always been fascinating to me. This topic has sparked numerous debates since OpenAI’s launch but most comments are one-sided. Recently I came across Konstantine Arkoudas’s pre-print paper GPT-4 Can’t Reason (arxiv) and was amazed by the clever scoping of the problem statement,…
-
DL weekly study note: Bengio’s AI scientist, Bayesian or Frequentist, and more
Bengio’s AI scientist talk: Why greatness can’t be planned: https://youtu.be/lhYGXYeMq_E?si=EaFHRz0d1MAqQ68G Bayesian Statistics Bayesian or Frequentist, Which Are You? Basics
-
DL weekly study notes: DL engineering and Bengio’s AI scientist
lec3 model reproduction and experimentations: https://github.com/gangfang/makemore/blob/main/makemore_mlp.ipynb Implementation notes: High-level:
-
Generalization and the Scaling laws – Jan 31, 2024
Traditionally, the central challenge in ML is generalization, or out of distribution generalization. The problem of generalization only occurs when the model is used to predict on previously unobserved inputs. However, with the current approach of LLM training, which uses the “entire Internet”, there doesn’t seem to be any more unobserved data and therefore, OOD…
-
DL study – Jan 25, 2024
The scaling law I have been hearing about, how much truth is there to it? I think to look at it from the opposite angle, the question is: how much neural architecture matters? Or does all those invariants or biases that different architectures possess are simply a “shortcut”, which facilitates more complex problem solving with…
-
System 2 AI, DL – Jan 21, 2024
I have been challenging myself my notion that in order for a system to conduct deliberate thinking, its design has to have some certain inductive biases that underpin such an ability in human. But then I read that these days neuroscience’s inspirations on DL are more limited than I used to think, I started to…
-
Can ChatGPT understand? A tiny case study – Jan 16, 2024
Can current machines understand? One observation that seems to be strongly showing ChatGPT doesn’t reason but merely imitate language formations is that it responds to more complex contextual information completely differently depending on the phrasing of the question, especially whether positivity or negativity is implied when a why question is asked. This shows that the…