Category: AI
-
DL weekly study notes: batch normalization, PyTorch’s APIs and distribution visualization
Updated on May 3. Code reproduction of Andrej Karpathy’s “building makemore part 3” lecture can be found at https://github.com/gangfang/makemore/blob/main/makemore_part3.ipynb This weekly notes include some older notes. They have been written for Andrej’s Building makemore Part 3: Activations, Gradients & BatchNorm YT video. Principles: We want stable gradients (neither exploding nor vanishing) of non-linearity throughout the…
-
Reproducing Paper “GPT-4 Can’t Reason”
(updated on Apr 22) Introduction The higher level cognitive abilities of ChatGPT has always been fascinating to me. This topic has sparked numerous debates since OpenAI’s launch but most comments are one-sided. Recently I came across Konstantine Arkoudas’s pre-print paper GPT-4 Can’t Reason (arxiv) and was amazed by the clever scoping of the problem statement,…
-
Another step towards convergence
It strikes me that history says the opposite. ChatGPT represents another stride toward the broader trend of convergence, aligning with how the internet, industrialization, and agriculture have historically embodied the same momentum. This convergence is characterized by a gradual reduction in uncertainties that, while reducing risks and damages, also diminishes “interestingness” in human experiences. Essentially,…
-
DL weekly study notes: DNN’s Initialization and GPT4 Can’t Reason
Published on Mar 17th, 2024. Notes: Notes on GPT for Can’t reason In my view, the most compelling a priori considerations against the plausibility of reliably robust LLM reasoning turn on computational complexity results. Reasoning is a (very) computationally hard problem. In fact, in the general case (first-order or higher-order logic), it is algorithmically undecidable, i.e.,…
-
DL weekly study note: Bengio’s AI scientist, Bayesian or Frequentist, and more
Bengio’s AI scientist talk: Why greatness can’t be planned: https://youtu.be/lhYGXYeMq_E?si=EaFHRz0d1MAqQ68G Bayesian Statistics Bayesian or Frequentist, Which Are You? Basics
-
DL weekly study notes: DL engineering and Bengio’s AI scientist
lec3 model reproduction and experimentations: https://github.com/gangfang/makemore/blob/main/makemore_mlp.ipynb Implementation notes: High-level:
-
Experiments I ran on ChatGPT
Here is a collection of experiments I ran on ChatGPT. This is more explorative and the direction is to get the chatbot to produce results that reveal more of its nature. 1. Can ChatGPT produce texts that don’t display regularities of English language? Regularities here refer to anything that gives senses to a text, like…
-
[IN PROGRESS] Software 2.0 and AI in charge of the whole stack to the bits
With Software 2.0, we say the broader the scope of responsibility DL takes on, the better the end results are. Using this as a prior, should we train a model to produce bits that a CPU can execute directly instead of code in a modern programming language? What is the point of producing code if…