AI – Page 2 – Gang Fang's Blog

DL weekly study notes: batch normalization, PyTorch’s APIs and distribution visualization

Updated on May 3. Code reproduction of Andrej Karpathy’s “building makemore part 3” lecture can be found at https://github.com/gangfang/makemore/blob/main/makemore_part3.ipynb This weekly notes include some older notes. They have been written for Andrej’s Building makemore Part 3: Activations, Gradients & BatchNorm YT video. Principles: We want stable gradients (neither exploding nor vanishing) of non-linearity throughout the…

gfang1212

April 29, 2024

DL skill training

DL weekly study notes: initialization and normalization

gfang1212

April 22, 2024

DL skill training

Reproducing Paper “GPT-4 Can’t Reason”

(updated on Apr 22) Introduction The higher level cognitive abilities of ChatGPT has always been fascinating to me. This topic has sparked numerous debates since OpenAI’s launch but most comments are one-sided. Recently I came across Konstantine Arkoudas’s pre-print paper GPT-4 Can’t Reason (arxiv) and was amazed by the clever scoping of the problem statement,…

gfang1212

March 27, 2024

AI, System 2 AI

Another step towards convergence

It strikes me that history says the opposite. ChatGPT represents another stride toward the broader trend of convergence, aligning with how the internet, industrialization, and agriculture have historically embodied the same momentum. This convergence is characterized by a gradual reduction in uncertainties that, while reducing risks and damages, also diminishes “interestingness” in human experiences. Essentially,…

gfang1212

March 27, 2024

AI

DL weekly study notes: DNN’s Initialization and GPT4 Can’t Reason

Published on Mar 17th, 2024. Notes: Notes on GPT for Can’t reason In my view, the most compelling a priori considerations against the plausibility of reliably robust LLM reasoning turn on computational complexity results. Reasoning is a (very) computationally hard problem. In fact, in the general case (first-order or higher-order logic), it is algorithmically undecidable, i.e.,…

gfang1212

March 17, 2024

AI, DL skill training

DL weekly study note: Bengio’s AI scientist, Bayesian or Frequentist, and more

Bengio’s AI scientist talk: Why greatness can’t be planned: https://youtu.be/lhYGXYeMq_E?si=EaFHRz0d1MAqQ68G Bayesian Statistics Bayesian or Frequentist, Which Are You? Basics

gfang1212

March 10, 2024

AI, System 2 AI

DL weekly study notes: DL engineering and Bengio’s AI scientist

lec3 model reproduction and experimentations: https://github.com/gangfang/makemore/blob/main/makemore_mlp.ipynb Implementation notes: High-level:

gfang1212

March 3, 2024

AI, DL skill training, System 2 AI

Experiments I ran on ChatGPT

Here is a collection of experiments I ran on ChatGPT. This is more explorative and the direction is to get the chatbot to produce results that reveal more of its nature. 1. Can ChatGPT produce texts that don’t display regularities of English language? Regularities here refer to anything that gives senses to a text, like…

gfang1212

March 3, 2024

AI

DL implementation study notes, week Feb 25

gfang1212

February 25, 2024

DL skill training

[IN PROGRESS] Software 2.0 and AI in charge of the whole stack to the bits

With Software 2.0, we say the broader the scope of responsibility DL takes on, the better the end results are. Using this as a prior, should we train a model to produce bits that a CPU can execute directly instead of code in a modern programming language? What is the point of producing code if…

gfang1212

February 22, 2024

AI

Category: AI