Gang Fang's Blog

Weekly DL study notes: building GPT from stretch, part 2

Parallelization: By removing sequential dependencies, Transformers could be trained much more efficiently on parallel hardware like GPUs.

gfang1212

June 24, 2024

AI, DL skill training

Weekly DL study notes: Residual Networks

Residual Networks (ResNets)

gfang1212

June 20, 2024

AI, DL skill training

Belly of the beats and Just write me something honest

It is incredibly difficult to be honest. Even with this writing, I am tempted to say stuff that shows off my knowledge that’s why I am writing in obsidian and not on my blog before I want to publish. So yesterday I went to the belly of the beats battle, I wasn’t physically in the…

gfang1212

June 9, 2024

Art, Dancing

Weekly DL study notes: building GPT from scratch

Updated on Jun 14, 2024 Residual Networks (ResNets) Attention

gfang1212

June 3, 2024

DL skill training

DL weekly study notes: building a wavenet

Code reproduction of Andrej Karpathy’s “building makemore part 5” lecture: https://github.com/gangfang/makemore/blob/main/makemore_part5.ipynb Study notes:

gfang1212

May 26, 2024

DL skill training

DL weekly study notes: manual backprop i.e., w/o loss.backward()

Code reproduction of Andrej Karpathy’s “building makemore part 4” lecture: https://github.com/gangfang/makemore/blob/main/makemore_part4_manual_backprop.ipynb

gfang1212

May 16, 2024

DL skill training

DL weekly study notes: batch normalization, PyTorch’s APIs and distribution visualization

Updated on May 3. Code reproduction of Andrej Karpathy’s “building makemore part 3” lecture can be found at https://github.com/gangfang/makemore/blob/main/makemore_part3.ipynb This weekly notes include some older notes. They have been written for Andrej’s Building makemore Part 3: Activations, Gradients & BatchNorm YT video. Principles: We want stable gradients (neither exploding nor vanishing) of non-linearity throughout the…

gfang1212

April 29, 2024

DL skill training

DL weekly study notes: initialization and normalization

gfang1212

April 22, 2024

DL skill training

Reproducing Paper “GPT-4 Can’t Reason”

(updated on Apr 22) Introduction The higher level cognitive abilities of ChatGPT has always been fascinating to me. This topic has sparked numerous debates since OpenAI’s launch but most comments are one-sided. Recently I came across Konstantine Arkoudas’s pre-print paper GPT-4 Can’t Reason (arxiv) and was amazed by the clever scoping of the problem statement,…

gfang1212

March 27, 2024

AI, System 2 AI

Another step towards convergence

It strikes me that history says the opposite. ChatGPT represents another stride toward the broader trend of convergence, aligning with how the internet, industrialization, and agriculture have historically embodied the same momentum. This convergence is characterized by a gradual reduction in uncertainties that, while reducing risks and damages, also diminishes “interestingness” in human experiences. Essentially,…

gfang1212

March 27, 2024

AI