DL skill training – Gang Fang's Blog

Weekly DL study note: GFlowNet Code Tutorial (completed)

Completed code: https://github.com/gangfang/littlegfn/blob/main/face_generator.ipynb Pre-requisites Flow Networks (Nothing about training yet) IMPORTANT – MAIN IDEA OF GFN The main idea behind GFlowNet is to interpret the DAG as a flow network, and to think of each edge as a pipe through which some amount of water, or particles, flows. We then want to find a flow where, (a) flow is…

gfang1212

August 17, 2024

AI, DL skill training, GFlowNets

Weekly study note: GFlowNet Code Tutorial 2022

Flow Networks (Nothing about training yet) IMPORTANT – MAIN IDEA OF GFN The main idea behind GFlowNet is to interpret the DAG as a flow network, and to think of each edge as a pipe through which some amount of water, or particles, flows. We then want to find a flow where, (a) flow is preserved, (b) the flow…

gfang1212

July 28, 2024

AI, DL skill training, GFlowNets

Experiments with a 2M-param transformer

Jupyter notebook: https://github.com/gangfang/nanogpt/blob/main/gpt_dev.ipynb

gfang1212

July 12, 2024

DL skill training

Weekly DL study notes: building GPT from stretch, part 2

Parallelization: By removing sequential dependencies, Transformers could be trained much more efficiently on parallel hardware like GPUs.

gfang1212

June 24, 2024

AI, DL skill training

Weekly DL study notes: Residual Networks

Residual Networks (ResNets)

gfang1212

June 20, 2024

AI, DL skill training

Weekly DL study notes: building GPT from scratch

Updated on Jun 14, 2024 Residual Networks (ResNets) Attention

gfang1212

June 3, 2024

DL skill training

DL weekly study notes: building a wavenet

Code reproduction of Andrej Karpathy’s “building makemore part 5” lecture: https://github.com/gangfang/makemore/blob/main/makemore_part5.ipynb Study notes:

gfang1212

May 26, 2024

DL skill training

DL weekly study notes: manual backprop i.e., w/o loss.backward()

Code reproduction of Andrej Karpathy’s “building makemore part 4” lecture: https://github.com/gangfang/makemore/blob/main/makemore_part4_manual_backprop.ipynb

gfang1212

May 16, 2024

DL skill training

DL weekly study notes: batch normalization, PyTorch’s APIs and distribution visualization

Updated on May 3. Code reproduction of Andrej Karpathy’s “building makemore part 3” lecture can be found at https://github.com/gangfang/makemore/blob/main/makemore_part3.ipynb This weekly notes include some older notes. They have been written for Andrej’s Building makemore Part 3: Activations, Gradients & BatchNorm YT video. Principles: We want stable gradients (neither exploding nor vanishing) of non-linearity throughout the…

gfang1212

April 29, 2024

DL skill training