-
Autoencoder, contrarian thought on AGI and GenAI’s impacts on Software engineering, Interpretability
Autoencoder: Contrarian thought on AGI It’s possible that AGI will never arrive. AI practitioners complain about benchmarks being pushed further and further whenever they make progress. This situation makes sense when viewed from the lens of Bayesian framework. The prior changed, which leads to changes in expectations. Inferring from this, I speculate that it is…
-
Weekly DL study notes: building GPT from stretch, part 2
Parallelization: By removing sequential dependencies, Transformers could be trained much more efficiently on parallel hardware like GPUs.
-
Belly of the beats and Just write me something honest
It is incredibly difficult to be honest. Even with this writing, I am tempted to say stuff that shows off my knowledge that’s why I am writing in obsidian and not on my blog before I want to publish. So yesterday I went to the belly of the beats battle, I wasn’t physically in the…
-
Weekly DL study notes: building GPT from scratch
Updated on Jun 14, 2024 Residual Networks (ResNets) Attention
-
DL weekly study notes: building a wavenet
Code reproduction of Andrej Karpathy’s “building makemore part 5” lecture: https://github.com/gangfang/makemore/blob/main/makemore_part5.ipynb Study notes:
-
DL weekly study notes: manual backprop i.e., w/o loss.backward()
Code reproduction of Andrej Karpathy’s “building makemore part 4” lecture: https://github.com/gangfang/makemore/blob/main/makemore_part4_manual_backprop.ipynb
-
DL weekly study notes: batch normalization, PyTorch’s APIs and distribution visualization
Updated on May 3. Code reproduction of Andrej Karpathy’s “building makemore part 3” lecture can be found at https://github.com/gangfang/makemore/blob/main/makemore_part3.ipynb This weekly notes include some older notes. They have been written for Andrej’s Building makemore Part 3: Activations, Gradients & BatchNorm YT video. Principles: We want stable gradients (neither exploding nor vanishing) of non-linearity throughout the…
-
Reproducing Paper “GPT-4 Can’t Reason”
(updated on Apr 22) Introduction The higher level cognitive abilities of ChatGPT has always been fascinating to me. This topic has sparked numerous debates since OpenAI’s launch but most comments are one-sided. Recently I came across Konstantine Arkoudas’s pre-print paper GPT-4 Can’t Reason (arxiv) and was amazed by the clever scoping of the problem statement,…