Category: DL skill training
-
DL weekly study notes: DNN’s Initialization and GPT4 Can’t Reason
Published on Mar 17th, 2024. Notes: Notes on GPT for Can’t reason In my view, the most compelling a priori considerations against the plausibility of reliably robust LLM reasoning turn on computational complexity results. Reasoning is a (very) computationally hard problem. In fact, in the general case (first-order or higher-order logic), it is algorithmically undecidable, i.e.,…
-
DL weekly study notes: DL engineering and Bengio’s AI scientist
lec3 model reproduction and experimentations: https://github.com/gangfang/makemore/blob/main/makemore_mlp.ipynb Implementation notes: High-level:
-
DL implementation study – Feb 16, 2024
makemore repo with the completed bigram model (lecture 2): https://github.com/gangfang/makemore Notes: Questions:
-
DL implementation study – Jan 5, 2024
building makemore before part 2: # Doing the same thing but with a ANN # imagine how the model looks like before watching video: # input is a char, output is the next # one-hot encoding to make each output neuron a bin classifier # softmax to get the prob dist # N is not…
-
DL study – Jan 25, 2024
The scaling law I have been hearing about, how much truth is there to it? I think to look at it from the opposite angle, the question is: how much neural architecture matters? Or does all those invariants or biases that different architectures possess are simply a “shortcut”, which facilitates more complex problem solving with…
-
DL implementation study – Jan 24, 2024
Completed micrograd exercise: https://github.com/gangfang/micrograd/blob/main/micrograd_exercises.ipynb
-
DL theory study – Jan 20, 2024
DL by IG, YB and AC I have been reading the Introduction of the book and here are some notes: