Building micrograd
Notes
- [DONE] The intuition behind derivative of a multivariate function
- A NN can be represented purely mathematically or as a graph data structure in a programming language. Remember, at the end of the day, they are just languages and we can use any language to represent or describe the construct, in this case a NN, we have in our head.

Questions
Why are gradient descent or its variants used to optimize parameters instead of solving the loss function analytically for its minima?
AN: complex loss functions aren’t easily solvable analytically

Leave a comment