Building makemore bigram
- A lot of the work is essentially transformation of representation. For example, transforming the bigram stats represented in counter (i.e., defaultdict) into a PyTorch tensor representation
- N-dimensional data can be displayed in 2D with N nested “[” like “[[[[” at the beginning of the data. I imagine it can be displayed in 3D too, but anything over 3D is scribbles
- “Sampling from the model” means to have a model which learned the stats in the data and represents the distribution of the data, then to generate new examples based on the distribution represented by the model.
Leave a comment