DL implementation study – Jan 21, 2024

Notes

What can be turned into a torch.tensor?
- a py list or a sequence
- a dimensional request can be passed into it and a corresponding tensor to be constructed. Like this:
- torch.zeros([2, 4], dtype=torch.int32) or torch.randn((2,2,2,2,2,2,2,2,2,2))
in bengio 2003 nlp paper, why is the embedding used in the input layer for those 17000 words? – AN: one word, efficiency. The traditional one-hot encoded word representation is very sparse and inefficient and it’s a curse of dimensionality. This embedding being efficient enables processing of larger vocabularies.
- In here, we say it’s a curse of dimensionality. And “dimension” refers to the length of a one-hot vector, which is the number of all words in the vocabulary. It doesn’t refer to the dimension of the training set, which is always 2D.
what exactly is the curse of dimensionality? – AN: many ml problems become exceedingly difficult as the number of variables (aka dimensions) increases because more examples are needed to adequately fill up/explain the data space that becomes larger as number of dimensions increases.
Two different interpretations of the embedding layer:
- w/o one-hot encoding: simply a matrix value retrieval process where the embedding value is retrieved from the matrix C when an index is given
- w one-hot encoding: a linear layer in the ann where C represents the weight matrix. The one-hot encoded char, which looks like [0,0,0,1,0,0…], is matrix multiplied by C (i.e., a linear operation) to get the output of this layer.

Qs:

what would happen if I unbind the last dimension? torch.unbind(emb, 2)? Do I get torch.Size([32, 3]) or torch.Size([32, 3,1])? My guess is the second. – The correct answer is torch.Size([32, 3]) because torch.unbind REMOVES the specified dimension.

One response to “DL implementation study – Jan 21, 2024”

DL weekly study notes: building a wavenet – Gang Fang's Blog

May 26, 2024 at 2:32 pm

[…] is the embedding layer used? I wrote about the two interpretations of the embedding layer in this study note: it is either a matrix value retrieval or a matrix multiplication, depending on whether we view it […]

LikeLike

DL implementation study – Jan 21, 2024

Share this:

One response to “DL implementation study – Jan 21, 2024”

Leave a reply to DL weekly study notes: building a wavenet – Gang Fang's Blog Cancel reply