Notes
- What can be turned into a torch.tensor?
- a py list or a sequence
- a dimensional request can be passed into it and a corresponding tensor to be constructed. Like this:
- torch.zeros([2, 4], dtype=torch.int32) or torch.randn((2,2,2,2,2,2,2,2,2,2))
- in bengio 2003 nlp paper, why is the embedding used in the input layer for those 17000 words? – AN: one word, efficiency. The traditional one-hot encoded word representation is very sparse and inefficient and it’s a curse of dimensionality. This embedding being efficient enables processing of larger vocabularies.
- In here, we say it’s a curse of dimensionality. And “dimension” refers to the length of a one-hot vector, which is the number of all words in the vocabulary. It doesn’t refer to the dimension of the training set, which is always 2D.
- what exactly is the curse of dimensionality? – AN: many ml problems become exceedingly difficult as the number of variables (aka dimensions) increases because more examples are needed to adequately fill up/explain the data space that becomes larger as number of dimensions increases.
- Two different interpretations of the embedding layer:
- w/o one-hot encoding: simply a matrix value retrieval process where the embedding value is retrieved from the matrix C when an index is given
- w one-hot encoding: a linear layer in the ann where C represents the weight matrix. The one-hot encoded char, which looks like [0,0,0,1,0,0…], is matrix multiplied by C (i.e., a linear operation) to get the output of this layer.
Qs:
- what would happen if I unbind the last dimension? torch.unbind(emb, 2)? Do I get torch.Size([32, 3]) or torch.Size([32, 3,1])? My guess is the second. – The correct answer is torch.Size([32, 3]) because torch.unbind REMOVES the specified dimension.
Leave a reply to DL weekly study notes: building a wavenet – Gang Fang's Blog Cancel reply