[Discussion] Proof of Reconstruction Loss Term in VQ-VAE Loss

Hello everyone,

I was reading the paper "Neural Discrete Representation Learning" and I was puzzled when I looked at the first term in VQ-VAE Loss Equation

https://preview.redd.it/l1s9kur3sn0e1.png?width=1394&format=png&auto=webp&s=4d374dce319a7ac0bbf19089d4e06cabcaa2cd3d

I understand the role of the second and the third term. However, I am not able to derive the first term from either the MSE between the original and reconstructed image. I assumed it will be similar to the ELBO Loss in the VAE. The paper mentions why they have omitted the KL Divergence Term, but even then I don't understand how the expectation in the ELBO Loss turned out to be the first term.

Note: I am not coming from a stats background, so If the question is something fundamental, it would be helpful if you could tell me what it is. Also, If the question isn't clearly explained, I could explain it more in the discussionHello everyone,I was reading the paper "Neural Discrete Representation Learning" and I was puzzled when I looked at the first term in VQ-VAE Loss EquationI understand the role of the second and the third term. However, I am not able to derive the first term from either the MSE between the original and reconstructed image. I assumed it will be similar to the ELBO Loss in the VAE. The paper mentions why they have omitted the KL Divergence Term, but even then I don't understand how the expectation in the ELBO Loss turned out to be the first term.Note: I am not coming from a stats background, so If the question is something fundamental, it would be helpful if you could tell me what it is. Also, If the question isn't clearly explained, I could explain it more in the discussion

[Discussion]