How I Finally Understood Self-Attention (With PyTorch)