Johns Hohns
[Transformer] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet (ICCV2021)