[HN Gopher] Convolutional Neural Networks for Visual Recognition
___________________________________________________________________
Convolutional Neural Networks for Visual Recognition
Author : yu3zhou4
Score : 51 points
Date : 2024-05-19 20:04 UTC (2 hours ago)
(HTM) web link (cs231n.github.io)
(TXT) w3m dump (cs231n.github.io)
| ks2048 wrote:
| One page says "CS231n: Deep Learning for Computer Vision" and
| another says "CS231n: Convolutional Neural Networks for Visual
| Recognition". Did they change it recently to recognize other
| methods (ViT), or?
| danjl wrote:
| Certainly still worth learning CNNs. Still unclear if ViT is
| better. And there's certainly enough for a full course on CNNs
| and a separate course on vision transformers.
| fzliu wrote:
| Agreed. ViTs are better if you're looking to go multimodal or
| use attention-specific mechanisms such as cross-attention. If
| not, there's evidence out there that ViTs are not better than
| convnets for small networks and at scale
| (https://frankzliu.com/blog/vision-transformers-are-
| overrated).
___________________________________________________________________
(page generated 2024-05-19 23:00 UTC)