[HN Gopher] Convolutional Neural Networks for Visual Recognition
       ___________________________________________________________________
        
       Convolutional Neural Networks for Visual Recognition
        
       Author : yu3zhou4
       Score  : 51 points
       Date   : 2024-05-19 20:04 UTC (2 hours ago)
        
 (HTM) web link (cs231n.github.io)
 (TXT) w3m dump (cs231n.github.io)
        
       | ks2048 wrote:
       | One page says "CS231n: Deep Learning for Computer Vision" and
       | another says "CS231n: Convolutional Neural Networks for Visual
       | Recognition". Did they change it recently to recognize other
       | methods (ViT), or?
        
         | danjl wrote:
         | Certainly still worth learning CNNs. Still unclear if ViT is
         | better. And there's certainly enough for a full course on CNNs
         | and a separate course on vision transformers.
        
           | fzliu wrote:
           | Agreed. ViTs are better if you're looking to go multimodal or
           | use attention-specific mechanisms such as cross-attention. If
           | not, there's evidence out there that ViTs are not better than
           | convnets for small networks and at scale
           | (https://frankzliu.com/blog/vision-transformers-are-
           | overrated).
        
       ___________________________________________________________________
       (page generated 2024-05-19 23:00 UTC)