[HN Gopher] Scaling Down Deep Learning
       ___________________________________________________________________
        
       Scaling Down Deep Learning
        
       Author : caprock
       Score  : 59 points
       Date   : 2022-12-22 19:04 UTC (3 hours ago)
        
 (HTM) web link (greydanus.github.io)
 (TXT) w3m dump (greydanus.github.io)
        
       | MinaMe wrote:
       | [flagged]
        
       | jszymborski wrote:
       | I think this is a wonderful dataset for teaching and will
       | certainly try to include it in the assignments I write.
       | 
       | Other datasets I tried:
       | 
       | Fashion MNIST has so many mislabels and also is pretty trivial to
       | separate with UMAP alone
       | 
       | Google's QuickDraw is better than MNIST, but I also haven't
       | tested it against e.g. log. regression.
       | 
       | Of course there is CIFAR but those images hardly look like
       | images. I can't classify half of them.
        
       | yorwba wrote:
       | Previous discussion (2020):
       | https://news.ycombinator.com/item?id=25314066
        
       | unixlikeposting wrote:
       | >The ideal toy dataset should be procedurally generated so that
       | researchers can smoothly vary parameters such as background
       | noise, translation, and resolution.
       | 
       | It's been a minute since I last touched ML, but that seems like a
       | fairly extreme claim. Am I wrong in thinking this?
        
         | pkghost wrote:
         | What's extreme about it? I'm new to ML, but this seems great
         | from a testing and verification perspective. I actually feel
         | like Christmas came early, here, because I'm eager to explore
         | novel model architectures, and having a small and easily
         | manipulable dataset to experiment with seems perfect for that.
        
       | stared wrote:
       | Well, MNIST is a trivial dataset. Many simple methods work
       | astonishingly well. For example, it takes some tuning to beat k
       | Nearest Neighbors with a neural network. Or it is enough to use
       | t-SNE to cluster digits in an unsupervised way without any
       | preprocessing.
       | 
       | Fashion MNISt is not better - it shares the same issue with
       | MNIST. At the very least, use non-MNIST (letters A-H from various
       | fonts). Instead, I wholeheartedly recommend Google Quickdraw -
       | hand-drawn doodles; more samples, more engaging, and more
       | diverse. And images of the same size as MNIST.
       | 
       | See an example of usage for someone's first neural network:
       | https://github.com/stared/thinking-in-tensors-writing-in-pyt...
        
         | kkoncevicius wrote:
         | kNN with k=1 is a more complex model than a neural network, not
         | a simpler one. If we had more data it would scale and if we
         | reach a point where we have a labeled image for every possible
         | pixel combination in MNIST it would be unbeatable.
        
       | naillo wrote:
       | This is a beautiful blog, e.g. this one is one of my favorites
       | https://greydanus.github.io/2020/10/14/optimizing-a-wing/
        
         | matmatmatmat wrote:
         | Man, ever find someone who's interested in all the same things
         | as you but has had time to explore them, correctly, and even
         | publishes the results?
         | 
         | What an amazing find, this blog, especially wing optimization
         | (as you pointed out). I hope this guy gets the resources to run
         | free with his work and just create incredible things.
        
       | MinaMe wrote:
       | [flagged]
        
       ___________________________________________________________________
       (page generated 2022-12-22 23:00 UTC)