[HN Gopher] Neural Networks: Zero to Hero
       ___________________________________________________________________
        
       Neural Networks: Zero to Hero
        
       Author : whereistimbo
       Score  : 243 points
       Date   : 2023-04-05 19:12 UTC (3 hours ago)
        
 (HTM) web link (karpathy.ai)
 (TXT) w3m dump (karpathy.ai)
        
       | whiplash451 wrote:
       | Andrej's course is brilliant and so nice to follow.
       | 
       | His explanation of attention is the most accessible I have ever
       | seen.
        
       | spaceman_2020 wrote:
       | A little offtopic, but is this something that someone with only
       | webdev experience can get started with?
        
         | nomel wrote:
         | Getting started with it takes exactly zero experience. Being
         | productive in it does, but that's unrelated to the starting
         | point, and shouldn't discourage you, if you really want to do
         | it.
         | 
         | There are several open courses, online.
        
         | jfisher4024 wrote:
         | It might make sense to get a handle on Python first
        
         | thealchemistdev wrote:
         | [dead]
        
         | mhh__ wrote:
         | The tools are basically irrelevant conceptually. Its all about
         | the mathematics
        
         | whiplash451 wrote:
         | You need some fluency in python and basic knowledge of algebra
         | (matrix multiplication etc.)
         | 
         | If you want, you can also start with the first lessons of
         | course.fast.ai
        
       | yacine_ wrote:
       | What I appreciate about karpathy's videos is that it doesn't make
       | things any more complicated than they need to be. Simple,
       | engineering language is used. No gatekeeping! It's reassuring,
       | and lets everyone know that anyone can do it.
       | 
       | Thanks karpathy!
        
         | bilsbie wrote:
         | I just don't know what he means by logits. Everything else
         | seems like straightforward language.
        
           | airstrike wrote:
           | Having not watched the series, I can only assume he means
           | logit as in a probability function from 0 to 1
           | 
           | https://deepai.org/machine-learning-glossary-and-
           | terms/logit....
        
           | joshvm wrote:
           | When people mention logits, they're usually referring to the
           | raw output of the model before it gets transformed/normalised
           | into a probability distribution (i.e. sums to 1, range
           | [0,1]). Logits can take any value. The naming might not be
           | mathematically strict, because it assumes(?) that you're
           | going to apply softmax (which interprets the output of the
           | model as logits), but that's how the term is used.
           | 
           | For example in many classification problems you get a 1D
           | vector of logits from the final layer, you apply softmax to
           | normalise, then argmax to extract the predicted class. It
           | extends to other tasks like semantic segmentation (predict
           | pixel classes) where the "logit" output is the same size as
           | the image with a channel for each class and you apply the
           | same process to get a single channel image with class-per-
           | pixel.
           | 
           | Here's a nice explanation:
           | https://stackoverflow.com/a/66804099/395457
        
       | [deleted]
        
       | sourcecodeplz wrote:
       | This is really cool and I am so glad my math teacher was a hard
       | ball and I still remember some Calculus.
       | 
       | edit: Python really was/is made for this
       | numbers/calculation/visualization thing. Kinda kicking myself now
       | for not investing more in it and sticking with PHP, although PHP
       | has its merits when building different things, Python is a beast
       | with numbers.
        
         | sourcecodeplz wrote:
         | I am at graphwiz now and it is getting better and better.
         | 
         | Also using ChatGTP to ask questions where I don't get
         | something.
         | 
         | Wow what a time we live in to learn things.
        
       | 7373737373 wrote:
       | This was the first time I actually _grokked_ backpropagation,
       | just the first video alone is more lucid and valuable than any
       | other resource about machine learning I had seen before, in fact
       | it 's so well explained that i managed to implement the library
       | almost completely from memory after watching it - I cannot
       | recommend it highly enough, especially for programmers without a
       | math background!
       | 
       | The only aspect I could see being non-ideal for some is that it
       | uses some Python-specific cleverness/advanced syntax and
       | semantics (__call__(), list comprehensions with two for's,
       | **kwargs, __add__, __repr__, subclasses, (nested) functions as
       | variables etc.), but if you are familiar with these it might seem
       | more compact and elegant as well.
        
         | whiplash451 wrote:
         | To be fair, the older Andrew Ng's online course was also
         | fantastic to explain backprop.
         | 
         | But this does not remove any credit to Andrej's class.
        
       | agentofoblivion wrote:
       | Wonderful! Just went through the GPT video the other day and it
       | was great. Andrej has a talent for pedagogy via simplification.
        
         | yuuuuyu wrote:
         | And he presents all this stuff with humility. Many people that
         | present are just showing off and are pretty much full of
         | themselves. I suppose they need the ego boost, who knows. But
         | Andrej could be the nice guy next door in the dorm who is
         | studying the same course as you, just that hr is a lecture or
         | two ahead. (Until you figure out he is the former VP of AI at
         | Tesla or whatever his title ended up being before he left.)
         | 
         | I can even recommend his interview with Lex Fridman.
        
           | Yajirobe wrote:
           | He also is quite good at teaching and solving the Rubik's
           | cube
        
           | meling wrote:
           | Absolutely agree with this.
        
           | 0cf8612b2e1e wrote:
           | Only finished the first video, but he even made two minor
           | blunders in his code, but kept the footage. Really helps your
           | confidence when you see a pro make a mistake rather than a
           | perfectly polished but unattainable ideal standard.
        
         | frankcort wrote:
         | Which GPT Video?
        
           | nomel wrote:
           | It can be found on the sites home page.
           | 
           | Let's build GPT: from scratch, in code, spelled out:
           | https://www.youtube.com/watch?v=kCc8FmEb1nY
        
             | frankcort wrote:
             | Thank you!
        
         | abraxas wrote:
         | He is a master educator. While at Stanford he developed their
         | undergrad machine learning intro course named cs231n which
         | immediately became legendary. It's somewhat out of date on some
         | details but it's still well worth watching especially as
         | delivered by Andrej. You can find all 11 lectures on YouTube.
        
       | auggierose wrote:
       | This course together with the new fastai ones [1] seem to be
       | exactly what I was looking for. The micrograd video is excellent.
       | 
       | [1] https://course.fast.ai
        
       | zakki wrote:
       | Anybody tried the lessons in an Apple MBA/MBP M1/M2? Is it easily
       | applicable?
        
       | sirodoht wrote:
       | I'm doing an ML apprenticeship [1] these weeks and Karpathy's
       | videos are part of it. We've been deep down into them. I found
       | them excellent. All concepts he illustrates are crystal clear in
       | his mind (even though they are complicated concepts themselves)
       | and that shows in his explanations.
       | 
       | Also, the way he builds up everything is magnificent. Starting
       | from basic python classes, to derivatives and gradient descent,
       | to micrograd [2] and then from a bigram counting model [3] to
       | makemore [4] and nanoGPT [5]
       | 
       | [1]: https://www.foundersandcoders.com/ml
       | 
       | [2]: https://github.com/karpathy/micrograd
       | 
       | [3]:
       | https://github.com/karpathy/randomfun/blob/master/lectures/m...
       | 
       | [4]: https://github.com/karpathy/makemore
       | 
       | [5]: https://github.com/karpathy/nanoGPT
        
         | bilsbie wrote:
         | Do you run the code as you watch?
         | 
         | I've been simply watching them on a palm from a hammock and I'm
         | worried I'm not getting the full experience.
        
           | sirodoht wrote:
           | I've found that actually running the code has been very
           | beneficial in understanding. This, along with reasoning for
           | each line of code and spending a lot of time with the video
           | paused and discussing and explaining to each other what we
           | understood.
        
             | whiplash451 wrote:
             | Same. I also found the exercises to be useful.
        
         | jimsparkman wrote:
         | That program sounds quite impressive, I wonder if any
         | equivalencies exist in the US?
        
           | sirodoht wrote:
           | The website doesn't say what--for me--is the best thing about
           | it. The course is peer-led which works like this: once your
           | join, you're part of a team which has one objective: get the
           | best score with your ML recommendation system.
           | 
           | There is simulated environment in which all teams of the
           | cohort receive millions of requests per day (and hundreds of
           | thousands of users and items) and you have to build out your
           | infrastructure on an EC2 instance, build a basic model, and
           | then iteratively improve on it. Imagine a simulated
           | facebook/youtube/tiktok-style system where you aim for the
           | best uptime and the best recommendations!
        
       ___________________________________________________________________
       (page generated 2023-04-05 23:00 UTC)