[HN Gopher] Just know stuff (or, how to achieve success in a mac...
       ___________________________________________________________________
        
       Just know stuff (or, how to achieve success in a machine learning
       PhD)
        
       Author : occamschainsaw
       Score  : 185 points
       Date   : 2023-01-27 15:50 UTC (7 hours ago)
        
 (HTM) web link (kidger.site)
 (TXT) w3m dump (kidger.site)
        
       | shihab wrote:
       | I didn't expect this to be as helpful as it actually was. Great
       | list.
       | 
       | Can anyone suggest me something similar for HPC domain?
        
         | sillysaurusx wrote:
         | Depends! HPC is a massive umbrella term. Which part of it are
         | you interested in?
         | 
         | For finance, firewall-to-firewall time was an important
         | concept. (The time it takes a signal to get into your
         | datacenter, be processed, then emit a signal back out.)
         | 
         | But e.g. massive data processing is an entirely different
         | beast. Latency isn't too important, whereas parallelization is
         | crucial.
         | 
         | So it's kind of hard to make a list without being pointed in a
         | vague direction.
        
         | patrickkidger wrote:
         | So HPC means like ten different things.
         | 
         | For example another commentor mentions low-latency concerns in
         | finance, and that's something I have zero experience with.
         | 
         | HPC has often also meant writing a lot of C++ to do e.g. MD or
         | something.
         | 
         | These days, I consider myself HPC-adjacent -- I write
         | scientific ML software, often for use on pretty beefy hardware
         | (TPU pods etc.) So at least for that, here's an off-the-cuff
         | list of a few items that come to mind:
         | 
         | - Know JAX. Really, really well: its internals, how its
         | transforms work. It's definitely a bit bumpy in places, but
         | it's still one of the best things we have for easily scaling
         | programs, e.g. through `jax.pmap`, being able to test on CPU
         | and then run on TPU, etc.
         | 
         | - Triton! New(-ish) kid on the block for GPU programming.
         | 
         | - How CPUs work: L1/L2/L3 caches, branch prediction, etc.
         | Parallelism via OpenMP.
         | 
         | - How GPUs work: warps etc.
         | 
         | - How BLAS works (e.g. tiling)..
         | 
         | - Compiler theory. Inlining functions, argment aliasing, NRVO,
         | ...
         | 
         | - Know autodiff well. E.g. have a read of the Dex paper, and
         | the concerns with doing autodiff through index operations.
         | Modern scientific computing is moving towards a ubiquitously
         | autodifferentiable future.
         | 
         | - ... plus loads more, haha. Probably I'm still missing ten
         | different things that another reader considers crucial.
        
       | lostmsu wrote:
       | While this surely seems required for a PhD program in one of the
       | top universities, more down-to-earth programs would require you
       | only about 1/4-1/2 of whats in the list depending on the area of
       | research. For instance, graph neural networks are very niche.
       | 
       | And if you just want to do research around current SoTA, that
       | would be more like 1/8th.
        
       | amelius wrote:
       | Good references seem to be missing.
        
         | burnished wrote:
         | Agreed that it would be nice, but exactly how much of the work
         | of becoming educated are you willing to ask this person to do
         | on your behalf?
        
           | l33t233372 wrote:
           | If they are as educated on these topics as they claim, adding
           | references is a trivial matter and generally greatly improves
           | the value of the work.
        
             | burnished wrote:
             | I am going to disagree with you there. I have physical
             | copies of a couple books I recommend and making them easily
             | referenced can be a bit of a pain - not difficult but may
             | be not worth the effort if I was compiling a list of topics
             | to serve as my general answer to a question I receive
             | frequently
        
           | amelius wrote:
           | Yeah, but perhaps they should have written/published this in
           | a form such that others can add references.
        
             | burnished wrote:
             | Holy hell, thats an idea with legs. Why dont you make that
             | and send it their way?
        
       | datastoat wrote:
       | > Please, please: learn some probability via measure theory.
       | You'll start reading machine learning papers wondering how people
       | ever express themselves precisely without it. The entire field
       | seems to be predicated around writing things like x ~
       | p_\theta(x|z=q_\phi(x)) as if that's somehow meaningful notation.
       | 
       | Hear hear! How did ML get saddled with such awful notation?
        
         | Kwpolska wrote:
         | How many papers with awful notation are actually the reverse
         | engineering of some (barely) working code, cobbled together
         | from random libraries and coefficients?
        
       | uoaei wrote:
       | This list is great for moving things from the "unknown unknown"
       | (U-U) to the "known unknown" (K-U) bucket. It's relatively easy
       | to move the things from the K-U to the K-K bucket just by virtue
       | of knowing enough search terms and places to start.
       | 
       | I think all of the topics in TFA's list could come into play at
       | some point (I have explored something to do with the majority of
       | these concepts during my work in private research) and it is
       | important to know how compiler optimizations are done, e.g. XLA
       | and Jacobian accumulation techniques to design fast models. I
       | don't think it matters that you don't master all of them, but
       | being able to quickly spin back up to comprehension upon a
       | relatively brief refresh is pretty important when it comes to
       | algorithm design and prototyping.
        
       | Foobar8568 wrote:
       | Maybe I should start a PhD in ML as a university drop out, it
       | sounds like a list of basic stuff taught within the first two
       | years of any math/CS program, but I might miss something as it's
       | far from being detailed...
        
         | thwayunion wrote:
         | Now is a terrible time to start a PhD in ML. If you do a CS
         | PhD, pick literally any other subfield.
        
           | Zetobal wrote:
           | Sounds a lot like my dad telling me to go to law school
           | instead of getting into cs. He thought the ultimate goal of
           | cs is to make everyone in the field obsolete... well maybe he
           | was right in the end.
        
             | amelius wrote:
             | Lawyers too (maybe they are first)
        
           | eddsh1994 wrote:
           | Why?
        
             | dekhn wrote:
             | because the very best machine learning models are distilled
             | from postdoc tears.
        
             | thwayunion wrote:
             | The subfield is oversupplied with labor relative to the
             | supply of good ideas worth working on for six years and
             | advising capacity.
        
             | sillysaurusx wrote:
             | It's trendy, and in general it's best to avoid trendy
             | fields.
             | 
             | But if you actually care about ML -- I think it's one of
             | the best things in the world -- go for it! ML has never
             | been more accessible.
        
               | l33t233372 wrote:
               | Trendy fields often have lots of money in them in both
               | the short term(grants, associateships, funding), and the
               | medium term(first job prospects) as well as benefits like
               | preferential appearance in journals(if two equally
               | insightful articles are up for one spot, would you rather
               | take the article on the topic many are interested in or
               | the article from a much more niche area?) and possibly
               | more low-hanging fruit since there is so much more work
               | to piggyback on/respond to/extend.
        
           | verdverm wrote:
           | I'd argue that cross field is a good avenue for a PhD in CS.
           | Work in collaboration with one of the sciences. There are
           | plenty of papers to be had in application. Not everything in
           | ML research needs to be algorithms. You'll likely find places
           | where the models break down and need algo work too
        
           | ad404b8a372f2b9 wrote:
           | People have been saying this for as long as I've been hanging
           | on ML forums, meanwhile the professors I work with can't keep
           | up with industry demand for ML and ML adjacent roles and have
           | to open up new classes.
        
         | LeanderK wrote:
         | I think it's hard to get those basics in an undergrad program.
         | 
         | CS often has the problem that the math basics are not teached
         | as rigorously as it should (I think calculus I/II and linear
         | algebra I/II should be exactly the same as the lectures for
         | math). Then it misses measure theory (usually in calc III) and
         | therefore you are going to miss solid fundamentals in
         | probability. Optimization is also usually absent, it's
         | sometimes squeezed into the numerical math. lecture but it
         | shouldn't since numerics (or better scientific computing) is so
         | important. You also want some statistics course.
         | 
         | Math usually lacks the whole machine learning canon, from SVM
         | to NNs, from bayesian methods to statistical learning theory. I
         | just see the same statistics lecture everywhere building upon
         | probability and deriving good estimators and their properties
         | and that's all. Also, you might miss CS basics like algorithms
         | and struggle with the basics of scientific computing
         | (programming in C, knowing all your matrix decompositions
         | etc.). Also, no knowledge of git, how to navigate a server just
         | with a shell etc.
         | 
         | I think you will either have to learn the missing parts after
         | your undergraduate, for example as your master, first year in
         | your phd, or be really lucky! I think most miss a significant
         | chunk of the topics after completing their undergraduate.
        
       | version_five wrote:
       | It's hard to be sure it really happened this way, but I feel like
       | my PhD began more with a research challenge and then went top
       | down into learning what I needed to to think about the challenge,
       | and working back up to an academic framing of how the fundamental
       | theory was advanced.
       | 
       | We also had comprehensive exams, so you're forced to know the
       | overall theory of your discipline as part of the rigor of the
       | program.
       | 
       | I personally like the challenge approach to research, it's like
       | what companies call their "north star" sometimes. It's not that
       | you're working directly on that problem necessarily, it's that
       | you're identifying what would have to be true for that problem to
       | be solvable and working on some of those things
        
       | CapmCrackaWaka wrote:
       | This was a really fun list to read through. I agree with the
       | author that knowing as much as possible about how things work,
       | not just what things do, is extremely useful.
       | 
       | However, "Just know stuff" I think is a secondary requirement
       | (although an important one) to be successful. People really
       | struggle to "just know stuff" if they aren't interested in the
       | subject in the first place. People who aren't interested will
       | settle for knowing only what things do, and not dive into how
       | they work.
       | 
       | I am interested in this stuff, and actually self taught
       | (bachelors in mathematics here). I have experience with a lot of
       | stuff on this list, not through work or academia, or because I
       | want to make money, but through fiddling on my desktop at home. I
       | too got the "coveted tech job" as a machine learning engineer,
       | but I never would have if I wasn't legitimately interested in
       | this stuff, studying for fun in my spare time. I have seen lots
       | of people fail to progress in this field because they _don't
       | care_, they just want a good job.
       | 
       | Kind of off topic, but this is actually an integral part of my
       | interviewing process. We give candidates a simple dataset to
       | model, and we receive their script. The performance on the hold-
       | out set is only weighted ~20%. The candidates ability to talk
       | about their process, about the internals of the model they used,
       | about the feature engineering quirks to work around model
       | limitations, about their parameter tuning scheme - these
       | conversations reveal how much someone is actually interested in
       | the field, and is a great indicator for whether or not they are
       | going to be a good contributor to the team. I've had candidates
       | who couldn't tell me _anything_ about how the models they used
       | actually worked. PhDs included!
        
         | patrickkidger wrote:
         | This is a great point I didn't cover! "Just know stuff" tends
         | to follow naturally from "care about stuff".
        
       | stevofolife wrote:
       | Amazing resource - very applied and relevant. Thank you for this
       | (coming from someone in Data Science).
        
       | dr_kiszonka wrote:
       | > Please, please: learn some probability via measure theory. [And
       | integration]
       | 
       | Are there any good, gentle books on measure theory for self-
       | study?
        
       | 0xBABAD00C wrote:
       | +1 for probability via measure theory & functional analysis
       | 
       | I would add the need for geometry and topology becoming more of a
       | pre-requisite for picking up modern methods:
       | https://arxiv.org/abs/2104.13478
        
       | qsort wrote:
       | "Just know stuff" deeply resonates. Even much below the author's
       | level, as is the case for me, technical knowledge dominates
       | everything else by orders of magnitude. I'm still appalled at how
       | people can manage to gather the courage to utter they don't need
       | math.
       | 
       | A great list, too. I guess I have stuff to brush up on for the
       | next 10 to 20 years?
        
         | yamtaddle wrote:
         | > I'm still appalled at how people can manage to gather the
         | courage to utter they don't need math.
         | 
         | I try to (re-)learn math periodically because I feel like I
         | should, like how one ought to eat one's vegetables and one
         | ought to exercise, but extrinsic motivation is the thing that's
         | lacking. I usually end up on recreational math puzzles or
         | something, before dropping it, since at least those are fun.
         | Probably made three or four cracks at it in a decade, each has
         | gone the same way.
         | 
         | I literally don't know what I'd use most of it for--I can't
         | find that need, even when I try. I'm sure I could make
         | something up just to have an excuse to apply what I was
         | learning, but... why? Probably there are other jobs I could
         | find where knowledge of math was absolutely key, but... why?
         | I've been paid to write code for about 23 years, my pay's
         | great, and I've repeatedly gained a reputation at companies for
         | being the guy to go to for tough or low-level problems. But if
         | you gave me an intro to linear algebra final or calculus 1
         | final, today, I'd be lucky to score 25% on either (hell, I'd be
         | lucky to get _anything_ right on the Calc final, but maybe
         | there 'd be a couple easy questions at the beginning--I've
         | never, once, ever applied anything I learned in calc, for any
         | purpose, so what little I knew about it to begin with is long
         | gone)
        
           | Dr_Birdbrain wrote:
           | It's one of those things where if you don't have the math
           | knowledge, the opportunities to apply it will be literally
           | invisible to you.
           | 
           | If all you have is a hammer, everything looks like a nail--
           | but the converse of that is that if you have never seen a
           | hammer, nails will be invisible and incomprehensible to you,
           | they will just blend into the background of noise.
           | 
           | When I learn about something, I suddenly see it everywhere.
           | 
           | Hypothetically, if you wanted to learn it, I would recommend
           | devoting the first hour of every day to it. Before your brain
           | fully wakes up and starts asking why you are doing it, just
           | do it.
           | 
           | I assume you have the standard CS background, with decent
           | knowledge of discrete math, calculus, linear algebra. I would
           | recommend starting with a graduate-level linear algebra
           | textbook. A friend of mine used to say "linear algebra is the
           | new addition", it permeates everything, and knowing it well
           | produces massive dividends.
        
             | yamtaddle wrote:
             | Why should I think I'll find uses this time, when I already
             | burned time and money learning (some of) it once, and lost
             | all that precisely because I never encountered any need for
             | it? An hour a day is a _huge_ time investment for something
             | that 's already failed to prove its worth once. Like I'm
             | sure I could go find some jobs that I can't get now because
             | I lack math skills, and I'm sure some (far from all!) of
             | those pay better than what I make now, but... like...
             | that's equally true of jobs that require better business or
             | speaking skills, and unlike linear algebra I can easily
             | point to ways those skills could be beneficial in everyday
             | life and in my existing job. Where's the corresponding
             | immediately-useful benefit for lin. alg.?
        
               | joxel wrote:
               | Clearly there are people that don't need to know math.
               | You happen to be one of them, congratulations. Though I
               | know I'd be bored out of my mind if I did software work
               | that only used high school level math and logic.
        
               | yamtaddle wrote:
               | Sure, some people need it, I was just responding to:
               | 
               | > I'm still appalled at how people can manage to gather
               | the courage to utter they don't need math.
               | 
               | When... well, the vast majority of people really don't.
               | They promptly forget almost everything back to about 6th
               | grade math, shortly after finishing formal education,
               | because _they truly never need it_ , so that knowledge
               | and those skills quickly rust.
               | 
               | If these people in-fact could make great use of it, such
               | that it's "appalling" that they don't think they need it,
               | then _that 's_ probably what school should focus on
               | teaching, at least for non-math-majors. Laser-focus on
               | application in everyday life. Especially in k-12. If it's
               | _actually_ useful and people are being forced to spend
               | hundreds to thousands of hours learning junior high, high
               | school, and college math, but then losing most of it
               | because they never see any use for it, that 's a
               | tremendous failing of curriculum that should be addressed
               | as directly as possible. If such a program wouldn't
               | succeed because it's _actually true_ that most people don
               | 't really need most of that math for anything, then we
               | ought not be "appalled" at their correctly assessing that
               | truth.
        
               | joxel wrote:
               | I see what youre saying. I don't think it's a failure of
               | the curriculum to teach people things when they are young
               | they don't end up using. Certainly a 13 year old isn't
               | going to know what their future career path/interests
               | will be (some do, but most don't) and shouldn't let them
               | shut doors down the road at such a young age. I think at
               | this point high school has devolved to the point of just
               | giving everyone the basic broad skills that they could
               | feasibly succeed at any college major. The seniors that
               | have already decided they just want to build houses all
               | day can complain during math, we all heard it, "when will
               | I ever use this", but the problem is the answer is not
               | "never" and its not "always" its "we don't know, but you
               | may need it, and closing those doors now will limit your
               | future potential".
        
             | patrickkidger wrote:
             | This is a great description, thank you! What you've said is
             | precisely the reason I emphasised knowing a bit of
             | foundational math, e.g. topology.
        
           | the_only_law wrote:
           | > I try to (re-)learn math periodically because I feel like I
           | should, like how one ought to eat one's vegetables and one
           | ought to exercise, but extrinsic motivation is the thing
           | that's lacking.
           | 
           | Personally, if it were "just relearn calculus and move from
           | there" I think I'd be able to motivate myself. At that point
           | I'm still pretty close to the problems and tasks that
           | interest me. But the reality is I spent much of middle school
           | and high school wasting my time doing things other than math,
           | so in reality I'd have to go much further back and relearn
           | all the prerequisite stuff, and the prerequisite stuff to
           | that, etc. By that point Im so divorced from the reason I
           | wanted to try and relearn math in the first places and I just
           | get bored and give up.
        
         | bitL wrote:
         | ...and then you meet your new boss who has no clue about
         | anything and makes your life hell when his imagination doesn't
         | match the "stuff you know".
        
       | albertzeyer wrote:
       | I have learned many of these things at some point. Unfortunately,
       | I have also forgotten most of it. Some of it I still remember to
       | some degree, or just the basic idea, and can probably reconstruct
       | it given some time, or easily look it up and remember. However,
       | many of the things I have completely forgotten, and it would take
       | more time to understand it again.
       | 
       | I am not sure I can keep so many things active in my memory +
       | also other things, depending on what I am working on in the last
       | years. If my work in the last few years does not need some of
       | this knowledge, I keep forgetting it.
       | 
       | Maybe my memory is just bad.
       | 
       | I think many of the things listed here are somewhat specific to
       | what the author has worked on, so that it's easier for him to
       | really know all this.
       | 
       | Maybe the point of this post is also more generic: Try to have an
       | active memory over a diverse set of fields.
        
         | patrickkidger wrote:
         | Haha, so I actually have an atrocious memory! Famously so
         | amongst my friends, I never remember what we've discussed.
         | 
         | When I wrote this list I certainly wasn't
         | expecting/recommending all of this to stay in the reader's head
         | forever.
         | 
         | Rather: if you've worked with something deeply at one time,
         | then -- even if you've forgotten the details -- you still can
         | pattern-match on it later. And then look up whatever you've
         | forgotten!
        
       | [deleted]
        
       | jackblemming wrote:
       | I know most of the stuff listed there and do not have a published
       | textbook or papers. It sounds like the author achieved success
       | because they work hard on things they're passionate about.
       | Knowledge and published works are a byproduct of that.
        
         | LeanderK wrote:
         | I think the point is that knowing this stuff makes it easier to
         | write a paper because you are not as easily out of breath. It's
         | not guaranteeing you a good paper and phd, that's your part.
         | 
         | Maybe you too could be a successful phd-student? :) At least
         | you know the basics!
        
       | version_five wrote:
       | Thinking about this, I'd be also interested to hear what the
       | author learned and didn't find useful over his phd. Is this a
       | list of most of what he ended up learning, (which could
       | potentially then have a lot of conformation bias in it) or is it
       | curated from the maze of blind alleys he went down?
        
         | patrickkidger wrote:
         | Ooh, that's a great suggestion!
         | 
         | So one thing I learned a _lot_ of in my PhD (for literally a
         | whole year), that I literally never needed, was functional
         | analytic methods for PDEs. Stuff like Moser iterations  / the
         | De Giorgi-Nash-Moser theorem, etc.
         | 
         | The finer details of Turing machines have never really helped
         | me, although in my case that's probably the exception as I
         | imagine that's still pretty important.
         | 
         | On a more ML note, I have literally never needed SVMs. (And
         | hope I never get asked about them, I've forgotten everything
         | about them haha.)
         | 
         | I think there's a lot of other stuff I could add to the "just-
         | don't-know-stuff" list!
         | 
         | (And to answer your last question: this list is curated, and
         | based on the criteria of (a) is it useful, and (b) is it widely
         | applicable.)
        
       | jll29 wrote:
       | What you need to know depends largely what you are working on -
       | and that also holds for a successful Ph.D. candidate. Of course,
       | it is good advice to be open-minded beyond one's narrow field of
       | inquiry, as the OP suggests, but a long list of maths topics may
       | not be helpful a lot to beginners.
       | 
       | The Ph.D. period is the time when you have some time available to
       | acquire additional skills, and my advice is: try to strengthen
       | those aspects of your education where you currently have the most
       | glaring deficits. For instance, take a statistics course if that
       | is your weak spot, or learn a foreign language if you have evaded
       | that topic so far in your educational journey.
       | 
       | Read the main textbooks of your field and read and re-read ALL
       | relevant papers for your actual Ph.D. topic, once you've been
       | able to identify it (which may well take you most or all of your
       | first year). Take details notes because nobody can remember most
       | of that much highly concentrated advanced material. Try to find
       | gaps: ask questions and find out if people have tried to answer
       | them yet or not. Interact with others e.g. at conferences, after
       | meeting some people there by attending and networking in a prior
       | year.
        
         | SQueeeeeL wrote:
         | > Read the main textbooks of your field and read and re-read
         | ALL relevant papers for your actual Ph.D. topic, once you've
         | been able to identify
         | 
         | I'm gonna be real as someone in grad school. Basically no PhD,
         | grad students, or professors I know read full text books. I
         | hear these ideas a lot, and they often sound like one of those
         | Instagram influencer diets, that's completely unreasonable if
         | you have any constraints in your life, and it's mostly been
         | used to gate keep "real" scientists. Be well studied and
         | knowledgeable about your problem domain, but you have a finite
         | lifespan, so never feel bad that you aren't "educated enough".
        
           | patrickkidger wrote:
           | As the author of this article... I have read maybe one
           | textbook cover-to-cover in my life. :D
           | 
           | (Hands-on machine learning, by Geron, back when I made the
           | jump math->ML.)
        
             | mkl wrote:
             | I have an applied maths PhD but no machine learning. Would
             | you still recommend Geron for that purpose?
             | 
             | BTW, I spotted a typo in the first paragraph of your thesis
             | abstract: "neural networks and differential equation are
             | two sides".
        
             | SQueeeeeL wrote:
             | Elements of Statistical Learning is my cover to cover read
             | :), I just think it isn't a requirement to be a "good
             | student"
        
       | steppi wrote:
       | Knowing things is good, but I think the real benefit of a PhD is
       | in developing the skill to learn new things quickly and fill in
       | any gaps as needed. It's important to get practice learning many
       | things in depth but the goal isn't to become a storehouse of
       | knowledge, it's to develop the ability assimilate existing
       | knowledge and apply it to figuring new things out.
       | 
       | The amount of topics I've studied in depth dwarfs this list (and
       | the same is certainly true of its author) but the set of things I
       | could teach a class on today without preparation is much smaller.
       | The important thing is that if I have a problem, I can use the
       | impressions from all I've learned before to get a sense of where
       | to look next. My memory isn't great but it doesn't matter because
       | I can refresh, learn, and figure things out as needed.
        
         | patrickkidger wrote:
         | Completely agreed!
        
       | bayesian_horse wrote:
       | That's what I do. I drink and I know stuff.
        
       | gautamcgoel wrote:
       | I just wrapped up a machine learning PhD at Caltech (now doing a
       | postdoc in ML at Berkeley) and I disagree strongly with this
       | article. What matters isn't knowing a bunch of random stuff, but
       | rather writing/speaking skills, a willingness to learn new
       | things, perseverance in the face of setbacks, creativity, having
       | enough EQ to navigate the advisor-advisee relationship and
       | departmental politics, and most of all, an ability to follow
       | through and actually get things done. These "intangible" skills
       | are far more important than having any specific knowledge.
        
         | mi_lk wrote:
         | This. The article is meh in terms of advices, and even serves
         | more as a self-promotion (which apparently he is good at and
         | that's something to learn). Maybe it somehow works for him and
         | it's fine, but it's no more useful than a table of contents.
        
         | patrickkidger wrote:
         | I think those are all things you need for _life_.
         | 
         | But what you do need for _specifically a PhD_? I argue that
         | "knowing stuff" is what is necessary -- and that indeed it's
         | essentially the purpose of the whole academic institutiom.
        
           | ModernMech wrote:
           | A Ph.D. isn't about knowing things, it's about doing
           | research. Recipients of the degree are given the title
           | "doctor" not for what they know, but because they are first
           | and foremost teachers of knowledge (from the Latin _docere_ ,
           | meaning to teach).
           | 
           | "Knowing stuff" is enough to get you to the point where you
           | can formulate a good research question, but being a good
           | researcher is a much broader skillset than just knowing
           | things. And consequently, just knowing things won't get you a
           | Ph.D., because getting one requires you to talk about your
           | research. A lot. Like, all the time. And after all, the last
           | step in getting the Ph.D. is called a "defense", not an
           | "examination", because you are not there to tell anyone what
           | you know -- you are there to _defend_ what you know, and that
           | 's a different skillset than gaining knowledge or even
           | disseminating knowledge.
           | 
           | I guess all that is to say, you could know everything in the
           | world, but if you lack the skills to tell anyone else about
           | that knowledge, you'll never get a Ph.D.
        
       | mochomocha wrote:
       | The tone of the post is pretty off putting. It reads as a "look
       | how smart I am!" article - the author doesn't even pretend to be
       | modest.
        
         | burnished wrote:
         | Should they be? In this case wouldnt it end up being false
         | modesty?
         | 
         | Like, if this person cant say "Look at me, I am UNUSUALLY
         | INTELLIGENT!" then who can?!
        
           | mochomocha wrote:
           | > Like, if this person cant say "Look at me, I am UNUSUALLY
           | INTELLIGENT!" then who can?!
           | 
           | No one, that's my point. If academia has not taught to the
           | author that his intelligence isn't unusual, the workforce of
           | his new employer certainly will. Listing github stars and
           | twitter followers in the second paragraph as an achievement
           | to me transpires lack of maturity and a need for external
           | validation. On the bright side, being good at self-promotion
           | when entering a mega-corporation will make sure he has a
           | great career in front of him.
        
             | patrickkidger wrote:
             | I'm sorry it came across this way for you! Rather, I'm just
             | outlining why folks seem to keep asking me this question.
             | :)
        
               | burnished wrote:
               | I think your post combined with the responses to it are a
               | great example of how messages become stripped of all
               | nuance, even by otherwise very educated/intelligent
               | people. A lot of the critique here ignores the nuance and
               | specificity you give just in the opening paragraphs.
        
             | burnished wrote:
             | I see, you read it as a form of self aggrandizement. I
             | think your reading was wrong (in addition to being unkind),
             | but wouldnt disagree with the core value under discussion.
             | 
             | I think a fair test to apply is whether or not the thing is
             | a fact - if it is a fact, and you think it is impressive,
             | that might say more about you than the author. I feel like
             | otherwise you are asking people to self censor too strongly
             | for fear of betraying some perverse sense of modesty that
             | does now allow for anyone to have done anything worth
             | noticing at all.
        
           | p1esk wrote:
           | Even if you can does not mean you should.
        
             | ding_dang wrote:
             | Why not?
        
               | [deleted]
        
       | j7ake wrote:
       | The problem with this list, although impressive in scope, is that
       | it is not clear how /deeply/ one should know each of those items
       | on the list.
       | 
       | One can write several papers for each one of those items on list.
       | 
       | Knowing the concept well enough to pass a job interview is a much
       | lower bar than well enough to innovate and push new knowledge.
        
       | sillysaurusx wrote:
       | If anyone's reading over this and feels "Gosh, I'll never be an
       | ML dev; this is way too much": I don't know most of that list,
       | and still manage to be a productive researcher. I learn what I
       | need as I go.
       | 
       | That's probably the optimal strategy. I'm skeptical of first-
       | principles learning. It's great to immerse yourself in theory,
       | but when you've gone all the way to "topology" you've probably
       | gone beyond the limit of what most ML devs care about on a day-
       | to-day basis.
       | 
       | It's still useful to know. I've applied lots of ideas from other
       | fields. But can you force that knowledge by forcing yourself to
       | study other fields? Maybe. We all have a finite amount of time
       | though.
       | 
       | That said, lots of items on this list are key, and it'd be worth
       | ranking them. There's no need to memorize formulas for Adam, but
       | knowing the concept of momentum-per-weight is pretty crucial.
        
         | thwayunion wrote:
         | _> Gosh, I 'll never be an ML dev_
         | 
         | FWIW very few of the ML devs I work with have PhDs. I'm not
         | sure aspiring ML devs are the intended audience.
        
         | extasia wrote:
         | The author and you seem to be talking to different audiences,
         | you're talking about ML eng and OP is talking about ML
         | researchers.
         | 
         | Researchers absolutely need to know a lot, not necessarily all
         | the way to topology or w/e but definitely the underlying
         | mathematical principles in order to advance the field (IMO).
        
           | sillysaurusx wrote:
           | You're right, I wasn't clear. But I'm a full-time ML
           | researcher. In terms of advancing the field, my contributions
           | so far have been modest, but they're there. Some of my
           | favorite ideas were swarm training
           | (https://battle.shawwn.com/swarm-training-v01a.pdf), stop
           | loss for stabilizing GAN training (https://twitter.com/search
           | ?q=from%3Atheshawwn%20stop%20loss&...), and getting GPT to
           | play chess
           | (https://www.theregister.com/2020/01/10/gpt2_chess/) back
           | when that was a shocking idea. And in terms of cited
           | research, https://arxiv.org/abs/2101.00027 has been the most
           | impactful.
           | 
           | There's significant overlap between ML research and ML dev.
           | If I can do it without most of that list, it should give
           | people here some hope of joining the field without needing to
           | immerse themselves in theory.
        
           | p1esk wrote:
           | I'm an ML researcher, and I'm with sillysaurusx on this. I
           | actually know most of the things on the list, but only
           | because my research is mainly in model compression and
           | computational efficiency. Recently I've been interested in
           | adapting diffusion models to generate music (in raw audio
           | domain), and I'd say only 5 out of 18 bullet points in ML
           | section are relevant - the rest falls from "nice to know" to
           | "irrelevant".
        
         | bayesian_horse wrote:
         | I feel like I know up to 75% of that stuff and can't get hired
         | even for entry level data science.
        
       | ad404b8a372f2b9 wrote:
       | I keep searching how to be a good researcher but I haven't found
       | any answers.
       | 
       | I know almost all of this except for weirdly arbitrary/specific
       | stuff (you really don't need to know Haskell for an ML PhD). It's
       | all pretty basic, half of it you learn during any CS Masters
       | degree, the other half are models and concepts that have been
       | popular enough in the last few years that you would have read the
       | papers and possibly implemented them if you keep up with the
       | field.
       | 
       | This hasn't helped me at all, my PhD has gone horribly and I'm
       | not entirely sure why. The first year I wasted on an EEG-related
       | review paper before realizing that EEG data is garbage, the
       | second year I wasted on an original method that I wanted to
       | pursue but it didn't perform well and got scooped. The third year
       | (which ended up being the fourth year because of severe health
       | issues) I was burnt-out and produced nothing of value.
       | 
       | I don't know how to produce good original research. I know how to
       | program like nobody's business, better than any of my lab
       | colleagues but this has done nothing to help me. I can implement
       | a method in a night, I can implement tons of ideas in a year but
       | if they don't perform better on the important metrics what's the
       | point? I know the math and the ML side of things but this hasn't
       | given me any insights, usually when I have an idea I realize it's
       | already been done while doing the literature review.
       | 
       | It's all a big mystery to me, I think maybe the pressure of
       | holding an industry job and having to finish the PhD before my
       | funding ran out prevented me from experimenting freely but it
       | might just be a convenient excuse.
        
         | patrickkidger wrote:
         | I'm sorry to hear that things didn't go so well for you!
         | 
         | FWIW being able to "program like nobody's business" is still
         | really really valuable. It's why I dedicate such a large chunk
         | of the post to software dev skills. :)
        
       | brilee wrote:
       | As somebody from the U.S., I'm struck by how much people from the
       | U.K. value a strong background in theory. At Google my experience
       | has been "what matters is how smart you are, whether you
       | understand the problem we're trying to solve, and whether you
       | have creative solutions". When I applied for DeepMind some time
       | ago, I was grilled, rapid-fire, a hundred questions covering the
       | breadth of a rigorous undergraduate education in linear algebra,
       | stats, ML, calculus, etc.. They seemed content to measure my
       | intelligence by seeing how rapidly and deeply I had assimilated
       | standard courses, rather than by seeing how I approached a
       | problem I'd never seen before.
       | 
       | This guy is obviously talented, but also he comes from a
       | tradition of optimizing for this kind of academic culture. You
       | would be similarly weirded out by the leetcode fetish in tech if
       | that's not what you were used to. I think that's what many
       | commenters are missing.
        
         | xmprt wrote:
         | I've noticed that cultural difference too. I think there are
         | things to take away from both approaches. Extremely
         | knowledgeable people with a lot of background in theory should
         | get better at understanding and creatively applying that
         | knowledge to new problems. And people who are great at problem
         | solving should learn more theory instead of expecting
         | themselves to just materialize the best solution out of thin
         | air.
        
       | foobarbecue wrote:
       | 2 years? Sheesh. This is the type of stuff that makes me think
       | genuis is a biological thing.
        
         | sampo wrote:
         | > 2 years?
         | 
         | Half of the items, and almost all in the "Mathematics" section,
         | you would have learned during your BSc/MSc (if it's in applied
         | math or physics, and Python programming is your hobby). I can't
         | find the author's CV, but there is 5 years from him returning
         | his MSc thesis in 2017 to returning PhD thesis in 2022. Maybe
         | he studied ML for all those 5 years, or maybe he took off 2
         | years to travel the world, who knows.
        
         | ipnon wrote:
         | You can learn a lot when you're in your early 20s, single,
         | ambitious, and getting a stipend from a university to do
         | nothing but study for 16 hours a day.
        
         | j7ake wrote:
         | In the UK, 3 years is common.
        
           | patrickkidger wrote:
           | Right! This was the situation for me.
        
         | [deleted]
        
         | supernova87a wrote:
         | I was about to comment on the short PhD in "2-and-a-bit years"
         | -- and how the UK expectation for PhD program duration is so
         | different from in the US.
         | 
         | And it wasn't to say that he's a genius. It's that the program
         | is planned to be shorter, and in fact, there's less funding
         | available to go longer even if you wanted to.
        
         | jahewson wrote:
         | One must consider what he spent the previous 22 years doing.
        
       | WithinReason wrote:
       | Maybe not necessary for a PhD, but you should also have some idea
       | about how HW works.
        
       | bitwize wrote:
       | "The stuff is what the stuff is, brother." --from a James Mickens
       | talk on machine learning,
       | https://www.youtube.com/watch?v=ajGX7odA87k&t=13m40s
       | 
       | Can confirm, the way ML is used in many businesses is like an egg
       | drop: You open up Jupyter, load in some data, and play around
       | with various models until you find one that fits the data, then
       | use it to try to predict future data. If the future results
       | comport with the model, congratulations: your egg is safe, at
       | least for drops of that height.
        
       ___________________________________________________________________
       (page generated 2023-01-27 23:01 UTC)