[HN Gopher] Just know stuff (or, how to achieve success in a mac...
___________________________________________________________________
Just know stuff (or, how to achieve success in a machine learning
PhD)
Author : occamschainsaw
Score : 185 points
Date : 2023-01-27 15:50 UTC (7 hours ago)
(HTM) web link (kidger.site)
(TXT) w3m dump (kidger.site)
| shihab wrote:
| I didn't expect this to be as helpful as it actually was. Great
| list.
|
| Can anyone suggest me something similar for HPC domain?
| sillysaurusx wrote:
| Depends! HPC is a massive umbrella term. Which part of it are
| you interested in?
|
| For finance, firewall-to-firewall time was an important
| concept. (The time it takes a signal to get into your
| datacenter, be processed, then emit a signal back out.)
|
| But e.g. massive data processing is an entirely different
| beast. Latency isn't too important, whereas parallelization is
| crucial.
|
| So it's kind of hard to make a list without being pointed in a
| vague direction.
| patrickkidger wrote:
| So HPC means like ten different things.
|
| For example another commentor mentions low-latency concerns in
| finance, and that's something I have zero experience with.
|
| HPC has often also meant writing a lot of C++ to do e.g. MD or
| something.
|
| These days, I consider myself HPC-adjacent -- I write
| scientific ML software, often for use on pretty beefy hardware
| (TPU pods etc.) So at least for that, here's an off-the-cuff
| list of a few items that come to mind:
|
| - Know JAX. Really, really well: its internals, how its
| transforms work. It's definitely a bit bumpy in places, but
| it's still one of the best things we have for easily scaling
| programs, e.g. through `jax.pmap`, being able to test on CPU
| and then run on TPU, etc.
|
| - Triton! New(-ish) kid on the block for GPU programming.
|
| - How CPUs work: L1/L2/L3 caches, branch prediction, etc.
| Parallelism via OpenMP.
|
| - How GPUs work: warps etc.
|
| - How BLAS works (e.g. tiling)..
|
| - Compiler theory. Inlining functions, argment aliasing, NRVO,
| ...
|
| - Know autodiff well. E.g. have a read of the Dex paper, and
| the concerns with doing autodiff through index operations.
| Modern scientific computing is moving towards a ubiquitously
| autodifferentiable future.
|
| - ... plus loads more, haha. Probably I'm still missing ten
| different things that another reader considers crucial.
| lostmsu wrote:
| While this surely seems required for a PhD program in one of the
| top universities, more down-to-earth programs would require you
| only about 1/4-1/2 of whats in the list depending on the area of
| research. For instance, graph neural networks are very niche.
|
| And if you just want to do research around current SoTA, that
| would be more like 1/8th.
| amelius wrote:
| Good references seem to be missing.
| burnished wrote:
| Agreed that it would be nice, but exactly how much of the work
| of becoming educated are you willing to ask this person to do
| on your behalf?
| l33t233372 wrote:
| If they are as educated on these topics as they claim, adding
| references is a trivial matter and generally greatly improves
| the value of the work.
| burnished wrote:
| I am going to disagree with you there. I have physical
| copies of a couple books I recommend and making them easily
| referenced can be a bit of a pain - not difficult but may
| be not worth the effort if I was compiling a list of topics
| to serve as my general answer to a question I receive
| frequently
| amelius wrote:
| Yeah, but perhaps they should have written/published this in
| a form such that others can add references.
| burnished wrote:
| Holy hell, thats an idea with legs. Why dont you make that
| and send it their way?
| datastoat wrote:
| > Please, please: learn some probability via measure theory.
| You'll start reading machine learning papers wondering how people
| ever express themselves precisely without it. The entire field
| seems to be predicated around writing things like x ~
| p_\theta(x|z=q_\phi(x)) as if that's somehow meaningful notation.
|
| Hear hear! How did ML get saddled with such awful notation?
| Kwpolska wrote:
| How many papers with awful notation are actually the reverse
| engineering of some (barely) working code, cobbled together
| from random libraries and coefficients?
| uoaei wrote:
| This list is great for moving things from the "unknown unknown"
| (U-U) to the "known unknown" (K-U) bucket. It's relatively easy
| to move the things from the K-U to the K-K bucket just by virtue
| of knowing enough search terms and places to start.
|
| I think all of the topics in TFA's list could come into play at
| some point (I have explored something to do with the majority of
| these concepts during my work in private research) and it is
| important to know how compiler optimizations are done, e.g. XLA
| and Jacobian accumulation techniques to design fast models. I
| don't think it matters that you don't master all of them, but
| being able to quickly spin back up to comprehension upon a
| relatively brief refresh is pretty important when it comes to
| algorithm design and prototyping.
| Foobar8568 wrote:
| Maybe I should start a PhD in ML as a university drop out, it
| sounds like a list of basic stuff taught within the first two
| years of any math/CS program, but I might miss something as it's
| far from being detailed...
| thwayunion wrote:
| Now is a terrible time to start a PhD in ML. If you do a CS
| PhD, pick literally any other subfield.
| Zetobal wrote:
| Sounds a lot like my dad telling me to go to law school
| instead of getting into cs. He thought the ultimate goal of
| cs is to make everyone in the field obsolete... well maybe he
| was right in the end.
| amelius wrote:
| Lawyers too (maybe they are first)
| eddsh1994 wrote:
| Why?
| dekhn wrote:
| because the very best machine learning models are distilled
| from postdoc tears.
| thwayunion wrote:
| The subfield is oversupplied with labor relative to the
| supply of good ideas worth working on for six years and
| advising capacity.
| sillysaurusx wrote:
| It's trendy, and in general it's best to avoid trendy
| fields.
|
| But if you actually care about ML -- I think it's one of
| the best things in the world -- go for it! ML has never
| been more accessible.
| l33t233372 wrote:
| Trendy fields often have lots of money in them in both
| the short term(grants, associateships, funding), and the
| medium term(first job prospects) as well as benefits like
| preferential appearance in journals(if two equally
| insightful articles are up for one spot, would you rather
| take the article on the topic many are interested in or
| the article from a much more niche area?) and possibly
| more low-hanging fruit since there is so much more work
| to piggyback on/respond to/extend.
| verdverm wrote:
| I'd argue that cross field is a good avenue for a PhD in CS.
| Work in collaboration with one of the sciences. There are
| plenty of papers to be had in application. Not everything in
| ML research needs to be algorithms. You'll likely find places
| where the models break down and need algo work too
| ad404b8a372f2b9 wrote:
| People have been saying this for as long as I've been hanging
| on ML forums, meanwhile the professors I work with can't keep
| up with industry demand for ML and ML adjacent roles and have
| to open up new classes.
| LeanderK wrote:
| I think it's hard to get those basics in an undergrad program.
|
| CS often has the problem that the math basics are not teached
| as rigorously as it should (I think calculus I/II and linear
| algebra I/II should be exactly the same as the lectures for
| math). Then it misses measure theory (usually in calc III) and
| therefore you are going to miss solid fundamentals in
| probability. Optimization is also usually absent, it's
| sometimes squeezed into the numerical math. lecture but it
| shouldn't since numerics (or better scientific computing) is so
| important. You also want some statistics course.
|
| Math usually lacks the whole machine learning canon, from SVM
| to NNs, from bayesian methods to statistical learning theory. I
| just see the same statistics lecture everywhere building upon
| probability and deriving good estimators and their properties
| and that's all. Also, you might miss CS basics like algorithms
| and struggle with the basics of scientific computing
| (programming in C, knowing all your matrix decompositions
| etc.). Also, no knowledge of git, how to navigate a server just
| with a shell etc.
|
| I think you will either have to learn the missing parts after
| your undergraduate, for example as your master, first year in
| your phd, or be really lucky! I think most miss a significant
| chunk of the topics after completing their undergraduate.
| version_five wrote:
| It's hard to be sure it really happened this way, but I feel like
| my PhD began more with a research challenge and then went top
| down into learning what I needed to to think about the challenge,
| and working back up to an academic framing of how the fundamental
| theory was advanced.
|
| We also had comprehensive exams, so you're forced to know the
| overall theory of your discipline as part of the rigor of the
| program.
|
| I personally like the challenge approach to research, it's like
| what companies call their "north star" sometimes. It's not that
| you're working directly on that problem necessarily, it's that
| you're identifying what would have to be true for that problem to
| be solvable and working on some of those things
| CapmCrackaWaka wrote:
| This was a really fun list to read through. I agree with the
| author that knowing as much as possible about how things work,
| not just what things do, is extremely useful.
|
| However, "Just know stuff" I think is a secondary requirement
| (although an important one) to be successful. People really
| struggle to "just know stuff" if they aren't interested in the
| subject in the first place. People who aren't interested will
| settle for knowing only what things do, and not dive into how
| they work.
|
| I am interested in this stuff, and actually self taught
| (bachelors in mathematics here). I have experience with a lot of
| stuff on this list, not through work or academia, or because I
| want to make money, but through fiddling on my desktop at home. I
| too got the "coveted tech job" as a machine learning engineer,
| but I never would have if I wasn't legitimately interested in
| this stuff, studying for fun in my spare time. I have seen lots
| of people fail to progress in this field because they _don't
| care_, they just want a good job.
|
| Kind of off topic, but this is actually an integral part of my
| interviewing process. We give candidates a simple dataset to
| model, and we receive their script. The performance on the hold-
| out set is only weighted ~20%. The candidates ability to talk
| about their process, about the internals of the model they used,
| about the feature engineering quirks to work around model
| limitations, about their parameter tuning scheme - these
| conversations reveal how much someone is actually interested in
| the field, and is a great indicator for whether or not they are
| going to be a good contributor to the team. I've had candidates
| who couldn't tell me _anything_ about how the models they used
| actually worked. PhDs included!
| patrickkidger wrote:
| This is a great point I didn't cover! "Just know stuff" tends
| to follow naturally from "care about stuff".
| stevofolife wrote:
| Amazing resource - very applied and relevant. Thank you for this
| (coming from someone in Data Science).
| dr_kiszonka wrote:
| > Please, please: learn some probability via measure theory. [And
| integration]
|
| Are there any good, gentle books on measure theory for self-
| study?
| 0xBABAD00C wrote:
| +1 for probability via measure theory & functional analysis
|
| I would add the need for geometry and topology becoming more of a
| pre-requisite for picking up modern methods:
| https://arxiv.org/abs/2104.13478
| qsort wrote:
| "Just know stuff" deeply resonates. Even much below the author's
| level, as is the case for me, technical knowledge dominates
| everything else by orders of magnitude. I'm still appalled at how
| people can manage to gather the courage to utter they don't need
| math.
|
| A great list, too. I guess I have stuff to brush up on for the
| next 10 to 20 years?
| yamtaddle wrote:
| > I'm still appalled at how people can manage to gather the
| courage to utter they don't need math.
|
| I try to (re-)learn math periodically because I feel like I
| should, like how one ought to eat one's vegetables and one
| ought to exercise, but extrinsic motivation is the thing that's
| lacking. I usually end up on recreational math puzzles or
| something, before dropping it, since at least those are fun.
| Probably made three or four cracks at it in a decade, each has
| gone the same way.
|
| I literally don't know what I'd use most of it for--I can't
| find that need, even when I try. I'm sure I could make
| something up just to have an excuse to apply what I was
| learning, but... why? Probably there are other jobs I could
| find where knowledge of math was absolutely key, but... why?
| I've been paid to write code for about 23 years, my pay's
| great, and I've repeatedly gained a reputation at companies for
| being the guy to go to for tough or low-level problems. But if
| you gave me an intro to linear algebra final or calculus 1
| final, today, I'd be lucky to score 25% on either (hell, I'd be
| lucky to get _anything_ right on the Calc final, but maybe
| there 'd be a couple easy questions at the beginning--I've
| never, once, ever applied anything I learned in calc, for any
| purpose, so what little I knew about it to begin with is long
| gone)
| Dr_Birdbrain wrote:
| It's one of those things where if you don't have the math
| knowledge, the opportunities to apply it will be literally
| invisible to you.
|
| If all you have is a hammer, everything looks like a nail--
| but the converse of that is that if you have never seen a
| hammer, nails will be invisible and incomprehensible to you,
| they will just blend into the background of noise.
|
| When I learn about something, I suddenly see it everywhere.
|
| Hypothetically, if you wanted to learn it, I would recommend
| devoting the first hour of every day to it. Before your brain
| fully wakes up and starts asking why you are doing it, just
| do it.
|
| I assume you have the standard CS background, with decent
| knowledge of discrete math, calculus, linear algebra. I would
| recommend starting with a graduate-level linear algebra
| textbook. A friend of mine used to say "linear algebra is the
| new addition", it permeates everything, and knowing it well
| produces massive dividends.
| yamtaddle wrote:
| Why should I think I'll find uses this time, when I already
| burned time and money learning (some of) it once, and lost
| all that precisely because I never encountered any need for
| it? An hour a day is a _huge_ time investment for something
| that 's already failed to prove its worth once. Like I'm
| sure I could go find some jobs that I can't get now because
| I lack math skills, and I'm sure some (far from all!) of
| those pay better than what I make now, but... like...
| that's equally true of jobs that require better business or
| speaking skills, and unlike linear algebra I can easily
| point to ways those skills could be beneficial in everyday
| life and in my existing job. Where's the corresponding
| immediately-useful benefit for lin. alg.?
| joxel wrote:
| Clearly there are people that don't need to know math.
| You happen to be one of them, congratulations. Though I
| know I'd be bored out of my mind if I did software work
| that only used high school level math and logic.
| yamtaddle wrote:
| Sure, some people need it, I was just responding to:
|
| > I'm still appalled at how people can manage to gather
| the courage to utter they don't need math.
|
| When... well, the vast majority of people really don't.
| They promptly forget almost everything back to about 6th
| grade math, shortly after finishing formal education,
| because _they truly never need it_ , so that knowledge
| and those skills quickly rust.
|
| If these people in-fact could make great use of it, such
| that it's "appalling" that they don't think they need it,
| then _that 's_ probably what school should focus on
| teaching, at least for non-math-majors. Laser-focus on
| application in everyday life. Especially in k-12. If it's
| _actually_ useful and people are being forced to spend
| hundreds to thousands of hours learning junior high, high
| school, and college math, but then losing most of it
| because they never see any use for it, that 's a
| tremendous failing of curriculum that should be addressed
| as directly as possible. If such a program wouldn't
| succeed because it's _actually true_ that most people don
| 't really need most of that math for anything, then we
| ought not be "appalled" at their correctly assessing that
| truth.
| joxel wrote:
| I see what youre saying. I don't think it's a failure of
| the curriculum to teach people things when they are young
| they don't end up using. Certainly a 13 year old isn't
| going to know what their future career path/interests
| will be (some do, but most don't) and shouldn't let them
| shut doors down the road at such a young age. I think at
| this point high school has devolved to the point of just
| giving everyone the basic broad skills that they could
| feasibly succeed at any college major. The seniors that
| have already decided they just want to build houses all
| day can complain during math, we all heard it, "when will
| I ever use this", but the problem is the answer is not
| "never" and its not "always" its "we don't know, but you
| may need it, and closing those doors now will limit your
| future potential".
| patrickkidger wrote:
| This is a great description, thank you! What you've said is
| precisely the reason I emphasised knowing a bit of
| foundational math, e.g. topology.
| the_only_law wrote:
| > I try to (re-)learn math periodically because I feel like I
| should, like how one ought to eat one's vegetables and one
| ought to exercise, but extrinsic motivation is the thing
| that's lacking.
|
| Personally, if it were "just relearn calculus and move from
| there" I think I'd be able to motivate myself. At that point
| I'm still pretty close to the problems and tasks that
| interest me. But the reality is I spent much of middle school
| and high school wasting my time doing things other than math,
| so in reality I'd have to go much further back and relearn
| all the prerequisite stuff, and the prerequisite stuff to
| that, etc. By that point Im so divorced from the reason I
| wanted to try and relearn math in the first places and I just
| get bored and give up.
| bitL wrote:
| ...and then you meet your new boss who has no clue about
| anything and makes your life hell when his imagination doesn't
| match the "stuff you know".
| albertzeyer wrote:
| I have learned many of these things at some point. Unfortunately,
| I have also forgotten most of it. Some of it I still remember to
| some degree, or just the basic idea, and can probably reconstruct
| it given some time, or easily look it up and remember. However,
| many of the things I have completely forgotten, and it would take
| more time to understand it again.
|
| I am not sure I can keep so many things active in my memory +
| also other things, depending on what I am working on in the last
| years. If my work in the last few years does not need some of
| this knowledge, I keep forgetting it.
|
| Maybe my memory is just bad.
|
| I think many of the things listed here are somewhat specific to
| what the author has worked on, so that it's easier for him to
| really know all this.
|
| Maybe the point of this post is also more generic: Try to have an
| active memory over a diverse set of fields.
| patrickkidger wrote:
| Haha, so I actually have an atrocious memory! Famously so
| amongst my friends, I never remember what we've discussed.
|
| When I wrote this list I certainly wasn't
| expecting/recommending all of this to stay in the reader's head
| forever.
|
| Rather: if you've worked with something deeply at one time,
| then -- even if you've forgotten the details -- you still can
| pattern-match on it later. And then look up whatever you've
| forgotten!
| [deleted]
| jackblemming wrote:
| I know most of the stuff listed there and do not have a published
| textbook or papers. It sounds like the author achieved success
| because they work hard on things they're passionate about.
| Knowledge and published works are a byproduct of that.
| LeanderK wrote:
| I think the point is that knowing this stuff makes it easier to
| write a paper because you are not as easily out of breath. It's
| not guaranteeing you a good paper and phd, that's your part.
|
| Maybe you too could be a successful phd-student? :) At least
| you know the basics!
| version_five wrote:
| Thinking about this, I'd be also interested to hear what the
| author learned and didn't find useful over his phd. Is this a
| list of most of what he ended up learning, (which could
| potentially then have a lot of conformation bias in it) or is it
| curated from the maze of blind alleys he went down?
| patrickkidger wrote:
| Ooh, that's a great suggestion!
|
| So one thing I learned a _lot_ of in my PhD (for literally a
| whole year), that I literally never needed, was functional
| analytic methods for PDEs. Stuff like Moser iterations / the
| De Giorgi-Nash-Moser theorem, etc.
|
| The finer details of Turing machines have never really helped
| me, although in my case that's probably the exception as I
| imagine that's still pretty important.
|
| On a more ML note, I have literally never needed SVMs. (And
| hope I never get asked about them, I've forgotten everything
| about them haha.)
|
| I think there's a lot of other stuff I could add to the "just-
| don't-know-stuff" list!
|
| (And to answer your last question: this list is curated, and
| based on the criteria of (a) is it useful, and (b) is it widely
| applicable.)
| jll29 wrote:
| What you need to know depends largely what you are working on -
| and that also holds for a successful Ph.D. candidate. Of course,
| it is good advice to be open-minded beyond one's narrow field of
| inquiry, as the OP suggests, but a long list of maths topics may
| not be helpful a lot to beginners.
|
| The Ph.D. period is the time when you have some time available to
| acquire additional skills, and my advice is: try to strengthen
| those aspects of your education where you currently have the most
| glaring deficits. For instance, take a statistics course if that
| is your weak spot, or learn a foreign language if you have evaded
| that topic so far in your educational journey.
|
| Read the main textbooks of your field and read and re-read ALL
| relevant papers for your actual Ph.D. topic, once you've been
| able to identify it (which may well take you most or all of your
| first year). Take details notes because nobody can remember most
| of that much highly concentrated advanced material. Try to find
| gaps: ask questions and find out if people have tried to answer
| them yet or not. Interact with others e.g. at conferences, after
| meeting some people there by attending and networking in a prior
| year.
| SQueeeeeL wrote:
| > Read the main textbooks of your field and read and re-read
| ALL relevant papers for your actual Ph.D. topic, once you've
| been able to identify
|
| I'm gonna be real as someone in grad school. Basically no PhD,
| grad students, or professors I know read full text books. I
| hear these ideas a lot, and they often sound like one of those
| Instagram influencer diets, that's completely unreasonable if
| you have any constraints in your life, and it's mostly been
| used to gate keep "real" scientists. Be well studied and
| knowledgeable about your problem domain, but you have a finite
| lifespan, so never feel bad that you aren't "educated enough".
| patrickkidger wrote:
| As the author of this article... I have read maybe one
| textbook cover-to-cover in my life. :D
|
| (Hands-on machine learning, by Geron, back when I made the
| jump math->ML.)
| mkl wrote:
| I have an applied maths PhD but no machine learning. Would
| you still recommend Geron for that purpose?
|
| BTW, I spotted a typo in the first paragraph of your thesis
| abstract: "neural networks and differential equation are
| two sides".
| SQueeeeeL wrote:
| Elements of Statistical Learning is my cover to cover read
| :), I just think it isn't a requirement to be a "good
| student"
| steppi wrote:
| Knowing things is good, but I think the real benefit of a PhD is
| in developing the skill to learn new things quickly and fill in
| any gaps as needed. It's important to get practice learning many
| things in depth but the goal isn't to become a storehouse of
| knowledge, it's to develop the ability assimilate existing
| knowledge and apply it to figuring new things out.
|
| The amount of topics I've studied in depth dwarfs this list (and
| the same is certainly true of its author) but the set of things I
| could teach a class on today without preparation is much smaller.
| The important thing is that if I have a problem, I can use the
| impressions from all I've learned before to get a sense of where
| to look next. My memory isn't great but it doesn't matter because
| I can refresh, learn, and figure things out as needed.
| patrickkidger wrote:
| Completely agreed!
| bayesian_horse wrote:
| That's what I do. I drink and I know stuff.
| gautamcgoel wrote:
| I just wrapped up a machine learning PhD at Caltech (now doing a
| postdoc in ML at Berkeley) and I disagree strongly with this
| article. What matters isn't knowing a bunch of random stuff, but
| rather writing/speaking skills, a willingness to learn new
| things, perseverance in the face of setbacks, creativity, having
| enough EQ to navigate the advisor-advisee relationship and
| departmental politics, and most of all, an ability to follow
| through and actually get things done. These "intangible" skills
| are far more important than having any specific knowledge.
| mi_lk wrote:
| This. The article is meh in terms of advices, and even serves
| more as a self-promotion (which apparently he is good at and
| that's something to learn). Maybe it somehow works for him and
| it's fine, but it's no more useful than a table of contents.
| patrickkidger wrote:
| I think those are all things you need for _life_.
|
| But what you do need for _specifically a PhD_? I argue that
| "knowing stuff" is what is necessary -- and that indeed it's
| essentially the purpose of the whole academic institutiom.
| ModernMech wrote:
| A Ph.D. isn't about knowing things, it's about doing
| research. Recipients of the degree are given the title
| "doctor" not for what they know, but because they are first
| and foremost teachers of knowledge (from the Latin _docere_ ,
| meaning to teach).
|
| "Knowing stuff" is enough to get you to the point where you
| can formulate a good research question, but being a good
| researcher is a much broader skillset than just knowing
| things. And consequently, just knowing things won't get you a
| Ph.D., because getting one requires you to talk about your
| research. A lot. Like, all the time. And after all, the last
| step in getting the Ph.D. is called a "defense", not an
| "examination", because you are not there to tell anyone what
| you know -- you are there to _defend_ what you know, and that
| 's a different skillset than gaining knowledge or even
| disseminating knowledge.
|
| I guess all that is to say, you could know everything in the
| world, but if you lack the skills to tell anyone else about
| that knowledge, you'll never get a Ph.D.
| mochomocha wrote:
| The tone of the post is pretty off putting. It reads as a "look
| how smart I am!" article - the author doesn't even pretend to be
| modest.
| burnished wrote:
| Should they be? In this case wouldnt it end up being false
| modesty?
|
| Like, if this person cant say "Look at me, I am UNUSUALLY
| INTELLIGENT!" then who can?!
| mochomocha wrote:
| > Like, if this person cant say "Look at me, I am UNUSUALLY
| INTELLIGENT!" then who can?!
|
| No one, that's my point. If academia has not taught to the
| author that his intelligence isn't unusual, the workforce of
| his new employer certainly will. Listing github stars and
| twitter followers in the second paragraph as an achievement
| to me transpires lack of maturity and a need for external
| validation. On the bright side, being good at self-promotion
| when entering a mega-corporation will make sure he has a
| great career in front of him.
| patrickkidger wrote:
| I'm sorry it came across this way for you! Rather, I'm just
| outlining why folks seem to keep asking me this question.
| :)
| burnished wrote:
| I think your post combined with the responses to it are a
| great example of how messages become stripped of all
| nuance, even by otherwise very educated/intelligent
| people. A lot of the critique here ignores the nuance and
| specificity you give just in the opening paragraphs.
| burnished wrote:
| I see, you read it as a form of self aggrandizement. I
| think your reading was wrong (in addition to being unkind),
| but wouldnt disagree with the core value under discussion.
|
| I think a fair test to apply is whether or not the thing is
| a fact - if it is a fact, and you think it is impressive,
| that might say more about you than the author. I feel like
| otherwise you are asking people to self censor too strongly
| for fear of betraying some perverse sense of modesty that
| does now allow for anyone to have done anything worth
| noticing at all.
| p1esk wrote:
| Even if you can does not mean you should.
| ding_dang wrote:
| Why not?
| [deleted]
| j7ake wrote:
| The problem with this list, although impressive in scope, is that
| it is not clear how /deeply/ one should know each of those items
| on the list.
|
| One can write several papers for each one of those items on list.
|
| Knowing the concept well enough to pass a job interview is a much
| lower bar than well enough to innovate and push new knowledge.
| sillysaurusx wrote:
| If anyone's reading over this and feels "Gosh, I'll never be an
| ML dev; this is way too much": I don't know most of that list,
| and still manage to be a productive researcher. I learn what I
| need as I go.
|
| That's probably the optimal strategy. I'm skeptical of first-
| principles learning. It's great to immerse yourself in theory,
| but when you've gone all the way to "topology" you've probably
| gone beyond the limit of what most ML devs care about on a day-
| to-day basis.
|
| It's still useful to know. I've applied lots of ideas from other
| fields. But can you force that knowledge by forcing yourself to
| study other fields? Maybe. We all have a finite amount of time
| though.
|
| That said, lots of items on this list are key, and it'd be worth
| ranking them. There's no need to memorize formulas for Adam, but
| knowing the concept of momentum-per-weight is pretty crucial.
| thwayunion wrote:
| _> Gosh, I 'll never be an ML dev_
|
| FWIW very few of the ML devs I work with have PhDs. I'm not
| sure aspiring ML devs are the intended audience.
| extasia wrote:
| The author and you seem to be talking to different audiences,
| you're talking about ML eng and OP is talking about ML
| researchers.
|
| Researchers absolutely need to know a lot, not necessarily all
| the way to topology or w/e but definitely the underlying
| mathematical principles in order to advance the field (IMO).
| sillysaurusx wrote:
| You're right, I wasn't clear. But I'm a full-time ML
| researcher. In terms of advancing the field, my contributions
| so far have been modest, but they're there. Some of my
| favorite ideas were swarm training
| (https://battle.shawwn.com/swarm-training-v01a.pdf), stop
| loss for stabilizing GAN training (https://twitter.com/search
| ?q=from%3Atheshawwn%20stop%20loss&...), and getting GPT to
| play chess
| (https://www.theregister.com/2020/01/10/gpt2_chess/) back
| when that was a shocking idea. And in terms of cited
| research, https://arxiv.org/abs/2101.00027 has been the most
| impactful.
|
| There's significant overlap between ML research and ML dev.
| If I can do it without most of that list, it should give
| people here some hope of joining the field without needing to
| immerse themselves in theory.
| p1esk wrote:
| I'm an ML researcher, and I'm with sillysaurusx on this. I
| actually know most of the things on the list, but only
| because my research is mainly in model compression and
| computational efficiency. Recently I've been interested in
| adapting diffusion models to generate music (in raw audio
| domain), and I'd say only 5 out of 18 bullet points in ML
| section are relevant - the rest falls from "nice to know" to
| "irrelevant".
| bayesian_horse wrote:
| I feel like I know up to 75% of that stuff and can't get hired
| even for entry level data science.
| ad404b8a372f2b9 wrote:
| I keep searching how to be a good researcher but I haven't found
| any answers.
|
| I know almost all of this except for weirdly arbitrary/specific
| stuff (you really don't need to know Haskell for an ML PhD). It's
| all pretty basic, half of it you learn during any CS Masters
| degree, the other half are models and concepts that have been
| popular enough in the last few years that you would have read the
| papers and possibly implemented them if you keep up with the
| field.
|
| This hasn't helped me at all, my PhD has gone horribly and I'm
| not entirely sure why. The first year I wasted on an EEG-related
| review paper before realizing that EEG data is garbage, the
| second year I wasted on an original method that I wanted to
| pursue but it didn't perform well and got scooped. The third year
| (which ended up being the fourth year because of severe health
| issues) I was burnt-out and produced nothing of value.
|
| I don't know how to produce good original research. I know how to
| program like nobody's business, better than any of my lab
| colleagues but this has done nothing to help me. I can implement
| a method in a night, I can implement tons of ideas in a year but
| if they don't perform better on the important metrics what's the
| point? I know the math and the ML side of things but this hasn't
| given me any insights, usually when I have an idea I realize it's
| already been done while doing the literature review.
|
| It's all a big mystery to me, I think maybe the pressure of
| holding an industry job and having to finish the PhD before my
| funding ran out prevented me from experimenting freely but it
| might just be a convenient excuse.
| patrickkidger wrote:
| I'm sorry to hear that things didn't go so well for you!
|
| FWIW being able to "program like nobody's business" is still
| really really valuable. It's why I dedicate such a large chunk
| of the post to software dev skills. :)
| brilee wrote:
| As somebody from the U.S., I'm struck by how much people from the
| U.K. value a strong background in theory. At Google my experience
| has been "what matters is how smart you are, whether you
| understand the problem we're trying to solve, and whether you
| have creative solutions". When I applied for DeepMind some time
| ago, I was grilled, rapid-fire, a hundred questions covering the
| breadth of a rigorous undergraduate education in linear algebra,
| stats, ML, calculus, etc.. They seemed content to measure my
| intelligence by seeing how rapidly and deeply I had assimilated
| standard courses, rather than by seeing how I approached a
| problem I'd never seen before.
|
| This guy is obviously talented, but also he comes from a
| tradition of optimizing for this kind of academic culture. You
| would be similarly weirded out by the leetcode fetish in tech if
| that's not what you were used to. I think that's what many
| commenters are missing.
| xmprt wrote:
| I've noticed that cultural difference too. I think there are
| things to take away from both approaches. Extremely
| knowledgeable people with a lot of background in theory should
| get better at understanding and creatively applying that
| knowledge to new problems. And people who are great at problem
| solving should learn more theory instead of expecting
| themselves to just materialize the best solution out of thin
| air.
| foobarbecue wrote:
| 2 years? Sheesh. This is the type of stuff that makes me think
| genuis is a biological thing.
| sampo wrote:
| > 2 years?
|
| Half of the items, and almost all in the "Mathematics" section,
| you would have learned during your BSc/MSc (if it's in applied
| math or physics, and Python programming is your hobby). I can't
| find the author's CV, but there is 5 years from him returning
| his MSc thesis in 2017 to returning PhD thesis in 2022. Maybe
| he studied ML for all those 5 years, or maybe he took off 2
| years to travel the world, who knows.
| ipnon wrote:
| You can learn a lot when you're in your early 20s, single,
| ambitious, and getting a stipend from a university to do
| nothing but study for 16 hours a day.
| j7ake wrote:
| In the UK, 3 years is common.
| patrickkidger wrote:
| Right! This was the situation for me.
| [deleted]
| supernova87a wrote:
| I was about to comment on the short PhD in "2-and-a-bit years"
| -- and how the UK expectation for PhD program duration is so
| different from in the US.
|
| And it wasn't to say that he's a genius. It's that the program
| is planned to be shorter, and in fact, there's less funding
| available to go longer even if you wanted to.
| jahewson wrote:
| One must consider what he spent the previous 22 years doing.
| WithinReason wrote:
| Maybe not necessary for a PhD, but you should also have some idea
| about how HW works.
| bitwize wrote:
| "The stuff is what the stuff is, brother." --from a James Mickens
| talk on machine learning,
| https://www.youtube.com/watch?v=ajGX7odA87k&t=13m40s
|
| Can confirm, the way ML is used in many businesses is like an egg
| drop: You open up Jupyter, load in some data, and play around
| with various models until you find one that fits the data, then
| use it to try to predict future data. If the future results
| comport with the model, congratulations: your egg is safe, at
| least for drops of that height.
___________________________________________________________________
(page generated 2023-01-27 23:01 UTC)