hngopher.com

       [HN Gopher] Jax vs. Julia (Vs PyTorch)
       ___________________________________________________________________
        
       Jax vs. Julia (Vs PyTorch)
        
       Author : sebg
       Score  : 63 points
       Date   : 2022-05-04 17:36 UTC (5 hours ago)
        
 (HTM) web link (kidger.site)
 (TXT) w3m dump (kidger.site)
        
       | longemen3000 wrote:
       | I feel called out on the academic part hahaah. I simply want to
       | code state of the art (thermodynamic) models, and at least julia
       | helps by providing easy testing and publishing infraestructure.
       | but obviously we can't compete with a corporation in code quality
       | (we are trying!)
       | 
       | Unrelated, but for small sizes, i really prefer to use forward
       | mode in julia (Via ForwardDiff.jl) instead of Zygote. the
       | overhead of reverse ADing over an arbitrary function with
       | mutation is not worth it.
        
         | tagrun wrote:
         | In the context of neural networks with differential equations
         | (which appears to be the original poster's field), the trade-
         | off depends:
         | https://diffeqflux.sciml.ai/dev/ControllingAdjoints/
        
       | machinekob wrote:
       | I strongly agree with readability in my opinion its cause
       | Academia people live in "bubbles" and they assume everyone knew
       | what a domain specific terms and greek letters means so its
       | easier to read some omega then for example learning_rate or lr.
       | 
       | But for us mortal who cross multiple domains its just getting
       | extremely frustrating to read full math based notation without
       | any extra info about notation in package/functions etc. so
       | debugging multiple sub-packages is just getting too time
       | consuming as you have to learn both person style of writing code,
       | whole scientific notation and get domain knowledge before you can
       | even touch the code.
        
         | aaplok wrote:
         | > Academia people live in "bubbles" and they assume everyone
         | knew what a domain specific terms and greek letters
         | 
         | Naming things by their English name is not more universal than
         | using Greek letters. It's just serving amother group of people
         | who live in a different bubble.
        
           | belval wrote:
           | Yes and no, the example that the author gives is actually a
           | very good one:
           | 
           | > Many Julia APIs look like Optimiser(e=...) rather than
           | Optimiser(learning_rate=...). This is a pretty unreadable
           | convention.
           | 
           | The learning rate is a well known name that basically every
           | one will understand, on the other hand, "e" or eta, is not
           | even used everywhere in the literature with some papers using
           | alpha instead.
           | 
           | This just looks clever, it's a pretty bad parameter name.
        
             | melissalobos wrote:
             | > The learning rate is a well known name that basically
             | every one will understand
             | 
             | Absolutely! Because as we all know, everyone speaks
             | English.
             | 
             | The GP's point was that greek letters are used in lots and
             | lots of papers even written in other languages. I have read
             | quite a few papers in Japanese that used exactly the same
             | conventions with respect to the greek letters and latin
             | letters used.
        
               | 127 wrote:
               | Google Translate is one click away. I can easily
               | translate both Japanese and Chinese comments and variable
               | names to get the gist of it. Using single hieroglyphs for
               | it makes the entire endeavor impossible.
        
               | belval wrote:
               | How many researchers in the ML/DL community don't speak
               | English? I don't have hard numbers but I highly doubt
               | that it's a significant proportion. What is the reach of
               | your Japanese papers when almost no-one outside of Japan
               | can read Japanese?
               | 
               | Even China, despite their best effort to de-westernize
               | their culture still uses English in their research
               | papers.
               | 
               | And if all the above wasn't enough, Julia's libraries are
               | still all in English so if an hypothetical researcher's
               | English is so poor that they don't know what "learning
               | rate" is, I'd venture that they'll have trouble
               | programming in Julia/JAX/PyTorch.
        
               | SiempreViernes wrote:
               | How many don't speak it as a native language? Quite a lot
               | as most of the world uses something else as their primary
               | language.
               | 
               | If you're instead asking of how many can struggle trough
               | an english text supported by machine translators, then
               | that's clearly almost everyone.
               | 
               | There's very often a significant gap between the ease
               | with which the native and the foreign language can be
               | used for reasoning, but surely I don't need to point that
               | out since any bilingual person knows this.
        
         | agumonkey wrote:
         | I understand both camps but I believe, these are superficial
         | problems. It's like worrying about the comfort of seat in the
         | operating room of a nuclear plant.
        
           | abakus wrote:
           | superficial you say. How about I name these in chinese in my
           | package?
        
             | melissalobos wrote:
             | Sure, just try to properly document what it does. There are
             | some characters that are easily confused at first glance
             | even for native speakers, so be sure to use some common
             | sense.
        
           | nextos wrote:
           | Exactly. I actually find Julia's ecosystem (not the language)
           | _way_ more approachable than Python 's.
           | 
           | In Python, most libraries are big monoliths. Whereas in
           | Julia, libraries are small and composable. Furthermore, it's
           | the same language all the way down.
           | 
           | Python's libraries are superb, but the learning curve to
           | develop (not to use them) is really steep.
        
             | jstx1 wrote:
             | I don't understand. What do you mean by "learning curve to
             | develop" an existing Python library?
        
         | bobbylarrybobby wrote:
         | And if you ever _do_ want to edit the code, you have to know
         | the name of every non-ASCII symbol the codebase uses if you
         | want to type out those same symbols without copying and pasting
         | them. If you 're not familiar with the material, entering a
         | character like x can be a real challenge, and is actually more
         | keystrokes than just typing "xi".
        
           | leephillips wrote:
           | For me it's one keystroke, with the dead-Greek modifier key.
           | 
           | I'll do it right here, by typing <G-x>: x
        
           | tagrun wrote:
           | Difference is 1 keystroke: you type \xi. Why is that a "real
           | challenge"?
        
             | tlb wrote:
             | What editors does this work in?
        
               | tagrun wrote:
               | Juno, Jupyter support it out-of-the-box. With plug-ins:
               | VS Code (unicode-latex), Atom (latex-completions), Emacs
               | (which has TeX input), Vim (latex-unicoder), ...
        
               | leephillips wrote:
               | Any editor, on Linux, if you have your keyboard set up to
               | type Greek letters and the Unicode symbols that you use
               | frequently. I can do it directly in the comment box: x
        
             | animal_spirits wrote:
             | The challenge comes from a person like me, who doesn't know
             | off the top of my head which Greek letter x is. So for each
             | of these symbols I'd have to google them and learn it, or
             | have some notepad where I can copy paste the needed symbol
        
               | tagrun wrote:
               | So you don't know the Greek alphabet, but write high-
               | performance computing code involving non-trivial math?
               | (Julia's main use case is HPC)
        
               | ShamelessC wrote:
               | What? Did you not read the parent of this thread?
               | 
               | > strongly agree with readability in my opinion its cause
               | Academia people live in "bubbles" and they assume
               | everyone knew what a domain specific terms and greek
               | letters means
               | 
               | In any case, I program HPC stuff myself with pytorch and
               | no - I don't know the Greek alphabet and probably don't
               | understand "non trivial math". The assumption that these
               | people can't contribute is pretty off-putting honestly.
               | More engineers would join such efforts if there wasn't so
               | much gatekeeping.
        
               | belval wrote:
               | The author example is using the greek letter "eta"
               | instead of spelling out "learning_rate", this is a pretty
               | damning example.
        
               | dTal wrote:
               | You should probably take the time to learn the names of
               | the Greek letters. There are only 24 of them and they're
               | even related to the English ones. It's not a huge time
               | investment, and it's probably worth it if you work in
               | engineering.
        
       | blindseer wrote:
       | Another massive frustration for me is that Julia has no formal
       | way to say "here are public functions and these are private" but
       | does have a completely orthogonal way of saying "these are
       | functions that will populate your global namespace if you use
       | `using`", i.e. `export variable_name`, and people absolutely
       | confuse the hell out of these. I don't think there's even
       | agreement in the Julia community if you should use `export` for
       | your public API or not.
       | 
       | And if you misspell the exports or change the variable in
       | question, Julia won't even warn you about it. That is straight
       | crazy behavior to me, and I still don't understand how that
       | hasn't been changed.
       | 
       | The `using` + `import` packages in Julia combined with how
       | `export`s work make it SUCH a confusing and frustrating
       | experience for beginners in Julia.
       | 
       | I personally like mathematical symbols when I'm writing and
       | reading code in my domain, but I do feel very lost when I'm
       | reading Julia code outside of my area of expertise. All my
       | colleagues hate it too (hard to grep, hard to type if you don't
       | know the math and are just a software engineer) and I'm coming
       | around to the idea of not using it or documenting it explicitly.
       | 
       | The fact that mutable structs are easier to use but immutable
       | structs are more performant, the lack of composition of fields
       | the way Go handles it, the lack of traits or interfaces, the
       | sorry state of compilation time, the non existent tooling all
       | lead a beginner / intermediate Julia developer in the wrong
       | direction in my opinion. It's very easy to write code that is
       | straight up broken or not efficient in Julia, and that's probably
       | why I won't pick it for a big project going forward.
       | 
       | But I'm still keeping my eye on it. Maybe in 5 years it'll be the
       | language for a lot of the jobs?
        
       | adgjlsfhk1 wrote:
       | """ Even in the major well-known well-respected Julia packages -
       | I'll avoid naming names - the source code has very obvious cases
       | of unused local variables, dead code branches that can never be
       | reached, etc. """ Please name names. that way we can fix stuff.
       | Other than that, great post!
        
         | lern_too_spel wrote:
         | I think the author is pushing for better code quality tooling
         | in Julia instead of having people manually fix these problems.
        
           | adgjlsfhk1 wrote:
           | we absolutely should, but in the short term, fixing issues is
           | good.
        
       | SemanticStrengh wrote:
       | The most next gen autodiff library probably is
       | https://github.com/breandan/kotlingrad because of its features,
       | ergonomy and type safety
        
         | martinsmit wrote:
         | Idk, Enzyme is pretty next gen, all the way down to LLVM code.
         | 
         | https://github.com/EnzymeAD/Enzyme
        
       | tagrun wrote:
       | In his item #1, he links to
       | https://discourse.julialang.org/t/loaderror-when-using-inter...
       | The issue is actually a Zygote bug, a Julia package for auto-
       | differentiation, and is not directly related to Julia codebase
       | (or Flux package) itself. Furthermore, the problematic code is
       | working fine now, because DiffEqFlux has switched to Enzyme,
       | which doesn't have that bug. He should first confirm whether the
       | problem he is citing is actually a problem or not.
       | 
       | Item #2, again another Zygote bug.
       | 
       | Item #3, which package? That sounds like an hyperbole, an
       | extrapolation from a small sample, and "I'll avoid naming names"
       | is a lazy excuse that would hide this. It is similarly easy to
       | point to poorly written (or poorly documented) JAX code or Python
       | code as well, so that doesn't prove that "Julia is lacklustre and
       | JAX shines". Also, as an academician, I strongly disagree that
       | learning_rate=... is better than e=..., but that's a matter of
       | convention & taste, with no bearing on the correctness or
       | performance of the package/language. It's bikeshedding. I agree
       | that errors are usually not "instructable" in Julia ML packages
       | (which needs to be improved), so monkey typing is less likely to
       | succeed.
       | 
       | Item #4 is such a nitpicker. Sure, Julia may not have a special
       | syntax for that particular fringe array slicing like Numpy does
       | and you'll instead need to make a function call, but
       | matrix/tensor code in Numpy is usually filled with calls to zips
       | and cats with explicit indices, whereas in Julia, no explicit
       | indexing needs to be done usually. Also, one can nitpick
       | similarly in the opposite direction: Julia has many language
       | features lacking from JAX or Python, why not talk about those as
       | well?
       | 
       | I'm not a huge fan of Julia either (mainly because of it's
       | garbage collected nature), but this is such a low-effort
       | criticism of it.
        
         | blindseer wrote:
         | I've been using Julia since 2017 and still do on a day to day
         | basis, and I agree with the author in a lot of cases, even his
         | subjective naming conventions gripes.
         | 
         | The author's biggest criticism is that Julia doesn't have
         | tooling to make the developer experience better. There's
         | Revise, JuliFormatter, LanguageServer and Jet, but the
         | development experience in Python is enviable. There's like 3
         | different REPLs, at least two competing linters and auto
         | formatters. It's okay to admit that these are places Julia is
         | lacking.
         | 
         | I think your kind of response to criticism about Julia is what
         | gives the Julia community a bad name, in my opinion. What is
         | wrong with saying these things suck and need improvements?
         | Would you rather Julia not improve and stay the way it is right
         | now forever? Surely I hope not.
        
           | tagrun wrote:
           | What exactly do you think I said about Julia's linters and
           | auto-formatters?
        
         | nullstyle wrote:
         | > but that's a matter of convention & taste, with no bearing on
         | the correctness or performance of the package/language. It's
         | bikeshedding.
         | 
         | I heartily disagree with that. Bikeshedding is about focussing
         | on the trivial, and the symbols we choose in our codebases are
         | hardly trivial, and indeed many of us regard naming things as
         | one of the central problems in programming[1].
         | 
         | [1]:https://medium.com/hackernoon/naming-the-things-in-
         | programmi...
        
           | tagrun wrote:
           | Unlike the example in the link you give, e isn't a generic
           | random name like a,x that can mean anything. If you ever read
           | a paper on stochastic gradient optimization, you'd know that
           | e means learning rate in the context.
           | 
           | It is bikeshedding because it is analogous to insisting that
           | using "angle" instead of "th", or "radius" instead of "r" in
           | a 2D geometry library is superior and takes your code from
           | being a lackluster to something that shines (in the words of
           | the original author), while not having anything useful to say
           | anything about the mathematical/technical aspects of the code
           | itself.
           | 
           | Here is the definition of bikeshedding:
           | 
           | > The term was coined as a metaphor to illuminate Parkinson's
           | Law of Triviality. Parkinson observed that a committee whose
           | job is to approve plans for a nuclear power plant may spend
           | the majority of its time on relatively unimportant but easy-
           | to-grasp issues, such as what materials to use for the staff
           | bikeshed, while neglecting the design of the power plant
           | itself, which is far more important but also far more
           | difficult to criticize constructively. It was popularized in
           | the Berkeley Software Distribution community by Poul-Henning
           | Kamp[1] and has spread from there to the software industry at
           | large.
           | 
           | from https://en.wiktionary.org/wiki/bikeshedding
        
             | 00ajcr wrote:
             | My interpretation of the point in the blog post was that
             | explicitly spelling out variable names makes APIs and the
             | underlying code much more accessible to a wider audience.
             | 
             | Sure, there'll be a subset of users of these libraries that
             | have read ML/textbooks and are familiar with what e means
             | in this context.
             | 
             | Today, many (most?) users of ML libraries will probably not
             | know what e means without looking it up. Adhering to
             | mathematical notation puts up an unnecessary barrier to
             | using the API/code and ultimately limits wider
             | engagement/collaboration.
             | 
             | To attract a bigger slice of the ML community, choosing
             | names that the ML hobbyyist can read, understand and use
             | without pause is the better path forward.
        
               | tagrun wrote:
               | You are saying most people don't know what e in that
               | context means (=people who likely haven't read a book or
               | a paper on stochastic gradient, and don't know how it
               | actually works), but they would somehow magically figure
               | out what it actually does if we call it "learning_rate"
               | in ASCII letters. How does that work?
               | 
               | FYI, the documentation of the function
               | https://fluxml.ai/Flux.jl/stable/training/optimisers/
               | explicitly says it is learning rate:
               | 
               | > Learning rate (e): Amount by which gradients are
               | discounted before updating the weights.
               | 
               | so this is already explicit to anyone who reads the
               | documentation. The quibble in the post is about the named
               | parameter.
        
               | jstx1 wrote:
               | > How does that work?
               | 
               | You can look up "learning rate" much easier than to look
               | up "what is this Greek letter on my screen" followed by
               | "what is the use of this Greek letter in my context" and
               | only then followed by searching for "learning rate"
               | 
               | More importantly, it's possible to know what a learning
               | rate is without knowing what Greek letter it's commonly
               | denoted as. Especially since mathematical notation is so
               | inconsistent across authors. I want less ambiguity in
               | code, not more. Explicit is better than implicit.
               | 
               | Mathematical notation is notorious for being an absolute
               | mess of inconsistencies. Who in their right mind looked
               | at it and went "yep, I want more of this is my source
               | code".
        
       ___________________________________________________________________
       (page generated 2022-05-04 23:00 UTC)