[HN Gopher] Jax vs. Julia (Vs PyTorch)
___________________________________________________________________
Jax vs. Julia (Vs PyTorch)
Author : sebg
Score : 63 points
Date : 2022-05-04 17:36 UTC (5 hours ago)
(HTM) web link (kidger.site)
(TXT) w3m dump (kidger.site)
| longemen3000 wrote:
| I feel called out on the academic part hahaah. I simply want to
| code state of the art (thermodynamic) models, and at least julia
| helps by providing easy testing and publishing infraestructure.
| but obviously we can't compete with a corporation in code quality
| (we are trying!)
|
| Unrelated, but for small sizes, i really prefer to use forward
| mode in julia (Via ForwardDiff.jl) instead of Zygote. the
| overhead of reverse ADing over an arbitrary function with
| mutation is not worth it.
| tagrun wrote:
| In the context of neural networks with differential equations
| (which appears to be the original poster's field), the trade-
| off depends:
| https://diffeqflux.sciml.ai/dev/ControllingAdjoints/
| machinekob wrote:
| I strongly agree with readability in my opinion its cause
| Academia people live in "bubbles" and they assume everyone knew
| what a domain specific terms and greek letters means so its
| easier to read some omega then for example learning_rate or lr.
|
| But for us mortal who cross multiple domains its just getting
| extremely frustrating to read full math based notation without
| any extra info about notation in package/functions etc. so
| debugging multiple sub-packages is just getting too time
| consuming as you have to learn both person style of writing code,
| whole scientific notation and get domain knowledge before you can
| even touch the code.
| aaplok wrote:
| > Academia people live in "bubbles" and they assume everyone
| knew what a domain specific terms and greek letters
|
| Naming things by their English name is not more universal than
| using Greek letters. It's just serving amother group of people
| who live in a different bubble.
| belval wrote:
| Yes and no, the example that the author gives is actually a
| very good one:
|
| > Many Julia APIs look like Optimiser(e=...) rather than
| Optimiser(learning_rate=...). This is a pretty unreadable
| convention.
|
| The learning rate is a well known name that basically every
| one will understand, on the other hand, "e" or eta, is not
| even used everywhere in the literature with some papers using
| alpha instead.
|
| This just looks clever, it's a pretty bad parameter name.
| melissalobos wrote:
| > The learning rate is a well known name that basically
| every one will understand
|
| Absolutely! Because as we all know, everyone speaks
| English.
|
| The GP's point was that greek letters are used in lots and
| lots of papers even written in other languages. I have read
| quite a few papers in Japanese that used exactly the same
| conventions with respect to the greek letters and latin
| letters used.
| 127 wrote:
| Google Translate is one click away. I can easily
| translate both Japanese and Chinese comments and variable
| names to get the gist of it. Using single hieroglyphs for
| it makes the entire endeavor impossible.
| belval wrote:
| How many researchers in the ML/DL community don't speak
| English? I don't have hard numbers but I highly doubt
| that it's a significant proportion. What is the reach of
| your Japanese papers when almost no-one outside of Japan
| can read Japanese?
|
| Even China, despite their best effort to de-westernize
| their culture still uses English in their research
| papers.
|
| And if all the above wasn't enough, Julia's libraries are
| still all in English so if an hypothetical researcher's
| English is so poor that they don't know what "learning
| rate" is, I'd venture that they'll have trouble
| programming in Julia/JAX/PyTorch.
| SiempreViernes wrote:
| How many don't speak it as a native language? Quite a lot
| as most of the world uses something else as their primary
| language.
|
| If you're instead asking of how many can struggle trough
| an english text supported by machine translators, then
| that's clearly almost everyone.
|
| There's very often a significant gap between the ease
| with which the native and the foreign language can be
| used for reasoning, but surely I don't need to point that
| out since any bilingual person knows this.
| agumonkey wrote:
| I understand both camps but I believe, these are superficial
| problems. It's like worrying about the comfort of seat in the
| operating room of a nuclear plant.
| abakus wrote:
| superficial you say. How about I name these in chinese in my
| package?
| melissalobos wrote:
| Sure, just try to properly document what it does. There are
| some characters that are easily confused at first glance
| even for native speakers, so be sure to use some common
| sense.
| nextos wrote:
| Exactly. I actually find Julia's ecosystem (not the language)
| _way_ more approachable than Python 's.
|
| In Python, most libraries are big monoliths. Whereas in
| Julia, libraries are small and composable. Furthermore, it's
| the same language all the way down.
|
| Python's libraries are superb, but the learning curve to
| develop (not to use them) is really steep.
| jstx1 wrote:
| I don't understand. What do you mean by "learning curve to
| develop" an existing Python library?
| bobbylarrybobby wrote:
| And if you ever _do_ want to edit the code, you have to know
| the name of every non-ASCII symbol the codebase uses if you
| want to type out those same symbols without copying and pasting
| them. If you 're not familiar with the material, entering a
| character like x can be a real challenge, and is actually more
| keystrokes than just typing "xi".
| leephillips wrote:
| For me it's one keystroke, with the dead-Greek modifier key.
|
| I'll do it right here, by typing <G-x>: x
| tagrun wrote:
| Difference is 1 keystroke: you type \xi. Why is that a "real
| challenge"?
| tlb wrote:
| What editors does this work in?
| tagrun wrote:
| Juno, Jupyter support it out-of-the-box. With plug-ins:
| VS Code (unicode-latex), Atom (latex-completions), Emacs
| (which has TeX input), Vim (latex-unicoder), ...
| leephillips wrote:
| Any editor, on Linux, if you have your keyboard set up to
| type Greek letters and the Unicode symbols that you use
| frequently. I can do it directly in the comment box: x
| animal_spirits wrote:
| The challenge comes from a person like me, who doesn't know
| off the top of my head which Greek letter x is. So for each
| of these symbols I'd have to google them and learn it, or
| have some notepad where I can copy paste the needed symbol
| tagrun wrote:
| So you don't know the Greek alphabet, but write high-
| performance computing code involving non-trivial math?
| (Julia's main use case is HPC)
| ShamelessC wrote:
| What? Did you not read the parent of this thread?
|
| > strongly agree with readability in my opinion its cause
| Academia people live in "bubbles" and they assume
| everyone knew what a domain specific terms and greek
| letters means
|
| In any case, I program HPC stuff myself with pytorch and
| no - I don't know the Greek alphabet and probably don't
| understand "non trivial math". The assumption that these
| people can't contribute is pretty off-putting honestly.
| More engineers would join such efforts if there wasn't so
| much gatekeeping.
| belval wrote:
| The author example is using the greek letter "eta"
| instead of spelling out "learning_rate", this is a pretty
| damning example.
| dTal wrote:
| You should probably take the time to learn the names of
| the Greek letters. There are only 24 of them and they're
| even related to the English ones. It's not a huge time
| investment, and it's probably worth it if you work in
| engineering.
| blindseer wrote:
| Another massive frustration for me is that Julia has no formal
| way to say "here are public functions and these are private" but
| does have a completely orthogonal way of saying "these are
| functions that will populate your global namespace if you use
| `using`", i.e. `export variable_name`, and people absolutely
| confuse the hell out of these. I don't think there's even
| agreement in the Julia community if you should use `export` for
| your public API or not.
|
| And if you misspell the exports or change the variable in
| question, Julia won't even warn you about it. That is straight
| crazy behavior to me, and I still don't understand how that
| hasn't been changed.
|
| The `using` + `import` packages in Julia combined with how
| `export`s work make it SUCH a confusing and frustrating
| experience for beginners in Julia.
|
| I personally like mathematical symbols when I'm writing and
| reading code in my domain, but I do feel very lost when I'm
| reading Julia code outside of my area of expertise. All my
| colleagues hate it too (hard to grep, hard to type if you don't
| know the math and are just a software engineer) and I'm coming
| around to the idea of not using it or documenting it explicitly.
|
| The fact that mutable structs are easier to use but immutable
| structs are more performant, the lack of composition of fields
| the way Go handles it, the lack of traits or interfaces, the
| sorry state of compilation time, the non existent tooling all
| lead a beginner / intermediate Julia developer in the wrong
| direction in my opinion. It's very easy to write code that is
| straight up broken or not efficient in Julia, and that's probably
| why I won't pick it for a big project going forward.
|
| But I'm still keeping my eye on it. Maybe in 5 years it'll be the
| language for a lot of the jobs?
| adgjlsfhk1 wrote:
| """ Even in the major well-known well-respected Julia packages -
| I'll avoid naming names - the source code has very obvious cases
| of unused local variables, dead code branches that can never be
| reached, etc. """ Please name names. that way we can fix stuff.
| Other than that, great post!
| lern_too_spel wrote:
| I think the author is pushing for better code quality tooling
| in Julia instead of having people manually fix these problems.
| adgjlsfhk1 wrote:
| we absolutely should, but in the short term, fixing issues is
| good.
| SemanticStrengh wrote:
| The most next gen autodiff library probably is
| https://github.com/breandan/kotlingrad because of its features,
| ergonomy and type safety
| martinsmit wrote:
| Idk, Enzyme is pretty next gen, all the way down to LLVM code.
|
| https://github.com/EnzymeAD/Enzyme
| tagrun wrote:
| In his item #1, he links to
| https://discourse.julialang.org/t/loaderror-when-using-inter...
| The issue is actually a Zygote bug, a Julia package for auto-
| differentiation, and is not directly related to Julia codebase
| (or Flux package) itself. Furthermore, the problematic code is
| working fine now, because DiffEqFlux has switched to Enzyme,
| which doesn't have that bug. He should first confirm whether the
| problem he is citing is actually a problem or not.
|
| Item #2, again another Zygote bug.
|
| Item #3, which package? That sounds like an hyperbole, an
| extrapolation from a small sample, and "I'll avoid naming names"
| is a lazy excuse that would hide this. It is similarly easy to
| point to poorly written (or poorly documented) JAX code or Python
| code as well, so that doesn't prove that "Julia is lacklustre and
| JAX shines". Also, as an academician, I strongly disagree that
| learning_rate=... is better than e=..., but that's a matter of
| convention & taste, with no bearing on the correctness or
| performance of the package/language. It's bikeshedding. I agree
| that errors are usually not "instructable" in Julia ML packages
| (which needs to be improved), so monkey typing is less likely to
| succeed.
|
| Item #4 is such a nitpicker. Sure, Julia may not have a special
| syntax for that particular fringe array slicing like Numpy does
| and you'll instead need to make a function call, but
| matrix/tensor code in Numpy is usually filled with calls to zips
| and cats with explicit indices, whereas in Julia, no explicit
| indexing needs to be done usually. Also, one can nitpick
| similarly in the opposite direction: Julia has many language
| features lacking from JAX or Python, why not talk about those as
| well?
|
| I'm not a huge fan of Julia either (mainly because of it's
| garbage collected nature), but this is such a low-effort
| criticism of it.
| blindseer wrote:
| I've been using Julia since 2017 and still do on a day to day
| basis, and I agree with the author in a lot of cases, even his
| subjective naming conventions gripes.
|
| The author's biggest criticism is that Julia doesn't have
| tooling to make the developer experience better. There's
| Revise, JuliFormatter, LanguageServer and Jet, but the
| development experience in Python is enviable. There's like 3
| different REPLs, at least two competing linters and auto
| formatters. It's okay to admit that these are places Julia is
| lacking.
|
| I think your kind of response to criticism about Julia is what
| gives the Julia community a bad name, in my opinion. What is
| wrong with saying these things suck and need improvements?
| Would you rather Julia not improve and stay the way it is right
| now forever? Surely I hope not.
| tagrun wrote:
| What exactly do you think I said about Julia's linters and
| auto-formatters?
| nullstyle wrote:
| > but that's a matter of convention & taste, with no bearing on
| the correctness or performance of the package/language. It's
| bikeshedding.
|
| I heartily disagree with that. Bikeshedding is about focussing
| on the trivial, and the symbols we choose in our codebases are
| hardly trivial, and indeed many of us regard naming things as
| one of the central problems in programming[1].
|
| [1]:https://medium.com/hackernoon/naming-the-things-in-
| programmi...
| tagrun wrote:
| Unlike the example in the link you give, e isn't a generic
| random name like a,x that can mean anything. If you ever read
| a paper on stochastic gradient optimization, you'd know that
| e means learning rate in the context.
|
| It is bikeshedding because it is analogous to insisting that
| using "angle" instead of "th", or "radius" instead of "r" in
| a 2D geometry library is superior and takes your code from
| being a lackluster to something that shines (in the words of
| the original author), while not having anything useful to say
| anything about the mathematical/technical aspects of the code
| itself.
|
| Here is the definition of bikeshedding:
|
| > The term was coined as a metaphor to illuminate Parkinson's
| Law of Triviality. Parkinson observed that a committee whose
| job is to approve plans for a nuclear power plant may spend
| the majority of its time on relatively unimportant but easy-
| to-grasp issues, such as what materials to use for the staff
| bikeshed, while neglecting the design of the power plant
| itself, which is far more important but also far more
| difficult to criticize constructively. It was popularized in
| the Berkeley Software Distribution community by Poul-Henning
| Kamp[1] and has spread from there to the software industry at
| large.
|
| from https://en.wiktionary.org/wiki/bikeshedding
| 00ajcr wrote:
| My interpretation of the point in the blog post was that
| explicitly spelling out variable names makes APIs and the
| underlying code much more accessible to a wider audience.
|
| Sure, there'll be a subset of users of these libraries that
| have read ML/textbooks and are familiar with what e means
| in this context.
|
| Today, many (most?) users of ML libraries will probably not
| know what e means without looking it up. Adhering to
| mathematical notation puts up an unnecessary barrier to
| using the API/code and ultimately limits wider
| engagement/collaboration.
|
| To attract a bigger slice of the ML community, choosing
| names that the ML hobbyyist can read, understand and use
| without pause is the better path forward.
| tagrun wrote:
| You are saying most people don't know what e in that
| context means (=people who likely haven't read a book or
| a paper on stochastic gradient, and don't know how it
| actually works), but they would somehow magically figure
| out what it actually does if we call it "learning_rate"
| in ASCII letters. How does that work?
|
| FYI, the documentation of the function
| https://fluxml.ai/Flux.jl/stable/training/optimisers/
| explicitly says it is learning rate:
|
| > Learning rate (e): Amount by which gradients are
| discounted before updating the weights.
|
| so this is already explicit to anyone who reads the
| documentation. The quibble in the post is about the named
| parameter.
| jstx1 wrote:
| > How does that work?
|
| You can look up "learning rate" much easier than to look
| up "what is this Greek letter on my screen" followed by
| "what is the use of this Greek letter in my context" and
| only then followed by searching for "learning rate"
|
| More importantly, it's possible to know what a learning
| rate is without knowing what Greek letter it's commonly
| denoted as. Especially since mathematical notation is so
| inconsistent across authors. I want less ambiguity in
| code, not more. Explicit is better than implicit.
|
| Mathematical notation is notorious for being an absolute
| mess of inconsistencies. Who in their right mind looked
| at it and went "yep, I want more of this is my source
| code".
___________________________________________________________________
(page generated 2022-05-04 23:00 UTC)