[HN Gopher] Julia Computing raises $24M Series A
___________________________________________________________________
Julia Computing raises $24M Series A
Author : dklend122
Score : 342 points
Date : 2021-07-19 14:33 UTC (8 hours ago)
(HTM) web link (www.hpcwire.com)
(TXT) w3m dump (www.hpcwire.com)
| mccoyb wrote:
| Just a comment to participants who are suspicious of Julia usage
| over another, more popular language (for example) -- I think most
| Julia users are aware that the ecosystem is young, and that
| encouraging usage in new industry settings incurs an engineering
| debt associated with the fact that bringing a new language onto
| any project requires an upfront cost, followed by a maintenance
| cost.
|
| Most of these arguments are re-hashed on each new Julia post
| here. A few comments:
|
| For most Julia users, any supposed rift between Python and Julia
| is not really a big deal -- we can just use PyCall.jl, a package
| whose interop I've personally used many times to wrap existing
| Python packages -- and supports native interop with Julia <->
| NumPy arrays. Wrapping C is similarly easy -- in fact, easier
| than "more advanced" languages like Haskell -- whose C FFI only
| supports passing opaque references over the line.
|
| Ultimately, when arguments between languages in this space arise
| -- the long term question is the upfront cost, and the
| maintenance cost for a team. Despite the fact that Julia provides
| an excellent suite of wrapper functionality, I'm aware that
| introducing new Julia code into an existing team space suffers
| from the above issues.
|
| I'm incredibly biased, but I will state: maintaining Julia code
| is infinitely easier than other uni-typed languages. I've had to
| learn medium-sized Python codebases, and it is a nightmare
| compared to a typed language. This total guessing game about how
| things flow, etc. It really is shocking. Additionally, multiple
| dispatch is one of those idioms where, one you learn about it,
| you can't really imagine how you did things before. I'm aware of
| equivalences between the set of language features (including type
| classes, multi-method dispatch, protocols or interfaces, etc).
|
| Ultimately, the Julia code I write feels (totally subjectively)
| the most elegant of the languages I've used (including Rust,
| Haskell, Python, C, and Zig). Normally I go exploring these other
| languages for ideas -- and there are absolutely winners in this
| group -- but I usually come crawling back to Julia for its
| implementation of multiple dispatch. Some of these languages
| support abstractions which are strictly equivalent to static
| multi-method dispatch -- which I enjoy -- but I also really enjoy
| Julia's dynamism. And the compiler team is working to modularize
| aspects of the compiler so that deeper characteristics of Julia
| (even type inference and optimization) can be configured by
| library developers for advanced applications (like AD, for
| example). The notion of a modular JIT compiler is quite exciting
| to me.
|
| Other common comments: time to first plot, no native compilation
| to binary, etc are being worked on. Especially the latter with
| new compiler work (which was ongoing before the new compielr
| work) -- seems feasible within a few quarter's time.
| amkkma wrote:
| > Some of these languages support abstractions which are
| strictly equivalent to static multi-method dispatch
|
| Yes, and even with languages that have these abstractions, they
| aren't as pervasive as in Julia. Ex. There's a fundamental
| difference between typclasses and functions in haskell. In
| julia, there's no such distinction and code is more generic
| caleb-allen wrote:
| Congratulations to Julia Computing!
| didip wrote:
| Will there be a JupyterHub-like written in Julia in the near
| future, then? Because that's clearly where the money is.
| notthemessiah wrote:
| I honestly haven't thought much about Jupyter since I moved to
| Pluto.jl (Observable-style reactive notebook):
|
| https://github.com/fonsp/Pluto.jl
| amkkma wrote:
| It exists in the form of https://juliahub.com/lp/
| mr_overalls wrote:
| Julia seems like such a superior language compared to R. What
| would be required for it to supplant R for statistical work (or
| some subset of it)?
| lycopodiopsida wrote:
| "Only" to write a very high amount of high-quality statistical
| and plot packages...
| tfehring wrote:
| Or more realistically, a caret/parsnip-like interface that
| lets you seamlessly use either R or Julia as a backend.
| mr_overalls wrote:
| Right. R's killer feature is its ecosystem.
|
| I'm wondering if most statisticians or researchers deal with
| data big enough that massively better performance would be
| enough motivation to switch.
| f6v wrote:
| You can write fast software with R, you just need to know
| how. The same applies to Julia - not everyone knows how to
| develop high-performance code.
| spywaregorilla wrote:
| > You can write fast software with R, you just need to
| know how.
|
| When the trick to writing fast R code is to rely on C as
| much as possible, that feels less compelling.
| f6v wrote:
| You'd be surprised how many people don't know that a for
| loop isn't great compared to vectorizing. The same for
| Julia, few will know that types have an impact on speed.
| My point is that you won't automatically write faster
| code.
| tylermw wrote:
| While writing in C is one way to speed up R code, you can
| also get pretty close to compiled speed by writing fully
| vectorized R code and pre-allocating vectors. The R REPL
| is just a thin wrapper over a bunch of C functions, and a
| careful programmer can ensure that allocation and copy
| operations (the slow bits) are kept to a minimum.
| freemint wrote:
| This is good if your code can be expressed in vectorized
| operations and doesn't gain benefits from problem
| structure exploited by multiple dispatch. With R the best
| you can is the speed of someone else's (or your) C code
| while Julia can beat C.
| tylermw wrote:
| "Massively better performance" is a bit misleading: Julia
| is only massively better at certain workflows. The fastest
| data.frame library in ALL interpreted languages is
| consistently data.table, which is R. For in-memory data
| analysis, Julia will have to offer more than performance to
| win over statisticians/researchers.
|
| Benchmarks:
| https://www.ritchievink.com/blog/2021/02/28/i-wrote-one-
| of-t...
| mbauman wrote:
| > The fastest data.frame library in ALL interpreted
| languages is consistently data.table, which is R.
|
| DataFrames.jl is very rapidly catching up and starting to
| surpass it. After hitting a stable v1.0 they've begun
| focusing on performance and those benchmarks have changed
| significantly over the past three months. Here's the live
| view: https://h2oai.github.io/db-benchmark/
| nojito wrote:
| 40% slower in groupbys and 4x slower in joins isn't
| convincing.
| mbauman wrote:
| Oh I agree. What's convincing to me is the momentum. The
| DataFrames.jl team only started focusing on performance
| three months ago after hitting v1.0[1] and were able to
| rapidly become competitive with groupbys; the performance
| of join is next[2]. Compare the live view with the state
| when grandparent's blog post was written/updated (March
| of this year).
|
| I expect it to continue to improve; note that it's
| starting to be the fastest implementation on some of the
| groupby benchmarks.
|
| 1. https://discourse.julialang.org/t/release-
| announcements-for-...
|
| 2. https://discourse.julialang.org/t/the-state-of-
| dataframes-jl...
| freemint wrote:
| This seems quiet cherry picked as there are 3 different
| dataset sizes.
|
| However yes, it does not beat all other packages tested
| in performance.
| nojito wrote:
| Not really cherry picked. Data.table is designed for
| large data sets with many groups + complex joins.
| wdroz wrote:
| I would like to see this benchmark with much more modern
| hardware, especially for GPU-related tools as the 1080 Ti
| they used is 4 years old.
| amkkma wrote:
| In addition to the comment about df.jl catching up, they
| aren't comparable at all.
|
| Julia's DF library is generic and allows user defined ops
| and types. You can put in GPU vectors, distributed
| vectors, custom number types etc. Julia optimizes all
| this stuff.
|
| data.frame is just a giant chunk of c (c++) code that one
| must interact with in very specific ways
| tylermw wrote:
| > Julia's DF library is generic and allows user defined
| ops and types. You can put in GPU vectors, distributed
| vectors, custom number types etc. Julia optimizes all
| this stuff.
|
| These features aren't of interest to practicing
| statisticians, which the parent comment was talking
| about.
|
| > data.frame is just a giant chunk of c (c++) code that
| one must interact with in very specific ways
|
| I don't understand this criticism: yes, data.table has an
| API.
| StefanKarpinski wrote:
| Many practicing statisticians do actually care about
| easily using GPUs and doing distributed computations on
| distributed data sets with the same code they use for a
| local data set, which is what those Julia capabilities
| give you.
| amkkma wrote:
| >These features aren't of interest to practicing
| statisticians, which the parent comment was talking
| about.
|
| It's pretty convenient for things like uncertainty
| propagation and data cleaning...all things statisticians
| should care about.
|
| >I don't understand this criticism: yes, data.table has
| an API
|
| A relatively limited API, walled off from the rest of the
| language.
| snicker7 wrote:
| As another commenter pointed out, DataFrames.jl is
| already faster than data.table in some benchmarks.
|
| And that's the killer feature of Julia. It is easier to
| micro-optimize Julia code than any other language, static
| or dynamic. Meaning if Julia is not best-in-class in a
| certain algorithm, it will soon.
| f6v wrote:
| Many do, a university cluster is usually full since it runs
| 3 day-long jobs from hundreds of people. But in order to
| switch I'd need to replace 100+ direct and indirect
| dependencies.
| systemvoltage wrote:
| And clone/bribe Hadley Wickham :-) He is a tour de force of
| R.
| amkkma wrote:
| It's already superior to R for data munging stuff, imo
|
| https://twitter.com/evalparse/status/1416039770833096706
|
| And https://github.com/JuliaPlots/AlgebraOfGraphics.jl >>>
| GoG
| Duller-Finite wrote:
| As an example, Douglas Bates, the author of R's lme4 excellent
| package for generalized linear mixed-effects models, has
| switched to julia to develop MixedModels.jl. The julia version
| is already excellent, and has many improvements over lme4.
| f6v wrote:
| The majority of researchers don't care about the language
| superiority. They're concerned with different issues and
| software tends to suffer from "publish and forget" attitude.
| Convenience matters, and R ecosystem is quite good.
| jakobnissen wrote:
| As a scientist programmer, that has not been my experience.
| In my experience, science programming is characterized by
| having to implement a lot of stuff from the ground up
| yourself, because unlike web dev or containerization, it's
| unlikely there is any existing library for metagenomic
| analysis of modified RNA.
|
| And here Julia is a complete Godsend, since it makes it a joy
| to implement things from the bottom up.
|
| Sure, you also need a language that already has dataframe
| libraries, plotting, editor support et cetera, and Julia is
| lacking behind Python and R in these areas. But Julia's
| getting there, and at the end of the day, it's a relatively
| low number of packages that are must-haves.
| f6v wrote:
| > In my experience, science programming is characterized by
| having to implement a lot of stuff from the ground up
| yourself
|
| It depends on the field, there're hundreds of biological
| publications each month that just use existing software.
| And if I'm developing a new tool for single-cell analysis,
| it's either going to be interoperable with Seurat or
| Bioconductor tools.
| leephillips wrote:
| Exactly. Almost all of it is bespoke implementations,
| sometimes of an algorithm that has just been invented and
| not yet applied to a real problem.
| NeutralForest wrote:
| Congrats, it's a big step in the right direction to support Julia
| development!
| [deleted]
| aazaa wrote:
| > Julia Computing, founded by the creators of the Julia high-
| performance programming language, today announced the completion
| of a $24M Series A fundraising round led by Dorilton Ventures,
| with participation from Menlo Ventures, General Catalyst, and
| HighSage Ventures. ...
|
| What products/services does Julia Computing sell to justify that
| Series A? The article doesn't mention anything.
|
| Although the company website lists some "products," there are no
| price tags or subscription plans attached to anything.
|
| And even if there are revenue-generating products/services on the
| horizon, how will the company protect itself from smaller, more
| nimble competitors that don't have a platform obligation to
| fulfill?
|
| How is this not another Docker? Don't get me wrong, both Julia
| and Docker are amazing, but have we entered the phase of the VC-
| funded deliberate non-profit?
| mbauman wrote:
| > What products/services does Julia Computing sell to justify
| that Series A? The article doesn't mention anything.
|
| That's the entire second paragraph of the article. JuliaHub is
| a paid cloud computing service for running Julia code and
| JuliaSim/JuliaSPICE/Pumas are paid domain-specific modeling and
| simulation products. See also some of the other comments here
| from Keno[1] and Chris Rackauckas[2]:
|
| 1. https://news.ycombinator.com/item?id=27884386
|
| 2. https://news.ycombinator.com/item?id=27887122
| aazaa wrote:
| I'm equating "product" with "revenue-generating product." A
| bit presumptuous, I know, (the objective of a company is to
| make money from those who buy its products and services,
| where products and services are something other than shares)
| but the article doesn't mention any source of revenue for the
| company.
| mbauman wrote:
| Yes, that's what I figured you meant, and that's precisely
| what that paragraph and my comment above detail. You can
| enter a credit card into JuliaHub and start running large
| scale distributed compute on CPUs and GPUs right now. Or
| you can email us about procuring an enterprise version for
| your whole team and/or licensing JuliaSim or Pumas.
| borodi wrote:
| Check some replies by Keno in thread, he goes a bit deeper into
| their business model
| [deleted]
| djhaskin987 wrote:
| Can someone please explain to me, a mere mortal, what is the big
| deal with Julia. Why use it, when there are so many other good
| languages out there with more community/support? Honest question.
| duped wrote:
| There really aren't that many languages out there trying to be
| on the cutting edge of JIT for scientific computing with a
| great REPL experience. There are a few areas where developers
| have to prototype in a language like Python or MATLAB to design
| their systems, generate test data, and even just plot stuff
| during debugging then rewrite in C/C++ for speed. It's an
| enormous time sink that is prone to errors, and leads to
| terrible SWE culture.
|
| If Julia can provide both the REPL/debugging experience of a
| language like Python or MATLAB with a fast enough JIT to use in
| production it would be an enormous boon to productivity and
| robustness.
|
| There are a few limiting factors but I don't think they're
| absolute.
| ziotom78 wrote:
| I am using Julia extensively since 2013, and I can say that
| it's awesome! But don't try to use it if you're looking for a
| general-purpose scripting language: Python is far better suited
| for this. Similarly, if you want to produce standalone
| executables, C++, Rust, Go or Nim are better.
|
| However, Julia is perfect if you write
| mathematical/physical/engineering simulations and data analysis
| codes, which is my typical use case. Its support for multiple
| dispatch and custom operators lets you to write very efficient
| code without sacrificing readability, which is a big plus.
| Support for HPC computing is very good too.
| amkkma wrote:
| >Python is far better suited for this. Similarly, if you want
| to produce standalone executables, C++, Rust, Go or Nim are
| better.
|
| That's the case now, because Julia made a design decision to
| focus on extreme composability, dynamism, generic codegen etc
| which involved compiler tradeoffs...but it's not inherent to
| the langauge.
|
| For scripting, interpreted Julia is coming. For executables,
| small binary compilation is as well...particularly bullish on
| this given the new funding
| oxinabox wrote:
| > For scripting, interpreted Julia is coming.
|
| Citation for this? Julia has had a built-in interpretted
| since 1.0, in 2017 use `--compile=min`, or `--compile=none`
| to make use of it. And JuliaInterpretter.jl has been
| working since 2018. Both are very slow -- slower than
| adding in the compile time for most applications. As I
| understand it, this is because a number of things in how
| the language works are predicated on having a optimizing
| JIT compiler. As is how the standard library and basically
| all packages are written.
|
| Julia is going to over time become nicer for scripting,
| just because of various improvements. In particular, I put
| more hope on caching native code than on any new
| interpreter.
| ziotom78 wrote:
| Yeah, you are right, these limitations are not much of a
| matter of the language itself.
| RandomWorker wrote:
| Basically, the syntax is similar to matlap as well as lots of
| the same features around functional and variable deceleration
| as python. The best thing for sure is the native support for
| parallel processing. Where python is single tread.
| Kranar wrote:
| I think your question presupposes a lot of assumptions that may
| not be right. For one, I don't know that Julia is like a "big
| deal", certainly Python is the big deal in this field and I
| doubt Julia is looking to displace it wholesale. That said,
| Julia is a great addition to the scientific computing landscape
| because of its performance compared to other languages and its
| use of modern programming features. Python is just really
| really slow compared to Julia and parallelism in Python is a
| huge pain. Fortran is really really fast but that comes at a
| cost of being awkward to use and coming with a great deal of
| baggage. Julia is fast, feels modern, and has pretty easy
| parallelism.
|
| Then there's Matlab, Mathematica, and they are also pretty good
| but they're closed source/proprietary, so their ecosystem is
| mostly limited and driven by commercial interests. Nothing
| wrong with that intrinsically and they're all widely used but
| it's one way Julia differentiates itself, by making the
| language open and making money through services.
| ChrisRackauckas wrote:
| Julia Computing is not a services company. There are
| commercial products built off of this stack which are the
| core of Julia Computing. For example, https://pumas.ai/ is a
| product for pharmacology modeling and simulation, and runs on
| the JuliaHub cloud platform of Julia Computing. It is already
| a big deal in the industry, with the quote everyone refers to
| "Pumas has emerged as our 'go-to' tool for most of our
| analyses in recent months" from the Director Head of Clinical
| Pharmacology and Pharmacometrics at Moderna Therepeutics
| during 2020 (for details, see the full approved press release
| from the Pumas.ai website). JuliaSim is another major product
| which is being released soon, along with JuliaSPICE publicly
| in the pipeline.
|
| But indeed, Julia Computing differentiates itself from
| something like MATLAB or Mathematica by leveraging a strong
| open source community on which these products are developed.
| These products add a lot of the details that are generally
| lacking in the open source space, such as strong adherents to
| file formats, regulatory compliance and validation, GUIs,
| etc. which are required to take such a product from "this guy
| can use it" to a fully marketable product usable by non-
| computational scientists. I will elaborate a bit more on this
| at JuliaCon next week in my talk on the release of JuliaSim.
| codekilla wrote:
| Wanted to ask if JuliaDB is something that might get more
| development attention? Or will that remain a community
| project? (I see it's been in need of a release for awhile.)
| ViralBShah wrote:
| In general, the community has discussed reviving the
| project (or at least the ideas and some of its codebase).
| Julia computing will also be contributing as part of that
| revival.
| codekilla wrote:
| Thank you both for the comments. I believe I remember
| early on there were some comparisons to kdb+/q. I think
| there is some pretty great potential with an offering
| like this (an in-memory database integrated with the
| language, coupled with solid static storage) from the
| Julia community going forward. I can envision some use
| cases in genomics/transcriptomics.
| ChrisRackauckas wrote:
| It is not in our current set of major products. That
| said, informal office discussions mentioned JuliaDB as
| recently as last week, so it's not forgotten. If there's
| a demonstrated market, say a need for new high-
| performance cloud data science tools as part of the
| pharmaceutical domains we work in, then something like
| JuliaDB could possibly be revived in the future (of
| course, this is no guarantee).
| nvrspyx wrote:
| From what I understand, Julia is dynamically typed and similar
| to a scripting language like Python or Ruby, but is also
| compiled, so it has performance similar to C/C++ (it's also
| written in itself). It also has built-in support for
| parallelism, multi threading, GPU compute, and distributed
| compute. I'm sure others can provide more insight. I've only
| dabbled in it and haven't used it extensively in any sense of
| the word.
| [deleted]
| coldtea wrote:
| > _Why use it, when there are so many other good languages out
| there with more community /support? Honest question._
|
| Such a question seems sort of in bad faith (or loaded), since
| the selling points of Julia have been hammered time and again
| on HN and elsewhere, and are prominent on its website. It's a 1
| minute search to find them, and if someone is already aware
| that there's this thing called Julia to the point that they
| think it's made to be "a big deal", they surely have seen them.
|
| So, what could the answer to the question above be? Some
| objective number that shows Julia is 25.6% better than Java or
| Rust or R or whatever?
|
| But first, who said it's a "big deal"? It's just a language
| that has some development action, seems some adoption, and
| secured a modest fundng for its company. That's not some earth
| shattering hype (if you want to see that, try to read about
| when Java was introduced. Or, to a much lesser degree, Ada, for
| that matter).
|
| You use a language because you've evaluated it for your needs
| and agree with the benefits and tradeoffs.
|
| Julia is high level and at the same time very fast for
| numerical computing allowing you to keep a clean codebase
| that's not a mix of C, C++, Fortran and your "real" language,
| while still getting most of the speed and easy parallelization.
| It also has special focus on support for that, for data
| science, statistics, and science in general. It's also well
| designed.
|
| On the other hand, it has slow startup/load times, incomplete
| documentation, smaller ecosystem, and several smaller usability
| issues.
| cbkeller wrote:
| While you have a valid perspective, the HN guidelines [1] do
| specifically ask us to assume good faith.
|
| [1] https://news.ycombinator.com/newsguidelines.html
| exdsq wrote:
| You assume they've seen the posts about Julia on HN. If
| you're not interested in PL it's fair to assume they might
| not click on those posts.
| oscardssmith wrote:
| Short answer is that it is (imo) by far the best language for
| writing generic and fast math. Multiple dispatch allows you to
| write math using normal notation and not have to jump through
| hoops to do so.
| enriquto wrote:
| can you show a simple example of that? I tend to see multiple
| dispatch as a mental burden, (e.g.: when I see a function
| call, where will it be dispatched? the answer dependa on the
| types that I'm juggling, that may not even be visible at that
| point...)
| oscardssmith wrote:
| The key to make multiple dispatch work well is that you
| shouldn't have to think about what method gets called. For
| this to work out, you need to make sure that you only add a
| method to a function if it does the "same thing" (so don't
| use >> for printing for example). To Dr the benefit of this
| in action, consider that in Julia 1im+2//3 (the syntax for
| sqrt(-1)+2 thirds) works and gives you a complex rational
| number (2//3+1//1 im). To get this behavior in most other
| languages, you would have to write special code for complex
| numbers with rational coefficients, but in Julia this just
| works since complex and rational numbers can be constructed
| using anything that has arithmetic defined. This goes all
| the way up the stack in Julia. You can put these numbers in
| a matrix, and matrix multiplication will just work, you can
| plot functions using these numbers, you can do gpu
| computation with them etc. All of this works (and is fast)
| because multiple dispatch can pick the right method based
| on all the argument types.
| enriquto wrote:
| > make sure that you only add a method to a function if
| it does the "same thing"
|
| But this only concerns when I'm writing the code myself.
| If I read some code and I see a few nested function
| calls, there's a combinatorial explosion of possible
| types that gives me vertigo.
|
| > complex and rational numbers can be constructed using
| anything that has arithmetic defined.
|
| seriously? this does not seem right, it cannot be like
| that. If I build a complex number out of complex numbers,
| I expect a regular complex number, not a "doubly complex"
| number with complex coefficients, akin to a quaternion.
| Or do you? There is surely some hidden dirty magic to
| avoid that case.
| mbauman wrote:
| There's no hidden dirty magic, but you're right: Complex
| numbers require `Real` components and Rationals require
| `Integer` numerators and denominators. Both `Real` and
| `Integer` are simply abstract types in the numeric type
| tree, but you're free to make your own. You can see how
| this works directly in their docstrings -- it's that type
| parameter in curly braces that defines it, and `<:` means
| subtype: help?> Complex search:
| Complex complex ComplexF64 ComplexF32 ComplexF16
| completecases Complex{T<:Real} <:
| Number Complex number type with real
| and imaginary part of type T.
| oscardssmith wrote:
| One other really good example is BLAS. Since it is a
| C/Fortran library you have 26 different functions for
| matrix multiply depending on the type of matrix (and at
| least that many for matrix vector multiply). In Julia, you
| just have * which will do the right thing no matter what.
| In languages without multiple dispatch, any code that wants
| to do a matrix multiply will either have to be copied and
| pasted 25 times for each input type, or will have 50 lines
| of code to branch on the input type. Multiple dispatch
| makes all of that pain go away. You just use * and you get
| the most specific method available.
| CapmCrackaWaka wrote:
| It's a solid combination of performance, easy syntax and
| flexible environment. A big drawback of Python is that any
| performant code is actually written in a lower level language,
| with foreign function calls.
|
| That's not to say that there are no disadvantages to Julia. I
| personally see Julia as a beefed up, new and improved R.
| SatvikBeri wrote:
| We had three big data pipelines written in numpy that we'd
| spent a lot of time optimizing. Rewriting them in Julia, we
| were able to get an 8x (serial -> serial), 14x (parallel ->
| serial), and 28x (parallel -> parallel) speedups respectively -
| and with clearer, more concise code. The difference is huge.
| leephillips wrote:
| If you are doing very high-performance numerical work, your
| choices1 are Fortran, C, C++, or Julia. Julia is way more fun
| to program in than the other choices. Also, it has some
| properties2 that make code re-use and re-combination especially
| easy.
|
| 1 https://www.hpcwire.com/off-the-wire/julia-joins-petaflop-
| cl...
|
| 2 https://arstechnica.com/science/2020/10/the-unreasonable-
| eff...
| devoutsalsa wrote:
| Doesn't Python offer this speed in it's scientific libraries,
| too? Or is the answer "yes, if you use the libraries are
| written in Fortran, C, C++, or Julia!"?
| ska wrote:
| > Or is the answer "yes, if you use the libraries are
| written in Fortran, C, C++, or Julia!"?
|
| That's basically the answer.
| leephillips wrote:
| When those libraries are fast, it is because they are using
| Numpy routines written in Fortran or C. And you can get a
| lot done with those libraries, of course. But they're only
| fast if your code can be fit into stereotyped vector
| patterns. As soon as you need to write a loop, you get slow
| Python performance. Python + Scipy would not be a good
| choice for writing an ocean circulation or galaxy merger
| simulation.
|
| EDIT: And last time I checked, Numpy only parallelizes
| calls to supplied linear algebra routines, and only if you
| have the right library installed. A simple vector
| arithmetic operation like a + b will execute on one core
| only.
| spenczar5 wrote:
| I work in research software for astronomy, and I cannot
| agree with that. A very large amount of astronomy
| software is in Python. Numba has gone a long way toward
| making non-vectorized array operations very fast from
| Python.
|
| Most people use a ton of numpy and scipy. It turns out
| that phrasing things as array operations with numpy
| operators is quite natural in this field, including for
| things like galaxy merger simulations.
|
| I work, in particular, on asteroid detection and orbit
| simulation, and it's all pretty much Python.
| devoutsalsa wrote:
| Out of curiosity, how does someone get into the work
| you're doing? Do you just kind of fall into it
| accidentally? Get a PhD in astronomical computing (if
| that's a thing)?
| maxnoe wrote:
| Numba essentially does the same as julia, compile to llvm
| bytecode, in julia, that's a language design decision, in
| python it is a library.
|
| You can get very far with these approaches I python, but
| having these at the language level just has more
| potential for optimization and less friction.
|
| The debugability of numba code is very limited and code
| coverage does ot work at all.
|
| Having a high level language that has scientific use at
| its core is just great.
|
| Python has the maturity and community size on its side,
| but Jul is catching up on that quickly.
| spenczar5 wrote:
| I agree that numba's JITted code needs debuggability
| improvements. I've been working on getting it to work
| with Linux's perf(1) for that reason.
|
| The Julia-for-astronomy community is just microscopic
| right now, so it's hard to find useful libraries. Nothing
| comes close to, say, Astropy[0].
|
| I'm not a huge fan of the current numpy stack for
| scientific code. I just don't think anyone should get too
| carried away and claim that Julia is taking the entire
| scientific world by storm. I don't know anyone in my
| department who has even looked at it seriously.
|
| [0] https://www.astropy.org/
| leephillips wrote:
| I'm aware that there is plenty of serious computation
| done with these tools. I don't want to overstate; I
| merely meant that, for a fresh project, Julia is now a
| better choice for a large-scale simulation. Note that no
| combination of any of the faster implementations of
| Python + Numpy libraries has ever been used at the most
| demanding level of scientific computation. That has
| always been Fortran, with some C and C++, and now Julia.
|
| "It turns out that phrasing things as array operations
| with numpy operators is quite natural in this field"
|
| But if A and B are numpy arrays, then A + B will
| calculate the elementwise sum on a single core only,
| correct? It will vectorize, but not parallelize. All
| large-scale computation is multi-core.
| spenczar5 wrote:
| > Note that no combination of any of the faster
| implementations of Python + Numpy libraries has ever been
| used at the most demanding level of scientific
| computation. That has always been Fortran, with some C
| and C++, and now Julia.
|
| This still seems like an overstatement, but maybe it
| depends on what you mean by "most demanding level." I
| work on systems for the Rubin Observatory, which is going
| to be the largest astronomical survey by a lot. There's a
| bunch of C++ certainly, but heaps of Python. For example,
| catalog simulation
| (https://www.lsst.org/scientists/simulations/catsim) is
| pretty much entirely in Python.
|
| Take a look at `lsst/imsim`, for example, from the Dark
| Energy collaboration at LSST:
| https://github.com/LSSTDESC/imSim.
|
| Maybe this isn't the "most demanding" but I don't really
| know why.
|
| > But if A and B are numpy arrays, then A + B will
| calculate the elementwise sum on a single core only,
| correct? It will vectorize, but not parallelize.
|
| That's correct, but numba will parallelize the
| computation for you (https://numba.pydata.org/numba-
| doc/latest/user/parallel.html). It's pretty common to use
| numba's parallelization when relevant.
| leephillips wrote:
| By a large-scale calculation I have in mind something
| like this: https://arxiv.org/pdf/2006.09368.pdf, which is
| in your field of astronomy. It uses about a billion dark-
| matter elements and was run on the Cobra supercomputer at
| Max Planck, which has about 137,000 CPU cores. It used
| the AREPO code, which is a C program that uses MPI. If
| you know of any calculation in this class using Python I
| would be interested to hear about it. But generally one
| doesn't have one's team write a proposal for time on a
| national supercomputing center and then, if it is
| approved, when your 100-hour slot is finally scheduled,
| upload a Python script to the cluster. But strange things
| happen.
|
| EDIT: Yes, numba is impressive.
| Hasnep wrote:
| You're right, Python and R are good choices if your goals
| happen to align with what those libraries are optimised
| for, but outside of that you normally need to start writing
| your own C or C++.
| ziotom78 wrote:
| NumPy is not a good comparison, because Julia can produce
| faster code which takes less memory [1]. The Python library
| that is closest to Julia's spirit is Numba [2], and in fact
| I was able to learn Numba in a few hours thanks to my
| previous exposure to Julia. (It probably helps that they
| are both based on LLVM, unlike NumPy.)
|
| However, Numba is quite limited because it only works well
| for mathematical code (it is not able to apply its
| optimizations to complex objects, like lists of
| dictionaries), while on the other side Julia's compiler
| applies its optimizations to _everything_.
|
| [1] https://discourse.julialang.org/t/comparing-python-
| julia-and...
|
| [2] https://numba.pydata.org/
| SatvikBeri wrote:
| There are a few reasons why Julia still tends to be faster
| than numpy:
|
| * Julia can do loop fusion when broadcasting, while numpy
| can't, meaning numpy uses a lot more memory during complex
| operations. (Numba can handle loop fusion, but it's
| generally much more restrictive.)
|
| * A lot of code in real applications is glue code in
| Python, which is slow. I've literally found in some
| applications that <5% of the time was spent in numpy code,
| despite that being 90% of the code.
|
| That said, if your code is mostly in numba with no pure
| python glue code (not just numpy), you probably won't see
| much of a difference.
| notafraudster wrote:
| What's the argument against using R and dropping into RCpp
| for very limited tasks? I (helped) write a very widely used R
| modelling package and while I wasn't doing anything on the
| numerical side, we seemed to get great performance from this
| approach -- and workflow-wise it wasn't too dissimilar to 25
| years ago where I had to occasionally drop in X86 assembly to
| speed up C code!
|
| (Not a hater of Julia at all, very much think it's a cool
| language and an increasingly vibrant ecosystem and have been
| consistently impressed when Julia devs have spoke at events
| I've attended)
| slownews45 wrote:
| One thing I don't like about the two langauge approach -
| deployment story get's more complicated it seems?
|
| In my case I went to deploy on a musl system and things
| with the two language just were a pain to get up and
| running.
|
| Conversely, everything that was native python ran fine in a
| musl based python container.
|
| Your native python code just moves also nicely between
| windows / linux / etc
| hobofan wrote:
| Development story is as complicated as the tooling makes
| it to be. With good tooling that e.g. minimizes the
| amount of glue code and/or makes integration of building
| the native parts easy, the development story doesn't have
| to much more complicated.
| _Wintermute wrote:
| I think the argument is that most R users don't know C++.
| So Julia avoids the "2 language problem" that you get with
| modern scientific computing.
| tylermw wrote:
| Not much an argument at all, if you ask me. There's
| definitely a benefit to only having to learn a single
| language (rather than R and C++), but the library/package
| ecosystem in R is hard to beat; unless you're doing truly
| bespoke computational work, the number of mature
| statistical libraries/packages in R is unmatched. Rcpp's
| syntactic sugar means most slow R bottlenecks can be
| written in C++ almost verboten, but without the interpreted
| performance penalty. One of R's best and under-emphasized
| features is its straightforward foreign-function interface:
| it's easy to creating bindings to C/C++/Fortran routines
| (and Rust support is coming along as well).
|
| I've been impressed with Julia, but it's hard to beat 25
| years of package development.
| QuadmasterXLII wrote:
| I think that's just being clumped in with "Use C++," which
| he mentioned as an option
| galangalalgol wrote:
| and the ffi adds a lot of overhead for granular data.
| Julia just works fast. My only friction has been offline
| development, which isn't well supported yet.
| krastanov wrote:
| Interoperability between libraries that expect your code to
| be pure R / pure Python. If you use RCpp or Cython or
| CPython then you lose much of the magic behind the language
| that enables the cool (but frequently slow) features. My
| biggest pain point in this situation: you can not use SciPy
| or Pillow or Cython code with Jax/Pytorch/Tensorflow
| (except in very limited fashion).
|
| Differential equation solvers that need to take a very
| custom function that works on fancy R/Python objects is
| another example of clumsiness in these drop-to-C-for-speed
| languages. It works and as a performance-nerd I enjoy
| writing such code, but it is clumsy.
|
| That type interoperability is trivial in Julia.
| tylermw wrote:
| Once your Rcpp code is compiled, it's almost
| indistinguishable from base R (when you're calling it).
| All R functions eventually end up calling R primitives
| written in C, and Rcpp just simplifies the process of
| writing and linking C/C++ code into the R interpreter.
|
| The only difficulty with Rcpp-based R packages is you
| have to ensure the target system can compile the code,
| which means having a suitable compiler available.
| oxinabox wrote:
| > it's almost indistinguishable from base R (when you're
| calling it).
|
| I am very surprised by this. Given how R is extremely
| dynamic. and has things like lazy-evaluation, that you
| can rewrite before it is called with substitute. Which I
| am sure some packages are using in scary and beautiful
| ways.
| krastanov wrote:
| I wonder how much does it differ from python's use of C
| or Cython (I have only superficial R skills). The
| prototypical example of why Python's C prevents
| interoperability is how the introspection needed by Jax
| or Tensorflow (e.g. for automatic GPU usage or automatic
| differentiation) fails when working on Scipy functions
| implemented in C.
|
| For instance, I imagine there is an R library that makes
| it easy to automatically run R code on a GPU. Can that
| library also work with Rcpp functions?
| ska wrote:
| Same argument as python.
|
| In other words, you can (empirically) get a lot done that
| way, but there is always friction.
| cbkeller wrote:
| Composability via dispatch-oriented programming, e.g. [1]
|
| It also pretty much solved my version of the two-language
| problem, but that means different things to different people so
| ymmv.
|
| [1] https://www.youtube.com/watch?v=kc9HwsxE1OY
| agumonkey wrote:
| kinda tries to make coding like python but running like fortran
| (without having to resort side-batteries like numpy/scipy)
|
| designers seems to have a good amount of PLT knowledge and made
| good foundations
| axpy906 wrote:
| My understanding is that it runs faster than native python and
| R. That said with Numba and other libraries, see no point.
| NeutralForest wrote:
| It's easier to write Julia code than to deal with Numba tbh
| and the ecosystem around Julia makes the code composable
| which is often not the case if you write Numba code and have
| to deal with other libraries.
| tfehring wrote:
| Numba is great for pure functions on primitive types but it
| breaks down when you need to pass objects around. PyPy is
| fantastic for single-threaded applications but doesn't play
| nicely with multiprocessing or distributed computing IME.
| Numpy helps for stuff you can vectorize, but there's a lot of
| stuff you can't (or can but shouldn't); it also brings lots
| of minor inconveniences by virtue of not being a native type
| - e.g., the code to JSON serialize a `Union[Sequence[float],
| np.ndarray]` isn't exactly Pythonic.
| oscardssmith wrote:
| Also, numpy has about a 100x overhead door small arrays (10
| or fewer elements).
| danuker wrote:
| > runs faster than native python and R
|
| That's a bit of an understatement. It's about as fast as C
| and Rust (ignoring JIT compilation time).
|
| https://julialang.org/benchmarks/
| dklend122 wrote:
| Composability, speed, static analysis, type system,
| abstractions, user defined compiler passes, metaprogramming,
| ffi, soon static compilation, differentiability and more
| create an effect that far exceeds numba
| gnicholas wrote:
| Are A-rounds now well into the $20M range?
|
| I remember when they hit $10M and assumed they had continued to
| grow somewhat. But I didn't realized we'd blown well past $20M --
| when did that happen?
| oscardssmith wrote:
| I think a lot of it is that this is a series A for a 6 year old
| company that is already making money. Most other startups would
| be on series B, so comparing this number to a 10 person startup
| for a company that doesn't have a product yet doesn't make a
| lot of sense.
| mrits wrote:
| I don't have recent data but I'd guess most startups would
| have been dead for several years, not on series B.
| KenoFischer wrote:
| I guess this would be a good place to mention that we're hiring
| for lots of positions, so if you would like to help build
| JuliaHub, or work on compilers, or come play with SDRs, please
| take a look at our job openings :) :
| https://juliacomputing.com/jobs/
| p_j_w wrote:
| >or come play with SDRs
|
| This sounds like an absolute dream!
| paulgb wrote:
| What's SDR in this context? Not software-defined radio,
| right? (Though I suppose Julia is a good fit for signal
| processing!)
| KenoFischer wrote:
| Yes, software-defined radio, we have a very broad set of
| interests, and that happens to be one of the open jobs :).
| slownews45 wrote:
| Very cool. SDR's are in the process of taking over ham
| radio I think as well (give it another 5 years). So
| flexible.
| p_j_w wrote:
| Understandable if you can't answer this question, but how
| much work have you guys done with SDRs and arrays?
| KenoFischer wrote:
| This project is just starting, so I have hardware sitting
| on my desk and have used it a bit, but other than that
| not much.
| p_j_w wrote:
| I see. Well I wish you guys luck on that!
| sidpatil wrote:
| The article mentions a circuit simulation package named
| JuliaSPICE, but I can't find any intonation on it. Can someone
| please provide a link?
| oscardssmith wrote:
| It's not released yet. Official announcement is happening at
| juliacon in 2 weeks
| KenoFischer wrote:
| There's not really much public about it yet. There'll be a
| technical talk about it at JuliaCon and we're talking to
| initial potential customers about it, but it's not quite ready
| for the wider community yet. If you want some of the technical
| details, I talked about it a bit in this earlier thread
| https://news.ycombinator.com/item?id=26425659 about the DARPA
| funding for our neural surrogates work in circuits (which will
| be part of the product offering, though the larger product is a
| modern simulator + analog design environment, which is supposed
| to address some of the pain of existing systems with the ML
| bits being a really nice bonus).
| KenoFischer wrote:
| Oh, I suppose I should add if you're looking to use something
| like this in a commercial setting, please feel free to reach
| out. Either directly to me, or just email info@ and you'll
| get routed to the right place.
| ChrisRackauckas wrote:
| For the earliest details, see the press release from our DARPA
| project: https://juliacomputing.com/media/2021/03/darpa-ditto/
| . This is being done in a way where a fully usable software is
| the result, so that those accelerations are not just a one-off
| prototype but a product that everyone else can use by the end.
| For more details, wait until next week's JuliaCon.
| infogulch wrote:
| Congrats!
|
| I tried Julia for the first time last week and it was great.
|
| I've been playing with the idea of defining a hash function for
| lists that can be composed with other hashes to find the hash of
| the concatenation of the lists. I tried to do this with matrix
| multiplication of the hash of each list item, but with integer
| mod 256 elements random matrices are very likely to be singular
| and after enough multiplications degenerates to the zero matrix.
| However, with finite field (aka Galois fields) elements, such a
| matrix is much more likely to be invertible and therefore not
| degenerate. But I don't really know anything about finite fields,
| so how should I approach it? Here's where Julia comes in: with
| some help I was able to combine two libraries, LinearAlgebraX.jl
| which has functions for matrices with exact elements, with
| GaloisFields.jl which implements many types of GF, and wrote up a
| working demo implementation of this "list hash" idea in a
| Pluto.jl [2] notebook and published it [0] (and a question on SO
| [1]) after a few days without having any Julia experience at all.
| Julia seems pretty approachable, has great libraries, and is very
| powerful (I was even able to do a simple multithreaded
| implementation in 5 lines).
|
| [0]: https://blog.infogulch.com/2021/07/15/Merklist-GF.html
|
| [1]: https://crypto.stackexchange.com/questions/92139/using-
| rando...
|
| [2]: https://github.com/fonsp/Pluto.jl
| tvladeck wrote:
| I would really love to use Julia, and for my team to as well. But
| we are too locked into R to even begin. If I were these folks, I
| would focus on the flow, not stock, of data analysts. Get the
| next generation locked in to Julia. Turn R into SPSS
| cauthon wrote:
| They have to offer a plotting library that's at least half as
| decent as ggplot if they want people to jump ship. Gotta be
| able to visualize your results easily and they've been stuck
| with a half-baked interface to python plotting libraries for
| years
| moelf wrote:
| http://makie.juliaplots.org/stable/
|
| hopefully will soon be a dominant force in this direction
| spywaregorilla wrote:
| I played with Julia a bit in grad school. Although I didn't end
| up using it much after that I thought it was a lovely language.
| Forget python, I hope Julia manages to kill off Matlab and its
| weird stranglehold on various pockets of academia. Congrats to
| the team here.
| mint2 wrote:
| Matlab, the SAS of academia and certain engineering fields.
| helmholtz wrote:
| I won't hear of it. Matlab is great. Superb at manipulating
| matrices, great for getting started with differential
| equations, world-class plotting library, and _massively_
| forgiving. All the things a computational engineer like me
| needs.
|
| I would never use it for producing software meant for
| distribution, but people mainly hate on it because it's
| 'cool', without realising that it excels at what it does. I
| fucking love matlab.
| swordsmith wrote:
| +1 on this. Matlab toolboxes are much more well tested, has
| thorough documentation and work much better out of the box,
| especially for controls and signal processing. Working with
| scipy can be a pain sometimes with incorrect or unstable
| results.
| maybelsyrup wrote:
| I know little about Matlab, but I know a lot about SAS,
| where it's the default analytical environment in public
| health practice and research. SAS is trash, and a ripoff to
| boot. R has made good inroads, giving SAS a tiny bit of
| pressure, but frankly it needs more. Bring on Julia! And
| python and anything else.
| wickedsickeune wrote:
| I guess you mean SAP?
| jcuenod wrote:
| Had the same thought but I think SAS is correct:
| Statistical Analysis System
| (https://www.sas.com/en_us/company-
| information/profile.html)?
| awaythrowact wrote:
| Congrats to the Julia team.
|
| I am a python developer who has dabbled with Julia but it never
| stuck for me.
|
| I think Julia was built by academics for other academics running
| innovative high performance computing tasks. It excels at the
| intersection of 1) big data, so speed is important, and 2)
| innovative code, so you can't just use someone else's C package.
| Indeed, Julia's biggest successful applications outside academica
| closely resemble an academic HPC project (eg Pumas). I think it
| will continue to have success in that niche. And that's not a
| small niche! Maybe it's enough to support a billion dollar
| company.
|
| But most of us in industry are not in that niche. Most companies
| are not dealing with truly big data, on our scale, it is cheaper
| to expand the cluster than it is to rewrite everything in Julia.
| Most who ARE dealing with truly big data, do not need innovative
| code; basic summary statistics and logistic regression will be
| good enough, or maybe some cloud provider's prepackaged turn key
| neural nets scale out training system if they want to do
| something fancy.
|
| I think for Julia to have an impact outside of academia (and
| academia-like things in industry) it will need to develop killer
| app packages. The next PyTorch needs to be written in Julia. Will
| that happen? Maybe! I hope so! The world would be better off with
| more cool data science packages.
|
| But I think the sales pitch of "it's like Pandas and scikit but
| faster!" is not going to win many converts. So is Jax, Numba,
| Dask, Ray, Pachyderm, and the many other attempts within the
| Python community of scaling and speeding Python, that require
| much less work and expense on my part for the same outcome.
|
| Again, congrats to the team, I will continue to follow Julia
| closely, and I'm excited to see what innovative capabilities come
| out of the Julia ecosystem for data scientists like me.
| krastanov wrote:
| There is another important niche I am particularly excited
| about: programming language research geeks and lisp geeks. The
| pervasive multiple-dispatch in Julia provides such a beautiful
| way to architecture a complicated piece of code.
| vletal wrote:
| > The pervasive multiple-dispatch in Julia provides such a
| beautiful way to architecture a complicated piece of code.
|
| On the other hand it makes function discovery almost
| impossible [1]. Combined with the way how 'using' keyword
| exports a predefined subset of functions, this makes the
| language doomed from larger adoption outside of academia at
| least as long as there is no superb autocompletion and IDE
| support.
|
| [1] https://discourse.julialang.org/t/my-mental-load-using-
| julia...
| mccoyb wrote:
| Have you seen Shuhei Kadowaki's work on JET.jl (?)
|
| If you're curious: https://github.com/aviatesk/JET.jl
|
| This may seem more about performance (than IDE development)
| but Shuhei is one of the driving contributors behind
| developing the capabilities to use compiler capabilities
| for IDE integration -- and indeed JET.jl contains the
| kernel of a number of these capabilities.
| ampdepolymerase wrote:
| Those two niches don't pay. They are effectively useless
| outside of the occasional evangelism on HN.
| pjmlp wrote:
| Who do you think puts money to action on LLVM, GCC, Swift,
| Rust, .NET (C#, F#, VB), Scala,....?
| trenchgun wrote:
| Money does not create code, humans do.
| UncleOxidant wrote:
| There's a Compiler and Runtime Engineer position listed at
| https://juliacomputing.com/jobs/ which presumably pays
| money.
| pjmlp wrote:
| Yep, that is my niche.
| systemvoltage wrote:
| Great summary. I've worked with scientists that love Julia and
| more power to them. As a software engineer, there are still
| rough edges in productionizing Julia (yes, I know there are a
| few examples of large scale production code). As soon as you
| take Julia out of notebooks and try to build moderately complex
| apps with it, you realize how much you miss Python. Having used
| Julia for last 4 years and having to maintain that code in
| production environment, I am strongly convinced that Julia has
| a niche but it is not going to be a contestant to
| Python/Java/C++ depending on the use case. Which really is a
| shame - I want one goddamn language to rule them all. I _want_
| that and tried to give a fair chance to Julia.
| amkkma wrote:
| > As soon as you take Julia out of notebooks and try to build
| moderately complex apps with it, you realize how much you
| miss Python
|
| Why's that? What features or lack thereof of Julia contribute
| to that experience?
| systemvoltage wrote:
| There is too much to talk about and I'd want to give an
| objective impression with examples in a blog post, but one
| of the major grips I have is how little information Julia
| provides you with stack traces. Debugging production
| problems with absolutely _zero_ clue of what /where the
| problem might be is one of the most frustrating aspects.
| I've spent so many hours debugging Julia using print
| statements. Debugger is janky, Juno/Atom support is not
| very good. Nothing feels polished and robust. Dependencies
| break all the time. We are stuck with Julia 1.2 and too
| much of an undertaking to go to latest version. Package
| management is an absolute disaster - this is the case with
| Python but Julia is worse. Julia Registry has many issues
| (compat errors). Testing and mocking is also underwhelming.
| I think startup times have improved but still not very
| developer-friendly. Sorry not an objective answer you're
| looking for. There also things that Python has such as
| excellent web-dev ecosystem and system libs that are
| missing in Julia. Python has everything. Want to generate a
| QR code image? Easy in Python. Want to create PDF reports
| with custom ttf fonts? A breeze in Python.
| amkkma wrote:
| Thanks for sharing! I think much of it has to do with the
| version you're stuck on. (ie startup and compile times
| are way way better with 1.6)
|
| Python having a broader ecosystem is a very good point,
| but I've found pycall to be very helpful
| habibur wrote:
| This matches with my experience with an old version of
| Julia too. But lets not focus on the negative aspects
| here. People have their different sets of priorities.
| awaythrowact wrote:
| I second most of this.
|
| A lot of problems are fixable with time and money. Maybe
| the Series A will help!
|
| But some problems might be related to Julia's design
| choices. One thing I really missed in Julia is Object
| Oriented Programming as a first class paradigm. (Yes I
| know you can cobble together an approximation with
| structs.)
|
| OOP gets a lot of hate these days. Mostly deserved. But
| in some large complex projects it's absolutely the right
| abstraction. I've used OOP in several Python projects.
| Most of the big Python DS/ML packages use OOP.
|
| Maybe you think PyTorch, SciKit, etc are all wrong, and
| would be better off with a more functional style. I know
| it's fashionable in some circles to make fun of the
| "model.fit(X,Y); model.predict(new_data)" style of
| programming but it works well for us busy professionals
| just trying to ship features.
|
| I don't think Julia is wrong for enforcing a more
| functional style. It probably makes it easier for the
| compiler to generate great performance.
|
| But Python has achieved its success because of its
| philosophy of being the "second best tool for every job"
| and that requires a more pragmatic, multiparadigm
| approach.
| amkkma wrote:
| How is "model.fit(X,Y)" better than "fit!(model,X,Y)"?
|
| Julia is object oriented in a broad sense, it just uses
| multiple dispatch which is strictly more expressive than
| single dispatch, so doesn't make sense to have dot
| notation for calling methods because types don't own
| methods.
|
| For giving up some facility in function discover, you get
| speed, composability, generic code...and a net gain in
| usability because you can have one array abstraction for
| GPUs, CPUs etc etc, which is just an instance of having
| common verbs across the ecosystem (enabled by MD).
| Instead of everyone having their own table type or stats
| package, you have Tables.jl or Statsbase.jl that packages
| can plug into and extend without the issues inherent in
| subclassing, monkeypatching etc.
|
| This is a much better, more powerful and pleasant
| experience
|
| Closing the gap in Method discovery will simply require a
| language feature with tooling integration, where you
| write the types and then tab to get the functions.
| There's already an open issue/PR for this
| awaythrowact wrote:
| I'll give you my two cents, recognizing that I very well
| might just be ignorant about Julia and multiple dispatch,
| and if so please continue to educate me.
|
| Consider if we want to run many different types of
| models. Logistic regression, gradient boosting, NNs, etc.
| We want the ability to easily plug in any type of model
| into our existing code base. That's why model.fit(X,Y) is
| attractive. I just need to change "model =
| LogisticRegressionModel" to "model =
| GradientBoostingModel" and the rest of the code should
| still Just Work. This is a big part of SciKit's appeal.
|
| But all these different models have very different
| training loops. So with "fit!(model,X,Y)" I need to make
| sure I am calling the compatible "fit" function that
| corresponds to my model type.
|
| You might now say "Ah! Multiple dispatch handles this for
| you. The 'fit' function can detect the type of its
| 'model' argument and dispatch execution to the right
| training loop sub function." And I suppose that's
| theoretically correct. But in practice I think it's
| worse.
|
| It should be the responsibility of the developer of
| "model" to select the "fit" algorithm appropriate for
| "model." (They don't have to implement it, but they do
| have to import the right one.) The developer of "fit"
| should not be responsible for handling every possible
| "model" type. You could have the developer of "model"
| override / extend the definition of "fit" but that opens
| up its own can of worms.
|
| So is it possible to do the same thing with
| "fit!(model,X,Y)"? Yes of course it is. It's possible to
| do anything with any turing complete language. The point
| is, which system provides the best developer ergonomics
| via the right abstractions? I would argue, in many cases,
| including this one, it's useful to be able to bundle
| functions and state, even if that is in theory "less
| flexible" than pure functions, because sometimes
| programming is easier with less flexibility.
| gbrown wrote:
| Have you checked out MLJ? I think their interface does a
| pretty good job of what you're discussing.
| amkkma wrote:
| Thanks, I see where you are coming from.
|
| >It should be the responsibility of the developer of
| "model" to select the "fit" algorithm appropriate for
| "model." (They don't have to implement it, but they do
| have to import the right one.) The developer of "fit"
| should not be responsible for handling every possible
| "model" type. You could have the developer of "model"
| override / extend the definition of "fit" but that opens
| up its own can of worms.
|
| It's really the same thing as python, just better...I
| don't see the distinction you are drawing.
|
| In python you have a base class with default behavior.
| You can subclass that and inherit or override.
|
| Julia has abstract types with interfaces...instead of
| relying on implementation details like fields, you
| provide functions so that more types of models can work
| even if they don't have that one specific field.
| Otherwise everything is the same where it counts,- you
| can compose, inherit and override. Even better, you can
| work with multiple models and types of data, inheriting
| where you see fit.
|
| I don't see any benefit to python's restrictions here,
| either in ease of use or in expressiveness.
|
| For all intents and purposes it's a strict superset.
|
| Even better, you can use macros and traits to group
| different type trees together.
|
| https://www.stochasticlifestyle.com/type-dispatch-design-
| pos...
|
| These seem to be in contradiction:
|
| >It should be the responsibility of the developer of
| "model" to select the "fit" algorithm appropriate for
| "model.
|
| >You could have the developer of "model" override /
| extend the definition of "fit" but that opens up its own
| can of worms.
|
| It's the same in python, either you inherit Fit or you
| can override it. What's the difference with Julia?
|
| Except in julia all types and functions have a multiple
| dispatch, parametric type and possible trait lattice of
| things you can override, customize and compose, so that
| even if the model author has to override fit, they can do
| it using small composable building blocks.
| awaythrowact wrote:
| I agree you can achieve same benefits with Macros.
| Indeed, I see that MLJ, Julia's attempt at a SciKit type
| project, makes extensive use of Macros. But I personally
| think macros are an antipattern. In large projects, they
| can introduce subtle bugs. Especially if you're using
| multiple modules that are each editing your code before
| compile time and that don't know about each other. I know
| others in Julia community agree that Macros are
| dangerous.
|
| I think abstract types are a brittle solution. The "can
| of worms" I alluded to is something like this: Library
| TensorFlow implements model "nn" and Library PyTorch also
| implements model "nn" and they both want to override
| "fit" to handle the new type "nn"... Good luck combining
| them in the same codebase. This problem is less
| pronounced in OOP where each development team controls
| their own method. Julia devs can solve this by having
| every developer of every "fit" function and every
| developer of every "model" struct agree beforehand on a
| common abstraction, but that's an expensive, brittle
| solution that hurts innovation velocity.
|
| I think the closest I can do in Julia via pure structs is
| for the developer to define and expose their preferred
| fit function as a variable in the struct, something like
| "fit = model['fit_function']; fit(model,X,Y)" but that
| introduces a boilerplate tax with every method I want to
| call (fit, predict, score, cross validate, hyperpameter
| search, etc). (EDIT: indeed, I think this is pretty much
| what MLJ is doing, having each model developer expose a
| struct with a "fit" and "predict" function, and using the
| @load macro to automate the above boilerplate to put the
| right version of "fit" into global state when you @load
| each model... but as described above, I don't like macro
| magic like this.)
| east2west wrote:
| I don't miss OOP in Julia but I do feel there need to be
| more ways to abstract things than multiple dispatch and
| structs. One thing I do miss is interfaces, which can
| group functions for a common purpose. I understand it may
| be not feasible in a dynamic language, but hopefully
| there will be something above functions as an abstraction
| mechanism.
| chrispeel wrote:
| > One thing I really missed in Julia is Object Oriented
| Programming as a first class paradigm.
|
| Julia peeps would tell you that the multiple dispatch
| used by Julia is a generalization of OOP. And that they
| like multiple dispatch
| cbkeller wrote:
| Yes indeed. While Julia does generally eschew the
| "model.fit(X,Y); model.predict(new_data)" style, it's not
| because it's functional, it's because it's _dispatch-
| oriented_ , which is arguably a superset of class-based
| OO, and even perhaps arguably closer to Alan Kay's
| claimed original vision for OO than modern class-based OO
| is.
| amkkma wrote:
| I really don't understand your point about package
| management.
|
| Instead of Pip, virtualenv, conda, etc etc there's one
| package manager that resolves and installs native and
| binary dependencies, ensures reproducibility with human
| readable project files, is accessible from the REPL etc.
|
| You can get an entire GPU based DL stack up and running
| on a windows machine in under 30 min, with a few
| keystrokes and no special containers or anything like
| that. Don't even have to install cuda toolkit. It's a
| dream, and I've heard the same from recent python
| converts
| mbauman wrote:
| > We are stuck with Julia 1.2
|
| Gosh, that's really old and it's not even an LTS version.
| I fear many of your woes stem from that. Julia 1.6 has
| huge improvements, and the community has rallied around
| VS Code now that Atom seems to be dying.
|
| It really shouldn't be too bad to update.
| leephillips wrote:
| This is the first time I've heard somebody say that
| Julia's package management is worse than Python's! For
| most people who have spent years grappling with the
| succession of unofficial attempts to supply Python with
| something like package management, including virtual
| environments, etc., and the resulting dependency hell,
| Julia's integrated and sophisticated package management
| (even with its own special REPL mode) is refreshing. I
| don't doubt your experiences are real, but suspect you
| have just had really bad luck.
|
| 1.2 is pretty ancient. Current, or even recent, versions
| of Julia have a fraction of the startup time
| (https://lwn.net/Articles/856819/). Package management
| has been refined further, as well.
| systemvoltage wrote:
| I'm a tortured soul, my opinions might be biased and I
| hope things improve with Julia.
|
| We absolutely cannot upgrade Julia version right now,
| dozens of repos full of complicated scientific code.
| Management doesn't care as far as it barely runs. I don't
| think it's fair to blame Julia for it but it just shows
| how much more it needs to go. That should be looked at as
| a positive thing.
|
| I have one more complain to Julia community - please
| don't be too defensive. Accept and embrace criticisms and
| turn that into a propellant for improving the language.
| I've seen a cult-like behavior in Julia community that is
| borderline toxic. Sorry, it's the truth and needs to be
| said. Speed isn't everything and people are magnetized by
| the benchmarks on the Julia website, especially non-
| software engineers.
| amkkma wrote:
| >I've seen a cult-like behavior in Julia community that
| is borderline toxic. Sorry, it's the truth and needs to
| be said. Speed isn't everything and people are magnetized
| by the benchmarks on the Julia website, especially non-
| software engineers.
|
| I think all languages have this dynamic...I've seen it
| with python and R. To some extent it's fed by what we
| perceive as criticisms from people defending their
| favorite incumbent language with arguments that aren't at
| all informed- such as a focus on speed and how numba
| achieves parity there.
|
| In the same vein, I and many Julia users are enthusiastic
| precisely because of thing other than speed, such as the
| type system, abstractions, differentiability and other
| things that make programming more joyful, fluid and
| productive.
|
| Agree though, that we could always improve on acceptance
| of criticism.
| leephillips wrote:
| Well you're right about the libraries--Python really does
| have everything (although sometimes those libraries are
| not great quality, but at least they mostly work). But
| Julia makes it easy to use Python libraries with Pycall.
| And there are big things that just don't exist yet in the
| Julia world, such as a web framework. I recently tried to
| create a web project using Genie.jl, which advertises
| itself as a ready-for-prime-time web framework for Julia,
| and I gave up after a few days. It's just not in the same
| universe as something like Django, plus it's barely
| documented.
| snicker7 wrote:
| What warts did you encounter? What language features /
| tooling would make Julia easier to productionize?
| jjoonathan wrote:
| > The next PyTorch needs to be written in Julia. Will that
| happen? Maybe! I hope so!
|
| The Two Language Problem makes this more likely than one might
| think. Those high level python packages that plaster over
| python's many mediocrities have to be written and maintained by
| someone, and while extremistan bears the brunt of the pain and
| has done a remarkable job shielding mediocristan, it's
| extremistan that gets to decide which language to use for the
| next killer app.
|
| Of course, python has more inertia than god, so it won't go
| quietly.
| stillwater919 wrote:
| Your comment almost reads like a poem
| ssivark wrote:
| > do not need innovative code
|
| I think this is a good deciding factor. Not just for "big
| data". In my experience, with its combination of flexibility
| and raw speed, Julia makes implementing new algorithms (from
| scratch) a breezy experience. The strong research/academic
| presence in the community also helps towards encouraging decent
| Julia libraries for a lot of cutting edge work.
|
| So if you are working in an area where that could make a
| significant difference, it's an excellent reason to use Julia.
|
| > will need to develop killer app packages. The next PyTorch
| needs to be written in Julia. Will that happen?
|
| If enough cutting-edge work happens in Julia, it's likely that
| a few great tools/platforms will emerge from that. We're
| already seeing that in the scientific modeling ecosystem (as an
| example), with Differential Equations infrastructure and
| PumasAI.
| queuebert wrote:
| > The next PyTorch needs to be written in Julia.
|
| If a major company would pick up Flux.jl and fill out the
| ecosystem that would be AMAZING.
|
| PyTorch and Tensorflow feel like duct tape and chewing gum all
| day, every day.
| alfalfasprout wrote:
| Frankly I think the key thing that'll really get a lot of Julia
| adoption is a full-featured ML framework on par with TF, Pytorch,
| etc.
|
| What we've noticed is the vast majority of the time it's the data
| scientist's code that's slow not the actual ML model bit. So
| allowing them to write very performant code with a dumpy-like
| syntax and not have to deal with painfully slow pandas, lack of
| true parallelism, etc. would be a true game changer for ML in
| industry.
| tbenst wrote:
| Agreed! Flux & other Julia Ml packages are awesome and have
| best in class API. Performance and memory usage aren't yet on
| par with TF/PyTorch (or at least when I last checked last
| year), but with more contributors and time I could see this
| closing and would love to use Julia for ML work
| amkkma wrote:
| I believe we are currently at pytorch parity (and sometimes
| better) for speed. Memory usage depends....And this is
| without the extensive upcoming compiler improvements.
|
| The reason for the lag is that Julia has been focusing on
| general composable compiler, codegen and metaprogramming
| infrastructure which isn't domain specific, whereas pytorch
| and friends has been putting lots of dev money into c++ ML
| focused optimizers.
|
| Once the new compiler stuff is in place, it would be
| relatively trivial to write such optimizations, in user
| space, in pure Julia. Then exceeding that would be fairly
| simple also, plus things like static analysis of array shapes
| mccoyb wrote:
| Have you explored the SciML landscape at all (?):
|
| https://sciml.ai/
|
| There are a number of components here which enable (what I
| would call) the expression of more advanced models using
| Julia's nice compositional properties.
|
| Flux.jl is of course what most people would think of here (one
| of Julia's deep learning frameworks). But the reality behind
| Flux.jl is that it is just Julia code -- nothing too fancy.
|
| There's ongoing work for AD in several directions -- including
| a Julia interface to Enzyme:
| https://github.com/wsmoses/Enzyme.jl
|
| Also, a new AD system which Keno (who you'll see comment below
| or above) has been working on -- see Diffractor.jl on the
| JuliaCon schedule (for example).
|
| Long story short -- there's quite a lot of work going on.
|
| It may not seem like there is a "unified" package -- but that's
| because packages compose so well together in Julia, there's
| really no need for that.
| hawk_ wrote:
| it's not obvious to me what's their revenue model?
| abalaji wrote:
| If Mathworks can make money, so can Julia Computing
| KenoFischer wrote:
| Nothing complicated.
|
| Stream 1: Build amazing products for particular domains, charge
| license fees
|
| Stream 2: Build a great SaaS platform for running Julia, charge
| for compute
|
| Since all of our domain products are built in Julia and often
| involve significant compute cost for their intended
| application, hopefully both at the same time :).
| hawk_ wrote:
| thanks for the answer Keno. i guess an example Stream 1
| product is Pumas. i didn't realize it's a separate product
| from Julia. my background is in finance and i am curious if
| you have any plans to break into that domain (examples on
| your website include julia language use)
| KenoFischer wrote:
| Finance was a focus area early on and we have a fair number
| of consulting clients there and JuliaHub is available of
| course, but we were never able to figure out a dedicated
| domain-specific, non-niche product to sell into the space.
| Maybe in the future.
| hpcjoe wrote:
| FWIW: I am using it as a general purpose language at the
| intersection of large data sets, analytics, and related
| bits. At a prop shop.
|
| YMMV, but I find it is fantastic in this use case. And I
| don't have to worry about semantic space.
| slownews45 wrote:
| My own view is that Julia Computing is distinguished a bit by
| a more product focus.
|
| It's a bit less of a pure language / infra play and more a
| product play. Ie, docker / containers was almost a pure infra
| play in the end. These guys make actual things you can use.
|
| The later sells better into business I think and is less
| likely to be competed against. Google / AWS et al are
| generally pretty quick to compete on the infra play level.
| [deleted]
| grumblenum wrote:
| Uber for numpy, perhaps. The best case scenario would be like
| Oracle and JavaSE. I lack the imagination to speculate on the
| worst case, so I am quite surprised that anyone expects to make
| more than $24M in profit from yet another _programming
| language_. Particularly one which caters to the narrow
| intersection of one-off scripts /rapid iteration, efficient
| machine code generation, but without limitations on memory
| usage.
| ChrisRackauckas wrote:
| The product is not the programming language. As addressed in
| there and on this board, there's a healthy set of products
| around pharmaceutical modeling and simulation (Pumas-AI,
| adopted by groups like Moderna), a cloud compute service
| JuliaHub, multi-physics simulation tools (JuliaSim), and
| upcoming verticals like JuliaSPICE for circuit simulation.
| Each of these themselves are entrants into billion dollar
| industries with tools that are, in some cases, already
| outperforming the current market leaders in computational
| speed and are quickly getting modern reactive GUIs. The Julia
| part is simply that it was founded by the creators of the
| Julia programming language and this stack is then (of course)
| built in Julia, leveraging all of its tools to reach these
| speeds. But the product is not the language itself.
| sneilan1 wrote:
| I'm confused as to why Julia, a programming language is worth so
| much money. If the makers of Julia have already given away their
| source code here, https://github.com/JuliaLang/julia what are
| they selling that's worth a 24 million series A round?
|
| Is the Julia business model similar to Redhat or Canonical where
| they sell consulting services?
| freemint wrote:
| Julia Computing is a company that employs (some of) the core
| contributors of Julia and develops enterprise solutions.
|
| Such as:
|
| - the first Julia IDE Juni now depreciated.
|
| - On premise package server
|
| - A wrapper over AWS called JuliaRun that has a nice web
| interface
|
| - paid "we make a core developer stare at RR traces of your
| problems"
|
| - FDA approved software for drug development and
| pharmacokinetics as https://juliacomputing.com/products/pumas/
|
| It is not the programming language that is the product.
| amkkma wrote:
| Hi. Good question.
|
| This is addressed in several places in the comment thread by
| Keno and Chris.
| sneilan1 wrote:
| Thank you. I'll do a search for that.
| darksaints wrote:
| I'm sure there is some good in there to have some solid funding
| for additional development, but now that it's a commercial
| venture, I'm terrified to see the revenue model. The moment you
| build your profit platform on top of someone else's profit
| platform, you become someone else's servant.
| meowkit wrote:
| This is my concern too. I skimmed the article and I guess its
| going to be something along the lines of locking certain sim
| modules behind a subscription like Autodesk/Fusion360?
|
| Happy to be wrong here.
| mbauman wrote:
| Nothing has changed here. Julia Computing has always been a
| commercial venture -- that's its reason-for-being, providing
| enterprise support and products built on the language. The
| Julia Language has always been (and always will be) open
| source. The two are completely separate[1], but we at Julia
| Computing invest greatly in the language itself -- our success
| as a company is directly linked to the language's success.
|
| 1. https://julialang.org/blog/2019/02/julia-
| entities/#julia_com...
| KenoFischer wrote:
| These kind of concerns are not unreasonable in general of
| course, but in this case let me point out that Julia Computing
| has been a commercial enterprise for more than six years. Also,
| our commerical product is deliberately not Julia, but rather
| we're building our products on top of Julia, just like anyone
| else might. In fact there are several startups unrelated to us
| that have built multimillion dollar businesses entirely in
| Julia . At this point there's a bunch of companies that depend
| on Julia and are committed to it's future - we're just one
| among them.
| jstx1 wrote:
| You're not _just_ one among them given how much control you
| have over the language itself. Those other companies aren 't
| founded by the co-creators of and main contributors to the
| language.
| KenoFischer wrote:
| Sure, but there's two separate concerns here. One is that
| money will turn us evil and we'll exercise undue influence.
| My counterpoint was that JC has been around for six years
| now and in that time we've actually strengthened Julia
| significantly as an independent project. My other point
| though was about concentration risk. I think far more
| common than people turning evil is that companies go all
| out raising money, become the only people developing a
| project and then if the revenue doesn't come as planned,
| the company and the project fail together with unpleasant
| results. At this point, if Julia Computing fails, Julia the
| language will survive no problem. Of course we're not
| planning on failing, but before going out on this
| commercial path, it was hugely important for all of us that
| Julia is on solid footing. For must of us, what we've built
| in Julia is our "life's work" (ok, it's only been 10 years,
| but that's a substantial effort still) and we're not
| planning to let that just die.
| [deleted]
| gfodor wrote:
| Congrats Julia team!
| StefanKarpinski wrote:
| Thanks, man :D
___________________________________________________________________
(page generated 2021-07-19 23:01 UTC)