[HN Gopher] A new programming language for high-performance comp...
___________________________________________________________________
A new programming language for high-performance computers
Author : rbanffy
Score : 69 points
Date : 2022-02-11 20:01 UTC (2 hours ago)
(HTM) web link (news.mit.edu)
(TXT) w3m dump (news.mit.edu)
| going_ham wrote:
| There is also this:
|
| https://anydsl.github.io/
|
| They have some framework that achieves high level compute!
| ogogmad wrote:
| I think formal verification makes sense for programs like
| optimising compilers. You have a reasonable hypothesis to prove:
| That the optimisations don't change the behaviour of the program
| in any way, except for speed.
| agambrahma wrote:
| Hmm, I would've expected them to double down on differentiation
| within Julia (via Zygote etc)
| simeonschaub wrote:
| It reads to me like ATL is actually more of a tensor compiler
| than a general-purpose language. In fact, I'd actually be
| curious if we could lower (subsets of) Julia to ATL, let it
| optimize the tensor expressions and transpile everything back
| into Julia/LLVM IR for further optimizations.
|
| In Julia, there is already Tullio.jl [1], which is basically a
| front end for tensor expressions, which can target both CPUs
| and GPUs with AD support and automatically parallelizes the
| computation. It doesn't really optimize the generated code much
| right now though, so something like this could be interesting.
|
| [1] https://github.com/mcabbott/Tullio.jl
| eigenspace wrote:
| MIT is a big place, there's lots of people there who don't use
| julia, and I don't think there's any structures in place to
| stop MIT people from making new languages (nor should there
| be).
| UncleOxidant wrote:
| Yeah, based on the title and where it's from I half expected it
| to be some kind of DSL based on Julia. Instead it's based on
| Coq which gives them formal verification capability.
| dan-robertson wrote:
| In mathematics, there are two ways to deal with tensors:
|
| 1. You come up with notation for tensor spaces, and tensor
| product operations, and a way to specify the indexes of pairs of
| dimensions you would like to contract, either using algebra or
| many-argument linear maps.
|
| 2. You write down a symbol for the tensors and one suffix for
| each of the indices. For an arbitrary rank tensor you write
| something like $T_{ij\cdots k}$. You write tensor operations
| using the summation convention: every index variable must appear
| exactly once in a term (indicating that it is free) or twice
| (indicating that it should be contracted - that is you should
| take the sum of the values for each possible value of the index).
| Special tensors are used to express special things.
|
| So for the product of two matrices C = AB, you could write:
| c_{ik} = a_{ij} b_{jk}
|
| Or a matrix times a vector: v_i = a_{ij} u_j
|
| Or the trace of A: t = a_{ij} \delta_{ij}
|
| (delta is the name of the identity matrix, aka the Kroenecker
| delta). For the determinant of a matrix you could write:
| d = a_{1i} a_{2j} \cdots a_{(n)k} \epsilon_{ij\cdots k}
|
| (Epsilon is the antisymmetric tensor, a rank n, n-dimensional
| tensor where the element at i, j, ..., k is 1 if those indicted
| are an even permutation of 1, 2, ..., n, -1 if they are an odd
| permutation, and 0 otherwise).
|
| The great advantage is that you don't spend lots of time faffing
| around with bounds for the indices or their positions in the
| tensor product. But to some extent this is only possible because
| the dimension (ie the bounds) tends to have a physical or
| geometric meaning. This is also why when the notation is used in
| physics you also have subscripts and superscripts and you must
| contract one subscript index with a corresponding superscript
| index.
|
| I wonder if it is possible to achieve a similarly concise
| convenient notation for expressing these kinds of vector
| computations.
|
| You can imagine abusing the notation a bit but leaving most loops
| and ranges implicit, e.g. let buf[i] = f(x[i]);
| let buf' = buf or 0; let out[i] = buf'[i-1] + buf[i];
| out
|
| (With this having a declarative meaning and not specifying how
| iteration should be done).
|
| A convolution with 5x5 kernel could be something like:
| /* in library */ indicator(p) = if p then 1 else 0
| delta[[is]][[js]] = indicator(is == js)
| windows(radii)[[is]][[js]][[ks : (-radii..radii)]] =
| delta[[is]][[js + ks]] /* in user code */
| out[i,j] = in[p,q] * windows([2, 2])[p,q,i,j,k,l] * weight[k,l]
| /* implicit sun over all k, l, p, q, but only loops over k, l as
| p, q values implied */
|
| Where I've invented the syntax [[is]] to refer to multiple
| indices at once.
|
| I don't know if it would be possible to sufficiently generalise
| it and avoid having to specify the size of everything everywhere,
| or whether a computer could recognise enough patterns to convert
| it to efficient computation.
| floren wrote:
| Meanwhile, HPC software will continue to be written with C+MPI,
| or Fortran+MPI.
|
| Source: was involved in development of multiple HPC frameworks,
| attended multiple HPC conferences, despite wailing and gnashing
| of teeth all actual software continues to be written the same way
| it's always been.
| brrrrrm wrote:
| It'd be quite a boring world if observations of the past could
| reliably predict the future.
| [deleted]
| AnimalMuppet wrote:
| I'm not an expert. But with current CPUs, doesn't HPC
| critically depend on memory alignment? (For two reasons:
| Avoiding cache misses, and enabling SIMD.) Sure, algorithms
| matter, probably more than memory alignment. But when you've
| got the algorithm perfect and you still need more speed,
| you're probably going to need to be able to tune memory
| alignment.
|
| C lets you control memory alignment, in a way that very few
| other languages do. Until the competitors in HPC catch up to
| that, C is going to continue to outrun those other languages.
|
| (Now, for doing the tweaking to get the most out of the
| algorithm, do I want to do that in C? No. I want to do it in
| something that makes me care about fewer details while I
| experiment.)
| brrrrrm wrote:
| You're correct that low level details very much matter in
| the HPC space. The types of optimizations described in this
| paper are exactly that! To elucidate, check out Halide's
| cool visualizations/documentation (which this paper
| compares itself to) https://halide-
| lang.org/tutorials/tutorial_lesson_05_schedul...
| agumonkey wrote:
| No mention of chapel ? just asking, it's been in development
| for years and I just saw its name on the benchmarkgame so some
| people still have interest in it.
| mindleyhilner wrote:
| Paper:
| http://people.csail.mit.edu/lamanda/assets/documents/LiuPOPL...
| jimbob45 wrote:
| >speed and correctness do not have to compete ... they can go
| together, hand-in-hand
|
| Is Rust considered slow these days outside of compilation speed?
| ska wrote:
| Not fast in this sense, but definitely not slow.
| stonewareslord wrote:
| Not fast in which sense? HPC? Or single/multithredded
| processing on one machine?
| wrnr wrote:
| Back in the days the compile times where legendary for being
| slow and that fact has stuck around but this is no longer true
| today.
| brrrrrm wrote:
| for the type of computation described in this article, Rust is
| considered slow. I recently stumbled across a site to track
| it's progress in changing that:
| https://www.arewelearningyet.com
| whatshisface wrote:
| No, it's quite fast.
| UncleOxidant wrote:
| This sounds like of like Halide [1] which I think is also from
| the same group in MIT. However, Halide is an embedded DSL in C++.
| It sounds like ATL being based on Coq would be more likely to
| always produce correct code.
|
| [1] https://halide-lang.org/
| xqcgrek2 wrote:
| So, basically a linear algebra template library hiding
| implementation to the user
|
| big deal /s
| dang wrote:
| " _Please don 't post shallow dismissals, especially of other
| people's work. A good critical comment teaches us something._"
|
| https://news.ycombinator.com/newsguidelines.html
|
| Edit: you've unfortunately been posting a lot of unsubstantive
| and/or flamebait comments. We ban accounts that do that. I
| don't want to ban you again, so you could you please review the
| site guidelines and fix this? We'd be grateful.
| jll29 wrote:
| Shame that reporting for "normal" earthlings is unlikely to
| include a few sample lines of code of this new language.
|
| Draft paper: https://arxiv.org/pdf/2008.11256.pdf
|
| Sample code (looks like a DSL embedded in Python):
| https://github.com/gilbo/atl/blob/master/examples/catenary.p...
| ogogmad wrote:
| One of the innovations in the paper is they find a method for
| implementing reverse-mode autodiff in a stateless manner,
| making optimisations easier to verify.
|
| I might be wrong but they take the principle of dual numbers: a
| + b epsilon, and change the b component into its linear dual.
| Therefore, b is a function, and not a number. If b is a
| function instead of a number, then there's more flexibility
| regarding flow of control, and that flexibility allows you to
| express reverse-mode autodiff.
| brrrrrm wrote:
| a pure functional language is trivially differentiated, no?
| View a program as a data flow graph and then append the
| differentiation data flow graph. How this is executed
| (reverse-mode vs forward-mode vs some exotic mix) is just a
| leaked abstraction. Ideally, it wouldn't need to be exposed
| to the user.
___________________________________________________________________
(page generated 2022-02-11 23:00 UTC)