[HN Gopher] Is Fortran easier to optimize than C for heavy calcu...
___________________________________________________________________
Is Fortran easier to optimize than C for heavy calculations?
Author : jokoon
Score : 76 points
Date : 2023-09-21 09:26 UTC (13 hours ago)
(HTM) web link (stackoverflow.com)
(TXT) w3m dump (stackoverflow.com)
| saagarjha wrote:
| The difference is mostly not relevant these days. Yes, restrict
| is important, but anyone who is optimizing numerical computing
| hard is probably already using it judiciously, and for them what
| is more important is what is familiar to them and makes them most
| productive.
| pjmlp wrote:
| The keyword is _judiciously_ , without triggering UB, and I bet
| most don't.
| gnufx wrote:
| Unfortunately I see rather few people doing serious optimizing
| knowledgeably at all in research computing. (I know how and
| where it's done well, of course.) That has global effects on
| productivity of users, though some of the biggest gains can
| come from more global optimization, like in the use of MPI.
| theodorethomas wrote:
| People mention the no-aliasing, the compilers, the intrinsics,
| the libraries, and the expressivity but one aspect of the
| difference is ignored and it is this: C is a language for
| specifying behaviour of hardware (time sequence of detailed
| states), Fortran is a language for specifying computation of
| values. Fortran abstracts far more of the hardware than C and
| consequently, a Fortran compiler can benefit from quantum
| processors, mind-readers or time-machines, should they ever be
| invented.
|
| A Fortran program that reads its input, calculates and finally
| writes out its output does not have to execute any particular
| instruction at all. As long as the answer is "AS IF" it had done
| the user-specified computation, the Fortran compiler has done its
| job. In between I/O, it submerges into the ineffable like a Cold
| War SSBN.
|
| C is about the instruments, the players and the conductor,
| Fortran is about the music.
| dTal wrote:
| Maybe this was once true, but the hardware that C was designed
| for specifying the behavior of was a PDP-11. Nowadays you are
| programming an abstract C-ish virtual machine that provides
| certain semantic guarantees that don't necessarily map terribly
| well to the physical hardware. For example, if you write to a
| memory address, and then read from the memory address in the
| "next instruction", you expect the change to be immediate, even
| though the code is actually running on a pipeline that could be
| dozens of instructions deep, with several layers of cache
| between the core and system memory. So in a sense there's not
| really a qualitative difference between C and Fortran - they
| are _both_ for specifying a sequence of operations on an
| abstract machine, relying on the compiler to implement that
| machine - and indeed modern optimizing C compilers actually
| provide very few guarantees about specific assembly
| instructions to be executed, happily rewriting or omitting code
| so long as it executes "as if" it ran on the C virtual
| machine.
|
| See "C is not a low-level language" -
| https://queue.acm.org/detail.cfm?id=3212479
| AnimalMuppet wrote:
| > you expect the change to be immediate, even though the code
| is actually running on a pipeline that could be dozens of
| instructions deep, with several layers of cache between the
| core and system memory.
|
| I expect the _hardware_ to handle cache coherency in that
| situation. What the compiler does should be irrelevant.
| dTal wrote:
| Right, but the point is that the hardware is still "meeting
| you halfway" to present the _appearance_ of something which
| isn 't actually happening. Those pointers in C aren't
| really "memory addresses" at all, they're keys in a key-
| value store managed by the hardware to preset the illusion
| of flat, contiguous memory, as mandated by the C
| programming model.
|
| So maybe it's accurate to say that C is "more compatible"
| with real hardware, in the sense that its abstract machine
| is more isomorphic to what's really happening than
| Fortran's is. But it's not exactly "closer to hardware" in
| the way we might be tempted to think; it's more of a lingua
| franca that your processor happens to speak.
|
| If you're still tempted to consider C "close to hardware",
| consider that you can compile the same code for a Z80 and a
| Threadripper. What hardware exactly are you controlling
| that's common to both?
| AnimalMuppet wrote:
| > as mandated by the C programming model.
|
| As PhilipRoman said, this is also true of assembly (or
| _any other programming language model_ [1]).
|
| > If you're still tempted to consider C "close to
| hardware", consider that you can compile the same code
| for a Z80 and a Threadripper. What hardware exactly are
| you controlling that's common to both?
|
| In both of them I can write to a memory-mapped I/O
| device, if it has one. I can write a custom memory
| allocator for a pool that I'm managing myself. I can't do
| either of those in Fortran or Javascript.
|
| [1] Why does it have to be true of any other programming
| language model? Well, maybe I exaggerate slightly. But
| can you show me a (single threaded) programming language
| where "a = 1" does not mean that on the next line, a will
| be 1?
| kmstout wrote:
| > But can you show me a (single threaded) programming
| language where "a = 1" does not mean that on the next
| line, a will be 1?
|
| MIPS I.
|
| https://en.wikipedia.org/wiki/Delay_slot#Load_delay_slot
| gpderetta wrote:
| >But can you show me a (single threaded) programming
| language where "a = 1" does not mean that on the next
| line, a will be 1
|
| Generally agree with your point, but just to play the
| devil's advocate, in a CPU with exposed pipeline and no
| interlocks, setting a register to a value doesn't
| guarantee that a following instruction reading from that
| register will see the last value written.
| theodorethomas wrote:
| I don't see how C can match Fortran's abstraction level and
| still reliably control hardware that uses memory-mapped I/O.
|
| C, as an operating system implementation language, is trying
| to do something fundamentally different than Fortran.
|
| You live by memory address, you die by memory address.
| PhilipRoman wrote:
| > For example, if you write to a memory address, and then
| read from the memory address in the "next instruction", you
| expect the change to be immediate
|
| This would also be true for assembly, hardly a high level
| language
| lmm wrote:
| On modern CPUs assembly is a high level language (or
| rather, it's a language that doesn't have any of the
| advantages of traditional low-level languages, even if it
| also lacks the advantages of traditional high-level
| languages. Much like C)
| coliveira wrote:
| Yes, you're correct. C was created to control a CPU, it is a
| low level language with a comfortable syntax. C abstracts the
| hardware. But Fortran has nothing to do with hardware, it is
| just a notation for computing matrix algorithms. Fortran can be
| thought as a primitive APL. You can do all kinds of
| optimizations in Fortran that you cannot do in C, because it
| doesn't care or know about the underlying hardware.
| mhh__ wrote:
| That was maybe true for C in the seventies but there's
| practically no difference anymore e.g. C has an as if rule too.
| Alifatisk wrote:
| This might be a bit irrelevant but,
|
| my professor who teaches computer programming (who previously
| worked as a nuclear physicist) loved mentioning Fortran whenever
| possible.
|
| He truly enjoy the language and believes it is well suited for
| heavy computation. He said its used a lot for weather
| forecasting.
|
| He is quite old now so I don't know how true it is today, but he
| made me more interested in Fortran and I might give it a shot
| instead of C sometime soon.
| nonameiguess wrote:
| I haven't talked to him in a few years, but used to be good
| friends with a meteorology researcher at Purdue and Fortran was
| virtually all he knew, it's so heavily used. When I first
| started out in GEOINT processing, a lot of our code (NRO code)
| at the library level was still Fortran, but that was mostly
| older libraries. Anything written after 2005 or so was
| predominantly C++. I don't really keep up to date with the
| constellation status since I haven't worked there in a while
| now, but there were still a few orbital platforms older than
| that back around 2017 or so and the code for handling them was
| still largely Fortran. We never used Fortran directly, though,
| only bindings for C++, though I did have to learn a tiny bit of
| Fortran to hack on those once.
| green-salt wrote:
| This was the case for an atmospheric research facility's HPC
| cluster which was pretty much running all fortran for the
| calculations at least back in 2010.
| dkarl wrote:
| Professors will do weird things if they get a chance. I know of
| a junior college where the intro to programming course used
| Fortran77 in the late 2000s. A lot of kids signed up for it
| because they needed a STEM elective and didn't want to take
| math or science.
| Joel_Mckay wrote:
| There are always language specific features that match use-
| cases. Languages like Fortran and Cobol are a niche area
| specific to legacy systems.
|
| Anyone who wants to stay employed in the private sector should
| initially learn JavaScript, Java, Python, C and C++. Then a
| shop specific choice like Scala (Alphabet), Go (Alphabet),
| Erlang (telecom), PHP (Meta), Ruby (small time web devs), and
| or Visual Studio .NET (small MS shops).
|
| Your best advisor is talking with people currently active with
| the company of interest. Some languages are very industry
| specific (Amazon has its own certificates for the proprietary
| stacks).
|
| I would also recommend looking at job postings outside the
| academic walled garden, and chaotic startups. Most academic
| institutions have been sued many times due to the volume of
| nonsense they feed kids. If you are from Engineering, things
| like PLC Ladder-Logic and Verilog are often more important.
| therealcamino wrote:
| "Most academic institutions have been sued many times due to
| the volume of nonsense they feed kids."
|
| I've never heard of that happening, let alone many times to
| most institutions.
| Joel_Mckay wrote:
| T
| __mharrison__ wrote:
| Can someone enlighten me as to what "T" means? First time
| I've seen it.
| jcranmer wrote:
| Not sure, but the individual previously had a different
| comment that was edited to be just "T".
| Joel_Mckay wrote:
| F
| bananapub wrote:
| so your weird unfounded assertion is also explicitly
| unprovable? awesome, thanks for sharing.
| Joel_Mckay wrote:
| In general lexis nexus holds the records of former
| students suing for BS employment promises.
|
| You would be fairly surprised who showed up besides trump
| u in the states.
|
| It was a lot more common than most like to admit. =)
| jcranmer wrote:
| > Part of the settlements is an NDA.
|
| If someone is sued, that creates a court record. Even if
| the _terms_ of the settlement are not disclosed, that
| there was a settlement is lodged in the court record, and
| publicly accessible. Even if court records are sealed,
| the existence of the case is not, and initial complaints
| are rarely sealed.
|
| In short, if suing universities because of what they're
| teaching is routine, you should be able to point to court
| cases. If you can't find them, then it's not because it's
| secret--secret court cases don't exist--it's because they
| don't exist.
| tptacek wrote:
| In the spirit of generosity, you might conclude that the
| parent commenter simply has their wires crossed, and has
| cases involving for-profit schools like University of
| Phoenix confused with real universities. If you didn't
| know much about higher education, you might think
| UPhoenix was one; it's in the name.
| Joel_Mckay wrote:
| Nope, but thank you for your concern. The local class
| action case was with a local school that privatized, lost
| accreditation, and eventually became a trade school. The
| other University case I alluded to was covered in the
| news, and was notable indeed,
|
| I'm just being careful here. Have a great day =)
| jcranmer wrote:
| And you're still refusing to tell us any names because
| we're somehow lazy for not being able to correctly guess
| what locality you're in, what time you're talking about,
| and search local media history for that timeframe to
| uncover the case you're referring to?
| Joel_Mckay wrote:
| T
| jcranmer wrote:
| Give me three case numbers of cases where former students
| successfully sued the institution (cases that were
| dismissed don't count--just because a case was filed does
| not mean it is meritorious). Those case numbers should
| make it easy for anyone to find the complete docket of
| the case on PACER (if it's federal court) or applicable
| state courts (well, to the degree to which states make
| their dockets easily accessible).
|
| You made the assertion; it's your job, not my job, to
| provide evidence for your assertion, and I've put you on
| notice as to the evidence that is satisfactory. If you
| can't provide the evidence, I have no reason to believe
| your assertion.
| Joel_Mckay wrote:
| T
| jcranmer wrote:
| > You would be fairly surprised who showed up besides
| trump u in the states.
|
| Okay then, who showed up? You imply you know who. It
| shouldn't take any effort to name anybody, if you already
| know them.
| Joel_Mckay wrote:
| T
| [deleted]
| tptacek wrote:
| POC||GTFO, as the kids say.
| kergonath wrote:
| It is still very much used for weather forecasting (ECMWF has
| components in it), as well as materials physics, quantum
| chemistry, and things like finite elements and nuclear fuel
| performance codes.
|
| It is much, much better than C for all these applications.
| Nobody in the field seriously considers C; all the non-Fortran
| codes are in C++. Both the Fortran and C++ bits have Python
| wrappers anyway.
| mschaef wrote:
| Back in school (mid-90's), I made a point of taking a FORTRAN
| introductory class to go along with all the other work I was
| doing in Pascal, C++, Lisp, etc.
|
| For most of the class, FORTRAN felt mostly like a penalty box. It
| was full of behaviors and limitations that dated straight back to
| the early 1960's, if not earlier. It was at least easy to make it
| do the things it could do.
|
| Where my opinion of FORTRAN changed was in the last time that
| class met. By that point, we 'knew' the language and the
| instructor was on to more advanced topics, including
| optimization. He made a point of how an optimizing FORTRAN
| compiler could re-nest three nested loops for the purpose of
| automatically improving memory access patterns. Vector Cray
| machines (which implemented common looping patterns in hardware)
| shipped with FORTRAN compilers that could take normal-looking
| loops and turn them into code that optimized for the hardware.
|
| I don't know if it's still the case (thanks to restrict and
| better compilers), but he made a compelling point at the time for
| the simplicity and restrictions of the language leading to more
| options for optimization.
| dralley wrote:
| >I don't know if it's still the case (thanks to restrict and
| better compilers)
|
| Nobody (really) uses restrict, certainly not at scale.
|
| This is especially obvious because when Rust came along (which
| can effectively use restrict in most places, much like Fortran)
| it uncovered a lot of serious bugs in LLVM's implementation for
| restrict. That whole situation only really stabilized in the
| past year or so.
|
| Maybe GCC does a better job as a result of having gfortran.
| jcranmer wrote:
| Fortran has a lot more (N-dimensional) array logic intrinsic to
| the language than C or C++ does, and in general, Fortran has a
| fair number of "little" things that can add up for optimization:
|
| * Fortran arrays do not alias by default. (This is the most well-
| known, and is indeed repeated in the answers on StackOverflow)
|
| * Fortran N-D array indexes are UB if they go out-of-bounds in a
| way that isn't true for most other languages. In C, if you have a
| declaration `int x[2][3];`, the expression `x[0][4]` is a legal
| way to access one of the elements--it's only UB to go outside the
| bounds of x as a whole, not individual rows. In Fortran, the
| equivalent expression (well, modified because Fortran is column-
| major) is illegal.
|
| * Fortran has built in array expressions--you can express a
| vector addition just by adding the arrays themselves. This makes
| it easier to autovectorize.
| LegionMammal978 wrote:
| > In C, if you have a declaration `int x[2][3];`, the
| expression `x[0][4]` is a legal way to access one of the
| elements--it's only UB to go outside the bounds of x as a
| whole, not individual rows.
|
| Even though multidimensional arrays are laid out contiguously
| in C/C++, it doesn't follow that indexing can traverse from one
| row to the next. By C17 6.5.6/8, pointer-integer addition can
| only create pointers to other elements (or one past the last
| element) of the _same array object_ that the original pointer
| came from. However, an int[2][3] array has two int[3] arrays as
| elements, and each of those arrays has three int objects as
| elements. The int objects are not all elements of the same
| array (the standard 's definition of an array element,
| 6.2.5/20, doesn't apply recursively), so you aren't allowed to
| index between them with expressions like (x[0] + 4).
|
| Compilers seem to be pretty lenient about this in practice,
| though, only optimizing out out-of-bounds accesses when they
| exit the complete object. But at least Clang's UBSan will
| complain if you try to index between rows [0].
|
| [0] https://godbolt.org/z/GYosP39Mc
| pklausler wrote:
| Fortran's _dummy arguments_ can (generally) be assumed in the
| scope of a subprogram to not alias, whether arrays or not. The
| burden is on the programmer of calls to procedures to not
| associate any of its dummy arguments that are modified during
| the call with data that might be associated with any other
| dummy argument.
| KeplerBoy wrote:
| > In C, if you have a declaration `int x[2][3];`, the
| expression `x[0][4]` is a legal way to access one of the
| elements
|
| Is it really? Does the C-compiler guarantee that your 6 ints
| will be stored in a contiguous memory area?
| jcranmer wrote:
| Yes. A[i] in C is exactly equivalent to *(A + i).
| (Incidentally, this means that 2[x] is a legal expression in
| C).
| KeplerBoy wrote:
| Of course, that part is clear. The question is, if the two
| rows are guaranteed to be contiguous in memory. If using
| heap allocated memory, one could easily construct a
| counterexample (i'm not sure how the stack allocated array
| behaves, i guess it works because it's the sane choice, but
| not strictly guaranteed):
|
| int *x; int numRows = 2; int numCols = 3;
|
| x = (int *)malloc(numRows * sizeof(int _));
|
| for (int i = 0; i < numRows; i++) { x[i] = (int
| _)malloc(numCols * sizeof(int)); }
|
| // This could segfault x[0][4] = 42;
| jcranmer wrote:
| Yes, it's a multidimensional array, everything is
| required to be consecutively allocated. (See 6.5.2.1 for
| more details). `int[2][3]` is 6 integer objects
| consecutively allocated (as 3 arrays of 2 arrays of
| integers), not 3 pointers to arrays of 2 integers.
| KeplerBoy wrote:
| Thanks for pointing to the actual section. Good to know.
| pipo234 wrote:
| > The performance difference comes from the fact that Fortran
| says aliasing is not allowed [...]
|
| This is one of the legacies modern C++ is still wrestling with
| today, supposedly solved by Rust, Zig, Val and the likes.
| _flux wrote:
| Hmm, I thought basically only Rust solves it. How do Zig and
| Val solve it?
| Conscat wrote:
| Multiple-aliasing cannot exist in a language without
| reference semantics, such as Val (or whatever its new name
| is).
| zik wrote:
| It's true that there's no "restrict" inthe C++ standard,
| however all the major C++ compilers have extensions for it
| ("__restrict__") and have had for a long time.
| patrec wrote:
| Except that it doesn't work and is basically WONTFIX since no
| C or C++ programmer ever uses it.
|
| This is only very slightly exaggerated.
| Conscat wrote:
| I've never had issues with it, personally. In what way does
| it not work?
| lmkg wrote:
| Rust has aliasing guarantees. In theory, Rust code can
| pass noalias annotations to LLVM and become more
| optimized.
|
| In practice, it took _years_ for Rust access these
| optimizations because the noalias annotations kept
| hitting codegen bugs in LLVM. It was an area of the
| compiler that had not seen much active use before Rust,
| and they were causing a lot of dormant bugs to become
| issues.
|
| The feedback cycle was also very long because LLVM didn't
| fix the problems quickly. I didn't pay close attention so
| I don't know if this is prioritization or because those
| fixes required major surgery, but Rust sometimes had to
| wait for major version bumps in LLVM before seeing if
| "this time was the last time."
| vlovich123 wrote:
| Yup. But it is true now though so hopefully they've added
| the test cases to LLVM to make sure things continue to
| work correctly in the face of more and more optimizations
| being thrown at code.
| Measter wrote:
| Not just in theory. Rust does now mark almost all
| references as `noalias`, the exception being `&T` where
| `T` contains an `UnsafeCell`. The equivalent safe Rust
| function signature[1] would have all four references
| marked `noalias`, even though the `input` and `matrix`
| slices could alias.
|
| How much the optimizer can take advantage of that is
| another matter, due to what you said. Doing a quick
| translation to more idiomatic Rust[2], it does hoist the
| accesses to `matrix` out of the hot loop, and does also
| seem to be a bit less moving stuff around compared to the
| C version, which I think is putting the `matrix` values
| in the right places, which Rust did before the loop.
|
| [1] fn transform(output: &mut [f32], input: &[f32],
| matrix: &[f32], n: &i32);
|
| [2] https://godbolt.org/z/5PPn3eh19
| Conscat wrote:
| That's interesting. It also looks like with the
| `__restrict` specifier (https://godbolt.org/z/cvaYoc6zh),
| the clang code is similar regarding the matrix
| multiplication. The body of the vectorized loop itself
| looks identical, `.LBB0_4` in clang and `.LBB1_8` in
| rustc.
| Blackthorn wrote:
| We use it all the time in audio code. How do you get the
| idea that it doesn't work?
| patrec wrote:
| Look at the gyrations the Rust compiler team has been
| going through for _years_ to be able to actually use LLVM
| 's noalias (which is a very desirable optimization in
| Rust where the borrow checker statically guarantees
| absence of aliasing in a lot of code).
|
| I haven't kept up with the current state of things, but
| if you can finally liberally use noalias in LLVM without
| having to fear miscompilations, you probably have solely
| the increased significance of Rust to thank for that.
| Blackthorn wrote:
| We use gcc so honestly I'm not sure how the difference
| goes. I'm glad for Rust making the whole ecosystem more
| usable regardless.
| kergonath wrote:
| GCC had to spend some time getting things correctly
| because of gfortran.
| Conscat wrote:
| https://gcc.gnu.org/onlinedocs/gcc/Loop-Specific-
| Pragmas.htm...
|
| The #pragma GCC ivdep annotation also enables this in loops.
| C23 also has the [[unsequenced]] attribute, which supports
| pointer arguments for optimizations like these, unlike
| [[gnu::const]].
| fsckboy wrote:
| fortran is call by reference, it's all aliasing, you can even
| in some implementations change the value of literals
|
| c is call by value, you have to at least try to alias
| gpderetta wrote:
| my understanding is that in fortran, if two arguments alias
| each other, it is UB. So it is not that fortran doesn't have
| aliasing, it is just that the optimizer assume that it
| doesn't (but don't quote me on this!).
| Bostonian wrote:
| Arguments are allowed to alias if they are both inputs that
| will not be changed in the function -- technically if they
| are both "intent(in)".
| pklausler wrote:
| INTENT(IN) is not required to allow aliasing of
| unmodified dummy arguments.
| kergonath wrote:
| Yes, but in practice no compiler checks whether an
| argument with no INTENT is modified or not, or could
| alias with something else or not. So the compilers tend
| to trust the programmer. With INTENTS, it's obvious and
| much easier to check.
| pklausler wrote:
| Dummy argument aliasing has nothing to do with INTENT.
| gnufx wrote:
| Fortran does _not_ call by reference, and the term aliasing
| isn 't used by the standard. It calls by what's usually
| called value-return or copy-in/copy-out, according to the
| "storage association" rules in the standard.
|
| While it may be true that the rules aren't generally checked
| by compilers, Toolpack did it for pure Fortran77 in the
| 1980s, though I don't know how comprehensive it actually was.
| bregma wrote:
| It took me days one time to figure out why 0 had adopted the
| value four.
|
| The technicality is that Fortran is call by descriptor (not
| by reference, not by value). That's a powerful difference but
| a sharp two-edged sword.
| kaba0 wrote:
| Could you please explain how is it differs from the other
| two? Never looked into Fortran.
| kergonath wrote:
| The Fortran standard does not specify anything about
| passing by references or value. Instead, it specifies the
| expected results. The compilers do it however they want
| or can, which means that they can pass pointers, values,
| copies, or pointers to copies depending on the situation.
|
| Sometimes (most of the time, to be fair), the compiler
| can work out the best solution. Sometimes, the developer
| has to jump through some hoops to avoid unnecessary
| copies.
| bregma wrote:
| Fortran is column-major and C is row-major. That difference alone
| makes vectorization in Fortran much much more frequently used.
| defrost wrote:
| As a 60 year old C | Fortran coder since ~1980 or so, that
| difference makes near zero practical difference to numerical
| computing.
|
| Pretty much any vector|matrix operations will see row against
| column (or colum against row) operations in equal proportion -
| what you gain on the swings you lose on the roundabouts
| whichever way you choose to orientate storage.
|
| Indeed, given large sparse arrays, language specific default
| array storage methods might not even be used - and here C is
| (arguably) more flexible at building task specific data
| structures - a column(?) of _pointers_ to row fragments
| (perhaps).
|
| The big advantage early FORTRAN had was baked in anti alias
| assumptions and a truckload of pre existing relatively
| accessible NIST | NASA grade quality numerical algorithms that
| had been crawled over, eyeballed, tested, and pounded in
| simulations by _many_ qualified applied scientists.
| AnimalMuppet wrote:
| > The big advantage early FORTRAN had was baked in anti alias
| assumptions and a truckload of pre existing relatively
| accessible NIST | NASA grade quality numerical algorithms
| that had been crawled over, eyeballed, tested, and pounded in
| simulations by many qualified applied scientists.
|
| Yeah. That Fortran library that has a solver for sparse
| complex matrices that won't produce nonsense if you give it a
| stiff problem, that has 50 years of people pounding on it to
| make sure that it's rock solid? I'm not sure that there's a
| C/C++ equivalent for it. I _absolutely_ cannot write an
| equivalent. Yes, I can write a matrix solver that takes
| complex data. No, I can 't write one that is as efficient, as
| numerically stable, or as bug-free as what already exists in
| Fortran.
| bregma wrote:
| As a 60 year old C and Fortran coder since ~1980 or so, and
| who has spent the better part of the last decade supporting
| compilers and their internals, I say it makes a big
| difference when it comes to autovectorization optimizations.
| But that's my experience based on most users being naive and
| know-enough-to-be-dangerous programmers, and your experience
| may differ.
| aap_ wrote:
| This is down to convention though. You decide how you interpret
| your data in C.
| bregma wrote:
| And the compiler determines how to implement your decisions.
| andy99 wrote:
| In terms of being easier (for the programmer, not the compiler)
| one thing worth mentioning is that Fortran already has optimized
| intrinsics for matmul and dot_product. I was recently doing a
| small comparison between C and Fortran for matrix-vector
| multiplication and was surprised how fast the matmul is "out of
| the box" vs handwritten loops. Personally I also think that for
| code that is linear algebra heavy Fortran is easier to work with
| because of the intrinsics - it's analogous to numpy, whereas C
| takes more effort, mainly keeping track of sizes of things.
|
| https://gist.github.com/rbitr/3b86154f78a0f0832e8bd171615236...
| physicsguy wrote:
| True for small matrices but above certain size you wouldn't
| want to use these methods or loops anyway, and they're not
| optimised if your matrix is sparse either. So practically you
| end up using a library and you're back to basically being
| equivalent with C.
| andy99 wrote:
| It's possible to link in an external BLAS library that will
| be used by the intrinsics instead. See https://fortran-
| lang.discourse.group/t/compiler-option-matmu...
|
| But yes I agree, at some point it's always going to be back
| to something more customized.
| microtherion wrote:
| And this is where the performance starts converging again,
| because BLAS is also available from C. So if you can
| express your computation in terms of BLAS primitives, it
| does not matter much from what language you're calling
| them.
| kergonath wrote:
| Yes, but you retain Fortran's expressiveness. You can
| write a matrix product A = x * B, regardless of whether
| the matrix multiplication is handled by a compiler's
| multiplication function, any flavour of BLAS, or
| ScaLAPACK over MPI. I don't think Fortran has that much
| of an edge in performance compared to decently optimised
| C, but the Fortran code to get there is much more
| readable and easier to grasp. And with much fewer
| footguns (think allocatables versus pointer arithmetic).
| leereeves wrote:
| The Fortran matmul intrinsic doesn't use an optimal
| algorithm?
| kergonath wrote:
| The compiler is responsible for the implementation of the
| matmul intrinsic. Most of them have a "decent for small
| matrices and easy to inline" version by default and an
| option to convert it to call to BLAS under certain
| conditions (like larger matrices). Then, the performance is
| that of your BLAS library. The assumption (quite good in my
| experience) is that people use the intrinsic naturally
| because it is easy and natural, but turn to BLAS calls when
| performance becomes important. Which is not the case for
| the vast majority of matrix multiplication, even in high
| performance computing.
|
| There is no universal optimal algorithm.
| duped wrote:
| Even theoretically this is only possible if the dimensions
| and sparseness are known at compile time, which can be
| pretty heavy assumptions.
| jcranmer wrote:
| The implementation of every BLAS implementation I'm aware
| of basically looks like this: if (dims in
| range A) gemm_implA else if (dims in range
| B) gemm_implB else if (dims in range C)
| gemm_implC // several more implementations...
|
| The immediate call function is invariably a switch over
| several implementations, and by the time you reach that
| call, you have actual knowledge of dimensions.
| pklausler wrote:
| Subroutine and function "dummy arguments" are basically
| restricted references. When analyzing data dependences in the
| body of a subprogram, a compiler doesn't have to worry about
| writes to one dummy argument affecting any reads to or writes
| from another dummy argument. The burden is on the programmer of a
| procedure's caller to ensure that any dummy argument that is
| modified during the call does not alias with any other dummy
| argument. This is the big one.
|
| Fortran >= 90 has ALLOCATABLE variables (and now components),
| which are dynamically managed objects that are akin to
| std::unique_ptr<> or std::vector<> smart objects. A compiler can
| assume that an ALLOCATABLE object is free from aliasing, unlike a
| POINTER.
|
| Fortran does have POINTERs, but they can only point to objects
| that have been explicitly tagged with the TARGET attribute.
| Objects that are not TARGETs are safe from aliasing with
| POINTERs. Fortran does still not have the concept of a pointer to
| immutable data.
|
| Do any of these still yield performance advantages over C/C++?
| Probably not.
___________________________________________________________________
(page generated 2023-09-21 23:01 UTC)