[HN Gopher] Code in ARM Assembly: Registers Explained
       ___________________________________________________________________
        
       Code in ARM Assembly: Registers Explained
        
       Author : ingve
       Score  : 97 points
       Date   : 2021-06-16 08:19 UTC (14 hours ago)
        
 (HTM) web link (eclecticlight.co)
 (TXT) w3m dump (eclecticlight.co)
        
       | technicalya wrote:
       | I thought the article contains something exciting, but excitement
       | ended when I was reading the last lines.
        
       | jiri wrote:
       | It's just a pity that the article ended just when I was most
       | interested :-)
        
       | saagarjha wrote:
       | > Passing arrays, structures and other arguments which can't
       | simply be put into a single register requires a different method,
       | in which a pointer to the data is passed: 'call by reference'.
       | 
       | Note that small structures will be passed in registers. Here's an
       | example: https://godbolt.org/z/rsGEbefcv. a and b are 32 bits and
       | passed packed in x0, and c is a long passed in x1.
        
         | MarkSweep wrote:
         | You can also pass structs of floating point or SIMD types by
         | register. They have to be 2-4 elements and all of the same
         | type. These are called HFA (Homogeneous Floating-point
         | Aggregate) and HVA (Homogeneous Short-Vector Aggregate).
         | 
         | Source: the Procedure Call Standard for the Arm 64-bit
         | Architecture https://github.com/ARM-software/abi-
         | aa/blob/2021Q1/aapcs64/a...
        
         | pdpi wrote:
         | It's also wrong to call this "call (or pass) by reference."
         | Pass by reference and pass pointer by value are different
         | things (here's an example -- https://godbolt.org/z/Kv7jPPvnz)
        
           | jhgb wrote:
           | Your example seems too C++-specific. Not all languages are
           | defined in the same way.
        
             | Erlangen wrote:
             | I am familiar with a small set of programming languages,
             | but they define reference and value passing similarly.
             | 
             | * Java: Java spec says "pass by value"
             | 
             | * C: pass by value
             | 
             | * C++: pass by value and reference(with ref qualifier)
             | 
             | * C#: pass by value and reference(with ref keyword)
             | 
             | Do you know a language that defines "pass by reference" in
             | a very different way?
        
               | Koshkin wrote:
               | In assemblers and C passing by reference and passing by
               | pointer value means the same thing.
        
               | jhgb wrote:
               | Not sure if all language specs explicitly use the "pass-
               | by-X" terminology but for example ANSI Smalltalk standard
               | says things like "Each argument [of a message send] is a
               | reference to an object."
               | 
               | > but they define reference and value passing similarly
               | 
               | Well, If all of them have been derived from C++ or
               | designed by people trying to improve on C++, they would
               | be very likely to have the notion defined similarly
               | regardless of how anyone else would use the term, so
               | there's not much surprise there.
        
               | MaxBarraclough wrote:
               | They're accepted terms in computer science. They're not
               | specific to any programming language.
               | 
               | https://en.wikipedia.org/wiki/Evaluation_strategy
        
               | jhgb wrote:
               | I was simply saying that C++ using a certain terminology
               | was a sufficient condition for C++-derived languages to
               | use the same terminology, you wouldn't need a CS-wide
               | definition for this and your small set of language
               | wouldn't allow you to make the inference that such a
               | common definition existed.
               | 
               | BTW the page you've linked seems to contradict some of
               | your statements:
               | 
               | > so they are typically described as call by value even
               | though implementations frequently use call by reference
               | internally for the efficiency benefits
               | 
               | As you seem to be arguing that this is not call by
               | reference you should probably correct the page.
               | 
               | EDIT: It also seems that there IS some amount of language
               | specificity anyway, since in another place of the page it
               | says:
               | 
               | > "In particular it is not call by value because
               | mutations of arguments performed by the called routine
               | will be visible to the caller. And it is not call by
               | reference because access is not given to the variables of
               | the caller, but merely to certain objects"
               | 
               | I'm reasonably sure that the first part would be called
               | "call by value" in C++ even if in context of CLU it
               | wasn't if the usual way of doing the same in C++, namely
               | passing a pointer to an object, were to be used.
        
               | MaxBarraclough wrote:
               | > As you seem to be arguing that this is not call by
               | reference you should probably correct the page.
               | 
               | I don't really disagree with what the article says there,
               | although I don't like the way it's phrased. The full
               | quote:
               | 
               | > _In purely functional languages there is typically no
               | semantic difference between the two strategies (since
               | their data structures are immutable, so there is no
               | possibility for a function to modify any of its
               | arguments), so they are typically described as call by
               | value even though implementations frequently use call by
               | reference internally for the efficiency benefits._
               | 
               | A purely functional language with the property of
               | referential transparency [0] can indeed be treated as
               | using pass-by-value, or pass-by-reference, or even some
               | blend of the two. With referential transparency, nothing
               | hinges on which of the two strategies is used. I'm not
               | sure it's accurate to say that they're _typically
               | described as call by value_.
               | 
               | > there IS some amount of language specificity anyway
               | 
               | As Wikipedia says, the term _call by sharing_ is not as
               | widespread. I hadn 't seen it before. I can't say I
               | really see where they're going with that term:
               | 
               | > _In particular it is not call by value because
               | mutations of arguments performed by the called routine
               | will be visible to the caller._
               | 
               | No, they aren't. In Java, you can pass an object
               | reference to the callee function, and the reference will
               | be passed by value. The callee function can modify the
               | _members_ of the referred-to object, in a way that may
               | later be visible to the caller function. So what? That 's
               | still pass-by-value. It's the same in C, where the callee
               | can modify a pointed-to variable.
               | 
               | The Wikipedia article also appears to suggest that in
               | Java, _all_ arguments are boxed /unboxed when passed.
               | That isn't true.
               | 
               | I can see some sense in having a term to emphasise that
               | Java sometimes performs boxing, rather than simple pass-
               | by-value. This is to say, I agree that it's a slight
               | oversimplification to say that _Java always simply passes
               | by value_. If we go too far down this road though we 'll
               | need to recategorise C, as it allows implicit conversions
               | between many of its primitive types.
               | 
               | > you should probably correct the page
               | 
               | I don't have the will to get into a Wikipedia edit war,
               | which is how this kind of thing often goes.
               | 
               | [0]
               | https://en.wikipedia.org/wiki/Referential_transparency
        
               | sry2dogpile wrote:
               | In c and assembly the terms reference and pointer are
               | sometimes used interchangeably. An lval is a reference to
               | a value, * is the dereference operator, and so forth. I
               | don't know the spec admittedly but I've seen the terms
               | used interchangeably in compiler code and documentation.
               | I think c++ began distinguishing between and defining
               | references as distinct from pointers, and java removed
               | explicit pointers entirely and generates them and
               | dereferences them automatically for objects.
        
           | secondcoming wrote:
           | A reference is just a pointer but with restrictions enforced
           | by the compiler. ASM-wise, they're totally the same thing.
           | 
           | 'Reference' doesn't mean C++'s reference, it means that you
           | pass where the data is, not the data itself. Even Visual
           | Basic had this concept.
        
             | [deleted]
        
             | MaxBarraclough wrote:
             | Argument-passing semantics ( _evaluation strategies_ ) are
             | something defined by the source language, not by the
             | calling convention of the target architecture/platform.
             | pdpi is right to point out that what the article is
             | describing is _not_ pass-by-reference semantics.
             | 
             | Unlike C++, C does not support pass-by-reference (ignoring
             | preprocessor silliness). Passing a pointer by value is not
             | the same thing. _edit_ To be clear, C++ supports both pass-
             | by-value _and_ pass-by-reference.
             | 
             | This topic cropped up a year ago:
             | https://news.ycombinator.com/item?id=23553574
        
               | cat199 wrote:
               | this is also a case of the later nomenclature redefining
               | the previous nomenclature in a backwards incompatible way
               | -
               | 
               | Wasn't there, but pretty sure in 70s, the notion of
               | pointer and reference (and 'handle', etc) were pretty
               | much interchangeable, since AFAIK there wasn't any
               | compiler capable of creating a compiler-checked
               | 'reference' construct.
               | 
               | so, if this is true, in the relm of asm, since really we
               | are just dealing with bits in a register that could be
               | interpreted as data or addresses, it makes sense to treat
               | pointer and reference interchangably, since there is no
               | such thing as a higher-order 'reference' anyway.
        
               | Joker_vD wrote:
               | No, they used to use both "pointer" and "reference"
               | interchangeably to mean an opaque handle that you could
               | only pass around or dereference; for things that
               | supported address arithmetic they used, well, "address".
               | See for example [0] -- the paper itself is from 1992, but
               | it extensively quotes the papers from the seventies.
               | 
               | [0] Amir M. Ben-Amram and Zvi Galil. 1992. On pointers
               | versus addresses. J. ACM 39, 3 (July 1992), 617-648.
               | DOI:https://doi.org/10.1145/146637.146666
        
               | jhgb wrote:
               | > are something defined by the source language, not by
               | the calling convention of the target
               | architecture/platform
               | 
               | In that case perhaps the response should have been "pass-
               | by-reference doesn't exist on machine level", not "pass-
               | by-reference is something different", as the latter would
               | seem to be a category error.
        
               | Koshkin wrote:
               | > _the response should have been "pass-by-reference
               | doesn't exist on machine level"_
               | 
               | Which would barely make any sense.
        
               | jhgb wrote:
               | How come? It would illustrate the notion that to a
               | machine, everything is a value, and that (abstract)
               | references get translated into manipulations of values
               | (and not even always in the same way).
        
               | Koshkin wrote:
               | Because "on a machine level" and "to a machine" are two
               | different things (the former referring to human
               | understanding).
        
               | MaxBarraclough wrote:
               | Assuming the machine code has no concept of a function
               | call, isn't it true to say that pass-by-reference doesn't
               | exist at the machine level? Function calls are a high
               | level concept that compile down to machine code.
               | 
               | A typical CPU deals with abstractions like jumps and
               | copies, and has no real concept of a function call. It
               | seems fair to say that, at the level of a typical
               | assembly language/machine language, there is no argument
               | passing, and therefore no evaluation strategy. Same goes
               | for a Turing machine.
        
               | Koshkin wrote:
               | This is similar to saying, for instance, that there are
               | no pointers, at the machine level, just integers. (In
               | fact, there are just bits and, maybe, bytes, for that
               | matter.) But it is the usage semantics that matters.
               | (Typically, CPU architectures are designed with it in
               | mind.) An instruction that saves the the next
               | instruction's address somewhere before performing the
               | jump cannot be thought of as simply combining two
               | arbitrary operations into one...
        
               | secondcoming wrote:
               | I'm reluctant to bikeshed, but if I read your argument
               | correctly, then technically everything is pass-by-value
               | since even for references the actual reference (a.k.a.
               | pointer) is passed by value.
        
               | MaxBarraclough wrote:
               | Pass-by-reference is distinct from passing a pointer by
               | value. In my old comment [0] I gave example code
               | fragments in C# to illustrate the difference. (C#
               | supports both pass-by-value and pass-by-reference. It's
               | like C++ in that regard, although the syntax is
               | different.)
               | 
               | > I'm reluctant to bikeshed
               | 
               | This is the kind of topic where precise phrasing is
               | important, so I wouldn't chalk it up as pedantry. It
               | doesn't help that Java muddied the waters with its use of
               | the word _reference_.
               | 
               | [0] Here's the link again:
               | https://news.ycombinator.com/item?id=23553574
        
               | secondcoming wrote:
               | You've show a syntactic difference, not a functional
               | difference. What does the 'ref' keyword change about how
               | C# passes the argument to the function? I'd imagine
               | (having zero C# experience) that it passes either a
               | pointer, or an index, or an offset, to the function _by
               | value_. It may also just copy the argument, pass it to
               | the function, and then copy the copy back to the original
               | argument (which would be daft).
        
               | MaxBarraclough wrote:
               | > You've show a syntactic difference, not a functional
               | difference.
               | 
               | Kinda, but that difference is important. We can say that
               | C allows us to simulate pass-by-reference by using
               | pointers. It's still true that the C language does not
               | support pass-by-reference semantics.
               | 
               | Taking the address of a variable is an operation that the
               | C language permits us to do, yielding a new value (a
               | pointer value). This isn't part of C's argument-passing
               | functionality, though.
               | 
               | > What does the 'ref' keyword change about how C# passes
               | the argument to the function?
               | 
               | The internals of C# compilers aren't really the point,
               | but I believe .Net does it the copy-intensive way,
               | copying to pass in and copying again to pass the new
               | value back out. I don't think this a real performance
               | problem though, unless you overuse the feature.
        
               | pdpi wrote:
               | Yes exactly. I believe Fortran and Perl only do pass by
               | reference, for example.
               | 
               | It's not a C++ concept, I only used C++ in that example
               | simply because godbolt.org makes it really easy to show
               | how the language treats the two concepts differently even
               | though they compile to the same thing.
        
         | kps wrote:
         | More importantly, it's just wrong. Large objects are not
         | converted to references, they're passed on the stack.
         | 
         | 1 SS5.4 https://developer.arm.com/documentation/ihi0055/b/
        
       | johndoe0815 wrote:
       | Related - M1 ARM assembly examples by Alex von Below:
       | https://github.com/below/HelloSilicon
        
         | tyingq wrote:
         | _" Apple reserves X18 for its own use. Do not use this
         | register."_
         | 
         | Interesting. I wonder why that particular one. It's not exactly
         | in the middle of the range. Maybe they searched for typical
         | existing use and decided that was the least likely to have some
         | conflict?
        
           | regularfry wrote:
           | "All these registers are yours - except X18. Attempt no
           | MOVing there."
        
           | monocasa wrote:
           | x16, x17, and x18 are all different env pointers not really
           | available for general use per se in the main ABI. x16 and x17
           | are used for plt linkage there, x18 is reserved for the
           | system to make use of as it sees fit, at it's leisure. x18's
           | use there seems to have survived into Apple's slightly
           | modified ABI.
        
           | comex wrote:
           | For more context, look at the standard ARM ABI spec. Apple's
           | ABI diverges from the standard one in some places, IIRC, but
           | in this case it's complying with it: the spec says that X18
           | (which it calls r18) can be reserved by platforms for
           | platform-specific use. The spec also shows how the registers
           | are divided into several categories, which helps explain the
           | choice of X18:
           | 
           | https://github.com/ARM-software/abi-
           | aa/blob/main/aapcs64/aap...
        
         | nonameiguess wrote:
         | Would have posted exactly this if you hadn't already. So many
         | pitfalls with M1 in the way Apple subtly veers from normal
         | ARM64 conventions and the published spec. This guy has already
         | figured them out for you. Also helpful to look at the XNU
         | source code to see how they implement syscalls.
        
           | comex wrote:
           | For the record, Apple does publish its own documentation
           | describing how Apple's ABI diverges from the ARM standard
           | one:
           | 
           | https://developer.apple.com/documentation/xcode/writing-
           | arm6...
        
       | gardaani wrote:
       | Is there a reason for giving registers a different name depending
       | on its value's length (D0 = 64-bit, S0 = 32-bit)?
       | 
       | I can never remember which names are mapped to the same register.
       | I've had that issue on x86 and I'll have that issue on ARM.
       | 
       | I loved how Motorola 68000 made it simple. Each command had a
       | dot-letter suffix indicating the length: move.b d0, d1
        
         | Someone wrote:
         | On 68k, that move writes 8 bits; it doesn't really write to a
         | 8-bit register. I make that difference because, on many (I
         | think) other systems, such a byte move sign extends the least
         | significant byte in d0 to the width of the target register and
         | writes the full register.
         | 
         | Also, on 68k you can only address the least significant word or
         | byte of a register that way. Some CPUs have, say, 8 8-bit
         | registers that can alternatively be treated as 4 16-bit
         | registers (https://en.wikipedia.org/wiki/Intel_8080#Registers).
         | On such a CPU, one can write to the top half of a register in a
         | single instruction by writing to the 8-bit register that shares
         | its storage (some RISC CPUs support that by having a separate
         | instruction "load top half of register")
        
         | talideon wrote:
         | My guess is that it's some effort to maintain some kind of
         | compatibility with ARM32. Back when I used to write ARM asm, it
         | was the ARM 26/32 days, and it was rXX for all the register
         | named in the APCS.
         | 
         | It doesn't seem to be as confusing as the x86 situation anyway:
         | none of the registers are _really_ special purpose on ARM.
         | 
         | Here's a link to the details in the ARM64 APCS:
         | https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aap...
        
         | pm215 wrote:
         | For 64-bit Arm FPU registers, the mapping is extremely simple:
         | Q0 is the 128-bit vector register; D0 is the bottom 64 bits of
         | it; S0 is the bottom 32 bits; H0 is the bottom 16 bits; B0 is
         | the bottom 8 bits. Similarly Q1/D1/S1/H1/B1 are all the same
         | underlying register.
         | 
         | For 32-bit Arm, unfortunately, things are different: D0 is the
         | bottom half of Q0, and D1 the top half; similarly S0 is the
         | bottom half of D0 and S1 is the top half (and so Q0 is S0 S1 S2
         | S3, with S3 its most significant 32 bits). This is why "just
         | indicate the length on the insn" wouldn't have worked for
         | 32-bit: there is more than one 32-bit register in each 64-bit
         | register (and some kind of 'high/low' notation would have been
         | annoying when you really do want to just think of it as a
         | collection of 32-bit registers sometimes, especially if your
         | hardware doesn't even have double-precision support!). The
         | 64-bit transition fixed up the awkward overlaid mapping, but
         | retains the notation for the benefit of all the people who were
         | already familiar with the Arm notational conventions for
         | things.
        
           | gardaani wrote:
           | Thanks! I had a closed look at them and the naming is
           | starting to make sense: Q=quad, D=double, S=single, H=half,
           | B=byte (except the article claims that V=128bit and there's
           | no mention of H or B)
           | 
           | For general purpose registers: X=long (64-bit), W=word
           | (32-bit)
        
             | pm215 wrote:
             | It's Qn when you're dealing with the register as a single
             | 128-bit quantity (currently only the SHA256 and SHA512
             | insns need to do this, I think, judging from a quick search
             | through the architecture reference manual), and Vn when
             | you're dealing with the register as a vector of smaller
             | units (eg "ADD V10.4S, V8.4S, V9.4S" does a vector-addition
             | of V8 and V9 into V10, treating each 128-bit V register as
             | a vector of 4 Single-precision (32-bit) floats): you write
             | vector arguments as Vn.2D, Vn.4S, Vn.8H or Vn.16B. Hn is
             | used for 16-bit floating point arithmetic insns. It looks
             | like nothing's using Bn yet, but the manual defines the
             | notation.
        
       | MarkSweep wrote:
       | Related, Raymond Chen is doing a series of articles on the ARM
       | instruction set this month:
       | https://devblogs.microsoft.com/oldnewthing/2021/06/
        
         | saati wrote:
         | That's about the deprecated Thumb-2, this article is about
         | ARMv8, they are wildly different.
        
         | mhh__ wrote:
         | I can recommend his articles to anyone interested in what they
         | cover. It's very common to completely miss the bigger picture,
         | or give an incomplete summary, when talking about processors
         | and their design. If I can name names I've always found
         | Lemire's blog anticlimactic - usually a good summary but not
         | fleshed out until someone else comments with the juicy details.
        
       | GeorgeTirebiter wrote:
       | Seems to me that 'registers' are just really fast memory
       | locations that get referenced by a 'stunted' address space in
       | order to fit into most instructions. In other words, this is not
       | a conceptual necessity. It exists because it's hard to build HW
       | which can access all memory locations really fast. This hardware
       | limitation leaks into software via hacks like 'registers'.
       | Dealing with this hack is what forced Armv8 to remove PC as R15;
       | "In the A32 and T32 instruction sets, the PC and SP are general
       | purpose registers. This is not the case in A64 instruction set."
       | 
       | There is a lot of other HW crud that leaks into sw for dubious
       | reasons. For example, although 'wasteful' if every memory
       | location had tag bits -- hardware enforced -- capabilities would
       | layer naturally. More sophisticated memory protection would be
       | possible.
       | 
       | Instead, we have these von Neumann archs hellbent on winning some
       | synthetic benchmark. Will the computer-using community ever
       | demand some sacrifice in overall potential "adds-per-second" to
       | enable better security and to support better abstractions? Sadly,
       | I doubt it. Even Risc-V is a baby step.
        
       ___________________________________________________________________
       (page generated 2021-06-16 23:02 UTC)