[HN Gopher] RISC-V Assembler: Arithmetic
       ___________________________________________________________________
        
       RISC-V Assembler: Arithmetic
        
       Author : WillFlux
       Score  : 50 points
       Date   : 2024-01-31 10:46 UTC (1 days ago)
        
 (HTM) web link (projectf.io)
 (TXT) w3m dump (projectf.io)
        
       | WillFlux wrote:
       | 68000 is, in many ways, the pinnacle of assembler for
       | programming, but RISC-V is pretty fun, too. I hope RISC-V tempts
       | a few more people to try asm programming (again).
        
         | snvzz wrote:
         | I got into 68000 programming quite late (6yr ago), but I have
         | been enjoying it (so far Amiga, Atari ST, rosco-m68k). It is a
         | very programmer-friendly instruction set architecture.
         | 
         | RISC-V I started playing with more recently (early 2023, thanks
         | to VisionFive 2), and it feels like my old favorite (MIPS),
         | without the baggage MIPS carried.
         | 
         | It is a pleasure to work with this amount of GPRs and the
         | comfortable alternative names for them that the official abi
         | offers. I am loving it so far.
         | 
         | I expect to have RVA22+V hardware soon (Milk-V Oasis). Very
         | much looking forward to playing with the vector extension on
         | that.
        
           | sylware wrote:
           | Yep, same, I am keeping an eye on Oasis, but to run powerful
           | GPU drivers (much user space would have to be ported from c++
           | to hand written risc-v assembly, SDK included). Don't rush it
           | though, concurrent access and memory coherence of device
           | memory is still not finalized.
           | 
           | I have been coding kind of a lot x64 recently, the limitation
           | of 16 GPRs has been painful. I am sure that when I will crank
           | up on rv64 assembly programming, those 32GPRs will feel like
           | fresh air.
           | 
           | In the other hand, I am not fond of the ABI register names,
           | and the pseudo-instructions involving mini-compilation. I'll
           | stick to xNUMBER register names and won't use pseudo-
           | instructions. Like I will avoid any abuse of the
           | preprocessor.
        
       | throwaway71271 wrote:
       | I love RISC-V assembler.
       | 
       | I did a bit of x86 as stuff 20 years ago but hated it, now I
       | wanted to teach my daughter some c and assembler and was thinking
       | between arm and riscv, but riscv is just a joy to teach (I made a
       | riscv assembler boardgame to help with the task
       | https://punkx.org/overflow/)
       | 
       | Recently I was rewatching Hackers(1995) and I also got excited
       | about the same quote:
       | 
       | > "RISC architecture is going to change everything." -- Acid Burn
       | 
       | After spending some time with esp32s and riscv assembler, I think
       | its more true than before :)
       | 
       | If you havent given it a try yet, there are many articles like
       | the one, and also projects like
       | https://luplab.gitlab.io/rvcodecjs/ or https://riscv.vercel.app/
       | where you can play with it, or even make your own emulator by
       | learning from other emulators like
       | https://github.com/OpenMachine-ai/tinyfive/blob/main/machine...
       | 
       | this cheatsheet is also very useful:
       | https://www.cl.cam.ac.uk/teaching/1617/ECAD+Arch/files/docs/...
        
         | stockhorn wrote:
         | Wow... Crazy this boardgame of yours. I'll definitly take a
         | deeper look at this :). Thanks
        
       | sylware wrote:
       | RISC-V, being an ISA worldwide royalty free standard, is meant
       | for assembly writting. The main reason, above the "comfort"
       | reason of C and similar, was ISA abstraction which has no meaning
       | in the RISC-V realm. The middle ground is those very high level
       | language interpreters (python/lua/ruby/javascript/etc) written
       | directly in RISC-V assembly (without abuse of any assembler
       | preprocessor, ofc).
       | 
       | I am currently writting my own rv64 on x64 virtual machine
       | process, to code my programs in rv64 and not anymore in x64 in
       | order to be "real hardware ready".
       | 
       | BTW, anybody knows a EU based distributor of milk-v duo boards
       | with the cv1800 SOC (the one without ARM cores and a rv64 MCU)
       | which I can contact using my domestic email server?
        
       | IshKebab wrote:
       | > ProTip: Hexadecimal literals are prefixed with 0x.
       | 
       | I love the idea that someone could get to this page and not
       | already know that!
       | 
       | Also this nicely highlights my pet peeve with assembly:
       | add  rd, rs1, rs2  # rd = rs1 + rs2
       | 
       | It's very difficult to remember which parameter is the
       | destination etc. IMO it would be much nicer if assembly had just
       | a _little_ more syntax for that sort of thing. E.g.
       | rd = add rs1, rs2         t0 = li 5
       | 
       | Just so the destination register is obvious. Ah well, nobody's
       | going to do that. Assembly parsing is a total mess; there's no
       | official grammar or anything - it's just "what GCC/LLVM do".
        
         | Findecanor wrote:
         | > I love the idea that someone could get to this page and not
         | already know that!
         | 
         | Might be worth noting because it is different in other assembly
         | languages. In e.g. 6502 and MC680x0 asm, you'd first prefix
         | with # to differentiate between an address (to load from and/or
         | store to) and an "immediate" constant number, and then $ to
         | denote hexadecimal. In x86 assembly (Intel style), you instead
         | suffix the number with the letter h.
         | 
         | > rd = add rs1, rs2
         | 
         | There are assembly languages for a few architectures that
         | did/have that form or a similar one. Itanium assembly had
         | instead the form:                   add r1 = r2, r3
        
         | JAlexoid wrote:
         | Assembly is a language, that can be extended. Assembly++?
         | 
         | But at some point doing simple C makes more sense, than adding
         | syntactic sugar to Assembly.
        
         | epcoa wrote:
         | > there's no official grammar
         | 
         | Most architectures have something. Sparc had one (I still have
         | the manual), PPC, 68k. I would even say x86 does as well but
         | you can't force its adoption, what AT&T and GNU wants to do on
         | their own can't be prevented. AT&T I suppose had the goal of
         | making it all consistent, but I'm not sure if that was an
         | improvement. Though I know of its vocal defendants.
         | 
         | RISC-V might be the exception more than anything, but they have
         | a defacto syntax used throughout the spec.
         | 
         | Analog Devices DSPs and Itanium are examples with the = token.
         | 
         | http://laplace.physics.ubc.ca/vnp4/intel/docs/asm_lan.pdf
         | 
         | https://www.nxp.com/docs/en/reference-manual/M68000PRM.pdf
        
           | camel-cdr wrote:
           | There is an assembly manual: https://github.com/riscv-non-
           | isa/riscv-asm-manual/blob/maste...
        
             | epcoa wrote:
             | That document is grossly unfinished and doesn't even appear
             | to specify syntax outside pseudo ops and a few other
             | things. In a few places it refers to the output of objdump,
             | which I think is close enough to saying "whatever gcc
             | does".
        
         | jcranmer wrote:
         | If you want fun, there's the x86 assembly syntax where the
         | destination is the first register, and the x86 assembly syntax
         | where the destination is the last register. One is the syntax
         | as is used in official documentation (the Intel and AMD
         | manuals), most reverse engineering tools, etc. The other is the
         | syntax most commonly used in practice because it's what gcc
         | defaults to and actually isn't documented (which gets into a
         | problem when you start running into what gcc figured was the
         | best way to adapt AVX-512 EVEX stuff into assembly).
         | 
         | So there's a lot of times where I'm staring at x86 assembly and
         | going "wait, which version is this? the one that does
         | destination first or destination second?"
        
           | epcoa wrote:
           | > and the x86 assembly syntax where the destination is the
           | last register.
           | 
           | ITYM AT&T :). The idea is that the basic grammar is common
           | across architectures to help compiler backend authors. The
           | historical reason for the ordering is because that's how it
           | was on the PDP-11, the "mother" assembly. And all AT&T/GNU
           | versions preserve this ordering regardless of the vendor
           | format.
           | 
           | > The other is the syntax most commonly used in practice
           | 
           | It didn't always used to be this way. In the dark ages before
           | NASM, MASM was a top warez.
           | 
           | And depending on what you're doing I don't think Intel syntax
           | is uncommon, gas will even accept it for the most part these
           | days.
           | 
           | > "wait, which version is this? the one that does destination
           | first or destination second?"
           | 
           | There must be some mnemonic to associate sigil vomit with
           | destination last. Shitty sigils come out the ass?
        
         | dvh wrote:
         | Haystack needle
        
         | edgyquant wrote:
         | Assembly is meant to be 1:1 with machine code, which makes
         | writing an assembler extremely easy as long as you know the
         | architecture. Machine code doesn't have things like equal
         | signs, it's literally just a series of bytes (an opcode and
         | operands)
         | 
         | If you want equal signs, use C
        
           | epcoa wrote:
           | Machine code doesn't have mnemonics either. What's your
           | point? There are assembly syntaxes with equal signs (The
           | Itanium though ultimately a failure was not obscure and there
           | are a handful of DSP ISAs that use this syntax). Not my cup
           | of tea but your argument is specious.
           | 
           | > Assembly is meant to be 1:1 with machine code,
           | 
           | And that is nonsense as well. RISC-V is a perfect example as
           | it has plenty of pseudo ops. (Or do you actually think that
           | the literal machine code bytes of an add instruction are 0x41
           | 0x44 0x44? - they're not)
           | 
           | Equal signs or not, macro assemblers have plenty of ergonomic
           | conveniences layered above machine code.
           | 
           | How does an equal sign break any concordance with assembly
           | and the underlying machine code anyway?
        
             | edgyquant wrote:
             | It isn't my definition so your entire rant here is just
             | being argumentative and defensive for no reason. Just
             | because some fringe ISAs have equal signs in their
             | assembler doesn't change anything either. Assembly is meant
             | to map directly to the way the machine code is written, and
             | ran, so having [opcode] [operand(s)] makes perfect sense
             | and quality of life/syntactic sugar beyond very simple
             | things like variables (which don't add some crazy ast or
             | other abstractions that make it a compiler) do not make
             | sense for the tool an assembler is.
        
               | epcoa wrote:
               | > fringe ISAs
               | 
               | Itanium fringe? You're clueless and have no credibility.
               | 
               | We're talking about an = instead of a , ... you're
               | needlessly bringing up "crazy ASTs", so much for being
               | argumentative.
               | 
               | And defensive?
               | 
               | You seem mixed up, I'm not the one even advocating for
               | the damn things. But your argument is ignorant and
               | foolish.
               | 
               | > Assembly is meant to map directly to the way the
               | machine code is written
               | 
               | This is just false.
               | 
               | https://github.com/netwide-
               | assembler/nasm/blob/master/asm/as... Simple?
        
           | IshKebab wrote:
           | Nonsense. Machine code doesn't have things like commas,
           | brackets and letters, yet we use those in assembly. There's
           | zero reason why you couldn't do my proposal.
           | 
           | Also assembly mnemonics aren't even 1:1 with instructions.
           | Pseudoinstructions do pretty much anything, and even
           | something like `add` can assemble to two different
           | instructions depending on the arguments.
           | 
           | As for writing an assembler being "extremely easy"... yeah
           | no. There's no formal grammar so you're going to be reverse
           | engineering LLVM _and_ GCC 's hilariously messy assemblers.
           | Or more realistically, guessing and building an enormous test
           | suite. Not easy at all.
        
       ___________________________________________________________________
       (page generated 2024-02-01 23:01 UTC)