[HN Gopher] AT&T Syntax versus Intel Syntax
___________________________________________________________________
AT&T Syntax versus Intel Syntax
Author : susam
Score : 84 points
Date : 2022-11-13 17:29 UTC (5 hours ago)
(HTM) web link (www.cs.mcgill.ca)
(TXT) w3m dump (www.cs.mcgill.ca)
| ufo wrote:
| On Linux, is there a way to convert an assembly language file
| from one syntax to the other?
|
| I know that there are ways to ask GCC to emit one syntax or the
| other, as well as ways to assemble code in either syntax. However
| I don't know any program that just translates one to the other.
| Graziano_M wrote:
| AT&T vs. Intel syntax has caused many arguments in some of my
| circles. AT&T is an atrocity and if you disagree... you're wrong.
| :)
| [deleted]
| mixmastamyk wrote:
| op src dest
|
| is the logical order, the rest of the syntax I don't care about
| much.
| SoftTalker wrote:
| TFA says "The `source, dest' convention is maintained for
| compatibility with previous Unix assemblers."
| Joker_vD wrote:
| Which is my all memcpy-like functions in C/C++ standard
| library take arguments in dst, src order.
| pjmlp wrote:
| In mathematics _variable = value_ , variable receives value,
| ergo _op dest src_.
|
| That is logic.
| mrkeen wrote:
| In mathematics _variable = value [?] value = variable_
| googlryas wrote:
| That is just an arbitrary notation, nothing to do with
| logic. In English at least, "add 4 to x" is more natural
| than "add to x 4".
|
| Ergo _op src dest_
| bialpio wrote:
| But English "add 4 to x" tells you nothing about where to
| store the result. : - )
|
| "Add 4 to x and store result in x" vs "Let x be x + 4"?
| monocasa wrote:
| Without any other context, you generally assume that the
| target is an accumulator.
|
| So "add 4 coins to that bucket", generally means at the
| end of that operation the bucket contains at least four
| coins plus any coins that were already in the bucket.
| bialpio wrote:
| Huh, I think my brain is not wired to think about `x` as
| a storage, it is closer to "add 3 to x for x = 2", which
| then gets reduced to "add 3 to 2" and then the
| destination is missing...
| pavlov wrote:
| Indeed. Besides, "let there be light" is clearly
| destination-first, so this matter was already resolved by
| Genesis 1:3.
| Eleison23 wrote:
| Which direction was the Hebrew text written in?
| LudwigNagasena wrote:
| That doesn't follow. _variable = value_ , ergo _dest op
| src_.
| monocasa wrote:
| ...math has no such order. Half the point of algebra is
| that the two sides of the equation are semantically
| equivalent and can be swapped at will.
| pjmlp wrote:
| Good luck convincing anyone that _value = variable_ makes
| sense.
| monocasa wrote:
| My math teachers had no problem with '42 = x' vs 'x = 42'
| as long as the steps made sense to get there. In fact
| they'd probably comment that there was no need to go with
| 'x = 42' if I obviously took a circuitous route simply to
| end up with x on the left side of the equation, as that
| would have demonstrated a lack of internalizing some of
| the base ideas of algebra and it's approach of equation
| symmetry.
| fweimer wrote:
| Some older C coding styles recommend that order because
| before compilers added warnings, "if (17 = variable)"
| resulted in a diagnostic, while "if (variable = 17)"
| would not. Nowadays, I think most programmers prefer
| putting the fastest-changing expression first.
| [deleted]
| akira2501 wrote:
| And it has no state. Half the point of computation is
| maintaining state for the purpose of efficiency. So,
| computation gets an "assignment" operator where pure math
| lacks one, pure math being relegated to subscripts of
| time and indicator functions instead.
| molticrystal wrote:
| Postfix , infix, prefix, crucifix.
| NobodyNada wrote:
| I used to think this was more intuitive, but after using both
| for a while I came to the conclusion that putting the
| destination first is much more practical, because my eyes can
| scan the left column to quickly find where a register was
| last written to. If the destination is last, it doesn't
| appear in a consistent location horizontally.
| sph wrote:
| Be that as it may, the AT&T indirect memory reference
| `section:disp(base, index,scale)` is an abomination unto God.
|
| At least the Intel one makes actual mathematical sense:
| `section:[base + index*scale + disp]`
| colejohnson66 wrote:
| That "logical" order is only because you're trying to read it
| like a sentence ("mov/add ebx into eax") when you should be
| reading it like a formula or what it actually is - code. And
| that's fine, but considering Intel created the chip, it makes
| sense that they should decide how the assembly syntax should
| be, not AT&T.
|
| The _only_ reason "AT &T syntax" exists for x86 is because
| people working at AT&T refused to use Intel as the
| authoritative reference on the syntax, and, instead, decided
| to follow the convention of the PDP, Motorola, etc. family
| and friends. Hence why `as` (and subsequently `gas`) have
| that as the default.
| JonChesterfield wrote:
| Sketchy part is when operations read and write with the same
| argument. Op first is nice for that as it becomes op arg0
| arg1.
|
| I quite like the SSA style which tends to be dst0 dst1 opcode
| src0 src1 but that doesn't model assembly brilliantly.
| Perhaps that order with read-write arguments required to
| appear on both sides of opcode with the same symbol has some
| merit.
| userbinator wrote:
| That is extremely confusing for comparisons, which are
| effectively a subtraction.
| [deleted]
| bbarnett wrote:
| They are using a classic dollar sign for assignment, so clearly
| at&t is better.
| masklinn wrote:
| > They are using a classic dollar sign for assignment
|
| They're using a dollar sign for _immediates_.
|
| As if you can't notice that it's a number.
| addl $4, %eax
|
| that's 3 different completely unnecessary symbols for things
| which are not ambiguous in the first place:
|
| - the operation width is provided by the registry
|
| - a number is an immediate
|
| - a register is named
|
| Hence the much less noisy Intel syntax: add
| eax, 4
| docandrew wrote:
| The comma is unnecessary too, so that's 4 unneeded symbols.
| matja wrote:
| addl is not necessary if it is not ambiguous
| add $4, %eax
|
| - is fine
|
| Compare with add 4, %eax
|
| The "less noisy" Intel syntax becomes:
| add eax, DWORD PTR [4]
| __init wrote:
| Most assemblers for Intel syntax will let you write:
| add eax, [4]
|
| if you desire. Indeed, many disassemblers will follow
| suit in unambiguous cases. IDA, for example, does this.
| colejohnson66 wrote:
| The only time "DWORD PTR" is required is when (1) you're
| working with old assemblers, or (2) you're using a memory
| operand with an immediate: add eax, [4]
| ; inferred add [eax], 4 ; ambiguous
| add DWORD [eax], 4 ; explicit
|
| A disassembler may output it when not necessary, however.
| amluto wrote:
| Contemplate the AT&T vs Intel syntax for x86 addressing modes
| and say that again.
| ratsmack wrote:
| X86 addressing modes are an atrocity:
| https://stackoverflow.com/questions/63571979/clarifying-
| thre...
| amluto wrote:
| They are an atrocity _in AT &T syntax_. They are
| perfectly readable in Intel format. For example:
|
| https://blog.yossarian.net/2020/06/13/How-x86_64-addresse
| s-m...
| ratsmack wrote:
| That is an excellent write up, thanks.
| pjmlp wrote:
| In a dumb idea to depend only on GAS, before it had good support
| for Intel syntax, I ported some code from Intel syntax into AT&T
| many moons ago, quite dumb idea, if i was doing it today I would
| have listed nasm or yasm as a requirement and be done with it.
| retrac wrote:
| Here's the why for the curious. It's just a historical quirk.
|
| It's "AT&T syntax" because it dates to the 1978 AT&T Labs effort
| to port UNIX to the 8086. [1] While the 8086 did not have virtual
| memory or hardware protection, its memory segmentation model was
| still adequate to support UNIX. It was the first microprocessor
| practically capable of running UNIX, and this was realized before
| the chip was even released. The porting effort started
| immediately. (Though most of the energy would soon switch to the
| 68000 when that was released a year later.)
|
| The AT&T folks did not wait for Intel's assembler. (Written in
| Fortran, to run on mainframes, or on Intel's development
| systems). Nor did they closely model their assembler after it.
| They just took the assembler they already had for the PDP-11 and
| adapted it with minimal changes for the 8086. Quick and dirty.
| Which was okay. You're not supposed to write assembly on UNIX
| systems, anyway. Only the poor people who had to write kernel
| drivers and compilers would ever have to deal with it.
|
| [1] https://www.bell-labs.com/usr/dmr/www/otherports/newp.pdf
| (see section III)
| bbanyc wrote:
| I think there's a bit more to the story. It was before my time,
| but as I understand it the most widely used Unix for the 8086
| was XENIX (initially a Microsoft product, later sold to SCO),
| which used Intel-syntax MASM as its assembler.
|
| XENIX for 386 was based on AT&T System V/386, which introduced
| the AT&T syntax to 32-bit x86. I've found some references to
| 32-bit XENIX still using an assembler called "masm" but I don't
| know if it was still based on Microsoft's MASM or just called
| that for compatibility, or whether it was AT&T or Intel syntax.
| Also by that point compilers and assemblers weren't included in
| the base OS anymore, but a "development kit" sold separately.
|
| The Minix compiler and assembler also used Intel syntax.
| bluedino wrote:
| It's funny how many things come down to 'UNIX hackers did it
| this way when they had to work with a PC'
| jcalvinowens wrote:
| I initially learned the Intel syntax, and preferred it for
| awhile. But the more I work with non-x86 CPUs, the more I prefer
| AT&T just because it feels less different than everything else.
| aap_ wrote:
| Much agreed. Intel syntax just seems somewhat alien.
| userbinator wrote:
| Intel syntax is more similar to ARM, MIPS, and even RISC-V's
| official syntax than AT&T.
| secondcoming wrote:
| But all the Intel instruction documentation unsurprisingly
| uses the Intel syntax!
| yakaccount4 wrote:
| That's interesting. My entire world revolves around x86 and
| ARM, so "Intel" syntax (which to me mostly means op dest,
| src) is what seems normal to me.
| aap_ wrote:
| Order of dst and src i have no strong feelings about, it's
| all the rest that i find weird about intel syntax.
| GeorgeTirebiter wrote:
| dest = src
|
| That's how I think of it.
|
| ALSO: I really do not like the MOV instruction; I much
| prefer LD. The Z-80 instruction names got everything
| mostly right.
| jart wrote:
| AT&T syntax is the most elite syntax. I've used it to write some
| famous hacks, like Actually Portable Executable, which is a
| 16-bit BIOS bootloader shell script ELF / Mach-O / PE executable.
| People dislike it because writing assembly Bell Labs style
| requires great responsibility. What makes AT&T syntax so powerful
| is its tight cooperation with the linker. I don't think it would
| have been possible for me to invent APE had I been using tools
| like NASM and YASM and FASM because PC assemblers were
| traditionally written as isolated tools that weren't able to take
| a holistic view with linker scripts and the C preprocessor.
| https://raw.githubusercontent.com/jart/cosmopolitan/master/a...
| chrisseaton wrote:
| Isn't this a function of the tools you were using, not the
| syntax? Couldn't any of these tools support any syntax and do
| the same thing?
| JonChesterfield wrote:
| The line between syntax and functionality is pretty thin for
| an assembler.
|
| I've definitely had code that some assemblers accepted and
| others didn't on the same arch, so even if they could be
| equally expressive in practice they aren't. Fairly sure
| that's also true of inline assembly on clang x64, had to
| change between intel and at&t for something a while ago.
| fingleberry wrote:
| I'm sorry, no. This is incredibly misleading. Even if the
| linker step and assembly are completely separated, everything
| you've built in Cosmopolitan and APE is 100% buildable in other
| tools. It might take more effort in some cases, but there's
| more to a native stack than choice of tooling and syntax; if
| you genuinely know your stack and architecture, anything is
| possible.
|
| Your accomplishments have zero to do with the 'elite' tooling
| you use (why are you gatekeeping and creating class
| distinctions out of assemblers?), and more that you've taken
| the time to really think about how memory is laid out and how
| the architecture works - which most of us who started out
| writing operating systems instead of Rails understand perfectly
| fine. Nothing about the relationship between gas and ld
| achieves uniqueness not seen in other native stacks. That's
| just made up.
|
| There are multiple operating systems built with hand written
| NASM. Arguing about assemblers like they matter for more than
| five seconds is tiring 1990s IRC stuff. They turn syntax into a
| byte layout. It's like realizing oh, this assembler sucks at
| ELF, why don't I just hand lay one out? and boom, you're on the
| way to APE.
| dboreham wrote:
| Hmm. "AT&T"? I thought it at least came from DEC via the PDP-11,7
| assembler.
|
| And I'm guessing Intel didn't invent doing it backwards in their
| own either.
| masklinn wrote:
| It's the AT&T syntax because AT&T are the one who unleashed
| that on the world against the wishes of everyone. See the
| sibling comment:
|
| > The AT&T folks did not even wait for Intel's assembler [...]
| Nor did they closely model their assembler after it. They just
| took the assembler they already had for the PDP-11 and adapted
| it with minimal changes for the 8086.
| ksherlock wrote:
| Wikipedia confirms it.
|
| https://en.wikipedia.org/wiki/X86_assembly_language#Syntax
|
| "The AT&T syntax is nearly universal to all other architectures
| with the same mov order; it was originally a syntax for PDP-11
| assembly. The Intel syntax is specific to the x86 architecture,
| and is the one used in the x86 platform's documentation."
|
| https://en.wikipedia.org/wiki/As_(Unix)
|
| "As of November 1971, an assembler invoked as as was available
| for Unix. Implemented by Bell Labs staff, it was based upon the
| Digital Equipment Corporation's PAL-11R assembler."
| [deleted]
| monocasa wrote:
| AT&T as in Bell Labs in the work that would become Unix.
| PaulHoule wrote:
| I never liked the syntax used by gas. It feels like something
| intended to be part of a C compiler, not like something you'd use
| for the joy of assembly language.
| userbinator wrote:
| IMHO the most confusing part is that AT&T/GAS syntax inverts the
| comparisons, which are otherwise natural in Intel syntax:
| cmp eax, ebx ; eax ? ebx jg foo ; jump if eax >
| ebx
|
| Related: http://x86asm.net/articles/what-i-dislike-about-gas/
| mrjin wrote:
| Also the % before register names is completely unnecessary, it
| just another extra character to type.
| [deleted]
| rwmj wrote:
| It disambiguates labels from registers, assuming of course
| you allow labels to have register names. eg this is valid:
| mov rax,%rax
| rax: .ascii "hello\0"
|
| Stupid perhaps, but valid.
| jck wrote:
| I wonder if it would have been more ergonomic to have the
| labels be % prefixed instead of registers.
| _tomcat_ wrote:
| mshockwave wrote:
| Similar thing also happens on 68k: Motorola syntax v.s. "MIT"
| syntax which is probably only used by GNU toolchain
| ack_complete wrote:
| Practically, 68k is far more usable in AT&T syntax than x86.
| When I used to do PalmPilot development, you could basically
| write standard 68k asm with just some extra %s sprinkled before
| registers and as would be fine with it. The x86 AT&T syntax is
| far more alien compared to the syntax in the official manuals,
| with arguments backward and nonstandard instruction names like
| addl and movabsq.
| bmc7505 wrote:
| Interesting bit of trivia, Prof. Ratzer, the author of this
| piece, was the first graduate student in computing at McGill
| University [1] and one of the founding members of the School of
| Computer Science [2], which just recently celebrated its 50th
| anniversary. [3]
|
| [1]:
| https://en.wikipedia.org/wiki/McGill_University_School_of_Co...
|
| [2]: https://www.cs.mcgill.ca/~ratzer/backup/welcome.html
|
| [3]:
| https://mcgill.imodules.com/controls/email_marketing/view_in...
___________________________________________________________________
(page generated 2022-11-13 23:00 UTC)