[HN Gopher] Actually Portable Executable
       ___________________________________________________________________
        
       Actually Portable Executable
        
       Author : NilsIRL
       Score  : 590 points
       Date   : 2021-02-26 12:06 UTC (10 hours ago)
        
 (HTM) web link (justine.lol)
 (TXT) w3m dump (justine.lol)
        
       | bmn__ wrote:
       | Previously:
       | 
       | https://news.ycombinator.com/item?id=24256883
       | 
       | https://news.ycombinator.com/item?id=25556286
        
       | fctorial wrote:
       | > One of the reasons why I love working with a lot of these old
       | unsexy technologies, is that I want any software work I'm
       | involved in to stand the test of time with minimal toil.
       | 
       | Could've written a win32 program.
        
         | est wrote:
         | Could it disguise as a WinRT program?
        
           | fctorial wrote:
           | It doesn't need to.
        
             | NullPrefix wrote:
             | Unless it wants to pass the built in anti malware filter.
        
               | colejohnson66 wrote:
               | What's up with your computer where Defender flags any
               | non-RT programs? I don't have that issue.
        
       | timcook4253stg wrote:
       | Hello its good
        
       | dang wrote:
       | Of course this is fabulous but it's a follow-up to
       | 
       |  _Show HN: Redbean - Single-file distributable web server_ -
       | https://news.ycombinator.com/item?id=26271117 - Feb 2021 (141
       | comments)
       | 
       | ... which is still high on the front page. Also there was a big
       | thread last year, which is still within the dupe window:
       | 
       |  _actually pdrtable executable_ -
       | https://news.ycombinator.com/item?id=24256883 - Aug 2020 (286
       | comments)
       | 
       | We downweight follow-ups because otherwise the front page gets
       | too repetitive and repetition is mainly what we try to avoid
       | here:
       | 
       | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
       | 
       | https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...
        
         | greggturkington wrote:
         | There is no way to find the other thread using the site search
         | though
        
           | TheRealPomax wrote:
           | the what now?
        
       | dfgdghdf wrote:
       | Can someone explain the advantage over building an executable for
       | each target system?
        
         | burkaman wrote:
         | It's cool.
        
         | mikepurvis wrote:
         | I don't think it has any benefit if you're installing software
         | exclusively that you built yourself on your own targets, or
         | from a distro package manager. But it's potentially a boon for
         | a whole class of statically-linked rescue tools, installers,
         | command-line utilities -- basically anything where there's a
         | website with a curl path/to/thing > local/bin/thing
         | installation option.
        
           | sirius87 wrote:
           | And malware. I don't know why that popped into my mind as the
           | first use-case for this and the web server. :|
        
           | CydeWeys wrote:
           | It also makes manually downloaded software distribution
           | easier. Rather than the user having to select which version
           | of the software to download (which users often get wrong), or
           | trying to guess based on browser user-agent, there's just the
           | one download link that works on everything.
        
         | sime2009 wrote:
         | Not having to build an executable for each target system.
        
           | mikepurvis wrote:
           | I think it's neat how this acknowledges the reality that the
           | actual meat of the machine code is identical for every x86_64
           | target-- all that's different is the OS interface. So unlike
           | other "fat binary" schemes where there's a lot of
           | duplication, this one has a single main program and then
           | small shims to provide the Linux ABI on MacOS and Windows.
        
         | amarant wrote:
         | This is slightly faster I guess?
         | 
         | I don't think it's that big of a deal either, since compiling
         | these days is fast enough you can do it 3 times without it
         | being a problem.
         | 
         | Don't get me wrong, it's very impressive, I just don't think it
         | makes that big of a difference in practice, especially since
         | environmental differences will still require you to have 2
         | codebases in many scenarios (like accessing the filesystem for
         | example)
        
           | rurban wrote:
           | Nope much slower. And never worked for me.
        
           | mikepurvis wrote:
           | Every source-portable program has that anyway though,
           | typically either with a bunch of ifdefs, or by linking to an
           | abstraction like boost::filesystem.
           | 
           | The change here would basically be that all versions of it
           | would have to be compiled into the same binary, with a
           | runtime switch.
        
         | user-the-name wrote:
         | It is very satisfying.
        
       | smallpipe wrote:
       | Using completely wrong greek letters in the title is making me
       | very uneasy for no good reason
        
         | jdxcode wrote:
         | I read your comment thinking you were being stodgy, but then I
         | went to the site and had the same reaction.
        
         | enriquto wrote:
         | I used to feel the same way, and this is indeed an annoying
         | practice. Yet here it makes perfect sense, since this work is
         | based on using certain symbols (e.g., header magic numbers in
         | one executable file format) according to a non-intended
         | interpretation based on casual and meaningless similarities
         | (e.g., as machine instructions in another executable file
         | format).
        
           | madeofpalk wrote:
           | okay well just fuck anyone using a screen reader then.
        
             | ryanianian wrote:
             | The HTML is accessible:                   <h1
             | title="Actually Portable Executable">actually ...
        
               | madeofpalk wrote:
               | ahhh fair!!! Unfortunately only the page title. Elsewhere
               | in the document it's not annotated                   the
               | <a href="https://raw.githubusercontent.com/jart/cosmopoli
               | tan/667ab245fe0326972b7da52a95da97125d61c8cf/ape/ape.S">a
               | ctually pdrtable executable</a> format
               | 
               | still obnoxious though imho
        
               | oqkf wrote:
               | No, that should be an ARIA label. Specifically NOT a
               | `title` attribute.
               | 
               | A title provides additional (not redundant) info and
               | browsers and assistive technologies implement the
               | attribute differently.
        
               | enriquto wrote:
               | The whole page is written in beautiful HTML also
               | (probably by hand?)
        
               | JxLS-cpgbe0 wrote:
               | <p style="float:right">              <center>
               | 
               | Sure looks handwritten to me. I guess "beautiful" is
               | subjective; the <center> element being deprecated is
               | _not_.
        
             | [deleted]
        
             | coldtea wrote:
             | Why, do the people with the screen reader have some
             | specific need to read the title of this article? As if it's
             | some important resource or something?
             | 
             | It's just one irrelevant thing they can't read, same as
             | millions of articles written in different languages...
        
               | colejohnson66 wrote:
               | Because the title tells you what the whole page is about.
               | You can know when to continue or not sometimes with just
               | the title.
        
               | oqkf wrote:
               | Users with or without screen readers could reasonably
               | expect to read plain text.
        
           | defgeneric wrote:
           | But the metaphor doesn't work. I understand the intent, but
           | if anything, it's more appropriate for an error-correcting
           | code.
        
           | oqkf wrote:
           | It doesn't make perfect sense. If the name is _actually_
           | "Actually Portable Executable" then all users should be able
           | to read it that way.
           | 
           | If it is only _stylized_ as  "actually pdrtable executable" a
           | readable name should be available alongside the visual
           | styling (using ARIA attributes, for example).
        
             | thotsBgone wrote:
             | Mouse over the title on the webpage
        
               | oqkf wrote:
               | Ok, it shows a tooltip.
               | 
               | The `title` attribute should specifically _not_ be used
               | here; ARIA labels _should_. The title attribute is
               | implemented differently across browsers and assistive
               | technologies [1] and is supposed to be a _title_ , not a
               | duplicate of the content of the element.
               | 
               | https://developer.mozilla.org/en-
               | US/docs/Web/HTML/Global_att...
        
         | JoelMcCracken wrote:
         | I mean, i keep reading it as the letters themselves; really the
         | only two that bothered me were delta ~ d and not o, and mu is m
         | and not u
        
         | codetrotter wrote:
         | Now you reminded me of a book that my grand father has, where
         | the title is something like:
         | 
         | Iatsssidi Sdias
         | 
         | And it's just so horrible to make a title using Cyrillic
         | characters according to what looks like Latin and not according
         | to their actual sounds XD
        
         | omnicognate wrote:
         | > I chose the name because I like the idea of having the
         | freedom to write software without restrictions that transcends
         | traditional boundaries.
         | 
         | Actmallu pdrtable execmtable
        
           | jwilk wrote:
           | Sorry to ruin the joke, but _y_ is actually in Latin script.
        
             | omnicognate wrote:
             | Yeah, that's why I tried to pick the closest Greek has to a
             | y sound, which I think (I only know a bit of GCSE ancient
             | greek from 30 years ago) is upsilon. If I'd read it as the
             | letter that _looks_ most like a y it would have been
             | "actmallg" (gamma).
        
         | pcthrowaway wrote:
         | As someone who reads and speaks (mono ena ligo - only a little)
         | Greek, same.
         | 
         | There's a subreddit for parodying the phenomenon -
         | http://reddit.com/r/grssk
        
       | m33k44 wrote:
       | "The most compelling use case for making x86-64-linux-gnu as tiny
       | as possible, with the availability of full emulation, is that it
       | enables normal simple native programs to run everywhere including
       | web browsers by default....I think we need compatibility glue
       | that just runs programs, ignores the systems, and treats
       | x86_64-linux-gnu as a canonical software encoding."
       | 
       | :)
       | 
       | Just a smiley, no other words!
        
       | mlok wrote:
       | Closely related : this "Show HN" of an Actually Portable
       | Executable for a web server, published earlier today by the
       | author : https://news.ycombinator.com/item?id=26271117
        
         | ericol wrote:
         | Yep, there are usually a lot of "piggybacking" (This comment is
         | not mean spirited, just stating a fact) in HN. I made a similar
         | comment a while back [1]
         | 
         | https://news.ycombinator.com/item?id=25625703
        
           | TheRealPomax wrote:
           | You do know that it's the same person, right? They're both
           | links to articles on her website. There is no "piggybacking",
           | this is literally her writing about making these things.
        
             | [deleted]
        
             | md224 wrote:
             | No, "piggybacking" refers to the posting of related
             | material on HN after the original material becomes popular.
             | The posting on HN is the piggybacking, not the writing of
             | the material itself.
        
       | nindalf wrote:
       | The author is less than enthusiastic about Apple and Microsoft
       | pivoting to ARM. Considering the perf of the M1, this is
       | virtually inevitable. And once most developer tool chains start
       | supporting ARM as a first class citizen, I see no reason why we
       | wouldn't start running our applications on ARM in the cloud. A
       | world with 2 architectures for mainstream use cases is the
       | future, there's no point fighting it.
       | 
       | (Unless you're Intel/AMD in which case please fight it by giving
       | us faster, more power-efficient chips for cheaper. Thanks!)
        
         | jart wrote:
         | We achieved a near ubiquitous consensus with the x86 PC. Then
         | APPLE said, Behold, the programmers are one, and they can build
         | portable binaries with one machine language; and this they
         | begin to do: and now nothing will be restrained from them,
         | which they have imagined to do. Go to, let us go down, and
         | there confound their machine code, that they may not run each
         | other apps across platforms. So APPLE scattered them abroad
         | with M1 processors from thence upon the face of all the
         | Internet: and they left off to rebuild their open source.
        
           | pcwalton wrote:
           | Except ARM CPUs have been vastly outselling x86 CPUs for a
           | very long time, long before the M1 entered the scene. In just
           | Q4 2020, 6.7 _billion_ ARM-based devices shipped, while 275
           | million PCs shipped in _all of 2020_. Desktop PCs are only a
           | small fraction of the total computing ecosystem.
           | 
           | The stuck-in-the-'90s "desktop is all there is" mindset is a
           | weird holdover from the early growth of PCs in developed
           | countries. If you look at emerging markets, mobile is
           | completely dominant.
        
             | jart wrote:
             | Raw sales numbers are going to be biased because ARM is
             | like Zerg and x86 is Protoss. In the ARM world there isn't
             | the same concept of a central processor so normally lots of
             | chips get built into each individual device.
             | 
             | ARM has also been historically used most often on
             | proprietary systems you need authorization to develop for.
             | So it's made less sense as a target for open source tooling
             | hack projects like this one.
             | 
             | ARM also has so many sub-targets that it's almost like a
             | coalition of ISAs rather than a unified one like x86. So
             | adding ARM support to Actually Portable Executable might
             | not be as simple as including an ARM build in the binary.
             | We might need to have multiple ARM builds for its
             | microarchitectures. Because ARM users want resource
             | efficiency and they're not going to be happy with a
             | generalized build that broadly targets ARM; they want code
             | that's narrowly targeted to the specific revisions of the
             | processor that they're using.
             | 
             | In other words, we can't give ARM users portable binaries
             | because ARM users do not want them.
             | 
             | I also always thought that code for other architectures was
             | the kind of thing that mostly got contributed by the people
             | who build those architectures. Things like how IBM always
             | graces our GitHub issues with patches each time our code
             | doesn't work on s390x mainframes. I like that they do it by
             | contributing patches rather than the feedback of why don't
             | you support this? Why don't you support that? Oh I didn't
             | say I actually needed it.
        
               | dapids wrote:
               | Hahaha, I love the StarCraft analogy!
        
               | NobodyNada wrote:
               | > they're not going to be happy with a generalized build
               | that broadly targets ARM; they want code that's narrowly
               | targeted to the specific revisions of the processor that
               | they're using.
               | 
               | Apple already does this for x86: macOS contains
               | duplicates of all operating system binaries & libraries
               | compiled for pre- vs. post-Haswell processors.
        
               | a1369209993 wrote:
               | > We might need to have multiple ARM builds for its
               | microarchitectures.
               | 
               | Nitpick: it is by definition not needed to have different
               | build for different _micro_ architectures (except maybe
               | for performance). If the same code doesn't work on
               | different chips, it's because they have different
               | ('versions of') instruction set _architectures_.
               | 
               | Edit: nevermind, on rereading, you were already
               | complaining about that. You should probably stick scare-
               | quotes on "'need'", though.
        
             | user-the-name wrote:
             | I'm not sure bursting in with an "EXCEPT!" is quite the
             | correct response to a satirical bible quote.
        
             | FranOntanaya wrote:
             | How do PC sales get tracked when a lot of builds are put
             | together from parts at mom-and-pop stores or by the
             | customers themselves?
        
               | toast0 wrote:
               | There's only two vendors selling CPUs for build your own
               | PCs. Track their sales, and there's your PC numbers.
        
           | mvh wrote:
           | Lol this made my morning
        
           | reaperducer wrote:
           | I'm not sure what you're saying, but I appreciate that you
           | put more effort into saying it than ten other HN posts
           | combined.
        
             | dtech wrote:
             | It's an adapted Bible quote. Genesis 11:6, about the tower
             | of Babel.
        
               | reaperducer wrote:
               | Cool, thanks for the explanation. It sounded Biblical,
               | but I wasn't sure.
        
             | beardbound wrote:
             | I think it was an allusion to the tower of Babel.
        
             | [deleted]
        
           | stergios wrote:
           | I read that in the voice of Cecil B. DeMille narrating in The
           | Ten Commandments. Well done.
        
           | blueblob wrote:
           | Just because it's ubiquitous doesn't mean that it's good.
           | Also, to be clear, x86 then became x86_64/amd64 which isn't
           | the same architecture either. There will always be iterations
           | and oddball architectures where something new can be learned
           | and even reapplied to update x86. POWER, Sparc, etc. all
           | taught new lessons.
           | 
           | Apple isn't scattering anything by running processors that
           | run the same architecture as android and ios on phones. Most
           | open source software can already be compiled for x86, arm,
           | sparc, etc.
        
         | wbl wrote:
         | How many toolchains do not have Arm64 support? Cross-compiling
         | is ancient and most tools predate x86 being useful.
        
         | Vvector wrote:
         | Does ARM really have a performance advantage? Or is it the
         | specific Apple customizations tailored to their use case?
         | 
         | Apple doesn't have to worry about 35 years of legacy
         | architecture to support.
        
           | manigandham wrote:
           | It currently has a performance per watt advantage because of
           | a fundamental design difference (smaller, simpler, many
           | cores) which works great for mobile and can be scaled up to
           | desktop/server rather than trying to scale down x86.
        
           | SuperscalarMeme wrote:
           | Performance is agnostic of ISA. Apple's custom designed cores
           | do indeed have a _massive_ performance /Watt advantage over
           | x86 based designs and happen to be using ARM. However, it's
           | not impossible for an x86 CPU to be designed in a similar
           | way. It does, however, get more difficult to do so due to
           | x86's variable length instruction encoding, to which ARM does
           | not have.
        
             | colejohnson66 wrote:
             | x86's instruction decoder suffers from its inability to
             | parallelize some things. Because instructions have no fixed
             | boundary,[a] something has to process the bytes
             | _sequentially_. Even if they can be read from memory in
             | massive amounts, something still has to sit there going
             | byte by byte to find the boundaries.
             | 
             | The good news is, once those boundaries are found, uops can
             | be generated. But that ~5% or so of die space is always
             | running full tilt (provided there's no pipeline stalls).
             | 
             | I'm sure Intel and AMD have put a massive amount of work
             | into theirs to make it as quick as possible,[b] but it's
             | still ultimately a sequential operation.
             | 
             | With RISC-like architectures like ARM and RISC-V, you don't
             | need that boundary detector. Just feed the 2 or 4 bytes
             | straight into the decoders.
             | 
             | [a]: Unlike ARM and RISC-V which have fixed 2 or 4 byte
             | encodings (depending on processor mode), x86's instructions
             | can be anywhere from 1 through _15_ bytes.
             | 
             | [b]: Take the EVEX prefix for example. It is _always_ 4
             | bytes long with the first one being 0x62. So, once you see
             | that 0x62 byte after the optional "legacy prefixes", you
             | can skip 3 bytes and go to the opcode. But then you need to
             | decode that opcode to see if it has a ModR /M byte, decode
             | that (partially) to see if there's an SIB byte, decode that
             | to see if there's a displacement (of 1, 2, or 4 bytes),
             | etc. And then, don't forget about the immediate (which can
             | be 1, 2, 4, or (in one case of MOV) 8 bytes).
        
               | teucris wrote:
               | Something has been bugging me about x86's lack of
               | boundaries...could the boundaries be computed ahead-of-
               | time and passed to the processor?
        
               | colejohnson66 wrote:
               | Not that I'm aware of. The decoding of an instruction is
               | complicated and also dependent on the current operating
               | mode and a few other things. So, for an OS to pass those
               | lengths before hand, it'd have to know everything about
               | the current state of the processor at that instruction.
               | For example, in 16 and 32 bit modes, opcodes 0x40 through
               | 0x4F are single byte INC and DEC (one for each register).
               | In 64 bit mode, those are the single byte REX prefixes;
               | The actual opcode follows. See also: the halting problem.
               | 
               | As for why it became an issue, instruction sets need to
               | be designed from the beginning to be forward expandable.
               | Intel has historically _not_ done that with x86. Take AVX
               | for example. Originally, it was just 128 bit (XMM)
               | vectors encoded as an opcode with various prefix bytes
               | being used in ways they weren't intended. Later, 256 bit
               | vectors were needed. So they made the VEX prefix. But it
               | only had 1 bit for vector length. This allowed 128 bit
               | (XMM) and 256 bit (YMM) vectors, but nothing else. So
               | when AVX-512 came along, Intel had to ditch it and create
               | the EVEX prefix and allow both to be used. But EVEX only
               | has 2 bits for vector length. So, should something past
               | AVX-512 come out (AVX-768 or AVX-1024?), it'll probably
               | use the reserved bit pattern 11, and they'll be stuck
               | again if they want to go past that.
               | 
               | For an example of this being done right, ForwardCom[0]
               | (started by the great Agner Fog) took the "forward
               | compatibility" (hence the name) issue into mind and used
               | 2 bits to signal the instruction length. It'll probably
               | never reach silicon, but it and RISC-V (which is in
               | silicon form) are good examples of attempting to keep
               | things forward compatible.
               | 
               | [0]: https://forwardcom.info/
        
             | dapids wrote:
             | Que? Look at VLIW ISA's for five minutes and tell me how
             | you've arrived at "agnostic".
        
             | mhh__ wrote:
             | Agnostic is a little strong, although it is true that M1 is
             | _extremely_ wide especially for a laptop chip, and wide in
             | ways beyond the decoder which could be applied to an X86
             | part.
             | 
             | Ultimately these discussions are quite hard because AMD
             | aren't on exactly the same density, and Intel are quite a
             | way behind at the moment.
        
               | [deleted]
        
             | api wrote:
             | "Performance is agnostic of ISA" is too strong a statement.
             | The variable length instruction encoding is a significant
             | performance disadvantage, as is the strict memory ordering
             | requirement of X86/X64.
             | 
             | X64 decoders are indeed only ~5% of the die on a modern
             | CPU, but it's 5% that is always at 100% utilization. That's
             | a non-trivial amount of extra power. X64 decode parallelism
             | is also limited. I've heard four instructions at once as a
             | magic number beyond which it becomes really hard. This is
             | why hyperthreading (SMT) is so common on X64 chips. It's a
             | "cheat" to keep the pipeline full by decoding two different
             | streams in parallel (allowing 8X parallelism). SMT isn't
             | free though. It drags in a lot of complexity at the
             | register file, pipeline, and scheduler levels, and is a bit
             | of a security minefield due to spectre-style attacks. All
             | that complexity adds more overhead and therefore more power
             | consumption as well as taking up die space that could be
             | used for more cores, wider cores, more cache, etc.
             | 
             | ARM is just a lot easier to optimize and crank up
             | performance than X86. The M1 apparently has 8X wide
             | instruction decode, and with fixed length instructions it
             | would be trivial to take it to 16X or 32X if there was
             | benefit to that. I could definitely imagine something like
             | a 16X wide ARM64 core at 3nm capable of achieving up to 16X
             | instruction level parallelism as well as supporting really
             | wide vector operations at really high throughput. Put like
             | 16 of those on a die and we're really far beyond X64
             | performance in every category.
             | 
             | This is also why SMT/hyperthreading doesn't really exist in
             | the ARM world. There's less to be gained from it. Better to
             | have a simpler core and more of them.
             | 
             | IMHO X86/X64 has hit a performance wall at least in terms
             | of power/performance, and this time it might be
             | insurmountable due to variable length instructions and
             | associated overhead. It matters in the data center as well
             | as for mobile and laptops. There's a reason AWS is pricing
             | to steer people toward Graviton: it costs less to run.
             | Power is the largest component of most data center costs.
        
               | bertr4nd wrote:
               | While it's absolutely true that fixed width instructions
               | make parallel decoding vastly easier, there's a cost in
               | terms of binary footprint size. x86 generally has an
               | advantage in instruction cache and TLB performance for
               | this reason, which can be significant depending on the
               | workload.
        
               | innocenat wrote:
               | Is this still really relevant? I can understood that it
               | can be a problem 20 years ago, but with current processor
               | with huge L1 cache and memory bandwidth, I am starting to
               | think that 4 bytes (or variable 4/8 bytes) is not a bad
               | tradeoff for density Vs superscalar.
        
               | formerly_proven wrote:
               | L1 size in 1999: 32 kB
               | 
               | L1 size in 2021: 64 kB
        
               | my123 wrote:
               | Apple M1 big core cache sizes:
               | 
               | 256KB L1I/128KB L1D
               | 
               | Little cores: 128KB L1I/64KB L1D
        
               | cesarb wrote:
               | The L1 size is yet another place where the x86 legacy
               | hinders things. To avoid aliasing in a virtually indexed
               | L1 cache (which is what you want for performance in a L1
               | cache, since a physically indexed cache would have to
               | wait for the TLB lookup), the size of each way is limited
               | to the page size, which on x86 is 4096 bytes. To get a 64
               | KiB L1 cache, it would have to be a 16-way cache, and
               | increasing that too much makes the cache slower and more
               | power-hungry. It's no wonder Apple decided to use a 16
               | KiB page size instead of a 4 KiB page size; a 64 KiB VIPT
               | L1 cache with 16 KiB page size needs only 4 ways.
               | 
               | For the L1 _instruction_ cache, aliasing shouldn 't be a
               | problem (since it's never written to), but this is once
               | again another place where the x86 legacy hinders things:
               | instead of requiring an explicit instruction to
               | invalidate a virtual address in the instruction cache,
               | it's implicitly invalidated when writing to that address.
        
               | pcwalton wrote:
               | Not true. This is a common myth that comes from some old
               | Linus posts in the 32-bit Pentium 4 days and still won't
               | die. I've done comparisons to test this. Compare sizes of
               | modern x86-64 Linux binaries to their counterparts on
               | AArch64. You'll find that they're extremely close.
               | 
               | The biggest problem is all the REX prefixes. The
               | inefficient encoding of registers in x86-64 squandered
               | all the advantages that x86 had.
        
               | a1369209993 wrote:
               | Is true. They said:
               | 
               | > > _x86_ generally has an advantage [empahsis added, not
               | "x86-64"]
               | 
               | Obviously if you take the worst of both worlds (bloated
               | _and_ variable-width instructions), you can squander that
               | advantage, but the advantage is in fact real.
        
             | Sparkyte wrote:
             | The ability to HT greatly increases the total performance
             | across all cores in an x86 chip. I digress there isn't one
             | better than other, just one more complex than the other.
             | CISC vs RISC and neither are truly CISC or RISC anymore in
             | terms of desktop processors. Each come at their own limits.
             | Apple's M1 custom is great because everyone stopped
             | innovating. It's like if Intel wasn't greedy and funded
             | development it would exceed the M1 in performance and
             | wattage, but 14nm+++++ anyone?
        
           | masklinn wrote:
           | > Or is it the specific Apple customizations tailored to
           | their use case?
           | 
           | Apple's use case is "run applications". It's not like there's
           | any magic or they have some sort of ultra specific workload
           | they improved by 10x while the rest sat there.
           | 
           | Apple's customisations are largely "throw hardware at the
           | problem", which I'm reasonably sure Intel would do if that
           | worked for x86. So sounds like something you can do with ARM,
           | which you can't with x86.
           | 
           | The more magical customisations _are_ workload specific, but
           | then they would only trigger for these workloads, both of
           | which are pretty much opt-in: running emulated x64 code on
           | ARM, and performing matrix computations (which AFAIK will
           | only be used through the Accelerate framework).
        
             | jdsully wrote:
             | Intel would do that if they could shrink their transistors.
             | But because they are _still_ at 14NM they are heavily
             | constrained. It 's actually amazing they are competitive at
             | all given they are now 3 generations behind in
             | manufacturing.
        
             | dtech wrote:
             | > So sounds like something you can do with ARM, which you
             | can't with x86.
             | 
             | There's not reason why Intel couldn't, but they don't have
             | the incentive to hyper-optimize frequently used Apple
             | workloads like Final Cut Pro.
        
               | masklinn wrote:
               | > There's not reason why Intel couldn't
               | 
               | If Intel could they would, for years now they've been
               | spending billions to get fraction of a pc improvements on
               | benchmarks. You really think if they could increase die
               | size by 10% and get 30% better perfs they'd say no? Come
               | on.
               | 
               | > they don't have the incentive to hyper-optimize
               | frequently used Apple workloads like Final Cut Pro.
               | 
               | Except M1's performance improvements show up across the
               | board including software which has no relation to Apple,
               | so this is just complete nonsense.
        
             | skohan wrote:
             | As far as I understood, _some_ of the reasons M1 is fast
             | are in fact specific to ARM. For Instance, the advantages
             | given by the width of the decode depend partly on the
             | uniformity of AMR instruction size, and M1 also benefits
             | from looser ordering of memory operations
        
         | DamnYuppie wrote:
         | It seems we are finally going back to the ecosystem of the 90's
         | with multiple processors. This was the genesis of Java at the
         | time and the promise of _Write Once Run Anywhere_ was quite
         | appealing to many developers at the time.
         | 
         | Back then IBM Mainframes still had as strong foothold in large
         | corporate IT departments. Sun had a dominant position as well
         | for most newer companies. If you wanted multiple CPU's with
         | redundant fail over and gigs of RAM Sun was your huckleberry
         | back in the day.
        
           | skohan wrote:
           | It seems like the current iteration is that modern build
           | systems provide the "write once run anywhere" rather than
           | virtual machines, which have their own compatibility and
           | performance issues.
           | 
           | It's trivial nowadays to write a program in Go or Rust and
           | deploy it to whatever architecture you want, without any
           | arcane knowledge of the build process
        
             | ketzu wrote:
             | > It's trivial nowadays to write a program in Go or Rust
             | and deploy it to whatever architecture you want
             | 
             | According to rust docs [1] and go wikipedia page [2] both
             | have mainly support for x86, while go recently added
             | support for macos/arm and in 2019 windows/arm, rust only
             | has tier 1 ("guaranteed to work") support for arm-linux and
             | x86.
             | 
             | Am I misreading this? It does not seem "trivial" to me for
             | arbitrary platforms.
             | 
             | [1] https://doc.rust-lang.org/nightly/rustc/platform-
             | support.htm...
             | 
             | [2] https://en.wikipedia.org/wiki/Go_(programming_language)
             | #Vers...
        
               | steveklabnik wrote:
               | A significant barrier to getting platforms to Tier 1
               | support for Rust is actual hardware to run CI on. Tier 1
               | is an _extremely_ high bar for support.
               | 
               | I do my job at work every day on a Tier 2 ARM target, and
               | in practice, don't notice any difference from the Tier 1
               | targets. YMMV of course.
        
               | ketzu wrote:
               | Thank you, that helped me put it into perspective!
        
               | stu2b50 wrote:
               | I'm not sure about Go, but Rust should work on everything
               | LLVM can emit native code for. While ARM may not be
               | listed as "tier 1", Rust worked on M1's on launch day
               | because of LLVM portability.
        
         | Cthulhu_ wrote:
         | > A world with 2 architectures for mainstream use cases is the
         | future
         | 
         | And the past, I mean PowerPC was a thing for a long time on
         | both Apple desktop systems and servers.
         | 
         | Glad that the compiler toolchains make this transition a lot
         | easier, and Apple has been a major contributor to that. It
         | helped ease the transition from 32 bit ARM to 64 bit ARM, it
         | enabled easy cross-platform apps (Mac Catalyst) and now from
         | x86 to ARM for desktop apps.
         | 
         | And of course it's not particularly new; 25 years ago (ish)
         | Java came out that promised the same thing, one codebase that
         | runs on all architectures. Scripting languages, too.
        
         | andrewaylett wrote:
         | I very definitely read that tongue-in-cheek. Her project
         | targets everything, so long as it's using AMD64, therefore
         | anything _not_ AMD64 is useless, as it can't run her project.
        
         | api wrote:
         | Only 2 architectures? Hah! I remember when there was X86,
         | Sparc, MIPS, PowerPC, M68K, and Alpha, all in relatively common
         | use. There were a few Itanium, S390x, and other weird things
         | floating around too. (MIPS, PowerPC, and S390x are still
         | hanging around in niche applications today.)
         | 
         | Portability is not hard. If you write standard C/C++ that does
         | not depend on undefined behavior (like wild-ass pointer casts,
         | etc.) you will be fine 99% of the time. Use newer languages
         | like Go and Rust or higher-level languages and you won't even
         | notice.
         | 
         | The only hard areas where labor intensive porting is needed are
         | hand rolled ASM or the use of CPU-specific extensions like
         | vector code (e.g. __m128i and friends). That's a tiny fraction
         | of code written and is generally confined to things like
         | codecs, graphics engines, crypto, and math kernels.
        
           | madsbuch wrote:
           | The problem with C/C++ and "newer" languages is that programs
           | need to be individually compiled or have an interpreter
           | installed to execute them - the main problem the author is
           | solving.
           | 
           | The author precisely realises that there used to be multiple
           | architectures, just like you, but also notices that we have
           | converges on x86-64 - what she terms the lingua franca.
           | 
           | I also completely follows her sentiment, that we should not
           | switch ISA _unless_ there is a very real computation per
           | power unit benefit of doing so.
        
           | JadeNB wrote:
           | > Portability is not hard. If you write standard C/C++ that
           | does not depend on undefined behavior (like wild-ass pointer
           | casts, etc.) you will be fine 99% of the time.
           | 
           | This is not my area of expertise, so I'm not the one to write
           | the rebuttal, but it seems that "---- is not hard" is never
           | anything more than an invitation to someone who fully
           | understands ---- to explain why it _is_ hard.
           | 
           | Succinctly, serving 99% of the use cases with no effort
           | mainly seems to be a recipe for making sure that, when one
           | hits those 1% problems, one has no idea how to deal with
           | them. I suspect that portability is one of those things where
           | it's easy to do a mediocre job but hard to do a good/robust
           | job.
        
       | arendtio wrote:
       | Actually, I wonder why not every OS comes with a Posix shell and
       | an Python interpreter nowadays. Posix shells should be super
       | easy, because most systems have them onboard already anyways.
       | However, since Posix shells are kinda broken, I think Python
       | should be the next iteration.
       | 
       | Just to give some context, I am not a Python person, as I prefer
       | Go. But given the popularity and the suited use-cases I think it
       | is a good option.
        
         | colejohnson66 wrote:
         | The problem is bash et al are languages designed for the
         | command line. Every line is a separate command.
         | 
         | Contrast:                   cat test.txt | grep search
         | 
         | With:[0]                   import os         import subprocess
         | with open('test.txt', 'r') as f:             for line in f:
         | line = line.rstrip()
         | subprocess.call(['/bin/grep', line, 'search'])
         | 
         | While the first may use some "magic" symbols such as the pipe,
         | it's really concise in conveying what it's doing.
         | 
         | I will give you this: bash variables and expansions can be
         | confusing. Contrast with programming where this probably works:
         | "start" + variable + "end"
         | 
         | [0]: https://stackoverflow.com/a/9018183/1350209
        
         | perlgeek wrote:
         | As somewhat of a Python developer these days, I have to point
         | out that each 3.something release of Python potentially breaks
         | backwards compatibility.
         | 
         | Second issue is that Python without any extra modules is still
         | pretty limited, nearly every serious python project comes with
         | some extra dependencies.
         | 
         | So have a Python interpreter available only gets you so far...
         | 
         | (For the record, the same could be said about pretty much all
         | dynamic languages I'm familiar with).
        
         | toast0 wrote:
         | Maybe a POSIX shell, as the POSIX shell standard is small and
         | more most purposes fixed. But you don't want to use the OS
         | python, as it is inevitably old and outdated.
        
       | gpvos wrote:
       | This got my upvote at "zip source file embedding could be a more
       | socially conscious way of wasting resources in order to gain
       | appeal with the non-classical software consumer".
        
       | danmg wrote:
       | Very interesting, but every one of these executives I try on my
       | fairly stock Ubuntu system returns 'run-detectors: unable to find
       | an interpreter'.
       | 
       | I'm invoking them with 'bash -c'.
        
         | jart wrote:
         | Author here. That error means you're using binfmt_misc. You can
         | fix that by saying:                   sudo sh -c "echo
         | ':APE:M::MZqFpD::/bin/sh:' >/proc/sys/fs/binfmt_misc/register"
         | 
         | Then you're good to go!
        
           | danmg wrote:
           | Thanks!
        
         | user-the-name wrote:
         | https://news.ycombinator.com/item?id=26273058
        
       | whitten wrote:
       | A clever solution but still dependent on qemu.
        
         | invokestatic wrote:
         | I think it only uses qemu if you attempt to execute on non-x86
         | architectures. So it's not a build-time dependency.
        
           | jessermeyer wrote:
           | Right. I believe the long term vision is to JIT for other
           | architectures.
        
       | breatheoften wrote:
       | I think heterogeneous computing is actually coming this time.
       | Increasing binary size requirements wherever portability is
       | required will be an intended sacrifice toward that aim (but for
       | app store based distributions - the only place required to pay
       | the binary size tax will be in the size of the bundle provided to
       | the app store vendor).
       | 
       | I think the importance of ISAs will fade away generally in favor
       | of specifications that enable coordination of higher level memory
       | model semantics "across" compute resources -- the cpu/compute
       | core becomes the thing that allows you to share reads and
       | transfer write ownership of memory as efficiently as possible
       | between heterogeneous components that operate on the compute
       | graph -- and many of these compute components may require various
       | binary forms of task specific instruction encoding ...
        
       | StavrosK wrote:
       | > actually pdrtable executable
       | 
       | As a Greek, if you do this, I hate you. Why the hell do you have
       | to make me read "actmally pdrtable execmtable"? At least this is
       | one of the less offensive cases.
       | 
       | EDIT: Solidarity to our Cyrillic friends!
        
         | soheil wrote:
         | Because it makes you look like you know what you're doing, not
         | too different than obfuscating javascript for the sake of
         | security, which does kind of work, at least on the lowest
         | common denominator type of attacks, and this does kind of works
         | too by having people think you're more of a genius than
         | previously thought because you can turn boring English letters
         | into something exotic which appeals to the ignorance of the
         | masses [0].
         | 
         | [0] https://en.wikipedia.org/wiki/Argument_from_ignorance
        
           | blueline wrote:
           | you might be overthinking this
        
         | Tenal wrote:
         | Deep breaths. Deep, slow breaths. You're going to be just fine.
        
           | [deleted]
        
         | yosefk wrote:
         | As a Russian speaker, I LOVE these things. Both ways (Russian
         | letters abused to spell English words and vice versa.) In fact
         | I miss old phones with just English keyboards where abuse to
         | spell Russian words (eg CCCP) was an art form, for a brief
         | period.
        
         | notretarded wrote:
         | Female white privilege
        
         | jcburnham wrote:
         | Especially since if you want to port the title to Greek
         | lettering, you have upsilon and omicron for u and o:
         | 
         | actually portable executable
        
           | StavrosK wrote:
           | Omicron looks exactly the same as the English o (it's not
           | visually distinguishable in most typefaces) so it doesn't
           | matter much, but upsilon is an "ee" sound usually, not an
           | "oo" like in "actually" and "executable", so it wouldnt' work
           | exactly. It would read "actially execeetable".
           | 
           | EDIT: For completeness, the full transliteration (or as close
           | to it) would be "axouali portampol exekioutampol". The extra
           | "o" in "portabol" and "execiutabol" is actually a schwa, I
           | think, so it can be omitted.
        
             | jcburnham wrote:
             | Upsilon is admittedly an "i" sound in Modern Greek, but in
             | Attic Greek (which is what I studied, sorry) it did have
             | the "oo" sound.
             | 
             | Edited: I missed the pi and rho completely though, my bad
        
               | StavrosK wrote:
               | Ah yes, you are correct!
        
         | jart wrote:
         | Author here. I wanted to honor Greece for the amazing cultural
         | impact they've had, similar to how mathematics honors Greece.
         | We got a lot of comments like this in the last thread. What
         | dang said about it was really smart:
         | https://news.ycombinator.com/item?id=24264514
        
           | StavrosK wrote:
           | Ah, I don't want to make a fuss about it (my comment was
           | tongue-in-cheek), it's really not a big deal, but it is
           | annoying to spend 2-3 seconds trying to figure out if you're
           | having a stroke, and then some more trying to suss out what
           | the sentence is actually trying to say.
           | 
           | If you want to honor Greece, use the letters as they're meant
           | to be used! "Actually portable executable" would be much
           | better (though I've intentionally tried to give English
           | readers a stroke with this one :)!
        
             | user-the-name wrote:
             | The entire project is built on not using things the way
             | they are meant to be used, though. The name is kind of
             | doing the exact same thing the code is.
        
               | StavrosK wrote:
               | Though, oddly enough, the English letters are used
               | exactly how they're meant to be used :P
        
           | zem wrote:
           | I have to admit, every previous time I saw this linked I
           | didn't bother clicking through, because from the title I
           | thought it was a post mocking the concept of portable
           | executables.
        
             | TeMPOraL wrote:
             | I'm guessing this the unfortunate consequence of the
             | pattern "actually, " becoming a pejorative meme in the past
             | year or so.
        
               | zem wrote:
               | no, it's that the greek letters reminded me of the
               | twitter "Im MoCkInG SoMeThInG sTuPid" format
        
           | soheil wrote:
           | Aren't the Greek symbols used in math void of implicit
           | meaning? You're taking a meaningful English sentence and
           | replacing its letters with Greek letters while making it
           | extremely difficult for people with disability on screen
           | readers, those two things are not the same.
        
           | defgeneric wrote:
           | It's actually not smart at all. Replacing the letters in the
           | Roman alphabet with Greek letters based on superficial
           | resemblance is not any different from replacing the "R" with
           | "Ia" when writing about anything Russian-related (you see it
           | stupidly used in book covers, t-shirts, etc).
           | 
           | How does this do anything to honor the cultural legacy of
           | Greece? Perhaps we could honor the legacy of 19th century
           | mathematics by using Fraktur characters when they resemble
           | Latin ones?
           | 
           | When people who can read Greek are telling you it's bad taste
           | maybe take _their_ word for it! Not dang.
           | 
           | What you're really saying is that the Greek alphabet (and by
           | extension its language community) is so insignificant
           | compared to Latin that the cost of potential misrecognition
           | is so low that it can be disregarded. This is chauvinism, not
           | "honoring Greek mathematics"!
        
             | ulzeraj wrote:
             | Word. I'm still trying to find out who Doidld Tyatsmr is
             | and why is he so hated in the US.
        
           | JxLS-cpgbe0 wrote:
           | What dang said about it was not smart.
           | 
           | > it's good for readers to have to work a little
           | 
           | Unless they're using assistive technologies. In that case
           | it's a nightmare. Don't make your users _work_.
           | 
           | > it's not hard for any HN reader to do the bit of work to
           | figure it out
           | 
           | Unless they're using assistive technologies. Or just want to
           | read it without _work_.
           | 
           | Or, say searching for it. This post comes up. The one you
           | linked to doesn't.
           | 
           | https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu.
           | ..
           | 
           | Respect to you for wanting to honor Greece. I think using the
           | letters* correctly would honor them more. _(thanks for the
           | correction)_
        
             | StavrosK wrote:
             | > I think using Cyrillic correctly would honor them more.
             | 
             | (Greece doesn't use Cyrillic but I agree with you
             | otherwise)
        
             | ClawsOnPaws wrote:
             | Yes. My screen reader, at least Voiceover on my phone, had
             | a stroke reading that. I had to navigate letter by letter
             | and guess what it meant. But it's also quite common so I'm
             | used to doing that regardless.
        
           | ActorNightly wrote:
           | I know this is a loaded question, but are there any resources
           | you can point to in learning the linux syscall stuff, or
           | perhaps writing a C compiler from scratch? I thought I had a
           | fairly good grasp of this stuff but after looking through
           | cosmopolitan code, I realized Im not even close.
        
             | jart wrote:
             | Rui is writing a book for the chibicc compiler in the cosmo
             | codebase. I should probably write a book on system
             | interfaces since there's no school for it. I had to go
             | straight to the primary materials, i.e. the source to
             | pretty much every existing kernel and libc along with the
             | historical ones in order to understand the origin of
             | influence. That's what helped me have a razor sharp focus
             | on the commonalities which made this project possible.
             | 
             | So I'd say that the SVR4 source code would be a good place
             | for you to start. It's like ambrosia and once you've read
             | it you can always tell by reading modern code which
             | developers have and haven't seen it. There's also the
             | Lions' Commentary on Unix. I highly recommend Richard W.
             | Stevens. The last book on the required reading list is
             | BOFH.
        
           | climech wrote:
           | I appreciate the good intentions, but confusing Greek readers
           | doesn't seem to me like a good way to honor the cultural
           | impact of Greece.
        
           | simonebrunozzi wrote:
           | > The quality of this post is so high that it doesn't feel
           | right to override any aspect of what the author created,
           | including quirks like the title.
           | 
           | I agree with dang's feelings/thoughts about the issue.
           | 
           | Perhaps a solution would be to add the "normal" meaning
           | between parenthesis, after the one in greek alphabet?
        
           | simonebrunozzi wrote:
           | By the way, Justine: great work. Besides the obvious HN
           | recognition, I wanted to tell you explicitly as well.
           | 
           | What are you going to work on in the near future? Curious to
           | hear about it. If you don't want to post in public,
           | $my_hn_username at gmail
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2021-02-26 23:02 UTC)