Post AwS9BuYH7VjJLdcfZ2 by amonakov@mastodon.gamedev.place
(DIR) More posts by amonakov@mastodon.gamedev.place
(DIR) Post #AwS1wiFXsSlVAefSoy by harold@mastodon.gamedev.place
2025-07-24T01:46:04Z
0 likes, 0 repeats
Memory renaming has made the so-called "fastcall" calling convention a bit of a joke, as another joke I wrote some "what if stdcall but 64-bit" code and it was slightly faster...
(DIR) Post #AwS1wjiePjdnjDIDpI by wolf480pl@mstdn.io
2025-07-24T07:18:16Z
0 likes, 0 repeats
@harold what's memory renaming?
(DIR) Post #AwS7mJCsjOc0KYqOhs by amonakov@mastodon.gamedev.place
2025-07-24T08:23:39Z
0 likes, 0 repeats
@wolf480pl @harold pipeline optimization for forwarding stored values to loads from the same address: normally this would be done late in the pipeline, picking the value from the store buffer (never wrong); with memory renaming, you track association of address and value registers seen in the stores, and then if you see a load with a symbolically matching address, you take the associated value register as its result (can be wrong if address changed or there was another store to same address)
(DIR) Post #AwS8Kvr9XVwO2uBrPs by amonakov@mastodon.gamedev.place
2025-07-24T08:24:56Z
0 likes, 0 repeats
@wolf480pl @harold this makes such forwarded stores zero latency instead of 3+ cycles of conventional forwarding latencyOn x86, this was first done by AMD in Zen 2, disappeared again in Zen 3. Not sure about the rest.
(DIR) Post #AwS8KwkoCaSapWEJAO by wolf480pl@mstdn.io
2025-07-24T08:29:56Z
0 likes, 0 repeats
@amonakov @harold hmm okay but on x86-64 the most common calling convention is to pass arguments through registers - isn't that effectively fastcall? And if so, is it slower than passing through stack when memory renaming is present?
(DIR) Post #AwS9BuYH7VjJLdcfZ2 by amonakov@mastodon.gamedev.place
2025-07-24T08:39:31Z
1 likes, 0 repeats
@wolf480pl passing by itself shouldn't be slower, this is what makes the experiment in the initial post surprising (at least to me)but then it depends on what you do with the arguments, if they are reused a few times and register pressure is high, having them on stack in the first place could be preferable to spilling @harold is that code something you could share?