Post AWS1mFFDXfxtuwkSCu by djspiewak@fosstodon.org
(DIR) More posts by djspiewak@fosstodon.org
(DIR) Post #AWS1mDa3jDRql0U4Po by SethTisue@fosstodon.org
2023-06-07T01:00:00Z
0 likes, 0 repeats
“Warning: Do not ever use JMH on Apple's M-series hardware”, says @djspiewak (with characteristic comic overstatement, of course) — they're weird processors that have very different performance characteristics than the CPUs you'll likely deploy on. #ScalaDays
(DIR) Post #AWS1mEVqGNfXeDWDTs by leviramsey@social.vivaldi.net
2023-06-07T13:37:39Z
0 likes, 0 repeats
@SethTisue @djspiewak Likely even if the processors you end up deploying on are ARM.
(DIR) Post #AWS1mFFDXfxtuwkSCu by djspiewak@fosstodon.org
2023-06-07T13:58:39Z
0 likes, 0 repeats
@leviramsey @SethTisue Indeed. The problem isn’t ARM (usually). The problem is the SoC.
(DIR) Post #AWS1mFdg4jWz8oBzKi by SethTisue@fosstodon.org
2023-06-07T01:04:33Z
0 likes, 0 repeats
P.S. on Planet Spiewak, slow things are “even even even even even more worse” and fast things are "l-u-u-u-u--u-udicrously fast”
(DIR) Post #AWS1mFwSwsYm54yzcO by alexelcu@social.alexn.org
2023-06-07T14:41:44Z
0 likes, 0 repeats
@djspiewak I was aware that M1/M2 has specific optimizations that aren't representative of other ARM platforms, but not that it makes such a difference in benchmarks. Are there any resources we can read?@leviramsey @SethTisue
(DIR) Post #AWS9MCDYyKKydP43MG by djspiewak@fosstodon.org
2023-06-07T16:06:42Z
0 likes, 0 repeats
@alexelcu @leviramsey @SethTisue It hasn't been talked about too much publicly to my knowledge, but it really works out to the same factors which make the M-series chips so incredibly fast in practice. The memory bandwidth and physical proximity of main memory to the compute units is the most relevant difference (for non-graphical workloads).
(DIR) Post #AWSHu8iRIXUnFiCe2a by alexelcu@social.alexn.org
2023-06-07T17:42:30Z
0 likes, 0 repeats
@djspiewak Ah, so memory access is faster. Therefore, there's less pressure to optimize memory access patterns.@leviramsey @SethTisue
(DIR) Post #AWSI5fkGpqnisRYhf6 by djspiewak@fosstodon.org
2023-06-07T17:44:34Z
0 likes, 0 repeats
@alexelcu @leviramsey @SethTisue Exactly. In fact, memory access is so fast that it's almost like having a really, really large L3 cache. Since cache management is such a massive bottleneck in modern processes, this ends up significantly distorting performance.