Yes, it's obvious that more compiler work is needed. This is just an initial implementation.
-
-
Replying to @whitequark @bascule
If you and
@pcwalton are really going to implement it for each arch, that's awesome, but it isn't *needed* since the old way can work fine.2 replies 0 retweets 0 likes -
This and "the old way" are exactly equivalent. __morestack had to be implemented for every arch separately as well.
1 reply 0 retweets 0 likes -
Replying to @whitequark @CopperheadOS and1 reply 0 retweets 0 likes
-
The point being that doing stack probes on archs like arm64 that way is less, not more efficient. We're talking about different things here.
1 reply 0 retweets 0 likes -
Which way? You don't have to rely on MMU/MPU in the implementation of __rust_stackprobe, you can compare against a TLS slot.
1 reply 0 retweets 0 likes -
Replying to @whitequark @CopperheadOS and
You're saying that this may not necessarily work on ARM64, but what is the exact case when it breaks?
1 reply 0 retweets 0 likes -
Not saying that doesn't work on ARM64. Only stating that naive probes are less efficient than doing the old kind of comparison.
1 reply 0 retweets 0 likes -
-
Sure. Then we shouldn't do naive probes on ARM64. Is that all? I don't see a problem here.
2 replies 0 retweets 0 likes
The hardwired TLS slot was such a pain (deciding on number, populating, avoiding conflicts). Stack probes are more maintainable IMHO.
-
-
It's less of a pain now that changing the TLS slot doesn't need an LLVM patch, so I think it could be manageable.
1 reply 0 retweets 0 likes -
Not sure without data that there’s any meaningful perf difference anyway—at worse you move cache misses to the fn prolog instead of the body
1 reply 0 retweets 0 likes - 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.