If you put me in charge of objc_msgSend, I think I’d have the compiler scatter a couple of polymorphic inline caches into the binary at each call site (using indirect jumps to satisfy W^X). I think it’d reduce insn count from 11 to ~6, and it would use the BTB better.
-
-
I mean, it seems obvious that you will sometimes benefit from ICs, so the question is only what threshold you want to kick in the optimization at (via PGO), not whether you should do it at all.
-
Profitable PGO thresholds are one problem. Another problem is that PGO and the associated performance measurement and feedback apparatus is not yet solved, as far as I know.
- 5 more replies
New conversation -
-
-
In addition to all of Greg’s points, remember that a call cache in JS gets to skip more work; ObjC assumes you’re calling a method with the right signature.
-
And the caching can do more than just devirtualization, e.g. you can devirtualize to a variant which propagates argument type information.
- 6 more replies
New conversation -
-
-
Inline caches in a runtime-compiled environment like typical JavaScript benefit from the fact that they have an interpreter handling the infrequently-executed code. It is not the case that all JavaScript call sites get an inline cache.
-
Also the high-performance designs that works well for your web browser might not scale when applied to every process in the OS. I haven’t run any of those numbers recently, though; perhaps the JS folks would know better.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.