Conversation

I mean, that's one way to implement a polymorphic inline cache…
Quote Tweet
Looks like Apple’s A15 (?) chip will ship with the objc_msgSend branch predictor: opensource.apple.com/source/objc4/o
Show this thread
Code snippet from runtime/objc-bp-assist.h objc4-824. Mildly truncated but you get the gist:

// Compute the control register value. For more detail on what
// the various pieces mean, see Apple ISA Extensions to ARMv8:
// version H14, chapter 23, "Software assisted Branch Prediction:
// objc_msgSend".
uint64_t enable = 0x8000000000000000ULL;
uint64_t mask = 0x0000000000000000ULL;
uint64_t classPtrRegister = 16; // Class pointer is loaded into x16
uint64_t selectorRegister = 1; // Selector is in x1.

uint64_t controlRegisterValue = enable;
controlRegisterValue |= mask;
controlRegisterValue |= classPtrRegister << ARM64_REG_BP_OBJC_CTL_EL1_AR_ClassPtr_shift;
controlRegisterValue |= selectorRegister << ARM64_REG_BP_OBJC_CTL_EL1_AR_Selector_shift;
controlRegisterValue |= indirectBranchOffset << ARM64_REG_BP_OBJC_CTL_EL1_Br_Offset_shift;
2
40
Replying to
A decade ago, Java 7 added InvokeDynamic, a programmable API to HotSpot’s JIT which could provide PICs, method inlining, and guard-based cache invalidation at call site granularity. In 2021, Apple is putting it in hardware. I’ve heard this one before and it’s called Jazelle DBX.
1
7
Replying to and
I guess this is the difference between HotSpot’s aggressively JIT-oriented strategy (where all dispatch is virtual until inlined) versus the largely static “choose your own dispatch” strategy used by AOT “JITs” like ObjC and C#
2
4
Replying to
I still think that the Android JIT approach is better than the iOS AOT approach for performance, security concerns notwithstanding, because speculative devirtualization enables so many powerful optimizations
2
2
Replying to and
Android doesn't really do anything fancy based on JIT compilation. It has fairly sophisticated devirtualization but it works with AOT compilation. ART JIT is essentially just an in-memory variant of the AOT compiler for a limited amount of hot code that hasn't been AOT compiled.
1
3
Interpreter / JIT generate profiles which are saved and used to decide what they should AOT compile in the background. Google has started distributing pre-generated profiles so AOT compilation can start working right away. Cold code, unused code, etc. doesn't end up AOT compiled.
1
1
Not really much distinction between ART AOT and JIT compilation. JIT can generate non-PIC code as a tiny micro-optimization. It prefers AOT compiled code paged from storage once it's available. Main purpose of ART JIT is avoiding wasting storage for cold/unused AOT compiled code.
1
1
Once you've heavily used an app for a few days, it ends up with the same end result as AOT compiling the whole thing from the beginning but without a bunch of compiled code for the massive amount of the app that's not actually used on your device and with your own usage of it.
1
They definitely have a very smart compiler system, GC and overall runtime but the JIT is deliberately very stupid and essentially exists to obsolete itself over time with AOT compiled code. They decided that doing sophisticated JIT was a waste of memory, cache, battery life, etc.
1
1
It works a lot differently on a device with 512M memory compared to a Pixel. A low memory device relies a lot on things like GC scheduled in sync with rendering frames, etc. On a Pixel 5 with 8GB of memory, it's not like the foreground app needs to care much about wasting memory.