Conversation

Replying to and
There's partial documentation on this here: source.android.com/devices/tech/d For example, I use the near full AOT compilation mode ('speed') without JIT or profiling: github.com/GrapheneOS/pla Full AOT compilation mode (disabling heuristics for interpreting cold code) is 'everything'.
1
Replying to and
They've also done some interesting optimizations, most of which I need to disable for my work. You may know that it spawns apps by forking from a Zygote acting as a template with the common classes / libraries already loaded and initialized, a preloaded OpenGL context and so on.
1
Replying to and
They also have some weird optimizations like shared RELRO sections and pre-generating the heaps for libraries, not just the code. I think the way it works is they load them up in a deterministic environment and then write out the heap in a way that's quick to verify on boot.
1
Replying to and
That sounds a lot like what Darwin does with the shared cache, to prelink the system dylibs into one image, pre-bind ObjC and Swift runtime data structures, etc., though we're starting from already native code
1
2
Replying to and
The shared RELRO thing is for native libraries / executables, specifically Chromium, since the library gets mapped in all of the Chromium renderer sandbox processes which are also spawned for every app using the WebView. There's actually a separate WebView sandbox Zygote now.
1
Replying to and
I have to drop most of this in order to have proper per-app ASLR and other probabilistic mitigations rather than shared bases and other secrets. It makes a big difference for app spawning time and also memory usage, since a lot of the initial heap in the Zygote remains the same.
1
Replying to and
Can see some of how they take advantage of the Zygote here: android.googlesource.com/platform/frame. This is run a single time on boot for each supported architecture. The image classes section in source.android.com/devices/tech/d explains that bit. For their use it only really optimizes Zygote start-up.
1
Replying to and
In my past work, which I still need to port forward, I just disabled all of this and used on-demand loading. It definitely makes a noticeable difference though. There's a 200-300ms or so delay like launching a desktop app instead of everything launching in a couple frames.
1
Replying to and
The JIT compiler is also way different than something like the Java runtime JIT compiler for reasons tied to that too. GC is also a lot different, with priorities a lot more like the Go GC. I'm sure people using Java for desktop apps would love to have a desktop version of ART.
1
1
Replying to and
It's obviously a nice ideal to have the enormous amount of work shared between languages, but it really breaks down with JIT. I think the WebKit move to LLVM and then away from it is very telling. LLVM just takes way too long and takes too many resources (hurting battery life).
1
Replying to and
Thank you for laying this out. Funny that ART's an open-source system used by so many folks but not many folks (including me) know much about how it works. This makes me curious about the details of the GC.
1
1
Show replies