Setting the pkru register for MPK has more overhead than I'd expected, at least within an EC2 virtual machine. I can try a fully inlined pkey_set but I don't think it will make much difference. It's worth having this as an option when willing to sacrifice more performance though.
Conversation
It's quite possible it will get much faster on future hardware. MPK is a very bleeding edge feature with limited hardware availability (Skylake-SP only). I expected it to be faster since it's entirely thread-local. It'd need to be 5-10x faster for me to want to use it by default.
