TIL container memory isolation is a lie User pages are accounted but kernel allocs need to be explicitly marked as "accounted" (~24 accounted vs ~31551 unaccounted). E.g. you can eat 2GB of non-pagable not-accounted mem by creating a large netfilter table:https://groups.google.com/forum/#!msg/syzkaller-bugs/3N989qrPDrg/aAADfSTFDwAJ …
it will become better if you never allow syscalls not in syzkaller description at first. you should ask Kees Cook about alt-syscall?