Interesting... I've been looking into why Linux forces you to use a user namespace and remapping yourself to root to do a mount namespace (PR_SET_NO_NEW_PRIVS should be sufficient) and it looks like now it kinda doesn't. 1/N
Conversation
New user namespaces start with all capabilities (as if root, even without making the uid remapping to 0), but Linux capablities all get dropped at execve if uid!=0. So you *can* do new mounts without remapping, but only in the initial userns-creating program before execve. 2/N
1
3
However, Linux 4.3 introduced a new capability set, the "ambient capabilities", which are preservable across execve even without root. Presumably by using these, you can make an environment in which the user "remains themselves" but can do tmpfs and bind mounts freely! 3/N
2
1
4
And, I'm told unshare(1) already supports this! For example:
unshare -mc --keep-caps
mount -t bind ~/src/musl/libc/libc.so /lib/ld-musl-x86_64.so.1
4/N
2
5
Replying to
Unfortunately this ends up being really confusing and hard to work with in my experience. I had to undo various things I'd done using this kind of thing because I'd keep screwing things up by not having a good intuition for layers of mount namespaces. Also really love how systemd
decides to change how stuff works (like propagation) on a global level so it doesn't work consistently between systems.

