Conversation

A problem with that is that you can't undo making the seccomp filter to do debugging/profiling without spawning the application again. Due to the design of the system call API, you can't really do anything more than disallowing it as a whole. Can't allow a small portion of it.
2
1
Granularity of seccomp-bpf is based on system calls and integer parameters. Look at how the io_uring kernel API is set up as another example. If you don't fully disallow it, it bypasses an ever increasing amount of seccomp-bpf filtering since it's blind to what's behind pointers.
2
2
For the most part namespaces don't restrict what processes do. They gain capabilities via the kernel and other processes via file descriptors. Mount namespaces give them their own path hierarchy but don't sandbox their filesystem access. Similar for most other than userns.
2
For example, Android has a service with the low-level read-only and runtime system properties. The rules for accessing that are statically defined via SELinux policy. The low-level rules for accessing services are also defined that way. It's just a static form of security policy.
2
Show replies