Granularity of seccomp-bpf is based on system calls and integer parameters. Look at how the io_uring kernel API is set up as another example. If you don't fully disallow it, it bypasses an ever increasing amount of seccomp-bpf filtering since it's blind to what's behind pointers.
Conversation
Landlock LSM (kernel.org/doc/html/lates) is meant to be the solution to seccomp-bpf being too crippled as a way to do self-sandboxing. I don't think they'll be receptive to seccomp-bpf needs in how they design system calls. Just look at what they did with io_uring already.
1
3
Landlock looks promising (a first for LSMs) but largely unneeded IMO. Namespaces + seccomp pretty much fully suffice. Seccomp to block all newfangled syscalls and trace/debug type stuff, and ns's for virtualizing resources.
1
For the most part namespaces don't restrict what processes do. They gain capabilities via the kernel and other processes via file descriptors.
Mount namespaces give them their own path hierarchy but don't sandbox their filesystem access. Similar for most other than userns.
2
Gain capabilities? That's not a thing unless you botched the basics (nosuid).
1
In a user namespace, you can't gain any new capabilities outside of your user namespace. User namespaces are where all the most important stuff happens.
2
And mount namespaces do sandbox filesystem access but you have to know how to use them. It involves bind mounting over things that should not be accessible (or that you want to interpose different content over) and then making a new nested namespace so they can't be undone.
2
That's changing what's accessible via paths, not sandboxing filesystem access. The program isn't restricted from accessing files passed to it from outside the sandbox in any way. If the user opens a file with an application via system UI, something is being passed to the app, etc
1
I'm aware of how mount namespaces work. Changing what can be accessed via paths is a much different thing than restricting which files can actually be accessed. Mount namespaces are doing the former rather than the latter. Same applies to other non-user namespaces.
1
I'm assuming if you pass a file descriptor into the sandbox from outside it, you're okay with it being accessed. If you don't want that, don't do it.
2
Look at this small fraction of what they're building:
flatpak.github.io/xdg-desktop-po
This doesn't include a whole lot of the baseline IPC.
It's going to get 100x larger and far more complicated as they fill out the capabilities to actually be able to sandbox more complex real apps.
So how does an application do something like using OpenGL, opening a file with user consent, taking a picture with user consent, obtaining access to take multiple pictures within the current session (or persistently, for a camera app), and so on and so on?
1
Show replies
They have to convert device access and things done via system services, other apps, etc. to this along with the baseline IPC for Flatpak and between apps. It's going to end up having complex security policies spread out all over the place. There's value in declarative policy.
2
It's not usable for that much yet and has a lot of limitations for those things which is part of why apps just opt-out of any real sandboxing. Their roadmap to sandboxing everything is getting them to move to using these APIs instead of the traditional ways of doing it.

