Compressible resources like CPU can be made available quickly by the kernel with low impact to the threads that were interrupted, provided it knows which threads urgently need the resources and which ones don't. We call this latency sensitive and latency tolerant respectively
-
-
Super informative thread, would love to read more from you on the subject! Many thanks



Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
This Tweet is unavailable.
-
Hey, the thread is ready and compiled. You can read the whole version here:https://threader.app/thread/1153341109692588032 …
End of conversation
-
-
-
Thanks for this, great explanation! You mention that setting request<limit allows for oversubscription. Question: given some historical data of resource usage, how might you actually set these values to achieve a defined level of oversubscription?
-
This doesn't have a Tweet-sized answer. :-) VPA (https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler …) is part of it, but the level of acceptable performance needs to be specified and measured. Quasar (https://dl.acm.org/doi/10.1145/2541940.2541941 …) was an interesting automated approach from a few years ago.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.