Could you explain pull vs push CDN?
Conversation
Cloudflare is a caching reverse proxy sitting in front of your origin servers. It makes requests to the origin servers when it receives requests and then caches them primarily based on Cache-Control and legacy Expires headers. It pulls into edge nodes based on what's requested.
1
2
The caching isn't mandatory. It can sit in front of dynamic services and will cache what it can. Most CDNs have you upload (push) content and you either link to content from your web site or you put the entire web site onto it as a static site, etc. Some services support both.
2
1
They also have other services related to that including load balancing across more than one origin server. You can use the load balancing with their proxy disabled too. You can run code on the edge nodes via Workers to reduce the need to make requests to the origin server, etc.
1
1
Cloudflare started out as being a caching nginx reverse proxy as a service with a network of many edge nodes using anycast IPs to route traffic to nearest edge node, and from there to your origin server(s) for anything that wasn't cached. They diverged from nginx though.
2
1
They still use a substantially modified fork of nginx which has diverged more and more and does much more. They use multiple types of nginx instances of it on every node. Substantially different from upstream project and they're gradually replacing it with homegrown stuff.
1
1
nginx's own caching reverse proxy support and other features are far more primitive and they gradually replaced almost all of it with their own code to the point that it's not really nginx anymore. They tried to upstream a bit but it mostly didn't go anywhere and it's too late.
1
A problem with nginx as an open source project is they have an open core model where a bunch of important and even very basic features are part of NGINX Plus instead of open source nginx. Simple example: nginx permanently caches DNS result for each host entry in upstream blocks.
2
Want to resolve the IPs again when DNS TTL expires? Pay $2500+/month per instance of NGINX Plus. For simple cases you can work around it with a hack by not using an upstream block and instead just using proxy_pass with a dynamic variable but you lose all upstream block features.
1
Caching has some major limitations even if you pay for it like not having a way to have a single request to backend to refresh cache unless you accept serving stale content until the initial request finishes.
Request queue when each server is at conn limit is NGINX Plus, etc.
Cloudflare had to rewrite / replace massive parts of it and turned it into something much different, until the point that it's not really nginx, and that they're just replacing it outright. You used to be able to see basic nginx limitations / quirks in their service years ago.


