Using proxy_cache_use_stale updating together with proxy_cache_background_update in Nginx can deliver significant performance improvements. With both enabled, Nginx can serve an expired (STALE) response from cache immediately while refreshing the object from the origin in a background subrequest. So if the cached response can be returned in e.g. 300 ms but the origin fetch takes 10 seconds, the client still gets a response in ~300 ms instead of waiting the full 10 seconds after the cache entry expires.
There is an important side effect of this combination that isn’t described in the official Nginx documentation, though it has been confirmed that this is the expected behavior in https://trac.nginx.org/nginx/ticket/1723#comment:1 and https://trac.nginx.org/nginx/ticket/1738#comment:1

The problem
As I mentioned above, when using proxy_cache_background_update in Nginx, a background subrequest updates an expired cache item while the response is returned to the client.
If the update takes longer than sending the response, subsequent requests on the same connection can be blocked when using HTTP/1.1 with keepalive. This can cause increased response times in high-traffic scenarios, where the same connections are reused frequently.
To showcase why this is an issue, consider a basic Nginx config, which enables caching and allows for background cache updates:
worker_processes 1;error_log /dev/stderr info;events { worker_connections 1024;}http { access_log /dev/stdout; proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=cache:10m max_size=100m inactive=60m use_temp_path=off; upstream backend { server backend:5000; keepalive 100; } server { listen 80; server_name localhost; location = /with-issue { proxy_pass http://backend/; # Enable keepalive to upstream proxy_http_version 1.1; proxy_set_header Connection ""; # short TTL to trigger stale responses sooner proxy_cache cache; proxy_cache_key $host$uri$is_args$args; proxy_cache_valid 200 5s; # Serve stale while updating in background proxy_cache_use_stale updating error timeout; # THIS IS THE PROBLEMATIC SETTING when combined with keepalive proxy_cache_background_update on; # Add header to see cache status add_header X-Cache-Status $upstream_cache_status always; } }}
I’ve set up a simple backend server which responds with a 10 second delay, to simulate the longer origin fetch response times.
Using curl --next we can test the issue by forcing curl to make several request over the same connection. First we run three requests, to show that the original MISS is slow, because it needs to fetch from origin. Subsequent requests are fast, because they are served from cache.
> curl -s --http1.1 \ -w "Request 1: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-issue?cache_buster=1234" \ --next \ -w "Request 2: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-issue?cache_buster=1234" \ --next \ -w "Request 3: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-issue?cache_buster=1234"Request 1: 10.011811s (HTTP 200) Cache: MISSRequest 2: 0.000817s (HTTP 200) Cache: HITRequest 3: 0.000675s (HTTP 200) Cache: HIT
Now we wait at least 5 seconds for the cache to expire, then we run the same test again. When the cache expires, the STALE response is served quickly, because it’s served from the cache. It triggers a background update, so the next request over the same connection is blocked until the origin responds to the background subrequest, rather than receiving a fast UPDATING response.
> curl -s --http1.1 \ -w "Request 1: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-issue?cache_buster=1234" \ --next \ -w "Request 2: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-issue?cache_buster=1234" \ --next \ -w "Request 3: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-issue?cache_buster=1234" Request 1: 0.002262s (HTTP 200) Cache: STALERequest 2: 10.004759s (HTTP 200) Cache: HITRequest 3: 0.000881s (HTTP 200) Cache: HIT
The fix
There’s no great way to fix this issue, especially since it seems to be expected Nginx behaviour. It can be disabled, but then you lose all the performance benefits it brings.
There’s a way to work around it, by closing the connection when cache status is STALE or EXPIRED. While this has some drawbacks, I found it to perform better.
We can close the connection by adding this Lua code to our Nginx location config:
header_filter_by_lua_block { local cache_status = ngx.var.upstream_cache_status if cache_status == "STALE" or cache_status == "EXPIRED" then ngx.header["Connection"] = "close" end}
We can observe the effects by repeating the same curl test as above:
curl -s --http1.1 \ -w "Request 1: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-fix?cache_buster=123456" \ --next \ -w "Request 2: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-fix?cache_buster=123456" \ --next \ -w "Request 3: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-fix?cache_buster=123456"Request 1: 10.013353s (HTTP 200) Cache: MISSRequest 2: 0.001427s (HTTP 200) Cache: HITRequest 3: 0.001296s (HTTP 200) Cache: HIT
Again, the first request is slow because it fetches from the origin, and the subsequent requests are fast due to a cache HIT. Now we wait for the cache to expire again, and re-run the same command:
> curl -s --http1.1 \ -w "Request 1: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-fix?cache_buster=123456" \ --next \ -w "Request 2: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-fix?cache_buster=123456" \ --next \ -w "Request 3: %{time_total}s (HTTP %{http_code}) Cache: %header{X-Cache-Status}\n" -o /dev/null \ "http://localhost:8080/with-fix?cache_buster=123456"Request 1: 0.001685s (HTTP 200) Cache: STALERequest 2: 0.000992s (HTTP 200) Cache: UPDATINGRequest 3: 0.000990s (HTTP 200) Cache: UPDATING
We can see that all three requests are now as fast as a cache HIT, as they’re all served from cache. The UPDATING status shows that the background request intended to update the cache with fresh content is still in progress.
If you don’t use Lua in your Nginx, you can also achieve the same with a map and add_header:
map $upstream_cache_status $connection_close { STALE "close"; EXPIRED "close"; default "";}add_header Connection $connection_close always;
Since HTTP/2 and HTTP/3 aren’t affected, this won’t impact most browser traffic. It’s more likely to surface in complex setups where multiple Nginx instances communicate over HTTP/1.1.
You can find the full solution, including a test script, on my GitHub.