Part III described this journey in the opposite direction. The request was travelling from the client to the server in that part. In this part, the response is travelling from the server back to the client.
Admittedly, this is not much different from what was described in Part III, so this part is much shorter than the others. The frames are passed from machine to machine through the local loop, then the intranet, then the Internet until they reach the client machine. Routing is the same.
The only real difference is what the caching servers do along the way. During the client-to-server journey they looked in their cache to see if they could serve the resource without bothering the web server. During the present server-to-client journey, they add the newly-retrieved resource to the cache (if it’s cacheable, of course).
If the current resource is too big, the caching server will just pass it along to the next machine without adding it to the cache. The definition of “too big” is configurable. Note that it has nothing to do with the size of the cache or how full it is. It is simply a count of the number of bytes in the current resource. If the current resource is bigger than the configured maximum, it will not be cached. [Cisco Example]
Do not maximize nor minimize the maximum number of bytes. Either way yields sub-optimal performance. This is one of those fiddle-with-it settings, so fiddle away until you get the best results. [The fiddling has to be done in production, not in the test environment.] Keep in mind that things change over time. Like most of the fiddle-with-it settings, we need to re-evaluate them from time to time and monitor them on an ongoing basis.
If the caching server’s cache is already full, the least-recently used entries will be deleted until there is enough room to store the new entry.
I would expect the caching server retransmits the frames to the next machine before adding them to the cache. To do the opposite would be foolishness from a performance standpoint. From a webapp developer’s viewpoint, though, this is almost irrelevant because that in-between machine is probably not within our control. We control the server and maybe have some degree of control over the client, but all those in-between machines out there on the Internet are not ours.
SPDY may offer hope for reduced latency, but we should note that caching servers are excluded from the process. SPDY compresses and encrypts the headers, then sends multiple requests over a single connection. The in-between caching servers cannot read the headers, so they are denied the opportunity to speed up the response. I expect SPDY will find a way around this soon. Perhaps they already have.
And now the resource arrives at the client machine… (to be continued)
For quick reference, here is the series’ table of contents:
- Part I – an overview of the entire process from beginning to end
- Part II – down the protocol stack (client side)
- Part III – the journey from client to server
- Part IV – up the protocol stack (server side)
- Part V – the web server (software)
- Part VI – the server side script
- Part VII – the database management system
- Part VIII – down the protocol stack (server side)
- Part IX (this one) – the journey from server to client
- Part X – up the protocol stack (client side)
- Part XI – the client-side script
- Part XII – the Document Object Model
- Part XIII – after the document is complete
- Part XIV – concurrency
- Part XV – wrap-up; best practices