Webpage Component Caching (Part 1 of 2)

Website Performance: Taxonomy of Tips intro­duced a clas­si­fi­ca­tion scheme to help us or­ga­nize the many per­for­mance tips found on the In­ter­net.  Empty Src and Href Attributesthen start­ed to exa­mine the “jour­ney from the ser­ver to the cli­ent” ca­te­go­ry by dis­cuss­ing the one tip that Ya­hoo lists as the most im­por­tant for web ap­pli­ca­tion per­­for­mance.  To­day’s ar­ti­cle con­tinues this dis­cus­sion by ex­a­min­ing Yahoo’s second-most-highly weight­ed tip.

Cache Components As Long As Possible

The best way to im­prove per­form­ance of an ac­tion is to not per­form the ac­tion.  Ze­ro sounds like a great per­for­mance mea­sure­ment!  When it comes to the jour­ney a re­sponse makes from a ser­ver to a cli­ent, the best pos­si­ble per­for­mance is to eli­mi­nate the jour­ney com­plete­ly.

A web page con­sists of many com­po­nents:  the HTML, CSS, Java­Script, vari­ous types of me­dia, and others.  Each com­po­nent has to make the jour­ney from the ser­ver to the cli­ent.  Or does it?  If the com­po­nent hasn’t changed since the last time it made the trip, why make the trip a second time?  Why can’t the brow­ser just use the com­po­nent it got last time?

If the brow­ser would be so kind as to store the com­po­nent lo­cal­ly, it could re­use that com­po­nent in the fu­ture in­stead of down­load­ing it again.  In fact, brow­sers can do this.  All they need to know is how long to cache the com­po­nent before re-request­ing it from the ser­ver.  The brow­ser can­not be ex­pect­ed to know when the com­po­nent changes, so the ser­ver has to tell it.  This cach­ing pro­cess hap­pens on the cli­ent’s ma­chine, but is un­der the ser­ver’s con­trol.

HTML cach­ing is in­ap­pro­pri­ate if the page changes dy­na­mi­cal­ly or fre­quent­ly.  It al­so in­ter­feres with page sta­tis­tics be­cause it hides user hits from the ser­ver.  For these rea­sons, it is of­ten more ap­pro­pri­ate to make the HTML non-cache­able and to place all cache­able com­po­nents in ex­ter­nal files.

How To Implement HTML Caching

Ser­ver-side script­ing lang­uages (e.g., PHP, ASP) pro­vide a way to set headers for the cur­rent HTML page.  Use it to in­clude eith­er the expires or cache-control: max-age head­er line.  Expires al­lows us to set an ex­pi­ry date; cache-control: max-age al­lows us to set a du­ra­tion (length of time from ac­cess date to ex­pi­ry date).

How To Implement Other Caching

Apache’s ExpiresDefault and ExpiresByType di­rec­tives can be used with­in .htaccess files to set the ex­pi­ry dates.  For example:

ExpiresActive On
ExpiresByType image/gif "access plus 1 year"
ExpiresByType application/ecmascript M86400
ExpiresDefault A3600

In this ex­am­ple, the first line turns cache con­trol on (a ne­ces­sary first step), the se­cond line sets GIF images served from this di­rec­to­ry to ex­pire one year af­ter the brow­ser accesses them, the third line sets Java­Scripts served from this di­rec­to­ry to ex­pire 24 hours af­ter they were last mo­di­fied, and the fourth line sets other data served from this di­rec­to­ry to ex­pire one hour after they are ac­cessed.

Keep in mind that .htaccess applies to the di­rec­tory it is in and to all sub­di­rec­to­ries be­low it.

Continued …

Part Two of this ar­ti­cle will dis­cuss some of the plan­ning and think­ing a de­ve­lop­er goes through when im­ple­ment­ing caching.  Watch for it on the monitor.us blog.


