Archive

Archive for November, 2009

Caching: A Problem or a Solution

November 4, 2009 scerrimark Leave a comment

Caching is a very good technique to minimize load times and bandwidth use by ensuring that components in your site such as images, stylesheets, and JavaScript files are cached once by your visitors and they do not need to download them each time they want to access your site. However it is not the first time that you do an update in one of your stylesheets and some of your visitors do not see these changes.

So let’s investigate this problem. By default, cached files expire quickly. This means that while very frequent visitors might experience a problem with a cached style sheet immediately upon making a change, infrequent site visitors (visitors who do not visit a site every day) will not encounter a problem with cached content. However, it also means that visitors will often be downloading files unnecessarily, leading to longer load times and wasted bandwidth.

To decrease load times and decrease bandwidth you may want to tell browsers to keep these cached files for a longer period of time. But having done this, your site’s visitors may experience problems when you make changes to images, style sheets, or scripts. While showing an image that is outdated is not that harmful imagine if the HTML is updated and the JavaScript is not. This may break the functionality of your site. Below I mention a number of techniques that can be used to trick the browser into thinking that this is new content.

Modify File Name

If you change the name of your file the browser thinks that this is new content and therefore downloads the file. This however can be tedious if you have a lot of files or if you change the content often. However if you choose a good naming convention this could also serve as a version control.

Query Strings

One can add a query string to the file paths. Each time the query string changes the browser downlaods that file. For example you can include the verison number in your path such as “/css/styles.css?version=1.0″. This is also a very good technique for version control. The only problem with this technique is that if you use it with images you may also need to change all the stylesheets and HTML that reference that image.

The Path Method

In my opinion this is the best solutions. It basically includes the version information in the path of the resource, rather than in the filename or query string. So, for example:

<link rel="stylesheet" href="/css.v1/styles.css" type="text/css" />

Well I know what you are thinking that this involves as much work as renaming the file. Well no because there is a very neat solution to this problem. What we can do is use a rewrite rule to make the URL /css.1234/styles.css point to /css/styles.css on the server. For example to do this with Apache, you’ll need to use mod_rewrite. You’ll also need to have mod_rewrite installed and permissions for .htaccess files.  Setting up a rule for this is straightforward. You can add the following lines to the .htaccess file located in your site’s root web folder:

RewriteEngine On
RewriteRule css[.][0-9]+/(.*)$ css/$1 [L]

This rule uses a regular expression to match any path consisting of css, followed by a period (.), followed by any number of numerical characters, and finally a slash. Any path matching this pattern will be rewritten to just /css/<filename>. The [L] specifies that no further rewrite rules should be applied to this request.

Conclusion

No site can afford to ignore browser caching. Caching can significantly improve performance for your users, and save money on bandwidth while you’re at it. But as any other technology you can easily use it incorrectly. However if you use the methods above you should be able to use caching at your own advantage.

Categories: Website Development