| |
|
 |
Feature: Developing Your Site for Performance, part 1 - Tips for Client-Side Code Optimisation
Source: UN, 12 January 2005
Submitted by
Ann Light
This pair of articles outlines a common sense, cost-effective approach to website acceleration according to the two simple laws of Web performance: * Send as little data as possible * Send it as infrequently as possible If used properly, these basic principles should result in: * Faster Web page loads * Reduction of server usage * Improved bandwidth utilisation Developing a website with these techniques in mind should not only improve user satisfaction with a site or Web-based application, but also save money on site delivery costs. Part one on "Client-Side Code Optimisation", here, will be followed by a second part on "Optimal cache control" in following weeks. Both articles offer the technical detail to support the good coding practice that should underlie good Web usability more generally.
Code for Yourself, Compile for Delivery
Any application programmer knows that there are good reasons why the code one works with is not the code one should deliver. It is best to comment source code extensively, to format it for maximum readability, and to avoid overly terse, but convoluted syntax that makes maintenance difficult. Later, one translates that source code using a compiler into some other form that is optimised for performance and protected from reverse engineering. This model can be applied to Web development as well. To do so, you would take the "source" version of your site and prepare it for delivery by "crunching" it down through simple techniques like white space reduction, image and script optimisation, and file renaming. You would then take your delivery-ready site and post it.
Now hopefully this isn't too foreign a concept, since you are likely already at least working on a copy of your site, rather than posting changes directly to the live site. If not, please stop reading right now and make a copy of your site, as this is the only proper way to develop, regardless of whether the site is a static brochure or a complex, CMS-driven application. If you don't believe us now, you surely will some day in the very near future if you ruin some of your site files and can't easily recover them.
As you build your site, you are probably focusing on the biggest culprits in site download speed reduction - images and binary files like Flash. While reducing the colors in GIF files, compressing JPEGs, and optimising SWF files will certainly help a great deal, there are still plenty of other areas for improvement. Remembering the first rule of Web performance, we should always strive to send as few bytes as possible, regardless of whether the file is markup, image, or script. Now it might seem like wasted effort to focus on shaving bytes here and there in (X)HTML, CSS or JavaScript, however, this may be precisely where the greatest attention ought to be paid.
During a typical Web page fetch, an (X)HTML document is the first to be delivered to a browser. We can dub this the host document since it determines the relationships to all other files. Once received, the browser begins to parse the markup, and in doing so, often initiates a number of requests for dependent objects such as external scripts, linked style sheets, images, embedded Flash, and so on. These CSS and JavaScript files may, in turn, host additional calls for related image or script files. The faster these requests for dependent files get queued up, the faster they will get back to the browser and start rendering in the page. Given the importance of the host document, it would seem critical to get it delivered to the browser and parsed as quickly as possible since, despite constituting a relatively small percentage of the overall page weight, it can dramatically impede the loading of the page. Remember: users doesn't measure bytes, they measure time!
So what specifically do you need to do to fully prep your site for optimal delivery? The basic approach involves reducing white space, crunching CSS and JavaScript, renaming files, and similar strategies for making the delivered code as terse as possible (See Google for an example). These general techniques are well known and documented on the Web and in books like Andy King's Speed up Your Site: Website Optimisation. In this article we present what we consider to be the top twenty markup and code optimisation techniques. You can certainly perform some of these optimisations by hand, find some Web editors and utilities that perform a few of the features for you, or roll your own crunching utilities. As a reference implementation for nearly all the optimising features described here and as a legitimate example of the "real world" value of code optimisation, we do also point you to a tool developed at Port80 Software, called the w3compiler. Now on with the tips!
Markup Optimisation
Typical markup is either very tight, hand-crafted and standards-focused, filled with comments and formatting white space, or it is bulky, editor-generated markup with excessive indenting, editor-specific comments often used as control structures, and even redundant or needless markup or code. Neither case is optimal for delivery. The following tips are safe and easy ways to decrease file size:
1. Remove white space wherever possible In general, multiple white space characters (spaces, tabs, newlines) can safely be eliminated, but of course avoid changing "pre", "textarea", and tags affected by the "white-space" CSS property.
2. Remove comments Almost all comments, save for client-side conditional comments for IE and doctype statements, can be safely removed.
3. Remap color values to their smallest forms Rather than using all hex values or all color names, use whichever form is shortest in each particular case. For example, a color attribute value like #ff0000 could be replaced with red, while lightgoldenrodyellow would become #FAFAD2.
4. Remap character entities to their smallest forms As with color substitution, you can substitute a numeric entity for a longer alpha-oriented entity. For example, È would become È. Occasionally, this works in reverse as well: ð saves a byte if referenced as ð. However, this is not quite as safe to do, and the savings are limited.
5. Remove useless tags Some "junk" markup, such as tags applied multiple times or certain "meta" tags used as advertisements for editors, can safely be eliminated from documents.
CSS Optimisations
CSS is also ripe for simple optimisations. In fact, most CSS created today tends to compress much harder than (X)HTML. The following techniques are all safe, except for the final one, the complexities of which demonstrate the extent to which client-side Web technologies can be intertwined.
6. Remove CSS white space As is the case with (X)HTML, CSS is not terribly sensitive to white space, and thus its removal is a good way to significantly reduce the size of both CSS files and "style" blocks.
7. Remove CSS comments Just like markup comments, CSS comments should be removed, as they provide no value to the typical end user. However, a CSS masking comment in a "style" tag probably should not be removed if you are concerned about down-level browsers.
8. Remap colors in CSS to their smallest forms As in HTML, CSS colors can be remapped from word to hex format. However, the advantage gained by doing this in CSS is slightly greater. The main reason for this is that CSS supports three-hex color values like #fff for white.
9. Combine, reduce, and remove CSS rules CSS rules like font-size, font-weight, and so on can often be expressed in a shorthand notation using the single property font. When employed properly, this technique allows you to take something like p {font-size: 36pt; font-family: Arial; line-height: 48pt; font-weight: bold;} and rewrite it as
p{font:bold 36pt/48pt Arial;}
You also may find that some rules in style sheets can be significantly reduced or even completely eliminated if inheritance is used properly. So far, there are no automatic rule-reduction tools available, so CSS wizards will have to hand-tweak for these extra savings. However, the upcoming 2.0 release of the w3compiler will include this feature.
10. Rename class and id values The most dangerous optimisation that can be performed on CSS is to rename class or id values. Consider a rule like
.superSpecial {color: red; font-size: 36pt;}
It might seem appropriate to rename the class to sS. You might also take an id rule like
#firstParagraph {background-color: yellow;}
and use #fp in place of #firstParagraph, changing the appropriate id values throughout the document. Of course, in doing this you start to run into the problem of markup-style-script dependency: If a tag has an id value, it is possible that this value is used not only for a style sheet, but also as a script reference, or even as a link destination. If you modify this value, you need to make very sure that you modify all related script and link references as well. These may even be located in other files, so be careful.
Changing class values is not quite as dangerous, since experience shows that most JavaScript developers tend not to manipulate class values as often as they do id values. However, class name reduction ultimately suffers from the same problem as id reduction, so again, be careful.
Note: You should probably never remap name attributes, particularly on form fields, since these values are also operated on by server-side programs that would have to be altered as well. Though not impossible, calculating such dependencies would be difficult in many Web site environments.
JavaScript Optimisation
More and more sites rely on JavaScript to provide navigational menus, form validation, and a variety of other useful things. Not surprisingly, much of this code is quite bulky and begs for optimisation. Many of the techniques for JavaScript optimisation are similar to those used for markup and CSS. However, JavaScript optimisation must be performed far more carefully because, if it is done improperly, the result is not just a visual distortion, but potentially a broken page! We start with the most obvious and easiest improvements and then move on to ones that require greater care.
11. Remove JavaScript comments Except for the "!-- //--" masking comment, all JavaScript comments indicated by // or /* */ can safely be removed, as they offer no value to end users (except for the ones who want to understand how your script works).
12. Remove white space in JavaScript Interestingly, white space removal in JavaScript is not nearly as beneficial as it might seem. On the one hand, code like
x = x + 1;
can obviously be reduced to
x=x+1;
However, because of the common sloppy coding practice of JavaScript developers failing to terminate lines with semi-colons, white space reduction can cause problems. For example, given the legal JavaScript below which uses implied semi-colons
x=x+1 y=y+1
a simple white space remover might produce
x=x+1y=y+1
which would obviously throw an error. If you add in the needed semi-colons to produce
x=x+1;y=y+1;
you actually gain nothing in byte count. We still encourage this transformation, however, since Web developers who provided feedback on the Beta version of the w3compiler found the "visually compressed" script more satisfying (perhaps as visual confirmation that they are looking at transformed rather than original code). The also liked the side benefit of delivering more obfuscated code.
13. Perform code optimisations Simple ideas like removing implied semi-colons, var statements in certain cases, or empty return statements can help to further reduce some script code. Shorthand can also be employed in a number of situations, for example
x=x+1;
can become
x++;
However, be careful, as it is quite easy to break your code unless your optimisations are very conservative.
14. Rename user-defined variables and function names For good readability, any script should use variables like sumTotal instead of s. However, for download speed, the lengthy variable sumTotal is a liability and it provides no user value, so s is a much better choice. Here again, writing your source code in a readable fashion and then using a tool to prepare it for delivery shows its value, since remapping all user defined variable and function names to short one- and two-letter identifiers can produce significant savings.
15. Remap built-in objects The bulkiness of JavaScript code, beyond long user variable names, comes from the use of built-in objects like Window, Document, Navigator and so on. For example, given code like
alert(window.navigator.appName); alert(window.navigator.appVersion); alert(window.navigator.userAgent);
you could rewrite it as
w=window;n=w.navigator;a=alert; a(n.appName); a(n.appVersion); a(n.userAgent);
This type of remapping is quite valuable when objects are used repeatedly, which they generally are. Note however, that if the window or navigator object were used only once, these substitutions would actually make the code bigger, so be careful if you are optimising by hand. Fortunately, many JavaScript code optimisers will take this into account automatically.
This tip brings up a related issue regarding the performance of scripts with remapped objects: in addition to the benefit of size reduction, such remappings actually slightly improve script execution times because the objects are copied higher up into JavaScript's scope chain. This technique has been used for years by developers who write JavaScript games, and while it does improve both download and execution performance, it does so at the expense of local browser memory usage.
File-Related Optimisation
The last set of optimisation techniques is related to file and site organisation. Some of the optimisations mentioned here might require server modifications or site restructuring.
16. Rename non-user accessed dependent files and directories Sites will often have file names such as SubHeaderAbout.gif or rollover.js for dependent objects that are never accessed by a user via the URL. Very often, these are kept in a standard directory like /images, so you may see markup like
img src="/images/SubHeaderAbout.gif"
or worse
img src="../../../images/SubHeaderAbout.gif"
Given that these files will never be accessed directly, this readability provides no value to the user, only the developer. For delivery's sake it would make more sense to use markup like
img src="/0/a.gif"
While manual file-and-directory remapping can be an intensive process, some content management systems can deploy content to target names, including shortened values. Furthermore, the w3compiler has a feature that automatically copies and sets up these dependencies. If used properly, this can result in very noticeable savings in the (X)HTML files that reference these objects and can also make reworking of stolen site markup much more difficult.
17. Shorten all page URLs using a URL rewriter Notice that the previous step does not suggest renaming the host files like products.html, which would change markup like
a href="products.html"
to something like
a href="p.html"
The main reason is that end users will see a URL like http://www.sitename.com/p.html, rather than the infinitely more usable http://www.sitename.com/products.html.
However, it is possible to reap the benefits of file name reduction in your source code without sacrificing meaningful page URLs if you combine the renaming technique with a change to your Web server's configuration. For example, you could substitute p.html for products.html in your source code, but then set up a URL rewriting rule to be used by a server filter like mod rewrite to expand the URL back into a user friendly value. Note that this trick will only put the new URL in the user's address bar if the rewrite rule employs an "external" redirect, thereby forcing the browser to re-request the page. In this case, the files themselves are not renamed as the short identifiers are only used in the source code URLs.
Because of the reliance on URL rewriting and the lack of widespread developer access to, and understanding of, such server-side tools as mod rewrite, even an advanced tool like the w3compiler does not currently promote this technique. However, considering that sites like Yahoo! actively employ this technique for significant savings, it should not be ignored, as it does produce noticeable (X)HTML reduction when extremely descriptive directory and file names are used in a site.
18. Remove or reduce file extensions Interestingly, there really is little value to including file extensions such as .gif, .jpg, .js, and so on. The browser does not rely on these values to render a page; rather it uses the MIME type header in the response. Knowing this, we might take
img src="images/SubHeaderAbout.gif"
and shorten it to
img src="images/SubHeaderAbout"
Or, if combined with file renaming, you might have
img src="/0/sA"
Don't be scared away by how strange this technique looks at first, as your actual file will still be sA.gif. It is just the end user who won't see it that way!
In order to take advantage of this more advanced technique, however, you do need to make modifications to your server. The main thing you will have to do is to enable something called "content negotiation," which may be native to your server or require an extension such as mod_negotation for Apache or Port80's PageXchanger for IIS. The downside to this is that it may cause a slight performance hit on your server. However, the benefits of adding content negotiation far outweigh the costs. Clean URLs improve both security and portability of your sites, and even allow for adaptive content delivery whereby you can send different image types or languages to users based upon their browser's capabilities or system preferences! See "Towards Next Generation URLs" by the same authors for more information.
Note: Extension-less URLs will not hurt your search engine ranking. Port80 Software, as well as major sites like the W3C, use this technique and have suffered no ill effects.
19. Restructure "script" and "style" inclusions for optimal number of requests
You will often see in the "head" of an HTML document markup like
"script src="/scripts/rollovers.js"""/script" "script src="/scripts/validation.js"""/script" "script src="/scripts/tracking.js"""/script"
In most cases, this should have been reduced to
"script src="/0/g.js"""/script"
where g.js contains all the globally used functions. While the break-up of the script files into three pieces makes sense for maintainability, for delivery it does not. The single script download is far more efficient than three separate requests, and it even reduces the amount of needed markup. Interestingly, this approach mimics the concept of linking in a traditional programming language compiler.
20. Consider cache-ability at the code level One of the most important improvements to site performance that can be made is to improve cacheability. Web developers may be very familiar with using the "meta" tag to set cache control, but (apart from the fact that meta has no effect on proxy caches) the true value of cacheability is in found in its application to dependent objects such as images and scripts. To prepare your site for improved caching, you should consider segmenting your dependent objects according to frequency of change, storing your more cacheable items in a directory like /cache or /images/cache. Once you start organising your site this way, it will be very easy to add cache control rules that will make your site clearly "pop" for users who are frequent visitors.
So you now have 20 useful code optimisation tips to make your site faster. One by one, they may not seem very powerful, but taken together you will see an obvious improvement in site delivery. Next, we will focus primarily on caching, explaining how it is generally misused and how you can significantly improve performance with just a few simple changes.
Thomas A. Powell and Joe Lima
About The Authors Thomas A. Powell is the founder of PINT, Inc., an instructor at the University of California, San Diego Computer Science Department, and the author of Web development books including "HTML & XHTML: The Complete Reference and JavaScript: The Complete Reference".
Joe Lima is a lead architect at Port80 Software and also teaches Web server technology for UCSD Extension.
Associated Link:
Port80 Software
|
|
|
 |
|
Site Visit Interviews: from Good to Great Source: User Focus, 29 August 2008 For those of you for whom the Basic Introduction to User Interviews wasn't quite enough. Six Metrics for Managing UI Design Source: Russell Wilson, 28 August 2008 A proposal of six metrics to be used for managing a user interface design department. Don't Judge a Form by its Cover Source: Formulate Information Design, 27 August 2008 The saying "don't judge a book by its cover" reminds us that looks are deceptive. It turns out that this idiom applies to forms too. Beijing Olympics - special State of the eNation report Source: www.abilitynet.org.uk, 26 August 2008 In this special report AbilityNet asked disabled users to try out the Beijing Olympics website in our interaction lab. It's Who You Know (Or Don't) Source: Stanford Magazine, 23 August 2008 Online social networks are powerful and ineffectual all at once. Winning Considerations for Interactive Content Source: UXMatters, 22 August 2008 Rich options for interactively presenting content also come with a challenge. Microsoft sees end of Windows era Source: BBC, 20 August 2008 Microsoft has kicked off a research project to create software that will take over when it retires Windows. News you can Use Source: Gerry McGovern, 18 August 2008 When the homepage is dominated by news you are not necessarily communicating more. Feeling Through your Computer Source: Discoveries and Breakthroughs Inside Science, 16 August 2008 A newly designed device lets computer users feel the texture and movement of what they are seeing in front of them. User interviews - A basic Introduction Source: Webcredible, 15 August 2008 It's surprising how few people have a real understanding of who's using their site.
|
|
|