Everything about Web and Network Monitoring

The Chronology of a Click, Part I

A user clicks on a link in a web page. Then he waits. Nothing seems to be happening, but he faithfully waits anyhow. Eventually the browser’s viewport goes blank. There’s nothing there but the white screen of death. Again he waits. Our hero is so-o-o patient. Eventually something appears – perhaps a header, perhaps an ad, perhaps a rectangle. Again, he waits, but this time he has some entertainment. He gets to watch the various components of the page magically appear, then jump around from place to place, and finally settle down. He now believes that the page is ready for use. He may be right; he may be wrong. But he goes ahead and uses the page anyhow. If he was wrong, he’ll find out the hard way.That’s a very realistic example of what users go through, but that’s not the topic of this article. This article looks at what’s happening behind the scenes while the user is waiting for the next page, especially from a performance perspective. This is the chronology of that simple click.This article is presented in multiple parts. This first part is a simple overview of what goes on. It’s our chance to see the forest before looking too closely at the many trees. Future parts will deal with the client-side activity, the journey from the client to the server, the server-side activity, the journey from the server back to the client, and the post-response client-side activity.Our typical reader is a web developer or techie of some sort. If that’s you, you may find this introduction a little simple. Never fear – the follow-on articles will provide much more detail.

An Algorithmic Overview

Let’s use pseudo-pseudocode to look at the entire process. Please note that this process varies from browser to browser, and even from version to version of the same browser. The following is not intended to be gospel-truth. It exists only to give us a general idea of what’s going on.

IF the browser does not have an unexpired copy of the resource {
  The client builds an HTTP request for the resource.
  The client sends the HTTP request to the server.
  The request journeys from the client to the server.
  IF the server rewrites the URL {
    it sends the new URL to the client
    the client goes back to the beginning of this process
  }
  IF a server-side script is being requested {
    The server executes it.
  }
  The server returns the file or the output from the script.
  The server builds the HTTP response.
  The server sends the HTTP response to the client.
  The response journeys from the server to the client.
}
The browser parses the HTML: {
  it builds the DOM tree and the render tree as it goes
  IF an element it encounters is not yet downloaded {
    queue for download or download immediately
    (immediate downloading blocks parsing, which blocks)
    (all following downloads and executions)
    (downloaded scripts are executed immediately)
  }
}
WHEN the DOM tree is fully constructed: {
  WHEN all components except images have been downloaded {
    the documentReady event fires.
  }
  WHEN all components have been downloaded {
    the onLoad event fires.
  }
}
FOR EACH script that was waiting for onLoad: {
  the browser executes the script
  (this may include downloading more components)
  (downloaded scripts are executed immediately)
}
WHEN other events (timers, mouse/keyboard activity, etc.) occur {
  the browser executes any scripts that were waiting for the event
  (this may include downloading more components)
  (downloaded scripts are executed immediately)
}

The above pseudo-pseudocode breaks the structured programming rules in a couple of places. That’s okay, though, because it’s not intended to be used as a design document for some implementation. Its purpose is to give us an overview of a process — and it does a good job at that.

Pseudocode Not Your Bag? Let’s Try English

When a user clicks on a link, the browser checks to see if it already has an unexpired copy of the requested resource. If so, it uses that copy. If not, the client builds and sends an HTTP request, which journeys from the client to the server, hopping merrily from machine to machine.When the server receives the HTTP request, it may rewrite the URL and send the new URL back to the client. It is now the client’s job to send a new HTTP request for the new URL.If the request is for a file, the server sends the file. If not, the server executes a server-side script to produce a download stream. Either the file or the output from the script is packaged into an HTTP response and sent to the client.

Upon receipt of the HTTP response, the client unpackages it and parses the HTML contained therein. If, during parsing, the browser encounters a component that is not yet downloaded, it either queues that component for later download or downloads it immediately. Further parsing may be blocked during immediate download.

Any time the browser downloads a script (whether it was immediate, queued, or waiting for an event), it executes that script in its entirety. Parsing is suspended, so anything downloaded later has to wait. [However, scripts that are set aside while waiting for an event do not block parsing while they wait for the event.]

When the DOM tree is fully built, all downloads (except images) are fully downloaded, and all waiting scripts have finished executing, document.ready fires and all scripts that were waiting for this event are executed in order.

When the DOM tree is fully built, all downloads (including images) are fully downloaded, and all waiting scripts have finished executing, onLoad fires and all scripts that were waiting for this event are executed in order.

Other client-side scripts may be downloaded and/or executed in response to any event. Example: A script may be waiting for a timer event. Example: The user may trigger a script by clicking on one of the elements in the web page.

Whenever a script executes, it may download more components. Any time it downloads another script, that script will also execute.

Performance Considerations

Even though we have not gotten into any meat yet, the above overview provides several opportunities for us to discuss performance issues. For each stage of the process, we can try to do it faster, do it at a more opportune time, don’t do it at all, or do more than one thing at a time.Do It Faster:  This is the first thought that pops into most people’s minds. For each of the stages described above, the Internet community has put much effort into trying to make it faster. Some of these efforts resulted in changed standards; some resulted in performance tips that individual developers need to implement on their own websites.We will discuss the do-it-faster performance tips in the follow-on articles.

Do It At A More Opportune Time:  With a tip of the hat to our end-users, the ones who ultimately decide the fate of our websites, we recognize that the perception of speed is more important than the reality of speed. As much as possible, let’s not process or transmit data while the user is tapping his fingers waiting for the computer. If that processing and transmission were to happen at some other time, when the user has something productive to do, there may be no effect on real performance, but there is a huge effect on perceived performance.

The best tip in this category is to move as much processing to build-time as possible. You can read more about this technique in Whatever Happened to Build Time?

The most talked-about tip in this category tells us to postpone loading components and executing scripts until after the document achieves interactivity (the point where the end-user can continue interacting with the website). This is still a bit of an art form because developers need to get a good feel for which components the users consider essential. Interactivity is not determined by a JavaScript event; it is determined by when the user can get on with what he was doing.

A page should download and execute the needed-now components statically and inline, then wait for the onLoad event, then download the needed-soon components. The needed-soon components download and execute while the user is skimming through the web page and deciding how to interact with it, so the user doesn’t notice. According to his perception, downloading and executing the needed-soon components do not count as a performance issue.

A similar tip tells us to preload components before the user requests them. Yes, webapp developers must be mind-readers! Of course, it’s not that hard. If we’ve been monitoring the paths users take through the site, we have a really good idea where the user may go next. If so, let’s start loading the components of that page in the background while the user is still reading the current page.

Even if we don’t know which page comes next, we can start downloading components, especially images, that are used on many or most of the web pages. This might even be a good time to download some of the larger images we use.

Memoizing components in localStorage is a new technique you’ll read more about in the near future. For now, check out Memoizing Snippets in LocalStorage.

Preloading and postloading are both better than downloading and executing while the user waits. Download everything at the most opportune time.

Don’t Do It:  Our primary metric is the time something takes. Smaller is better. Zero is best. How can we do something in zero time? Simple – just don’t do it. If there is any way not to do something, we should gravitate to that method first.

Caching may be the best example of don’t do it. If we need a component or a DNS entry, doesn’t it make a lot more sense to use a locally-cached copy instead of fetching it yet again? Web developers have control over how long something is cached. If they expire a component or a DNS entry earlier than they have to, they’re contributing to the performance problem.

Transport layer security (TLS) and secure socket layers (SSL) are also good examples of don’t do it. Every component that is protected by SSL or TLS requires four or more extra trips through the Internet. Avoiding these extra trips is as easy as using HTTP instead of HTTPS. Instead of encrypting everything on a page, why not encrypt only those components that absolutely require it?

Parallel Processing:  The algorithm above incorrectly implies that everything happens serially, with each step waiting for the previous one to complete. Nothing could be further from the truth.

In fact, parallel processing (more than one thing happening at a time) offers a big performance boost. While waiting for something to happen, do something else.

Example: While waiting for the arrival of one download, request the next one. Your browser probably requests certain files in parallel, but it imposes a per-domain limit on this parallelism. As one well-known tip says, increase parallelism and performance by downloading your components from multiple domains.

Example: If you are building your download stream dynamically in PHP, you can use flush() to send whatever’s done so far to the browser. This lets the browser go ahead and start doing its thing while you’re still building the download stream on the server.

The Horrible, Disgusting Non-Example: JavaScript is a single threaded language. It does not give you the opportunity to execute your code in parallel with other parts of your code. Too bad; so sad!

Conclusions

Understanding the request/response/parse/render process helps developers avoid performance problems. It also helps explain the performance tips we read about on the web.The above is a top-level discussion. Follow-on articles will go into more depth to explain other performance issues and tips. Watch for them at the Monitor.Us blog.

About Warren Gaebel

Warren wrote his first computer program in 1970 (yes, it was Fortran).  He earned his Bachelor of Arts degree from the University of Waterloo and his Bachelor of Computer Science degree at the University of Windsor.  After a few years at IBM, he worked on a Master of Mathematics (Computer Science) degree at the University of Waterloo.  He decided to stay home to take care of his newborn son rather than complete that degree.  That decision cost him his career, but he would gladly make the same decision again. Warren is now retired, but he finds it hard to do nothing, so he writes web performance articles for the Monitor.Us blog.  Life is good!