Everything about Web and Network Monitoring

The Chronology of a Click, Part II

This article is presented in multiple parts. Part I was a simple overview of what goes on from the time the user clicks on a link to the time the new web page is completely available. It was our chance to see the forest before looking too closely at the many trees. Part II, today’s post, details what happens from the time the user clicks the mouse to the time the request leaves the local machine. Subsequent parts will deal with the journey from the client to the server, the server-side activity, the journey from the server back to the client, and the post-response client-side activity.The following looks at the activity from the OSI model’s viewpoint. Note that your operating system probably does not follow the OSI model exactly. This may result in minor reordering, combining, or splitting of activities, but not enough to make a huge difference to the discussion.Performance Consideration:  The browser and the protocol stack described below will both perform better if they have sufficient memory. If the system is paging, there is not enough memory.Performance Consideration:  Measure, benchmark, and monitor. Of course, this tip applies to every stage of the chronology. Each stage presents its own obstacles to performance, so each stage should be monitored.

Performance Consideration:  Each of the following components can be configured to one degree or another. Default configurations are not always the best. Make sure each layer is configured for maximum performance.

The Browser

If the browser already has an unexpired copy of the requested resource, it uses that copy instead of going through all the work described below and in the upcoming parts of this article. This avoids everything to do with networking and servers.Performance Consideration:  Most browsers cache some data, but some cache more than others. Since this form of caching eliminates so much work, advise your end-users to install a browser that maximizes caching (which changes over time as new versions are released). Remember, too, that the browser’s default configuration may not be good enough. Change it to get the best possible performance.

The Application Layer

Be it Chrome, Firefox, Safari, Internet Explorer, or one of the lesser-knowns, the browser sends its request through the API (application programming interface) supplied by the application layer. The application layer then builds the HTTP request from the information provided in the <a href=”…”> tag.

If the user clicked on a form’s submit button, the form tag may have specified the GET method or the POST method. If this is a GET request, the first line of the HTTP request will start with the word GET. If this is a POST request, the first line of the HTTP request will start with the word POST.

Form data is handled differently for GET and POST. In both cases the form data is copied into a URL-encoded query string as name-value pairs. However, for GET requests the query string is added to the URL in the first line of the HTTP request, but for POST requests the query string is placed into the HTTP request’s body.

Performance Consideration:  Use GET rather than POST. GET imposes a lower limit on the number of bytes, so it will remind you when your form data is starting to bloat.

The application layer then puts some of the following headers (and perhaps others) into the request:

  • Host – the server’s host name and port number
  • Connection – “close” if the connection is to be closed after the request is fulfilled
  • User-Agent – the browser’s operating system, name, version, and installed components
  • Date – the date and time the request was sent
  • Referer – the web page the request originated from (i.e., the one that had the link)
  • Accept – the Internet Media Types (MIME types) the browser can receive from the server
  • Accept-Encoding – the types of compression the browser can receive from the server
  • Accept-Language – the languages the browser expects to receive from the server
  • Accept-Charset – the character sets the browser can receive from the server
  • Content-Length – the number of bytes in the request’s body
  • Content-MD5 – the MD5 checksum of the HTTP request
  • Content-Type – The Internet Media Type (MIME type) of the body
  • Cookie – the data the server previously stored on the client

The meaning of the connection header changed between HTTP 1.1 and 1.2. Although it is not defined in either 1.1 or 1.2, keep-alive was a widely-used experimental connection token that many servers understood. It was the browser’s way to request a persistent (reusable) connection.

HTTP 1.2 operates differently. It assumes the browser wants a persistent connection unless told otherwise. The browser can tell the server it doesn’t want a persistent connection by specifying the Connnection: close header. The server will finish serving the requested resource, then close the connection.

See sections 8.1.2.1 and 19.6.2 of RFC 2616 for more information.

Performance Consideration:  Use persistent connections. Do not use Connection: close until the last resource is requested. Really, though, the browser should take care of this for you, but don’t assume that. [Note: If you are running an HTML 1.0 server, use Connection: keep-alive to request a persistent connection.]

The browser uses the Cookie header to return the data the server previously stored on the client. It sends all cookies for the requested directory, its ancestor directories, and its ancestor domains. Example: If the tag is

<a href="https://a.example.com/b/index.html">

the browser will add all the cookies that exist for:

a.example.com/b/
a.example.com/
example.com/b/
example.com/

Performance Consideration:  EVERY cookie is sent. The more cookies used by the entire website, the more cookies will be sent with EVERY HTTP request. Even cookies that aren’t needed are sent. Here are some tips to minimize the number of cookie bytes (pun intended):

 

  • Organize your cookie hierarchy so cookies that pertain to only one page are set for the directory that contains that page, cookies that pertain to one webapp are set for the root of the hierarchy containing that webapp, and cookies that pertain to the entire website are set for the root of all your server-side code. Only server-side code should be served from this hierarchy. Everything else (images, style sheets, html, client-side code, fonts, data, etc.) should be served from some other cookie-less hierarchy. Some people even move the non-server-side-code components to a separate cookie-less domain.

 

 

  • Limit the number of cookies used.

 

 

  • Limit the amount of data stored in each cookie.

 

 

  • Serve resources from IP addresses instead of from domains if cookies aren’t used by that resource. [Cookies aren’t sent from IP addresses.] [This won’t work when a number of domains share the same IP address.]

 

 

  • Use HTML5’s localStorage instead of cookies.

 

Then comes a blank line and the body. As mentioned above, this is where the query string goes if form data is being sent with the POST method. Content-Length, Content-Type, and Content-MD5 are not needed if the body is empty.And then the application layer sends the HTML request to the presentation layer…

The Presentation Layer

The presentation layer makes sure the client and the server are speaking the same language by translating character codes and graphics as required. It also compresses and encrypts as required.

Performance Consideration:  Compression is not usually important at this point because HTTP requests are typically quite small. However, if you are in the habit of creating wickedly long HTTP requests, compressing them might be a good idea. Of course, shortening them would be the better solution.

Performance Consideration:  Always include an Accept-Encoding header line in every request. It should specify every type of compression your browser understands. Let the server decide whether compression will help – it will (or should) compress only those that will be made smaller.

And then the presentation layer sends the data to the session layer…

The Session Layer

The session layer establishes, manages, and terminates connections, but before it can do that…

We Interrupt This Program to Bring You the Domain Name System

Before the session layer can do its work, it must know the IP address of the server, so it slams on the brakes for this request and issues a Domain Name System (DNS) request to translate the domain name to an IP address.

DNS queries are only required if the HTTP request specifies a domain name. HTTP requests can be directed to IP addresses instead of domain names, in which case the DNS system is bypassed.

Performance Implication:  Avoid this entire process and all the back-and-forth communication by serving your resources from an IP address instead of from a domain name. Note: This won’t work if your server hosts more than one domain.

The DNS request goes down through the application layer, the presentation layer, the session layer, the transport layer, the network layer, and the data link layer, then skips from machine to machine until it reaches a DNS name server. The reply will come back across the Internet, then up the protocol stack and back to the session layer.

To keep things manageable, this description is way over-simplified. In actual fact, getting the IP address may involve multiple trips back and forth between the local machine and various name servers.

Fortunately, caches may exist at various points throughout the process, even within the local machine itself. If an unexpired DNS entry is found in a cache along the way, it can be passed back to the session layer, which circumvents the rest of the process. Whew!

Performance Implication:  The local machine’s DNS resolver should be configured to avoid TCP unless UDP won’t resolve. RFC 1123 says, “UDP queries have much lower overhead, both in packet count and in connection state,” so use TCP only as a backup.

Performance Implication:  Caching saves considerable time in this process, but caching will not provide as much benefit if the DNS entries expire sooner than they need to. Developers should make sure all their DNS zone files set expiry dates as far into the future as possible.

Performance Implication:  If your end-users are in a small geographic area (e.g., all in one city), make sure there is a caching DNS name server physically close to them.

Performance Consideration:  Here’s a horrible worst-case scenario. Suppose your DNS entries expire immediately. That means they never get cached anywhere. That means every time you request a component from that domain, the request will be put on hold while the DNS system is consulted over and over and over and over… (once for each component).

The session layer now knows the all-important IP address, so it can continue with what it was doing.

Back to The Session Layer

Now that it has the IP address, the session layer creates a connection to the server or reuses an existing connection if one is available. The session layer manages all connections to all servers, including initiating and terminating connections when required.

Performance Implication:  The more requests you issue, the more work for the session layer and the more delays due to DNS lookups and establishing connections. Combining requests is one of the more important performance tips. Inline small images, scripts, and style sheets to minimize the number of HTTP requests. Using image sprites can also reduce the number of requests.

If the request is an HTTPS request, the session layer must use SSL encryption on the connection. This requires extra handshaking (trips back and forth between client and server) when creating the connection. The session layer also encrypts the data before passing it to the next layer.

Performance Implication:  Since HTTPS requires extra trips to the server, plus the time spent encrypting, avoid it as much as possible. Do not encrypt data that doesn’t require it.

The session layer also enforces the maximum number of connections per server. If an attempt is made to connect to any server too many times at once, the session layer steps in and puts it on the back burner until previously-used connections become available.

Performance Implication:  If your browser allows you to change the number-of-connections-per-server, set it higher than the default. However, setting it too high can adversely affect performance, too, so optimization will require a bit of fiddling. [The development team also needs to ask whether they want the end-user doing this. It can lead to other problems.]

Performance Implication:  Since the maximum number of requests is defined per server, increase the number of servers. Serve some resources from one server and other resources from other servers. Content delivery networks (CDN’s) offer this flexibility.

And then the session layer sends the data to the transport layer…

The Transport Layer

The transport layer packages the data into TCP segments. It adds the source and destination port numbers to the header. It adds a sequence number so it can guarantee in-order delivery with no duplication. It adds a checksum to the header so the server will know if the data was corrupted en route.

And then the transport layer sends the TCP segment to the network layer.

If the transport layer does not get an acknowledgement from the server within a reasonable time, it assumes the segment was lost somewhere along the way and sends it again.

The Network Layer

The network layer packages the TCP segments into IP datagrams and puts the source and destination IP addresses into the datagram’s header. It also decides the routing (i.e., which machine to send the datagram to).

And then the network layer sends the IP datagrams to the data link layer…

The Data Link Layer

The data link layer is the software that controls the networking hardware. It repackages the IP datagrams into frames and “puts the frames on the wire” bit by bit or byte by byte. Unlike the network layer, the data link layer sees only the local network, not the entire Internet. It works with MAC addresses, not IP addresses.

The data link layer responds to errors reported by the networking hardware. The response could be a simple retransmit, but things are not always that easy.

And so the data link layer sends the frames to the physical layer…

The Physical Layer

The physical layer is the physical equipment that lets computers talk to each other. If you can touch it, it’s part of the physical layer. Example: wires, fibre optic lines, physical ports, adapter cards.

Once the request hits the physical layer… (to be continued in Part III; keep your eye on the Monitor.Us blog)

Post Tagged with

About Warren Gaebel

Warren wrote his first computer program in 1970 (yes, it was Fortran).  He earned his Bachelor of Arts degree from the University of Waterloo and his Bachelor of Computer Science degree at the University of Windsor.  After a few years at IBM, he worked on a Master of Mathematics (Computer Science) degree at the University of Waterloo.  He decided to stay home to take care of his newborn son rather than complete that degree.  That decision cost him his career, but he would gladly make the same decision again. Warren is now retired, but he finds it hard to do nothing, so he writes web performance articles for the Monitor.Us blog.  Life is good!