Online education portals like Udacity and Coursera are really changing the world of remote learning in significant ways. By making free and high quality education accessible to a global audience, these platforms are opening up undreamt of possibilities for communities around the world to improve, grow, and prosper in the digital economy of the 21st century. Education at top tier colleges and universities has traditionally been a social and economic privilege, but now anyone can join in the learning revolution by sitting in virtual classrooms with the world’s best and brightest educators. Whether this involves learning how to code and build smart phone apps, or starting up a new business, or learning about public health literacy, the sky is the limit of what’s now possible.

Everything about Web and Network Monitoring

The Chronology of a Click, Part XII

This series has been following the chronological progress of a click on a link. The click was converted to a request, sent to the server, and processed by the server. The server created a response and sent it back to the client, where the client-side script executed. Today’s episode tells how the output from the client-side script becomes a fully-rendered web page. Part XIII tells how event-driven scripts, which execute after the page is fully loaded, affect performance.The client-side script seems to send its output to the browser’s viewport, but in reality the output is sent to the browser engine. There are several engines in common usage: WebKit is used in Chrome and Safari, Gecko is used in Firefox, Trident is used in Internet Explorer, and Presto is used in Opera.The browser engine is sometimes referred to as a rendering engine or a layout engine, but these are incorrect terms. The rendering engine and the layout engine are subcomponents of the browser engine. The layout engine is described in Build the Render Tree below. The rendering engine is described in Render the Document below.

 

1) Build the Content Tree

The content tree, usually called the DOM tree, is a hierarchical data structure. It represents the content and structure of the document with no formatting information. Upon receipt of HTML from the client-side script, the browser engine parses the HTML, inserts it into nodes, then inserts the nodes into the appropriate place in the content tree.The content tree’s structure represents the structure of the document. The relationship between HTML tags and the HTML block in which they appear is represented by a child/parent relationship within the content tree. For example, the html node has two children, the head node and the body node. Another example: If a <div> section of the HTML contains five paragraphs, then the corresponding div node in the content tree has five p nodes dangling from it.

Performance Consideration: 
The HTML parser is very forgiving. When we violate HTML rules, it does its best to figure out what we meant and to present some output that might be what we’re looking for. Unfortunately, there is a performance penalty for that extra effort. For best performance, use well-formed XHTML instead of HTML.

Performance Consideration: 
Reduce the size of the content tree, especially its depth.

2) Build the Style Structure

This section may follow Build the Content Tree, but it does not follow it chronologically. Building the content tree and building the style structure are contemporaneous – as the parser works its way through the document, it builds the content tree as it encounters HTML and it builds the style structure as it encounters formatting information. The content tree and the style structure together represent a separation of content and formatting.I expect the style structure is not cascading, but rather something like a list of already-cascaded styles that apply to nodes in the content tree. It therefore contains fully-realized formatting rules (i.e., effective styles as opposed to cascading styles).

The style structure differs from browser engine to browser engine. There is no explicit or de facto standard, but that’s okay because there is no public API into it. Only the parser and the rendering engine need to understand it.

Performance Consideration: 
Minimize the number of CSS rules. Eliminate duplicated and unused rules.

 

3) Build the Render Tree

The render tree is a hierarchical structure that represents the positioning and formatting of the document. Each node in the render tree is called a frame. Each frame represents a visible rectangle on the end-user’s screen. The structure of this tree reflects the structure of the visuals seen by the end-user. [It does not reflect the structure of the document. That’s what the content tree does.] The render tree links the content and its formatting together, making it ready for rendering.After the content tree and style structure are built, the layout engine uses that information to build the render tree. Now, that’s a bit of a lie, isn’t it? The content tree and the style structure need not be fully built. The layout engine can start determining positions and sizes as soon as the first little bit of information becomes available. True, it may need to duplicate some of its effort (e.g., an element that is relatively positioned above elements that have already been laid out will require those other elements to be laid out a second time), but delaying startup until the parser and layout engine are completely finished can take even longer.

Performance Consideration: 
JavaScript should never use the DOM during page load. Create static style sheets in the <head> instead. If the client-side script cannot produce a static style sheet for the element, perhaps the server-side script can. In any case, format the element statically and stay away from the DOM tree while the page is loading.

After the page is completely loaded, it’s a different story. See the Reflow section in Part XIII for tips we can use after the page loads (i.e., in response to events).

Performance Consideration: 
Avoid HTML tables. Never use them for formatting; use <div>s instead. If you must use them, keep them as small as possible. Layout can almost always be calculated in a single pass, but HTML tables may need two passes.

There is no correspondence between the content tree’s structure and the render tree’s. The content tree’s structure reflects the document’s HTML structure. The render tree’s structure reflects the positioning of the frames seen by the end-user, left-to-right and top-to-bottom. These two structures can be quite different.

Some nodes found in the content tree are not found in the render tree. For example, the head node will not appear in the render tree because it requires no rendering. Another example: Content tree nodes that are not visible (e.g., display:none in CSS) are not included in the render tree.

Some nodes found in the render tree are not found in the content tree. For example, each line of text requires a separate node in the render tree, but they are lumped together into one text node in the content tree.

 

4) Render the Document

After the browser engine starts to build the render tree, the rendering engine kicks in and paints (or draws, if you prefer) the visuals that the user will see. In fact, if it renders slowly enough, the end-user can watch the document appear, be formatted, and move into the correct position. Gyuque uploaded these three visualizations of the rendering process to YouTube:

[youtube https://www.youtube.com/watch?v=dndeRnzkJDU?rel=0&w=140&h=105]
[youtube https://www.youtube.com/watch?v=ZTnIxIA5KGw?rel=0&w=140&h=105]
[youtube https://www.youtube.com/watch?v=AKZ2fj8155I?rel=0&w=140&h=105]

Did you notice how much repetition there is? Most frames are built and rebuilt several times. This is an example of the repetition mentioned in Build the Render Tree above. Each change can trigger any number of other changes.

Rendering does not really happen after the previous step. It can begin as soon as the layout engine puts something into the render tree. Then it runs concurrently with the parser and layout engine.

If the rendering engine responds too slowly to changes in the rendering tree, that is an obvious performance issue. However, responding too quickly can also become a performance issue. If a script or stylesheet changes previously-rendered nodes, the rendering engine can find itself rendering, re-rendering, re-re-rendering, and so on. [This is actually much more common than one might expect.]

Performance Consideration: 
Specify all formatting in the <head> and specify all content in the <body>. If a document provides formatting information after the content to which it applies, the rendering engine will use default formatting rules when it encounters the content. Later, when the layout engine sees the new formatting information, it will relayout the frame and all affected frames, then the rendering engine will have to do all its time-consuming work again.

If the layout engine were to wait a few milliseconds for the content tree and style structure to settle down into their final state, it can do the layout once rather than multiple times. This will trigger one change to the render tree instead of multiple changes, which will make the rendering engine do its job once instead of multiple times.

Performance Consideration: 
In fact, some browsers do improve performance by purposely delaying rendering, but in one way they are at the mercy of the programmer. If the script asks for current layout information, the layout engine has to immediately process all the queued frames before it can answer the question. By clearing the queue, answering the question eliminates the performance improvements that could have been. Programmers should avoid code that queries the size (height or width) or position (top or left) of any element.

5) When onLoad Fires

When the layout engine and rendering engine are finished, the <body>‘s onLoad event fires. This indicates that the document is ready for user interaction. It is fully visible and all the elements are functional. We’re good to go.JavaScript code can respond to the onLoad event.

Performance Consideration: 
In the grand scheme of things, it’s not really onLoad that determines when a document is interactive, it’s the end-user. Some elements must be present and fully functional before the end-user will consider the page ready to go. Other elements aren’t quite so urgent. They can wait a few milliseconds, or perhaps even a few hundred milliseconds.

The developer must be aware of the end-user’s thought processes. He must know what the user needs immediately vs. what he needs in a second or two. Example: A typical user needs to see the top of the document sooner than the bottom of the document. The top is therefore needed-now and the bottom is therefore needed-soon. Other elements that may be needed-soon or needed-later: footers, advertising, dynamic content, images, etc.

Code that creates needed-later elements can be executed after onLoad fires. We note that this is a perceived performance improvement, not a real one. However, from the end-user’s perspective, it is very real. He can get on with what he wants to do sooner. He may not even notice that some elements have not been fully created yet.

If, however, most users do notice the delay in an element, then we have incorrectly classified the element as needed-later when it is really needed-now. Performance must always relate back to the end-user’s wait time.

Conclusion

The parser builds the content tree and the style structure from the HTML. The layout engine builds the render tree from the content tree and style structure. The rendering engine uses the render tree to draw the text and images that are seen by the end-user.Since the layout engine and the rendering engine execute concurrently, the layout engine can make changes to frames that have already been drawn. When this happens, the frames need to be drawn again. If this were only an occasional occurrence, it wouldn’t be a problem, but in real life it happens far too often.

Part XIII tells how post-loading, event-driven scripts affect performance. Watch for it on the Monitor.Us Blog.

References

Garsiel, Tali. How Browsers Work. Published 2010.02.13 and last updated 2011.06.04 at https://taligarsiel.com/Projects/howbrowserswork1.htm. This classic article details the workings of modern web browsers. It’s a little bit behind the times, but not enough to matter to a casual reader. The explanations are well-written and easy to understand (assuming some technical background on the reader’s part). It’s not short, but shortening it would eliminate some of the details that it sets out to explain.
Post Tagged with

About Warren Gaebel

Warren wrote his first computer program in 1970 (yes, it was Fortran).  He earned his Bachelor of Arts degree from the University of Waterloo and his Bachelor of Computer Science degree at the University of Windsor.  After a few years at IBM, he worked on a Master of Mathematics (Computer Science) degree at the University of Waterloo.  He decided to stay home to take care of his newborn son rather than complete that degree.  That decision cost him his career, but he would gladly make the same decision again. Warren is now retired, but he finds it hard to do nothing, so he writes web performance articles for the Monitor.Us blog.  Life is good!