Mobile Rendering Performance at Sky
The above is what all prospect customers see when they visit Sky.com. Codenamed Polaris, it’s about creating a premium, uniquely Sky, brand engagement in order to convert a prospect into a customer.
Perceived speed is a key part of creating a premium experience and maximising KPIs such as click through. The site has to feel fast to load and buttery smooth while in use.
There is a huge amount to say about performance on our www.sky.com estate. So I’m going to relentlessly stick to describing what improvements can be made to the empty-cache rendering time of the prospect homepage and what implications this has for our frontend architecture across the estate. I will not cover: runtime performance, backend performance, or system performance. There is a ton to read about general optimisations, and I may do other posts which focus on this aspect. However, I will try to keep general optimisation tips to a minimum and focus on what we need to do at Sky specifically.
Before and After
I had a chance recently to test some modern techniques for improving front end performance, or critical rendering path, on a mobile device on a 3G connection. As suggested by the introduction, I tested against the prospect homepage. Check out the following - top is before (demo environment), bottom is after (crit environment). Perhaps best to right click and open in another tab!
A few caveats to this film reel. These are not customer-facing speeds. These are test environments. There is no CDN, no gzip. Both of these will have a huge impact on the final speeds to both environments. This is also a work in progress. I’m far from done. The point here is to compare.
Let’s play spot the difference for the crit environment (bottom reel):
- the user receives visual feedback that the page is loading 2 seconds earlier.
- the above the fold content is visually complete 8.5 seconds earlier.
- the page loads 11 seconds earlier.
- the masthead loads 3 seconds later.
I should caveat the above points by saying that I’m defining load/loaded to be visually complete, not necessarily interactive.
In this post we’ll accept for now that faster = better, but just briefly:
- Research shows that users experience a metal context switch after 3 seconds.
- 50% of users expect a page to load after 1 second.
Look out for a future post which sources more research and goes into depth here.
The how
Some things were already done in the demo environment:
- minified concatenated Javascript and CSS assets with appropriate cache headers.
Some of the steps I took were fairly obvious and part of a very generic approach to performance:
- responsive images, which are kept as small as possible by restricting the dimensions
- interlaced images, which load in a first, lower quality, frame, and progressively enhance their quality, rather than loading in from top to bottom at full quality, which is perceived as slower.
Other steps applied the now well-described techniques for identifying and improving the critical rendering path:
- inlining critical CSS in the document body at build time. This avoids many TCP round trips to the server to obtain the CSS necessary for this content.
- rendering the above the fold content on the server and deferring the rest to the client. The truth is you have to do this if you want reasonable performance on a mobile connection.
- all Javascript loaded asynchronously.
Other steps were much more Sky specific and deserve their own paragraphs.
The masthead is the bottleneck for all front end performance at Sky. By far the most impactful thing I did to improve the performance of this page is embed the masthead resources within the app during build time and then consume those resources on the client, deferring the latter until after the critical resources from the application had loaded. I cannot recommend this approach. It took a great deal of finessing to load all the masthead resources asynchronously when they are not designed this way. I settled on CSS tricks, but I’ve tried everything, including timers which poll the page for computed styles to check the HTML and CSS are ready for display. When you are also trying to load your own Javascript asynchronously using the HTML5 async attribute, but there are dependencies between your code and the masthead, this is more fiddly than it sounds to get right.
We are also very wasteful with fonts. Sometimes we load the same font in multiple times. We load every variant (bold, medium, whatever) in a single concatenated file. And we provide many different extensions, which is no longer necessary to support the vast majority of customers. While it’s true some browsers are quite clever about what they do here, the fact remains that to get the vast majority of our text to an acceptable level at load time for the vast majority of browsers, we only need Sky Regular in WOFF format. I encoded and inlined the whole font in the document.
There is an interesting tradeoff in that decision. Modern browsers wait up to 3 seconds to display text if they think they are waiting on an external custom font resource to download. Some people advocate loading in a default font and using CSS tricks to make this as similar as possible to your custom font (the results can be surprisingly good). In this case I opted to delay the text, but display it in the right font when it does appear. Ideally, we’d try both and validate our decision with real user metrics.
Recommendations
I’ll now go into a bit more detail about what we can change culturally, technically, and architecturally as a result of this proof of concept. Some of this is hugely ambitious, some achievable very quickly.
Goals
We need domain specific performance goals. For the prospect homepage, a decision was taken to load in the full bleed image above the fold before worrying about the masthead. That’s because we don’t want people to come to the homepage for email anymore. Indeed, if you are a prospect customer you are unlikely to care about this. We instead want to funnel those customers straight into engagement with our content.
How do you measure such a thing? Page load time was a good proxy for perceived speed in the early 2000s when pages were largely text and images. It is terrible now owing to our friend and foe Javascript. And as described above always terrible because it was so generic. Instead, we’d be much better off with custom instrumentation which tells you how long that hero image took to load. But failing that we should look at algorithms which measure the time it took for the above the fold content to be visually complete, such as Speed Index.
We need aggressive goals that match the business desire for a premium, mobile first, experience. 1 second above the fold visually complete on a good 3G connection (around 150ms RTT) is possible and we should strive for it.
Data
We need performance data to answer two types of question. First, resource specific questions. For example, it’s clear some decisions, like encoding and concatenating font variants in one css file, were taken to benefit a cached experience. But we don’t know what proportion of users have a fresh cache. We need data about cached vs. stale or empty experience.
Best approach here would be to collect information from the HTML5 resource and navigation timing APIs using Sky Tags. While the browser won’t explicitly tell you which resources are cached for privacy reasons, you can infer this from the time it took a resource to load.
Second, estate wide questions: for example, the benefits of cross-caching across all our different properties. Could we benefit from a single, rarely changing, cacheable, 3rd party bundle containing widely used resources like Underscore, for example? Or are we better off with everyone doing their own thing?
We need to tie this data to business metrics. If we introduce an artificial 5 second delay to a site to 10% of visitors, what impact does this have on behaviour? Where we make performance improvements, can we a/b test it to validate the work? This is the holy grail because it not only allows us to validate and justify performance work, which has traditionally been on an indefinite backburner, but also question new features and 3rd party additions to the estate. Perhaps a script adds an advanced analytics capability, but the delay it causes drops sales by 2%, and adds an extra 3% calls to the callcentre; oh hai SessionCam. In which case it should be removed.
Knowledge
We need to marry this data with an understanding of TCP and the Browser. Why is inlining your critical CSS into the document such as good step? Because connections on mobile take a long time, as do big files, owing to a TCP algorithm known as slow start. Why should we not load in every font type and variant? Because browser support for WOFF is extensive and browsers delay the appearance of text for 3 seconds to wait on fonts. This is a lower level of knowledge than traditionally assumed for performance optimisation at Sky.
And of course we need to marry this data with an understanding of the user. If there’s a mental context switch after 1 second, what can we do to at least give some visual feedback the page is loading? Does that drop the bounce rate? Do more users go on to buy Sky or are driven to other channels?
Technical changes
The masthead should become much lighter weight. It should no longer be a place to whang every script imaginable. It shouldn’t even be a service. It would be much better placed, not just from a performance perspective, but from a shared ownership perspective, as a light web toolkit component. Here it would no longer be a overloaded shared architectural touch point, but instead just a menu - nothing more.
We also need tooling to integrate with our build pipelines. Much more could be said about this, but I’ll leave it there.
Wrapping up
Speed matters.
Long term, we need to join up psychology, synthetic tests, and real user monitoring in order to inform domain specific goals, measure those goals with modern, useful proxies like page speed, and a/b test every change for a detrimental performance impact, and compare that to business metrics.
Short term, I’ve shown we can achieve massive performance gains simply by reorganising the way that resources are loaded onto the page using modern techniques. I’ve not even looked at the code itself where there is sure to be plenty of low hanging fruit. It’s a similar story for a great deal of the estate. We can aim for ambitious, but achievable, more generic performance goals, such as a visually complete above the fold experience on a good 3G connection within 1 second.
Medium term, we can change obvious bottlenecks like the masthead, fonts, and go on making sure standard best practices like gzip and geographical content distribution via a CDN are in place.
I’ve been wanting to do this proof of concept for ages. It’s not finished yet. It has been great fun. Hopefully this post is useful. I’ve teased about doing more posts. If anyone would like to see more, let me know.