The HTTP/2 standard was finalized in May 2015. Most major browsers support it, and Google uses it heavily.
HTTP/2 leaves the basic concepts of Requests, Responses and Headers intact. Changes are mostly at the transport level, improving the performance of parallel requests – with few changes to your application. The go HTTP/2 ‘gophertiles’ demo nicely demonstrates this effect.
A new concept in HTTP/2 is Server Push, which allows the server to speculatively start sending resources to the client. This can potentially speed up initial page load times: the browser doesn’t have to parse the HTML page and find out which other resources to load, instead the server can start sending them immediately.
This article will demonstrate how Server Push affects the load time of the ‘gophertiles’.
HTTP/2 in a nutshell
The key characteristic of HTTP/2 is that all requests for a server are sent over one TCP connection, and responses can come in parallel over that connection.
Using only one connection reduces overhead caused by TCP and TLS handshakes. Allowing responses to be sent in parallel is an improvement over HTTP/1.1 pipelining, which only allows requests to be served sequentially.
Additionally, because all requests are sent over one connection, there is a Header Compression mechanism that reduces the bandwidth needed for headers that previously would have had to be repeated for each request.
Server Push
Server Push allows the server to preemptively send a ‘request promise’ and an accompanying response to the client.
The most obvious use case for this technology is sending resources like images, CSS and JavaScript along with the page that includes them. Traditionally, the browser would have to first fetch the HTML, parse it, and then make subsequent requests for other resources. As the server can fairly accurately predict which resources a client will need, with Server Push it does not have to wait for those requests and can begin sending the resources immediately.
Of course sometimes, you really do only want to fetch the HTML and not the accompanying resources. There are 2 ways to accomplish this: the client can specify it does not want to receive any pushed resources at all, or cancel an individual push after receiving the push promise. In the latter case the client cannot prevent the browser from initiating the Push, though, so some bandwidth may have been wasted. This will make deciding whether to use Server Push for resources that might already have been cached by the browser subtle.
Demo
To show the effect HTTP/2 Server Push can have, I have extended the gophertiles demo to be able to test behavior with and without Server Push, available hosted on an old raspberry pi.
Both the latency of loading the HTML and the latency of loading each tile is now artificially increased.
When visiting the page without Server Push with an artificial latency of 1000ms, you will notice that loading the HTML takes at least one second, and then loading all images in parallel again takes at least one second – so rendering the complete page takes well above 2 seconds.
With server push enabled, you will see that after the DOM has loaded, the images are almost immediately there, because they have been Push’ed already.
All that glitters, however, is not gold: as you will notice when experimenting (especially at lower artificial latencies), while Server Push fairly reliably reduces the complete load time, it sometimes increases the time until the DOM-content is loaded. While this makes sense (the browser needs to process frames relating to the Server Push’ed resources), this could have an impact on the perceived performance of your page: for example, it could delay running JavaScript code embedded in your site.
HTTP/2 does give you tools to tune this, such as Stream Priorities, but this might need careful tuning and be supported by the http2 library you are choosing.
Conclusions
HTTP/2 is here today, and can provide a considerable improvement in perceived performance – even with few changes in your application.
Server Push potentially allows you to improve your page loading times even further, but requires careful analysis and tuning – otherwise it might even have an adverse effect.