You often hear that TLS is the most important security protocol. Usually, the reasoning is that it’s very widely deployed and also that it works for many higher-level protocols. That’s certainly true, but for those who work more closely with these protocols there is another important aspect: we can learn so much about protocol design by carefully examining the evolution of TLS.
Update (13 January 2017): Since this post was originally published, the TLS 1.3 specification was amended to change how future protocol versions are negotiated. The new approach bypasses the intolerance problem described below. Thus deployment of TLS 1.3 will proceed largely without fear of breaking compatibility.
Protocol Version Intolerance
One important lesson we learned is never to use explicit protocol version numbers. In TLS, they have been a continuous source of trouble, leading to waste of time and security issues. The root cause is the fact that there are many servers that freak out when a client offers to negotiate a TLS version number they don’t understand. According to the protocol specification, they’re supposed to respectfully decline to use an unknown version and fall back to the best protocol they can offer.
Despite this being a server-side problem (and, at a deeper level, a library problem), in practice browsers get all the blame if they can’t communicate with a particular web site. This leads to a behaviour sometimes called voluntary protocol downgrade: after trying their best protocol version and failing, browsers try their second best, then third best, and so on, until hopefully they manage to establish an encrypted channel with the server.
Although this browser behaviour helps with interoperability, it slows down communication and also enables active network attackers to downgrade secure communication to the worst protocol version supported by both the client and server.
After many years of struggle, major browsers vendors eventually managed to disable the fallbacks. The fallback to SSL v3 was the first to go, prompted by the discovery of the POODLE attack. Other fallbacks were removed later, as you can see on this timeline of events in the SSL/TLS ecosystem.
The version intolerance problems will persist for as long as we continue to publish new protocol versions. Protocol fallbacks are relevant again because TLS 1.3 is close to being finished and we now have to face the harsh reality of many intolerant servers out there.
Testing for version intolerance in SSL Labs
Protocol version intolerance testing has been implemented in SSL Labs for a very long time. If you’re unfortunate to run into a server that’s intolerant to some of TLS 1.0, 1.1, or 1.2, you get a big warning. But such servers are getting increasingly rare; after all, most browsers implemented TLS 1.2 by the end of 2013, which means that system owners have had a few years to resolve the intolerance problems.
We also test for intolerance of TLS 1.3 and some other absurdly large version numbers. The TLS 1.3 results are now of immediate interest; the other tests possibly more of interest to protocol developers.
Given that the release of TLS 1.3 is imminent, we’re planning to make the intolerance warnings more prominent, as well as take it into account when grading servers. There will soon be a separate post about that.
Version intolerance tracking in SSL Pulse
Our SSL Pulse project, which monitors SSL/TLS configuration on about 150,000 most popular web sites, provides version intolerance information from the first ever scan we did in April 2012. You can see what that looks like in the following chart.
You can see that, in our most recent scan (July 2016), 3.2% servers from our data set don’t like TLS 1.3. A further 3.6% servers don’t like TLS 2.152. This doesn’t sound like much, but it’s a huge problem for browsers because it translates to thousands of sites, some very popular.
You will also notice a huge drop in the number of intolerant servers in May 2015, which warrants an explanation. When we first started measuring version intolerance, the problem was genuinely much bigger; at peak, with about 12% servers intolerant to TLS 1.3 and over 60% servers intolerant to TLS 2. With such numbers, there was little chance of introducing a new protocol version smoothly. But then someone realised that the problem could be reduced with only a slight tweak to the TLS 1.3 protocol.
If you look under the hood of any SSL or TLS protocol version, you will see that they all use two different version numbers. There’s one version used for the main protocol (TLS 1.0, 1.1, and so on), but there is also the record layer version. You can think about the record layer as a subprotocol. With two version numbers, there’s always been a confusion about what exactly to send in each field. Some clients would only use the main protocol version (e.g., TLS 1.2), but keep the record layer version at SSL 3 or TLS 1.0, to indicate that, at that layer, nothing changed. However, some clients would use the bigger protocol versions for both fields. Naturally, in our tests we simulated the worst case.
Anyway, it transpired that most problems arise from record layer intolerance. Given that no one actually needs two version numbers, the TLS 1.3 specification was subsequently changed to deprecate the record layer version number and fix it at TLS 1.0 (0x0301). Then, some time later, in May 2015, we started testing for version number intolerance under the new rules, which caused the big drop.