I have a confession to make: I fear that HTTP Public Key Pinning (HPKP, RFC 7469)—a standard that was intended to bring public key pinning to the masses—might be dead. As a proponent of a fully encrypted and secure Internet I have every desire for HPKP to succeed, but I worry that it’s too difficult and too dangerous to use, and that it won’t go anywhere unless we fix it.
What is public key pinning?
Before I go on, let’s briefly discuss why we need public key pinning in the first place. The problem is with the way we manage trust: we have hundreds of CAs and each of them is able to issue a certificate for any web site in the world. Technically, owner permission is not necessary. Now, I think this system actually works rather well, which is evident from the fact that the rate of incidents is very low. But fraudulent certificates can be created in one way or another, and that’s not good enough for high-profile web sites. Further, technical people have trouble relying on a system that’s not foolproof.
Enter public key pinning, which is a technique that enables site owners to have a say in which certificates are valid for their sites. For example, in one of the possible deployment options, you choose two or more CAs to trust; after that, any certificate issued by anyone else is ignored. What’s not to like?
Public key pinning started at Google, and they first implemented in Chrome, pinning their own web sites. Their approach is an example of static pinning; the pins are not easy to change because they’re embedded in the browser. Chrome’s pinning has served us well over the years, uncovering many cases of fraudulent certificates that would otherwise perhaps fly under the radar. Google also allowed (and still does) that other organisations embed their pins in Chrome. These days, Firefox also supports static pinning, drawing from the same pins maintained by Google.
Whereas static pinning works well, it has a problem because maintaining pins is a slow manual process that doesn’t scale. For that reason Google also sparked the IETF work that lead to RFC 7469, officially known as “Public Key Pinning Extension for HTTP”, but which everybody calls just HPKP. HPKP is an example of dynamic pinning; web site owners can set the pins at will.
What is the problem with HPKP then?
The main problem with HPKP, and with pinning in general, is that it can brick web sites. The culprit is the memory effect: pins, once set, remain valid for a period of time. Each pin is associated with a unique cryptographic identity that the web site must possess to continue operation. If you lose control of these identities, you effectively also lose your web site.
Clearly, pinning introduces a paradigm shift. Without it, TLS is quite forgiving—if you lose your keys you can always create a new set and get a fresh certificate for them. With pinning, your keys become precious.
There is some relief in the fact that a valid HPKP configuration must include at least one backup key. The idea is that, if you something goes seriously wrong, you fetch your securely-stored backup key and resume normal operation.
Even if you don’t lose your pinning keys, you have to be careful how and when you’re changing them. Your configuration must, at any time, be offering at least one pin that matches the configuration you offered to all your previous users. If you rotate the keys too quickly you risk not having a pin for some of your older visitors.
To sum up, HPKP is not for the faint of heart; you essentially need to know what you’re doing and be careful about it.
Talking about knowing what you’re doing, HPKP is also too flexible about what you can do with it. With it you can pin any public key in the certificate chain, choosing from your own keys (the leaf certificate), the intermediate certificates, or the root. Each decision comes with its advantages and disadvantages, but you need to understand PKI very well to appreciate them. This flexibility is a point of great confusion that in practice often leads to paralysis (“What to do?”). Some sites will inevitably make the wrong choice and suffer for it.
Thus, a successful pinning strategy requires that you:
- Build a threat model to determine if there is a real threat out there that pinning can address at an acceptable cost
- Understand PKI and HPKP and choose the right place to pin
- Avoid losing your pinning keys
- Keep backup keys in a separate location in case of server compromise
- Have a robust plan for the key rotation and execute it smoothly
The above steps aren’t terribly difficult to carry out, but the stakes are pretty high. A serious mistake with pinning can lead to the business shutting down. The deployment numbers reflect this; in August 2016 Scott Helme found only 375 sites with HPKP deployed, along with 76 sites with HPKP running in report-only mode. Contrast these numbers with about 30,000 instances of HSTS in the same data set.
A potentially bigger problem with HPKP is that it can be abused by malicious actors. Let’s say, for example, that someone breaks into your server (a very common occurrence) and thus gains control of your web site. They can then silently activate HPKP and serve pinning instructions to a large chunk of your user base. It’s very unlikely that you will detect this. After a long-enough period, they remove the pinning keys from the server and brick your web site just for the fun of it. Or, if you’re lucky, they seek ransom, giving you a chance to get the backup pinning key from them and keep your business.
The HPKP working group had been aware of this problem (one early example here), but didn’t include a mitigation mechanism in the standard. Early HPKP drafts had specified a ramp-up period; pinning had to be observed over a period of time to become fully operational. That feature eventually got removed, probably because it wasn’t quite clear how to effectively assess continuity. In the end, the RFC specified that one active pin and one inactive (backup) pin are sufficient to activate pinning for essentially any duration. The “Hostile Pinning” section of the RFC mentions this problem almost in passing, noting that sites should be able to recover after the maximum policy duration expires. The RFC leaves to browsers to decide on their maximum max-age and has a very soft recommendation to cap it at 60 days.
We are not yet seeing attacks just yet because HPKP is relatively new, but the word is starting to get out. Just this month there was a talk at DEFCON about what they called RansomPKP. If you’re interested in this topic, you should also read this follow-up from Scott Helme, who provided more details.
So why do I say that HPKP is dead?
I think that, ultimately, HPKP requires too much effort and that only a small number of sites will ever deploy it. At the same time, it can be—in the current form—used as a powerful weapon against everybody.
The irony of HPKP is that it’s not going to be used by many sites (because it’s too taxing), but that it can be used against the long tail of millions of small sites who are not even aware that HPKP exists. For the small number of sites who are using pinning, it’s just as likely that static pinning would work well, with less fuss and no danger for the rest of the Web.
Can HPKP be fixed?
To fix HPKP we need to 1) make it easier and less dangerous to deploy and 2) have a way to deal with potential malicious use.
For the latter, one possibility is to “dull” HPKP so that it remains useful but that the really dangerous aspects of it are addressed. There’s a variety of ways in which this could be done. For example, we could reintroduce the ramp-up mechanism. Another solution might be to restrict pinning to those who can demonstrate a level of security knowledge and operational proficiency, for example those who are already preloading HSTS. And perhaps browsers can build an undo mechanism that could be used to override broken pins.
Making HPKP easier to use probably means allowing sites to deploy safer pinning, for example pinning to CAs, not their roots. In fact, that’s largely how static pinning is done right now. It goes like this: you choose 2-3 CAs and require that only they can issue valid certificates for your site. This approach is not as secure as pinning to the leafs (your own keys), but it vastly reduces the attack surface and with much less effort. (This idea had also been discussed during the development of HPKP, but it was rejected because it wasn’t a pure-technical solution and required collaboration of many organisations.)
Technically, pinning to CAs is possible today provided you have a very good understanding of PKI. The key issue is understanding how different root stores include different root keys, how root keys change, and so on. I have some hope that with more public information about these topics and with help with interested CAs we can make this safer style of pinning possible, even without any changes to HPKP.
What can you do today?
Leaving all the possibilities of the future aside, let’s try to figure out what we can do today. Here are some ideas you can consider to make yourself safer:
- First, you could have a monitoring system in place to audit your configuration and detect unwanted pinning. For large enterprises this could be a good idea anyway, because pinning (and other security technologies) could be deployed by an eager developer, without organisation-wide coordination. (As an example, both HPKP and HSTS have a memory effect and also support the includeSubDomains directive, that could make a configuration spread to an entire domain name, even to servers controlled by other teams.)
- You could front your sites with a reverse proxy and make sure that the HPKP response header is never sent to your users. This defence measure won’t address all the possible attack vectors (e.g., if someone redirects the web site to other servers and abuses a misissued certificate), but it would prevent escalation from server take-over.
- If you don’t mind your hand being forced, the pinning itself can be an effective defence against malicious pinning. All the attack vectors that include DNS hijacking and fraudulent certificates can be detected if you’re already using pinning. Sadly, an attacker who takes control over your pins (by compromising your servers) can rotate the pins to those they control. (But the previous control, the reverse proxy, helps in that case.)
Thanks to Ryan Sleevi and Scott Helme for reading early drafts of this post.