Tim Berners-Lee recently posted a paper about the movement to upgrade the web to TLS. He argued that we should drop the https
prefix from encrypted URLs in favour of sticking with http
, have the browser always try for a TLS connection first, and signal to the user accordingly.
In general, I agree, but I’d look at a lot of the same points from a different perspective.
Modern websites, especially media websites like the FT, often include dozens or even hundreds of third party components. Content that is fully or partly funded by advertising can even include sub-resources from different third parties from one request to the next. This is one reason why some news providers are still “stuck” on HTTP, because the way security is signalled to users ranks a partly secure page as worse than an entirely insecure one. A move to optimistically upgrade every connection to TLS regardless of the prefix would by necessity be accompanied by a change in policy regarding UI signals given to the end user, which would be welcome and would help big sites to upgrade to TLS more incrementally.
For example, if the page is secure but a number of subresources are not, or the page loads an IFRAME that is not secure, then currently a browser will get very upset, and express that upset in the form of loud nannying UI. If instead it displayed the page in the same manner as if the whole lot had been insecure, then as developers of that site we no longer have a perverse incentive to stick with HTTP for the page.
I don’t agree that moving to HTTPS breaks “the web of links”. Practically since its inception HTTP has had redirects, and most originators of links that cross domain boundaries these days come from sources that curate those links automatically (ie search engines) so a migration to HTTPS indicated by a redirect will rapidly be detected and optimised out of the workflow by the search crawler. Manually curated links suffer a few hundred milliseconds additional latency. Not a big deal.
It’s also questionable in my mind whether the https
prefix is a misuse of the URI or not – after all it might be possible to fetch a document via FTP as well as HTTP, resulting in an identical URL with a different scheme. If https
should not be in the URL then should the URL contain protocol information at all? It’s an argument that rapidly becomes pointlessly academic, like the view that URLs should never end in a file extension like .html or .jpg because that’s what content negotiation and the Content-Type
header are for. Yes, but millions of sites do it, no-one dies, so let’s move on.
In practice I don’t think having the https
scheme is really a problem. It’s another untidyness in a web that is basically held together by untidynesses. But here’s the bit that matters: if we begin to use TLS on http
URLs, then the https
prefix can essentially be seen simply as a form of client-initiated HSTS. Browsers could display http
regardless (even if under the hood they make an https
request), and treat two otherwise identical URLs as the same resource.
The UX of this needs careful planning – does it benefit the user to see the http
/https
in the URL bar? I don’t know. But none of this potential future change is derailed by encouraging site owners to move to HTTPS URLs in the interim, so that still deserves wholehearted support.