On June 8, early risers on the east coast of the United States woke up to a broken internet. A wide range of websites, including Amazon, Reddit, and The New York Times, were not available. A Drudge Report title, for those who could access it, screamed, “GLOBAL WEB FAILURE – MEDIA, GOVERNMENT WEBSITES EXPECTED. The brief but general downtime of the website was due to a single bug affecting the services managed by Quickly, a cloud service provider. Along with competitors such as Cloudflare and Akamai, Fastly provides customers with a sort of Internet fast lane, reducing website load times by storing a copy of requested content on a highly optimized network of servers spread across the globe.
A handful of these “cutting edge IT vendors” support many of the world’s most visited websites, and most of them have made using the Internet faster and less frustrating. With the help of advanced computing, a user, whether in Seoul or Manhattan, can watch New York Times videos seamlessly. But with these benefits come distinct risks. When the accessibility of major sites depends on the performance of a single provider, that provider becomes a single point of failure for a large part of the Internet. A difficult trade-off emerges, whereby websites using advanced computing tend to perform better the vast majority of the time, but when they fail, they fail en masse. A comparatively stronger basket now contains all the eggs.
In a recent study, we and our co-authors, Samantha Bates, Shane Greenstein, Jordi Weinstock and Yunhan Xu, examines the structural risks associated with the emergence of a small handful of dominant cloud providers. The study provides a window into the trends of “internet entropyThis is how hosting various major online destinations has become distributed compared to concentrated. The internet was designed from the start to be radically decentralized and, therefore, robust to the failure of individual components. If a router goes down, packets can usually find an alternate route to their destination. If Bank of America’s online check deposit service goes down, it doesn’t take away from Capital One. But as more and more websites, each acting reasonably, outsource their hosting and networking to a handful of cloud service providers, that design paradigm is eroding.
To track the decline of the Internet’s entropy, the study examines trends in how an index of the world’s 1,000 busiest websites implemented critical network technology – how users can find them via the Domain Name System (DNS) – between November 2011 and July. 2018. DNS allows web browsers to translate between user-friendly domain names (such as www.google.com) and friendly IP addresses (like 220.127.116.11) that allow them to locate and access websites’ servers. Websites designate special servers that correlate digital IP addresses, needed to reach them, with their widely advertised domain names – this is how people try to find them. Websites can delegate this task, if they wish, to a cloud service provider. Like Fastly’s content delivery features, DNS is an essential service: a website without functioning DNS servers is inaccessible to most users.
Our data, shown in Figure 1, reflects a massive and continuous shift from self-hosted DNS to a reliance on external cloud service providers. While only around 32.9% of websites relied exclusively on a cloud service provider to manage their DNS in 2011, almost 66% did so in 2018.
Figure 1. Percentage of domains relying on self-hosted DNS, externally hosted DNS, or both over time.
This move towards external DNS hosting solutions is consistent with the emergence of a concentration of dominant cloud service providers responsible for managing a growing fraction of the DNS services market. The lines in Figure 2 represent what economists call “concentration ratios” (the percentages of market share captured by the top one, four, and eight suppliers, respectively) over the time period we studied. As of November 2011, the eight largest external DNS providers collectively hosted DNS for just over 24% of the top 1,000 websites. By July 2018, this proportion had risen to around 61.6%.
Figure 2. Percentage of overall market share (CR) held by the top one, four, and eight vendors over time.
Taken together, these two trends – the move towards external DNS hosting and the concentration of the external DNS hosting market – amount to a huge reduction in Internet entropy. A once distributed system is now channeled to an increasing extent through the infrastructure of a small group of cloud service providers. This is important, for the same reason that websites depend on a few dominant leading IT vendors: when a major external DNS provider goes down, so do the many sites that depend on them. And the threat of massive DNS outages is not theoretical: a distributed denial of service (DDoS) attack on DNS provider Dyn in October 2016 caused catastrophic outages for websites across the United States. (“INTERNET FAILUREDrudge cried.)
Unfortunately, websites haven’t used the tools at their disposal to mitigate the risk of future massive outages. The DNS protocol explicitly contemplate the use of backup DNS servers that are triggered automatically when the main servers go down, maintaining the accessibility of a site. Despite a slight increase in the practice following the Dyn attack (reflected in Figure 3), the use of secondary DNS, which could allow websites to “diversify” their DNS between multiple providers or self-backup servers. hosted, remains marginal. The reason is not entirely clear, but this trend could be related to cloud hosting provider lockdown, resource limitations, general lack of awareness, or a combination of these and other factors.
Figure 3. Percentage of domains using one, two, or three or more DNS providers over time.
In the case of DNS, the way forward in the face of declining Internet entropy seems relatively clear. While the use of secondary DNS may not restore entropy per se, it has the potential to dramatically increase resilience to Internet-wide errors and attacks. External DNS providers should encourage the use of secondary DNS, which makes it as easy as possible to set up a backup provider. In most cases, provisioning a secondary DNS should be relatively inexpensive.
The fact that secondary DNS is a given to many websites makes it all the more striking that adoption has more or less stagnated in the months following the Dyn attack. The fact that so many websites haven’t adopted such a simple and effective mitigation strategy is testament to an all-too-common phenomenon in the world of cybersecurity (and, more broadly, engineering) –not corrected, poorly designed, or otherwise neglected systems force those charged with maintaining and defending them to adopt a reactive rather than a proactive posture. Even well-resourced organizations struggle to learn from the mistakes of others, leaving them to follow the same dangerous paths again. Indeed, our analysis found that more than half of the websites that added a secondary DNS provider as a result of the Dyn attack were the ones that paid the price, sites that relied exclusively on Dyn alone. at the time of the attack. Likewise, the undiversified websites that dodged the bullet by choosing another provider diversified at considerably lower rates.
And with industry-wide DNS resiliency still seemingly out of reach, it’s not reassuring that tackling the decline of internet entropy may prove even more difficult in internet technologies beyond. DNS, including the much more complex and expensive advanced IT services provided by Fastly. It is probably not possible to put the cat back in the bag and “re-decentralize” the web, although some try nobly. Even so, asking dominant cloud service providers to “do better” is unlikely to be enough. Bugs, cyber attacks, and human error are problems for even the most sophisticated and time-consuming operations. Businesses and other organizations that rely on consistent website availability should look for ways to guard against the failure of the cloud service providers they rely on, including creating redundancy in their systems and developing contingency plans that take into account the failure of suppliers.
Ultimately, markets on their own might not be up to the task of formulating and adopting new best practices for this era of declining internet entropy. This is especially the case with cybersecurity, where those who create services must keep pace with those who wish to break them. Comprehensive cybersecurity legislation, a long time a chimera in the United States and elsewhere, remains a distant prospect for now, but the government could get the ball rolling by revisiting its own procurement policies. In a promising display of attention, Congress passed a law in December 2020 to establish baseline security standards for “Internet of Things” devices purchased for federal use.
The road is long and uncertain to acclimatize to the new, more centralized architecture of the Internet.