Fastly brings down Amazon, Twitter, Twitch and many more

author: MedUX

Fastly brings down Amazon, Twitter, Twitch and many more: MedUX impact assessment

  • Fastly CDNs went down between 12 PM and 1 PM CET on Tuesday, June 8th, 2021, which impacted the web browsing experience and content delivery globally for approximately an hour.
Article written in collaboration with Xuehong Jiang, MedUX Business Analyst and Consultant.

MedUX technology has detected the Fastly service disruption, which has brought down a number of major websites around the world, including news, government, and social media websites.

Fastly, a leading Content delivery networks (CDNs) provider, runs an Edge cloud between data centers and end-users, moving content closer to the user in order to reduce latency and improve the customer experience.

However, today Fastly prevented users worldwide from accessing hundreds of sites such as Amazon, Twitch, Reddit, gov.uk, Paypal, HBO Max, The New York Times, The Financial Times, and eBay at an international level, and ElMundo, Marca or LaSexta in Spain, among others affected.

Websites have already recovered from this major service outage. Please, stay tuned or get in touch with us at hello@medux.com for a more detailed analysis or impact assessment, which significantly impacted customer experience.

Preliminary analysis of customer experience impact

Fastly’s outage had a major impact on the user experience, as can be seen in the graphs below. For illustrative purposes, we have selected some performance indicators and websites in Spain to understand how the service outage impacted the user experience.

Some of the services impacted as per our real-time dataset were Twitch, Marca (Spanish sports newspaper), El Mundo (Spanish newspaper), and Yahoo. All graphs below show that the customer experience was affected the most between 12 PM and 1 PM CET, as end-users could either not access the websites at all or the loading time was impacted significantly.

 

Website Availability (Completed Tests)

  • Twitch: Tests against www.twitch.tv started to fail around 11.50 AM with an approx. outage duration of 70 min, but some abnormal spikes around 9 AM and 10.30 AM already anticipated the disruption.
  • El Mundo: Availability of www.elmundo.es appeared to have been affected during a shorter period of time (30 min)

Web Browsing Time

  • Yahoo: Web loading time of www.yahoo.com almost doubled, increasing from typical values of 700ms to 1200 ms.
  • Marca: Web loading time of www.marca.com experienced a free fall while showing the Fastly connectivity error, but the outage has been showing an aftereffect.

As stated in Techcrunch’s article ‘Twitch, Pinterest, Reddit and more go down in Fastly CDN outage’, “CDNs are a key part of the internet infrastructure. […] Even though the web is a digital platform, it’s very physical by nature. When you load a page on a server on the other side of the world, it’s going to take hundreds of milliseconds to get the page. Over time, this latency adds up and it feels like a sluggish experience. When a page is already cached, a CDN can usually start sending the content of the page in less than 25 milliseconds.”

At MedUX we are already testing a new CDN performance test, soon available for all our clients, aimed to measure several particular metrics on content delivery that are key to the overall customer experience: availability, responsiveness, and throughput when downloading content cached in different CDN providers. Our newly developed test aims at measuring and benchmarking the performance of several CDNs.

We are already doing trials for some of our clients to help them determine whether collaborating with certain providers, such as Akamai CDN, Amazon CloudFront, Google Cloud CDN, Cloudflare CDN, Microsoft Azure CDN, and Alibaba Cloud, would lead to a meaningful improvement in end-user experience.

Disruption overview

The issue started to be detected at midday in Europe (CET). “We’re currently investigating potential impact to performance with our CDN services”, Fastly reported.

When trying to access the mentioned websites, digital newspapers, and governmental and institutional sites, the servers returned an error message “503 – service unavailable”. That is, a problem with the server that hosts them, which temporarily prevented users from accessing the content.

It appears that the issue was a “global CDN disruption” and not a particular data center related issue.

Some pages that were still running also showed service failures, for example, Twitter, which showed a problem when displaying images and videos/GIFs.

The issue was resolved after one hour. Fastly reports on its website that “the issue has been identified and a fix has been applied. Customers may experience increased origin load as global services return”. This has made the affected services available again.

Although it has been rapidly solved, this outage has brought an increase in customer complaints who found an error message when trying to access several websites.

Fastly’s outage has been a reminder on the complexity of the Internet and the importance of redundancy. Apparently, those Fastly’s customers using multiple CDNs were only partially affected and service was recovered with alternative providers. Apart from this, some Fastly’s customers tried to mitigate the impact by redirecting users to the origin servers. Optimal delivery depends on the ability the identify and diagnose these kind of issues, but also on the redundancy strategy for essential parts of the network, such as hosting servers, DNS resolution or CDNs.

About MedUX

MedUX is the leading Customer Experience monitoring and measurement solution for telecommunication services, governments, and OTT providers by making use of AI-Powered Real-Time Advanced Analytics.

Our ecosystem helps to understand a wide range of incidents affecting the customer experience, from Quality of Service (QoS) problems and network outages, to Quality of Experience (QoE) issues and degradation of services, as happened last year with the massive YouTube outage.

Over 5,000 MedUX HOME devices deployed in 10 European countries allow us to measure the quality and experience of fixed broadband networks in real-time. MedUX monitors the quality of broadband and most used services such as OTT (Over The Top) applications, such as social networks, messaging platforms, web browsing, or video streaming, from the end-user perspective. In addition, MedUX ecosystem’s analytical capabilities enable network and service issues diagnosis and trouble-shooting.

Stay tuned to our next reports and insights and get in touch with us at hello@medux.com if you need further information. Our team will be glad to discuss our new features to prevent Customer Experience issues in this innovative and hyper-connected era. Find out how we help operators globally to deliver on new technology promises while testing and having visibility into services from the user perspective.

Don’t forget to follow us on social networks and subscribe to MedUX newsletter!

SUBSCRIBE

 

Share it

Tagged with:

related POSTS:

Outage Analysis Why do Internet service outages happen? The importance of network monitoring (and benchmarking) to understand service outages and network reliability

Why do Internet service outages happen?


Author: MedUX

Technical problems can happen, but ISPs should be prepared to detect, solve and anticipate them.

Outage Analysis

YouTube goes down: MedUX impact assessment


Author: MedUX

YouTube goes down: MedUX impact assessment YouTube went down between midnight and 2AM UTC on Thursday, November 12th, 2020, which

Outage Analysis

COVID-19 impact: Monitoring on European residential networks


Author: MedUX

COVID-19 impact: Performance and experience monitoring on European residential networks This article is a compilation of our series of articles

Any need related to Quality and Customer Experience? We are here to help you!