Back after mammoth 6-hour outage, Facebook blames changes it made to its routers
Almost unprecedented global shutdown reportedly exacerbated by tech chaos that kept staff from getting into building to find and fix problem; Zuckerberg’s wealth said dipped $6b
After an almost unprecedented six-hour global outage, Facebook restored its services and those of WhatsApp and Instagram on Monday and blamed the fiasco on configuration changes it made to the routers that coordinate network traffic between its data centers.
“This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt,” Facebook vice president of infrastructure Santosh Janardhan said in a post.
The technical chaos reportedly extended to Facebook’s own employees’ email and work passes, complicating its efforts to fix the problem. New York Times technology reporter Sheera Frenkel told the BBC that “the people trying to figure out what this problem was couldn’t even physically get into the building” at its California campus to work out what had gone wrong.
Facebook made clear that the shutdown, which meant billions of users worldwide could not access its services, was caused internally, rather than by a cyberattack or other outside forces. It did not immediately disclose details of how the outage was fixed, but the Guardian noted “multiple reports” that it sent technical staff to manually reset the servers in California where the problem originated.
The US internet infrastructure and security firm Cloudfare explained that, in an internal change, Facebook had essentially told the internet that the routes to its platforms no longer existed.
“Externally, we saw the BGP [Border Gateway Protocol] and DNS [Domain Name System] problems … but the problem actually began with a configuration change that affected the entire internal backbone. That cascaded into Facebook and [its] other properties [such as Instagram and WhatsApp] disappearing and staff internal to Facebook having difficulty getting service going again.”
Facebook (accidentally, we assume) sent an update to a deep-level routing protocol on the internet that said, basically, "hey we don't have any servers any more xoxo"
— alex hern (@alexhern) October 4, 2021
The outage was highly unusual, but not the worst that the social media giant has suffered. Two years ago, Facebook and its associated platforms were largely inaccessible worldwide for 14 hours.
Facebook chief Mark Zuckerberg’s personal wealth dropped by over $6 billion during the outage, Bloomberg reported.
“A selloff sent the social-media giant’s stock plummeting around 5% on Monday, adding to a drop of about 15% since mid-September,” it said, adding that Zuckerberg’s worth fell to $120.9 billion, placing him below Bill Gates, at No. 5, on the Bloomberg Billionaires Index. “He’s lost about $19 billion of wealth since Sept. 13, when he was worth nearly $140 billion.”
Zuckerberg apologized for the failure in a post.
Facebook, Instagram, WhatsApp and Messenger are coming back online now. Sorry for the disruption today — I know how much you rely on our services to stay connected with the people you care about.
Posted by Mark Zuckerberg on Monday, October 4, 2021
At the height of the outage, Facebook had resorted to Twitter to acknowledge that “some people are having trouble accessing (the) Facebook app” and say it was working on restoring access.
We’re aware that some people are having trouble accessing our apps and products. We’re working to get things back to normal as quickly as possible, and we apologize for any inconvenience.
— Facebook (@Facebook) October 4, 2021
After restoring services to Facebook, and its Instagram and WhatsApp platforms, the company said late Monday that “the root cause of this outage was a faulty configuration change” and that there was “no evidence that user data was compromised as a result” of the outage.
The company apologized and said it was working to understand more about the cause, which began around 11:40 a.m. Eastern.
Facebook was already in the throes of a separate major crisis after whistleblower Frances Haugen, a former Facebook product manager, provided The Wall Street Journal with internal documents that exposed the company’s awareness of harms caused by its products and decisions. Haugen went public on CBS’s “60 Minutes” program Sunday and is scheduled to testify before a Senate subcommittee Tuesday.
Haugen had also anonymously filed complaints with federal law enforcement alleging Facebook’s own research shows how it magnifies hate and misinformation and leads to increased polarization. It also showed that the company was aware that Instagram can harm teenage girls’ mental health.
The Journal’s stories, called “The Facebook Files,” painted a picture of a company focused on growth and its own interests over the public good. Facebook has tried to play down their impact. Nick Clegg, the company’s vice president of policy and public affairs, wrote to Facebook employees in a memo Friday that “social media has had a big impact on society in recent years, and Facebook is often a place where much of this debate plays out.”
The outage didn’t exactly bolster Facebook’s argument that its size and clout provide important benefits for the world. London-based internet monitoring firm Netblocks noted that the company’s plans to integrate the technology behind its platforms — announced in 2019 — had raised concerns about the risks of such a move. While such centralization “gives the company a unified view of users’ internet usage habits,” Netblocks said, it also makes the services vulnerable to single points of failure.
“This is epic,” said Doug Madory, director of internet analysis for Kentik Inc, a network monitoring and intelligence company. The last major internet outage, which knocked many of the world’s top websites offline in June, lasted less than an hour. The stricken content-delivery company in that case, Fastly, blamed a software bug triggered by a customer who changed a setting.
In Monday night’s statement, Facebook blamed changes on routers that coordinate network traffic between data centers. The company said the changes interrupted the communication, which had “a cascading effect on the way our data centers communicate, bringing our services to a halt.”
Madory said earlier Facebook appeared to have deleted basic data that tells the rest of the internet how to communicate with its properties. Such data is part of the internet’s Domain Name System, a central component that directs its traffic. Without Facebook broadcasting its location on the public internet, apps and web addresses simple could not locate it.
There was no evidence as of Monday afternoon that malicious activity was involved. Matthew Prince, CEO of the internet infrastructure provider Cloudflare, tweeted that “nothing we’re seeing related to the Facebook services outage suggests it was an attack.”
Facebook did not respond to messages for comment about the attack or the possibility of malicious activity.
While much of Facebook’s workforce is still working remotely, there were reports that employees at work on the company’s Menlo Park, California, campus had trouble entering buildings because the outage had rendered their security badges useless.
But the impact was far worse for multitudes of Facebook’s nearly 3 billion users, showing just how much the world has come to rely on it and its properties — to run businesses, connect with online communities, log on to multiple other websites and even order food.
It also showed that despite the presence of Twitter, Telegram, Signal, TikTok, Snapchat and a bevy of other platforms, nothing can easily replace the social network that over the past 17 years has effectively evolved into critical infrastructure. The outage came the same day Facebook asked a federal judge that a revised antitrust complaint against it by the Federal Trade Commission be dismissed because it faces vigorous competition from other services.
There are certainly other online services for posting selfies, connecting with fans or reaching out to elected officials, But those who rely on Facebook to run their business or communicate with friends and family in far-flung places saw this as little consolation.
Kendall Ross, owner of a knitwear brand called Knit That in Oklahoma City, said he has 32,000 followers on his Instagram business page @id.knit.that. Almost all of his website traffic comes directly from Instagram. He posted a product photo about an hour before Instagram went out. He said he tends to sell about two hand-knit pieces after posting a product photo for about $300 to $400.
“The outage today is frustrating financially,” he said. “It’s also a huge awakening that social media controls so much of my success in business.”
So many people are reliant on Facebook, WhatsApp or Instagram as primary modes of communication that losing access for so long can make them vulnerable to criminals taking advantage of the outage, said Rachel Tobac, a hacker and CEO of SocialProof Security.
“They don’t know how to contact the people in their lives without it,” she said. “They’re more susceptible to social engineering because they’re so desperate to communicate.” Tobac said during previous outages, some people have received emails promising to restore their social media account by clicking on a malicious link that can expose their personal data.
Jake Williams, chief technical officer of the cybersecurity firm BreachQuest, said that while foul play cannot be completely ruled out, chances were good that the outage is “an operational issue” caused by human error.
“What it boils down to: running a LARGE, even by internet standards, distributed system is very hard, even for the very best,” tweeted Columbia University computer scientist Steven Bellovin.
Twitter, meanwhile, chimed in from the company’s main account on its service, posting “hello literally everyone” as jokes and memes about the Facebook outage flooded the platform.
hello literally everyone
— Twitter (@Twitter) October 4, 2021
Later, as an unverified screenshot suggesting that the facebook.com address was for sale circulated, Twitter CEO Jack Dorsey tweeted, “how much?”
Twitter, meanwhile, chimed in from the company’s main account on its service, posting “hello literally everyone” as jokes and memes about the Facebook outage flooded the platform. Later, as an unverified screenshot suggesting that the facebook.com address was for sale circulated, Twitter CEO Jack Dorsey tweeted, “how much?”