Facebook explains error that caused global outage

Facebook has said an error during routine maintenance of its network of data centers caused a cascade of problems that took down its platforms for more than six hours on Monday.

In a blogpost published on Tuesday, Santosh Janardhan, vice-president of engineering, said the global outage that saw Facebook, Instagram and WhatsApp go dark for billions of users had begun when the company’s engineers issued a command that unintentionally disconnected Facebook data centers from the rest of the world.

Janardhan described the error as originating within the company’s “global backbone” of fiber-optic cables and data centers.

“This outage was triggered by the system that manages our global backbone network capacity,” Janardhan wrote. “The backbone is the network Facebook has built to connect all our computing facilities together, which consists of tens of thousands of miles of fiber-optic cables crossing the globe and linking all our data centers.”

“During one of these routine maintenance jobs, a command was issued with the intention to assess the availability of global backbone capacity, which unintentionally took down all the connections in our backbone network, effectively disconnecting Facebook data centers globally,” Janardhan said.

diagram explains facebook outage
Photograph: Reuters

The company said its systems were designed to audit commands to prevent mistakes, but the audit tool had encountered a bug and had failed to stop the command that caused the outage. The outage had knocked out tools that engineers would normally use to investigate and repair such outages, making the task even more difficult.

The outage was the largest that Downdetector, a web monitoring firm, said it had ever seen.

Facebook said it had not been caused by malicious activity.

While users lost access to one of the world’s most popular messaging apps – WhatsApp has more than 2 billion users – employees were also blocked from internal tools.

The company said it had sent a team of engineers to the location of its data centers to try to debug and restart the systems.

However, it took the company extra time to get engineers inside to work on the servers due to the physical and system security in place.

Even after network connectivity was restored to the data centers, Facebook said it worried a surge in traffic would cause its websites and apps to crash.

But because the company had run drills to prepare for such situations, access to its services returned relatively quickly.

“Every failure like this is an opportunity to learn and get better,” Janardhan wrote. “From here on out, our job is to … make sure events like this happen as rarely as possible.”

The outage came during a difficult week for Facebook, as the US Senate held a hearing with a former employee turned whistleblower who accused the social network of putting profits before people’s safety, a claim that Facebook disputes.

Source: Facebook explains error that caused global outage

*This is a free press release. Upgraded press releases are ad-free!

New York #1 Best-Selling Author Finds Inspiration in Shen Yun

LOUISVILLE, Ky.—Dr. David West Reynolds holds a Ph.D. in archeology specializing in ancient Rome and Egypt. He’s also the New York Times #1 best-selling author of Star Wars guide books. His books have been translated into a dozen languages and have sold over 2 million copies around the world. He has also written books on…

Read Press Release

Mark Cuban believes that Bitcoin is the best store of value

Ethereum has more use cases than Bitcoins as per billionaire Mark Cuban and Shark TankCuban sees BTC as digital Gold, and deemed that Bitcoin is what the Gold folks are doingCuban advocates smart contracts platform Ethereum and layer 2 solutionsMark Cuban believes that investors could choose Bitcoin to invest in the longer time-frameEthereum and Bitcoin…

Read Press Release

The changing nature of beauty packaging

In a fiercely competitive market, packaging can make an enormous difference in which beauty products do best. Here we examine how sustainability and influencer marketing is affecting the look and feel of the beauty sector In 2020, the UK’s beauty industry was reportedly worth £27 billion, and has been valued at around $500 billion globally.…

Read Press Release

Facebook explains error that caused global outage - Click To Share

Share on facebook
Share on twitter
Share on reddit
Share on linkedin
Share on email
Share on whatsapp