Thread: Fecebook down
View Single Post
#5
10-05-2021, 02:01 PM
Senior Member
Joined in Sep 2017
2,017 posts
Red neck
Quote:
Originally Posted by jwxie518 View Post
Our team at FB did a great job pulling this through. Can't disclose much what happened entire throughout, but all i can say is this literally needed ppl physically in the data centers (multiple of them) and we had to pull in everyone in all time-zones. All hands on deck.

We had some shit employees leaking stuff to public which made the investigation harder to coordinate, wasting time on setting up new and private communication channel.

The other good part is internally the company does a lot of drill on almost daily basis in different region and this type of complete outage had been planned - we had runbooks on who to call, where to meet etc. The sad thing is we still have a lot of tooling behind our network and because DNS couldn't resolve, it was nearly impossible for anyone to get onto any server, let alone opening up our internal wiki pages...

BGP and DNS are not "down" per se. That's a gross simplification. We simply stopped announcing our DNS to the the whole BGP network and the reason will be investigated.

Lucky me I didn't have to be 100% available troubleshooting since I am not oncall for one of the critical systems this week, but man, I can feel the heat even by listening to the live zoom call we had until someone fked up with leaks.
Is you job hiring/ sponsoring EB3 visa ?lol
Post your reply or quote more messages.