/srv/irclogs.ubuntu.com/2015/02/05/#ubuntu-us-mi.txt

cmaloneyMorning13:17
jrwrenmornin13:20
cmaloneyMan, surprise covers abound13:31
* cmaloney is listening to Deconbrio - Guilty13:31
cmaloneycover of Gravity Kills - Guilty13:31
cmaloneyI like.13:31
jrwrenrather it be a klute cover :p13:31
cmaloneyNot familiar13:32
jrwrenleather strip side project13:38
jrwrenvery aggro13:38
cmaloneyAh13:44
cmaloneylaeeaeaeaaeeeaeeeather strip?13:44
cmaloney;)13:44
greg-gweeee17:36
greg-ghttps://twitter.com/Wikimedia/status/56338837589841100817:36
greg-g"All Wikimedia sites are experiencing issues due to a network problem. We'll be back up shortly!"17:36
cmaloneygreg-g: Woo woo17:39
brouschgreg-g: Fix it!18:26
greg-gluckily, I don't have to18:30
greg-gpoweroutage to an important switch18:30
cmaloneyLovely18:54
greg-gyeah, I feel bad for the DC tech18:55
cmaloneyI'm sure his death will be swift.18:55
cmaloneyThough not honorable18:55
jrwrensingle switch?18:58
jrwrenthat is a shit spof design.18:58
jrwrenbridging is your friend FFS! :p18:58
cmaloneyjrwren: Likely one switch with bad failover.18:58
greg-gyeah, I don't know all the details, but it sounds like it was a perfect shitty accidental storm18:59
cmaloneyyeah, there's a lot of instances where if one component fails then things are fine, but if one component disappears (or conversely doesn't disappear enough) then shit doesn't work19:00
cmaloney"Hi, I know this is weird and all but I just powered up and I have no idea what a route is. Pleased to meet me"19:01
greg-gthe best part is, it looks like our logging system kept us from coming back up in a timely manner, we had to disable logging for a $timespan19:02
cmaloney"Hi other router. Apparently you're up, so here's all the traffic. Derp derp"19:02
greg-g:)19:02
cmaloneygreg-g: Those are the worst19:02
cmaloneyWhen your tracking is actively fucking you.19:03
greg-gyeah, which we also just beefed up a bit (and starting logging a lot more stuff by default)19:03
greg-gWe've indeed had a total site outage for roughly 30 minutes. We're still19:12
greg-gcollecting all data, but we've tracked down the cause to multiple cascading19:12
greg-gissues including loss of power to a critical SPOF network switch and HHVM19:12
greg-gMediaWiki application servers getting blocked due to multiple unoptimal19:12
greg-gtimeout settings. We'll post a full incident report soon, and work to19:12
greg-gcorrect the underlying issues as soon as possible.19:12
brouschhmmmm21:44
brouschhttps://www.irccloud.com/pastebin/oz47PtCo21:44
brouschAh, it's docker.io, not docker21:49
greg-ghttps://wikitech.wikimedia.org/wiki/Incident_documentation/20150205-SiteOutage21:57
rick_h_evening22:23
cmaloneyHello from OCC22:23
rick_h_party!22:24
cmaloneyW00t22:25
* DrDaemonEye waves at cmaloney23:00
cmaloneyHowdy23:14
DrDaemonEyehow goes?23:17
cmaloneyWriting23:18
DrDaemonEyefun times23:18
cmaloneyYeah23:18

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!