cmaloney | Morning | 13:17 |
---|---|---|
jrwren | mornin | 13:20 |
cmaloney | Man, surprise covers abound | 13:31 |
* cmaloney is listening to Deconbrio - Guilty | 13:31 | |
cmaloney | cover of Gravity Kills - Guilty | 13:31 |
cmaloney | I like. | 13:31 |
jrwren | rather it be a klute cover :p | 13:31 |
cmaloney | Not familiar | 13:32 |
jrwren | leather strip side project | 13:38 |
jrwren | very aggro | 13:38 |
cmaloney | Ah | 13:44 |
cmaloney | laeeaeaeaaeeeaeeeather strip? | 13:44 |
cmaloney | ;) | 13:44 |
greg-g | weeee | 17:36 |
greg-g | https://twitter.com/Wikimedia/status/563388375898411008 | 17:36 |
greg-g | "All Wikimedia sites are experiencing issues due to a network problem. We'll be back up shortly!" | 17:36 |
cmaloney | greg-g: Woo woo | 17:39 |
brousch | greg-g: Fix it! | 18:26 |
greg-g | luckily, I don't have to | 18:30 |
greg-g | poweroutage to an important switch | 18:30 |
cmaloney | Lovely | 18:54 |
greg-g | yeah, I feel bad for the DC tech | 18:55 |
cmaloney | I'm sure his death will be swift. | 18:55 |
cmaloney | Though not honorable | 18:55 |
jrwren | single switch? | 18:58 |
jrwren | that is a shit spof design. | 18:58 |
jrwren | bridging is your friend FFS! :p | 18:58 |
cmaloney | jrwren: Likely one switch with bad failover. | 18:58 |
greg-g | yeah, I don't know all the details, but it sounds like it was a perfect shitty accidental storm | 18:59 |
cmaloney | yeah, there's a lot of instances where if one component fails then things are fine, but if one component disappears (or conversely doesn't disappear enough) then shit doesn't work | 19:00 |
cmaloney | "Hi, I know this is weird and all but I just powered up and I have no idea what a route is. Pleased to meet me" | 19:01 |
greg-g | the best part is, it looks like our logging system kept us from coming back up in a timely manner, we had to disable logging for a $timespan | 19:02 |
cmaloney | "Hi other router. Apparently you're up, so here's all the traffic. Derp derp" | 19:02 |
greg-g | :) | 19:02 |
cmaloney | greg-g: Those are the worst | 19:02 |
cmaloney | When your tracking is actively fucking you. | 19:03 |
greg-g | yeah, which we also just beefed up a bit (and starting logging a lot more stuff by default) | 19:03 |
greg-g | We've indeed had a total site outage for roughly 30 minutes. We're still | 19:12 |
greg-g | collecting all data, but we've tracked down the cause to multiple cascading | 19:12 |
greg-g | issues including loss of power to a critical SPOF network switch and HHVM | 19:12 |
greg-g | MediaWiki application servers getting blocked due to multiple unoptimal | 19:12 |
greg-g | timeout settings. We'll post a full incident report soon, and work to | 19:12 |
greg-g | correct the underlying issues as soon as possible. | 19:12 |
brousch | hmmmm | 21:44 |
brousch | https://www.irccloud.com/pastebin/oz47PtCo | 21:44 |
brousch | Ah, it's docker.io, not docker | 21:49 |
greg-g | https://wikitech.wikimedia.org/wiki/Incident_documentation/20150205-SiteOutage | 21:57 |
rick_h_ | evening | 22:23 |
cmaloney | Hello from OCC | 22:23 |
rick_h_ | party! | 22:24 |
cmaloney | W00t | 22:25 |
* DrDaemonEye waves at cmaloney | 23:00 | |
cmaloney | Howdy | 23:14 |
DrDaemonEye | how goes? | 23:17 |
cmaloney | Writing | 23:18 |
DrDaemonEye | fun times | 23:18 |
cmaloney | Yeah | 23:18 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!