[01:07] wallyworld: did I miss anything this morning? [01:07] axw: not really, i can fill you in if you wanted a qick HO [01:07] wallyworld: ok, see you at standup [01:07] with site news etc [02:03] babbageclunk: ping? [02:03] menn0: hey [02:04] babbageclunk: so I think your hacked mgoprune takes care of a case which mgopurge doesn't handle (as you found) [02:04] menn0: ok [02:05] menn0: I mean, it just reports them, doesn't do any cleanup [02:05] babbageclunk: you were seeing Insert ops in the txns collection which didn't have a matching entry in stash - is that right? [02:06] menn0: they had a stash record, but that stash didn't have the txnid_nonce in txn-queue [02:06] hmmm ok [02:07] menn0: It was raising the error from flusher.go:475 [02:08] babbageclunk: the existing PurgeMissing function goes through the stash and looks for txn-queue entries there which don't have a matching txns doc [02:08] babbageclunk: but this is the other way around [02:09] menn0: right [02:09] babbageclunk: I guess we should extend mgopurge to handle this [02:11] menn0: yeah, that was my thinking [02:11] babbageclunk: how did you fix the issue? [02:13] menn0: The bad transactions were all leadership lease updates (I think), so blahdeblah thought it made more sense to remove them. [02:13] menn0: so we removed the txn id from the txn-queue of the *other* record and then removed the txn record. [02:14] i.e. all of the txns were an update to one record and an insert. [02:15] menn0: I thought the other way to fix it was to insert the txn id into the start of the stash record that was missing it. That's probably the more general fix. [02:16] (Although we didn't try it, so I guess we can't be sure it's actually rignt.) [02:17] babbageclunk: that might work but it seems riskier... I wonder if it's safer to just remove the badness [02:18] menn0: that was the thinking in this case, although it might not only happen with this specific type of (pretty disposable) transaction [02:19] babbageclunk: true... i'll have a play around [02:21] babbageclunk: blame me, why don't you? :-P [02:21] menn0: maybe the fact that we've only seen it with leadership-related transactions means it's something to do with the way we're handling them? Or maybe it's just that we have lots of them, so that's where it turns up. [02:21] I think it's the latter [02:22] babbageclunk: but it would be great to understand why these problems at all [02:22] happen at all [02:22] babbageclunk: we've never really managed to figure that out... it's either mongodb screwing up or a bug in mgo [02:22] blahdeblah: :) I was just trying to lend the decision some weight by adding your name to it! === akhavr1 is now known as akhavr [02:38] I'm not fat; my chest just slipped a bit [04:22] wallyworld: I've found some nice low hanging fruit for reducing the number of active sessions. every agent (machine & unit) has a session open for the lifetime of the agent's connection, for logging to the db [04:23] yeah, i think it's 2 per agent? [04:23] wallyworld: I'm thinking we might want to apply jam's idea of aggregating writes into a bulk insertion for presence here as well [04:23] wallyworld: just 1 AFAICS [04:23] sgtm [04:34] wallyworld, axw: seen the message about mongo replication in #juju@irc.canonical.com? Any ideas? [04:35] babbageclunk: I'm afraid not [04:35] babbageclunk: I mean I've seen it now, I don't have any good ideas [04:37] axw: wallyworld: "reducing the number of active sessions", is that closing the session while its active, or doing all of the rights by Cloning the one session per agent? [04:38] axw: wallyworld: I'd also say aggregating presence should certainly be 2.2.1 not delay 2.2rc3/2.2.0 [04:38] no, it won't delay [04:38] it will be 2.2.1 [04:38] I'm off today for my wife's birthday, so I won't be around much [04:38] have fun [04:39] jam: if we aggregate, then either Copy() for the lifetime of the aggregator, or Copy() just before doing a bulk insert, and close straight after. I'm not sure which is best yet [04:39] site looks ok atm [04:39] axw: I'd copy before insert, given if we bulk updating, we shouldn't have a lot of those active. [04:39] axw: I was trying to find a nice number for the frequency, given its a 30s window, it feels like we could batch all the way up to 1s intervals [04:40] * jam goes to take the dog out [04:40] jam: that sounds reasonable. it could be made adaptive too, i.e. insert more frequently if the incoming rate is slower [04:40] * axw waves [04:41] axw: the other option is that we just aggregate whatever we get until the last update finishes [04:42] so we only have 1 presence-based write at any time [04:42] and just splat down whatever updates we've gotten since then [04:42] * axw nods [05:03] menn0: standup? === frankban|afk is now known as frankban [07:18] wallyworld: do you know if it's ok for us to merge things like https://github.com/juju/docs/pull/1910 into master? or should we wait for evilnick or someone else on the docs team to merge? [07:18] not sure if there's some special process, don't want to bork things [07:34] axw: yeah, not sure. for something like that would be nice to merge [07:35] wallyworld: ok. burton, I'll merge and we can ask for forgiveness... :) [07:36] axw wallyworld GREAT! [08:40] wallyworld: no rush since it won't be landing until >2.2, but https://github.com/juju/juju/pull/7496 is a prereq for a follow-up db logger session reduction PR [08:54] netpln--help [09:05] axw: just getting dinner, will look after [09:10] wallyworld: btw, I got your message, thank you, I'm still looking for a hotel in Brisbane (but there are rooms available, I'll book something in the next few days) [09:28] wpk: great, let me know if you need any help etc. i've booked you an extra 2 nights a the venue and then monday night will be in Brisbane [09:42] Is there a way to check if goyaml.Unmarshal processed everything? [09:42] I see an error returned if there's a field of wrong type, but what about field that is in yaml but not in structure? [10:37] Bug #1697664 opened: Juju 2.X using MAAS, juju status hangs after juju controller reboot [11:50] anyone familiar with yaml.v2 ? [12:05] wpk: Fields in the yaml and not in the structure get dropped. I think if you don't want that, you need to unmarshal to a mapping and handle it yourself. [12:05] stub: I added an UnmarshalStrict method and PRed [12:08] Other thing is that yamllint says that a list marshaled by goyaml is badly indented === frankban is now known as frankban|afk [23:15] What's the best way to generate a lot of status updates for a model? Can I do it with juju run? [23:16] Or do we have a charm that will do it? [23:16] Asking for a friend. [23:16] wallyworld, menn0: ^ [23:17] babbageclunk: you mean the hook execution? [23:18] wallyworld: Oh, will any hook execution generate updates? So I can just use peer-xplod? [23:19] babbageclunk: there needs to be an actual implementation of that hook to then write values [23:19] and juju only runs the hook once every 5 minutes [23:20] or are you talking about unit status history log? [23:20] wallyworld: Hmm, that'll take too long. I guess I should make a charm that just sits in a loop updating status? [23:20] rather than unit/machine status values [23:21] wallyworld: For the pruner it doesn't matter - just want to generate megs of updates in status history, right? [23:21] oh, i thought you were talking about status values [23:21] "status updates" [23:22] you can always hack a charm as you suggest [23:22] or use juju run [23:22] I don't understand the difference between status values ands status updates. [23:23] Oh, you mean the update status hook. [23:23] sorry, my brain was differentiating between what goes into the status collection vs status history collection [23:24] it's all very confusing, similar terminology [23:24] Gotcha, sorry. Yeah, I want to fill up the history so that I can QA my pruning work. [23:24] you want to test history pruning [23:24] I'm not really asking for a friend, that was a joke. :) [23:24] lol [23:25] i'd just do a juju run inside a bash loop? [23:25] you can execute the status-set hook tool inside juju run [23:25] Ah, awesome - playing with that now - thanks! [23:37] babbageclunk: easy review please: https://github.com/juju/txn/pull/37 [23:38] menn0: lookign [23:38] ng [23:40] menn0: Approved [23:42] babbageclunk: cheers