[00:45] babbageclunk: another easy one: https://github.com/juju/juju/pull/7499 [00:47] menn0: approved [00:48] babbageclunk: thanks. develop one on it's way. [00:48] menn0: I don't think you need that reviewed though. === mup_ is now known as mup [00:52] babbageclunk: true [00:58] wallyworld: doing the sort/check in SetAPIHostPorts should be fine, we don't seem to care about the order in the worker, so shouldn't there either [01:44] axw: excellent, thanks for checking [03:15] menn0: can you please review https://github.com/juju/juju/pull/7501? [03:15] will do [03:15] thanks! [03:16] * babbageclunk goes for a run, anyway [05:11] jam: ping? [05:11] babbageclunk: in standup will ping when done [05:11] jam: cool cool [05:12] menn0: ping? [05:12] babbageclunk: otp [05:12] * babbageclunk sulks [05:40] babbageclunk: what's up [05:41] babbageclunk: we have the same standup :) [05:42] jam: yeah, sorry - forgot! [05:43] jam: Here's my change to the status history pruning, if you want to take a look: https://github.com/juju/juju/pull/7501 [05:44] babbageclunk: so my concern with the 'while true' stuff, is you probably aren't getting close to actually having 4GB of history [05:44] and that's the point where we need to watch out for how it operates. [05:44] babbageclunk: did you do any big scale testing to know that it performs decently ? [05:44] jam: I tried it with 2,000,000 records in unit tests, but not any more than that. [05:44] and/or analysis of how much memory we store, how long it takes to get the list of things to be deleted [05:45] jam: with those it was getting through ~400k rows in each 15s chunk [05:46] babbageclunk: where was this running? [05:46] jam: but you're right, checking how long the initial query takes is something I'll do [05:47] babbageclunk: yeah, one of the particular concerns is just the 'grab all the ids' never returning before getting killed [05:47] on my nice fast laptop with ssd, so I'm not sure how it'll scale. [05:47] babbageclunk: ah, I guess you are walking the iterator and deleting, right ? [05:47] so it doesn't have to read all 2M before it starts removing stuff [05:48] jam: *In theory* it should be streaming the query, but that's definitely something I should confirm for really sure. [05:48] yup [05:48] babbageclunk: I've seen collection.count() fail [05:48] in some of the logs [05:48] and *that* should be much cheaper [05:50] jam: yeah, that bit's problematic anyway - the number we get back is only a rough estimate and we frequently exit the loop before deleting all of them (because the size drops below threshold). [05:51] jam: maybe there's a more stats-y way to get a rough row count (since the scale is really the information we need) [05:51] babbageclunk: so coll.Count() should just be reading the stats on the collection [05:52] as long as we aren't doing query.Count() [05:52] babbageclunk: and it may be that if something like Count() is failing, there just really isn't anything we can do [05:52] jam: yeah, true [05:52] cause we wouldn't be able to get useful work done [05:53] babbageclunk: so *if* you're looking at this, is there any chance you could try a perf test of doing Bulk().Remove()*1000, vs coll.RemoveAll({_id: $in{1000}}) ? [05:53] jam: sorry, I was a bit unclear - the count is probably exact but our calculation of how many rows need to be deleted is approximate. [05:54] babbageclunk: no I understood that part, and I think its still worthwhile from a "ok, you've taken 10min, are you going to take another 30s, or another 3hrs" [05:54] jam: Oh yes - I did that. The bulk call is a bit faster at a batch size of 1000. [05:54] babbageclunk: k. I have a very strong feeling it is much slower on mongo 2.4 (cause it doesn't support actual pipelined operations, so mgo fakes it) [05:55] which *may* impact us on Trusty, where I'm told we're still using 2.4, but I can live with that. [05:55] jam: RemoveAll is faster with a batch size of 10000, but bulk doesn't support more than 1000 [05:55] babbageclunk: do you have numbers? [05:55] jam: It's easy for me to change back to using RemoveAll - it's all in one commit. [05:55] as in 10% difference, 2x faster, ? [05:55] 25% faster [05:56] RemoveAll(10k) is 25% faster than Bulk(1k) [05:56] ? [05:56] yuo [05:56] yup [05:56] babbageclunk: and RemoveAll(1k) vs Bulk(1k) ? [05:57] I was getting 400k / 15s block for bulk vs 490k / 15s for RemoveAll(10k) [05:57] I can't remember the number for RemoveAll(1k) - there wasn't much difference between Bulk and RemoveAll [05:58] * babbageclunk checks scrollback [05:59] babbageclunk: surprisingly historicalStatusDoc also doesn't have any ,omitempty fields [05:59] babbageclunk: can you create a bug against 2.3 about adding them? [06:00] jam: ok. I'm only selecting ids in this case though, so hopefully that wouldn't change it. [06:00] only one I kind of care about is StatusData as it is very likely to be empty and thats just bytes on the docs we don't need. [06:00] babbageclunk: not about anything you're doing [06:00] ok [06:00] babbageclunk: its about "oh hey, this isn't correct" [06:00] sure :) [06:00] It's like when you lift a rock and see all the creepy crawlies [06:01] babbageclunk: its a change that I'm not 100% comfortable just landing in a 2.2.* series, but also low hanging fruit for a 2.3 [06:01] Yeah, makes sense [06:02] babbageclunk: every status history doc is at least 52 bytes long just from the keyword fields [06:02] given we have Millions of them, we probably should also consider being more frugal [06:04] babbageclunk: so just a general "we should re-evaluate the fields in statuses-history because we are being wasteful with size of a doc we have lots of" [06:05] babbageclunk: my gut (and I'd like menn0 to have a say here), is to go with RemoveAll(10k), because there are fewer total moving parts, and it will do better on 2.4 anyway [06:05] babbageclunk: the other thing to compare against is what is the total time when it was a single query? [06:05] I guess we can't really shorten the stored field names at this point either. [06:05] as in, RemoveAlL(, t < X) => 500k/15s average time [06:05] babbageclunk: well that we can do with an upgrade step [06:06] *could* [06:06] axw: i'm confused. somethings use a *Macaroon. other times we pass around a [][]*Macaroon (in fact []macaroon.Slice). do you know if just storing a single macaroon for cmr auth will be sufficient? [06:06] babbageclunk: or is RemoveAll(t < X) 1M/15s [06:06] babbageclunk: do you have that number? [06:06] I'm guessing it may still be worth it to give feedback and be making concrete progress [06:06] No, sorry, haven't tested that. [06:07] babbageclunk: ok, I'd like you to do that comparison, just to have some information about whether we're making a big impact or not. [06:07] Would be good to know how much the incremental processing is costing. [06:07] Ok, I'll compare to that. [06:07] babbageclunk: my experience on Pruning is that Read is *way* cheaper that Remove [06:07] than [06:07] babbageclunk: as in, PruneAll takes seconds to figure out what to do and minutes to remove them [06:08] Right [06:08] babbageclunk: but I'd like confirmation here [06:08] babbageclunk: also, make sure that you're doing the prunes while the charms are firin [06:08] firing [06:08] so that there are active inserts while we're removing [06:08] Yup [06:09] jam: ok - I have to go help feed the kids before I get in trouble. [06:10] jam: Thanks though - I'll compare those. [06:10] babbageclunk: np, I approved the PR conditional on the testing [06:10] jam: awesome [06:44] wallyworld: sorry, was riding. uhm. IIRC, you pass around a collection if you'll need to discharge. I think in your case you only need to pass around one [06:45] yeah, that's all i was counting on doing [08:58] jam: did you want to discuss mgo/txn changes? [08:58] menn0: indeed [08:59] jam: hangout? [09:00] menn0: https://hangouts.google.com/hangouts/_/canonical.com/john-menno?authuser=1 [09:02] jam: sorry, having auth issues [09:02] bear with me [09:02] menn0: np, I thought I was having the problems, hangouts doesn't like me all the time [09:03] jam: i'm in the hangout now [09:14] menn0: I'm finally there, are you still connecting? [09:14] jam: i've been in it for a while [13:27] axw: reviewed your log buffering === frankban|afk is now known as frankban [23:57] hml: ping [23:59] rich_h: hi [23:59] or rick_h even. :-) hi