[00:20] wallyworld: I've got a bug fix to land. Shall I target 1.24 first and after shipit land it in master? [00:21] waigani: which bug? [00:21] wallyworld: https://bugs.launchpad.net/juju-core/1.24/+bug/1384549 [00:21] Bug #1384549: Running Juju ensure-availability twice in a row adds extra machines [00:21] waigani: 1.24 please. master is blocked. we will look to merge 1.24 into master and get all the fixes in one go [00:22] wallyworld: okay, sounds like a plan - thanks [00:22] np, great that you have a fix for that bug [00:23] wallyworld: yeah, well see what you think... [00:23] ah, have to branch 1.24, I was working against master [00:35] thumper: HA bug fix: http://reviews.vapour.ws/r/1570 [00:59] waigani: did you ever get your virtual maas setup? [01:00] wwitzel3: yes I did! [01:00] wwitzel3: then I read a comment from nate that the bug I was working on didn't need maas to be reproduced :/ [01:01] wwitzel3: but I now have maas setup on my machine for future debugging [01:06] Bug #1451626 was opened: Erroneous Juju user data on Windows for Juju version 1.23 <1.23> [01:37] wallyworld: hey, sorry I missed the standup. kids kept waking up last night, so I slept in [01:37] axw: np. hope everything's ok [01:38] i've been trying to fix my farking internet [01:38] wallyworld: yeah, just a bit cold and they keep rolling around and kicking their covers off... [01:38] got a new modem [01:38] oh, finally :) [01:38] well, i didn't know what the issue was , so i'm trying swapping out components :-) [01:39] axw: we'll be releasing 1.24 alpha 1 tomorrow, and then will look to merge 1.24 into master to bulk forward port the fixes [01:41] axw: o/ [01:41] anastasiamac: howdy [01:41] wallyworld: same question as waigani [01:41] I have the patch for the debug hooks issue [01:41] perrito666: great, merge into 1.24 first please [01:42] perrito666: but do it tomorrow as it's past your eod [01:42] axw: m good :D i have a branch for u to review when u settle :D [01:42] wallyworld: seems there's some issues with charm.v5? mgz proposed a revert of one of my branches which updates dependencies.tsv [01:42] axw: storage stuff is beta than coffee, u know :) [01:43] wallyworld: k just wondering if it was required before the cut [01:43] anastasiamac: sure, I'd like to try and prevent a revert of my work first though [01:43] nah, that bug has been there for a while [01:43] axw: sounds good [01:44] axw: reverts r always painfull and 1.24 takes priority over my branch (against master) anyway [01:44] axw: i looked at the revert and it looked like the commit u added for charm.v5 is more recent?.. [01:44] wallyworld: http://reviews.vapour.ws/r/1571/ [01:44] cheers people [01:45] perrito666: thanks, will look [01:45] anastasiamac: commits were added to v6-unstable, and dependencies.tsv apparently had that rev listed. so it says v5, but the commit wasn't on even in that branch [01:46] axw: what a mess :( [01:46] indeed [01:50] axw: perrito666's change looks ok but i'd like a 2nd opinion, so after your charm.v5 adventures, would be great if you could look. no rush as the work will be landed tomorrow anyway [01:53] menn0: with that upgrade bug, i really think juju should by default be changed to upgarde to latest stable. this whole only 1 version at a time thing is a relic of when we didn't do upgrades very well [01:54] wallyworld: I agree [01:55] i think there's a 1.24 bug related to that [01:55] or at least a similar issue [01:55] wallyworld: related to what? [01:55] wallyworld: https://github.com/juju/charm/pull/126 <- PTAL [01:56] an upgrade selected a surprising version [01:56] wallyworld: right [01:56] menn0: so doing upgrades right will sort of fix that implicitly [01:56] axw: looking [01:56] wallyworld: if we are going to support big upgrade jumps then we need to socialise that and have a CI test that upgrades from 1.18 to current stable [01:57] yes [01:57] wallyworld: I think i've found yet another upgrade bug going from 1.18 to 1.23 [01:57] wallyworld: just figuring it out now [01:57] best find them now before we change behaviour :-) [01:57] axw: how was the wrong rev added to dependencies.tsv? [01:57] wallyworld: fuck knows [01:58] someone was lazy [01:58] * axw shrugs [01:59] axw: i added $$merge$$, can't recall if there's a bot [01:59] will merge manually if needed [01:59] wallyworld: thanks, I think I can merge [01:59] tests all pass right? [01:59] ok [01:59] what, run tests? I suppose I should, but it's pretty trivial [02:00] all tests pass :) [02:02] \o/ [02:04] axw: so when you proposed the juju core hooks change and updated the charm.v5 revision, did you just grab the tip revision from master? [02:05] wallyworld: no. I branched off v5 and added my commit, then updated juju to use that. someone else had previously pulled a commit from v6-unstable [02:05] so when I made my change, the fixes on v6-unstable were lost [02:07] ah right [02:15] wallyworld: ok, definitely another upgrade problem [02:15] axw: looks like it was a cloud sigma change - they use a stale version of v5 [02:16] wallyworld: don't understand [02:17] they commited to dependencies.tsv and updated it with a stale charm.v5 revision ie one that was about a week old [02:17] wallyworld: I was trying to avoid laying blame, but the problem occurred on this commit https://github.com/juju/juju/commit/73b4f331085fabd5bef5188e5af193118ec573fb [02:18] wallyworld: note the change in dependencies.tsv [02:18] the commit changed to does not exist in v5 [02:18] not so much laying blame but understanding the cause so we can ensure it doesnt happen again [02:19] wallyworld: I replied to mgz's email, with a note on what to avoid doing [02:19] and explained what happened [02:19] rightio, need to check mail [02:31] waigani: can you please hold off on retrying that build for a minute [02:31] your change doesn't look quite right [02:31] axw: yeah for sure [02:31] axw: what's your concern? [02:32] waigani: I'm just reading over the code and thinking, pretty sure what was there was doing the right thing... need to go over it some more [02:33] axw: servers that were not available and wanted vote where getting demoted [02:33] axw: that's the crux of the bug [02:34] waigani: that's exactly the specified behaviour of ensure-availability [02:34] axw: with the fix servers that not available, have vote and want vote now get demoted [02:35] axw: "adding vote" == want vote + don't have vote [02:35] axw: we shouldn't be demoting servers that are getting a vote added to them [02:36] waigani: why not? [02:36] the machine is unavailable, so why would we not demote it? [02:36] that prevents another machine from becoming a state server [02:40] waigani: I suspect what's happened in the bug scenario is that the machine's pinger hasn't started yet [02:40] so it looks like the machine isn't available, but it kinda-sorta really is [02:40] axw: I've manually tested this on aws, I ran ensure-availability, turned off an instance, reran ensure-availability, dead server got demoted an new one got added [02:41] waigani: yes... that is what is meant to happen [02:41] axw: and it still does [02:44] waigani: how long did you wait? if you wait long enough, that state server should lose its "has vote" status [02:45] waigani: i.e. because the peer grouper has noticed that the mongo has stopped responding [02:45] waigani: and once that happens, the machine will be "maintained" forever [02:47] axw: I've just destroyed the environment, but let me do the same again and we can poke around [02:48] waigani: thanks. I think you should be able to see from the peergrouper logs which machines have/haven't got a vote [02:48] oh you already said it's in the status, never mind [02:49] axw: yep. this is from last run: http://pastebin.ubuntu.com/10987944/ [02:50] axw: I'm just bootstrapping again now, I'll run HA, turn off an instance, run HA again. Then I'll ping you. [02:50] waigani: thanks [03:00] wallyworld: can you please close https://github.com/juju/juju/pull/2209? [03:00] axw: yep, sure [03:01] wallyworld: ehm, hmm. on 1.24, I get "2015-05-05 03:01:02 WARNING juju.cmd.envcmd environmentcommand.go:253 invalid JUJU_CLI_VERSION value:" [03:01] I guess something is not checking for "" [03:01] I guess something is not checking for "" [03:01] oops [03:02] axw: hmmm, was supposed to check [03:02] i'll double check and fix [03:03] axw: ah, the warning is printed unnecessarily [03:03] ffs [03:07] axw: here is where the env is up to: http://pastebin.ubuntu.com/10987993/ [03:07] axw: stopped machine is being demoted, new machine is getting the vote [03:07] waigani: it'll do that as long as the machine "has-vote" [03:07] waigani: I think it should lose the vote after a long enough time of mongo not being running [03:10] axw: are you saying, if I stop an instance and don't run HA straight away, but wait, that instance will loose "has-vote" and then the logic is screwed? [03:10] waigani: right (assuming has-vote does get lost) [03:11] axw: how about l stop another instance and let it sit for the afternoon - checking in now and then [03:12] waigani: I'd appreciate it. I'm looking at the code now, but do need to get back to other things soon [03:12] axw: I'll also read through the code and see if I can spot anywhere that removes it [03:12] waigani: worker/peergrouper is the place to look [03:12] axw: ha, I was just about to ask - thanks [03:13] axw: on an interesting aside, I used destroy-environment --force on last env, but it didn't terminate the stopped instance - it's still there in aws [03:14] waigani: hrm :/ [03:15] axw: yeah. Other instances are marked as terminated. But that one is still there as "stopped". [03:15] I'm guessing that we're filtering for only running instances [03:18] axw: a reasonable guess. [03:22] waigani: okay, here's a scenario where it would (I think) occur: ensure-availability, then immediately stop jujud and mongod on the new state server [03:22] i.e. regardless of has-vote being lost [03:22] (I think has-vote will never be lost unless another eligible state server comes along) [03:24] axw: I currently have two stopped state server machines: one with vote, one without. How about I spin up the one without? [03:24] axw: then we'd have another eligible state server right? [03:24] waigani: sure. I'll try this in my own env too === beisner- is now known as beisner [03:36] axw: is this the scenario we need: http://pastebin.ubuntu.com/10988070/ [03:37] axw: machine 2 is started and has no vote. machine 3 is stopped and has vote [03:38] waigani: that's *a* scenario, yeah. what happens with ensure-availability once all of the agents are running/available? [03:40] axw: yikes, I just noticed that 0 and 1 are down - they shouldn't be? [03:41] axw: okay they are up now, I don't understand why they went down? [03:41] waigani: so anyway, I'm going to have to NOT LGTM this. with your change, if a state server machine never gets provisioned, then it'll never get replaced [03:42] waigani: there's a small window as machines and units start where they are marked as down - the default presence value is 0 [03:43] this is all messed up but that's how it was written [03:43] axw: yeah, but 0 and 1 were started, I just stopped and started 2. Anyway, focusing on the bug at hand [03:44] waigani: sorry, I don't know why those agents are going up and down [03:44] axw: so they are up now. the stopped server still has no vote. [03:44] waigani: I don't think there is a simple fix here. We need to improve agent presence, as wallyworld mentions [03:46] s/improve/burn in hell [03:46] it is horrid [03:46] hehe [03:46] wallyworld: not landing stuff on a blocked branch are you? [03:46] axw: how could a state server get the vote without being provisioned? [03:47] thumper: yes i am [03:47] thumper: i talked to curtis a few days ago and we agreed onecould be prmatic [03:47] and this ia a small fix to a bug in a previous landing [03:48] waigani: it doesn't ever need to have gotten a vote. [03:48] axw: sorry, got my logic backwards - it needs the vote to get demoted [03:48] right [03:48] axw: once you finish your conversation, do you have time for a quick chat about storage with me and anastasia? [03:49] axw: so we can't tell the difference between a state server that is un-provisioned because of an error and one that is in the process of being provisioned [03:49] waigani: or one where provisioning failed but can be retried, or a provisioned one where the agents's dead, or mongo's dead, or ... [03:50] wallyworld: sure, tanzanite hangout? [03:50] yep [03:50] anastasiamac: ^^^ [03:50] wallyworld: m here :D [03:52] wallyworld: can u hear us? [03:52] no, hangout is hanging [03:52] ironic [03:53] ffs [03:59] axw: what if we use a wait and retry policy? [03:59] axw: after n retries and x time, it gets demoted? [04:05] waigani: IMO that should be part of the "availability" [04:05] which is more general than HA [04:05] although HA availability *should* consider more than just agent availability [04:09] axw: but it's not (or the window is not big enough). because we are getting servers that are not available because they are in the process of being provisioned [04:09] I know it's not, I'm saying it should be [04:09] it's known that agent availability/presence is not great [05:11] wallyworld or axw: I am seeing a case of incorrect field ordering of docs in mongodb which is causing an assert to fail [05:12] oh? [05:12] wallyworld or axw: could be upgrade related. i see it when upgrading to 1.23 [05:12] example? [05:12] wallyworld or axw: it breaks address setting on machines [05:13] wallyworld: i see it happen fairly often on either the addresses and or machineaddresses fields [05:13] "machineaddresses" : [ [05:13] { [05:13] "addresstype" : "ipv4", [05:13] "networkscope" : "local-machine", [05:13] "value" : "127.0.0.1" [05:13] }, [05:13] { [05:13] "value" : "192.168.122.107", [05:13] "addresstype" : "ipv4", [05:13] "networkscope" : "local-cloud" [05:14] }, [05:14] { [05:14] "addresstype" : "ipv6", [05:14] "value" : "fe80::5054:ff:fe92:1645" [05:14] }, [05:14] { [05:14] "value" : "fe80::c8a9:ceff:fef4:18fb", [05:14] "addresstype" : "ipv6" [05:14] } [05:14] ], [05:14] note how the field ordering isn't consistent [05:14] mongo cares about field ordering when doing comparisons [05:14] so the assertion fails [05:14] there have been issues in the past where something would sort addresses when it shouldn't; order of addresses in state is meant to be maintained [05:14] axw: it's not the order of the docs that's the issue. that seems to be correct. [05:14] axw: it's the order of the fields inside the docs [05:15] hmmm [05:15] menn0: perhaps someone changed the struct definition [05:15] axw: in this case the 2nd and 4th docs matches the struct definition. the others don't [05:15] hrm :/ [05:15] wallyworld: i've checked that and it doesn't look like it [05:16] wh should mongo care about field ordering ffs? [05:16] what sort of retarted "db" it that [05:16] i'm wondering if it's related to the dict randomisation in Go 1.3 [05:16] i'm on vivid today [05:16] but mgo/bson seems to do everything right [05:16] i might stick some debugging stuff into the transaction layer and see what I can find [05:17] even with map odering, surely a map comparison doesn't care [05:17] wallyworld: mgo/txn actually makes a query to MongoDB to do the comparison [05:17] wallyworld: so it's being done at the MongoDB side, so field ordering does matter. [05:17] so are you saying if mgo gets a map{1:2, 3:4} that will be considered diferent to {3:4, 1:2} [05:18] if so, wtf [05:18] wallyworld: if you replace the word "map" with "document" then yes [05:18] why? [05:18] that just makes no sense to me [05:18] wallyworld: beats me but that's what it does [05:18] ffs [05:18] i hate mongo even more now [05:19] http://devblog.me/wtf-mongo [05:19] wallyworld: ^^^ first item [05:19] great link title [05:21] menn0: so the link offers a work around [05:21] alter the find syntax slightly [05:21] wallyworld: that doesn't really work for this code. [05:21] mwhudson: https://go-review.googlesource.com/#/c/9526 [05:21] wallyworld: it's asserting that a long list of network address structs matches [05:21] if you have time [05:21] a bit messy [05:22] but it gets the job done [05:22] wallyworld: I guess you *could* translate it [05:22] menn0: we *may* have to :-( [05:23] wallyworld: but first I'd like to understand what's going on because as far as I can see we only write using our structs which should preserve order (i've been trying to break mgo/bson and so far it's doing the right thing) [05:23] menn0: are you sure the struct field order is that same for 1.18 vs 1.23 etc? [05:24] wallyworld: fairly sure... I've been comparing both codes bases [05:24] wallyworld: i'll do some more digging [05:24] wallyworld: this is pretty frustrating [05:24] indeed :-( [05:41] wallyworld: well fuck, I see the problem and this is big [05:41] oh no [05:41] wallyworld: just about all the env UUID migrations use one helper function [05:42] wallyworld: and that helper loads the doc to be migration into a bson.M, modifies the _id, adds an env-uuid field and writes it back out [05:42] wallyworld: b/c it's loading it into a bson.M which is just a map (and in this case a map of maps) [05:42] wallyworld: and b/c it's Go 1.3 [05:43] wallyworld: the field orderings get messed up [05:43] oh dear [05:43] needs to be into a slice [05:43] wallyworld: this will be a problem where-ever we have a Go 1.3+ compiled Juju and nested docs or arrays of docs [05:44] luckily our releases are all 1.2 at present [05:44] wallyworld: I'm pretty sure that the releases for vivid are being compiled on 1.3 [05:44] wallyworld: at least I thought I saw sinzui say that [05:44] hmmm, could be [05:44] shit [05:45] I can fix the migration helper [05:45] luckily i feel most vivid usages of juju are to play with openstack kilo [05:45] new environments [05:46] but still [05:46] i'll create a new ticket [05:46] i think we are fine to ship aplha with this unfixed [05:47] but needs fixing for 1.24.0 [05:47] wallyworld: yeah, it's still alpha. if you upgrade an important environment to an alpha release then you deserve it. [05:47] yup [05:48] and we will add in big red letters to release notes [06:15] Bug #1451674 was opened: Broken DB field ordering when upgrading to Juju compiled with Go 1.3+ === urulama is now known as urulama__ [07:42] fwereade, ping? === bradm_ is now known as bradm === liam_ is now known as Guest74585 [08:09] morning o/ === axw_ is now known as axw [08:14] wallyworld: when you're free, please take another look at the storage hook order review. it's changed a bit, to fix unit termination === ashipika1 is now known as ashipika [09:00] hangout, omw [09:01] voidspace: hangout! [09:01] dooferlad: omw [09:28] mgz: hiya [09:29] rogpeppe: hey [09:29] mgz: i've just landed a change to godeps that should make our life easier: https://codereview.appspot.com/230460044/ [09:29] mgz: (i wanted your review but you weren't in sight :-) ) [09:30] rogpeppe: I saw that one, thanks (I also added a hack in CI to delete stuff, your change is much nicer) [09:30] mgz: this should put paid to all those transitory dependency issues [09:30] rogpeppe: yeah, sorry, but saw it was getting looked at [09:31] mgz: if your change just deleted extraneous dependencies, then that was kinda flawed... :) [09:31] rogpeppe: indeed. relies on the build/test actually catching stuff it's missing but cares about. [09:32] mgz: well, the whole point of the test (and why the build was failing) was to catch places where we have a dependency that's not mentioned in dependencies.tsv, i think [09:32] yup [09:32] mgz: you'd've been better off just removing that check... [09:32] mgz: but anyway, now there is a Better Way :) [09:32] anyway, not using godeps is just much better [09:33] what's the alternative? [09:33] or did you mean "not using go get" ? [09:33] rogpeppe: we also needed an actually correct tarball for releasing, which means not including code we don't declare (and have done debian copyright checking on and such like) [09:34] rogpeppe: I did [09:34] using godeps, not using go get [09:36] mgz: actually deleting the extras was probably fine in fact, 'cos it would cause the build to fail if the dep was actually being used [09:36] that was the idea. [09:36] mgz: BTW I was using godeps -P 20 to fetch dependencies and it seemed to work reliably. i wonder if you might want to experiment with that to speed up turnaround time. [09:36] mgz: re https://github.com/juju/juju/pull/2209#issuecomment-99005800 -- the last windows unit test job passed, so I don't think 1.24 is blocked by the issue anymore? [09:37] axw, right, in process of opening both branches, one more forward port and some joyent junk to look at [09:38] ok then [09:41] oh, and functional-restricted-network is borked on 1.24 still... did that get a bug filed... [09:54] can I have a re-stamp on eric's fix, http://reviews.vapour.ws/r/1576 [09:56] mgz: stamped [09:56] ta [09:59] morning all [09:59] hey nate [10:01] sorry if I came off as pissy in the email, should not be tracking down bugs late at night [10:02] I thought it was fine. You're absolutely right that we shouldn't be landing branches that don't fix the blockers. To be fair, I did land a bugfix on 1.24 that was marked critical... but that one on master I had intended to let just sit there until master was unblocked. [10:15] davecheney: looks good to me [10:19] davecheney: thanks, btw [10:48] morning === lazyPower_ is now known as lazyPower [11:11] lazyPower: around? [11:12] axw: back from soccer, will have dinner and then look [11:13] wallyworld: your network seems to have settled, what did you do? [11:13] perrito666: replaced the @$*@&!^ modem [11:13] that happens [11:16] * perrito666 destroy one cheap AP per year and one modem every two [11:19] * perrito666 frowns at bugs that asume one knows things [11:31] perrito666: i am [11:35] lazyPower: https://bugs.launchpad.net/juju-core/+bug/1444861 [11:35] Bug #1444861: Juju 1.23-beta4 introduces ssh key bug when used w/ DHX [11:36] I would appreciate a bit more detail on reproducing this, I have never used the plugin [11:36] so what provider where you using, version of the plugin, steps? [11:38] perrito666: I have a plan for that fix, I'll add to the bug [11:38] it's basically an api versioning/breakage issue [11:40] mgz: tx [11:41] mgz: do you know what time sinzui starts his day? I am waiting for his cut to merge a minor bug in 1.24 [11:41] perrito666: shortly [11:41] perrito666: I'm trying to open both master and 1.24 currently, master should be clear shortly [11:42] perrito666: launchpad is rejecting edits to the bug - is adding it as a comment fine? [11:43] lazyPower: it is [11:43] Updated as comment #2 [11:45] tx [11:58] axw: reviewed with a test suggestion [12:06] mgz: should I wait for your input on the bug or go on and try to fix it? [12:06] perrito666: enarly there [12:20] perrito666: commented, yell if anyhting doesn't make sense [12:21] * perrito666 yells because life is senseless [12:21] I think you can do a 1.23 fix that unbreaks the api, but refactors enough to be useful for a new api call with a better result struct [12:22] * perrito666 just read: make a dirty hack just don t make it look as a dirty hack [12:22] ... I think it's pretty elegant really... :) [12:23] you make runSSHKeyImport do what we want, the just adapt how it's mapped to ErrorResults in ImportKey different [12:23] I havent read the comment, I was talking about what you justsaid [12:24] :P [12:39] mgz: are there any unverified bug fixes on 1.24? can I JFDI a fix for storage? [12:40] axw: just gone through, functional-restricted-network is still unhappy but all blockers done [12:40] one sec and I'll unblock, no forcing needed [12:40] mgz: sweet, thanks [12:41] axw: go ahead and $$merge$$ [12:42] mgz: thanks [12:42] I'll file something about the network test, may not be juju breakage but something needs fixing [12:50] meh, I think 1.24 bootstrapping may actually be doing something wrong on limited networks [12:54] mgz: if I cancel the current github-merge-juju job on Jenkins, will it clean up the instance and so on? [12:54] missed a test fix while changing a signature [12:55] axw: possibnly not, but we'll catch and clean it up later anyway [12:55] axw: beware, that might be mine [12:55] mgz: ok, thanks [12:55] so perhaps you still have some room to fix :p [12:56] perrito666: it's not, promise [12:56] anh, axw won the race [12:56] axw: bummer [12:58] mgz: :( where'd that blocker come from [12:58] axw: is terminated [12:59] so yeah, our script does handle manual aborts fine [12:59] mgz: cool [13:00] menno filed it, probably doesn't need to block 1.24 or trunk at this point [13:01] gimme a sec [13:01] thanks [13:01] 1.23 would be fair [13:02] perrito666, mgz; 'sup [13:02] voidspace, TheMue: https://plus.google.com/hangouts/_/canonical.com/maas-juju-net [13:03] sinzui: so, but 1451674 should absolutely block a release on any of those branches, but doesn't actually need to block development on 1.24/trunk right? [13:03] we have bug fixes that won't interfer with that, it's not a new regression [13:04] *bug 1451674 [13:04] Bug #1451674: Broken DB field ordering when upgrading to Juju compiled with Go 1.3+ [13:04] mgz: it won't block the alpha release today [13:04] sinzui: so, we do not want to mark it as a blocker, right? [13:04] we shouldn't necessarilyt block an alpha release on thsi bug [13:04] sinzui: just wanted to know what time where you going to cut 1.24.xxxx so I can merge something into 1.24 (a fix) [13:04] it's an alpha - so we can tell people not up upgrade using vivide [13:04] wallyworld: Y U NO SLEEPING [13:04] 1.24 should be open atm [13:04] axw: ^^^^ [13:05] woop [13:05] mgz: it is a blocker, because we cannot do a real release with it, it is not critical for 1.24-alpha at this moment [13:05] sleep gives you cancer [13:06] sinzui: so we need to land storage work so we can get 1.24 alpha out [13:06] okay, we need to sort out what we're doing here then, currently blocker prevents landing, which I want for "we have new regressions that need handling before ci can do useful work" [13:06] sinzui: so does everything else [13:06] wallyworld, oh, I thought we agreed to release today. [13:06] this issue should be resolved before 1.24.0 but not alpha1 [13:06] that's not the case for this bug, it's a real upgrade issue, but not a new one from recent landings that needs resolving before we get meaningful ci results [13:06] sinzui: we did, hence trying to land thi storage fix which we cant release witout [13:07] I'd prefer we don't jfdi things through, but label bugs to reflect the state of a branch vs release/devel clearly [13:07] mgz: CI hasn't failed so far with this bug, so it's not fully testing gor it [13:07] mgz: this issue should not block landings for 1.24 alpha 1 [13:07] yup, we are not testing vivid state servers or --upload-tools from vivid [13:08] wallyworld, and you know why, there wasn't a version to test with [13:08] we need to be prgatic about blocking landings [13:08] wallyworld: that is what I am saying [13:08] We can add upgrade testing today for vivid and we can add gce [13:08] yes, all good reasons why this lsipped through [13:09] I want you guys to not land trivial stuff when which means I have to read 2700 line diffs to work out which change broke something, but blocking currently doesn't cause that pain [13:09] you shouldn't have to do that [13:09] mgz, to be clear, --upload-tools is just a gateway for developers doing what we officially tell users not to do [13:09] that's devel's job to fix our stuff [13:10] mgz, vivid's compiler is 1.3 and until we released last week, it wasn't possible to test this case [13:10] wallyworld, all devel release come with an advisory that it doesn't support upgrade, so I remain unconcerned [13:11] sinzui: so, how do we want to mark bugs that we cannot release with, but have already been tracked down and have limited disruption for ci [13:11] I'd have thought "critical" but no "blocker" tag [13:11] sinzui: +1, me too, that is my position also [13:11] mgz, the opposite. it is a blocker that that release team is tracking. it is just high on the alpha [13:12] hence the push back for blocking landings [13:12] * sinzui already fixes it [13:12] okay. [13:12] so, 1.24 currently unblocked [13:13] there *is* an issue with cloud-utils but I'm trying to get details on that still [13:13] mgz: I am more concerned about what we don't know. Why does the restricted network test fail. I fear Juju changes something that breaks private clouds [13:14] sinzui: canonistack also fails the same way - which supports that idea [13:14] mgz, expand the list at http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/job/functional-restricted-network/ and you will see 99% passes become 90^ fails [13:14] am trying to find out more [13:26] sinzui: that last storage branch just landed [13:27] wallyworld, okay. I will watch the next run [13:27] i don't think there's anything else blocking a release of alpha 1 as soon as CI passes [13:36] mgz: i'm just trying to get my head around git cherry-pick to forward-port the changes that have landed in charm.v5. I was a bit confused by 87cf11b8cb2ab830c4ed9c2eab4d633004bb4689 which *looks* like a merge (from the message) but isn't actually. did you by any chance use cherry-pick to create that branch? [13:36] s/that branch/that commit/ [13:39] yeah, charm got super-confused [13:39] because the head of v5 moved === kadams54-away is now known as kadams54 [13:44] mgz: hmm, i hope that wasn't my fault [13:45] rogpeppe1: may have been, but we should be able to fix up somehow [13:46] mgz: ah! i think i might know what happened [13:47] mgz: i accidentally pushed the new v6-unstable branch to v5 and quickly reverted it, but i guess i must have force-pushed an earlier version [13:47] rogpeppe1: my theory was v5 got renamed to v6-unstable [13:47] aha, that also sounds possible [13:48] mgz: probably because i hadn't done a git fetch recently enough [13:48] anyway, we probably want to check nothing else got dropped, axw merged the two I highlighted back into v5 [13:49] mgz: ... and i thought it was unproblematic because i realised my mistake after only 5 seconds... [13:49] git is dangerous:) === kadams54 is now known as kadams54-away === urulama__ is now known as urulama [13:55] axw: ping [13:58] rogpeppe1: pong [13:58] axw: i just noticed https://github.com/juju/charm/pull/125/files [13:59] axw: and thought i should mention that, for future reference, that kind of change is technically API-breaking [13:59] rogpeppe1: it was not in any release yet [13:59] axw: currently we don't have the 'bot guarding API-breaking changes, but we should have... [13:59] axw: any juju-core release? [14:00] rogpeppe1: correct [14:00] axw: there are quite a few users of the charm package [14:00] axw: not just juju-core [14:00] rogpeppe1: ok. does anything care about storage hooks yet? [14:00] axw: and the major versioning *should* apply regardless of release status [14:01] axw: dunno [14:01] axw: we try to keep to the letter of the rules about API versioning, i guess [14:01] rogpeppe1: is there a merge bot there? [14:01] axw: because it's so easy to break things [14:01] perrito666: not yet [14:01] rogpeppe1: ok, will remember that for next time [14:02] axw: thanks. [14:02] I will happily mergebot charm now I have a godeps that can actually work# [14:02] that explains why the $$merge$$ in https://github.com/juju/charm/pull/122 never did anything [14:02] axw: tbh the gopkg.in thing is still in trial - it's good to try to be honest with ourselves [14:03] natefinch: stand up [14:03] issue before was it didn't have dependencies.tsv but had branches that din't work against tip of all deps [14:03] mgz: right. it needs a dependencies.tsv [14:03] now I can at least manufacture one that will work [14:03] mgz: cool === mgz is now known as mgz_ [14:25] mgz, rogpeppe1: do we have more than one dependencies.tsv in the tree of dependencies that juju uses? [14:25] natefinch: not sure if this is related; axw has a pr for the charm.v5 stuff: http://reviews.vapour.ws/r/1575/diff/# [14:28] mgz_: what should I be doing with https://bugs.launchpad.net/juju-core/+bug/1450919 ? The windows patch that got backed out, I can fix easily. But I have no idea how the charm.v5 code has anything to do with windows breakages [14:28] Bug #1450919: many window unit tests failures [14:29] natefinch: in a call [14:31] dooferlad: sorry I missed the call. a) I forgot it was Tuesday - feels like Monday to me. [14:31] dooferlad: and b) one my neighbours got burgled (by another of my neighbours - in a normally sleepy village!) [14:31] dooferlad: and I got inveigled [14:31] dooferlad: anything I should know about? [14:31] voidspace: no, all quiet [14:32] voidspace: nice negbours :) [14:32] perrito666: yeah :-/ [14:35] voidspace: dooferlad: hey just a heads-up. we are cutting 1.24-alpha1 today if you can wrap up any of the bugs you guys are working on [14:35] katco: thanks [14:35] voidspace: np, and wb [14:35] katco: I don't think I've got anything I can get in 1.24 today [14:35] katco: o/ [14:36] katco: thanks. I am in a similar situation to voidspace :-| [14:36] voidspace: wow, I was sure inveigled must have been a typo. It's not often people use words in common conversation that I've never even heard of before [14:36] dooferlad: understood, and wb as well [14:36] katco: is it already unblocked? [14:37] perrito666: no, wwitzel3 is working on the blocking bug [14:37] natefinch: really? voidspace does that to me all the time [14:37] perrito666: I guess he *is* british [14:37] perrito666: well, and natefinch is working on the windows blocker [14:37] but then again, my list of english words must be way shorter than yours [14:38] natefinch: heh [14:38] perrito666: I'm sorry... [14:39] voidspace: dont be, its enriching, most of my english is otherwise based on techincal words only [14:39] yeah, I always like learning new words [14:39] me too [14:39] or three actually I guess... [14:40] voidspace: although none of the definitions of inveigled I found make sense in your sentence [14:40] perrito666: get involved by trickery and subterfuge [14:41] perrito666: he got snookered [14:41] * natefinch is probably not helping ;) [14:41] perrito666: there wasn't really much subterfuge, but it was rather against my will that I got dragged into it [14:41] hehe [14:41] natefinch: ... I am not even gonna try.... [14:41] voidspace: ahh I see [14:41] perrito666: I was inveigled into the situation [14:45] TheMue: ping [14:45] voidspace: pong [14:45] TheMue: bug https://bugs.launchpad.net/juju-core/+bug/1442801 [14:45] Bug #1442801: aws containers are broken in 1.23 [14:46] TheMue: for juju-core this is marked as assigned to you and fix released [14:46] TheMue: as far as I *know*, part of the fix is putting addressable containers behind a feature flag [14:46] TheMue: which hasn't yet landed [14:46] TheMue: am I incorrect? [14:46] TheMue: (I have a PR for putting addressable containers behind a feature flag on trunk and I'm wondering if this is the right issue) [14:47] TheMue: if I am correct, I'll JFDI it as it's a critical bug [14:47] voidspace: it has been a small fix by dooferlad, ported by dimiter to 1.24 and now by me [14:47] TheMue: ah [14:47] TheMue: so it's not the issue I'm thinking of [14:47] TheMue: thanks [14:48] TheMue: this was the DNS search domains one then I guess [14:48] voidspace: take a look at http://reviews.vapour.ws/r/1564/diff/#, there you see the few changes [14:48] ah ok [14:48] not that one either :-) [14:48] TheMue: thanks [14:50] voidspace: yw [14:51] fwereade, do peer relations remain accessible during upgrade-charm execution? [14:52] jamespage, yes, they should do [14:52] jamespage, are you seeing otherwise? [14:52] fwereade, no sure - still triaging [14:52] fwereade, maybe bad charm code [14:55] evilnickveitch: typo fix branch for review: https://github.com/juju/docs/pull/410 [14:56] bac, thanks [15:40] fwereade: leadership log spam review if you have a moment (small review): http://reviews.vapour.ws/r/1577/ [16:01] natefinch: sorry for not answering - i was in a call, then forgot you'd asked a question... [16:01] natefinch: the answer is yes [16:01] natefinch: (but i don't think it matters) [16:01] pwd [16:02] rogpeppe1: /home/hduran/gocode/src/github.com/juju/juju [16:02] * perrito666 hates to leave wrong window command unanswered [16:02] omg i've changed identity! [16:03] rogpeppe1: I am pretty sure you would be waaay more freaked if I had produced your pwd [16:04] perrito666: i don't think it would be hard to guess... [16:04] perrito666: well, one of them anyway [16:23] all blockers cleared! [16:29] Bug #1450919 changed: many window unit tests failures [16:29] Bug #1451100 changed: TestCheckProviderProvisional fails on ppc64 [16:31] wwitzel3: test ping [16:31] katco: pong, yes! [16:31] great success [16:31] wwitzel3: :) [16:37] abentley, mgz, katco, wallyworld: CI blessed master, and it closed th remaining blocking bugs by itself [16:37] sinzui: yay :) [16:38] Ci will now test 1.24 with the new storage feature [16:38] Bug #1450919 was opened: many window unit tests failures [16:38] Bug #1451100 was opened: TestCheckProviderProvisional fails on ppc64 [16:47] Bug #1450919 changed: many window unit tests failures [16:47] Bug #1451100 changed: TestCheckProviderProvisional fails on ppc64 [17:12] katco, shit it [17:12] er [17:12] ship it [17:12] ... [17:12] * fwereade crawls off to hide in a corner somewhere [17:12] that's not how we do releases around here fwereade :P [17:15] lol === liam_ is now known as Guest95660 [17:25] LOL [17:39] * fwereade is off to catch some crabs with laura, will be back later [17:40] maybe even an eel === redelmann is now known as redel|meating [17:50] wwitzel3: looks like you don't have to look at that critical bug... blessing of master closed it === redel|meating is now known as redel|failmeadin === redel|failmeadin is now known as redelmann [18:00] katco: awesome, that's good timing too [18:00] katco: since it was on my list for after lunch :) [18:01] :) [18:22] is anyone looking at bug 1451674? it's a blocker for master [18:22] Bug #1451674: Broken DB field ordering when upgrading to Juju compiled with Go 1.3+ [18:28] fwereade, that sounds like awesome fun [18:29] katco, I'm just EODing, if no one picks it up between now and me starting tomorrow I'll take a look [18:29] mattyw: looks like menn0 might be involved, but i'm sure ian will coordinate [18:29] katco, but if someone is able to look at it in the meantime that would be nice - I have stuff waiting to land :) [18:29] mattyw: ty though [18:29] mattyw: yeah :p [18:32] wwitzel3: perrito666: natefinch: cherylj: i need a volunteer ^^ [18:38] (let ((volunteers '(wwitzel3 perrito666 cherylj))) [18:38] (elt volunteers (random (length volunteers)))) => cherylj [18:38] cherylj: what are you up to? [18:38] :) [18:39] katco: sorry, was on the phone with the vet. Had to take my cat in for surgery today. But other than that, I can help [18:40] eek.. everything ok? [18:40] Yeah... it wasn't an emergency surgery [18:40] oh good [18:40] she had to get some teeth extracted and it turned out to be a lot worse than they thought initially [18:40] fun times [18:41] oh =/ yeah we almost had to do that to one of our cats [18:41] hope they're ok in the end [18:41] can you take a look at bug 1451674? it doesn't seem like it should be too bad... looks like a map ordering issue [18:41] Bug #1451674: Broken DB field ordering when upgrading to Juju compiled with Go 1.3+ [18:42] yeah, I'll take a look [18:42] cherylj: ty! fyi, it's a blocker (as indicated by the blocker tag) [18:42] yay! [18:42] hehe [18:43] lol [18:43] cherylj: you will be a hero to all, as TheMue was in the yesterday [18:44] cherylj: your name will ring out in CI logs across the build farm [18:44] well that all depends on the timeliness of my fix [18:44] heh [18:44] we have faith! :) [18:44] start with 1.23 and work your way up, as is custom [18:45] ah, menn0 was talking about this in our standup yesterday [18:46] oh awesome, so you have some context then? [18:47] not much more than what's in the report [18:59] hey I am back, cherylj if you get tired of it you can toss it this way [19:02] Thanks, perrito666. I can take it, but wouldn't mind some pointers as this part of the code is all new to me. [19:03] perrito666: Do I just need to change the doc to be bson.D rather than bson.M? [19:05] cherylj: yes, you might need to do some cascade fixing since I have seen some queries use.M just to make the envuuid thinguie and will blow if you change it [19:06] ok, cool, thanks [19:06] if it werent for point 2 on that report this would not be critical [19:26] are we supposed to be using go 1.3 for our local development? [19:27] cherylj: not really, but juju in vivid is supposed to be compiled with it iirc [19:27] ok, I'm just trying to compile with 1.3 to try and recreate and I'm not having a lot of luck [19:28] cherylj: mm, test not failing right? [19:28] perrito666: it seems to be complaining that the dependencies are not also built with 1.3 [19:29] cherylj: rm $GOPATH/pkg [19:30] perrito666: thanks!!! [19:30] np that is a nassty one, I have it in my buildjuju alias === kadams54 is now known as kadams54-away [19:58] katco: huge thank you for fixing the leadership log spam [19:58] natefinch: lol === kadams54-away is now known as kadams54 [21:08] wallyworld katco xwwt: I wont be at the meeting in 25 minutes. we got CI passes for master and 1.24. [21:08] yay [21:08] sinzui: :) thanks for the heads-up [21:08] sinzui: so 1.24 release status? [21:09] katco: btw, i think those log levels should be trace [21:09] wallyworld, katco xwwt bug 1451674 is the last blocker. CI cannot test it yet. [21:09] vry noisy even for debug [21:09] Bug #1451674: Broken DB field ordering when upgrading to Juju compiled with Go 1.3+ [21:09] sinzui: menn0 is onto that today i believe [21:09] wallyworld, We just got the bless. I will queue the release when I get back from my son's school [21:09] sinzui: yay tyvm [21:10] wallyworld: i think you're right. some of them should remain debug, but i should have made the majority trace [21:10] yeah :-) [21:10] can be fixed for 1.24 final [21:10] wallyworld, sinzui: I am. will get on to it straight after the standup [21:10] \o/ [21:11] sinzui: we still having issues with restricted network tests last time i looked? [21:11] wallyworld, I just sussed it. I will let you read the MP when I explain it [21:11] wallyworld, not a juju issue [21:11] ok [21:11] menn0: wallyworld: sinzui: i believe cherylj is looking into that as well [21:12] cool :-) [21:12] katco: yep saw that [21:12] yeah, my next step was to bring in menn0. I'm having problems reproducing [21:31] davecheney: looks like all the patches to go 1.3 in vivid are backports so hopefully they are all fixed in 1.4.2 [21:31] (which is what debian sid has) [21:31] one is a patch to go-mode.el so that's irrelevant now [21:31] two are archive/tar things [21:31] no, three [21:32] and one is an armhf linker thing [21:32] hm, linker thing appears not to be present in 1.5 [22:02] davecheney, mwhudson: possible Go 1.4 release notes error: "much of the runtime was translated from Go to C". I thought it was the other way around. https://golang.org/doc/go1.4#performance [22:02] haha really? [22:03] lolz [22:04] er [22:04] mwhudson: I've just seen elsewhere in the notes that it's the other way around [22:04] menn0: that's not what i see on the page [22:04] oh [22:04] mwhudson: the performance section [22:04] mwhudson: sorry, ignore me [22:04] that says "much of the runtime was translated to Go from C" [22:04] mwhudson: I read it the wrong way [22:04] :) [22:04] sorry [22:05] * menn0 had a bad night with both children being awake a lot [22:05] * menn0 goes to make another coffee [22:05] seems like a great day to fix some bugs === kadams54 is now known as kadams54-away [22:12] menn0: find . -name *.go | xargs rm [22:12] no more bugs! === kadams54-away is now known as kadams54 [22:17] Bug #1452050 was opened: Add log when firing hook [22:20] mwhudson: good plan! i wonder how hard it'll be to get that PR past review... [22:20] * menn0 seeks out a Juju devel who's equally tired... [22:20] menn0: it'll be huge, the reviewer won't read it properly [22:20] ha :) === StoneTable is now known as aisrael === hazmat_ is now known as hazmat === kadams54 is now known as kadams54-away [23:02] Bug #1452050 changed: Add log when firing hook [23:08] Bug #1452050 was opened: Add log when firing hook [23:17] perrito666: standup? [23:17] Bug #1452050 changed: Add log when firing hook [23:19] * thumper heading out for early lunch