=== kadams54 is now known as kadams54-away === kadams54 is now known as kadams54-away === kadams54 is now known as kadams54-away [07:24] Bug #1442493 was opened: Openstack services failing on 1 node while deploying using JUJU [07:26] TheMue: morning! o/ [07:27] dooferlad: morning o/ [07:27] TheMue: Could you review this for me? http://reviews.vapour.ws/r/1406/ [07:28] dooferlad: so the sapphire group is complete for today ;) [07:28] TheMue: yep! [07:28] dooferlad: yes, will do [07:28] TheMue: thanks [07:38] dooferlad: uff, reading the bug is longer than the fix :) [07:39] TheMue: the effort of testing it on EC2 was larger than that by many times. MAAS just worked. Our lack of support for when there is no default VPC really screwed me over [07:39] dooferlad: do you know why this SNAT rule has been in there? so doesn't its removal crash something? [07:39] TheMue: not that I have found. [07:41] dooferlad: hmm, ok. here I definitely need some good networking lessons to get more safe in understanding it [07:41] TheMue: I am sure I can arrange that :-) [07:42] dooferlad: would be great. in my past I mostly have done typical business software, absolutely different area ;) [07:43] dooferlad: you've got you ship it [07:43] TheMue: thanks! [07:44] yw [08:13] * fwereade out for an hour or two [09:00] TheMue: hangout? [09:01] dooferlad: coming, just had phone [09:05] this PR removes the testing deps from the production juju code: http://reviews.vapour.ws/r/1407/ [09:05] reviews appreciated (it's pretty trivial) [09:05] rogpeppe1: will do after hangout [09:06] TheMue: ta! [09:30] TheMue: http://reviews.vapour.ws/r/1408/ is very similar to the last change :-) [09:30] *click* [09:32] dooferlad: thinking you're right, so ship it. alternatively %T is also valid, but no combination [09:33] dooferlad, TheMue: there's nothing wrong with %#T [09:33] http://play.golang.org/p/8XAOJVdeO2 [09:34] natefinch: why would you use %#T - it's identical to %T [09:35] rogpeppe1: because I always forget it's the same [09:35] natefinch: rogpeppe1: %#T is not documented, and %#v and %T behave different in showing also the values with %#v [09:36] Bug #1442541 was opened: hook name ommitted using juju-log [09:36] hmm, does vet complain about it? [09:37] TheMue: %T is the correct fix then. I just got done fixing that error message so it *didn't* use %v. It's doing type checking, so it should print out the type if it's incorrect, not the error's string, which doesn't tell you anything. [09:37] dooferlad: ^ [09:38] natefinch: ic, so dooferlad, fix it, then ship it ;) [09:38] dooferlad: and thanks for fixing my go vet mistake [09:42] natefinch: no problem - shame I didn't ask you what your intention was before the first review :-) [09:45] dooferlad: np :) [09:51] dooferlad: $$merge$$, not %% [09:52] TheMue: darn it! [09:55] dooferlad: no wonder, after all this %... discussion [10:07] natefinch: looks like go vet doesn't like the %#T [10:08] mgz: ping [10:15] natefinch: i think in the place you've used it, %#v would be a better bet, as if the test fails you really want to see the actual value there not just the type [10:15] rogpeppe1: very true [10:16] natefinch: the test isn't doing type checking, it's doing value checking, BTW [10:16] anyone have any idea how this build failure might be happening? http://juju-ci.vapour.ws:8080/job/github-merge-juju/2845/console [10:17] Extant directories unknown: [10:17] gopkg.in/juju/charm.v5 [10:17] rogpeppe1: regardless, printing out the string value of the error is quite a bit less than useful [10:17] i'm presuming that's the reason for the build failure, not the vet message: apiserver/server_test.go:88: unrecognized printf flag for verb 'T': '#' [10:17] natefinch: that's why i'd use %#v [10:17] rogpeppe1: yep [10:18] natefinch: the string value is almost certainly going to be more useful than just the type though [10:18] natefinch: which will usually be just *errors.Err [10:19] rogpeppe1: let's just agree that %#v is the correct fix :) [10:19] natefinch: :) [10:19] dooferlad: is that code already committed? [10:19] natefinch: yes [10:20] natefinch: as %T [10:20] I did wonder about %T: %#v... [10:21] dooferlad: %#v includes the type...%#v is basically %T with %v [10:21] well.. no that;s a bad description [10:23] dooferlad: %#v prints out the value as if it were go code to construct the value [10:23] dooferlad: http://play.golang.org/p/6HKt-7atpL [10:24] except of course, ignore the "type:" string on the beginning of each line, since that was just copied from the old text I had in there [10:24] fixed example: http://play.golang.org/p/FtEYRp1dJS [10:24] I need to stop trying to explain things before 7am [10:25] natefinch: no worries :-) [10:25] natefinch: actually the merge job failed, so I can easily switch back to %#v if you like [10:26] dooferlad: yes please :) [10:31] rogpeppe1: about that error in the console... note that error message is looking for gopkg.in/juju/charm.v5, not gopkg.in/juju/charm.v5-unstable [10:31] natefinch: yes, i think i know what's going on now [10:32] natefinch: i've updated that branch to change everything to use charm.v5 [10:32] natefinch: which should fix the issue (although it makes that branch quite a bit bulkier) [10:32] natefinch: i've also changed it to fix your %#T thing [10:33] rogpeppe1: I think dooferlad is making the %#v fix, though I suppose it can't hurt [10:33] rogpeppe1: hey [10:33] mgz: hiya [10:33] mgz: too late! :) [10:33] you're just loving breaking deps at the moment [10:34] mgz: it's one of my favourite activities [10:34] well, at least this one was an actual catch by the bot [10:34] mgz: not really [10:34] mgz: it's actually a bug in godeps [10:35] mgz: because godeps uses go get to fetch new deps [10:35] mgz: and there's no way to tell go get not to fetch recursively [10:35] yeah, which just pulls in everything [10:35] mgz: so really i think godeps needs to fork the go get vcs fetch functionality [10:35] so, tells you to fix the imports, no? [10:35] mgz: no, the imports are right [10:36] rogpeppe1: I agree with that though, we need to not go get really [10:36] mgz: the problem is that the tip of the repo has a different set of deps from the dep we wanted to use [10:36] rogpeppe1: ah, that one again [10:36] er... i can sort that for now I guess [10:37] mgz: i'm fixing it my updating to use the latest deps [10:37] I have a flip that just removes unkniwn deps rather than complaining, and assumes that if things build the godep stated deps were in fact the intended ones [10:37] mgz: the branch was all about updating deps anyway [10:37] fair enough [10:37] mgz: so here's the branch, updated (and now huge 'cos of all the import path changes): http://reviews.vapour.ws/r/1407 [10:39] mgz: if i want to change the description of a PR, should I do it in reviews.vapour or in github ? [10:39] rogpeppe1: well, reviews is the one people read [10:39] mgz: i care mostly about the commit log message [10:40] I guess change on github then [10:41] * rogpeppe1 $$merge$$s === rogpeppe1 is now known as rogpeppe [12:00] mgz: https://go-review.googlesource.com/#/c/8725/ [12:00] mgz: it won't fix it for now but sometime in the future, i hope [12:00] mgz: (assuming it's accepted) [12:35] fwereade, rogpeppe is one or both of you packing dominion? [12:36] mattyw, I will surely pack a set or two [12:36] mattyw, hopefully so will rogpeppe or dimitern as well [13:13] fwereade: i definitely intend to bring some, probably seaside, prosperity, maybe intrigue too [13:14] rogpeppe, intrigue is probably my favourite -- hopefully dimitern will bring his :) [13:14] rogpeppe, any of mine you'd be particularly interested to see? [13:14] TheMue: Another review for you... http://reviews.vapour.ws/r/1400/ [13:15] fwereade: i haven't played many real life games with alchemy, and dark ages is always good [13:15] rogpeppe, I don;t have alchemy actually -- I'll probably go with hinterlands/dark ages [13:15] fwereade: sgtm [13:17] fwereade: BTW i came across your recent dependency engine thing - cool stuff! [13:17] rogpeppe, glad you like it -- it borrows heavily from worker.Runner [13:17] fwereade: i like that it fits in with the worker framework too [13:18] rogpeppe, and I *suspect* will come closer still to it as I need to cover some of the more baroque cases I want in jujud [13:18] rogpeppe, yeah, absolutely, the worker approach has been fantastic -- it's just organising them clearly where we've really fallen down [13:18] fwereade: i still don't quite understand the overall motivation behind it mind [13:19] rogpeppe, I want to run something to take over some of the uniter's responsibilities when certain resources (like an api conn) are not available [13:19] fwereade: the output value thing could probably do with a little more documentation [13:20] rogpeppe, the thought of coordinating two distinct workers that ran at such very different levels made me cry [13:20] rogpeppe, noted, thanks [13:21] fwereade: i found the term "manifold" somewhat opaque; was this the meaning you had in mind "a manifold is a topological space that resembles Euclidean space near each point. More precisely, each point of an n-dimensional manifold has a neighbourhood that is homeomorphic to the Euclidean space of dimension n" [13:31] ? [13:32] rogpeppe, no, it's a mechanism with pipes going in and oout of it [13:32] fwereade: ah "a chamber having several outlets through which a liquid or gas is distributed or gathered." [13:32] fwereade: not a term i'm familiar with [13:33] fwereade: (which isn't to say that it's not entirely appropriate!) [13:33] rogpeppe, I could swear I made a point of documenting that... [13:33] / Manifold defines the behaviour of a node in an Engine's dependency graph. It's [13:33] / named for the "device that connects multiple inputs or outputs" sense of the [13:33] / word. [13:34] fwereade: ha, i evidently skipped over that :) [13:34] rogpeppe, ;p [13:45] katco, dooferlad : the commit that reverted the container SNAT rule in 1.23 may be causing aws bundle tests to fail exactly like bug 1441319 [13:45] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [13:45] * sinzui tries to get container log [13:47] sinzui: no, that is because we don't support container cloning. [13:47] dooferlad, this test passed on the previous version on 1.23 and we know the landscape bundle we are deploying hs not changed [13:48] dooferlad: just seen your message, reviewing it now [13:48] this test passed on joyent and maas 1.7 and hp cloud [13:48] sinzui: it was also reported before the SNAT rule change was reverted [13:49] dooferlad, not be CI and not with 1.23 [13:50] dooferlad, Until your commit, CI has never seen this error. This test is voting, so 1.23 will not be blessed for a release. I need to get more information about what is wrong. [13:57] sinzui: OK, the only thing that should have changed in terms of what other machines see, is that traffic from containers comes from the containers IP address, not the host machine's address. This used to be the case before rev b584fcb85b9bcce3dadba97d32fe50e8f3680e40 [13:57] on that commit we added an SNAT rule to modify all traffic leaving a container host to look as if it came from that host [13:58] dooferlad, yes, I agree and this test was happy when that change was made. I don't know why but would appear. could there have been another change made inconjuction for aws [14:01] * natefinch almost doesn't know what to do now that the HA code finally made it in. [14:03] sinzui: so it worked OK on the 8th after the proxy change? That would seem more likely. [14:03] dooferlad, yep [14:04] sinzui: what would you say is the correct behavior for this bug? https://bugs.launchpad.net/juju-core/+bug/1441904 just failing? [14:04] Bug #1441904: juju upgrade-juju goes into an infinite loop if apt-get fails for any reason [14:05] natefinch, yes, say it failed [14:05] sinzui: wonder if we should retry a few times in case of network problems etc [14:07] natefinch, I thought all apt calls were retried 3 times. dimitern fixed an issue a few months ago where juju wasn't retrying [14:08] sinzui: hmm.. good question, I can check, I don't know offhand [14:15] katco, I added the container.log you requested on https://bugs.launchpad.net/juju-core/+bug/1441319 [14:15] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [14:16] sinzui: ty... that's from the machine that failed? [14:17] katco, no larry's. the latest commit to 1.23 broken the aws deployer test in CI [14:18] sinzui: confused... isn't larry the one who reported the failure? and this is the one we're thinking is intermittent? [14:18] katco, yes, but now CI is affected. I am providing a log [14:18] sinzui: ah ok. so we're sure it's the same issue? [14:19] yep. we can reproduce it on command with 1.23 tip deploying the landscape bundle to aws [14:19] I have 5 failures [14:19] sinzui: i thought larry's issue was an lxc container problem... [14:19] * katco apologizes if she's being dense [14:20] rogpeppe, fwereade: FWIW, I have Alchemy, it's just not that great, so I kinda hesitate to lug it 3000 miles :) [14:21] natefinch: i've quite enjoyed it on occasion in Androminion [14:21] katco, oh, sorry, my error is not the same. I will report another bug. [14:22] sinzui: k, thanks for walking me around the block on that one [14:22] it looks like that container is having problems because of apparmor. The peer has disconnected message is because https://github.com/dotcloud/lxc/blob/master/src/lxc/af_unix.c -> lxc_af_unix_rcv_credential is failing. [14:22] sinzui: still working on some caffeien here :p [14:22] caffeine even [14:22] katco, thank you for questioning me. I am low on caffeine. [14:22] sinzui: a toast! [14:23] rogpeppe: at least it's small. I could probably leave the box at home and it would be less of a big deal. [14:23] dooferlad: are you refering to 1441319? [14:23] katco: yes [14:24] dooferlad: which log on that bug? [14:24] katco: https://bugs.launchpad.net/juju-core/+bug/1441319/+attachment/4371597/+files/container.log (the last attachment, from sinzui) [14:24] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [14:25] dooferlad: ah ok. so that will be relavant to sinzui's new bug he's opening, not 1441319 [14:25] katco: yea, indeed [14:25] relevant even [14:25] jees with the spelling this morning [14:29] dooferlad, can you work iwth sinzui on a bug as a result of your lastest commit [14:33] alexisb: sure [14:33] and dooferlad I just finished reading the backscroll :) [14:35] dooferlad, katco sorry, more caffeine and CI's error is the same as oil's 'failed to retrieve the template to clone: template container [14:35] "juju-trusty-lxc-template" did not stop' [14:35] instance-id: pending so https://bugs.launchpad.net/juju-core/+bug/1441319 is the issue [14:35] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [14:37] sinzui: that's actually great! repeatable! [14:49] TheMue: thanks for that review. Fixes commited. [14:54] anyone have an opinion on the best way to get a PR merged into master into the 1.23 branch? I could re-PR to 1.23, or I could cherry-pick from master to 1.23... not sure which was is "better" (or if there's a better third way) [14:56] mgz: ^ any thoughts? [14:56] (should say "to get a PR that was merged into master, into the 1.23 branch" [14:59] sinzui: for the record, we try apt installs *30* times, with 10 second delays between each. Wowza. [14:59] natefinch, well we tried. juju can admit it failed [14:59] natefinch: i always like to cherry pick from master, but that's just my workflow [14:59] natefinch: if it's important that it be done fast for 1.23, land it there first and forward port [15:00] sinzui: yep, it shouldn't be a big deal to have the error from apt-get just tell the command to fail. If it doesn't work after trying for 5 minutes, it's not going to work. [15:00] katco: I thought forward porting was the new normal. Maybe just for Saphire. [15:00] no, we're supposed to land on 1.23 first and forward port ,from what I remember... I just did the wrong thing [15:00] dooferlad: could be? i don't think it really matters... isn't it personal preference? [15:01] katco: I don't think it really matters, which is not the same as saying no one cares ;) [15:01] natefinch: hey. you paint that shed blue dammit. [15:03] I have an email from alexisb about making sure to land on stable and then forward port to master.... but it was only sent to me and wwitzel3, but I think that's The Way It's Supposed To Be™ [15:17] natefinch, bug 1394755 says it is fixed in 1.23, but I don't see the commit. I do see the commit in master though [15:17] Bug #1394755: juju ensure-availability should be able to target existing machines [15:22] Bug #1436863 was opened: test failure replicaset_test.go:52: SetUpTest.pN42_github_com_juju_juju_replicaset.MongoSuite [15:31] sinzui: working on that, sorry [15:36] katco: how do I cherry-pick the PR without listing out the 37 commits? Picking just the merge commit says I'm missing -m, and using -m says it expects a numerical value, but I have no idea what the number should be [15:37] also probably doesn't help that I have merge commits from master into that branch... sigh [16:04] mgz, can your review http://reviews.vapour.ws/r/1409/ [16:05] natefinch: sorry i was in a meeting [16:06] natefinch: i am the worst person to ask for advice on git, because i use an emacs mode which does it all for me [16:06] katco: haha no problem [16:06] sinzui: on it [16:06] katco: I think I'm getting it. I just ended up doing a cherry pick of all ~34 non-merge commits [16:07] sinzui: lgtm, hold till when you need it [16:07] thank you mgz [16:13] natefinch, "git help -a" should have what you are looking for [16:13] though I must admit that it has been a long time since i have done a kernel patch and used git [16:15] alexisb: git help doesn't always. [16:15] alexisb: you used to work on the linux kernel? [16:15] yes mam [16:15] alexisb: that's awesome! [16:16] alexisb: I thought i had it, but somehow my cherry pick left this branch in a bad state... some code referring to packages that don't exist. Sigh. [16:16] it was fun, there are days I think about going back :) [16:16] alexisb: nooo! we need you! [16:17] katco, it has been long enough I would have to really work to get back into it, my last public commit was in 2008 I believe [16:17] alexisb: I'm sure it's mostly the same ;) [16:18] natefinch, I have paused CI because I have a change merging now, and I don't want CI to be busy when you change arrives [16:18] sinzui: thanks, I guess. I don't know that my change will make it in terribly soon, given the troubles I'm having with git, and I need to go pick up my daughter from preschool in 20 minutes [16:19] :( [16:19] natefinch: is it checked into 1.23? [16:19] natefinch, I assume CI will take 3 hours to test my version change. [16:20] katco: no, it's in master, that's the problem... I developed it off of master, merged master into my branch a couple times, and then PR'd into master [16:20] natefinch: so you're trying to backport? [16:20] katco: yes [16:20] natefinch: what's the commit hash? [16:21] katco: b228e89dd3ef9a3fe0f14b958db123e87a68bc48 [16:22] katco: I presume there's some magic command line that'll do the right thing... but I'll be damned if I can figure it out. [16:22] natefinch: from what i can remember, it's not a single command [16:22] natefinch: you set cherry head, and upstream [16:22] natefinch: and then start picking commits [16:25] natefinch, katco if we have someone that can help out natefinch on this that would be great [16:25] we gotta get 1.23 out [16:25] alexisb: i am backporting now [16:28] Bug #1258485 changed: support leader election for charms [16:29] natefinch: ptal: http://reviews.vapour.ws/r/1410/ [16:29] natefinch: especially make sure i didn't mess up the merge [16:31] katco: hmm.... I think I must have given you the wrong hash.... that's not what needed to get ported... well, that may still need to get ported, but that's not all of it [16:31] natefinch: is there more than one commit? [16:31] katco: there's two different merges... one is the test fixes and one is the HA --to code [16:32] natefinch: k i'll pull the other commit too [16:33] katco: there's Merge pull request #1962 from natefinch/ha3 which contains 37 commits :/ [16:33] natefinch: k tal [16:34] it is possible to charrypick an entire merged branch with git, I don't recall the spelling off the top of my head [16:35] mgz: it's the spelling that I'm not sure of... plus the branch contains commits that are merges from master to that branch, which would contain changes that shouldn't go into 1.23 (presumably) [16:35] natefinch: right, but the diff from merged to master commit vs previous commit to master should be the right thing [16:37] so the dumb version is just take that diff and apply it, but cherry-pick does have a one-step of that [16:37] mgz: I would think so, but I'm always disappointed in what vcs figures out on its own [16:38] git just doesn't record mainline, so talking about three-way merges in the commandline is more annoying [16:38] unfortunately, I gotta run to pick up my daughter.... I'll be back in like an hour. That's about the best I can do. [16:44] gosh... it's going to be hard for me to merge this much in without knowing anything about what i'm merging [16:48] katco: how conflicty is the `git cherry-pick -m `? [16:56] natefinch, mgz, I think I had to use patch or format-patch once [16:58] mgz: 47 unmerged files [16:59] Bug #1442719 was opened: juju sync-tools fails [16:59] mgz: i (or rather emacs) ended up doing "git --no-pager -c core.preloadindex=true merge master~2 --no-commit" [17:01] hm, I wonder if that's actually right without specifying the mainline? [17:01] mgz: i'm merging it into a branch off of 1.23 upstream [17:02] maybe the ~2 just happens to be resolving right on sha1 alpha order [17:02] mgz: it is; that's emacs doing magic [17:02] mgz: i found the merge master hash, hit m and it does the right thing [17:02] mgz: i would hate doing all of this manually [17:03] mgz: which is to say via the command line [17:03] well resolving conflicts is always going to need an editor [17:03] you can use an OS if you want :P [17:03] mgz: haha [17:03] mgz: i'll show you this workflow in nuremberg. it's quite nice [17:04] neat [17:04] mgz: merge dumps me into a buffer with all merge conflicts. pressing "e" on an unmerged file dumps me into a visual diff [17:04] mgz: and i hit "a" or "b" depending on which side i want. and can edit in 3rd pane at bottom [17:05] alexisb: i can't perform this backport. i don't know enough about the patch to perform the requisite merges. [17:05] katco, just fyi, the juju-core leadership team advocates the process of forward porting as juju has historically seen more issues with back porting then forward porting === psivaa_ is now known as psivaa-afk [17:05] katco, ack, we will have to wait for natefinch then [17:05] alexisb: ok, duly noted. [17:05] as wwitzel3 is traveling [17:06] mgz, sinzui if need be we will have to add the HA working in a point release [17:07] sinzui, mgz, dooferlad, where are we with the aws container failure? [17:08] alexisb, If Ubuntu deem the patch to be a feature, they will not accept it. natefinch is the patch more like a bug than a feature [17:10] sinzui, in many of the "bugs" we fix for stakeholders there is a fine line between feature and bug [17:11] I see your point, but thought we had decided this was a bug [17:12] alexisb, yes, and Ubuntu doesn't honour that line they have reject our streams change in 1.20 and our right refused to package the backup/restore changes to 1.16.4 [17:14] katco, get with natefinch when he is back and lets get this merged [17:14] sinzui, I see this impacting our ability to deliver by weds [17:14] sinzui, where do we stand on the aws failure mentioned this morning? [17:14] No progress. [17:15] alexisb, and this bug was just moved to juju-core https://bugs.launchpad.net/juju-core/+bug/1439535 [17:15] Bug #1439535: 1.23-beta2 websocket incompatibility [17:16] ^ There are contradictions about beta3 being affected [17:16] so sinzui adding bugs is not helping get the release out :) [17:17] so lets start with the bug from this morning, what does no progress mean? [17:17] is there no one from core looking at it? is there something blocking us from debugging the issue? [17:18] alexisb, neither katco or dooferlad have asked for more information from me after I provided the log that was requested. [17:18] alexisb, I am at a critical moment in the release, so I cannot let myself get too distracted [17:19] sinzui, that is fine, I understand [17:19] dooferlad, are you still around? [17:19] alexisb: just. In the middle of feeding the baby [17:20] :) that is a fun task [17:20] I know it is close to your eod but have you had a chance to take a look at the logs sinzui provided [17:21] alexisb: As I said above, the failure seems to be to do with LXC and apparmor having a bad interaction. [17:21] alexisb: and I haven't touched code anywhere near that [17:22] dooferlad, ack, katco is this similar to the issue you looked at earlier this week? [17:22] alexisb: yeah... looks to be exactly the same [17:23] which we determined was not a juju issue, correct? [17:23] dooferlad, really. great. let me check to see if an apparmor change was delivered to trusty in the last day ( and see if the aws mirrors are our of sync [17:24] alexisb: we were leaning that way because we hadn't touched the code in that area; however, sinzui indicated that a commit caused this to be repeatable? [17:24] dooferlad: or was sinzui saying the change was your commit? [17:24] dooferlad, katco. apparmor did change last week: [17:24] 2.8.95~2430-0ubuntu5 release (main) 2014-04-04 [17:25] sinzui: if we haven't done any commits that touch that area of code, i'm blaming that [17:25] We have testing this week that says aws was fine, but let me see if I can find evidence that the mirrors are stale [17:25] sinzui: i apologize, i still don't understand the connection between aws and an lxc issue? [17:26] katco, I am sorry about the situation [17:26] sinzui: i'm assuming that i'm missing something; nothing to be sorry for [17:30] OK all, see you in Germany! [17:30] dooferlad: see you there! [17:31] katco: back [17:31] mgz: here's the start of the fix to godeps - copying a bunch of code from the go tool: https://codereview.appspot.com/223390043/ [17:32] natefinch: hey, i can't backport that change... there are too many merge conflicts i don't understand [17:32] katco: I was worried about that [17:33] rogpeppe: neat! I had a look at your get upstream change earlier, that seems like a good idea regardless [17:39] natefinch: try "git --no-pager -c core.preloadindex=true merge master~2 --no-commit" but tweak the ~2 to point to the correct merge hash for your branch of master [17:40] katco: man, how did I miss something so obvious? [17:40] natefinch: (shrugs) it's easy to get blinders on that you have to cherry [17:41] natefinch: obviously you dont use emacs, there most likely is a shortcut for that, smth like: Ctrl-M gnpccptmmnc2 [17:41] perrito666: lol i put my cursor over the merge hash and pressed "m" [17:41] some of those command line flags are probably superfluous [17:42] * katco somewhat embarrassingly just realized natefinch was joking. [17:42] katco: how do I figure out the number after ~? That stuff confuses the hell out of me [17:42] katco: lol yes [17:42] natefinch: it's just an offset from head [17:43] natefinch: so however many commits back from head that merge hash is for you [17:43] katco: you say that, but sometimes ~2 brings in like 1000 changes if you've done a merge [17:43] natefinch: i pulled from trunk recently so it is likely the same, but i would 2x check [17:48] katco, I need to step out for a bit, but I will be back soon, can you please make sure that the latest summary gets captured here: https://bugs.launchpad.net/juju-core/+bug/1441319 [17:48] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [17:49] so that we don't loose the conversation that happened on irc [17:49] alexisb: sure [17:49] thanks [17:50] katco: this keeps bringing in a ton of changes to files I know I haven't been anywhere near [17:51] natefinch: ok, i'll see if i can wizard up another command for you [17:51] katco: what I really need is to just whip up a patch file from the commit to master and then apply that to 1.23... [17:55] natefinch: so see on the PR, the merge has 2 parents? https://github.com/juju/juju/commit/b228e89dd3ef9a3fe0f14b958db123e87a68bc48 [17:55] natefinch: which is the parent you don't want to cherry? [17:56] * perrito666 has a ton of things to do and already lost 6hs on bureaucracy and now I just discovered that wile I was a away the mailman delivered mail... before the rain [17:57] katco: both of those should be fine... though I don;t know why those are parents per se... those are just the two previous commits I made [18:00] katco: I think the problem is that I merged master into my branch, which is evidently screwing up git, because it thinks I then want to merge that into 1.23, which I don't. [18:02] natefinch, i've often found git diff start..merge | patch -p1 --merge works, when cherry-pick doesn't dwiw. maybe there's a way to tell cherry-pick how to pick a sensible parent, but i've not grasped it [18:03] that's kind of dirty, but works when branches have diverged quite a bit [18:03] natefinch: so maybe do both: git cherry-pick b228e89dd3ef9a3fe0f14b958db123e87a68bc48 -m 1 [18:04] natefinch: and then git cherry-pick b228e89dd3ef9a3fe0f14b958db123e87a68bc48 -m 2 [18:07] right, that's it [18:07] whether it's -m 1 or 2 depends on alpha order of the parent shas [18:08] you may want to squidge the commits anywhat if using cherrypick [18:08] git-squidge should totally be a thing [18:08] cmars: lol [18:09] katco: so far, the first one looks promising [18:10] katco: and that second one brings in a bunch of crap that is totally wrong. So maybe just the first one is correct [18:11] running tests... we'll see how it goes. It looks promising [18:18] natefinch: are you sure that has all of the changes you need? [18:19] katco: nope. I should probably throw up a diff and make sure it includes everything [18:29] katco: looking good. Files look correct. NUmber of changes is correct [18:30] natefinch: woohoo! [18:45] well, it is missing my other test fixes, but that's just a single commit which hopefully is trivial to cherry-pick [18:45] not bad [18:45] they're not strictly part of HA, but they fail reliably on my machine... maybe they don't on the bot though [18:46] I think the bot is running with GO_MAX_PROCS = 1, which is likely the difference [18:46] ah yeah i run with that too... i learned quickly lol [18:48] katco: I think that backport you produced is actually the other half of stuff I'd like to get in [18:48] natefinch: lol i deleted the pr [18:49] katco: here's the review for my backport (really, I should just call it your backport ;) [18:49] http://reviews.vapour.ws/r/1414/ [18:49] natefinch: ah jees... for the same reason i couldn't do the backport, this is going to be hard to review [18:49] natefinch: how confident are you that you pulled the right changes? [18:50] I'm going to have a bunch of trivial changes to licence headers I'll need a rubber stamp on shortly [18:50] ..meh, and dep bumps that's a bit more painful [18:51] katco: basically 100%. I did a visual side by side diff of the changes for the original PR: https://github.com/juju/juju/pull/1962/files vs. the backport PR: https://github.com/juju/juju/pull/2060/files [18:51] katco: the diff in the review in this case is not really useful. You need to diff the two diffs :) [18:51] natefinch: k i'm going to rubber stamp it then... [18:52] natefinch: and tests are passing? [18:52] oh, but I'm being called for food first [18:53] katco: they only fail in the way the ones on master fail for me... if that's any consolation. Some of that is what I fixed in that other PR [18:53] k [18:53] you have been rubber stamped [18:54] katco: I promise to fall on the grenade if it blows up after merging. [18:54] lol [18:57] So, last night at 6pm, I realized that my last two pairs of jeans that were at all fit to wear had decided to simultaneously acquire large rips in the knees.... and given no time to actually go out shopping anywhere.. .I made my first purchase of jeans off amazon, to be delivered tomorrow, hopefully before I have to leave. [18:57] lol that is brave [18:58] I *think* they're a style that has fit well before... and men's sizes are usually regular enough that as long as I stay away from slim-fit and other nonsense, they'll just work. But yes, it is a bit of a gamble. [18:58] I wouldn't have done it if they didn't explicitly say they have free returns. [19:00] katco: btw..... just noticed this works: https://patch-diff.githubusercontent.com/raw/juju/juju/pull/1962.patch [19:00] dooferlad, katco. I have confirmed that 1.22.1 and 1.23-beta4 can deploy the test bundle to aws, but the revision that changed reverted container networking cannot. I can report a separate bug if you think that will be less confusing. [19:01] katco: I got that from seeing this tip at the bottom of the PR page: ProTip! Add .patch or .diff to the end of URLs for Git's plaintext views. [19:01] dooferlad, The release notes for the container networking state it was only enabled for aws and maas. Could the revision be incomplete? Does something else need to change for aws? [19:01] anyway, I gotta run some errands, I'll check back in later to see if this has blown up or not [19:02] katco: I'll try to get you an email with info ASAP, but it might be late tonight. [19:02] (info about team lead stuff) [19:03] sinzui: can you explain the lxc + aws connection to me? [19:03] katco, bundle is deploying to two containers. [19:04] sinzui: ah ok, so it's deploying containers to an aws host? [19:05] yes. Two app servers in a containers, and haproxy directly on the host to round-robin the work load [19:05] sinzui: ah ok. thanks, that's the information i was missing [19:05] katco, let me update the bug with the command line. I used any of us can run it. It isn't anything like the awkward CI tests [19:06] sinzui: awesome [19:06] sinzui: so to echo this back to make sure i understand [19:07] sinzui: 1.23-b4 works? where was the commit that breaks things? trunk? [19:08] katco, the dooferlad's commit is the only change from 1.23-beta4 released. We used cherylj's commit [19:09] sinzui: ah so it was to 1.23-beta4, it just hasn't been packaged (released) yet? [19:09] katco, it was released 20 minutes ago. [19:11] sinzui: so i'm confused (sorry)... you said 1.23-beta4 is working, but 1.23-beta4 includes dooferlad's commit, which is not working? [19:11] no [19:12] katco, we selected cherylj's as the official 1.23-beta4 because tip was broken. [19:13] ah ok [19:13] i understand now, thanks [19:13] and that commit repeatably works whereas dooferlad's commit repeatably fails [19:14] sinzui: is that https://github.com/juju/juju/commit/7e7bc9d3ad436cf25fa725f54334527cea9cb938 ? [19:14] katco, https://bugs.launchpad.net/juju-core/+bug/1441319/comments/11 [19:14] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [19:15] yes, that is the commit that breaks things. [19:15] sinzui: well at first glance, it certainly looks to be directly related... [19:16] katco, Since the release notes say only maas and aws were supported by SNAT, maybe another change was made that also needs to be reverted [19:17] Attention core developers - juju actions are friggin sweet. Thank you for this feature. that is all [19:17] lazyPower: hey you are pretty awesome. [19:17] :D [19:18] I've been prototyping actions for a couple days now. i want to go through and hyperbole all the charms with actions now [19:18] lazyPower: i want to work on a charm with you at the sprint [19:18] actions ALL THE THINGS [19:18] lol [19:18] katco: i wont be at n.burg :( [19:18] lazyPower: oh what the boo [19:18] only a handfull of eco peeps wil be there, we got split up [19:18] we're now under d. westervelt in the big division of juju teams [19:19] yeah heard about that [19:19] didn't realize you wouldn't be at the sprints though... for good? [19:19] hopefully not [19:19] but we'll see what the future brings [19:19] marco, cory and antonio will be there however [19:19] cool [19:20] also - i'm *always* up for a review/pairing session [19:20] just hit me with a repo link and a timeblock and i can make time to pair [19:20] cool [19:20] wish i had more time to actually use juju and not just work on it :p [19:20] just have a small installation on my home network [19:20] i've got coutn it - 8 environments running now [19:21] that i manage/assist [19:21] wow [19:21] that's pretty cool [19:21] :D [19:21] getting easier every day [19:21] my cousin works at a zoo which is understaffed [19:21] keep landing great features and i'll keep spinning up the envs [19:21] and they need to stand up a bunch of standard windows things [19:21] pinged him to use juju [19:21] oh man [19:21] our windows support without maas/openstack is non-existant today [19:22] so bring them to maas! ;) [19:22] +1 for that [19:26] sinzui: so this kind of blows any theory that the apparmor update was at all responsible? [19:26] sinzui: since cherylj's commit works fine [19:27] katco, yes, I didn't see any issue with the aws mirrors to imply they are different from other mirrors too [19:31] sinzui: ok. well, we'll have to ping alexisb for her opinion on what to do. i'm assuming this blocks v1.23 for the time being? [19:37] katco, this is very awkward, because dooferlad was reverting to unblock. so we cannot revert his change. We need to find another change that was made related to the SNAT [19:40] sinzui: do you know if anyone else was involved with that code? [19:41] I don't. I can only think to use annotate and log to find other parts of the code that were changed for snat. [19:45] sinzui: looks like possibly dimiter [19:59] katco, whats up? [20:00] alexisb: well, curtis has arguably proved that we have a commit in juju that is responsible for the failures [20:00] lazyPower, make sure to thank jw4 for actions [20:00] alexisb: it's made complicated by the fact that the commit was backing another commit out to fix another issue [20:01] alexisb: we probably need dimiter+team to make further changes to fix tip [20:01] 1.23 tip [20:03] well that means we have to wait tell monday for those fixes [20:03] alexisb: that is the issue we needed your input on [20:03] alexisb: what does that mean for vivid? [20:04] it means we release vivid with a broken juju [20:05] so what "fix" was reverted that broke aws and why did we revert it? [20:06] sorry had to blow my nose >.< [20:06] sinzui: do you know the answer to that question? [20:07] the issue is that we have a known issue that affects oil and any openstack bundle really, and the current 1.23 blessed version does not have a fix for that issue [20:07] alexisb: here's the suspect commit: https://github.com/juju/juju/commit/7e7bc9d3ad436cf25fa725f54334527cea9cb938 [20:08] which means the first 1.23.0 we are targeted to get released to vivid will break many of our stakeholders [20:08] but if we dont release then juju will just not work on vivid [20:09] alexisb: well that's not a good decision to have to make. [20:09] katco, so that commit is actually a fix for the issue I am referring to [20:09] I take it that is the one that is breaking aws [20:09] alexisb: yep. [20:09] alexisb, the bug doesn't really say we reverted the container network changes. It says we made a small change that address stakeholder conerns. the change just is not enough to restore AWS containers. maas containers are fixed [20:10] alexisb: can oil work around the bug by manually tweaking the nat tables on containers? [20:10] yeah I actually dont think it is reverting anything [20:10] hmm, who reported the aws container issues? [20:10] alexisb, this is difficult. before the change this morning some charms were broken, now all aws containers are broken [20:10] mgz, you guys did :) [20:11] katco, I dont have answer to your question jamespage would have an answer [20:11] sinzui, yeah it sucks :) [20:12] mgz, https://bugs.launchpad.net/juju-core/+bug/1441319/comments/9 and beyond [20:12] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [20:12] so if we know that maas is fixed, is there a way to trace the aws code path and figure out what is different that would affect the issue [20:13] katco, question for you ^^^ [20:13] alexisb: it's possible, but not likely in the time we have with what resources are left [20:14] katco, that is totally fair [20:15] so the next question is, is it more important for the first release to have oil working or aws containers working? [20:15] and katco I need to follow-up re your question on work around [20:16] so... I think containers on aws matters a fair bit for our testing, but I don't know of any customers deploying on aws [20:16] * alexisb goes to jason hobbs [20:16] as in, big paying ones, lots of people use it [20:18] alexisb: mgz: yeah, i'm thinking OIL is probably more important to get right out of the gate? [20:18] yeah I am thinking oil working is going to be the priority [20:18] katco, you beat me to it [20:18] will oil be utilizing vivid right away? [20:18] almost certainly not [20:18] katco, not likely [20:19] hm [20:19] so the "crumple zones" we have are: workarounds, and adoption rate [20:19] but they do use the latest stable of juju [20:19] if we can understand those better, we could probably make a better decision [20:19] and they want leader elections [20:19] yeah [20:20] well, i guess i vote oil then. [20:20] yeah so katco we do have a pretty good understanding of those, oil is the right choice [20:20] it just sucks [20:20] given there is an obvious failure we can see in our testing [20:20] yeah =/ [20:21] so sinzui .... [20:21] another lesson learned: releases should be "done" well before sprints and everyone takes off [20:21] lol [20:21] alexisb, there are now 3 regressions are in https://bugs.launchpad.net/juju-core/+milestone/1.23.0 [20:21] well we had planned for 3/27 but you know ho that goes [20:21] lol yeah [20:24] sinzui, you are referring to all the ones marked critical [20:25] yes. [20:25] so lp 1439535 just came to us today [20:25] we think that 811 is fixed [20:25] alexisb, and I think we need me to split one of them. because while the errors CI sees are the same as oil in one of the bugs, we know they were using a older juju [20:25] and will not be able to verify thta without a release [20:26] alexisb, we could say it is fixed, except the same charms listed in the bug are still broken on aws [20:27] sinzui, we should open a new bug, as what I really want to know is if oil is fixed for that bug [20:27] if it is then we need to figure out the path for aws [20:27] if it is not then we need to rethink the fix altogether [20:27] okay, I will shuffle the bugs [20:30] sinzui, thank you for all you work helping us sort all these issues out [20:31] sinzui: yeah curtis, you are amazing [20:31] katco: anything customer facing will be on the LTS releases as that is all we well support on an all that our support wing will support customers using [20:31] we don't have charms for the non LTS releases either [20:31] alexisb, katco, My son doesn't think so. I forgot to pick him up from school 45 minutes ago. I suck [20:32] sinzui: oh no :( [20:32] davecheney: ah ok ty for the added info [20:34] katco: and recently I found out that CTS doen't even start recommending the new LTS version for 4-6 months after release [20:34] katco, davecheney is correct, however we do have a special circumstance in the case of 1.23 because the openstack team next released set of charms will leverage Leader Elections [20:34] the first "stabilty release" or some other euphemism, stragenly coinciding with the point release [20:34] so we will need 1.23 for that release at the end of the month [20:34] alexisb: what does that have to do with V ? [20:35] davecheney: i am familiar with that pattern from the MS world. maybe it spilled over somehow [20:35] davecheney, point taken, actually nothing except the release is aligned [20:37] alexisb: it's probably because to get 1.23 backported we always have to land a release in the current version [20:37] that's the way backports work aparently [20:37] if you'd like to know more [20:37] i'll be crying into my beer all next week about this [20:37] (try the fish) [20:38] davecheney: lol, i'll need to pick your brain to understand the situation better [20:39] davecheney: if you don't mind some annoying questions :) [20:40] katco: oh it's a turbulet tail [20:41] haha [20:41] full of technical intriegue and skullduggery [20:41] turbulet -- what kind of a word is that [20:41] davecheney, lol [20:41] davecheney, I like having you in this timezone [20:41] how do I even adjective [20:41] davecheney: i think you have a novel in you ;) [20:41] it's weird working in this timezone [20:42] i'm use to having the west coast available all morning [20:42] right now, they don't come online til I get back to the hotel [20:47] katco: I am spamming reviewboard with licence header fixes for various juju subrepos at various versions of the branches - if you get a mo can you rubber stamp them? [20:47] mgz: sure [20:48] cmd and testing*2 are there, charm*2 should be there shortly [20:48] then I can do the dep update and juju*2 [20:49] katco: hm, actually charm is not under reviewboard, those are on https://github.com/juju/charm/pulls only [20:50] Bug #1442801 was opened: aws containers are broken in 1.23 [20:50] mgz: k tal [20:52] mgz: i think i got all of them [20:53] katco: thanks! [20:53] np [21:10] wtf is date 2015-03-30T-9:15:59Z in dependencies.tsv about... [21:10] I guess I'll just change that line as well [21:12] mgz: i don't know what that's used for, so i always just change it [21:12] urk, there are non-trivial charm.v4 changes it seems [21:14] hm nope, git is just weird, everything is fine [21:14] time to switch to my laptop to make sure everything is working properly [21:26] katco: last bit http://reviews.vapour.ws/r/1417 [21:26] I may also do trunk while I;m here but less urgency on that [21:27] mgz: kathump [21:27] (sound of a rubber stamp) [21:28] :D [21:32] katco, did you assign yourself to lp1442801 [21:34] alexisb: no [21:34] katco, ok, I am going to go ahead and assign it to james [21:34] alexisb: ok [21:59] if anyone has a mo, licence header changes for trunk too, http://reviews.vapour.ws/r/1418