[00:03] menn0: are you getting time to finish off https://github.com/juju/juju/pull/2724? [00:03] thumper: working on that right now [00:03] thumper: not far [00:03] cool [00:11] thumper: done, about to merge [01:26] mehh I was reviewing a pr and when I got to the end it got discarded [01:27] perrito666: :( [01:27] perrito666: what time is it for u? [01:30] wallyworld: I don't suppose someone on your team can fix the intermittent failure in uniter worker? [01:30] FAIL: uniter_test.go:892: UniterSuite.TestUniterUpgradeConflicts [01:30] happens relatively regularly [01:31] anastasiamac: 22:30 [02:13] axw: ping [02:15] thumper: sorry, missed ping about failing test. will take a look. currently full with WIP fixes for 1.25 release and soon a critical customer issue which is more a feature thana bug [02:15] i suspect will be end of week or early next [02:15] also working on arm issue for 1.24.3 [03:07] wallyworld, thumper: the mongodb timeout PR has landed in 1.24 [03:07] great [03:08] do we know who to prod to see if it helps with anything? [03:09] wallyworld: ahasenack reported the bug that lead to me doing this work (i'm not sure if it will help or not)} [03:10] ok [03:10] thumper also thinks it might help with a problem env he was looking at [03:10] time will tell i guess [03:27] wallyworld: so what is the github.com/juju/juju/juju package all about? [03:30] seems like stuff that I would have expected to be in the state package [03:41] menn0: that package is sort of an attempt to get stuff out of state as far as i understand [03:42] it is horribly named [03:44] wallyworld: so state helpers then? [03:45] wwitzel3: ping? [03:45] menn0: more like juju core business logic [03:45] that is not persistence related [03:45] hmm ok [03:45] i didn't add it :-) [03:46] menn0: cool [03:48] wallyworld: I didn't think you did [03:48] i can't defend it too hard :-) [03:53] wallyworld: review done [03:54] ty menn0 [04:02] thumper: have you picked up much about juju and arm via osmosis? [04:02] nope [04:02] and dave is away this week :-( [04:08] aye === kadams54 is now known as kadams54-away [05:25] thumper: u got a fish trophy! [05:56] jam: hi [05:57] hey wallyworld [05:57] jam: network related question if you have a moment [05:57] bug 1472014 [05:57] Bug #1472014: juju 1.24.0: wget cert issues causing failure to create containers on 14.04.2 with lxc 1.07 [05:57] see the last couple of comments [05:58] it seems we don't store / report all the cloud local addresses for a machine [05:58] so a machine's AddressWatcher doesn't get told about all possible addresses an https request can arrive on [05:59] do you know why we throw away some cloud local addresses? [05:59] i'm not looked at the code in detail, just going by jame's comment [05:59] but i'm wary about changing network related code [06:00] as it has the habit of breaking things [06:02] wallyworld: so offhand I'd say we don't actually cope with having multiple addresses where things could arrive [06:02] wallyworld: consider charms, they can only really report 1 private address to eachother [06:02] hmmm, so which one to pick then [06:02] wallyworld: so the issue is probably that we are thinking 10.0.6.* is the right address, when really the correct cloud-local address is the 10.0.3* one [06:02] as we are picking the wrong one [06:03] hmmm, so how to pick the right one [06:04] wallyworld: so that machine has 3 addresses that I would consider "cloud-local" sort of addresess, a 10.0.3 a 10.0.6 and a 192.168 [06:04] for this purpose we should just stick everything in the SAN [06:04] wallyworld: Surprisingly (for me) 10.0.3 is usually the LXC bridge (I thought) [06:05] jam: what about this line : setting API hostPorts [06:05] wallyworld: I think in the idealized model Juju would be aware of all the subnets and have labled names for them (spaces), in which case it would know that machine X is supposed to talk to machine Y on a given address. [06:05] it seems there we pass in everything [06:05] wallyworld: internally it feels like we should be aware and save all the addresses [06:05] yes, save them all internally sounds good to me too, but if we do that now stuff would break i would tink [06:06] wallyworld: short term, I think just adding all addresses to the SAN is fine. [06:06] but [06:06] that relies on AddressWatcher :( [06:06] so i'll need to change how it all works [06:06] bollocks [06:06] i'll go read the code and see what can be done [06:06] wallyworld: well, I think you can certainly get help from Sapphire on this one. [06:07] jam: i asked but none were 100% sure about why only one address was saved etc [06:07] so maybe there's not the level of knowledge there to dive right in [06:07] wallyworld: well, dimitern is away and the others probably not quite as familiar [06:08] gophercon being this week. [06:08] yeah, that's what i figured, hence asking you :-) [06:08] i'll see if it's possible to tinker with the cert updater [06:08] i could re-read all machine addresses [06:08] but may not be triggered [08:09] wallyworld, jam: just read the backlog. Yell if you want a networking hand. [08:10] wallyworld: also with ARM, I am an ex-ARM employee so if I can help with that please call on me [08:41] dooferlad: oh, i might take you up on that arm offer. i might ping you after dinner [09:02] jam, fwereade: hangout? [09:30] fwereade: git blame -L302,302 provider/ec2/config_test.go [09:30] fwereade: that's the test that fails for me on master [09:31] fwereade: git blame may be deceived of course... [11:15] fwereade, quick ping? [11:15] mattyw, heyhey [11:16] fwereade, is there any doc or something about the uniter operation/ callbacks arch? I'm finding myself getting in to it and was hoping I could make some decisions about my stuff without having to hassle you [11:16] mattyw, only what's inline, I'm afraid [11:17] fwereade, I probably only want to call a certain function when a certain hook has finished [11:17] mattyw, that sounds like the responsibility of the CommitHook bit to me? [11:18] mattyw, but the callbacks themselves are basically evil [11:18] fwereade, time for a 5 minute hangout? [11:18] mattyw, it's basically just a cut-down uniter facade/adapter for the use of the ops [11:18] fwereade, I'll try to timebox it at that [11:18] mattyw, sure, start one please? [12:36] ericsnow: ping [12:52] Bug #1472596 opened: bootstrap failed yet retry says it succeeded [13:42] bogdanteleaga: is rr 2107 live or not atm? [13:49] mgz: no it's not [13:49] mgz: it's more of a weird interaction, 2109 is the same but should show a better diff [13:49] mgz: any ideas if I can delete that one? [13:52] bogdanteleaga: it is marked as discarded, so that's probably fine, is just getting updated still I guess as it's the same github branch [13:53] mgz: yeah, that was the one submitted to github, but the diff would be a bit funky since it contains another branch [13:56] bogdanteleaga: do you need anything else on those branches, or are you good to go? [13:56] bogdanteleaga: I think we'll want to backport to 1.24 after master has blessed the change [13:59] mgz: no, I was doing some final tests a couple of hours ago, but everything seems fine [13:59] mgz: got caught up with something else [14:00] bogdanteleaga: no problem [14:01] mgz: squashing now and I'll start merging [14:01] * bogdanteleaga grabs popcorn [14:01] :) [14:19] ericsnow: ping [14:20] wwitzel3: hey hey hey [14:24] ericsnow: trying to work through an issue and I've run in to some code that I could use some help deubgging [14:24] wwitzel3: sure [14:25] wwitzel3: moonstone? [14:25] ericsnow: we can go to a query, conference wifi probably won't work so well with a hangout [14:25] wwitzel3: right :) [14:35] mgz: you said they ought to be backported? [14:35] bogdanteleaga: I think we do need it on 1.24, yeah [14:35] want to see it work on master first of course [14:39] mgz: Build failed: Does not match ['fixes-1472632'] [14:39] can't find a bug with that number [14:40] abentley: ^did you mean to make that bug a blocker and private? [14:40] mgz: Yes. [14:40] it does not really strike me as either [14:40] mgz: Regressions are blockers. [14:40] mgz: It has debug logs from SSH. [14:42] I am reading the debug ssh log and apart from containing your name and some of juju-ci's ips addresses it seems to have nothing personal [14:42] and I don't see how this bug prevents us releasing, which is the point of blockers [14:43] we've been releasing fine with this for three 1.24-s [14:43] mgz: Well, I was erring on the side of caution with the SSH. If you're willing to take responsibility for making it public, I'm fine with that. [14:44] mgz: I don't know why we've continued to make releases after it was discovered. I assumed sinzui had filed a regression bug, since he knew about it. [14:44] abentley: I didn't know about it [14:44] abentley: okay. I can confirm this does not contain your private ssh keys. :) [14:45] sinzui: Oh? Didn't you say you'd had to rename your ssh key to id_rsa to deal with this issue? [14:45] abentley: the issue with the ssh stuff is it depends somewhat on your personal setup, so I knew that the combo of my local ssh config + juju ci scripts borked ssh for juju [14:46] abentley: yes, for just bootstrapping ec2 and openstack providers. I haven't seen any issue with other providers or ssh in general [14:46] I agree this is a regression, but given it has an annoying but somewhat trivial workaround (don't use your personal ssh config) I don't see how it's critical [14:47] abentley: nd this behaviour matches the windows setup from 18 months ago [14:50] mgz: I don't think the existence or lack of workarounds is a factor in whether an issue should block. We don't want to break users' existing workflows, and this does break users' existing workflows. [14:51] abentley: I don't disagree [15:11] Bug #1472632 opened: regression: juju ssh dies with (publickey) [15:17] abentley, do we know what commit caused the regression? [15:17] I would like to see bogdanteleaga be able to land his fix, it is a critical fix for 1.24.3 [15:17] alexisb: No, we don't know which commit. We found it by hand, not with automated tests, so we don't have logs that would show it. [15:18] abentley, ok [15:21] sinzui, abentley, mgz there will not be anyone from core looking at that bug until NZ/AUS comes online [15:22] I don't see how blocking is productive for this issue, nor how it's justified by our procedure [15:22] alexisb: it doesn't blocl [15:22] not tag blocker [15:22] sinzui: it has that tag currently, I was planning on raising in standup in 5 mins [15:23] sinzui, ok, I must have miss read the back scroll [15:23] I thought bogdanteleaga was blocked [15:24] sinzui: `./check_blockers.py check master` [15:24] alexisb: if he is, he can add __JFDI__ to $$merge$$ to make it mege and test <- bogdanteleaga [15:24] noo.. [15:24] either the bug blocks or it doesn't, we shouldn't be bypassing [15:25] I don't think it should block. [15:25] it is blocked currently, but this is just the fix for master, the one for 1.24 is coming up [15:26] not sure how long it takes for the upgrade ci job to test the fix though [15:26] mgz: as we don't have a test for it and the regression is in the wild, we don't need to block. I think this is like the expressions closing the stable door after the horse has bolted [15:29] sinzui: That assumes that the number of people who have not upgraded to 1.24 is not significant. I think that it is significant. I think every time we put out a release, especially if we release 1.25, we encourage people who are using 1.23 and earlier to upgrade. [15:53] bogdanteleaga: use "$$merge$$ fixes-1471332" to the pull requet comment to ensure CI will test and merge [15:55] sinzui: I did try it with __JFDI__ but I fluked a test [15:55] sinzui: should be fine on the next try; should I use jfdi or fixes? [15:56] either bogdanteleaga [15:56] cool [16:10] dooferlad: TheMue: if you have a chance I'd appreciate a review http://reviews.vapour.ws/r/2116/ [16:11] voidspace: one moment, hunting a test failure, but will start in a few seconds [16:11] cool, thanks [16:11] good luck with the hunt :-) [16:15] ah, kewl, panic is gone. now I can jump into your review [16:18] voidspace: seeing your new ReleaseAddress() signature. could it be that the passed address doesn't match to the passed MAC address? IMHO they always should be a kind of pair with 1:N (one MAC, multiple IP) [16:18] TheMue: no they're always 1:1 [16:18] TheMue: the MAC comes from state.IPAddress [16:19] instance id is stored there too [16:19] voidspace: ok, but technologically one MAC could have o´multiple IP, we only don't model it so right now [16:19] ah [16:19] our model does allow that, sorry [16:19] although we don't use that capability [16:20] ok [16:20] so yes, one mac address could have multiple ip addresses - and we handle that [16:20] we only delete the device if the IP address is the last one [16:20] otherwise we just release the address normally [16:20] biab, grabbing coffee whilst you read [16:31] voidspace: you've got a review [16:34] * TheMue is afk for a moment, continue later === liam_ is now known as Guest71963 [17:15] TheMue: thanks [17:32] Bug #1472711 opened: MAAS node has "failed deployment", juju just says "pending" === kadams54 is now known as kadams54-away [18:20] Bug #1472729 opened: juju stuck in "upgrade in progress " for 20min [18:23] Bug #1472729 changed: juju stuck in "upgrade in progress " for 20min [18:32] Bug #1472729 opened: juju stuck in "upgrade in progress " for 20min === kadams54-away is now known as kadams54 [19:47] Bug #1472749 opened: github.com/juju/utils has contradictory licences === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away [21:49] hello world [21:49] maas 1.8.0 juju 1.24.2 deploying to LXC containers seems stuck, it's trying to wget the image from the bootstrap node/api server and it's been doing that for like 15 mins [21:53] jk, it's moving on now [22:00] hey marcoceppi welcome back from vacation [22:01] alexisb: hey, thanks! [22:37] holy shit balls [22:37] these worker tests are taking ages to run... [22:39] FAIL github.com/juju/juju/state/leadership 1200.021s [22:39] hmm [22:39] timeout kill [22:40] nbd, 20 min tests [22:41] nbd? [22:42] no big deal [22:44] nah man, it is a big deal [22:44] this is broke [22:54] thumper: shall I pick up the juju ssh blocker? [22:54] bug? [22:55] thumper: bug 1472632 [22:55] Bug #1472632: regression: juju ssh dies with (publickey) [22:55] it's blocking master and 1.24 [22:58] sinzui, ^^ I thought we had decieded that bug was not a blocker [22:58] menn0, we discussed that one earlier today [23:00] menn0: I think abentley's analysis is wrong [23:00] menn0: if you look at the logs, it was his personal id_rsa that worked [23:00] but 1.24 and master did not appear to be trying === mwhudson_ is now known as mwhudson === mwhudson is now known as Guest82160 [23:02] thumper: i haven't looked at it in any detail [23:03] so menn0, thumper, my understanding from this morning is that bug should not be a blocker and the tag was going to be removed [23:04] * thumper removes blocker tag [23:04] unfortunately I don't see any of the release dudes online atm [23:04] sweet [23:04] thumper, alexisb: cool. that unblocks master [23:04] 1.24 is still blocked due to the window upgrade issue [23:04] i've just been talking to bogdanteleaga [23:05] he's waiting to see the problem is fixed on master before pushing the fixes to 1.24 [23:05] that shouldn't block 1.24 then [23:05] the PR to fix 1.24 is ready to go though [23:06] thumper: no? [23:06] * thumper thinks [23:06] it will block us doing a release [23:06] but I don't think it should block us landing other fixes on 1.24 [23:07] thumper: remove the blocker tag then? [23:08] menn0: do we know if it fixed the issue on master? [23:08] thumper: no we don't. CI hasn't gotten to running it yet [23:09] thumper: bogdanteleaga just noticed that his fix broke the unit tests under windows so he's going to do a fix for that now [23:11] waigani_: can you take a look at 1.24 bug 1472711? it claims bug 1376246 may not quite be fixed [23:11] Bug #1472711: MAAS node has "failed deployment", juju just says "pending" [23:11] Bug #1376246: MAAS provider doesn't know about "Failed deployment" instance status === Guest82160 is now known as mwhudson === kadams54 is now known as kadams54-away [23:37] rick_h_: hey there [23:59] thumper: so the windows upgrade still isn't working in CI