[00:06] anastasiamac: I thought you were off today? [00:07] babbageclunk: i have to head to cricket in about an hour - you ok to take another look at pr so i can land before i go? [00:07] menn0: m not working :) wallyworld made me log in \o/ [00:07] anastasiamac: that bastard [00:08] menn0: :D to organise the PARTY \o/ [00:08] come to think of it - not that much different from my usual work [00:08] haha [00:08] party? [00:08] * wallyworld is innocent [00:09] because my bugs are always a party! [00:09] menn0: you get to have an xmas party, 2 x per diem [00:09] there's 5 of us close by in brisbane [00:09] menn0: redir made us acquire a taste for michelin star restaurants, i've found an equivalent in bne ;) [00:09] menn0: http://www.restaurant2.com.au [00:10] anastasiamac: nice! can I come too? [00:10] anastasiamac: you arranging my flight? [00:10] menn0: of course but you need to pay your own flight :-) [00:11] menn0: redir: as long as u r ready for sauna :D VERY HOT here [00:12] anastasiamac: hehe. I think Mari would be very unhappy... [00:16] redir: partners r invited too :) [00:19] but partners pay their own :-) [00:19] wallyworld: of course! partners and plane tickets r not covered ;) [00:20] would have been good to have menn0 and redir fly in :-) [00:20] anastasiamac: I was just thinking how nice & cool it was here yesterday [00:20] And ocean still isn't like a bath (as of this morning, at least) [00:20] blahdeblah: u r killing me almost as much as wallyworld with his pool [00:21] pool schmool - give me waves & salt any day [00:21] blahdeblah: m more of spa kind of person - everything else involves effort :) [00:22] also m not big fan of sand in my bathing suite [00:22] suit* [00:23] Exercise is actually good for you, y'know... ;-) [00:26] not in this heat :) [00:27] wallyworld: yup - sorry was lunching [00:27] no wuckers [00:30] wallyworld: looks good [00:30] yay tyvm [00:30] hopefully it lands first time [00:47] wallyworld: I'll keep an eye on it [00:54] babbageclunk: ty. i'm about to head off, pr is almost done, hopefully it gets through [01:00] anastasiamac: do you think I need to port the fix for bug 1641824 to the 2.0 branch? [01:00] Bug #1641824: Migrating a model back to controller fails: model XXXX has been removed [01:01] anastasiamac: the bug's about migrations but the underlying problem is a resource leak when the model is destroyed [01:02] babbageclunk: damn, windows failures. hit merge againn but go to head off now, will be back later tonight [01:02] wallyworld: stink - will watch it like an easily-distracted hawk [01:02] np, ty [01:03] like a sort of hawk-magpie hybrid [01:06] babbageclunk: anastasiamac is supposed to be on leave, but given we are supporting 2.0.x and the leak exists, would be worth a backport IMHO [01:11] babbageclunk: i agree with wallyworld (both about being on holiday and the backport) \o/ [01:12] anastasiamac: go away :-) [01:12] oh, my ride is here, gotta go [01:12] anastasiamac: oops, sorry! you shouldn't be in here then! Ok, will do the backport - it's not totally straightforward unfortunately [01:12] wallyworld: nice try :) i need numbers tho unless u can txt me :) [01:13] wallyworld: ok, will do [01:58] thumper: here's the fix to the unit agents going failed as the migration starts: https://github.com/juju/juju/pull/6712 [01:58] * thumper looks [02:00] thumper: coming to hangout now [02:26] thumper: i'm fairly certain there even used to be an attempt field but I removed because I didn't think it was needed [02:27] :) [02:29] can I get a review of https://github.com/juju/juju/pull/6713 [02:30] thumper, menn0: ^ [02:30] It's a backport of the state pool fixes to 2.0 [02:31] babbageclunk: is it basically the same or did you have to restructure things? [02:31] babbageclunk: if it's the same, no need for a re-review [02:31] It's basically the same, modulo some conflicts - yeah, probably doesn't need rereview [02:37] * thumper sighs [02:37] menn0: I may have found another... [02:37] thumper: fuuuuuuu [02:38] babbageclunk: just land it [02:38] thumper: better us than end users I guess [02:38] let's just say I went to migrate a model, had no units deployed because I hadn't go to that [02:38] and it is in the target [02:38] but still in the source [02:38] shows in show-models [02:38] shows in list-models [02:38] but won't in show-model [02:38] I get permission denied [02:39] that implies that reap didn't work [03:13] thumper: PR coming for dumb migration sorting bug [03:13] k [03:13] menn0: also I want to change the migration master logger [03:13] so it uses a "." not a ":" separator [03:13] because you can't use --include-module juju.worker.migrationmaster with debug-log [03:13] because of the :xxx [03:14] thumper: ok sounds good [03:15] thumper: here it is: https://github.com/juju/juju/pull/6715 [03:28] thumper: there's QA steps on https://github.com/juju/juju/pull/6715 now [03:29] looking [03:44] thumper: so where are we at the "" is not a valid tag issue? [03:44] thumper: derailed by all the other problems discovered? [03:45] menn0: extra logging added to the CI test [03:45] waiting for it to happen again [03:45] the bug is marked incomplete with a note [03:46] thumper: ok cool [03:48] menn0: interestingly, the 6 hex digit suffix to the logger for the migration master uses the last 6, all other places use the first 6 from the model uuid [03:48] thumper: we should change that too [03:48] thumper: I honestly thought other places used the last digits as well [03:48] thumper: what's used for the instance ids? [03:48] :) [03:48] first 6 [03:48] that's what I thought I checked against [03:48] ok [03:49] thumper: if you're in there please fix that too [03:49] thumper: i'm onto the next thing. is there anything I can do with the "migration already in progress" (like reap issue) problem [03:49] ? [03:49] I just hit the unit not idle bug [03:49] i've been trying to land that - it's attempting now [03:53] approved the attempt addition one [03:53] thumper: thanks [03:59] thumper, menn0 I may have found a bug with migrations, I'm trying to confirm it further now. It seems that superuser migrating a users model causes issues with controller cleanup [03:59] i.e. this test passes that the superuser can migrate the model, but during cleanup the model is mentioned in both controllers cleanup output, while sitting there waiting for it to die for ages [04:01] veebers: that is the bug I'm currently trying to fix I think [04:01] where the reap phase hits a bug [04:01] which is the one that cleans things up [04:01] thumper: ah, well, ok then. In that case it seems we have a test case in place that can confirm it's fixed [04:02] thumper: do you have a bug id for that? [04:02] it is very intermittent [04:02] I've done 30 migrations without triggering it [04:02] veebers: is it reproducable for you? [04:02] thumper: if it's the same bug I'm hitting I can get it every time it seems [04:02] oh [04:02] interesting [04:02] thumper: what can I check to see if it's the same thing? [04:03] how confident are you with the mongo shell? [04:03] actaully [04:03] what log level are the controllers capturing? [04:04] thumper: its a ci test so the default level they run at [04:04] ugh... [04:04] thumper: i canrun with --debug if you like [04:04] which isn't high enough to see what's going on [04:04] yeah, run with debug [04:05] then search the source controller logs for "setting migration phase to REAP" [04:05] thumper: cool, I'll just wait for this to clean up then re-run [04:05] if it is successful, you'll see "setting migration phase to DONE" [04:05] if you are hitting the same bug, you'll not see that [04:06] thumper: cool I'll get on that and let you know [04:06] thumper, menn0: further model migration question: migration of model logs is done? [04:06] yes [04:07] thumper: cool, I'll check the implementation of the test because it fails for me currently [04:07] k [04:11] menn0: I'm struggling to reproduce the error [04:12] menn0: I'll just land a patch that does the logging and reapfail [04:18] menn0: huh... you may be right re: last 6 digits [04:18] I thought it was the first [04:19] thumper: possibily b/c there's less variation at the start? [04:22] menn0: trivial https://github.com/juju/juju/pull/6717/files [04:22] thumper: looking [04:23] thumper: ship it [04:40] babbageclunk: trivial one: https://github.com/juju/juju/pull/6718 [04:40] menn0: looking [04:40] babbageclunk: cheers [04:44] menn0: minor typo in docstring otherwise LGTM [04:44] babbageclunk: doh ... thanks [04:49] thumper: remind me what JES is in reference to juju? [04:50] wow [04:50] old name [04:50] Juju Environment Server [04:50] what Controllers are now [04:50] thumper: coolio, thanks [04:57] menn0: no comment on my funny joke :( [04:57] babbageclunk: youse guys are very funny [04:58] I think it's delayed jetlag [04:59] menn0: if you're still around, I'm not seeing REAP mentioned at all in the source controllers machine-0.log file, does that mean I'm not seeing the same bug as thumper or something else? === frankban|afk is now known as frankban === jamespag` is now known as jamespage [09:22] could anyone provide some oppinions on this patch? https://github.com/juju/juju/pull/6523 [10:16] wallyworld: ping [10:18] voidspace: calendar says that today is a wallyworld swap day [10:18] voidspace: looks like he's gone until the end of winter break, so if you need him, I'd recommend email [10:18] jam: ah, thanks - I will do [10:23] jam: I still want to think about your default space email, it seems like a good analysis === akhavr1 is now known as akhavr [11:26] hm. 'juju show-machine 0/lxd/0' doesn't work. you have to specify the host machine [11:33] morning all [11:37] morning perrito666 [11:38] frobware: voidspace: should 'juju add-machine --constraints spaces=UNKNOWN' be rejected early? [11:38] I'm seeing it try to provision and then fail, cause who is 'unknown' ? [11:38] jam: failing early always seems better [11:42] voidspace, jam: +1 on failing early [11:59] voidspace: hey, am here now [11:59] frobware: voidspace: rick_h: https://pastebin.canonical.com/173699/ "juju show-machine" showing that the host machine has 3 bridges, but 0/lxd/2 is only in one of them [11:59] wallyworld: hey, hi [11:59] wallyworld: go home [11:59] :) [11:59] wallyworld: how well do you recall the details of bundlechanges? [11:59] wallyworld: well, you're probably home. but you shouldn't be working. you have vacation time for a reason :) [12:00] wallyworld: it has good tools for me to play with it, so I'm finding my way round it slowly... [12:00] jam: <3 !! [12:00] wallyworld: I need to copy application constraints from the application to container creation (sometimes) [12:00] jam: just got back from cricket match, so i'll go away soon [12:00] wallyworld: I'm about to go into a call though - I'll email you... [12:00] voidspace: you meant the juju/bundlechanges repo? [12:00] wallyworld: did anyone win? or do you go back tomorrow for more game? [12:00] wallyworld: yep [12:00] voidspace: + [12:01] jam: I'm wondering if that is misleading. That's fallen back to using lxdbr0 [12:01] that was written by rick's guys [12:01] psh, wallyworld is rick's guys :P [12:01] voidspace: i did make sime tweaks to account for multi-series [12:01] wallyworld: rick_h thinks frankban would be a better person to talk to... [12:01] wallyworld: I will pester him instead :-) [12:01] voidspace: correct, i tweaked that package but didn't write it [12:02] frobware: heh, well, it could just be broken, indeed. I saw 10... but missed that it wasn't 10.100. [12:02] frobware: but it only has 1 :) [12:02] jam, that I can agree with. :) [12:05] yet again maas 2.1.1 failed to relase my node. Need to capture logs for this + bug. [12:05] or... it could be Juju. === arosales_ is now known as arosales === mup_ is now known as mup [12:13] frobware: db.linklayerdevices.find() shows that I don't have any records for the container... [12:13] time for more debugging [12:15] jam: https://github.com/juju/juju/blob/staging/apiserver/machine/machiner.go#L233 [12:15] jam: and various other places that do the same check [12:16] frobware: I don't think that is valid here. I think I'm getting an error somewhere, but it isn't getting logged/reported [12:22] frobware: 20MB upload of .gz to try out a new jujud [12:23] jam: build locally @frobware.com if it helps [12:27] frobware: that would be 14M if we used xz instead of gz [12:27] jam: I would use pxz. MOAR CORES! :) [12:28] jam: well, same format, just quicker to compress/decompress [12:36] frobware: yeah, I was getting an error, it was triggering: got error looking for host spaces: subnet "127.0.0.0/8" not found [12:36] so our tests need to include Addresses that aren't in known spaces, and we see that it properly skips them [12:37] jam: did you just run into https://bugs.launchpad.net/charms/+source/ceph-osd/+bug/1603074 [12:37] Bug #1603074: subnet "127.0.0.0/8" not found [12:37] rick_h: ^^ [12:38] meh either my office RCD is malfunctioning of the pc is [13:03] perrito666: sounds worse than my lights problem :-( [13:15] rick_h: yup, with my change to bundlechanges (not yet pushed anywhere - will do that now and work on test) [13:15] rick_h: application constraints are honoured by KVM placement [13:16] rick_h: and it's a relatively simple change [13:16] voidspace: <3 great to hear [13:29] mgz: thoughts on having Juju support '.xz' for agent binary tarballs? On my machine it is 20MB as a .gz, but 14MB as a .xz [13:29] can ubuntu core unpack them? === akhavr1 is now known as akhavr [13:40] mgz: 'xz' is the standard distribution of deb files now, I believe [13:41] mgz: http://juju-ci.vapour.ws/job/github-check-merge-juju/454/console I accidentally submitted it 1 time against staging, but it is now targetting 'develop' [13:41] mgz: why is it still complaining? [13:41] jam: lets see [13:42] mgz: unfortunately the 'check' system doesn't let you go back and see old vs new (or I haven't found it) [13:42] so I don't know if it is a stale failure [13:42] or it ran again and is still failing [13:43] jam: I see the pr targetted vs staging still? [13:43] #6720 [13:44] mgz: but I'm trying to get 6722 checked [13:44] sorry https://github.com/juju/juju/pull/6721 [13:44] 6721 [13:44] mgz: why is the link from 6721 linking to the check for 6720 ? [13:44] mgz: do we only support 1 source branch? [13:44] jam: you can confused it it seems :) [13:44] (1 pr per source branch even though I hvae targetted it 2x) [13:44] I even closed the 6720 [13:45] jam: let's see if I can do a manual build to unstick [13:45] mgz: I can't retarget 6720 cause something is already targetting develop [13:45] (6721) [13:47] and googling says you can't delete PR, only close them [13:54] jam: I believe I have unstack you [13:58] rick_h: voidspace: frobware: a proper 'juju show-machine' https://pastebin.canonical.com/173707/ [13:58] note that the address is from the right IP range [13:59] 172.16.102.34 instead of 172.16.101 or 10.* [14:00] jam: ty [14:00] macgreagoir: ^ [14:00] jam: cool === frankban is now known as frankban|afk [14:01] the ip results from that machine, I'm wondering if it doesn't have a default gateway at all: [14:01] https://pastebin.canonical.com/173708/ [14:01] it must have something, as I can ping 8.8.8.8 [14:01] but it definitely has an unroutable resolver [14:01] rick_h: jam: heh, looks like I uncovered (and fixed by accident) another bug in bundlechanges - application constraints were ignored for "new" machines as well (when specified with placement) - not just containers [14:01] voidspace: gotta love drive-by improvements [14:02] just need add a test to cover that change and frankban is happy [14:02] voidspace: make sure to get frankban|afk as one of the reviewers if we can please. [14:02] voidspace: <3 [14:02] rick_h: already done [14:13] ha, I misread that as "just need a test to see if frankban is happy" which seems like a pretty fragile test, IMO [14:15] natefinch: hehe, yeah - making that deterministic would be "challenging" [14:15] crap, why did CI fail this time? Oh, dang, frank stubbed his toe again. [14:16] :-) === frankban|afk is now known as frankban [14:33] natefinch: yeah I hate when it happens [14:53] rick_h, macgreagoir, jam: so other info to capture as part of show-machine is: gateway, DNS nameserver entries. And probably more as containers are not bridged onto all subnets... [14:55] Just noticed: Gogdand - https://www.jetbrains.com/go/ [14:55] or, gogland :/ [14:56] frobware: Do you mind adding anything you think of to that card. I'll add these now. [14:57] macgreagoir: will do. ty for adding the ones I just mentioned. [15:00] It'll be a straight dump of the machines doc soon :-) [15:01] this is the worst: [15:01] $ juju credentials [15:01] ERROR removing secrets from credentials for cloud azure: auth-type "userpass" not supported [15:01] macgreagoir: we could just run, in a fuzzy way, all of ip, brctl, cat /etc/ .... :) [15:14] katco: you wanted to talk about pollster? [15:15] natefinch: yes, i couldn't find how to give default values for questions [15:16] katco: like a default for an open ended question, e.g. name? [15:17] natefinch: yeah, e.g.: "What would you like to name your cloud? [my-cloud]: " [15:17] natefinch: i thought you had said yesterday that this was possible [15:17] katco: I haven't written support for it yet. [15:17] natefinch: ah ok. i'll just skip that question then [15:17] katco: I have support for defaults for multiple choice, but not open ended [15:17] natefinch: ta [15:18] welcome./.. I actually will be adding that with my bug fix I'm working on today [15:18] so it'll probably work out [16:05] mgz: hey, the jujugui bot has different magic to trigger a merge - do you know it? [16:05] voidspace: :shipit: [16:05] rick_h: ah.... [16:05] rick_h: thanks :-) [16:07] rick_h won [16:13] mgz: you can do a review though instead... [16:13] mgz: https://github.com/juju/juju/pull/6723 [16:14] mgz: if you would like... [16:14] voidspace: just sanity checking, is there any way this can cause is issues in non-container setups? [16:15] voidspace: the change didn't seem to key off containers in any way so if I deploy with a constraint to a machine on the bare metal it can blow up now vs what is used to do? [16:15] rick_h: it has two changes - new machines will now honour app constraints [16:15] rick_h: and also containers will honour app constraints [16:15] ah ok, since it's not a "new machine" it'll be ok [16:15] rick_h: those are the only two places that now have constraints that previously didn't [16:16] rick_h: and the absence of those two places are both bugs [16:16] voidspace: rgr ty [16:17] rick_h: both new changes are guarded by a check - either ContainerType != "" [16:17] rick_h: or placement.Machine == "new" [16:17] rick_h: so I'm *fairly sure* there are no new issues being created here... [16:18] voidspace: you're good. I was just looking at the diff scanning for container checks but those were already there [16:18] voidspace: sorry, just got my brain thinking appreciate you reassuring my worry-self :) [16:18] :-) [16:21] frobware: you're OCR today I believe, would be nice to land this on 2.1 and develop today before I EOD [16:22] frobware: https://github.com/juju/juju/pull/6723 [16:24] voidspace: I have stamped the branch, have not done the QA steps [16:25] you might want to flatten to one commit [16:31] mgz: thanks, I have tried the QA step myself [16:32] also https://github.com/juju/juju/pull/6724 [16:42] natefinch: you're probably the best to review this (small): https://github.com/juju/juju/pull/6725 [16:47] mgz: hopefully one last time please [16:47] mgz: https://github.com/juju/juju/pull/6726 [16:48] katco: looking [16:48] natefinch: ta [16:49] voidspace: approved [16:49] mgz: you rock, as always [16:53] +1 [17:08] dammit... my 24" monitor just made a nasty hissing noise and turned off... and now won't turn on [17:09] possibly just a fuse or something, but.... dammit. [17:32] natefinch: lol, yes, that definitely is something burning [17:32] hissing sound i would go for something more serious than a fuse [17:39] unless there is a snake sheltering in there for warmth, and it hit the power button === frankban is now known as frankban|afk [18:06] Gah, this is horrible... all my muscle memory has me looking at my now-dead monitor for IRC :/ [18:07] on a different note, I see this test failing: TestAgentConnectionDelaysShutdownWithPing anyone seen this as an unrelaible test? It fails with "connection didn't get shut down" which sounds suspiciously like a timeout [18:09] sinzui, abentley, mgz: have you seen this failure? ^ full failing output: http://pastebin.ubuntu.com/23634769/ [18:09] natefinch: that looks like the old timing bug we often see in lxd [18:10] natefinch: https://bugs.launchpad.net/juju/+bug/1632485 [18:10] Bug #1632485: TestAgentConnectionDelaysShutdownWithPing fails [18:33] sinzui: thanks for the bug link === frankban|afk is now known as frankban === petevg_afk is now known as petevg === frankban is now known as frankban|afk [20:33] bugger commodity hardware [21:04] hml, sorry omw [21:07] alexisb: okay [21:16] * redir lunches === frankban|afk is now known as frankban [21:41] veebers, ping [21:41] alexisb: hello o/ [21:42] heya veebers I hear there is some turmoil in the MM tests [21:42] wanted to let you know that with menno and tim out babbageclunk is around if you need a developer to take a look [21:43] alexisb: ah excellent thanks :-) I'll be bothering babbageclunk a little bit today to clarify some concerns I've run into with the tests [21:43] I'm just retesting some now as I see some fixes landed between my last testing and now [21:43] veebers: cool cool [21:44] awesome will leave you guys to it, thank you both [22:39] babbageclunk: So I have a test here which is passing the migration part, but the controllers fail to be cleaned up (kill-controller is waiting for ages as well as attempting to kill the migrated model on both controllers) [22:40] babbageclunk: thumper mentioned it may be related to a bug he was working on but I'm not sure. The test creates a user with superuser perms and does the model migration (that's the difference between this test and the migration test that passes) [22:40] Bug #1650401 opened: Kubernetes-core bundle fails to deploy. Easy-rsa results with failed hook [22:42] babbageclunk: would you have some time to take a look with me at this issue? [22:49] veebers: kill-controller waits more than 5 minutes? or just destroy-controller [22:49] veebers: my starting point would be to turn on trace logging, ssh in during destroy controller, and pull down all the logs from the controller machine 0 [22:50] we have seen a bunch of teardown issues following migrations, of interesting nature [22:51] veebers: sorry, missed this until now! [22:52] Bug #1650405 opened: Juju Embedded - Juju logout/login not working for multiple users connected to same controller [22:53] veebers: ready to look at it with you now [22:53] babbageclunk: nw, I have something to try out now so will try collect those logs === frankban is now known as frankban|afk [22:54] mgz: is there a nice way to turn on trace logging for a test? --debug is easy enough but maybe not the level that you want? [22:54] veebers: ok, let me know if you want me to take a look too [22:55] babbageclunk: will do, I'll probably call on you when I'm going through the collected logs etc. [22:57] veebers: export JUJU_LOGGING_CONFIG="=TRACE" before bootstrap (or running a ci test) [22:57] veebers: ok [22:57] or babbageclunk can perhaps give you a more targetting logging config [22:57] we might just want the workers stuff and some other bits at trace [23:02] mgz, veebers: well, there was the dependency problem that I fixed while we were in Barcelona, but that shouldn't be happening anymore. [23:02] veebers: Does this problem happen reliably? [23:02] Bug #1650405 changed: Juju Embedded - Juju logout/login not working for multiple users connected to same controller [23:03] babbageclunk: yes it does, everytime I run this test [23:03] veebers: grab the logs and trace during the hung teardown, and some careful pouring should give an idea [23:07] mgz, babbageclunk test is running now. I'm about to pop down to town for an errand so might be a short delay before I can get you the logs [23:07] veebers: ok cool [23:37] mgz: ping [23:45] babbageclunk: heya [23:49] mgz: Hey (isn't it super late for you?) [23:50] mgz: I'm trying to change my lxd config using `lxd init` but it keeps complaining that I've got containers when I don't. Any ideas? [23:51] mgz: correction - I didn't when I was running the command, but I do now, so I probably can't try anything out. [23:51] babbageclunk: I'm back in for the evening [23:53] babbageclunk: did you run lxd init as root? there may also be some other bits of setup that prevent it being rerun [23:54] mgz: yup, ran with sudo. [23:54] mgz: I could've sworn that I've run it before to change subnet + stuff [23:55] mgz: trying to turn off zfs given worse performance [23:55] yeah, it is rerunable to some extent