=== StoneTable is now known as aisrael [01:22] thumper: I'm do we have any nice way of passing a userTag over the wire? [01:23] I'm currently breaking it up on the client, putting it in a params struct and rebuilding it on the server [01:25] alternatively, we could not build the tag until the server side, and just pass through usernames... [01:36] waigani: yes, it is a tag [01:39] wallyworld: got 5 min? [01:39] sure [01:41] thumper: we should allow the env to be 'unshared' as well right? [01:41] wallyworld: https://plus.google.com/hangouts/_/canonical.com/tim-ian [01:41] waigani: yes... [01:41] waigani: not sure "unshare" is the right verb [01:41] hence the scare quotes [01:42] under the hood that would just be RemoveEnvionmentUser [02:24] trivial review for someone: https://github.com/juju/juju/pull/641 [02:25] and my notes about bout abootstrapping: https://github.com/juju/juju/pull/642 [02:26] review please: https://github.com/juju/juju/pull/643 -- fixes CI blocker [02:28] axw: does this address both blockers? [02:28] thumper: I've only looked at the one so far [02:28] will take al ook at the other now [02:28] kk [02:43] thumper: what's the other blocker? there was only one i thought [02:43] the other critical bug is not a blocker as it's not a regression [02:43] https://bugs.launchpad.net/juju-core/+bug/1363143 [02:43] Bug #1363143: local lxc deployments fail to create machines [02:43] showing "ci regression" [02:44] thumper: ah, it hasn't got a milestone [02:44] i was looking at 1.21alpha1 milestone bugs [02:45] it could be the same root cause as the manual provider fix [02:49] could be [02:55] wallyworld thumper: is local meant to be doing apt-get update/upgrade by default now? [02:55] cos it seems to be [02:55] axw: with lxc-clone, it wasn't meant to [02:56] isn't lxc-clone implicit on local? [02:56] it is now yes [02:56] as of 1.20 [02:57] I'm doing add-machine on trunk with no special config, and added machines are updating/upgrading [02:57] if local is doing the apt dance with lxc-clone = true, that's wrong [02:57] I expected that to clone and not update/upgrade [02:57] do you have a template conrainer created? [02:57] yes [02:57] and if it is using that, ie clone = true, then there's a bug if it's doing apt [02:58] I'll look into it more after I've repro'd this bug [02:58] thumper: added Usermanager.AddEnvironmentUser to the serverapi, we can expose it as Environment.Share on the client: https://github.com/juju/juju/pull/644 [02:58] waigani: Noooooooo [02:58] sigghhhhhhh [02:58] axw: i'll look as well - people wanted to option of doing apt in clones so as not to have stale templates, but we wanted to lso preserve the current default behaviour [02:59] thanks [03:00] thumper: hangout? [03:00] * thumper nods [03:00] axw: there's a test that passes TestLocalDisablesUpgradesWhenCloning [03:01] maybe the test is wrong [03:01] waigani: muted? [03:01] waigani: you are very quiet [03:01] wallyworld: I didn't specify lxc-clone or use-clone or anything [03:01] * axw looks at the test [03:01] axw: lxc-clone defaults to true - maybe the apt logic is broken in that it doesn't handle implicit defaults [03:02] i'll check [03:04] axw: i found the bug [03:04] well, i think so from reading the code - i'll fix [03:04] wallyworld: cool, so I'm not going crazy :) [03:04] thanks [03:04] nope, not this time :-) [03:04] :p [03:05] i'll write a test to check my theory [03:13] axw: just checked the config code - seems that lxc-clone defaults to false, which means i totally misremembered what was done [03:13] huh, ok. [03:14] wallyworld: it's in container/lxc [03:14] i *think* we must have wanted to retian the original old behaviour which was before clone was supported ie do not clone unless asked [03:14] if it's not set, container/lxc auto-detects support based on series [03:14] https://github.com/juju/juju/blob/master/container/lxc/lxc.go#L104 [03:15] axw: huh, well that conflicts with the code in config [03:16] it will be messy because that check in lxc.go is donr on the host machine [03:16] the config parsing is done on the client [03:16] they're one and the same :) [03:16] for local anyway [03:16] not for maas with lxc etc [03:16] yes, will be messy for them probably [03:17] for now, maybe best just to be explicit with lxc-clone=true [03:49] axw: I'll update my doc to match your comments [03:57] thumper: thanks [03:57] and thanks in general, I've been meaning to write that for a while... [03:58] yeah, me too === Guest18526 is now known as wallyworld [04:33] well... [04:33] I deleted the AddAdminUser method [04:33] and got all the breakages I expected before... [04:33] * thumper enfixorates [04:35] * thumper chuckles at the pure number of panics [04:38] just for state: OOPS: 248 passed, 90 FAILED, 213 MISSED [04:38] whoops, juju dev is usually #2 on my irris [04:56] thumper: https://github.com/juju/juju/pull/645 [05:14] whelp, that's a circular import [05:14] time to table flip and try again [05:15] state/state.go: [05:15] 30: "github.com/juju/juju/environmentserver/authentication" [05:15] ffs [05:38] * thumper weeps and leaves [05:38] searched for "user-admin" in our tests [05:38] lots [05:38] no longer admin [05:38] * thumper goes to make dinner [05:38] problem for tomorrow-tim [06:14] wallyworld: it looks like the tools-in-cloud-config for local may be what's making things not work well in local/lxc [06:15] I'm going to back it out and see if it fixes things [06:16] from what I can see, it makes cloud-init take a lot more CPU. I guess it's hurting the YAML parser [06:20] axw: could be, yeah. i would prefer another way tbh [06:21] wallyworld: prefer another way? as in, other than what's currently in master? [06:22] axw: scp or something like that - to get tools into the bootstrap machine. we used to set up an http service but that fails if firewall ports are closed [06:23] wallyworld: ah. well I'm only changing it for non-bootstrap machines atm. bootstrap is fine [06:23] (but could be better, I agree) [06:23] np, let's iterate on it [06:23] * wallyworld -> school pickup, bbiab [06:26] axw: are you putting a 8MB tarball into a text configuration file ? [06:26] jam: indeed, and now reverting that :) [06:27] axw: did you test whether that size was even feasible for Userdata ? [06:27] jam: yes, it does work. I only did it on the local provider, and it works for both lxc and kvm [06:27] but it does seem to add significant overhead, which I hadn't noticed before [06:28] and delays agent startup... probably didn't notice because I only tested on my laptop before [06:28] just tested on a VBox VM, and it was noticeable [06:28] yaml doesn't do length-prefixed delimiting, so I can imagine that looking for the closing marker on 8MB * (4/3 base64) of tarball is a bit expensive [06:32] jam: https://github.com/juju/juju/pull/648 reverts it, if you have a moment to review. fairly trivial === uru_ is now known as urulama [06:33] axw: the first part of that certainly looks like we are only putting the URL into the user-data [06:33] Tools.URL [06:33] is being set [06:33] shouldn't there be some sort of encoding of the actual content? [06:33] jam: environs/cloudinit treats file:// specially, and reads the file in when generating the cloud-config [06:34] that's used for bootstrap at the moment [06:34] and manual provisioning [06:36] axw: that looks so totally horrible to me... :( [06:36] "if something starts with file://" then it must be a tools prefix and thus we should read it into our cloud config file as tools.tar.gz [06:37] very "spooky action at a distance" [06:37] yes, it is a bit magical and needs fixing [06:37] maybe if it had at least "tools" in that string. [06:37] in which string? [06:38] jam: oh, we only do that in one specific place: when generating the "copy tools" command [06:38] nm, it isn't anything with file:// it is just if Tools.URL has file:// which is slightly better, but still [06:38] yes, still a bit magical. it'd be better if we just scp'd it in the first place [06:38] axw: just to confirm you've tried it with and without and the overhead is significantly better after your patch, right? [06:38] will try to reorganise things at some point to accommodate that [06:38] axw: agreed [06:38] jam: yes, on my VM it's noticeably faster [06:39] also doesn't leave 8MB cloud-init files lying around in the lxc container cache [06:39] axw: so with your change we just get the Tools.URL that we discovered, rather than overwriting it as a "file://' url, right? [06:39] yup, it'll just do what every other provider does - download from the API server [06:41] axw: oh man... AddBinaryFile adds a shell script which is doing "printf %s BASE64CONTENTS | base64 -d > file" [06:41] We're lucky the shell was allowing it given that size [06:43] axw: LGTM [06:43] thanks === urulama is now known as urulama-afk [06:49] axw: fwiw, ec2 User Data is capped at 16kB, which I think would be a sane rule for us to follow. [06:50] jam: yep, I won't make that mistake again. bootstrap will migrate sooner or later, but that at least does not seem to have this problem [06:51] bbs, school pickup [06:51] later [06:57] morning [06:58] dimitern: just grabbing some coffee, will be there in a bit [06:58] jam, sure, omw [07:06] fwereade: heya, you back on board this week? [07:06] wallyworld, heyhey [07:06] wallyworld, yeah :) [07:06] fwereade: awesome. i'd love to catch up via hangout when you have some time [07:07] maybe ping me later when you have read all your email [07:09] wallyworld, cheers [07:10] wallyworld, will do [07:31] dimitern: did my connection die or is it yours? [07:33] so - is there a way for me to know if landing is currently blocked? [07:33] also - morning all [07:33] mattyw: try to land something and CI will reject it? [07:33] AFAIK there is no blockers right now [07:33] are no [07:34] critical bugs are in the topic - maybe we should update those to indicate which ones block landings [07:34] jam, I was wondering if there was a better way - so I can work out if there's any point trying to land something [07:34] but I guess better to ask forgiveness and all that [07:34] mattyw: well you can do the search yourself [07:34] for any "ci+regression" bugs [07:35] maybe the bugs in the topic ate the blockers already [07:35] https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL [07:35] jam, ok great [07:35] wallyworld: looks like LXC deployments failing to create machines is considered blocking? [07:35] wallyworld, jam, that's what blocked me on friday [07:36] mattyw: axw just submitted something for it [07:36] at least, I reviewed something that sounded like it was this [07:36] jam1: yeah, there were 2 - one for manual, one for local [07:36] first is done, second is on its way [07:36] the manual one was fallout of two branches sort of landing together [07:37] mattyw: https://github.com/juju/juju/pull/648 is addressing bug #1363143 [07:37] Bug #1363143: local lxc deployments fail to create machines [07:37] so it is queued right now [07:37] fwereade: also, since you're ocr today, i'd love a review of this which *may* solve one of the container pending issues we cannot reproduce but the landscape guys can https://github.com/juju/juju/pull/646 [07:38] wallyworld, just having a ciggie; quick hangout after that? [07:38] sure [07:38] wallyworld: so my quick take from my experience there was that it was that the provisioner would puke on certain kinds of errors [07:38] is that what you're changing? [07:38] wallyworld, we still using the spreasheet to work out our ocr days? isn't it changing to the calendar soon? [07:38] mattyw: yes, soon :-) [07:38] wallyworld: I saw that when tools couldn't be found [07:39] it treated no-tools as a bug in the Provisioner code, which would cause it to restart [07:39] jam1: pretty much - but in this case, it's an inconsistent database. the machine was there but the status record wasn't [07:39] but of course, its queue still said "i need to start a machine with tools that aren't available" so it would just keep doing that [07:39] this is different [07:39] so it *felt* to me that errors during provisioning shouldn't be treated as a Provisioner failure but a failure to provision [07:39] and thus the Provisioner could keep going to the next thing to start. [07:40] yep, agreed [07:40] which is what we did, but specific to tools [07:40] this problem is one case of the provisoner not beibg rbust [07:40] and also is due to our lack of tranactions [07:41] so the provisioner gets told a machine is ready to provision, except it isn't [07:41] because there's no status record yet [07:41] that arrives later, after the provisioner has a;ready errored for that machine [07:41] well, that's what the debug logs show [07:42] and it explains the behaviour that's been seen [07:46] mattyw: wallyworld: andrew's patch 'succeeded' except for a replicaset_test timeout [07:46] so it should be unblocked RSN [07:46] !@%$!~$@~ replicasets [07:47] wallyworld: so it looks to be a traceback in TestAddRemoveSet for the MongoSuite and not MongoSuiteIPv6 [07:47] which leads me to believe that it is just both that are flakey [07:47] though you specifically called out just the IPv6 version [07:47] (they run the same test with different addresses) [07:48] jam1: i called that one out because there's been a bug raised for it and it was assigned to 1.20 series [07:48] i think it was holding up CI at one point [07:48] ie failing very often, hence they raised it and assigned it [07:49] wallyworld: sure. Looking at the code AFAICT they are identical except we always do the IPv4 setup, and then we follow that with an IPv6 setup in the v6 case [07:49] and I thought that doing double set up might be the problem. [07:49] but didn't get a chance to actually run it enough to have any confidence there. [07:49] ah, ok. i hadn't looked specifically at the test [07:53] wallyworld: it only matters because *my* team was responsible for adding IPv6 but we didn't write the original test, not that we can't be the ones to fix it [07:54] jam1: fair enough. i was sort of thinking that whoever wrote the failing tests could have the best chance of fixing them, but of course anyone can fix any test [08:14] jam, so even though the lxc bug has been marked as fix committed my branch still fails to land because of it - do I need to wait till it's marked fix-released? [08:16] mattyw: I just checked the URL that the bot is supposed to be using, and it is clear now. Though LP timed out the first time I tried [08:16] maybe just try submitting again? It hadn't updated when it got to your previous request [08:16] jam1, will do - thanks [08:23] jam1, I just tried again and it failed - I'll go make some coffee and try later, I don't want to spam the poor thing on a monday morning [08:24] mattyw: can you link the PR, I'd like to see it [08:24] jam1, https://github.com/juju/juju/pull/562 [08:26] axw: wallyworld: I thought the bot ignored Fix Committed bugs, but it is clearly still complaining. [08:26] Do we need to drop it from CRITICAL so that we can land code again? [08:28] jam1: not sure, I thought it ignored them too. can try it I guess [08:31] axw: http://paste.ubuntu.com/8204244/ is the specific request the bot is making [08:32] that sure looks like just Triaged and In Progress (and not Fix Committed) [08:33] axw: mattyw: when I go to https://api.launchpad.net/devel/juju-core?ws.op=searchTasks&status%3Alist=Triaged&status%3Alist=In+Progress&importance%3Alist=Critical&tags%3Alist=regression&tags%3Alist=ci&tags_combinator=All [08:33] it returns an empty list [08:34] weird [08:34] I'll just try to land one more time [08:35] I'm sure the bots likes being kept busy anyway [08:35] jam1: where's the code that's making the request? could it be caching? [08:35] axw: I'm pretty sure the code is: https://code.launchpad.net/~juju-qa/juju-ci-tools/trunk [08:35] but I can't find anything that calls check_blockers.py [08:36] axw: that code is just doing urllib2.urlopen so it shouldn't be doing any caching. [08:39] jam1: the Jenkins job calls it directly [08:39] calls check_blockers.py [08:40] axw: k, I don't think I have visibility into that layer [08:41] nothing enlightening [08:41] * axw shrugs [08:42] and it has now been 7 minutes from a $$Merge$$ before the bot has noticed. [08:43] all the other requests in that thread show as "request accepted within 1 min" [08:44] it really doesn't want to land that branch [08:45] jam1, I've seen it completely miss a branch before - can't remember the reason, but I've seen it happen [08:50] TheMue: I'm currently in a meeting, so I might be a bit late for our 1:1 [08:50] but I'll keep you updated [08:51] mgz, ping? [08:58] jam: ok, I'm here [08:58] jam: so ping me when you're ready [09:19] axw: you free for standup nw since katherine is away? [09:20] wallyworld: sure, gimme a couple of mins [09:20] jam: sorry, was on a call before, did you get the bot issue sorted? [09:21] wallyworld: the bot seemed to be sleeping for an hour: http://juju-ci.vapour.ws:8080/job/github-merge-juju/ [09:21] see the 1hr gap from 8:18 to 9:17 [09:21] wallyworld: axw: mattyw: it appears to have woken up for axw, but is still ignoring matty ? [09:21] jam: so it was, hmmmm [09:22] wallyworld: https://github.com/juju/juju/pull/562 seem to have been pending for an hour without the bot noticing [09:22] jam2, looks like it - I've seen this happen before - but not for ages [09:22] wallyworld: can barely hear you [09:22] jam, jam2 which is the real one? [09:23] jam: i had no idea why ottomh [09:23] maybe mgz can look into it [09:23] jam, last time this happened mgz did some magic [09:23] now frozen [09:23] mattyw: I'm on 2 machines, jam2 happens to be my laptop which is a bit better than my desktop *right now* [09:24] jam, so I guess what would be most annoying for you would be pinging them alternately? [09:24] jam2, right? [09:24] my laptop pings on both of them [09:24] axw: you forze, you still there? [09:25] wallyworld: hangouts isn't loading.. [09:25] I got cut off, now I can't load hangouts [09:25] hang on [09:25] \o/ [09:25] my wifi setup is dodgy atm [09:48] TheMue: I'm in the hangout, if you can try to make it quickly [09:48] jam: OK [09:52] fwereade, we should have a chat about what we've discussed around metrics & environment [09:52] fwereade, also, good morning [10:07] mattyw, heyhey :) [10:13] TheMue: just a reminder to be booking your travel to brussels === jamespag` is now known as jamespage === bloodearnest_ is now known as bloodearnest [10:37] fwereade: maybe this time https://github.com/juju/juju/pull/650 [10:47] wallyworld, LGTM [10:47] fwereade: tyvm [10:47] mattyw, free for a quick chat? [10:56] jam: i take your point but if I do prereqOps = append(prereqOps, machineOp) then the result isn't just prereqOps anymore, so I deliberately used a different name. prereqOps isn't used elsewhere so if it gets modified it doesn't matter [10:57] dimitern: https://github.com/juju/juju/pull/627 [10:58] sound ok? [11:01] wallyworld: I understood why, but it means that line has an unspecified side effect, I'd rather we were concrete about them. [11:01] You could just do: [11:01] return mdoc, append(...), nil [11:02] I'll live with it either way, but I feel it is risky to do "append()" and assign it to another variable [11:02] thought about that too, but it looked ugly. i might change it to that though. the side effect is local only === gsamfira1 is now known as gsamfira [11:03] there's one case with two lists that need appending where it will get messy [11:07] fwereade, sorry - yes I am now [11:08] mattyw, np, I'll start a hangout [11:15] hey, could I get another review for https://github.com/juju/testing/pull/31 ? It's a small PR in the testing package that adds a missing windows script, as well as new failing functionality for testing [11:31] jam, you've got a reviw [11:31] review even [11:45] dimitern: thanks [11:45] perrito666, ping? === jam2 is now known as jam1 [11:48] mattyw: pong [11:49] perrito666, good morning, you did a fix for this? https://bugs.launchpad.net/juju-core/+bug/1363079 [11:49] Bug #1363079: userManagerSuite.TestUserInfoUserExists fails [11:49] perrito666, the reason I ask is that a number of the test failures in our google doc look to be the same error as was reported by that bug [11:50] perrito666, I'm trying to work out if we can consider the other tests "fixed" or at least keep an eye on them for probably/ maybe being fixed [11:50] mattyw: I did not manage to get it solved, I did figure what might be happening to that particular test [11:50] mattyw: want to tell memore? [13:33] could I get somebody to take another look at https://github.com/juju/juju/pull/517 ? [14:35] hello folks. Can someone have another look at: https://github.com/juju/testing/pull/31 ? [15:00] fwereade: I think you're OCR today? https://github.com/juju/juju/pull/651 [15:02] jcw4, cheers [15:02] fwereade: tx === Ursinha is now known as Ursinha-afk [17:46] hello all, three last branches for Actions on the unit ready for final review -- [17:46] https://github.com/juju/juju/pull/615 [17:46] https://github.com/juju/juju/pull/415 [17:46] https://github.com/juju/juju/pull/520 [17:46] sorry, 617, not 615 [17:54] perrito666, we'll talk about those errors tomorrow if that's ok? === Ursinha-afk is now known as Ursinha === Ursinha is now known as Ursinha-afk [21:01] morning [21:13] thumper, g'morning mate [21:13] o/ [21:16] mm, somthing is odd here, thumper said good morning and its not yet night here [21:16] perrito666: days are getting longer [21:17] * perrito666 calls the ministry of truth to fix that [21:19] perrito666: the ratio of daylight to non-daylight is increasing in the southern hemisphere [21:20] lifeless: I should know, I live there [21:20] perrito666: kk :) [21:22] lifeless: we are now on the road to the interesting time where its 10PM and it still afternoonish [21:23] perrito666: you must be waaaay south for that - mcmurdo? [21:24] lifeless: geographic center of argentina, which is quite south. [21:24] I dont mind much the sun, the 30°C during the night is what gets me [21:24] ah, I should have been able to guess that from the earlier quip ;) [21:24] perrito666: 30'is too hot to sleep comfortably for sure [21:27] thumper: when making a user, what is a valid 'Name' field? Should '@' be allowed at all? [21:28] waigani: no, it should fit through this hole: var validPart = "[a-zA-Z][a-zA-Z0-9.-]*[a-zA-Z0-9]" [21:28] waigani: that is the name part from names/user.go [21:28] thumper: currently state.AddUser checks via names.IsValidUser [21:29] waigani: right, but valid user is user@provider [21:29] what I think we want is "names.IsValidUserName" [21:29] which just checks the valid part [21:30] thumper: so that will be a new func on names? [21:30] right === Ursinha-afk is now known as Ursinha [21:37] thumper: meta-review please: https://github.com/juju/juju/pull/649 [21:37] menn0: kk, shortly [21:37] thumper: it's a small one so shouldn't take long [21:39] thumper, menn0, waigani, mornings :) [21:39] fwereade: morning :) [21:39] fwereade: hai! [21:39] or evening for you [21:39] fwereade: o/ [21:39] waigani, details details, fell asleep putting laura to bed, bit confused :) [21:40] how's everything? [21:40] fwereade, menn0: I'm taking a week off before the sprint. All goes well, weekend in London and a few days in Paris. [21:40] waigani, oh, lovely [21:41] so any must see/do please email me :D [21:41] waigani: sounds good! [21:41] hopefully Molly will come - it will be our first real holiday :) [21:43] fwereade: are you around for a bit? I have a few quick questions regarding stuff we did in Dunedin. [22:25] thumper: another quick meta-review please: https://github.com/juju/names/pull/24 [22:35] omg, this branch is going to be so horrible [22:39] menn0: happy with regex? https://github.com/juju/names/pull/24/files [22:41] * thumper nods [22:53] waigani: "^"+ValidPart+"$" [22:58] * thumper wonders if the compile part of the test run will be done before the standup [22:58] fan going full speed [22:59] thumper: I thought you used go nowdays? [22:59] lifeless: I do [22:59] lifeless: the tests are compiled [22:59] thumper: and that it had ultra super awesome compile times [22:59] lifeless: they do... [22:59] thumper: renice [22:59] each package is compiled into its own executable I think [23:00] correct [23:28] * thumper afk to collect rachel and have a coffee [23:39] waigani: thumper menn0 as discussed https://github.com/juju/juju/pull/653 [23:57] thumper: should state.AddUser be able to take "user@provider" or just "user"?