[02:15] grrrr... an interface that mimicks bits of State using the same names but slightly different method signatures [02:15] what a great idea :) [02:27] wallyworld: sorry, another question. how does a machine agent get into the "Started" state? where is that set? [02:28] menn0: from memory, that happens when the agent comes up on the node and "phones home" via the precense api [02:28] until then the staus is pending/allocationg [02:28] wallyworld: with this issue I'm looking at, it seems that the peergrouper isn't even noticing the new controller nodes so isn't adding them to the replicaset [02:28] or it may be a direct status call, not sure [02:29] wallyworld: and the only reason that that can happen (really) is if the machine doesn't get to "started" [02:29] you mean not noticing the status change? [02:29] is there a watcher for status? [02:29] wallyworld: no it can't be that [02:30] wallyworld: the peergrouper polls so even if the watcher wasn't working any changes still get noticed [02:30] wallyworld: I can see the peergrouper running regularly in the logs, but the other machines aren't picked up [02:31] when enable ha is run, the machine collection does get the new machine docs [02:31] wallyworld: those machines do connect to the API on machine-0 even when their own mongodb instance isn't part of the replicatset yet [02:31] what is the peergrouper polling for? [02:32] I think it polls in case the replicaset changes underneath it (in mongodb) [02:32] sorry, i meant what data is it polling? [02:32] there are watchers so that it reacts straight away something changes in state [02:32] and the polling is there to catch any changes in mongodb [02:33] the only way the peergrouper can be not reacting to the new machines is if they don't get to "started" [02:34] so it watches for machine status changes [02:34] not really [02:34] it watches for controller changes as well as waking up periodically [02:34] that's what "started" is isn't it? a machine agent status [02:34] and when it wakes up due to either a poll or a controller change [02:34] it checks the controller machines against the mongodb replicaset config [02:35] and udpates things if required [02:35] but controller machines are only considered if they're "started" [02:35] yes machine agent status [02:35] obtained via (state.)Machine.Status() [02:35] so, can we see in the logs if that SetStatus() api call is being made [02:36] * menn0 checks [02:36] by the new agent when it comes up via jujud [02:36] wallyworld: they are being made: [02:36] 2016-04-14 21:01:33 DEBUG juju.apiserver apiserver.go:291 <- [475] machine-1 {"RequestId":67,"Type":"Machiner","Version":1,"Request":"SetStatus","Params":"'params redacted'"} [02:36] 2016-04-14 21:01:33 DEBUG juju.apiserver apiserver.go:305 -> [475] machine-1 324.478971ms {"RequestId":67,"Response":"'body redacted'"} Machiner[""].SetStatus [02:37] don't know what the status is being set to but the calls are there [02:37] * menn0 checks the machine-1 logs at that time [02:37] if you run with trace you'll see tha params [02:37] it sounds like maybe the watcher is suspect? [02:38] wallyworld: I can't repro the problem reliably, it's intermittent [02:38] awesome [02:38] so I'm stuck with the DEBUG logs from the CI failures [02:39] I think the machine was set to started [02:39] 2016-04-14 21:01:33 INFO juju.worker.machiner machiner.go:105 "machine-1" started [02:39] sounds like it [02:39] so why the hell isn't the peergrouper noticing the machine..... [02:39] so we need to know why the watcher is not firing or why the peergrouper doesn't see that [02:39] exactly [02:39] if the logs were at TRACE I'd see why in the peergrouper logs [02:40] we can ask QA i guess [03:45] wallyworld: I've just simulated a machine not setting its status to started (by hacking the machiner to not set the status for a specific tag) and I see exactly the same log output in the machine-N.logs and the mongodb logs. [03:45] wallyworld: so that's the likely intermediate cause [03:45] wallyworld: now to figure out why the machine isn't "started" when the peergrouper looks [03:45] menn0: hmmm, interesting. maybe there's a network/routing issue at play? [03:46] wallyworld: it can't be that because we see the new machines contact the API on the bootstrap node [03:46] we even see it make the SetStatus call [03:46] wallyworld: my guess is that something is setting the status to another value after the machiner sets it to started [03:47] wallyworld: is that possible to your knowledge? [03:47] not ottomh [03:47] but could be i guess [03:47] the logs should show extra SetStatus calls [03:48] wallyworld: there aren't any [03:48] menn0: could there be a legit issue with the watcher firing? [03:48] it does get there eventually right? [03:49] but after a long time [03:49] wallyworld: well it polls every minute [03:49] that is true [03:49] so wtf [03:49] wallyworld: so even if the watcher doesn't fire the peergrouper still sees any changes at least once a minute [03:50] yes [03:51] wallyworld: the other way the peergrouper might not be seeing a controller machine is if it's not in the controllerinfo doc [03:51] wallyworld: but that seems less likely based on my read of the code [03:52] yeah, i'm not overly familiar with the controllerInfo state code [03:53] wallyworld: the code that updates it looks solid... it's either going to work or the whole txn which adds to it and adds the machine docs fails [03:53] and the machine docs are clearly being added [03:54] menn0: could it be one of those corner cases we've hit before with txn etc? [03:55] wallyworld: maybe... I don't think so [03:55] i've just noticed something else in the peergrouper code... digging some more [03:55] ok [04:05] wallyworld: I just noticed that the list of controller machines and machine status is only updated/checked based on the watcher [04:05] wallyworld: and the code to do that is horrid [04:05] menn0: you saying the poll every minute relies on the watcher firing? [04:06] Bug #1571476 opened: "juju register" stores password on disk [04:06] Bug #1571477 opened: juju 1.25.3: juju-run symlink to tmpdir [04:06] to update the machien list to poll [04:06] Bug #1571478 opened: juju login/register should only ask for password once [04:06] wallyworld: no the poll happens every minute regardless [04:06] right [04:06] but the list to poll [04:06] comes frm the watcher? [04:06] wallyworld: and during that poll the latest status of the mongo replicaset is updated [04:07] wallyworld: but the controller data from state is only refreshed based on the watchers [04:07] wallyworld: and the way it's done screams of data race to me [04:07] seems like ti :-( [04:07] it [04:08] there's a separate goroutine which passes a method on itself over a channel to main peergrouper goroutine [04:08] i wonder why we poll at all? [04:08] the peergroup receives the method, calls it [04:08] which then goes and modifies stuff back on the main watcher again [04:08] sounds like a good refactoring is in order [04:09] yep, I might try that [04:09] but first I'm going to land some simple logging changes so that we can get more info when it fails in CI [04:10] sgtm [04:11] wallyworld: thanks for being a sounding board all day... it helps a lot having to explain what i'm seeing [04:11] tis ok, i didn't do much [04:11] or anything really :-) [04:20] axw: any chance of a small review? https://github.com/juju/bundlechanges/pull/22 [04:21] wallyworld: sure [04:25] wallyworld: a little confused by the second paragraph in the description - I don't see any change relating to series override [04:26] wallyworld: is that a change to be made in juju/juju? [04:26] axw: a test failed - the bundle said trusty but the charm in the bundle said precise. the test exected trusty, and i changed to precise [04:27] actually, that may have been me adding trusty first up to the test [04:27] and then having to change to precise [04:27] so maybe ignore that [04:27] wallyworld: yeah, there are only additions in the code and tests [04:28] yeah, i just checked too, so i think i just forgot what i changes/added [04:28] wallyworld: LGTM, please drop that para before merging to avoid confusing anyone else :p [04:29] yep :-) [04:29] ty [05:09] wallyworld: peergrouper logging changes: http://reviews.vapour.ws/r/4623/ [05:09] back soon [05:09] awesome, will look in a sec [05:50] axw: here's the other half of that fix, only a small change http://reviews.vapour.ws/r/4624/ [06:01] wallyworld: reviewed [06:01] ty [06:02] axw: yeah, i'll need to rework the common function. technical it is "user" specified (not charm specified) but i take the point about the message [06:02] well, i guess not really user specified [06:03] wallyworld: no, I don't think so. in the original use of that function, it was user-specified because the user specifies with --series on the command line [06:03] here they're just doing "juju deploy some-bundle" [06:03] yeah, i reconsidered by position :-) [06:03] my [06:03] cool [06:30] jam: didn't get anywhere with tests (re RB). Was testing some stuff that dooferlad was proposing. === frobware_ is now known as frobware [06:45] frobware: morning [06:45] frobware: I have 2 PRs up for review, more coming later [06:45] http://reviews.vapour.ws/r/4614/ and https://github.com/juju/gomaasapi/pull/42 [07:42] frobware: ping [07:51] dimitern: morning [07:51] frobware: morning :) [07:51] frobware: have you seen the links I pasted above? [07:52] (still not quite sure ERC doesn't shit itself and appears connected but it isn't) [07:52] just about to look, but have a 1:1 with jam in a minute [07:52] frobware: sure, np [08:08] frobware: didn't like what I had to say? [08:21] wallyworld: ping [08:22] or axw ? [08:22] dimitern: heya, what's up? [08:23] I'm wondering why juju list-machines (or status for that mater) does not display containers [08:23] dimitern: erm, it doesn't? no idea [08:23] I haven't used containers in ages [08:23] axw: ok, np :) [08:23] I think, unless it's on purpose, it should be a bug [08:45] Bug #1571545 opened: juju status with default tabular format or juju list-machines does not show containers [09:17] why does some test files on windows like fork/exec fail ? I know that fork syscall dosen't exist in windows but we should try to to detect the specific platfor and then run just the specific tests [09:18] I'm right? [09:29] dimitern, frobware, voidspace: sorry, dropped accidentally, but won't have much to contribute to topic beyond "I endorse the removal of hacks" [09:48] dimitern: hey, sorry was afk [09:48] wallyworld: np - it's rather late anyway [09:49] dimitern: re ntpdate: https://bugs.launchpad.net/bugs/1564397 [09:49] Bug #1564397: MAAS provider bridge script deletes /etc/network/if-up.d/ntpdate during bootstrap [09:49] wallyworld: check out https://launchpad.net/bugs/1571545 [09:49] Bug #1571545: juju status with default tabular format or juju list-machines does not show containers [09:49] dimitern: i do have a question for you - i started to look at removing the Network attr from deploy service and it was a very deep rabbit hole and many 100s of lines of code that i started to delete and then i noticed i started to overlap woth one of your existing prs so backed off [09:50] dimitern: i am hoping all that network stuff - including collections, paras structs etc - can all be deleted for 2.0 rc1 === babbageclunk` is now known as babbageclunk [09:50] wallyworld: ah, well - yeah, it's gnarly but we'll get there and drop it soonish I hope [09:51] dimitern: yeah, it needs to be done before 1.0 final [09:51] 2.0 [09:51] it's no longer used and can't influence the code path anymore [09:51] that status thing - it may not have ever included containers, not sure, but seems like a bug [09:51] dimitern: but the api does expose it etc, so we need to drop thart bit at least [09:52] it's fine to drop the CLI argument (if still there) [09:52] i.e. just hide it while it can be dropped [09:52] dimitern: the model migration stuff also references the obsolete collection(s) [09:53] it would be best just to drop the whole lot; we will delete a couple of 1000 lines of code i think [09:53] wallyworld: ah, those I *think* are safer to remove now [09:53] wallyworld: let me have a look today how much we can drop [09:54] wallyworld: networksC it only still referenced by the opened ports, but should be easy to move that to spaces instead [09:54] dimitern: ok, let me know how you get on. i delete a shit tonne of stuff from state, apis, params etc before i started to hit the address alllocation feature flag stuff [09:54] so i stopped [09:55] wallyworld: yeah, I really wanted to drop the whole thing, but it was suggested as safer to drop it on maas only for now [09:55] and yeah, the ports thing i wasn't sure about [09:56] dimitern: not sure 100%, but i think the status thing was by design - [Machines] just really does mean machines and not containers [09:56] to keep it not too verbose and to fit on a screen [09:57] well containers *are* machines :) [09:57] not sure if we should keep it like that and add a --with-containers arg [09:57] i'm just guessing [09:57] what the rationale may have been [09:57] but yeah, i agree with you [09:57] it will be nice to not have to go through a pile of yaml just to get what addresses containers have [09:58] agreed [10:03] frobware, voidspace, babbageclunk: friendly review poke :) https://github.com/juju/gomaasapi/pull/42 http://reviews.vapour.ws/r/4614/ http://reviews.vapour.ws/r/4626/ [10:03] * dimitern steps out for a while [10:24] dimitern: reviewed the first two - the third's getting pretty far away from anything I know about, so might take me a bit longer. [10:59] erm [11:03] Bug #1571593 opened: lxd bootstrap fails with unhelpful 'invalid config: no addresses match' === JoseeAntonioR is now known as jose === Ursinha_ is now known as Ursinha [11:10] Bug #1571593 changed: lxd bootstrap fails with unhelpful 'invalid config: no addresses match' === cppforlife__ is now known as cppforlife_ [11:16] Bug #1571593 opened: lxd bootstrap fails with unhelpful 'invalid config: no addresses match' [11:19] frobware: I had to destroy that ZNC service - it was hogging my nick here! But I still couldn't work out if it was listening on any local ports. [11:22] babbageclunk: shame [12:00] cheers babbageclunk! [12:01] dimitern: :) oops, forgot to ping you when I'd done them! [12:01] babbageclunk: np, I've just got back anyway [12:02] frobware: I'll give it another go later on. [12:17] dimitern: did you get your reviews done? [12:18] voidspace: yeah, most of them - I'd appreciate a look on the last one though: http://reviews.vapour.ws/r/4626/ [12:18] dimitern: I'll swap it: http://reviews.vapour.ws/r/4629/ [12:18] voidspace: sure thing [12:20] dimitern: I like the logging changes :-) [12:21] dimitern: I have no new issues to add to the reviews already there [12:21] :) I'm sure *everybody* does heh [12:21] * voidspace lunches [12:21] voidspace: ta [12:21] babbageclunk: I'll pick up something new after lunch [12:22] babbageclunk: we're nearly there! [12:34] voidspace: reviewed [12:51] voidspace - around? Want to pick your brains about how devicename and hardware id are set. [13:01] frobware, voidspace, babbageclunk: guys, I still need an approval on http://reviews.vapour.ws/r/4626/ - please, have a look [13:16] frobware: you know what? /etc/network/if-up.d/ntpdate is missing on xenial - it' only there on trusty [13:17] (well, maybe also in more recent non-LTS *releases) [13:18] dimitern: is that because ntpdate is not in the base image on xenial? [13:19] frobware: I suspect so - it's not on my machine after upgrading to xenial, nor it's on freshly deployed xenial maas nodes with the most recent images [13:20] dimitern: but technically it could come back (i.e., later versions add it (again)) [13:21] charms can also install packages. [13:21] pretty sure neither ntp or ntpdate have ever been part of the base server image [13:22] true, but I can confirm ntpdate is there on trusty and not there on xenial images [13:23] even without juju in the picture [13:23] frobware: it could, but the code handles that transparently [13:23] voidspace, state/machine.go:1262 seems like it might not be quite right -- surely the preferred address should be allowed to change if it's no longer one of the know addresses? [13:23] (chmod -f -x .. and later chmod -f +x ..) [13:24] dimitern: do we fail on chmod or just ignore? [13:24] chmod -f does not fail when the file is missing [13:24] dimitern: do we fail if chmod fails? [13:24] dimitern: ty [13:24] dimitern: and chmod -f is supported as an option in precise? [13:25] frobware: unfortunately I could see a bunch of ntpdate still hanging around with the chmod -x patch, as it seems ifup calls `/bin/sh /etc/network/if-up.d/ntpdate` [13:26] dimitern: yay [13:26] frobware: we *could* try using ifup --no-scipts, but that seems more dangerous [13:26] dimitern: ooh. interesting. [13:27] frobware: I'll do some experiments to see [13:27] dimitern: we *should* try this. the scripts will have run once for curtin's ENI, and they will on every reboot. Just not whilst we're replacing stuff. [13:28] dimitern: at face value that seem ok [13:28] frobware: that's an excellent point (which I keep forgetting about) [13:28] dimitern: is --no-scripts supported in precise? :) [13:29] frobware: will try precise as well [13:31] dimitern: fwiw, I don't think '--no-loopback' is supported in precise [13:31] dimitern: nope, not supported. [13:33] dimitern: http://pastebin.ubuntu.com/15912945/ [13:36] frobware: unfortunately --no-scripts does not work even on xenial [13:36] frobware: that is, it works, but since one of the scripts is `bridge`, the bridges are not configured ok [13:36] dimitern: what does 'bridge' mean here? [13:37] frobware: /etc/network/if-pre-up.d/bridge -> /lib/bridge-utils/ifupdown.sh* [13:38] dimitern: which doesn't exist on precise... let me look elsewhere [13:38] frobware: it should be there if bridge-utils is installed [13:42] voidspace, ignore me [13:56] babbageclunk, frobware: how about `verifySubnetAliveUnlessMissing(cidr) error` ? it will still return no error if cidr does not match an existing subnet [13:59] would that be easier to follow? [14:02] lol [14:02] dimitern: a bit, although it's still got too many clauses in the name [14:02] if you have to make a function into a sentence, you probably need more than one function [14:02] what's wrong with descriptive names? [14:03] or that's just not how real go devs roll :D [14:03] dimitern: hang on, I'm typing up what I mean [14:03] dimitern: nothing wrong with descriptive names, it's that the thing shouldn't be one function if its name has to be too descriptive. [14:04] dimitern, natefinch: fwiw, that was my original concern in the PR [14:04] generally if you need a name that specific, it means you're tying the implementation of that function too tightly into what your consumer needs. Just split it into two functions that do two simple things [14:05] ok, a better option will be I think to return a concrete error in the case the subnet does not exist, so it can be verified where needed [14:05] granted, I don't know what that function does per se, but it seems something like if subnetExists(cidr) { return verifyAlive(cidr) } is probably clearer and the individual functions are more reusable [14:07] fwereade_: ok, I will ignore you [14:07] babbageclunk: you there? [14:09] voidspace: yup, just typing up something [14:11] dimitern: https://pastebin.canonical.com/154572/ [14:11] dimitern: Maybe? [14:14] voidspace: yup? [14:14] babbageclunk: do you still have questions? [14:14] voidspace: yes! [14:14] voidspace: hangout? [14:14] babbageclunk: why do you want to ask about new device name? [14:14] anyone up for a review of a 2.0 bug? http://reviews.vapour.ws/r/4616/diff/# [14:14] babbageclunk: sure [14:15] babbageclunk: I'm trying out a similar approach, will update the PR soon with it [14:18] jam: I think the bits we need to expose to apply NICs to the container is: SetContainerConfig(container, key, value string) [14:25] dimitern: cool [14:25] dimitern: can I pick your brains about /list [14:26] dimitern: oops, that is not what I meant to type [14:26] babbageclunk: yeah? :) [14:26] dimitern: what I meant to type was: provider/maas/volumes.go [14:27] babbageclunk: I'm not *that* familiar with it, but I'll help with what I can [14:27] babbageclunk: HO? [14:27] dimitern: ok thanks - voidspace was not much help! [14:28] dimitern: yup yu[ [14:41] frobware, babbageclunk: updated http://reviews.vapour.ws/r/4626/diff/ can you have another look please? [14:44] dimitern: looking [14:45] Gah, X keeps crashing on me. :( [14:45] dimitern: I'm landing my branch - the only issue you opened was invalid and your other two comments I addresses [14:46] dimitern: (you suggested changing map[string]bool to set.Strings but the bool has significance, it isn't just a set) [14:46] voidspace: sure, sounds good [14:46] dimitern: the map tracks which subnets we actually found (true/false) [14:46] dimitern: cool [14:47] voidspace: we should bump deps.tsv for gomaasapi at some point as well [14:47] dimitern: it's been bumped whenever needed [14:47] dimitern: last time was on Friday [14:48] voidspace: ok [14:48] dimitern: I don't think anything has been done since then that needs updating [14:48] voidspace: my PR that fixes fetching VLANs with a null name [14:48] (landed earlier) [14:49] dimitern: ah, cool [14:49] dimitern: want me to do just that and propose it? [14:49] dimitern: I just knocked one more thing off the maas2 list and was about to tackle the next [14:49] dimitern: looking [14:49] voidspace: well, as it's not a blocker for your maas I can do it later tonight or tomorrow [14:50] dimitern: ok [14:50] frobware: thanks! [14:51] babbageclunk: updated/simplified http://reviews.vapour.ws/r/4626/diff/ (you dropped and missed this I think) [14:51] dimitern: LGTM on your branch [14:51] voidspace: ta! [14:52] Bug #1571687 opened: Azure-arm leaves machine-0 from the admin model behind [15:02] mgz: ping [15:02] ericsnow: standup time [15:03] voidspace: yo [15:03] dimitern: looking now [15:03] thanks babbageclunk [15:05] mgz: just emailed you [15:05] mgz: I thought it was better to do as an email anyway [15:05] mgz: we'd like all the MAAS CI tests duplicating for MAAS 2.0 please :-) [15:06] voidspace: we have a card for it [15:06] mgz: ah, awesome [15:06] mgz: we're very near to needing it [15:06] mgz: anytime tomorrow will be fine ;-) [15:06] :-P [15:07] what's not working on master at present? [15:08] mgz: we don't add machine tags in instance characterstics (in progress) [15:08] mgz: instance.volumes unimplemented (in progress) [15:08] mgz: all container support not done yet (a couple of days work probably) [15:08] mgz: a network interface function that is implemented but not wired in [15:08] mgz: (that's trivial) [15:09] hm, so the basic deploy test should work, but the bundle ones probably won't quite yet [15:09] mgz: yep [15:09] mgz: but it will only be a handful of days which is why I'm pinging now [15:09] thanks :) [15:09] mgz: and thanks to you sir [15:11] cherylj: I commented on https://bugs.launchpad.net/juju-core/+bug/1531444 ... Maybe I'm missing some context, but it seems like it's probably not super critical [15:11] Bug #1531444: azure: add public mapping of series->Publisher:Offering:SKU [15:12] natefinch: it was marked as critical as it impacted our ability to publish centos / windows in streams for azure [15:12] Is there a protocol for asking for help in canonical #maas? [15:13] Someone particular I should ask? [15:13] babbageclunk: I usually ask roaksoax, or mpontillo [15:13] cherylj: yes, but if it's only when a new version of windows comes out... we need to update core for that anyway (which, admittedly is horrible and bad, but it is the state of the code AFAIK) [15:14] babbageclunk: roaksoax has promised to help us [15:14] babbageclunk: allenap, mpontillo, roaksoax, blake_r [15:14] it's not a very lively channel [15:14] but gavin is our timezone, and sometimes in hitting range of me (allenap) [15:14] natefinch: thanks for checking it out, I'll bring it up again today to better understand the blockage [15:14] voidspace: or maybe he *vowed* to help us? [15:14] Ok, thanks [15:15] babbageclunk: uhm, maybe I guess... [15:16] voidspace: It would be more dramatic. [15:16] babbageclunk: it certainly would be [15:16] voidspace: Never promise when you could vow [15:17] babbageclunk: heh, sound life advice there [15:18] babbageclunk: frobware: dimitern: a really difficult one http://reviews.vapour.ws/r/4630/ [15:18] voidspace: looking [15:19] dimitern: does it need a test where there are no tags? [15:19] voidspace: :) LGTM [15:19] dimitern: thanks [15:20] frobware: that's up to gomaasapi I think - it should handle the lack of tags as an empty slice (or nil) [15:20] voidspace: ^^ [15:20] dimitern: frobware: yep, it will just be an empty slice [15:20] ok [15:21] dimitern: frobware: calling gomaasapi give us a struct with a Tags member - so either there's something there or there isn't. It doesn't matter. [15:21] or rather, an interface with a Tags method [15:21] voidspace: cool [15:21] frobware: thanks [15:22] voidspace: also I doubt you have any tags on your vmaas vms - otherwise that would've been noticed earlier :) [15:22] (I mean if it's a panic or something nasty like that) [15:22] indeed [15:23] babbageclunk: don't forget to update the status doc - ta! [15:24] voidspace, babbageclunk, frobware: http://reports.vapour.ws/releases/3899/job/run-unit-tests-race/attempt/1338#highlight it might be worth running provider/maas tests with '-race' a few times to find and fix that [15:26] dimitern: I've added it as a TODO on the status doc [15:27] voidspace: +1 [15:36] voidspace, babbageclunk, dimitern: reminder - no rick call today [15:37] frobware: ok [15:38] * dimitern bbl [15:41] dimitern, voidspace: whoa, that data race is weird - can someone explain it to me a bit? [16:01] Bug #1519877 changed: 'juju help' Provider information is out of date === tasdomas` is now known as tasdomas [16:14] babbageclunk: I looked at it, went "whoa" and stopped looking at it [16:14] babbageclunk: as you might guess, the race detector detects possible race conditions between goroutines [16:14] babbageclunk: so it shouldn't be *too* hard to work out [16:14] voidspace: Ah, I think I get it - it's the fact we store the filename in GetFile. [16:15] voidspace: In fakeController. [16:15] babbageclunk: if you think you can fix it then awesome [16:15] babbageclunk: ah, right [16:15] sounds likely [16:17] voidspace: do you think I should put locking on all of the fakeController methods that store state on the controller for later? [16:17] voidspace: may as well, right? [16:18] babbageclunk: if it's not too much work [16:19] babbageclunk: I don't really like "just in case" code [16:19] babbageclunk: but locking is perhaps an exception [16:23] It's only a few methods [16:31] Bug #1571737 opened: Race is mass provider storage [16:39] babbageclunk, voidspace, so who gets bug 1571737? :) [16:39] Bug #1571737: Race is mass provider storage [16:39] cherylj: me, me! [16:39] we have a winner! [16:39] cherylj: fixing it now [16:39] yay! [16:39] thank you, babbageclunk :) [16:39] cherylj: :) [16:41] babbageclunk: what's your lp ID? [16:41] cherylj: hmm, good question - checking now [16:41] cherylj: 2-xtian [16:41] yeah, I never would've guessed that [16:41] thanks, babbageclunk :) [16:42] me neither [16:42] and I'm sure you told me this a few weeks ago [16:49] voidspace, dimitern, frobware: review my data race fix please? http://reviews.vapour.ws/r/4631/ [17:00] babbageclunk: LGTM [17:01] voidspace: sweet. What's the protocol for closing bugs? Will anything update it automatically on merge if I put a tag on the PR, or do I just close it manually? [17:02] babbageclunk: assign it to yourself, mark it in progress [17:02] babbageclunk: then once the fix lands mark it fix committed [17:02] babbageclunk: QA are responsible for marking it fix released (effectively closing it) [17:03] babbageclunk: I don't *think* there's anything auto here [17:03] our release process does auto-fix-released bugs targetted at the milestone [17:03] mgz: cool [17:03] babbageclunk: you should probably target the bug at the latest 2.0 beta/rc or whatever the latest is then [17:04] yeah, rc1 [17:04] Bug #1570035 changed: Race in api/watcher/watcher.go [17:04] Bug #1570994 changed: deploy fails to download updated local charm [17:06] When you say target it at 2.0-rc1 - is that the milestone? (It's already set to that.) [17:07] babbageclunk: yes [17:08] Ok, I've marked it as in-progress, and when the merge passes (as I'm sure it will with no flaky tests!) I'll change it to fix-committed. [17:09] babbageclunk: don't forget status doc [17:10] voidspace: haven't! [17:10] babbageclunk: :-p [17:10] voidspace: no, I mean I haven't updated it ;) [17:11] hah [17:11] I know you haven't [17:12] voidspace: but I will. [17:12] ok [17:12] you do that [17:30] voidspace: ding! [17:31] Bug # changed: 1556113, 1556146, 1556180, 1558901 [17:47] ericsnow, katco`: if you're looking to break up your day, you could review my bugfix from last week: http://reviews.vapour.ws/r/4616/ === katco` is now known as katco [17:47] natefinch: will take a look in a bit [17:47] natefinch: same [17:52] Bug #1571783 opened: Windows unit tests cannot setup under go 1.6 [17:55] Bug #1571783 changed: Windows unit tests cannot setup under go 1.6 [18:01] Bug #1571783 opened: Windows unit tests cannot setup under go 1.6 [18:12] I really wish juju status would print out the controller name and model name I'm looking at [18:12] natefinch: open a bug === cherylj_ is now known as cherylj === redir is now known as redir_lunch [18:32] Bug #1571792 opened: Juju status should show controller and model names === redir_lunch is now known as redir [19:25] cmars: I'm looking at https://bugs.launchpad.net/juju-core/+bug/1566130 but I can't reproduce it with a trivial install hook that just does an exit 1... do you still have a good repro? [19:25] Bug #1566130: awaiting error resolution for "install" hook [19:27] natefinch, try pulling cs:~cmars/gogs and introducing an install hook error in reactive/gogs.py [19:27] anyone have a minute to rubber duck something with me? [19:27] natefinch, maybe raise Exception("foo") in there [19:28] cmars: ok, I'll give it a try, thanks [19:28] natefinch, then, "fix" it, do `juju upgrade-charm gogs --force-units` [19:28] natefinch, then possibly juju resolved --retry gogs/0 [19:32] katco: who would I ping about zseries information? [19:32] Bug #1556155 changed: worker/periodicworker data race [19:32] Bug #1570219 changed: juju2 openstack provider setting default network [19:32] redir: sec [19:32] np [19:37] redir: i answered on internal network in case you missed it [19:42] I did [20:19] oh weird, we automatically retry failed hooks now? [20:20] wallyworld: you around? [20:20] natefinch: yeah, see dev thread in jan, from message from bogdan in nov [20:20] did his followup changes all get reviewed? [20:20] I saw at least one hanging around for a while [20:21] Oh yeah, I remember that thread now [20:21] mgz: no idea about reviews [20:24] mgz: was wondering if maybe the retry code had something to do with the bug I'm looking at: https://bugs.launchpad.net/juju-core/+bug/1566130 [20:24] Bug #1566130: awaiting error resolution for "install" hook [20:26] natefinch: seems possible at least - don't have a tighter revision window to check for you I'm afraid, as we don't have tests for actually borked charms. [20:26] *nod* [20:27] should at least have something exercising resolved --retry and the like, wouldn't be that hard to add. [20:34] natefinch: sorta [20:34] wallyworld: np, got an answer elsewhere === jillr_ is now known as jillr === redir is now known as redir_afk [20:47] I'll be back later this eve, but not sure what time yet. [20:48] redir_afk: gl [20:50] Bug #1571831 opened: TxnPrunerSuite.TestPrunes intermittent test failure [20:50] Bug #1571832 opened: Respect the full tools list on InstanceConfig when building userdata config script. [20:57] cmars: that install bug... have you been able to reproduce it with latest master? I tried a few different ways of having install fail, to no avail. [20:58] natefinch, a very recent 2.0rc1 master. i'll try in a few min & demonstrate [20:58] cmars: thanks === alexlist` is now known as alexlist [21:49] natefinch, updated the bug with really precise instructions. confirmed it's still there with latest master. the trick is to upgrade-charm --force-units after you get the hook error [21:53] Bug #1571855 opened: User lacking model write access confronted with unhelpful message [22:05] Bug #1571861 opened: juju upgrade-charm requires --switch for local charms [23:50] wallyworld: I'm thinking of removing the --generate flag from change-user-password. it currently doesn't tell you what it generated, so not very helpful; and really, you should just use a password manager if you want that [23:51] axw: sgtm i think