[00:36] thumper: cherylj [00:36] machine-0: 2016-02-09 00:36:41 INFO juju.apiserver apiserver.go:302 [9] user-admin@local API connection terminated after 7.391830549s, active connections: 5 [00:36] machine-0: 2016-02-09 00:36:41 INFO juju.apiserver apiserver.go:302 [A] user-admin@local API connection terminated after 2.046902769s, active connections: 4 [00:37] machine-0: 2016-02-09 00:36:41 INFO juju.apiserver apiserver.go:302 [C] user-admin@local API connection terminated after 91.680242ms, active connections: 3 [00:37] ^ this is what I can do [00:41] davechen1y: active connections is new current count? [00:43] yup [00:43] cherylj: 2016-02-08 22:30:16 ERROR Command '('juju', '--debug', 'deploy', '-m', 'maas-1_9-deploy-trusty-amd64', 'local:trusty/dummy-sink')' returned non-zero exit status 1 [00:43] where can I find the source for this charm ? assuming the charm matters [00:44] davechen1y: https://private-fileshare.canonical.com/~cherylj/dummy-charms/ [00:44] There's a tar file there, and some copied commands that the CI test uses [00:44] ta [00:45] is this environement a ha environment ? [00:45] but I imagine the charm doesn't matter [00:45] davechen1y: probably not, but you can double check by looking at the test run [00:45] yeah, i deployed some of my usual favorites [00:45] i'm bloody sick of the tools versino checker [00:45] is that on the top of a list for someone to fix ? [00:46] checking every 3 seconds is stupid [00:46] 15 minutes would be sufficient [00:46] davechen1y: no, there are other more serious failures that everyone's working on [00:46] unfortunatley [00:46] unfortunately, even. Because that one is darn annoying [00:46] davechen1y: no, the env is not HA [00:47] juju debug-log pauses if you run it after calling enable-ha [00:47] brilliant [00:48] so does juju ssh [00:48] that's wonderful [00:48] juju enable-ha [00:48] puts your environment into catatonia until that process finishes [00:49] doesn't surprise me. everything dies / pauses when you enable-ha [00:54] cherylj: so what's happened is [00:55] the addresses of the additional mongo servers have been added to the list of state servers [00:55] but those additional mongos are up [00:55] sorry, not up [00:55] they are still going through the cloud=init dance [00:55] so you have a 2/3 chance that the apiserver running on machine-0 will try to connect to those [00:56] 2016-02-09 00:56:04 DEBUG juju.worker.peergrouper desired.go:116 machine "0" is already voting [00:56] 2016-02-09 00:56:04 DEBUG juju.worker.peergrouper desired.go:123 machine "2" is not ready (has status: true) [00:56] 2016-02-09 00:56:04 DEBUG juju.worker.peergrouper desired.go:123 machine "3" is not ready (has status: true) [00:56] yet the api server still tries to dial it :) [00:56] 2016-02-09 00:55:49 DEBUG juju.mongo open.go:117 connection failed, will retry: dial tcp 10.251.20.20:37017: getsockopt: connection refused [00:56] 2016-02-09 00:55:49 DEBUG juju.mongo open.go:117 connection failed, will retry: dial tcp 10.241.59.50:37017: getsockopt: connection refused [00:56] 2016-02-09 00:55:50 DEBUG juju.mongo open.go:117 connection failed, will retry: dial tcp 10.251.20.20:37017: getsockopt: connection refused [00:56] 2016-02-09 00:55:50 DEBUG juju.mongo open.go:117 connection failed, will retry: dial tcp 10.241.59.50:37017: getsockopt: connection refused [00:56] 2016-02-09 00:55:50 DEBUG juju.mongo open.go:117 connection failed, will retry: dial tcp 10.251.20.20:37017: getsockopt: connection refused [00:56] 2016-02-09 00:55:50 DEBUG juju.mongo open.go:117 connection failed, will retry: dial tcp 10.241.59.50:37017: getsockopt: connection refused [00:57] davechen1y: but the two bugs that you're looking at aren't using ha [00:58] right [00:58] i won't get distracted [00:58] i was just using enable-ha to try to get the apiserver to explode [00:58] and enter it's restart behaviour [00:59] davechen1y: ah, I was confused [01:00] i shouldn't go poking around in juju [01:00] there be dragons [01:01] ain't that the truth. [01:03] and it's failed [01:03] you'll love this [01:03] Attempt 63 to download tools from https://10.251.11.185:17070/tools/2.0-alpha2.1-precise-amd64... [01:03] + curl -sSfw tools from %{url_effective} downloaded: HTTP %{http_code}; time %{time_total}s; size %{size_download} bytes; speed %{speed_download} bytes/s --noproxy * --insecure -o /var/lib/juju/tools/2.0-alpha2.1-precise-amd64/tools.tar.gz https://10.251.11.185:17070/tools/2.0-alpha2.1-precise-amd64 [01:03] curl: (7) couldn't connect to host [01:03] tools from https://10.251.11.185:17070/tools/2.0-alpha2.1-precise-amd64 downloaded: HTTP 000; time 0.001s; size 0 bytes; speed 0.000 bytes/s + echo Download failed, retrying in 15s [01:03] Download failed, retrying in 15s [01:03] machine-2 is trying to bootstrap [01:03] it needs tools from machine-0 [01:03] machine-0's agent is trying to connect to the replica set [01:04] the replica set is down because it's trying to ensure ha [01:04] so, no tools [01:04] no bootstrap [01:04] no ha [01:04] no tools [01:04] etc [01:05] hmm [01:05] this is even weirder [01:06] this is another case of machine-0 not listening on port 17017 [01:06] 17070 [01:06] I need to look into this [01:06] if that port is not open, no api server [01:39] 2016-02-09 01:36:02 INFO juju.apiserver apiserver.go:302 [1] machine-0 API connection terminated after 3m35.591602712s, active connections: 0 [01:39] 2016-02-09 01:36:02 INFO juju.apiserver apiserver.go:325 closed listening socket "[::]:17070" with final error: [01:39] gets more and more interesting [01:40] the api server shuts down, but never starts back up again [01:46] OH MY GODS [01:46] the bit where we send data to bash via ssh [01:46] after this line [01:46] 2016-02-09 01:43:36 DEBUG juju.utils.ssh ssh.go:249 using OpenSSH ssh client [01:46] we're sending ONE CHARACTER AT A TIME [01:46] axw: ping [01:47] davechen1y: pong [01:47] say whaaat [01:48] davechen1y: where's hte code that's doing that? [01:48] https://bugs.launchpad.net/juju-core/+bug/1543388 [01:48] Bug #1543388: bootstrapping talks to the remote machine one character at a time [01:48] i was watching the bootstrap [01:48] and i'm like why is top using 48% cpu [01:48] so I straced it [01:48] read(0, "e", 1) = 1 [01:48] read(0, "H", 1) = 1 [01:48] read(0, "t", 1) = 1 [01:48] read(0, "D", 1) = 1 [01:48] read(0, "N", 1) = 1 [01:48] read(0, "y", 1) = 1 [01:48] read(0, "q", 1) = 1 [01:48] read(0, "5", 1) = 1 [01:48] read(0, "g", 1) = 1 [01:48] read(0, "H", 1) = 1 [01:48] read(0, "/", 1) = 1 [01:48] read(0, "S", 1) = 1 [01:48] read(0, "o", 1) = 1 [01:48] read(0, "5", 1) = 1 [01:48] read(0, "t", 1) = 1 [01:49] read(0, "6", 1) = 1 [01:49] read(0, "C", 1) = 1 [01:49] :/ [01:49] Bug #1543388 opened: bootstrapping talks to the remote machine one character at a time [01:55] Bug #1543388 changed: bootstrapping talks to the remote machine one character at a time [01:58] Bug #1543388 opened: bootstrapping talks to the remote machine one character at a time [02:04] can tomb.Kill block ? === natefinch-afk is now known as natefinch [02:06] davechen1y: it does acquire a lock, so, in theory, yes. [02:06] davechen1y: but otherwise, it just closes the dying channel [02:06] * thumper takes a deep breath and dives into importing units [02:13] * natefinch looks up the time formatting date for the 1000th time [02:16] natefinch: https://github.com/juju/juju/blob/master/apiserver/apiserver.go#L96 [02:16] so here's the thing [02:16] the only way processCertChanges can exit is if someone called tomb.Kill [02:16] and hte only thing that can is cl.Close [02:16] so this code calls tomb.Kill twice, then tomb.Done for good measure ... [02:16] seems like overkill [02:16] wallyworld: can you take a look at my enable-ha change: http://reviews.vapour.ws/r/3782/ [02:17] sure [02:17] thanks! [02:17] cherylj: did axw ping you about possibly making a non blocking channel send? [02:17] davechen1y: yep... also, it doubly bad, because someone might see cl.tomb.Kill(cl.processCertChanges()) and think they don't need those other lines, but that line won't ever kill the tomb [02:18] wallyworld: no, not yet [02:18] was maybe an alternative to increasing the channel buffer size [02:19] but the buffer size increase might be acceptable for now until manifolds are done for those workers [02:19] okay, can take a look later. I'm going to be afk for ~40 mins or so, but I'll be back [02:19] ok [02:20] cherylj: wallyworld: wasn't going to bother, since we need to change it again. non-blocking send wouldn't work here anyway, it's not just a notify chan [02:20] ok, just wanted to double check, ty :-) [02:20] channels with buffers that are not 0 or 1 are pretty suspect, in my experience [02:21] unless it's directly next to the things populating the channel, I tend to agree [02:24] this sounds like one of those cases where you could have an arbitrary number of sends on the channel before any reads from the channel, in which case any buffer size could be insufficient [02:28] not arbitrary - is equal to the number of allowed controllers (7) plus a known number of local addresses [02:28] rick_h___: noted, about the name=file for resources. [02:28] it's a short term fix until workers migrated to dep engine [02:29] wallyworld: ahh, the fact that it's a short term fix makes a big difference. [02:29] for 2.0, sadly dep engine not going into 1.25 [02:30] so not sure what to do there [02:30] wallyworld: well, we're not going to support 1.25 for very long, right? ;) [02:30] 2 years :-( [02:30] natefinch: wahhahahahahaha [02:30] wallyworld: I know, I was joking [02:30] but 1.25 is blocked right now, so need to get 1.25.4 out [02:30] o/ gallows humor [02:31] wallyworld: just make the channel buffer 128... larger powers of 2 are always better [02:31] ok [02:31] i'll add a comment [02:32] wallyworld: definitely comment why the 10 is 10. [02:32] axw: 2016-02-09 02:27:09 DEBUG juju.utils.ssh ssh.go:249 using OpenSSH ssh client [02:32] what happens after this line [02:32] somethign in bootstrap [02:33] but it's mute until the other side starts to output things [02:33] davechen1y: umm. could be one of a few things, that debug message gets printed whenever an ssh client is created [02:33] davechen1y: first we ssh to each of the possible controller addresses [02:34] davechen1y: then (if you're uploading tools), copy tools across via ssh [02:34] davechen1y: then run the cloud-config rendered as a bash script [02:34] I think that's it [02:35] davechen1y: AFAICR, we just open "ssh" with the script as a bytes.Buffer piped to the ssh process's stdin [02:35] davechen1y: could be that ssh is in an interactive mode? looking for escape codes? [02:35] 13:34 < axw> davechen1y: then (if you're uploading tools), copy tools across via ssh [02:35] ^ it'll be this [02:36] i'm spelunking in the code now [02:36] short version is the openssh impl doens't buffer stdout/stdin [02:36] or something [02:38] davechen1y: actually, gross as it is, we just add the contents of the tools to the bash script (base64 encoded or something) [02:38] so the 2nd and 3rd steps are just one [02:43] rick_h___: you around? [03:00] axw: that's fine [03:00] i knew we bas64'd them [03:00] the problem is something is unbuffered there and it's sending one character at at time over ssh [03:00] which is going to turn each byte into about 400 [03:00] mabye 200 [03:00] but it's a lot [03:00] and the cpu on both sides is non trivial [03:05] davechen1y: yep. I had a look, nothing obvious. what were you stracing exactly? bash on the remote side? ssh on the client? [03:07] remote side [03:07] bash is hitting 50% cpu [03:07] results are in that ticket [03:09] davechen1y: ok, will take another look later [03:20] umm, https://github.com/juju/juju/blob/master/api/apiclient.go#L554 [03:20] natefinch: axw https://github.com/juju/juju/blob/master/api/apiclient.go#L554 [03:21] this construction is unsafe [03:23] davechen1y: you mean because two calls could race? [03:24] davechen1y: if so yeah.. pretty sure it's always one thing's responsibility to close though. [03:24] so in theory, but not in practice (unless we're doing something dumb, which I wouldn't rule out) [03:26] this would have to be hit way harder than we could [03:26] but it's entirely possible to hit this [03:26] https://bugs.launchpad.net/juju-core/+bug/1543404 [03:26] Bug #1543404: unsafe double channel close idiom [03:27] that code in apiclient doesn't look like it's intended to be threadsafe, so hopefully we're not trying to use it from multiple goroutines [03:28] lol panics [03:29] https://github.com/juju/juju/blob/master/api/apiclient.go#L435 [03:34] sooo, http://paste.ubuntu.com/14999670/ [03:34] no matter what logging I add, i cannot get line 71 to output something [03:35] all I can think of is somehow tools are being cached [03:35] and i'm not pushing up what I think i'm pushig up [03:36] davechen1y: log.Infof, not logger? [03:38] OH FOR FUCKS SAKE [03:38] thanks [03:38] * davechen1y wonders what log was in this scope ... [03:39] heh, np. One of those things your brain just can't see if you were the one that wrote it. [03:43] sooo, amazong just gave me a machien without a public ip [03:43] has that ever happened to anyone ? [03:44] it has no public ip or public dns [03:46] natefinch, wallyworld, so should I do 10 or 128? [03:46] :) [03:46] Bug #1543404 opened: unsafe double channel close idiom [03:46] 128 according to nate [03:46] cherylj: I was joking [03:46] ha, ok :) [03:47] cherylj: I do fear that 10 will just error out less often... but *shrug* Seems better than 1 :/ [03:47] natefinch: in practice, I see it firing twice [03:47] if that helps :) [03:47] 10 has some science behnd it [03:48] no, not science, it's http://i.imgur.com/24Jw4gM.gif [03:48] lol, educate dguess then [03:48] cherylj: 1 should be enough [03:48] based on knowledge of the system [03:48] hehe. I love that gif [03:49] davechen1y: I've seen that it's not [03:49] if you just want to hand off the value between producer and consumer without either blocking [03:49] it was 1 before [03:49] and that was the problem [03:49] cherylj: shit [03:49] that's more serious [03:49] it was sending twice [03:49] if it was already buffered [03:49] yeah [03:49] what about making the recieve side timeout [03:50] I'm not sure I see how we would do that. The receiver is blocked waiting on a lock that the sender is holding [03:50] davechen1y: it's a clusterfuck that is getting rewritten soonish [03:50] davechen1y: thus, the bigger buffer is just a stopgap [03:50] and a band aid for 1.25 :) [03:50] whee 1.25 cluserfuck, keeps for 2 years even under adverse conditions [03:53] but, we should explore the other option menn0 suggested for 1.25 - where there's some other synchronization between certupdater and apiserver [04:06] natefinch: still around? [04:07] wanting to verify that the assign units collection is a transitory collection [04:07] meaning that once all units have been assigned to machines, the length of that collection should be zero [04:12] thumper: cherylj http://reviews.vapour.ws/r/3784/ [04:13] thumper: that is correct. [04:13] axw: ta [04:16] Bug #1543408 opened: WatchControllerStatusChanges needs unit tests [04:18] thumper: yes it should be zero [04:19] thumper: when we assign a unit to a machine we also remove that unit from the unit assignment collection [04:19] saw that, just wanted to confirm [04:19] ta [04:39] wallyworld: :/ I've been working on credentials support for the clouds [04:40] oh, damn [04:40] sorry, i thought you were doing --config [04:40] wallyworld: I did, then th other. never mind. I've done other clouds as well [04:40] i started adding joyent support and noticed we needed to do a bit of work [04:42] wallyworld: axw: PTAL http://reviews.vapour.ws/r/3787/ [04:43] axw: so mine just does maas and joyent. with maas, i made the maas-server come from the cloud endpoint attribute in clouds.yaml [04:43] wallyworld: yeah, I did the same in my branch. reviewing now [04:43] axw: i'm just resolving a conflict and pushing [04:44] anastasiamac_: not sure why you would move controllerserver to jujuclient. it's not a client-side thing. [04:49] axw: that's my fault, i misread a question and thought we were renaming controllerserver to just controller at the top lovel to hold server side controller stuff [04:49] i don't like the name controllerserver [04:50] nor do I [04:50] environmentserver made some sense, controllerserver does not [04:50] it was the best i could think of at the time :-/ [04:50] i don't really like environmentserver either [04:51] wallyworld: not suggesting we go back, but there was some connection to the two words before [04:51] axw: atm we have controller and controllerserver at the top level. we choose just one i think [04:51] a controller's a controller, adding server to the end doesn't change anything [04:52] wallyworld: I think go with "controller". the things that *were* in controller have been moved to jujuclient [04:52] oh, ok, we'll fix that [04:52] i like controller also [04:53] not sure if it should stay a top level package, but ok for now [05:00] wallyworld: reviewed [05:00] ty, noticed your config one, looking at that [05:00] wallyworld: I'll do azure, cloudsigma, and vsphere now [05:00] ok, i promise i won't :-) [05:01] :) [05:01] axw: with the manta url - that's going away as soon as storage is gone [05:01] wallyworld: yep, as per comment [05:02] fair enough, i'll leave in comment till then, i was hoping storage would be gone this week or next [05:43] wallyworld: replied to comment about Apply [05:43] ok [05:45] axw: fair point, i think, seems like the tests need updating which i'll look at. are you happy with the modified todo for the private key stuff? [05:45] wallyworld: didn't read yet, one sec [05:47] wallyworld: yep. you probably wouldn't want to enter it interactively, but we can read the file during interactive entry of the filename, and add the value [05:47] wallyworld: then your credentials.yaml is protected from changes on disk. [05:47] maybe not obvious though [05:47] not sure, we can leave it for now [05:47] covered the 99% case I think [05:47] i think the key on disk will be pretty static [05:47] yeah, all uses use file path afaik [05:57] axw: that should be good to go now [05:58] wallyworld: thanks, shipit [05:58] tyvm [05:59] wallyworld: just testing azure, should be ready to propose the rest very shortly [05:59] awesome [05:59] wallyworld: there's still an issue with azure, another case where we need to be able to specify multiple endpoints [06:00] wallyworld: in azure there's separate endpoints for storage and everything else, and they're not necessarily derivable [06:00] damn [06:00] wallyworld: pretty sure we're going to have to extend the clouds.yaml format [06:00] seems so [06:01] do we *need* azure storage long term? [06:01] wallyworld: yes. for volume support, and also some more basic operations like specifying where the VM image should live [06:02] storage-endpoint then i guess [06:02] axw: are the storage endpoints well know like the auth ones? [06:02] can we add them to publoc cloud.yaml [06:03] wallyworld: yes, for azure public cloud. for azure stack you'd specify your own [06:03] axw: well seems like we should just update our public cloud yaml and cloud metadata struct them [06:04] wallyworld: you mean with a new storage-endpoint field? [06:04] yeah [06:05] if it's not derivable [06:05] wallyworld: I'm on the fence as to whether it should be specific to storage, rather than having a flexible map of :. storage-endpoint is probably fine though [06:06] given it's optional, it keeps the default yaml nice and simple [06:06] for other clouds that don't need it [06:06] we can always get feedback and tweak [06:06] wallyworld: ok, sounds fine [06:07] can someone help me figure out why go test isn't actually testing anything in a particular directory? http://paste.ubuntu.com/15000293/ [06:07] wallyworld: added a card to Next [06:07] no suite registered? [06:08] the test ran in the merge bot, and failed, obviously [06:08] but not when I do it locally [06:08] sigh, hate that [06:08] cherylj: have you changed anything? changed a test file from package to package_test perhaps? [06:08] axw: no, I didn't change any test files [06:09] wallyworld: axw: updated move \o/ PTAL? [06:09] looks like even on master I see the same issue [06:09] I change a test to fail, and it happily thinks there's nothing to test [06:11] anastasiamac_: one sec [06:11] * anastasiamac_ waiting :D [06:12] ugh, the suite_test.go was for peergrouper_test, and nothing else was [06:12] cherylj: m blind but where is package_test.go? [06:12] wonder how long it's been like that [06:12] in worker/peergrouper... [06:12] anastasiamac_: guess they're using suite_test.go, rather than package_test [06:13] could be the problem?.. [06:13] anastasiamac_: LGTM, thanks [06:13] axw: \o/ [06:13] wow - 2 in one day!! [06:15] wallyworld: http://reviews.vapour.ws/r/3790/ -- here's the rest [06:15] ta, looking [06:27] axw: reviewed, you may want to rebase first as my branch is almost landed and it will probably conflict in fallback public clouds yaml [06:27] wallyworld: thanks, yep, will do [06:27] bbiab, school pickup [06:44] wallyworld, axw can one of you review the test changes I had to make? http://reviews.vapour.ws/r/3782/ [06:45] I'm going to add in the unit tests for the change and am tracking that work via bug 1543408 [06:45] Bug #1543408: WatchControllerStatusChanges needs unit tests [06:45] cherylj: just the last diff? [06:45] or all of it? [06:45] axw: yes, the last diff [06:45] I copied the mock watcher that was there and modified it to be a strings watcher [06:46] had to pull suite_test.go into the peergrouper package. I'll see about making the tests all external when I write the tests for the functional change. [06:46] it wasn't a no-op this time, so I skipped it [06:49] cherylj: ignoring the error from ControllerInfo in state watcher seems a bit alarming. why not just error out there? [06:49] (sorry, couldn't help myself and looked at the rest) [06:50] axw: if we're getting an error there, we'll get that error elsewhere and things will get restarted [06:50] that was my thinking anyway [06:50] cherylj: so it's not harmful to error out there as well right? [06:50] there's not really a way to return an error there, from what I saw. [06:50] oh. /me looks again [06:50] cherylj: have you moved to Australia, or do you just hate sleep? :-) [06:50] I miss sleep [06:51] we used to be buddies [06:51] now he's all emaciated because I don't feed him [06:51] That's no way to treat your buddies. :-) [06:51] cherylj: right, there's not, sorry. [06:51] cherylj: LGTM [06:52] thanks, axw! [06:53] and now, SLEEEEEP [06:53] be nice to your buddies! [06:54] cherylj: BTW, thanks for some of those recent bug fixes. [06:55] cherylj: i have a strong suspcion that bootstrap is not making it to waitForInitalisatin [06:55] I think it's bombing out _WAY_ earlier [06:56] davechen1y: shh... cherylj is sleeping \o/ [07:15] the new bootstrap syntax has already infected my brain. switching back and forth between 1.25 and 2.0 is going to be fun [07:19] anastasiamac_: looks like the controller.yaml file isn't quite right - it looks like it is storing model information as well as controllers [07:19] it also doesn't have the local. prefix [07:19] for the controller name [07:27] davechen1y: it appears that piping stuff to bash causes bash to read a character at a time [07:27] davechen1y: as in, piping commands [07:44] oh wow [07:44] that's rubbish [07:53] wallyworld: would you like me to look at updating "switch", or are you doing that? [08:12] only in juju would our rpc layer have a field called "Code", that was a string ... [08:14] axw: RB is not picking this up.. Could you PTAL on github? https://github.com/juju/juju/pull/4347 [08:19] wallyworld: so... according to the spec, "local" is a prefix that is added to model name at bootstrap. [08:19] wallyworld: to me, this reads as a separte PR unrelated to my controllers.yaml work [08:20] wallyworld: model name will be transformed before the file is written, hence, my work will just pick it automatically [08:20] wallyworld: ping me when u r back \o/ [08:20] wallyworld: why there is model info in the file now, is probably an "oversight" :D i'll look [08:21] anastasiamac_: done [08:21] axw: awesome \o/ [08:22] axw: would we want to differntiate btw "controllers.lock", "models.lock", "accounts.lock" for different files? [08:23] anastasiamac_: probably not. models and accounts are both related to controllers [08:24] axw: k. thnx [08:24] anastasiamac_: meaning: removing a controller should remove related info [08:24] anastasiamac_: so you can't not lock the controllers file if you're modifying models file [08:25] and vice versa [08:25] axw: to what m observing, we lock a dir not a file... [08:25] anastasiamac_: for now, yes [09:10] frobware, voidspace, dooferlad, I've updated the PR to make the diff a bit easier to follow and included a fix after live testing on maas: [09:10] frobware, voidspace, dooferlad, so please have a look when you can http://reviews.vapour.ws/r/3773/ [09:21] dimitern: ok [09:44] davechen1y: would appreciate if you could test https://github.com/juju/juju/pull/4349 tomorrow, and see if it resolves the issue for you. fairly sure it does, but need to get on with other things [09:48] axw, I think this should be applied to AddScripts as well - specifically in maas now we push a whole lot of python code to set up the bridge script [09:49] ..or perhaps that includes all runcmd / bootcmd scripts already [09:49] dimitern: is that going to pipe commands to bash? I don't think so [09:49] dimitern: yeah, this is the entire cloud-config script [09:51] axw, right, I've just noticed in maas we do a similar thing for the bridge script - put it in /tmp, trap exit and run it [10:15] Hey everyone, does any one know if/how is it possible to access relation/conversation data from within an action? [10:31] jam: dooferlad: dimitern: sorry, screwed my network temporarily [10:45] jam, sorry I was too quick - what were you about to say? [10:46] dimitern: just wanted to check if anyone has heard from alexisb today. Usually I have a 1:1 with her last night, but she missed it, and didn't reply to my email, which is unlike her [10:47] jam, nope - but according to the calendar she was off yesterday [10:47] dimitern: sigh. I had turned of "Juju Team Calendar" when I was at the Cape Town sprint, so that's why I didn't see it. [10:48] dimitern: thanks for noticing and checking. team calendar is back on. [10:53] Bug #1543517 opened: status command tests fail when you have dns spoofing [10:54] :) np [10:57] dimitern: when a user specifies multiple spaces for deployment constraints are we still only using the first? [10:57] dimitern: apiserver/provisioner/provisioninginfo.go line 224 [10:57] dimitern: we should fix that soon I think [11:02] Bug #1543517 changed: status command tests fail when you have dns spoofing [11:05] Bug #1543517 opened: status command tests fail when you have dns spoofing [11:08] dimitern: LGTM on your PR [11:08] dimitern: the basic approach seems sound and I have no specific suggestions for it [11:08] dimitern: there's a lot of context on the code this touches I'm missing, so you may want someone else to look at it too [11:13] voidspace, sure, the more sets of eyes the better [11:13] voidspace, cheers [11:14] voidspace, yeah, that's due to aws [11:14] voidspace, (ignoring all but the first space) [11:14] voidspace, but in maas we actually use the spaces constraints for a machine directly [11:40] anastasiamac_: hey. yes it will be a separate PR, just wanted to let you know that it needed looking at as part of the overall work [11:56] voidspace: you found the problem, jeez what a strange issue [12:06] wallyworld: \o/ added card [12:06] ty [12:41] wallyworld: yeah, took most of the day [12:42] voidspace: you may be interested in https://bugs.launchpad.net/bugs/1539428 [12:42] Bug #1539428: cmd/juju/status: status filtering performs IP resolution of patterns [12:42] related to your issue [12:42] wallyworld: indeed [12:45] wallyworld: so getting rid of the resolution would be a double win. [12:45] yes :-) [12:46] it will have to be fixed for 2.0 i think [12:51] dooferlad, replied to your mail [12:53] dooferlad, let me know if that helps [13:13] frobware: dimitern: trivial one for you http://reviews.vapour.ws/r/3796/ [13:16] voidspace, looking [13:27] voidspace, reviewed [13:31] dooferlad, ping [14:02] dimitern, voidspace: is the MAAS meeting happening? I am the only person in there... [14:05] dooferlad, sorry got distracted - is it still going? [14:05] dimitern: just starting [14:05] dooferlad, omw [14:10] omw too [14:48] dooferlad, did you had a chance to look at my PR #4331 ? [14:49] dimitern: I thought voidspace had [14:49] dimitern: happy to if it needs more eyes [14:49] dooferlad, I'd appreciate it, thanks [14:49] dooferlad, as I'd like to get it in today, if possible and have another one almost ready to propose [14:52] dooferlad: I did review it and it looks sound to me, but I have a lot of missing context so I think another pair of eyes would be useful [16:06] dimitern: LGTM. I would appreciate some card/bug links, but I didn't require them because it is in old code. [16:08] dooferlad, thank you [16:12] Bug #1497301 opened: mongodb3 SASL authentication failure [16:14] hey perrito666, looks like a couple bugs on the mongo3 branch need some love: bug 1534620 and bug 1497301 [16:14] Bug #1534620: TestMongo26UpgradeStep fails on windows because of dos paths [16:14] Bug #1497301: mongodb3 SASL authentication failure [16:15] Bug #1497301 changed: mongodb3 SASL authentication failure [16:24] hey dimitern, how are things going with maas-spaces? [16:24] Bug #1497301 opened: mongodb3 SASL authentication failure [16:25] cherylj, hey [16:26] cherylj, we're tracking master daily and wait for some CI feedback from the changes we pushed in maas-spaces so far (4 days we had no CI run of maas-spaces) [16:26] cherylj, we're also ~2d way from having a hopefully blessed maas-spaces [16:26] dimitern: ping [16:26] voidspace, pong [16:27] dimitern: the card I'm working on is "provider/maas: Addresses() to set SpaceProviderId not SpaceName" [16:27] dimitern: there is no provider/maas Addresses method [16:27] dimitern: and in fact "SpaceName" doesn't appear at all in provider/maas/environ.go [16:28] voidspace, it's in instance.go or interfaces.go IIRC [16:28] dimitern: ah yes it is [16:29] dimitern: better use of grep just found it in instance.go [16:29] dimitern: sorry for the noise :-) [16:29] dimitern, are there still changes the team needs to push to the maas spaces branch for a CI bless? [16:29] regex for the win [16:29] alexisb: yes [16:29] alexisb: proper handling of controller space [16:29] voidspace, and we are looking at 2 days for those fixes? [16:29] alexisb: the failures related to "Default Space" actually highlighted an important problem which we've nearly fixed [16:30] alexisb: yes [16:30] ok [16:30] we shouldn't have been using the default space at all and we *must* know the space to deploy controllers too [16:30] voidspace, dimitern would a CI run on maas-spaces now be useful to validate landed fixes? [16:30] alexisb: yes [16:30] voidspace, thank you [16:30] alexisb, yes, absolutely - we didn't have one for 4 days [16:30] alexisb: space discovery problems should be completely fixed and we haven't had a full run since those fixes landed [16:30] we need to stay in daily contact regarding progress on this branch [16:31] voidspace, ack [16:31] alexisb, we're hopefully down to 2-3 failures now compared to master, but still need to get a CI run to be sure [16:31] sinzui, mgz, can we please get a run on the maas spaces branch [16:31] alexisb: it is running [16:31] alexisb: FYI the three yellow cards in our "In Progress" kanban lane are tracking progress [16:31] alexisb: https://canonical.leankit.com/Boards/View/101652562#workflow-view [16:32] sinzui: thanks [16:32] voidspace, awesome thank you [16:51] Bug #1543660 opened: juju needs to fix the yaml parser [17:08] dimitern, voidspace: EOD for me. Still watching email. [17:10] hey tych0, looks like CI on your branch had more build failures: http://reports.vapour.ws/releases/3586 (non-trusty build failures) [17:11] dooferlad, cheers, but I need to go soon as well [17:12] katco: can you double check with rick_h__ that we're dropping comment for now? [17:12] natefinch: yes [17:12] natefinch: doh, sorry i did. we are. [17:12] oh good :) [17:13] natefinch: sorry, forgot to include that in the email. ericsnow fyi ^^^ [17:28] ericsnow: did you have something you needed reviewed? [17:29] natefinch: http://reviews.vapour.ws/r/3778/ [19:09] ericsnow: how much of cmd/juju/charmcmd/store.go is copied from elsewhere? [19:10] natefinch: nearly all of it [19:10] ericsnow: I thought so. Can you put in comments as to that effect? Then I don't have to try so hard to figure out why it's doing all this stuff. [19:11] ericsnow: like directly in the function bodies that have been copied [19:12] natefinch: sure [19:12] natefinch: oh, and internal tests are making my life miserable [19:13] ericsnow: how are internal tests making your life miserable? [19:13] natefinch: import cycles, and fixing them is a ton of work :( [19:13] ericsnow: ahh boo.... well, import cycles are a valid reason to use external tests [19:14] natefinch: that doen't help me now though [19:14] ericsnow: sorry :/ [19:15] ericsnow: is this because of existing internal tests? If so, I'd definitely like to hear details at some point, so we can try to avoid this in the future. [19:30] ericsnow, katco: trivial rename in charm metadata: https://github.com/juju/charm/pull/190 [19:31] natefinch: +1 from me! [19:32] natefinch: same here [20:00] ericsnow, katco: I have the s/comment/description patch for juju ready, but it's dependent on my other patch that's up for review, and I can't make rbt show the right diff.... so I'll just leave it until the other one lands. It's another trivial diff anyway. [20:01] natefinch: ok that sounds fine [20:01] katco: so, should I grab the --debug card? or should we point the upgrade-charm card? [20:02] natefinch: grab the --debug card since that's the last user-facing bit [20:06] katco: ok [20:21] natefinch: is this something specific to resources? https://github.com/juju/juju/commit/2910afceacca26dc1d880cedaa5d7560b249ebbe#diff-5c78256455800f538886c2beef98045aR66 [20:21] natefinch: doesn't seem to be in master at any point [20:22] Bug #1543770 opened: Juju 2.0alpha1.1 does not assign a proper IP address for LXC containers [20:22] katco: you mean the arguments struct? [20:22] natefinch: specifically "toMachineSpec" [20:23] katco: it is there on the left side of the diff... that's what juju deploy foo --to 3 gets turned into [20:23] katco: possible that it's been changed in master? [20:23] natefinch: it has; trying to resolve the diff [20:24] natefinch: i was mistaken, just found it's lineage in master [20:24] *its [20:24] katco: certainly, it doesn't have anything to do with resources.. I was just refactoring the args into a struct. [20:24] natefinch: thx that makes the merge much easier :) [20:24] katco: cool [20:40] is there a function to get the unit number from the unitID? [20:41] like, I can certainly write one, but I'd rather reusing someone else's [20:42] thumper: when you have a chance, could you look at my replies to your comments on http://reviews.vapour.ws/r/3745/ ? === menn0_ is now known as menn0 [21:02] any charm-tools people I could bother with a couple of reviews? [21:02] menn0: ok [21:03] rick_h__, ericsnow, katco: should we support juju resources foo/0 --debug, and only show detailed information about that particular unit's resources? [21:03] (currently the spec only talks about juju resources foo --debug) [21:04] natefinch: yes, I think it should be able to go into per unit as well [21:04] rick_h__: ok. luckily, that's not much more work [21:04] :) [21:04] menn0: I'm tempted to call yagni and just use user tags [21:05] rick_h__: what should we show if the unit hasn't called resource-get for a resource? [21:05] revision "-" or something? [21:05] natefinch: otp atm [21:09] thumper, menn0: If yagni is an option, it's almost always the right choice. "Just in case" just makes your code confusing when it can really only be one thing right now. (IMO) [21:09] thumper: i'm fine with that. The facade can always be bumped later. [21:11] natefinch: I generally agree with that too. In this case the cost difference between either approach is nominal so it wasn't clear which way to go. [21:12] menn0: yeah. I generally prefer more specific types, so they convey the right information. Seeing an API that takes a names.Tag, but really only one type of tag ever gets passed in, can easily make someone write the wrong code, trying to support a bunch of arguments that they don't really even need to worry about. [21:13] natefinch: yep, agreed. [21:13] natefinch: I'm going with the more specific type. === natefinch is now known as natefinch-afk [21:42] cherylj: ok, i'll take a look now. was distracted by some other stuff this morning [21:56] cherylj: https://github.com/juju/juju/pull/4355 [21:57] looks like somehow that commit got dropped in all the shuffling. anyway, I think that should fix at least the problems you're seeing now. [21:57] (assuming the i18n stuff was the only issue, that's what i saw in my monte carlo sampling :) [22:38] thumper: can you pls take another look at https://github.com/juju/juju/pull/4303 ? [22:39] thumper: also http://reviews.vapour.ws/r/3746/ pls (tiny) [23:29] here is a simple review https://github.com/juju/juju/pull/4357 [23:29] anyone ? [23:31] Bug #1543839 opened: juju/service: test fixture panics if SetUpSuite fails [23:34] Bug #1543839 changed: juju/service: test fixture panics if SetUpSuite fails [23:40] Bug #1543839 opened: juju/service: test fixture panics if SetUpSuite fails