[00:24] perrito666: I'm not sure what your message was in reply to ("I believe so too...") [00:24] * axw goes to get breakfast [01:12] alrighty all I am off for the night, see everyone tomorrow [01:40] axw: good point about not showing for only one unit [01:40] I'll look into it [01:41] thumper: thanks. one complication would be when you specify status filtering [01:41] i.e. you might just be showing a subset of units, and there may be more than what you're showing [01:41] yeah... [01:42] also, what about showing leader in yaml / json? [01:42] keep there even if only one? [01:42] thumper: IMO it's fine to have it in there [01:43] ok [01:43] * thumper will think on it [03:12] thumper: seems I led you astray on the openstack storage fix. https://bugs.launchpad.net/juju/+bug/1615095 [03:12] Bug #1615095: storage: volumes not supported [03:12] oh? [03:12] damn [03:12] thumper: you know how we're looking for the volume endpoint in SetConfig? [03:13] thumper: doesn't work, because the client isn't authenticated yet [03:13] ah [03:13] oops [03:13] we don't authenticate to keep Open fast [03:13] thumper: gtg out, will fix it tomorrow [03:13] ack === frankban|afk is now known as frankban [08:13] axw: thanks a lot for the review of my letsencrypt branch [08:38] rogpeppe1: hey, I'm trying to figure out why the api server started to throw errors like this: ERROR juju.worker runner.go:210 exited "apiserver": cannot start api server worker: crypto/tls: private key does not match public key [08:39] rogpeppe1: it seems to happen frequently after the controller machine was restarted, on CI [08:39] dimitern: interesting === rogpeppe1 is now known as rogpeppe [08:39] dimitern: which version of juju are you using? [08:40] rogpeppe: what I found so far seems to indicate the key file was corrupted somehow [08:41] dimitern: wouldn't it be nice if debugging output printed the whole error stack? [08:41] rogpeppe: 2.0, on a feature branch master-lp1627037 [08:41] dimitern: so have you seen this issue on master? [08:42] rogpeppe: yes, 2 days ago [08:42] rogpeppe: http://reports.vapour.ws/releases/4429/job/functional-container-networking-maas-2-0/attempt/1093 [08:42] rogpeppe: and the last occurrence is http://reports.vapour.ws/releases/4436/job/functional-container-networking-maas-2-0/attempt/1108 [08:43] rogpeppe: the errors are visible in machine-0.log on the controller, after it has been rebooted - it seems intermittent though [08:44] dimitern: ah, so the reports.vapour.ws log doesn't have that error in [08:44] rogpeppe: it just shows it failed to connect to the apiserver after 10m of trying [08:44] rogpeppe: but the machine-0.log shows the apiserver keeps restarting every 3s with that error [08:45] dimitern: i don't see any occurrence of "does not match" in http://data.vapour.ws/juju-ci/products/version-4429/functional-container-networking-maas-2-0/build-1093/controller/machine-0/machine-0.log.gz [08:46] rogpeppe: sorry, so the one on master shows a different error, but still related: 2016/09/27 13:24:45 http: TLS handshake error from 10.0.30.40:43974: remote error: bad certificate [08:47] or maybe not related ... /me is getting confused :/ [08:47] dimitern: it seems like it might well be related [08:47] rogpeppe: I know some things around tls / certs have changed lately [08:47] dimitern: they have? [08:48] dimitern: have you got a link to a log that contains the "private key does not match" error? [08:48] rogpeppe: might have.. not sure - I know you added tests, but that shouldn't have caused such things [08:48] yeah, just a sec [08:49] dimitern: yeah, actually, i did change some stuff, it's true, i'd forgotten that [08:49] rogpeppe: here's the log http://data.vapour.ws/juju-ci/products/version-4436/functional-container-networking-maas-2-0/build-1107/machine-0.log.gz [08:50] dimitern: are all the failures since that commit (42d0c9c07cffbc5075cb05add9e1398056f0d890) ? [08:50] rogpeppe: it's from the feature branch CI run, but the branch itself does not have anything to do with tls or certs - just changes in provider/maas [08:50] rogpeppe: let me check [08:50] dimitern: does that feature branch include commit 42d0c9c07cffbc5075cb05add9e1398056f0d890 ? [08:52] dimitern: does the report (http://reports.vapour.ws/releases/4429/job/functional-container-networking-maas-2-0/attempt/1093) mention the commit id anywhere? i can't find it currently. [08:52] rogpeppe: I can't seem to find that commit on the fbranch or master [08:52] dimitern: i don't know what "revision 4429" means in a git context [08:52] rogpeppe: check the top of the report - has a link to the commit hash tested and it links to github [08:53] dimitern: what's the text of the link? [08:53] rogpeppe: ah, no actually - the [08:53] Jenkins link links to the job which mentions the commit id [08:53] e.g. http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1107/ [08:54] gitbranch:master-lp1627037:github.com/juju/juju bbd844f [08:54] [08:54] dimitern: i get a 404 from the jenkins link [08:54] rogpeppe: ah, looking at that job's build history I can confirm the commit ​42d0c9c [08:54] dimitern: how do you find the build history? [08:54] rogpeppe: failed [08:55] dimitern: failed what? [08:55] rogpeppe: can you open http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/ for example? [08:55] dimitern: nope [08:55] dimitern: 404 [08:55] mgz: ping [08:57] rogpeppe: try logging in and then the link above? [08:57] dimitern: ok, works now, thanks [08:57] dimitern: one might've thought the 404 would contain a login link [08:57] rogpeppe: try to login ;) this 404 is misleading... it really means that u r not authenticated... [08:57] :) [08:57] mgz: unping :-) [08:57] anastasiamac: yeah, i hate that :) [08:57] rogpeppe: yeah, it's not *that* helpful.. [08:58] rogpeppe: \o/ keeping us on our toes [09:01] dimitern: so it looks like http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1093/console doesn't include the commit we're thinking of (42d0c9c) [09:01] dimitern: so perhaps we shouldn't level the blame at that :) [09:03] dimitern: assuming the "REVISION_ID=a5606e7126c0ee5b816b3c52e85f5c77635b5ce3" holds the revision being tested [09:07] rogpeppe: well, 42d0c9c did fail the job on master though... [09:08] rogpeppe: http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1098/ [09:08] dimitern: and that was the first time it failed like that? [09:08] rogpeppe: it failed before, but not like this I think.. checking earlier logs [09:09] rogpeppe: it failed before because the substrate was unclean - no machine matches constraints [09:11] rogpeppe: but interestingly, the very next run on 42d0c9c passed ok.. [09:11] might be just flaky.. or misconfigured maas node [09:11] dimitern, frobware: So, this keeps happening: [09:12] MODEL CONTROLLER CLOUD/REGION VERSION [09:12] foo maas maas 2.0-rc2.1 [09:12] APP VERSION STATUS SCALE CHARM STORE REV OS NOTES [09:12] UNIT WORKLOAD AGENT MACHINE PUBLIC-ADDRESS PORTS MESSAGE [09:12] MACHINE STATE DNS INS-ID SERIES AZ [09:12] 0 started 192.168.1.101 /MAAS/api/1.0/nodes/node-67b68b08-1452-11e6-9228-54a050d5d9eb/ xenial default [09:12] 0/lxd/0 started 10.0.0.199 juju-df0cd5-0-lxd-0 xenial [09:12] 1 started 192.168.1.102 /MAAS/api/1.0/nodes/node-7b5b54e0-1452-11e6-9228-54a050d5d9eb/ xenial default [09:12] 1/lxd/0 started 192.168.1.103 juju-df0cd5-1-lxd-0 xenial [09:12] oops, that should have gone to pastebin [09:12] hang on [09:12] yeah [09:12] :) [09:12] https://www.irccloud.com/pastebin/PrlicZuK/ [09:12] I have a smart IRC client, dumb user. [09:12] dimitern: which is the juju report associated with http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1098/ ? [09:12] dooferlad: on the 0/lxd/0 machine can you cat: [09:13] /var/lib/cloud/seed/nocloud-net/network-config [09:13] frobware: on it [09:13] dimitern: i'd like to see the machine-0.log for it [09:14] https://www.irccloud.com/pastebin/5d4y61hL/ [09:14] frobware: ^^ [09:14] rogpeppe: it should say somewhere.. looking [09:15] frobware: it is the same as on the LXC that worked [09:15] rogpeppe: http://reports.vapour.ws/releases/4431 [09:16] dooferlad: MAAS? [09:16] frobware: yes [09:16] frobware: 1.9 [09:17] dooferlad: this look like you need mick's fix [09:18] dimitern: but which link on that page has the actual test run that contains the machine-0.log artifact for that run? [09:18] dooferlad: https://github.com/juju/juju/pull/6276 [09:18] dooferlad: I just ran into this too. [09:18] frobware: it was doing this yesterday too though, before that landed (I think) [09:19] dooferlad: Just checking that it has landed... [09:19] rogpeppe: scroll down [09:19] dooferlad: so it has. [09:19] dimitern: i see hundreds of links but no artifacts [09:19] rogpeppe: and search for the job name - on the build number to the right (hovering) you'll see the logs [09:19] dimitern: what job name am i looking for? [09:20] dooferlad: 17 hours ago - is that before or after the CI job? [09:21] rogpeppe: it's frustrating how it overlaps the logs, but if you hover on functional-container-networking-maas-2-0 | Succeeded | >1099< [09:21] frobware: it is showing up in the history, so it must have merged, right? [09:21] dimitern: but that's a success - i thought this was meant to have failed [09:21] dooferlad: agreed. but pull and check [09:21] rogpeppe: the list that appears has 1098 ... but unfortunately http://reports.vapour.ws/releases/4431/job/functional-container-networking-maas-2-0/attempt/1098 does no appear to have the machine-0 log [09:22] frobware: or is this the 'new process' stuff that I have been ignoring [09:22] dooferlad: that was going through my head [09:22] damn :/ hate it when that happens! [09:22] frobware: yes, it is there in my build [09:23] dimitern: so you're not sure whether 42d0c9c failed because of the issue you've described? [09:25] frobware: and it wasn't yesterday when I was running into it the first time [09:25] dimitern: looking at the changes i made in that branch, i'd find it very unlikely that they could cause an extra error case when starting the api server [09:26] dimitern: the changes were all about how changed certificates were handled [09:27] dimitern: i think the issue probably arises because a duff cert/key pair is being passed into the api server [09:28] dimitern: it's difficult to say without being able to reproduce the issue [09:28] dimitern: perhaps add some debugging log statements that might help if the issue happens again [09:28] dimitern: for example if the cert is wrong, log it and the key [09:29] rogpeppe: well, I'll let you know if we repro it again, and thanks for looking into it! [09:29] dimitern: np [09:41] dooferlad: did you draw any conclusion? [09:41] frobware: no. Got stuck in other email. Will take another look in a moment or 10 [09:46] dooferlad: pulling that change on top of what I'm doing fixed that lxd/dhcp/eth0 case for me [09:51] frobware: frustrating that it didn't help me then :-| [09:52] hey, I need a review for https://github.com/juju/juju/pull/6352 anyone available? thanks! [09:52] * dooferlad tries again in case of user error [10:06] jam: i've replied to https://github.com/juju/juju/pull/6345 and made one or two changes. do you think it's good to land? [10:09] voidspace: your PR fails make check on trusty btw [10:10] voidspace: you can see which failed in trusty.out log - GetServerAddrs or something like this [10:10] voidspace: http://juju-ci.vapour.ws:8080/job/github-merge-juju/9362/artifact/artifacts/trusty-out.log [10:14] voidspace: ah, you've updated that I guess it will pass now - sorry for the noise :) [10:16] axw: could you please take a look at https://github.com/juju/juju/pull/6352 when you have time? thanks [10:16] dimitern: I know, already fixed [10:16] dimitern: but thanks [10:23] ok, latest master still has the random not getting the right address problem. No question. Yay. Starting more digging... [10:30] frobware, dimitern: the answer is in /var/log/lxd//lxc.conf -- the container that doesn't end up on the right subnet gets: [10:30] lxc.network.type = veth [10:30] lxc.network.flags = up [10:30] lxc.network.link = lxdbr0 [10:30] lxc.network.hwaddr = 00:16:3e:4c:3d:b3 [10:30] lxc.network.name = eth0 [10:30] The right thing would be more like: [10:30] lxc.network.type = veth [10:30] lxc.network.flags = up [10:30] lxc.network.link = br-eth0 [10:30] lxc.network.hwaddr = 00:16:3e:ca:f3:6c [10:30] lxc.network.mtu = 1500 [10:30] dooferlad: what's in the machine-0.log around PrepareContainerInterfaceInfo API call? [10:30] lxc.network.name = eth0 [10:30] lxc.network.type = veth [10:30] lxc.network.flags = up [10:30] lxc.network.link = br-eth1 [10:30] lxc.network.hwaddr = 00:16:3e:40:36:21 [10:30] lxc.network.mtu = 1500 [10:30] lxc.network.name = eth1 [10:30] dooferlad: any errors [10:30] aaaaah stop it:) [10:31] dooferlad: what does: lxc config show show? [10:31] dooferlad: \o/ i second dimitern :D [10:31] bah [10:32] dooferlad: humbug? [10:33] dimitern: logs https://www.irccloud.com/pastebin/4xfeEDhy/ [10:37] dooferlad: I can't see PrepareContainerInterfaceInfo response there? [10:38] dooferlad: it should be a bit earlier in the contoller [10:38] dooferlad: controller's machine log [10:39] dimitern, dooferlad, voidspace, babbageclunk: I concluded my manual testing on https://github.com/juju/juju/pull/6342 [10:40] dimitern, dooferlad, voidspace, babbageclunk: want to land this but looking for final review/approval now [10:42] frobware: http://pasteboard.co/8ThbB7FWl.png [10:45] dimitern: do you know what the set-numa-control-policy setting does, by any chance? [10:45] dimitern: i was mucking around in controller/config.go and came across: // NumaControlPolicyKey stores the value for this setting. [10:46] dimitern: which is possibly the most uninformative doc comment I have ever come across [10:46] the above is a question for anyone else too, BTW. [10:46] dimitern: sorry, my fingers typed 'juju ssh 0', not 'juju ssh -m controller 0' [10:46] * dooferlad curses CLI changes [10:48] rogpeppe: sounds like my thing :D off memory, it was done as a fix for https://bugs.launchpad.net/juju-core/+bug/1350337 [10:48] Bug #1350337: Juju DB should use numactl when running mongo on multi-socket nodes [10:48] this is also hilarious: // DefaultNUMAControlPolicy should not be used by default. [10:48] rogpeppe: it's a setting specific for NUMA machines. r u using NUMA? [10:49] anastasiamac: no, but i think that every attribute should be adequately documented [10:49] rogpeppe: mongo needs to have flag setup to run on NUMA. Hence, the setting [10:49] :D [10:49] it is [10:49] hilarious [10:49] dimitern: guestmount -a /var/lib/libvirt/images/maas19-node6.qcow2 -o ro -m /dev/sda1 /mnt [10:49] rogpeppe: agreed :) it was my ealry days on juju and of course, as a dev, it was obvious to me at the time :) sorry [10:49] anastasiamac: ok, well it should be documented that it's about running mongo with numactl for a start [10:50] rogpeppe: agreed, feel free to clarify as an external force that is now aware :-P [10:50] rogpeppe: especially, since u r in the area :) [10:51] anastasiamac: do you know what a "multi-socket server" is? [10:52] rogpeppe: :D not sure where u r reading "server"... i thought it was about nodes :) [10:52] anastasiamac: BTW shouldn't this be done by some code on the system to test if it's NUMA rather than with a config setting? [10:52] anastasiamac: from the bug report you linked "When running Juju on multi-socket servers I see this in the mongo log:" [10:53] anastasiamac: perhaps it's talking about physical CPU sockets [10:54] rogpeppe: haha ;) no, i do not know... can't really cast my mind that far back: feels like another lifetime \o/ [10:54] anastasiamac: how about this for a doc comment? [10:54] // NUMAControlPolicyKey specifies whether the MongoDB [10:54] // instance on the controller nodes should be run under numctl. [10:54] // This should be set if the controller will run on NUMA hardware. [10:56] rogpeppe: \o/ sounds perfect :D [11:00] dimitern, frobware: got it: [11:00] 77427b4d-a1e4-4659-8b34-fc17ed10ac2b machine-0: 2016-09-29 10:10:35 DEBUG juju.apiserver request_notifier.go:140 -> [3C2] machine-0 694.756636ms {"request-id":107,"response":"'body redacted'"} Provisioner[""].PrepareContainerInterfaceInfo [11:00] e6b42b7c-5cb7-47e2-8e4d-27040bdc810b machine-0: 2016-09-29 10:10:35 WARNING juju.provisioner lxd-broker.go:62 failed to prepare container "0/lxd/0" network config: creating device interface: ServerError: 400 BAD REQUEST ({"vlan": ["This field is required."]}) [11:00] missing vlan field [11:03] dooferlad: what does the setup look like to get into this state? [11:03] frobware: could you be more specific? [11:04] dooferlad: :) MAAS setup / node config / interface config on the nodes [11:05] frobware: ip addr https://www.irccloud.com/pastebin/mxvcDvoP/ [11:05] frobware: sorry, ignore that [11:06] frobware: this is the right machine https://www.irccloud.com/pastebin/XrgdgkDx/ [11:07] that machine has 3 NICs with eth1 assigned an address [11:07] no VLAN tags [11:07] I really don't want to see if it works if I change the NIC with an address to eth0. It would make me throw things if that was it. [11:09] frobware: I need to go make lunch for older daughter. Back in a few minutes. [11:16] axw: still here? [11:22] a large-but-mechanical change to use consistent spellings for API and NUMA throughout Juju. review appreciated, thanks. https://github.com/juju/juju/pull/6353 [11:34] https://bugs.launchpad.net/juju-core/+bug/1173122/+index?ss=1 [11:34] Bug #1173122: API server should not log passwords [11:34] why log passwords? and why not review the patch sended? (check diff). [11:47] anyone? [11:49] hoenir: sorry, I've no idea why this never got looked at. It's from over 2 years ago on the juju-core release in maintenance mode because we've focused on juju 2.0 in launchpad.net/juju and since the code's moved to github we've not looked at code reviews in LP for some time [11:50] rick_h_, thanks for making it clear. [11:54] rick_h_, I makes me wander... why we keep them if their are invalid? Why not close all bugs/PR's on bzr/lanchpad and add into the project desction smth like "Juju dosen't recive any more commits on launchpad, we moved to github" ? [11:55] hoenir: all I can say is that I've never even looked to be honest. [11:55] hoenir: there's no reason not to [11:56] dooferlad: sorry, was sidetracked. will look in a bit. [11:57] rick_h_: we have not landed the patch (yet!). We ran into an issue with aliases when testing this morning. [11:57] I think we will make our lives better if we will take some time cleaning the project issues/bugs... It's kind of a mess. [11:58] frobware: k, please keep us up to speed on that. [11:58] rick_h_: we should sync (now?) [11:58] frobware: k, meet you in standup [11:58] rick_h_: omw [12:04] hoenir: "project issues/bugs".. are u refereing to launchpad? [12:06] anastasiamac, yeah [12:07] dooferlad, babbageclunk, voidspace, macgreagoir: PTAL @ https://github.com/juju/juju/pull/6342 and rubber stamp an approval if you're happy with it. [12:07] anastasiamac, lately the review system is chaged or I'm undertanding it wrong? We still use http://reviews.vapour.ws or the github one? [12:08] hoenir: we r experimenting with doing reviews on github [12:09] hoenir: ur patch in launchpad, is it possible for u to re-propose agaisnt github? [12:09] Hello guys, I was wondering whether this bug fix was included in the openstack oslo messaging charms for Liberty? https://bugs.launchpad.net/oslo.service/+bug/1524907 [12:09] Bug #1524907: [SRU] Race condition in SIGTERM signal handler [12:09] [12:11] Andrew_jedi: this is mostly juju-dev channel, u might get more info in #juju - where charmers are \o/ [12:12] anastasiamac: thanks! [12:12] anastasiamac, could you reformulate the last question again? [12:13] anastasiamac, the patch in the launchpad It's not mine. [12:13] hoenir: oh k.. i wonder if we have adressed it in codebase in github already [12:15] dooferlad: are you still reviewing the bridge changes pr? [12:15] Yeah that's why I'm suggesting to clean up a bit the project tracker issue. [12:16] mgz: heads up, might be a coulpe of min late to our call this morning. Gotten sidetracked this morning [12:17] rick_h_: no problem [12:29] babbageclunk: I was playing around with your tests again, and if I run with '-v' I do see some errors here and there about "http: TLS handshake error from..." I wonder if the mongo driver is assuming ssl? [12:29] and periodically it is trying to poll the replicaset but failing because of TLS stuff, which isn't failing the primary test [12:30] jam: I think we see that occasionally in normal test runs anyway? Maybe. [12:30] babbageclunk: it certainly sounds like something is genuinely wrong and we just haven't been noticing. [12:30] I'll try running with pure trunk and see if I still see it [12:31] jam: Yeah, could be. [12:31] jam: have you got a final opinion on https://github.com/juju/juju/pull/6345 (your review didn't approve or not-approve) ? [12:31] rogpeppe: I haven't seen your response yet, will try and look again [12:32] jam: ta [12:35] mgz: omw [12:35] rick_h_: I await [12:41] rogpeppe: I had some comments, but I think LGTM [12:45] jam: thanks. i responded to "I would guess the minimum is still how long it takes for DNS to update." [12:45] jam: it's independent of DNS updates [12:46] rogpeppe: so its not entirely independent. if you run 2 controllers at the same IP address, then sure, they both get a signed cert. [12:46] but if you are doing "juju bootstrap" you then need to go update your IP record [12:46] and then it has to propagate [12:46] and *then* you can get a signed cert. [12:46] jam: you do, but that's independent of letsencrypt [12:47] jam: ideally I think we'd provide a way for Juju to automatically update its own DNS record [12:47] jam: and then of course it would take a while for the record in the IP address to propagate [12:48] jam: one other thing, kinda trivial: I've been wanting to standardise on Api vs API spelling for ages (we're inconsistent all over), and this PR does that. Do you think it's a good idea? https://github.com/juju/juju/pull/6353 [12:49] rogpeppe: you didn't address Http while you were at it? [12:49] rogpeppe: jam my one concern would be if things like deployer/etc break due to api changes to them ^ [12:49] I'm +1 on the general concept of preferring API [12:49] jam: good point, that would be good too [12:49] rick_h_: and +1 on that [12:50] rick_h_: i'll double check we're not changing any API calls [12:50] the only thing I worry about with a global search/replace is things that we've exposed as part of a public interface. [12:50] as rick_h_ pointed out as well. [12:52] jam: looks like no API calls are affected [12:54] jam: or by Http vs HTTP either, which I'll do too [12:54] rogpeppe: <3 ty [13:15] babbageclunk: looks like you're right, the TLS stuff is pre-existing. [13:15] I wonder if something is trying to update to an old address that got torn down [13:16] (my quick hack of apiserver.go to drop TLS shows about 10 tests that actually need ssl cause they are testing that we expose SSL endpoints, and 54s vs 58s runtime) [13:16] so another 10%-ish, but quite a bit more actual work to get all of those to actually run. [13:33] rogpeppe: looks like you missed a legacy_test.go http://juju-ci.vapour.ws:8080/job/github-merge-juju/9366/artifact/artifacts/trusty-err.log [13:34] Bug #1173122 changed: API server should not log passwords [13:38] jam: darn, i thought i'd run all the tests locally [13:39] jam: ah, it's not surprising - that code has only just landed in master [13:40] * rogpeppe should've rebased before $$merge$$ [13:41] jam: BTW I changed https://github.com/juju/juju/pull/6353 to do all of: API, NUMA, HTTPS, HTTP, FTP and URL. [13:43] rogpeppe: so I'm +1 on theory, but I'd like someone to really go through it carefully to watch out for any actual public API changes. [13:44] jam: so I did a grep for all those names inside apiserver/... and there are no methods that have changed name; and there are no fields in apiserver/params without JSON annotations that have changed name either [13:44] frobware: dimitern ty for getting that in and landed! please enjoy the rest of the sprint! [13:45] rick_h_: thank you ;) I'm glad it landed [13:46] jam: i'd really like it if we had an automatic API compatibility checking tool that checked for any type and name changes. obviously it wouldn't be sufficient, but it would be a good sanity check to have. [13:46] jam: i started doing one when we had "friday labs" but no time now [13:49] oh the good ol' days.. [13:49] :D [13:49] jam: here are all the differences that are inside apiserver/... that aren't in _test.go files: http://paste.ubuntu.com/23251153/ [13:49] dimitern: :) [13:49] Bug #1173122 opened: API server should not log passwords [13:55] Bug #1173122 changed: API server should not log passwords [14:02] rick_h_, voidspace, dooferlad, macgreagoir: ping for standup [14:02] natefinch: omw :-) [14:02] natefinch: thanks [14:03] natefinch: macgreagoir is at sprint :) r u just missing Mick or do u need to update him? :D [14:04] natefinch: Aye, sorry, I'm sprinting :-) [14:05] macgreagoir: forgot, nevermind :) [14:16] jam: Sorry, have you already done the no-ssl-api-connection hack? [14:16] babbageclunk: only in one package and a very dirty dirty hack without making all the tests work. [14:17] with failing tests its hard to say if its faster because of the SSL or because of the tests stopping early [14:17] jam: ah, cool - I'm finding it pretty slow going/very whack-a-mole [14:17] I just did "apiserver/" itself. [14:18] I have the feeling that the difficulty you're seeing in implementing is enough of an answer in itself. [14:18] not worth it for the time to whack all the moles [14:18] jam: yeah, I'm thinking that too [14:19] babbageclunk: because to do it *cleanly* you still have to whack those moles [14:39] natefinch, axw: I have two branch up for review when you have time: https://github.com/juju/juju/pull/6352 and https://github.com/juju/juju/pull/6354 thanks! [14:39] macgreagoir, anastasiamac: PTAL https://github.com/juju/juju/pull/6355 - small time-related state improvements [14:40] frankban: will take a look [14:41] natefinch: ty! [14:45] dimitern: lgtm [14:45] natefinch: ping, sorry forgot to stay on [14:45] natefinch: can you meet me back in the standup? [14:46] jam: tell user when operations are blocked \o/ https://github.com/juju/juju/pull/6356 [14:46] rick_h_: sure [14:58] dooferlad: re: https://bugs.launchpad.net/juju/+bug/1623480 did you run frobware's test on azure and that's corrected? [14:58] Bug #1623480: Cannot resolve own hostname in LXD container [14:59] rick_h_: no. [14:59] rick_h_: probably should! [15:00] rick_h_: hey what was the consensus on the new charm URL format. is the change from cs:/- to cs:// intended? [15:00] dooferlad: ty [15:00] katco`: consensus was "watch out! it's a swamp!" [15:00] lol [15:01] rick_h_: but i'm OK to correct unit tests to new format? i'm not just fixing by coincidence? [15:01] katco`: I've got a reply on the latest there I need to go process. [15:01] katco`: i'm not sure tbh [15:01] rick_h_: ok, well lmk; no rush [15:01] katco`: the quick test is can you use that url in trunk atm? [15:02] rick_h_: looks like it locates it at least [15:03] katco`: ok, I know not all the url space works so I think it might be in an in-between land [15:03] katco`: so we have to run with what we've got as to get it all working seems like it's beyond the 2.0 scope atm [15:03] rick_h_: although it's cs:// [15:03] katco`: right, a lovely mongrel [15:04] although actually that's good charm/series/rev, wonder if charm/rev works [15:04] * rick_h_ needs to update trunk and tinker [15:04] rick_h_: doesn't look like it [15:05] rick_h_: if you're going to specify a rev, looks like you need a series. which doesn't make any sense with multi-series charms [15:05] katco`: right :/ [15:25] frobware, dimitern: found it https://bugs.launchpad.net/juju/+bug/1628973 [15:25] Bug #1628973: maas provider uses 'primary interface' logic - allocateContainerAddresses1 fails when interface 0 doesn't have a VLAN [15:25] dooferlad: looking [15:27] dooferlad: there is in fact such a thing as primary nic for a device [15:27] dooferlad: it's the only one maas creates along with the device when you pass it a mac address [15:27] dimitern, frobware: that whole function is suspect; looking at a subnet to find what VLAN an interface should use? You can have the same subnet used by two different VLANs - kind of the point. [15:28] dimitern: no, there isn't. [15:28] dimitern: that logic uses 'primary' to mean 'first in the list' [15:28] dooferlad: no, there is ever only one vlan for any subnet in maas [15:28] in fact subnets are contained within a vlan [15:28] dimitern: that is a current MAAS limitation, not a network limitation. [15:28] dimitern: BTW that doc comment on NowToTheSecond is misleading - mongo stores time in millisecond resolution, not second resolution [15:29] and I'm talking about the maas vlan db entity, not a VLAN with tag 1234 [15:29] dooferlad: I agree it's maas specific, but that's what the code is handling there [15:30] dooferlad: ok the name might be changed to "first" vs "primary" [15:30] dimitern: I suggest comments. They are useful :-) [15:31] maas db triggers ensure every device has at least 1 NIC, but it doesn't link it to the correct vlan (i.e. uses the "default vlan" for it) [15:32] the problem with that statement is that 'default vlan' could be the absence of a VLAN in terms of the network, right? [15:32] dooferlad: I'm not saying they aren't useful ;) I'm thankful [15:32] dooferlad: it's a vlan used when maas cannot figure out which vlan to use - it's in fact hardcoded in maas src with id=5001 [15:33] horrible, horrible stuff [15:33] yikes! [15:33] do you know what happens if we don't specify the VLAN field in the API request? It seems like that is the right thing to do in this case. [15:34] I need to look at the API docs (and hope they are correct) [15:34] I know yes - it fails, as for a physical nic vlan is required [15:34] and all device nics are physical [15:35] that chunk of code tries to satisfy the requirement [15:35] oh my [15:36] I think we need to just use id=5001 when it isn't set then :-( [15:36] you can easily check: maas devices create parent=xyz mac_addresses=aa:bb:cc:dd:ee:f0 ; maas interfaces create-physical [no vlan=] [15:37] might be, but if it happens not to match the vlan used by the host bridge, you'll get issues [15:40] fortunately, figuring out which subnet the host bridge is on (and the vlan of that subnet) is easier now, since we only bridge host interfaces with ip addresses [15:41] it was much harder previously, as the host bridge might be without address (hence subnet) [15:47] dimitern: in allocateContainerAddresses1 that isn't the case. On my hardware br-eth0 and br-eth2 don't have addresses, so we shouldn't be trying to get lxd/0 eth0 or eth2 addresses. That code even uses the horror of if nic.InterfaceName == "eth0". [15:47] it will fail if an interface called eth0 isn't configured to use as a set of defaults. [15:48] dooferlad: are you using tip of master? how come those address-less br-* get created? [15:48] dimitern: yes, using the tip of master [15:49] dimitern: as of this morning that is. [15:50] dooferlad: yeah, well the bridge-only-configured (master-lp1627073 ?) PR landed, so now you shouldn't be seeing such bridges anymore [15:50] dimitern: so your changes will have stopped br-eth0 from turning up, but this code still looks for an eth0 (from MAAS?) [15:50] so it doesn't make any difference [15:50] ...maybe [15:50] dooferlad: eth0 is inside the device [15:50] will it be linked to br-eth0? [15:50] dooferlad: it will be linked to the first bridge on the host [15:51] dooferlad: which has an address - might be br-ens5 or br-eth0 (depends on the host node network config) [15:51] so, it will be linked to br-eth1 in this case. OK. [15:51] yeah [15:52] dooferlad: also maas names the first device nic "eth0", fwiw [15:52] dimitern: but first != special [15:52] dimitern: and untagged is a magic API value [15:52] dooferlad: it kinda is, because it's always there [15:52] (5001) [15:53] dooferlad: so we can't create it, just find it and update it (if needed) [15:53] maas auto-creates it using the mac address passed to create device [15:53] so after the message "NIC %v has no subnet - setting to manual and using untagged VLAN", shouldn't we assign the value "5001" to match the API? [15:54] dooferlad: I think that's just dead code after that PR landed :/ [15:54] dooferlad: since it will have a subnet always [15:54] dimitern: I am all for deleting it and replacing it with an error! [15:55] dooferlad: if you're willing to go there, please do - I'll happily review the changes [15:55] (now rc2's been cut off anyway) [15:58] dimitern: I think we need to add a cleanup card to schedule the work. That function is over 100 lines long without any comments and I think has bigger problems than the dead code we just identified. [15:58] p [15:58] dooferlad: sgtm === frankban is now known as frankban|afk [16:32] dimitern: confirmed the problem has gone with your latest changes :-) [16:32] dooferlad: \o/ !! :) [17:01] rick_h_: so, authentication with apikey is going to be more difficult than I thought. It's special for rackspace, not using the openstack standards, and since our rackspace code just reuses all that... it would take some work to extract the authentication code into something rackspace can override [17:03] natefinch: yea, that's what I was thinking [17:03] natefinch: so it'd be good to drop any notes into the bug, rename it to "add support for api-key auth to the rackspace provider" and 2.1 it [17:03] rick_h_: yep [17:04] rick_h_: it's unfortunate... I can change the names we call them, but then deep in the goose code, it has hard coded expectations of what the values will be called, and those aren't what rackspace expects. [17:04] natefinch: yea, that's what I expected [17:05] rick_h_: oh well. I can disable access-key type authentication easily enougj [17:06] natefinch: ty [17:51] Simple review anyone? +25 -17: https://github.com/juju/juju/pull/6357 [17:53] rick_h_: ^ "fixed" :) [17:53] natefinch: ty [18:00] when connecting to a controller via the API the user-info returns two fields 'controller-access' and 'model-access' - why would these values differ from what is returned via the cli for the same user? [18:01] they are both "" when the cli reports that it should have addModel [18:01] natefinch: so when you add credential there's no option right? [18:01] natefinch: as far as which type to use? [18:02] rick_h_: via the api you can specify for certain clouds [18:02] rick_h_: correct. it just says 'Using auth-type "userpass"' [18:02] $ juju add-credential rackspace [18:02] Enter credential name: bar [18:02] Using auth-type "userpass". [18:02] Enter username: foo [18:02] Enter password: [18:02] Enter tenant-name: a [18:02] Credentials added for cloud rackspace. [18:03] natefinch: <3 ty [18:04] rick_h_: d-(^_^)-b [18:07] ok let me rephrase my question, when logging into the controller should the acl data be returned in the response, or do I need to make another request for the data? [18:08] perrito666: ^ [18:12] https://github.com/juju/juju/pull/6348 could use another look. warning: scripted renames ahead [18:13] natefinch: katco` can you all swap for review then please? [18:16] yep yep [18:19] yep [18:20] katco`: is this correct? cs:~a-user/trusty/spam-5 is now cs:trusty/spam/~a-user/5 ? [18:21] natefinch: i think i got the bargain in this transaction: lgtm [18:21] katco`: lol I was gonna say.... :) [18:21] natefinch: sigh no that's not correct [18:23] katco`: well, at least that's fairly easy to find with a regex [18:23] yeah [18:23] i have a feeling i probably missed some cases too [18:24] I presume /2 instead of -2 is the new correct way? [18:24] yeah [18:24] read commit message [18:25] ahh yeah, missed that, but figured it from context [18:25] i actually don't know how it's supposed to look for a user url [18:26] I presume ~user/charmname/series/revision ..... anything else would be kinda wacky [18:26] (which, yes, means the old way was wacky ;) [18:26] natefinch: cs:charm/series/rev means is kinda wacky for multiseries charms too [18:28] katco`: very true [18:34] katco`: if I ran the zoo, all charms would be multiseries... it's silly to have different charms when the charm format is cross platform. [18:45] hatch: to be honest, I dont remember [18:48] what was the consensus on squashing commits with reviews? [18:48] katco`: do it, I believe was the consensus [18:49] natefinch: squashed with fix for user urls [18:51] perrito666: heh ok [18:56] perrito666: can you please help hatch out with the gui using ACL bits. There's a possiblity of a bug that we're about to head out in rc2 and I want to make sure it's ok [18:57] rick_h_: sure [18:58] thanks all [18:58] you rock [18:58] yes we do [19:02] katco`: I like the way you show the regex you used as if I can somehow tell if it's right or wrong ;) [19:04] natefinch: lol, well it's mainly there in case we need it again or it's screwed something up :) [19:04] natefinch: just good to pair the command used with the commit [19:04] katco`: lgtm. I have a quibble with the logging, but meh, it's not that important [19:04] natefinch: yeah i am considering your comments [19:05] natefinch: i usually have a rather black and white view on the matter; i.e. it's not the place of a function to determine what happens with its errors. i'm weighing your opinion trying to decide if it's not always so [19:06] natefinch: you are correct in that we only ever log it; the case i consider is the future. it's much easier to change things on the edges, and when the handling is done on the bottom-edge, it couples all callers together with their handling [19:06] well, either senderror has to handle its own errors, or the http handler function has to handle their own errrors [19:07] natefinch: right, and i always prefer the top-edge (i.e. http handler) [19:07] pushing decisions down into your tree makes code rigid, but easier to read/use [19:07] I guess I see senderror as just a shortcut instead of copying and pasting that whole thing into every http handler [19:08] natefinch: i believe sometimes it goes through other functions before making its way up to the http handler? [19:09] natefinch: at any rate, i am still mulling over your comments. i'm going to $$merge$$ so that the tests i undoubtedly missed fail and take a late lunch [19:09] heh good idea [19:10] if the merge goes through, like I said, it's mostly a philosophical difference. It's obviously not incorrect as it is. [19:11] katco`: btw, it looks like sendError *used* to get called from lower in helper functions, and I agree that would be bad... but now those helper functions (correctly) just return errors, so sendError is only ever called from the top level http handler functions. [19:12] natefinch: ah [19:12] natefinch: i have been told to wait on lunch in case my wife needs help stuffing our cat into the crate for the vet [19:13] haha, I know how that goes [19:13] one of our cats is like "No!!! DEATH FIRST!" And the other one is like "hey, what's this, a nice little place to curl up? don't mind if I do" [19:14] lol yep. we have those same 2 cats === frankban|afk is now known as frankban [19:33] redir, you about/ === frankban is now known as frankban|afk [19:40] balloons, nope he wont be [19:40] balloons, something I can help you with? [19:40] perrito666, ping [19:41] alexisb, just looking for someone to +1 this so we can merge it ;-) https://github.com/juju/juju/pull/6358 [19:41] I'm on call reviewer [19:41] ahh, I misread.. howdy natefinch [19:42] you've been through the drill before [19:42] balloons: lgtm [19:42] natefinch, beat me to it :) [19:45] alexisb: pong (sorry was having afternoon tea, yes we do that here) [19:46] :) np === mup_ is now known as mup === mup_ is now known as mup [19:59] balloons: all set? [19:59] redir, yeppers! [19:59] * redir nods [20:09] anyone remember if facades are supposed to start at version 0 or 1? === mup_ is now known as mup === mup_ is now known as mup === mup_ is now known as mup === mup_ is now known as mup [20:39] natefinch: 1 [20:39] natefinch: bc 0 is the default value and is equivalent to forgetting about the version [20:40] Has this been discussed before? During a bootstrap when an 'apt-get' fails juju continues to then try install some packages and doesn't timeout at all. === mup_ is now known as mup === mup_ is now known as mup [22:00] axw: ? === mup_ is now known as mup === mup_ is now known as mup === mup_ is now known as mup [22:09] morning [22:14] heya thumper [22:15] perrito666: awake now, but I'm heading into a meeting shortly === mup_ is now known as mup === mup_ is now known as mup === mup_ is now known as mup