[00:00] <waigani_> davecheney: is there a particular bug you'd like me to look at next or should I just grab one from the list?
[00:11] <sinzui> wallyworld_, thumper: does any of this look familiar and do you have an advice? http://pastebin.ubuntu.com/7224152/
[00:13] <wallyworld_> sinzui: and juju works fine on hp cloud?
[00:16] <sinzui> wallyworld_, yes
[00:16] <wallyworld_> hmmm
[00:17] <sinzui> wallyworld_, I just confirmed that both configs use the same keys
[00:18] <wallyworld_> sinzui: in the past, where container permissions have been wrong, the container has been created but subsequent reads failed. here we can't even create the container
[00:18] <wallyworld_> it does seem to imply a canonistack swift issue
[00:18] <sinzui> wallyworld_, noted. and in the past creation fails were race conditions. this say auth failure
[00:19] <wallyworld_> sinzui: have you tried the other region?
[00:20] <sinzui> wallyworld_, no
[00:20] <wallyworld_> sometimes that can work
[00:20] <wallyworld_> lyc01 vs lyc02
[00:20] <sinzui> wallyworld_, but I just checked the canonistack dashboard for /both/ accounts. The container view shows an error
[00:21] <wallyworld_> hmmm, ok
[00:21] <wallyworld_> and no joy asking in #is?
[00:22] <sinzui> wallyworld_, they officially defer to canonical support. I opened a ticket there 10 hours ago and no one will talk to me
[00:22] <wallyworld_> :-(
[00:23] <sinzui> I am tempted to send an email notifying canonistack that it will be desupported. Without working accounts, I cannot deliver the next juju to it
[00:24] <wallyworld_> agreed
[00:24] <wallyworld_> we do need an openstack deployment to test against though :-( besides hp cloud
[00:25] <wallyworld_> thumper: this fixes bug 1304132 and also removes the log noise from the critical bug alexis emailed about https://codereview.appspot.com/85770043
[00:25] <_mup_> Bug #1304132: nasty worrying output using local provider <ppc64el> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1304132>
[00:53] <davecheney> arosales: hazmat email sent
[00:53] <davecheney> with ppc segfault informatoin
[00:54]  * arosales looks
[00:55] <hazmat> davecheney, k.. trying fresh on new ppc8.. orange box #2 vanquished
[00:55] <davecheney> hazmat: that would be a good data point, i only have access to wolfe and winton, which are power7
[00:56] <davecheney> hazmat: if you see panics on your power8 host, you should revert to that kernel I specified
[00:56] <hazmat> davecheney, my p7 host has been good wolfe-02..  3.13-08
[00:57] <hazmat> trying on stilton-5
[00:57] <davecheney> hazmat: yes, that is the working kernel
[00:57] <davecheney> it is pre the switch to 64k
[00:57] <davecheney> pages
[00:57] <arosales> dfc: so 3.13.0-08.28 is what we need correct?
[00:57] <arosales> hazmat: are you running -08.28?
[00:57] <davecheney> the .28 isn't the important bit
[00:58] <davecheney> the -08 -18, -23 is
[00:58] <davecheney> i've included a link to the old kernel in the archive
[00:58] <hazmat> one moment.. switching tracks off maas
[01:01] <arosales> dfc, gotcha, avoid -18 and -23
[01:01] <hazmat> davecheney, so my p7 is -> 3.13.0-8-generic
[01:01] <hazmat> davecheney, my p8 is -> 3.13.0-23-generic
[01:06] <davecheney> hazmat: right
[01:12] <hazmat> davecheney, so.. theory being that's an okay version? .. i'm gonna test and find out either way
[01:13] <davecheney> hazmat: i've tested -8, -18 and -23
[01:14] <davecheney> only -8, which was pre the 4k page switch can run juju stabally
[01:14] <davecheney> the other kernels radomly kill juju proceses with SEGVs
[01:14] <hazmat> solid
[01:15] <hazmat> davecheney, cool, thanks for tracking that down.. apparently i lucked into having at least one good demo p machine
[01:15] <davecheney> hazmat: yeah me too
[01:15] <davecheney> winton-02 is ooooooooooold
[01:15] <davecheney> so it was running a very old kernel
[01:15] <davecheney> but thumper bodie and timv hit problems
[01:17] <mwhudson> oh man, 64k pages kill the gccgo runtime?
[01:17] <mwhudson> somehow that's easy to believe
[01:18] <davecheney> mwhudson: yup
[01:18] <davecheney> mwhudson: tell me your thoughts
[01:18] <davecheney> its signal related
[01:19] <davecheney> somehow an invlaid signal is generated, or created, or just pops into existance
[01:19] <davecheney> the powerpc/kernel/signal_64.c doesn't know how to handle it, so It calls force_sigsegv
[01:19] <davecheney> and the userland thinks it has hit a nil pointer exception and panics
[01:19] <mwhudson> davecheney: well i think malloc.goc has a #define PAGE_BITS 12 in it
[01:19] <mwhudson> o
[01:19] <mwhudson> h
[01:19] <mwhudson> that sounds pretty messed up
[01:19] <davecheney> mwhudson: but why should that matter
[01:20] <davecheney> 12 is < 16
[01:20] <mwhudson> davecheney: dunno
[01:20] <davecheney> but is a multiple
[01:20] <davecheney> all that happens is if you call mmap(0, 4096) you get a 64k allocation
[01:20] <davecheney> mwhudson: i'm logging this all in a bug now
[01:20] <davecheney> then i have a juju test to fix
[01:21] <davecheney> then i'll try to create a smaller reproduction case
[01:21] <davecheney> there is additional debugging in that file
[01:21] <davecheney> but it appears to be turned off in this build
[01:21] <davecheney> maybe spinning a new kernel with it enabled is the next step
[01:22] <thumper> wallyworld_: back from the gym now
[01:22] <mwhudson> davecheney: an invalid signal number is generated?
[01:22] <mwhudson> wow
[01:22] <mwhudson> what is the userspace doing when this signal arrives?
[01:22] <davecheney> chlling
[01:22] <wallyworld_> thumper: ok. i have 2 fixes for that critical bug https://codereview.appspot.com/85770043 and https://codereview.appspot.com/85750045
[01:23] <mwhudson> so it's an async signal?
[01:23] <wallyworld_> not sure if more work is needed
[01:23] <davecheney> [18519.444748] jujud[19277]: bad frame in setup_rt_frame:
[01:23] <davecheney> 0000000000000000 nip 0000000000000000 lr 0000000000000000
[01:23] <davecheney> [18519.673632] init: juju-agent-ubuntu-local main process (19220)
[01:23] <davecheney> killed by SEGV signal
[01:23] <davecheney> [18519.673651] init: juju-agent-ubuntu-local main process ended, respawning
[01:23] <thumper> wallyworld_: so what what going wrong?
[01:23] <wallyworld_> thumper: i'll get those landed and will have to either test or ask axw if there's anything else obvious that needs looking at
[01:24] <wallyworld_> thumper: 2 things 1. instance poller noise due to it not ignoring unprovisioned machines
[01:24] <thumper> +1 for that
[01:24] <wallyworld_> 2. bad schema def for storage-port config attr on manual provider causing provisioner startup to fail
[01:24] <wallyworld_> due to json serialisation issue
[01:24] <wallyworld_> float64 vs int and all that
[01:25] <wallyworld_> so those 2 fixes i did just by looking at logs
[01:25] <wallyworld_> i had a look at the code to see if i could relate the fixes to the actual observed issue, but didn't get far enough
[01:26] <wallyworld_> so i figured we could fire up some arm instances and test and/or ask axw  for input when he comes online
[01:26] <axw> I am online
[01:26] <axw> what input do you need?
[01:26] <wallyworld_> \o/
[01:26] <axw> I have LGTM'd your two fixes
[01:26] <wallyworld_> axw: bug 1302205
[01:26] <_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
[01:26] <wallyworld_> ok :-)
[01:26] <wallyworld_> i am not sure if my fixes are sufficient
[01:27] <wallyworld_> they are needed, but is there more to be done
[01:27] <hazmat> davecheney, confirmed btw re 23.. panic while doing nothing detected in the log
[01:27] <axw> hrm
[01:27] <wallyworld_> have you seen similar issues when developing the manual provider?
[01:27] <axw> wallyworld_: nope
[01:28] <wallyworld_> or maybe we just need to test with the fixes
[01:28] <wallyworld_> could yet be an arm issue i guess
[01:28] <mwhudson> davecheney: can you run my test program from https://sourceware.org/bugzilla/show_bug.cgi?id=16629 ?
[01:28] <axw> looking at the logs now...
[01:28] <davecheney> hazmat: i've even seen /usr/bin/go panic while running tests
[01:28] <davecheney> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304754
[01:28] <_mup_> Bug #1304754: gccgo compiled binaries are killed by SEGV on 64k ppc64el kernels <linux (Ubuntu):New> <https://launchpad.net/bugs/1304754>
[01:28] <wallyworld_> certainly that storage-port issue is pretty fatal
[01:28] <davecheney> arosales: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304754
[01:28] <_mup_> Bug #1304754: gccgo compiled binaries are killed by SEGV on 64k ppc64el kernels <linux (Ubuntu):New> <https://launchpad.net/bugs/1304754>
[01:28] <axw> wallyworld_: ah, the "close of closed channel" is something rogpeppe brought up last night
[01:29] <axw> there's a bug in cmd/jujud
[01:29] <axw> not sure if he fixed it yet...
[01:29] <mwhudson> (although i can't see why 64k pages would matter here)
[01:29] <wallyworld_> axw: is that in the machine 0 log attached to the bug?
[01:29] <axw> wallyworld_: yeah
[01:29] <wallyworld_> let me look
[01:29] <axw> wallyworld_: https://codereview.appspot.com/85450044/
[01:29] <axw> wallyworld_: I broke the machine agent when I allowed upgrade steps to get a state connection
[01:30] <mwhudson> davecheney: also, it would sure be nice to follow the execution of handle_rt_signal64 with gdb
[01:30] <wallyworld_> axw: does that mp fix the close channel issue?
[01:31] <davecheney> mwhudson: way above my pay grade
[01:31] <davecheney> i'm not even qualified for pointer arythmetic
[01:31] <axw> wallyworld_: yeah
[01:31] <arosales> davecheney: looks like the latest in the archies is -23
[01:31] <davecheney> yup
[01:31] <wallyworld_> axw: great. so i'll land my branches and we can re-test i guess
[01:32] <axw> wallyworld_: sgtm
[01:32] <axw> wallyworld_: it would be nice to silence "cannot get instance info for instance "manual:10.0.128.7": no instances found" too, but it's not critical
[01:33] <wallyworld_> axw: i haven't look into that on yet - what's the cause?
[01:33] <mwhudson> i wonder if there is an arm64 kernel with 64k pages i can try with
[01:33] <axw> wallyworld_: manually provisioned machines are not managed by the provider - they just should not be polled
[01:33] <davecheney> mwhudson: thta would be a good test
[01:34] <davecheney> i tried to test using gccgo/amd64
[01:34] <wallyworld_> axw: in that case i'll add some code to my first branch
[01:34] <davecheney> but lxc was all fucked on amd64 yesterday
[01:34] <wallyworld_> do both fixes in one go
[01:34] <axw> wallyworld_: there's a state.Machine.IsManual method that'll help there
[01:34] <mwhudson> i don't know anything about legacy architectures like amd64
[01:38] <arosales> davecheney: do you have link hand to the matching initrd to the -28 .deb you pointed at?
[01:41] <hazmat> arosales, so i removed the other kernels .. sudo upgrade-grub.. currently doing shutdown -r now .. to see if it worked ;-)
[01:41] <hazmat> arosales_, removed via pkgs that is
[01:42] <jcastro> where's this -28 kernel at, I don't see it in proposed?
[01:42] <hazmat> jcastro, its on the machines that barf.. ls /boot
[01:42] <thumper> jcastro: o/
[01:42] <thumper> jcastro: I have a version of debug-log on my machine that works with the local provider
[01:42] <hazmat> arosales, don't do what i just suggested it.. it doesn't like that ;-)
[01:42] <jcastro> thumper, hey! we made a plugin, heh
[01:43] <thumper> yeah, but it doesn't do filtering
[01:43]  * thumper guesses
[01:43] <jcastro> oooh
[01:44] <hazmat> or replay or exclude/include by unit/machine or channel
[01:44] <hazmat> jcastro, we have parity..
[01:45]  * hazmat sheds a tear
[01:45] <hazmat> for debug-log
[01:45] <jcastro> heh
[01:45] <jcastro> thumper, also, one thing we should talk about
[01:45] <davecheney> arosales: not -28
[01:45] <jcastro> is the debug-hooks <-> retry --resolved thing makes me cry
[01:45] <arosales> hazmat, ack :-)
[01:45] <davecheney> you want uname -a
[01:45] <davecheney> Linux winton-02 3.13.0-8-generic #28-Ubuntu SMP Mon Feb 17 08:22:39 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
[01:45] <jcastro> oh _8_
[01:45] <davecheney> wget https://launchpad.net/ubuntu/+source/linux/3.13.0-8.28/+build/5602341/+files/linux-image-3.13.0-8-generic_3.13.0-8.28_ppc64el.deb
[01:45] <jcastro> not 28
[01:46] <arosales> davecheney, so the _only_ fix is to revert to -08?
[01:46] <davecheney> jcastro: i'll forward you the email
[01:46] <jcastro> davecheney, do I need an initrd for that?
[01:46] <hazmat> arosales, atm yes
[01:46] <davecheney> arosales: the only workaround I have at this time
[01:46] <davecheney> jcastro: no
[01:46] <jcastro> davecheney, thanks!
[01:46] <thumper> jcastro: I don't understand
[01:46] <jcastro> so I do debug-hooks
[01:47] <jcastro> and in order to be able to fire off a hook to debug it
[01:47] <thumper> also, know this
[01:47] <thumper> I can only make one person happy at a time
[01:47] <jcastro> I need to open a new terminal and do resolved --retry
[01:47] <thumper> it isn't your turn
[01:47] <thumper> it is hazmat's
[01:47] <jcastro> <3
[01:47] <jcastro> local log will keep me happy. :D
[01:48] <thumper> yeah, I'm open to looking to fix it...
[01:48] <arosales> davecheney, I am confused in your email you state, "workarounds: you should install this kernel
[01:48] <arosales> wget https://launchpad.net/ubuntu/+source/linux/3.13.0-8.28/+build/5602341/+files/linux-image-3.13.0-8-generic_3.13.0-8.28_ppc64el.deb"
[01:49] <arosales> davecheney, ah I should have said -8.28 not -28
[01:49] <arosales> davecheney, gotcha
[01:49] <arosales> with is a revert
[01:49] <hazmat> thumper, and all of cts :-)
[01:49] <arosales> sorry long day
[01:49]  * arosales better just grab some dinner
[01:50] <davecheney> arosales: yup, we're also lucky that -28.8 isn't a think
[01:51] <davecheney> both those numbers appear to be increasing
[01:51] <davecheney> s/think/thing
[02:08] <davecheney>     c.Assert(err, gc.IsNil)
[02:08] <davecheney> ... value *mgo.QueryError = &mgo.QueryError{Code:16149, Message:"exception: cannot run map reduce without the js engine", Assertion:false} ("exception: cannot run map reduce without the js engine")
[02:08] <davecheney> store tests are failing again
[02:09] <davecheney> i thought that the store tests wouldn't run unless we passed a flag ?
[02:09] <davecheney> cmars: didn't you fix this ?
[02:09] <cmars> davecheney, thought so, yes. is this trunk or 1.18?
[02:09] <davecheney> cmars: trunk
[02:09] <cmars> hmm
[02:10] <cmars> davecheney, which test is it? is there a file & line #?
[02:10] <davecheney> cmars: please hold
[02:11] <davecheney>  go test launchpad.net/juju-core/store 2>&1 | pastebinit
[02:11] <davecheney> Failed to contact the server: [Errno socket error] [Errno socket error] timed out
[02:11] <davecheney> oh for fucks sake
[02:11] <davecheney> does nothing work today ?
[02:13] <davecheney> thumper: what is the env var to lower logging ?
[02:13] <davecheney> JUJU_LOG= ?
[02:14] <thumper> here's mine: JUJU_LOGGING_CONFIG=<root>=INFO; juju.container=TRACE; juju.provisioner=TRACE
[02:14] <thumper> that is ready by bootstrap
[02:14] <davecheney> ta
[02:14] <davecheney> hnn, that isn't it
[02:14] <davecheney> rog had a different one
[02:14] <davecheney> a flag to testing
[02:15] <davecheney> -juju.log WARNING
[02:17] <davecheney> cmars: http://paste.ubuntu.com/7224466/
[02:17] <cmars> ok, thanks. looking
[02:19] <davecheney> ta
[02:22] <cmars> davecheney, i thought it had landed, but it hasn't
[02:22] <cmars> https://code.launchpad.net/~cmars/juju-core/cs-mongo-tests/+merge/213563
[02:23] <cmars> i think we can land it, if CI will support running the store tests with full mongodb tests
[02:25] <cmars> davecheney, what do you think? will you take that as an action item?
[02:25] <davecheney> cmars: no
[02:25] <cmars> ok :)
[02:25] <davecheney> i cannot take that as an action item
[02:26] <cmars> i'll follow up w/curtis tmw then
[02:26] <davecheney> cool
[02:26] <cmars> cheers
[03:19] <wallyworld_> arosales: do you have any doc or otherwise that tells me what i need to do get access to some arm vms to test a fix for that bug
[03:27] <dannf> wallyworld_: i can help w/ that
[03:27] <wallyworld_> \o/
[03:27] <dannf> wallyworld_: do you have an account on batuan?
[03:27] <dannf> that's the gateway into our network - if you don't, you can ask for one in #is
[03:27] <wallyworld_> yes, since i have logged onto power vms previously
[03:27] <dannf> sweet
[03:28] <wallyworld_> i think you know it's bug 1302205
[03:28] <_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
[03:28] <dannf> right
[03:28] <wallyworld_> i don't think it's specifcally an arm issue
[03:28] <wallyworld_> but want to test on arm anyway
[03:28] <dannf> yep - just a sec... wonder if someone just took our host down
[03:28] <wallyworld_> there have been 3 branches which landed today or last night which should hopefully fix it
[03:34] <dannf> wallyworld_: yeah, looks like its in use debugging an unrelated issue. i'll send you an e-mail with access info and ping you (or have someone ping you) when its ready
[03:34] <wallyworld_> great thanks :-)
[03:36] <thumper> davecheney: how can I make sure that the bufio.Scanner doesn't consume too much?
[03:36] <thumper> davecheney: I have an io.ReadCloser
[03:36] <thumper> davecheney: and I want to read up to the first new line, and no more
[03:36] <thumper> the Scanner when I call scan reads 4k
[03:36] <thumper> which consumes way more than I want
[03:45] <dannf> wallyworld_: e-mail sent - system isn't quite ready yet, stay tuned..
[03:45] <wallyworld_> ok
[03:51] <arosales> dannf, thanks re wallyworld arm access question :-)
[03:53] <dannf> np, very happy to have help w/ it :) two days and i've just managed to learn how to kinda read go :/
[04:06] <dannf> wallyworld_: ok, have at it
[04:06] <wallyworld_> dannf: awesome, firing up ssh now
[04:07] <wallyworld_> dannf: "ssh 10.229.41.200" should work right?
[04:07] <dannf> it should, but its not working for me either - lemme ask
[04:40] <thumper> is that because the ssh config can't work out where to proxy through?
[04:42] <jcw4> thumper: fyi I filed a merge request for the changes you suggested yesterday about isolating the git tests
[04:42] <thumper> jcw4: awesome, I saw that in my inbox
[04:42] <thumper> jcw4: I'll take a look once I submit this change :-)
[04:43] <jcw4> thumper: thanks; no rush... just excited about contributing ;)
[04:46] <thumper> wallyworld_: fyi instance type constraint branch has conflicts
[04:46] <wallyworld_> yes it does, fixed loclly
[04:46] <wallyworld_> still wip
[04:47] <thumper> jcw4: did you push your changes?
[04:47] <jcw4> yes; to a new branch
[04:47] <jcw4> the last one was too messy
[04:47] <thumper> :-)
[04:48] <thumper> jcw4: ok, there is a resubmit option on the RHS of the merge proposal page, that includes a "start over"
[04:48] <thumper> which would have marked the old as superseded
[04:48] <thumper> but that's OK, I'll just reject the old one.
[04:48] <jcw4> I see. Thanks
[04:50] <thumper> jcw4: what happens when you change the LC_ALL to "C" ?
[04:50] <jcw4> I was planning on doing that after running all the tests without it.
[04:50] <jcw4> after they all passed I was too excited and forgot
[04:50] <jcw4> testing now
[04:52] <jcw4> thumper: worker/uniter/charm/... tests passed
[04:52] <jcw4> I'll push that change too?
[04:52] <thumper> move that patch env into the base git test suite
[04:52] <thumper> with the other env patches
[04:53] <thumper> jcw4: then you can delete the SetUpTest for GitDirSuite
[04:53] <jcw4> cool, right
[04:53] <thumper> as it won't be doing anything
[04:53] <thumper> then yes, push that
[05:03] <jcw4> thumper: the LoggingSuite TearDownTest(c) needs to be called in the GitSuite TearDownTest?  I'd add that back in if necessary
[05:03] <thumper> jcw4: if the only line of the tear down is to upcall the tear down, then you can just delete it
[05:03] <jcw4> thumper: okay; that's all there is.  tx
[05:15]  * thumper EODs
[05:20] <axw> wallyworld_: I've responded to your comments, but I'm now looking at HA
[05:20] <wallyworld_> np
[05:21] <wallyworld_> just wanted to get some thoughts down
[05:21] <wallyworld_> i'm stuck on otherthings also
[05:33] <yaguang> hi all,  I am using 1.16.6 stable juju-core to bootstrap an Openstack Havana cloud, but fails with can't find index.json
[05:34] <yaguang> it seems that juju is trying to find the meta file in the path  /streams/  but  swift has  tools/streams/
[05:41] <dimitern> morning all
[05:57] <dimitern> fwereade, can you take a look at this please ? https://codereview.appspot.com/85220044/
[06:51] <bigjools> dimitern: howdy!  How's the vlan work coming along?
[06:51] <dimitern> hey bigjools
[06:52] <dimitern> bigjools, i'm in the final steps - cloudinit scripts that bring up network interfaces
[06:53] <dimitern> bigjools, vladk is working on a few extensions to gomaasapi to allow us to unit test the new api calls
[06:53] <dimitern> bigjools, capabilities; lshw dump of a node; networks?op=list_connected_macs
[06:54] <bigjools> dimitern: nice, all going to make it for the release?  And any issues with maas I need to know about?
[06:54] <dimitern> bigjools, but all these were live tested on my local maas using daily builds ppa
[06:55] <dimitern> bigjools, we're aiming for feature completeness by friday, but should be ready before that
[06:55] <bigjools> excellent
[06:56] <dimitern> bigjools, bug 1303617 hit me after a recent upgrade and i can no longer use the fast installer (fails at boot and doesn't recover), which is slow and tedious
[06:56] <_mup_> Bug #1303617: pc-grub install path broken in curtin <landscape> <curtin:Fix Released by smoser> <curtin (Ubuntu):Fix Released> <https://launchpad.net/bugs/1303617>
[06:56] <bigjools> dimitern: weird, I  did a fast install today and it was fine
[06:56] <dimitern> hmm Fix Released - i'll try it now
[06:57] <dimitern> bigjools, we have a few wishlist items for the maas api
[06:58] <dimitern> bigjools, like the ability to see networks + connected macs in one place (either in GET node/system_id or in GET networks/(all))
[06:59] <bigjools> dimitern: please file bugs
[06:59] <dimitern> bigjools, will do
[06:59] <bigjools> I will triage them as wishlist and we'll put them on the stack
[06:59] <dimitern> bigjools, otherwise now we need to do several api calls at startinstance time to get all we need
[06:59] <bigjools> ok we can optimise that
[07:10] <jam> cmars: I'm not sure why your test failed, but it would seem that we could tell the landing bot to always run the mongojs tests
[07:10] <jam> cmars: though because of that, I'd actually rather have the CI tests disable it, rather than disabled by default.
[07:11] <jam> experience has shown that ENV vars play nicer with go test than flags, because flags are only valid per package, and "go test ./..." tries to pass all flags to all packages.
[07:11] <dimitern> bigjools, filed bug 1304857
[07:11] <_mup_> Bug #1304857: API should report networks and connected macs in the response of a single node <api> <MAAS:New> <https://launchpad.net/bugs/1304857>
[07:12] <dimitern> jam, fancy a review ? https://codereview.appspot.com/85220044/
[07:15] <dimitern> bigjools, another question re gomaasapi - how do you feel about adding high level wrappers around common APIs? like having AcquireNode method that the provider calls, rather than constructing a URL internally? Other similar examples are ListNetworks, ListNodes, etc.?
[07:20] <bigjools> dimitern: I can't remember much about gomaasapi
[07:22] <dimitern> bigjools, :) well, that was just a thought
[07:22] <bigjools> dimitern: one of the other guys will have an opinion I'm sure
[07:23] <dimitern> bigjools, we'll ask them for reviews when changes are proposed
[07:30] <fwereade> dimitern, I'm getting progressively more nervous about NetworkName vs making it clear that it's a provider-specific id like instance.Id
[07:31] <dimitern> fwereade, can't we say yes, it's provider specific, but it's also used by juju to identify the network internally?
[07:31] <fwereade> dimitern, it will be, indeed
[07:31] <fwereade> dimitern, but we're going to want network names as well
[07:31] <dimitern> fwereade, ok, how are we going to make it clearer?
[07:32] <fwereade> dimitern, when openstack gives us network abcdef638746328756865198, and we call that NetworkName, what field will we use for the "private" name users will want to use
[07:32] <fwereade> dimitern, or "my_network" or whatever
[07:33] <dimitern> fwereade, openstack has labels for networks just the same
[07:33] <fwereade> dimitern, and so does every provider ever?
[07:33] <fwereade> dimitern, that's quite the prediction ;p
[07:33] <dimitern> fwereade, i can't say that :P
[07:34] <dimitern> fwereade, so tell me how to alleviate your nervousness about it? :)
[07:34] <fwereade> dimitern, call it NetworkId :)
[07:35] <fwereade> dimitern, you know -- we have machine ids, and instance ids, and they are not the same
[07:35] <fwereade> dimitern, (and machine ids are machine names really, but hysterical raisins)
[07:36] <dimitern> fwereade, so, basically change it everywhere from NetworkName to NetworkId ?
[07:36] <dimitern> fwereade, I need a follow-up for that
[07:36] <fwereade> dimitern, I'm more concerned about the API
[07:37] <dimitern> fwereade, you're thinking about network tags?
[07:37] <fwereade> dimitern, and that the terminology that's hard to change be consistent with what we expect to do
[07:37] <fwereade> dimitern, well, that was my first thought
[07:37] <fwereade> dimitern, but then I realised that converting these names into tags would be completely wrong
[07:37] <rogpeppe> mornin' all
[07:37] <dimitern> morning rogpeppe
[07:37] <fwereade> dimitern, because they're provider vocabulary, not juju vocabulary
[07:37] <rogpeppe> dimitern: hiya
[07:38] <dimitern> fwereade, ok, so then what?
[07:38] <dimitern> fwereade, i'm trying to follow but can't see what's needed
[07:38] <rogpeppe> axw: ping
[07:38] <fwereade> dimitern, although -- wait, don't you use tags in the client api? I think we should...
[07:38] <axw> rogpeppe: pong
[07:39] <dimitern> fwereade, we use tags everywhere in the api
[07:39] <dimitern> fwereade, but not for networks
[07:39] <rogpeppe> axw: about removing JobManageEnviron:
[07:39]  * fwereade grumbles
[07:39] <rogpeppe> axw: the reason we don't want to remove JobManageEnviron from a voting state server is that when a machine hasn't got JobManageEnviron we allow it to be removed
[07:40] <fwereade> dimitern, we don't identify machines in the client API by provider-specific instance id, and we shouldn't identify networks that way either
[07:40] <rogpeppe> axw: and if that happens we could break the invariant that we only ever have an odd number of voting state servers
[07:40] <rogpeppe> axw: or rather, and odd number of state servers that *want* to vote
[07:40] <dimitern> fwereade, i agree, but the only way we can deal with networks so far is if we get them from the provider
[07:40] <dimitern> fwereade, at provisioning time
[07:40] <rogpeppe> s/and odd/an odd/
[07:41] <fwereade> dimitern, we *can* impose a requirement that network names match provider ids exactly, this is MVP after all
[07:41] <dimitern> fwereade, i guess you're suggesting to require the user to add any networks to juju before being able to deploy with them
[07:42] <dimitern> fwereade, and i can see how this is the way we wanna go eventually, but not for nwo
[07:42] <dimitern> now
[07:42] <fwereade> dimitern, well, mid-term, yes -- I'd expect --networks params to be validated
[07:42] <axw> rogpeppe: okay
[07:42] <axw> rogpeppe: lots to take in here, still figuring out how all the voting bits work.
[07:43] <fwereade> dimitern, short-term, I want us to be clear on the distinction between juju vocabulary over the client API (tags) and provider vocabulary over the internal API (network ids)
[07:43] <dimitern> fwereade, so let's make a plan - i land this last CL and make another one for s/NetworkName/NetworkId/ throughout, and then do  the cloudinit stuff
[07:43] <rogpeppe> axw: thanks for taking a look. feel free to ask about whatever doesn't seem to make sense.
[07:43] <fwereade> dimitern, internally I'm fine saying that network name == network id (for mvp at least)
[07:43] <fwereade> dimitern, am I helping?
[07:43] <dimitern> fwereade, how can we be clear about this? in the docs? where?
[07:43] <rogpeppe> fwereade: it would be nice if network ids were distinguishable from machine ids and unit ids (which are both currently distinguishable from each other)
[07:44] <rogpeppe> fwereade: and service names, of course
[07:44] <dimitern> fwereade, yeah, that is how it's gonna be for now - we call it networkId, but we mean maas-specific name
[07:44] <fwereade> rogpeppe, not really gonna happen, that's why we have tags
[07:44] <rogpeppe> fwereade: yeah, fair enough
[07:44] <fwereade> rogpeppe, although *probably* units/machines will be safe, but services won't ;p
[07:45] <rogpeppe> fwereade: yeah
[07:45] <fwereade> dimitern, so, in the Setprovisioned bits: it's a provider-specific network id, not a tag
[07:46] <fwereade> dimitern, in the Client-facing IncludeNetworks/ExcludeNetworks bits, we should be using tags
[07:46] <dimitern> fwereade, yes, you mean better doc comment
[07:46] <fwereade> dimitern, internally we can just strip off the "network-" prefix and keep going mapping 1:1 with provider-specific network ids
[07:46] <dimitern> fwereade, so juju deploy --networks=net1,net2 which goes over the API as network-net1, network-net2
[07:47] <dimitern> fwereade, and for include/excludeNetworks in state we still use the ids, not tags as usual
[07:48] <fwereade> dimitern, yeah, exactly -- and for now the stripped names have to map to intrnal provider ids, but we keep them distinct so it doesn't become confusing when we have to change over later
[07:48] <dimitern> fwereade, ok, got it
[07:48] <fwereade> dimitern, inside state you can even stick with a single field in the document doing both duties... but be very clear that the _id field is for the *juju* name, not the provider name
[07:49] <dimitern> fwereade, better comments, ok
[07:49] <fwereade> dimitern, brilliant, thanks
[07:49] <dimitern> fwereade, i'll try to remember all that :) will propose it some time later today
[07:49] <fwereade> dimitern, I'm going through the CL now in case you hadn't realised ;p -- some more naming quibbles but otherwise looking sound I think
[07:50] <dimitern> fwereade, great!
[07:52] <fwereade> dimitern, reviewed
[07:53] <fwereade> rogpeppe, btw, do you think you might have a spare cycle to look at https://bugs.launchpad.net/bugs/1303735 today? it looks a bit like something you might know about
[07:53] <_mup_> Bug #1303735: private-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
[07:53] <rogpeppe> fwereade: looking
[07:53] <fwereade> axw, did you see https://bugs.launchpad.net/bugs/1303583 ?
[07:53] <_mup_> Bug #1303583: provider/azure: new test failure <gccgo> <juju-core:Triaged> <https://launchpad.net/bugs/1303583>
[07:54] <axw> fwereade: I have, but haven't had time to look into it yet
[07:54] <fwereade> axw, np, just wanted to make sure it was on your radar
[07:54] <dimitern> fwereade, ta
[07:58] <rogpeppe> fwereade: the issue is quite obscure to me - i'm can't see the exact problem that's being reported there
[07:59] <fwereade> rogpeppe, AIUI it's a change in behaviour -- jamespage will be able to make it clear I think?
[08:00] <rogpeppe> fwereade: right. it would be nice to know what's the expected behaviour there and how the reported logs differ
[08:00] <jamespage> rogpeppe, I upgrade nova-compute nodes (which had the correct private-address) and the private-address switches to be the ip address of the internal bridge virbr0
[08:00] <rogpeppe> jamespage: where can i see the result of that in the status? (or the logs?)
[08:00] <jamespage> rogpeppe, in the bug report
[08:00] <rogpeppe> jamespage: yeah, i was looking at the bug report
[08:01] <jamespage> rogpeppe, the dns-name of all the nodes are the same
[08:01] <jamespage> #err
[08:01] <rogpeppe> jamespage: the status doesn't seem to show private addresses
[08:01] <jamespage> rogpeppe, OK - public-address then
[08:01] <rogpeppe> jamespage: ah, dns-name, sorry
[08:01] <jamespage> rogpeppe, whatever happened it was wrong
[08:01] <rogpeppe> jamespage: right, the public address. that really confused me.
[08:01] <jamespage> rogpeppe, I'm not sure about the private-address tbh
[08:02] <jamespage> rogpeppe, titled changed
[08:02] <rogpeppe> jamespage: thanks
[08:16] <dimitern> fwereade, how about s/SetProvisionedWithNetworks/ProvisionInstance/ ?
[08:18] <rogpeppe> jamespage: can you find out what addresses nova returns for the instance ids?
[08:18] <jamespage> rogpeppe, not right now
[08:18] <rogpeppe> jamespage: ok
[08:18] <jamespage> but I can look again later
[08:19] <rogpeppe> jamespage: i'm suspecting that nova is returning the libvirt bridge address as one of the addresses for an instance, and our logic happens to be picking it out
[08:19] <jamespage> rogpeppe, hmm
[08:20] <jamespage> rogpeppe, nova has no knowledge of that afaik
[08:20] <jamespage> as in there is no agent in the instance that would let it know
[08:20] <rogpeppe> jamespage: hmm
[08:25] <rogpeppe> jamespage: ah, i see where it comes from
[08:31] <rogpeppe> axw: i think this issue (#1303735) is to do with worker/machiner - setMachineAddresses is setting the libvirt bridge address without marking it as NetworkMachineLocal
[08:31] <_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
[08:31] <rogpeppe> jamespage: so, i know what the issue is, but i'm not yet sure of the right way to fix it
[08:37] <axw> rogpeppe: that would suggest that the openstack provider doesn't have any cloud-local addresses
[08:37] <axw> is that expected?
[08:38] <axw> rogpeppe: and you're right of course, it's not setting them to local - how would it know to do that?
[08:38] <rogpeppe> axw: yeah
[08:39] <rogpeppe> axw: i don't think it means the provider doesn't have any cloud-local addresses, as we're looking for public addresses here
[08:39] <axw> rogpeppe: sorry, misread the bug
[08:39] <axw> rogpeppe: I thought it was private
[08:39] <rogpeppe> axw: it seems like state.mergedAddresses doesn't preserve ordering, which is perhaps a pity
[08:40] <rogpeppe> jamespage: it would still be useful to see what addresses nova is returning for the instances
[08:42] <rogpeppe> axw: i'm thinking that it might be possible for a machine to know which interfaces are private, but it might be quite os-specific
[08:42] <axw> rogpeppe: ISTM that the best thing we could do is to prefer cloud-local over unknown
[08:43] <axw> rogpeppe: indeed
[08:43] <rogpeppe> axw: when asking for a public address?
[08:43] <axw> (quite os specific)
[08:43] <rogpeppe> axw: that seems wrong to me
[08:43] <axw> rogpeppe: yeah, if there's no public address
[08:43] <axw> is it less wrong to choose an unknown address that might be private (like this)?
[08:43] <rogpeppe> axw: another possibility is to strictly order Machine.Addresses before Machine.MachineAddresses
[08:43] <axw> the right thing of course is to classify things properly
[08:44] <axw> rogpeppe: looking at instance.SelectPublicAddress, that won't work - it chooses the last cloud-local/unknown in the list
[08:45] <axw> which is different to internal, for some reason
[08:45] <rogpeppe> axw: that's definitely wrong if so
[08:45] <axw> rogpeppe: perhaps it just needs to change to be like internal
[08:46] <rogpeppe> axw: yes
[08:46] <axw> (and preserve order)
[08:46] <rogpeppe> axw: in fact, the implementation of internalAddressIndex and publicAddressIndex should probably be merged
[09:10] <rogpeppe> jamespage: when you have a moment, would you be able to run this go program on one of the openstack nodes in the juju env that exhibits this problem? http://play.golang.org/p/GH0261EIHH
[09:11] <rogpeppe> axw: i'm thinking we might be able to make some deductions from the interface name
[09:13] <jamespage> rogpeppe, OK - lemme finish up the upgrade testing I'm doing and I'll try again
[09:36] <natefinch> morning all
[09:36] <natefinch> rogpeppe: you around?
[09:36] <rogpeppe> natefinch: yup
[09:36] <rogpeppe> natefinch: just doing a review. will be with you shortly.
[09:36] <natefinch> rogpeppe: sure
[09:42] <axw> rogpeppe: sorry was afk. I suppose that would be better than what we have now
[09:43] <rogpeppe> axw: preserving order, you mean?
[09:43] <axw> rogpeppe: deducing classification
[09:43] <rogpeppe> axw: yeah
[09:44] <rogpeppe> axw: we should preserve order too, i think, so the addresses are in predictable order. currently we're shuffling them randomly, which isn't great
[09:44] <natefinch> rogpeppe, axw: you guys talking about the sort.Stable address problem with replicaset addresses?
[09:44] <axw> rogpeppe: provider addresses should certainly come before machine, but otherwise I think relying on order is a mistake
[09:45] <axw> natefinch: no, something else entirely - choosing public addresses when there are only unknown/cloud-local
[09:45] <rogpeppe> natefinch: no, we're tallking about #1303735
[09:45] <_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
[09:45] <rogpeppe> axw: mgz says that order is important
[09:45] <natefinch> ahh ok
[09:46] <axw> rogpeppe: if addresses really do have a priority, then I think that should be explicit
[09:46] <rogpeppe> axw: and that a provider can return preferred addresses by putting them earlier in the addresses slice
[09:46] <axw> ordering in a slice seems pretty subtle, easy to break
[09:46] <rogpeppe> axw: that's the current design, FWIW
[09:47] <axw> yeah, I get that - it needs to be fixed - just whining :)
[09:47] <natefinch>  type AddressByPriority []Address
[09:47] <natefinch> now it's explicit
[09:48] <rogpeppe> natefinch: i think it's reasonable as is, actually.
[09:48] <axw> rogpeppe: we should probably document that order is important on instance.Instance.Addresses
[09:49] <rogpeppe> axw: it's not too hard to take care to preserve order. it would be nice if there was a function to help with merging address slices in the instance package
[09:49] <rogpeppe> axw: definitely
[09:51] <natefinch> rogpeppe: I'm not a huge fan of relying on order of a generic slice.  I guess we very rarely pass it around outside the provider, and if the provider interface makes it clear the order matters, then that's probably ok.
[09:51] <rogpeppe> natefinch: we pass it around a lot actually
[09:51] <rogpeppe> natefinch: i don't really see the problem - slices are inherently ordered
[09:53] <natefinch> rogpeppe: yes, but that order usually doesn't matter.  And it's not clear it matters when some random function gets a list of addresses deep in the bowels of the code.
[09:53] <rogpeppe> natefinch: huh? that order often/usually does matter!
[09:53] <natefinch> I presume we got into this mess because we didn't realize the order of the slice matters
[09:53] <rogpeppe> natefinch: e.g. []byte
[09:54] <rogpeppe> natefinch: we definitely need to document that more
[09:54] <rogpeppe> natefinch: but i think it's reasonable to have a convention that []Address is ordered
[09:55] <rogpeppe> natefinch: otherwise we'd end up adding some kind of a priority field which would actually make things considerably harder
[09:59] <natefinch> rogpeppe: I'm just not a fan of preventing bugs by following conventions that are likely only written down in one place in a huge codebase.  But I agree making the providers return a different type would be a hassle.
[10:00] <rogpeppe> natefinch: it's not just making the providers return a different type - it's coordinating priorities. do you have some global definition of address priority levels? what do you do when you combine address from two different sources?
[10:00] <rogpeppe> natefinch: all those issues fall out naturally if you assume that ordering matters in a slice
[10:01] <rogpeppe> natefinch: we should definitely write down in a couple of places that order is significant
[10:01] <natefinch> rogpeppe: I don't want to continue to argue it, since it's just stopping us from actually doing anything, but I think the answer is non-trivial no matter what we do.
[10:01] <rogpeppe> natefinch: i don't think it's too hard actually. just preserve order when combining addresses.
[10:02] <wallyworld_> mgz: have you nova booted an instance on hp cloud manually and then attempted to ssh into it? i've had no luck getting in via ssh
[10:02] <natefinch> rogpeppe: I guess I don't know how to preserve order when merging two slices unless you know how they were sorted in the first place.
[10:03] <rogpeppe> natefinch: trivial answer: just concatenate the slices
[10:03] <jam> axw: I just got a "session already closed" panic on the bot. Doesn't your patch fix that?
[10:04] <axw> my patch?
[10:04] <jam> axw: the one that untwines StateWorker and APIWorker
[10:04] <axw> jam: rogpeppe fixed a channel closed one
[10:04] <rogpeppe> natefinch: more sophisticated answer: delete items in the second slice that exist in the first slice before concatenating them
[10:04] <axw> jam: link?
[10:05] <axw> jam: nm, found it
[10:05] <natefinch> rogpeppe: how do you know the ones in the second slice are lower priority than all the ones in the first slice?
[10:05] <jam> axw: heres' a link to the failure: https://code.launchpad.net/~jameinel/juju-core/go-vet-cleanup/+merge/214911
[10:05] <rogpeppe> jam: yeah, my patch wasn't for a "session already closed" error
[10:05] <axw> jam: that looks different
[10:05] <rogpeppe> natefinch: you make that decision
[10:05] <rogpeppe> natefinch: based on the origin of each slice
[10:06] <rogpeppe> jam: it may well be related to my patch though
[10:06] <rogpeppe> jam: i'll have a look
[10:07] <jam> axw: I thought you had a comment in IRC about breaking the machine agent because of the multiple connections during upgrade, which might be related, but maybe not directly.
[10:07] <jam> this, in particular, looks like a Watcher that is trying to finish something while the connections are cleaning up.
[10:08] <axw> jam: I did, and rogpeppe fixed it... I don't think it is related, but maybe rog will have a better idea
[10:10] <jam> axw: rogpeppe: looking at state/watcher/watcher.go it looks like it could be a race condition. If we triggered tomb.Dying but also got the timeout in time.After(period), the w.needSync will be checked without looking at tomb.Dying
[10:11] <jam> hmm.. alternatively, on first entering the function, you also set needSync, but haven't looked at Dying yet (AFAICT)
[10:11] <rogpeppe> jam: i don't think that should matter
[10:12] <jam> the traceback says that it was happening in New()
[10:12] <rogpeppe> jam: until the watcher's tomb is Dead, it's entitled to do anything it likes
[10:12] <jam> though it doesn't go above that.
[10:12] <rogpeppe> jam: i think it must be that we're not closing things down properly
[10:12] <jam> rogpeppe: sure, it looks like we might have gotten a closed session while we were doing something else, and we're closing it concurrently with creating something new.. ?
[10:14] <jam> rogpeppe: anyway, don't look too deeply on this, I was just trying to push out some of wwitzel's in-progress stuff while he was gone
[10:14] <jam> it isn't critical work
[10:17] <natefinch> btw, rogpeppe: to land HA, we need to rework the sort.Stable of addresses.  sort.Stable is go 1.2, and we only require go 1.1.2  right now
[10:18] <axw> natefinch: why do you need to stable sort?
[10:18] <rogpeppe> natefinch: right - i saw that. all that selectPreferredStateServerAddress logic is about to go anyway
[10:19] <rogpeppe> natefinch: i didn't suggest taking it out because i didn't want to perturb the branch any more
[10:19] <rogpeppe> natefinch: i'd just delete all of that and use mongo.SelectPeerAddress instead
[10:19] <rogpeppe> axw: we used a stable sort to preserve address order
[10:20] <natefinch> rogpeppe: right, we just have to take it out since the bot can't compile it
[10:20] <jamespage> rogpeppe, OK - this is from 12.04 - http://paste.ubuntu.com/7225600/
[10:20] <axw> rogpeppe: yeah, just wondering what part of the address is being ignored for the sort.Sort not to be good enough
[10:20] <jamespage> rogpeppe, however I think I saw the issue on 14.04 nodes - so doing it there as well.
[10:21] <rogpeppe> jamespage: oh, one mo. i didn't include some crucial info.
[10:21] <axw> cos if they're equal and we're considering all fields, surely we don't care
[10:24] <rogpeppe> jamespage: this is more useful: http://play.golang.org/p/mmy9KhUy9T
[10:24] <rogpeppe> axw: we weren't comparing all fields
[10:37] <mgz> wallyworld_: yeah, you need to add your ssh key either through cloud-init or via nova though
[10:37] <wallyworld_> mgz: i tried via nova using keypair-add
[10:37] <mgz> right, with that... it didn't work?
[10:37] <wallyworld_> i used the --pub-key option
[10:38] <wallyworld_> yeah, didn't work
[10:38] <wallyworld_> mgz: i'm trying to test the latest fixes to the manual provider that landed today
[10:38] <jamespage> rogpeppe, http://paste.ubuntu.com/7225650/
[10:38] <mgz> you can use `nova console-log` to see what's up if you supplied any cloud init bits
[10:39] <wallyworld_> mgz: didn't supply any cloud init bits, was just assuming keypair-add would work
[10:39] <wallyworld_> console log seemed to show some random key being used
[10:39] <wallyworld_> not mine
[10:39] <mgz> odd
[10:39] <rogpeppe> jamespage: thanks
[10:40] <mgz> wallyworld_: ah,
[10:40] <rogpeppe> jamespage, mgz: do you think it would be reasonable to pattern match on the interface name to determine the class of address? (e.g. if it matches virbr* then assume it's machine-local)
[10:40] <mgz> did you actually use `nova boot --key-name MYKEY` ?
[10:40] <wallyworld_> yep
[10:40] <rogpeppe> jamespage: i don't know how predictable interface names are in linux
[10:40] <mgz> okay, I'm out of ideas then :P
[10:40] <wallyworld_> mgz: the same name as i used for keypair-add
[10:41] <wallyworld_> :-(
[10:41] <mgz> wallyworld_: try supplying a key with cloud-init instead
[10:41] <jamespage> rogpeppe, hmm
[10:41] <mgz> 's a bit more work but should be fine
[10:41] <wallyworld_> mgz: point me to some doc to tell me what to do?
[10:41] <mgz> sec
[10:41] <wallyworld_> or i can try with lxc i guess
[10:42] <rogpeppe> jamespage: because i believe there are cases where we really do want to get the addresses off the local machine interfaces. but that's hard if we can't tell which ones are machine-local.
[10:45] <mgz> wallyworld_: basically, make a text file with `#cloud-config\nssh_authorized_keys\n  -  ssh-rsa .... blah@blah\n"
[10:45] <jamespage> rogpeppe, you can't safely make that assumption "if it matches virbr* then assume it's machine-local"
[10:46] <mgz> see doc/examplescloud-config-ssh-keys.txt in lp:cloud-init for an example
[10:46] <wallyworld_> ta, ok
[10:46] <mgz> then you can supply that file stright as --user-data to boot
[10:46] <mgz> (no need to gzip as it's so small)
[10:46] <wallyworld_> ok, i'll try that
[10:47] <jamespage> rogpeppe, is it possible to limit juju to quering interfaces its been told about or created itself?
[10:47] <jamespage> rogpeppe, whitelist rather than blacklist
[10:47] <dimitern> jam, standup?
[10:48] <mgz> jamespage: we'll nearly start doing that with maas now I think, as dimitern has started getting the network interfaces from the lshw that maas provides
[10:48] <mgz> we should probably do something similar when we grow better networking support in other clouds
[10:49] <dimitern> mgz, jamespage, we definitely will do that for other clouds, gradually as juju networking support grows
[10:50] <dimitern> fwereade, updated and tested https://codereview.appspot.com/85220044/ - should be good to land
[10:50] <fwereade> dimitern, cheers
[11:07] <mattyw> fwereade, I've made the small change you asked for - just added a test and a small fix - happy or me to land it? https://codereview.appspot.com/83060049/
[11:17] <jam1> dimitern: sorry I missed the ping. I completely spaced off the standup, and was on my other laptop.
[11:18] <dimitern> jam1, we're still there, you can join if you like :)
[11:32] <fwereade> mattyw, if I LGTMed with fixes you don't need to ask, but you can always ask for another review if you'e not sure
[11:32] <mattyw> fwereade, ok, I just added the test - and a fix I found while writing it so I'll approve it then, thanks
[11:33] <fwereade> mattyw, cool
[11:39] <jam1> fwereade, dimitern: https://code.launchpad.net/~jameinel/juju-core/1.18-refuse-downgrade-1299802/+merge/214878 needs a review
[11:40] <dimitern> jam1, looking
[11:42] <jam1> dimitern: thanks
[11:47] <dimitern> jam1, LGTM
[12:12] <rogpeppe> mgz, jamespage: i wonder if we could just add only addresses from eth* interfaces for the time being. that would probably cover the case that we care about most currently.
[12:13] <rogpeppe> natefinch: hangout?
[12:13] <axw> rogpeppe: is it expected we'll want to have non-voting replicaset members? is that why we have NoVote/WantsVote? or is that specifically for handling inaccessible members?
[12:14] <rogpeppe> axw: yes - if a machine goes down, we don't know that it might just come back up again in a few moments, so we don't want to just destroy it or remove it immediately
[12:15] <rogpeppe> axw: so we just mark it so that it doesn't want the vote
[12:15] <rogpeppe> axw: also, we can have a machine with WantsVote=false and HasVote=true
[12:17] <axw> ok
[12:17] <natefinch> rogpeppe:  sure
[12:17] <rogpeppe> axw: our main invariant is that the number of machines that *want* the vote must always be odd, and similarly the number of machines in the replica set configuration that *have* the vote must always be odd.
[12:17] <rogpeppe> natefinch: one mo, i've just been called to lunch
[12:17] <natefinch> ok
[12:17]  * rogpeppe lunches
[12:18]  * natefinch breakfasts
[12:21]  * perrito666 snacks after breakfast
[12:21] <perrito666> we really need more names for eating occasions
[12:21]  * axw pats his belly full of pizza
[12:22] <jam1> perrito666: brunch is the breakfast + lunch meal
[12:22] <jam1> second breakfast is the hobbit one, (along with elevensies (sp?))
[12:22] <axw> heh
[12:22] <perrito666> jam1: I am in the hobbit one
[12:22] <perrito666> I intend to lunch too
[12:22] <perrito666> (and honestly,I might also eat something near eleven not that you mention it)
[12:23] <jam1> perrito666: http://www.moviemistakes.com/film1778/quotes
[12:23] <jam1> so, breakfast, second breakfast, elevensies, Lunch, Luncheon, Afternoon tea, dinner, supper, I'm not sure if there are more
[12:24] <perrito666> that pretty much covers my day :)
[12:25] <jam1> rogpeppe, natefinch: so how close are we to having a "juju ensure-state-availability" that we can play with ?
[12:37] <rogpeppe> jam1: i've got a branch that seems to work
[12:38] <rogpeppe> jam1: but it needs more tests
[12:48] <jam1> rogpeppe: natefinch: I just noticed that we thought EnsureMongo could probably land (and be polished from there) yesterday, but it is still up for review.
[12:49] <jam1> at least the comment yesterday was "if I get enough time before the kids wake up", which probably didn't happen, but certainly afterwards... ?
[12:49] <rogpeppe> jam1: it's landing very soon
[12:49] <rogpeppe> jam1: it used a go 1.2 feature which meant it couldn't land as was
[12:49] <jam1> rogpeppe: if that wasn't said weeks ago, I would trust you :)
[12:49] <jam1> rogpeppe: what was that? (I wasn't particularly aware of 1.2 incompatibilities)
[12:50] <rogpeppe> jam1: it used sort.Stable, which is a go1.2 addition
[12:50] <jam1> ah
[12:50] <rogpeppe> jam1: it's been LGTM'd
[12:51] <mattyw> is the landing bot awake?
[12:52] <jam1> mattyw: it landed my stuff 10 min ago
[12:52] <jam1> but I'll check it
[12:52] <jam1> mattyw: do you have something that it isn't noticing?
[12:53] <mattyw> jam1,  https://code.launchpad.net/~mattyw/juju-core/deploy-with-user-name/+merge/213962
[12:53] <mattyw> jam, I guess there might be a queue?
[12:53] <jam1> mattyw: you don't have a commit message set
[12:53] <jam1> so the bot ignores it
[12:53] <mattyw> jam, ah - of course, thanks
[12:53] <jam1> mattyw: I copied your description
[12:54] <mattyw> jam1, that's great thanks very much
[12:54] <mattyw> jam, I'll try to remember for next time
[12:58] <dimitern> fwereade, poke re https://codereview.appspot.com/85220044/
[13:02] <rogpeppe> natefinch: i've got a dentist's appointment now. back in 30 mins
[13:02] <jam1> mattyw: I can see the bot is processing your request.
[13:02] <jam1> Note that we've had some intermittent failures with "Session already closed". If you see that, you can resubmit.
[13:03] <mattyw> jam1, ok thanks
[13:04] <sinzui> Hi jam, fwereade : I think this bug is describing unsupported behaviour or lxc nested in kvm: https://bugs.launchpad.net/juju-core/+bug/1304530
[13:04] <_mup_> Bug #1304530: nested lxc's within a kvm machine are not accessible <addressability> <cloud-installer> <kvm> <local-provider> <lxc> <juju-core:New> <https://launchpad.net/bugs/1304530>
[13:05] <mgz> sinzui: yeah, that's likely just a case of no one having tried it yet
[13:06] <mgz> the local provider is already pretty crazy when it comes to addressing without adding nested containers in
[13:08] <sinzui> mgz, I think stokachu  has done something like that and it required esoteric magis to work
[13:08] <mgz> if you manually fiddle with the network setup you you could probably make it work
[13:08] <mgz> it's not something we're looking to support for trusty though
[13:09] <sinzui> mgz, CI hates trunk https://bugs.launchpad.net/juju-core/+bug/1305047
[13:09] <_mup_> Bug #1305047: Unit tests fail on lp:juju-core r2588  <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1305047>
[13:10] <mgz> sinzui: that's rogpeppe's bug
[13:10] <mgz> rogpeppe: have you got a bug number for it?
[13:12] <sinzui> Ah, silly me, stokachu is the reporter of the bug. So I think he has reached the dead end that thumper predicted
[13:23] <fwereade> dimitern, rereviewed
[13:27] <dimitern> fwereade, thanks
[13:29] <rogpeppe> mgz, sinzui: i tried and failed to reproduce that problem
[13:31] <sinzui> :(
[13:32] <rogpeppe> sinzui: interestingly panic is in a different test to the one that jam saw
[13:33] <rogpeppe> s/panic/that panic/
[13:35] <sinzui> rogpeppe, CI will run the tests 5 times before giving up. It tried for many revs and did many fails
[13:35] <sinzui> rogpeppe, But Vi just got a pass http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/run-unit-tests-amd64-precise/
[13:36] <sinzui> rogpeppe, trusty is has the same bad record, but its passes happen in a better order to make CI happy: http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/run-unit-tests-amd64-trusty/
[13:37] <mgz> sinzui: it seems like a pretty easy to hit race fail
[13:37] <mgz> landing bot has jackpotted a number of times
[13:37] <natefinch> rogpeppe:  you have dentist appointments that only take 30 minutes?  Damn, mine always take like an hour.
[13:37] <natefinch> (and that's not including time to get there)
[13:39] <rogpeppe> natefinch: the actual appointment was just a checkup - 10 minutes only; and the dentist is only a couple of minutes bike ride away
[13:39] <natefinch> rogpeppe: that's cool.
[13:41] <rogpeppe> hmm, i just saw another (probably unrelated) panic when testing
[13:42] <rogpeppe> http://paste.ubuntu.com/7226267/
[13:42] <mgz> rogpeppe: is the change you suspect just revertable?
[13:43] <rogpeppe> mgz: probably
[13:43] <rogpeppe> mgz: i'd like to know what's going on though
[13:43] <mgz> if CI can hit the error this reliably, should be pretty easy to confirm blame or not
[13:50] <jamespage> rogpeppe, does the same code get used in the MAAS provider? when using LXC containers, brX is also valid
[13:51] <rogpeppe> jamespage: yes, the same code gets used in the MAAS provider
[13:51] <rogpeppe> jamespage: perhaps we need provider-specific code to run in the client to get the addresses
[13:51] <jamespage> rogpeppe, so in the MAAS provider the IP address is assigned to the bridge, not the physical interface
[13:51] <jamespage> assuming LXC or KVM containers have been created
[13:52] <jamespage> whitelisting eth* and br* might work OK
[13:52] <jamespage> that said I've seen emX style entries as well with biosdevname
[13:52] <rogpeppe> jamespage: i'm not familiar with the details of what scenarios we really need the machine-local address discovery.
[13:52] <rogpeppe> fwereade, mgz: ^
[13:53] <rogpeppe> s/discovery/discovery for/
[13:53] <mgz> jamespage: I'm suspecious of anything just running on the machines themselves
[13:53] <jamespage> mgz, rogpeppe: do we know what this local discovery step is used for?
[13:54] <rogpeppe> jamespage: i'm not sure. i'm guessing there are some places that we can't use the provider for discovery
[13:54] <rogpeppe> jamespage: perhaps this is something that's there for manual provisioning only
[13:56] <mgz> jamespage: this is all re bug 1303735 right?
[13:56] <_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
[13:57] <jamespage> mgz, yes
[14:27] <fwereade> rogpeppe, jamespage: it's the blasted local provider that needs local address discovery
[14:27] <rogpeppe> ah, interesting, there's a race in the test between agent config and the apiaddressupdater
[14:27] <fwereade> rogpeppe, jamespage: I can't immediately recall whether we have lxc-ls --fancy everywhere we might need it, which *could* let us work around that
[14:27] <fwereade> rogpeppe, jujud tests?
[14:28] <rogpeppe> fwereade: yeah
[14:28] <fwereade> rogpeppe, those things suck
[14:28] <fwereade> ;)
[14:28] <rogpeppe> fwereade: it's only a race because we don't set the APIHostPorts in the state
[14:28] <rogpeppe> fwereade: so the apiaddressupdater starts and immediately assigns no addresses to the agent config
[14:29] <rogpeppe> fwereade: the test only works if the APIWorker grabs the APIInfo before it does that
[14:29] <rogpeppe> fwereade: we should have valid APIHostPorts in the state, then there shouldn't be a problem
[14:30] <fwereade> rogpeppe, right, makes sense
[14:30] <rogpeppe> fwereade: i wondered about having an EnvironProvider method that allows us to ask a provider for local addresses
[14:31] <rogpeppe> fwereade: then the local provider could implement it, but the other providers could just return nothing
[14:32] <rogpeppe> fwereade: (but it would potentially allow us to move away from using the instancepoller for some providers, if we wanted to - hazmat thinks that's a good idea)
[14:33] <fwereade> rogpeppe, honestly I think it's a matter of tuning the instancepoller more than it is a matter of dropping it
[14:33] <fwereade> rogpeppe, we want to keep track of instance status as well
[14:33] <fwereade> rogpeppe, I think there's something else
[14:33] <fwereade> rogpeppe, oh, yeah, instance networks
[14:34] <rogpeppe> fwereade: regardless, having a way for providers to add locally-sources addresses to a machine seems like a reasonable idea
[14:34] <fwereade> rogpeppe, yeah, I wouldn't object to making the MachineAddresses stuff smarter
[14:34] <dimitern> fwereade, network ids and tags https://codereview.appspot.com/86010044 when you can take a look
[14:35] <rogpeppe> fwereade: FWIW MachineAddresses isn't a great name - it doesn't really say why Machine.MachineAddresses is different from Machine.MachineAddresses...
[14:35] <fwereade> rogpeppe, agreed, it's an awful name
[14:36] <rogpeppe> fwereade: LocalAddresses?
[14:36] <rogpeppe> fwereade: LocallySourcedAddresses?
[14:36] <fwereade> rogpeppe, the semantic payload there is not ideal either
[14:36] <rogpeppe> (i don't like either of those, BTW)
[14:36] <fwereade> rogpeppe, the latter is probably best
[14:37] <natefinch> anyone know why I'm getting this when I run juju? WARNING unknown config field "proxy-ssh"
[14:37] <fwereade> (best of a bad bunch)
[14:37] <rogpeppe> fwereade: AgentProvidedAddresses ?
[14:39] <fwereade> natefinch, hmm, that seems odd -- it's something axw added for azure, but I'm not sure where ther error comes from
[14:39] <natefinch> fwereade: I don't have proxy-ssh in my environments.yaml anywhere
[14:43] <natefinch> fwereade: I think blowing away my old environments.yaml and making a new one helped
[14:44] <vladk> dimitern: I done lbox propose for gomaasapi, but no codereview on appspot was created, only on LP
[14:44] <dimitern> vladk, did lbox give you any errors?
[14:44] <fwereade> natefinch, would you take a few moments to dig into it sometime today please? many users will have old environments.yamls...
[14:45] <vladk> dimitern: no just print link to LP:
[14:45] <vladk> Proposal: https://code.launchpad.net/~klyachin/gomaasapi/101-testserver-extensions/+merge/214961
[14:45] <dimitern> vladk, you could try running it again
[14:46] <natefinch> fwereade: yeah, once HA lands, I'll be able to actually work on other thigns
[14:46] <natefinch> fwereade: which will be as soon as I can run a few tests
[14:47]  * fwereade cheers at natefinch
[14:48] <natefinch> fwereade: this also seems to have fixed other problems I had been experiencing.  Definitely worth investigating
[14:49] <natefinch> (luckily I kept around the old environments.yaml
[14:51] <natefinch> I wish we had an environment variable we could set that would effectively add --debug to every juju command line call
[15:03] <vladk> dimitern: I specified -cr explicitly: https://codereview.appspot.com/86070043
[15:03] <dimitern> vladk, ah, you know - gomaasapi doesn't have .lbox.check in the root dir I think
[15:04] <dimitern> vladk, in juju-core we have .lbox containing the default args for lbox: "propose -cr -for lp:juju-core"
[15:04] <mgz> vladk: you can always rerun lbox propose as many times as you want, so that's fine
[15:04] <mgz> I always do `lbox propose -cr -v` out of long standing habit
[15:07] <jamespage> fwereade, are you guys aiming for a 1.18.1 release prior to the end of tomorrow?
[15:07] <jamespage> (which co-incidentally is when Final Freeze kicks in)
[15:08] <mgz> jamespage: we appear to have no fixed bugs in 1.18.1
[15:08] <mgz> so I'd guess not.
[15:09] <natefinch> gah, I still can't deploy stuff locally
[15:09] <jamespage> mgz, well we have 24 hrs until tomorrow eod :-)
[15:10] <mgz> jamespage: yeah, but all the bugs look hard... ;_;
[15:15] <fwereade> jamespage, it's not impossible: I'm working on the first; I will ask axw to look at the second overnight; just asked for cmars' comments re third; not sure about 4th, I'll ask an australian to take a look; 5th apears unreproable
[15:15] <jamespage> fwereade, OK - thanks
[15:16] <natefinch> ah hah..... lxc-ls seems broken.  I bet that's my problem
[15:16] <jamespage> fwereade, its not impossible to get a point release in after tomorrow
[15:16] <fwereade> jamespage, sure, but I prefer to be a good citizen where practical
[15:18] <rogpeppe> natefinch: i'm just proposing a branch to fix one of the machine agent panics. perhaps we could join up to move the HA stuff forward after that?
[15:19] <dimitern> fwereade, updated https://codereview.appspot.com/85220044/ once more
[15:19] <natefinch> rogpeppe: sure.  I fixed the port problem with the initiate address and removed the testing panics you mentioned in the review.  It works on amazon, but I seem to have an LXC problem on my local host, so I'm apt-getting and will reboot after to see if that fixes anything
[15:19] <dimitern> fwereade, i really want to land that and  https://codereview.appspot.com/86010044/ today if i can
[15:21] <rogpeppe> this fixes a cmd/jujud test crash: https://codereview.appspot.com/86080043/
[15:21] <rogpeppe> fwereade, mgz, dimitern, natefinch: review appreciated
[15:22] <rogpeppe> unfortunately it's not the one that people have been seeing on the 'bot and in CI
[15:23] <mgz> ha, typed in the wrong id
[15:23] <mgz> but I should actually review dimitern's branch, which is where I ended up :P
[15:24] <natefinch> brb, gonna reboot now that I have upgraded, see if that fixes my lxc problems
[15:29] <dimitern> rogpeppe, reviewed
[15:29] <rogpeppe> dimitern: ta!
[15:30] <rogpeppe> dimitern: in general i prefer to use a literal - if i use "nothing", then i have to check its value. i don't mind the slightly greater verbosity.
[15:31] <rogpeppe> dimitern: at some point in the future i hope to see a "zero" builtin in Go that acts like nil except it represents the zero value for any type.
[15:32] <dimitern> rogpeppe, I know you don't mind :) It's just my opinion
[15:32] <dimitern> rogpeppe, yeah, that will be very handy
[15:33] <rogpeppe> dimitern: FWIW, i think the code with naive literals reads slightly more easily - it's more directly obvious what the code is doing.
[15:34] <fwereade> dimitern, https://codereview.appspot.com/86010044/ reviewed
[15:38] <dimitern> fwereade, ta!
[15:38] <fwereade> dimitern, you might not be so happy when you read it, we may need to discuss, I fear I have been unclear
[15:39] <dimitern> rogpeppe, the first thing i'm doing when reading unfamiliar code and see a var/type/etc. i don't get, i immediately hit M-. in emacs, which invokes godef on the symbol and voila!
[15:39] <rogpeppe> dimitern: i'm usually reading the code in codereview...
[15:40] <rogpeppe> dimitern: but without the nothing declaration there's no need for any second look - it's immediately obvious on first scan
[15:40] <rogpeppe> dimitern: which is why i prefer it more direct like that
[15:42] <fwereade> dimitern, the other one LGTM with trivials
[15:42] <dimitern> fwereade, thanks, still reading the last review :)
[15:43] <dimitern> fwereade, i can't really impose restrictions on what juju deems a valid network id until it's provider specific, can I ?
[15:44] <natefinch> rogpeppe: lucky 13? https://codereview.appspot.com/72500043/   addressed the things you mentioned in the last review, and it tests ok live on local and amazon
[15:44] <fwereade> dimitern, bugger, ofc you're right
[15:44] <fwereade> dimitern, TODO it with a short explanation of why we're so lax? or, hmm, ask the maas guys what their restrictions on net names are?
[15:45] <fwereade> dimitern, and slavishly copy those? :)
[15:45] <dimitern> fwereade, and re params.Network having both Tag and Id as you suggest - I can do that and for now make sure both match always
[15:45] <dimitern> fwereade, (except for the tag prefix ofc)
[15:46] <dimitern> fwereade, I'll just look at the maas source
[15:47] <dimitern> fwereade, re tags/names/ids - we can have in state and in the api + params all three and make tags work with names and keep name=id for now
[15:53] <vladk> dimitern: I got LGTM from rvba. Could you give me the next task?
[15:54] <dimitern> vladk, sorry, I have a few comments for you review
[15:54] <dimitern> vladk, will submit in a minute
[15:54] <fwereade> dimitern, yeah -- params.Network is just saying here's the net with this juju name, and this is what the provider calls it
[15:54] <fwereade> dimitern, sounds like we're aligned
[15:54] <fwereade> dimitern, thanks
[15:56] <dimitern> fwereade, yep, thanks, will propose a bit later, if you're still here will ping you again :)
[15:58] <dimitern> vladk, reviewed
[15:59] <mgz> fwereade: your comments on dimitern's proposal confuse me
[15:59] <fwereade> mgz, ha :)
[15:59] <fwereade> mgz, it is perfectly possible that I am missing something
[15:59] <rogpeppe> natefinch: get it approved!
[16:00] <fwereade> mgz, would you expand a little?
[16:00] <mgz> it seemed like we were deriving the tags from the cloud provider stuff, hence the no restrictions bar != "", rather than from being named by the use
[16:00] <mgz> *user
[16:01] <mgz> tag=what juju calls the network, id=what the cloud calls the network, name=junked due to being ambiguous
[16:01] <natefinch> rogpeppe: it's going. I *just* merged and fixed a conflict, so it should just work.   fingers crossed.
[16:01] <rogpeppe> natefinch: hangout?
[16:01] <mgz> (and label=optional friendly id for network also from the cloud)
[16:02] <mgz> I guess your review is saying we *should* be providing a way for the user to specify a tag, tied to a given id
[16:03] <mgz> but without some juju cli network commands, I don't see how we add that
[16:06] <mgz> fwereade: ^does that make sense of my confusion?
[16:07] <fwereade> mgz, tag is purely API-level, it's not what juju calls things
[16:07] <dimitern> mgz, for now network.ProviderId == network.Name in juju (both state and api) and tags are created from names
[16:07] <mgz> fwereade: ugh, the name inside the tag then
[16:08] <mgz> having tag be the magic decoration bit is annoying
[16:08] <fwereade> mgz, we *will* have cli network commands, and I want what we have to day to fit in with what we will need to do soon
[16:09] <fwereade> mgz, for now, maas is the only thing that has networks, and there's a perfect mapping between provider id and user name
[16:09] <mgz> fwereade: so, dimitern's version seems to do that, by autonaming the... names inside the tags, and placing no restrictions on them
[16:09] <mgz> so they can become user-specified later
[16:11] <fwereade> mgz, but it also conflates names and ids in several places -- and if we don't make the kind of data clear now we will have the devil of a time once we have a name that does not match an id
[16:11] <mgz> fwereade: well the conflating I saw was from that autonaming business... maybe there's some other bits I missed?
[16:12] <dimitern> mgz, I wasn't clear about not just renaming Name to Id, but having both and using name for juju stuff and id for provider stuff
[16:12] <fwereade> mgz, it seemed to me that it was using Id instead of name across the board
[16:13] <mgz> okay, so we just need to be picky as hell about the naming... which will still be confusing even if we are due to too many things, too few names for names...
[16:17] <natefinch> rogpeppe: sorry, stepped out to get lunch.  I can hangout now, yeah
[16:17] <rogpeppe> natefinch: ok, cool
[16:17] <rogpeppe> natefinch: https://plus.google.com/hangouts/_/canonical.com/juju-core?authuser=1
[16:19] <rogpeppe> natefinch: lp:~rogpeppe/juju-core/540-enable-HA
[16:21] <vladk> :332
[16:22] <natefinch> \o/  The proposal to merge lp:~natefinch/juju-core/030-MA-HA into lp:juju-core has been updated.   Status: Approved => Merged
[16:24] <alexisb> natefinch, sweetness!
[16:24] <natefinch> finally finally finally
[16:24] <rick_h_> natefinch: if you get time want to chat about that for a couple of min
[16:25] <natefinch> rick_h_: how urgent is it?  My day is pretty slammed
[16:25] <rick_h_> natefinch: not at all, completely when you've got time
[16:25] <rick_h_> and the time can even be 'let's catch up in vegas'
[16:25] <natefinch> ha, ok
[16:25] <rick_h_> just a heads up gui wants to catch up on HA to see what we can/should do from our end so we can have a plan
[16:28] <natefinch> rick_h_: good enough... I'll shoot you an email about it.  We're not quite done with it, but this was a huge chunk that had taken way too long to get in
[16:28] <rick_h_> natefinch: rgr, thanks
[16:39] <jam1> hey guys, something in the test suite is now creating a directory and turning it into 666
[16:39] <jam1> which means it is not executable or writeable
[16:39] <jam1> which means the test suite is failing to clean it up
[16:39] <jam1> a lot of: /tmp/jctest.LpP/gocheck-5577006791947779410/27/some-file
[16:39] <jam1>  files
[16:40] <jam1> fwereade: is that one of your FT tests ?
[16:40] <natefinch> jam1: there's a lot of tests that use 0666
[16:41] <jam1> natefinch: sure ,but you don't normally change a *DIR* to 0666
[16:41] <fwereade> jam, hmm, I didn't *think* I did that, but I can't swear to it
[16:41] <jam1> they need 7 to be able to read the contet
[16:41] <natefinch> jam1: oh, directory, right
[16:42] <natefinch> jam1: Tcharm/repo_test.go has a couple of those
[16:42] <natefinch> s/Tcharm/charm.
[16:43] <fwereade> jam1, hmm, I do have an 0644 in there, drivebying it now
[16:43] <jam1> fwereade: sorry, it is 444 read only
[16:43] <jam1> 6 would be rw
[16:43] <fwereade> jam, none of them I think
[16:44] <fwereade> ah-ha! yes I do
[16:44] <fwereade> bugger
[16:44] <jam1> well, it is all test number 27 ... ):
[16:44] <jam1> :)
[16:44] <jam1> not that *that* part helps
[16:45] <jam1> fwereade: TestRemovedCreateFailure
[16:45] <jam1> TestDirCreateFailure
[16:45] <jam1> fwereade: so I think just adding a Chmod(777) so we can clean up afterwards would be nice
[16:46] <fwereade> jam1, yeah, deferred chmods back to 0777
[16:46] <jam1> I don't think it is causing the test suite to *fail*, but it is preventing rm-rf from cleaning up after itself
[16:46] <fwereade> jam1, yep
[16:46] <fwereade> jam1, sorry about that
[16:46] <jam1> fwereade: np, I only noticed because the test suite is failing for other reasons
[16:46] <jam1> and that shows up in the log
[16:47] <jam1> dimiter's last patch just failed, some of which looks transient, and some which looks like an error message changed.
[16:47] <jam1> but at the *end* of that, it says "I couldn't clean up"
[16:48] <jam1> but hey, root can do anything it wants...
[16:48] <jam1> :)
[16:53] <natefinch> trivial code review anyone?  https://codereview.appspot.com/85970044
[16:54] <natefinch> jam1: btw, did you see EnsureMongoServer finally landed?
[16:54] <jam1> natefinch: I didn't. YAY \o/
[16:54] <natefinch> right? :)  Super psyched
[17:01] <natefinch> rogpeppe: https://codereview.appspot.com/85970044
[17:04] <jam1> natefinch: lgtm
[17:16] <fwereade> natefinch, woooot!
[17:19] <natefinch> fwereade: thanks :)
[17:20]  * natefinch has to see a man about some bees.  
[17:24]  * rogpeppe is done for the day
[17:24] <rogpeppe> might make it back in later for a little bit
[17:24] <rogpeppe> g'night all
[17:37]  * fwereade needs to go out for a while, would appreciate looks at https://codereview.appspot.com/85670046
[17:44] <jam1> natefinch-afk: you forgot to set a commit message when you proposed your merge, I'll do it for you
[18:06] <cmars> proposal for LP: #1303880 up, PTAL https://codereview.appspot.com/86130043
[18:06] <_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <regression> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
[18:11] <cmars> jam1, can you take a look at ^^
[18:19] <sinzui> cmars, I need to update the bug and note that setting the default-series in the env is also a solution if you are opposed to typing the series when you deploy a charm
[18:20] <sinzui> cmars, I am taking the regression tag off. Now that we know the affected users are the edge cases we talked about. I think the solution is to show the right error message
[18:23] <cmars> sinzui, that's a much easier fix :) please note the desired error message in the bug
[18:24] <sinzui> I see you included lucid, but juju and charms don't run on it
[18:24] <sinzui> cmars, I will think of a message right now
[18:41] <sinzui> cmars, https://bugs.launchpad.net/juju-core/+bug/1303880/comments/6
[18:41] <_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
[18:42] <sinzui> though I see I meant to write "release notes" in that commment
[19:44] <cmars> sinzui, updated my proposal. can someone take a look, its much smaller now :) https://codereview.appspot.com/86160043
[19:45] <sinzui> cmars, I am not a reviewer but that size is nice.
[19:45] <cmars> :)
[19:45] <sinzui> where did the US go?
[19:48] <sinzui> perrito666, can you review cmars's branch?
[21:15] <rogpeppe1> anyone up for doing a review? https://codereview.appspot.com/86200043/
[21:22] <bac> sinzui: have you tried using staging-tools or their kin lately?
[21:23] <sinzui> bac, I don't even know what they are
[21:24] <bac> sinzui: you created the branch :)
[21:24] <bac> https://code.launchpad.net/~ce-orange-squad/charmworld/staging-tools
[21:25] <sinzui> bac :) I have forgotten much
[21:25] <sinzui> oh
[21:25] <sinzui> bac.
[21:25] <sinzui> you probably care about the rt a report today
[21:26] <bac> yes, maybe.  does it involved access to canonistack post hb?
[21:26] <sinzui> bac: I used those tools several times a week. orangesquad and juju-qa cannot use swift. Juju is unusable
[21:26] <sinzui> bac: nova is fine
[21:27] <bac> sinzui: i'm seeing canonical-sshuttle dying, not being able to connect to canonistack
[21:27] <bac> it all worked the last time i tried
[21:27]  * sinzui tries
[21:28] <sinzui> bac, I am connected, but what did I connect too because it looks empty
[21:29] <sinzui> bac: I think my jenv is bad. I am told the env is not bootstrapped
[21:30] <bac> sinzui: you think you're on staging?
[21:56] <thumper> davecheney: bugger... it seems like godeps doesn't update the hg branches
[22:01] <thumper> ah, no it does, it just doesn't say that it does
[22:10] <thumper> trivial review to just update the go.net library: https://codereview.appspot.com/86250043
[22:12] <cory_fu> Has juju ssh to a machine number (juju ssh 0) been fixed yet for LXC?
[22:15] <thumper> cory_fu: what do you mean?
[22:15] <thumper> cory_fu: for the local provider?
[22:15] <thumper> cory_fu: yes it works, except for machine 0 as that is the host
[22:15] <thumper> unless your host actually has sshd running
[22:17] <thumper> cmars: how goes https://bugs.launchpad.net/juju-core/+bug/1303880
[22:17] <_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
[22:18] <cmars> thumper, i have a proposal, PTAL, https://codereview.appspot.com/86160043/
[22:18] <thumper> ack
[22:19] <dannf> wallyworld_: just to clarify - the branch you linked fixed that error, but wasn't the root cause, correct? just wondering if there's a fix i can/should verify
[22:24] <thumper> dannf: perhaps more context would help :-)
[22:26] <dannf> thumper: LP: #1302205
[22:27] <_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
[22:27] <thumper> dannf: it was my understanding that the fixes that wallyworld had were to fix the root cause
[22:27] <thumper> dannf: are you still having issues?
[22:27] <thumper> I know that wallyworld was trying to test yesterday
[22:27] <thumper> but not sure on the final progress
[22:27] <wallyworld> dannf: hi, i just logged on again after networking issues so can't see that backscroll, can i help?
[22:28] <dannf> wallyworld: yeah - just curious if the branches you linked were root cause - i.e., if it is worth me retesting w/ them
[22:29] <wallyworld> dannf: i committed fixes, but had trouble testing because i have to run from trunk to test and so can't use simplestreams to get the tools and building juju from source on the arm vms just hung
[22:29] <wallyworld> so i couldn't get tools built to test with
[22:30] <dannf> wallyworld: did you try building on the nova host? i've built there many times w/o a problem
[22:30] <wallyworld> i installed gcc-go and couldn't get outgoing access to lunchpad or github so just copied my source tsrball acros
[22:30] <wallyworld> yeah, i think i build on the nova host
[22:31] <wallyworld> i can try again
[22:31] <wallyworld> actually
[22:31] <wallyworld> i could get outgoing access via wget
[22:31] <wallyworld> but go get just hung
[22:31] <wallyworld> so i couldn't get the juju source in the normal way
[22:31] <wallyworld> via vcs
[22:31] <dannf> though building in the vms *should* work - if not, probably a bug
[22:32] <rogpeppe1> a fairly trivial review if anyone wants to take a look: https://codereview.appspot.com/85600044
[22:32] <dannf> wallyworld: we can ask is to open access for us to certain things. surprised lp access was blocked
[22:33] <wallyworld> dannf: i could wget to launchpad but "go get launchpad.net/juju-core" failed
[22:33] <wallyworld> or hung
[22:33] <dannf> ah - go get... never used that before
[22:33] <cmars> thumper, thanks
[22:33] <wallyworld> so i just copied across the source
[22:33] <dannf> i'll investigate that and at least get a bug filed if neeeded
[22:33] <wallyworld> dannf: go get uses bzr behind the scenes
[22:33] <wallyworld> dannf: i have a meeting now but will ping back when done
[22:34] <dannf> ack
[22:35] <cmars> sinzui, fix for LP: #1303880 is landing in trunk. do you need it proposed to any branches?
[22:35] <_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
[22:36] <sinzui> cmars, yes please lp:juju-core/1.18
[22:37] <dannf> wallyworld: seems to be working for me. but slow. and no output. i just see the directory growing in size
[22:40] <thumper> wallyworld: trivial review for you, although since it is needed on two branches, lbox isn't good at handling it
[22:41] <thumper> https://code.launchpad.net/~thumper/juju-core/update-websocket-lib/+merge/215057
[22:41] <thumper> and https://code.launchpad.net/~thumper/juju-core/update-websocket-lib/+merge/215046
[22:43] <wallyworld> thumper: otp, will look soon
[22:43] <thumper> ack
[22:55] <thumper> cmars: I have approved the other branch
[22:55] <thumper> cmars: although I did realise that there aren't any tests for the new error message
[22:55] <thumper> cmars: is it hard to add one?
[22:56] <thumper> cmars: also, lbox doesn't like submitting the same branch to multiple targets
[22:56] <thumper> cmars: it is a bit too dumb
[23:13] <cmars> thumper, i'll propose a test case for deploying local without series. might be after dinner
[23:13] <thumper> cmars: ack
[23:18] <dannf> wallyworld: and go get seems to have completed (/home/ubuntu/dannf/go)
[23:19] <wallyworld> dannf: great :-) still in meeting, will check back soon
[23:19] <dannf> wallyworld: np; i need to start up the grill, so responses will be latent
[23:43] <davecheney> arosales: hazmat ping
[23:45] <davecheney> arosales: hazmat do you have time for a quick G+ to talk about the demo
[23:48] <arosales> davecheney: hello
[23:48] <davecheney> arosales: hazmat lets take this to #eco
[23:49] <arosales> ok