[00:09] <davecheney> _thumper_: seen mramm ?
[00:10] <_thumper_> davecheney: yeah
[00:10]  * davecheney waits on mramm for our 1-on-1
[02:33] <bigjools> davecheney: I'm trying to do a go get on github.com/andelf/go-curl and go says "cannot find package", yet it seems to work for others.  Any ideas?
[02:35] <bigjools> ok we worked it out
[02:35] <bigjools> FFS
[02:35] <bigjools> my GOPATH started with a :
[02:35] <bigjools> ridiculous
[03:38] <davecheney> bigjools: O_O!!
[03:39] <davecheney> ahh, there is a long story about that
[03:39] <davecheney> which isn't (all) go's fault
[03:40] <bigjools> davecheney: I see :)
[03:41] <davecheney> for example, did you know that PATH=$PATH:: is short hand for
[03:41] <davecheney> PATH=$PATH:.
[03:42] <bigjools> !
[03:43] <bigjools> davecheney: so when I do a perfectly reasonable line like this in my .bashrc, I am screwed then: export GOPATH="$GOPATH":/my/path
[03:43] <davecheney> yup, that isn't awesome
[03:44] <davecheney> that says GOPATH=.:/somethign/else
[03:44] <davecheney> where . is the first element, and will be the target for go get
[03:44] <davecheney> if it is of any consolation, i have fixed many of these usability problems in 1.1
[03:45] <davecheney> bigjools: thumper do you two have time to talk about the mongo problem ?
[03:45] <bigjools> sire
[03:45] <bigjools> sure even
[03:46] <davecheney> so, latest i have is everyone is working really hard to get mongo into at least raring by 13.04
[03:46] <davecheney> but I think we need a backup
[03:47] <davecheney> bigjools: what I would like to do is move from the tarball to your ppa for getting mongo for the bootstrap nodes
[03:47] <bigjools> the current raring packaging has a bug when it tries to build with ssl, fwiw
[03:47] <bigjools> \o/
[03:47] <bigjools> how about an official PPA
[03:48] <bigjools> I just did mine as I don't trust unsigned binaries :)
[03:48] <davecheney> bigjools: sounds fine by me, i'd really like your guidence on this
[03:48] <bigjools> ok
[03:48] <davecheney> the trick is we need P, Q and R, amd64 as a must
[03:48] <davecheney> 386 and arm as a maybe
[03:48] <bigjools> thumper was saying that the 2.2.3 build didn't work for him
[03:48] <davecheney> shitter
[03:48] <bigjools> 2.2.2 was defo ok
[03:48] <bigjools> I built 2.2.3 in the same way so I dunno what's up
[03:49] <davecheney> bigjools: how do I, or you
[03:49] <davecheney> i think you'd be faster
[03:49] <davecheney> get a PPA location setup so I can then reference that in the cloud init scripts ?
[03:50] <bigjools> activate a new PPA for the juju devs?
[03:50] <davecheney> bigjools: full disclosure, i'm a launchpad numpty
[03:50] <davecheney> you'll have to walk me through this
[03:50] <bigjools> heh
[03:50] <bigjools> np
[03:50] <bigjools> which team should own the ppa?
[03:50] <davecheney> i think htere is a juju team now
[03:50] <davecheney> it used to be ~gophers
[03:50] <bigjools> perfect
[03:50] <davecheney> but that was changed recently
[03:50]  * davecheney checks
[03:51] <davecheney> bigjools: if you are not a member
[03:51] <davecheney> i can fix that
[03:51] <bigjools> so if you are an admin of the team you can activate a ppa for it
[03:51] <bigjools> once you do that you can't rename the team
[03:51] <bigjools> not sure I need to be a member
[03:51]  * davecheney goes to look
[03:51] <davecheney> https://launchpad.net/~juju
[03:51] <davecheney> ^ i can create a new ppa
[03:52] <bigjools> on the left mid way down, yes
[03:52] <bigjools> so I suggest making one called devel, one called staging and one called stable
[03:53] <bigjools> or perhaps s/devel/experimental/
[03:53] <bigjools> then we can test packages and copy them later to the stable ppa
[03:53] <davecheney> bigjools: sounds reasonable
[03:53] <davecheney> i'm goig to be adding this ppa to the cloudinit script
[03:53] <bigjools> the forms on LP are pretty self-explanatory I hope
[03:54] <davecheney> how will having dev/stage/stable play into this
[03:54] <bigjools> yeah then it does need a production PPA that is stable :)
[03:54] <davecheney> ok, done devel and stable
[03:54] <davecheney> that'll do for the moment
[03:55] <davecheney> bigjools: also, because i'm a lp numpty, i'm not setup with all the pkg signing aparatus
[03:55] <bigjools> easy to sort
[03:55] <davecheney> ok
[03:55] <bigjools> on the cmd line just type "add-apt-repository ppa:account/ppa-name"
[03:56] <davecheney> oh yeah, i've done that side
[03:56] <bigjools> it won't work until the 1st package is in the PPA though
[03:56] <davecheney> but the publishing part will probably involve more steps
[03:56] <bigjools> packaging is tricky, yes
[03:56] <bigjools> I can help there
[03:56] <davecheney> sweet, thanks
[03:56] <bigjools> I suggest that once you have the experimental PPA up, you copy my 2.2.3 package into there
[03:57] <davecheney> ok, experimental created
[03:57] <davecheney> i copied your ppa once before
[03:57]  * davecheney looks
[03:57] <bigjools> so visit my PPA's packages page
[03:58] <bigjools> and click copy packages
[03:58] <bigjools> then your experimental PPA will be in the destination drop down
[03:58] <bigjools> copy to same series and copy source + binary
[04:00] <davecheney> https://launchpad.net/~julian-edwards/+archive/mongodb
[04:00] <davecheney> why is it always a sonofabitch to find the copy link ?
[04:00] <bigjools> click "view packages"
[04:00] <bigjools> it's on that page
[04:00] <davecheney> gotcha
[04:02] <davecheney> ok, wheels are grinding now
[04:02] <davecheney> while that is baking i shall make a branch to use it
[04:05] <bigjools> will be interesting to see why it fails
[04:05] <bigjools> or at least thumper said it fails :)
[04:05] <davecheney> bigjools: can I use copy package to change the series to quantal, etc ?
[04:05] <bigjools> davecheney: not a good idea to go backwards
[04:05] <davecheney> ok
[04:05] <bigjools> davecheney: I did a 2.2.2 for quantal
[04:05] <bigjools> it works fine
[04:06] <davecheney> that is ok
[04:06] <davecheney> once we get _one_ working mongo
[04:06] <davecheney> i can use default-series to boot an environment that matches
[04:06] <bigjools> you can promote 2.2.2 from quantal to raring
[04:06] <davecheney> bigjools: is there anyone I should talk to about using ppa's in cloudinit ?
[04:06] <bigjools> smoser
[04:07] <davecheney> of course
[04:07] <davecheney> da man
[04:31] <davecheney> bigjools: are you on juju-dev ?
[04:31] <davecheney> ML
[04:31] <bigjools> davecheney: yes I think so
[04:31] <davecheney> ok, just announced my intentions for this packaging bollocks there
[04:32] <davecheney> please comment and/or throw fruit
[04:32] <bigjools> seems ok
[04:32] <davecheney> didnt' expect it to be contraversial
[04:32] <bigjools> I don't understand juju well enough to comment really
[04:33] <bigjools> writing the python maas provider was enough for me :)
[04:33] <davecheney> i can give you more background
[04:33] <davecheney> but you probably don't want to know
[04:34] <davecheney> thumper: seen this test failure ?
[04:34] <davecheney> ... obtained *net.OpError = &net.OpError{Op:"write", Net:"tcp", Addr:(*net.TCPAddr)(0xc200245d50), Err:0x68} ("write tcp [::1]:46014: connection reset by peer")
[04:34] <davecheney> ... expected *rpc.ServerError = &rpc.ServerError{Message:"transformed: message", Code:"transformed: code"} ("server error: transformed: message (transformed: code)")
[04:59] <thumper> umm...
[05:00] <thumper> perhaps
[05:00] <thumper> I do still get some intermittant failures
[05:00] <davecheney> happens 100% of the time
[05:00] <davecheney> will bisect in a few mins
[05:00] <thumper> hmm
[05:01] <thumper> I ran tests just before I went out
[05:01] <thumper> and all passed
[05:01] <davecheney> trying to improve my jenkins builder to run our tests
[05:01] <thumper> let me check my trunk revision
[05:01] <davecheney> thumper: which revno
[05:01] <thumper> r1024
[05:01] <thumper> davecheney: what do you have?
[05:01] <davecheney> hmm, how can I check
[05:01] <davecheney> this branch has local changes
[05:02] <thumper> bzr revno
[05:02] <davecheney> yeah, checking 1024 now
[05:03] <davecheney> thumper: yup, totally repeatable @1024
[05:03] <davecheney> if you can confirm the failure I will log a bug about it
[05:03] <thumper> hmm... I don't get it on r1024
[05:03] <davecheney> we've seen failures like this before
[05:03] <thumper> let me triple check
[05:04] <davecheney> races between the client reading the reseponse and the server closing the connection
[05:04] <thumper> ok  	launchpad.net/juju-core/rpc	0.186s
[05:04] <thumper> that is with r1024 of trunk
[05:15] <davecheney> thumper: can you please try GOMAXPROCS=2 go test launchpad.net/juju-core/rpc
[05:16] <thumper> davecheney: tried three times, succeeded each time
[05:16]  * thumper is trying to land the upload tools tweaks
[05:16] <davecheney> with GOMAXPROCS=2 (or any number other than 1)
[05:16] <davecheney> fair enough
[05:16] <thumper> OMG there were a lot of conflicts with trunk
[05:17] <thumper> davecheney: yes, with set to 2
[05:17] <davecheney> that is important, gets the review queue unblocked
[05:17] <thumper> exactly
[05:17] <davecheney> carry on
[05:18]  * thumper nods
[05:24] <davecheney> bigjools: https://launchpad.net/~juju/+archive/experimental/+packages
[05:24] <davecheney> looks like it worked
[05:24] <bigjools> davecheney: cool
[05:24] <bigjools> so hax0r away
[05:30] <thumper> bigjools: have you fixed mongo?
[05:30] <thumper> davecheney: branch landed
[05:30] <bigjools> rofl
[05:30]  * thumper is done
[05:31] <bigjools> thumper: it's not a packaging problem AFAICS therefore it's a juju problem
[05:32] <bigjools> or mongo.... take your pick. either way, it's not *my* problem :)
[05:32] <thumper> heh
[05:32] <bigjools> you said you saw libssl linked?
[05:32] <bigjools> perhaps 2.2.3 has something that breaks juju
[05:33] <thumper> maybe... NFI
[05:33]  * thumper goes to help make dinner
[05:36] <davecheney> LFTM, ldd $(which mongod) linux-gate.so.1 =>  (0xb76ef000) libssl.so.1.0.0 => /lib/i386-linux-gnu/libssl.so.1.0.0 (0xb7690000)
[05:36] <davecheney> mongo is stupid
[05:36] <davecheney> they require the elephant that is libboost
[05:36] <davecheney> but they only need it for the headers
[05:40] <davecheney> thumper: ok, the rpc test failure is real
[05:40] <davecheney> just verified under 1.0.3 as well
[05:41] <davecheney> it fails under my stress test script
[05:41] <davecheney> will raise a bug
[05:43] <davecheney> https://bugs.launchpad.net/juju-core/+bug/1157553
[05:43] <_mup_> Bug #1157553: rpc: tests are unreliable <juju-core:New> < https://launchpad.net/bugs/1157553 >
[08:21] <davecheney> jam1: question about juju cli for windows ?
[08:21] <jam> yes davecheney?
[08:21] <davecheney> did you get a cross compile working, or did you do it directly on win32 ?
[08:22] <jam> davecheney: just did it directly
[08:22] <davecheney> fair enough
[08:22] <davecheney> that is what I tought
[08:22] <davecheney> thought
[08:22] <davecheney> as you were
[08:22] <jam> well, I need to actually run it there anyway, right ? :)
[08:22] <davecheney> yes
[08:22] <davecheney> sorry, was answerin a question from another channel
[08:22] <jam> np
[08:23] <jam> I did recompile go, but I probably could have just downloaded the 32-bit version. The big key was just needing 'gcc' for cgo.
[08:23] <jam> Though maybe I would need a "close" match between the gcc for go and the one for goyaml
[08:23] <jam> so maybe I would need to recompile go
[08:23] <jam> C usually isn't as version specific as C++
[08:24] <davecheney> i would really like to remove the cgo requirement for goyaml
[08:24] <davecheney> which is a medium sized job
[08:25] <davecheney> but might be helpful if we have to cross compiile a lot in the future
[08:25] <jam> when I mentioned it in the past (1yr?) gustavo claimed that he didn't want to implement the mess that was yaml parsing.
[08:25] <davecheney> ie, I don't see lp supporting a win32 target any ti,e in the future
[08:25] <davecheney> yeah, that is very true
[08:25] <davecheney> goyaml just wraps the python yaml c code
[08:25] <jam> right
[08:25] <davecheney> so it has the same compatibilty as the python version
[08:25] <davecheney> a decent compromise
[08:26] <jam> from what kapil said, yaml parsing speed becomes a bottleneck at large scale, so having a slow parser might also be bad
[08:26] <jam> but it could be *really* useful if the API was compatible
[08:26] <jam> so you could use "A" or "B" yaml libsd
[08:26] <jam> libs
[08:26] <jam> and then we could have native go  yaml for stuff like win32
[08:27] <davecheney> i don't think yaml parsing speed is that importnat
[08:27] <jam> davecheney: do you know why cgo doesn't support cross-compiling? You certainly have cross-compiling gcc
[08:27] <davecheney> we don't have the topology node
[08:27] <davecheney> jam: it's not that it doesnt' support it
[08:27] <davecheney> but the moving parts required especially to go from elf -> PE
[08:27] <davecheney> are substantial
[08:28] <davecheney> if you did all the compilation steps by hand, and had the right cross compiler and cross linker, as well as headers and .so's
[08:28] <davecheney> you could do it
[08:28] <davecheney> but it was just too hard to autoate with the simple go tool
[08:28] <jam> sure. So for Bazaar, we just created an EC2 instance that we bundled, and then brought up when we did a release to build the next binaries.
[08:29] <jam> It wouldn't be hard to set that up again.
[08:29] <rogpeppe> mornin' all
[08:29] <jam> hi rogpeppe, hope your day has started well so far
[08:29] <davecheney> morning rogpeppe
[08:29] <rogpeppe> jam: the shower was hot :-)
[08:29] <rogpeppe> davecheney: hiya
[08:29] <davecheney> jam: i think that is a good suggestion
[08:29] <davecheney> lets see how it pans out
[08:30] <davecheney> maybe someone will kick in the resources for this machine on a native platform
[08:30] <davecheney> :P
[08:30] <jam> davecheney: so in the short-term, I have no problem doing the compiling on win32, but I imagine we'll want a more automated system so we can detect issues pre-release.
[08:31] <dimitern> morning
[08:31] <davecheney> jam: smells like a jenkins shaped problem
[08:31] <jam> hi dimitern
[08:31] <jam> davecheney: it does, indeed. Though my personal jenkins-fu is quite rusty.
[08:31] <davecheney> jam: i've setup a jenkins CI build now that i have the right i386 mongo
[08:31] <jam> I've used it, I've played with it, but not allt hat much.
[08:32] <jam> davecheney: you have a mongod that runs on Windows? Or is that Linux-i386?
[08:32] <davecheney> jam: jenkins is easy, imagine a hammer that has been worn down by repeatedly bashing on hard thins
[08:32] <davecheney> that is jenkins
[08:32] <davecheney> jam: i386, see mail to juju-dev about backstop plan for how to get away from using a tarball for mongo
[08:33] <jam> vila from our old squad had jenkins stuff set up. I think his biggest trouble was trying to create a reproducible save state (so if needed someone else could bring it up similarly). It wasn't terrible to configure and sort of hack it together.
[08:33] <jam> davecheney: yeah, I saw the "start at a ppa"
[08:33] <jam> though IIRC the ppa build breaks the test suite.
[08:33] <jam> If you run with '-gocheck.vv' I *think* rogpeppe and I (mostly him at my request) made it so the failing mongod would at least get logged.
[08:33] <davecheney> jam: i think we fixed that
[08:36] <dimitern> setting up jenkins is painful, but at least takes less than a day, even from scratch - did it twice :)
[08:36] <rogpeppe> davecheney: i saw your TestTransformErrors failure yesterday
[08:36] <rogpeppe> davecheney: it only happens in go tip
[08:36] <rogpeppe> davecheney: and i haven't narrowed it down yet
[08:39] <davecheney> rogpeppe: it happens with 1.0.3
[08:39] <davecheney> i checked
[08:39] <rogpeppe> davecheney: orly?
[08:39] <davecheney> it's timing related
[08:39] <davecheney> rogpeppe: http://play.golang.org/p/lSPW2BO2Tk
[08:39] <davecheney> cd $PKG
[08:39] <davecheney> bash stress.bash
[08:39] <rogpeppe> davecheney: i'm sure it never happened before on go tip, so i'm surprised about 1.0.3
[08:39] <davecheney> fails in a few seconds
[08:40] <dimitern> fwereade: https://codereview.appspot.com/7497047/ please?
[08:46] <rogpeppe> davecheney: ah, i've managed to reproduce the problem on 1.0.2 (and i also saw a log-target-race related panic)
[08:47] <davecheney> it's less prevelent under the old 1.0.x scheduler
[08:47] <davecheney> but the race is there
[08:48] <rogpeppe> davecheney: yeah (this time i saw the issue in a different test, BTW, but still "connection reset by peer")
[08:49] <rogpeppe> davecheney: (when reading a response, but writing the request though - it might be a different issue but i suspect not)
[08:49] <rogpeppe> s/but/not/
[08:49] <davecheney> rogpeppe: indeed, the faliure modes are always subtly different but it appears to be the client and the server racing to read the final message
[08:49] <davecheney> rogpeppe: didn't dimeter fix a very similar bug in Atlanta ?
[08:49] <fwereade> dimitern, sorry, on it right now
[08:49] <dimitern> fwereade: cheers!
[08:50] <rogpeppe> davecheney: ah, i *think* i might see the problem
[08:51] <dimitern> how can i subscribe to canonical-juju ?
[08:51] <davecheney> dimitern: how much cash do you have ?
[08:51] <dimitern> davecheney: :D
[08:53] <davecheney> dimitern: https://lists.canonical.com/mailman/listinfo/canonical-juju
[08:53] <rogpeppe> davecheney: yup, fixed it, i think
[08:53] <dimitern> davecheney: as a BG rapper sang once "i own tens of bucks"
[08:53] <dimitern> davecheney: cheers
[08:56] <rogpeppe> davecheney: https://codereview.appspot.com/7919043
[08:57]  * davecheney looks
[08:58] <davecheney> rogpeppe: that'll do it every time
[08:58] <rogpeppe> davecheney: yup :-)
[08:58] <davecheney> i'm surprised that ever work
[08:58] <davecheney> worked
[08:58] <rogpeppe> davecheney: yup.
[08:58] <davecheney> what is the history on that file
[08:59] <davecheney> i have a worrying feeling that we tried to 'fix' it before
[08:59] <davecheney> (poorly)
[08:59] <rogpeppe> davecheney: i guess it was almost always running the new goroutine first
[08:59] <rogpeppe> davecheney: i'm not sure. i don't remember doing so
[09:00] <davecheney> i have a vague memory from atlanta
[09:01] <rogpeppe> davecheney: i might've missed it, being involved with gui stuff
[09:01]  * davecheney checks
[09:01] <davecheney> the reason i mentoin it is we had a very similar bug that week
[09:04] <rogpeppe> davecheney: well, there are no fixes of that kind in the rpc package at any rate
[09:04] <rogpeppe> davecheney: this produces only ian's recent changes to log:  for(i in *.go) {echo $i; bzr blame $i | grep -v '^ +\|' | grep -v roger.p }
[09:05] <davecheney> someone showed me
[09:05] <davecheney> bzr log --show-diffs
[09:05] <davecheney> but I cna't make it work
[09:05] <davecheney> what am I doing wrong ?
[09:05] <rogpeppe> davecheney: i just checked that noone other than me has made any changes to the files in rpc. :-)
[09:06] <davecheney> maybe it was a similar error in another package
[09:06] <davecheney> screw it
[09:06] <davecheney> get one more LGTM and job done
[09:07] <davecheney> yup, i'm happy we are not reverting another fix
[09:11] <fwereade> dimitern, qualified LGTM, ping me if there's any uncertainty
[09:12] <dimitern> fwereade: tyvm
[09:14] <dimitern> fwereade: no, i'm not getting the thing about using the wrong charm url in unit.go, please explain
[09:15] <fwereade> dimitern, the service config we care about from the perspective of the unit is the config that is known to match the *unit*'s charm url
[09:15]  * TheMue likes failing tests *hmpf*
[09:16] <dimitern> fwereade: exactly what i'm doing isn't it?
[09:16] <dimitern> fwereade: getting u.doc.CharmURL + u.doc.Service for the key
[09:16] <dimitern> (well, vice versa, but the code is correct i think)
[09:17] <fwereade> dimitern, looking up the service's charm url is defeating the point of doing all the work in the first place
[09:17] <fwereade> dimitern, not entirely ofc, because you do the right thing in watcher.go
[09:17] <fwereade> dimitern, when watching the unit's service config
[09:17] <dimitern> fwereade: are you talking about WatchServiceConfig() ?
[09:17] <fwereade> dimitern, WatchServiceConfig is Correct
[09:17] <fwereade> dimitern, ServiceConfig is not
[09:17] <dimitern> fwereade: aah, got you :)
[09:18] <fwereade> dimitern, they should be using the same document
[09:18] <dimitern> fwereade: right, sorry - i'll fix it
[09:18] <fwereade> dimitern, np, I didn't catch that first time round anyway
[09:19] <fwereade> dimitern, but, sorry, wait a mo
[09:20] <fwereade> dimitern, no, we're good
[09:20] <fwereade> dimitern, thanks :)
[09:21] <dimitern> fwereade: i'm a bit divided what comments to put around the ops
[09:21] <fwereade> dimitern, which ones? :)
[09:21] <dimitern> fwereade: this case you described about inc ops with a single upgraded unit in error state getting reverted
[09:22] <dimitern> fwereade: should we document this somehow, or make the "first use" comment more correct
[09:23] <fwereade> dimitern, I think that the explanatory line should be either made strictly accurate, or maybe even dropped; as it is, it clearly contradicts the preceding if/else chain
[09:24] <fwereade> dimitern, I'm not sure what explanatory power a strictly accurate version has though
[09:24] <dimitern> fwereade: ok, i'm dropping it :) can be bothered to wrap my mind around commenting all cases in a sentence
[09:24] <fwereade> dimitern, how about "The document might need to be created"?
[09:24] <fwereade> dimitern, I think that is accurate and high-level helpful
[09:25] <fwereade> dimitern, "A new settings document might need to be created"?
[09:25] <dimitern> fwereade: that's like saying the water is wet
[09:25] <dimitern> fwereade: we're creating it anyway
[09:26] <fwereade> dimitern, ha, sorry, "a new refcount document might need to be created"?
[09:26] <dimitern> fwereade: hmm... let me think a bit
[09:27] <dimitern> fwereade: what we're actually doing there is either creating a refcount=1 or just refcount++
[09:27] <fwereade> dimitern, honestly, it is documented anyway is settingsIncRefOp
[09:28] <dimitern> fwereade: yeah, i think it makes sense, although not saying when we might do it... ah right - just take a look at the settingsIncRefOp
[09:28] <fwereade> dimitern, on service creation we just blat it out directly because we don't need the complexity of an incref, but in this case we need it so we use it
[09:28] <dimitern> fwereade: ok, i'll change both inc and dec comments so
[09:29] <fwereade> dimitern, maybe the comments are best couched in terms of "the unit adds a reference to the new settings doc" and "the service drops its reference to its old settings doc"?
[09:29] <dimitern> fwereade: even better! 10x
[09:30] <fwereade> dimitern, cool
[09:32] <TheMue> Ha, killed the next failure. *g*
[09:32]  * fwereade cheers at TheMue
[09:34] <TheMue> fwereade: After changes the hard coded references to $HOME/.juju there are now still some hidden glitches in the tests. ;)
[09:34] <fwereade> TheMue, yeah, no doubt -- good luck with them
[09:34] <TheMue> fwereade: thx
[09:44] <dimitern> the update manager this morning suggested to do a "partial distribution upgrade" - never seen this before, i did it and /etc/issue still reports 12.10 - anyone seen this?
[09:46] <dimitern> fwereade: done with the changes - would you like one last check (mostly for wording in comments) before I submit?  https://codereview.appspot.com/7497047/
[09:59] <fwereade> dimitern, sure
[10:22] <davecheney> good night gentlemen
[10:22] <davecheney> don't forget, release on friday
[10:23] <davecheney> i'll do the release notes dance tomorrow
[10:23] <dimitern> davecheney: good night!
[10:23] <davecheney> night y'all
[10:23] <jam> night davecheney
[10:24] <jam> fwereade: so we had a question come up recently about Series vs Arch. Specifically, one seems to be a conf parameter, and one is meant to be a --constraint (from what I could tell).
[10:24] <jam> I noticed because both are a bit borked on Windows
[10:24] <jam> where it wants <unknown> i386, and we don't have those charms available :)
[10:25] <jam> I can change the code so that if series is unknown, it returns the default series from conf, but where do I overload the arch? (--constraint was the recommended method from mgz, IIRC)
[10:26] <jam> mgz, dimitern: can either of you try bootstrapping on HP? I'm getting weird inability to write to swift, but I don't know if it is something about me.
[10:26] <jam> I would ask wallyworld, but he is obviously away right now.
[10:26] <fwereade> jam, heyhey
[10:27] <fwereade> jam, usage of version.Current.Series will hopefully be evaporating imminently: tim's in the process of clearing up the environs api to accept series instead of tools
[10:27] <jam> also, fwereade, https://code.launchpad.net/~fwereade/juju-core/bootstrap-constraints-3a/+merge/153725 landed, so I think your "EC2 should choose" kanban card can go into merged?
[10:28] <jam> fwereade: yeah, I was hoping I wouldn't have to do much about series, and could let Tim do the work for me there.
[10:28] <jam> But Arch is also important in this context, and it was unclear how that happens.
[10:28] <jam> At the very least, it seemed both Arch and Series should be configured in the same manner
[10:28] <jam> (environments.yaml, or --constraints, but not both)
[10:29] <jam> well, even both, but not one one way and the other differently
[10:29] <mgz> jam: can check
[10:29] <jam> thanks mgz, good morning, btw
[10:29] <jam> I hope your Muselix is tasty today
[10:29] <fwereade> jam, I don't consider my card really done until I've merged -6 -- it was originally meant to include that too
[10:29] <fwereade> jam, but I am working on merging it all today
[10:29] <jam> k
[10:29] <fwereade> jam, wrt series and arch I think matters are a little more subtle
[10:30] <mgz> eaten long ago, the fun of getting on the 7:45 bus...
[10:30] <jam> mgz: I thought colocation was Thurs?
[10:30] <mgz> at least I have a net connection up to doing audio+ssh today, not sure what was up with the crazy packet loss at home yesterday
[10:30] <jam> yeah, that got pretty crazy at the end.
[10:30] <mgz> it is, I'm casually in town
[10:31] <jam> I could hear you fine, but your ssh lag was pretty bad
[10:31] <mgz> and irc stayed up...
[10:31] <mgz> but incoming audio was being mangled, and screen was paaainful
[10:32] <jam> mgz: so how much to get Fi-OS to your house? :0
[10:32] <mgz> well, there have been promises from the local authority...
[10:32] <jam> we promise to make you pay through the nose
[10:32] <jam> to get 1Mbit
[10:32] <fwereade> jam, would you like a quick hangout?
[10:32] <jam> I guess T3 was 1.5Mbit?
[10:32] <jam> fwereade: sure
[10:33] <jam> fwereade: you want to set it up or me?
[10:33] <mgz> it's mostly a service reliablity and cost thing these days rather than speed, which is actually pretty reasonable
[10:33] <fwereade> jam, I was getting water but will do so now
[10:33] <jam> np
[10:38] <dimitern> fwereade: ping
[10:41] <fwereade> dimitern, pong
[10:41] <dimitern> fwereade: does it look ok?
[10:41] <fwereade> dimitern, dammit sorry
[10:42] <dimitern> fwereade: np, i'm already half though the upgrade-charm cmd anyway :)
[10:44] <rogpeppe> dimitern: one last question on your review https://codereview.appspot.com/7497047/
[10:44] <dimitern> rogpeppe: just answered :)
[10:45] <rogpeppe> dimitern: hmm, but settingsRefsDoc doesn't *have* an id field. that was what was confusing me.
[10:45] <dimitern> rogpeppe: it has an implied id
[10:46] <dimitern> rogpeppe: if not specified mongo creates one, afaik
[10:46] <rogpeppe> dimitern: oh, i see. we're relying on a mongo-created _id?
[10:48] <dimitern> rogpeppe: yeah
[10:49] <rogpeppe> dimitern: so how do we find out that key to increment the ref count?
[10:49] <dimitern> rogpeppe: we use the serviceSettingsKey() to construct the key, and then Find/FindId
[10:50] <rogpeppe> dimitern: oh, i see, we do specify the id when creating the document
[10:50] <dimitern> rogpeppe: exactly
[10:50] <rogpeppe> dimitern: we just don't have in the doc
[10:50] <rogpeppe> dimitern: hmm, i think that's a bit confusing. all the other docs contain their id fields.
[10:50] <dimitern> rogpeppe: yeah, if seemed a bit weird at first, but fwereade explained it and it made sense
[10:51] <fwereade> dimitern, rogpeppe: no, we specify the key in the txn.Op
[10:51] <rogpeppe> fwereade: yes, i saw that. but don't all the other doc types have the key in the struct too?
[10:52] <dimitern> rogpeppe: well, saves a bit of typing: settingsRefsDoc{1} instead of {Id:x, RefCount:1}
[10:52] <fwereade> dimitern, rogpeppe: it ends up cleaner imo, because then we can just use the doc directly in update operations without mgo whining about _id not neing sane to update
[10:52] <fwereade> rogpeppe, and since it's not actually required anywhere else, meh, why include it?
[10:52] <rogpeppe> fwereade: ok. maybe a comment next to the settingsRefDoc would be appropriate then
[10:53] <fwereade> rogpeppe, +1 to that
[10:53] <dimitern> rogpeppe: i'll add a comment
[10:53] <rogpeppe> fwereade: because it wasn't obvious to me at any rate
[10:53] <mgz> jam: bootstrap on hp worked btw, want to compare config?
[10:53] <fwereade> rogpeppe, yeah, good observation
[10:54] <jam> mgz:  my hp config: https://pastebin.canonical.com/87223/
[10:55] <mgz> basically, pulled juju-core trunk, `go install ./...`, sourced creds, `~/go/bin/juju bootstrap --upload-tools` and `~/go/bin/juju status`
[10:55] <dimitern> fwereade, rogpeppe: thanks for the reviews. i'll include your suggestions and submit it shortly
[10:55] <fwereade> dimitern, cool, thanks
[10:57] <jam> mgz: and the failure: https://pastebin.canonical.com/87224/
[10:57] <jam> I can't --upload-tools on Windows
[10:57] <jam> but that isn't where it is failing
[10:58] <mgz> jam: https://pastebin.canonical.com/87225/
[10:58] <mgz> jam: check your OS_REGION_NAME
[10:59] <jam> mgz: export OS_REGION_NAME="az-3.region-a.geo-1"
[10:59] <jam> I checked that
[10:59] <jam> and the URL for the swift service looks right
[10:59] <jam> it is going to: https://region-a.geo-1.objects.hpcloudsvc.com/v1/AUTH_3...
[11:00] <jam> but it is getting an EOF
[11:00] <jam> which if I then 'swift list' after changing OS_REGION_NAME, I don't see the bucket mentioned in the bootstrap debug
[11:00] <jam> while I *do* see your hpgztesting bucket
[11:02] <mgz> I'll try changing the bucket name to see if creation is borked somehow, I would have had that bucket before
[11:05] <jam> mgz: I would have thought 'destroy-environment' would delete the bucket.
[11:06] <mgz> nope, that works as well
[11:06] <mgz> jam: you'd have thought...
[11:07] <jam> mgz: (that can also be tested :)
[11:07] <jam> mgz: I don't see your old bucket, and I do see a "hpgztesting-new" bucket
[11:09] <mgz> $ swift list
[11:09] <mgz> No handlers could be found for logger "keystoneclient.v2_0.client"
[11:09] <mgz> Endpoint for object-store not found - have you specified a region?
[11:09] <mgz> ...thanks python-swiftclient
[11:09] <mgz> ...is that actually the bucket name?
[11:10] <mgz> ah, no, it should have my hex bit as well
[11:14] <jam> mgz: you have to change OS_REGION_NAME for swift list
[11:14] <jam> because it doesn't match the nova region
[11:17] <mgz> ah, ta
[11:21] <jam> (note that you have to use the regular one for 'nova list' but you have to change it for 'swift list'... sign)
[11:21] <jam> sigh
[11:22] <mgz> jam: so yeah, sorry, works for me, and nothing's obviously wrong with your setup in comparison
[11:25] <dimitern> rogpeppe, fwereade: is it ok for both of you if I just drop the last sentence of settingsRefsDoc comment (the one about the last unit upgrading should drop the old settings doc) and leave the rest? that is already mentioned in the service.go inline
[11:26] <fwereade> dimitern, I think I'd still favour dropping all but the first, for the same reasons, but I think I should defer judgment to roger who's in a better position to know how the code looks when it's unfamiliar
[11:27] <dimitern> fwereade: ok, so to make peace, I'll leave it then :)
[11:28] <fwereade> dimitern, what? this means WAR!
[11:28] <fwereade> dimitern, (no, that's fine ;))
[11:28] <dimitern> fwereade: :D
[11:28] <jam> WAR? This means PEACE!
[11:28] <jam> wait...
[11:28] <rogpeppe> fwereade: is there anywhere else that has a more coherent comment about the overall strategy for managing settings docs?
[11:28] <fwereade> rogpeppe, there is not
[11:28] <dimitern> rogpeppe: not in one place, not
[11:29] <fwereade> rogpeppe, it would be a very good thing for us to write such a document, but in my personal queue it comes behind a how-to-write-transactions document
[11:29] <rogpeppe> fwereade: in which case, i think it's good to have that comment there (and to perhaps fix it if it's not accurate), rather than inferring the strategy from scattered comments
[11:29] <jam> wallyworld_: I didn't mean to scare you away, please come back :)
[11:29] <fwereade> rogpeppe, yep, +1
[11:29] <wallyworld_> jam: stupid mumble is freezing :-(
[11:29] <rogpeppe>  fwereade: i really think it's nice to have comments in the code (perhaps in addition to other external docs)
[11:30] <rogpeppe> fwereade: or at the least a pointer to the external doc within the code.
[11:30] <fwereade> dimitern, for pedantry's sake would you mention that the service decrefs the settings doc for its old url on change please?
[11:30] <dimitern> fwereade: sure, +1
[11:31] <rogpeppe> fwereade: when the service config settings change?
[11:31] <fwereade> rogpeppe, no, when the service's charm url changes
[11:31] <rogpeppe> fwereade: how is that different from upgrading?
[11:32] <fwereade> rogpeppe, it's not, but the decref is not mentioned -- only the incref to the new one
[11:32] <dimitern> rogpeppe, fwereade: there it is - take it or leave it :) http://paste.ubuntu.com/5630902/
[11:32] <rogpeppe> fwereade: "
[11:32] <rogpeppe> When a unit upgrades to the new charm,
[11:32] <rogpeppe>  717 // the old service settings ref count is decremented
[11:32] <rogpeppe> "
[11:32] <rogpeppe> isn't that what you're talking about?
[11:33] <fwereade> rogpeppe, ok, I would appear to be lacking in perception
[11:33] <fwereade> dimitern, go for it
[11:33] <rogpeppe> fwereade: ah, i thought i was missing something crucial :-)
[11:33] <dimitern> cheers
[11:33] <jam> wallyworld_: no love for mumble today?
[11:34] <wallyworld_> trying
[11:36] <wallyworld_> jam: mumle won't connect, wanna do a hagout?
[11:37] <jam> wallyworld_: unfortunately, mgz is on his chromebook, and there are not binaries for G+ on it... :( but I can hangout with you and proxy for martin)
[11:37] <jam> wallyworld_: can you hear us?
[11:37] <jam> I see you in mumble
[11:37] <wallyworld_> no :-(
[11:37] <jam> if you can hear, then you can "speak" in the channel maybe?
[11:37] <wallyworld_> mumble locks the cpu
[11:37] <jam> joy joy... :(
[11:38] <wallyworld_> i guess we can start a hangout
[11:39] <jam> wallyworld_: https://plus.google.com/hangouts/_/261ed219337d5a6c8fb184b90b243b7a06b746f3?authuser=2&hl=en
[11:39] <jam> dimitern: mgz ^^
[11:39] <mgz> will be there shortly
[11:40] <wallyworld_> jam: can you hear me?
[11:40] <jam> wallyworld_: we can hear you
[11:40] <jam> you can't hear us now?
[11:43] <jam> mgz: no audio
[12:14] <TheMue> Ahhh! *jump* *jump* *jump* It seems I've found it.
[12:20] <TheMue> *tschakka* Yeah, got it. *big-smile*
[12:24]  * fwereade_ cheers at TheMue again
[12:24] <fwereade_> rogpeppe, does http://paste.ubuntu.com/5631012/ look like something you've seen before?
[12:24] <jam> mgz: it might have been the tools that are in the bucket on hp-cloud (in the shared account). When I added public-bucket-url, it seems to be working...
[12:25] <jam> (maybe we have a bug if you download from swift and then try to upload it)
[12:25] <jam> does swift do anything with uploading identical content?
[12:25] <jam> like tell you "I don't need you to send any bytes because you already have a file matching that sha1" ?
[12:25] <rogpeppe> fwereade_: no - AddRelation is new in the api. it's almost certainly something to do with the way that the AddRelation operation is being undone (or not...)
[12:26] <rogpeppe> fwereade_: at least, i *think* AddRelation is new
[12:27] <fwereade_> rogpeppe, yeah, that's the latest commit
[12:31] <fwereade_> teknico, I'm seeing a consistent failure in trunk: http://paste.ubuntu.com/5631012/
[12:32] <teknico> fwereade_, what did I break? looking
[12:32] <dimitern> fwereade_: i have the same issue
[12:32] <dimitern> just run the tests
[12:32] <dimitern> on trunk
[12:33] <fwereade_> teknico, if it's not happening for you, that's not immediately clear :)
[12:34] <dimitern> fwereade_: could it be because of the go version? i'm on 1.0.3
[12:34] <fwereade_> dimitern, I'm 1.0.2
[12:35] <fwereade_> teknico, assuming it's not so trivial as to take only a couple of minutes, would you back it out please? I'd be happy to help investigate on another branch if it turns out to be something tricky
[12:37] <fwereade_> teknico_, what was the last thing you saw?
[12:37] <teknico_> fwereade_, my own question at 13:33:55
[12:38] <fwereade_> teknico_, I saw nothing from you before 13:32
[12:38] <fwereade_> teknico, I have http://paste.ubuntu.com/5631037/
[12:40] <teknico> fwereade_, thanks, here's what I said:
[12:40] <teknico> [13:33:55] <teknico> fwereade_, is it the test on line 1952? it did pass here before I submitted
[12:40] <teknico> checking right now
[12:46] <fwereade_> rogpeppe, fwiw it is not totally wonderful that a test failure seems to cascade into a panic; would it be simple to address this?
[12:47] <teknico> fwereade_, yes, it's happening here too, I'm not clear on why, so I'm reverting the whole branch
[12:48] <fwereade_> teknico, no worries, let me know when it's reverted
[12:48] <rogpeppe> fwereade_: it's because opClientAddRelation returns a nil function pointer. i can't see how that could ever have worked
[12:48] <rogpeppe> fwereade_: i don't believe the tests were run before submitting
[12:48] <fwereade_> rogpeppe, ah, jolly good
[12:49] <fwereade_> rogpeppe, thanks
[12:51] <benji> fwereade_: I am about to submit this branch https://codereview.appspot.com/7600044/ and it was suggested that you might want to know it is coming because it may cause merge conflicts with your current work
[12:54] <teknico> fwereade_, reverting branch is lp:~teknico/juju-core/revert-add-relation-commit
[12:54] <teknico> fwereade_, shall I directly "lbox submit", or do I still need to "lobx propose"?
[12:55] <fwereade_> teknico, I think you need to do both
[12:59] <teknico> fwereade_, https://codereview.appspot.com/7719050 , wanna have a look or shall I just submit it?
[13:00] <fwereade_> teknico, LGTMed on trust for form's sake
[13:01] <TheMue> rogpeppe: ping
[13:01] <rogpeppe> TheMue: pong
[13:01] <dimitern> benji: please wait before submitting - trunk is currently broken
[13:01] <TheMue> rogpeppe: just updated and made a test ./...
[13:01] <benji> dimitern: thanks for the heads-up
[13:02] <TheMue> rogpeppe: now I've got this http://paste.ubuntu.com/5631081/
[13:02] <TheMue> rogpeppe: any idea?
[13:03] <dimitern> TheMue: that's the problem we have in trunk, pending a revert on last commit
[13:03] <rogpeppe> TheMue: yes, the recently merged branch broke trunk
[13:04] <TheMue> rogpeppe: ah, thx, so it's well known
[13:04] <teknico> fwereade_, landed, sorry about that
[13:04] <teknico> and everyone too :-)
[13:04] <fwereade_> teknico, np, I'm pretty sure we've all done it, thanks for cleaning it up swiftly
[13:05] <dimitern> teknico: it happened to me recently, i know the feeling
[13:15] <rogpeppe> fwereade_: i replied to the "Synchronizing: watchers, GUI, and API" thread BTW. you might want to have a look and weigh in.
[13:17] <fwereade_> rogpeppe, looks sane to me -- I don't feel any particular need to add to it, because I've shunted those details off into my mental "oakland" bucket
[13:17] <rogpeppe> fwereade_: cool, thanks
[13:18]  * rogpeppe goes for some lunch
[13:51] <rogpeppe> back
[13:58] <niemeyer> Greetings al
[13:58] <niemeyer> l
[13:58] <niemeyer> mthaddon: Heya
[13:58] <niemeyer> mthaddon: Can we do a quick store deploy today?
[14:00] <mthaddon> niemeyer: sure, anyone from webops can do that - can you file an RT with the details?
[14:00] <niemeyer> mthaddon: Sure
[14:00] <mthaddon> thx
[14:00] <niemeyer> mthaddon: Doing that right now
[14:24] <niemeyer> mthaddon: #60244 is up
[14:24] <_mup_> Bug #60244: install problem <ubiquity (Ubuntu):New> < https://launchpad.net/bugs/60244 >
[14:24] <niemeyer> _mup_: no no no
[14:25] <mthaddon> thx
[14:25] <niemeyer> mthaddon: Thank you!
[14:30] <mthaddon> niemeyer: https://pastebin.canonical.com/87252/ - has the branch location changed?
[14:31] <niemeyer> mthaddon: Hmm
[14:31] <niemeyer> mthaddon: It has, kind of
[14:32] <niemeyer> mthaddon: It's still at lp:juju-core
[14:32] <niemeyer> mthaddon: BUt the trunk owner has changed
[14:32] <niemeyer> mthaddon: To ~juju
[14:32] <niemeyer> mthaddon: If you had lp:juju-core, it should theoretically just work
[14:33] <niemeyer> mthaddon: If you must use the explicit reference for some reason, then lp:~juju/juju-core/trunk should do
[14:33] <mthaddon> bzr remember resolves the full URL I believe, so I'm not sure if that's an option - will adjust manually
[14:34] <mthaddon> ok, happier now
[14:34] <niemeyer> sweet
[14:35] <mthaddon> niemeyer: deployed
[14:35] <niemeyer> woot, thanks!
[15:22] <rogpeppe> dimitern: i just saw this in trunk; any ideas?: http://paste.ubuntu.com/5631407/
[15:22] <rogpeppe> dimitern: hmm, maybe not trunk actually, but i'm pretty sure the branch hasn't got a problem there.
[15:22] <dimitern> rogpeppe: hmm, haven't seen this
[15:23] <dimitern> rogpeppe: let me look closer
[15:24] <dimitern> rogpeppe: does it happen every time?
[15:24] <rogpeppe> dimitern: i'll let you know when this test finishes
[15:25] <dimitern> rogpeppe: if it appears hung, just kill it - this happens sometimes on failure in the uniter tests
[15:25] <rogpeppe> dimitern: it just passed
[15:25] <rogpeppe> dimitern: i think it may just be a timeout that's too short
[15:25] <dimitern> rogpeppe: so it's intermittent
[15:26] <rogpeppe> (i hate those friggin' arbitrary timeouts)
[15:26] <frankban> rogpeppe, dimitern: that seems similar to the 8 failures in trunk a mentioned yesterday, I was able to make the suite pass increasing timeouts in uniter_test
[15:26] <rogpeppe> frankban: yes, that's what i was thinking
[15:27] <dimitern> rogpeppe: no this seems something else
[15:27] <dimitern> rogpeppe: the unit's charm never gets installed for some reason
[15:27] <dimitern> fwereade_: ideas? ^^
[15:33] <dimitern> frankban: what were these failures and what did you change?
[15:39] <fwereade_> dimitern, rogpeppe: looking
[15:41] <rogpeppe> fwereade_: the most significant difference i can see is that the broken version goes into "awaiting error resolution for "start hook"" after "loading uniter state" where the ok version says "charm is not deployed"
[15:41] <fwereade_> rogpeppe, yeah, I am staring at that in complete bafflement
[15:42] <frankban> dimitern: I still see these failures in trunk -> http://pastebin.ubuntu.com/5631469/
[15:43] <dimitern> frankban: at trunk tip?
[15:43] <frankban> dimitern: tests passed when I locally increased timeouts timeouts in uniter_test.go (from 5 to 20 seconds and from 50 to 200 milliseconds)
[15:43] <frankban> dimitern: revno 1034
[15:43] <dimitern> frankban: hmm... i'm getting trunk and running tests now
[15:45] <dimitern> frankban: have you seen these before? r1034 is gustavo's store CL - not related to the uniter
[15:46] <frankban> dimitern: yes, 2 days ago, 1034 is just the revno of my last test run in trunk
[15:46] <frankban> dimitern: I think it was 1013.
[15:48] <rogpeppe> fwereade_: can i run something by you please?
[15:48] <fwereade_> rogpeppe, sure
[15:49] <dimitern> frankban: that's really strange - i run all tests on trunk at least 10 times a day and haven't seen these (apart from temporary failures when I was working on the uniter itself before proposing)
[15:49] <rogpeppe> fwereade_: i'm looking at line 1353 on https://codereview.appspot.com/7598043/diff2/31001:88001/state/state_test.go
[15:49] <rogpeppe> fwereade_: and wondering why we're calling Initialize on the state when all we want is to write the environ config
[15:49] <rogpeppe> fwereade_: there are quite a few other places that do that too. and Initialize is not in fact called in the usual case in tests!
[15:50] <rogpeppe> fwereade_: (AFAICS)
[15:51] <rogpeppe> fwereade_: i'm thinking that we should always call state.Initialize, apart from the few places we actually want to test Initialize.
[15:51] <fwereade_> rogpeppe, +1 to that
[15:51] <dimitern> frankban: all tests pass for me @1036 - what's your go version?
[15:52] <rogpeppe> fwereade_: i'll leave frankban's changes to go in and then we'll fix it in a later branch
[15:52] <fwereade_> rogpeppe, cool, thanks
[15:52] <frankban> rogpeppe: cool
[15:52] <frankban> dimitern: 1.0.2
[15:53] <frankban> dimitern: it seems that my wall clock is faster than your, i.e. for some reasons, my local machine is slower
[15:53] <dimitern> frankban: i'm on 1.0.3, but this shouldn't be the problem really.. are you using HDD or SSD drive?
[15:53] <frankban> dimitern: hybrid drive
[15:54] <fwereade_> rogpeppe, so, the only scenario I can come up with when looking at the code in trunk is that there's some weird stale uniter state somewhere
[15:54] <dimitern> frankban: which timeouts in uniter_test you needed to change?
[15:54] <fwereade_> rogpeppe, but I don't see how that's possible either
[15:54] <dimitern> frankban: i'm on ssd, but i think this is not the most common in the team, so also probably not an issue
[15:55] <frankban> dimitern: I changed all, just to check that time was the problem there
[15:55] <rogpeppe> fwereade_: i haven't looked at the code yet, i'm afraid. i'm just running all tests again to see if it happens again.
[15:55] <rogpeppe> frankban: LGTM
[15:55] <frankban> dimitern: I think Makyo reproduced those 8 failures too, in his local env
[15:55] <frankban> rogpeppe: \o/
[15:56] <rogpeppe> frankban: thanks a lot for your patience with this branch!
[15:56] <Makyo> frankban, dimitern.  Yes.  I have a log saved, I think.
[15:56] <dimitern> frankban: with changed timeouts how long does it take to run all tests? these are my timings: http://paste.ubuntu.com/5631522/
[15:56] <frankban> rogpeppe: thank you for your help
[15:57] <dimitern> Makyo: does it happen with the current trunk tip as well? also timeouts perhaps..
[15:58] <fwereade_> TheMue, am I being dense? I can't see what happens when $JUJU_HOME is not set
[15:59] <rogpeppe> fwereade_: i just saw the same failure again. not in trunk though - perhaps my megawatcher changes are making a difference. that wouldn't surprise me entirely, as we've got the allWatcher that might be affecting things.
[15:59] <dimitern> fwereade_: i have some questions about how to properly test upgrade-charm cmd and a wip CL, if you have the time?
[16:00] <fwereade_> dimitern, I am very keen to take a look but chat with TheMue takes priority, if he's available
[16:01] <fwereade_> dimitern, send me the link and I'll let you know my thoughts
[16:01] <dimitern> fwereade_: sure, np - https://codereview.appspot.com/7927043/
[16:01] <Makyo> dimitern, checking.
[16:01] <TheMue> fwereade_: if it's not set it will fallback to $HOME/.juju.
[16:02] <TheMue> fwereade_: and if $HOME is also not set it will panic with an according message.
[16:03] <TheMue> fwereade_: dimitern asked for a test, i'll see how I can do it. i already tested it by hand and it's working.
[16:03] <dimitern> TheMue: PanicMatches checker?
[16:04] <rogpeppe> dimitern, fwereade_: and another failure: http://paste.ubuntu.com/5631542/
[16:04] <fwereade_> TheMue, hold on, we panic if we don;t have a $HOME? why should we need a $HOME to load the config package?
[16:04] <TheMue> dimitern: yep, remove both envs before and then call Restore... with the PanicMatches.
[16:04] <rogpeppe> dimitern: i think the allWatcher must be interfering with the uniter operation somehow, but i'm not sure how.
[16:04] <TheMue> fwereade_: if we don't have a home, how do you now where to locate .juju?
[16:05] <dimitern> rogpeppe: wow.. I never seen this one - it's falling apart :)
[16:05] <fwereade_> TheMue, why do we need to know that in order to manipulate a config file
[16:05] <TheMue> fwereade_: we have a lots of places in the code where $HOME is read and taken, even if it is unset. Getenv returns "" then.
[16:06] <fwereade_> dimitern, that latest one smells of races and luck in the change you made actually ;p
[16:06] <TheMue> fwereade_: where do you read environments.yaml from if $HOME isn't set?
[16:06] <dimitern> gents, have to step out for about half an hour, bbiab
[16:06] <fwereade_> dimitern, having hit start-failed is not sufficient reason to believe we've run the subsequent upgrade
[16:06] <fwereade_> dimitern, ttyl
[16:07] <fwereade_> TheMue, but this panics on package load if $HOME is not set, right?
[16:07] <fwereade_> TheMue, I don't think that's a reasonable requirement
[16:08] <TheMue> fwereade_: yep, but I also can live with a different default for the juju home if $JUJU_HOME and $HOME aren't set. so what do you want JujuHome() to return then?
[16:09] <fwereade_> TheMue, JujuHome can panic, that's fine
[16:09] <TheMue> fwereade_: so we would panic a bit later. ;)
[16:09] <TheMue> fwereade_: when the first one accesses it.
[16:10] <rogpeppe> dimitern: wow, *everything* is failing now: http://paste.ubuntu.com/5631564/
[16:10] <fwereade_> TheMue, right, and if that happens it's usually an error
[16:10] <rogpeppe> dimitern: and that's without the allWatcher running.
[16:10] <rogpeppe> dimitern: but this still isn't in trunk.
[16:10] <rogpeppe> dimitern: so it may well be a problem i've introduced
[16:11] <TheMue> fwereade_: so not JH() string but JH() (string, error)? no pro, please write it into the review.
[16:11] <fwereade_> TheMue, I'm thinking about it
[16:12] <fwereade_> rogpeppe, I'm baffled by the jump into start mode, but that other one seems like a real problem to me
[16:13] <fwereade_> TheMue, I think the situation is usually better communicated by a panic than an error -- it generally indicates that one of us is not thinking properly and is calling the wrong code in the wrong context
[16:14] <rogpeppe> fwereade_: ha, yes, i just saw that failure in trunk
[16:14] <rogpeppe> dimitern: ^
[16:14] <fwereade_> TheMue, but I also think we should have no risk of panic in the actual client: even if they don't have either set for whatever crazy reason, a goroutine trace is not a nice way to say hello
[16:14] <rogpeppe> dimitern: so i don't think the issue is my fault, so i'm going to submit the allWatcher integration branch
[16:15] <fwereade_> TheMue, in addition, even if $HOME is set, that is not enough reason to set JH
[16:15] <benji> I think the tests are leaving un-cleaned-up files in /tmp/.  Is that a known issue?
[16:15] <TheMue> fwereade_: IMHO it's a panic situation, but you're right about the info to the user.
[16:16] <rogpeppe> dimitern: here's the latest failure i've seen, in trunk: http://paste.ubuntu.com/5631582/
[16:16] <fwereade_> benji, it is not, they should in general be being cleaned up
[16:16] <benji> fwereade_: thanks; I'll verify that it is and file an issue.
[16:16] <fwereade_> benji, thanks
[16:17] <TheMue> fwereade_: so I could switch from an automatic init() to a manual Init() error that has to be called in commands. and also a panic by JujuHome() if the init hasen't been done.
[16:17] <rogpeppe> dimitern: it also has the same "[LOG] 15.63869 INFO: worker/uniter: awaiting error resolution for "start" hook" error
[16:18] <fwereade_> rogpeppe, I'm sure that's a straight-up bug
[16:18] <rogpeppe> fwereade_: yeah, i think it must  be
[16:18] <rogpeppe> fwereade_: i was worried it was provoked by my allWatcher stuff, but i don't think so
[16:18] <fwereade_> rogpeppe, well, actually, I don't know what it *is*, it still defies my understanding
[16:19] <rogpeppe> fwereade_: well, i can reproduce the issue reasonably consistently
[16:19] <fwereade_> rogpeppe, are you getting test droppings in your temp dir as well as benji?
[16:19] <rogpeppe> fwereade_: in /tmp ?
[16:20] <fwereade_> rogpeppe, yeah, but I forget where exactly
[16:20] <fwereade_> rogpeppe, the code seems very clear that it *has* loaded a state file in a sane format, and that the state file told it what to do
[16:20] <fwereade_> rogpeppe, which makes me wonder whether that one is a test isolation issue
[16:21] <fwereade_> rogpeppe, although it's strange that it never came up before
[16:21] <rogpeppe> fwereade_: it seems quite possible. perhaps the uniter isn't being cleaned up synchronously or something?
[16:21] <fwereade_> rogpeppe, quite possible, I shall poke at it
[16:21] <rogpeppe> fwereade_: i don't see any recent test droppings in /tmp
[16:22] <fwereade_> rogpeppe, ok, cool, that is good to know
[16:22] <rogpeppe> fwereade_: you might see if you can reproduce the issue with GOMAXPROCS=5 go test
[16:22] <rogpeppe> fwereade_: also, i'm running go tip, which has a completely different scheduler now, which might be the trigger
[16:23] <Makyo> dimitern, current trunk tests for me: http://paste.ubuntu.com/5631605/
[16:23] <fwereade_> TheMue, I think so -- set it to panic-on-call by default, and make cmd/juju call something to make it safe
[16:24] <TheMue> fwereade_: yep, will do so. you'll note it in the review too, please?
[16:24] <fwereade_> TheMue, sure, I'm just soliciting opinions live before I do the review proper
[16:25] <TheMue> fwereade_: yeah, sure, I only want to have it noted afterwards, in case I'm not remembering it correctly. I'm getting older. :D
[16:38] <fwereade_> TheMue, sent
[16:43] <TheMue> fwereade_: great, thank you
[16:45] <TheMue> fwereade_: you ask why the juju home is set in the hook context. exactly there, during a hook execution, i had the case of no $JUJU_HOME and no $HOME.
[16:46] <rogpeppe> fwereade_: you've got a review of the state dev docs branch, BTW
[16:46] <TheMue> fwereade_: i sadly don't have the log anymore, but it has been the install hook.
[16:46] <dimitern> back
[16:47] <fwereade_> TheMue, my point is that I think agent code *should* fail if it even tried to look at JUJU_HOME
[16:47] <fwereade_> TheMue, it's not its business at all
[16:48] <fwereade_> TheMue, it's no more meaningful than trying to hit the ec2 metadata service from your own laptop
[16:48] <TheMue> fwereade_: yeah, the change we discussed will lead to a removal of this setting.
[16:49] <TheMue> fwereade_: good catch.
[16:50] <fwereade_> rogpeppe, thanks, good comments
[16:51] <dimitern> fwereade_: if you still feel like it, take a peek at my CL as well please
[16:52] <fwereade_> dimitern, sorry, a load got added to the stack
[16:52] <fwereade_> dimitern, fwiw I think I know the problem in the uniter tests
[16:52] <dimitern> fwereade_: oh, what is it?
[16:53] <dimitern> fwereade_: np about the cmd CL, i'll leave it for tomorrow
[16:53] <fwereade_> dimitern, two things; :572 is asserting something it should be waiting for, and the tests aren't properly cleaning up after themselves
[16:54] <fwereade_> dimitern, there's a RemoveAll(s.unitDir) that is not run quite as often as it should be if tests fail
[16:54] <fwereade_> dimitern, if I check your review would you see if you can repro rog's failures against go tip?
[16:56] <dimitern> fwereade_: sorry, i'm confused now
[16:57] <dimitern> fwereade_: the 2 things are about uniter failures?
[16:57] <fwereade_> dimitern, yeah, rog sent a number of pastes while you were away
[16:58] <dimitern> fwereade_: these were test execution dumps, not code right?
[16:58] <fwereade_> dimitern, yeah
[16:58] <dimitern> fwereade_: and :572 is where?
[16:58] <fwereade_> dimitern, line 572, the second forced upgrade error test iirc
[16:59] <dimitern> fwereade_: let me check
[16:59] <fwereade_> dimitern, it's a verifyCharm{1} that depends on unwarranted assumptions
[16:59] <fwereade_> dimitern, then when that fails it takes out all subsequent tests because it doesn't clean up
[17:00] <dimitern> fwereade_: is this the possible case of tests hanging on failure sometimes?
[17:00] <dimitern> *cause
[17:01] <dimitern> fwereade_: right! the RemoveAll in runUniterTests?
[17:01]  * TheMue is AFK for some time, bbl
[17:01] <fwereade_> dimitern, hanging forever is unrelated, is there a bug for that?
[17:01] <fwereade_> dimitern, yeah
[17:02] <dimitern> fwereade_: don't think so - but I've only seen it twice over a month probably and it was always when some other test fails first
[17:03] <dimitern> fwereade_: so that RemoveAll should be deferred as well probably
[17:04] <fwereade_> dimitern, can't remember what happens if RemoveAll fails
[17:04] <fwereade_> dimitern, but yeah, I think that would od it
[17:04] <dimitern> fwereade_: I *cannot* reproduce any of these failures against tip - tried 5 or 6 times already
[17:04] <fwereade_> dimitern, hum, that is annoying
[17:05] <dimitern> fwereade_: trying again now
[17:15] <fwereade_> dimitern, quick hangout re upgrade-charm when you're free? o maybe over a beer later?
[17:15] <dimitern> fwereade_: +1 for beer
[17:15] <dimitern> fwereade_: still no joy - all tests pass
[17:16] <fwereade_> ok: dimitern, rogpeppe, I need to visit the shops with some haste
[17:17] <fwereade_> dimitern, if you make a test fail deliberately you can verify the effect of a deferred RemoveAll; if that's shown to work it'll get a swift LGTM on its own when I return
[17:18] <fwereade_> dimitern, rogpeppe: but in the large I think this is a situation for skip-the-test, add-critical-bug
[17:19] <rogpeppe> fwereade_: sounds reasonable
[17:19] <fwereade_> dimitern, rogpeppe: I'm pretty sure I know what's going on but it won;t be fixed tonight and I don't *think* in this case that it is better to back out the change
[17:19] <fwereade_> ttyl all
[17:20] <rogpeppe> dimitern: have you tried running against go tip, with GOMAXPROCS > n ?
[17:20] <dimitern> i'll add the quickfix for RemoveAll and try
[17:22] <dimitern> rogpeppe: no
[17:22] <rogpeppe> dimitern: that's where i see the problem (the go tip scheduler is different)
[17:22] <dimitern> rogpeppe: i have go tip at hand, but haven't updated recently
[17:22] <rogpeppe> dimitern: "hg sync; cd src; ./all.bash" and bob's yer uncle
[17:23] <dimitern> rogpeppe: hg pull you mean?
[17:23] <rogpeppe> dimitern: yeah, same thing. i think "sync" is a synonym they introduced that also copes with CL changes.
[17:23] <rogpeppe> dimitern: unless you're contributing to the go project, there's no need to use it.
[17:24] <dimitern> rogpeppe: haven't seen it, it might be a plugin (reports an error here)
[17:24] <rogpeppe> dimitern: yeah, it's a plugin
[17:24] <rogpeppe> dimitern: just use hg pull; hg update tip instead
[17:24] <dimitern> rogpeppe: done, building now
[17:26] <dimitern> rogpeppe: what's n? ^^ 2 or more
[17:26] <rogpeppe> dimitern: i used 5, one more than the number of cores in my machine
[17:27] <dimitern> rogpeppe: ok
[17:27] <dimitern> rogpeppe: how do i run it with 5 ?
[17:28] <rogpeppe> dimitern: GOMAXPROCS=5 go test
[17:29] <dimitern> rogpeppe: running all tests now, first without GOMAXPROCS
[17:30] <rogpeppe> dimitern: i hope you've changed GOROOT and GOPATH
[17:30] <rogpeppe> dimitern: otherwise you might be running the wrong binaries
[17:30] <dimitern> rogpeppe: ofc, even run go version: go version devel +8cc853b84f89 Wed Mar 20 09:06:33 2013 -0700 linux/amd64
[17:31] <rogpeppe> dimitern: cool
[17:31] <rogpeppe> dimitern: i have a separate GOPATH dir for the alternative go compiler, so there's no chance of confusion
[17:31] <dimitern> rogpeppe: actually, do I have to wipe out pkg/* as well as changing GOROOT?
[17:31] <rogpeppe> dimitern: probably not, as everything will have changed, but in general yes
[17:32] <rogpeppe> dimitern: that's why i have a separate copy of everything in its own GOPATH
[17:33] <dimitern> rogpeppe: so you have 2 separate bzr working trees?
[17:33] <rogpeppe> dimitern: yup
[17:33] <rogpeppe> dimitern: and a command (go-alt-pull) that pulls the current branch in my usual working dir into the alternative branch
[17:34] <dimitern> rogpeppe: with a corresponding magical shortcut, beginning with ",x.." ? :)
[17:34] <rogpeppe> dimitern: (most of the time i work with go tip, but i test against 1.0.2 before submitting
[17:34] <rogpeppe> )
[17:34] <dimitern> all tests pass with trunk/tip and go/tip, now will try GOMAXPROCS=5
[17:34] <rogpeppe> dimitern: nah, i just type go-alt-pull
[17:34] <rogpeppe> :-)
[17:35] <rogpeppe> dimitern: ,x just edits.
[17:35] <dimitern> rogpeppe: ahaa
[17:36] <rogpeppe> dimitern: ("," is short for "0,$" and "x" says do something for everything matching the following regexp)
[17:36] <dimitern> rogpeppe: apart from some weird warnings in decode/cgo something no other issues
[17:36] <rogpeppe> dimitern: hmm.
[17:36] <rogpeppe> dimitern: try it a few times :-)
[17:37] <dimitern> i have some failures now
[17:38] <rogpeppe> dimitern: cool
[17:38] <rogpeppe> dimitern: the same ones?
[17:38] <dimitern> in rpc though http://paste.ubuntu.com/5631798/
[17:39] <dimitern> fwereade_: riak testing charm surprisingly has rev 7
[17:39] <dimitern> rogpeppe: nope, no errors, will run it a couple of times again
[17:39] <dimitern> rogpeppe: (apart from the rpc panic)
[17:40] <rogpeppe> dimitern: ah, i think i know why that happens, and it should be fixed by dfc's upcoming branch
[17:40] <dimitern> rogpeppe: the logging stuff?
[17:41] <dimitern> rogpeppe: on second run (n=5) rpc passed
[17:41] <rogpeppe> dimitern: yeah, the race setting log.Target
[17:43] <dimitern> rogpeppe: it failed now, but with a different issue (SIGSEGV) http://paste.ubuntu.com/5631814/
[17:45] <rogpeppe> dimitern: hmm, i've not seen that befoe
[17:45] <rogpeppe> re
[17:45] <dimitern> rogpeppe: how often do the uniter tests fail like that for you?
[17:45] <rogpeppe> dimitern: most times
[17:47] <rogpeppe> dimitern: i can reproduce by just running the uniter tests too
[17:47] <dimitern> rogpeppe: rpc failed again (the same issue), but the uniter tests passed (i noticed with n=5 they are slightly faster - about 10-15%)
[17:48] <dimitern> running again without n=5
[17:48] <fwereade_> rogpeppe, does my sketchy judgment that it looks like (1) a genuinely flaky test that fails in verifyCharm, and (2) poor isolationcausing all subsequent test cases to fail
[17:48] <fwereade_> rogpeppe, ...match your own observations?
[17:48] <dimitern> fwereade_: imo 2 is definitely happening
[17:48] <rogpeppe> fwereade_: i haven't been delving into it as much as dimitern
[17:48] <dimitern> fwereade_: alas, not for me
[17:49] <dimitern> that segfault is troubling though
[17:49] <rogpeppe> fwereade_: (i figured both you and dimitern are closer to the problem than me, and will likely find the fix much sooner)
[17:49] <fwereade_> dimitern, while I have concern about the things you're observing I am focused pretty hard on the uniter tests
[17:50] <fwereade_> rogpeppe, ofc -- that was just something tossed into the air in case it were to get shot down with an instant "well no I've also seen X, Y, Z" :)
[17:50] <fwereade_> dimitern, agreed
[17:50] <fwereade_> dimitern, 8ish @cuba?
[17:50] <dimitern> wow, new panic now (second run, n=1 go tip): http://paste.ubuntu.com/5631831/
[17:51] <dimitern> fwereade_: sgtm
[17:51] <dimitern> I've seen this before once
[17:52] <dimitern> and it's a pass (except the panic)
[17:52] <rogpeppe> fwereade_: looking at the errors, yes, i think your analysis seems likely
[17:53] <dimitern> fwereade_: so my thoughts about the cleanup: there are 2 approaches - extract the loop body of runUniterTests and add a defer RemoveAll there; OR just add an additional defer RemoveAll in runUniterTests
[17:55] <rogpeppe> dimitern: hmm, i can't see how that could happen
[17:55] <dimitern> rogpeppe: what?
[17:55] <rogpeppe> dimitern: the panic in randomHexToken
[17:56] <dimitern> rogpeppe: today's the day for obscure test failures it seems :)
[17:56] <rogpeppe> dimitern: unless... what revision of goose are you using?
[17:57] <dimitern> maybe it's biorhythms or full moon or smth
[17:57] <dimitern> rogpeppe: haven't updated recently, that could be it, lemme check
[17:57] <rogpeppe> dimitern: yup, that's it
[17:57] <rogpeppe> dimitern: if you're before r80, that's the reason
[17:58] <dimitern> rogpeppe: i was, now @80, running again with n=5/tip
[18:02] <dimitern> now the uniter tests failed..goroutine 4574  wtf?!
[18:02] <dimitern> http://paste.ubuntu.com/5631858/
[18:03] <dimitern> maybe mgo has hard time coping with a lot of concurrent procs?
[18:04] <fwereade_> dimitern, whoa, that definitely deserves a bug, probably against mgo... niemeyer? does a theory leap out at you?
[18:05] <fwereade_> actually, I have to be off for supper
[18:06] <dimitern> fwereade_: so see you around 8 @cuba then
[18:07] <fwereade_> rogpeppe, would you skip that  test and add a critical bug/card for it please?
[18:07] <rogpeppe> fwereade_: the uniter test?
[18:07] <fwereade_> rogpeppe, if nobody's around to approve it I pre-LGTM a single line that just skips the one that fails for you :)
[18:07] <fwereade_> rogpeppe, please, if you have time
[18:07] <rogpeppe> fwereade_: ok, will do
[18:07] <fwereade_> rogpeppe, awesome, tyvm
[18:08] <fwereade_> later all
[18:13] <dimitern> i'll call it a day as well
[18:13] <dimitern> night y'all
[18:14] <rogpeppe> dimitern: g'night
[18:21] <hazmat> seems odd that a co of juju-core gets .   submit branch: /home/rog/src/go/src/launchpad.net/juju-core/juju/.bzr/cobzr/go-state-units-under-service
[18:39] <rogpeppe> hazmat: that is really weird
[18:39] <rogpeppe> hazmat: i think that branch dates from when i did see some weirdness around there
[18:39] <hazmat> rogpeppe, just some cached metadata in the branch
[18:39] <hazmat> not an issue, just odd
[18:40] <rogpeppe> hazmat: it may have happened when i did a push --overwrite, possibly
[18:40] <rogpeppe> anyway, eod here
[18:40] <hazmat> rogpeppe, cheers
[18:40] <rogpeppe> hazmat: cheers to you too
[18:40] <rogpeppe> and g'night all others
[20:09] <niemeyer> fwereade_: I can't see how there could be a nil reference in that line.. mongoServers is part of the cluster by value
[20:10] <niemeyer> fwereade_: Do you know what Go version this is running on?
[20:22] <thumper> moring folks
[21:03]  * thumper does the conflict resolution dance again
[21:11] <robbiew> lol
[21:23]  * thumper sighs
[21:23] <thumper> never just as simple as resolving conflicts
[21:23]  * thumper fixes test failures, import errors and other boring bits
[21:26]  * thumper wishes go had a build_tests
[21:26] <thumper> so I could check all the tests built before running them...
[21:28] <mramm> that would be nice
[22:35] <davecheney> thumper: any final comments on https://codereview.appspot.com/7524046
[22:35] <davecheney> i don't think this bike shed will stand up to any more coats of paint
[22:37]  * thumper looks
[22:37] <thumper> davecheney: morning
[22:39] <davecheney> mornig
[22:39] <davecheney> gutter cleaners are here today in the complex
[22:39] <davecheney> and they have bought the australia strategic reserve of petrol powered leaf blowers
[22:40] <thumper> davecheney: defer log.SetTarget(log.SetTarget(c))
[22:40] <thumper> does the defer command evaluate all the args at the call time?
[22:41] <thumper> davecheney: I'm assuming that log.SetTarget(c) is evaluated prior to the defer stashing the method away for later
[22:41] <davecheney> thumper: yes
[22:42] <davecheney> so the inner log.SetTarget(c) sets the target to c and reutrns the previous value
[22:42] <davecheney> that value is captured by the defer and passed to the outer log.SetTarget when run
[22:46] <thumper> davecheney: got time for a hangout chat?
[22:51] <davecheney> sure
[22:51] <davecheney> lemmie go upstairs
[22:51] <thumper> ok