[00:30] <thumper> taking the kids out for lunch
[00:30] <thumper> back later
[03:04] <thumper> davecheney: got a few minutes for a hangout?
[03:04] <davecheney> sure
[03:05] <davecheney> i'll get my other pc
[03:05] <thumper> k
[03:05] <davecheney> ready
[03:06] <thumper> davecheney: https://plus.google.com/hangouts/_/gqpui2frcqiffoyg6usnsp6dfaa?authuser=1&hl=en
[03:47] <stokachu> someone mind taking a look at this json response http://paste.ubuntu.com/7778445/
[03:47] <stokachu> im passing it the ConfigYAML file with the service name
[03:48] <stokachu> but it isn't finding the settings for keystone for some reason
[03:49] <stokachu> i parsed the yaml file to make sure there were no errors and it loads up with the keystone set as the key
[03:53] <thumper> stokachu: is the service named keystone?
[03:54] <thumper> stokachu: I've found that the config is based on the name of the service, not the charm name
[03:54] <stokachu> ah hmm
[03:54] <stokachu> lemme check something
[03:54] <stokachu> so i set the ServiceName to keystone
[03:55] <thumper> stokachu: what is it named in status?
[03:55] <thumper> stokachu: or, another way, what was your deploy command?
[03:56] <stokachu> im using ServiceDeploy api call
[03:56] <stokachu> so i pass in the params for ServiceName etc
[03:57] <stokachu> thumper, if i deploy without ConfigYAML set then it deploys the keystone service named 'keystone'
[03:57] <thumper> right
[03:58] <thumper> sorry, not sure
[04:01] <stokachu> i cant find a test that uses the ConfigYAML parameters
[04:04] <stokachu> i tried to mimic the api call to https://github.com/juju/juju/blob/master/state/apiserver/client/client_test.go#L807
[04:04] <stokachu> which i think is right
[04:06] <thumper> stokachu: let me look at something
[04:06] <stokachu> ok
[04:10] <thumper> stokachu: according to the code in cmd/juju/deploy.go, the yaml config expected in the ServiceDeploy api is the entire yaml file with a top level name matching the service name.
[04:10] <stokachu> so not the string?
[04:10] <stokachu> filename string i take it
[04:10] <thumper> haha
[04:10] <thumper> no
[04:11] <thumper> you need to read the file, and pass the yaml as string
[04:11] <stokachu> ah well then thats probably why
[04:11] <thumper> yep
[04:11] <thumper> will be
[04:11] <stokachu> shouldnt code so late
[04:11]  * thumper was hacking at 1am the other night
[04:11] <thumper> gets tiring
[04:12] <stokachu> yea but im so close to get this working lol
[04:14] <stokachu> damnit what a stupid mistake
[04:15] <stokachu> thumper, thanks for the extra set of eyes
[04:15] <thumper> np
[04:20] <thumper> my brain has stopped working
[04:20] <thumper> time to leave
[04:46] <waigani> ugh, just realized my irc was not connected
[04:48] <waigani> davecheney: here's the output of my script: http://pastebin.ubuntu.com/7778556/
[04:50] <waigani> davecheney: it deals with comments, if the groups are in the wrong order, if there are too many groups. I *think* I don't have any false positives.
[05:15]  * davecheney looks
[06:51] <vladk> dimitern: please, take a look http://github.com/juju/juju/pull/121
[06:57] <dimitern> vladk, looking
[07:33] <TheMue> morning
[07:33] <urulama> morning
[07:39] <dimitern> vladk, reviewed
[07:39] <vladk> dimitern: thanks
[07:42] <dimitern> vladk, remind me again please, why do we need to care about parsing the interfaces files so thoroughly ?
[07:42] <dimitern> vladk, can't we get away with just what the net package gives us?
[07:42] <dimitern> (and running some commands like ifup etc.)
[07:43] <dimitern> also, seems like fixing maas changes to interfaces files is not a job for juju
[07:43] <dimitern> (does it need to be a special case like that?)
[07:51] <vladk> dimitern: to bring up interface, we need to add some lines to configuration file, in the future we are going to support static IPs, so configuration lines may depend on network state
[07:52] <dimitern> vladk, well, my point is we can *write* to the interfaces files, but use the net package to *read* what's currently on and has address
[07:52] <dimitern> vladk, re static ips - let's think about that when we get there - it won't be until october that's for sure
[07:53] <vladk> dimitern: I asked people and found that most prefer to have configuration in separate files
[07:53] <vladk> additionally it may be interfaces not managed by juju
[07:53] <dimitern> vladk, if it's a machine juju started, juju has total control over it and we don't have to care about what users change later
[07:55] <dimitern> vladk, so istm being less careful to preserve what's there and making sure we (over)write the config file for interfaces juju manages (all of them) should be fine
[08:40] <dimitern> TheMue, a small-ish review? https://github.com/juju/juju/pull/293
[08:41] <TheMue> dimitern: yep
[08:44] <dimitern> damn! i've edited a card and then it disappeared!
[08:45] <dimitern> TheMue, thanks!
[08:47] <TheMue> dimitern: after first fears that the review process in GH gets worse I now feel more comfortable with it
[08:51] <dimitern> TheMue, it grows on you :)
[08:54] <TheMue> dimitern: ;)
[08:55] <TheMue> dimitern: you’ve got a review
[08:55] <dimitern> TheMue, tyvm
[08:55] <TheMue> dimitern: yw
[08:56]  * TheMue continues with the flakey tests
[08:58] <dimitern> TheMue, need any help on these? For the "port 54321 already in use" it should be trivial not to hard code it, but use ":0" and then get the random port Listen used
[08:59] <TheMue> dimitern: I’m currently at the double interface event
[09:00] <dimitern> TheMue, ok
[09:00] <TheMue> dimitern: but I’ll keep your hint in mind, thanks
[09:00] <dimitern> TheMue, cheer
[09:00] <dimitern> s :)
[09:05] <TheMue> ouch
[09:05] <TheMue> system just logged me out (w/ kiling all appls)
[09:05] <dimitern> TheMue, the PR linked to the flaky tests bug report is wrong btw
[09:05] <dimitern> aw..
[09:06] <TheMue> dimitern: which PR?
[09:06] <TheMue> dimitern: I’ve not yet done one
[09:06] <dimitern> TheMue, the one linked is about the networker https://github.com/juju/juju/pull/207/
[09:08] <dimitern> TheMue, but the failing test is in state/machine_test.go, which is an earlier one - https://github.com/juju/juju/pull/169/
[09:08] <TheMue> dimitern: thx, will change it
[09:15] <TheMue> dimitern: could you pls take a look at state/machine_test.go line 1803 - 1808?
[09:16] <dimitern> TheMue, something's fishy, I've been looking at the machine interfaces watcher implementation
[09:16] <TheMue> dimitern: if two different interfaces are disabled, why we only await one change?
[09:17] <dimitern> TheMue, the only reason I can think of is if 2 changes are merged and reported as one, but the watcher has to be written like that
[09:18] <TheMue> dimitern: that would be in watcher.go 1758ff
[09:18] <dimitern> TheMue, there's a collect() func in state/watcher
[09:19] <dimitern> TheMue, which is supposed to collapse multiple changes into one, but the machine interfaces watcher is not using it, but it's being tested as it is
[09:20] <TheMue> dimitern: yes, looks like
[09:21] <dimitern> TheMue, what freaks me out is why it's not happening *all the time*
[09:22] <TheMue> dimitern: it’s a race, sometimes the second disabling event is fast enough received, sometimes not
[09:22] <dimitern> TheMue, that seems to be the issue - *all* notify watchers use collect() inside the loop
[09:22] <TheMue> dimitern: so I’ll change it do it so here too
[09:23] <dimitern> TheMue, can that be forced by introducing a delay before disabling the second iface in the test?
[09:23] <TheMue> dimitern: not yet tested
[09:23] <dimitern> TheMue, that way you can be sure your changes in the loop() work
[09:23] <TheMue> dimitern: good idea, will try that first to have a good testbed
[09:23] <dimitern> TheMue, thanks
[09:24] <dimitern> TheMue, also, while you're at it, can you please drop the notify flag from the loop? setting out to nil or to w.out is enough to signal a notification is needed
[09:25] <TheMue> dimitern: yip
[09:26] <dimitern> TheMue, ta
[09:29] <TheMue> dimitern: btw, why did you moved the flakey test card, there’s no PR yet ;)
[09:31] <dimitern> TheMue, did I? Wasn't intentional if it was me
[09:31] <TheMue> dimitern: changed it back, an will split it into machine interface and the other one
[09:32] <dimitern> TheMue, good idea
[09:57] <vladk> dimitern: you wrote many comments on exportable names. 90% of them are related to access to internals from test package, because test package has different name. How do you solve this problem?
[09:58] <dimitern> vladk, using export_test?
[10:46] <TheMue> dimitern: standup
[10:46] <dimitern> TheMue, brt
[12:44] <mgz> http://code.mumak.net/2014/07/cutting-edge-debugging.html
[12:55] <TheMue> vladk: ping
[12:55] <vladk> pong
[12:55] <vladk> TheMue: pong
[12:56] <TheMue> vladk: in TestWatchInterfaces() in line 1854, why do you expect no change here?
[12:57] <TheMue> vladk: state/machine_test.go
[12:59] <vladk> TheMue: because we have a watcher per machine, so it does not notify about interface changes on foreign machine
[12:59] <TheMue> vladk: ah, ic, thx
[13:00] <perrito666> natefinch: the suspense is killing us, what is the veredict?
[13:00] <TheMue> vladk: and found the error here in my code ;)
[13:03] <TheMue> vladk: yeah, that has been the needed info, now it works.
[13:11] <TheMue> vladk: you’re OCR? here’s PR https://github.com/juju/juju/pull/294 for one of the flakey tests (see lp:1339864)
[13:11] <vladk> TheMue: looking
[13:11] <TheMue> vladk: thanks
[13:15] <perrito666> more than 60 secs to run 1 test when using mgosuite... beautiful
[13:38] <perrito666> wwitzel3: dont get that near the computer, it will mess your eyes
[13:38] <perrito666> :p
[13:44] <wwitzel3> perrito666: :)
[14:14] <TheMue> vladk: any questions regarding the PR?
[14:15] <vladk> TheMue: no, just a minute
[14:15] <TheMue> vladk: thx
[14:15] <TheMue> vladk: looking into the second issue during that time ;)
[14:18] <mgz> TheMue: I'm confused as to how that pr actually makes the test reliable
[14:19] <TheMue> mgz: it has been a race condition as it not used the collection of changes before. and when those have been with the wrong timng the watcher sent multiple events
[14:20] <TheMue> mgz: I’ve been able to reproduce the error locally by removing the error asserts in the test (to make it faster, funnily a Sleep doesn’t changed it)
[14:21] <TheMue> mgz: and after the change (dimiter gave me the hint with collect) it worked fine in all situations
[14:32] <vladk> TheMue: where did you find a race condition? We have a separate watcher for every machine with separate map of changes
[14:33] <TheMue> vladk: I didn’t found it, Andrew. See https://bugs.launchpad.net/juju-core/+bug/1339864.
[14:33] <_mup_> Bug #1339864: MachineSuite.TestWatchInterfaces intermittently fails <intermittent-failure> <test-failure> <juju-core:In Progress by themue> <https://launchpad.net/bugs/1339864>
[14:35] <TheMue> vladk: I simply tried to understand it, discussed with dimiter, found a way to have it almost every time here on my machine too, changed it and now it never happened again (ran the test multiple times in a loop)
[14:35] <mgz> \$\$\S+\$\$
[14:37] <vladk> TheMue: It would better to drop that test with two sequential changes, rather to implement collect. I agree that the test is not correct. But no race condition here. Even with the collect implemented that test may failed.
[14:37] <TheMue> vladk: why using a different pattern then in the other watchers?
[14:38] <TheMue> vladk: collect() hasn’t been implemented by me and is used in multiple watchers
[14:38] <vladk> TheMue: because all watchers already has different patterens
[14:39] <vladk> I'm not sure that this code has large practical meaning, because interfaces are not changed very often. We will get more complex code, but will not get any increase in performance.
[14:39] <TheMue> vladk: in details yes, not always in patterns
[14:40] <vladk> TheMue: I gave you LGTM
[14:41] <TheMue> vladk: thanks
[14:41] <vladk> TheMue: But I would drop that test with 2 events or changed it to one event.
[14:41] <vladk> TheMue: also there is a mistake in WatchInterfaces() comment
[14:42] <TheMue> vladk: Some minutes
[14:44] <wallyworld> mgz: https://github.com/juju/juju/pull/295
[15:36] <axw> wallyworld: https://github.com/juju/juju/pull/296
[16:26] <katco> niemeyer: you around?
[16:40] <niemeyer> katco: Always!
[16:40] <niemeyer> ;)
[17:19] <mattyw> rogpeppe, ping?
[17:25] <rogpeppe> mattyw: off now, sorry
[17:25] <rogpeppe> g'night all
[17:32] <voidspace> niemeyer: ping
[17:33] <voidspace> niemeyer: I'm looking at a juju bug, "i/o timeout" errors, which occur in various places https://bugs.launchpad.net/juju-core/+bug/1307434
[17:33] <_mup_> Bug #1307434: talking to mongo can fail with "TCP i/o timeout" <cloud-installer> <landscape> <performance> <reliability> <juju-core:Triaged> <https://launchpad.net/bugs/1307434>
[17:33] <voidspace> niemeyer: we *believe* this is due to the way we use a single session for *everything* in juju
[17:34] <voidspace> niemeyer: please bear in mind that I know very little about mgo and our use of it, so I'm very happy for you to correct any misconceptions I may have....
[17:34] <voidspace> niemeyer: state.State has a transaction runner (actually our own that wraps an mgo one)
[17:34] <niemeyer> voidspace: Heya
[17:34] <voidspace> niemeyer: o/ sorry for the wall of text :-)
[17:34] <niemeyer> voidspace: In a call, but can be with you in 20mins, at most
[17:35] <voidspace> niemeyer: ok, sure - I'll finish describing the problem and if you get a chance to look at it and answer my question (I do have a question) then great
[17:35] <voidspace> niemeyer: if not I can fire it off as an email (probably when I get back from Lexington)
[17:35] <niemeyer> voidspace: Sounds good
[17:36] <voidspace> niemeyer: we would rather use the socket pooling that is built into mgo
[17:36] <niemeyer> voidspace: I mean, please feel free to continue describing the problem.. will read and respond right after the call
[17:36] <voidspace> niemeyer: yep, thanks
[17:37] <voidspace> niemeyer: the comment on the mgo transaction runner says
[17:37] <voidspace> / Multiple transaction collections may exist in a single database, but
[17:37] <voidspace> / all collections that are touched by operations in a given transaction
[17:37] <voidspace> / collection must be handled exclusively by it.
[17:38] <voidspace> niemeyer: this leads us to believe that we must only have a single transaction runner and that we can't just switch to using multiple sessions (as the transaction runner has a reference to the session it is created from)
[17:40] <mgz> fun fixes: https://github.com/juju/juju/pull/298
[17:40] <voidspace> niemeyer: and for some reason, just refreshing the session every time we use it is causing auth errors (possibly related to the way we change passwords - but reopening the state after changing password doesn't fix the auth errors so there must be some other interactin here)
[17:40] <voidspace> niemeyer: so, can you see a way for us to avoid having a long lived transaction runner / session - and do you think that it is indeed likely that the long lived session is the root cause of these errors we're seeing
[17:50] <niemeyer> voidspace: Okay, here
[17:50] <niemeyer> Reading...
[17:50] <voidspace> niemeyer: o/
[17:50] <niemeyer> voidspace: The reason why it's timing out is likely that you have a session sitting around unused for more than 10 minutes
[17:51] <voidspace> niemeyer: yep, we use a single session for everything
[17:51] <niemeyer> voidspace: mgo currently does not ping sessions in use to keep them alive
[17:52] <niemeyer> voidspace: Using a single session is not strictly a problem by itself
[17:52] <niemeyer> voidspace: It may be a problem depending on what else is being done
[17:52] <voidspace> niemeyer: ok
[17:52] <voidspace> niemeyer: the transaction runner keeps a reference to the session, and we must have a single transaction runner (we believe)
[17:52] <niemeyer> voidspace: For example, sharing a session means sharing the underlying socket too, which means concurrent clients may have to wait for long running operations
[17:53] <voidspace> niemeyer: right
[17:53] <niemeyer> voidspace: It also means that connectivity errors break both clients at once, and you cannot recover the session locally because it would hide the error from every other user
[17:53] <voidspace> niemeyer: so we could use multiple sessions (copy / clone / refresh), but we also need to ensure we keep alive the one used by the transaction runner
[17:54] <niemeyer> voidspace: Okay, switching to the transaction runner issue
[17:54] <niemeyer> voidspace: You can run as many transaction runners with as many sessions as you want
[17:54] <niemeyer> voidspace: Creating a runner is very cheap, and has no kind of caching
[17:55] <voidspace> niemeyer: what about that comment in the transaction runner code?
[17:55] <niemeyer> voidspace: That comment is about the transaction collection..
[17:55] <niemeyer> voidspace: not about sessions
[17:55] <niemeyer> (or runners)
[17:55] <voidspace> ah... indeed
[17:55] <voidspace> cool
[17:56] <niemeyer> voidspace: I would like to implement global session pinging, but I'm a bit concerned about the cost of doing so
[17:57] <niemeyer> voidspace: I will sit on this over the weekend to ponder how it might be implemented in a good way
[17:57] <voidspace> niemeyer: I think I mentally substituted "runner" for "collection" when reading that comment - so it's not actually a concern
[17:57] <niemeyer> voidspace: I definitely don't want to have an extra goroutine for every single session, for instance
[17:57] <voidspace> niemeyer: that would certainly help us, but we should probably fix our code anyway
[17:57] <voidspace> right
[17:57] <voidspace> niemeyer: thanks, that's very helpful - I think I have some things I can try now
[17:57] <niemeyer> voidspace: Right, I do think you likely have issues to solve there either way
[17:57] <voidspace> yep
[17:58] <niemeyer> voidspace: FWIW, I've shared that sentiment before in that same context (I/O timeouts)
[17:58] <voidspace> we could have a session pinger ourselves, but that's probably not the best fix
[17:58] <niemeyer> voidspace: I pretty much always implement http request handling as Copy + defer Close, for instance
[17:58] <niemeyer> voidspace: Agreed.. it sounds like this is a symptom of a more fundamental design issue
[17:59] <voidspace> I still have to work out why we get these auth errors, but that's my problem...
[17:59] <niemeyer> voidspace: Even with pinging, connections die all the time.. then what?
[18:05] <voidspace> m1k43l
[18:05] <voidspace> m1k43l
[18:33] <mgz> wallyworld_: https://bugs.launchpad.net/juju-core/+bug/1340893
[18:33] <_mup_> Bug #1340893: juju bootstrap in an existing environment destroys the environment <canonical-is> <juju-core:New> <https://launchpad.net/bugs/1340893>
[19:35] <bodie_> super simple PR for testing Actions -- literally nothing added, just a testing charm location.  I'll need the godeps change in order to land Actions RunHook
[19:35] <bodie_> https://github.com/juju/charm/pull/16
[19:52] <jcw4> bodie_: LGTM, I think it's fine to merge
[19:53] <bodie_> yeah
[20:00] <bodie_> hmm, my dependencies.tsv diff seems off
[20:01] <jcw4> how so?
[20:02] <bodie_> I seem to have a different entry for loggo even though I've synced with master
[20:02] <mgz> bodie_: see my latest mp
[20:02] <jcw4> bodie_: I think the launchpad.net/loggo dependency was wrong
[20:02] <jcw4> mgz: +1
[20:02] <bodie_> ah
[20:02] <mgz> pull/299
[20:02] <bodie_> thanks
[20:03] <mgz> so, just remove that line
[20:10] <bodie_> there we are then https://github.com/juju/juju/pull/301