[00:30] taking the kids out for lunch [00:30] back later === thumper is now known as thumper-afk === thumper-afk is now known as thumper [03:04] davecheney: got a few minutes for a hangout? [03:04] sure [03:05] i'll get my other pc [03:05] k [03:05] ready [03:06] davecheney: https://plus.google.com/hangouts/_/gqpui2frcqiffoyg6usnsp6dfaa?authuser=1&hl=en === Ursinha is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha [03:47] someone mind taking a look at this json response http://paste.ubuntu.com/7778445/ [03:47] im passing it the ConfigYAML file with the service name [03:48] but it isn't finding the settings for keystone for some reason [03:49] i parsed the yaml file to make sure there were no errors and it loads up with the keystone set as the key [03:53] stokachu: is the service named keystone? [03:54] stokachu: I've found that the config is based on the name of the service, not the charm name [03:54] ah hmm [03:54] lemme check something [03:54] so i set the ServiceName to keystone [03:55] stokachu: what is it named in status? [03:55] stokachu: or, another way, what was your deploy command? [03:56] im using ServiceDeploy api call [03:56] so i pass in the params for ServiceName etc [03:57] thumper, if i deploy without ConfigYAML set then it deploys the keystone service named 'keystone' [03:57] right [03:58] sorry, not sure [04:01] i cant find a test that uses the ConfigYAML parameters [04:04] i tried to mimic the api call to https://github.com/juju/juju/blob/master/state/apiserver/client/client_test.go#L807 [04:04] which i think is right [04:06] stokachu: let me look at something [04:06] ok [04:10] stokachu: according to the code in cmd/juju/deploy.go, the yaml config expected in the ServiceDeploy api is the entire yaml file with a top level name matching the service name. [04:10] so not the string? [04:10] filename string i take it [04:10] haha [04:10] no [04:11] you need to read the file, and pass the yaml as string [04:11] ah well then thats probably why [04:11] yep [04:11] will be [04:11] shouldnt code so late [04:11] * thumper was hacking at 1am the other night [04:11] gets tiring [04:12] yea but im so close to get this working lol [04:14] damnit what a stupid mistake [04:15] thumper, thanks for the extra set of eyes [04:15] np [04:20] my brain has stopped working [04:20] time to leave [04:46] ugh, just realized my irc was not connected [04:48] davecheney: here's the output of my script: http://pastebin.ubuntu.com/7778556/ [04:50] davecheney: it deals with comments, if the groups are in the wrong order, if there are too many groups. I *think* I don't have any false positives. [05:15] * davecheney looks === vladk|offline is now known as vladk [06:51] dimitern: please, take a look http://github.com/juju/juju/pull/121 [06:57] vladk, looking [07:33] morning [07:33] morning [07:39] vladk, reviewed [07:39] dimitern: thanks [07:42] vladk, remind me again please, why do we need to care about parsing the interfaces files so thoroughly ? [07:42] vladk, can't we get away with just what the net package gives us? [07:42] (and running some commands like ifup etc.) [07:43] also, seems like fixing maas changes to interfaces files is not a job for juju [07:43] (does it need to be a special case like that?) [07:51] dimitern: to bring up interface, we need to add some lines to configuration file, in the future we are going to support static IPs, so configuration lines may depend on network state [07:52] vladk, well, my point is we can *write* to the interfaces files, but use the net package to *read* what's currently on and has address [07:52] vladk, re static ips - let's think about that when we get there - it won't be until october that's for sure [07:53] dimitern: I asked people and found that most prefer to have configuration in separate files [07:53] additionally it may be interfaces not managed by juju [07:53] vladk, if it's a machine juju started, juju has total control over it and we don't have to care about what users change later [07:55] vladk, so istm being less careful to preserve what's there and making sure we (over)write the config file for interfaces juju manages (all of them) should be fine [08:40] TheMue, a small-ish review? https://github.com/juju/juju/pull/293 [08:41] dimitern: yep [08:44] damn! i've edited a card and then it disappeared! [08:45] TheMue, thanks! [08:47] dimitern: after first fears that the review process in GH gets worse I now feel more comfortable with it [08:51] TheMue, it grows on you :) [08:54] dimitern: ;) [08:55] dimitern: you’ve got a review [08:55] TheMue, tyvm [08:55] dimitern: yw [08:56] * TheMue continues with the flakey tests [08:58] TheMue, need any help on these? For the "port 54321 already in use" it should be trivial not to hard code it, but use ":0" and then get the random port Listen used [08:59] dimitern: I’m currently at the double interface event [09:00] TheMue, ok [09:00] dimitern: but I’ll keep your hint in mind, thanks [09:00] TheMue, cheer [09:00] s :) === vladk is now known as vladk|offline === vladk|offline is now known as vladk [09:05] ouch [09:05] system just logged me out (w/ kiling all appls) [09:05] TheMue, the PR linked to the flaky tests bug report is wrong btw [09:05] aw.. [09:06] dimitern: which PR? [09:06] dimitern: I’ve not yet done one [09:06] TheMue, the one linked is about the networker https://github.com/juju/juju/pull/207/ [09:08] TheMue, but the failing test is in state/machine_test.go, which is an earlier one - https://github.com/juju/juju/pull/169/ [09:08] dimitern: thx, will change it === vladk is now known as vladk|offline [09:15] dimitern: could you pls take a look at state/machine_test.go line 1803 - 1808? [09:16] TheMue, something's fishy, I've been looking at the machine interfaces watcher implementation [09:16] dimitern: if two different interfaces are disabled, why we only await one change? [09:17] TheMue, the only reason I can think of is if 2 changes are merged and reported as one, but the watcher has to be written like that [09:18] dimitern: that would be in watcher.go 1758ff [09:18] TheMue, there's a collect() func in state/watcher [09:19] TheMue, which is supposed to collapse multiple changes into one, but the machine interfaces watcher is not using it, but it's being tested as it is [09:20] dimitern: yes, looks like [09:21] TheMue, what freaks me out is why it's not happening *all the time* [09:22] dimitern: it’s a race, sometimes the second disabling event is fast enough received, sometimes not [09:22] TheMue, that seems to be the issue - *all* notify watchers use collect() inside the loop [09:22] dimitern: so I’ll change it do it so here too [09:23] TheMue, can that be forced by introducing a delay before disabling the second iface in the test? [09:23] dimitern: not yet tested [09:23] TheMue, that way you can be sure your changes in the loop() work [09:23] dimitern: good idea, will try that first to have a good testbed [09:23] TheMue, thanks [09:24] TheMue, also, while you're at it, can you please drop the notify flag from the loop? setting out to nil or to w.out is enough to signal a notification is needed [09:25] dimitern: yip [09:26] TheMue, ta [09:29] dimitern: btw, why did you moved the flakey test card, there’s no PR yet ;) [09:31] TheMue, did I? Wasn't intentional if it was me [09:31] dimitern: changed it back, an will split it into machine interface and the other one [09:32] TheMue, good idea === vladk|offline is now known as vladk [09:57] dimitern: you wrote many comments on exportable names. 90% of them are related to access to internals from test package, because test package has different name. How do you solve this problem? [09:58] vladk, using export_test? === urulama is now known as uru-afk [10:46] dimitern: standup [10:46] TheMue, brt === uru-afk is now known as urulama === vladk is now known as vladk|offline === vladk|offline is now known as vladk === vladk is now known as vladk|offline === vladk|offline is now known as vladk [12:44] http://code.mumak.net/2014/07/cutting-edge-debugging.html === psivaa is now known as psivaa-lunch [12:55] vladk: ping [12:55] pong [12:55] TheMue: pong [12:56] vladk: in TestWatchInterfaces() in line 1854, why do you expect no change here? [12:57] vladk: state/machine_test.go [12:59] TheMue: because we have a watcher per machine, so it does not notify about interface changes on foreign machine [12:59] vladk: ah, ic, thx [13:00] natefinch: the suspense is killing us, what is the veredict? [13:00] vladk: and found the error here in my code ;) [13:03] vladk: yeah, that has been the needed info, now it works. [13:11] vladk: you’re OCR? here’s PR https://github.com/juju/juju/pull/294 for one of the flakey tests (see lp:1339864) [13:11] TheMue: looking [13:11] vladk: thanks [13:15] more than 60 secs to run 1 test when using mgosuite... beautiful [13:38] wwitzel3: dont get that near the computer, it will mess your eyes [13:38] :p [13:44] perrito666: :) === Ursinha is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha === psivaa-lunch is now known as psivaa [14:14] vladk: any questions regarding the PR? [14:15] TheMue: no, just a minute [14:15] vladk: thx [14:15] vladk: looking into the second issue during that time ;) [14:18] TheMue: I'm confused as to how that pr actually makes the test reliable [14:19] mgz: it has been a race condition as it not used the collection of changes before. and when those have been with the wrong timng the watcher sent multiple events [14:20] mgz: I’ve been able to reproduce the error locally by removing the error asserts in the test (to make it faster, funnily a Sleep doesn’t changed it) [14:21] mgz: and after the change (dimiter gave me the hint with collect) it worked fine in all situations [14:32] TheMue: where did you find a race condition? We have a separate watcher for every machine with separate map of changes [14:33] vladk: I didn’t found it, Andrew. See https://bugs.launchpad.net/juju-core/+bug/1339864. [14:33] <_mup_> Bug #1339864: MachineSuite.TestWatchInterfaces intermittently fails === vladk is now known as vladk|offline === vladk|offline is now known as vladk [14:35] vladk: I simply tried to understand it, discussed with dimiter, found a way to have it almost every time here on my machine too, changed it and now it never happened again (ran the test multiple times in a loop) [14:35] \$\$\S+\$\$ [14:37] TheMue: It would better to drop that test with two sequential changes, rather to implement collect. I agree that the test is not correct. But no race condition here. Even with the collect implemented that test may failed. [14:37] vladk: why using a different pattern then in the other watchers? [14:38] vladk: collect() hasn’t been implemented by me and is used in multiple watchers [14:38] TheMue: because all watchers already has different patterens [14:39] I'm not sure that this code has large practical meaning, because interfaces are not changed very often. We will get more complex code, but will not get any increase in performance. [14:39] vladk: in details yes, not always in patterns [14:40] TheMue: I gave you LGTM [14:41] vladk: thanks [14:41] TheMue: But I would drop that test with 2 events or changed it to one event. [14:41] TheMue: also there is a mistake in WatchInterfaces() comment [14:42] vladk: Some minutes === vladk is now known as vladk|offline [14:44] mgz: https://github.com/juju/juju/pull/295 [15:36] wallyworld: https://github.com/juju/juju/pull/296 === vladk|offline is now known as vladk [16:26] niemeyer: you around? [16:40] katco: Always! [16:40] ;) === urulama is now known as uru-afk [17:19] rogpeppe, ping? [17:25] mattyw: off now, sorry [17:25] g'night all [17:32] niemeyer: ping [17:33] niemeyer: I'm looking at a juju bug, "i/o timeout" errors, which occur in various places https://bugs.launchpad.net/juju-core/+bug/1307434 [17:33] <_mup_> Bug #1307434: talking to mongo can fail with "TCP i/o timeout" [17:33] niemeyer: we *believe* this is due to the way we use a single session for *everything* in juju [17:34] niemeyer: please bear in mind that I know very little about mgo and our use of it, so I'm very happy for you to correct any misconceptions I may have.... [17:34] niemeyer: state.State has a transaction runner (actually our own that wraps an mgo one) [17:34] voidspace: Heya === ev_ is now known as ev [17:34] niemeyer: o/ sorry for the wall of text :-) [17:34] voidspace: In a call, but can be with you in 20mins, at most [17:35] niemeyer: ok, sure - I'll finish describing the problem and if you get a chance to look at it and answer my question (I do have a question) then great [17:35] niemeyer: if not I can fire it off as an email (probably when I get back from Lexington) [17:35] voidspace: Sounds good [17:36] niemeyer: we would rather use the socket pooling that is built into mgo [17:36] voidspace: I mean, please feel free to continue describing the problem.. will read and respond right after the call [17:36] niemeyer: yep, thanks [17:37] niemeyer: the comment on the mgo transaction runner says [17:37] / Multiple transaction collections may exist in a single database, but [17:37] / all collections that are touched by operations in a given transaction [17:37] / collection must be handled exclusively by it. [17:38] niemeyer: this leads us to believe that we must only have a single transaction runner and that we can't just switch to using multiple sessions (as the transaction runner has a reference to the session it is created from) [17:40] fun fixes: https://github.com/juju/juju/pull/298 [17:40] niemeyer: and for some reason, just refreshing the session every time we use it is causing auth errors (possibly related to the way we change passwords - but reopening the state after changing password doesn't fix the auth errors so there must be some other interactin here) [17:40] niemeyer: so, can you see a way for us to avoid having a long lived transaction runner / session - and do you think that it is indeed likely that the long lived session is the root cause of these errors we're seeing [17:50] voidspace: Okay, here [17:50] Reading... [17:50] niemeyer: o/ [17:50] voidspace: The reason why it's timing out is likely that you have a session sitting around unused for more than 10 minutes [17:51] niemeyer: yep, we use a single session for everything [17:51] voidspace: mgo currently does not ping sessions in use to keep them alive [17:52] voidspace: Using a single session is not strictly a problem by itself [17:52] voidspace: It may be a problem depending on what else is being done [17:52] niemeyer: ok [17:52] niemeyer: the transaction runner keeps a reference to the session, and we must have a single transaction runner (we believe) [17:52] voidspace: For example, sharing a session means sharing the underlying socket too, which means concurrent clients may have to wait for long running operations [17:53] niemeyer: right [17:53] voidspace: It also means that connectivity errors break both clients at once, and you cannot recover the session locally because it would hide the error from every other user [17:53] niemeyer: so we could use multiple sessions (copy / clone / refresh), but we also need to ensure we keep alive the one used by the transaction runner [17:54] voidspace: Okay, switching to the transaction runner issue [17:54] voidspace: You can run as many transaction runners with as many sessions as you want [17:54] voidspace: Creating a runner is very cheap, and has no kind of caching [17:55] niemeyer: what about that comment in the transaction runner code? [17:55] voidspace: That comment is about the transaction collection.. [17:55] voidspace: not about sessions [17:55] (or runners) [17:55] ah... indeed [17:55] cool [17:56] voidspace: I would like to implement global session pinging, but I'm a bit concerned about the cost of doing so [17:57] voidspace: I will sit on this over the weekend to ponder how it might be implemented in a good way [17:57] niemeyer: I think I mentally substituted "runner" for "collection" when reading that comment - so it's not actually a concern [17:57] voidspace: I definitely don't want to have an extra goroutine for every single session, for instance [17:57] niemeyer: that would certainly help us, but we should probably fix our code anyway [17:57] right [17:57] niemeyer: thanks, that's very helpful - I think I have some things I can try now [17:57] voidspace: Right, I do think you likely have issues to solve there either way [17:57] yep [17:58] voidspace: FWIW, I've shared that sentiment before in that same context (I/O timeouts) [17:58] we could have a session pinger ourselves, but that's probably not the best fix [17:58] voidspace: I pretty much always implement http request handling as Copy + defer Close, for instance [17:58] voidspace: Agreed.. it sounds like this is a symptom of a more fundamental design issue [17:59] I still have to work out why we get these auth errors, but that's my problem... [17:59] voidspace: Even with pinging, connections die all the time.. then what? [18:05] m1k43l [18:05] m1k43l [18:33] wallyworld_: https://bugs.launchpad.net/juju-core/+bug/1340893 [18:33] <_mup_> Bug #1340893: juju bootstrap in an existing environment destroys the environment [19:35] super simple PR for testing Actions -- literally nothing added, just a testing charm location. I'll need the godeps change in order to land Actions RunHook [19:35] https://github.com/juju/charm/pull/16 === vladk is now known as vladk|offline === vladk|offline is now known as vladk [19:52] bodie_: LGTM, I think it's fine to merge [19:53] yeah === vladk is now known as vladk|offline [20:00] hmm, my dependencies.tsv diff seems off [20:01] how so? [20:02] I seem to have a different entry for loggo even though I've synced with master [20:02] bodie_: see my latest mp [20:02] bodie_: I think the launchpad.net/loggo dependency was wrong [20:02] mgz: +1 [20:02] ah [20:02] pull/299 [20:02] thanks [20:03] so, just remove that line [20:10] there we are then https://github.com/juju/juju/pull/301