thumper | taking the kids out for lunch | 00:30 |
---|---|---|
thumper | back later | 00:30 |
=== thumper is now known as thumper-afk | ||
=== thumper-afk is now known as thumper | ||
thumper | davecheney: got a few minutes for a hangout? | 03:04 |
davecheney | sure | 03:04 |
davecheney | i'll get my other pc | 03:05 |
thumper | k | 03:05 |
davecheney | ready | 03:05 |
thumper | davecheney: https://plus.google.com/hangouts/_/gqpui2frcqiffoyg6usnsp6dfaa?authuser=1&hl=en | 03:06 |
=== Ursinha is now known as Ursinha-afk | ||
=== Ursinha-afk is now known as Ursinha | ||
stokachu | someone mind taking a look at this json response http://paste.ubuntu.com/7778445/ | 03:47 |
stokachu | im passing it the ConfigYAML file with the service name | 03:47 |
stokachu | but it isn't finding the settings for keystone for some reason | 03:48 |
stokachu | i parsed the yaml file to make sure there were no errors and it loads up with the keystone set as the key | 03:49 |
thumper | stokachu: is the service named keystone? | 03:53 |
thumper | stokachu: I've found that the config is based on the name of the service, not the charm name | 03:54 |
stokachu | ah hmm | 03:54 |
stokachu | lemme check something | 03:54 |
stokachu | so i set the ServiceName to keystone | 03:54 |
thumper | stokachu: what is it named in status? | 03:55 |
thumper | stokachu: or, another way, what was your deploy command? | 03:55 |
stokachu | im using ServiceDeploy api call | 03:56 |
stokachu | so i pass in the params for ServiceName etc | 03:56 |
stokachu | thumper, if i deploy without ConfigYAML set then it deploys the keystone service named 'keystone' | 03:57 |
thumper | right | 03:57 |
thumper | sorry, not sure | 03:58 |
stokachu | i cant find a test that uses the ConfigYAML parameters | 04:01 |
stokachu | i tried to mimic the api call to https://github.com/juju/juju/blob/master/state/apiserver/client/client_test.go#L807 | 04:04 |
stokachu | which i think is right | 04:04 |
thumper | stokachu: let me look at something | 04:06 |
stokachu | ok | 04:06 |
thumper | stokachu: according to the code in cmd/juju/deploy.go, the yaml config expected in the ServiceDeploy api is the entire yaml file with a top level name matching the service name. | 04:10 |
stokachu | so not the string? | 04:10 |
stokachu | filename string i take it | 04:10 |
thumper | haha | 04:10 |
thumper | no | 04:10 |
thumper | you need to read the file, and pass the yaml as string | 04:11 |
stokachu | ah well then thats probably why | 04:11 |
thumper | yep | 04:11 |
thumper | will be | 04:11 |
stokachu | shouldnt code so late | 04:11 |
* thumper was hacking at 1am the other night | 04:11 | |
thumper | gets tiring | 04:11 |
stokachu | yea but im so close to get this working lol | 04:12 |
stokachu | damnit what a stupid mistake | 04:14 |
stokachu | thumper, thanks for the extra set of eyes | 04:15 |
thumper | np | 04:15 |
thumper | my brain has stopped working | 04:20 |
thumper | time to leave | 04:20 |
waigani | ugh, just realized my irc was not connected | 04:46 |
waigani | davecheney: here's the output of my script: http://pastebin.ubuntu.com/7778556/ | 04:48 |
waigani | davecheney: it deals with comments, if the groups are in the wrong order, if there are too many groups. I *think* I don't have any false positives. | 04:50 |
* davecheney looks | 05:15 | |
=== vladk|offline is now known as vladk | ||
vladk | dimitern: please, take a look http://github.com/juju/juju/pull/121 | 06:51 |
dimitern | vladk, looking | 06:57 |
TheMue | morning | 07:33 |
urulama | morning | 07:33 |
dimitern | vladk, reviewed | 07:39 |
vladk | dimitern: thanks | 07:39 |
dimitern | vladk, remind me again please, why do we need to care about parsing the interfaces files so thoroughly ? | 07:42 |
dimitern | vladk, can't we get away with just what the net package gives us? | 07:42 |
dimitern | (and running some commands like ifup etc.) | 07:42 |
dimitern | also, seems like fixing maas changes to interfaces files is not a job for juju | 07:43 |
dimitern | (does it need to be a special case like that?) | 07:43 |
vladk | dimitern: to bring up interface, we need to add some lines to configuration file, in the future we are going to support static IPs, so configuration lines may depend on network state | 07:51 |
dimitern | vladk, well, my point is we can *write* to the interfaces files, but use the net package to *read* what's currently on and has address | 07:52 |
dimitern | vladk, re static ips - let's think about that when we get there - it won't be until october that's for sure | 07:52 |
vladk | dimitern: I asked people and found that most prefer to have configuration in separate files | 07:53 |
vladk | additionally it may be interfaces not managed by juju | 07:53 |
dimitern | vladk, if it's a machine juju started, juju has total control over it and we don't have to care about what users change later | 07:53 |
dimitern | vladk, so istm being less careful to preserve what's there and making sure we (over)write the config file for interfaces juju manages (all of them) should be fine | 07:55 |
dimitern | TheMue, a small-ish review? https://github.com/juju/juju/pull/293 | 08:40 |
TheMue | dimitern: yep | 08:41 |
dimitern | damn! i've edited a card and then it disappeared! | 08:44 |
dimitern | TheMue, thanks! | 08:45 |
TheMue | dimitern: after first fears that the review process in GH gets worse I now feel more comfortable with it | 08:47 |
dimitern | TheMue, it grows on you :) | 08:51 |
TheMue | dimitern: ;) | 08:54 |
TheMue | dimitern: you’ve got a review | 08:55 |
dimitern | TheMue, tyvm | 08:55 |
TheMue | dimitern: yw | 08:55 |
* TheMue continues with the flakey tests | 08:56 | |
dimitern | TheMue, need any help on these? For the "port 54321 already in use" it should be trivial not to hard code it, but use ":0" and then get the random port Listen used | 08:58 |
TheMue | dimitern: I’m currently at the double interface event | 08:59 |
dimitern | TheMue, ok | 09:00 |
TheMue | dimitern: but I’ll keep your hint in mind, thanks | 09:00 |
dimitern | TheMue, cheer | 09:00 |
dimitern | s :) | 09:00 |
=== vladk is now known as vladk|offline | ||
=== vladk|offline is now known as vladk | ||
TheMue | ouch | 09:05 |
TheMue | system just logged me out (w/ kiling all appls) | 09:05 |
dimitern | TheMue, the PR linked to the flaky tests bug report is wrong btw | 09:05 |
dimitern | aw.. | 09:05 |
TheMue | dimitern: which PR? | 09:06 |
TheMue | dimitern: I’ve not yet done one | 09:06 |
dimitern | TheMue, the one linked is about the networker https://github.com/juju/juju/pull/207/ | 09:06 |
dimitern | TheMue, but the failing test is in state/machine_test.go, which is an earlier one - https://github.com/juju/juju/pull/169/ | 09:08 |
TheMue | dimitern: thx, will change it | 09:08 |
=== vladk is now known as vladk|offline | ||
TheMue | dimitern: could you pls take a look at state/machine_test.go line 1803 - 1808? | 09:15 |
dimitern | TheMue, something's fishy, I've been looking at the machine interfaces watcher implementation | 09:16 |
TheMue | dimitern: if two different interfaces are disabled, why we only await one change? | 09:16 |
dimitern | TheMue, the only reason I can think of is if 2 changes are merged and reported as one, but the watcher has to be written like that | 09:17 |
TheMue | dimitern: that would be in watcher.go 1758ff | 09:18 |
dimitern | TheMue, there's a collect() func in state/watcher | 09:18 |
dimitern | TheMue, which is supposed to collapse multiple changes into one, but the machine interfaces watcher is not using it, but it's being tested as it is | 09:19 |
TheMue | dimitern: yes, looks like | 09:20 |
dimitern | TheMue, what freaks me out is why it's not happening *all the time* | 09:21 |
TheMue | dimitern: it’s a race, sometimes the second disabling event is fast enough received, sometimes not | 09:22 |
dimitern | TheMue, that seems to be the issue - *all* notify watchers use collect() inside the loop | 09:22 |
TheMue | dimitern: so I’ll change it do it so here too | 09:22 |
dimitern | TheMue, can that be forced by introducing a delay before disabling the second iface in the test? | 09:23 |
TheMue | dimitern: not yet tested | 09:23 |
dimitern | TheMue, that way you can be sure your changes in the loop() work | 09:23 |
TheMue | dimitern: good idea, will try that first to have a good testbed | 09:23 |
dimitern | TheMue, thanks | 09:23 |
dimitern | TheMue, also, while you're at it, can you please drop the notify flag from the loop? setting out to nil or to w.out is enough to signal a notification is needed | 09:24 |
TheMue | dimitern: yip | 09:25 |
dimitern | TheMue, ta | 09:26 |
TheMue | dimitern: btw, why did you moved the flakey test card, there’s no PR yet ;) | 09:29 |
dimitern | TheMue, did I? Wasn't intentional if it was me | 09:31 |
TheMue | dimitern: changed it back, an will split it into machine interface and the other one | 09:31 |
dimitern | TheMue, good idea | 09:32 |
=== vladk|offline is now known as vladk | ||
vladk | dimitern: you wrote many comments on exportable names. 90% of them are related to access to internals from test package, because test package has different name. How do you solve this problem? | 09:57 |
dimitern | vladk, using export_test? | 09:58 |
=== urulama is now known as uru-afk | ||
TheMue | dimitern: standup | 10:46 |
dimitern | TheMue, brt | 10:46 |
=== uru-afk is now known as urulama | ||
=== vladk is now known as vladk|offline | ||
=== vladk|offline is now known as vladk | ||
=== vladk is now known as vladk|offline | ||
=== vladk|offline is now known as vladk | ||
mgz | http://code.mumak.net/2014/07/cutting-edge-debugging.html | 12:44 |
=== psivaa is now known as psivaa-lunch | ||
TheMue | vladk: ping | 12:55 |
vladk | pong | 12:55 |
vladk | TheMue: pong | 12:55 |
TheMue | vladk: in TestWatchInterfaces() in line 1854, why do you expect no change here? | 12:56 |
TheMue | vladk: state/machine_test.go | 12:57 |
vladk | TheMue: because we have a watcher per machine, so it does not notify about interface changes on foreign machine | 12:59 |
TheMue | vladk: ah, ic, thx | 12:59 |
perrito666 | natefinch: the suspense is killing us, what is the veredict? | 13:00 |
TheMue | vladk: and found the error here in my code ;) | 13:00 |
TheMue | vladk: yeah, that has been the needed info, now it works. | 13:03 |
TheMue | vladk: you’re OCR? here’s PR https://github.com/juju/juju/pull/294 for one of the flakey tests (see lp:1339864) | 13:11 |
vladk | TheMue: looking | 13:11 |
TheMue | vladk: thanks | 13:11 |
perrito666 | more than 60 secs to run 1 test when using mgosuite... beautiful | 13:15 |
perrito666 | wwitzel3: dont get that near the computer, it will mess your eyes | 13:38 |
perrito666 | :p | 13:38 |
wwitzel3 | perrito666: :) | 13:44 |
=== Ursinha is now known as Ursinha-afk | ||
=== Ursinha-afk is now known as Ursinha | ||
=== psivaa-lunch is now known as psivaa | ||
TheMue | vladk: any questions regarding the PR? | 14:14 |
vladk | TheMue: no, just a minute | 14:15 |
TheMue | vladk: thx | 14:15 |
TheMue | vladk: looking into the second issue during that time ;) | 14:15 |
mgz | TheMue: I'm confused as to how that pr actually makes the test reliable | 14:18 |
TheMue | mgz: it has been a race condition as it not used the collection of changes before. and when those have been with the wrong timng the watcher sent multiple events | 14:19 |
TheMue | mgz: I’ve been able to reproduce the error locally by removing the error asserts in the test (to make it faster, funnily a Sleep doesn’t changed it) | 14:20 |
TheMue | mgz: and after the change (dimiter gave me the hint with collect) it worked fine in all situations | 14:21 |
vladk | TheMue: where did you find a race condition? We have a separate watcher for every machine with separate map of changes | 14:32 |
TheMue | vladk: I didn’t found it, Andrew. See https://bugs.launchpad.net/juju-core/+bug/1339864. | 14:33 |
_mup_ | Bug #1339864: MachineSuite.TestWatchInterfaces intermittently fails <intermittent-failure> <test-failure> <juju-core:In Progress by themue> <https://launchpad.net/bugs/1339864> | 14:33 |
=== vladk is now known as vladk|offline | ||
=== vladk|offline is now known as vladk | ||
TheMue | vladk: I simply tried to understand it, discussed with dimiter, found a way to have it almost every time here on my machine too, changed it and now it never happened again (ran the test multiple times in a loop) | 14:35 |
mgz | \$\$\S+\$\$ | 14:35 |
vladk | TheMue: It would better to drop that test with two sequential changes, rather to implement collect. I agree that the test is not correct. But no race condition here. Even with the collect implemented that test may failed. | 14:37 |
TheMue | vladk: why using a different pattern then in the other watchers? | 14:37 |
TheMue | vladk: collect() hasn’t been implemented by me and is used in multiple watchers | 14:38 |
vladk | TheMue: because all watchers already has different patterens | 14:38 |
vladk | I'm not sure that this code has large practical meaning, because interfaces are not changed very often. We will get more complex code, but will not get any increase in performance. | 14:39 |
TheMue | vladk: in details yes, not always in patterns | 14:39 |
vladk | TheMue: I gave you LGTM | 14:40 |
TheMue | vladk: thanks | 14:41 |
vladk | TheMue: But I would drop that test with 2 events or changed it to one event. | 14:41 |
vladk | TheMue: also there is a mistake in WatchInterfaces() comment | 14:41 |
TheMue | vladk: Some minutes | 14:42 |
=== vladk is now known as vladk|offline | ||
wallyworld | mgz: https://github.com/juju/juju/pull/295 | 14:44 |
axw | wallyworld: https://github.com/juju/juju/pull/296 | 15:36 |
=== vladk|offline is now known as vladk | ||
katco | niemeyer: you around? | 16:26 |
niemeyer | katco: Always! | 16:40 |
niemeyer | ;) | 16:40 |
=== urulama is now known as uru-afk | ||
mattyw | rogpeppe, ping? | 17:19 |
rogpeppe | mattyw: off now, sorry | 17:25 |
rogpeppe | g'night all | 17:25 |
voidspace | niemeyer: ping | 17:32 |
voidspace | niemeyer: I'm looking at a juju bug, "i/o timeout" errors, which occur in various places https://bugs.launchpad.net/juju-core/+bug/1307434 | 17:33 |
_mup_ | Bug #1307434: talking to mongo can fail with "TCP i/o timeout" <cloud-installer> <landscape> <performance> <reliability> <juju-core:Triaged> <https://launchpad.net/bugs/1307434> | 17:33 |
voidspace | niemeyer: we *believe* this is due to the way we use a single session for *everything* in juju | 17:33 |
voidspace | niemeyer: please bear in mind that I know very little about mgo and our use of it, so I'm very happy for you to correct any misconceptions I may have.... | 17:34 |
voidspace | niemeyer: state.State has a transaction runner (actually our own that wraps an mgo one) | 17:34 |
niemeyer | voidspace: Heya | 17:34 |
=== ev_ is now known as ev | ||
voidspace | niemeyer: o/ sorry for the wall of text :-) | 17:34 |
niemeyer | voidspace: In a call, but can be with you in 20mins, at most | 17:34 |
voidspace | niemeyer: ok, sure - I'll finish describing the problem and if you get a chance to look at it and answer my question (I do have a question) then great | 17:35 |
voidspace | niemeyer: if not I can fire it off as an email (probably when I get back from Lexington) | 17:35 |
niemeyer | voidspace: Sounds good | 17:35 |
voidspace | niemeyer: we would rather use the socket pooling that is built into mgo | 17:36 |
niemeyer | voidspace: I mean, please feel free to continue describing the problem.. will read and respond right after the call | 17:36 |
voidspace | niemeyer: yep, thanks | 17:36 |
voidspace | niemeyer: the comment on the mgo transaction runner says | 17:37 |
voidspace | / Multiple transaction collections may exist in a single database, but | 17:37 |
voidspace | / all collections that are touched by operations in a given transaction | 17:37 |
voidspace | / collection must be handled exclusively by it. | 17:37 |
voidspace | niemeyer: this leads us to believe that we must only have a single transaction runner and that we can't just switch to using multiple sessions (as the transaction runner has a reference to the session it is created from) | 17:38 |
mgz | fun fixes: https://github.com/juju/juju/pull/298 | 17:40 |
voidspace | niemeyer: and for some reason, just refreshing the session every time we use it is causing auth errors (possibly related to the way we change passwords - but reopening the state after changing password doesn't fix the auth errors so there must be some other interactin here) | 17:40 |
voidspace | niemeyer: so, can you see a way for us to avoid having a long lived transaction runner / session - and do you think that it is indeed likely that the long lived session is the root cause of these errors we're seeing | 17:40 |
niemeyer | voidspace: Okay, here | 17:50 |
niemeyer | Reading... | 17:50 |
voidspace | niemeyer: o/ | 17:50 |
niemeyer | voidspace: The reason why it's timing out is likely that you have a session sitting around unused for more than 10 minutes | 17:50 |
voidspace | niemeyer: yep, we use a single session for everything | 17:51 |
niemeyer | voidspace: mgo currently does not ping sessions in use to keep them alive | 17:51 |
niemeyer | voidspace: Using a single session is not strictly a problem by itself | 17:52 |
niemeyer | voidspace: It may be a problem depending on what else is being done | 17:52 |
voidspace | niemeyer: ok | 17:52 |
voidspace | niemeyer: the transaction runner keeps a reference to the session, and we must have a single transaction runner (we believe) | 17:52 |
niemeyer | voidspace: For example, sharing a session means sharing the underlying socket too, which means concurrent clients may have to wait for long running operations | 17:52 |
voidspace | niemeyer: right | 17:53 |
niemeyer | voidspace: It also means that connectivity errors break both clients at once, and you cannot recover the session locally because it would hide the error from every other user | 17:53 |
voidspace | niemeyer: so we could use multiple sessions (copy / clone / refresh), but we also need to ensure we keep alive the one used by the transaction runner | 17:53 |
niemeyer | voidspace: Okay, switching to the transaction runner issue | 17:54 |
niemeyer | voidspace: You can run as many transaction runners with as many sessions as you want | 17:54 |
niemeyer | voidspace: Creating a runner is very cheap, and has no kind of caching | 17:54 |
voidspace | niemeyer: what about that comment in the transaction runner code? | 17:55 |
niemeyer | voidspace: That comment is about the transaction collection.. | 17:55 |
niemeyer | voidspace: not about sessions | 17:55 |
niemeyer | (or runners) | 17:55 |
voidspace | ah... indeed | 17:55 |
voidspace | cool | 17:55 |
niemeyer | voidspace: I would like to implement global session pinging, but I'm a bit concerned about the cost of doing so | 17:56 |
niemeyer | voidspace: I will sit on this over the weekend to ponder how it might be implemented in a good way | 17:57 |
voidspace | niemeyer: I think I mentally substituted "runner" for "collection" when reading that comment - so it's not actually a concern | 17:57 |
niemeyer | voidspace: I definitely don't want to have an extra goroutine for every single session, for instance | 17:57 |
voidspace | niemeyer: that would certainly help us, but we should probably fix our code anyway | 17:57 |
voidspace | right | 17:57 |
voidspace | niemeyer: thanks, that's very helpful - I think I have some things I can try now | 17:57 |
niemeyer | voidspace: Right, I do think you likely have issues to solve there either way | 17:57 |
voidspace | yep | 17:57 |
niemeyer | voidspace: FWIW, I've shared that sentiment before in that same context (I/O timeouts) | 17:58 |
voidspace | we could have a session pinger ourselves, but that's probably not the best fix | 17:58 |
niemeyer | voidspace: I pretty much always implement http request handling as Copy + defer Close, for instance | 17:58 |
niemeyer | voidspace: Agreed.. it sounds like this is a symptom of a more fundamental design issue | 17:58 |
voidspace | I still have to work out why we get these auth errors, but that's my problem... | 17:59 |
niemeyer | voidspace: Even with pinging, connections die all the time.. then what? | 17:59 |
voidspace | m1k43l | 18:05 |
voidspace | m1k43l | 18:05 |
mgz | wallyworld_: https://bugs.launchpad.net/juju-core/+bug/1340893 | 18:33 |
_mup_ | Bug #1340893: juju bootstrap in an existing environment destroys the environment <canonical-is> <juju-core:New> <https://launchpad.net/bugs/1340893> | 18:33 |
bodie_ | super simple PR for testing Actions -- literally nothing added, just a testing charm location. I'll need the godeps change in order to land Actions RunHook | 19:35 |
bodie_ | https://github.com/juju/charm/pull/16 | 19:35 |
=== vladk is now known as vladk|offline | ||
=== vladk|offline is now known as vladk | ||
jcw4 | bodie_: LGTM, I think it's fine to merge | 19:52 |
bodie_ | yeah | 19:53 |
=== vladk is now known as vladk|offline | ||
bodie_ | hmm, my dependencies.tsv diff seems off | 20:00 |
jcw4 | how so? | 20:01 |
bodie_ | I seem to have a different entry for loggo even though I've synced with master | 20:02 |
mgz | bodie_: see my latest mp | 20:02 |
jcw4 | bodie_: I think the launchpad.net/loggo dependency was wrong | 20:02 |
jcw4 | mgz: +1 | 20:02 |
bodie_ | ah | 20:02 |
mgz | pull/299 | 20:02 |
bodie_ | thanks | 20:02 |
mgz | so, just remove that line | 20:03 |
bodie_ | there we are then https://github.com/juju/juju/pull/301 | 20:10 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!