[00:16] wallyworld_: o/ [00:29] hi davecheney [00:32] wallyworld_: how can I help with that ppc bug ? [00:32] davecheney: fix the compiler :-) [00:32] wallyworld_: link [00:33] it broke between -10 and -12 [00:33] * wallyworld_ looks up link [00:33] https://bugs.launchpad.net/bugs/1365480 [00:33] Bug #1365480: new compiler breaks ppc64el unit tests in many ways [00:34] the part that hooks in the monkey patch is broken [00:34] i guess because the call stack has changed somehow [00:34] let me know if i can provider more info, thanks for looking [00:34] np [00:34] sorry i couldn't help on fiday [00:34] i have many hats atm [00:36] np [00:36] i got the landing unblocked [00:36] cause i managed to help prove it was a compiler bug, not juju [00:37] excellent [00:42] oh crap [00:42] i forgot python on winton-09 was screwed [00:42] hmm === Guest4804 is now known as wallyworld [01:15] hmm, mercurial on stilson-07 is also scrwed [01:40] axw_: can I get a few quick reviews? [01:40] https://github.com/juju/juju/pull/692 [01:40] thumper: hey. sure [01:40] https://github.com/juju/names/pull/25 [01:40] https://github.com/juju/juju/pull/693 [01:40] axw_: all related to changes due to land soon where the initial user isn't called "admin" any more [01:40] okey dokey [01:41] I'm trying to break it up into small understandable bits [01:46] wallyworld_: i cannot reprodue the error [01:46] hmmm, that doesn't make sense [01:47] very little does these days [01:47] so the affected tests using the COntrolHook stuff all pass on -12? [01:47] wallyworld_: that isn't what I see in https://bugs.launchpad.net/bugs/1365480 [01:47] Bug #1365480: new compiler breaks ppc64el unit tests in many ways [01:48] i see a repl test failyure and a linking failure [01:48] https://bugs.launchpad.net/juju-core/+bug/1365480/comments/3 [01:49] man, there are three indepdent bugs onthat issue [01:49] i can't confirm that failure either [01:50] from the attached log [01:50] local_test.go:244: [01:50] c.Assert(err, gc.ErrorMatches, "(.|\n)*cannot allocate a public IP as needed(.|\n)*") [01:50] ... value = nil [01:50] ... regex string = "" + [01:50] ... "(.|\n" + [01:50] ... ")*cannot allocate a public IP as needed(.|\n" + [01:50] ... ")*" [01:50] ... Error value is nil [01:50] that is the symptom of the compiler issue [01:50] forget the replicaset failure [01:50] that fails all the time [01:50] i cannot confirm that failure [01:51] i've been able to reproduce the linking fialure now [01:51] the local test failure? [01:51] /tmp/go-build604963151/github.com/juju/juju/api/provisioner/_test/github.com/juju/juju/api/libprovisioner.a(provisioner.o): In function `github_com_juju_juju_api_provisioner.State$equal': [01:51] /home/ubuntu/src/github.com/juju/juju/api/provisioner/provisioner.go:10: multiple definition of `github_com_juju_juju_api_provisioner.State$equal' [01:51] /home/ubuntu/pkg/gccgo_linux_ppc64/github.com/juju/juju/api/libprovisioner.a(provisioner.o):/home/ubuntu/src/github.com/juju/juju/api/provisioner/provisioner.go:10: first defined here [01:51] trivial for someone: https://github.com/juju/juju/pull/694 [01:52] davecheney: so with -12, you don't get FAIL: local_test.go:227: TestBootstrapFailsWhenPublicIPError.pN61_github_com_juju_juju_provider_openstack_test.localServerSuite ??? [01:53] i think that failure is because of the one above [01:53] it's really hard to tell [01:53] so many things are broken on ppc [01:53] python has stopped working [01:53] no, it's due to my comment in the ug [01:53] bug [01:53] i think [01:53] maybe not [01:53] but all the failures point to the control hook not being run [01:54] ok [01:54] "all the failures" = the ones hwere a control hook is used but not run [01:54] and a slightly bigger one: https://github.com/juju/juju/pull/695 [01:55] wallyworld_: which package [01:55] provider/openstack ? [01:55] davecheney: provider/openstack [01:55] yep [01:56] there's maybe 4 or 5 tests which use the control hook stuff [01:56] and i think they all fail [01:56] but pass with -10 [01:56] ok, confirmed with -12 [01:56] thanks [01:59] great, ty [02:27] axw_: i called the handleFailure func because i thought we'd need to ensure that the interrupt notify APIs were called. do we not need to do that? [02:28] 1 sec [02:29] wallyworld_: no we don't, that's just there to add feedback if the user hits ctrl-c while it's tearing down [02:29] ah, rightio [02:29] i'll fix, thanks [02:29] cheers [02:30] axw_: speaking of that, have you seen this one? https://bugs.launchpad.net/juju-core/+bug/1365665 [02:30] Bug #1365665: ^C doesn't stop bootstrap [02:31] wallyworld_: hmm, nope, hadn't [02:31] np, keep it in mind as a background task, might be a simple fix [02:54] thumper: https://github.com/juju/juju/pull/697 [02:58] thumper: wallyworld_ https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750444 [02:58] mercurial blow up bug when http_proxy is set [02:59] http://bugs.python.org/issue7776 [02:59] broken in both python2 and python3 [02:59] yay [03:00] davecheney: but fixed in hg 3.0.1? [03:00] we don't ship that ersion in trusty [03:00] ii mercurial 2.8.2-1ubuntu1 ppc64el easy-to-use, scalable distributed version control system [03:00] ah, i'm running utopic [03:00] ii python 2.7.8-1~ubuntu1 ppc64el interactive high-level object-oriented language (default version) [03:00] only happens if you set http_proxy [03:00] which we do on ppc machines in our lab [03:00] ok [03:00] :-( [03:02] maybe i can just hulk smash in the utopic version to get thi8gns going [03:38] thumper: re my comment [03:39] we (juju) have several admin users [03:39] yes... [03:39] well, for now [03:39] right [03:39] and currently they also all have the same password [03:39] but in the future they wont [03:39] so please make it very obvious which "admin" we're talking about [03:39] sure... [03:39] ie, does the mongo admin user have to be called "admin" [03:40] id' prefer it if it wasnt [03:40] AFAICT, which is why I added `mongo.AdminUser` [03:40] thumper: my concern is we have two many sets of credentials called admin [03:40] we have the mongodb username/password [03:40] we have the state first user [03:41] we have the details we pass over the api. etc [03:41] can they all have differtn usernames to make it super easy when one is being transposed [03:41] otherwise it'll just be "admin": login fialed, wrong password [03:41] where it would be clearer if [03:41] mongodb: "environment-admin-user": login failed, wrong password [03:41] well, for now the initial user is still called "admin", but the branch I'm landing soon changes that for all our tests [03:42] I've had to tease it apart [03:42] ok, so if you tell me it'll get change in a followup [03:42] davecheney: RSN, the only "admin" user will be the mongo admin user [03:42] i withdraw my nitpicking [03:42] ok [03:42] but when that happens [03:42] can the mongo admin user not be called admin [03:42] "mongo-admin-user" [03:42] "master" [03:42] "root" [03:42] "scott" [03:42] anything [03:42] just not admin [03:42] I'm not sure we can... can we? [03:43] maybe... [03:43] ok, if mongo is hard coded to call the admin user admin [03:43] then we cannot use that word anywhere else [03:43] it'll be the only one [03:43] ie, on the api [03:43] ok then [03:43] I've changed our tests to have the env owner as: [03:43] "initialize-admin" [03:43] "factory-admin" [03:43] "dummy-admin" [03:43] etc [03:44] EXCELLENT [03:44] davecheney: it's coming [03:44] very soon now [03:44] ... [04:16] davecheney: https://github.com/howbazaar/juju/commit/26e038ab7d7c0ebb47bae2b598903b1aa5e8cd5c for https://github.com/juju/juju/pull/695 [04:21] thumper: does userTag.Name() and userTag.Id() do the same thing ? [04:22] there are too many subtly named methods on that type [04:24] davecheney: no [04:24] davecheney: Id() returns what it was constructed with... [04:25] Name() returns the first part of the string [04:25] so... [04:25] NewUserTag("dave"), id == "dave", name == "dave", provider == "local" (but internally it is "") [04:25] NewUserTag("dave@local"), id == "dave@local", name == "dave", provider == "local" [04:26] the Id needs to be able to be fed back into the type to make something equal [04:26] the tag round-tripping bit [04:26] do Id and Username do the same thing ? [04:26] no [04:26] NewUserTag("dave"), id == "dave", username == "dave@local" [04:27] NewUserTag("dave@local"), id == "dave@local", username == "dave@local" [04:27] wow, subtle [04:27] yes, had to be for the transition period [04:27] there are six bullets in that footgun [04:27] I've managed to keep it pretty sane in most places [04:27] which is why I don't generally use Id() for name tags [04:28] good plan [04:29] davecheney: I'm starting a branch that you'll like :) [04:33] wallyworld_: FYI, https://github.com/juju/juju/pull/700 [04:33] looking [04:34] wallyworld_: there's no upgrade steps to move tools from provider storage yet, but it won't adversely affect anything until we drop provider storage [04:34] searching/fetching tools will still go to provider storage for now, it's just behind the scenes [04:34] sure, sounds ok [04:35] going to work on getting toolstorage to remove old tarballs now [04:35] easy review pls: https://github.com/juju/juju/pull/699 (thumper?) [04:35] then will move onto charms I think [04:35] * thumper looks at menn0's branch [04:36] thumper: this is just something in the support infrastructure that came up today [04:36] thumper: the big PR is Coming Soon [04:36] * thumper nods [04:52] thumper: cheers. I've changed to uses backticks throughout those tests where it makes sense [04:52] kk [05:17] davecheney: are you happy enough with https://github.com/juju/juju/pull/695 ? [05:17] yup [05:18] go fot it [05:21] kk [05:27] * thumper got sad [05:28] seeing usernames passed around as tags, because tags are strings and usernames are strings, so that's all good, right? [05:28] but not string versions of those tags, oh no, just names [05:44] wallyworld_: i'm not going to have a ppc fix for you today [05:44] i've not gotten to the bottom of the issue [05:44] davecheney: no worries, CI has been tweaked to avoid having the failing ppc tests block landings [05:44] when we fixed this last time we made it automagiclaly figure out the stack call depth [05:44] and that bit is working now [05:44] well it is working [05:45] so something broke ebtween -10 and -12? [05:45] absolutelty [05:45] -12 is broken [05:45] but I can't tell yo what [05:45] or how to fix it yet [05:46] are there tests that should be writen to avid this again? [05:46] avoid [05:46] unknown [05:46] ok [05:46] thanks for looking at it [05:46] np [05:46] i'll keep looking [05:46] ok [05:51] * thumper closes irc and plans to work on his KiwiPyCon talk tonight [05:52] good luck === uru_ is now known as urulama [06:08] wallyworld_: data point: the registerserverpoint thinggy doesn't appear to be run [06:09] know that is what you said [06:09] but putting a panic in the function doesn't do jack ... [06:09] i wonder what is going on [06:09] davecheney: i added a panic too, and it didn't get called [06:10] i haven't looked at the code in a while, but from memory it looks at the call stack to figure out when to run the patched code [06:10] what the firetruck [06:10] ok [06:10] i'll keep looking [06:18] wallyworld_: look what I found [06:18] // This is very fragile. fullName will be something like: [06:18] // launchpad.net_goose_testservices_novaservice.removeServer.pN49_launchpad.net_goose_testservices_novaservice.Nova [06:18] // so if the number of dots in the full package path changes, [06:18] // this will need to too... [06:19] \o/ [06:19] what file is that in? [06:20] service_gccgo.go [06:20] hook/ [06:21] gotta go get kid, bbiab [06:25] wallyworld_: tweaked constants, things look better [06:25] wallyworld_: btw, we already have tests for all of this [06:25] they are in the goose repo [06:25] can goose be tested on ppc reguarly ? [06:40] davecheney: i can ask that to happen [06:40] davecheney: will the changes break with -10? [06:41] wallyworld_: not sure [06:41] i've just proposed a branch [06:41] which should try to figure it out automatically [06:41] ok [06:41] the bug was one of theose 'how the hell did that work in the first place' errors [06:42] lbox propose --- ah memories [06:43] FARK [06:43] my branch was screwde [06:43] wallyworld_: can you figure out how to merge this ? https://codereview.appspot.com/31890043 [06:44] davecheney: there's a prereq branch? [06:45] looks like the prereq has been merged already [06:47] not sure what you've done there. did you bzr branch off lp:goose to create a new branch? and then push that? and then propose? [06:53] my goose was very old [06:53] i just proposed it on whatever [06:54] i've forgotten almost all of the old ways [06:54] davecheney: i can propose if you like, but i have to go to the doctor in a bit so can do it after dinner [06:55] ok [06:55] let me see if i can figure it out [06:55] i'll also let curtis know so we can get a goose ppc test set up [06:56] ok, i'll check back a bit later, gotta leave in about 5 mins [07:02] dimitern: 1:1 ? === rogpeppe2 is now known as rogpeppe [07:47] morning [07:52] dimitern: I changed PR 626 after your review. mind another look? [07:54] TheMue, morning, sure - looking [07:58] dimitern: thanks [08:04] wallyworld_: i got that branch landed [08:04] https://code.launchpad.net/~go-bot/goose/trunk [08:04] need to land a followup to udpate depdencies.tsv [08:11] wallyworld_: https://github.com/juju/juju/pull/703 [08:15] morning TheMue [08:16] jam: heya [08:20] jam: your emails are interesting [08:20] jam: I noticed the "socket.Close()" issue [08:20] jam: I assumed that the test author knew what they were doing and the defered session.Close() was late bound [08:21] (i.e. the newly opened session would be closed to) [08:21] I don't think that can be the cause of the spurious connect (log lines connecting to ipv4 address) [08:32] davecheney: thanks for fix, I'll let curtis know [08:33] not sure how/when/if to backport to 1.20 [08:33] we could do, i can update the deps file [08:33] wallyworld_: i reckon if it aint broke, don't touch it [08:33] it's not clear what compiler 1.20 is being built with on trusty [08:34] not sure, i'll find out [08:35] heya voidspace [08:36] jam: morning [08:36] voidspace: so I'm not sure if session.Close is the issue, as it seems to be doing a syncServer inside the 'mgo' driver [08:36] and I wonder if the fact that we are calling inst.Destroy() isn't causing the cluster sync to get cleaned up. [08:37] jam: I'd like to see the mongo logs [08:37] jam: do you know where they go to? [08:37] jam: I'm seeing the tests die sometimes with unreachable servers - so it looks like mongo has fallen over altogether [08:37] voidspace: they are supposed to be on stdout, such that if you kill mongo somehow you'll get them in the test log [08:37] github.com/juju/testing/mgo.go is where that is set up [08:37] you can hack it to pass "--logfile" to mongo [08:37] jam: ah... we catch standard out when we start mongo [08:37] and then debug them from there [08:38] jam: and we parse the log to tell when mongo has started [08:38] voidspace: yes [08:38] jam: so you can't redirect it to a logfile [08:38] ah, we might not be able to redirect [08:38] someone did that in the past, you're right [08:38] maybe I can tee that handle [08:38] and send it to standard out as well [08:38] I really need those logs [08:39] jam: using CurrentStatus and checking member state *plus* uptime works *most of the time* for telling when the replicaset is healthy [08:39] "most of the time" lovely [08:39] sigh [08:39] jam: however it often dies with unreachable servers - and can't recover within five minutes [08:40] voidspace: have you tried looking in the mongo source ? [08:40] to see what it uses ? [08:40] jam: nope [08:40] jam: if I see the original errors in the mongo log then I can follow from there [08:40] voidspace: I'm not sure if it is worth it, but at this point its where I would go next if we can't get something reliable [08:41] right [08:41] voidspace: no reachable servers might be better to focus on, [08:41] jam: the thing is that the attemptLoop wasn't dying like this [08:41] and that sounds strange given we know mongo has started from its stdout right? [08:41] voidspace: I have the feeling it was failing in its own way that was a similar root cause but a retry caused it to get papered over [08:42] jam: right, but I have session.Refresh inside the healthy check - so I don't know (yet) what's different about the way attemptLoop was retrying and this is retrying [08:42] that's my next point of attack [08:42] but I'd like to see the logs to see why mongo thinks it's fallen over [08:44] jam: the main difference is that before we would retry the operation multiple times (with the same session) [08:44] e.g. Set() (which is usually the one that dies) [08:44] whereas now we just retry CurrentStatus multiple times [08:44] voidspace: I'm pretty sure we iterate over the output from mongo, you can probably line-by-line that to a logger.Debugf() call [08:44] jam: yep [08:45] have juju actions happened yet? [08:47] mwhudson: TheMue is working with the guys who are doing it. I don't think it is something you can put in a charm yet [08:48] ok [08:48] is there a spec? [08:48] i remember some discussion on the list, but that was a while ago [08:49] jam: so we do log the lines - at Trace level [08:49] mwhudson: I believe it was part of the "mega planning" sheet that was put together a while back, I'm not sure if there is a focused line, TheMue would probably know better when he notices these pings [08:49] heh [08:50] mwhudson: https://docs.google.com/a/canonical.com/document/d/14W1-QqB1pXZxyZW5QzFFoDwxxeQXBUzgj8IUkLId6cc/edit exists, last edit was 2 weeks ago, so it is probably up to date [08:50] * jam goes to make coffee, brb [08:51] mwhudson: so far I'm collecting the infos about all docs. one interesting source may be https://github.com/binary132/juju/blob/actions-doc/doc/actions.md [08:59] TheMue, you've got a review [09:00] dimitern: yep, already seen first feedback. thanks. will discuss one point with you after my 1:1, got a question here. [09:00] TheMue, sure [09:00] TheMue: thanks, looks appropriate for what i want to do, but for now will go for "juju ssh unit/0 random-shell-script.sh" :) [09:01] mwhudson: ;) [09:02] mwhudson: I think it is "juju scp unit/0 random-shell-script.sh . && juju ssh unit/0 random-shell-script.sh" right? [09:03] jam: the charm will provide random-shell-script.sh [09:04] mwhudson: SGTM [09:50] voidspace: I need to take my dog out, so I might be a little late for our 1:1, but hopefully not too late [09:50] jam: no problem [10:07] voidspace: I'm here now [10:09] jam: cool [10:09] jam: sapphire hangout? [10:09] voidspace: I linked it to the calendar event: https://plus.google.com/hangouts/_/canonical.com/john-michael [10:11] jam: found it :-) [10:45] jam, voidspace, TheMue, standup? [10:46] dimitern: just finishing up our 1:1 be there in a sec [10:46] jam, sure, np [11:12] jam: when it fails (the IPv6 test), I see this in the logs: [11:12] [LOG] 0:43.834 DEBUG juju.testing tls.Dial(127.0.0.1:37874) failed with EOF [11:12] jam: which mirrors what you were saying and is I believe "a clue" [11:12] using the non IPv6 address [11:12] tasdomas, ping [11:23] Ah, *actually* switched on the mongo log [11:23] That's better [11:25] voidspace: so... when the bus comes and says "we need someone at the door immediately because we can't just park here", my response is "don't show up 10 minutes early without letting me know"... [11:25] jam: it's unfortunate the number of people who think their mistake is your problem... [11:26] voidspace: so I have to work on homework, but then I'd be willing to pair debugging if mongo is being a pain [11:26] jam: yep, their should be a kind of automatism to signal the approach of your kid [11:26] jam: sure [11:27] Hah, and of course after switching on logging the test refuses to fail [11:27] :-) [11:27] I'm sure it will happen shortly [11:28] voidspace: turning on logging changes timings... [11:28] mongo is thinking "they're watching what I'm doing, I'd better behave now" [11:28] jam: I've only changed the log level in our code, so it shouldn't *actually* [11:28] vo [11:28] but yeah, definitely timing [11:28] writing to stdout is slow [11:29] 4 passes in a row, it was failing every time a minute ago :-) [11:33] voidspace: and now if you turn off stdout logging, they will pass every time [11:33] it was a phase of the moon thing [11:36] morning [12:01] lunch [12:01] jam: I'm taking a break - going to post office so may be a bit [12:19] dimitern, pong [12:40] voidspace: np. The homework for today seems completely inappropriate (almost all stuff that I know he didn't learn last year), so its going rather slowly. I'm not sure if it is "pre test" and this is what he'll be learning, or whether somehow we're completely off track... [12:42] jam: teacher having a papers mixup? [12:42] perrito666: it certainly looks like stuff he might learn this year, so I'm not really sure. [12:43] he's going into second grade, and they're having him do stuff like "friends" vs "friend's" [12:43] he's a bit more at the speed of being able to read the word "friend" [12:43] jam: well just in case you should not help him (In case its a level finding test) [12:44] perrito666: the rules from earlier in the week that they sent home are that we should help if they ask for help [12:44] and there is only 1 line for each type of thing [12:44] anyway, I'm certainly going to be asking about it. [12:45] jam: I dont think is customary for teachers to handle their phone numbers there, right? [12:45] xI have her email address. [12:46] teachers and emails are not as fast as one would expect :p [12:46] * perrito666 will be a very annoying parent someday [12:46] perrito666: :0 [12:46] :) [12:57] hazmat: question about bug 1363971 if you are online [12:57] Bug #1363971: add-machine containers should default to latest lts <14.10> [13:00] wallyworld_, shoot [13:01] wallyworld_, the issue is the container default series logic is shared i think between local and other providers. [13:01] wallyworld_, the issue is the fallback is basically a hardcoded string of precise [13:01] wallyworld_, i'm in the uk this week fwiw [13:02] wallyworld_, in terms of what the proper behavior would be.. default to latest lts or in non local provider case host series both sound reasonable. [13:10] jam: :-/ [13:11] jam: good news for me though - text and email to say internet is on [13:11] jam: about to find out if this is true... [13:46] no, it seems like a lie [13:46] no internet on the dsl line [13:47] the router can detect a dsl line, but no internet... [13:53] How do I see Trace logs in tests? -gocheck.vv doesn't do it and there's no -gocheck.vvv [14:19] natefinch: do you know the trick for turning on verbose mgo logging? [14:19] natefinch: dimiterm mentioned it last week and foolishly I didn't write it down [14:19] voidspace: nope [14:19] heh [14:20] natefinch: I'm pretty sure we don't write mongo logs to disk by the way [14:20] natefinch: we deliberately send to stdout and capture them so we parse them to know when mongo has started [14:20] voidspace: possible during the tests I guess [14:20] natefinch: we log them at Trace level [14:20] voidspace: hey, you are from uk :) [14:20] natefinch: I couldn't see how to show them, but I've just changed them to Debug for the moment [14:20] perrito666: I am... [14:21] voidspace: priv [14:21] natefinch: annoyingly I see lots of mongo heartbeat messages whilst still getting a "no reachable servers" error [14:21] natefinch: and increasing that timeout (it's in the Dial function we use as far as I can see) doesn't fix it... [14:21] voidspace: heh yeah [14:21] so as far as mongo is concerned everything is ok as far as I can tell [14:21] that's why I want those mgo logs [14:22] the timeout is in cluster AcquireSocket [14:30] alexisb, I take the silence regarding inserting a new development milestone means I can release 1.21alpha1 [14:34] jam: I think you mentioned something about someone working on making juju resolved --retry into just juju retry? Is that work that is currently underway? [15:23] dimitern: ping [15:23] voidspace, hey [15:23] dimitern: last week you suggested a way to get more logging out of mgo [15:23] dimitern: which I didn't write down... [15:24] voidspace, right :) so take a look at mgo.SetLogger() and mgo.SetDebug(true) [15:24] voidspace, SetLogger can take *gc.C I think, as it needs something with .Output(..) method, so you can have both mgo and other logs in the same place [15:25] dimitern: ok, I'll try tinkering with those [15:25] dimitern: thanks [15:25] in case no one else noticed.... there's a built-in side by side diff on github now [15:26] voidspace, you'll need to call these early, like in SetUpSuite, although depending on what you need just setting them up in SetUpTest or the TestXX case itself could be enough [15:26] dimitern, jam: *phew* I think I found a nice way testing an API for all existing versions taking into account the individual changes [15:26] dimiterm, jam: you'll see it with the next push [15:26] natefinch, cool! [15:26] TheMue, nice, I'll have a look [15:27] dimitern: I'll ping you then [15:27] natefinch: hey, yes, simply press "Split". that's cool, thanks for the hint [15:28] dimitern: wow [15:28] dimitern: that's screenful after screenful of logs... [15:28] like, constant [15:32] voidspace, it is a lot, and it takes some time to understand how to read it :) [15:34] dimitern: this is interesting [15:35] dimitern: for mongo we have to *not* use square brackets for ipv6 addresses [15:35] dimitern: i.e. :::port [15:35] dimitern: but mgo uses net.DialTimeout [15:35] dimitern: and *it* specifies [15:35] If host is a literal IPv6 address or host name, it must be enclosed in square brackets as in "[::1]:80" [15:36] so, technically the address form we're using for ipv6 in the tests are invalid for net.DialTimeout [15:36] now that I have logs, the specific error in mgo is [15:36] [LOG] 2:22.633 SYNC Failed to start sync of ::1:41050: failed to resolve server address: ::1:41050 [15:36] whilst trying to sync servers [15:36] [LOG] 7:21.167 SYNC Synchronization was partial (cannot talk to primary). [15:37] voidspace, hmm.. interesting [15:37] but it's not deterministic [15:37] but it maybe that ipv6 addresses of that form are incompatible with *part* of mgo [15:38] voidspace, IPv6 localhost address + port could be specified as [::1]:port, but it depends on the way the thing is written I guess [15:38] dimitern: no, mongo fails to parse that [15:39] dimitern: we specifically can't use that format [15:40] voidspace, fails to parse "::1:12345" or "[::1]::12345" ? [15:40] dimitern: the second [15:40] the correct one :-) [15:42] voidspace, hmm, but it still works sometimes? [15:42] dimitern: yeah, which is *possibly* due to the test only failing if we cause primary renegotiation [15:42] dimitern: so it has to redial [15:42] dimitern: and hits this function which can't handle our ipv6 address format [15:43] dimitern: it's pushed. the interesting one is apiserver/machine/machiner_test.go. the suite is now run twice, once for v0, once for v1. when the version is lower than 1 the special tests for v1 (today only one) are skipped. this concept should also work when we get more versions. [15:43] dimitern: I'm going to try locally and see if it does work with addresses in that format [15:43] voidspace, so the ::1:port is only needed when passing args to mongod, or? [15:43] dimitern: yep [15:43] dimitern: but the address we pass to mongo is the address used by the dial function [15:43] dimitern: I'd have to hack round it if that is the problem [15:44] dimitern: I'm going to see if net.DialTimeout does work with these addresses despite claiming not to [15:44] if it does then the resolve error must be due to something else [15:46] TheMue, it's run twice due to you calling gc.Suite() twice with for the same type with different constructor args [15:48] dimitern: not looking good [15:48] dimitern: dial udp: too many colons in address ::1:80 [15:48] dimitern: but mgo discards the actual error and we just see "failed to resolve" [15:48] dimitern: so for ipv6 I think that's the problem [15:49] just confirming added logging of the actual error and re-running [15:49] voidspace, weird.. [15:49] voidspace, what if you try with "[::1]:80" [15:49] dimitern: in the test? [15:49] dimitern: mongo will fail to start [15:49] that's what will happen [15:50] I'll need to hack round it (if I even can) to mix the forms [15:50] if I can confirm that this is the problem I'll do that [15:50] I just need the test to fail... [15:50] it fails about 50/50 [15:51] voidspace, right, sounds good [15:53] dimitern: *damn* - got it to fail, but I logged the address not the error!!! [15:53] now to try again... [15:58] dimitern: confirmed [15:58] dimitern: when the IPv6 mongo replicaset fails, the root cause is this error inside mgo [15:58] dimitern: [15:58] [LOG] 1:26.513 SYNC failed to connect: dial udp: too many colons in address ::1:42113 [15:59] during a syncServers [15:59] dimitern: I'll look at a workaround [15:59] jam: I've found the cause of the ipv6 failures - but it isn't a general issue for the other tests [16:00] jam: we use the form ::1:port for starting mongo on ipv6 because mongo doesn't understand [::1]:port [16:00] jam: but if the Set operation causes a primary renegotiation mgo has an internal call to net.DialTimeout [16:00] jam: and net.DialTimeout doesn't work with ::1:port format [16:00] jam: mgo logging doesn't log the original error so it was only visible as "no reachable servers" [16:01] jam: the actual error is [16:01] [LOG] 1:26.513 SYNC failed to connect: dial udp: too many colons in address ::1:42113 [16:01] jam: so we need to be able to start mongo with the "bad" ipv6 address format, and then use the "good" format elsewhere [16:03] voidspace, sweet! there's progress then :) === Ursinha is now known as Ursinha-afk === urulama is now known as urulama-afk [16:17] ericsnow: do you know how to use rbtools with our OAuth login on the rb instance? [16:20] wwitzel3: yuck [16:20] wwitzel3: I hadn't thought of that [16:20] wwitzel3: I'll take a look [16:23] ericsnow: yeah, we will have to extend rbtools to support it [16:23] ericsnow: looking at the code now [16:24] wwitzel3: that or some kind of wrapper [16:24] wwitzel3: I'm also going to look for a git plugin for rbt [16:25] ericsnow: yeah, or one you oauth with git, make the user able to set a password and login either way? [16:25] ericsnow: but only allow creation through the inital oauth [16:25] wwitzel3: yeah, maybe [16:26] wwitzel3: depends on how easy the OAuth approach is === Ursinha-afk is now known as Ursinha [17:04] wwitzel3: try this for your password: oauth:wwitzel@github === Ursinha is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha [17:48] Looks like home internet works! [17:48] This box not yet on it [17:48] Big switchover shortly [17:48] Still a crappy 3.6Mbps line though, which is really annoying [17:49] voidspace: wow, I have more, that is not normal [17:50] perrito666: it used to be ok, about four months ago it got really crappy [17:50] perrito666: I hoped changing provider might fix that [17:50] apparently not [17:50] all changing provider did is leave me without internet at all for ten days... [17:50] voidspace: I have a 10 or 12 Mbps line, lousy upload and terrible lag but at least has more bw :p [17:51] hah [17:51] :-p === Ursinha is now known as Ursinha-afk [18:07] ericsnow: that worked! :) [18:08] wwitzel3: awesome [18:08] voidspace, perrito666: no worries, Google Fiber will soon take over the world via project Loon :P [18:08] wwitzel3: now if only I could get rbt post to do what I want [18:09] gsamfira: I live outside of the world :p [18:09] =)) [18:09] ericsnow: what do you want it to do? [18:09] wwitzel3: actually work :) [18:09] wwitzel3: it's not generating the right diff [18:11] ericsnow: hrmm .. mine worked just fine, are you trying to do something specific? [18:11] ericsnow: or just the default behavior of diff vs. master [18:11] wwitzel3: yep, just that [18:11] wwitzel3: I'm pretty sure my branches have confused it [18:13] wwitzel3: so diffs are reflecting only the last commit in the branch [18:14] ericsnow: if none of the other commits are merged in to master, that is really weird [18:15] wwitzel3: yep [18:15] ericsnow: the first step is figuring out how to generate the right diff using git diff --full-index and if you can get the diff you want there, then you can usually get rbt to do what you want. [18:23] wwitzel3: looks like rbt can't handle it when my branch is based on a stale master [18:55] ericsnow: to be fair, would you ever really want to review something like that? [18:56] katco: no :) however, rbt happily did the wrong thing [18:57] katco: well, the wrong thing for me :) [18:57] ericsnow: ah i see :p [18:59] ericsnow: did you get your bkp code landed? [18:59] wwitzel3: FYI, I made a comment on your test review request :) === Ursinha-afk is now known as Ursinha [19:00] perrito666: still waiting on reviews [19:00] ericsnow: sorry... I'll look at it after my current meeting [19:01] natefinch: thanks! [19:01] or during the meeting if alexisb is held up ;) [20:12] wwitzel3: you have been reviewed [20:17] ericsnow: you have an LGTM [20:17] natefinch: thanks [20:18] 'o/ [20:19] \o/ [20:19] my previous happy guy had no right arm apparently [20:20] perrito666: the result of a terrible waving accident [20:20] ericsnow: the result of spending a life using a en_US keyboard and now type in an es_ES [20:21] perrito666: ah, it's worse than I thought ;) [20:21] ericsnow: since I moved to my wife's laptop I am stuck with an es kb until my next US trip (this one is far easier to change though :p) [20:28] perrito666: no more ripping out keyboards via brute force? [20:28] perrito666: I've reviewed your restore-mode patch. Basically LGTM. [20:28] natefinch: its a thinkpad, its as close as it gets to being made out of lego [20:28] ericsnow: thanks [20:33] ericsnow: merge streak [20:34] perrito666: don't forget that my LGTM doesn't count yet :) [20:34] ericsnow: well you just merged my patch [20:35] oh shoot [20:35] s/oo/i/ [20:35] perrito666: no wonder it didn't go for the one I thought I had done [20:37] is there a way to cancel the $$merge$$? [20:37] ericsnow: I presume it can be done by hand [20:38] I might be able to break the patch so it wont merge, but I would rather not [20:38] natefinch: you know how to tell the bot to cancel a merge? [20:47] ericsnow: I don't think it's possible [20:48] ericsnow: at least not easily [20:49] well, it passed CI and merged [20:49] natefinch: should we revert it? [20:52] ericsnow: what's the PR? Maybe I can do a retro-active review :) [20:52] natefinch: :) [20:52] natefinch: https://github.com/juju/juju/pull/678 [20:55] perrito666: thanks, that else was just an artifact of refactoring, nothing actually goes there, I will remove it [21:01] perrito666: I don't see anything glaringly wrong, but I haven't really had time to thoroughly review it. [21:02] perrito666: I say leave it [21:02] perrito666: only minor fidly stuff, no logic problems. [21:02] natefinch, perrito666: yeah, that's the way I saw it too [21:04] perrito666, ericsnow: gotta run. Good luck! [21:07] perrito666: FYI, that backups patch has landed now [21:07] perrito666: https://github.com/juju/juju/pull/708 adds the top-level backups abstraction [21:08] * perrito666 jumps like an anime little girl [21:08] perrito666: lol [21:14] perrito666, ericsnow: what do you guys need reviews of [21:14] wwitzel3: https://github.com/juju/juju/pull/708 [21:14] besides that [21:14] thumper: morning, https://github.com/juju/juju/pull/702 I've got an LGTM but wanted to double check with you [21:14] j/k :P [21:14] wwitzel3: :) [21:15] morning [21:15] wwitzel3: 702 is a much smaller patch, not the fat one that just landed [21:15] waigani_: fine with me [21:15] wwitzel3: 708 I mean [21:31] kwmonroe: whats your google+ link you want in the circle? [21:31] oops wrong chan [21:51] rick_h_: can we reschedule the call at the end of the week re:mess? [22:05] thumper: when works for you? [22:05] thumper: I really want to chat. I've not heard what's up since Germany really and want to make sure we've got a good path on that [22:05] rick_h_: can you do a day earlier? [22:06] thumper: put something on the calendar and we can make it fit [22:06] rick_h_: any changes I make only appear on my calendar [22:06] the event creator needs to change it [22:07] thumper: ok, so the same time the day before? [22:07] yep [22:07] thumper: or a diff time as well? [22:07] I can make that time [22:08] thumper: ok cool [22:08] thumper: alexisb has stuff at that time but will move so we can get caught up [22:08] thumper: and we can report to urulama-afk and alexisb if we need then [22:10] thumper: ok, moved and such. Let me know if you don't get the notice. Dinner time here, biab [22:11] thumper: meta-review pls: https://github.com/juju/juju/pull/678 [22:14] katco: hiya, back now if you are free [22:14] wallyworld_: i am, one sec [22:14] menn0: ack, otp [22:14] thumper: no rush [22:18] wallyworld_: ok ready? [22:18] yup === jherouxz is now known as jheroux_away [22:22] menn0: thanks a lot dude [22:22] perrito666: np [22:23] wow github lied to me blattantly [22:35] wow, thumper uses emoticons in his reviews, we are on a whole new level here [22:35] I wonder if I can get davecheney to do the same with mine :p [22:36] heh [22:36] doesn' [22:36] I don't think that github supports the unicode "pile of poo" [22:42] * perrito666 feels thumper wants to tell him something [22:42] * thumper whistles [22:43] thumper: look out, you might have an accident while visiting a foreign country like say... belgium? [22:44] hmm... waffles... [22:45] tell memore about waffles [22:46] if someone where to read this conversation without context none of us would pass a turing test [22:49] perrito666: and that is the way it should be [22:49] unpredictable [22:59] wallyworld_: FYI, I've added you and the other team leads who have logged in as reviewboard admins [22:59] sorry thumper [23:00] sinzui: ping [23:03] ericsnow: awesome, thank you, on call, will look soon [23:04] wallyworld_: no worries [23:04] wallyworld_: I'm pretty close to EOD, but I'm sure I'll pop in a time or two before tormorrow [23:07] ericsnow: why is there an interface for Backups? [23:08] ericsnow: I am pretty sure you already explained this to me at some point [23:09] perrito666: it's to hide away the implementation details of backups [23:09] perrito666: so that there is a single concise implementation in terms of the larger low-level code [23:10] perrito666: without the low-level details (as much as possible) [23:10] ericsnow: I take there is more to the implementation than what I see in 708? [23:11] hazmat: sorry i missed you, i fell asleep after pinging you. i will change the hard coded "precise" string. what i was curious about was why the distro-info command failed which caused the hard coded default to be used [23:12] hazmat: we already default to the host series for non-local from what i can see; i'll double check that's the case [23:13] perrito666: there's the create() implementation from the previous PRs [23:13] perrito666: the code in 708 just has to call the low-level create() method [23:13] perrito666: basically what you already have implemented for restore would be the low-level implementation of it [23:19] sinzui: hi, there's been a change to goose to work around the ggc compiler change which broke ppc tests on CI. juju has had the dependencies updated so those failures are not happening anymore. but there still looks like a linker error with the compiler causing other failures [23:20] sinzui: can a CI test be set up for goose on ppc - this will flag any future breakages like the one we just saw? [23:31] davecheney, thumper: I blew away my go/pkg dir and the pre-push hook passed [23:32] waigani_: did you run ./all.bash ? [23:32] thumper, waigani_: here it is: https://github.com/juju/juju/pull/709 [23:32] it will have given you a message explaining exactly what has changed [23:32] davecheney: no... [23:32] oh [23:33] menn0: ta da [23:33] waigani_: http://paste.ubuntu.com/8294496/ [23:34] davecheney: handy, thanks [23:34] waigani_: please remember that the official compiler version we use to compile juju is the one that ships with ubuntu [23:34] if you want to use tip [23:34] that's cool, it often turns up interesting bugs [23:34] but it's your responsibility to own your environment and deal with the tools we've written that may not expect to be run against tup [23:34] tip [23:35] yep, understood I'm using go that ships with Ubuntu [23:36] 09:31 < waigani_> davecheney, thumper: I blew away my go/pkg dir and the pre-push hook passed [23:36] ^ what does this mean ? [23:36] oh [23:37] you mean $GOPATH/pkg [23:37] right [23:37] sorry, yeah [23:37] nope, my mistake [23:38] next time I'll use ./all.bash and remove just the offending pkg [23:38] sinzui: ping [23:38] waigani_: please don [23:38] please disregard all the advice I fave you [23:38] it is only relevant if you are running tip [23:39] oh right. consider it disregarded! [23:39] thumper: quick look - https://github.com/juju/juju/pull/702/files I've added three lines [23:44] wallyworld_: can you kick this build job http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el-devel/ [23:44] for me [23:44] waigani_: my gut reaction is that we should only need one factory line not two... [23:44] ok [23:44] waigani_: but since an env user isn't necessarily local [23:44] waigani_: it seems that we should add something to the makeUser method [23:44] waigani_: so it can also create the env user [23:44] thumper: yeah, I know what you mean, had the same initial thought [23:45] waigani_: in fact, we probably want the default behaviour of MakeUser to be one where the env user is created [23:45] thumper: sure, shall I add that to this branch [23:45] waigani_: and we need to explicitly say "don't make an env user for this user" [23:45] waigani_: yeah, should only be a few lines [23:45] * thumper looks at the params [23:45] yep, no problem [23:46] davecheney: what do you need done? [23:47] waigani_: since go defaults to zero values, I feel we are going to have a double negative... [23:47] waigani_: so "if !params.NoEnvUser { make the env user }@ [23:47] wallyworld_: i'd like to know if the fix to goose has made the ppc situation any better [23:47] wallyworld_: can you push the build button on that job for me ? [23:47] waigani_: that way we could create a user, and set { NoEnvUser: true, ... } in the params [23:47] davecheney: it has - you can see from the job runs overnight. but there's still alinker issue [23:48] waigani_: that way the default will be the most expected behaviour [23:48] wall fuk [23:48] err [23:48] ok [23:48] waigani_: make sense? [23:48] davecheney: eg [23:48] /tmp/go-build476839839/github.com/juju/juju/api/provisioner/_test/github.com/juju/juju/api/libprovisioner.a(provisioner.o): In function `github_com_juju_juju_api_provisioner.DistributionGroup.pN44_github_com_juju_juju_api_provisioner.Machine': [23:48] /home/ubuntu/juju-core_1.21-alpha1/src/github.com/juju/juju/api/provisioner/machine.go:161: multiple definition of `github_com_juju_juju_api_provisioner.DistributionGroup.pN44_github_com_juju_juju_api_provisioner.Machine' [23:49] wallyworld_: got it [23:49] davecheney: ty