[00:03] dannf: yeah, seems to be working for me now. i think it was just so slow yesterday that i gave up. i also tried building juju from src and i gave up after 30 minutes. maybe the vms are ust slow [00:13] thumper: wallyworld__ on my ppc systems I see the provisioning constantly (every 300 sec) polling the charm store [00:13] this fails because FIREWALLS! [00:14] but i wonder why does it poll at all ? [00:14] davecheney: to see if charms are out of date [00:14] so that can be shown in status [00:14] wallyworld__: ok, so i should raise an RT to get access to charm store enabled [00:14] looks like the proxy is blocking it [00:14] yeah that would be good [00:15] wallyworld__: ill fix [00:15] davecheney: so when you run status, it says "hey you have mysql version 10 installed but version 12 is available" [00:40] ahh nice [00:40] https://bugs.launchpad.net/juju-core/+bug/1305365 [00:40] <_mup_> Bug #1305365: juju 1.18.0 environment unusable after bootstrap [00:40] ^ what the hell [00:41] it bootstrapped and deployed fine [00:41] why does status have a whinge [01:00] thumper, i'm trying to capture output that's been written with logger.Errorf (from within in a coretesting.RunCommand). is there a way to hook into loggo like this? [01:00] i've tried coretesting.Stdout(ctx), and Stderr, but nothing there [01:08] cmars: hey [01:08] hi [01:08] cmars: chances are you are using a LoggingSuite [01:08] it caputres the logging [01:09] it is in the gocheck *gc.C thingy [01:09] as some test log [01:09] normally we don't test logging [01:09] what are you after exactly? [01:10] hi axw [01:10] hey thumper [01:10] axw: was asked to hit you up about a critical bug [01:10] bug 1303735 [01:10] ok [01:10] <_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade [01:10] thumper, the recent error message I've added. i think that'll do it, thanks [01:11] apparently you were talking with rob about it [01:11] ah, was chatting with rog about that last night... [01:11] cmars: there are some new methods on the context object [01:11] cmars: if you are testing the error [01:11] then you should check the error response from Run [01:12] cmars: testing.RunCommand returns a context and an error [01:12] the error is what you want to be checking [01:13] axw: do you know what the issue is there with the addresses [01:13] ? [01:13] I recall work going on there [01:13] but I don't know exactly what changed [01:14] thumper: it looks like the openstack environment has no public addresses, only cloud-local and/or unknown; the SelectPublicAddress code chooses the last cloud-local/unknown address in the list... so it gets one of the unknown addresses that the machiner records [01:15] hmm... [01:15] so... how do we go about fixing it? [01:15] thumper: so, we should probably prefer the provider addresses over machine addresses if they're all cloud-local/unknown [01:16] are we able to tell the difference [01:16] ? [01:16] yes, they're recorded in separate lists [01:17] thumper: it would be ideal if the machiner could decide which addresses were machine-local, but it's not a straightforward thing to do [01:17] apart from localhost addresses of course... [01:18] right [02:08] thumper: davecheney has LGTMed https://codereview.appspot.com/85710043/ did you want to take a look before I land it? [02:08] yes please === alexlist` is now known as alexlist [02:15] ah, I should remove loggingSuite from environ_test.go [02:33] https://bugs.launchpad.net/juju-core/+bug/1305386 [02:33] <_mup_> Bug #1305386: state/apiserver: multiple data races [03:03] davecheney: what causes these and do you have any suggestions to fix? [03:06] thumper: not sure yet [03:06] i'm hoping its something we're doing (sharing a mgo conn) not a driver bug [03:06] its not clear if these are cosmetic or serious [03:12] thumper: awesome, found out why go install was hanging. trunk of code.google.com/p/go.crypto has changed and juju-core no longer compiles but that causes go install to just hang unless you compile with the -v option [03:12] how good is thst [03:12] wut? [03:12] hah [03:12] so we had better not update go.crypto [03:12] why? [03:12] no idea [03:12] wallyworld__: so you should run godeps, yeah? [03:13] seems so [03:13] but i mean wtf [03:13] it should have told me there was a compile error [03:13] i would have thought so [03:14] i just thought it was slow cause other stuff has been slow [03:15] wallyworld__: is the compiler spinning ? [03:15] agl and hanwen landed that big branch today [03:15] davecheney: not sure, how do i tell? [03:15] i didn't see any discussion about it [03:15] top ? [03:15] that was rather unorthodox [03:16] davecheney: strangely using the -v flag which prints aeach package as it is compiled printed out the errors also and then exited when done [03:16] wallyworld__: paste [03:16] ? [03:18] davecheney: https://pastebin.canonical.com/108166/ shows first attempt hanging, and then -v showing errors [03:18] wallyworld__: can you use pastebin it [03:18] my 2fa is downstairs [03:19] ok [03:19] http://pastebin.ubuntu.com/7229253/ [03:20] wallyworld__: can't tell if a hang, or just took a long time [03:20] try [03:20] rm -rf $GOPATH/pkg [03:20] davecheney: it ran for over an hour before i hit ^C [03:20] wallyworld__: did you look at top ? [03:20] no, not at the time [03:20] bummer [03:20] as for tip of ssh being broken [03:20] yes, that is poor form [03:20] but we're not blameless here [03:21] the top of gomaasapi is unusable isn't it [03:21] or is it gwacl ? [03:21] not sure [03:21] davecheney: i ran it again and that time it exited with the compile errors even without the -v flag [03:22] i don't -v would have any impact on a hang [03:22] -v is for cmd/go [03:22] which just forks the compiler [03:22] its hard to say uless you can get it to happen again [03:22] i'd 100% believe go get hanging [03:22] i read in help that -v is for verbose [03:22] wallyworld__: yes, that is what it does [03:23] go get can get confused when hg/bzr/git fire off some program to deal with merge conflicts as the pty they run against isn't a terminal [03:23] ok [03:25] davecheney: right, after running godeps, install now appears to be spinning [03:26] which process is spinning [03:26] cmd/go [03:26] yep [03:26] or the compiler ? [03:26] kill cmd/go with SIGQUIT [03:27] capture the output [03:27] great [03:27] this is some pretty shit [03:27] 11th hour and everyting is breaking, including the toolchain [03:27] wonderful [03:28] davecheney: http://pastebin.ubuntu.com/7229265/ [03:28] something about stack unavailable [03:29] nuking pkg fixes it [03:31] wallyworld__: whos' machine is this ? [03:31] its running gccgo [03:39] thumper: \o/ and now i have it compiled i can't test anyway because of bug 1304742 FML [03:39] <_mup_> Bug #1304742: version reports "armhf" on arm64 [03:39] i guess i should fix that [03:39] yeah... guess so [03:40] * wallyworld__ sighs heavily [03:45] * axw joins in [03:45] HP cloud doesn't like my change [03:49] supportedArchitectures isn't an ordered list, right ? https://bugs.launchpad.net/juju-core/+bug/1305397 [03:49] <_mup_> Bug #1305397: provider/common: test failure [03:50] * davecheney goes to lunch [03:50] in the rain [04:04] davecheney: another one http://pastebin.ubuntu.com/7229318/ [04:21] faaaark. only took 1000 attempts but finally got the go compiler to build without panicing === vladk|offline is now known as vladk [05:17] wallyworld__: eerk, that looks simlar to the bug we see on 64k ppc64 kernels [05:18] davecheney: yuk. i'm not having much luck with juju per se either. machine 0 agent won't start properly [05:18] wallyworld__: more nil pointers ? [05:18] doesn't seem to get past starting the start server [05:19] not sure yet if it's because it can't see the db or something else [05:20] wallyworld__: using lxc ? [05:20] nope, arm vms [05:20] manual provisioning [05:20] ok [05:21] lxc is broken atm [05:23] :-( [05:24] wallyworld__: i know [05:25] i'm running from trunk and it seems like it's dying inside the ensure ha stuff but there's not enough logging to know for sure [05:25] :( [05:25] :emoji crying: [06:29] davecheney: i farking give up. takes 1000 retries to get a compile. then complains about missing libgo.so.5. so find that i need extra compile flags. 1000 retries later get new jujud. still libgo.so.5 missing. so i install deb package on target machine just to get that shared lib. rinse and repeat. still complains. sigh [06:32] wallyworld__: see https://docs.google.com/a/canonical.com/document/d/1m9R2n6LPLNLGjdopcNkQYVG8D5V4FTyvc1vvn-9ZifM/edit [06:32] at the bottom [06:32] the gb alias [06:32] wallyworld__: can you paste me anything intersting from dmesg [06:32] davecheney: i used those flags [06:32] i want to compare with the ppc64 problems [06:33] if it's complaining that libgo.so.5 is missing [06:33] binaries were about 30% bigger after [06:33] then it didn't work [06:33] wallyworld__: i think you should cut your losses [06:33] so then i tried installing the lib directly on the target machine [06:33] looks like juju doesn't work on arm64 [06:33] i don't know if anyone is working on that atm [06:34] we're close [06:34] dmesg doesn't have anything that interesting [06:35] wallyworld__: ldd $(which jujud) [06:35] ubuntu@ms01a:~/juju/src/launchpad.net/juju-core/cmd/jujud$ ldd $(which jujud) [06:35] linux-vdso.so.1 => (0x0000007fafbdd000) [06:35] libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000007fafba7000) [06:35] libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000007fafb07000) [06:35] libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000007fafae4000) [06:35] libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007faf999000) [06:35] /lib/ld-linux-aarch64.so.1 (0x0000005595496000) [06:36] hmm, so libgo.so.5 is missing [06:36] i built with [06:36] go install -a -v -gccgoflags -static-libgo launchpad.net/juju-core/... [06:36] ah missing = [06:37] fark [06:37] wallyworld__: nope [06:37] you got it right [06:37] if you got it wrong [06:37] there would be a line showing [06:37] libgo.so.5 => (missing) [06:37] ok [06:38] but you should probably use the = [06:39] used the =, same result [06:40] mornin' all [06:41] in this case it will be [06:41] but if you wanted to pass a second option [06:41] you'd need = and to quote it [06:41] rogpeppe2: o/ [06:41] davecheney: yo! [06:44] morning rogpeppe2 [06:45] axw: hiya [06:45] rogpeppe2: I've got a CL for the addresses bug, about to propose (just finishing live test) - got time to have a look? [06:45] axw: sure [06:46] rogpeppe2: I made the changes we talked about, but also changed instance.NewAddress derive scope from IPv4 address range [06:46] so 10.*, 192.168.*, etc. are recorded as cloud-local [06:47] axw: that seems reasonable to me [06:49] axw: although that's not really true in the case of this bug [06:49] axw: the 192.168.* addresses were machine-local [06:50] rogpeppe2: I think that's okay, because we'll still choose provider cloud-local over machine cloud-local [06:50] machiner* [06:51] axw: yeah, it should be ok [06:51] * axw taps finger, waits for lbox [06:54] davecheney: success. i blew away *everything* and started again. bootstrapped manual provider on arm64. gotta run to soccer, but will add another machine later and deploy a charm [06:59] rogpeppe2: https://codereview.appspot.com/85590044/ [07:00] mgz: would appreciate a review from you too, in case there's some subtlety I've missed [07:01] rogpeppe2: on a completely unrelated note, I was thinking about the EnsureAvailability task some more this morning. Is there a reason why we shouldn't move a non-voting state server back to voting if it becomes accessible again? [07:02] axw: absolutely not. [07:02] axw: i think we should do that [07:02] goodo [07:02] that's what I thought too [07:02] axw: that's why we keep 'em around === vladk is now known as vladk|offline [07:21] anyone know anything about the replicaset failure in https://code.launchpad.net/~fwereade/juju-core/uniter-relation-states/+merge/215003 ? [07:21] are we seeing it a lot? [07:27] fwereade: wrt your previous comment [07:27] http://paste.ubuntu.com/7229693/ [07:28] i am not sure if this is a real race, or an instrumentatoin error [07:28] davecheney, I will take a look at that, thanks [07:29] davecheney, although at first glance... sync.WaitGroup *ought* to be used that way, oughtn't it? [07:30] fwereade: you cannot call Add() while another goroutine calls Wait() [07:30] only Done() [07:31] axw: reviewed [07:31] ta [07:32] rogpeppe2: heh, I kept writing IPv4 in my new code and then changed it to match the existing :/ [07:32] I will fix it another time- this is going to need to be backported, don't want to create unnecessary hardship :) [07:32] axw: yeah, it's definitely worth changing at some point [07:32] axw: fair enough [07:33] http://golang.org/pkg/sync/#WaitGroup.Add [07:34] fwereade: ah, i think i know what that might be to do with [07:35] fwereade: i think perhaps the replicaset tests were relying on the fact that Set put the session into monotonic mode. [07:35] rogpeppe2: likewise for NewAddress arg order if you don't mind - that will create a lot of noise [07:35] axw: ok [07:36] fwereade: (i mean Initiate, not Set) [07:37] davecheney, indeed, I see; but I'm having some difficulty mapping it onto the code [07:37] fwereade: yes [07:37] davecheney, the code looks odd fwiw, I don't see why we'd wait twice [07:37] i'm having trouble understandig if that is a real race [07:37] or a bug in the race detector [07:37] i'd use 1.3, but thre are heaps of bugs at tip [07:37] heh [07:37] as usual juju is leading the way in showig bugs in go tip [07:37] go us! [07:38] fwereade: hmm, it doesn't look as if that is actually the case [07:39] davecheney: it may well be a real race [07:39] rogpeppe2: \o/, i guess [07:39] davecheney: something similar came up on golang-nuts recently [07:39] rogpeppe2: indeed [07:40] and the semantics may change [07:40] but under 1.2 [07:40] the old semantics apply, you can't call Add(positive) after someone else has called Wait() [07:40] davecheney, hey, I think it is a race, but I'm still only at single-coffee levels of incisiveness [07:41] fwereade: based on that assesment, I shall raise a bug [07:41] i have found many other races [07:41] some in the mongo driver [07:41] davecheney, cheers, we can always close it if it isn't [07:42] davecheney: looks like a trivial fix for that one at least [07:42] davecheney: though i'm not sure it will actually fix the bug we're seeing [07:42] rogpeppe2: they normally are [07:42] true [07:43] davecheney: it may do though, thinking about it [07:43] davecheney: get that Add in! [07:46] https://code.launchpad.net/~dave-cheney/juju-core/128-environs-sync-tempdir-prefix/+merge/215087 [07:46] if anyone has two seconds [07:46] no [07:46] ignore, that one landed [07:46] https://code.launchpad.net/~dave-cheney/juju-core/127-fix-lp-1305397/+merge/215077 [07:46] this one is also trivial [07:46] rogpeppe2, that failure in my MP has an *awful* lot of "attempting Set got error: replSetReconfig command must be sent to the current replica set primary." lines before the no-reachable-servers failure [07:47] fwereade: yeah, i don't know why [07:57] rogpeppe2: worker/peergrouper/worker_test.go :318 [07:58] i see test failures when the servers is not the first entry in expectedAPIHostPorts(3) [07:58] davecheney: yeah, agreed, it's dubious. weren't we going to sort the servers slice? [07:59] rogpeppe2: i am trying [07:59] but how can I sort a set of []instance.APIPorts [08:00] davecheney: fairly easily [08:00] do tell [08:00] davecheney: it's not hard to define an ordering between two []instance.HostPort values [08:01] davecheney: (compare each element in order) [08:01] rogpeppe2: thta isn't what you told me to do [08:01] davecheney: you can probably just compare address value [08:01] you said that []instance.HostPort is already sorted [08:01] davecheney: it is [08:01] wright [08:02] rifht [08:02] davecheney: we're sorting a [][]instance.HostPort [08:02] so I need to sort a [][]instance.HostPort [08:02] yup [08:02] davecheney: so to do that, you need to compare two []instance.HostPorts [08:02] rogpeppe2: ok, got it [08:02] davecheney: cool [08:29] anyone know how to find out what code a given CI test is actually running? [08:29] (the CI test itself, that is) [08:31] rogpeppe2, I think it's lp:~juju-qa/juju-core/ci-cd-scripts2 [08:32] fwereade: i just found that, but it doesn't seem to have all the scripts in it (e.g. aws-upgrade) [08:33] fwereade: but i'm not sure what the correspondence is between that branch and the test names we see in jenkins [08:44] rogpeppe2: https://codereview.appspot.com/86400043/ [08:44] darn, i've just realised that we really need to support upgrading to HA [08:44] your thoughts sir [08:44] opps [08:44] two secs [08:46] davecheney: that doesn't look quite right [08:46] davecheney: i don't think that Less function is commutative [08:47] https://codereview.appspot.com/86400043/ [08:47] sorry, have another look [08:47] davecheney: i think you want: if len(a) != len(b) { return len(a) < len(b) } [08:49] ok [08:53] davecheney: in fact, i think the other test isn't right either [08:53] yeah, that was tricky [08:53] davecheney: i think it needs to compare the values for equality, and only if they're not equal should it compare the less-ness of them [08:54] yeah, that makes sense [08:54] davecheney: i've got that wrong before [08:54] davecheney: so i always look closely at Less functions now :-) [08:55] rogpeppe2: this one is tricky because in our case the ports are always the same [08:55] so a < b is always false [08:55] davecheney: i don't think it's that tricky [08:55] davecheney: one mo, i'll paste a suggestion [08:58] davecheney: reviewed (with suggested code in the review) https://codereview.appspot.com/86400043/ [08:58] rogpeppe2: ta === rogpeppe2 is now known as rogpeppe [09:00] rogpeppe: thanks, that is much more straight forward [09:00] davecheney: np [09:02] davecheney: just sent one other trivial comment [09:02] mgz: ping [09:03] afk [09:03] rogpeppe: et al, http://code.google.com/p/go/issues/detail?id=7749 [09:03] juju really breaks go 1.3 at the moment [09:11] axw: hey [09:14] mgz: hey, would you please take a look at https://codereview.appspot.com/85590044/ when you have a moment? [09:14] I've changed some address logic, and I'm a bit nervous about breaking everything :) [09:14] :) [09:18] axw: had planned the pivate range stuff, that looks fine, will have a proper poke through before the standup and see if any of the subtle logic bits have got lost [09:21] mgz: great, thanks [09:22] axw: only thing that jumps out as a risk is canonistack and similar setups [09:23] were we don't actually *have* a public address generally, [09:23] but currently juju can lie, and return the NetworkUnknown 10. address which will work with sshuttle [09:25] right, I need to change location now [09:25] mgz: I did live test with canonistack (and HP); the 10. address is now recorded as cloud-local, and usable as both public and internal [09:25] davecheney: :-( [09:25] davecheney: can you reproduce that reliably? === natefinch-afk is now known as natefinch [09:30] morning all [09:31] morning natefinch === vladk|offline is now known as vladk [09:43] natefinch: good morning. extra early for you, isn't it ? [09:45] jam1: yeah, been trying to get up early to get some more work done on HA. Also, yesterday I lost a good bit of the working day due to some emergency beekeeping tasks that came up. [09:55] good morning [09:57] just mentioning, standup in 3 min [10:19] natefinch: you have bees? [10:24] axw: just wondering about 11.0.0.0/8 [10:24] axw: it is technically public [10:24] axw: but actually private [10:24] axw: I wonder if anyone tries to use it [10:24] axw: it is the ipaddress of the us military group [10:25] and is a disconnected internet type network [10:25] I think the current code is good, and if someone is stupid enough to use this [10:25] then it can be on their head :-) [10:25] thumper: that would be an interesting problem to have :) [10:26] * axw thinks [10:26] it will be no worse than it is now, as cloud-local will be used for public/internal as well as unknown [10:26] oh wait [10:26] public.. hrm [10:27] well they would be inside the cloud anyway, so it would be fine I think [10:27] inside the cloud -> inside the network [10:28] axw: I do like the 'less unknowns' :-) [10:38] hey fwereade this is a wip https://codereview.appspot.com/86430043 for some reason https://codereview.appspot.com/86430043/patch/1/10003 is getting empty unit networks, care to take a quick look? (I did not upload work for the machine not changed checks to avoid clutter on what I am trying to debug) [10:43] thumper: yeah, three hives. bees are awesome :) [10:44] perrito666, actually do you want to pop on, I'm not quite following everything there [10:45] pop? [10:45] perrito666, sorry, I mean, quickly reenter the team meeting hangout [10:45] I certainly can [10:46] i've got two branches up for review. would much appreciate if someone could take a look: https://codereview.appspot.com/86200043/ https://codereview.appspot.com/85600044/ [10:52] axw: properly going over address branch now, I see rog has already looked through it [10:57] c7z: I assume that's mgz; yes he has, just thought I'd get your thoughts too, because you did the original work I think? [10:58] bbs [10:59] axw: yup, though there's a lot more complexity than the first version unfortuanately [11:04] rogpeppe: looking [11:05] oh noooo, daylight savings I missed the meeting [11:05] hahaha [11:06] I was like, why is there no one here??? [11:06] yep, daylight savings is annoying [11:06] sigh, reading notes [11:06] fwereade, sorry - bug 1305780 [11:06] <_mup_> Bug #1305780: juju-backup command fails against trusty bootstrap node [11:07] waigani: didn't miss much [11:07] hey, I'm in the notes! [11:08] natefinch: okay, that's good. Well I'll get some sleep and be more productive tomorrow! [11:08] night all [11:08] waigani: g'night [11:22] jamespage, I thought we had the tools used by backup/restore in juju-mongodb [11:22] fwereade, we do - but I suspect the fact they are not in the path is breaking things [11:22] jamespage, gaaaah ofc [11:22] fwereade, it works fine on 12.04 [11:22] where that is the case [11:23] axw: commented [11:23] fwereade, I suspect if I could get restore past "error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: machine "0" has no public address" [11:23] then I would hit the same issue gain on 14.04 [11:24] jamespage: https://codereview.appspot.com/85590044/ [11:24] one of your bugs is nearlyfixed [11:24] c7z, \o/ [11:24] woser [11:24] that was complex [11:25] c7z, I better stop finding new ones :-) [11:26] yeah, axw decided to start doing the right thing with deriving the network scope rather than piling on hacks [11:36] natefinch: "could this code get moved to the loop over entity.jobs below?" [11:36] natefinch: i don't think so [11:36] natefinch: because newSingularRunner can fail [11:37] jam, mgz, have the bot stopped landing stuff for gomaasapi? [11:37] dimitern: I'm pretty sure the bot never landed things for gomaasapi [11:38] only gwacl [11:38] dimitern: it never did in the current iteration [11:38] really? [11:38] dimitern: I manually landed the last bits, I can land anything else you guys need [11:38] there was once a gomaasapi bot, but it wasn't ours, and it went away [11:38] dimitern: lp:~juju/gomaasapi/trunk [11:38] end of fairytail [11:38] rogpeppe: yeah, that's a good point. I guess if you need to make sure you fail early, that's valid. [11:38] vladk, there's your reason ^^ c7z is your man :) [11:38] the bot has never been in ~juju, IIRC [11:39] rogpeppe: not for this review, but that method really needs to be refactored. 110 lines is just too long. [11:39] dimitern: I intentionally didn't want to give the bot too much access to things that weren't its, nor give ~juju direct access to bits controlled by the bot [11:40] natefinch: yeah, it could be easily split up [11:40] * jam is away for a bit [11:40] jam, yep, understandable concerns [11:41] * rogpeppe has just acquired a new, unbroken phone [11:41] woo [11:41] rogpeppe: what did you go for? [11:41] also, if it doesn't have a smashed screen by vegas, I'll be disappointed [11:41] c7z: a samsung galaxy s4 active from ebay [11:41] c7z: :-) [11:42] c7z: main reasons were the fact that it is waterproof (i killed a previous phone from water damage) and it has a replaceable battery [11:46] c7z: could you manually land my branch to gomaasapi: https://code.launchpad.net/~klyachin/gomaasapi/101-testserver-extensions/+merge/214961 [11:47] vladk: on it [11:47] c7z: just so I understand about the floating IP... [11:47] c7z: floating-ip would be stored as NetworkUnknown as well? [11:47] c7z: hence why we would take the last one? [11:49] yeah, so, the original iteration of the code understood some network names as special [11:50] but hp and some others were annoying in that they had a network named 'private'... but the (public) floating ip just got appended to that network, not added to a new one [11:50] * rogpeppe just realises that c7z==mgz [11:50] rogpeppe: sorry :P [11:51] but I'm pretty sure that your code will actually pass that test (with less fiddling that you needed), because of the new scope detection code [11:53] c7z: okay, cool. yes I think it should work then [11:53] c7z: good catch on NetworkName. I'll fix that and land [11:53] but yeah, the current/old version of openstack bits basically left everything as NetworkUnknown and used some hacks based on ordering to make the various previous cases work [11:57] vladk: landed [11:59] natefinch: how's it going? [12:06] rogpeppe: sorry, helping my ender daughter, Lily get ready for preschool. Should be back in about 45 minutes, though not at full capacity for a little over an hour. [12:06] s/ender/elder/ [12:06] natefinch: ok. perhaps you could just push the branch you were working on last night? [12:06] natefinch: then i can move it forward [12:08] rogpeppe: cool, just pushed it here: lp:~natefinch/juju-core/041-moremongo [12:09] natefinch: thanks [12:16] evilnickveitch: had a note that the environments.yaml config option for bug 1241674 isn't added to the 1.18 docs, what branch do I need to get to put it in? [12:16] <_mup_> Bug #1241674: juju-core broken with OpenStack Havana for tenants with multiple networks [12:19] c7z, it should go in the master branch for now, thanks! [12:28] c7z: sorry, will fix that test in a followup [12:29] axw: I wasn't quite clear the first time around that the test should have just been s/127\./10\./ [12:29] c7z: no worries, understood now - I think the bot's already running it though [12:31] anyone else having trouble getting to bazaar.launchpad.net ? [12:31] axw: no problems, as I said, should pass that way as well [12:32] jam: I just did, and it was transient [12:32] c7z: can you check if you can get to launchpad? [12:32] c7z: k, it is still failing for me... :( [12:32] as in, failed to branch twice, pinged, worked, sshed, worked, branched... worked [12:32] c7z: I can't SSH or get to the HTTP page [12:45] jam: apparrently lp app servers were seeing issues getting stuff through squid, it's still working for me at present [12:45] c7z: it just worked fro me [12:45] for [12:51] dimitern: ping about SCP and extra arguments [12:51] you seem to have a patch that made it so that scp only supports *1* extra argument [12:51] and CI wants to use about 5 extra args [12:51] when doing a local provider deploy for the first time, how can you track the image download progress? [12:51] c7z: iftop ? [12:52] I wish I knew a better way, I think it is controlled underneath lxc [12:52] (hidden from us) [12:52] isn't that fun [12:54] jam, my changes to scp was that it can accept any number of extra args [12:54] dimitern: not in 1.18 [12:55] dimitern: "juju scp 1:foo . -o "StrictHostKeyChecking: no" [12:55] jam, if it got changed later i don't know [12:55] complains that "-o" is unknown [12:55] dimitern: vs juju scp 1:foo . -o"StrictHostKeyChecking: no" [12:55] works [12:55] but it has to be *1* argument [12:57] dimitern: the line is "if i != len(c.Args) - 1" [12:57] sounds like it only accepts 1 extra argument, and is attributed to you (according to bzr annotate) [12:57] jam, hmm.. looking at the code I see the problem [12:57] jam, it was broken before - not being able to take more than 3 targets [12:58] jam, but i broke that it seems - the fix should be: once we start adding extraArgs we treat all the rest as extraArgs [12:58] so CI was trying to do: if timeout 5m juju --show-log scp -e $ENV -- -o "StrictHostKeyChecking no" -o "UserKnownHostsFile /dev/null" -i $JUJU_HOME/staging-juju-rsa 0:/var/log/juju/all-machines.log $log_path; then [12:58] using "--" [12:59] which I don't quite see that we ever actually supported anyway [12:59] But they would like to have the target late, and extra args early [12:59] we can move that around [12:59] (I think) [12:59] but we should support more args [13:00] why targets late and extra args early? [13:00] dimitern: it is a natural way to write it, if you were writing "SCP" code. [13:00] as in, it is how *I* would write "scp ..." [13:00] it used to be documented that extra args are passed after -- that was never implemented [13:00] dimitern: I don't think we *have* to because SCP will let you passed them late. [13:02] jam, well, initially i made it so -- can be at any place (even between targets) to specify one or more -args, but it was rejected on the review [13:02] jam, yeah, passing them last is both easier to parse and not far from natural [13:05] jam, I think something like this should fix the issue http://paste.ubuntu.com/7230671/ [13:14] I'm getting a weird error with cobzr. I should be able to diff against trunk, right? [13:15] rogpeppe: https://codereview.appspot.com/86490043 -- hopefully I'm at least on track here... [13:15] axw: looking [13:16] rogpeppe: I'm about to sign off, so no particular rush [13:22] evilnickveitch: https://code.launchpad.net/~gz/juju-core/update_config_openstack_1.18/+merge/215177 [13:23] c7z, thanks, i'll take a look [13:25] I'm off now, will check a bit later but would appreciate if someone could poke my MP if it fails again: https://code.launchpad.net/~axwalk/juju-core/lp1303735-fix-address-logic/+merge/215085 [13:26] and I'll backport to 1.18 in the morning [13:31] dimitern: the problem is the signal to start adding things has a "if i != len(args) -1" [13:31] so we really just need a different check. I think actually implementing "--" would be the easiest thing, if we want to support multiple SCP targets [13:33] jam, -- sgtm to [13:33] jam, can't remember who was against it in the review ;) [13:34] ...the code did use --... which broke things apparently, hence the fix to not [13:34] c7z: well the CI guys had written "scp -- 0:foo ." which would be broken regardless [13:34] because that would pass an explicit "0:blah" to scp [13:35] We *could* just detect if any given argument had a possible Juju identifier at the start and map it [13:35] with potentially an escape character? [13:35] * jam doesn't like escapes very much here === Ursinha is now known as Ursinha-afk === hatch__ is now known as hatch [14:12] fwereade, are you around? [14:24] fwereade: alexisb: I just put one of you on the hook for a 1.19 and 1.18 update at the cross-team meeting. [14:24] mramm ack === Ursinha-afk is now known as Ursinha [14:32] mramm: I think fwereade said he had a headache, so is off for a bit. which puts alexisb on the hook :) [14:32] jam: cool [14:33] I'm sure alexisb can handle it ;) [14:33] alexisb: here's my summary, I think cmars is already working on #1304770, but I don't have a feeling for why it isn't done already [14:33] <_mup_> Bug #1304770: store: tests do not pass with juju-mongodb [14:33] mramm, jam, alexisb: thanks, I just dragged myself up here to make sure there was someone there [14:33] * fwereade goes back to bed [14:33] bug #1302205 was the one that wallyworld__ sent an email on. I don't think it should block 1.19.0 if we can't fix it in time [14:33] <_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 [14:33] the toolchain is hard to work with there, so it isn't a *regression* [14:34] we should definitely talk over that bug on the call [14:34] bug #1303697 I thought fwereade actually had a patch that I reviewed, so it should be just-about done as well [14:34] <_mup_> Bug #1303697: peer relation disappears during upgrade of juju [14:34] dimitern: do you have a status for bug #1304905? Presumably that is Critical because you wanted to fix the API before it becomes released? [14:34] <_mup_> Bug #1304905: Change NetworkName to NetworkId across codebase and use network tags in the API [14:35] bug #1305386 is about "this might be a bug" being told to us by the go compiler, which is something we should try to address, but it isn't a bug that has seen actual live problems [14:35] <_mup_> Bug #1305386: state/apiserver: multiple data races [14:37] jam, i'm about to propose the CL that fixes it [14:43] jam, alexisb, I have a fix for 1304770, which I can land, we'll just need to update CI to test the charm store tests that will become disabled by default [14:43] cmars: could we invert the logic? or provide it via an ENV var and just have the build process set that ENV var? [14:43] because the landing bot is happy to run the tests [14:43] and so is CI [14:44] it is more that the build process for trusty needs to be able to disable them [14:44] We could even just do "if version.Current.arch == 'trusty'" if we wanted. [14:44] jam, i can certainly invert the logic [14:44] jam, mgz, rogpeppe, anyone.. I'd really appreciate a review on this https://codereview.appspot.com/86010044/, which also fixes bug 1304905 [14:44] <_mup_> Bug #1304905: Change NetworkName to NetworkId across codebase and use network tags in the API [14:44] jam, would you prefer a switch or env var? [14:45] dimitern: ok, lookin [14:45] g [14:45] or both :) [14:45] rogpeppe, cheers - it looks huge, but it's not, just touches a lot of things [14:47] cmars: *I* prefer an env var, because 'go test ./...' doesn't pass switches well, they have to be defined on all packages [14:47] while an ENV var can just be picked up while running. [14:47] wfm [14:48] +1 for me too [14:50] jam: +1 [14:51] jam: it would be nice to have a small set of well defined env vars though [14:55] fwereade: do you recognise this last test failure? looks like a test wasn't updated. https://code.launchpad.net/~rogpeppe/juju-core/mfoord-wrapsingletonworkers/+merge/215035 [14:57] rogpeppe: fwereade seems to be ill in bed :( [14:57] perrito666: ah [15:04] so jam just a general note... much of the status on bugs you are giving here is not actually in the bug, any reason for that? Can we just ask folks to update bugs, seems like a pretty good common place to put status :) [15:32] jam, updated the nomongojs option for tests, ptal https://codereview.appspot.com/82930043 [15:48] dimitern: you've got a review [16:06] rogpeppe, thanks! [16:06] dimitern: yw [16:15] c7z, ping [16:16] dimitern: hey, will be free in a min [16:17] c7z, rogpeppe reviewed my https://codereview.appspot.com/86010044/, but can you take a look as well please? I want to land it today, and I almost have the last bit working (cloudinit). [16:19] fwereade: does your patch for https://bugs.launchpad.net/juju-core/+bug/1303697 actually fix the full bug? I felt like it did, so I went ahead and marked it Fix Committed since your branch is merged. [16:19] <_mup_> Bug #1303697: peer relation disappears during upgrade of juju [16:19] fwereade: I should note that if you do "bzr commit --fixes lp:12345" then Tarmac will mark things as Fix Committed when it merges them. [16:23] alexisb: so I updated some things, for things that "have a branch in progress ready to land" that information is captured in the bug, as long as people link their branches to the bugs. (after the Description is a section about Related Branches), see https://bugs.launchpad.net/juju-core/+bug/1303697 [16:23] <_mup_> Bug #1303697: peer relation disappears during upgrade of juju [16:24] some of the other bits I mentioned were just stuff that I worked out while reading over the current list of critical bugs, thus existed nowhere but in IRC at that moment :) [16:25] * jam1 goes to take my son to bed [16:25] can someone PTAL at my mongojs notest option proposal, https://codereview.appspot.com/82930043/? [16:29] thus far i've only tried building juju from the deb package - what's the right mechanism for doing it w/ go gotten source? [16:30] nm - see the README [16:41] natefinch: lp:~rogpeppe/juju-core/natefinch-041-moremongo [16:43] rogpeppe: func fakeCmd(path string) { [16:43] err := ioutil.WriteFile(path, []byte("#!/bin/bash --norc\nexit 0"), 0755) [16:43] if err != nil { [16:43] panic(err) [16:43] } [16:43] } [16:45] jam1, thanks! [17:14] cmars: lgtm [17:15] jam1, thanks! [17:15] cmars: make sure to update the bug with how the build process can skip the tests if it needs to [17:16] jam1, will do. i'm standing up an LXC to test it with an actual juju-mongodb right now, just to be sure. i usually develop against a stock mongodb-server [17:21] cmars: I also wonder if there is an obvious place in a README or HACKING to describe env flags for the test suite like this [17:27] jam1, CONTRIBUTING, in the Testing section [17:28] i'll add a blurb [17:28] jam1, any objections to me moving the target for 1302205 to 19.1? [17:31] alexisb: fine with me [17:31] alexisb: it sounds like something that has a stakeholder, so it is important work, but not something that used to work that we broke [17:32] nor something that should hold up 19.0 [17:32] I think wallyworld__ moved it because he had 3 fixes he wanted to land but there are still additional issues that need to be investigated [17:35] anyone else having issues with launchpad server? [17:38] alexisb: the juju-core homepage opens ok for me, but I haven't really been on it all day [17:43] natefinch, I got logged out and am having troubles with the login page [17:44] alexisb: haha, yeah, the login page looks borked: Something broke while generating the page. Please try again in a few minutes, and if the problem persists file a bug or contact customer support. Please quote OOPS-ID ['OOPS-a584d1910ee34d7d91342302107837b7'] [17:44] worked this last time though [17:44] heh [17:44] well I am in now [17:45] yep, me too [17:47] interesting got same type of oops trying to get back into email [17:50] alexisb: it has been flakey on and off today. You can probably go into #webops on irc.canonical.com and ask there when we have trouble like this. Apparently there is an issue with some of the Launchpad servers and some squid proxy machines [17:50] good pointer, thank you jam1 [17:58] jam1: do we really need to support quantal? [18:00] https://codereview.appspot.com/86600043 - reviews appreciated (VLAN cloudinit network setup for MAAS) [18:01] natefinch: is there a specific bug in quantal that we're trying to avoid? [18:02] jam1: lack of mongo [18:03] natefinch: I don't have something that says "OMG, we must support Q for people". However, it sounds like more code than not to avoid it. [18:03] natefinch: so, given that nobody ever bothered to port mongo to Q and nobody has complained, I think its pretty clear where our support for it lise [18:03] lies [18:03] however, a *lot* of our test code uses Q as a "your not actually running the test suite on this otherwise 'supported' version" [18:05] jam1: we have code that says "if you're running quantal, apt-add-repository ppa:juju/stable [18:07] natefinch: so if we have mongo, what is the problem? [18:07] jam1: just more code to maintain, more special case tests to write [18:07] (I was writing the test to check that quantal did the add-apt-repository) [18:08] natefinch: Quantal EOL's when Trusty is released, If I'm reading http://en.wikipedia.org/wiki/List_of_Ubuntu_releases#Ubuntu_12.10_.28Quantal_Quetzal.29 correctly [18:08] natefinch: so... I'd rather limp along and support it for the next release [18:09] but I wouldn't spend huge amounts of time on it. [18:10] jam1: ok, that's fair === vladk is now known as vladk|offline [18:50] * rogpeppe has reached eofd [18:50] or eod even [18:51] g'night all [18:56] anyone else able to "lbox propose" ? I get: error: Get https://api.launchpad.net/devel/people/+me: x509: certificate signed by unknown authority [18:56] I have the feeling LP changed their SSL certs due to Heartbleed, and now LBOX is refusing to let us get our work done. [20:13] jam1, i just got that same x509 error w/lbox, then retried and it was successful. strange [20:14] cmars: maybe it depends what lp appserver you get [20:14] as there are like 16 of them [20:14] I did get it 2x in a row [20:14] jam1: it worked for me this morning. haven't tried since then, though I've heard others have had problems sporadically [20:14] though I thought all the SSL stuff was handle in the Apache front end [21:24] morning all [21:25] morning waigani [21:25] morning [21:25] waigani: thanks for trying to resubmit your branch [21:26] mwhudson: do you have time for some deb hand holding ? [21:26] davecheney: no problem, looks like mongod problem again? [21:26] waigani: that bot is screwed [21:26] hmmm [21:26] turns out our dogfood isn't fit for human consumption [21:26] haha [21:27] davecheney: fyi I'm working on this now: 1304767 [21:28] davecheney: probably [21:28] https://bugs.launchpad.net/juju-core/+bug/1304767 [21:28] <_mup_> Bug #1304767: test failure in cmd/juju [21:28] waigani: ok [21:29] basically, I'll mock out a fake tarball so we are just testing tools uploading - not actually building [21:30] waigani: hmm [21:30] i don't think that is the right solution [21:30] it might eb [21:30] davecheney: open to suggestions [21:31] be what I have found in is in many test cases they use the version.Current symbol [21:31] but the test then expects amd64 tools [21:31] id check that first [21:31] davecheney: okay [21:32] davecheney: wallyworld__and thumper pointed me in the direction of mocking out the tools tarball [21:32] waigani: thumper knows what he is doing [21:33] davecheney: I'll look into both :) [21:45] davecheney, i've flipped the logic for mongojs tests and that's landed in trunk, re: 1304770 [21:45] davecheney, can you PTAL at landing this in https://codereview.appspot.com/86650043/? [21:46] ^in 1.18 i mean [21:48] cmars: so be it [21:48] nobody will read that [21:48] and we'll get bug reports [21:48] but so be it [21:48] until the store moves [21:48] that is [21:48] *not very subtle hint* [21:49] it's not a big problem, nobody runs the tests but us [21:50] sounds like a topic for discussion at the next sprint. for now, this is a way to unblock CI [21:51] cmars: roger [21:54] cmars: thanks for getting that fix in [21:54] np [22:11] ffs [22:11] the bot is really screwed [22:11] every single change has to be landed several times before it sticks [22:32] http://paste.ubuntu.com/7232727/ [22:32] mongo is blowing up :( [22:33] can anyone connect to the bot and see if there are like 1,000 mongod processes leaking around [22:46] hazmat: where did you look for the source of juju-mongodb again? [22:47] hazmat: I'm looking at bug 1302747 [22:47] <_mup_> Bug #1302747: mongodb fails to start with local provider [22:47] davecheney: not 1000, but maybe 5 [22:47] i'll nuke them [22:47] wallyworld__: ta [22:47] i normally find a few doen lurking around by the EOD [22:47] thumper, i went to github lsat time [22:48] thumper, mongodb upstream that is [22:48] kk [23:15] anyone seen juju machines pending forever on aws? [23:17] http://paste.ubuntu.com/7232845/ machine 8 fwiw machine-0 log [23:17] pending in status for 20m [23:22] * hazmat switches to stable branch tip [23:25] https://codereview.appspot.com/86650043/, can I get a look (backport to 1.18) [23:25] needs a LGTM