[00:15] thumper: ping [00:16] lazyPower: hey man [00:16] hey thumper :) is there any way that i can query an action status in juju? i've ran 3 items (1 ran 2 queued) [00:17] i'm realllllyyyy curious whats going on with those other 2 actions that queued that gave me zero feedback other than a queue with a hash. [00:17] lazyPower: NFI sorry, jw4 around? [00:17] yep [00:17] \o/ [00:17] score, right people, right time [00:17] lazyPower: juju action status ? [00:18] how... did i miss this in juju action help? [00:18] * lazyPower facepalms [00:18] lazyPower: bad docs [00:18] mea culpa :( [00:20] lazyPower: btw... juju help actions is very truncated primer [00:20] its a new feature, i forgive you jw4 [00:21] lazyPower: lol thanks [00:21] we'll do better next time \o/ [00:25] Bug #1441915 was opened: juju uses unsafe options to dpkg inappropriately === kadams54_ is now known as kadams54-away [00:29] axw: standup? [00:32] lazyPower: ping === kadams54-away is now known as kadams54_ [00:33] rick_h_: pong [00:33] lazyPower: hate to be a bother man, but noticed that our link to the openstack bundle was broken and see you pushed it under -basic vs -base per https://pastebin.canonical.com/129202/ [00:33] yikes [00:34] 1 sec, let me fix that [00:34] lazyPower: working on the release notes email and wanted to call out the bundle move, any chance you've got time to repush it? and remember the name in the yaml has to match please [00:34] sorry for not realizing it when you did it :( [00:35] All good - glad we caught it before we broadcast a 404 [00:35] lazyPower: +1 [00:35] rick_h_: the bundle deploy command turns into: juju quickstart bundle:openstack/openstack-base [00:35] right? [00:35] or is it openstack-base/openstack-base [00:35] juju quickstart openstack-base [00:36] that's it, just the same thing as the jujucharms.com url for promulgated bundles [00:36] lazyPower: e.g. http://jujucharms.com/mongodb-cluster check the copy/paste command there [00:36] ok openstack-base is pushed [00:36] or even the one there now https://jujucharms.com/openstack-basic/ [00:37] mgz: sinzui: merge incorrectly failed on goose: http://juju-ci.vapour.ws:8080/job/github-merge-goose/3/console [00:37] lazyPower: <3 my hero [00:37] rick_h_: do we wan tot wipe openstack-basic LP repo and nuke the charmstore bin? [00:37] lazyPower: will wait for it to ingest before I sent my email [00:37] or have i created cruft :| [00:37] lazyPower: yes please [00:37] ack, on it now [00:37] I'll blow it away once the branch is gone [00:38] done, branch shoudl 404 now [00:41] katco: back, ready now? [00:41] axw: yep [00:44] lazyPower: can you double check that the url in the new bundle is right? lp:~charmers/charms/bundles/openstack-base/bundle (ends in /bundle?) [00:45] fetches 34 revisions from bzr [00:45] lazyPower: I think we only ingest /trunkto avoid everyone's branches in dev/progress [00:46] thats a mis-match with what everythign else thats a bundle is pushed to for bundles [00:46] the /trunk is correct nomenclature with charms, but /bundle is what we've been using for bundles since i started. [00:46] lazyPower: ah ok then coolio [00:46] lazyPower: just double checking [00:46] :) [00:47] https://launchpad.net/~charmers/charms/bundles/mediawiki/bundle <- as verification [00:47] yep, gotcha [00:47] on that first page the only other bundle was /trunk so I got nervous === kadams54_ is now known as kadams54-away [00:55] mgz: goose bot still doesn't want to merge stuff: http://juju-ci.vapour.ws:8080/job/github-merge-goose/3/console [00:57] lazyPower: <3 https://jujucharms.com/openstack-base/ [00:57] rick_h_: would you mind hitting the merge button on https://github.com/go-goose/goose/pull/6? the tests pass (see console above), but the lander mustn't be set up correctly because the merge failed again. [00:58] axw: looking [00:58] Email Deployed! [00:58] axw: button hit [00:58] rick_h_: thanks :) [00:58] np [00:58] always happy to use my powers for evil [01:02] rick_h_: ty sir :) [02:03] axw: you around? [02:03] thumper: I am [02:03] howdy [02:03] axw: team meeting? [02:03] oops === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 [03:18] thumper: hey, did you see my emails about potential changes to the JES spec? I haven't heard any replies from you. [03:18] I have to go walk the dog, but I'll be back in a bit [03:22] jam: yeah... [03:22] jam: we should have a hangout when you are back and have time [03:22] jam: it was on my todo list to give you a comprehensive response === kadams54 is now known as kadams54-away [04:53] thumper: are you still there? [04:53] yup [04:53] I'm up for a hangout, let me grab my coffee cup [04:54] heh, empty anyway :) [04:54] jam: thumper just replied to the cli stuff fyi [04:54] rick_h_: why are you still here? [04:55] surely it is past your bed time :) [04:55] thumper: because I can't sleep and I was clearing kanban and replying to bugs and found your email interesting :) [04:55] at least more interesting than bug triage [04:55] so party time! [04:55] rick_h_: I don't know the ascii art for blowing a noisemaker... :) [04:56] jam: that's ok, I couldn't do it in animated gif form either so we'll be quiet party folk [04:57] thumper: https://plus.google.com/hangouts/_/gqw3kdy7c4nxrjs2a7adlggamea [05:00] jam: thumper is that JES or uncomitted and worth me listening in on for fly on the wall info next week? or carry on with my bugs? [05:00] rick_h_: the chat is about JES I believe [05:00] rick_h_: you are welcome [05:00] you're welcome if you're interested === JoshStrobl is now known as JoshStrobl[AFK] === thumper is now known as thumper-afk [07:20] Bug #1427814 changed: juju bootstrap fails on maas with when the node has empty lshw output from commissioning [08:35] dooferlad, hey there [08:35] Bug #1442012 was opened: persist iptables rules / routes for addressable containers across host reboots [08:35] dimitern: hi [08:36] dooferlad, I'm reviewing your branch, but something more urgent came up [08:36] dimitern: yea, the container stuff? [08:37] dooferlad, yeah - some experiments are needed - have a look at bug 1441811 [08:37] Bug #1441811: juju-1.23beta3 breaks glance <-> mysql relation when glance is hosted in a container [08:37] dooferlad, I've added some comments with suggestions how to change the iptables rules we generate so we'll hopefully solve the issue [08:38] dooferlad, can you please try these (or others if you have a better idea) - the result we're seeking is packets from a container-hosted charm arrive to another host with the container's IP as source, not it's host's [08:39] dimitern: sure [08:39] dooferlad, cheers! in the mean time I'll finish your review [08:40] dimitern: this only applies to new containers deployed with 1.23, the upgrade doesn't change existing containers network configuration - right? [08:40] voidspace, which does? [08:41] dimitern: that bug - the routing problems [08:41] voidspace, ah, yes [08:41] voidspace, but post-upgrade any new instance hosting containers will potentially have the issue [08:42] dimitern: yep [08:42] dimitern: so you can be in a "mixed" environment, with old style and new style containers [08:43] voidspace, oh most certainly :) [08:43] voidspace, we need to deal with this gracefully though [08:45] Bug #1442012 changed: persist iptables rules / routes for addressable containers across host reboots [08:54] Bug #1442012 was opened: persist iptables rules / routes for addressable containers across host reboots [09:33] voidspace, aren't those 2 cards in review merged btw? [09:34] dimitern: iptraf seems to be a good tool to add to the collection. Shows source address of traffic, so just pinging a container is enough to see if the source address is the host or the container. [09:36] dooferlad, cool - how do you run it? [09:37] dimitern: it is a console app. It is apt-getable as well. [09:39] dooferlad, I'll check it out [09:59] dimitern: it looks like if we just don't add the current SNAT rule we are fine. Currently we have "-A POSTROUTING -o eth0 -j SNAT --to-source " [09:59] dimitern: without it, ping responses are from the container IP [10:00] dimitern: just testing a small change [10:01] dooferlad, hmm we better check any proposed fix on both AWS and MAAS just to be sure [10:01] dimitern: indeed! [10:09] Bug #1442046 was opened: Charm upgrades not easy due to versions not being clearly stated [10:26] i just saw this panic when running github.com/juju/juju/cmd/jujud/agent tests: http://paste.ubuntu.com/10781542/ [10:26] it looks like a legit juju bug to me [10:27] rogpeppe1: heh I was just trying to fix that on my branch, assuming it was my own fault somehow [10:27] rogpeppe1, looks like a bug to me as well << axw [10:27] rogpeppe1: seems like ensureErr should just return nil if the thing its given is nil... though there might be more to the bug than that.... like why we're passing it something nil [10:27] natefinch: it looks to me as if we're getting a closed channel on filesystemsChanges before the first environConfigChanges value has arrived [10:28] natefinch: that would be my thought for a fix too [10:28] natefinch, nope, EnsureErr's reason for existence is to report watcher errors on failure, it should not be called when err != nil [10:28] dimitern: it's being called with a nil Errer [10:29] rogpeppe1, yeah, but it shouldn't - seems to me like an omission in the loop using the watcher [10:30] dimitern: hmm, you're right, it shouldn't. looking more closely, i don't see how we can possibly be getting a closed channel on filesystemsChanges when filesystemsWatcher is still nil [10:31] weird [10:36] fwereade: i'm seeing sporadic failures in UniterSuite.TestUniterUpgradeConflicts too [10:44] dimitern: What do I do to get a public IP address for an EC2 container? [10:46] dooferlad, there's no such thing yet [10:46] dooferlad, the public address of the host is used [10:47] dimitern: ah, so what do we expect from an EC2 container? Being able to start a service in it, and access it using the host IP? [10:48] dooferlad, the public address is only needed for exposing stuff [10:48] dooferlad, but in AWS we should have the same behavior as MAAS (assuming the fix worked) [10:49] dimitern: well, in MAAS I can create a container and ping it, getting the response back from the container IP address [10:49] dooferlad, i.e. for other hosts (or containers on the same or other hosts) talking to the a container-hosted charm should see the packets from the container's ip [10:51] dimitern: which is private to the host at the moment, because it is on a bridge? [10:51] unless a service has been exposed [10:51] dooferlad, what is private to the host? [10:52] dimitern: to ask a better question, do we expect addresses of containers, attached to lxcbr0, to be accessable from other physical machines in the same VPC? === thumper-afk is now known as thumper [10:55] dooferlad, ok let's call 10.0.3.0/24 and 192.168.122.0/24 addresses "local container addresses" and the ones in the same range as their host "internal container addresses" [10:55] dooferlad, we expect other hosts (or containers on other hosts) to be able to connect to the internal container address and see connections originating from the same address [10:55] hmm... [10:56] noticed the check in time at Nuremburg is 3pm [10:56] I arrive around 7:30 am [10:56] dimitern: great [10:56] who is staying saturday night? [10:56] dooferlad, while the local container addresses are irrelevant (if possible) [10:56] thumper, I'll be there saturday afternoon [10:57] hazaah [10:57] I think I'll be half dead / asleep [10:57] dooferlad, "if possible" == I'd rather not have to deal with local container addresses wrt iptables rules (just their CIDR range) [10:58] dimitern: yes, good point - I'll move them now [10:58] voidspace, cheers! [10:58] dimitern: OK, I started an EC2 machine with an LXC on it, juju status says its IP address is 10.0.3.82. I guess that is bad. [10:59] dooferlad, it is bad [10:59] dooferlad, it needs to be something like 172.x.x.x [11:00] dooferlad, check machine 0 logs around PrepareContainerInterfaceInfo (I hope you're logging at TRACE level) [11:01] dimitern: "cannot allocate addresses: no interfaces available" [11:02] logging at debug [11:02] dimitern: yeah I noticed that just earlier myself, filesystemsWatcher isn't being assigned [11:02] rogpeppe1: ^^ [11:02] it's shadowed [11:02] in startWatchers [11:02] gtg catch a plane, see you in a few days [11:03] axw: ah! [11:03] axw: it's not the only one either [11:04] dimitern: there's another bug there too [11:05] rogpeppe1, oh yeah? [11:05] dimitern: when environConfigChanges is closed, we should do EnsureErr on environConfigWatcher, but it's doing it on volumesWatcher instead [11:05] dimitern: so i think that's three easy-to-fix bugs :) [11:05] rogpeppe1, nice catch! [11:06] dimitern: i saw an actual panic from that one too [11:17] fwereade, hey, are you about? [11:17] dimitern, o/ [11:18] fwereade, :) I hope you've sorted out your flights for nuremberg [11:18] dimitern, ha, yes [11:18] fwereade, cause you're still red on the logistics spreadsheet [11:18] ...shite, I think I forgot that bit [11:19] * fwereade goes crawling off to try to sort that out === kadams54-away is now known as kadams54 [11:58] mgz: would it be possible to fix this failure in the juju landing 'bot please? http://juju-ci.vapour.ws:8080/job/github-merge-juju/2834/console [11:59] mgz: i think it's happening because code.google.com/p/go.net is no longer a dependency (a Good Thing) === kadams54 is now known as kadams54-away [12:00] mgz: and for some reason the script is trying to remove charset/testdata [12:07] is anyone else here able to fix the landing bot ? [12:28] rogpeppe1: sure [12:28] mgz: thanks [12:28] fwereade: ping, I had some questions about a leader-election test [12:28] jam, heyhey [12:28] rogpeppe1: when you say no longer a dep, did it actually just move? [12:28] mgz: yeah [12:29] so, do we still need to remove that stuff but from a different path? [12:29] mgz: to golang.org/x/net [12:29] mgz: quite possibly. [12:29] okay, I will do that [12:29] mgz: do you know why it was removed anyway? [12:29] fwereade: I don't know if you saw http://reviews.vapour.ws/r/1378/ which basically just removes leader-elected as a feature flag (its just always enabled) per Mark's request. [12:29] rogpeppe1: it's not properly licenced [12:29] I had a bit of cleanup (some stuff with how exceptions were getting wrapped and unwrapped)b [12:29] but mostly it worked [12:30] except one test [12:30] mgz: why should that matter in the build bot? [12:30] fwereade: specifically UnitDying test causes leader-settings-changed (presumably because the unit loses its leader status) [12:30] but leader-settings-changed isn't ordered vs db-relation-departed [12:30] jam, ah yeah, I saw the ship-it and didn't look further than that [12:31] rogpeppe1: we use the same tarball creation script as the actual release, so we're testing the right stuff [12:32] fwereade: so http://reviews.vapour.ws/r/1378/diff/# line 67 is where I unwrapped the error [12:32] and 1034 is the test I commented out [12:32] axw made the comment "maybe leader-settings-changed should be suppressed when dying" which I had just thought of independently [12:34] jam, hmm, so re the unwrapping that surprises me a bit -- what it it that's wrapping an ErrDying in the first place? I usually think of that as a direct signal rather than something that bubbles through many layers [12:34] fwereade: isLeader [12:34] jam, not to say that it can't happen or that it's not good though [12:34] times out with ErrDying [12:34] and the upper layers do "errors.Trace()" [12:34] jam, generally I think we should be checking Cause(err) rather than err just about everywhere though [12:35] jam, it's the price we pay for tracing [12:35] jam, re leader-settings-changed while dying [12:35] mgz: i'm surprised about the licensing - golang repos don't usually contain anything encumbered [12:35] fwereade: yeah, so maybe we want a helper that wraps tomb.Kill in tomb.Kill(errors.Cause(err)) [12:36] jam, don't think so? it's site-specific [12:36] jam, frequently the context is just what the doctor ordered [12:36] jam, it's only for certain special values in any given case [12:36] jam, even if there are going to be some very common cases... [12:36] fwereade: you're right. It was more about for Signaling (singleton?) errors [12:37] the trace did help me actually find where the error was being generated [12:37] though interestingly enough [12:37] jam, cool :) [12:37] isLeader() doesn't return errors.Trace(ErrDying) [12:37] which would have actually gotten the line [12:37] rogpeppe1: changed, try sending it through again [12:38] mgz: trying [12:38] jam, heh, interesting point [12:39] jam, errors.Trace(tomb.ErrDying) is squicky at first sight, but I can't think of a good argument against it [12:39] rogpeppe1: the html test cases are from the w3c, and their licence is non-free [12:39] Bug #1442132 was opened: [packaging juju-1.23] Issues met while working on debian/copyright file [12:39] jam, and that'd then require us to enforce cause-checking properly [12:39] fwereade: well one bit is that it makes it *obvious* that the errors are going to be wrapped and you need errors.Cause() before passing to tomb. [12:40] paraphrase jinx [12:40] jam, indeed :) [12:40] has a nohm, their website currently claims dual licenced to 3-clause bsd, I wonder if that's a recent change [12:40] jam, so, yeah, I think you''re right [12:41] their licence has an advertising clause [12:41] jam, the traces are good, cause-checking is generally necessary anyway, we should just trace special error values from the start [12:41] mgz: it looks fairly free to me: http://www.w3.org/Consortium/Legal/2008/04-testsuite-copyright.html [12:42] jam, doesn't affect the most common errors.New/Errorf code, right? [12:42] and no-modification [12:42] fwereade: so a helper to errors.Cause ErrDying for tomb is sane? [12:42] mgz: is the problem this: "Neither the name of the W3C nor the names of its contributors may be used to endorse or promote products derived from this work without specific prior written permission." [12:42] ? [12:42] jam, yeah, and we can just stick it in in place of all the x.tomb.Kill(x.loop()) calls [12:43] rogpeppe1: and "No right to create modifications or derivatives of W3C documents is granted..." [12:43] jam, about LSC when dying [12:43] mgz: i don't see that clause anywhere [12:43] but by that current doc, we can just take 3-clause bsd, I'll check with rbasak [12:44] jam, we should look further into where that LSC is coming from -- I don't recall us abdicating leadership at that point [12:44] jam, off the top of my head, I'd suspect the filter of starting a fresh watcher maybe? [12:45] jam, assuming it is a legitimate hook, though, the arbitrary ordering is a feature not a bug [12:46] fwereade: right, I don't think we can prescribe an ordering. [12:46] fwereade: so we have checks that say "if as a result of this action, I'm no longer leader, trigger leader-settings-changed" [12:47] jam, so looking at AliveLoop/DyingLoop I am mainly concerned that they are more different than I would have hoped -- ie that we seem to stop reacting to all manner of hooks while dying [12:47] mgz: same issue still: http://juju-ci.vapour.ws:8080/job/github-merge-juju/2835/console [12:47] jam, and I'm not completely sure that's correct -- I know it has on occasion irritated users that we don't handle charm upgrades while dying, for example [12:47] jam, and as a charmer you *don't know* whether you're dying [12:47] rogpeppe1: doh, I changed the wrong machine [12:48] jam, so every difference between alive and dying is just arbitrarily varying behaviour from the POV of the charmer [12:48] fwereade: could it be uniter.go line 397 "before.Leader != after.Leader" ? [12:49] jam, quite likely, yes -- but I'm still not seeing what'd trigger it [12:49] jam, oops sorry wrong bit [12:49] fwereade: so a unit no longer being alive means it can't be elected to leader, right? [12:50] voidspace, dooferlad, Subnets API - AllSpaces() - please, take a look: reviews.vapour.ws/r/1403/ [12:50] certainly we don't want a Dying unit to become the leader (I would think) [12:50] jam, after.Leader shouldn't have been set, should it? [12:50] rogpeppe1: re-re-try [12:50] jam, completely independent currently [12:50] mgz: reretrying [12:50] fwereade: so I haven't debugged here, but before.Leader should have been set, right? It was the only unit of a service, thus the leader, then it goes to dyign [12:51] fwereade: we know the unit is not leader because it got leader-settings-changed not leader-elected [12:51] jam, yes, before.Leader should have been set, and unless we ran a resignLeadership op it should still be set [12:51] jam, apart from anything else [12:52] jam, a dying unit should not renounce leadership if it's the only unit [12:52] fwereade: so I don't *know* that its that code that's triggering it. I just know I'm seeing a leader-settings-changed in that test, and it feels a lot like something noticing its not leader anymore and thus queuing a leader-settings-changed event [12:52] jam, and the simplest way to implement that it to completely decouple leadership from life -- the only cost is that the next leader may be elected after a short delay [12:53] jam, I admit I am mostly occuppied with a different structure in my head right now so I might easily be wrong somewhere [12:53] fwereade: meaning, if you were leader you have 30s after you die before we notice that you're no longer handling leadership ? [12:53] jam, yes [12:53] fwereade: what about the code that votes in a new leader, seems it should favor non-dying [12:54] jam, dropping it the moment the unit's set to Dead would be fine and good [12:55] jam, not convinced that makes much difference in the end? should the only remaining unit, dying but blocked, never be elected leader, and this never (say) run service-level actions? [12:55] well I did say favor not never-elect [12:56] jam, true :) [12:56] but at the same time, if you're Dying I don't know whether leader stuff actually matters. [12:56] jam, but I can't see a robust way to do that and I'd rather do nothing than something inherently flaky [12:56] fairy 'nuff [12:57] jam, it still does, I think -- you're still likely to be responsible for a bunch of things even if the charm isn't aware of it [12:57] jam, given a service composed of N dying+blocked units, you still want one of them to be leader so it can aggregate statuses, run service actions, whatever [12:58] jam, and when a dying and non-blocked unit is elected, ehh, it'll be deposed soon enough, and we have to expect and tolerate leadership changes *anyway* [13:02] fwereade: so runOperation appears to be the only place that could possibly call WantLeaderSettingsEvents(true) does that fit with you? [13:02] jam, I think so, yes [13:03] fwereade: I was just trying to figure out where to look to find out why we were deposed [13:03] jam, that's the one place that should see all the state changes we record between leader/minion [13:03] jam, seeing what operation set it to false in there should work [13:04] fwereade: and your thought is that it should be a ResignLeadership op ? [13:05] jam, well, I sorta think it *shouldn't* be --ie, yes, that is the only op that should; but I don't *think* we should get that op... should we? [13:06] fwereade: well, that's the only code that has "Leader = false" but potentially leadership.State{} also sets it to false [13:06] (setting by omission is one of my dislikes of go defaults) [13:06] jam, indeed, I would be suspicious of finding some op that wasn't using stateChange.apply === tvan-afk is now known as tvansteenburgh [13:34] dimitern: so the reason that EC2 containers aren't working is that provider/ec2/environ.go -> NetworkInterfaces -> ec2Client.NetworkInterfaces is not returning any interfaces. [13:34] ...so we can't find out what the machine's IP address is, so we can't set up the container correctly [13:35] dooferlad, have you tried without your fix? [13:35] dimitern: no, but I am not sure how it would make any difference. [13:35] dimitern: can do now if you like. [13:36] dooferlad, please do [13:36] * dimitern hopes we didn't break addressable containers on AWS [13:36] fwereade: so I do see "running operation resign leadership" being triggered, but sometimes the test doesn't fail... [13:36] is there a good way to figure out why we'd be running that op? [13:37] dooferlad, which branch are you using? [13:37] benji: yw [13:37] dimitern: 1.23 [13:37] jam, look in op_plumbling.go for the creator func and see where that's used [13:37] jam, should only be a couple of places [13:37] jam, and probably only one of them should be a plausible source given the test [13:37] dooferlad, ok, I'll try 1.23 tip here on AWS [13:38] dimitern: I am already running [13:38] fwereade: modeAbideDyingLoop has newResignLeadershipOp() [13:38] jam, right, but I thought it only triggered when the tracker toldd us to [13:38] Refresh, DestroyAllSubordinates, SetDyingg, ResignLeadership [13:39] fwereade: ModeAbide has "<-u.f.UnitDying() return modeAbideDyingLoop" [13:41] jam, *dammit* sorry [13:42] jam, I do "resign leadership" as soon as we hit that loop [13:43] jam, it doesn't affect anything else, it's effectively just early warning for the charm that soon it won't be leader [13:43] jam, so I think the problem is that the other hooks are racing with <-UnitDying in modeAbideAliveLoop [13:43] fwereade: sure, so ordering means we may or may not get it before db-relation-broken and db-relation-dying etc. [13:44] jam, we should run it, you're absolutely correct, all my intimations that we shouldn't have been coplete nonsense [13:44] jam, if the broken and dying were triggered by the unitdying too we'd be fine I think [13:45] jam, is this test one where we're dying *while* the remote units really are leaving the relation? [13:45] jam, if so, it's unorderable I fear [13:45] jam, collecting laura gtg bbs [13:45] fwereade: enjoy [13:45] yeah, I don't think it should have an order, the question is how to properly test it. [13:48] rogpeppe1, dimitern: did you guys figure out that panic w/ EnsureErr? [13:48] natefinch: axw worked it out [13:48] natefinch: i'm leaving it for one of you guys to fix (there are about 3 bugs there) [13:49] rogpeppe1: since it's blocking me from committing, I'm more than willing to fix it. [13:49] natefinch: it's mostly a shadowed-variable bug [13:49] natefinch: but there's one place that the wrong variable is used too [13:50] natefinch: it's sporadic (i just managed to merge a PR) [13:50] rogpeppe1: saw it in scrollabck, and I see it in the code, looks straightforward enough... just remove a couple colons [13:51] rogpeppe1: what's the wrong variable? [13:51] natefinch: [13:51] case _, ok := <-environConfigChanges: [13:52] if !ok { [13:52] return watcher.EnsureErr(volumesWatcher) [13:52] } [13:52] natefinch: volumesWatcher should be environConfigWatcher [13:52] rogpeppe1: yep, ok, I see it [13:54] dimitern: no change without the fix [13:56] dooferlad, well, something is wrong at your side, because I've just bootstrapped and deployed a container on AWS - with an address from the host's range [13:57] dimitern: well, that's no good :-| [13:57] dooferlad, what AWS account are you using? [13:58] dimitern: the canonical one I was given [13:58] dooferlad, is the env still alive? [13:58] dimitern: yes [13:59] dooferlad, in us-east-1 ? [13:59] dimitern: yes, ec2-54-159-20-216.compute-1.amazonaws.com [14:00] dooferlad, got it - the issue is there's no default VPC there [14:00] dimitern: oh (*%&^* [14:01] dimitern: what region should I target? [14:01] dooferlad, hmm.. no there is one actually [14:01] dooferlad, but why wasn't it used? I can see the instance is a classic EC2 one, not VPC [14:02] dooferlad, if you have full TRACE logs and --debug log from the bootstrap that might give us some pointers [14:03] dimitern: http://paste.ubuntu.com/10782887/ [14:03] wwitzel3: natefinch are we having standup? [14:04] dooferlad, and the machine-0 log? [14:05] dimitern: http://paste.ubuntu.com/10782893/ [14:17] perrito666: sorry, lost track of time [14:27] katco: you doing the cross team call? Do you have the info? [14:27] I think my ears are shrinking [14:27] my earplugs are not as comfortable as they used to be [14:27] natefinch: yeah i'll be there... although the call keeps dropping for some reason [14:47] hi! [14:53] * dimitern steps out for ~1h [15:14] ahh sleeps in tests, the hallmark of true quality [15:16] dimitern: gobot doesn't actually have push rights to go-goose [15:16] *jujubot [15:17] anyone else seeing notifyWorkerSuite.TestCallSetUpAndTearDown failing sometimes with it not having called setup? [15:18] I'n not sure where it's falling down exactly though [15:21] mgz: did you do the "set membership to public" thing? I don't seem to have rights to see the membership stuff, so I can't tell myself. [15:22] natefinch: I did [15:22] but I don't have perms to poke further [15:22] mgz: is this the first time we've tried this with something external to github.com/juju? [15:24] natefinch: yup, but it's not really much different [16:04] wow, ok, Isee what'swrong with the notifyworker tests..... it's a built-in race condition. We're assuming a notifyworker running a goroutine will call Setup() before one of the tests checks that setup has been called. [16:06] is it my idea or there is no clear documentation on how to create a new hook? [16:07] perrito666: documentation is for the weak [16:12] or for those that need to implement a hook before monday [16:13] perrito666: sorry.... all the info I can see in the docs is here: https://github.com/juju/juju/blob/e751bc6d1b44ef71679b946417a3b5c7484673b2/doc/charms-in-action.txt [16:14] perrito666: which from a quick skim does not seem to really talk about making new hooks [16:14] yup, same here, I think Ill dig out a bit among the implemented ones, I foresee lots of jumping around :p [16:25] rogpeppe1: Care to do a super quick review of those changes you found, plus a couple other small test fixes? http://reviews.vapour.ws/r/1405/ [16:28] or dimitern, or katco or anyone else ^^ [16:29] jam: fwereade: ^^ [16:29] natefinch: tal [16:30] katco: thanks [16:30] perrito666, what hook? :) [16:31] Bug #1442257 was opened: lxc network.mtu setting not set consistently across hosts [16:31] perrito666, I can try to hit the high points [16:33] wwitzel3, fwereade, katco, perrito666 I need a volunteer to help with a critical customer issue [16:33] someone have bandwidth to help the onsite team? [16:34] alexisb, I can jump in but perhaps not for very long [16:34] fwereade, pointed you to the chatter [16:35] natefinch: given notifyHandler is test code, is there any reason SetUp is done asynchronously in the first place? [16:35] alexisb, cheers [16:36] natefinch: reviewed [16:37] jam: the setup is the watcher method that gets called by the watcher code [16:40] natefinch: also reviewed... just a few questions [16:40] alexisb: fyi i think wwitzel3 is traveling atm [16:40] jam: it's this code, which is done from a goroutine spawned in NewStringsWorker: https://github.com/juju/juju/blob/master/worker/stringsworker.go#L64 [16:41] be back in a bit... gotta pick up my kid from preschool [16:47] launchpad question: if a bug is blocked while we're waiting on more information, do we mark it as incomplete? [16:48] i'm hesitant because it says "cannot be verified", but what i really mean is "verified, but need more information" [16:49] katco: incomplete is fine [16:50] mgz: cool, ty [17:06] natefinch, how goes your branch for bug 1394755? [17:06] Bug #1394755: juju ensure-availability should be able to target existing machines [17:07] voidspace, how goes the fix for bug 1441206? [17:07] Bug #1441206: Container destruction doesn't mark IP addresses as Dead [17:12] sinzui: natefinch is picking up his kiddo [17:14] sinzui, natefinch's fix is ready but he is seeing failing tests when he tries to merge [17:14] he is currently investigating [17:14] alexisb: i think that fix is under review too [17:14] voidspace, is done for the day === kadams54 is now known as kadams54-away [17:15] and fwereade your a rock star, thank you for the help! [17:15] katco, thank you. === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away [17:24] alexisb, I just turn up and frown at the bugs and they go away ;p [17:27] fwereade: you are chuck norris === kadams54-away is now known as kadams54 [17:33] back [17:34] front [17:40] katco: responded to your review... do you understand my response? [17:41] * katco looking [17:41] * natefinch doesn't want to just dismiss your review and commit without you actually reading it and agreeing I'm not crazy ;) [17:42] natefinch: ah yeah that makes sense [17:42] (about this) [17:42] katco: cool [17:42] natefinch: another place channels would have saved us some trouble :) [17:42] katco: yep, channels are pretty cool [17:44] * natefinch really wishes reviewboard's markdown parser would auto-link urls [17:45] natefinch: I would not make the merging on a patch dependent to someone saying you are not crazy [17:48] :) === JoshStrobl[AFK] is now known as JoshStrobl [17:55] sinzui: yeah, sorry - EOD [17:55] sinzui: it won't be finished today and I'm off tomorrow [17:55] sinzui: will land in Nuremberg... [17:57] voidspace, that is ok, it will just go in a point release [17:57] perrito666: haha very funny [17:57] have a great weekend and we will see you next week voidspace [17:57] voidspace, thank you [18:04] I am somewhat unreasonably excited to see people again.... I think the fact that the last 4 months of my life have been dominated by a very tiny human being probably has something to do with that. [18:07] natefinch, I totally understand :) [18:08] though it only takes ~30 hours to start getting homesick [18:08] my husband says that he would really love a week away with adults and is jealous [18:08] alexisb: i have instructed my wife to send me at least 1 new picture of my daughter a day [18:09] heh yeah, my wife always tell me she's jealous that I get a week's holiday... no matter how much I tell her it's really work [18:09] alexisb: heh, I'm getting pay back. One week after I get back she's gone for 8 days. Her longest trip ever. [18:09] the pictures really do help [18:09] i miss her when she's at school. i'm awful =| [18:09] yep [18:09] natefinch: my in-laws still try to build me a travel itenerary "You have to go see xxx and yyy and zzz" :) [18:09] rick_h_: haha [18:10] rick_h_: lol i get that too! [18:10] work, enjoy dinner...bar maybe? then bed [18:10] talking with the kids over hangouts helps, but there's no replacing hugs in person [18:10] rick_h_, natefinch husband jokes that I missed all the fun in school so sprints are now my frat parties [18:11] natefinch, right now the hangouts are hard with jay because he cries for mommy [18:11] ack! [18:11] austin this last time was really really bad [18:11] that has to be heart wrenching [18:11] I actually stopped talking to him because it made it so hard on james [18:11] which killed me [18:11] alexisb: yeah, this one is going to be really bad for me, Zoƫ has become a huge Daddy's girl, and the first couple nights I'm sure it'll be hell trying to get her to go to bed. [18:12] :( [18:12] and by "for me", I mean "but much worse for my wife" [18:12] natefinch: yea, it's rough when mom asks "come here and talk to daddy" and the response is "no, I don't want to talk to him" and it's your one chance to chat in however many days [18:12] the royal we [18:12] takes time for them to figure out wtf is going on as well [18:12] rick_h_: that's rough [18:12] we do phone recorded video swaps now more [18:13] rick_h_, that is a good idea [18:13] helps with TZ diff, and I'll record something special, so did from the top of table mountain last trip [18:13] and it's async and mom can help record something when he's in a good mood [18:13] go back/forth and helps be somewhere in the middle I think [18:13] natefinch: oh i forgot to tell you! we got a ladybug girl book for my daughter :) she's still a little young, but she likes looking at it :) [18:13] katco: awesome.. love those books :) [18:14] natefinch, has all the good skinning on the kids books [18:14] jay and I read "How does a dinosaur say goodnight" twice this morning [18:14] we also just got this book "dinotrucks" i think... it's hilarious [18:14] my nephew loves cars and stuff, so he loves it [18:14] alexisb: if your boy gets into dragons pick up 'dragons love tacos' [18:14] alexisb: how do dinosaurs say good night? :) [18:14] got it in vegas and still a fav [18:15] with a kiss and a hug and by tucking their tails in [18:15] aw [18:15] which I love to read to him because I get a kiss and a hug [18:16] my daughter is *just* starting to give hugs and it's the best thing ever [18:17] alexisb: sinzui: thanks - see you next week [18:40] cherylj, ping [18:43] Bug #1442308 was opened: 1.23 cannot deploy on vivid, but master can [18:43] alexisb: btw, my old co-worker just applied for that dev manager position. I don't know who the hiring manager is for that, but hopefully they'll see her as a quality candidate, even if her prior technology skill set isn't an exact match. [18:44] natefinch, if you forward me her info I can ping the hiring manager [18:44] alexisb: what's up? [18:45] cherylj, the latest 1.23 bug may be fixed by one of your 1.24 fixes [18:45] can you take a quick peak at https://bugs.launchpad.net/juju-core/+bug/1442308 [18:45] Bug #1442308: 1.23 cannot deploy on vivid, but master can [18:45] alexisb: sure [18:45] thnaks [18:46] oh, the commit they're referencing is new functionality [18:47] and would not fix the problem [18:47] I'll look more [18:59] katco: realized I missed some other spots where we were manually creating that type, and so had to move the construction into a function... much cleaner now.. and it fixes some other spots that I didn't realize also needed it: http://reviews.vapour.ws/r/1405/ [19:01] natefinch: tal [19:01] katco: thanks... I gotta run for a bit, but will try to merge later if it passes muster [19:01] k === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 [19:34] cherylj, can you have look at bug 1442308. Master likes vivid but 1.23 does not, and one of your commits might be the fix (though I cannot see how) [19:34] Bug #1442308: 1.23 cannot deploy on vivid, but master can [19:35] sinzui: Yeah, I'm looking at that now and I know for certain that my commit wouldn't come into play here... [19:38] sinzui: in the similar bug for trusty, it looks like they were able to grab the lxc log from /var/lib/juju/containers/juju-*-lxc-template/container.log. How can I get that for this vivid failure? [19:39] cherylj, ha. it is still there from the last failure. let me get that to you before something cleans up [19:40] sinzui: thanks :) [19:40] cherylj, I can see that juju-vivid-lxc-template is still from from the last attempt too [19:43] cherylj, I attached the log [19:44] sinzui: thanks! I'll take a look === kadams54 is now known as kadams54-away [20:12] cherylj, should I delete the current container log? will that make future runs easier to diagnose. [20:15] sinzui: yes, go ahead [20:16] okay [20:18] sinzui: Do you know if you could grab that same log from the trusty failure? [20:19] cherylj, I cannot, trusty has never failed in CI [20:19] cherylj, only vivid fails [20:21] sinzui: oh, I guess I'm confused about the difference between bug 1442308 and 1441319 [20:21] Bug #1442308: 1.23 cannot deploy on vivid, but master can [20:22] cherylj, thumper asked me to separate the vivid issue from trusty. I reported a new bug because ubuntu-engineering want it fixed [20:24] sinzui: did you ever find a log in /var/lib/juju/containers/juju-trusty-lxc-template/container.log? [20:24] for bug 1441319 [20:24] Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop [20:25] I'm just confused because I'm comparing the two, but they're both referencing juju-vivid-lxc-template, even though failure message for 1441319 indicates juju-trusty-lxc-template [20:25] cherylj, as I said CI didin't fail, oil did and they have the log [20:26] ah, okay, I'll ping lmic [20:32] cherylj, I will hide my vivid comment from the other bug so that it is only about oil [20:37] sinzui: cool, thanks. Would it be possible to get the /var/lib/juju/containers/juju-vivid-lxc-template/container.log from the successful master run? [20:44] cherylj, wasn't that what I just gave? or is the log reset at each bootstrap? [20:45] cherylj, the log is not saved, so all I could get is what was left of the machine and the size lead me to think it was from many weeks of tries [20:48] sinzui: I think the large size is due to this "peer has disconnected" error being repeated numerous times. [20:48] understood [20:56] cherylj, I have mixed news. beta4 can bootstrap and deploy trusty charms. this is good, but you need to set default-series: trusty in your env to ensure it juju doesn't try to use vivid versions of local charms [20:57] my bad news is I just ran out of disk. [20:57] * sinzui cleans up [20:57] d'oh [21:16] cherylj, I can document a work around for vivid deploying vivid charms. beta4 doesn't complete the vivid template. we can stop it ourselves, they destroy the env. creating and deploying again will work [21:17] so as long as you don't remove a working template, Juju is good. I am going to change CI to not delete my working template [21:24] thumper, ping [21:25] alexisb: yaas? [21:25] do you mind joining the release call in 5 minutes [21:25] we are going to need to make some tough calls on 1.23 and I need a clear picture of the lxc issues on vivid [21:29] alexisb: ok [21:29] will be there [21:29] link? [21:29] sent you the invite [21:30] k === kadams54 is now known as kadams54-away [21:59] sinzui: ok, where are these vivid machines? [22:00] thumper, I am going to add your key to ubuntu@vivid-slave-b.vapour.ws [22:02] katco, does monring or afternoon work better for you tomorrow? [22:02] thumper, try logging in. [22:02] thumper, I am disable the one job on it so that CI won't use the machine [22:02] sinzui: I'm in [22:03] alexisb: probably morning-ish [22:03] thumper, and it looks like CI already used the machine and confirmed that master does love vivid. [22:04] sinzui: so master shuts down nicely, but 1.23 doesn't? [22:04] thumper, the juju on the machine is the latest beta which mostly works [22:04] thumper, yes [22:04] sinzui: ok, all I need to do is go through all that is in master that isn't in 1.23 and look for systemd stuff I guess [22:04] simple [22:04] ... [22:05] thumper, but cherylj and I could not find a commit to correlate to the passes yesterday [22:05] * thumper nods [22:05] I'll start from the start [22:38] * thumper sighs [22:38] how do I just make a git branch refer to a particular head? [22:38] upstream 1.23 here [22:44] thumper you mean --set-upstream ? [22:44] git branch --set-upstream? [22:44] or do you mean check out a branch starting at origin/1.23 [22:45] git checkout -t upstream/1.23 [22:45] found that command [22:45] that's what I wanted [22:45] coolio [22:58] WTF? juju goes from 47 meg to 51 meg between 1.23 and 1.24 [23:17] sinzui: I can't even seem to get 1.23 to bootstrap a local provider on vivid [23:18] thumper I had just bootstrapped on that machine using local 30 minutes before our meeting [23:18] * sinzui tries again [23:18] sinzui: I'm currently trying to bootstrap [23:19] the first time it failed with: 2015-04-09 23:05:35 ERROR juju.cmd supercommand.go:430 cannot initiate replica set: cannot dial mongo to initiate replicaset: no reachable servers [23:19] thumper, I see master passed, but the first two tries died quickly. are you getting an error immediately [23:19] oh... [23:19] and interestingly (FSVO) it failed to remove the .jenv file or the datadir [23:19] thumper, I have seen that on this machine in the past, but not in the last few days [23:19] so it thought it was still bootstrapped for the next try [23:20] how long does the bootstrap process normally take? [23:20] thumper, I using --force a lot on this machine trying to kill mongo [23:20] so I may not have seen this [23:20] thumper, just over a minute [23:20] if bootstrap fails, it shouldn't leave cruft behind [23:20] it is a bug if it does [23:20] and it is [23:20] which is a bug [23:20] * thumper sadface [23:21] at least it was my understanding that it was a bug [23:22] perhaps the behaviour has changed to allow people to invesigate [23:22] no reachable servers... again [23:22] takes 5 minutes to time out [23:23] * thumper tries a third time [23:25] oh... worked that time [23:25] yay? [23:40] anyone gives me the 2 minute overview of systemd commands? [23:40] sinzui: do you know ^^? [23:50] sinzui: just bootstrapped 1.23, deployed ubuntu charm, and destroyed the environment, all looked fine [23:50] thumper, no mongod left running? [23:50] with 1.23 built locally from upstream/1.23 and copied across [23:50] It happend to me 3 times [23:50] sinzui: not that I could tell... [23:50] I didn't use --force [23:51] I saw a mongod running aft the message. [23:51] how? [23:51] second bootstrap worked [23:51] and your 1.23 must be the same as mine because no one has committed [23:51] thumper, ps ax | grep mongod [23:52] * thumper looks [23:52] but if you didn't see an error, juju thinks it did everything right [23:52] ok, just the one [23:52] for the currently running [23:52] * thumper destroys again [23:53] and no mongo [23:53] sinzui: hangout? [23:53] sure [23:54] sinzui: https://plus.google.com/hangouts/_/canonical.com/vivid