[00:20] wallyworld_: thanks for the review on http://reviews.vapour.ws/r/549/ (I finally got around to responding) [00:20] sure, i'll go look [00:24] wallyworld_: thanks [00:54] wallyworld_: thanks for doing the backport - was about to do that [00:54] axw: np, i got blocked on something else so thought i may as well [01:12] ericsnow: what's happening with the current CI blocker? [01:13] wallyworld_: sinzui was going to add a 5 minute sleep before running backup in the script (or something like that) [01:13] ok, thanks [01:13] i guess we still added the wait for ha? [01:13] wallyworld_: I don't know what's going to happen with that blocker bug [01:14] wallyworld_: voidspace's patch did not go in [01:14] oh ok [01:14] was it rejected? [01:14] wallyworld_: incidentally, that failing test passed at least once today, so we must be relatively close on when HA is becoming ready [01:15] mongo sorta sucks that it seems to lie to us about if it is ready [01:15] wallyworld_: yep [01:16] wallyworld_: that's the gist of the current blocker bug, if I understand correctly [01:16] we need to find a solution [01:16] nate was talking about getting in touch with the mongo folks (I'll follow up tomorrow) [01:17] sounds like he has had some contact with them before about replica set stuff [01:19] wallyworld_: on that upload patch are you okay with me leaving the cmd tests as is (they do mock out the API client) [01:19] yep [01:19] k [02:17] wallyworld_: can't hear you anymore... === kadams54 is now known as kadams54-away [02:54] axw: FWIW, I was mostly joking about exa, yotta and zeta bytes... [02:56] thumper: 640k is enough for anybody [02:56] thumper: heh :) [02:56] for sure [02:57] thumper: doesn't really hurt [02:57] if someone tries it, it's likely to fail for the foreseeable future, but... may as well do the 10 minute job now and forget about it [02:58] wallyworld_: sent out the email. I've got a workshop to go to at the school, heading off in a few [02:58] axw: sure, tyvm === linstatsdr_ is now known as LinstatSDR === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away [05:09] hatch: Ugh, now I have to mock classList.remove and classList.add because we don't have a dom to work with :( [05:10] huwshimi: wrong channel? :) [05:10] erm [08:41] watching the first end-to-end run of an Action right now :) [08:44] if anyone would like to play with / break it, we have a branch available at under "actions" https://github.com/juju-actions/juju/ [08:44] landing a commit in a moment that should make "do", "fetch", and "defined" functional, along with the charm-side stuff [08:48] also some good content in the wiki at that fork [08:58] bodie_, sweet \o/ I'll give it a try later today [08:59] dimitern, awesome :) there's a Phoronix testing suite charm we have linked in our wiki, but marcoceppi knows more about how to use it [08:59] bodie_, cool, thanks - I'll give you a shout if something is unclear :) [09:17] dimitern, great, and feel free to open issues if you manage to break something [09:17] bodie_, sure, no worries [09:55] ericsnow: ping [10:01] voidspace, jam1, standup? [10:01] dimitern: ok [10:02] dimitern: jam1 will be off won't he [10:02] voidspace, ah, it's friday - yeah [10:02] TheMue: stdup? [10:15] morning [10:18] perrito666: o/ [10:33] jamespage, ping [10:34] jamespage, hey, a friendly reminder to send me that mail with request-address/interface use cases when you have time please :) [10:34] on my list for today [10:34] jamespage, thanks! [12:48] how is the blocking status on CI? [12:48] heh, surprised no one's written an irc bot yet :) !ci-status [12:52] * perrito666 goes figure out if he is hearing gunshots or fireworks brb [13:00] back [13:42] ericsnow: ping === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 [14:33] voidspace, are you working on bug 1399277 as well? [14:33] Bug #1399277: ensure-availability is not reliable [14:34] dimitern: yes [14:34] dimitern: we hope this branch is the fix http://reviews.vapour.ws/r/583/ [14:35] dimitern: but it needs a review [14:35] dimitern: and I'm not convinced it's sufficient but it will definitely help [14:35] dimitern: I got a "ShipIt" but I added a change to go from "all healthy" to "majority healthy" which needs review [14:35] dimitern: and as this changes ericsnow's code I'd really like him to see it before I merge it [14:37] voidspace, ok, I'll have a look [14:44] dimitern: PickNewAddress is done bar the tests, it's quite a funky little algorithm [14:45] dimitern: several possible places for fence post errors, so needs careful checking [14:46] voidspace, great! [14:46] voidspace, you have a review btw [14:47] voidspace, eww... my first comment got awkwardly formatted [14:47] dimitern: I understand [14:47] dimitern: and I don't know a better way than string comparison [14:47] dimitern: I agree it's icky [14:47] dimitern: suggestions welcomedc [14:47] *welcomed [14:48] dimitern: ericsnow has another PR that does the same thing but hides it in a function - I think I'll just merge that [14:48] voidspace, this is an error coming from mgo, right? [14:49] dimitern: right [14:50] voidspace, I have also added a 5m sleep after status reports all machines has-vote. that change did improve the pass rate. [14:50] sinzui: cool [14:50] sinzui: only "improve", or "fixes the problem"? [14:50] shame to add five minutes to the run time [14:50] improved [14:50] :-/ [14:51] a automatic retest passed. So we passed slowly [14:51] if a five minute sleep doesn't solve this problem then a half hour sleep wouldn't - there's some other issue [14:51] that at least is progress [14:52] dimitern, how goes your work with bug 1397376 [14:52] Bug #1397376: maas provider: 1.21b3 removes ip from api-endpoints [14:54] sinzui, I'm about to propose a fix for 1.22, just fixing a final test [14:54] \o/ [14:54] sinzui, it took a lot of time to ensure I don't break something; live tested on maas, local, canonistack, ec2 [14:55] understood [14:55] voidspace, ah, too bad then - go with string comparison :) [14:59] voidspace: o/ [15:32] ericsnow: hey, hi [15:32] ericsnow: I made a change to IsReady [15:33] ericsnow: I'd like your agreement before I merge [15:33] ericsnow: http://reviews.vapour.ws/r/583/ [15:33] voidspace: k [15:33] * ericsnow takes a look [15:33] ericsnow: with the current implementation, if one state server in an HA environment goes down [15:33] ericsnow: and *then* the user tries to backup [15:33] ericsnow: IsReady will always report false and they can't backup [15:33] ericsnow: so I changed IsReady to check for majority healthy rather than all healthy [15:34] ericsnow: (majority healthy is the requirement a mongo replicaset has to be functioning) [15:43] mm, tests dont like to be ctrl+z [15:43] voidspace: why are we using health at all? It's not at all clear what "up" means..... plus, there's a bug in the logic, because the member you're getting the status from always excludes its own value for health, so it'll default to false. [15:43] voidspace: there's member status which seems much more detailed about what each state actually means. [15:45] ericsnow, voidspace ping [15:45] alexisb: pong [15:45] alexisb: hey [15:46] hey guys, do you guys know where the April PyCon is going to be held? [15:46] natefinch: interesting, that doesn't seem to happen in practise [15:46] ie what location [15:46] natefinch: so I think your're wrong about defaulting to false [15:46] alexisb: montreal [15:46] alexisb: montreal [15:46] ok thanls [15:46] thanks [15:46] alexisb: the proposed sprint date clashes with the pycon sprints [15:46] alexisb: or is that deliberate? [15:46] yep that is what I am working on [15:46] alexisb: cool [15:47] natefinch: if you were right, the current implementation would *always* return false [15:47] natefinch: so backup could *never* work - which isn't what we're seeing [15:52] sinzui, voidspace: the current failure (connection refused) with the HA backup CI test may not be HA-related [15:52] sinzui, voidspace: see http://reviews.vapour.ws/r/590/ [15:53] so, an API client is only good for a single request (and then disconnects)? [15:53] anyone ^ [15:54] alexisb: I have a talk and poster session on juju submitted to pycon [15:54] alexisb: no idea if either has been (or will be) accepted yet [15:54] voidspace, awesome [15:54] ericsnow, I don't know, but I have hope that the 5m sleep gives juju a moment to gather its wits [15:54] hehe [15:54] sinzui: I think it helped [15:54] ericsnow: I'm merging your PR with the better error checking for IsReady with mine [15:55] just running tests [15:55] voidspace: hmm that;s true. Weird. I wonder why it doesn't work that way... since the docs definitely say the value doesn't exist for the member you get status from, and therefore the bool should default to false [15:55] ericsnow, agreed, the test is still retried, but at least its odds of success are better [15:56] voidspace: I don't think we need to switch to WaitUntilReady to fix the current failures [15:56] natefinch: my defence is that *that* code (in juju) pre-exists my PR which just waits for it [15:56] voidspace: http://docs.mongodb.org/v2.4/reference/command/replSetGetStatus/ [15:56] voidspace: though the fix to IsReady stands on its own [15:56] ericsnow: I think it's an improvement and we *definitely* need to move to checking for majority healthy rather than all [15:56] voidspace: oh yeah, totally... I just wondered if you knew why it was there. I guess I should ask git who to ask [15:56] ericsnow: otherwise if one state server goes down you can't backup! [15:56] natefinch: it's ericsnow :-) [15:57] natefinch: I did look in detail at the CurrentConfig and CurrentStatus meanings, but it was a little while ago now [15:57] natefinch: I just did what we were doing elsewhere in juju [15:57] maybe mgo does helpful magic for us [15:58] natefinch: actually, the code I added just calls other code that predates mine [15:58] natefinch: that code is what does the check IIRC [15:58] natefinch: in that doc I see "self" included in members with meaningful data [15:58] i.e. health and state [15:58] with "self": true [15:59] perrito666: here's that fix to try out: http://reviews.vapour.ws/r/590/ [15:59] ericsnow: going [15:59] natefinch: "The members field holds an array that contains a document for every member in the replica set." [15:59] natefinch: what are you reading? [16:00] voidspace: replSetGetStatus.members.health [16:00] The health value is only present for the other members of the replica set (i.e. not the member that returns rs.status.) This field conveys if the member is up (i.e. 1) or down (i.e. 0.) [16:00] natefinch: hah, ah right [16:01] natefinch: indeed, I concur with your reading [16:01] ericsnow: you will need a bit of patience, this test takes a bit to run [16:01] natefinch: so we should be checking for Member.self and assuming health: 1 for that member [16:01] else you couldn't connect to it [16:01] perrito666: I figured as much [16:01] perrito666: thanks [16:01] but looks like we don't need to in practise [16:02] natefinch: ah, but see the MemberStatus struct [16:02] natefinch: and specifically the Healthy field [16:02] [16:02] // Healthy reports whether the member is up. It is true for the [16:02] // member that the request was made to. [16:02] natefinch: so we specifically don't need to do that [16:03] ericsnow: so this is my latest version: http://reviews.vapour.ws/r/583/diff/# [16:03] ericsnow: my feeling is that it's only an improvement [16:03] ericsnow: and so we should merge [16:04] voidspace: yeah, but I wrote that, and the code does not seem to back it up (assuming the docs on mongodb are more likely to be accurate than my code comments) [16:04] natefinch: yet it seems to be true in practise [16:05] voidspace: agreed [16:05] ericsnow: I'm going to hit merge then, we'll see if it helps [16:05] voidspace: thanks for doing that [16:06] ericsnow: we'll see... [16:06] I need a break, biab [16:08] does anyone know if an API client is only good for a single request (and then disconnects)? [16:09] sinzui: mgz_ abentley "ImportError: No module named boto" <-- what is boto? [16:09] perrito666, it is a python lib for talking to aws [16:10] perrito666, sudo apt-get install boto [16:10] voidspace: BTW, the "connection refused" condition for isConnectionNotAvailable should probably be dropped [16:10] sinzui: oh ok, I though it was some internal module from ci [16:10] perrito666, sudo apt-get install python-boto [16:10] voidspace: that message came from the API, not from mongo [16:12] cmars: could I get a review on http://reviews.vapour.ws/r/590/? (you're OCR, right?) [16:12] cmars: if I understood right, it should help unblock CI [16:12] ericsnow, yep, i'll take a look, was just reading 583 [16:12] perrito666: what sinzui said, but I think it's python-boto [16:12] mgz_: it is [16:12] cmars: ta [16:12] ericsnow, ok, awesome. i know nothing about HA & replica sets, so i'll probably have lots of stupid questions [16:13] cmars: well, my patch is unrelated to HA :) [16:13] ericsnow, even better :) [16:13] cmars: and it's small [16:16] ericsnow, why can't you use an API client for more than a single request? [16:16] cmars: I haven't gotten any answer on that yet :P [16:17] cmars: I find it surprising if true (which is what perrito666 explained to me recently) [16:17] ericsnow, hmm. let me look a bit at NewAPIClient just to see if this mostly harmless. would i be likely to see any difference in tests passing in my dev env with vs without this patch? [16:18] cmars: in the interest of getting CI unblocked I figured I'd take the chance that the patch helped (since it doesn't hurt) [16:18] * cmars wishes CI could test PR branches for these kinds of experiments [16:18] but i don't want to be terribly difficult either [16:19] cmars: no worries [16:19] cmars: it's doable [16:19] let me look over that api client just a few minutes & I'll let you know. [16:19] cmars: I definitely want an answer to the API client question either way :) [16:19] cmars: thanks [16:20] mgz_: that would be awesome [16:20] mgz_, thanks. is it difficult? [16:20] mgz_: from what perrito666 told me, it's a pain setting things up to run CI tests yourself [16:26] ericsnow: it not super fun, the other option is we can send alternative branches through [16:28] mgz_, wasn't trying to be difficult, i was thinking of the github pull request jenkins plugin. i think it'd be able to pull something like this off (though maybe it is tricky to fit into the existing setup?) [16:29] cmars: sure but if you're talking about actual ci runs rather than just the gating, that's too much to do on every push to a proposed branch [16:29] mgz_, ah, that's true, it'd have to be a targeted approach and maybe that's where it falls down [16:30] natefinch: on second thought, using member.Healthy in IsReady is strictly my fault [16:31] natefinch: I made the decision based on the doc comments in MemberStatus [16:31] natefinch: so I'm sure it could stand improvement :) [16:33] mgz_: for Python core development the CI is set up to allow running against a branch on a dev repo on request [16:33] mgz_: that works well [16:35] ericsnow, is the backups httpClient reusing the API websocket http connection to upload and download files? [16:35] cmars: nope [16:35] mgz_: hit this before? http://pastebin.ubuntu.com/9384377/ [16:36] ericsnow, where does it get its httpClient initialized? [16:36] perrito666: got your ec2 creds soruced? [16:36] cmars: pretty sure its api/http.go [16:36] mgz_: nope, I dont recall being required before, though it makes sense [16:36] ericsnow, ah, ok thanks [16:37] cmars: ericsnow if you can find out why cient fails to being re-used you get each a beer on me next sprint [16:37] perrito666: you can also have the ccred sin the environment.yaml name you pass [16:37] cmars: that code follows the precedent of charm download and tools upload [16:37] mgz_: I am passing: WORKSPACE=$(pwd) ./assess_recovery.py --ha-backup /home/hduran/gocode/bin perritoec2 [16:38] perritoec2 being my ec2 env [16:38] perrito666: so, if perritoec2 had the creds, I'd expect that excdeption not to happen [16:38] perrito666, do you see client re-use issues in other facades or just backups? [16:38] I always just source my creds instead of putting them in the yaml [16:38] mgz_: it has them, since it works :p [16:39] cmars: I have not actually tried, but most likely only backups facade [16:39] cmars: i first encountered this error while trying to retry actions waiting for upgrade to finish [16:39] mgz_, perrito666: ^^^ in the restore patch [16:40] (http://reviews.vapour.ws/r/298/) [16:40] perrito666: oh, right (pre-dating restore) [16:40] ericsnow: ? [16:41] ericsnow: I am trying your patch again now [16:41] perrito666: thanks [16:48] ericsnow, LGTM'd it. let's give it a shot [16:48] cmars: k [16:49] cmars: opening a bug on the api client re-use thing first [16:50] ericsnow, perfect [16:50] ericsnow: we should write a facade call empirical test to make succesful and unsuccesful facade calls and see how the client behaves there [16:56] ericsnow> natefinch: I made the decision based on the doc comments in MemberStatus [16:56] ericsnow: and that's what you get for basing your implementation on my shoddy documentation ;) [16:56] natefinch: :) [16:57] natefinch: and it was quite late and I was working on unblocking CI so perhaps I wasn't as rigorous for the sake of urgency :) [17:05] ericsnow: well my test run failed for the second time without relation to your patch :p running a third one [17:05] perrito666: k :( [17:07] natefinch: want me to open a bug on IsReady? [17:09] ericsnow: I don't know for sure that it needs a bug... it's entirely possible that Health is perfectly acceptable and does The Right Thing... I'm just wary of trusting vague docs :) [17:10] :p MayBeReadyOrJustAlways0 [17:10] natefinch: well, I'll open a bug just so we make sure we at least look into it :) [17:10] natefinch: we can easily close it [17:10] ericsnow: good point [17:18] natefinch: https://bugs.launchpad.net/juju-core/+bug/1399730 (feel free to comment on the bug) [17:18] Bug #1399730: replicaset.IsReady should check MemberStatus.State rather than Healthy. [17:19] natefinch: in fact, feel free to pick up the bug ;) [17:38] is there any way from the command line to get juju to print the environment UUID? [17:41] rogpeppe: the only one I know of is a little indirect: "juju backups create" :) [17:41] ericsnow: i'm not sure i'm gonna use that one :) [17:41] rogpeppe: the env UUID in the backup metadata [17:41] rogpeppe: yeah, I figured :) [17:41] i guess it might be good as part of juju status [17:41] rogpeppe: +1 [17:42] rogpeppe: does the juju api endpoints command do it? [17:42] rick_h_: i don't think so [17:42] rick_h_: that just prints the api endpoint addresses [17:42] k [17:43] sinzui: both my and voidspace's fixes have landed so I'm hopeful for the HA backups CI test result [17:43] rock [17:44] sinzui: keep in mind that our fixes don't really address the underlying concern we have with knowing for sure that HA is ready [17:45] ericsnow, understood. I hope all our work improve the test's success. [17:45] sinzui: me too! [17:45] ericsnow, CI found the new commits and has started the tests. [17:45] sinzui: cool [17:50] alexisb: we're still on in 10 minutes, no? [17:50] ericsnow, yep [17:50] alexisb: k === kadams54 is now known as kadams54-away [17:57] ericsnow: pre-good news, the test passes [17:57] perrito666: cool [17:57] perrito666: and it fails without the patch? [18:00] ericsnow: I did not try, but I assumed so [18:00] I am runing the same code as CI [18:00] perrito666: cool [18:00] perrito666: CI should be running the test in a little while [18:00] ericsnow: lets hope with the same results === kadams54-away is now known as kadams54 [18:15] ericsnow, just LGTM'd 549 [18:30] cmars: thanks! [18:30] katco, mind if I take a few minutes to chat with natefinch before we meet? [18:30] alexisb: not at all! [18:30] Hello All. [18:30] alexisb: ty for asking :) [18:32] cmars: would you mind reviewing a couple more for me (http://reviews.vapour.ws/r/591/ and http://reviews.vapour.ws/r/578/)? [18:32] ericsnow, already on 591 [18:32] cmars: http://reviews.vapour.ws/r/557/ too if you feel up to it :) [18:32] cmars: awesome [18:37] sinzui: how soon do you think we'll know on the CI results for functional-ha-backup-restore? [18:38] ericsnow, about an hour. We are in the dull period where CI is publishing streams for the cloud-based tests [18:38] sinzui: k [18:38] g'night all [18:38] have a happy weekend [18:39] voidspace: thanks for all your help [18:39] EOW [18:39] ericsnow: see you on monday [18:43] sinzui: FWIW, perrito666 verified that with my patch the CI test passed on his EC2 instance [18:44] fab [18:48] ok katco ready and joing the hangout [19:12] cmars: thanks for those reviews === kadams54 is now known as kadams54-away [19:23] sinzui: looks like the test made it past the spot where it's been failing :) [19:25] ericsnow, fingers crossed :) [19:31] sinzui: so the test passed and I'm confident that it will pass in re-test [19:32] sinzui: should that blocker bug be marked as fixed-released or should the CI tag be dropped? [19:33] ericsnow, I am marking it as fix released [19:33] sinzui: okay, cool [19:33] sinzui: thanks for all the help [19:33] sinzui: I think that sleep really helped [19:56] alexisb: http://www.pvponline.com/comic/2007/12/06/kringus-risen-part-4 [19:56] lol nice! [19:56] alexisb: we call our cat kringus because he likes to get in the tree :) [20:19] cmars: I'm guessing you got those errors from a header file in the mongo source... [20:20] cmars: are they exposed in mgo at all? [20:22] cmars: or are those from errno? [20:39] ericsnow, I grepped 'connection' out of the go stdlib :) [20:39] just for a sample [20:39] ericsnow, mgo doesn't have anything specific to connection errors, other than when trying to dial === kadams54 is now known as kadams54-away [20:41] cmars: okay, I'll go with the ones you listed (see http://golang.org/pkg/syscall/ and http://golang.org/src/pkg/syscall/zerrors_linux_amd64.go) [20:49] ericsnow: beware using anything from syscall. It should set off warning bells in your head if you ever import it. It is explicitly not cross-platform compatible. Sometimes the methods & types will be the same across platforms, and sometimes they won't be. We need to at least make sure that everything compiles on all our platforms (by which of course I mean Windows) [20:51] natefinch: good point [20:52] natefinch: in this case the errnos are set up for most (all?) platforms go supports, including windows [20:53] natefinch: the errno constants are all I need [20:53] ericsnow: ok, just wanted to make sure. Syscall always gives me the heebie jeebies :) [20:53] natefinch: hey, good thinking === kadams54-away is now known as kadams54 [21:07] cmars: I've updated http://reviews.vapour.ws/r/591/ with those errnos [21:08] cmars: do you think you'll have time to look at http://reviews.vapour.ws/r/557/? [21:08] cmars: it's a large patch, but most of it is mechanical [21:09] cmars: I tried to break it down helpfully in the description [21:13] ericsnow, thanks. yep, i'll take at 557 next [21:14] cmars: awesome! thanks for grinding through these :) [21:28] ericsnow, i don't think i'm qualified to approve 557, sorry [21:28] cmars: no worries :) [21:28] cmars: thanks for taking a look [21:30] cmars: I'll see if I can talk axw into it on Monday :) [21:44] ericsnow: I wonder if we shouldn't just put availabilityZone into hardware characteristics [21:46] natefinch: I considered that but from my cursory introduction to that corner of juju it didn't seem to fit quite right there [21:46] natefinch: what would be the motivation? [21:46] natefinch: other than reducing the churn in that patch :) [21:48] ericsnow: reducing churn is a noble goal. But it also just seems like something you might want to pass around with hardware characteristics. [21:50] natefinch: I suppose, but it still doesn't seem like a good fit IMHO [21:51] ericsnow: at least as good a fit as "tags". AZ can't change for the hardware anymore than the architecture can. Where the hardware is seems like a valid piece of info about the hardware. [21:51] natefinch: FWIW, on IRC axw gave me a non-binding LGTM on the approach after a cursory review [21:51] lol [21:51] natefinch: fair enough [21:52] ericsnow: I think it's worth bringing it up to others first... I don't want to muck up other parts of the code if changing HW chars has other specific meaning, but from a pure code point of view, it's a lot better than adding yet another return value to all these functions. [21:53] natefinch: sounds good [21:54] natefinch: at once point I believe I had the AZ as stored on the Instance (and expected with an AvailabilityZone method) [21:55] s/expected/exported [21:57] ericsnow: I thought of that too... I think HWC is slightly better.. but not 100% [21:57] natefinch: agreed; instance.Instance is more about stuff that I expect can change === kadams54 is now known as kadams54-away [23:10] i'm done. have a good weekend juju people