[00:00] sinzui: considering extended block of ci, is there any word on extension to feature freeze date? [00:00] sinzui: can I bail out blaming azure? [00:00] sinzui: the good news is, the new restore runs a couple of times faster than the previous one [00:01] perrito666, ci will retest with another substrate and it might blame you [00:01] something in the order of 4 to 5 times in my machine [00:01] anastasiamac, no there are no plans to extend the deadline [00:01] sinzui: oh no no, we have to blame the test inorrectly looking for the output and THEN you can blame me [00:02] sinzui: thnx :D [00:06] alexisb: not looked yet looking [00:11] sinzui: dimitern so we have a plan for the juju-client stuff then? Our guys hit this today and we were just starting to look into it but hadn't nailed down it was the latest release [00:12] rick_h_, alexisb and xwwt aren't keen to work on 1.21 given than 1.22 is supposed to be stable. [00:13] sinzui: ok [00:13] I think that's what I told Makyo today was that he should be running the latest 1.22 release [00:13] so we didn't file anything until he upgraded and checked it out [00:13] rick_h_, there is a patch for jujuclient and I know a build is circulating. we can put that version in the stable ppa [00:14] sinzui: rgr ok we'll have to vendor that into the GUI/quickstart then. Will add a card. [00:14] sinzui: do you know who's got that build going and where I can watch it? [00:14] rick_h_, It would be nice is more stakeholders used the proposed streams since they demanded something inbetween devel and stable [00:15] rick_h_, AIUI a patch for python-jujuclient was done- https://bugs.launchpad.net/juju-core/+bug/1425435/comments/11 [00:15] Bug #1425435: juju-deployer/jujuclient incompatibility with 1.21.3 [00:15] rick_h_, https://launchpad.net/~ahasenack/+archive/ubuntu/python-jujuclient [00:16] sinzui: understand. I think the big thing is that we're running the dev stuff in progress and not the older versions. [00:16] and have been wanting to get our tools into the Juju cross version testing on the QA end vs implementing cross juju version testing on our end [00:17] rick_h_, well, we have 3 versions of juju. that is mad. we are months behind in releases, so we never release betas from trunk [00:17] sinzui: understood and agreed [00:18] sinzui: ok, well we've got giu and quickstart releases planned for tomorrow so we'll make sure this dep update gets into them and should have fixes on our ends tomorrow. Thanks for the info/heads up [00:20] rick_h_, we deploy the landscape bundle with quickstart with every juju revision we test. Ans we do it on 3 maases, aws, hp, and joyent. We know the current juju liked quickstart [00:21] sinzui: right, the issue here is that it was the older juju that would/shoud have broken correct? [00:21] sinzui: basically the orange boxes lag behind stable a bit and they got bit by it it seems to me [00:23] rick_h_, we tested 1.20, 1.21, 1.22 and 1.23 after the issue was reported, but packaging assumes that the set of packages are consistent. Er. juju/stable packages work together, but ubuntu take one package doesn't mean ubuntu has a stable set [00:23] sinzui: ah gotcha [00:34] wallyworld, the voting 386 unit-test job passed. The non-voting unit tests jobs did not. We are still good for a release [00:34] great [00:55] sinzui: restore merged [00:56] noted [00:56] * perrito666 braces [00:56] I feel as if we should go celebrate this [01:09] wallyworld: thanks for merging my branch [01:09] axw: no, wanted to get in before the rsh :-) [01:09] np [01:21] menn0: thanks for merging my branch too :) - I was waiting for all the important ones to land first [01:26] wallyworld, in a few minutes, this page will show we have all blesses http://reports.vapour.ws/releases [01:27] \o/ [01:27] CI is testing 31ca67f now [01:27] sinzui: how can i see http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore now? [01:28] perrito666, developers get their own login now to see all and even retest [01:28] was there a mail about that? [01:31] jw4: I merged your branch? [01:31] jw4: I thought all I did was comment on the PR [01:31] perrito666, there wasn't because I don't know how to distribute the password. I am telling everyone who asks [01:31] menn0, really? Interesting [01:32] menn0, well it merged [01:32] jw4: sweet :) [01:32] jw4: I think I know what happened [01:32] menn0, interesting discovery... [01:33] there was already a $$merge$$ somewhere in the comment thread and my comment triggered the bot to look at the PR again. it saw the earlier $$merge$$ and decided to send it off again. [01:33] yep, that makes sense [01:33] that's a little bit silly I guess but oh well [01:33] menn0, not sure if it's a bug or a feature [01:33] probably the first [01:33] :) [01:34] jw4: i can understand why it works that way. otherwise the bot would have to keep state about each PR. [01:35] menn0, makes sense - if it only looked at the *last* message it may miss the $$merge$$ if someone comments again, and if it looks at 'n' last messages.... where do you draw the line? [01:35] axw: you saw the maas meeting tonight? [01:35] wallyworld: yep [01:35] uff I am far down the queue [01:42] jw4: exactly [01:45] jw4, I added the last $$merge$$ btw :) [01:54] lol cherylj did you see what reviewboard added to your rev 1000? [01:59] axw: for later when you have a moment http://reviews.vapour.ws/r/1017/ [01:59] dimitern, oh yeah - thanks :) [02:00] perrito666: yeah, I did! [02:01] cherylj: you got the prize, I wonder if those pop at other numbers :p ill be standind by [02:01] perrito666: I was joking with katco when I submitted that review, asking if I get a trophy and it turns out that I do! [02:03] * perrito666 frantically hits refresh on the backup restore jenkins test === kadams54-away is now known as kadams54 [02:36] wallyworld: getting a cup of tea, will review your branch when I return [02:36] sure, np [02:38] wallyworld: I managed to fix the tests for the uniter in uas_new_statuses branch so if no other tests are broken tomorrow I might add the compatibility layer [02:38] perrito666: awesome, ty [02:38] Its near midnight for me so Ill just go to sleep knowing that restore is finally merged [02:38] :p [02:38] perrito666: yeah, well done on restore [02:38] I have been working on that since vegas :p [02:38] that is almost a year [02:38] lol [02:39] but now it's done [02:39] restore is never done muhahahah [02:39] anyhow, good night all [02:54] * thumper needs to go make a coffee [02:59] perrito666, well done man! sleep well [03:03] jam, leads call? [03:13] wallyworld, thumper looks like gcc/ppc compilation errors quickly merged into master [03:15] sinzui: you mean people had issues in their queued up branches? [03:15] yep [03:15] we are already broken [03:16] I am gathering the log before the curse is revealed in 45 minutes [03:17] wallyworld, sort of good news, It is the multiple definition error again [03:17] sinzui, we should not be blocking on known gccgo bugs [03:18] alexisb, this isn't a known gcc bug the test passed, now it does not [03:19] thumper, davecheney we will want to be prepared to triage this asap ^^^ [03:20] sinzui: I thought we were going to have ppc not block? [03:20] sinzui: the branch that cherylj had reverted before triggers a bug in ppc, but it is a bug in gcc on power [03:20] thumper, how can I tell you haven't broken ppc64el again if I cannot use a test suite [03:21] well... it fails to compile, and it is a gcc issue [03:21] alexisb: there is no point in triage [03:21] we'll never get the fix merged back into T and V in time [03:21] thumper, As Angry as I am I will say nothing triage it as you will [03:22] thumper, bug 1425788 [03:22] Bug #1425788: multiple definition of http.HandlerFunc [03:23] oops, even my fingers make it critical when I say I wont. [03:23] thumper, lower it if you wish [03:24] thumper, lower it, sinzui we should revisit this with xwwt and team in the morning [03:25] alexisb: lowered to high, comment added [03:30] good new the bogus link to get windows installer for juju has quadrupled the number of win beta testers, but I doubt they know that version wont work with release streams === kadams54 is now known as kadams54-away [03:59] thumper, wallyworld axw: Do either of you have a moment to review http://reviews.vapour.ws/r/1018/ [03:59] sinzui: done [04:00] thank you axw === dimitern is now known as dimitern_off [04:08] jam, are you about yet? [04:08] hey mattyw, what's up? [04:31] wallyworld: r u here? [04:51] o/ [04:52] is it possible to disable juju's AZ spread after deployment? running 1.21.3 wgrant is seeing weird behavior out of a havana env when using juju deploy --to zone=foobar [04:52] its an icehouse environment [04:52] oh right, sorry [04:54] Hm, I suppose patching my local juju to ignore the forbidden AZ won't really work, will it. [04:59] https://bugs.launchpad.net/juju-core/+bug/1311976 looks relevant [04:59] Bug #1311976: Support environment-specific placement directives in deploy/add-unit [04:59] Though being able to set the possible AZs would be handy. [05:00] wgrant: and you found the magic answer. so you were right based on our soundboarding [05:00] wgrant: fraid not. you can add a machine to a specific zone and then deploy --to that [05:01] Yeah, that's what it seems like I'll have to do. [05:01] Thanks. [05:02] that doesn't seem like a great answer though, you'll end up manually ahving to distribute across the AZ [05:04] bradm: I agree, it's just a workaround [05:05] is there an existing bug to fix this? or should we file one? [05:05] and by we, I mean wgrant. ;) [05:06] bradm: we do have support for failing over to another AZ, so I guess it's just not checking errors well enough. [05:06] bradm: not that I'm aware of [05:07] bradm wgrant: if you would file a bug that'd be good. in particular, I'd like to see machine-0.log, which should have the error we need to check to fail over properly [05:09] axw: In this case the instance gets created but the scheduler fails to find a host for it. I'll set up a clean example and attach the log. [05:10] Inter-AZ latency is usually multiple milliseconds though, so automatic distribution without a very easy to way to override it on a per-service basis seems like a nasty trap. [05:10] wgrant: oh, if nova isn't erroring then that's unhelpful. we currently check for an error saying that no hosts could be found in the specified zone [05:11] axw: Ah, yeah, the nova boot succeeds, but a second or so later nova-scheduler places it into ERROR saying no hosts were valid or something. [05:11] "No valid host was found." [05:12] I see, perhaps we should be waiting until the scheduler picks it up. [05:12] thanks [05:12] (eg. we couldn't run LP on AWS without manually assigning AZs, because 200 queries to a DB server that was randomly assigned 5ms away is a full extra second on your page load times) [05:12] wgrant: there is an outstanding bug to support AZ service constraints - that would solve that issue, right? [05:13] oh I see, you want LP and the DB in the same AZ? [05:14] there was some work done to allow charms to query AZ properties of remote units, to allow them to self organise [05:14] Yeah [05:14] I'm not sure if it's done, or if it's helpful here [05:15] Well, I guess eventually I want a DB service and a webapp service with units in both, that prefer to talk to the local units of the other service, but that's more complicated than having one DB service and one webapp service in each AZ and handling the interconnections with relations. [05:16] wallyworld: do you remember what happened with service-level AZ constraints? [05:17] (in the specific case that caused confusion here, one AZ isn't usable by my dev tenant at all. but there are more general issues.) [05:23] axw: i think they got dropped? i think we decided placement directives would have to suffice. but can't recall for sure [05:24] wallyworld: mk [05:26] axw: so just skimming, would the behaviour to skip unsuitable az and try another not suffice? [05:26] wallyworld: in this particular case, yes [05:26] It would solve today's specific issue. [05:27] wallyworld: but that leaves the performance issue [05:27] and we ave that in aws and openstack now [05:27] right [05:27] wallyworld: yeah, looks like what's in openstack is insufficient [05:28] wallyworld: sounds like we need to wait until the scheduler picks up the node before claiming success [05:28] So it's reasonable to file a bug asking juju to detect this and fall back to another AZ? [05:28] i think so [05:28] wgrant: yes, we should be doing it already - just seems to be a bug in the implementation [05:40] "nonce":"user-admin:bootstrap" [05:40] That doesn't sound very noncey. [05:47] axw: https://bugs.launchpad.net/juju-core/+bug/1425808 [05:47] Bug #1425808: OpenStack provider doesn't try another AZ if the scheduler fails to find a valid host [05:48] wgrant: thanks [05:49] wallyworld: does that belong on 1.23? [05:49] um [05:49] i guess we could say that as it's a bug in an existing feature [05:50] we modelled te implementation om what was done for AWS I think [05:50] AWS clients seem to traditionally wait until the instance is started before returning success. [05:51] whereas ours accepts a successful call to run the instance [05:51] which you are saying might transition later to error [05:51] Right. I don't know how your EC2 client works. [05:51] sometimes our code make me weep [05:51] Or indeed if EC2's API doesn't bother returning until it's starting. [05:52] we are triaging 1.23 bugs for the first milestone next week [05:53] how important is this? [05:54] We're redeploying our cloud in a different way soon to avoid this, partly because it's not really possible to effectively use a recent juju in our existing design. [05:55] gotta go get kid, bbiab [06:09] * thumper is done for the day [06:09] see ya all tomorrow [09:56] sinzui: I am guessing you did not sleep :( is it possible that aws is misbehaving? [10:01] voidspace: hangout time! [10:02] dooferlad: yeah, my laptop isn't cooperating [10:02] dooferlad: maybe it needs coffee... [10:02] dooferlad: with you in a minute [10:09] TheMue: dooferlad: hangout plugin keeps crashing on firefox and safari won't start [10:09] TheMue: dooferlad: looks like it's reboot time [12:08] wallyworld: didn't take long for trunk to become blocked again :~( [12:09] indeed :-( [12:09] * anastasiamac crying [12:09] axw: and juts sorted my PR and got like a 3rd shipit on it :D [12:09] lol [12:09] perrito666: you going to look at restore issue? [12:10] poor anastasiamac is crying [12:10] she wants to land her branch so badly [12:10] wallyworld: mayb tears of joy?.. [12:10] yeah, that's it [12:10] wallyworld: but yes, landing will be euphoric! [12:10] Anyone want me to buy an Ubuntu Phone to bring to the April sprint for them? [12:11] if I hadn't just bought a phone... [12:11] :-) [12:11] voidspace: getting yourself one I guess? [12:11] axw: yeah, about to order [12:11] cool [12:11] voidspace: i think i mite want on :D [12:11] one* [12:12] anastasiamac: shall I order you one? [12:12] or do you want to think about it [12:12] we can share shipping cost if I order now...] [12:12] * axw goes to rescue wife from crying baby [12:14] voidspace: i was told that we r waiting for them to b legally sold in oz [12:14] voidspace: i'll skip this time ;D [12:14] anastasiamac: cool [12:14] voidspace: isn't ur new addition due today? [12:15] anastasiamac: yep :-) [12:15] voidspace: \o/ u must b very anxious - why r u not with her? [12:15] voidspace: ur wife?.. [12:16] anastasiamac: she's at home 300 metres away. We have no internet at the new house (moved last week) so I'm working from a friend's house. [12:16] anastasiamac: no sign of the baby yet... [12:16] anastasiamac: she'll ring me if there's any news and I'm less than five minutes away [12:16] the pile-on after block does have the downside of often reblocking :) [12:17] voidspace: congratulations on the move! what a busy season for u :) [12:17] anastasiamac: yeah, we were hoping to get the move done at least a week earlier but lots of last minute delays on buying the house [12:17] a really nice house though, so it's worth it [12:17] anastasiamac: and thanks :-) [12:17] phone [12:18] voidspace: ! [12:18] mgz: but ci gets unblocked so rarely - it becomes a md rush to land :P [12:21] not sure what the plan ob bug 1425788 is [12:21] Bug #1425788: multiple definition of http.HandlerFunc [12:22] mgz: gccgo bugs are not to be blockers [12:22] a recent gccgo works fine [12:22] bug CI is using an old gccgo [12:22] but [12:22] well, we're using the pacakged one, but sure, the plan then is backport a newer gccgo [12:23] yeah, but the politics there is huge :-( [12:23] :) [12:23] so, it's just backup-restore blocking [12:23] i think so [12:24] that can probably get fixed today, so should be less of a painful block than the last [12:24] we need perrito666 to look at it, will hopefully be unblocked for tomorrow so we can finish work for 1.23 feature freeze [12:24] right. [12:24] worst case, we revert the depraction I suspect [12:25] yep [12:25] we'll have to do that tomorrow if not fixed, but i think there were a couple of commits in that area, will nee dto check [12:53] dooferlad: I left some comments on your review [12:54] dooferlad: we can't change the subnetSet to a String Set [12:54] dooferlad: as the bool value has significance, but map[string]bool is good [12:54] dooferlad: I think that's what I did in EC2 but forgot to change EC2 - it means one less cast === ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1425807 [13:52] mgz, can you report a bug about the consistent failure in run-unit-tests-precise-i386 in master [13:53] mgz, I retested the vivid job and it passed. The machine was clean too. I cannot explain why it failed repeatedly hours ago [13:55] perrito666, While we have consistent failures over two builds, I am retesting the job. I am willing to force it to something like HP or joyent too [13:56] sinzui: sure [13:56] perrito666, also, as ci is locked down, I think we can enable --debug if that will help you. [13:56] the changes I made to the windows test seemed to have helped [14:00] mgz fab. I haven't had time to look yet. I queued the release last night and am now releasing [14:03] sinzui: I am doing a backup and restore by hand and getting success, Ill try t run the CI job here [14:04] perrito666, well that is good news. are you using aws? === kadams54-away is now known as kadams54 [14:05] sinzui: yup [14:06] perrito666, okay. I just switch the job to be hp. [14:07] sinzui: tx [14:18] sinzui: a requierements.txt would be a much appreciated asset in the ci tools :) [14:18] perrito666, ah, well I have a work-in-progress that will address that [14:19] sinzui: :) tx a lot [14:34] sinzui: familiar with http://pastebin.ubuntu.com/10429862/ ? [14:35] perrito666, no, but I also know that that function changed a few hours ago. [14:36] I just bzr pulled and bzr revert-ed [14:37] perrito666, I am going to revert that change [14:42] perrito666, tip is reverted [14:42] tx [14:44] hazmat: around? [14:47] perrito666, Looking at http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/job/functional-ha-backup-restore/1538/console, there is a different error [14:47] perrito666, do you want me to retry hp? [14:48] sinzui: yes, odd one, that means, basically that the bootstrap went wrong === kadams54 is now known as kadams54-away [15:04] good morning, devs [15:09] jam: ping [15:10] jam: un-ping [15:10] having problems in running windows-boilerplate with juju, anyone can help me? [15:12] this testing suite is going to kill me euca-terminate-instances: error: missing access key ID; please supply one with -I [15:13] sinzui: ^ did I miss something in my test call? [15:13] Muntaner: good morning, tell me more [15:14] hi perrito666, I simply downloaded the windows-boilerplate charm [15:14] Muntaner: point me to it please [15:14] perrito666, I cannot remember if you need to source ec2 creds first. I suppose so since the the key is definitely in the yaml [15:15] that is new :) [15:17] perrito666, chatted you in query [15:20] natefinch: ericsnow I seem to have forgot standup [15:23] ok, if this run does not pass (assuming I didnt forget to set up things) Ill revert the patch [15:24] perrito666: heh sorry, thought you were out [15:28] nah, dentist was fast and bad, Ill change dentist for the next one :p I was absorbed trying to fix the restore bug [15:36] agrjaslkdjaslkjds I hate euca [15:39] perrito666, hp failed again, *before* setup. I am going to try another provider since hp doesn't appear to be suitable [15:42] I am being frustrated by this euca issue, i just exported EC2_ACCESS_KEY and EC2_SECRET_KEY [15:53] ah, I thought it was quiet [15:53] I wasn't connected [15:54] voidspace, aren't you suppose to be in the delivery room?? [15:54] alexisb: we're having the baby in a pool at home [15:54] alexisb: I'm down the road at a friend's house where we have internet [15:55] alexisb: and the baby hasn't made any signs of appearing yet... :-) [15:55] voidspace: you have a pool? sweeeet [16:01] perrito666: yeah, it will double as a great paddling pool after the birth (and after we've thrown away the lining - it will look like the remnants of a shark attack immediately after!) === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away [16:12] ahasenack: i am.. sorry didn't get to it last night was 5hrs on a bus yesterday [16:12] ahasenack: or something new? [16:14] hazmat: new [16:14] ahasenack: all ears [16:14] let me fetch the pastebin [16:14] hazmat: http://pastebin.ubuntu.com/10430000/ [16:15] that with r53 of jujuclient [16:15] that looks like an error [16:15] in code [16:16] * hazmat checks [16:18] hazmat: it's also a bit confusing where python-jujuclient is hosted. There is a project and branch in launchpad, and another one in github [16:18] ahasenack: interesting, its a by product of the py3 support. base class for self there is an interator [16:18] ok [16:18] ahasenack: i was hoping to move it to github, but wanted to try and coordinate with the recipes, and setup the foreign branch import [16:18] ok, another issue for another day [16:19] yeah [16:19] I switched my builds from lp:~hazmat/python-jujuclient/trunk to lp:python-jujuclient [16:19] cool [16:19] I think you had setup the former to be a mirror [16:19] but doesn't matter now [16:20] hazmat: ahasenack so we're working on trying to test and QA the latest python jujuclient and to QA it against quickstart/gui charm so we can update the juju stable ppa [16:20] do we know who has pypi permissions? [16:20] sinzui: when you adding a deployer test to qa for core? [16:20] rick_h_: hazmat I suppose, no? He owns it [16:21] hazmat not yet [16:21] rick_h_: i do [16:21] hazmat, we do juju-quickstart with ever revision test against maas 1.7, aws, hp, and joyent [16:21] hazmat: k, we'll QA and then hit you up for a release then? Can we get permission for someone else to help handle uploads due to your network fun? [16:22] sinzui: thats a good start, but given the amount of real world production workflows we have with deployer, it needs to be in the qa tests for core.. [16:22] rick_h_: qa what? [16:22] hazmat, I don't disagree, jog was taken off deployer to work on kubernetes [16:22] rick_h_: so latest releases should always be on pypi [16:24] rick_h_: i'll add one of the lp juju-deployers to pypi perms.. [16:25] probably tvan [16:25] hazmat: well so there's a ppa with a patch and we're chasing does that patch exist in trunk and is his .19 PPA build from trunk and ok to be in the juju stable ppa [16:25] sinzui, are you around? [16:25] rick_h_: well gui team doesn't have any extant merges ... [16:26] hazmat: right, this patch came from andreas is my understanding [16:26] mgz, or you? [16:26] dimitern: hey [16:26] xwwt, mgz, sinzui, i might have the reason why the restore job fails [16:27] rick_h_: dpb1 merged that patch, I don't know the author. I just rebuilt the packages [16:27] dimitern: what's your theory? [16:27] dimitern: What is your thought? [16:27] ahasenack: ok, so the patch is in trunk then and you're doing a fresh build for juju stable? [16:27] dimitern, oh goody. I haven't a clue and gathering evidence is hard [16:27] rick_h_: for deployer or client? [16:27] xwwt, mgz, sinzui, so I was looking at the logs of the last successful and first failed jobs [16:27] hazmat: python-jujuclient [16:27] rick_h_: no, I don't touch juju stable. it's in my ppa, a daily one for jujuclient. It uses LP recipes [16:28] xwwt, sinzui, mgz, and so far I can see at least 3 separate issues [16:28] ahasenack: ok, but that should be good to copy to juju stable then if it qa's ok against quickstart/gui/deployer in there [16:28] ahasenack: as it's an 'official python-jujuclient release'? [16:28] rick_h_: yeah.. that patch was outstanding for not a long time.. like two days [16:28] afaics [16:28] hazmat: right cool. I'm just trying to unblock us releasing the gui/quickstart with the patched python-jujuclient [16:28] xwwt, mgz, sinzui, 1) the mentioned PR https://github.com/juju/juju/pull/1667/ *did* in fact break the job's expectations and as it is it won't be able to ever pass [16:29] hazmat: and to keep people on the juju stable ppa vs the personal ppa of andreas [16:29] yes, beisner is the author. but it's straightforward, just an if clause added to deal with the missing data [16:29] rick_h_: yes.. that's great [16:29] rick_h_: deployer & client needs help on packaging both for stable ppa and distro [16:29] in general [16:30] hazmat yea, I'm trying to see what we can do with that as we depend on it :) [16:30] hurray, dimitern. I can fix a test script, probably quickly [16:30] xwwt, mgz, sinzui, that's because the job internally calls "juju restore --show-log [--constraints mem=2G]" etc., while after the patch #1667 landed it doesn't matter how you call it - it internally calls [16:30] cmd := exec.Command("juju", "backups", "restore", "-b", "--file", c.backupFile) [16:31] which obviously ignores the arguments; 2) issue - https://github.com/juju/juju/pull/1596/ which landed some time ago actually triggers the "/mnt/jenkinshome/jobs/functional-ha-backup-restore/workspace/extracted-bin/usr/lib/juju-1.23-alpha1/bin/juju-backup: line 5: [: ==: unary operator expected" [16:31] errors, which in turn (AIUI due to the set -xeu) always fail the job when first run [16:32] rick_h_: with all that said, I'm still seeing an issue with current jujuclient from trunk, r53: http://pastebin.ubuntu.com/10430000/ [16:32] ahasenack: ugh ok so trunk isn't ready for release atm [16:33] I think not, I had to downgrade jujuclient and juju-core to get some deploys working today [16:33] ahasenack: ok, is there a bug for this? [16:33] that in turn causes the 3) issue - which might be the most minor one - the restore_present_state_server method in assess_recovery incorrectly assumes restore failed early (due to the exit 1) [16:33] right now I'm with jujuclient 0.18.4-5 from juju-stable and core 1.20.11-trusty-amd64 from trusty updates [16:33] rick_h_: not yet, just pasted it to hazmat a few minutes ago looking for confirmation it's something in jujuclient itself [16:34] ahasenack: ok [16:34] sinzui, that's the other option - fixing the job to call "juju backups whatever" (new style) instead of "juju restore ..." (old style, before the restore plugin was deprecated) [16:34] ahasenack: please file a bug and we'll start looking into it [16:34] ok [16:35] ahasenack: we need to release the python-jujuclient and so will look into the bug/fix to unblock all this then. [16:35] jcsackett: will look into it hazmat ahasenack ^ [16:36] sinzui, and *the* better option long term I believe; otherwise we might just apply a hot fix for the stuff landed in #1596 (quote the "$1" in the two ifs there) and #1667 (to properly pass arguments to juju backups, rather than ignoring them) [16:36] ahasenack: hi; when you file a bug on this with steps to repro, can you ping me with it? [16:36] xwwt, mgz ^^ [16:36] jcsackett: ok [16:37] ahasenack: thanks :) [16:37] dimitern, mgz: I am an uncertain. We want to ensure 1.18 compatibility. anyone who scripted the restore will fail as we did [16:39] sinzui, that's true - so I would've already proposed the "hot fix" option, but I'm not quite certain about the implications of converting old-style juju restore args to the new-style juju backups - perrito666 might tell if that's sane [16:39] dimitern, exactly my conundrum [16:40] sinzui, however, I'm not excluding this - because 1.23 *does* deprecate juju restore as plugin, you're expected to see your script job start failing and see why [16:40] (as a customer - a bit too presumptuous perhaps) [16:42] dimitern: sorry I was paying attention to my terminal [16:43] dimitern: have you slept since we last spoke? [16:43] dimitern, the problem with deprecation is that Ubuntu wont let you obsolete it. The contract must remain consistent for trusty [16:43] rick_h_: yeah.. it something about the super and iterators i did for py3 compat.. investigating [16:45] dimitern: that should work ootb [16:45] I mean sending the params to new restore [16:45] brb, dimitern if yo have said patch push it [16:46] perrito666, oh yes, I got a good 10h thanks :) [16:46] perrito666, not yet, wanted to check with sinzui and you first [16:47] hazmat: k, jcsackett is tasked to look into it today so we can try to unblock a formal release. [16:47] perrito666, but it's should be trivial - I'll get on it now, and ask you for a review [16:48] sinzui, as for the i386 tests failing and the ppc64 one - easily skip-able [16:48] rick_h_: ok.. in future a little more heads up would be good. i'll work on it some this evening. [16:48] hazmat: I'd love more heads up. We just found all this out last night while watching irc tbh. :) [16:48] hazmat: so trying to respond today [16:50] dimitern, +1 for the 386 timeout [16:52] jcsackett: reproduced it with this yaml: http://pastebin.ubuntu.com/10432117/ [16:52] jcsackett: bug has the details: https://bugs.launchpad.net/python-jujuclient/+bug/1426020 [16:52] Bug #1426020: TypeError: super object is not an iterator [16:53] local env, but happened also with maas [16:53] these are the two I tried [16:58] hey guys [16:58] a good guide to get started with writing charms? [17:02] ahasenack: thanks [17:02] Muntaner: https://jujucharms.com/docs/authors-intro [17:03] Muntaner: the navigation on the page is not very obvious, but if you collapse the "User Guide" section of the menu on the left side, you'll see the section for "Charm Authors" and you should just go down that list of links in order. [17:04] rick_h_: ^^ Are we going to make navigating these pages easier for people" They're supposed to be kind of serial, but there's still no way to just go to the "next" page... so for anyone that doesn't already know the layout, actually reading through them can be really frustrating. [17:11] dimitern, sinzui: Because we are expected to maintain compatibility with 1.18+, I think that the "hot fix" option is the only viable option. We could test "backups restore" in addition to "restore" if we wanted to, but one doesn't replace the other. [17:11] abentley, indeed [17:12] abentley, sinzui, in fact after a few tests replicating what the job does the fix turned out to be much smaller than I expected [17:12] dimitern: Excellent! [17:12] 4 characters to be exact :) [17:13] * sinzui awards dimitern with 4 gold stars ★★★★ [17:15] :D [17:15] :) [17:17] sinzui, let's see if it'll work on CI - http://reviews.vapour.ws/r/1021/ << perrito666 [17:19] natefinch: we've nothing on the roadmap for docs other than supporting multiple versions which is in progress. We rely on the docs folks to manage that at this time. [17:21] perrito666, sinzui, abentley, should we try landing this? ^^ it least ISTM won't cause more harm if it doesn't work [17:22] rick_h_: ok, then, where do I file a bug against the docs? :) [17:22] dimitern +1 from me [17:22] natefinch: github.com/juju/docs [17:22] ? [17:22] cool, let's roll then :) [17:24] rick_h_: ahh, cool, I didn't think to look there... assumed they were using launchpad. This is much easier [17:24] dimitern: but that patches backup not restore :p [17:25] dimitern: i think your patch cant harm [17:25] perrito666, these missing quotes are the reason for: "juju-backup: line 14: [: ==: unary operator expected: exit 1" [17:26] AIUI at least [17:26] dimitern: mm, i am not sure, but will take some clutter of the errors so for me go [17:26] jw4, hi, what's the status of juju actions these days? I'm getting hits for "actions" in the git log so I'm getting a good feeling. :) [17:27] coreycb: they're pink and freshly scrubbed [17:27] coreycb: we're actually hoping to get the feature flag removed for the code freeze tomorrow [17:28] oh ffs, whatever version of euca i am using requires -I and -S [17:28] coreycb: at which point actions will include a functional basic command line [17:28] jw4, awesome! [17:28] coreycb: and a good set of basic features [17:29] coreycb: let me see if I can find some documentation on the state of the union. bodie_ do you have any quick doc links handy? [17:29] jw4, that's great. I'll be trying it out soon. yeah I could use some usage/dev examples or doc if you have them [17:29] dimitern: i am about to hit merge in your behalf [17:29] perrito666, +1 [17:30] I thought I already did though :) [17:32] jw4, coreycb eh, one sec [17:32] you did [17:32] coreycb, take a look here https://jujucharms.com/docs/actions [17:34] bodie_, awesome, ty [17:34] coreycb, you might want to start with the doc for Charm authors, since it explains more about how they work and about the schemas [17:34] link on the first line of the Action users doc [17:35] coreycb, also, you CAN use them now in juju trunk as long as you export JUJU_DEV_FEATURE_FLAG=actions [17:36] bodie_, very good, thanks! [17:36] :) [17:37] let us know if you have any questions or issues... #jujuskunkworks [17:49] sinzui: ok, apparently my credentials are not good for euca :s [17:50] sinzui: at this point I would really like to see a run in debug mode, if dimiterns patch does not fix the issue [17:55] perrito666: are you using the right envvar names? [17:57] https://www.eucalyptus.com/docs/eucalyptus/3.2/cli/setting_vars_euca2ools.html [17:57] mgz: I actually hacked the parameters into the eua call in the code [18:03] hey nice, when its a timeout it gets noticed by jenkins [18:36] perrito666, I think I found a deeper issue and that fix won't harm, but not fix the failure [18:36] perrito666, look at this: "ERROR could not exit restoring status: cannot complete restore: : Restore did not finish succesfuly" [18:37] perrito666, this nil there is because of the way your api/backups client is calling the apiserver/backups facade methods [18:38] perrito666, when you have e.g. func (a *API) Method() error { } defined on the facade, the *correct* way to call it is like func (c *Client) Method() error { return c.facade.FacadeCall("Method", nil, nil) } [18:39] dimitern: again? [18:39] perrito666, and not: err := client.facade.FacadeCall("Method", nil, &remoteError) \n return err, remoteError << that will always be nil [18:40] rogpeppe2 can confirm this ^^ [18:41] dimitern: sorry, facades were invented after i left juju-core [18:41] perrito666, so the signature the apiserver expects from facade methods is listed on rpc.Conn.Serve()'s doc comment [18:41] rogpeppe2, yes, but it still holds I think [18:41] dimitern: iirc, err is an error in the calling process and remoteError is an error from the facade Method [18:42] perrito666, I don't think so - there is only 1 error - the one returned by the remote method [18:42] not exactly true, you can have an error in facadecall [18:42] dimitern: btw, where did you get that error/ [18:42] ? [18:43] perrito666, compare with apiserver/client.go - DestroyMachines(args) error and api/client.go - calling return c.facade.FacadeCall("DestroyMachines, args, nil) [18:43] perrito666, dimitern: if it's anything like it was, the error returned from an API method comes out in the error returned from the client call [18:43] perrito666, it's in the log of most failing jobs - e.g. http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/1530/console [18:44] rogpeppe2, perrito666, considering apiserver/client and api/client are both fairly well tested and we have cases in there where a method returns just 1 error (not results, error) [18:44] ah I see it, it is something we need to address eventually but in this case, its a red herring [18:45] dimitern: that error is triggered by trying to salvage the machine after a failed restore [18:45] rogpeppe2, perrito666, and since I can't see a test for FinishRestore anywhere [18:46] that might be the issue - it's just not getting called correctly (or the response is not interpreted correctly on the client) [18:47] ffs i cant believe this thing is so hard to run in a local machine [18:48] dimitern: I am worried about the: open /var/lib/juju/agents/machine-0/agent.conf: no such file or directory [18:48] error [18:51] I cannot find a good reason for that not to exist [18:53] perrito666, it comes most likely from state/backups/backups_linux.go:66 [18:55] perrito666, which leads to updateBackupMachineTag switching dataDirs [18:56] dimitern: but the thing is, that IS the right path [18:56] Starting Juju machine agent (jujud-machine-0) [18:56] so that IS machine 0 [18:57] perrito666, yeah. newTag and oldTag match [18:58] dimitern: yup, that is a workaround so we can deploy backups in ha envs [18:58] perrito666, but PrepareMachineForRestore earlier recreated the paths [19:00] dimitern: the process is roughly: machine is bootstraped, dirs are nuked, files are uncompressed, files might be moved to match new tag, config is read [19:00] perrito666, I think at this point you might consider reverting the removal of juju-restore plugin (#1667) [19:00] and t'is driving me crazy, i can do it by hand and for some reason i cannot get this awful ci script to run [19:00] agjhaskdjasld [19:00] I will [19:00] dimitern: on it [19:01] perrito666, do what I do in such frustrating moments - ad TONS of logger.Tracef *everywhere*, just in case :) [19:01] perrito666, cheers [19:03] dimitern: Ill roll down and cry and then do that [19:06] :/ [19:09] and when you cant be frustrated enough... the computer freezes due to lack of ram fml [19:09] dimitern: http://reviews.vapour.ws/r/1022/ [19:10] I really need to move to a country with more upload and amazon shipping [19:13] perrito666, :ship it: [19:14] dimitern: shipping [19:15] perrito666, sweet! [19:16] sinzui, ^^ it might get green soon on restore after all [19:18] * dimitern is off again [19:35] fix committed [19:41] I think I found the issue :D [19:41] ericsnow: around? [19:41] perrito666: sure [19:43] ericsnow: is it possible that either I took what meta.Origin.BackupMachine is or its returning wrong value [19:43] ericsnow: state/backups/backups_linux.go [19:43] line 43 [19:45] perrito666: what's the error? [19:45] sorry I meant meta.Origin.Machine [19:45] ericsnow: well, I am assuming that it is the machine where the backup was made, and from that asking restore to update the paths accordingly (from machine-N to machine-M) [19:46] buuuut, i am getting the wrong N [19:46] perrito666: it is supposed to the machine where the backup was made [19:46] perrito666: it comes from the API server [19:46] ericsnow: is it possible that its not? :p [19:47] perrito666: I'm looking at apiserver/backups/backups.go to see :) [19:48] thanks, a second set of eyes are really apreciated [19:48] am I confused or has 'fixes-1425807' already been merged yet CI is still blocking? https://github.com/juju/juju/pull/1693 [19:49] the PR to get Actions exposed for 1.23 is waiting to land before the freeze and I'm sweating... just a tiny bit... maybe two drops of sweat [19:49] bodie_: ci running the tests [19:51] perrito666: so the machine ID comes from the apiserver/common.Resources (the "machineID" key) that got passed in to NewAPI by the request handler [19:52] perrito666: so it should match the machine ID of the API server [19:52] ericsnow: mmmmmmm [19:52] I wonder what that does under HA? [19:53] perrito666: either the resource is incorrect or it did not get serialized properly into the metadata file [19:53] do you get the principal one's id? [19:53] let me cut open this tarball [19:53] perrito666: yeah, take a look at the metadata.json file [19:54] perrito666: re: HA, I expect the machine ID is strictly that of the current API server and unrelated to the replicaset [19:54] * ericsnow looks [19:55] ericsnow: it is the right ID.. I wonder if I am getting the right one when I ask [19:56] * perrito666 found the error to be there and now chases it [19:56] perrito666: oh, good, I'll stop digging through the tangle of API code :) [19:56] ericsnow: thanks a lot [19:56] perrito666: np === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 [20:57] ok, and now for something completely different, this bug is not deterministic [21:10] * perrito666 crafts a backup by hand and converts the bug in deterministic [21:36] if anyone care spare some time, I could really use a review on http://reviews.vapour.ws/r/1002 === ericsnow is now known as ericsnow_afk [21:54] * thumper tries to ignore the world for 30 minutes and write some tests [21:57] abentley, perrito666 are we still blocked due to 1425807? [22:00] alexisb: We don't have a blessed revision yet. The last tested was 0872c2cb [22:02] alexisb: 43bbcb6 is currently being tested. That is likely to have the fix shown. [22:03] abentley, ok [22:06] alexisb: I reverted the offending commit and its the ommit after 0872 [22:10] * perrito666 is invited to a dinner in 1.5 hour, and wonders if the other people there would have a problem with him running tests while he eats [22:11] hahaha [22:36] ericsnow_afk, ungraduated review complete... 1002 LGTM :) [22:49] jw4: thanks!!! [23:00] ericsnow_afk: I'm afraid all my feedback is trivial and nitpicky