/srv/irclogs.ubuntu.com/2014/12/02/#juju-dev.txt

jw4thumper: do you have some direction for me regarding feature flags?  We'd like to begin landing the Actions CLI behind a feature toggle until it's complete.00:00
=== kadams54 is now known as kadams54-away
jw4thumper: I'm planning on using an envar say DEV_ACTIONS_CLI or something00:01
hazmatwallyworld_, that would be great00:02
wallyworld_np, done00:02
wallyworld_or will be real soon :-)00:02
wallyworld_hazmat: are some of those instance types only available in certain regions?00:15
wallyworld_cause Juju is complaining there's missing cost data in regions for G1 etc00:16
hazmatwallyworld_, bummer00:16
hazmatwallyworld_, bummer.. needs better unit tests..00:16
hazmatwallyworld_, the G series are not publicly available, i put in the definitions for them.. but only setup mappings for the g5s for a  partner request00:17
hazmattheir in a public beta atm, but pricing is not known00:17
hazmatbut a partner was interested in using them00:17
wallyworld_what default would be sensible you think?00:17
wallyworld_or would a default even be worthwhile00:18
wallyworld_if i put in a high figure, that would mre or ess require explicit instance type selection by the user00:18
wallyworld_that probably could be ok i think00:19
hazmatwallyworld_, sounds reasonable00:19
wallyworld_ok, will do thanks00:19
hazmatwallyworld_, g series are mostly monster machines.00:19
=== kadams54-away is now known as kadams54
hazmatso i imagine the price is.. well if you have to ask ;-)00:19
wallyworld_lol00:19
thumperjw4: oh hai00:35
thumperjw4: the feature flag branch needs tweaking and can land shortly after00:35
thumperdidn't realise anyone else was waiting on it00:35
davecheneymwhudson: 1.5 discussion will start in the next 48 hours00:37
davecheneyprobably sooner00:37
mwhudsondavecheney: \o/00:38
mwhudsonin other news, i've found yet another gccgo bug00:38
jw4thumper: no worries - we just started talking about it too00:43
wallyworld_axw: i had to make a fix to gwacl to ensure all instance types had costs. I fixed the tests also. these changes are in rolesizes.go and rolesizes_test.go. However, I got sick of the non-standard formatting cause my editor gofmts on save, so the branch has had vanilla gofmt run over it so the diff is large. all tests run, the real changes are in the files i just mentioned. are you able to take a look?00:43
wallyworld_https://code.launchpad.net/~wallyworld/gwacl/ensure-all-roles-have-costs/+merge/24334600:44
wallyworld_ignore my crappy branch name00:44
jw4thumper: I will be interested to see it when the branch is ready to land00:44
thumperjw4: ack00:44
=== kadams54 is now known as kadams54-away
davecheneymwhudson: sorry, there is no prize for gccgo bugs00:54
axwwallyworld_: ok00:58
wallyworld_ty and sorry :-)00:59
wallyworld_hopefully you can ignore the white space00:59
axwwallyworld_: if you haven't already, can you please either delete the format target in Makefile, or change it to "go fmt"?01:00
mwhudsondavecheney: heh heh01:02
wallyworld_axw: yeah did that :-)01:02
axwwallyworld_ hazmat: do we even care about actual costs? I think we only care about relative costs... I wonder if we could just do "1, 2, 3, ..." in order of role size01:05
wallyworld_axw: i made the costs arbitary and large01:05
wallyworld_and in order of role size01:05
axwwallyworld_: I guess I should just actually look at the MP ;)01:05
wallyworld_axw: that way, G1 etc are not selected by default, and require an explicit instance type constraint01:06
axwright. goodo01:06
perrito666I try to run my tests with -gocheck.f="myfilter" but whenever I use -gocheck.something the only run test is dependenciesTest.TestDependenciesTsvFormat01:22
perrito666anybody hit tat before?01:22
davecheneyperrito666: are you still stuck on that dependency problem ?01:24
perrito666davecheney: nope, I just tried to run one test alone of a suite01:25
perrito666davecheney: tx for your help the other day, I realized too late you where not working that day, apologies01:26
mwhudsondavecheney: ah no, it's probably just a go tool bug01:28
mwhudsondavecheney: even tinier non-prize01:28
davecheneyperrito666: meh, if i'm not working, i'll ignore you01:31
davecheneyyou don't need to appologise01:31
perrito666:p I am a very nice person, why would you ignore me?01:33
* perrito666 is offended01:33
mwhudsonyes01:34
mwhudsondavecheney: do you remember enough of the pain to say if this makes sense? http://paste.ubuntu.com/9338212/01:37
davecheneymwhudson: i remember that link order was super fragile01:40
davecheneywe occasionally see builds blow up on power64 when link order is wrong01:41
davecheneyi can't say that what you have proposed is wrong01:41
mwhudsondavecheney: yeah, that's the bug i'm poking at01:41
davecheneybut it does make sense that that is the cause01:41
mwhudsondavecheney: the problem seems to be that i added code to not pass the target of two actions for the same a.p01:42
mwhudsonbut there are two different a.p's for the package being tested when you run go test01:42
mwhudson(which makes sense, maybe, now that i think about it: one has the internal _test.go files in it)01:42
wallyworld_katco: i asked for tests, let me know if you have any questions01:46
davecheneymwhudson: juju uses 'external' tests a lot more than the stb lib02:03
davecheneyand we're also leaning hevily on the export_test.go escape hatch to get access to internal functions during tests02:03
davecheneyi'm sure this has agrevated the issue02:04
=== kadams54 is now known as kadams54-away
jw4thanks thumper - I see your PR02:55
thumperhmm... is juju/cmd hooked up to review board?03:08
thumperhttps://github.com/juju/cmd/pull/1003:08
thumpermenn0: you might like that one :-) ^^^^03:17
menn0thumper: looking...03:22
thumpermenn0: obviously the supercommand code wasn't complicated enough...03:23
thumpermenn0: I've just noticed the I never got around to updating the doc strings for the new functions03:24
menn0thumper: i'll ignore that then :)03:29
menn0thumper: reviews for other repos don't go to RB then?03:29
thumpermenn0: some do, some don't03:30
thumperI don't know which do03:30
menn0thumper: well this hasn't so I'll do the review on GH03:31
menn0thumper: so I should ignore the incorrect docstring for RegisterSuperAlias?03:37
thumperyes03:38
ericsnowaxw: patch to add AZ to instanceData: http://reviews.vapour.ws/r/557/03:43
ericsnowaxw: let me know if that's what you had in mind03:43
=== kadams54-away is now known as kadams54
ericsnowaxw: with that my unit-get patch will get a lot simpler :)03:44
axwericsnow: yep, pretty much. I don't think Instance needs to get a new AvailabilityZone method; we should just return it from StartInstance03:46
axwI guess it doesn't hurt to have it there though03:47
axwericsnow: oh right, it's needed for upgrades. never mind me03:48
axwah hm, not really if it's a ZonedEnviron ... anyway.03:49
menn0thedac: review done04:10
menn0thedac: sorry, that was supposed to be for thumper04:10
davecheneymwhudson: the patch attached to this issue appears to be a patch, if it's not a patch, then remove the patch tag from this issue ...04:38
davecheneythanks, i guess04:38
=== kadams54 is now known as kadams54-away
huwshimiHi, I was just wanting to try a few things with 1.21 and wondering which ppa to use?05:19
axwhuwshimi: I think you want https://launchpad.net/~juju/+archive/ubuntu/devel05:30
huwshimiaxw: A brilliant thanks, I'll take a look05:31
huwshimi*Ah05:32
dimiternmorning05:48
axwfwereade_: any chance you could take a look at https://github.com/juju/charm/pull/77 some time soon? I expect we can iterate on this for changes due to mandatory constraints07:35
fwereade_axw, sorry, queued it up07:36
axwfwereade_: thanks07:36
dimitern_fwereade_, wallyworld_, TheMue, hey guys, any of you willing to review a fix for bug 1395908 ? http://reviews.vapour.ws/r/558/09:06
mupBug #1395908: LXC containers in pending state due to juju-br0 misconfiguration <lxc> <oil> <juju-core:In Progress by dimitern> <juju-core 1.20:In Progress by dimitern> <juju-core 1.21:In Progress by dimitern> <https://launchpad.net/bugs/1395908>09:06
wallyworld_dimitern_: i've been called to dinner but can look afterwards if not done by then09:07
dimitern_wallyworld_, no worries09:07
fwereade_dimitern_, LGTM, very clean, ty09:08
mattywmorning all09:09
dimitern_fwereade_, thanks!09:09
dimitern_morning matty09:10
mattywdimitern_, morning09:10
mattywfwereade_, morning, would you have a few minutes to take a look at http://reviews.vapour.ws/r/553/09:11
dimitern_fwereade_, when you have 5m, I have a backport for the same bug fix - http://reviews.vapour.ws/r/559/ (for 1.21) - no other changes09:46
fwereade_dimitern_, LGTM again :)09:48
dimitern_fwereade_, cheers :)09:50
voidspaceNobody has posted to juju or canonical-tech about rocket...09:52
fwereade_voidspace, it was on cloud though09:54
* fwereade_ popping out, bbs09:54
voidspacefwereade_: ah, I may not be on that list then...09:59
voidspacejam1: dimitern_: stdup?10:01
dimitern_voidspace, omw, sorry10:02
fwereade_menn0, ping10:50
menn0fwereade_: hi10:51
fwereade_menn0, remind/reassure me: did we manage to close off the paths whereby non-state server agents can end up connecting to state servers of an earlier version than themselves?10:51
menn0fwereade_: you mean while an upgrade is in progress?10:53
fwereade_menn0, yeah10:53
menn0fwereade_: yes, that was already done before I touched the upgrade code10:53
menn0fwereade_: state servers don't advertise the new version until they themselves have upgrades10:53
menn0upgraded10:53
fwereade_menn0, I'm just looking at the bit I wrote in the uniter for spinning on CodeNotImplemented from a particular call while waiting for the state server to be upgraded10:53
fwereade_menn0, and suspecting I can drop it10:53
fwereade_menn0, cool10:53
fwereade_menn0, thanks10:54
menn0fwereade_: and given that an upgrade can't start unless all the hosts in the env are on the current or next version there's really no way a non-state server can get ahead of the state servers10:55
natefinchfwereade_: do you mind if I tweak this a little?  I'm not really a fan of the cast here, since it can panic: http://bazaar.launchpad.net/~fwereade/juju-core/provider-skeleton/view/head:/provider/skeleton/config.go#L12511:11
wallyworld_natefinch: here's a fix for the critical bug on master http://reviews.vapour.ws/r/560/11:12
natefinchfwereade_: I'd much rather do it once in validateConfig and just store the values in a normal old struct11:12
natefinchwallyworld_: looking11:12
wallyworld_ty11:12
fwereade_natefinch, so, in *theory* it won't panic because it should have been validated on creation11:13
fwereade_natefinch, but *hell yes* put things in structs if you can11:14
natefinchfwereade_: kk will do.  I know it shouldn't panic, but it gives me the heebie jeebies11:14
natefinchman, %q was just the best idea ever11:21
mattywhi folks, I'm consistently getting errors on FAIL: status_test.go:2702: StatusSuite.TestFilterSubordinateButNotParent. Anyone else? It seems to have been happening for a while for me so it must just be me11:32
wallyworld_jam1: do you know the history surrounding the change in api endpoints as per bug 1397376? it's claimed the change to only return a single address for api-endpoints is a regression, and it sure seems that way. also the use of dns name rather than ip address11:33
mupBug #1397376: maas provider: 1.21b3 removes ip from api-endpoints <api> <cloud-installer> <landscape> <maas-provider> <juju-core:New> <juju-core 1.21:New> <https://launchpad.net/bugs/1397376>11:33
wallyworld_mattyw: katco should be able to help with that11:34
natefinchwallyworld_: lgtm11:38
wallyworld_natefinch: tyvm11:38
mattywwallyworld_, ack, thanks very much11:55
wallyworld_np, she should be on soon11:55
dimitern_wallyworld_, your fix landed, but the bot still refuses new PRs12:01
dimitern___JFDI__ doesn't seem to work either :/12:01
wallyworld_dimitern_: i think it takes a while for the fix to go through CI12:01
dimitern_wallyworld_, ah, my mistake - it did work with __JFDI__12:02
wallyworld_great12:02
mattywfwereade_, ping?12:18
fwereade_mattyw, pong, in meeting, responding slowly12:19
=== Ursinha is now known as Ursinha-brb
=== Ursinha-brb is now known as Ursinha
dimitern_fwereade_, last backport - for 1.20; same bug, no changes - will you approve it? http://reviews.vapour.ws/r/561/12:27
dimitern_voidspace, can you have a look instead?  http://reviews.vapour.ws/r/561/12:54
=== dimitern_ is now known as dimitern
* dimitern is afk for a while12:56
voidspacedimitern: sure13:02
voidspacedimitern: LGTM13:03
dimiternvoidspace, thanks!13:09
fwereade_can anyone think of a reason not to make "machineenvironmentworker.MachineEnvironmentWorker" into something a bit more readable like "proxy.Worker"?13:51
perrito666is anyone very very savvy about github.com/juju/cmd?13:54
dimiternfwereade_, +10014:04
dimiternperrito666, rogpeppe or thumper I think14:04
perrito666dimitern: tx, I figured out, testing subcommands was confusing me a bitç14:05
rogpeppefwereade_: why "proxy"?14:05
rogpeppefwereade_: i'd certainly use less stuttering14:06
jw4wallyworld_: how can I know definitively when CI is accepting new builds?  CI must be using some check that I can run myself before $$merge$$?14:07
perrito666fwereade_: +1 to have names that can be pronounced without taking a breath at the point14:08
fwereade_rogpeppe, because it's all about setting proxy-related env vars, and writing out files describing what proxies to use ;)14:09
dimiternjw4, I use this bookmark https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL14:09
fwereade_rogpeppe, but maybe I missed some detail?14:09
jw4dimitern: that's basically what I use too, but CI is still rejecting builds even though that reports zero bugs14:10
dimiternjw4, it was blocked earlier, but not anymore - a PR of mine landed less than an hour ago14:11
rogpeppefwereade_: is there any reason to export the worker type at all from that package?14:11
fwereade_rogpeppe, fair point14:11
jw4dimitern: hmm; mine was rejected after yours was accepted14:11
rogpeppefwereade_: i'd suggest proxyworker as the package identifier14:11
rogpeppefwereade_: and define proxyworker.New as the creator of the worker14:11
fwereade_rogpeppe, "worker/proxyworker" is a little bit yucky, but yeah, that sounds sane14:12
rogpeppefwereade_: and use "worker" as the internal type14:12
rogpeppefwereade_: i think it's fairly idiomatic - e.g. http/httputil14:12
fwereade_rogpeppe, yeah14:13
jw4wallyworld_, dimitern yeah - again : Does not match ['fixes-1396981']14:14
jw4CI must be using a different query - I just want to know what that is so I don't have to keep guessing when CI gets unblocked14:14
dimiternjw4, they must've changed it14:26
dimiternmgz_, sinzui, any clue why the bot is still blocked on bug 1396981 ?14:27
mupBug #1396981: Upgrade fails because product json is renamed <ci> <regression> <upgrade-juju> <juju-core:Fix Committed by wallyworld> <https://launchpad.net/bugs/1396981>14:27
jw4dimitern: oh well - I've waited almost a week to land this anyway - I can wait longer :)14:27
sinzuidimitern, 1. the bug is not Fix released, 2. we need to remove the rule for frozen index.json.14:28
sinzuidimitern, I think master will be open in about 30 minutes14:28
sinzuijam can you get someone to look into Bug #1398406? the gwacl change broke azure deploys for CI14:29
mupBug #1398406: Azure provider attempts to deploy with unsupported "D1" RoleSize <azure-provider> <bootstrap> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1398406>14:29
dimiternsinzui, great, thanks!14:30
dimiternsinzui, jam is not here today, I can have a look perhaps14:30
sinzuithank you dimitern14:31
dimiternsinzui, btw, lest I forget - will there be a  1.20.14 release? I have just fixed an issue with maas there - bug 139590814:32
mupBug #1395908: LXC containers in pending state due to juju-br0 misconfiguration <lxc> <oil> <juju-core:Fix Committed by dimitern> <juju-core 1.20:Fix Committed by dimitern> <juju-core 1.21:Fix Committed by dimitern> <https://launchpad.net/bugs/1395908>14:32
sinzuidimitern, we need to do one for the gwacl change...but the change is broke tests14:33
dimiternsinzui, the gwacl one I'm looking at now?14:33
sinzuidimitern, I was just looking at that bug and think about how I might verify the change works with the QA maas14:33
sinzuidimitern, yes. bug 1398406 is fallout from bug 138942214:34
mupBug #1398406: Azure provider attempts to deploy with unsupported "D1" RoleSize <azure-provider> <bootstrap> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1398406>14:34
mupBug #1389422: azure instance-types and regions missing <azure-provider> <constraints> <Go Windows Azure Client Library:Fix Committed by hazmat> <juju-core:In Progress by wallyworld> <juju-core 1.20:Triaged> <juju-core 1.21:Fix Committed by wallyworld> <https://launchpad.net/bugs/1389422>14:34
dimiternsinzui, well, if it's a virtual maas it can be tested by tweaking one or a few kvm domain configs (vms) to have disabled NICs14:34
sinzuithank you dimitern,14:35
dimiternsinzui, that's how I tested it locally, I can give you more details should you need them14:35
sinzui:)14:36
dimiternsinzui, with the fix juju will log "node "XXX" skipping disabled network interface "XYZ" for disabled devices; luckily by default only the primary interface is enabled (as discovered by lshw) so you should see these right after a node is acquired from maas (e.g. during bootstrap --debug)14:40
dimiternsinzui, and after a few "skipping disabled network interface.." log lines (depending on how many NICs are there) there will be a final "node "<maas-uuid>" primary network interface is "XYZ" (e.g. "eth2", which will appear in /etc/network/interfaces under the juju-br0 "bridge-ports" setting)14:42
jw4dimitern: (sinzui ) this updated query takes into account Fix Committed and not Fix Released... : https://api.launchpad.net/devel/juju-core?ws.op=searchTasks&status%3Alist=Triaged&status%3Alist=In+Progress&status%3Alist=Fix+Committed&importance%3Alist=Critical&tags%3Alist=regression&tags%3Alist=ci&tags_combinator=All14:55
jw4sinzui: that should be the right query for me to check status before merging right?14:56
dimiternjw4, do you have a bugs.launchpad.net link with the same filters?14:57
perrito666natefinch: ericsnow can we jump into the standup since we are in it?14:57
jw4dimitern, no but I think I can whip it up...14:57
dimiternjw4, cheers!14:57
dimiternsinzui, is bug 1398406 only for 1.21 or also for trunk and 1.20 ?14:58
mupBug #1398406: Azure provider attempts to deploy with unsupported "D1" RoleSize <azure-provider> <bootstrap> <ci> <regression> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1398406>14:58
sinzuidimitern, the Lp only claims the merge was in 1.2114:58
* sinzui checks log14:58
perrito666natefinch: ?14:59
sinzuidimitern, I don't see the dep change in master. just 1.2114:59
dimiternsinzui, right; so after some analysis I think the issue is MS renamed "D1" to "Standard_D1" etc.14:59
dimiternsinzui, I'll propose a fix, can it be tested in the same CI scenario easily?15:00
sinzuiyes15:00
sinzuidimitern, this is the command that failed. the constraint is important, the env can be any asure: juju --show-log bootstrap -e azure-deploy-precise-amd64 --constraints mem=2G15:04
dimiternsinzui, right, I'll try it locally as well15:05
dimiternthanks15:05
sinzuijw4, that call is right. I use15:06
sinzui juju-ci-tools/check_blockers.py master 123415:06
sinzuiwhere the number is random15:06
jw4sinzui: perfect15:06
jw4dimitern: check_blockers.py is the best route, but I think the query would look like: https://bugs.launchpad.net/juju-core/+bugs?field.searchtext=&orderby=-importance&search=Search&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.15:08
jw4importance%3Alist=CRITICAL&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_subscriber=&field.tag=ci+regression+&field.tags_combinator=ALL&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.15:08
jw4has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on15:08
mgz_jw4: it's possible to get a shorter version of that url :)15:08
sinzuidimitern, jw4 abentley: bug 1398406 is wrongly on master (and the wrong milestone), I am going to target it to the 1.21 branch and beta4 to get master unblocked15:09
mupBug #1398406: Azure provider attempts to deploy with unsupported "D1" RoleSize <azure-provider> <bootstrap> <ci> <regression> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1398406>15:09
jw4mgz_: lol - you mean with a shortner or by removing all the empty fields ?15:09
dimiternjw4, thanks!15:09
jw4mgz_: I keep trimming the empty fields by hand, but the search tool puts them back in whenever I tweak the params15:09
dimiternsinzui, cool, thank you15:09
mgz_jw4: :)15:10
jw4mgz_: (dimitern) how about : https://bugs.launchpad.net/juju-core/+bugs?field.searchtext=&orderby=-importance&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL15:12
jw4tiny bit shorter15:12
mgz_that looks reasonable15:12
dimiternthanks again jw415:12
jw4dimitern: :)15:13
natefinchericsnow, perrito666: sorry, chrome crashed again, and I can't get back in.  I think we mostly were done.  I'll be back in like 5 minutes.15:20
perrito666natefinch: we noticed15:20
natefinchbtw, is it possible to downgrade back to trusty?  Because this is just ridiculous15:20
ericsnownatefinch: ping me when you are ready for our 1-on-115:20
natefinchericsnow: will do]15:20
perrito666natefinch: mm, I am not sure, it should be it requires some crafting by hand and some things might break15:27
perrito666there are not that many changes in utopic15:28
perrito666natefinch: you could downgrade the browser15:28
natefinchI don't even know that it changed the browser.  I'm running Chrome, not Chromium15:28
perrito666natefinch: chances are that the browser updated and it might also be broken in trusty, chrome comes from a ppa iirc15:30
alexisbhey there team, I see katco has pickup one critical bug do we have an assignee for this one:15:34
alexisbhttps://bugs.launchpad.net/juju-core/+bug/139737615:34
mupBug #1397376: maas provider: 1.21b3 removes ip from api-endpoints <api> <cloud-installer> <landscape> <maas-provider> <juju-core:Triaged> <juju-core 1.21:Triaged> <https://launchpad.net/bugs/1397376>15:34
natefinchalexisb: looking15:41
natefinchsinzui: about that api-endpoints bug.... it sounds like the scripts were relying on behavior that was not guaranteed, but simply happened to be somewhat reliable in their particular environments.  I'm with Tim in that I'm not sure this is a regression.15:53
sinzuinatefinch, They are relying on behaviour that worked most of the time, now doesn't work most of the time15:54
sinzuinatefinch, I think we need to know if anyone has built tools on the unguaranteed behaviour. If deployer assumes it, we have a problem15:55
sinzuihazmat, juju 1.21 api-endpoints is more likely to return a dns address instead of an ip address. Some people assume it only returned an IP address. does deployer make such assumptions?15:57
natefinchsinzui: isn't deployer on the server?  api-endpoints is a client-only command,  I don't think deployer could call it if it wanted to.  Also, again, deployer is on the server, so it shouldn't need to get the api endpoint, right?16:02
sinzuinatefinch, no, deployer drives your client16:02
natefinchor am I crazy about how deployer works16:02
* natefinch is crazy, ok. :)16:02
natefinchI've never used the deployer, sorry :)16:02
sinzuiwell supplants your client with an bulk api clinet16:02
sinzuinatefinch,  a deployer files/bundle is a summary of an end state of an env. deployer/quickstart walks juju through the steps to but the env into the state16:03
voidspacedimitern: ping16:27
dimiternvoidspace, pong16:28
voidspacedimitern: I'm stuck on a failing test16:28
voidspacedimitern: this is for IPAddress.SetState16:28
voidspacedimitern: the Assert seems to prevent State being changed at all16:29
voidspacedimitern: this is the code https://github.com/voidspace/juju/compare/state-ipaddresses16:29
voidspacedimitern: this is the error message: http://pastebin.ubuntu.com/9346641/16:29
dimiternvoidspace, looking16:29
voidspacedimitern: the failing test is TestIPAddressSetState16:30
voidspacedimitern: it fails on line 1306 - where we attempt to change the state (actually calling SetState)16:30
voidspacedimitern: I've checked other Asserts in our code and can't see the difference16:30
voidspacedimitern: the Assert itself is line 96 of the diff (first block - the new ipaddresses.go file)16:31
dimiternvoidspace, unknownOrSame?16:32
voidspacedimitern: yes16:32
voidspacedimitern: to change the state, the existing state must be AddressStateUnknown, or the same state as the one we're setting16:32
voidspaceso the state must be unknownOrSame16:32
voidspacewhich was your name I believe... yesterday :-)16:33
dimiternvoidspace, yeah16:33
dimiternvoidspace, hmm.. weird..16:33
voidspacedimitern: the test creates a new address, checks that the state is AddressStateUnknown and then attempts to set the state to AddressStateAllocated16:34
voidspacedimitern: and that fails with "transaction aborted"16:34
dimiternvoidspace, how about if you use getCollection + defer like in other places, rather than using C: ipaddressesC16:34
voidspacedimitern: ok, I'll try that16:34
dimiternvoidspace, ah, another thing I just noticed16:34
voidspacedimitern: hmm... other places uses st.runTransaction with C16:35
dimiternvoidspace, state is omitempty, which means it can be null in addition to any valid state16:35
dimiternvoidspace, it shouldn't be omitempty16:35
voidspacedimitern: ok16:35
voidspacedimitern: that might be it16:35
voidspaceStateUnknown is ""16:35
dimiternvoidspace, we have an "unknown" .. yeah :)16:35
voidspacedimitern: getting rid of omitempty worked!16:37
voidspacedimitern: thanks :-)16:37
dimiternvoidspace, sweet!16:38
voidspaceI knew there was a reason we kept you around ;-)16:38
dimitern:D16:38
dimiternindeed16:38
voidspace:-)16:38
dimiternok, eod for me16:43
dimiterng'night all16:43
voidspacedimitern: g'ngight16:43
dimiternvoidspace, o/16:43
=== kadams54 is now known as kadams54-away
mattyw_ericsnow, ping?16:52
ericsnowmattyw_: hey16:52
mattyw_ericsnow, hey hey, what's the current process for proposing a branch with a prereq?16:52
ericsnowmattyw_: Do it like normal and then use "rbt post -u --parent ..." afterward16:53
mattyw_ericsnow, perfect thanks very much16:54
ericsnowmattyw_: np :)16:54
=== kadams54-away is now known as kadams54
mattyw_cmars, http://reviews.vapour.ws/r/563/16:59
voidspaceg'night all18:13
jw4on call reviewer today available to PTAL?  Very small change - fully tested with juju-core (small fix to juju-core to follow this update) http://reviews.vapour.ws/r/565/18:33
natefinchjw4: what have we learned?18:34
natefinch<chorus> never use regular expressions </chorus>18:35
katconatefinch: that is a rather firm stance to take18:36
natefinchkatco: perhaps only slightly too firm.  Never use regular expressions longer than 10 characters?18:36
katconatefinch: i can at least consider that statement :)18:37
katconatefinch: very useful: http://www.colm.net/open-source/ragel/18:37
jw4natefinch: lol!18:38
natefinchjw4: haha, did we not even have a positive testcase for that?18:38
natefinchwait wait wait18:38
jw4natefinch: blush18:38
natefinchwhy is IsValidUUIDString not just18:39
jw4katco: interesting link18:39
natefinch_, err := utils.UUIDFromString(mystring)18:39
katcojw4: it's a very neat project18:39
natefinchreturn err == nil18:39
jw4natefinch: that's actually how this was caught18:40
jw4the check was passing but the UUIDFromString was panicing18:40
natefinchin fact... why do we even have an IsValidUUIDString?18:40
jw4natefinch: no answer except that we just started using that method too18:41
jw4natefinch: we could switch to the _,err approach and remove the bool version18:41
natefinchjw4: please do... there's no sense having two implementations which can get out of sync18:41
jw4natefinch: I'll have a juju core PR to follow this, and now a names package follow up too18:42
natefinchjw4: cool, sorry to give you more work, but it's definitely the better fix18:43
jw4natefinch: no problem - I'm semi blocked on CI anyway18:43
perrito666I am all for regexes as long as you have a test battery that covers every possible case for them :p18:46
* perrito666 gets a friend to bring an en_US kb from ny and jumps of happiness like mario with a new coin18:47
jw4natefinch: the regex is useful because it's an easy way to verify that the 15th char is a '4' and the 20th char is one of '8','9','a','b'18:51
jw4natefinch: we  could do that check manually but the regex actually seems simpler18:51
jw4natefinch: utilis.UUIDFromString uses the regex to verify those special characters, and then the rest is just simple hex decoding18:54
natefinchjw4: you think that 73 character regex is simpler than  s[14] == '4' && switch s[19] { case  '8','9','a','b':    default: error } ?18:54
jw4natefinch: ;-p18:54
jw4natefinch: okay18:54
natefinchjw4: also, what's up with the 4 and 89ab thing anyway?18:55
jw4natefinch: something to do with valid UUID per a spec somewhere18:55
jw4natefinch: fwereade_ gave me a link but I forget it18:55
jw4natefinch: the 4 is a version number18:55
perrito666mm uuid validator, I seem to recall we have code that checks that18:56
natefinchahh yeah, I see some comments about version 418:56
=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
Makyosinzui, ping18:56
jw4natefinch: btw, we also have to validate length if we remove the regex.  I'll update the PR with those changes18:56
sinzuihi Makyo18:56
jw4perrito666: other than in the utils package?18:56
perrito666jw4: nope, same ugly regexp18:57
Makyosinzui, was told to get in touch with you re: juju beta PPA18:57
natefinchjw4: is there a reason we're rolling our own and not using something like https://godoc.org/code.google.com/p/go-uuid/uuid18:57
perrito666I recall having a fake uuid breaking and trying to decypher that regex to figure out wtf until I just read the rfc18:57
sinzuiMakyo, https://launchpad.net/~juju/+archive/ubuntu/devel ?18:58
jw4perrito666: fwereade_ pointed me to that package - we can ask him about go-uuid18:58
Makyosinzui, ah, thanks, we didn't know the location.18:58
perrito666jw4: I know natefinch will not agree with me, but I would definitely go for something more in the side of http://pastebin.ubuntu.com/9348308/19:07
perrito666if I actually had to use that regex19:07
perrito666and properly name the blocks19:07
perrito666aaaand of course a nice doc explaining it a bit more19:07
jw4perrito666: I'm okay with that - if natefinch agrees I will go that route19:07
natefinchjw4: until such a time as we can use something more complete, that's fine... except you don't need sprintf19:12
jw4natefinch: +119:12
perrito666duh, the sprintf was from the first implementation I wrote19:16
perrito666but I found it clearer by just concatenating the strings19:17
jw4perrito666: yeah19:17
jw4perrito666: natefinch: also - I'm guessing we want to allow upper case letters too19:17
perrito666jw4: It would be wiser to de-capitalize the uuid before matching19:17
jw4perrito666: kk19:19
perrito666can I trigger the compilation of a _windows file on linux??19:19
* perrito666 knows the answer and starts a windows+19:20
jw4GO_ARCH=windows?19:20
natefinchjw4: so, yes, but I think you have to have the go tools built to run that way19:21
jw4natefinch: I see.  FWIW, I'm coming back to a non-regex approach again! :)19:22
natefinchjw4: http://dave.cheney.net/2013/07/09/an-introduction-to-cross-compilation-with-go-1-119:22
jw4natefinch: cool.19:23
natefinchjw4: once the tools are built `GOOS=windows go build` will work19:24
perrito666if you build them, they will compile...19:27
=== tvansteenburgh is now known as tvan-afk
jw4natefinch, perrito666 : http://reviews.vapour.ws/r/565/19:43
perrito666jw4: I lgtmd but you need davecheney to bless it19:53
perrito666jw4: I proposed the change :p I was not going to be against it19:53
=== kadams54 is now known as kadams54-away
perrito666we have tests failing because my windows is in spanish :s20:24
perrito666is reviewboard on for utils?21:02
perrito666it is, and it changes the description, how nice21:04
perrito666http://reviews.vapour.ws/r/567/21:04
perrito666ok EOD for me, cheers everyone21:32
* menn0 is sick of CI being blocked21:32
menn0is anyone looking at bug 1398473?21:33
mupBug #1398473: backup output format changed <backup-restore> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1398473>21:33
perrito666menn0: lemme check21:36
menn0perrito666: either the test needs to change or the output filename21:36
menn0perrito666: the other blocker is also backup related21:37
menn0perrito666: bug 139844821:37
mupBug #1398448: ERROR while creating backup archive: while bundling state-critical files: write to tar file failed: open /var/log/juju/all-machines.log: no such file or directory <backup-restore> <ci> <ha> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1398448>21:37
menn0perrito666: seems more serious21:37
perrito666ericsnow: did you merge the deprecation of backup before restore is merged?21:37
ericsnowperrito666: yes21:38
perrito666ericsnow: old restore will not restore the new backups21:38
ericsnowperrito666: why not?21:38
perrito666ericsnow: are you dumping with ooplog?21:38
ericsnowperrito666: the oplog doesn't matter21:38
perrito666ericsnow: ?21:39
ericsnowperrito666: the dumped oplog will just get ignored, giving exactly the same result as if dumped without the oplog21:40
perrito666ericsnow: except that loosing everything that happened between the beginning and end of juju backups create21:40
perrito666old backup would stop the state server while backing up21:40
ericsnowperrito666: I see21:41
perrito666yes, I know, its a PITA21:41
perrito666:)21:41
ericsnowperrito666: however, restore will land in 1.22 so it won't matter, right?21:41
ericsnowperrito666: is the backup archive (with oplog) breaking tests?21:42
ericsnowperrito666: that bug is related to the filename, not the archive content21:42
perrito666ericsnow: no that is just a side effect I noticed while looking the other issue21:42
ericsnowperrito666: k21:42
perrito666ericsnow: I believe a change in the test is in order, unless there is some strict rule on the file name21:42
ericsnowperrito666: looks like the test has some hard-coded regex for the filename that no longer works21:43
perrito666ericsnow: it has21:43
perrito666it expects it to have the old name21:43
perrito666sinzui: is this considered a break on backwards compatibility?21:44
menn0perrito666, ericsnow: if users of the old backup system have scripts which expect the old filename the change will break them21:44
ericsnowmenn0: fair enough21:44
menn0perrito666, ericsnow: not sure how worried we should be about that though21:44
perrito666ericsnow: I presume you changed the script to call backups create, right?21:45
menn0wallyworld_: ^^^21:45
ericsnowperrito666: right21:45
perrito666ericsnow: you could add a mv at the end21:45
perrito666but, apart from that21:45
perrito666there is an underlying problem there21:45
ericsnowperrito666: I was thinking the same thing, though I don't like it :P21:46
perrito666ericsnow: although that script should be nuked before 1.2221:46
ericsnowperrito666: we have to keep the script around due to backward-compatibility21:47
menn0ericsnow: so what's the difference between the old and new style filenames?21:47
sinzuiperrito666, menn0, ericsnow : I don't think we promised a naming convention for the backup file. The promise is a new file.21:47
perrito666it will produce a different format than previously and an unknowing user might try to restore one of these into an old juju (because of reasons)21:47
ericsnowmenn0: the new filename has a .tar.gz suffix instead of .tgz, and now includes the env UUID in the filename21:47
menn0perrito666: that's a good point. given that the format has changed a new file name style is probably a good thing21:48
ericsnowmenn0: actually, the env UUID isn't included in the filename21:48
ericsnowmenn0: so just the suffix changed21:48
menn0ericsnow: ok21:48
menn0ericsnow: I agree that tar.gz is marginally better :)21:49
perrito666so this bug is on sinzui's side21:49
menn0so if the filename is staying the same let's change the test to expect the new filename21:49
perrito666although its our fault21:49
perrito666for not communicating properly the change21:49
menn0perrito666: we have access to the CI tests and can fix them too...21:50
ericsnowmenn0: I'm all for just fixing the regex in the CI test :)21:50
perrito666menn0: we do, that is a particularly convoluted tests though :) because it tests an ugly plugin of juju21:51
ericsnowFWIW, I don't think there's a whole lot we can do about someone trying to restore a 1.22 backup using 1.20 or 1.2121:51
perrito666I am not sure for the case of backup, but it also might be expecting certain output from stdout21:51
menn0sinzui: remind me what the process for changing CI tests is21:51
perrito666ericsnow: we could print a hughe warning ni the script, that is it21:52
sinzuiperrito666, abentley I agree that "juju-backup-20141202-141348.tar.gz" is still clearly a backup. We will loosen the regex21:52
ericsnowDoing so should still work fine, with the (very?) remote chance of introducing DB inconsistencies21:52
perrito666ericsnow: well, there will not be inconsistencies in CI, just in practice21:53
abentleysinzui: It is a regression, though.21:53
sinzuiabentley, yep, I have targeted the bug and started a fix21:53
abentleysinzui: If it breaks my code, it can just as easily break someone else's.21:53
ericsnowperrito666: and probably very very rarely21:54
menn0ericsnow, perrito666: now what about the other issue relating to all-machines.log?21:54
perrito666ericsnow: I agree, in the current status of users for juju, although it is technical debt21:54
perrito666menn0: checking21:54
menn0ericsnow, perrito666: bug 139844821:54
mupBug #1398448: ERROR while creating backup archive: while bundling state-critical files: write to tar file failed: open /var/log/juju/all-machines.log: no such file or directory <backup-restore> <ci> <ha> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1398448>21:54
sinzuiabentley, there is no promise of a specific naming convention. The new name is very similar and We...specifically me was probably too specific.21:55
* ericsnow looks21:55
perrito666menn0: that is odd, I have personally done that process a lot of times21:56
ericsnowabentley: shouldn't /var/log/juju/all-machines.log always be there when the CI tests run?21:57
perrito666ericsnow: albeit, we might just let it pass if a log file is not there21:57
ericsnowperrito666: we already do that for a couple other files21:57
ericsnowperrito666: but I'd expect all-machines.log to be there21:58
perrito666ericsnow: I think logs are not that critical as config and db21:58
perrito666ericsnow: although yes, it should be there21:59
abentleysinzui: I think that it is a regression to change even behaviour that was not promised.  Users will depend on anything they can get their hands on.21:59
abentleysinzui: But it's your call.21:59
perrito666abentley: I dont think this qualifies as a change in behavior21:59
sinzuiabentley, no one told me a naming convention. I picked a test that satisfied myself21:59
abentleyperrito666: Then why is our test broken?21:59
perrito666we make a backup and download a file in the same format, the name changes but that is an accident21:59
perrito666abentley: you made a reasonable assumption on the file name, yet it is a bit extremist to call it a regression22:01
perrito666the compression is the same, the format is the same, it is compatible22:01
perrito666abentley: ftr, we suck for not spec-ing that and I do apologize for it22:02
perrito666I am working on a spec22:02
abentleyperrito666: Yes, and the same could be said if you had changed the commandline arguments.22:02
abentleyperrito666: But just like this change, changing the commandline arguments would break scripts.22:02
jw4perrito666: thanks22:03
abentleyperrito666: I'm not upset that you changed it, I'm just concerned to make sure we don't break our customers.22:03
perrito666abentley: I am trying to rubber duck with you, I am not yet decided on if we break or not our customers22:04
abentleyperrito666: You've already said the regex makes a "reasonable assumption" about the filename format.  Do you think it is unlikely that any of our customers made the same assumption?22:05
perrito666abentley: yes, I was re-reading myself22:06
wallyworld_menn0: hi, did you need me?22:08
perrito666abentley: I concede, it is a reasonable assumption and we might need to maintain it.22:08
abentleyperrito666: Okay, well sinzui has overruled me, so I won't waste everyone's time discussing it further.  EOD for me.22:08
perrito666oh you can override abentley :p that is a cool feature22:09
perrito666have a nice EOD22:09
perrito666ericsnow: sinzui well its your call guys22:09
perrito666I am EOD too22:09
Ice-xdoes anyone know where I can buy dvd movies with a low price on the internet ?22:10
ericsnowperrito666: goodnight22:10
menn0perrito666: ciao22:10
jw4davecheney: ptal http://reviews.vapour.ws/r/565/22:15
jw4davecheney: perrito666 said I should get your review on that.22:15
thumperIce-x: wrong channel22:20
mwhudsonthumper, davecheney, alexisb: https://groups.google.com/d/msg/golang-dev/2ZUi792oztM/UCA2V7Ul3nkJ22:28
davecheneyjw4: /me looks22:30
jw4davecheney: ta22:31
jw4thanks davecheney :)  Comments duly noted22:33
jw4davecheney: actually - my preference would be to use https://code.google.com/p/go-uuid/ and not even mess with it ourselves - the only benefit is that our version is presumably lighterweight22:34
davecheneyjw4: if someone else has alredy written a package to to dhtat22:35
davecheneyuse it22:35
davecheneyjuju clearly has no policy against external depenencies22:35
jw4davecheney: I think I floated that idea and fwereade_ or someone pointed me to the utils package instead (sorry if I'm falsely accusing you fwereade_ )22:36
davecheneyjw4: meh22:37
davecheneywhat's better than having one of something ?22:37
davecheneyhaving more than one of something!!22:37
jw4davecheney: lol22:37
davecheneycode reuse involves less heroism22:37
=== kadams54 is now known as kadams54-away
ericsnowabentley: so about all-machines.log in CI, why would it be missing?22:42
menn0ericsnow: it could be a logging regression22:43
ericsnowmenn0: perhaps; do we have any other CI tests that rely on all-machines.log22:44
ericsnow?22:44
menn0ericsnow: where the log file isn't being created on some state servers?22:44
menn0ericsnow: just a guess22:44
ericsnowmenn0: yeah22:44
ericsnowmenn0: could be a permissions issue too22:44
menn0ericsnow: i'm just spinning up an HA env on ec2 now to check22:45
ericsnowmenn0: cool22:46
* ericsnow remembers abentley is EOD already22:47
katcowallyworld_: can someone stamp http://reviews.vapour.ws/r/569/22:50
katcooops, that's to anyone, not just wallyworld_22:50
wallyworld_katco: sure, looking22:50
katcowallyworld_: ty sir22:50
wallyworld_katco: land away22:53
katcowallyworld_: sweet. what the heck is the fixes syntax (or where can i find it)?22:53
wallyworld_$$fixes12345$$22:53
katcoty22:53
wallyworld_oops22:53
wallyworld_$$fixes-12345$$22:53
wallyworld_katco: ^^^22:54
katcoit's too late! the damage is done!!22:54
katcooh, the horror!22:54
wallyworld_sorry22:54
wallyworld_it will be rejected22:54
katcoi'm just kidding around :)22:54
wallyworld_i know22:54
menn0ericsnow: perms on all-machines.log could be the problem. on a new state server the logs dir looks like:22:55
menn0-rw------- 1 syslog adm    99219 Dec  2 22:54 all-machines.log22:55
menn0-rw------- 1 syslog syslog   883 Dec  2 22:52 ca-cert.pem22:55
menn0-rw------- 1 syslog syslog   589 Dec  2 22:52 logrotate.conf22:55
menn0-rwx------ 1 syslog syslog    83 Dec  2 22:52 logrotate.run*22:55
menn0-rw------- 1 syslog syslog 95478 Dec  2 22:54 machine-0.log22:55
menn0-rw------- 1 syslog syslog   895 Dec  2 22:52 rsyslog-cert.pem22:55
menn0-rw------- 1 syslog syslog   891 Dec  2 22:52 rsyslog-key.pem22:55
katcoit will be interesting to see if this completely fixes andreas's issue22:55
ericsnowmenn0: I recall thumper saying something about log permissions recently22:56
ericsnowmenn0: which is what made me think of it22:56
menn0ericsnow: but the backup runs as root right?22:57
ericsnowmenn0: I expect the issue of permissions partly depends on the user running the CI tests22:58
ericsnowmenn0: I don't know how they are set up22:59
ericsnowmenn0: I'm less convinced it's a permissions issue (the error message is "no such file or directory")22:59
menn0ericsnow: I'm pretty sure these tests run against a separate ec2 env basically from the end user's persepective23:00
menn0ericsnow: i've just got that ec2 env up23:01
menn0ericsnow: there's no all-machines.log on the 2 new state servers created by ensure-availability23:02
menn0ericsnow: so that could be the problem23:02
ericsnowmenn0: ah, yep, and from that perspective it's the API server process that is running the backup23:02
menn0ericsnow: ok the all-machines.log is there now on the new machines23:03
menn0ericsnow: it just takes a while to appear23:03
menn0ericsnow: that's probably the issue23:03
ericsnowmenn0: so if one of them runs the backup we'd get the failure we're seeing23:03
menn0ericsnow: which state server runs the backup? the master or the API server that handled the request?23:03
ericsnowmenn0: so a race condition of sorts?23:03
ericsnowmenn0: the one that handled the request23:04
menn0ericsnow: yeah, probably a race condition23:04
menn0ericsnow: i've looked at the test and it waits for all machines to be "started" after ensure-availability23:04
menn0ericsnow: but when I looked all machines were in the started state23:05
menn0ericsnow: but the all-machines.log wasn't there yet23:05
ericsnowmenn0: but it showed up after a little while, right?23:08
menn0ericsnow: yep23:08
ericsnowmenn0: so in practice this isn't likely to bite users23:08
menn0ericsnow: looking at the logs for the rsyslog worker, there might be at least a 20s gap23:08
menn0ericsnow: it would suck though if something wasn't quite right with rsyslog meaning that backup couldn't be taken23:09
ericsnowmenn0: agreed23:09
menn0ericsnow: maybe the backup system should be tolerant of missing log files23:10
ericsnowmenn0: we already have precedent23:10
ericsnowmenn0: what would be the importance of the logs on the restored host23:11
ericsnowmenn0: do we have code that relies on all-machines.log?23:11
ericsnowmenn0: did the /var/log/juju directory exist even when the log file was missing?23:12
menn0ericsnow: yes /var/log/juju was there with the local machine-N.log and the key files for logging23:15
menn0ericsnow: I don't think anything in Juju relies on all-machines.log except the debug-log feature23:15
menn0ericsnow: I don't think it's at all essential for the backup23:15
ericsnowmenn0: okay, good23:15
menn0wallyworld_: do you agree that it isn't essential that all-machines.log is included in a backup?23:16
wallyworld_menn0: do we include other logs?23:16
wallyworld_i can see that it would be useful, but not strictly necessary to get a system up and running again23:17
ericsnowmenn0: k23:17
ericsnowmenn0: I'm working up a patch to allow for missing log files23:18
ericsnowwallyworld_, menn0: we also back up machine-0.log23:18
katcowallyworld_: (when you get a chance) regarding your comment about tests for apiserver/leadership regarding permissions: can i just stub out an authorizer that returns a failure and ensure the code behaves correctly?23:19
wallyworld_so in that case why not include all-machines.log23:19
wallyworld_katco: i think so - from memory, the other similar tests use a fakeAuthorizer, but i'd need to check to be sure23:19
ericsnowwallyworld_: all-machines.log does get backed up23:19
katcowallyworld_: cool, ty.23:20
wallyworld_ericsnow: ok23:20
ericsnowwallyworld_: the problem is where it didn't exist yet in the HA case23:20
wallyworld_doesn't it always exist?23:20
ericsnowwallyworld_: apparently it doesn't show up immediately on the extra hosts under HA23:21
ericsnowwallyworld_: menn0 saw something like a 20s gap23:21
ericsnowwallyworld_: and that was enough for CI to fail23:21
wallyworld_oh, i see23:22
wallyworld_so why doesn't backup use the primary state server?23:22
ericsnowwallyworld_: we use whichever one cmd.Command.NewAPIRoot gives us23:26
ericsnowwallyworld_: you're suggesting we force that to machine 0?23:27
wallyworld_ericsnow: the state server itself that receives the request could redirect to the primary (not necessarily machine 0). i'm just thinking out loud23:27
ericsnowwallyworld_: yeah, menn0 hinted at the same thing23:28
menn0wallyworld_: that seems pretty fragile23:28
wallyworld_why fragile?23:28
menn0what happens if the master changes23:28
menn0if state servers have come and go23:28
menn0what is machine-0 anyway?23:29
wallyworld_the cluster knows which is master23:29
wallyworld_surely?23:29
wallyworld_i didn't say machine 023:29
wallyworld_i said NOT necessarily machine 023:29
menn0the cluster does know who master is23:29
wallyworld_right23:29
wallyworld_so if the machine that receives the request is not master, it redirects to master23:29
menn0it seems like a lot of extra complexity just to get a non-essential log file23:29
menn0which will be there most of the time anyway23:30
menn0it's only not there for the first 20s ish after a new state server comes up23:30
wallyworld_trouble is, people may want it (I can't make that call), and the backup needs to be consistent23:30
wallyworld_it can't sometimes include all-machines and sometimes not23:30
menn0I guess the oldest state server (which may or may not be the master) will have the most complete all-machines.log23:31
wallyworld_i'm not saying we must include it etc - just offering an argument of why we should23:31
ericsnowwallyworld_: agreed that the contents of a backup archive should be consistent23:32
wallyworld_there's an argument we could leave it out23:32
=== kadams54 is now known as kadams54-away
menn0wallyworld_: the problem is, I've seen the mongo master change even when the master doesn't go away23:32
wallyworld_menn0: ericsnow: so, i think we need to ask stakeholders - do they want all-machines, and can we exclude it23:33
wallyworld_hopefully they'll say they don't want it - they just want a backup so that a system can be restored if needed23:33
wallyworld_for now, to remove the regression, we can ignore all-machines if it doesn't exist23:34
wallyworld_or just exclude always23:34
ericsnowwallyworld_: why was it included in the first place (along with machine-0.log)?23:34
wallyworld_don't know for sure, i think everything in logs was just included23:35
menn0wallyworld_: I don't think we're talking about excluding all-machines.log are we23:35
menn0wallyworld_: just not aborting the backup if isn't there23:35
wallyworld_menn0: well, we could exclude it so that we always have a consistent backup23:36
menn0wallyworld_: i.e. it'll get included if it's there but otherwise there will just be a warning or something but the backup continues23:36
menn0wallyworld_: ok fair enough23:36
wallyworld_i'd rather not have it there sometimes and not others23:36
wallyworld_that's IMHO23:36
menn0wallyworld_: fair enough23:36
ericsnowmenn0: I'd be nervous about a backup archive having it only some of the time23:37
menn0wallyworld_: if we do decide it should be there I think it still shouldn't cause the backup to abort if it's not there23:37
wallyworld_agreed23:37
menn0wallyworld_: not being able to backup if rsyslogd isn't working would be kinda shite23:37
wallyworld_yep, i hope stakeholders just say we can exclude it23:38
menn0wallyworld_: so how about ericsnow makes an immediate change to make the log optional so we can get CI unblocked23:38
wallyworld_menn0: yep, that was my suggestion above23:38
menn0wallyworld_: and then once we've heard more from stakeholders there might be a further change23:38
menn0wallyworld_: ok. i missed that.23:38
wallyworld_and ericsnow needs to ask nate to engage stakeholers to see what we need to do23:38
wallyworld_we need to explain to them the implications etc so they understand the issues23:39
menn0ericsnow: i've just noticed there's another cause for this CI test failure too23:41
menn0ericsnow: ERROR unable to get DB names: EOF23:41
menn0ericsnow: that's from a different run. http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/1165/console23:41
ericsnowwallyworld_: will do23:41
menn0ericsnow: looks unrelated to the all-machines.log issue23:41
wallyworld_ty23:41
ericsnowmenn0: I'll take a look in a sec23:42
menn0ericsnow: ok, i'll leave you to it23:46
ericsnowmenn0: k23:46
menn0sinzui, wallyworld_: do you think we can the all-machines.log backup CI blocker to no longer be a regression. it's really more of a race condition that's only likely to happen in the test.23:48
wallyworld_menn0: i do but it's not my ultimate call23:48
ericsnowwallyworld_, menn0: http://reviews.vapour.ws/r/572/23:55
wallyworld_ericsnow: why machine0Log?23:56
wallyworld_it won't be always machine0 surely?23:56
ericsnowwallyworld_: right23:59

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!