/srv/irclogs.ubuntu.com/2018/08/21/#juju.txt

veebersthumper: It should take about how it's used by the tests (not juju) to store the details for substrates to test against00:34
=== wgrant_ is now known as wgrant
thumperveebers: do you have a few minutes01:16
veebersthumper: I do01:16
thumperveebers: HO?01:16
veebersthumper: sounds good, up release-call?01:16
thumpersure01:16
wallyworldanastasiamac: small one https://github.com/juju/juju/pull/908501:21
* anastasiamac looking01:22
anastasiamacwallyworld: lgtm as long as 'hooks' dir is created.. m guessing it is otherwise it would not have worked for u :D01:24
wallyworldthe python code does all that01:24
wallyworldyou just need to assign the hooks01:24
anastasiamacyep. i assumed so :) thanks for a quick fix!!01:24
wallyworldnp, i should do the same fix for the other charm in the same pr01:24
wallyworldthe constraints one01:24
wallyworldwill be the same thing01:25
anastasiamacyes, all charms should have it now...01:25
anastasiamacunless thay r testing the actual failure to deploy invalid charms and I do not think we have ci tests for that, only unit ones)01:25
veeberswallyworld, thumper it seems the upgrade test failure for 2.4 is legit, upgrade commands states to use proposed, controller logs show it's looking in released: https://pastebin.canonical.com/p/NG8PRCg26f/01:36
wallyworldshit eh01:37
wallyworldlucky we have tests01:38
veebersI've noted in the doc, moving on to the next one01:46
veebersteam, what's the haps with https://bugs.launchpad.net/juju/+bug/1782803 (just noticed it as I was filing a bug)01:48
mupBug #1782803: juju 2.4.1: juju status failure <cdo-qa> <cdo-qa-blocker> <foundation-engine> <juju:New> <Juju Wait Plugin:Invalid> <https://launchpad.net/bugs/1782803>01:48
veebersjust noticed it was critical*01:48
wallyworldveebers: from memory we told them it wasn't for juju to retry01:55
wallyworldif it's the one i'm thinking of01:55
wallyworldanastasiamac: i pushed a couple of small fixes for the other 2 CI failures01:56
anastasiamacveebers: it was marked as New overnight but it should not be critical01:56
veeberswallyworld: ok, the bug has been updated 6 hours ago. It might not be clear there as it's still marked crit01:57
veebersack anastasiamac ^^01:57
anastasiamacveebers: it's something with their setup and yes, wallyworld is right - not on us01:57
wallyworldlooks like they've reopened it will logs attached01:58
wallyworld*with01:59
wallyworldit can be looked at but IMO we'll push back as not a release stopper01:59
anastasiamacwallyworld: m not convinced that the api change is needed but m not too attached to it :D so my +1 still stands unless u want multiple +1 from me :D01:59
wallyworldanastasiamac: what api change?02:00
anastasiamac"zip file spec 4.4.17.1 says that separators are always "/" even on Windows."02:00
wallyworldok that. that's why the unit tests are failing on windows02:00
wallyworldwe were looking for a hooks\install02:00
anastasiamacwallyworld: oh ic... good to know02:00
wallyworldi have not rerun the windows unit tests but that *has* to be the reason i think02:01
wallyworldwe'll see soon enough02:01
anastasiamac:)02:01
veeberskelvinliu__: nice work with the enable-condition02:02
wallyworldkelvinliu__: has that fix above landed? if so i'll strike out the issue in the doc02:04
* thumper groans02:05
thumperveebers: the test failed with an unrelated failure AFAICT02:06
kelvinliu__veebers, wallyworld just in 1:1 meeting with Tim. going to land it now,02:06
wallyworldgr8 ty02:06
veebersthumper: which test02:07
kelvinliu__veebers, I deployed the RunFunctionaltests-amd64 job with the fix, and tested02:07
veebersthumper: ah log rotation right02:07
thumperyeah...02:07
veeberskelvinliu__: sweetbix02:07
thumperI'm just deploying the charm myself locally and testing that way02:08
kelvinliu__wallyworld, landed and tested. going to re-test crd now02:08
wallyworldty02:08
veebersthumper: what was the new failure?02:09
thumperhttps://pastebin.canonical.com/p/rQfKs22C4d/02:10
veebersthumper: you weren't expecting "ERROR:root:Wrong unit name: Expected: /var/log/juju/machine-0.log, actual: /var/log/juju/machine-lock.log" ?02:12
thumperveebers: oh, I was just looking at the last error...02:14
veebersthumper: ack, that last failure is probably jujupy choking because you used --existing and it screwed up and got confused :-|02:15
thumperah02:15
thumperwallyworld: quick call?02:26
wallyworldok02:27
thumperwallyworld: release call HO?02:27
thumperwallyworld: https://github.com/juju/juju/pull/908602:31
veeberslxc list02:50
veeberslol, wrong window02:50
anastasiamacveebers: lolo :) at least no password... we've all putour password into irc chat at least once :)02:58
veebershah, I have done that too :-P02:58
veebersor perhaps 'lxc list' *is* my password >_>02:59
anastasiamachmmm k that would b pretty sad pass phrase :) altho m not better - i usually use song lyrics as my pass phrases :)03:00
anastasiamaclike a variation on 'a spoonful of sugar' :D03:01
veebers^_^03:06
veebersok, I'm redoing how we do the manual provider test, it's silly how we're currently doing it03:08
babbageclunkDid someone clean up the GCE addresses? The quota is saying 4/23 in use.03:13
veebersbabbageclunk: I didn't, is it split by region?03:15
babbageclunkveebers: I think so, but this is for the us-central1 region that's in the error.03:16
babbageclunkhuh, curiouser and curiouser.03:16
veebersbabbageclunk: hmm odd, Perhaps it's was a perfect storm and there was heaps of jobs running in that region at the time and we got unlucky to run out03:16
babbageclunkYeah, maybe.03:17
veebersbabbageclunk: could be worth checking what regions are used in tests and perhaps manually distributing them out a bit?03:19
babbageclunkveebers: ok, just looking at the job config to understand what it's doing.03:20
veebersbabbageclunk: heh, let me know if you need anything clarified :-) Most the job configs are setup, the test run is a single build step03:20
babbageclunkThanks, I'll have a go at working it out first before roping you in! :)03:22
veebersis it possible to set a UserKnownHostsFile option for juju (i.e. ssh option)?03:22
wallyworldveebers: juju help ssh says yes03:51
wallyworldi assume you are talking about for running juju ssh03:52
veeberswallyworld: I meant for everything ssh that juju does (i.e. with a manual provider how it gets into the machine)03:52
wallyworldoh, juju use of ssh internally. i think that's all fixed03:53
veebersit's ok I've gone with a different approach that'll work. It's just not so fancy03:53
wallyworldfixed as in hard coded03:53
veeberswallyworld: ack, thanks for confirming. I've got something working though03:53
veebers(the reason was: I was 'lxc copy'-ing new machines from a base, but need to auth them to ssh in, using a generated known_hosts key would work, but need to set which file that actually is).03:54
wallyworldveebers: i left a comment on that upgrade bug - not something we can fix quickly / easily sadly IIANM03:54
veebersI've since created manual tests for the different clouds and locked down which IPs they start with. The lxd network management seems pretty nifty https://stgraber.org/2016/10/27/network-management-with-lxd-2-3/03:54
veeberswallyworld:  oh, :-(03:55
veeberswallyworld: it worked previously though right?03:55
wallyworldnot that i can see03:55
wallyworldi can't have03:55
wallyworldit03:55
veebersah ok03:56
veebersoh, it works in develop though03:56
wallyworldsimple controller model owrks03:56
wallyworldbut not agents on machines03:56
wallyworldprobably broken in devel too, or not?03:56
wallyworldneed to check but if it works in develop my theory is wrong03:57
wallyworldveebers: is the pexpect() stuff a substring match? eg does child.expect('(?i)password') match "some text here password:"03:59
veeberswallyworld: that test is green for develop branch (upgrades)04:02
veeberswallyworld: re: pexpect, should just be regex IIRC04:03
wallyworldit could be green because the agent binaries get cached04:03
wallyworldso my theory could be wrong. the code looks correct though04:03
wallyworldfor the controller be use the supplied agent stream04:04
veeberswallyworld: FYI https://pexpect.readthedocs.io/en/stable/api/pexpect.html#pexpect.spawn.expect04:05
wallyworldhmmm, that test should work then04:06
wallyworldunless it needs ^.* etc04:07
veeberswe should make it as promiscuous as possible, we only care if its asking for a password04:09
veeberswallyworld: FYI I found a 2.4 branch run of the upgrade tst that passed: http://10.125.0.203:8080/job/nw-upgrade-juju-amd64-lxd/199/console (2.4-rc2)04:12
wallyworldveebers: it could be the error is misleading then04:12
wallyworldthe agents will only look in release streams04:13
wallyworldbut if the controller has been done successfully first, the agents will be cached04:13
veebersalthough this one fails as we're seeing now: http://10.125.0.203:8080/job/nw-upgrade-juju-amd64-lxd/233/console04:14
thumperI can't seem to get the dbLog feature tests that fail intermittently to fail on my machine at all04:25
wallyworldveebers: if my reading of the pexpect doc is correct, our test is broken. http://pexpect.sourceforge.net/pexpect.html#spawn-expect seems to say that expect("bar") will not match "foobar". so our expect("password") will not match "Enter a password:"04:26
veeberswallyworld: huh, that seems to be the case if we're just passing in the string. We could pass in a compiled regex instead04:29
veebersis the tst really just using ("password")? that sucks04:29
wallyworldchild.expect('(?i)password')04:30
wallyworldwhich hopefully is treated as an uncompiled regex04:30
wallyworldalol other usages seem to do the right thing and use the whole prompt04:31
wallyworldeg04:32
wallyworldchild.expect('Enter client-email:')04:32
babbageclunkdid someone delete that GCE quota?04:32
wallyworldnot me said the duck04:32
veebersbabbageclunk: I haven't touched it04:32
babbageclunkWeird, it's not listed on the quota page anymore. :/04:32
veeberswallyworld: "Strings will be compiled to re types"04:33
veebersbabbageclunk: that's really odd04:33
kelvinliu__wallyworld, the crd works as expected.04:33
wallyworldveebers: ok, i'll look to follow convention elsewhere and use the exact prompt04:34
wallyworldkelvinliu__: awesome ty04:34
kelvinliu__wallyworld, np04:34
veeberswallyworld: a regex would be better surely? so we don't get tripped up by minor text changes04:34
wallyworldour preferred convention elsewhere (in juju also) is to use exact text04:35
wallyworldso we get breakages04:35
wallyworldso we think about the consequences of changing04:35
wallyworldand also so we can see when error messages are dumb04:35
wallyworldif you just match on a small regexp, you miss things like "could not do this because: could not do this: because could not do it" etc04:36
veeberswallyworld: ack04:37
veebersgood point04:37
veebersthumper: it seems like the commands in that job are failing which feeds bad input into the next command. one sec I'll line something up04:39
* thumper nods04:48
veebersvinodhini: looks like the timeout extension worked, it needed an extra 10 minutes apparently04:59
veebers100 minutes is a long time for that test though, maybe there is an issue with azure-arm. Did you try a different region too? Perhaps the default we use is slow etc.04:59
vinodhinii didnt try diff region.04:59
vinodhiniits just timeout period i incresed first in default reg05:00
vinodhinihttp://localhost:18080/job/nw-model-migration-amd64-azure-arm/647/console05:00
veebersvinodhini: I would attempting trying a different region see if that goes faster; having a test take 1hr 40 min is a bit gross :-)05:05
vinodhinii will try with actual time period and diff region05:07
vinodhinii mean the orig time period05:07
vinodhiniveebers: just a quick clarification plz correct me if i am wrong here - ENV=parallel-azure-arm -- iam setting this to different region. and i am listing out the regions from juju list-region azure05:16
veebersvinodhini: no, that env stays the same (it's the part that says run this test in azure-arm). just below that should be the assess_<blah> call, that should take a --region arg05:16
veebersone sec, let me check05:16
wallyworldveebers: a small PR for the pexpect fix05:17
wallyworldhttps://github.com/juju/juju/pull/908705:17
vinodhiniok. iam seeing in acceptance test assess_model_migration05:17
vinodhinii got that.05:17
vinodhini--region is option which overrides it.05:18
vinodhiniit alright thanks veebers05:18
veebersvinodhini: yeah --region should be there for the model migration test05:18
veeberssweet :-_05:18
veeberswallyworld: ack, looking05:18
veeberswallyworld: you've used a json query CLI tool before? something like jq or so?05:18
wallyworldi have05:18
wallyworldcan't remember the syntax though05:19
wallyworldbeen a while but very useful05:19
vinodhiniits ok. i verified in py script05:20
veeberswallyworld: ack cool I'll look it up, Should be able to use this 5 piped command using grep/sed/head etc. ^_^05:20
vinodhininow i have set time 90 and diff region and started it05:20
vinodhinilets see05:20
wallyworldveebers: yep, i pipe from stdin etc when i used it05:20
wallyworldveebers: i thought about controller_name but that is the one bit we don't really care about that could change05:22
veeberswallyworld: ack, fair enough05:22
wallyworldand it may not be contreoller_name05:22
wallyworldthe test should be using a different controller05:23
wallyworldfor true multi-controller cmr05:23
babbageclunkIs anyone else getting gocomplaints from gometalinter about gomocks-generated files not being goimported?05:35
babbageclunkwallyworld: ^05:44
veebersI've updated the nw-bootstrap-constraints-maas-2-2 job so it should get the right input for the test, going to have tea will check back in later on.05:44
wallyworldbabbageclunk: i haven't so far05:53
wallyworldkelvin added some new micks yesterday05:53
wallyworldbut they are all committed in tree05:53
babbageclunkwallyworld: I tried running it again and it went away, so I don't know what was happening there.05:54
* wallyworld shrugs05:54
* babbageclunk also05:54
veeberswallyworld: don't forget to propse your fixes to develop too :-)06:25
vinodhini veebers: are u strill ard. i did revert back the time qnd changed the region and its all good Success.06:30
vinodhini http://localhost:18080/job/nw-model-migration-amd64-azure-arm/648/console06:30
vinodhiniwallyworld: looks like veebers not ard06:48
vinodhinii would like to know abt this azure failure which is actually fine if we change the regin.06:49
stubgo go gadget gometalinter06:50
vinodhiniso what shd be the solution ? i have made the modification directly in Web UI06:50
vinodhiniI have updated the doc.06:59
wallyworldvinodhini: not sure, i'll have to read the failure, i am not faimi9lair with it07:15
wallyworldvinodhini: wouldn't it be better to increase the test timeout? that's what i seem to recall may have been discussed this morning07:18
vinodhiniwallyworld: i was away to get some dinner.07:43
vinodhiniYes. initially i increased the timeout period and it was successful.07:43
vinodhinibut veebers was asking me not to do that way07:43
wallyworldvinodhini: ok, i'm surprised at that. i'll talk to him tomorrow. just changing the region is quite fragile as that coud slow down also07:52
wallyworldthanks for looking into it07:52
vinodhiniits ok.07:52
vinodhinii was working in credentialsd part its was just side by side running.07:53
wallyworldgood plan07:53
vinodhinithis is not potential failure. its slow thats why its an issue07:53
wallyworldyeah, azure is very slow at instance creation/destroy07:53
vinodhiniSo we arent doing release today ?07:54
vinodhiniI am sure veebers will look into the status a bit later :-)07:54
wallyworldmaybe, maybe not, depends on how the other guys go with the remaining issues. i'd say not today but tomorrow if i had to guess07:54
vinodhiniIn this case how to target the solution iam not sure. Modifying a config option is not a fix.07:55
vinodhiniSo we should focus on solution.07:55
vinodhiniok. wallyworld. I am drafting a mail to you. I wont be there tomorrow morning hours as i have appoinment with Indian consulate.07:57
wallyworldit depends on the root cause. if the substrate is slow, then increasing a timeout seems reasonable to me07:59
veebersvinodhini, wallyworld: The timeout is already 90 minutes, any more seems like a huge amount. My suggestion was to try a different region in case the original is having troubles etc.08:04
wallyworldwow 90 minutes!!!08:04
wallyworldfark08:04
veeberswallyworld: if it's still taking ages in another region there is an issue there08:05
wallyworldyeah, let's see08:06
veebersyeah, it times out after 90 :-) Takes about 1hr 45 min for a successful run08:06
wallyworldveebers: do you know the gce quota status? was that sorted?08:07
veeberswallyworld: no idea sorry. I know babbageclunk was looking. We thought perhaps it was bad timing and we had a bunch of stuff all running the same region etc. Not sure if the suggestion to check which region is used across tests (with the thought to share it out a bit) went08:08
wallyworldok, np08:08
veeberswallyworld: the jq way is much better: https://github.com/CanonicalLtd/juju-qa-jenkins/pull/81/files08:10
wallyworldveebers: looks good08:56
veeberswallyworld: this is an easy one: https://github.com/juju/juju/pull/908809:10
wallyworldlooking09:10
wallyworldlgtm ty09:11
stickupkidmanadart: you got 5 minutes for a quick HO?10:56
* stickupkid gone for lunch11:04
rick_h_morning party folks11:14
rick_h_stickupkid: morning11:30
rick_h_stickupkid: can I ask you to pause WIP and grab an issue from the release blocking doc please?11:31
stickupkidsure can12:11
rick_h_stickupkid: ty, the other side of the world cranked out a lot of notes/fixes and we need to help move forward today.12:12
stickupkidrick_h_: just reading up on the doc12:12
rick_h_stickupkid: k, let me or hml know if you have any questions/issues12:12
manadartexternalreality: Approved #908414:04
externalrealitymanadart, cool. I spoted that I attempted to push the removal of the Id feild did not make it in. Gonna add that before attempting to land.14:15
manadartexternalreality: Didn't quite get all of my PR done before EoD, but I've put it up as a WIP, if you are able to review: https://github.com/juju/juju/pull/909015:16
hmlstickupkid: quick pr pls: https://github.com/juju/juju/pull/909115:27
stickupkidhml: done15:28
hmlstickupkid: ty15:28
hmlstickupkid: i’m off to long lunch shortly.  do you have anything for me to review?15:30
stickupkidhml: nope, nothing atm, just digging15:30
stickupkidpretty sure I'm just making the hole deeper15:30
hmlstickupkid: ha!15:30
externalrealitymanadart, reviewing now15:34
rick_h_stickupkid: "I'm gonna need a bigger shovel!"15:38
stickupkidrick_h_: true15:39
stickupkidhas anyone seen this recently "16:51:11 DEBUG juju.provider.common bootstrap.go:575 connection attempt for 10.156.96.10 failed: /var/lib/juju/nonce.txt does not exist" - it's been happening a couple of times today15:52
stickupkid?15:52
stickupkidJust doing a "juju bootstrap localhost --debug" on the 2.4 branch15:53
stickupkidit works in the end, but really takes it's time...15:53
rick_h_stickupkid: looks like some history https://bugs.launchpad.net/juju-core/+bug/131468215:57
mupBug #1314682: Bootstrap fails, missing /var/lib/juju/nonce.txt (containing 'user-admin:bootstrap') <bootstrap> <juju> <maas-provider> <juju:Expired> <juju-core:Won't Fix> <https://launchpad.net/bugs/1314682>15:57
stickupkidrick_h_: nice, i'll give that a read15:57
stickupkidrick_h_: so i guess the retry that's implemented to fix this, does work... maybe my computer was just being slow...16:00
rick_h_stickupkid: yea, not sure.16:01
* stickupkid back to digging...16:01
=== beisner_ is now known as beisner
veebersMorning o.20:55
rick_h_wheeeee20:58
cory_fuwallyworld, kelvinliu_, knobby: This call reminded me of this, if you haven't seen it: https://www.youtube.com/watch?v=JMOOG7rWTPg  :p21:20
kelvinliu_cory_fu, ^.@21:24
babbageclunkwallyworld, veebers: I had a look at the GCE quota thing. As far as I could see the quota was now fine - IP addresses in use was fluctuating between 4 and 0 when the test was running. I couldn't change the region tests were using because it's defined as us-central1 in environments.yaml. Maybe I could duplicate parallel-gce as parallel-gce-us-east1 and move some jobs to use that instead?21:24
veebersbabbageclunk: using --region with an assess script should overwrite that IIRC21:25
hmlbabbageclunk: it looks like we may hit it when there are two ci-run going at the same time21:25
hmlbabbageclunk: that’s what was giong on when it was hit again in run 108921:26
babbageclunkveebers: ah, thanks - so if I change the jobs to use different regions that might avoid it? It definitely looks like a per-region quota.21:26
babbageclunkveebers: ok, I'm going to do that now.21:27
babbageclunk(dumb question, but what does the nw- prefix mean?)21:29
babbageclunkgah, my brain's stopped accepting "likelihood" as a real word.22:31
babbageclunklikeli22:31
rick_h_it does look strange written out22:32
babbageclunkveebers: can you take a look at https://github.com/CanonicalLtd/juju-qa-jenkins/pull/83 ? I've checked there are no errors from jenkins-jobs.22:46
veebersbabbageclunk: can do22:46
babbageclunkta22:46
babbageclunkAfter it's deployed I'll make sure to run each of the changed jobs, just in case I missed a \22:46
veebersbabbageclunk: LGTM. a redeploy should be just doing nw-* so it redploys all the functional jobs (no need to screw around cherry picking names etc.)22:48
babbageclunkveebers: more detail? Hang on, I'll read more of the readme.22:49
veebersbabbageclunk: oh, your question earlier re: nw_ prefix; hah it's because while we where spinning up the new CI run bits we continued to run the original jobs; You couldn't run both at the same time as they stomped on each other (workspace/$JOBNAME is the working dir for a job). So I added nw- (new world), it was supposed to be changed when we did the roll over but never was22:50
veebersbabbageclunk: ah sorry, hah yeah the arg for jenkins-job . . . . -r jobs/ci-run nw-*22:51
babbageclunkAh, ok - so running `jenkins-jobs update` like in the deploying jobs section, but with a wildcard to do all the new-world jobs.22:52
babbageclunkveebers: coolthanks!22:52
veebersbabbageclunk: yep that's the one22:52
babbageclunkveebers: ok, having a go at deploying them now.22:54
veebersbabbageclunk: sweet, let me know when it's done as I'm deploying and testing some changes I'm making22:55
babbageclunkhow do you add a private key interactively (in juju add-credential)? Remove all the linebreaks?23:43

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!