/srv/irclogs.ubuntu.com/2018/09/07/#juju.txt

babbageclunk--config="features=[legacy-leases]"00:01
babbageclunkwallyworld: ^00:01
babbageclunkuh-oh, why?00:01
babbageclunkfound a bug?00:01
wallyworldbabbageclunk: in k8s bootstrap testing, the initialisation is getting lease errors and we want to rule out a raft issue in the docker image00:01
babbageclunkcool00:01
thumperwallyworld: got 5 minutes?00:08
wallyworldthumper: sure, just talking to kelvin, give me a minute00:09
thumperwallyworld: I'll jump into our 1:1, come when you're ready00:13
babbageclunkwallyworld: what size did you suggest to use as a minimum for downloading the agent binaries again?01:08
wallyworldi think they are approx 80MB but we should allow for unpacking temp space etc01:08
wallyworldso say 250MB to be safe?01:09
babbageclunkheh, that's the same as in upgrades/preupgradesteps.go01:10
babbageclunkIt's checking after the binaries are downloaded/unpacked and the agent has restart, though, so not quite the same.01:11
babbageclunk*restarted01:12
wallyworldkelvinliu_: no rush, would love a review on this today sometime https://github.com/juju/juju/pull/917401:19
kelvinliu_wallyworld, it's a big one! looking now.01:21
wallyworldkelvinliu_: yeah - a lot of it is shifting code from one facade to another01:22
wallyworldno hurry01:22
kelvinliu_wallyworld, yup01:22
kelvinliu_wallyworld, wondering if it's easy to change the APIAddresses to [juju-controller-service-internal-endpoint]01:27
wallyworldkelvinliu_: i don't quite understand? can you explain?01:28
kelvinliu_for example, juju-controller[.juju-namespace]:1707001:29
kelvinliu_wallyworld, not for this PR, just thinking it for juju k8s version01:30
wallyworldi'm sure we can discuss that01:31
* wallyworld goes to buy coffee, bbiab01:31
veebersoh man, search command history I should be more careful. delete and describe namespace both start with de but do very different things01:38
kelvinliu_ - -!!01:40
wallyworldveebers: only just saw that comment. i shouldn't laugh but i can't help it :-D02:22
veebershah, it's ok it was an *almost* mistake, I caught it in time02:25
veeberswallyworld: we're still expecting caas charms to set Active after setting pod spec right?02:25
wallyworldi think so for now; but they only really need to if they have set "maintenance". if the unit status remains "waiting for container" we will override that02:27
veeberswallyworld: ack, sweet that matches my expectation. I did a run through but using the not updated charm so didn't work quite as expected; watch this space, just waiting for the bootstrap to complete02:28
wallyworldyeah, demo charms will need updating02:28
veeberswallyworld: re: this comment https://github.com/juju/juju/pull/9081/files/#r215465303, the tests added to state/caasmodel_test.go should cover the caas model and tests in state/status_model_test.go should cover IAAS models02:37
veebersactually state/status_model_test.go might not show up on that diff as I've modifed my changes to it etc.02:37
wallyworldok, so long as we have coverage. i can check the final PR02:46
wallyworldkelvinliu_: you ask a good question - i have answered in the PR, but basically I am allowing for future if we want to only use a single config map for all apllications02:46
wallyworlddoes that make sense?02:47
kelvinliu_wallyworld, yeah, it makes sense. LGTM, thanks02:47
wallyworldtyvm02:47
veebersanastasiamac: tough question, have we ever seen the 'local charm archive' test fail in merge, or just the check-merge jobs?03:16
veeberswallyworld: FYI: https://pastebin.canonical.com/p/vFjmPhPrJc/03:28
veebersNote the "Instantiting pod." is poor messaging from me03:29
wallyworldveebers: also though, tyhe active status for workload is premature03:29
wallyworldsince the pod has still not come up03:29
wallyworldi think at that stage the container status is "wating" ?03:30
veeberswallyworld: didn't we discuss having the charm set active after it sets pod spec?03:30
wallyworldyes03:30
wallyworldbut03:30
veebers(I'm with you that it seems dishonest to have it set active then)(03:30
wallyworldwe said that would be overridden from container status if the container is not running03:31
wallyworldie if container status is blocked, that take precendence03:31
veeberswallyworld: I'm pretty sure the container was never in the blocked stage (or that we polled at least), it was a happy deploy so we set pod spec and the pod came up real quick03:32
wallyworldthe container status message would be the reason why iot is blocked03:32
veebersI'm happy to confirm this though03:33
wallyworldit had to be blocked because the pvc failed03:33
wallyworldthe k8s pod status would have been Unschedulable03:33
veebersah right, yeah that did happen, then after the trust addition the pod came up happy03:33
wallyworldyup03:33
wallyworldso until trust is run, the unit needs to show blocked03:34
wallyworldwhich it gets from container status03:34
veebersit did show that, but it went from active -> blocked (as the charm did 'set spec', set active) so we saw 'active -> then blocked once k8s realised the pod was unschedable03:35
wallyworldthe container stataus would not have been "running" yet though?03:38
wallyworldi think unless the container status is running, we should not show the unit status as active03:39
anastasiamacveebers: let me check...03:41
anastasiamacveebers: the one babbageclunk linked yesterday was in merge not hte check...03:42
anastasiamacwhy?03:42
veeberswallyworld: (unless I have the whole charm should set to active when doing the podspec) then there will be a gap, charm sets pod spec then sets active, k8s/pod will spin up then find out that it's blocked but by then juju has seen the 'active' status from the charm03:42
veebers(it's too late, it's seen everything)03:42
veebersanastasiamac: ah ok, I'm thinking that the way the pr check job is there is a bit of space for interruption from other jobs or jenkins. Really wanting to get my reshuffle of that stuff sorted and proposed03:43
wallyworldyeah right. we are stuck because the charm cannot see the workload03:43
anastasiamacveebers: and fwiw, m not seeing it at all locally... more thn 24hr running under stress...03:43
wallyworldveebers: but at that stage, there will no no container status right?03:43
veebersanastasiamac: aye, this is why I thnk it's how it's run in jenkins03:43
wallyworldso we could also say if container status is not found, don't go to active03:44
veeberswallyworld: right becuase the mechanisms that does that is only just kicked off03:44
wallyworldtreat it as container is still allocating03:44
veeberswallyworld: ok, will look at adding that into it too :-)03:44
wallyworldty03:44
wallyworldso close03:44
anastasiamacveebers: i think it's test data setup time when run in parallel... i think just waiting for charm to be setup will solve the issue...03:44
veeberswallyworld: feels like we're shoring it up with toothpicks and ducttape though ^_^03:45
veebersanastasiamac: hopefully so03:45
wallyworlda bit but there's not much else we can do yet03:45
veebersaye03:45
babbageclunkveebers, wallyworld, thumper: who knows about statfs?04:36
veebersbabbageclunk: what is statfs?04:36
babbageclunkor anastasiamac, kelvinliu_04:36
veebers(I guess that answers for me .  . .)04:36
wallyworldwhat's the question?04:36
babbageclunkIt's a syscall to get filesystem stats04:36
veebersah /me googles04:36
anastasiamac the same as veebers 'wot's statfs'?04:37
anastasiamacbabbageclunk: why do u need it? what's is wrong?04:37
babbageclunkI'm trying to test my check-space change, but having a bit of a mare.04:37
anastasiamactry stallion?04:37
anastasiamacbut srsly.... what kind of nightmare?04:38
babbageclunkwallyworld: I'm trying to understand the difference between the Bfree and Bavail fields (and how they interact with fallocate, which is how I'm filling up the disk)04:38
wallyworldhmmm, shrug, sorry :-(04:39
wallyworldyou can always test with a number > than your current free disk space04:39
wallyworldjust to see04:39
wallyworldwithout having to fill the disk04:39
babbageclunkYeah, I found that in your tests. :)04:40
wallyworld*my* tests?04:40
babbageclunkI'm doing a manual test.04:40
wallyworldwhat a clver hack that was :-)04:40
kelvinliu_not very sure what's the difference.. sry04:40
wallyworldright, so a manual test, just compile a jujud with a bogus free space requirment04:40
babbageclunkwallyworld: https://github.com/juju/juju/commit/47b2bf184636b31076de11979429304ecd8a78a904:41
babbageclunkwallyworld: well, the other way is to really fill up the disk using fallocate, which is what I'm doing (on someone else's computer)04:42
anastasiamacbabbageclunk: that was 2015!! another era...04:42
babbageclunkIan had long hair04:42
wallyworldit is, but if it's too hard to do, is there a real benefit?04:42
wallyworlds/long hair/hair04:42
* wallyworld sobs quietly04:43
babbageclunkIt's not that it's hard to do, it's that my code doesn't seem to be working...04:43
wallyworldthanks for mentioning it :-(04:43
babbageclunk:)04:43
wallyworldso you can't just code it to require 100000000000000000000000000000GB of free space and watch it fail?04:44
babbageclunkAnd the code I cribbed from you (in that commit) uses Free, which from my testing still returns a big number even though df -h says I only have 68M available.04:44
anastasiamaci like the idea of asking for unrealistic number.. this could be codified...04:44
babbageclunkI can, but I'd rather test the real code or understand why it doesn't work04:44
babbageclunkYes, I'm doing that in tests, but without really testing it it's not obvious that the code is actually right.04:45
anastasiamacbabbageclunk:  ur very dedicated if u want to fill up someone's else machine just to test space check...(or u have exceptional friends!!)04:45
wallyworldbabbageclunk: why what doesn't work? if the bogus number fails in "prod" with a special jujud, that seems ok right?04:45
babbageclunkI mean, it's Jeff Bezos' machine, we're not super close.04:45
babbageclunkI have a machine with only 68M available space. Upgrading on it doesn't fail, I'd like to understand why.04:46
wallyworldah i see, so it does fail in prod04:48
wallyworldyeah that seems like a bug04:48
babbageclunkI'm worried that setting the number to a huge one (like I do in the tests) will work, even though it would still not prevent downloading the agent if it was the real value.04:48
wallyworldcan you debug it on your machinr with only 68M04:48
babbageclunkyes04:48
babbageclunkThat's what I'm doing04:49
babbageclunkI'm trying to find out if anyone understands what the difference is between Bfree and Bavail, which seems to be the problem04:50
wallyworldNFI sorry04:51
anastasiamacbabbageclunk: the best i could find - https://community.hpe.com/t5/System-Administration/vxfs-jfs-bavail-vs-bfree/td-p/378680904:54
anastasiamacbabbageclunk: diff seems to b user based?...04:54
anastasiamacbabbageclunk: "bavail normally means blocks available to a non-superuser "04:55
babbageclunkyeah. That's what I've found in the docs too.04:56
babbageclunkhttps://linux.die.net/man/2/statfs04:56
anastasiamacbabbageclunk: yes, was reading this one too04:57
babbageclunkok, that gives me an idea04:57
anastasiamac\o/04:58
anastasiamacon a side note, isn't it funny that the manual was my last reference? i went to forums first :)04:58
babbageclunkThe manual is not very useful in this case - I've been trying to find more info for ages.04:59
veeberswallyworld: can you think of a better place than Unit SetStatus to do the 'if no container status and active ignore it' check? doing it there feels a bit off05:01
wallyworldyeah, we normally don't want to mess with the raw data model05:03
wallyworldit should be done in the apiserver layer05:03
wallyworld(or ideally a separate business logic layer we don't have yet)05:04
veebersack, thanks05:04
wallyworldbut unless we storage the raw data, thing will still mess up05:04
wallyworldso that's not really the answer either - it really is a prsentation issue05:04
wallyworldwe need to store the raw unit and container status as set05:05
wallyworldand transform when we hand off to status or when storing history05:05
wallyworldhence we talked yesterday about that helper method05:05
wallyworldso in the Done() of the unit ops05:06
wallyworldit would use the same helper as FullStatus() does05:06
veeberswallyworld: aye, that's in place at the moment, but I have the charm setting active and that's done outside of the caas provisioner updateUnit ops bits05:07
wallyworldthat's fine - we need to store the raw info05:07
veebersIf I remove that part of the charm it should work, I think, but that does mean that if anyone sets active in their charm it'll be displayed wrong05:07
wallyworldwhy? we transform in FullStatus()05:08
wallyworldwe need to store the raw data as set by the various actors and transform as necessary to present05:08
wallyworldunit SetStatus() does call out to update history so we'll need to use the helper method there too05:09
wallyworldbut only for caas models05:09
veeberswallyworld: Unit.SetStatus will update history, we're not doing anything there05:09
veebershah right, being a bit slow at typing05:09
wallyworld:-)05:09
veeberswallyworld: so we need to tweak our helper function logic even more, if charm sets active then sets podspec, unit status is no longer active w/ default message. So unless the pod encounters an error it won't overwrite the unit status.05:11
wallyworldif the pod spec is updated, the deployment controller will do a rolling update; we should get events for that and can update the data model accordingly05:13
wallyworldwe can do that in another PR05:13
stickupkidmanadart: structs vs closures - do you mean pass back a struct or pass an argument as a struct14:20
stickupkidre: comment from #916314:21
manadartPass on struct instead of the 3 mock pointers.14:21
manadarts/on/one14:21
stickupkidfair14:21
pmatulishow does one know what zones are available to the juju client? for instance, 'juju bootstrap --to zone=eu-west-2 aws' doesn't work from north america16:20
externalrealitypmatulis, does juju clouds list zones16:21
externalreality`juju clouds`16:21
externalrealitypmatulis, well I guess it lists the default zone16:21
externalreality`juju regions` maybe16:22
externalrealityzone == region correct16:23
externalrealityor is zone is generalization of region?16:23
pmatulisexternalreality, i originally tried 'juju show-cloud aws'16:23
externalrealitypmatulis, juju regions aws16:24
pmatulisyeah, that gives the same as show-cloud except with less detail16:24
pmatulisit would be very useful if the client could somehow know what zones are available16:26
pmatulisseems it should be able to query AWS16:26
pmatulisfunny. i can't get *any* zones to work. even the one that gets used by default successfully16:30
externalrealitypmatulis, is what `juju regions` lists not compatible with the "zone" placement directive?16:31
rick_h_pmatulis: honestly we don't want folks knowing/dealing with zones16:32
rick_h_pmatulis: Juju automatically attempts to spread units across zones to make worklaods resilient to outages16:32
pmatulisrick_h_, well we shouldn't say it works then16:32
rick_h_pmatulis: and having users custom-load them is a sign of a bad install or doing too heavy custom/snowflake stuff that's not portable across clouds16:32
pmatulis'cause it fails fast and hard16:33
rick_h_pmatulis: it can/does work but we don't go out of the way to put it in normal user commands like clouds/etc16:33
rick_h_externalreality: zone != region16:33
rick_h_externalreality: each region has several zones typically16:33
rick_h_externalreality: so when you deploy to us-east1 you can get units in various zones in that region16:33
rick_h_externalreality: think of zones as racks in a room, for instance16:33
externalrealityrick_h_, ack, I see, I just found this which was written by Andrew Wilkins https://awilkins.id.au/post/blogger/availability-zones-in-juju/16:37
rick_h_externalreality: cool, yea been a while.16:38
externalrealityrick_h_, pmatulis - I am with pmatulis in that how do you use the directive "zone" if you are unaware of any acceptable parameters16:38
externalrealityI guess juju does want to make it easy for users to do the wrong thing - Ack - just make it possible.16:40
rick_h_externalreality: understand, and maybe we can document something but if you're going to use it you need to understand the cloud you're on, the region you're in (different regions ahve different zones) and what it means to manually place16:40
rick_h_externalreality: exactly16:40
rick_h_so I'm definitely -1 on making it "easy"16:40
pmatulisrick_h_, what's the point in showing the regions? in 2 commands even. is it for something else?16:42
rick_h_pmatulis: ? every time you create a model you have to know what region it's in. It could be in the US, in EU, in APAC. The region is very important to performance and geo-spreading workloads across the world.16:43
rick_h_pmatulis: I'm not sure what you mean in 2 commands?16:43
pmatuliscommands 'regions' and 'show-cloud'16:43
* pmatulis didn't even know about the 'regions' command16:43
rick_h_pmatulis: ah, ok. Well the regions available are part of cloud details but folks thought that list-regions made since once you bootstrapped since you can only add-model to regions of the same cloud16:44
rick_h_pmatulis: so basically show-cloud is pre-bootstrap useful and list-regions is post-bootstrap (add-model) useful16:44
pmatulisbut just listing regions doesn't tell you what you're model is in. the show-model command does though16:46
rick_h_pmatulis: right, it's meant to help you pick the right region names available for add-model16:46
rick_h_pmatulis: once the model is added you're right, what region the model is in is about the model16:47
rick_h_pmatulis: so list-regions is more about what you can do given the controller you're on16:47
rick_h_pmatulis: if you switch controllers from aws, gce, openstack and run list-regions you'd get different answers16:47
pmatulisrick_h_, ok16:55
hmlsmall pr for review: https://github.com/juju/juju/pull/917819:47
hmlfixes to errors.Annotatef() with go test (using go 1.11)19:47
hmlbuild failures19:48
hmlcrackers, must be friday afternoon - need to check that the tests still pass. :-)19:48
rick_h_hml: ^ heads up19:53
rick_h_sorry, never mind19:53
rick_h_lol, somehow I read that hatch was submitting the pr19:54
hmlrick_h_: happy friday afternoon19:54
hml:-)19:54
* rick_h_ tries to halt checkout procedures19:54

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!