[00:52] <wallyworld> kelvinliu: babbageclunk: forgot to check, it shouldn't be too hard to support tag constraints in vsphere i hope. main thing is to be able to query the nodes for what tags they have i think?
[00:53] <wallyworld> ie as per the discourse discussion
[00:58] <kelvinliu> tags is marked as 	validator.RegisterUnsupported(unsupportedConstraints) , i m not sure why we decided to not support it
[00:59] <babbageclunk> wallyworld: just reading about tag handling in the vsphere api, I haven't seen it before
[01:05] <wallyworld> kelvinliu: i'm guess because we didn't yet implelent the api calls to ask vsphere about tags
[01:05] <wallyworld> i reckon we could do something (he says all handwavy)
[01:05] <kelvinliu> currently, we just fetch all instances then filter tags in  metadata field from client side, https://github.com/juju/juju/blob/develop/provider/vsphere/environ_instance.go#L88
[01:06] <wallyworld> so we could do that server side too and support tag placement
[01:26] <wallyworld> hpidcock: in the for loop to hande the process cancel/kill/exit, i think we'd want to cap the amount of times we retry kill and not get notification of process exit? with a suitable user facing message surfaced
[03:49] <wallyworld> kelvinliu: tlm: hpidcock: i have a stateful set issue if any of you guys have time for a HO in standup
[03:49] <tlm> ok
[03:50] <kelvinliu> yep
[04:27] <kelvinliu> Updates to the volume claim template are not currently permitted. A feature request to permit this is open at #69041
[04:27] <mup> Bug #69041: Beagle: German translation incomplete <beagle (Ubuntu):Fix Released by ubuntu-l10n-de> <https://launchpad.net/bugs/69041>
[04:28] <kelvinliu> wallyworld: tlm https://github.com/kubernetes/kubernetes/issues/85955
[14:39] <skay> what do I do when my unit thinks there is a relation when there isn't? https://paste.ubuntu.com/p/w5b8Hw82tZ/
[14:40] <rick_h> skay:  um don't know? and it skipped it so all good? :)
[14:40] <skay> rick_h: no, juju status says that the agent is in error
[14:40] <skay> I'm trying to figure out why and that's a suspicious thing in the logs
[14:41] <skay> I restarted the service for that unit and the service for the machine, btw
[14:41] <rick_h> skay:  oh hmmm, what does juju status --relations show? and are there > 1 units (peer relation?)
[14:42] <skay> rick_h: there's only 1 postgresql unit. https://paste.ubuntu.com/p/t5kBzghkg2/
[14:43] <skay> I did recently remove a relation I no longer needed. Previously there was pgbouncer. I removed it and connected things to postgresql directly
[14:44] <skay> and since this is my test environment, I take down units that are connected to it willy-nilly and then spin up new units. sometimes apps
[14:45] <skay> I like how this unit is on machine 101. it is juju failed 101. I should learn some lessons from this
[14:47] <skay> if I had more ranks in dadjoke I would be able to make a good-worse joke than that
[14:54] <skay> rick_h: do you have any troubleshooting tips for this? I am at a loss.
[14:56] <rick_h> skay:  sorry, getting pulled in a few directions atm and we've got a bunch of folks out of the office today
[14:56] <rick_h> skay:  no, I mean I would mark it --resolved and try to see if you can get past it
[14:56] <rick_h> I'm not sure what is up with "skipping" but then an error
[14:57] <skay> rick_h: I tried marking it as resolved. I can ask again later when things are less hectic
[14:57] <skay> it's not in an error state, it's 'active' and 'failed'
[14:58] <skay> I just noticed the yaml status has a better message. 'message: resolver loop error'
[14:58] <rick_h> skay:  oh sorry, I thought you mentioned it was in an agent error
[14:58] <rick_h> skay:  hmmm, can you paste more of the unit log on there then please?
[15:00] <skay> brb standup
[15:00] <rick_h> k
[15:00] <skay> (the only thing I see in the log is the thing I pasted. I'm tailing it. I'll restart it to see if it has different output after)
[15:06] <rick_h> skay:  ok
[15:21] <skay> rick_h: I've been tailing hte postgres unit's log for a while now and those two lines are the only thing that show up.
[15:22] <rick_h> achilleasa:  do you have any ideas around this agent error skay is seeing? https://paste.ubuntu.com/p/w5b8Hw82tZ/
[15:26] <skay> achilleasa: here's a snippit from the status. the juju-status messages is 'resolver loop error' https://paste.ubuntu.com/p/GxVq58pH3z/
[15:43] <achilleasa> skay: looking
[15:47] <achilleasa> skay: which juju version are you using on the controller?
[15:47] <skay> achilleasa: 2.6.10
[15:49] <skay> they will be upgrading the controller soon
[16:02] <achilleasa> skay rick_h: so there are a couple of places in (https://github.com/juju/juju/blob/2.6/worker/uniter/relation/relations.go) where this error is raised but there is not enough context to figure out which one is it (best guess is L383 or L400)
[16:03] <achilleasa> maybe you could try to remove-unit --force to get rif of the stuck unit and spin up a new one?
[16:06] <skay> achilleasa: ouch. that's my postgresql unit.  it's not extremely painful since i don't care about the database in this environment, but if it happens in a real environment it would be painful
[16:07] <achilleasa> skay: can you share a mongo dump with me? maybe I can track down which relation name is associated with the 303 ID
[16:11] <skay> achilleasa: I do not have access to the controller. would I need that? if I don't, then are there docs on how to get a dump?
[16:15] <achilleasa> skay: you might be able to use "juju create-backup" (see https://jaas.ai/docs/controller-backups)
[16:41] <hml> achilleasa: i’ve reviewed 11255 and added comments.  still have qa to do
[16:42] <hml> achilleasa:  one is more of an observation and question, rather than a request to change as it’s a set pattern in the code there.  :-/
[16:43] <achilleasa> hml: I keep messing up the import stanzas... :-(
[16:43] <hml> achilleasa:  doesn’t help that the static analysis job isn’t working correctly for imports either.
[16:44] <hml> mine get minimized, so i don’t see them until i push the code up to GH
[16:44] <achilleasa> I will clean up the commits and force-push the right version
[17:00] <achilleasa> hml: I think I fixed the stanza issues; can you take another look?
[17:04] <hml> achilleasa:  sure
[17:15] <hml> achilleasa:  the qa isn’t working for me.  ho?
[17:20] <achilleasa> hml: omw
[17:24] <hml> achilleasa:  https://pastebin.canonical.com/p/kS82HDydk7/ model errors
[17:25] <hml> achilleasa:  https://pastebin.canonical.com/p/BQxgqD7zqC/ controller errors
[18:03] <gnuoy> I'm after some advice if anyone has a sec. I have a charm with a hook erroring with:
[18:03] <gnuoy> 2020-02-27 17:49:12 ERROR juju.worker.uniter.operation runhook.go:132 hook "ceph-client-relation-changed" failed: could not write settings from "ceph-client-relation-changed" to relation 0: permission denied
[18:03] <gnuoy> if I resolve the hook it works
[18:03] <gnuoy> sorry, I meant:
[18:03] <gnuoy> if I resolve the hook using a debug-hook session the error goes away
[18:03] <rick_h> gnuoy:  ooh, I think achilleasa just fixed this one
[18:03] <gnuoy> if I do it without a debug-hook session it persists
[18:04] <rick_h> gnuoy:  single unit?
[18:04] <gnuoy> always the non-leader of a two unit deploy in my case
[18:04] <rick_h> gnuoy:  hmmm yea non-leaders can't write leader data. Sounds like a charm logic problem then
[18:05] <gnuoy> right, but why does the debug-hook session make a difference ?
[18:05] <rick_h> gnuoy:  the charm should be checking if it's the leader before trying to write the data
[18:05] <gnuoy> yep, it should, and I believe it is
[18:06] <rick_h> gnuoy:  ok so the question is why does it not do it with debug-hooks?
[18:08] <gnuoy> yes. Full disclosure I'm using the operator framework and the bug could lie in there. but its hard to track down when using debug-hooks seems to make it go away
[18:09] <gnuoy> I'm ssh'd onto the unit and happily resolving the hook and reproducing the error
[18:10] <gnuoy> rick_h, I don't want to waste your time, the bug is almost certainly outside of juju. Just wonder if anything spring to mind about the difference in hook env when using debug-hooks ?
[18:10] <rick_h> gnuoy:  thinking but confused tbh...if you're ssh'd to the unit I would think you'd not have hook context and have issues
[18:10] <gnuoy> rick_h,oh, I'mm ssh'd in just to observe whats happining, not executing the hook from the ssh session
[18:10] <rick_h> debug-hooks sets up the hook context, but I can't think of why it would affect leader data type stuff...unless maybe it's working around a check somehow?
[18:13] <gnuoy> I wonder if this is a focal'ism
[18:18] <hml> gnuoy: which version of juju?
[18:19] <gnuoy> I tried with 2.8.1 and 2.7.3
[18:19] <hml> gnuoy:  there should be no difference between the context during debug-hook and the regular hook execute then.
[18:20] <hml> previously the only diff i know of was an env var not set with debug hook
[18:20] <gnuoy> hmm, ok, I must be doing something truly stupid
[18:20] <hml> but nothing to do with leadership
[18:26] <hml> gnuoy: it’s always possible something is off with not debug-hooks.  definitely shouldn’t be seeing a diff in hook execute
[18:27] <gnuoy> fwiw https://paste.ubuntu.com/p/5df6Yw6XFH/
[18:29] <hml> gnuoy:  are there any errors in the juju debug-log for that model?
[18:32] <gnuoy> hml, does juju just use the exit code of the hook to determine if the hook worked ?
[18:32] <hml> gnuoy: yes
[18:34] <gnuoy> hml, I'm going to stop using up your time and go do some more digging, thanks for the ideas
[18:34] <hml> gnuoy:  have fun.  i need to lunch, but will be back later
[20:42] <tlm> morning
[20:42] <rick_h> morning tlm