[00:52] kelvinliu: babbageclunk: forgot to check, it shouldn't be too hard to support tag constraints in vsphere i hope. main thing is to be able to query the nodes for what tags they have i think? [00:53] ie as per the discourse discussion [00:58] tags is marked as validator.RegisterUnsupported(unsupportedConstraints) , i m not sure why we decided to not support it [00:59] wallyworld: just reading about tag handling in the vsphere api, I haven't seen it before [01:05] kelvinliu: i'm guess because we didn't yet implelent the api calls to ask vsphere about tags [01:05] i reckon we could do something (he says all handwavy) [01:05] currently, we just fetch all instances then filter tags in metadata field from client side, https://github.com/juju/juju/blob/develop/provider/vsphere/environ_instance.go#L88 [01:06] so we could do that server side too and support tag placement [01:26] hpidcock: in the for loop to hande the process cancel/kill/exit, i think we'd want to cap the amount of times we retry kill and not get notification of process exit? with a suitable user facing message surfaced [03:49] kelvinliu: tlm: hpidcock: i have a stateful set issue if any of you guys have time for a HO in standup [03:49] ok [03:50] yep [04:27] Updates to the volume claim template are not currently permitted. A feature request to permit this is open at #69041 [04:27] Bug #69041: Beagle: German translation incomplete [04:28] wallyworld: tlm https://github.com/kubernetes/kubernetes/issues/85955 === wgrant is now known as wgrant_ === wgrant_ is now known as wgrant [14:39] what do I do when my unit thinks there is a relation when there isn't? https://paste.ubuntu.com/p/w5b8Hw82tZ/ [14:40] skay: um don't know? and it skipped it so all good? :) [14:40] rick_h: no, juju status says that the agent is in error [14:40] I'm trying to figure out why and that's a suspicious thing in the logs [14:41] I restarted the service for that unit and the service for the machine, btw [14:41] skay: oh hmmm, what does juju status --relations show? and are there > 1 units (peer relation?) [14:42] rick_h: there's only 1 postgresql unit. https://paste.ubuntu.com/p/t5kBzghkg2/ [14:43] I did recently remove a relation I no longer needed. Previously there was pgbouncer. I removed it and connected things to postgresql directly [14:44] and since this is my test environment, I take down units that are connected to it willy-nilly and then spin up new units. sometimes apps [14:45] I like how this unit is on machine 101. it is juju failed 101. I should learn some lessons from this [14:47] if I had more ranks in dadjoke I would be able to make a good-worse joke than that [14:54] rick_h: do you have any troubleshooting tips for this? I am at a loss. [14:56] skay: sorry, getting pulled in a few directions atm and we've got a bunch of folks out of the office today [14:56] skay: no, I mean I would mark it --resolved and try to see if you can get past it [14:56] I'm not sure what is up with "skipping" but then an error [14:57] rick_h: I tried marking it as resolved. I can ask again later when things are less hectic [14:57] it's not in an error state, it's 'active' and 'failed' [14:58] I just noticed the yaml status has a better message. 'message: resolver loop error' [14:58] skay: oh sorry, I thought you mentioned it was in an agent error [14:58] skay: hmmm, can you paste more of the unit log on there then please? [15:00] brb standup [15:00] k [15:00] (the only thing I see in the log is the thing I pasted. I'm tailing it. I'll restart it to see if it has different output after) [15:06] skay: ok [15:21] rick_h: I've been tailing hte postgres unit's log for a while now and those two lines are the only thing that show up. [15:22] achilleasa: do you have any ideas around this agent error skay is seeing? https://paste.ubuntu.com/p/w5b8Hw82tZ/ [15:26] achilleasa: here's a snippit from the status. the juju-status messages is 'resolver loop error' https://paste.ubuntu.com/p/GxVq58pH3z/ [15:43] skay: looking [15:47] skay: which juju version are you using on the controller? [15:47] achilleasa: 2.6.10 [15:49] they will be upgrading the controller soon [16:02] skay rick_h: so there are a couple of places in (https://github.com/juju/juju/blob/2.6/worker/uniter/relation/relations.go) where this error is raised but there is not enough context to figure out which one is it (best guess is L383 or L400) [16:03] maybe you could try to remove-unit --force to get rif of the stuck unit and spin up a new one? [16:06] achilleasa: ouch. that's my postgresql unit. it's not extremely painful since i don't care about the database in this environment, but if it happens in a real environment it would be painful [16:07] skay: can you share a mongo dump with me? maybe I can track down which relation name is associated with the 303 ID [16:11] achilleasa: I do not have access to the controller. would I need that? if I don't, then are there docs on how to get a dump? [16:15] skay: you might be able to use "juju create-backup" (see https://jaas.ai/docs/controller-backups) [16:41] achilleasa: i’ve reviewed 11255 and added comments. still have qa to do [16:42] achilleasa: one is more of an observation and question, rather than a request to change as it’s a set pattern in the code there. :-/ [16:43] hml: I keep messing up the import stanzas... :-( [16:43] achilleasa: doesn’t help that the static analysis job isn’t working correctly for imports either. [16:44] mine get minimized, so i don’t see them until i push the code up to GH [16:44] I will clean up the commits and force-push the right version [17:00] hml: I think I fixed the stanza issues; can you take another look? [17:04] achilleasa: sure [17:15] achilleasa: the qa isn’t working for me. ho? [17:20] hml: omw [17:24] achilleasa: https://pastebin.canonical.com/p/kS82HDydk7/ model errors [17:25] achilleasa: https://pastebin.canonical.com/p/BQxgqD7zqC/ controller errors [18:03] I'm after some advice if anyone has a sec. I have a charm with a hook erroring with: [18:03] 2020-02-27 17:49:12 ERROR juju.worker.uniter.operation runhook.go:132 hook "ceph-client-relation-changed" failed: could not write settings from "ceph-client-relation-changed" to relation 0: permission denied [18:03] if I resolve the hook it works [18:03] sorry, I meant: [18:03] if I resolve the hook using a debug-hook session the error goes away [18:03] gnuoy: ooh, I think achilleasa just fixed this one [18:03] if I do it without a debug-hook session it persists [18:04] gnuoy: single unit? [18:04] always the non-leader of a two unit deploy in my case [18:04] gnuoy: hmmm yea non-leaders can't write leader data. Sounds like a charm logic problem then [18:05] right, but why does the debug-hook session make a difference ? [18:05] gnuoy: the charm should be checking if it's the leader before trying to write the data [18:05] yep, it should, and I believe it is [18:06] gnuoy: ok so the question is why does it not do it with debug-hooks? [18:08] yes. Full disclosure I'm using the operator framework and the bug could lie in there. but its hard to track down when using debug-hooks seems to make it go away [18:09] I'm ssh'd onto the unit and happily resolving the hook and reproducing the error [18:10] rick_h, I don't want to waste your time, the bug is almost certainly outside of juju. Just wonder if anything spring to mind about the difference in hook env when using debug-hooks ? [18:10] gnuoy: thinking but confused tbh...if you're ssh'd to the unit I would think you'd not have hook context and have issues [18:10] rick_h,oh, I'mm ssh'd in just to observe whats happining, not executing the hook from the ssh session [18:10] debug-hooks sets up the hook context, but I can't think of why it would affect leader data type stuff...unless maybe it's working around a check somehow? [18:13] I wonder if this is a focal'ism [18:18] gnuoy: which version of juju? [18:19] I tried with 2.8.1 and 2.7.3 [18:19] gnuoy: there should be no difference between the context during debug-hook and the regular hook execute then. [18:20] previously the only diff i know of was an env var not set with debug hook [18:20] hmm, ok, I must be doing something truly stupid [18:20] but nothing to do with leadership [18:26] gnuoy: it’s always possible something is off with not debug-hooks. definitely shouldn’t be seeing a diff in hook execute [18:27] fwiw https://paste.ubuntu.com/p/5df6Yw6XFH/ [18:29] gnuoy: are there any errors in the juju debug-log for that model? [18:32] hml, does juju just use the exit code of the hook to determine if the hook worked ? [18:32] gnuoy: yes [18:34] hml, I'm going to stop using up your time and go do some more digging, thanks for the ideas [18:34] gnuoy: have fun. i need to lunch, but will be back later [20:42] morning [20:42] morning tlm