[08:05] <bthomas> Morning all
[08:58] <bthomas> 𝐆𝐨𝐨𝐝 𝐌𝐨𝐫𝐧𝐢𝐧𝐠 Chipaca
[08:58] <bthomas>  
[08:58] <bthomas> Chipaca: Will harness.add_relation_unit trigger a relation changed event ?
[08:58] <Chipaca> bthomas: morning
[08:59] <Chipaca> bthomas: from that method's documentation,
[08:59] <Chipaca>         This will trigger a `relation_joined` event and a `relation_changed` event.
[08:59] <bthomas> hmm
[09:00] <Chipaca> bthomas: https://ops.rtfd.io/en/latest/#ops.testing.Harness.add_relation_unit
[09:01] <bthomas> Thanks for confirming. I think I am messing something then.
[09:17] <bthomas> Apologies Chipaca : I do not see what I am missing in this relation test which shows an empty relation data dictionary. https://pastebin.canonical.com/p/2ZxTXPvNtB/
[09:19] <Chipaca> bthomas: where can i pull this from?
[09:19] <bthomas> Chipaca: I have not pushed to github since it was not working but will do so now.
[09:20] <Chipaca> bthomas: thank you
[09:20] <bthomas> Chipaca: pushed
[09:21] <Chipaca> bthomas: git commit -a -m checkpoint && git push && git reset @^  and then next time you push, git push --force-with-lease
[09:21] <Chipaca> bthomas: ('git reset @^' does an "uncommit")
[09:21] <bthomas> Chipaca: yes I will do a force push when I fix the test but if you have pulled you will end up with conflict if you pull again
[09:21] <Chipaca> bthomas: pushed to where?
[09:22] <bthomas> Chipaca: Oops . Pushed here https://github.com/balbirthomas/prometheus-charm.git
[09:22] <Chipaca> bthomas: pulling
[09:27] <Chipaca> haha
[09:27] <Chipaca> heh
[09:27]  * bthomas hids
[09:27]  * bthomas hides
[09:28] <Chipaca> bthomas: nope, you good
[09:28] <bthomas> :)
[09:28] <Chipaca> bthomas: look at the last line of https://ops.readthedocs.io/en/latest/_modules/ops/testing.html#Harness.add_relation_unit
[09:29] <Chipaca> bthomas: notice how it fires relation-joined
[09:29] <Chipaca> bthomas: but not relation-changed
[09:29] <bthomas> Thank you. For spotting this.
[09:29] <Chipaca> i need to check with jam but I think that's a bug. I don't know if it's a bug in the docstring or the code though.
[09:30] <bthomas> Chipaca: Shouldn't relation-joined emit relation changed in turn ?
[09:30] <Chipaca> also need to check with jam about #395
[09:30] <mup> Issue #395: juju-info not supported <Created by stub42> <https://github.com/canonical/operator/issues/395>
[09:30] <Chipaca> bthomas: not like that
[09:30] <Chipaca> bthomas: i mean, that's why i need to check with jam, as i don't know if _juju_ does
[09:31] <Chipaca> bthomas: but the charm won't and shouldn't
[09:31] <bthomas> Chipaca: Ok I will park this test for now and continue with others
[09:31] <Chipaca> bthomas: if juju does, then the bug is in the code; if juju does not, then the bug is in the docstring
[09:31] <bthomas> Thank you for clarifying
[09:31] <Chipaca> thank you for spotting this
[10:36] <facubatista> ¡Muy buenos días a todos!
[10:37] <bthomas> नमस्ते facubatista
[10:37] <facubatista> hola bthomas
[10:53] <jam> Chipaca, when a unit joins a relation (once they complete the start hook), juju will always fire relation-joined and then relation-changed. I'm not sure if we want to encode that in the harness, vs waiting for the user to update_relation_data to trigger relation-changed
[10:58] <Chipaca> jam: I'm not sure either, but I'm sure the docstring calls it one way and the code another :)
[10:59] <jam> Chipaca, indeed, I saw that from Balbir's comment. Feel free to fix it in either way :)
[11:00] <jam> Chipaca, for juju-info, there is, indeed, an implicit endpoint for every application call 'juju-info'. It was unclear to me what stub wanted to do with it to make sure we cover his use case
[11:15] <mup> PR operator#401 opened: fix docstring on Harness.add_relation_unit WRT events <Created by chipaca> <https://github.com/canonical/operator/pull/401>
[11:17] <Chipaca> bthomas: circling back to your issue, it's working as intended and you should call update_relation_data to see the relation-changed event :)
[11:18]  * facubatista begs for reviews https://github.com/canonical/charmcraft/pull/154
[11:18] <mup> PR charmcraft#154: Improved short help messages <Created by facundobatista> <https://github.com/canonical/charmcraft/pull/154>
[11:18] <bthomas> Chipaca: will check. thank you
[11:19] <Chipaca> bthomas: also if you're feeling like reading something, feedback on https://docs.google.com/document/d/1jwFlPG_q5lr80NYI19L_3mErAv4Yk2-g6uzK113Qiks/edit welcome
[11:19] <bthomas> Chipaca: will do after lunch
[11:27] <Chipaca> lunch: a preposterous notion
[12:01] <bthomas> Alternatively I can fuel on wine too
[12:02] <bthomas> Chipaca: Added a few questions to google docs. It is entirely possible they are meaningless dribble if so ignore them but if they make sense ...
[12:06] <bthomas> facubatista: pull 154 approved with one comment
[12:06] <mup> Issue #154: [Question] Do we need a separate base class for interfaces? <Created by VariableDeclared> <Closed by VariableDeclared> <https://github.com/canonical/operator/issues/154>
[12:09] <facubatista> bthomas, thanks
[12:10] <facubatista> bthomas, jaja, yes, what a silly docstring I wrote there!
[12:10] <bthomas> :-)
[12:10] <bthomas> I think you will find stiff competition from me when it comes to being silly
[12:11] <Chipaca> facubatista: .i see no problem.
[12:13] <facubatista> :)
[12:20] <bthomas> Chipaca: I am setting relation data in the relation changed handler in my src/charm.py. So I want to test that my relation changed handler defined in src/charm.py is setting that data. To write such a test I am adding a relation unit then calling update_relation_data() with an empty dictionary as the argument. This produces an empty dictionary for the data. Is this the expected behaviour ?
[12:22] <Chipaca> bthomas: it … depends? if you mean that you get an empty dictionary, and never see the setting of the data that you do, then no
[12:23] <bthomas> Chipaca: ok then I have a problem and need to dig a bit before I get back to you. But looks like right now, my src/charm.py relation handler is not setting data or alternatively harness is not seeing it, or update_relation_data is overwriting it.
[12:24] <Chipaca> bthomas: share the code so i can take a peek?
[12:26] <bthomas> Chipaca: pushed to https://github.com/balbirthomas/prometheus-charm.git . The test in question is the only one skipped.
[12:29] <Chipaca> bthomas: self.unit is prometheus/0, not grafana/0
[12:31] <Chipaca> bthomas: so in your call to get_relation_data, you want prometheus/0 not grafana/0
[12:33] <bthomas> Chipaca: I was under the impression that when I set the relation data in one unit participating in the relation the other unit will see the change too. Looks like I was wrong. So I guess I need to set the data on event.unit i.e. event.relation.data[event.unit]['port'] = ... . Is this correct ?
[12:35] <bthomas> Yep. Changing self.unit to event.unit makes test pass.
[12:35] <bthomas> Thanks Chipaca
[12:36] <bthomas> oops spoke too soon
[12:36] <Chipaca> i don't think it works that way
[12:36] <Chipaca> you're setting your data in the other charm's data bag
[12:37] <Chipaca> the other unit's data bag*
[12:38] <Chipaca> bthomas: is that what you're wanting to do?
[12:40] <bthomas> Chipaca: I just changed self.unit to event.unit in on_grafana_changed() in src/charm.py. I am trying to get the test to pass now. I will comb the docs to make sure I am doing this right.
[13:13] <justinclark> Good morning o/
[13:16] <bthomas> Morning justinclark
[13:26] <Chipaca> justinclark: could you enter your PTO time in the team calendar?
[13:26] <justinclark> Oh yes. Will do right now.
[13:26] <Chipaca> ta
[13:29] <justinclark> Done
[13:33]  * bthomas resumes his literary career writing hook documenation
[13:35] <Chipaca> justinclark: 👍
[15:24] <Spads> Chipaca: https://code.launchpad.net/~nick-moffitt/charm-k8s-site24x7-exporter/+git/charm-k8s-site24x7-exporter/+merge/390565 is in the active reviews queue now, thanks!
[15:25] <Chipaca> Spads: thanks!
[15:56] <justinclark> bthomas: let me know whether you decide to use app data vs. unit data for the prometheus private-address/port/source-type since grafana will need to access that data. Right now grafana is looking at the prometheus unit data.
[15:57] <bthomas> justinclark: leave it as it is for now if it is working. I presume this will not be a big change. Correct me if I am wrong. We should talk to Jeremy if there will ever be the case when mutiple prometheus units in the same cluster will talk to grafana.
[15:59] <Chipaca> if you have multiple prometheuses (promethea?) for HA, don't you need a load balancer for grafana to dtrt?
[15:59] <justinclark> My understanding is that there might be multiple prometheus units, but there will be some sort load balancer between it and grafana - so there will really only be one URL for the multi-unit prometheus cluster that is sent to Grafana.
[15:59] <justinclark> Yes I think so Chipaca ^
[15:59] <Chipaca> ok
[16:01] <justinclark> And if we need prometheus to monitor totally different things, we will have a separate relation (i.e. separate relation data bags) for that - not separate units on the same relation.
[16:02] <bthomas> By separate relation are you referring to metadata.yaml ?
[16:04] <bthomas> If we collating data from multiple prometheus units through a load balancer (Service in the k8s world) we will need to ensure that those data streams are distinguishable in Grafana and test for this specifically.
[16:05] <justinclark> I don't know all the details, but I think the charm code would remain the same - we would just do `juju deploy prometheus <config_options_0>` and `juju deploy prometheus <config_options_1`
[16:07] <justinclark> Then I think add two relations for each prometheus deployment. I'll test that this works.
[16:08] <justinclark> I don't think Grafana has to worry about whether there are multiple units - only the leader Prometheus unit should be setting the relation data so there should only be one set of information passed to Grafana
[16:09] <bthomas> Maybe we can do this once we have the one to one relation working
[16:10] <bthomas> What you are suggesting is that the prometheus leader unit aggregate data and forward it to Grafana. I am fine with this. I would like to get some input from Jeremy with regard to the use cases we must support and the overally architecture.
[16:11] <justinclark> bthomas: agreed. We can dive deeper into this tomorrow.
[16:32] <bthomas> Chipaca: facubatista jam : Is it correct to say that after handing a config changed event a charm must set its unit/application status to ActiveStatus, if there is no on_start handler? Alternatively it should do so at the end of the on_start handler ?
[16:33] <jam> bthomas, it depends whether you feel you are ready at that point.
[16:33] <jam> bthomas, if you need relation data before you can be correctly configured, then you're not "Active"
[16:33] <jam> you're still Waiting
[16:33] <jam> if you don't have a relation that you depend upon, then you're "Blocked"
[16:34] <jam> bthomas, you don't *join* relations until after start completes, so there isn't much expectation that you'll be fully Active at that point.
[16:34] <jam> (you'll have relation-created to know the relation exists, but you won't have relation-joined to know the content of the relation information)
[16:36] <bthomas> jam : thanks for the clarification. I will paraphase this in the hook docs taking into consideration the caveats you highlight. How to manage status was another one of those things that does not come through existing docs very clearly (IMHO).
[16:49] <bthomas> jam : can I say that "status" set by a Charm is a "hint" to the Juju agent, and that the Juju is the final arbitrer of the reported status of a Charm and its units. I mean to exemplify this with the example of ActiveStatus which is set by the Charm, without which Juju well not report the unit as active, however despite it Juju may wait till pods are running before reporting the units as active.
[16:51] <jam> bthomas, I probably would phrase it differently, and just treat active as a special case, because all other cases it is exactly the current status as you specify it.
[16:52] <bthomas> jam: Ah ok. Thank you. So Juju honours every other status rigidly except ActiveStatus. Got it. Will amend my wording.
[17:20]  * bthomas EOD but maybe bbiab
[18:41]  * facubatista -> bbl
[19:09] <justinclark> jam, Chipaca, facubatista (when back) - I have a question about the "model.pod.set_spec()" method. I was really struggling to figure out why my Grafana configuration wasn't working but it turns out that the new pod was never actually running. After restarting the pod directly with kubectl, everything works as expected. However, I was kind of assuming "model.pod.set_spec()" would restart the pod with the new
[19:09] <justinclark> configuration.
[19:10] <justinclark> Is there something in addition to "set_spec()" that I need to do to make sure the new pod is actually started?
[19:12] <justinclark> Happy to jump on a quick call since it might be easier to just show what's happening
[19:15] <justinclark> bthomas: ^^ this implies that the prometheus charm is doing what it's supposed to do since I can create dashboards with prometheus data as soon as the updated pod spec is live.
[19:15] <bthomas> justinclark: good to hear. the pod.set_spec issue baffels me though.
[19:16] <bthomas> justinclark: what is the state of you pods without using kubectl to restart them.
[19:17] <justinclark> They're all running with the original configuration (before relating Prometheus)
[19:17] <bthomas> I was just going to say Ah never mind :) since I reread your post and realized just the same.
[19:18] <justinclark> My hacky solution was "microk8s.kubectl scale -n lma --replicas=0 statefulset.apps/grafana"
[19:18] <justinclark> followed by
[19:18] <justinclark> "microk8s.kubectl scale -n lma --replicas=1 statefulset.apps/grafana"
[19:19] <bthomas> My understanding is that if you do set_spec, the underlying k8s deployment should see a new spec and restart pods. Obviously this does not seem to be happening in your case.
[19:20] <justinclark> That is (was) my understanding as well.
[19:21] <bthomas> justinclark: what is your github id, so I can look at the code.
[19:21] <justinclark> justinmclark
[19:24] <justinclark> https://github.com/justinmclark/grafana-charm-base/blob/master/src/charm.py
[19:40] <bthomas> justinclark: As yet looking at source I am not seeing an obvious bug. Makes me feel chasing this may require looking at logs. I will look at the source once again.
[19:43] <justinclark> Yeah we could very well be missing something. It just seems weird that the new pod spec is correctly set and all newly created configuration files make it to the pod, but the pod itself is just never restarted.
[19:50] <bthomas> So I infer you actually checked, that before you did kubectl scale, the config file for grafana with the new data source for prometheus, had actually been written into the appropriate mount path.
[19:51] <justinclark> Yep - those files are all there before restarting. Checked with "microk8s.kubectl exec grafana-0 -n lma -- cat /etc/grafana/provisioning/datasources/datasources.yaml"
[19:51] <bthomas> Excellent. if there is a way to tell grafana to reload its config, without creating a new pod, does that work.
[19:52] <bthomas> your can try a kubectl exec some bash command to grafana (SIGHUP ?) or some python code from charm to nudge grafana maybe
[19:54] <bthomas> Also you can watch kubectl all to know if new pods were actually created when you did set_spec
[19:55] <justinclark> Let me check. I do know the Grafana docs say to restart Grafana before new config options are set. I think watching will also be helpful.
[19:59] <bthomas> Oh! "restart Grafana before new config" ... or "after new config" . Also don't forget to check juju debug-log for anything interesting
[19:59]  * bthomas goes to hammer another nail in his coffin -- cough cough
[20:04] <justinclark> Haha don't stay up too late for my sake. I'll hopefully get it figured out soon. The juju/ops wizards might have ideas as well.
[20:04]  * justinclark aspires to be a juju/ops wizard someday
[20:06]  * bthomas is back
[20:06] <bthomas> will be up for about 40 mins or so
[20:24] <justinclark> Looks like the only way to restart Grafana is to restart the container it's in
[20:27] <justinclark> I'm wondering if it is somehow related to the fact that only a mounted file is changing - rather than anything specific to the kubernetes pod spec. I've seen somewhere that pod_spec_set is idempotent so if there is no change to the spec, the pod won't be restarted.
[20:29] <justinclark> If the only change that is checked is the kubernetes pod spec (not the entire juju pod spec), then that might be the issue.
[20:30] <justinclark> Although it's more likely that the issue is on my end rather than on Juju's end
[20:45] <bthomas> justinclark: Chipaca: mentioned that if ConfigMaps changed then setting pod spec does not actually create new pods.
[20:46] <bthomas> I am begining to think that this is the problem you may be dealing with. You can confirm this by watching kubectl all and checking if new pods are actually created in response to adding a prometheus relation.
[20:48] <bthomas> If new pods are not being created and the likely cause is the ConfigMap issue, you may try a workaround where you create a new variable in the pods spec that is not part of the ConfigMap and just change that variable along with appending the new grafana config file to the files container.
[20:51] <justinclark> Ah yes that's probably exactly the issue. I can confirm that watching kubectl all does not change any of the pods/statefulsets after adding the prometheus relation.
[20:51] <justinclark> I'll try the variable workaround.
[21:06] <bthomas> justinclark: If kubectl all shows that new pods are not created then I am not aware of any reason that can cause this other than the ConfigMap issue. Hopefully we will soon have all such gotchas carefully documented so that it will no cause any one of us to stumble on such things. Good luck with the work around.
[21:06] <bthomas> I will call it a night. See you tomorrow.
[21:06] <justinclark> I appreciate the help. Thanks!
[21:33]  * facubatista is back
[21:39] <facubatista> justinclark, probably there should be a debug message somewhere that indicates what's going on...
[21:39] <facubatista> justinclark, I mean, I'm not saying that *there is*, but that there *should be*
[21:40] <justinclark> Agreed. I couldn't find anything to indicate that problem. But I just confirmed that this is the ConfigMap issue that bthomas talked about is in fact the issue.
[21:40]  * justinclark does celebratory dance
[21:41] <facubatista> justinclark, did you collect all logs for the sequence? so we can explore them with jam
[21:42] <facubatista> (at least for all of us to understand better the logs!)
[21:47] <justinclark> Yes - here are the logs:
[21:48] <justinclark> https://pastebin.canonical.com/p/PZkPvqGXXf/
[21:49] <justinclark> The join event happens right around 17:33:22 in the logs.
[21:52] <justinclark> Oh, to be clear, those ^^ are the logs with the new changes that fix the issue. I'll send the logs before getting the pod restart to trigger
[21:55] <justinclark> These are the logs before the issue was fixed: https://pastebin.canonical.com/p/CK5BsvWgp4/
[22:01] <justinclark> I see "controller-0: 17:33:31 DEBUG juju.worker.caasunitprovisioner deployment changed: true" in the fixed logs, but that is absent in the broken logs - not even something like "deployment changed: false"
[22:18] <facubatista> justinclark, wonderful, thanks! I'll check on those
[22:18] <facubatista> (monday, I'm eoding and eowing now)
[22:19] <justinclark> Sounds great. I appreciate it! Have a good weekend
[22:28] <facubatista> thanks!
[22:28]  * facubatista runs