[04:32] Issue operator#429 closed: Test harness set_leader erases relation data [12:46] * Chipaca brb [13:31] * facubatista didn't say hello today, shame on me [13:32] facubatista: i've been grumpy for days now it seems [13:32] facubatista: gooooooooooooooooooooooooooooood morning! [13:32] Chipaca, :) [13:55] * bthomas had IRC disruption due to 2.4ghz interference [14:16] * Chipaca had coffee, instead [14:29] :) [14:38] * justinclark has a question [14:38] Is there a way to create an event that is triggered when the app.status is Active? There's a configuration update that needs to take place after all k8s pods have settled and I'm struggling to find a good way to do that. [14:39] I'm not tied to the app.status thing - really any way to have a piece of code run in a charm after all k8s pods are ready [14:40] I've tried deferring events until that condition is true - and it kind of works. However, there could be several minutes before the deferred event re-fires. [14:41] justinclark: How about tweaking liveness probe or adding a readyness probe ? [14:42] justinclark: If Juju is forwarding requests to an application container that is not ready then either we are not doing the lifecyle management correctly or Juju has a bug or some behaviour has not been documented properly. [14:45] I will try tweaking the probes. Though it looks like Juju is sending requests to pods that haven't completely started up yet. [14:49] justinclark: i think there is the intention of having juju events for that kind of thing but i don't know if it's planned yet [14:49] ^ jam? [14:50] also, aside, i'm going to make myself a lot of tea and have some paracetamol and lie down for a bit [14:50] hope you get well soon [14:53] @Chipaca, it is work on this cycle, though AIUI Mark is under the impression that Operator Framework could probably prototype it with the K8s integration already available inside the pod (by forking a process that would call back into juju-run to execute charm code) [14:53] justinclark, bthomas ^ [14:53] The roadmap item is: https://docs.google.com/document/d/12aAYfhrKybaoWMV6HgTnJHP-sByqSurOj6PsNkdenRo/edit [14:54] "lifecycle events as charm events" [15:01] jam : With the current implementation can we assume Juju will set a application container's status as "active" only if both liveness and readiness probes pass ? [15:02] bthomas, "set application container's status" what concrete field are you thinking about? (iow, what data are you trying to read) [15:02] the value in "juju status" [15:02] the contents of "self.model.app.status" ? [15:02] some other field in K8s API? [15:08] jam : following up on justinclark comment I was looking for a way we can ensure the load balanced ingress address does not forward an HTTP API request to an application container that was not yet ready to service it. I am thinking if we could do this by adding a readiness probe or tweaking the existing liveness probe. justinclark is sending an API request in response to a new peer relation and seems to be finding that it is bei [15:08] forwarded to one of the new peers that are not yet ready to respond to it. I do not think the charm is doing any self.model.app.status check, and perhaps it does not need to since it is sending the request to the ingress address. justinclark : please correct me if I have misunderstood. [15:09] bthomas, I am not sure how Juju is configuring the Service abstraction in front of the pods. [15:09] https://kubernetes.io/docs/concepts/services-networking/service/ says that you should be using "readiness probes" [15:09] so that K8s doesn't try to use a backend that isn't actually ready [15:09] IIRC, *juju* can't know how to determine if an application is actually ready [15:10] jam : agreed this is certainly one of the current shortcomings of our charm. [15:10] so that is the responsibility of the charm to define the readiness of the application [15:10] yes indeed [15:11] bthomas - that is correct. And it does seem as though reducing the timeoutSeconds/initialDelaySeconds in the readiness probe helped things. The error will pop up once, but it always seems to resolve itself. [15:12] This is using the ES Python module btw. [15:12] This will all be unnecessary in ES 7.x :) [15:13] justinclark: awesome! if readyness probe seems to work it saves us lot of clunky code logic. IMHO it may be a good idea to choose a timeout that does not lead to unpleasent behavior. [15:13] justinclark: what will be unnecessary in ES 7.x ? [15:14] Setting "minimum_master_nodes" is no longer necessary. We will probably still set some dynamic settings, but this one in particular is the issue. [15:15] got it [15:15] bthomas, we get this error: "illegal value can't update [discovery.zen.minimum_master_nodes] from [1] to [2]" because we're trying to set the value to be higher than the total number of nodes in the cluster. [15:16] Anyways, I can continue this and give a more complete description in the Github issue. [15:16] justinclark: will read . Can catch up after standup. [15:16] Thanks bthomas, Chipaca, jam for helping out [16:12] bthomas, holding off on the PR for a bit. Investigating some options to eliminate the error completely. [16:28] justinclark: ack. enjoy the deep think :-) [18:59] PR operator#430 opened: Docstrings for storage, log and model modules [22:12] * facubatista eods