/srv/irclogs.ubuntu.com/2020/10/20/#smooth-operator.txt

mupIssue operator#429 closed: Test harness set_leader erases relation data <Created by stub42> <Closed by stub42> <https://github.com/canonical/operator/issues/429>04:32
* Chipaca brb12:46
* facubatista didn't say hello today, shame on me13:31
Chipacafacubatista: i've been grumpy for days now it seems13:32
Chipacafacubatista: gooooooooooooooooooooooooooooood  morning!13:32
facubatistaChipaca, :)13:32
* bthomas had IRC disruption due to 2.4ghz interference 13:55
* Chipaca had coffee, instead14:16
bthomas:)14:29
* justinclark has a question14:38
justinclarkIs there a way to create an event that is triggered when the app.status is Active? There's a configuration update that needs to take place after all k8s pods have settled and I'm struggling to find a good way to do that.14:38
justinclarkI'm not tied to the app.status thing - really any way to have a piece of code run in a charm after all k8s pods are ready14:39
justinclarkI've tried deferring events until that condition is true - and it kind of works. However, there could be several minutes before the deferred event re-fires.14:40
bthomasjustinclark: How about tweaking liveness probe or adding a readyness probe ?14:41
bthomasjustinclark: If Juju is forwarding requests to an application container that is not ready then either we are not doing the lifecyle management correctly or Juju has a bug or some behaviour has not been documented properly.14:42
justinclarkI will try tweaking the probes. Though it looks like Juju is sending requests to pods that haven't completely started up yet.14:45
Chipacajustinclark: i think there is the intention of having juju events for that kind of thing but i don't know if it's planned yet14:49
Chipaca^ jam?14:49
Chipacaalso, aside, i'm going to make myself a lot of tea and have some paracetamol and lie down for a bit14:50
bthomashope you get well soon14:50
jam@Chipaca, it is work on this cycle, though AIUI Mark is under the impression that Operator Framework could probably prototype it with the K8s integration already available inside the pod (by forking a process that would call back into juju-run to execute charm code)14:53
jamjustinclark, bthomas ^14:53
jamThe roadmap item is: https://docs.google.com/document/d/12aAYfhrKybaoWMV6HgTnJHP-sByqSurOj6PsNkdenRo/edit14:53
jam"lifecycle events as charm events"14:54
bthomasjam : With the current implementation can we assume Juju will set a application container's status as "active" only if both liveness and readiness probes pass ?15:01
jambthomas, "set application container's status" what concrete field are you thinking about? (iow, what data are you trying to read)15:02
jamthe value in "juju status"15:02
jamthe contents of "self.model.app.status" ?15:02
jamsome other field in K8s API?15:02
bthomasjam : following up on justinclark comment I was looking for a way we can ensure the load balanced ingress address does not forward an HTTP API request to an application container that was not yet ready to service it. I am thinking if we could do this by adding a readiness probe or tweaking the existing liveness probe. justinclark is sending an API request in response to a new peer relation and seems to be finding that it is bei15:08
bthomasforwarded to one of the new peers that are not yet ready to respond to it. I do not think the charm is doing any self.model.app.status check, and perhaps it does not need to since it is sending the request to the ingress address. justinclark : please correct me if I have misunderstood.15:08
jambthomas, I am not sure how Juju is configuring the Service abstraction in front of the pods.15:09
jamhttps://kubernetes.io/docs/concepts/services-networking/service/ says that you should be using "readiness probes"15:09
jamso that K8s doesn't try to use a backend that isn't actually ready15:09
jamIIRC, *juju* can't know how to determine if an application is actually ready15:09
bthomasjam : agreed this is certainly one of the current shortcomings of our charm.15:10
jamso that is the responsibility of the charm to define the readiness of the application15:10
bthomasyes indeed15:10
justinclarkbthomas - that is correct. And it does seem as though reducing the timeoutSeconds/initialDelaySeconds in the readiness probe helped things. The error will pop up once, but it always seems to resolve itself.15:11
justinclarkThis is using the ES Python module btw.15:12
justinclarkThis will all be unnecessary in ES 7.x :)15:12
bthomasjustinclark: awesome! if readyness probe seems to work it saves us lot of clunky code logic. IMHO it may be a good idea to choose a timeout that does not lead to unpleasent behavior.15:13
bthomasjustinclark: what will be unnecessary in ES 7.x ?15:13
justinclarkSetting "minimum_master_nodes" is no longer necessary. We will probably still set some dynamic settings, but this one in particular is the issue.15:14
bthomasgot it15:15
justinclarkbthomas, we get this error: "illegal value can't update [discovery.zen.minimum_master_nodes] from [1] to [2]" because we're trying to set the value to be higher than the total number of nodes in the cluster.15:15
justinclarkAnyways, I can continue this and give a more complete description in the Github issue.15:16
bthomasjustinclark: will read . Can catch up after standup.15:16
justinclarkThanks bthomas, Chipaca, jam for helping out15:16
justinclarkbthomas, holding off on the PR for a bit. Investigating some options to eliminate the error completely.16:12
bthomasjustinclark: ack. enjoy the deep think :-)16:28
mupPR operator#430 opened: Docstrings for storage, log and model modules <Created by facundobatista> <https://github.com/canonical/operator/pull/430>18:59
* facubatista eods22:12

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!