[09:44] Morning Chipaca [09:45] Chipaca: I was wondering how would rasing exceptions work for WaitingStatus where the common use case is to set waiting status, do something and then set Active status ? Also for ActiveStatus would it make sense to raise an exception, when the intention is to indicate an error free condition ? [09:45] bthomas: morning [09:48] bthomas: not for active status, no. And my idea is/was/had been that there'd be 2 ways: one with an exception, ie eg raise BlockedStatus("..."), and one with a context manager, ie eg with MaintenanceStatus("..."): … [09:51] Ah Ok. Contenxt managers would make sense of MaintenanceStatus. Personally I am inclined to "There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you are Dutch" ... :-) [09:56] Chipaca: In the same vein as the request to clarify doc string of event.defer() shouldn't we make explicit in docstring for Status (especially BlockedStatus) that the event handler continues to execute even after setting the status unless an return call is made. [10:35] Chipaca: I need a 1:1 to show you the behaviour I mentioned in standup yesterday and get a second opinion . Can I add a 30 min session to your calendar ? [10:35] sure [10:37] done (made it 45 min just in case - sneeky :-) ) [10:53] bthomas: i commented on your #446 but (as i say there) i'd like jam's input [10:53] PR #446: Minor clarification for deferred event doc string [10:55] Chipaca: I really like what you have written. Kudos. [11:10] ¡Muy buenos días a todos! [11:14] Buenos días facubatista [11:16] hola bthomas [12:17] Buongiorno! [12:24] Chipaca: bthomas: so one piece is that only the handler that defers stays registered. eg if you have config-changed register 2 event handlers foo.on_config() and bar.on_config(); if foo defers, but bar does not, only foo will be called again. [12:31] jam: yeah i should make that explicit also [12:31] jam: it's kinda half there :) [12:31] Chipaca: I'm writing something up, I'll have you look at it. [12:31] (editing your version) [12:31] jam: awesomesauce [12:32] jam: meanwhile i've spent a fun hour stepping through logs. Is it possible that in k8s, juju does install → start → config-changed rather than install → config-changed → start? [12:32] jam: also, is it expected / supported / a feature, that setting app data causes the charm to let itself know that app data changed? 🙂 [12:33] (the mongodb uses this to trick juju into what is in effect a busy wait) [12:34] setting app data *to the same thing as it was before*, even [12:37] Chipaca: posted on the PR. I think the best text will be a hybrid of yours and mine. [12:38] jam: ta [12:38] Chipaca: If start is happening before the first config-changed, I would consider that a bug, because you shouldn't be ready to start before knowing your config. [12:38] Chipaca: changing your own application data will tell the other units of your application that there is new data. On K8s it is 1 application agent, but many unit agent threads. [12:38] So unit/0 (leader) setting new app data should wake up unit/1 and unit/2 to see the new data. [12:39] However, unit/1 and unit/2 should not be allowed to set app data as they are not the leader. [12:39] I do know there was some confusion, because there is only 1 application agent, so it needs to be telling the controller "on behalf of unit/0 set the leader data" [12:39] jam: unless I'm much mistaken, I'm seeing unit/0 setting app data and getting woken for it [12:39] bthomas: do you think you could pastebin the log? [12:40] bthomas: it shows both these things [12:40] Chipca will do now [12:40] ops Chipaca will do now [12:41] Chipaca: thinking through it, it is plausible that we aren't properly trapping it. (you have to record when unit/0 sets app data to rev 2, that unit/0 has now 'seen' rev 2 of the data, otherwise it would wake up again) [12:41] jam: mostly i don't want us to write a charm that uses this behaviour if it's a bug :) [12:41] Chipaca: so exploiting it will also wake up all the other units over and over. [12:42] it is very convenient though, in the absence of an app ready event [12:45] Chipaca: though it could be done with a cron job running juju-run, IIRC, and that would be intended to work :) [12:45] It does take effort to fix on our part, but I can't help but feel that we should. [12:45] * bthomas is struggling to copy and paste the entire terminal window :( . will get there eventually [12:47] Chipaca: jam : https://pastebin.canonical.com/p/pBtvTfc8YJ/ [12:47] bthomas: apt install pastebinit 🙂 [12:47] thanks [12:48] bthomas: i think you forgot the --replay? [12:48] let me fix that [12:51] Chipaca: jam : https://pastebin.canonical.com/p/jyvCYnwJvb/ should be fixed now [12:53] bthomas: Chipaca: line 342. why are we seeing 2 logs of Operator Framework [12:54] ah, because we have hooks/start as a symlink to dispatch [12:54] jam: yeah, calling itself [12:54] (and detected) [12:54] jam: from that log, the events seen are, in order: install, cluster-relation-created, leader-elected, config-changed, start, db-storage-attached, … [12:55] Chipaca: 175 we run install, 265 we run config-changed,332 we run start [12:55] hmm [12:55] I'm surprised storage attached is so late, but maybe we can't tell you about the storage on k8s until the pod is actually running. [12:55] jam: i/we must've gotten confused before, sorry for the noise :) [12:56] jam: yeah the storage-attached happens after config-changed has fired a pod-spec-set [12:58] jam: ah! and a config-changed after db-storage-attached [12:58] Chipaca: there is nothing that says config-changed won't be run later, I believe in that case it is because your IP address has become available. [12:58] (networking calling config-changed is one of the warts) [12:58] jam: gotcha [12:59] jam: i think we just missed the first config-changed when we were walking through it [12:59] looking at 821 "running cluster-relation-changed", it ran relation changed, and then gets ready to run it again [12:59] runs a bunch of "state-get, state-set" and relation-get, relation-list, but I don't see a relation-set [12:59] that would cause it to fire the next one. [13:00] jam: relation-set is called from the start event handler [13:00] Chipaca: but why would we fire app-relation-changed over and over if it isn't *changing* [13:01] jam: including the deferred one that is re-emitted from cluster-relation-changed waking up the charm [13:01] jam: i'm asking you 🙂 [13:01] no backsies [13:02] jam: firing app-relation-changed again and again is a busy loop hack that makes this charm initialize fast (afaiu) . Not my invention . [13:02] Chipaca: sure, it was more rhetorical [13:02] It seems more that we just aren't doing the right thing for peer application relation data, and we just fire forever, which is *definitely* a bug. [13:03] bthomas: do you think you could tear this down, patch ops so that ops.framework.StoredList.__repr__ returns repr(self._under), and deloy it again? [13:03] Chipaca: I can certainly try. [13:09] Chipaca: https://pastebin.canonical.com/p/bxBc2FFtRS/ is my change. Deploying and regenerating lows now [13:09] 👍 [13:10] I think I'll drop in a __repr__ for all of these [13:15] Chipaca: https://pastebin.canonical.com/p/V6wTgfbdMg/ Please check if I got this right and is what you required. [13:16] bthomas: if this is the new log, i suspect it hasn't packaged the modified ops in there [13:16] hmm [13:16] bthomas: look for replica_set_hosts= in that log [13:17] I just did a pip3 install in the operator git repo. git diff shows me the changes are there (not committed). pip3 list shows my that ops is installed. [13:24] Chipaca: ^ I just checked the file in my ~/.local/lib/python pip3 install location and the ops.framework.StoredList does have the __repr__ method added. [13:24] btw, I'm doing breakfast/family stuff for abit [13:27] facubatista: where does Charmcraft build pull operator framework from ? Is there way to tell it to pull from local pip repository ? [13:28] Chipaca: You were right . ^ [13:28] * bthomas goes to boil some pasta [13:28] bthomas, sure! by default it gets it from PyPI, but if you need to get it from your disk, just tell that to pip [13:29] bthomas, like putting this in your requirements.txt (untested): file:///your/absolute/path/to/the/ops/project/ [13:30] Thanks [13:31] bthomas, let me know if you can not make it work [13:33] will do [13:38] github down? [13:38] github down [13:42] facubatista: https://pastebin.canonical.com/p/6rVmZrBBgg/ [13:42] * bthomas is still boiling pasta [13:43] this is weird: ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/home/balbir/.local/lib/python3.8/site-packages/ops' [13:43] hmm [13:45] my charmcraft version is 0.6.0+10.g635bad0 [13:46] bthomas, how the '/home/balbir/.local/lib/python3.8/site-packages/ops' directory gets into play? are you located there? you have some environment variable to configure pip behaviour? [13:47] bthomas, which permission has that directory? [13:50] facubatista: "drwxrwxr-x 4 balbir balbir 4096 Nov 13 13:22 /home/balbir/.local/lib/python3.8/site-packages/ops" [13:50] * bthomas -> lunch [13:50] bthomas, let me know when you come back [14:14] * bthomas -> back [14:14] facubatista: directory has global read . I wonder can I manually copy a file into build directory and repack it. [14:16] bthomas, no, there's a more weird problem, let's find it [14:17] bthomas, what happens if you run pip3 install --target=/home/balbir/src/charms/mongodb-operator/build/venv --requirement=/home/balbir/src/charms/mongodb-operator/requirements.txt [14:17] trying now [14:20] facubatista: https://pastebin.canonical.com/p/9B5n9ZtC89/ [14:20] bthomas, what's the content of requirements.txt ? [14:21] facubatista: file:///home/balbir/.local/lib/python3.8/site-packages/ops [14:21] along with pymongo and git repo for OCI image lib [14:22] git+https://github.com/juju-solutions/resource-oci-image/@c5778285d332edf3d9a538f9d0c06154b7ec1b0b#egg=oci-image [14:22] [14:22] and pymongo [14:22] bthomas, ah, that's where that .local weird path is involved [14:22] bthomas, you should point to the project's directory [14:23] to the source code [14:23] you can not "install from installed" [14:23] ah ok let me try that [14:25] facubatista: that worked perfectly. Thank you. So much gets "Lost in Translation" . Checkout the movie if you haven't :-) [14:25] bthomas, I love Scarlett Johansson installing python packages [14:26] :-) [14:32] * Chipaca just did 'git pish' and giggled for a full minute before fixing the typo [14:33] PR operator#447 opened: give StoredFoo classes a useful __repr__ [14:35] Chipaca: https://pastebin.canonical.com/p/khMPWtx5pw/ Lunch is served :-) [14:35] bthomas: buen provecho [14:36] :-) [14:36] jam: in bthomas's pastebin above you can see it's relation-set'ing it to the same value every time, fwiw [14:37] bthomas: when you get back from lunch, can you deploy one with N nodes? [14:37] units [14:37] to see how chatty it gets [14:37] (probably: very) [14:42] Chipaca: why do some lines have: DEBUG unit.mongodb/0.juju-log Deferring . and some: DEBUG unit.mongodb/0.juju-log cluster:0: Deferring . [14:42] is it different versions of the charm? [14:42] I do see relation-set in those lines, I didn't find them before, but maybe I was just missing it? [14:44] jam: I don't know where the cluster:0 in a lot of those log lines is coming from [14:44] jam: i don't think it's 'us' [14:45] Chipaca: With 3 units https://pastebin.canonical.com/p/nfrp3wMDZk/ [14:45] that *looks* like it is relation-name:relation-id [14:46] jam: note it's also there for the 'ops up and running' line [14:47] jam: application-mongodb: 14:44:19 DEBUG unit.mongodb/1.juju-log cluster:0: Emitting Juju event cluster_relation_changed. [14:48] jam: not sure what it is [14:48] it seems to me that something hacked ops logging to inject a prefix. it is possible that Juju is doing it, but then I would expect it to be consistent. [14:52] Chipaca: so the original paste "https://pastebin.canonical.com/p/jyvCYnwJvb/" actually does show relation-set, just not in the early calls, it happened in the later ones. [15:43] bthomas, is there a canonical/mongodb-operator? Working on the Graylog docs and wanted to point to that repo. If not I'll make one quickly [15:43] * facubatista -> quick lunch before standup [15:44] justinclark: there is [15:44] hang on [15:44] justinclark: https://github.com/canonical/mongodb-operator However [15:45] It does not have the relation code and it does have the bug [15:45] that mongodb may not be fully initialized when graylog connects [15:45] I see. It's also a private repo. [15:45] oops [15:46] we can make it public. [15:46] But use that link in your README [15:46] Sounds good. I think we need to submit a request with IS to make it public. [15:47] I'm not sure how Jeremy did that for Elasticsearch and Graylog, though [15:47] justinclark: I thought we can just mark it public in the settings. I may be wrong. [15:47] However I prefer checking with jldev before we do [15:48] He mentioned the other day in the private channel that we shouldn't be making private repos so I'm sure it's true for this also. [15:48] Let me see if I can find the comment. [15:49] From November 5: "Yeah, and organization admin (IS) will have to change it. We shouldn't be making private repos to begin wtih." [15:51] It was a fairly quick change, though. I'm sure we can get it changed today no problem. [16:04] Just submitted a ticket with IS. [16:05] justinclark: can I leave that with you then. Alternatively we can delete the repository and I can just re-create a public one ? [16:06] oops can not delete repo either [16:07] Nah I don't think it's really a big deal. It won't take too long. It might even be better this way since we still need to merge your refactored branch and that's what we want to be public. [17:01] Chipaca: https://bugs.launchpad.net/juju/+bug/1904196 [17:01] I was able to reproduce that relation-set --app triggers a relation-changed for the current leader. [17:01] *however* setting it a second time did *not* trigger a second relation-changed event. [17:01] bthomas: ^ [17:03] jam: this in k8s? [17:04] jam: or is that the difference here [17:05] Chipaca: this is in my ubuntu-operator charm which is running in LXD. I would confirm that you really are setting identical data for the charm (which you could be, just confirm that it is the case) [17:05] jam: in the second of the pastebins you link we're logging the actual data that's set [17:06] jam: (bthomas patched ops with a __repr__ on the StoredList) [17:06] jam: so if you search for replica_set_hosts, that's the app data [17:06] jam: (or, if you'd ratehr, search for relation-set; the app data is two lines further down) [17:09] Chipaca: if you are talking this one, I do see it changing, though not always [17:09] https://pastebin.canonical.com/p/nfrp3wMDZk/ [17:10] jam: ah because i asked bthomas to spin up N [17:10] for extra chatty [17:10] ¯\_(ツ)_/¯ [17:11] jam: anything we can do to properly clarify this? [17:11] i guess write an ad-hoc charm [17:11] yeah, i'll do taht [17:11] Chipaca: hence why I updated ubuntu-operator that does nothing but the bit I add :) [17:12] 🙂 [17:12] Chipaca: from what I see, it looks like on_start is being deferred but *also* calling relation-set [17:12] yes, that's part of the mud in the charm that balbir will refactor [17:12] it's deferring it _twice_, just for funsies [17:13] ah, and then storage-attached runs the deferred start, causing it to call relation-set again [17:13] and then we get relation-changed (the first time), which causes us to run start, and ultimately causes us to call relation-set 2x [17:14] and the rest look like actual changes causing us to wake up again [17:14] Chipaca: so from my perspective, the relation-set calls are because on-start is calling it, and being deferred and thus woken up again, and there isn't a busy-wait loop [17:14] just a 'wake up every time I tell myself something' [17:15] hmm [17:15] jam: i'll try to isolate it 🙂 [17:21] Chipaca: I'm looking at trying to avoid waking yourself up, but it would be good to know if we're doing something even weirder with k8s. You probably can just edit the mongodb charm and have it do less 'other stuff' [18:13] fun fact, you can't log "-". [18:30] jam: 18:30:22 INFO juju.worker.uniter unit "mongodb/0" shutting down: %!s() [18:30] fwiw [18:31] Chipaca: already have a patch for that in 2.8.7 [18:31] 👍 [18:31] Chipaca: did you know that gullible isn't in the dictionary? [18:31] jam: it is not! mostly becuase it's in Spanish [18:32] Chipaca: though more directly, can you not 'logger.debug('-')' ? [18:32] it seems like a case of 'juju-log -- -' or some fashion to avoid having it be misinterpreted [18:32] logger.debug("--------") got very upset at me [18:34] jam: setting app data to the same value does not seem to trigger the event, in k8s, at least with this pruned back operator [18:35] no such option '------' [18:35] :) [18:35] and with that, EOW, have a good one, etc etc [18:35] see y'all sometime next week :-D [18:35] have a good one Chipaca [18:37] jam: you too! [21:21] * facubatista eods, and eows! [21:21] see you all on Monday