[01:28] would someone like to try and bootstrap develop on azure please? to check bug 1856739, has the potential to be a weird apt issue [01:28] Bug #1856739: Can't bootstrap juju-2.8-beta1 on azure [01:34] here's a PR for someone that wants to understand multiwatchers: https://github.com/juju/juju/pull/10931 [01:34] hpidcock, babbageclunk ^^? [01:56] thumper: uh, just trying to get my appdata stuff finished... [01:56] babbageclunk: no worries [01:59] I'll have a look in a sec [02:24] thanks hpidcock [03:07] wallyworld: ok, using json.Unmarshal seems to work! Although now I have a race where the watcher loop can be finished (so .Wait returns) while the watcher commonLoop is still running - so sometimes .Stop is called, and sometimes it's not [03:08] I'm just going to poll the call count in the test until there are 5 and then stop [03:08] I mean do the assertino [03:08] ion [03:08] we do that elsewhere too [03:08] ie poll the call count [03:18] wallyworld: actually, I changed the watcher to start the commonLoop with tomb.Go so it automatically waits for it to finish. It seems like the other watchers should do that too? Won't do it in this change though. [03:19] thumper: added some questions to your PR [03:24] thumper: I'm able to bootstrap to azure with 2.8-beta1 [03:36] babbageclunk: sounds reasonable i think, but would need to see the code [03:46] wallyworld: it's up now: https://github.com/juju/juju/pull/11013/files#diff-b9ff1f17db1e956ba20a1a341ee1dfd1R444 [03:46] ok [03:51] babbageclunk: so in all other watchers we just run commonLoop() outside the tomb, and that means the tomb doesn't wait for it as you point out. i wonder why we have not ssen issues before now [03:52] it seems reasonable what you did [03:59] wallyworld: cool cool [04:00] wallyworld: yeah, I guess it just hasn't caused any problems that the commonloop might still be hanging around? [04:00] guess not [04:01] babbageclunk: what the the resource changes for? just getting to that bit [04:06] babbageclunk: is it still draft? i gave a +1 [04:11] wallyworld: I just pushed a build7 of tjuju, it's still failing with same error, i don't think they alllowed tjuju at all [04:12] kelvinliu: ah damn, ok. they told me they did. i reckon we try normal juju in latest/edge [04:14] wallyworld: from Jamie's email, I think what he did was he manually approved the build 6 that I pushed this Mon. [04:14] ah bollocks, ok. let's just do edge and go from there [04:15] wallyworld: so we just land my PR, revert the PR if anything wrong? [04:15] yup [04:16] wallyworld: +1 plz https://github.com/juju/juju/pull/10857 [04:16] kelvinliu: looking [04:17] lgtm ty [04:18] thx [04:21] hpidcock: I should have actually mentioned that most of the implementation has just been lifted from the state package [04:21] I just moved the control of the goroutine to a real worker and hooked things up [04:24] wallyworld: oh thanks! just finishing the featuretest [04:24] ahh ok [09:32] manadart, have you seen this one before? https://paste.ubuntu.com/p/99x2ttBMrv/ [09:35] No, but is snapd running? [09:36] it would seem not, this is from a CI test [09:38] this is REALLY interesting https://paste.ubuntu.com/p/8dJFv9dCHR/ [09:39] juju barked when attempting to deploy a basic bundle [10:28] CR anyone https://github.com/juju/juju/pull/11049 [11:01] stickupkid: Approved. [11:02] My apt cacher recent history says I've saved 40GB of downloads :) Killing it for repeated LXD controller spin-ups. [11:15] stickupkid: Boom. https://github.com/juju/juju/pull/11050 [11:16] manadart, wow! NICE [11:16] manadart, let me look [11:22] manadart: might I ask how you have setup your apt cacher? just apt-cacher-ng and bootstrapping with apt-http-proxy and apt-https-proxy set as model-defaults? [11:24] zeestrat, we have a discourse post on it, one second [11:24] zeestrat, https://discourse.jujucharms.com/t/lxd-bootstrap-apt-cache/558 [11:25] stickupkid: ah, great. thank you very much! [16:06] Good evening Juju! [16:13] Good Afternoon Juju! [16:15] parlos: Afternoon :D [16:27] Once a model is deployed, would it survive if the controller is removed (power-off style). [16:34] parlos, as long as it's not on the same machine, I would assume so, although I wouldn't advice it. The model would be in a broken state as it couldn't reach the controller, but the actual application you deployed would work. [16:37] stickupkid, once the controller comes back would it reconnect to the deployed model(s)... Or do they only exist in memory? [16:37] parlos, it would come back up as long as the controller hasn't changed [16:38] parlos, the data is stored in mongo, so all is good from that side [16:38] parlos, I would recommend you set your controllers up as HA, that way you can restart controllers and still keep connection to a model [16:41] stickupkid, eventually once enough 'nodes' becomes availible. [16:45] stickupkid, have to figure out the 'best' way to upgrade a juju/maas deployment, without breaking the deployed models.. Any hints? [16:46] parlos, you can upgrade a model and a controller if required [16:46] parlos, we have some good documentation around it - https://jaas.ai/docs/upgrading [16:46] parlos: sorry, what's the maas/controllers on? [16:46] parlos: e.g. is this some hyperdense thingy? [16:47] parlos: so the running models will just run. You could go to every machine and shut down jujud and they'll keep giving out services. As Simon notes there's some docs around how to upgrade (controller first, other models after) [16:47] parlos: but if you think you've got a tricky situation it'd be good to understand what interesting bits you have before we give you bad advice [16:49] stickupkid, I've upgraded juju controllers, that works fine, but the maas causes some issues. Afair I have issues with network spaces. [16:49] parlos, any details on that? [16:49] parlos, the issues? [16:50] rick_h the situation is simple, I've got maas deployed on a single small server, this serves juju that has one controller deployed from it + a couple of models. [16:50] parlos: ok cool [16:50] stickupkid, the spaces juju knows does not match the spaces that configured in maas. [16:51] parlos: check out using reload-spaces to correct any changes in there [16:51] rick_h, will do so. [16:53] rick_h; reload-spaces runs, but does not update the spaces. :( [16:53] parlos: what version of Juju? [16:54] parlos: it'd be good to get a bug with what maas says, what reload-spaces --debug says, etc [16:54] juju --version; 2.7.0-bionic-amd64 [16:55] parlos: ok yea please let us know what you're getting in a bug. We did some work to improve reload-spaces a little bit in 2.7 and if it's not behaving it'd be good to know why it's not happy [16:55] RPC connection died,, seems bad. (connecting,dailed, established,...) [16:57] https://pastebin.com/B8U3Pde5 [16:58] parlos: yea, so it'd be good to see the controller side of that to see what it did and why [16:58] maas controller? [16:58] parlos: juju controller [16:59] ok, so ssh to controller and grab the log? [16:59] parlos, juju debug-log --replay --no-tail -m controller [17:02] stickupkid and rick_h, seems that the controller has a problem (Warning) with the cleanup of a model that was destroyed a couple of weeks ago.. :( Cant see anything related to the reload call. [17:10] ok, so I did a debug-log dumped the output to a file, ran the reload spaces, then another dump to a different file. Then a diff on those, no difference. It seems that reload spaces did not generate any log entry... [17:25] rick_h: parlos: reload-spaces was enhanced to add new information from maas spaces, it will not change existing data, nor delete it. those two items were for a later cycle [17:28] hml; so if a space has changed on maas, you basically need to launch a new controller? [17:35] parlos: adding a new model might work. [17:35] parlos: if a space has changed with existing machines in the model, juju may loose connectivity to the units etc. [17:36] parlos: therefore there is a lot of work on juju’s side to make the changes and inform the user appropriately. asaik, it’s on the radar of work to be done. [17:39] hml: it is correctly understood that when a model is deployed it grabs the 'current' network spaces from maas. I'd understand if they dont change as long as the model is deployed. [17:40] hml: caused confusion, as the controller is just a 'model' which grabbed somes spaces when it deployed.. and has restricted update capabilities. [17:59] hml: can you take a look at https://github.com/juju/juju/pull/11051 when you are back? [18:00] rick_h: if you have a few min can you please double-check the QA steps just in case I missed something? [18:00] achilleasa: sure. [18:23] achilleasa: definitely [18:31] achilleasa: this may take a bit, it appears that only part of the original change was removed. makes comparisions interesting [21:31] what do I do when juju status shows a machine down that novo lists as up? I ssh'd to it and restarted jujud-machine-38 already. [21:32] skay: if juju shows it as down then it things the jujud agent isn't running on that machine [21:32] skay: it'd be worth ssh'ing to it and seeing if there's some reason the service cannot start? [21:34] rick_h: status does not show that service in a failure state [21:35] skay: so when you say "machine down" I assumed you mean status shows the machine agent as down and not talking [21:36] rick_h: correct. and when I do `novo list` it shows the machine as active. and when I `ssh ubuntu@` I am able to. and I can then run sudo systemctl commands on it [21:36] I restarted jujud-machine- and didn't get an error [21:36] skay: that's good, then does it show back up in status? [21:36] skay: if not, can you check the juju machine log for that machine? [21:37] /var/log/juju/machine.... [21:37] rick_h: oh, it shows as inactive [21:37] will do [21:38] hmm, inactive seems an odd state [21:39] rick_h: weird. here's the last bit of the log. unauthorized access. https://paste.ubuntu.com/p/PFkw3dnc8K/ [21:41] skay: :( that seems ungood. The password on the machine for authenticating isn't valid. Any interesting history on the model? [21:41] was it migrated or upgraded or something else in the recent past? [21:43] rick_h: I don't think so. only weird thing I can think of is that I first used it to deploy xenial machines, and now I'm using it to deploy bionic ones. which isn't really weird, right? I don't need to create a new model to deploy a different series do I? [21:44] wallyworld: straightforward merge 2.7 to develop? This is before the CMR/appdata one that I'm expecting to have some conflicts with the migration work. https://github.com/juju/juju/pull/11052 [21:45] the model is version 2.6.10 [21:46] babbageclunk: ok, looking [21:46] thanks [21:46] I am thinking of destroying the machine and trying hte deployment over [21:47] everything else worked [21:48] skay: no, no need for a different model for a different series [21:48] rick_h: do you want me to try anything else before I kill the machine? [21:48] skay: if the machine can go down/back up let's see if that works. I can't think of a good reason why the login would fail like that [21:52] (running mojo deploy manifest now... tick tock) [21:53] looking good. thanks, rick_h [23:45] wallyworld: and now the merge for the appdata-cmr worker stuff to develop? https://github.com/juju/juju/pull/11053 [23:50] sure [23:51] babbageclunk: so this is a forward port of that one commit from 2.7 with conflicts resolved? [23:53] wallyworld: yup [23:53] thanks!