[00:15] wallyworld: easy review: https://github.com/juju/juju/pull/7516 [00:15] wallyworld: also, jam would prefer it if we did bulk merges from 2.2 to develop instead of targeted PRs for each forward port === menn0_ is now known as menn0 [00:30] Can anyone explain to me what's happening here when I try to upgrade juju 2.1.3 controllers to 2.2? https://pastebin.canonical.com/191338/ [00:31] It seems like 2.2 client -> 2.1.3 controller connections are fairly fundamentally broken. === bradm_ is now known as bradm [00:57] blahdeblah: hmmm, that doesn't look good. let me do some digging [00:58] thanks menn0 [00:59] menn0: confirmed on another controller which has been working fine up until now: [00:59] [master*]paulgear@peleg:~$ juju status [00:59] ERROR unable to connect to API: malformed HTTP response "\x15\x03\x01\x00\x02\x02\x16" [00:59] ok [00:59] seems like it's trying to do http to an https port or something [01:00] I remember jam looking at something kinda similar with me recently, and I think it had something to do with some API requests being proxied and some going direct. [01:04] blahdeblah: I'm setting up a repro. in case it is proxy related, do you have $http_proxy etc set on the client machine? [01:05] menn0: yep [01:05] https_proxy=http://127.0.0.1:3128/ [01:05] http_proxy=http://127.0.0.1:3128/ [01:05] no_proxy=localhost,127.0.0.0/8,::1 [01:05] ftp_proxy=http://127.0.0.1:3128/ [01:11] blahdeblah: works fine for me without a proxy [01:11] blahdeblah: trying with a proxy in the mix [01:17] menn0: No difference to me whether I do it with or without those proxy settings [01:23] blahdeblah: very strange. I haven't been able to replicate yet. [01:25] blahdeblah: brb [01:28] blahdeblah: aha... if I have proxy settings set I get the same thing [01:29] that's progress, I guess; strange that it affects me when I unset them, though [01:31] blahdeblah: it took about 10mins from the connection attempt to the "malformed HTTP response" error though [01:31] yeah - that's about how long it is for me [01:31] blahdeblah: now that I look at the timestamps I see that you get that too [01:31] yep [01:31] blahdeblah: I'll do some more digging [01:32] thanks - do you want a bug about this? [01:33] blahdeblah: yes please [01:33] blahdeblah: one likely cause is that we switched websocket libraries between 2.1 and 2.2 [01:34] that would seem a strong candidate [01:34] blahdeblah: to fix various issues [01:34] blahdeblah: perhaps there's a difference in proxy handling [01:36] Seems like it :-) [01:41] https://bugs.launchpad.net/juju/+bug/1698989 created [01:41] Bug #1698989: Can't connect to controllers, juju status hangs for 10 minutes [01:49] blahdeblah: thank you [01:49] thank you! :-) [01:50] blahdeblah: I can definitely repro it when https_proxy is set, and it goes away with https_proxy is unsety [01:50] blahdeblah: can you double check at your end? [01:50] Already did, and mentioned that in the bug [01:50] Doesn't matter whether I set or unset the vars, it doesn't work [01:50] Which suggests to me that it's hiding a proxy setting somewhere else [01:51] blahdeblah: you don't have transparent redirect to the proxy or something set up? [01:51] nope [01:51] Is there a way I can curl/wget the API endpoint directly to verify with/without the _proxy settings? [01:54] I suspect you'll need a websocket client [02:02] blahdeblah: hmmm... I just did some packet sniffing and it looks like the client is trying to send a proxy CONNECT straight to the Juju API server port [02:02] LOL [02:03] If we're using a proxy, ignore the proxy and try to use our API server as the proxy! :-) [02:03] maybe something like that [02:05] blahdeblah: it's still a mystery why disabling the proxy env vars doesn't work for you [02:05] indeed [02:05] but it should work with, regardless [02:05] blahdeblah: for sure [02:06] (and did work on 2.1.3...) [02:06] blahdeblah: I'm just wondering that's there's more to this [02:06] blahdeblah: anyway, I'll do some digging through the gorilla websocket code [02:06] enjoy [04:39] jam: here's a PR to improve update status hook firing. sadly, having the uniter efficiently be able to listen to just changes to that value and pick up the new one is somewhat non-trivial, so not done for now https://github.com/juju/juju/pull/7519 [05:01] wallyworld: looking, found at least 1 typo so far, will finish review after standup [05:01] ok, thanks [05:04] jam: i fixed one, so refresh before looking again [05:04] wallyworld: kk, 'greater' vs 'less than' [05:04] yup [05:04] sigh [06:09] wallyworld: reviewed [06:09] urty [06:16] ? [06:19] jam: sorry, typo [06:19] i responded to some comments [06:19] the listen for changes thing is a lot of work potentially [06:21] so you need to use --config when bootstrapping or adding a model really [06:22] at least with the random times, existing models still get benefit [06:22] that was my thinking anyway [06:22] in any case, we can land this and follow up with work to listen for changes if there's time [06:23] by we only have 1 day or so till release [06:23] wallyworld: do we have a way to change it during upgrade? [06:23] whenever agents bounce, they will read any current value, but you can't set the value before upgrade because juju 2.2.0 foesn't know about the setting === fnordahl_ is now known as fnordahl [06:41] wallyworld: juju upgrade-juju should take --config [06:42] maybe, but it doesn't :-) [06:42] would be nice though [06:42] wallyworld: I think we're missing it on upgrade-charm as well, essentially we need all the 'deploy' flags for upgrade [06:43] it definitely would be nicd to add [06:43] maybe something for the sprint [07:19] jam: i think the issue last time with mandatory config was people using a newer client with an older juju? [07:20] the trouble with accepting "" is that it hides config errors [08:16] jam: i've added a second bit of bespoke default handling with a todo to remove later, and some testing for the timer value. could you PTAL? I've also pushed another PR to fix the monfig-config feature request for 2.2.1. off to soccer for a bit but back later [08:17] k [09:01] wallyworld, jam : https://bugs.launchpad.net/juju-core/+bug/1244841 is adding --config (and other deploy flags) to upgrade-charm. It also came up as part of storage, where if there are deployments of your charm you can never add mandatory storage because there would be no way to upgrade the existing deployments. [09:01] Bug #1244841: support atomic upgrade-charm --config var=val ... [10:02] blahdeblah, jam: I found the proxy bug [10:07] blahdeblah, jam: updating the ticket [10:25] menn0: I didn't see an update yet, so I might as well ask here what you found [10:36] jam: it's there now [10:36] thanks menn0 [10:36] I think we just kill the DNS cache [10:37] jam: I'm beginning to agree [10:37] jam: I was just reading your PR where you had reservations about the feature [10:38] jam: the commentary seems to indicate the cache is important for cert validation [10:39] is that actually true? [10:39] menn0: my understanding was that they needed to change what we were doing because of cert validation [10:41] jam: I'll continue with this tomorrow [10:41] jam: too tired to think too hard about it now [10:42] jam: it seems like "killing the cache" will have to be done carefully [10:44] menn0: maybe, I'm happy to chat specifically about my understanding of it [10:45] Rog's patch seems more about ultimately dealing with slow DNS but otherwise handling the fact that we were tracking addresses in a way that conflicted with them wanting to force hostnames for JAAS controllers [10:47] jam: that's why I said "carefully". we have to not break the JAAS requirements when winding some of this feature back. [10:48] menn0: yeah, my guess is that if we just don't do internal DNS caching, it will use existing DNS and just work [10:48] but testing, etc. [10:50] * menn0 nods [10:50] jam: and this is quite important to fix. I almost marked the bug as Critical [10:57] burton-aus: are you around? [10:58] we just got a weird failure in CI trying to land a patch [10:58] and it looks like a buggy script [10:58] http://juju-ci.vapour.ws:8080/job/github-merge-juju/11160/artifact/artifacts/windows.log [10:59] ['scp', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=120', '-oUserKnownHostsFile=/tmp/tmpTtb9OC', '-i', '/var/lib/jenkins/cloud-city/staging-juju-rsa', 'juju-core_2.3-alpha1.tar.gz', 'Administrator@developer-win-unit-tester.vapour.ws:\'\nimport tempfile\nprint tempfile.mkdtemp(prefix=\'"\'"\'workspace-runner-\'"\'"\')\nc:\\users\\administrator\\appdata\\local\\temp\\workspace-runner-r1g7mn/ci\'/'] [10:59] ' returned non-zero exit status 1 [10:59] why would you have "scp" .... "import tempfile\nprint" [10:59] that looks like it is mixing 'ssh' and run python with 'scp' working files back out [11:05] jam no idea, but I just triggered a re-run. [11:05] jam nicholas was working on the windows machine. [11:05] k, I was looking to file a bug in case [11:05] burton-aus: offhand that looks like it interpreted a string wrong [11:05] burton-aus: like maybe that was supposed to decide what the name of the temp directory was [11:05] but instead it interpreted the literal string that was generating it [11:05] as the path [11:06] jam I don't have knowledge on the windows slave. Nicholas was updating ssh key or something in the past several days. [11:07] jam and by looking at several previous jobs they were not involved in windows run. [11:08] jam from the latest email I received, nicholas is working on migrating and updating windows slaves. [11:27] burton-aus: rerun failed in the same way: http://juju-ci.vapour.ws:8080/job/github-merge-juju/11161/artifact/artifacts/windows.log [11:28] jam maybe you want to draft an email to nicholas? [11:28] jam he must know more details about that windows slave. [11:28] k [11:31] burton-aus: emailed [11:32] jam got it. === dames is now known as thedac [22:01] wallyworld: ping? I saw that bug about subordinates and upgrading - looks like anastasiamac's working on it? jam wanted me to do some stress testing on the status history pruning so I'll do that otherwise. [22:02] babbageclunk: sgtm, ty. just in release call. we may need you to help with the subord thing depending on how it goes. first up though, could you pretty please do a review of a yucky cmr pr? https://github.com/juju/juju/pull/7522 [22:03] wallyworld: sure, looking now [22:03] sorry in advance [22:03] ugh [22:03] ;) [22:03] it's a lot of paper shuffling [22:30] babbageclunk: assuming you get some scale testing done and your pr landed, could you take a look at this bug 1649936? it should be a straight logic error hopefully [22:30] Bug #1649936: Resources are getting deleted when juju remove-unit is issued [22:30] wallyworld: ok [23:11] babbageclunk: easy review pls? https://github.com/juju/juju/pull/7524 [23:15] menn0: ok [23:16] menn0: approved [23:17] babbageclunk: thanks [23:36] ugh... [23:36] brain fuzz starting already [23:58] thumper: insert more coffee [23:58] I did