[00:05] Ooh, scale testing? [01:02] wallyworld: back [01:12] axw: standup ho? [02:25] wallyworld: my testing has reproduced a number of issues we are seeing at scale [02:25] well that is good [02:25] finally [02:25] 513 models, 142 apps, 2324 units [02:25] no [02:25] 3234 units [02:25] restarting the app servers on 2.3.1 caused much issues [02:26] massive db load [02:26] txns.log lost iterators [02:26] broken apiservers [02:26] yay [03:44] thumper: and your PR fixes the above I assume when you add in a custom jujud [03:44] yeah, writing up test results now [03:44] I'll make sure you get a copy [03:46] axw: well, this is great. juju deploy is broken. the doc claims that it accepts --config foo=bar but it clearly doesn't and has been that way for a while [03:47] it only accepts a yaml file for --config [03:50] wallyworld: :/ [03:50] let the yak shaving begin [03:50] wallyworld: that sounds like an EDA, it should be easy enough to add a functional test for it (probably tack onto the right place somewhere) [03:54] yeah, sigh, we are awesome. the change to the code at add the config parsing looks like it was first done in 2012 [03:54] wow [03:54] so clearly no one ever does juju deploy myapp --config foo=bar [04:28] Oh. [04:28] We all want to do that but assumed it was deliberately removed! [04:28] I have heaps of YAML files around with like three lines to work around that [04:29] thumper: Did the load return to normal with roughly the same profile as the post-upgrade graph? [05:18] wgrant: yeah [05:22] thumper: Very nice [05:25] thumper: And +1 on 2.2.7, since we can't upgrade until at least 2.3.2 and by that point we're dangerously close to Christmas. [05:25] wgrant: hmmm... wallyworld, jam, thoughts? [05:25] All very good news, anyway. Hopefully this will make the remaining scaling issues a lot more obvious. [05:27] I'll run my smaller controllers on 2.3.x over the break for validation, but don't particularly want to subject the big one to even 2.3.2 without running the small ones for a while -- despite the obvious wins from removing update-status DB writes, it's a big risk to a lot of teams and applications. [05:28] And it's likely that your fix will pull the shared controller out of its current pit of despair without big risk. [05:28] axw: here's that deploy fix https://github.com/juju/juju/pull/8206 [05:29] wgrant: i just did a PR to fix. i'll backport to 2.3 for the next releae also [05:29] thumper: i think a 2.2.7 would be great actually [05:29] given the delay in 2.3.2 [05:31] Do we have an ETA for 2.3.2? No rush, just wondering. [05:31] Also have the holes in the QA process been identified? [05:31] wallyworld: -1 on treating file contents differently based on how many are specified [05:31] axw: we have to [05:31] if we want to retain compatibility [05:31] and sensible behaviour [05:32] and conform to semantics of --config [05:32] there's competing concerns here [05:32] wallyworld: why do we _have_ to? why can't we merge the values underneath the application name? [05:33] because of how --config behaves - it can't distinguisg btween a yaml file with just key=values and one with the charm settings format [05:33] it could guess but that would be bad IMO [05:33] wallyworld: why r we doing something silly for the sake of compaibility? i'd rather do the right thing [05:34] we can't break compatibility [05:34] and there's a reason for a single yaml file to have that format [05:34] as it allows juju config to be redirected to a file and resued [05:34] but that then cinflicts wirh --config behaviour [05:35] so we need to try and accommodate both [05:35] wgrant: probably not until the new year based on extra tests we are getting in place, and the fact that many people aren't around [05:35] wallyworld: sorry, I don't follow. can you point me at some code or docs or something? [05:35] there will only ever be one yaml file with charm config format [05:35] axw: hang out? will be easier [05:35] thumper: Sounds entirely reasonable. Thanks for the info. [05:35] wallyworld: sure [05:35] see you in standup [05:39] * thumper at EOD [05:39] caio [06:20] axw: from what i can see, we allow for bundle storage from the CLi to override what is in the bundle itself, but as of now, there;s no equivalent mechanism for charm config - you get what's in the bundle and that's it [06:21] so i think we can defer doing that stright up [06:21] and just support charms for now using --config [06:22] wallyworld: sounds fine to me, if that's how it is now [06:22] that's what it looks like [06:22] we pass in bundleStorage to the bundle handler [06:22] but not any config - that's just used in deployXCharm [06:23] and all bundle config looks like it is just read from the bundle yaml [06:23] wallyworld: yep, I can't see anywhere it's used either [06:23] i still need to combine yaml and config name/value for v5 of the api though [06:23] if both are specified === frankban|afk is now known as frankban [07:16] axw: that PR is updated [10:02] jam: thanks for the reviews mr. mylastname :o) [10:03] :) [10:07] jam: don't suppose you know what normally gets cleaned up on the CI machine when spaces runs out? [10:07] axw: I do not, unfortunately [10:08] axw: did you check /tmp or /var/log ? [10:08] not much in either [10:08] I would also potentially check 'ps' output in case there are file handles that are open but 'deleted'. [10:08] axw: du --max-depth=3 / | sort -n [10:08] has always been my friend [10:08] maybe max-depth=4 if you're at '/' [10:09] thanks [10:13] gotta go, I'll bbl [10:14] balloons: juju-core-slave has run out of disk. can you please let me know what you would normally delete, so I can fix it next time? there's some stuff in /var/lib/lxd.old, taking up 1.5GB of precious space [10:14] unchanged since July 19 2016 [10:14] *probably* safe to remove... [10:15] I dunno if these things are backed up though [11:37] 4:31 pm Hi All, one of our charms uploaded on charms store is not getting listed on the portal. We have uploaded total 6 charms and only 5 are getting listed. Having said that we can see the missing charm by exact url. [11:38] List of the charms can be seen at: https://jujucharms.com/q/hyperscale [11:41] anyone care to review a near-trivial change: https://github.com/juju/juju/pull/8208 [11:44] akshay_: I think you can ask on juju@lists.ubuntu.com [12:01] Thanks @jam I have sent out a mail to the same, will wait for the response. [12:13] jam: I don't understand this sentence: "you have to somehow sighup juju itself to allow it to reopen its file handles, without also restarting all of the 1000s of units that are actively communicating with juju" [12:14] rogpeppe: most things that use an external log rotator, restart the underlying process, or at least tell it to close the handles and reopen them, so they stop writing to the old location. [12:14] however, if you restart jujud, then everything connected to it (all the other agents), also lose their connections [12:14] which seems very bad in a large deployment [12:15] so I'd rather we just leave stdout/stderr to an unrotated file, but don't put much content there, and then rotate our logs without bouncing anything. [12:15] jam: what do you mean by "external log rotator" there? [12:16] rogpeppe: some process that isn't juju that rotates logs [12:16] rogpeppe: eg 'apt install logrotate' [12:16] jam: so which process are you talking about restarting? [12:17] rogpeppe: if you just 'mv stdout.log stdout-123456.log' you then have to get the process that was writing to 'stdout.log' to close its file handle, and open a new one. [12:17] the standard 'logrotate' mechanism is to bounce the agent (eg, postgresql) [12:17] jam: i wasn't suggesting doing that [12:18] jam: i was thinking of an external process that reads stdin and writes it to the current file, changing that file when it gets too big [12:18] jam: then pipe jujud output to that [12:19] jam: i guess the down side is that it's a tad less efficient because you're not writing directly to disk [12:22] i'm less concerned with that then having it be confusing because of more moving parts [12:27] jam: it would be less moving parts inside jujud itself (although I guess the scripts would need changing) [12:28] jam: i'm concerned that having stack traces separate from the log messages would make it hard to work out exactly where/when the stack trace occurred, making it hard to post-mortem debug [12:29] jam: also it means another log file to remember to gather up (and i guess that file would potentially need rotating too) [12:29] jam: because stack traces can be enormous [12:39] rogpeppe: what if we just rotated on every startup ? [12:40] jam: how would that work for long-running daemons? [12:40] rogpeppe: you don't have panic stacktraces in long running daemons, because SIGQUIT kills the process [12:40] jam: it's not improbably that a jujud might run for 6 months [12:41] s/bably/bable [12:42] jam: the problem that originally prompted the issue to be created was when we had a long running daemon that produced a stack dump [12:48] jam: another possibility would be to get juju to redirect its stderr to the rotated log file itself (by using syscall.Dup2) [12:57] We are bootstrapping on openstack using https identity service ... [12:58] endpoint is https://192.168.23.222:5000/v3 [12:58] but bootstrap is failing with following error .... [12:58] INFO juju.provider.openstack provider.go:144 opening model "controller" 07:41:20 DEBUG juju.provider.openstack provider.go:798 authentication failed: authentication failed caused by: requesting token: failed executing the request https://192.168.23.222:5000/v3/auth/tokens caused by: Post https://192.168.23.222:5000/v3/auth/tokens: x509: cannot validate certificate for 192.168.23.222 because it doesn't contain any IP SANs ERROR au [14:14] axw, back [14:16] That slave is always close to full. I would remove old workspaces, but lxd data also sounds fine [14:56] hello folks [14:56] is there any ppa where I can still get juju 2.2.6 ? [14:56] version 2.3.1 seems to have issues with LXD containers :) === frankban is now known as frankban|afk [18:43] gsamfira, there isn't a ppa but you can still get it from a snap branch [18:43] gsamfira, what issues with lxd containers are you having? are you talking about the ubuntu-fan bug? [18:45] gsamfira, https://bugs.launchpad.net/juju/+bug/1737640 [18:45] Bug #1737640: /usr/sbin/fanctl: arithmetic expression: expecting primary | unconfigured interfaces cause ifup failures smb> [18:45] gsamfira, snap install juju --channel=stable/2.2.6 --classic [18:58] ok - https://jujucharms.com/docs/1.22/howto-proxies tells me that "juju set-env" should work, but it doesn't exist [18:58] how do I set http_proxy and no_proxy now? [19:00] juju bootstrap --config somehow? [20:03] hallyn: you found an old version of the docs - it’s listed as no longer supported. [20:03] hallyn: here’s a list of the model keys for config: https://jujucharms.com/docs/stable/models-config [20:04] hallyn: juju bootstrap —config http-proxy= [20:04] hallyn: you might want to look at —model-default as well with bootstrap === akhavr1 is now known as akhavr [20:38] babbageclunk: ping [20:49] hml: hey, sorry - was at the bank [20:49] babbageclunk: have a few minutes for an HO? [20:50] yup yup [20:50] standup? [20:50] babbageclunk: sure [21:36] balloons: no call today? [21:52] wallyworld, tim is out, and trying to stay productive :-) [21:52] no worries [21:52] wallyworld, but I thought we were going to cancel the calls for this week [21:53] probs yeah [22:13] balloons: yes, the fan bug it is! Thanks for the hint about the snaps [22:24] <3 snaps [22:34] babbageclunk: the config used was hiding in ProvisionerSuite.Model :-) now to clean up the test [22:34] oh yay!