=== gnomethrower is now known as wings === DerRaiden is now known as DerRaiden`work === coreycb_ is now known as coreycb === davecore_ is now known as davecore === sadin999_ is now known as sadin999 === jamespage_ is now known as jamespage === logan_ is now known as logan- === Pici` is now known as Pici === Greyztar- is now known as Greyztar === smoser1 is now known as smoser [13:46] jamespage: sahid: the software-properties sru for train has landed in bionic-updates. so add-apt-repository cloud-archive:train(-proposed) will now work. [13:57] coreycb: \o/ [14:32] Hey folks, I'm having latency issues between two Ubuntu machines on my (wifi) network and was wondering if anyone had some advice on how I could start debugging. The two machines on wifi are surprise and leopard; they both get good pings (1-3ms) to lively, which is plugged directly in to my router. However, the pings between them show the occasional 1-3ms ping but more often than not they're upwards [14:32] of 200ms, all the way up to ~700ms in some cases. Any thoughts? [14:32] (From one to the other: rtt min/avg/max/mdev = 1.385/167.298/676.538/146.793 ms) [14:59] rbasak or sarnold: either of you alive for me to run something by you before I consider making a proposed 'fix' to a runtime raceish condition bug on nginx? [15:08] OK, it looks like pinging _from_ leopard works absolutely fine, but pinging leopard from my router, for example, is bad. [15:10] Odd_Bloke, can you pastebin a traceroute between the two slow ones? [15:10] Preferably one each way [15:28] lordcirth: https://paste.ubuntu.com/p/qPvx6652RF/ [15:30] Odd_Bloke, look at the interfaces with 'ip link' are there dropped packets? [15:31] lordcirth: I'm not seeing anything about packets in the `ip link` output; what should I be looking for? [15:34] Odd_Bloke, sorry, wrong tool. Try installing ethtool and running "ethtool -S ifacename" [15:37] lordcirth: https://paste.ubuntu.com/p/mQYqT8RKY3/ [15:38] Ok, so not massive amounts of packet drops [15:38] Odd_Bloke, what are the Ubuntu versions? [15:41] teward: re LP: #1581864, AFAIK, this is reproducible in every releases as long as you have 1 CPU [15:41] Launchpad bug 1581864 in nginx (Ubuntu) "nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument" [Low,Confirmed] https://launchpad.net/bugs/1581864 [15:41] sdeziel: even 1 vCPU VMs couldn't reproduce on my end [15:42] but if we could make a NOTE that it affects all releases on the bug so we can assign tasks that'd be great [15:42] the 'proposed workaround' there is actually why i'm poking rbasak or sarnold xP [15:43] teward: I'm less sure that's a nginx bug now to be honest. I've vaguely remember seeing the same error with other daemons [15:44] teward: it feels as if systemd expects the PID to be there just a little before the daemon gets to create it [15:48] sdeziel: then that sounds like a SystemD bug [15:48] but the nginx one we can still apply a workaround, if it makes sense to add a half-second delay I could [15:48] but I want to 'avoid' breaking other things [15:48] so it would need additional testing [15:50] lordcirth: disco on leopard, eoan on surprise [15:51] (Thanks for your help BTW!) [15:53] Odd_Bloke: when the ping latency spikes are both wifi devices trying to send at the same time? [15:56] TJ-: "send" in what sense? They're both running stuff, so I expect the network is in use at some level ~all the time. [15:56] (And how can I check/measure what you're asking?) [15:59] Odd_Bloke: I'm thinking they're causing transmit collisions with each other, which would induce spikes. The other possibility could be how the router handles ICMP. Is it only ping, or other protocols too? [16:01] TJ-: arping exhibits a similar issue [16:03] Odd_Bloke: the obvious difference is you've got 4 trips over wifi, whereas for lively, there's only 2 [16:05] Hmm, I wasn't seeing slowness router->lively before but I am now. [16:05] And I'm seeing it to my Android phone too. [16:06] OK, I think this is a router/wifi config issue, so I'll look at that later when I don't need Internet to work. [16:06] TJ-: lordcirth: Thanks for the help. :) [16:06] Oh, I didn't realize it was wifi. Yes, wifi is a single collision domain and can cause this sort of thing [16:09] teward: agreed but the workaround is so ugly ;) [16:15] sdeziel: when is a workaround NOT ugly >.> [16:15] and besides who only uses 1cpu nowadays anyways lol [16:15] *shot* [16:18] teward: I have yet to see high enough load on nginx to warrant 2 CPUs. I was never slashdot'ed though [16:18] usually not nginx that needs the extra CPU but $BACKEND that needs the extra power :P [16:47] How can I have an xattr be set on all new home directories that are created? [17:41] teward,sdeziel, that proposed workaround is really too ugly to ship. it's fine to suggest, but too ugly to give to other people [17:42] that's what i had assumed [17:42] sarnold: then we need systemd to investigate and fix :p [17:43] because if sdeziel is correct this affects other daemons not just nginx [17:43] and systemd is the race condition [17:43] teward: if you've got the cycles and the enthusiasm, give it a try on newer releses, or fedora; if it still happens with one of the latest two or three systemd releases, then systemd upstream will be a *lot* more motivated to try to find a solution [17:44] teward: but don't be surprised if they say it'd be best to convert to using socket activation or the systemd readiness API or remove the pidfile entirely etc [17:44] teward: there was a related discussion in #systemd: https://paste.ubuntu.com/p/CFGsSkJJ4y/ [17:44] sarnold: sdeziel indicated it affects all latest :P [17:44] (honestly given how almost nothing implements pid files correctly, removing them all would do the world a huge service) [17:45] sarnold: yeah, bad PID handling is that what the systemd person hinted at [17:45] yeah, I think I agree completely with damjan in this case :) [17:45] I don't know how much of it is due to the re-exec capability of nginx [17:46] (service nginx upgrade) [17:46] sdeziel: MY solution to the upgrade bits would just to be a full service stop and start [17:47] but that's not a friendly way [17:47] because downtime [17:47] teward: that's no solution then :P [17:47] exactly :P [17:48] that's one of the very nice advantage that NGINX has over Apache [17:49] NGINX can also handle async uwsgi properly (Django) [17:49] unlike Apache's WSGI plugin [17:55] sdeziel: well in THEORY we could probably try and get by WITHOUT a pidfile but that'd torture a lot of things. But systemd IS nice for me running things where I don't need a pidfile, even like a simple application [17:59] teward: I don't know if Debian wants to keep compat with non-systemd setups. And the init script is used for at least 2 things even when systemd is present: upgrade and rotate (the logrotate job uses it) [18:00] teward: that said, I think it would still work to drop PIDFile and ExecStop from the systemd unit. NGINX would still create the PID for the other actions to make use of it [18:04] teward: scratch that, just tested it and it works until you "service nginx upgrade" at which point systemd looses track of the main process [18:04] yep [18:05] sdeziel: well we're already EXTREMELY diverged from Debian [18:05] with nginx anyways [18:05] we could probably alter it further to do our needs to ONLY support SystemD based systems without many headaches... [18:07] granted but maybe the best course of action would be to check with upstream if the PID handling could be improved as suggested by damjan (#systemd) and sarnold [18:08] sdeziel: except they still support some older mechanisms and non-systemd things [18:08] upstream [18:08] so until EVERY distro they build/support is SystemD they can't remove it [18:08] i know this because I Have this discussion with them regularly [18:08] for other "supported" things that need to go die... [18:09] teward: I'm not asking for support of Type=notify, just better PID handling [18:09] I don't even know if Type=notify would work with NGINX through re-exec/upgrade [18:09] *shrugs* [18:11] sdeziel: i'd need details on explaining what they need to do to make handling work better. Right now I'm not feeling up to dealing with their upstream devs just now, but if you want to open the issue for discussion, email nginx-devel@nginx.org - i'm subbed there and can weigh in :p [18:11] * teward is currently nursing an insufficient-quantity-of-caffeine headache [18:13] teward: unfortunately, I don't know much about fork/exec/PID handling to provide an adequate bug report ... merely enough to have a vague feeling I understand some of it ;) [18:14] :P [18:14] i know a little about PIDfile handling... but that's mostly because of Python scripts and services that are cron-run every minute [18:14] and sometimes the last process hasn't completed [18:14] ... and also the worker processes for some Redis-driven API systems [18:15] which are also SystemD-ified, but with less dependency on the pidfile [18:15] which are also SystemD-ified, but with less dependency on the pidfile (and more for the worker to not run if there's more than one worker attached for that queue) [18:15] (due to queue locking) [18:51] teward: sdeziel I think I have a solution for your nginx issue; TimeoutStartSec=XXX --- according to "man systemd-system.conf" the default is 100ms, so increasing that would likely solve your issue cleanly [18:51] TJ-: thanks trying now [18:52] oops, sorry, misread, it's set to 90s ...DefaultRestartSec is 100ms [18:58] TJ-: I tried RestartSec=1s and it didn't work [19:02] actually, I found something strange here. All the docs talk about TimeoutStartSec but (at least for apache2 here on 18.04) that doesn't exist BUT TimeoutStartUSec does - implying its micro-seconds not seconds as the base unit [19:07] IIRC, some version of systemd tried to make those timeout params uniformly named. I also they they all take unit suffixes (s, ms, us, etc) [19:09] yes, seems like USec is only for DBus activations though [19:11] on the face of it, if DefaultTimeoutStartSec=90s, then systemd shouldn't complain about the missing PIDFile until that expires even if the process has started (which begs the question what it considers to be 'started' ) [19:22] ahhh, I see. With Type=fork it's when the parent process exits and of course nginx is calling ngx_create_pidfile() from the forked process, not the parent, so may not have completed that call [19:23] easiest solution would be to move the call to ngx_create_pidfile() to before the ngx_daemon() call [19:30] TJ-: thanks for digging into the code, your conclusion is sound to me [19:30] if you want a tester for a path, count me in ;) [19:31] s/path/patch/ [19:36] TJ-: feel free to propose the patch upstream as well, but if your patch works we can quilt patch it for Ubuntu [19:37] teward: I've attached it to the bug [19:37] sdeziel: ^^ [19:39] teward: please let me know if you intend to provide a test build with TJ's patch, if not, I'll setup a local builder [19:42] sdeziel: email server's fubar but i'll look once I'm home, and probably test-build in a PPA against Eoan and Bionic (current 'test' envs I have available) [19:42] so I will have test builds... but in a few hours [19:42] not now [19:42] (can't get into home from work right now) [19:42] teward: awesome, if you can post link to that PPA in the ticket, I'll be sure to test it out tomorrow [19:52] how come i see "updated software is available for this computer; do you want to download"? i thought i set up auto install of unattended upgrades both security and not security. why am is eeing this? [19:52] *seeing this [19:55] TJ-: mind if I steal your name/email right from Launchpad? Or do you have a specific thing you want me to put in the DEP3 headers? [19:55] teward: It should just be "Tj or [19:56] arooni, look in /etc/apt/apt.conf.d/50unattended-upgrades. Is -updates repo enabled? [19:56] teward: if it works I'll do a PR upstream [19:57] lordcirth:yes under allowed-origins [19:58] arooni, ok, so if you do " apt list --upgradeable" what repos are they from? [19:59] lordcirth: grr i already ran the update so i dont get any output on that command [19:59] lordcirth: it was stuff like neovim [19:59] arooni: presumably you have PPAs added? [19:59] oh i definitely do [20:00] arooni: if I recall correctly unattended-upgrades only deals with Ubuntu archives, not PPAs [20:00] well that would explain it [20:02] Yeah, that list is a whitelist of repos [20:02] I think that adding "LP-PPA-${PPA_NAME}:$DISTRO" to the allowed-origins is what's needed [20:03] TJ-: ack. They'll want an hg-created patch though against their HG and submitted to nginx-devel [20:03] they don't use git :\ [20:03] that's fine :) [20:04] arooni: Bear in mind that a PPA owner could put any package in their PPA and you would have it installed via unattended-upgrades; this could break your system without any chance of you intervening. [20:04] Odd_Bloke:so you're saying its not a goodddddddd idea [20:04] (Obviously there's some danger of that regardless, but PPA packages don't have to go through the same process that Ubuntu packages do.) [20:05] Yep, it's not something I would do. [20:05] (Unless I specifically trusted the PPA owner a great deal.) [20:05] ^^^^ [20:06] And when I say any package, I mean any package; someone could put a compromised browser, kernel or fundamental system library in there and you might not notice. [20:08] wow didnt know that was a thing [20:09] everyone knows everyone on the internet is to be 100 trusted [20:14] TJ-: sdeziel: PPA at https://launchpad.net/~teward/+archive/ubuntu/nginx-lp1581864 - Eoan builds are all done except for s390x, builds queued for Xenial, Bionic, Cosmic, Disco. [20:15] but it's the PPA builders soooooo [20:17] put the thing in the bug as well for testers. [20:18] i adjusted some security restrictions at work to let me out - advantage: IT Security guy :P [20:19] TJ-: also when I did the DEP3 headers I used Tj and ubuntu@ [20:21] for the author part of the patch [21:09] TJ-: teward: test failed here [21:24] sdeziel: the patch definitely applied? [21:25] TJ-: pretty sure as the PID handling was very different, see the bug update for the details [21:27] sdeziel: was this for service start, or restart/reload ? [21:27] TJ-: I only tried to start it [21:28] oh of course! stupid of me [21:28] looks like there is double forking [21:28] PID will be parent not child, doh! Can you tell I've been up for 19 hours? [21:29] Hmmm, did you try using the systemd directive GuessPID=true and leaving PIDFile= unset? [21:31] sorry, GuessMainPID= [21:31] see "man systemd.service" [21:31] TJ-: I tested with PIDFile= (empty) without your patch, this didn't work [21:32] it is the default for Type=forking [21:32] TJ-: GuessMainPID defaults to yes when Type=forking so I think I tested what you asked, just not with your patch on [21:41] I agree with you [21:42] I can see why the patch didn't work, and I can see how it might be possible to make a different patch that would, but that has to be more invasive so I'm looking for the most straightforward, minimal, way to do it. [21:45] I'll sumerise: ngx_daemon() is the function that does the fork(). When calling fork() it returns the PID of the child to the parent, and 0 to the child. ngx_daemon() does an immediate exit(0) so never returns so we can't get at the child's PID in the parent before it exits. There are 2 options: 1) have the create_pidfile() cann moved into ngx_daemon() [ not nice ] or 2) amend ngx_daemon() in some way [21:45] so even in the parent it can return and bring with it the child PID [22:04] sdeziel: TJ-: Removing the packages since they don't work - has the bug been updated that tests failed? [22:04] teward: yes [22:06] I almost have it :) [22:06] TJ-: well you'll have to upload a new patch xD [22:06] unrelated i'm starving, time to make dinner. [22:25] I could do with some review of this one! [22:26] http://paste.ubuntu.com/p/sbsSp5xwJ9/ [22:28] if you can follow the logic that'll help :) Basically, instead of ngx_daemon() doing exit() after the fork() it returns the childs PID. The calling code uses that (child_pid) to figure out whether to write the PID file. PIDs are always > 0 (NGX_OK) so the exit() is done after create_pidfile() only if pid_child > NGX_OK [22:28] TJ-: wow that's subtle [22:28] I've not done a compile test on this patch [22:29] I've tried to make the code retain cross-platform usage so it won't break a WIN32 build [22:30] is win32 code the part missing between hunks? [22:30] sarnold: yes [22:30] ah, good. [22:31] well, those are seaprated by a lot :) but at least if it doesn't run, it won't have sideeffects from not being run twice :D [22:31] that's why I set pid_child = NGX_BUSY (-3) so I can detect if it was set by a call to ngx_daemon() [22:32] pid_child will be either NGX_ERROR (-1) if the fork failed, NGX_OK if this is the child process, or > 0 if it is the parent process [22:32] TJ-: I think if it were my codebase, I'd want the first added line to be: ngx_int_t child_pid = NGX_BUSY; [22:32] sigh firefox y u copy neewline [22:32] TJ-: let me know when you want me to local-test the patch. [22:33] ngx_int_t child_pid = NGX_BUSY; /* sentinal used for ngx_create_pidfile later */ [22:33] sarnold: good idea, I'll add that [22:33] TJ-: and if they don't like extra comments, feel free to blame me :D [22:34] * teward wants pizza, but has no money at the moment [22:36] how about this? http://paste.ubuntu.com/p/gCfDk6YkXM/ [22:36] sarnold: yeah, I don't like code without comments! [22:37] I doubt upstream will like this, but it should solve the immediate issue [22:37] sarnold: which ubuntu release are you primarily testing with this (so I can do a compile/build test here before posting the patch to the bug report) ? [22:38] TJ-: I haven't done any testoing at all [22:38] ah, it was teward wasn't it? [22:38] TJ-: your comment is much better :) nice. I think it'd read better if it had a newline in front of it [22:38] TJ-: and sdeziel [22:38] sarnold's better at code :) [22:38] but i'm happy to do some testing [22:38] I'm sooo tired my memory is like a goldfish right now [22:38] * teward has a number of tests he runs locally [22:39] * teward gives TJ- caffeine [22:39] I've a Husky asleep at my feet; I want to swap places :) [22:39] awwwww :) [22:39] don't mess with the husky :P [22:39] * sarnold awooooooooo [22:40] newline just for you http://paste.ubuntu.com/p/Xk58SZr7bk/ [22:40] TJ-: there we go that's the good stuff :) [22:40] teward: which ubu release should I compile/build-test with this? [22:40] * TJ- needs to fire up a container [22:40] TJ-: start with Eoan? [22:40] teward: will do [22:41] since that's where this will land first [22:41] then Bionic after that since LTS is important :P [22:50] TJ-: i'd also ask you test Xenial because that's still supported for a few years too so :P [22:50] but at the least Eoan and Bionic are the two big ones, Eoan 'cause it'll land there faster than the rest, and Bionic because latest LTS [22:51] teward: as long as the code hasn't changed too much so the patch doesn't break :) [22:52] this feels like code that would have been written once 15 years ago [22:52] good point. I don't think much has happened from the initialization part for some time [22:52] I'm not even sure which branch I've got my git pointing at! [22:52] and the oldest you'd have to go is 1.10.x which is 16.04. [22:52] so bleh [22:53] oh, its 'origin/ubuntu/devel [22:53] I assume that matches latest for Eoan [22:53] TJ-: cat the changelog? [22:54] check what the latest entry is :P [22:54] nginx (1.16.0-0ubuntu1) eoan; [22:54] yep that's eoan [22:54] I was forgetting this'll include the packaging [22:54] :P [22:55] I'm on a slow connection so I'll make a pot of coffee whilst the container gets ready [23:01] hmmm, I may not be able to do this here now; out of space on both / and /var ! [23:03] uhoh. bad idea to deal with that while tired [23:04] let's see if I can resize the /home/ LV to free up some extents... this could be fun whilst mounted! [23:11] need to log-off whilst I do this [23:44] Anyone know why the ubuntu server rpi3 image updates to rpi2?