=== cpaelzer__ is now known as cpaelzer | ||
doko | ddstreet: https://gcc.gnu.org/pipermail/gcc/2021-May/235967.html | 07:06 |
---|---|---|
-queuebot:#ubuntu-release- New: accepted pipewire [amd64] (impish-proposed) [0.3.26-1] | 07:11 | |
=== cpaelzer__ is now known as cpaelzer | ||
Laney | juliank: can you check download-results please, I think you semi-broke it | 08:08 |
Laney | and that is probably making the web frontend miss results | 08:08 |
juliank | Laney: sure | 08:08 |
juliank | Laney: yeah, it times out after 5s because publish-db holds the lock for ~1s | 08:12 |
juliank | Laney: We should switch the database from journal_mode=DELETE to journal_mode=WAL; then readers don't block writers | 08:15 |
juliank | see https://sqlite.org/wal.html | 08:16 |
juliank | Not sure how it works with the backup API, maybe waveform knows more if the backup DB will be journal_mode=DELETE still if the main db is journal_mode=WAL | 08:16 |
juliank | it seems to be | 08:18 |
juliank | Laney: Oh I guess it might hold the lock longer because it's slower, I can't say | 08:19 |
juliank | But WAL would avoid any logging, but then we can't publish a single db file anymore that way ugh | 08:19 |
juliank | I guess we can do pragma journal_mode=delete; at the end before committing it | 08:20 |
juliank | (committing being the rename here) | 08:20 |
juliank | Might just set pragma journal_mode=OFF on the db copy | 08:21 |
juliank | Laney, waveform https://code.launchpad.net/~juliank/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/402242 | 08:30 |
juliank | Laney: validated on staging | 08:34 |
Laney | juliank: It sounds sensible to me but I also think w_aveform would be a better reviewer | 08:35 |
Laney | btw once this is fixed you need to run download-all-results.service once to catch up | 08:35 |
dosaboy | hi, there are currently multiple uploads in the bionic unapproved queue for neutron whereby the most recent one can superscede all others | 08:37 |
dosaboy | https://launchpad.net/ubuntu/bionic/+queue?queue_state=1&queue_text=neutron | 08:37 |
dosaboy | would it be possible for someone with sru powers to go ahead and reject all prior to the most recent | 08:37 |
dosaboy | sorry about that, it resulted from accidental parralel uploads | 08:38 |
dosaboy | https://launchpad.net/ubuntu/bionic/+upload/26074856/+files/neutron_12.1.1-0ubuntu7.dsc is the one to keep | 08:38 |
-queuebot:#ubuntu-release- Unapproved: rejected linux-firmware-raspi2 [source] (focal-proposed) [4-0ubuntu0~20.04.1] | 08:54 | |
Laney | juliank: meh, let's try it, it's making me anxious seeing download-results fail all the time | 08:54 |
-queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu4.1] | 08:54 | |
-queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu7] | 08:54 | |
-queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu6] | 08:54 | |
juliank | Laney: OK, I can publish it :D | 08:55 |
Laney | aye | 08:55 |
juliank | Laney: it's live now | 08:58 |
juliank | Laney: Running d-a-r | 08:59 |
Laney | w00t | 09:00 |
Laney | keep a journalctl -fe -u download-results running | 09:00 |
Laney | oh sigh, I forgot I had a stupid conflict on download-all-results | 09:01 |
Laney | we should drop that, there's no need for it I think | 09:01 |
* Laney does quickly | 09:01 | |
waveform | juliank, journal_mode=WAL *would* be ideal (due to readers not blocking writers) but for the fact that read-only mode becomes more tricky with that (-wal and -shm must pre-exist, in other words it *must* be open already elsewhere, in order for a read-only open to succeed IIRC) | 09:02 |
waveform | but then if you're not really counting on read-only mode (it's just a safety measure) it might be useful to explore anyway | 09:03 |
juliank | waveform: I think it's fine, we set r-o mode in the reader that does not set the pragma, the pragma is only set in the writer | 09:04 |
juliank | waveform: And the reader runs from time to time, the writer runs consistently :D | 09:05 |
waveform | okay, that should be fine then | 09:05 |
juliank | Laney: running on autopkgtest-web/0 at the moment, once that is done, autopkgtest-web/1; ETOOMANYTERMINALS :D | 09:06 |
juliank | download-all-results is *slow* | 09:07 |
Laney | yeah | 09:07 |
Laney | it's basically doing a diff between swift and the db | 09:07 |
waveform | hmmm, I found a few more places to optimize last night but I haven't looked at d-a-r yet; will have a look now | 09:08 |
Laney | that's only run as a recovery tool, I wouldn't bother too much | 09:08 |
Laney | or, well, not as a first priority, feel free of course | 09:09 |
juliank | 4-7 hours runtime it seems from journal | 09:09 |
Laney | no way | 09:09 |
waveform | o.O | 09:09 |
juliank | Apr 09 16:34:23 juju-4d1272-prod-proposed-migration-2 systemd[1]: Starting Download all results... | 09:09 |
juliank | Apr 09 23:57:48 juju-4d1272-prod-proposed-migration-2 systemd[1]: Finished Download all results. | 09:09 |
juliank | Apr 26 08:14:03 juju-4d1272-prod-proposed-migration-2 systemd[1]: Starting Download all results... | 09:09 |
juliank | Apr 26 11:51:14 juju-4d1272-prod-proposed-migration-2 systemd[1]: Finished Download all results. | 09:09 |
Laney | yeah where it had a lot of work to do when initialising | 09:10 |
Laney | not now though, I wouldn't have thought | 09:10 |
juliank | It finished xenial now and is on bionic | 09:10 |
juliank | maybe it should download results from swift in a thread pool or sth | 09:12 |
waveform | would be interesting to throw some more logging in there and see where the bottleneck really is | 09:12 |
Laney | polling the swift API is slow | 09:12 |
Laney | you have to fetch the contents in batches | 09:13 |
Laney | juliank: see sync-swift in the wendigo homedir, there's a thread pool in that one, if you really want to optimise this tool we shouldn't have to be running :p | 09:13 |
juliank | mice | 09:14 |
juliank | *nice | 09:14 |
juliank | focal :D | 09:17 |
juliank | Laney: it failed, swift got unhappy | 09:20 |
juliank | http.client.RemoteDisconnected: Remote end closed connection without response | 09:21 |
juliank | gotta start it again :( | 09:21 |
Laney | sad | 09:21 |
juliank | Laney: started it on web/1 too now, though | 09:23 |
Laney | oh yeah there's all the 'damaged' results, we should maybe purge those out | 09:25 |
-queuebot:#ubuntu-release- Unapproved: accepted gnome-control-center [source] (focal-proposed) [1:3.36.5-0ubuntu2] | 09:25 | |
juliank | yeah | 09:25 |
juliank | Laney: With WAL, we don't actually need to use the r/o database in the web frontend anymore, so we could update it less often and make the web frontend use the r/w one | 09:39 |
juliank | So results would appear sooner that way; and we can save CPU cycles by syncing the public DB less often | 09:40 |
juliank | Currently it just runs back to back without a break | 09:40 |
juliank | Running it every 15 mins would work then I think | 09:40 |
juliank | Then the copy just exists for like britney speedup work | 09:42 |
Laney | sounds good | 09:43 |
juliank | Laney: have we considered settign up https://www.haproxy.com/de/blog/accelerate-your-apis-by-using-the-haproxy-cache/? | 09:44 |
waveform | oh - the only reason the r/o database existed was to relieve pressure on the r/w one? okay -- things are becoming clearer :) | 09:45 |
-queuebot:#ubuntu-release- Unapproved: accepted neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu7] | 09:45 | |
juliank | waveform: Basically, https://bugs.launchpad.net/auto-package-testing/+bug/1639847 | 09:46 |
ubot3 | Launchpad bug 1639847 in Auto Package Testing "web results incomplete/error out on locked DB" [Medium, Fix Released] | 09:46 |
juliank | Laney: Ah, but we do use the extra fields publish-db creates in browse.cgi | 09:46 |
juliank | We'd have to that bit on the r/w database then in a transaction | 09:47 |
Laney | that could be moved to the rw one, I'd have thought | 09:47 |
waveform | was just about to ask that :) | 09:47 |
Laney | that and to provide a non changing thing which people/programs can download | 09:47 |
waveform | indeed -- the "static" r/o copy makes sense to me; was just confused about why certain bits only existed in that and not the r/w "master", but it's all beginning to make some kind of sense | 09:49 |
* Laney nods | 09:50 | |
juliank | waveform: but if we have a big write transaction in there, it will block the other writes too, right, even if they use different tables? | 09:50 |
waveform | under the historical journal mode, yes -- I'll just go skim the sqlite WAL docs though as I'm less familiar with that | 09:51 |
juliank | waveform: I found https://sqlite.org/src/doc/begin-concurrent/doc/begin_concurrent.md#:~:text=Overview,system%20still%20serializes%20COMMIT%20commands. | 09:52 |
Laney | juliank: caching> sort of, but not much, I don't want to lose the shiny faster updates we just won ... | 09:52 |
juliank | But not sure how to make BEGIN CONCURRENT work, if it's enough to to do it only for the large writer | 09:53 |
waveform | might be a silly question, but what sort of storage do the machines this is running on have? Just concerned that it's something that properly implements file-locks and not something like NFS | 09:54 |
juliank | ext4 on virtual block storage | 09:55 |
waveform | okay - sounds like that should be fine | 09:55 |
waveform | juliank, that BEGIN CONCURRENT doc looks a lot like RR in other systems; in other words if you're handling a write request you *have* to deal with the possibility of txn conflicts. Which is do-able but would probably mean a fair bit of upheaval in the code-base (basically means on any commit you have to catch busy-snapshot and retry the whole txn) | 09:56 |
juliank | waveform: Right, if we can only do it in publish-db which adds its own extra tables, and is the only one writing them, we'd be fine though | 09:56 |
juliank | waveform: Because there are no conflicts across tables | 09:57 |
juliank | Two transactions that write to different sets of tables never conflict, and that | 09:57 |
waveform | well, note it's not just which tables you *write* | 09:57 |
waveform | "yes, this transaction did not read or modify any data modified by any concurrent transaction" | 09:57 |
waveform | in other words, if that transaction reads any page of data which was modified by another transaction, then it conflicts | 09:58 |
waveform | (it's effectively RR isolation) | 09:58 |
juliank | ah | 09:58 |
juliank | That works for us :D | 09:58 |
waveform | yeah - if that still works, then \o/ - I'm just not that clear on all the things that happen simultaneously yet :) | 09:59 |
juliank | waveform: We can also rollback and retry those transactions publish-db currently does too, that's not a problem :D | 10:00 |
waveform | excellent | 10:00 |
waveform | bit of a shame the result table (in particular) isn't WITHOUT ROWID but that's something to look at another time | 10:07 |
juliank | I mostly wonder how we ended up with a working database so long | 10:18 |
juliank | like, we only started getting inconsistent database images yesterday and our copy without lock hack still did not seem to break much | 10:19 |
waveform | looking at how schema migration is handled ... erm ... yeah :) | 10:19 |
juliank | that too, yeah | 10:19 |
Laney | now you're getting used to autopkgtest-cloud | 10:20 |
juliank | Well we probably 500ed out before too, but did not have the haproxy declaring our servers dead | 10:20 |
Laney | we didn't have proposed-migration hitting the web backends before | 10:21 |
doko | finally, gcc-10 built on armhf after 11 days \o/ | 10:32 |
xnox | 🥳🥳🥳 | 10:34 |
juliank | Laney: On https://ubuntu-release.kpi.ubuntu.com/d/76Oe_0-Gz/autopkgtest?orgId=1&var-instance=production, I think it would be nicer if we could split the active/error worker graphs, and same for the normal/abnormal exit | 10:42 |
juliank | wondering you can have those graphs accumulative, such that it shows total since start of period, instead of like bouncing between 1 and 0 | 10:43 |
Laney | mmm | 10:44 |
Laney | sometimes I want to like show one arch, sometimes all the errors | 10:44 |
Laney | I usually shift-click the labels to select what I want | 10:45 |
Laney | but yeah it could probably be better somehow | 10:45 |
Laney | juliank: let me add you to that, you can make a copy of the dashboard and play if you want | 10:46 |
Laney | check email | 10:47 |
Laney | would love to at least get the port quota thing fixed, but unsure how to further prioritise it :( | 10:49 |
juliank | Laney: maybe I just want one error panel which shows three graphs: errored jobs, worker in error states, relxd motes in error state | 10:56 |
juliank | Laney: One quick thing to glance at and see if things are behaving normal, not for digging further | 10:56 |
* Laney relxd motes | 10:57 | |
juliank | Laney: yeah, lol @ mosh messing up lxd remotes | 10:58 |
juliank | Laney: We need some email alerts for increased error rates | 10:58 |
juliank | We also need to capture 5xx server errors in haproxy and log them | 10:58 |
Laney | card that stuff | 10:58 |
juliank | "Something is off, please have a look" | 10:58 |
Laney | some 'stat' panels might be nice for what you're after | 10:59 |
Laney | not graphs, but here is the state now | 10:59 |
juliank | ACPM-15 and ACPM-16 | 10:59 |
Laney | juliank: some kind of notification support in the metrics project would be great | 10:59 |
Laney | then we could have like /notify #ubuntu-release ISO builds are failing too | 10:59 |
juliank | Yeah, email irc or mattermost pings | 11:00 |
Laney | I guess you should be able to write influx queries for this stuff | 11:00 |
Laney | so we need a way to hook it up to the notification side | 11:00 |
Laney | medium amount of work I guess | 11:01 |
-queuebot:#ubuntu-release- Unapproved: accepted udisks2 [source] (hirsute-proposed) [2.9.2-1ubuntu1] | 11:01 | |
-queuebot:#ubuntu-release- Unapproved: openssl (hirsute-proposed/main) [1.1.1j-1ubuntu3 => 1.1.1j-1ubuntu3.1] (core, i386-whitelist) | 11:57 | |
-queuebot:#ubuntu-release- Unapproved: openssl (focal-proposed/main) [1.1.1f-1ubuntu2.3 => 1.1.1f-1ubuntu2.4] (core, i386-whitelist) | 11:59 | |
-queuebot:#ubuntu-release- Unapproved: openssl (groovy-proposed/main) [1.1.1f-1ubuntu4.3 => 1.1.1f-1ubuntu4.4] (core, i386-whitelist) | 11:59 | |
Trevinho | rbasak: hey, I've re-uploaded the emoji package, this time with the leading line... Damned vim :) | 13:00 |
-queuebot:#ubuntu-release- Unapproved: fonts-noto-color-emoji (focal-proposed/main) [0~20200408-1 => 0~20200916-1~ubuntu20.04.1] (ubuntu-desktop) | 13:00 | |
Trevinho | oh, here it is :D | 13:00 |
rbasak | Thanks! | 13:01 |
-queuebot:#ubuntu-release- Unapproved: rejected fonts-noto-color-emoji [source] (focal-proposed) [0~20200916-1~ubuntu20.04.1] | 13:03 | |
ddstreet | rbasak any chance you have time to review the openssl uploads? it's been reviewed by the security team already as you can see in LP: #1926254 and the change for LP: #1927161 has no code change | 13:03 |
ubot3 | Launchpad bug 1926254 in openssl (Ubuntu Groovy) "x509 Certificate verification fails when basicConstraints=CA:FALSE,pathlen:0 on self-signed leaf certs" [Medium, In Progress] https://launchpad.net/bugs/1926254 | 13:04 |
ubot3 | Launchpad bug 1927161 in openssl (Ubuntu Impish) "dpkg-source: error: diff 'openssl/debian/patches/pr12272.patch' patches files multiple times; split the diff in multiple files or merge the hunks into a single one" [Low, In Progress] https://launchpad.net/bugs/1927161 | 13:04 |
rbasak | ddstreet: marked for attention. I'll try and get to it today, but feeling a bit swamped under all the SRUs in the queue at the moment! | 13:04 |
ddstreet | ack, thanks | 13:04 |
rbasak | It'll be the next one I start looking at - just trying to clear the list of things I've already started. | 13:05 |
-queuebot:#ubuntu-release- Unapproved: accepted fonts-noto-color-emoji [source] (focal-proposed) [0~20200916-1~ubuntu20.04.1] | 13:13 | |
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (impish-proposed/primary) [465.27-0ubuntu1] | 13:55 | |
-queuebot:#ubuntu-release- New source: fabric-manager-450 (impish-proposed/primary) [450.119.04-0ubuntu1] | 14:03 | |
-queuebot:#ubuntu-release- New source: fabric-manager-460 (impish-proposed/primary) [460.73.01-0ubuntu1] | 14:03 | |
-queuebot:#ubuntu-release- New source: libnvidia-nscq-460 (impish-proposed/primary) [460.73.01-0ubuntu1] | 14:06 | |
-queuebot:#ubuntu-release- New source: libnvidia-nscq-450 (impish-proposed/primary) [450.119.04-0ubuntu1] | 14:09 | |
juliank | Oh, Laney, waveform I figure we should add some random delay to the timer units; the way it works now both backends get the same peak loads at the same time, giving them even just 5s randomness should smoothen the load out for clients a bit | 14:18 |
* juliank still has htop running on both autopkgtest-web instances :D | 14:19 | |
juliank | We should set AccuracySec=1us and RandomizedDelaySec=30s so that systemd does not coalesce timers (Which it does in 1 minute groups) and makes them more random | 14:21 |
juliank | though I guess maybe it isn't timers after all | 14:21 |
juliank | publish-db, who runs that | 14:22 |
juliank | pushed to https://code.launchpad.net/~juliank/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/402262 | 14:27 |
Laney | juliank: I would like FixedRandomDelay too, think focal's systemd might not have that though | 14:32 |
Laney | otherwise seems like a sane idea | 14:32 |
juliank | Laney: That means they might end up with the same delay though, without it, it's more likely to distribute better | 14:33 |
juliank | (IMO) | 14:33 |
juliank | Then you might just as well start one at :00, the next at :10, another at :20 | 14:36 |
Laney | Otherwise you get a different delay each time, I think that's worse | 14:46 |
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (groovy-proposed/restricted) [460.73.01-0ubuntu0.20.10.1 => 460.73.01-0ubuntu0.20.10.2] (i386-whitelist) | 15:22 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (bionic-proposed/restricted) [460.73.01-0ubuntu0.18.04.1 => 460.73.01-0ubuntu0.18.04.2] (no packageset) | 15:22 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (focal-proposed/restricted) [460.73.01-0ubuntu0.20.04.1 => 460.73.01-0ubuntu0.20.04.2] (i386-whitelist) | 15:22 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (groovy-proposed/restricted) [450.119.03-0ubuntu0.20.10.1 => 450.119.04-0ubuntu0.20.10.1] (core, i386-whitelist) | 15:23 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (hirsute-proposed/restricted) [450.119.03-0ubuntu1 => 450.119.04-0ubuntu0.21.04.1] (core, i386-whitelist) | 15:24 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (bionic-proposed/restricted) [450.119.03-0ubuntu0.18.04.1 => 450.119.04-0ubuntu0.18.04.1] (no packageset) | 15:24 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (focal-proposed/restricted) [450.119.03-0ubuntu0.20.04.1 => 450.119.04-0ubuntu0.20.04.1] (i386-whitelist) | 15:24 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (bionic-proposed/main) [460.39-0ubuntu0.18.04.2 => 460.73.01-0ubuntu0.18.04.1] (ubuntu-desktop) | 15:35 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (groovy-proposed/main) [460.39-0ubuntu0.20.10.1 => 460.73.01-0ubuntu0.20.10.1] (i386-whitelist, ubuntu-desktop) | 15:35 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (focal-proposed/main) [460.39-0ubuntu0.20.04.1 => 460.73.01-0ubuntu0.20.04.1] (i386-whitelist, ubuntu-desktop) | 15:35 | |
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (hirsute-proposed/main) [460.56-0ubuntu2 => 460.73.01-0ubuntu0.21.04.1] (i386-whitelist, ubuntu-desktop) | 15:35 | |
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (hirsute-proposed/primary) [465.27-0ubuntu0.21.04.1] | 15:44 | |
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (groovy-proposed/primary) [465.27-0ubuntu0.20.10.1] | 15:45 | |
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (focal-proposed/primary) [465.27-0ubuntu0.20.04.1] | 15:45 | |
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (bionic-proposed/primary) [465.27-0ubuntu0.18.04.1] | 15:46 | |
* blackboxsw SRU vanguard for tomorrow (sil2100). ubuntu-advantage-tools 27.0 cleared SRU verification for B, F, G and H series. If there is time to look at that tomorrow it would be great to get that release out and align w/ Xenial that has already released | 16:50 | |
* blackboxsw SRU vanguard for tomorrow (sil2100): ubuntu-advantage-tools 27.0 cleared SRU verification for B, F, G and H series. If there is time to look at that tomorrow it would be great to get that release out and align w/ Xenial that has already released. https://bugs.launchpad.net/ubuntu/+source/ubuntu-advantage-tools/+bug/1926361 | 16:51 | |
ubot3 | Launchpad bug 1926361 in ubuntu-advantage-tools (Ubuntu Hirsute) "[SRU] ubuntu-advantage-tools (10ubuntu0.16.04.1 -> 27.0) Xenial, Bionic, Focal, Hirsute" [Undecided, Fix Committed] | 16:51 |
=== mfo_ is now known as mfo | ||
ddstreet | bdmurray vorlon sorry to bug you about an sru, but systemd in bionic has been in -proposed for quite a long time and we're getting asked about why it's delayed, any chance you could release it to -updates? it's for lp: #1923115 | 20:31 |
ubot3 | Launchpad bug 1923115 in systemd (Ubuntu Bionic) "Networkd vs udev nic renaming race condition" [Undecided, Fix Committed] https://launchpad.net/bugs/1923115 | 20:31 |
* enyc meows | 21:29 | |
-queuebot:#ubuntu-release- New binary: python-xstatic-dagre [amd64] (impish-proposed/none) [0.6.4.0-1] (no packageset) | 23:15 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!