=== cpaelzer__ is now known as cpaelzer [07:06] ddstreet: https://gcc.gnu.org/pipermail/gcc/2021-May/235967.html [07:11] -queuebot:#ubuntu-release- New: accepted pipewire [amd64] (impish-proposed) [0.3.26-1] === cpaelzer__ is now known as cpaelzer [08:08] juliank: can you check download-results please, I think you semi-broke it [08:08] and that is probably making the web frontend miss results [08:08] Laney: sure [08:12] Laney: yeah, it times out after 5s because publish-db holds the lock for ~1s [08:15] Laney: We should switch the database from journal_mode=DELETE to journal_mode=WAL; then readers don't block writers [08:16] see https://sqlite.org/wal.html [08:16] Not sure how it works with the backup API, maybe waveform knows more if the backup DB will be journal_mode=DELETE still if the main db is journal_mode=WAL [08:18] it seems to be [08:19] Laney: Oh I guess it might hold the lock longer because it's slower, I can't say [08:19] But WAL would avoid any logging, but then we can't publish a single db file anymore that way ugh [08:20] I guess we can do pragma journal_mode=delete; at the end before committing it [08:20] (committing being the rename here) [08:21] Might just set pragma journal_mode=OFF on the db copy [08:30] Laney, waveform https://code.launchpad.net/~juliank/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/402242 [08:34] Laney: validated on staging [08:35] juliank: It sounds sensible to me but I also think w_aveform would be a better reviewer [08:35] btw once this is fixed you need to run download-all-results.service once to catch up [08:37] hi, there are currently multiple uploads in the bionic unapproved queue for neutron whereby the most recent one can superscede all others [08:37] https://launchpad.net/ubuntu/bionic/+queue?queue_state=1&queue_text=neutron [08:37] would it be possible for someone with sru powers to go ahead and reject all prior to the most recent [08:38] sorry about that, it resulted from accidental parralel uploads [08:38] https://launchpad.net/ubuntu/bionic/+upload/26074856/+files/neutron_12.1.1-0ubuntu7.dsc is the one to keep [08:54] -queuebot:#ubuntu-release- Unapproved: rejected linux-firmware-raspi2 [source] (focal-proposed) [4-0ubuntu0~20.04.1] [08:54] juliank: meh, let's try it, it's making me anxious seeing download-results fail all the time [08:54] -queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu4.1] [08:54] -queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu7] [08:54] -queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu6] [08:55] Laney: OK, I can publish it :D [08:55] aye [08:58] Laney: it's live now [08:59] Laney: Running d-a-r [09:00] w00t [09:00] keep a journalctl -fe -u download-results running [09:01] oh sigh, I forgot I had a stupid conflict on download-all-results [09:01] we should drop that, there's no need for it I think [09:01] * Laney does quickly [09:02] juliank, journal_mode=WAL *would* be ideal (due to readers not blocking writers) but for the fact that read-only mode becomes more tricky with that (-wal and -shm must pre-exist, in other words it *must* be open already elsewhere, in order for a read-only open to succeed IIRC) [09:03] but then if you're not really counting on read-only mode (it's just a safety measure) it might be useful to explore anyway [09:04] waveform: I think it's fine, we set r-o mode in the reader that does not set the pragma, the pragma is only set in the writer [09:05] waveform: And the reader runs from time to time, the writer runs consistently :D [09:05] okay, that should be fine then [09:06] Laney: running on autopkgtest-web/0 at the moment, once that is done, autopkgtest-web/1; ETOOMANYTERMINALS :D [09:07] download-all-results is *slow* [09:07] yeah [09:07] it's basically doing a diff between swift and the db [09:08] hmmm, I found a few more places to optimize last night but I haven't looked at d-a-r yet; will have a look now [09:08] that's only run as a recovery tool, I wouldn't bother too much [09:09] or, well, not as a first priority, feel free of course [09:09] 4-7 hours runtime it seems from journal [09:09] no way [09:09] o.O [09:09] Apr 09 16:34:23 juju-4d1272-prod-proposed-migration-2 systemd[1]: Starting Download all results... [09:09] Apr 09 23:57:48 juju-4d1272-prod-proposed-migration-2 systemd[1]: Finished Download all results. [09:09] Apr 26 08:14:03 juju-4d1272-prod-proposed-migration-2 systemd[1]: Starting Download all results... [09:09] Apr 26 11:51:14 juju-4d1272-prod-proposed-migration-2 systemd[1]: Finished Download all results. [09:10] yeah where it had a lot of work to do when initialising [09:10] not now though, I wouldn't have thought [09:10] It finished xenial now and is on bionic [09:12] maybe it should download results from swift in a thread pool or sth [09:12] would be interesting to throw some more logging in there and see where the bottleneck really is [09:12] polling the swift API is slow [09:13] you have to fetch the contents in batches [09:13] juliank: see sync-swift in the wendigo homedir, there's a thread pool in that one, if you really want to optimise this tool we shouldn't have to be running :p [09:14] mice [09:14] *nice [09:17] focal :D [09:20] Laney: it failed, swift got unhappy [09:21] http.client.RemoteDisconnected: Remote end closed connection without response [09:21] gotta start it again :( [09:21] sad [09:23] Laney: started it on web/1 too now, though [09:25] oh yeah there's all the 'damaged' results, we should maybe purge those out [09:25] -queuebot:#ubuntu-release- Unapproved: accepted gnome-control-center [source] (focal-proposed) [1:3.36.5-0ubuntu2] [09:25] yeah [09:39] Laney: With WAL, we don't actually need to use the r/o database in the web frontend anymore, so we could update it less often and make the web frontend use the r/w one [09:40] So results would appear sooner that way; and we can save CPU cycles by syncing the public DB less often [09:40] Currently it just runs back to back without a break [09:40] Running it every 15 mins would work then I think [09:42] Then the copy just exists for like britney speedup work [09:43] sounds good [09:44] Laney: have we considered settign up https://www.haproxy.com/de/blog/accelerate-your-apis-by-using-the-haproxy-cache/? [09:45] oh - the only reason the r/o database existed was to relieve pressure on the r/w one? okay -- things are becoming clearer :) [09:45] -queuebot:#ubuntu-release- Unapproved: accepted neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu7] [09:46] waveform: Basically, https://bugs.launchpad.net/auto-package-testing/+bug/1639847 [09:46] Launchpad bug 1639847 in Auto Package Testing "web results incomplete/error out on locked DB" [Medium, Fix Released] [09:46] Laney: Ah, but we do use the extra fields publish-db creates in browse.cgi [09:47] We'd have to that bit on the r/w database then in a transaction [09:47] that could be moved to the rw one, I'd have thought [09:47] was just about to ask that :) [09:47] that and to provide a non changing thing which people/programs can download [09:49] indeed -- the "static" r/o copy makes sense to me; was just confused about why certain bits only existed in that and not the r/w "master", but it's all beginning to make some kind of sense [09:50] * Laney nods [09:50] waveform: but if we have a big write transaction in there, it will block the other writes too, right, even if they use different tables? [09:51] under the historical journal mode, yes -- I'll just go skim the sqlite WAL docs though as I'm less familiar with that [09:52] waveform: I found https://sqlite.org/src/doc/begin-concurrent/doc/begin_concurrent.md#:~:text=Overview,system%20still%20serializes%20COMMIT%20commands. [09:52] juliank: caching> sort of, but not much, I don't want to lose the shiny faster updates we just won ... [09:53] But not sure how to make BEGIN CONCURRENT work, if it's enough to to do it only for the large writer [09:54] might be a silly question, but what sort of storage do the machines this is running on have? Just concerned that it's something that properly implements file-locks and not something like NFS [09:55] ext4 on virtual block storage [09:55] okay - sounds like that should be fine [09:56] juliank, that BEGIN CONCURRENT doc looks a lot like RR in other systems; in other words if you're handling a write request you *have* to deal with the possibility of txn conflicts. Which is do-able but would probably mean a fair bit of upheaval in the code-base (basically means on any commit you have to catch busy-snapshot and retry the whole txn) [09:56] waveform: Right, if we can only do it in publish-db which adds its own extra tables, and is the only one writing them, we'd be fine though [09:57] waveform: Because there are no conflicts across tables [09:57] Two transactions that write to different sets of tables never conflict, and that [09:57] well, note it's not just which tables you *write* [09:57] "yes, this transaction did not read or modify any data modified by any concurrent transaction" [09:58] in other words, if that transaction reads any page of data which was modified by another transaction, then it conflicts [09:58] (it's effectively RR isolation) [09:58] ah [09:58] That works for us :D [09:59] yeah - if that still works, then \o/ - I'm just not that clear on all the things that happen simultaneously yet :) [10:00] waveform: We can also rollback and retry those transactions publish-db currently does too, that's not a problem :D [10:00] excellent [10:07] bit of a shame the result table (in particular) isn't WITHOUT ROWID but that's something to look at another time [10:18] I mostly wonder how we ended up with a working database so long [10:19] like, we only started getting inconsistent database images yesterday and our copy without lock hack still did not seem to break much [10:19] looking at how schema migration is handled ... erm ... yeah :) [10:19] that too, yeah [10:20] now you're getting used to autopkgtest-cloud [10:20] Well we probably 500ed out before too, but did not have the haproxy declaring our servers dead [10:21] we didn't have proposed-migration hitting the web backends before [10:32] finally, gcc-10 built on armhf after 11 days \o/ [10:34] 🥳🥳🥳 [10:42] Laney: On https://ubuntu-release.kpi.ubuntu.com/d/76Oe_0-Gz/autopkgtest?orgId=1&var-instance=production, I think it would be nicer if we could split the active/error worker graphs, and same for the normal/abnormal exit [10:43] wondering you can have those graphs accumulative, such that it shows total since start of period, instead of like bouncing between 1 and 0 [10:44] mmm [10:44] sometimes I want to like show one arch, sometimes all the errors [10:45] I usually shift-click the labels to select what I want [10:45] but yeah it could probably be better somehow [10:46] juliank: let me add you to that, you can make a copy of the dashboard and play if you want [10:47] check email [10:49] would love to at least get the port quota thing fixed, but unsure how to further prioritise it :( [10:56] Laney: maybe I just want one error panel which shows three graphs: errored jobs, worker in error states, relxd motes in error state [10:56] Laney: One quick thing to glance at and see if things are behaving normal, not for digging further [10:57] * Laney relxd motes [10:58] Laney: yeah, lol @ mosh messing up lxd remotes [10:58] Laney: We need some email alerts for increased error rates [10:58] We also need to capture 5xx server errors in haproxy and log them [10:58] card that stuff [10:58] "Something is off, please have a look" [10:59] some 'stat' panels might be nice for what you're after [10:59] not graphs, but here is the state now [10:59] ACPM-15 and ACPM-16 [10:59] juliank: some kind of notification support in the metrics project would be great [10:59] then we could have like /notify #ubuntu-release ISO builds are failing too [11:00] Yeah, email irc or mattermost pings [11:00] I guess you should be able to write influx queries for this stuff [11:00] so we need a way to hook it up to the notification side [11:01] medium amount of work I guess [11:01] -queuebot:#ubuntu-release- Unapproved: accepted udisks2 [source] (hirsute-proposed) [2.9.2-1ubuntu1] [11:57] -queuebot:#ubuntu-release- Unapproved: openssl (hirsute-proposed/main) [1.1.1j-1ubuntu3 => 1.1.1j-1ubuntu3.1] (core, i386-whitelist) [11:59] -queuebot:#ubuntu-release- Unapproved: openssl (focal-proposed/main) [1.1.1f-1ubuntu2.3 => 1.1.1f-1ubuntu2.4] (core, i386-whitelist) [11:59] -queuebot:#ubuntu-release- Unapproved: openssl (groovy-proposed/main) [1.1.1f-1ubuntu4.3 => 1.1.1f-1ubuntu4.4] (core, i386-whitelist) [13:00] rbasak: hey, I've re-uploaded the emoji package, this time with the leading line... Damned vim :) [13:00] -queuebot:#ubuntu-release- Unapproved: fonts-noto-color-emoji (focal-proposed/main) [0~20200408-1 => 0~20200916-1~ubuntu20.04.1] (ubuntu-desktop) [13:00] oh, here it is :D [13:01] Thanks! [13:03] -queuebot:#ubuntu-release- Unapproved: rejected fonts-noto-color-emoji [source] (focal-proposed) [0~20200916-1~ubuntu20.04.1] [13:03] rbasak any chance you have time to review the openssl uploads? it's been reviewed by the security team already as you can see in LP: #1926254 and the change for LP: #1927161 has no code change [13:04] Launchpad bug 1926254 in openssl (Ubuntu Groovy) "x509 Certificate verification fails when basicConstraints=CA:FALSE,pathlen:0 on self-signed leaf certs" [Medium, In Progress] https://launchpad.net/bugs/1926254 [13:04] Launchpad bug 1927161 in openssl (Ubuntu Impish) "dpkg-source: error: diff 'openssl/debian/patches/pr12272.patch' patches files multiple times; split the diff in multiple files or merge the hunks into a single one" [Low, In Progress] https://launchpad.net/bugs/1927161 [13:04] ddstreet: marked for attention. I'll try and get to it today, but feeling a bit swamped under all the SRUs in the queue at the moment! [13:04] ack, thanks [13:05] It'll be the next one I start looking at - just trying to clear the list of things I've already started. [13:13] -queuebot:#ubuntu-release- Unapproved: accepted fonts-noto-color-emoji [source] (focal-proposed) [0~20200916-1~ubuntu20.04.1] [13:55] -queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (impish-proposed/primary) [465.27-0ubuntu1] [14:03] -queuebot:#ubuntu-release- New source: fabric-manager-450 (impish-proposed/primary) [450.119.04-0ubuntu1] [14:03] -queuebot:#ubuntu-release- New source: fabric-manager-460 (impish-proposed/primary) [460.73.01-0ubuntu1] [14:06] -queuebot:#ubuntu-release- New source: libnvidia-nscq-460 (impish-proposed/primary) [460.73.01-0ubuntu1] [14:09] -queuebot:#ubuntu-release- New source: libnvidia-nscq-450 (impish-proposed/primary) [450.119.04-0ubuntu1] [14:18] Oh, Laney, waveform I figure we should add some random delay to the timer units; the way it works now both backends get the same peak loads at the same time, giving them even just 5s randomness should smoothen the load out for clients a bit [14:19] * juliank still has htop running on both autopkgtest-web instances :D [14:21] We should set AccuracySec=1us and RandomizedDelaySec=30s so that systemd does not coalesce timers (Which it does in 1 minute groups) and makes them more random [14:21] though I guess maybe it isn't timers after all [14:22] publish-db, who runs that [14:27] pushed to https://code.launchpad.net/~juliank/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/402262 [14:32] juliank: I would like FixedRandomDelay too, think focal's systemd might not have that though [14:32] otherwise seems like a sane idea [14:33] Laney: That means they might end up with the same delay though, without it, it's more likely to distribute better [14:33] (IMO) [14:36] Then you might just as well start one at :00, the next at :10, another at :20 [14:46] Otherwise you get a different delay each time, I think that's worse [15:22] -queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (groovy-proposed/restricted) [460.73.01-0ubuntu0.20.10.1 => 460.73.01-0ubuntu0.20.10.2] (i386-whitelist) [15:22] -queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (bionic-proposed/restricted) [460.73.01-0ubuntu0.18.04.1 => 460.73.01-0ubuntu0.18.04.2] (no packageset) [15:22] -queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (focal-proposed/restricted) [460.73.01-0ubuntu0.20.04.1 => 460.73.01-0ubuntu0.20.04.2] (i386-whitelist) [15:23] -queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (groovy-proposed/restricted) [450.119.03-0ubuntu0.20.10.1 => 450.119.04-0ubuntu0.20.10.1] (core, i386-whitelist) [15:24] -queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (hirsute-proposed/restricted) [450.119.03-0ubuntu1 => 450.119.04-0ubuntu0.21.04.1] (core, i386-whitelist) [15:24] -queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (bionic-proposed/restricted) [450.119.03-0ubuntu0.18.04.1 => 450.119.04-0ubuntu0.18.04.1] (no packageset) [15:24] -queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (focal-proposed/restricted) [450.119.03-0ubuntu0.20.04.1 => 450.119.04-0ubuntu0.20.04.1] (i386-whitelist) [15:35] -queuebot:#ubuntu-release- Unapproved: nvidia-settings (bionic-proposed/main) [460.39-0ubuntu0.18.04.2 => 460.73.01-0ubuntu0.18.04.1] (ubuntu-desktop) [15:35] -queuebot:#ubuntu-release- Unapproved: nvidia-settings (groovy-proposed/main) [460.39-0ubuntu0.20.10.1 => 460.73.01-0ubuntu0.20.10.1] (i386-whitelist, ubuntu-desktop) [15:35] -queuebot:#ubuntu-release- Unapproved: nvidia-settings (focal-proposed/main) [460.39-0ubuntu0.20.04.1 => 460.73.01-0ubuntu0.20.04.1] (i386-whitelist, ubuntu-desktop) [15:35] -queuebot:#ubuntu-release- Unapproved: nvidia-settings (hirsute-proposed/main) [460.56-0ubuntu2 => 460.73.01-0ubuntu0.21.04.1] (i386-whitelist, ubuntu-desktop) [15:44] -queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (hirsute-proposed/primary) [465.27-0ubuntu0.21.04.1] [15:45] -queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (groovy-proposed/primary) [465.27-0ubuntu0.20.10.1] [15:45] -queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (focal-proposed/primary) [465.27-0ubuntu0.20.04.1] [15:46] -queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (bionic-proposed/primary) [465.27-0ubuntu0.18.04.1] [16:50] * blackboxsw SRU vanguard for tomorrow (sil2100). ubuntu-advantage-tools 27.0 cleared SRU verification for B, F, G and H series. If there is time to look at that tomorrow it would be great to get that release out and align w/ Xenial that has already released [16:51] * blackboxsw SRU vanguard for tomorrow (sil2100): ubuntu-advantage-tools 27.0 cleared SRU verification for B, F, G and H series. If there is time to look at that tomorrow it would be great to get that release out and align w/ Xenial that has already released. https://bugs.launchpad.net/ubuntu/+source/ubuntu-advantage-tools/+bug/1926361 [16:51] Launchpad bug 1926361 in ubuntu-advantage-tools (Ubuntu Hirsute) "[SRU] ubuntu-advantage-tools (10ubuntu0.16.04.1 -> 27.0) Xenial, Bionic, Focal, Hirsute" [Undecided, Fix Committed] === mfo_ is now known as mfo [20:31] bdmurray vorlon sorry to bug you about an sru, but systemd in bionic has been in -proposed for quite a long time and we're getting asked about why it's delayed, any chance you could release it to -updates? it's for lp: #1923115 [20:31] Launchpad bug 1923115 in systemd (Ubuntu Bionic) "Networkd vs udev nic renaming race condition" [Undecided, Fix Committed] https://launchpad.net/bugs/1923115 [21:29] * enyc meows [23:15] -queuebot:#ubuntu-release- New binary: python-xstatic-dagre [amd64] (impish-proposed/none) [0.6.4.0-1] (no packageset)