/srv/irclogs.ubuntu.com/2021/05/05/#ubuntu-release.txt

=== cpaelzer__ is now known as cpaelzer
dokoddstreet: https://gcc.gnu.org/pipermail/gcc/2021-May/235967.html07:06
-queuebot:#ubuntu-release- New: accepted pipewire [amd64] (impish-proposed) [0.3.26-1]07:11
=== cpaelzer__ is now known as cpaelzer
Laneyjuliank: can you check download-results please, I think you semi-broke it08:08
Laneyand that is probably making the web frontend miss results08:08
juliankLaney: sure08:08
juliankLaney: yeah, it times out after 5s because publish-db holds the lock for ~1s08:12
juliankLaney: We should switch the database from journal_mode=DELETE to journal_mode=WAL; then readers don't block writers08:15
julianksee https://sqlite.org/wal.html08:16
juliankNot sure how it works with the backup API, maybe waveform knows more if the backup DB will be journal_mode=DELETE still if the main db is journal_mode=WAL08:16
juliankit seems to be08:18
juliankLaney: Oh I guess it might hold the lock longer because it's slower, I can't say08:19
juliankBut WAL would avoid any logging, but then we can't publish a single db file anymore that way ugh08:19
juliankI guess we can do pragma journal_mode=delete; at the end before committing it08:20
juliank(committing being the rename here)08:20
juliankMight just set pragma journal_mode=OFF on the db copy08:21
juliankLaney, waveform https://code.launchpad.net/~juliank/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/40224208:30
juliankLaney: validated on staging08:34
Laneyjuliank: It sounds sensible to me but I also think w_aveform would be a better reviewer08:35
Laneybtw once this is fixed you need to run download-all-results.service once to catch up08:35
dosaboyhi, there are currently multiple uploads in the bionic unapproved queue for neutron whereby the most recent one can superscede all others08:37
dosaboyhttps://launchpad.net/ubuntu/bionic/+queue?queue_state=1&queue_text=neutron08:37
dosaboywould it be possible for someone with sru powers to go ahead and reject all prior to the most recent08:37
dosaboysorry about that, it resulted from accidental parralel uploads08:38
dosaboyhttps://launchpad.net/ubuntu/bionic/+upload/26074856/+files/neutron_12.1.1-0ubuntu7.dsc is the one to keep08:38
-queuebot:#ubuntu-release- Unapproved: rejected linux-firmware-raspi2 [source] (focal-proposed) [4-0ubuntu0~20.04.1]08:54
Laneyjuliank: meh, let's try it, it's making me anxious seeing download-results fail all the time08:54
-queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu4.1]08:54
-queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu7]08:54
-queuebot:#ubuntu-release- Unapproved: rejected neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu6]08:54
juliankLaney: OK, I can publish it :D08:55
Laneyaye08:55
juliankLaney: it's live now08:58
juliankLaney: Running d-a-r08:59
Laneyw00t09:00
Laneykeep a journalctl -fe -u download-results running09:00
Laneyoh sigh, I forgot I had a stupid conflict on download-all-results09:01
Laneywe should drop that, there's no need for it I think09:01
* Laney does quickly09:01
waveformjuliank, journal_mode=WAL *would* be ideal (due to readers not blocking writers) but for the fact that read-only mode becomes more tricky with that (-wal and -shm must pre-exist, in other words it *must* be open already elsewhere, in order for a read-only open to succeed IIRC)09:02
waveformbut then if you're not really counting on read-only mode (it's just a safety measure) it might be useful to explore anyway09:03
juliankwaveform: I think it's fine, we set r-o mode in the reader that does not set the pragma, the pragma is only set in the writer09:04
juliankwaveform: And the reader runs from time to time, the writer runs consistently :D09:05
waveformokay, that should be fine then09:05
juliankLaney: running on autopkgtest-web/0 at the moment, once that is done, autopkgtest-web/1; ETOOMANYTERMINALS :D09:06
juliankdownload-all-results is *slow*09:07
Laneyyeah09:07
Laneyit's basically doing a diff between swift and the db09:07
waveformhmmm, I found a few more places to optimize last night but I haven't looked at d-a-r yet; will have a look now09:08
Laneythat's only run as a recovery tool, I wouldn't bother too much09:08
Laneyor, well, not as a first priority, feel free of course09:09
juliank4-7 hours runtime it seems from journal09:09
Laneyno way09:09
waveformo.O09:09
juliankApr 09 16:34:23 juju-4d1272-prod-proposed-migration-2 systemd[1]: Starting Download all results...09:09
juliankApr 09 23:57:48 juju-4d1272-prod-proposed-migration-2 systemd[1]: Finished Download all results.09:09
juliankApr 26 08:14:03 juju-4d1272-prod-proposed-migration-2 systemd[1]: Starting Download all results...09:09
juliankApr 26 11:51:14 juju-4d1272-prod-proposed-migration-2 systemd[1]: Finished Download all results.09:09
Laneyyeah where it had a lot of work to do when initialising09:10
Laneynot now though, I wouldn't have thought09:10
juliankIt finished xenial now and is on bionic09:10
juliankmaybe it should download results from swift in a thread pool or sth09:12
waveformwould be interesting to throw some more logging in there and see where the bottleneck really is09:12
Laneypolling the swift API is slow09:12
Laneyyou have to fetch the contents in batches09:13
Laneyjuliank: see sync-swift in the wendigo homedir, there's a thread pool in that one, if you really want to optimise this tool we shouldn't have to be running :p09:13
juliankmice09:14
juliank*nice09:14
juliankfocal :D09:17
juliankLaney: it failed, swift got unhappy09:20
juliankhttp.client.RemoteDisconnected: Remote end closed connection without response09:21
juliankgotta start it again :(09:21
Laneysad09:21
juliankLaney: started it on web/1 too now, though09:23
Laneyoh yeah there's all the 'damaged' results, we should maybe purge those out09:25
-queuebot:#ubuntu-release- Unapproved: accepted gnome-control-center [source] (focal-proposed) [1:3.36.5-0ubuntu2]09:25
juliankyeah09:25
juliankLaney: With WAL, we don't actually need to use the r/o database in the web frontend anymore, so we could update it less often and make the web frontend use the r/w one09:39
juliankSo results would appear sooner that way; and we can save CPU cycles by syncing the public DB less often09:40
juliankCurrently it just runs back to back without a break09:40
juliankRunning it every 15 mins would work then I think09:40
juliankThen the copy just exists for like britney speedup work09:42
Laneysounds good09:43
juliankLaney: have we considered settign up https://www.haproxy.com/de/blog/accelerate-your-apis-by-using-the-haproxy-cache/?09:44
waveformoh - the only reason the r/o database existed was to relieve pressure on the r/w one? okay -- things are becoming clearer :)09:45
-queuebot:#ubuntu-release- Unapproved: accepted neutron [source] (bionic-proposed) [2:12.1.1-0ubuntu7]09:45
juliankwaveform: Basically, https://bugs.launchpad.net/auto-package-testing/+bug/163984709:46
ubot3Launchpad bug 1639847 in Auto Package Testing "web results incomplete/error out on locked DB" [Medium, Fix Released]09:46
juliankLaney: Ah, but we do use the extra fields publish-db creates in browse.cgi09:46
juliankWe'd have to that bit on the r/w database then in a transaction09:47
Laneythat could be moved to the rw one, I'd have thought09:47
waveformwas just about to ask that :)09:47
Laneythat and to provide a non changing thing which people/programs can download09:47
waveformindeed -- the "static" r/o copy makes sense to me; was just confused about why certain bits only existed in that and not the r/w "master", but it's all beginning to make some kind of sense09:49
* Laney nods09:50
juliankwaveform: but if we have a big write transaction in there, it will block the other writes too, right, even if they use different tables?09:50
waveformunder the historical journal mode, yes -- I'll just go skim the sqlite WAL docs though as I'm less familiar with that09:51
juliankwaveform: I found https://sqlite.org/src/doc/begin-concurrent/doc/begin_concurrent.md#:~:text=Overview,system%20still%20serializes%20COMMIT%20commands.09:52
Laneyjuliank: caching> sort of, but not much, I don't want to lose the shiny faster updates we just won ...09:52
juliankBut not sure how to make BEGIN CONCURRENT work, if it's enough to to do it only for the large writer09:53
waveformmight be a silly question, but what sort of storage do the machines this is running on have? Just concerned that it's something that properly implements file-locks and not something like NFS09:54
juliankext4 on virtual block storage09:55
waveformokay - sounds like that should be fine09:55
waveformjuliank, that BEGIN CONCURRENT doc looks a lot like RR in other systems; in other words if you're handling a write request you *have* to deal with the possibility of txn conflicts. Which is do-able but would probably mean a fair bit of upheaval in the code-base (basically means on any commit you have to catch busy-snapshot and retry the whole txn)09:56
juliankwaveform: Right, if we can only do it in publish-db which adds its own extra tables, and is the only one writing them, we'd be fine though09:56
juliankwaveform: Because there are no conflicts across tables09:57
juliankTwo transactions that write to different sets of tables never conflict, and that09:57
waveformwell, note it's not just which tables you *write*09:57
waveform"yes, this transaction did not read or modify any data modified by any concurrent transaction"09:57
waveformin other words, if that transaction reads any page of data which was modified by another transaction, then it conflicts09:58
waveform(it's effectively RR isolation)09:58
juliankah09:58
juliankThat works for us :D09:58
waveformyeah - if that still works, then \o/ - I'm just not that clear on all the things that happen simultaneously yet :)09:59
juliankwaveform: We can also rollback and retry those transactions publish-db currently does too, that's not a problem :D10:00
waveformexcellent10:00
waveformbit of a shame the result table (in particular) isn't WITHOUT ROWID but that's something to look at another time10:07
juliankI mostly wonder how we ended up with a working database so long10:18
julianklike, we only started getting inconsistent database images yesterday and our copy without lock hack still did not seem to break much10:19
waveformlooking at how schema migration is handled ... erm ... yeah :)10:19
juliankthat too, yeah10:19
Laneynow you're getting used to autopkgtest-cloud10:20
juliankWell we probably 500ed out before too, but did not have the haproxy declaring our servers dead10:20
Laneywe didn't have proposed-migration hitting the web backends before10:21
dokofinally, gcc-10 built on armhf after 11 days \o/10:32
xnox🥳🥳🥳10:34
juliankLaney: On https://ubuntu-release.kpi.ubuntu.com/d/76Oe_0-Gz/autopkgtest?orgId=1&var-instance=production, I think it would be nicer if we could split the active/error worker graphs, and same for the normal/abnormal exit10:42
juliankwondering you can have those graphs accumulative, such that it shows total since start of period, instead of like bouncing between 1 and 010:43
Laneymmm10:44
Laneysometimes I want to like show one arch, sometimes all the errors10:44
LaneyI usually shift-click the labels to select what I want10:45
Laneybut yeah it could probably be better somehow10:45
Laneyjuliank: let me add you to that, you can make a copy of the dashboard and play if you want10:46
Laneycheck email10:47
Laneywould love to at least get the port quota thing fixed, but unsure how to further prioritise it :(10:49
juliankLaney: maybe I just want one error panel which shows three graphs: errored jobs, worker in error states, relxd motes in error state10:56
juliankLaney: One quick thing to glance at and see if things are behaving normal, not for digging further10:56
* Laney relxd motes10:57
juliankLaney: yeah, lol @ mosh messing up lxd remotes10:58
juliankLaney: We need some email alerts for increased error rates10:58
juliankWe also need to capture 5xx server errors in haproxy and log them10:58
Laneycard that stuff10:58
juliank"Something is off, please have a look"10:58
Laneysome 'stat' panels might be nice for what you're after10:59
Laneynot graphs, but here is the state now10:59
juliankACPM-15 and ACPM-1610:59
Laneyjuliank: some kind of notification support in the metrics project would be great10:59
Laneythen we could have like /notify #ubuntu-release ISO builds are failing too10:59
juliankYeah, email irc or mattermost pings11:00
LaneyI guess you should be able to write influx queries for this stuff11:00
Laneyso we need a way to hook it up to the notification side11:00
Laneymedium amount of work I guess11:01
-queuebot:#ubuntu-release- Unapproved: accepted udisks2 [source] (hirsute-proposed) [2.9.2-1ubuntu1]11:01
-queuebot:#ubuntu-release- Unapproved: openssl (hirsute-proposed/main) [1.1.1j-1ubuntu3 => 1.1.1j-1ubuntu3.1] (core, i386-whitelist)11:57
-queuebot:#ubuntu-release- Unapproved: openssl (focal-proposed/main) [1.1.1f-1ubuntu2.3 => 1.1.1f-1ubuntu2.4] (core, i386-whitelist)11:59
-queuebot:#ubuntu-release- Unapproved: openssl (groovy-proposed/main) [1.1.1f-1ubuntu4.3 => 1.1.1f-1ubuntu4.4] (core, i386-whitelist)11:59
Trevinhorbasak: hey, I've re-uploaded the emoji package, this time with the leading line... Damned vim :)13:00
-queuebot:#ubuntu-release- Unapproved: fonts-noto-color-emoji (focal-proposed/main) [0~20200408-1 => 0~20200916-1~ubuntu20.04.1] (ubuntu-desktop)13:00
Trevinhooh, here it is :D13:00
rbasakThanks!13:01
-queuebot:#ubuntu-release- Unapproved: rejected fonts-noto-color-emoji [source] (focal-proposed) [0~20200916-1~ubuntu20.04.1]13:03
ddstreetrbasak any chance you have time to review the openssl uploads? it's been reviewed by the security team already as you can see in LP: #1926254 and the change for LP: #1927161 has no code change13:03
ubot3Launchpad bug 1926254 in openssl (Ubuntu Groovy) "x509 Certificate verification fails when basicConstraints=CA:FALSE,pathlen:0 on self-signed leaf certs" [Medium, In Progress] https://launchpad.net/bugs/192625413:04
ubot3Launchpad bug 1927161 in openssl (Ubuntu Impish) "dpkg-source: error: diff 'openssl/debian/patches/pr12272.patch' patches files multiple times; split the diff in multiple files or merge the hunks into a single one" [Low, In Progress] https://launchpad.net/bugs/192716113:04
rbasakddstreet: marked for attention. I'll try and get to it today, but feeling a bit swamped under all the SRUs in the queue at the moment!13:04
ddstreetack, thanks13:04
rbasakIt'll be the next one I start looking at - just trying to clear the list of things I've already started.13:05
-queuebot:#ubuntu-release- Unapproved: accepted fonts-noto-color-emoji [source] (focal-proposed) [0~20200916-1~ubuntu20.04.1]13:13
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (impish-proposed/primary) [465.27-0ubuntu1]13:55
-queuebot:#ubuntu-release- New source: fabric-manager-450 (impish-proposed/primary) [450.119.04-0ubuntu1]14:03
-queuebot:#ubuntu-release- New source: fabric-manager-460 (impish-proposed/primary) [460.73.01-0ubuntu1]14:03
-queuebot:#ubuntu-release- New source: libnvidia-nscq-460 (impish-proposed/primary) [460.73.01-0ubuntu1]14:06
-queuebot:#ubuntu-release- New source: libnvidia-nscq-450 (impish-proposed/primary) [450.119.04-0ubuntu1]14:09
juliankOh, Laney, waveform I figure we should add some random delay to the timer units; the way it works now both backends get the same peak loads at the same time, giving them even just 5s randomness should smoothen the load out for clients a bit14:18
* juliank still has htop running on both autopkgtest-web instances :D14:19
juliankWe should set AccuracySec=1us and RandomizedDelaySec=30s so that systemd does not coalesce timers (Which it does in 1 minute groups) and makes them more random14:21
juliankthough I guess maybe it isn't timers after all14:21
juliankpublish-db, who runs that14:22
juliankpushed to https://code.launchpad.net/~juliank/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/40226214:27
Laneyjuliank: I would like FixedRandomDelay too, think focal's systemd might not have that though14:32
Laneyotherwise seems like a sane idea14:32
juliankLaney: That means they might end up with the same delay though, without it, it's more likely to distribute better14:33
juliank(IMO)14:33
juliankThen you might just as well start one at :00, the next at :10, another at :2014:36
LaneyOtherwise you get a different delay each time, I think that's worse14:46
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (groovy-proposed/restricted) [460.73.01-0ubuntu0.20.10.1 => 460.73.01-0ubuntu0.20.10.2] (i386-whitelist)15:22
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (bionic-proposed/restricted) [460.73.01-0ubuntu0.18.04.1 => 460.73.01-0ubuntu0.18.04.2] (no packageset)15:22
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-460 (focal-proposed/restricted) [460.73.01-0ubuntu0.20.04.1 => 460.73.01-0ubuntu0.20.04.2] (i386-whitelist)15:22
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (groovy-proposed/restricted) [450.119.03-0ubuntu0.20.10.1 => 450.119.04-0ubuntu0.20.10.1] (core, i386-whitelist)15:23
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (hirsute-proposed/restricted) [450.119.03-0ubuntu1 => 450.119.04-0ubuntu0.21.04.1] (core, i386-whitelist)15:24
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (bionic-proposed/restricted) [450.119.03-0ubuntu0.18.04.1 => 450.119.04-0ubuntu0.18.04.1] (no packageset)15:24
-queuebot:#ubuntu-release- Unapproved: nvidia-graphics-drivers-450-server (focal-proposed/restricted) [450.119.03-0ubuntu0.20.04.1 => 450.119.04-0ubuntu0.20.04.1] (i386-whitelist)15:24
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (bionic-proposed/main) [460.39-0ubuntu0.18.04.2 => 460.73.01-0ubuntu0.18.04.1] (ubuntu-desktop)15:35
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (groovy-proposed/main) [460.39-0ubuntu0.20.10.1 => 460.73.01-0ubuntu0.20.10.1] (i386-whitelist, ubuntu-desktop)15:35
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (focal-proposed/main) [460.39-0ubuntu0.20.04.1 => 460.73.01-0ubuntu0.20.04.1] (i386-whitelist, ubuntu-desktop)15:35
-queuebot:#ubuntu-release- Unapproved: nvidia-settings (hirsute-proposed/main) [460.56-0ubuntu2 => 460.73.01-0ubuntu0.21.04.1] (i386-whitelist, ubuntu-desktop)15:35
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (hirsute-proposed/primary) [465.27-0ubuntu0.21.04.1]15:44
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (groovy-proposed/primary) [465.27-0ubuntu0.20.10.1]15:45
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (focal-proposed/primary) [465.27-0ubuntu0.20.04.1]15:45
-queuebot:#ubuntu-release- New source: nvidia-graphics-drivers-465 (bionic-proposed/primary) [465.27-0ubuntu0.18.04.1]15:46
* blackboxsw SRU vanguard for tomorrow (sil2100). ubuntu-advantage-tools 27.0 cleared SRU verification for B, F, G and H series. If there is time to look at that tomorrow it would be great to get that release out and align w/ Xenial that has already released16:50
* blackboxsw SRU vanguard for tomorrow (sil2100): ubuntu-advantage-tools 27.0 cleared SRU verification for B, F, G and H series. If there is time to look at that tomorrow it would be great to get that release out and align w/ Xenial that has already released. https://bugs.launchpad.net/ubuntu/+source/ubuntu-advantage-tools/+bug/192636116:51
ubot3Launchpad bug 1926361 in ubuntu-advantage-tools (Ubuntu Hirsute) "[SRU] ubuntu-advantage-tools (10ubuntu0.16.04.1 -> 27.0) Xenial, Bionic, Focal, Hirsute" [Undecided, Fix Committed]16:51
=== mfo_ is now known as mfo
ddstreetbdmurray vorlon sorry to bug you about an sru, but systemd in bionic has been in -proposed for quite a long time and we're getting asked about why it's delayed, any chance you could release it to -updates? it's for lp: #192311520:31
ubot3Launchpad bug 1923115 in systemd (Ubuntu Bionic) "Networkd vs udev nic renaming race condition" [Undecided, Fix Committed] https://launchpad.net/bugs/192311520:31
* enyc meows21:29
-queuebot:#ubuntu-release- New binary: python-xstatic-dagre [amd64] (impish-proposed/none) [0.6.4.0-1] (no packageset)23:15

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!