[07:55] xnox: if there is time before the final ISO, my vote is that the /etc/init/oem-config.conf change be backed out - https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1239471 [07:55] Launchpad bug 1239471 in ubiquity (Ubuntu) "oem user gets deleted even before "Prepare for shipping to end user"" [Undecided,New] [08:12] jderose: and be left with not removing oem user? [08:12] jderose: let me run a test with it, and I'll wait for cjwatson to weight on this as well. [08:19] xnox: yeah, because i have a (probably not very good) fix for removing the OEM user. it's more of a pain for the ISO to be broken for mastering... we can always update the golden image, add updated packages. but we're stuck with the ISO for 6 months :) [08:21] xnox: also, the existing fix is wildly unusable... you can't get to logging in as the OEM user and clicking "Prepare for shipping to end user" in the first place. Sorry I didn't catch this before... I was testing from a VM with an existing OEM mode install [08:29] right. [08:32] xnox: also, why is it a "post-start" target instead of "post-stop"? (bear in mind I'm still kinda upstart dumb) does this upstart job continue to run even after the customer config completes? [08:33] jderose: no it does not, as it's a "task" job. [08:33] jderose: when the script stanza finishes, that's when it's "started", hence the post-start. [08:33] hmm [08:35] but /var/lib/oem-config/run doesn't get removed till after oem-config-first-run completes, right? and there still might be a process running as the oem user? [08:35] jderose: Did you have any luck with my patch set to move it into oem-config-firstboot instead? [08:36] cjwatson: no, it failed, although it wasn't clear why. [08:37] (not clear to me anyway) [10:40] xnox, hey ... got a list of bugs you are including in the patition busy issues ... so i can review them too [10:40] apw: Feel free to bounce ideas around with xnox here for the partition rescan bug. I'm multitasking too hard to keep focus on it. [10:41] apw: bug 1220165 [10:41] Launchpad bug 1220165 in ubiquity (Ubuntu) "Error informing the kernel about modificatons" [High,Confirmed] https://launchpad.net/bugs/1220165 [10:41] xnox, we did have a couple of people who could repro it and it went away for them when they retested [10:41] apw: and dupes/comments from there. [10:41] (in my team) [10:41] apw: after i'm done with oem bug, i'll try to work on reproducing it. [10:41] xnox, from what i can see that xfs error report is just scarey, it is not clear it is an issue [10:42] apw: same thoughts here, it's just that os-prober/partman & friends do try to mount things read-only a few times, but in the end all of that successfully quits and offers installation options (wipe and install) [10:43] xnox, how did you determine that swap was correclty stopped [10:45] given the syslog shows it being added [10:45] apw: i don't see that it was stopped. [10:45] xnox, ok then if it wasn't you won't be changing the partition table [10:45] apw: slangasek seemed to have concluded that it was not in use. [10:45] Oct 11 15:53:41 lubuntu os-prober: debug: /dev/sda3: is active swap [10:46] one of the last things osprober says is that it thinks it is active [10:46] and i see the kernel saying it opened and started using it [10:46] apw: do you remember if there was an installer / kernel cmdline or some such to disable swap activation? [10:46] xnox: not afaik [10:47] apw: when was it activated though? e.g. on some of them i see very early on in the syslog: Sep 3 09:45:18 lubuntu kernel: [ 94.421151] Adding 2439064k swap on /dev/sda4. Priority:-1 extents:1 across:2439064k FS [10:47] partman activates swap it finds in its init.d sequence [10:47] even if it wasn't activated before [10:48] which is well ahead of ubiquity or anything else comming up. as if it gets activated in the initramfs/casper. [10:48] sorry this is partman's init.d not /etc/init.d [10:48] but ok, your syslog entry [10:49] cjwatson: thus it should have been behaving the same way...... sans it being activated already. Right and it deactivates that swap if one decides to format that drive? [10:50] Yeah, this could just be a case of partman needing to *de*activate all swap before writing partition tables. [10:50] At least, all swap on drives it's about to write to. :P [10:50] it does [10:50] These syslogs don't seem to show that. [10:50] ./lib/base.sh:1094:disable_swap () { [10:50] ./commit.d/parted:15: disable_swap "$dev" [10:50] Or does the kernel not helpfully report when swap is deactivated? [10:51] my bet would be a udev race or something similarly annoying [10:51] it does not helpfully report it i don't think [10:51] cjwatson: re:oem-config patch, I am booted into oem config user. Interesting enough oem-config stays in rc state with the only file left - /etc/init/oem-config.conf [10:51] although parted even tries sleeping for a while [10:51] Oh, the kernel totally doesn't say anything about swapoff. That's unhelpful., [10:51] cjwatson: maybe we can remove oem user if oem-config-firstboot is not found? [10:52] perhaps [10:52] infinity, no indeed, it says nothing, though if you ask it to do so it seems to not return till it is successful [10:53] so if you can show we did make an attempt we ought to be able to say it did if you continued [10:53] I usually slam set -x into /lib/partman/lib/base.sh to debug this kind of thing [10:57] "I'm suspicious of ubiquity-partman but that's just a guess" unhelpful bug comments of our time [10:57] (Yes, it's just possible that the partitioner might be involved) [10:59] I see a log from d-i there [10:59] Which is good, means it's likely nothing to do with ubiquity's additions [10:59] cjwatson: Yeah, I've personally tripped it in barebones netbooted d-i. [10:59] cjwatson: Ran into it a bunch when doing midway installer work, and so has Rob Herring on highbank/midway. [11:00] (At the time, no one else had been talking about it, so I assumed it might be specific to the platform... I was clearly wrong) [11:00] infinity, it seems unlikely the exit path of the xfs warning can be to blame if it is not conistant and reproducible on all installs (to my mind) [11:01] as we attempt that always, regardless [11:01] Indeed, probably has nothing to do with XFS [11:01] apw: The inconsistency makes it hard to blame anything, mind you. But if it happened on EVERY reinstall, we'd have a lot more bug reports. Unless all our serial reinstallers have left us for Mint and Gentoo. [11:01] and i have never seen it on any of my installs, so [11:02] I'll have a go at it following the recipe in bug 1239515 (only with installing over Ubuntu/Ubuntu, since the flavour won't matter) [11:02] Launchpad bug 1239515 in ubiquity (Ubuntu) "Saucy ubiquity crashes with kernel error" [Undecided,New] https://launchpad.net/bugs/1239515 [11:05] Easier in qemu - I can snapshot and keep retrying [11:12] If we're changing ubiquity we might want to sort out bug 1236233 BTW [11:12] Launchpad bug 1236233 in OEM Priority Project precise "BIOS that SMIBIOS is not fuly supported by dmidecode causes image autmatic installation fail" [High,New] https://launchpad.net/bugs/1236233 [11:12] as a workaround [11:12] dmidecode's behaviour is crappy but not sure I want to change that now [11:49] cjwatson, on the dmidecode thing, i note the error it whines about is >2.7, but actually the current saucy dmidecode would emit >2.8 so i think it might be moot for saucy [11:50] cjwatson, at least until they invent 2.8 [11:52] mm, ok [11:53] can live with that [11:58] cjwatson, it is probabally something that should be on stderr, or suppressed by -q or both in the long term [12:03] apw: cjwatson: re mount: it tried all the filesystems it knows about even to the point of trying xfs meaning that mounted tests from os-prober have failed to mount ext4 even though partition table was read to contain one. wouldn't that mean kernel is failing to read anything past the partition table? [12:03] and then later when it tries to update the partition table (or do any write operation on the drive) that fails to. [12:03] *too [12:03] (surprising swap does get activated in those cases) [12:04] infinity: ^ [12:04] xnox, how did you make that detemrination, in the syslog i read i saw it mount ext3 successfully [12:05] xnox: That seems unconvincing since there were syslog indications of it reading the partition table [12:05] * xnox goes to read it again. [12:05] Oh, maybe not after repartitioning [12:05] Not sure if those ioctls emit any logging though [12:06] be good to have a timecode and url for the log, so we know we are in sync [12:06] Sep 3 09:48:47 lubuntu kernel: [ 315.175165] EXT4-fs (sda6): write access unavailable, cannot proceed [12:06] S [12:06] os-prober is before the repartitioning, is it not? [12:07] when requesting full-disk install, would mean that it needs to wipe sda6 at least (which is rootfs of the second installation, in a dual boot linux setup) [12:07] xnox: That looks like grub-mount has failed for some reason [12:07] ok. [12:07] Normally the kernel code for this shouldn't even be touched [12:07] xnox, which logfile is that, url pls [12:07] https://launchpadlibrarian.net/149229337/UbiquitySyslog.txt [12:07] https://launchpadlibrarian.net/149229336/UbiquityPartman.txt [12:07] which is top level / original bug 1220165 [12:07] Launchpad bug 1220165 in partman-base (Ubuntu) "Error informing the kernel about modificatons" [Critical,Confirmed] https://launchpad.net/bugs/1220165 [12:07] So I would research *that* - it's very bad if we aren't using grub-mount correctly [12:08] Sep 3 09:48:47 lubuntu kernel: [ 315.175148] EXT4-fs (sda6): INFO: recovery required on readonly filesystem [12:08] Sep 3 09:48:47 lubuntu kernel: [ 315.175165] EXT4-fs (sda6): write access unavailable, cannot proceed [12:08] or if grub-mount fails, then replace/reuse recipes would also try mount -o ro, as a fallback. [12:08] xnox, so that says you asked for a mount of a dirty partition, and we cannot do it cause it needs fsck [12:08] Yeah, that should be removed in T (I was going to remove it from os-prober following a conversation at Debconf but hadn't got round to it yet) [12:08] apw: Which is exactly why we use grub-mount [12:09] The kernel cannot do the right thing here and if we're trying to ask it to we've already lost [12:09] indeed [12:09] We aren't supposed to hit that path [12:09] but that shouldn't prevent from wiping that thing. [12:09] It might if udev got involved as a result [12:09] Or udisks [12:09] I can absolutely imagine this bug being a side-effect of a failure to use grub-mount [12:09] i do see on the desktop/ubiquity side where udisks kicks in and mounts stuff (in some duplicate logs) [12:10] but udisks wouldn't explain d-i install. [12:10] Not on its own, but there are various of these things [12:11] See also rant in http://www.chiark.greenend.org.uk/ucgi/~cjwatson/blosxom/ubuntu/2008-04-12-desktop-automount-pain.html [12:11] yeah, and the fact that udisks2 doesn't have disk-inhibitor and pitti had to hack one up for us. [12:12] * xnox s/had to/end up to/ [12:13] * apw goes see if a poweroff is the trigger ehre [12:34] Oho [12:34] Reproduced [12:34] qemu: Ubuntu amd64, Ubuntu amd64 side-by-side, Ubuntu amd64 erase disk [12:34] No funny business [12:35] so a double install, and is there anything mounted or swap running [12:36] no, no [12:37] if you have a shell and have fuser, be interested if any of the partitions show open [12:37] nope. pretty sure it'll be racy [12:38] cjwatson, i guess we can prove that if you attempt to change the parititon table now [12:38] and it works [12:39] The delay is suspiciously short [12:39] I tried again and it still failed [12:39] But I think it's a race between removing/adding the partition and the next thing in the installer that cares [12:39] so the paritition really is open somewhere in the kernels mind [12:40] ok [12:41] Running udevadm monitor in the background makes it work [12:42] So this is totally a race between parted and a udev rule [12:42] heh ... well that must be indicative indeed [12:42] Oh dear. [12:42] so i wonder if a sleep 5 after the partitioning change would be enough [12:43] to confirm it is a udev opens and closes things a lot on change thing [12:43] It'd have to be internal to parted [12:43] as cirtainly we emit events on the parititon table change [12:43] But parted already has such abominations internally, it just needs a few more [12:43] Ideally it'd use event cookies to avoid having to need this [12:43] But well [12:43] or do a udevadm settel [12:43] It actually does [12:44] didn't we just add extra helpers to the udev.udeb to make it "work" [12:44] and since those are executed, they know actually do something in a non-finite time. [12:44] My test case is in ubiquity so udeb not involved [12:44] ok. [12:45] Anyway, sure, random permutations will affect this, but reverting the random permutations would only change the set of cases where the bug shows up [12:45] You don't fix races that way :) [12:47] Perhaps a settle between the removes and the adds would do it [12:47] Let's see how reliably I can reproduce this first [12:47] yeah we need to know if its 1/10 or 9/10 [12:55] apw: 5/5 at least [12:55] oh great indeed [13:21] xnox, have you done any 'side-by-side' installs, i am looking at the 'slider' and it seems to have no legends [13:40] cjwatson, to confirm i have managed repro the issue as well with the same configuration two ubuntu 13.10s side by side, and try and replace them [13:41] I'm building a test parted now with an extra udevadm settle inserted between remove/add [13:41] Might slow committing down somewhat but I'll take that over breaking [13:43] yeah, its not like it is a simple thing one is doing [13:43] in the old days we would have had to reboot, at least we don't have to do that [13:43] so do we know about this popup about super+space being the hot key [13:45] Does that happen on every boot, or just reboot after upgrade? [13:45] I don't reboot often enough to notice... [13:46] super+space? that occurs on the boot of the installer before its install window appears [13:48] i don't recall seeing it anywhere else [13:50] cjwatson, yeah this is 5/5 reproducible for me, if i can help test in any way let me know [13:51] Description:Ubuntu 13.04 [13:51] bah, wrong channel [13:52] apw: when I get something to build ... [14:10] cjwatson: with your oem patch: [14:10] + userdel --force --remove oem [14:10] userdel: user oem is currently used by process 1207 [14:10] userdel: cannot open /etc/subuid [14:10] cjwatson: 1207 ? S 0:00 dbus-launch --exit-with-session [14:10] from ps aoutput. [14:11] ok, it was a Friday night special [14:11] so..... maybe our removal of the user is obsolutely fine, it's just dbus is left around. [14:17] tried http://paste.ubuntu.com/6236071/, no luck === shadeslayer_ is now known as shadeslayer [14:39] process accounting suggests that swapoff triggers udisks-part-id and modprobe, but the udevadm settle should account for both [14:40] I'm tempted to loop the way parted does elsewhere, even though it's an utterly foul thing to do [14:46] cjwatson, loop trying to replace each partition or the whole shebang [14:46] each [14:46] it already does that for removes, just not adds [14:48] so very true indeed how vile [15:01] OK, need more coffee before trying to cope with parted's exception handling code [15:35] xnox: have you seen bug 1239471? [15:35] Launchpad bug 1239471 in ubiquity (Ubuntu) "oem user gets deleted even before "Prepare for shipping to end user"" [Undecided,New] https://launchpad.net/bugs/1239471 [15:35] bdmurray: yes. [15:36] bdmurray: it's reopen of the previous bad fix of "oem user is not removed" [15:36] bdmurray: i am working on it. [15:36] xnox: great, thanks [16:54] cjwatson: bdmurray: infinity: fix for oem user removed too early (revert previous upload) and fix removal of oem user with cjwatson's paste from yesterday, with added processes cleanup to make userdel succeed. [16:54] https://code.launchpad.net/~xnox/ubiquity/fix-oem-user/+merge/191005 [16:54] wubi rt #65238 [16:56] xnox: any chance of using pkill instead of killall? killall makes me itch [16:56] (it has startlingly different behaviour on some systems - granted, none we actually care about) [16:57] not a blocker though [16:59] cjwatson: oh, and pkill is in minimal, vs killall which is only in standard [16:59] cjwatson: "pkill -U oem || true" ? [17:00] ahh the good old 'killall really does killall on some systems' [17:00] one of my favorite unexpected behaviors ;p [17:01] xnox: you should depend on the appropriate package either way [17:01] xnox: I usually use -u, but I guess match whichever of ruid or euid userdel is checking for [17:02] pkill -> Depends: procps [17:11] cjwatson: it checks all/both. [17:11] effective then. [17:12] * cjwatson -> dinner [17:14] cjwatson, i managed to take an strace off parted_server when failing, and i don't even see it attempting an add the one and only ioctl it tries is a resize [17:17] cjwatson, and if that fails which it quite reasonably could as we are expanding partition vda1 in my case (remove two small OSs and replace with one bigger one) and vda2 has not been removed at that point so is in the way [17:18] cjwatson: updated. I can't upload now though. And I got to go. [17:19] cjwatson, and that error seems to be just fatal when in fact next thing we could try is removing the other partitions, i bet if one hits ignore things are fine [17:19] cjwatson, and the bug to my eye is that _blkpg_resize_partition is fatal when it is more of an optimisation [19:20] apw: I worked it out with psusi on #ubuntu-devel [19:21] cjwatson, ok great [20:11] Not sure if it's just me, but on the latest rebuild, the live session fails to start: Bug #1239833. [20:11] Launchpad bug 1239833 in ubiquity (Ubuntu) "Live session failed to start" [Undecided,New] https://launchpad.net/bugs/1239833 [20:16] ScottK: Huh. That's.. Weird. [20:17] infinity: Just got it on a second machine. [20:17] (using the same usb stick I booted a live image on that same machine on Sunday) [20:18] Yeah, I'm not assuming it's the machine's fault. [20:18] Having a hard time seeing a way it could be the software's fault either, mind you. [20:19] * ScottK looks for another usb. [20:19] Also, loving the logs that got attached. [20:19] The Ubiquity syslog from oneiric is pure class. [20:19] (PS: Wash your screen) [20:20] I wouldn't have even bothered with ubuntu-bug, except I was trying to get the hardware info in the bug. [20:24] OK, now I remember why I bought that USB stick. I lost all my other ones. [21:05] cjwatson: infinity: xnox: i've been putting ubiquity 2.15.24 through its paced today... fix seems solid, haven't found any issues. thanks! [21:05] Phew [21:06] cjwatson: so will that build make the ISO? [21:06] I believe it's on the list, yeah [21:06] awesome [21:06] Though need to actually upload it [21:09] cjwatson: It's uploaded. [21:09] Oh OK [21:09] Then it'll be fine [21:09] Just testing another go-around at ubiquity now [21:09] I sponsored it (after a quick fix) earlier. [21:10] Delayed by celebrating Canadian Thanksgiving (happy thanksgiving!) [21:10] Everyone seems to be celebrating that except me. :P [21:10] Last couple of years my wife has decided she wants to celebrate all three thanksgiving dates (Canada, US, Liberia) [21:10] Excuse for a nice meal :) [21:11] cjwatson: An excuse to make turkeys? [21:11] Indeed [21:12] Wikipedia lists a few more Thankgivings, if she's keen. :P [21:12] * cjwatson tries to decide whether or not to tell her [21:13] I might burst [21:13] It's a good way to go. [21:13] Plus, pumpkin pie. [21:31] infinity: *and* tarte au sucre [23:10] question about the ubuntuone sign in... does it actually do anything with those credentials? seems like you still have to separately sign into ubuntuone anyway, so i'm not clear what the point of it is