[08:21] xnox: Don't know if you saw, but we had a tester show up overnight who can reproduce 1080701 reliably - can we make use of him later today? [08:22] bug 1080701 [08:22] Launchpad bug 1080701 in ubiquity (Ubuntu Raring) "After 'Preparing to install Ubuntu' screen, raring installation hangs" [High,Confirmed] https://launchpad.net/bugs/1080701 [08:27] cjwatson: noticed. will think how to best use "remote hands" =) I also want to try kentb's reproducer case. I think it might be another case of phantom lvm metadata getting left-over. Something similar to previous one bug 154086 [08:27] Launchpad bug 154086 in partman-auto-lvm (Debian) "Installing to HDD with previous ubuntu fails to create fresh LVM claiming group already in use" [Unknown,New] https://launchpad.net/bugs/154086 [08:29] Beware of possible multiple causes [10:16] Hello :) I'm trying to preseed installation for 12.04 on machines which will dual boot with MS Windows for a classroom setting. I can't find any information about using/resizing existing partitions in d-i, but I've noticed ubiquity can do the "install alongside" stuff. Is that preseedable? [10:22] jackweirdy: look into using partman-auto/init_automatically_partition select biggest_free. That will install into largest free disk-space. That assumes that e.g. you pre-shrink windows installs. [10:22] awesome; I'll look into that :) Presumably I could use early_command to do the shrinking itself? [10:22] you can try resize_use_free, but it can choke up on resizing window installation, and that would be harder to troubleshoot. [10:23] (resize_use_free should resize windows & then install into biggest free) [10:24] jackweirdy: yes, you can do early-command as well, I'd recommend "partman/early_command" as that should have all the partitioning/fs utilities available to you. [10:26] awesome, thanks for that :D [13:45] Tracing through bug 1171185 by inserting 'set -x; exec 2>/tmp/foo' in /bin/partman et al [13:45] Launchpad bug 1171185 in ubiquity (Ubuntu) "Ubuntu installer appears to hang on "Installation von Ubuntu wird vorbereitet" screen" [Undecided,Incomplete] https://launchpad.net/bugs/1171185 [13:46] dank: yeap, that would be handy. Also when you boot the image, edit the boot paramater to have "debug-ubiquity" in it, that way all logs will be more verbose. [13:47] dank: i'm trying to reproduce it here as well based on kentb-out comments, but i have not been successful yet. [13:48] hanging in /lib/partman/display.d/10initial_auto, adding set -x there [13:49] now hanging in /lib/partman/automatically_partition/15reuse/choices [13:49] dank: also interesting if exectuing "os-prober" hangs. And what sort of set of operating systems are installed. [13:50] dank: what about output of "mount" to see what's mounted where? [13:50] (i guess the fact that grub-mount uses mount namespaces will not help much) [13:50] os-prober does not hang [13:52] mount sez: [13:52] /cow on / type overlayfs (rw) [13:52] proc on /proc type proc (rw,noexec,nosuid,nodev) [13:52] sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) [13:52] udev on /dev type devtmpfs (rw,mode=0755) [13:52] devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) [13:52] tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) [13:52] /dev/sr0 on /cdrom type iso9660 (ro,noatime) [13:52] /dev/loop0 on /rofs type squashfs (ro,noatime) [13:52] none on /sys/fs/cgroup type tmpfs (rw) [13:52] none on /sys/fs/fuse/connections type fusectl (rw) [13:52] none on /sys/kernel/debug type debugfs (rw) [13:52] none on /sys/kernel/security type securityfs (rw) [13:52] tmpfs on /tmp type tmpfs (rw,nosuid,nodev) [13:52] none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) [13:52] none on /run/shm type tmpfs (rw,nosuid,nodev) [13:52] none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755) [13:52] tmpfs on /var/lib/polkit-1/localauthority/90-mandatory.d type tmpfs (rw) [13:52] hmm, why does chatzilla think /cow should be italics? :-) [13:53] right, nothing unsual. / is this italics for you? / [13:53] yes [13:53] =) [13:53] dank: can you give me output of os-prober? [13:53] dank: and "parted -l" [13:53] wait for it [13:55] hmm, how does one edit boot parameters? f6 gives only a small set of choices. [13:55] oh [13:56] dank: also I'm now thinking a tarball of /var/lib/partman would be most useful to debug why reuse is hanging. [13:56] before triggering the problem? [13:58] dank: well it will not be populated yet. so get to at least the point were partman has started and /var/lib/partman/devices got populated with folders. [13:58] os-prober outputs [13:59] grr [13:59] and then create a tarball of /var/lib/partman/devices and email it to me or attach to a bug report. [13:59] "/devv/sd1a:Ubuntu 13.04 (13.04):Ubuntu:linux" [13:59] what's your email? === kentb-out is now known as kentb [13:59] dank: my-irc-nick@ubuntu.com [14:00] dank: or see PM [14:03] parted -l says: [14:03] Model: ATA WDC WD5000AACS-0 (scsi) [14:03] Disk /dev/sda: 500GB [14:03] Sector size (logical/physical): 512B/512B [14:03] Partition Table: msdos [14:03] Number Start End Size Type File system Flags [14:03] 1 1049kB 14.0GB 14.0GB primary ext4 boot [14:03] 2 14.0GB 22.0GB 8000MB primary linux-swap(v1) [14:03] 3 22.0GB 500GB 478GB extended [14:03] 5 22.0GB 500GB 478GB logical ext4 [14:05] ok, I added set -x to /lib/partman/automatically_partition/*/choices [14:05] dank: ok. Let me send you a couple of files and instructions on what to try. One moment please. [14:06] and.... this time it didn't hang. [14:07] dank: ok. still I'd like to send you something. [14:07] ok [14:09] maybe debug-ubiquity makes it not hang [14:11] dank: that's possible =) in that case reboot "normal", replace the reuse & replace choices with my patched scripts from people.canonical.com/~xnox/reuse and people.canonical.com/~xnox/replace [14:11] e.g.: [14:12] wget -O /lib/partman/automatically_partition/15reuse/choices http://people.canonical.com/~xnox/reuse/choices [14:12] wget -O /lib/partman/automatically_partition/25replace/choices http://people.canonical.com/~xnox/replace/choices [14:12] chmod +x /lib/partman/automatically_partition/*/choices [14:16] http://paste.ubuntu.com/5592667/ [14:17] this is my current guess, that grub-mount rightfully gracefully exits with non-zero, but ro-mount subsequently chokes up completely =) [14:17] I guess it would be useful to redirect grub-mount and mount output somewhere useful from above..... [14:17] not sure where to though. [14:21] I did the usual set -x and redirect inside partman. Now to run... [14:22] no hang [14:22] want the log? [14:22] dank: yeah. [14:22] /var/log/partman /var/log/syslog [14:23] and the output from set -x, if it's not in /var/log/syslog (should be) [14:27] dank: if you install pastebinit package you can simply do " cat foo | pastebinit" [14:27] sent [14:28] oh, you kids, get off my lawn with this pastebin stuff :-) [14:28] while you look at that I may do one more run with plain old files and judicious set -x to see if I can catch it hanging again [14:30] awesome =) [14:37] it hung [14:37] I'll send you another batch of logs [14:41] sent [14:42] fun fact: ps shows 15reuse/choices still running, in S state [14:45] thanks. Now, that it is hanging, can you try with replaced choices as above ^^^^ ? but please reboot. Once it's hanging, there is no clean way to go back to original state, and most likely it will work if you kill partman/choices and restart the installer instead of rebooting. [14:46] It's *usually* sufficient to kill all ubiquity/partman/parted* processes and rm -rf /var/lib/partman [14:46] But it requires some care and it's probably best to avoid introducing new variables while debugging [14:49] cjwatson: but that won't cleanup anything which is already mounted by e.g. reuse/replace recipes which at this point may or may not have mounted something. [14:49] True [14:49] i guess one can also go and do $ ls /dev/sd* | xargs -L 1 umount [14:51] Argh "Apr 22 14:35:47 ubuntu rsyslogd-2177: imuxsock begins to drop messages from pid 4575 due to rate-limiting" [14:52] cjwatson: have you considered making the limits basically unlimited for partman/ubiquity, just in case one is trying to debug it =) [14:55] hmm. How would I disable rate limiting for this run? [14:56] maybe I'll just redirect stderr. [14:56] dank: [14:56] $SystemLogRateLimitInterval 0 [14:56] $SystemLogRateLimitBurst 0 [14:56] in [14:57] /etc/rsyslog.conf [14:57] and then restart it. [14:59] k [14:59] I've swapped out the two files, marked them executable, turned off rate limiting. Here we go... [15:08] sent. Included output of ps augxw and lsof in files in /tmp [15:08] and now I'm turning into a pumpkin. I can run more tests for you tomorrow. [15:08] dank: did it still hang with my swapped in files? [15:08] dank: thanks a lot for your help! [15:09] well i guess i should see that in the logs... [15:09] it did, sigh. [15:39] xnox: It might be a plan to unlimit logging in casper, perhaps [15:40] xnox: Are you going to be in the London office tomorrow? [15:40] cjwatson: yeah =) [15:40] Excellent, if all else fails we can pair-debug it there [20:39] dank: still around? [20:49] xnox: hi, so I see from logs that dank was able to send you some interesting debug output [20:50] xnox: does this get us closer to having a reproducer? plars is trying to reproduce the bug, so maybe if the debug output is interesting you could share it with him [20:51] yes, would love to find a way to reliably reproduce this [21:12] plars: there wasn't muh useful in it. my current hypothesis is that when ext4 needs to recover journal or otherwise have rw access to the hard drive. replace/reuse will hang. My other idea is to axe reuse/replace and simply use os-prober instead of that code. [21:14] similarly dirty / hybernated / not-cleanly mounted ntfs partitions could cause the same. But maybe not as much after the top crasher got fixed for it (bug 1019806) [21:14] Launchpad bug 1019806 in ntfs-3g (Ubuntu) "ntfs-3g crashed with SIGABRT in get_node()" [Medium,Confirmed] https://launchpad.net/bugs/1019806 [21:15] I will run these past cjwatson tomorrow, and see what he thinks. [21:16] I think my other hdd had that once where I had to wipe it clean it was: intel raid metadata -> full-disk lvm -> lvm volumes used for VMs and thus having -> (full disk ubuntu install, nested lvm installs, etc) But (a) I wouldn't want to reproduce such a setup (b) not sure we want to support anything like that. [21:17] dank reported that there are no errors from os-prober, it just works fine. [21:17] xnox: well, the hope is that plars would be able to get you a reproducer for this bug before tomorrow [21:19] slangasek: my next step is to remaster a ubiquity cd which defaults to full debugging and set -x partman and disable rate-limitting, such that hopefully lower the "debugging" skills of those who hit this bug. [21:42] xnox: Wouldn't it be quicker/easier to modify the files in an already created cd image than remaster it? [21:43] GrueMaster: sure. are you affected by 1080701 ? [21:43] bug 1080701 [21:43] Launchpad bug 1080701 in ubiquity (Ubuntu Raring) "After 'Preparing to install Ubuntu' screen, raring installation hangs" [Critical,Confirmed] https://launchpad.net/bugs/1080701 [21:44] so far I can't think of anything more than https://wiki.ubuntu.com/DebuggingUbiquity#Deeper_debugging_of_partman to help us find the problem [21:44] No, I stopped testing the live images after 12.04. [21:44] enotime. [21:44] I understand. [21:45] But I used to do this almost weekly when I was doing Arm QA. [21:45] And before you say that is different, I used to verify issues I found on x86/amd64. [21:46] :) [21:53] Ubiquity is one of the ugliest programs I have seen to try to debug. the kernel parameters for adding debug output are largely ignored from what I have seen.