[08:21] <cjwatson> xnox: Don't know if you saw, but we had a tester show up overnight who can reproduce 1080701 reliably - can we make use of him later today?
[08:22] <stgraber> bug 1080701
[08:22] <ubot2> Launchpad bug 1080701 in ubiquity (Ubuntu Raring) "After 'Preparing to install Ubuntu' screen, raring installation hangs" [High,Confirmed] https://launchpad.net/bugs/1080701
[08:27] <xnox> cjwatson: noticed. will think how to best use "remote hands" =) I also want to try kentb's reproducer case. I think it might be another case of phantom lvm metadata getting left-over. Something similar to previous one bug 154086
[08:27] <ubot2> Launchpad bug 154086 in partman-auto-lvm (Debian) "Installing to HDD with previous ubuntu fails to create fresh LVM claiming group already in use" [Unknown,New] https://launchpad.net/bugs/154086
[08:29] <cjwatson> Beware of possible multiple causes
[10:16] <jackweirdy> Hello :) I'm trying to preseed installation for 12.04 on machines which will dual boot with MS Windows for a classroom setting. I can't find any information about using/resizing existing partitions in d-i, but I've noticed ubiquity can do the "install alongside" stuff. Is that preseedable?
[10:22] <xnox> jackweirdy: look into using partman-auto/init_automatically_partition select biggest_free. That will install into largest free disk-space. That assumes that e.g. you pre-shrink windows installs.
[10:22] <jackweirdy> awesome; I'll look into that :) Presumably I could use early_command to do the shrinking itself?
[10:22] <xnox> you can try resize_use_free, but it can choke up on resizing window installation, and that would be harder to troubleshoot.
[10:23] <xnox> (resize_use_free should resize windows & then install into biggest free)
[10:24] <xnox> jackweirdy: yes, you can do early-command as well, I'd recommend "partman/early_command" as that should have all the partitioning/fs utilities available to you.
[10:26] <jackweirdy> awesome, thanks for that :D
[13:45] <dank> Tracing through bug 1171185 by inserting 'set -x; exec 2>/tmp/foo' in /bin/partman et al
[13:45] <ubot2> Launchpad bug 1171185 in ubiquity (Ubuntu) "Ubuntu installer appears to hang on "Installation von Ubuntu wird vorbereitet" screen" [Undecided,Incomplete] https://launchpad.net/bugs/1171185
[13:46] <xnox> dank: yeap, that would be handy. Also when you boot the image, edit the boot paramater to have "debug-ubiquity" in it, that way all logs will be more verbose.
[13:47] <xnox> dank: i'm trying to reproduce it here as well based on kentb-out  comments, but i have not been successful yet.
[13:48] <dank> hanging in /lib/partman/display.d/10initial_auto, adding set -x there
[13:49] <dank> now hanging in /lib/partman/automatically_partition/15reuse/choices
[13:49] <xnox> dank: also interesting if exectuing "os-prober" hangs. And what sort of set of operating systems are installed.
[13:50] <xnox> dank: what about output of "mount" to see what's mounted where?
[13:50] <xnox> (i guess the fact that grub-mount uses mount namespaces will not help much)
[13:50] <dank> os-prober does not hang
[13:52] <dank> mount sez:
[13:52] <dank> /cow on / type overlayfs (rw)
[13:52] <dank> proc on /proc type proc (rw,noexec,nosuid,nodev)
[13:52] <dank> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
[13:52] <dank> udev on /dev type devtmpfs (rw,mode=0755)
[13:52] <dank> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
[13:52] <dank> tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
[13:52] <dank> /dev/sr0 on /cdrom type iso9660 (ro,noatime)
[13:52] <dank> /dev/loop0 on /rofs type squashfs (ro,noatime)
[13:52] <dank> none on /sys/fs/cgroup type tmpfs (rw)
[13:52] <dank> none on /sys/fs/fuse/connections type fusectl (rw)
[13:52] <dank> none on /sys/kernel/debug type debugfs (rw)
[13:52] <dank> none on /sys/kernel/security type securityfs (rw)
[13:52] <dank> tmpfs on /tmp type tmpfs (rw,nosuid,nodev)
[13:52] <dank> none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
[13:52] <dank> none on /run/shm type tmpfs (rw,nosuid,nodev)
[13:52] <dank> none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
[13:52] <dank> tmpfs on /var/lib/polkit-1/localauthority/90-mandatory.d type tmpfs (rw)
[13:52] <dank> hmm, why does chatzilla think /cow should be italics?  :-)
[13:53] <xnox> right, nothing unsual. / is this italics for you? /
[13:53] <dank> yes
[13:53] <xnox> =)
[13:53] <xnox> dank: can you give me output of os-prober?
[13:53] <xnox> dank: and "parted -l"
[13:53] <dank> wait for it
[13:55] <dank> hmm, how does one edit boot parameters?  f6 gives only a small set of choices.
[13:55] <dank> oh
[13:56] <xnox> dank: also I'm now thinking a tarball of /var/lib/partman would be most useful to debug why reuse is hanging.
[13:56] <dank> before triggering the problem?
[13:58] <xnox> dank: well it will not be populated yet. so get to at least the point were partman has started and /var/lib/partman/devices got populated with folders.
[13:58] <dank> os-prober outputs
[13:59] <dank> grr
[13:59] <xnox> and then create a tarball of /var/lib/partman/devices and email it to me or attach to a bug report.
[13:59] <dank> "/devv/sd1a:Ubuntu 13.04 (13.04):Ubuntu:linux"
[13:59] <dank> what's your email?
[13:59] <xnox> dank: my-irc-nick@ubuntu.com
[14:00] <xnox> dank: or see PM
[14:03] <dank> parted -l says:
[14:03] <dank> Model: ATA WDC WD5000AACS-0 (scsi)
[14:03] <dank> Disk /dev/sda: 500GB
[14:03] <dank> Sector size (logical/physical): 512B/512B
[14:03] <dank> Partition Table: msdos
[14:03] <dank> Number  Start   End     Size    Type      File system     Flags
[14:03] <dank>  1      1049kB  14.0GB  14.0GB  primary   ext4            boot
[14:03] <dank>  2      14.0GB  22.0GB  8000MB  primary   linux-swap(v1)
[14:03] <dank>  3      22.0GB  500GB   478GB   extended
[14:03] <dank>  5      22.0GB  500GB   478GB   logical   ext4
[14:05] <dank> ok, I added set -x to /lib/partman/automatically_partition/*/choices
[14:05] <xnox> dank: ok. Let me send you a couple of files and instructions on what to try. One moment please.
[14:06] <dank> and.... this time it didn't hang.
[14:07] <xnox> dank: ok. still I'd like to send you something.
[14:07] <dank> ok
[14:09] <dank> maybe debug-ubiquity makes it not hang
[14:11] <xnox> dank: that's possible =) in that case reboot "normal", replace the reuse & replace choices with my patched scripts from people.canonical.com/~xnox/reuse and people.canonical.com/~xnox/replace
[14:11] <xnox> e.g.:
[14:12] <xnox> wget -O /lib/partman/automatically_partition/15reuse/choices http://people.canonical.com/~xnox/reuse/choices
[14:12] <xnox> wget -O /lib/partman/automatically_partition/25replace/choices http://people.canonical.com/~xnox/replace/choices
[14:12] <xnox> chmod +x /lib/partman/automatically_partition/*/choices
[14:16] <xnox> http://paste.ubuntu.com/5592667/
[14:17] <xnox> this is my current guess, that grub-mount rightfully gracefully exits with non-zero, but ro-mount subsequently chokes up completely =)
[14:17] <xnox> I guess it would be useful to redirect grub-mount and mount output somewhere useful from above.....
[14:17] <xnox> not sure where to though.
[14:21] <dank> I did the usual set -x and redirect inside partman.  Now to run...
[14:22] <dank> no hang
[14:22] <dank> want the log?
[14:22] <xnox> dank: yeah.
[14:22] <xnox> /var/log/partman /var/log/syslog
[14:23] <xnox> and the output from set -x, if it's not in /var/log/syslog (should be)
[14:27] <xnox> dank: if you install pastebinit package you can simply do " cat foo | pastebinit"
[14:27] <dank> sent
[14:28] <dank> oh, you kids, get off my lawn with this pastebin stuff :-)
[14:28] <dank> while you look at that I may do one more run with plain old files and judicious set -x to see if I can catch it hanging again
[14:30] <xnox> awesome =)
[14:37] <dank> it hung
[14:37] <dank> I'll send you another batch of logs
[14:41] <dank> sent
[14:42] <dank> fun fact: ps shows 15reuse/choices still running, in S state
[14:45] <xnox> thanks. Now, that it is hanging, can you try with replaced choices as above ^^^^ ? but please reboot. Once it's hanging, there is no clean way to go back to original state, and most likely it will work if you kill partman/choices and restart the installer instead of rebooting.
[14:46] <cjwatson> It's *usually* sufficient to kill all ubiquity/partman/parted* processes and rm -rf /var/lib/partman
[14:46] <cjwatson> But it requires some care and it's probably best to avoid introducing new variables while debugging
[14:49] <xnox> cjwatson: but that won't cleanup anything which is already mounted by e.g. reuse/replace recipes which at this point may or may not have mounted something.
[14:49] <cjwatson> True
[14:49] <xnox> i guess one can also go and do $ ls /dev/sd* | xargs -L 1 umount
[14:51] <xnox> Argh "Apr 22 14:35:47 ubuntu rsyslogd-2177: imuxsock begins to drop messages from pid 4575 due to rate-limiting"
[14:52] <xnox> cjwatson: have you considered making the limits basically unlimited for partman/ubiquity, just in case one is trying to debug it =)
[14:55] <dank> hmm.  How would I disable rate limiting for this run?
[14:56] <dank> maybe I'll just redirect stderr.
[14:56] <xnox> dank:
[14:56] <xnox> $SystemLogRateLimitInterval 0
[14:56] <xnox> $SystemLogRateLimitBurst 0
[14:56] <xnox> in
[14:57] <xnox> /etc/rsyslog.conf
[14:57] <xnox> and then restart it.
[14:59] <dank> k
[14:59] <dank> I've swapped out the two files, marked them executable, turned off rate limiting.  Here we go...
[15:08] <dank> sent.  Included output of ps augxw and lsof in files in /tmp
[15:08] <dank> and now I'm turning into a pumpkin.  I can run more tests for you tomorrow.
[15:08] <xnox> dank: did it still hang with my swapped in files?
[15:08] <xnox> dank: thanks a lot for your help!
[15:09] <xnox> well i guess i should see that in the logs...
[15:09] <xnox> it did, sigh.
[15:39] <cjwatson> xnox: It might be a plan to unlimit logging in casper, perhaps
[15:40] <cjwatson> xnox: Are you going to be in the London office tomorrow?
[15:40] <xnox> cjwatson: yeah =)
[15:40] <cjwatson> Excellent, if all else fails we can pair-debug it there
[20:39] <plars> dank: still around?
[20:49] <slangasek> xnox: hi, so I see from logs that dank was able to send you some interesting debug output
[20:50] <slangasek> xnox: does this get us closer to having a reproducer?  plars is trying to reproduce the bug, so maybe if the debug output is interesting you could share it with him
[20:51] <plars> yes, would love to find a way to reliably reproduce this
[21:12] <xnox> plars: there wasn't muh useful in it. my current hypothesis is that when ext4 needs to recover journal or otherwise have rw access to the hard drive. replace/reuse will hang. My other idea is to axe reuse/replace and simply use os-prober instead of that code.
[21:14] <xnox> similarly dirty / hybernated / not-cleanly mounted ntfs partitions could cause the same. But maybe not as much after the top crasher got fixed for it (bug 1019806)
[21:14] <ubot2> Launchpad bug 1019806 in ntfs-3g (Ubuntu) "ntfs-3g crashed with SIGABRT in get_node()" [Medium,Confirmed] https://launchpad.net/bugs/1019806
[21:15] <xnox> I will run these past cjwatson tomorrow, and see what he thinks.
[21:16] <xnox> I think my other hdd had that once where I had to wipe it clean it was: intel raid metadata -> full-disk lvm -> lvm volumes used for VMs and thus having -> (full disk ubuntu install, nested lvm installs, etc) But (a) I wouldn't want to reproduce such a setup (b) not sure we want to support anything like that.
[21:17] <xnox> dank reported that there are no errors from os-prober, it just works fine.
[21:17] <slangasek> xnox: well, the hope is that plars would be able to get you a reproducer for this bug before tomorrow
[21:19] <xnox> slangasek: my next step is to remaster a ubiquity cd which defaults to full debugging and set -x partman and disable rate-limitting, such that hopefully lower the "debugging" skills of those who hit this bug.
[21:42] <GrueMaster> xnox: Wouldn't it be quicker/easier to modify the files in an already created cd image than remaster it?
[21:43] <xnox> GrueMaster: sure. are you affected by 1080701 ?
[21:43] <xnox> bug 1080701
[21:43] <ubot2> Launchpad bug 1080701 in ubiquity (Ubuntu Raring) "After 'Preparing to install Ubuntu' screen, raring installation hangs" [Critical,Confirmed] https://launchpad.net/bugs/1080701
[21:44] <xnox> so far I can't think of anything more than https://wiki.ubuntu.com/DebuggingUbiquity#Deeper_debugging_of_partman to help us find the problem
[21:44] <GrueMaster> No, I stopped testing the live images after 12.04.
[21:44] <GrueMaster> enotime.
[21:44] <xnox> I understand.
[21:45] <GrueMaster> But I used to do this almost weekly when I was doing Arm QA.
[21:45] <GrueMaster> And before you say that is different, I used to verify issues I found on x86/amd64.
[21:46] <GrueMaster> :)
[21:53] <GrueMaster> Ubiquity is one of the ugliest programs I have seen to try to debug.  the kernel parameters for adding debug output are largely ignored from what I have seen.