cjwatson | xnox: Don't know if you saw, but we had a tester show up overnight who can reproduce 1080701 reliably - can we make use of him later today? | 08:21 |
---|---|---|
stgraber | bug 1080701 | 08:22 |
ubot2 | Launchpad bug 1080701 in ubiquity (Ubuntu Raring) "After 'Preparing to install Ubuntu' screen, raring installation hangs" [High,Confirmed] https://launchpad.net/bugs/1080701 | 08:22 |
xnox | cjwatson: noticed. will think how to best use "remote hands" =) I also want to try kentb's reproducer case. I think it might be another case of phantom lvm metadata getting left-over. Something similar to previous one bug 154086 | 08:27 |
ubot2 | Launchpad bug 154086 in partman-auto-lvm (Debian) "Installing to HDD with previous ubuntu fails to create fresh LVM claiming group already in use" [Unknown,New] https://launchpad.net/bugs/154086 | 08:27 |
cjwatson | Beware of possible multiple causes | 08:29 |
jackweirdy | Hello :) I'm trying to preseed installation for 12.04 on machines which will dual boot with MS Windows for a classroom setting. I can't find any information about using/resizing existing partitions in d-i, but I've noticed ubiquity can do the "install alongside" stuff. Is that preseedable? | 10:16 |
xnox | jackweirdy: look into using partman-auto/init_automatically_partition select biggest_free. That will install into largest free disk-space. That assumes that e.g. you pre-shrink windows installs. | 10:22 |
jackweirdy | awesome; I'll look into that :) Presumably I could use early_command to do the shrinking itself? | 10:22 |
xnox | you can try resize_use_free, but it can choke up on resizing window installation, and that would be harder to troubleshoot. | 10:22 |
xnox | (resize_use_free should resize windows & then install into biggest free) | 10:23 |
xnox | jackweirdy: yes, you can do early-command as well, I'd recommend "partman/early_command" as that should have all the partitioning/fs utilities available to you. | 10:24 |
jackweirdy | awesome, thanks for that :D | 10:26 |
dank | Tracing through bug 1171185 by inserting 'set -x; exec 2>/tmp/foo' in /bin/partman et al | 13:45 |
ubot2 | Launchpad bug 1171185 in ubiquity (Ubuntu) "Ubuntu installer appears to hang on "Installation von Ubuntu wird vorbereitet" screen" [Undecided,Incomplete] https://launchpad.net/bugs/1171185 | 13:45 |
xnox | dank: yeap, that would be handy. Also when you boot the image, edit the boot paramater to have "debug-ubiquity" in it, that way all logs will be more verbose. | 13:46 |
xnox | dank: i'm trying to reproduce it here as well based on kentb-out comments, but i have not been successful yet. | 13:47 |
dank | hanging in /lib/partman/display.d/10initial_auto, adding set -x there | 13:48 |
dank | now hanging in /lib/partman/automatically_partition/15reuse/choices | 13:49 |
xnox | dank: also interesting if exectuing "os-prober" hangs. And what sort of set of operating systems are installed. | 13:49 |
xnox | dank: what about output of "mount" to see what's mounted where? | 13:50 |
xnox | (i guess the fact that grub-mount uses mount namespaces will not help much) | 13:50 |
dank | os-prober does not hang | 13:50 |
dank | mount sez: | 13:52 |
dank | /cow on / type overlayfs (rw) | 13:52 |
dank | proc on /proc type proc (rw,noexec,nosuid,nodev) | 13:52 |
dank | sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) | 13:52 |
dank | udev on /dev type devtmpfs (rw,mode=0755) | 13:52 |
dank | devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) | 13:52 |
dank | tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) | 13:52 |
dank | /dev/sr0 on /cdrom type iso9660 (ro,noatime) | 13:52 |
dank | /dev/loop0 on /rofs type squashfs (ro,noatime) | 13:52 |
dank | none on /sys/fs/cgroup type tmpfs (rw) | 13:52 |
dank | none on /sys/fs/fuse/connections type fusectl (rw) | 13:52 |
dank | none on /sys/kernel/debug type debugfs (rw) | 13:52 |
dank | none on /sys/kernel/security type securityfs (rw) | 13:52 |
dank | tmpfs on /tmp type tmpfs (rw,nosuid,nodev) | 13:52 |
dank | none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) | 13:52 |
dank | none on /run/shm type tmpfs (rw,nosuid,nodev) | 13:52 |
dank | none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755) | 13:52 |
dank | tmpfs on /var/lib/polkit-1/localauthority/90-mandatory.d type tmpfs (rw) | 13:52 |
dank | hmm, why does chatzilla think /cow should be italics? :-) | 13:52 |
xnox | right, nothing unsual. / is this italics for you? / | 13:53 |
dank | yes | 13:53 |
xnox | =) | 13:53 |
xnox | dank: can you give me output of os-prober? | 13:53 |
xnox | dank: and "parted -l" | 13:53 |
dank | wait for it | 13:53 |
dank | hmm, how does one edit boot parameters? f6 gives only a small set of choices. | 13:55 |
dank | oh | 13:55 |
xnox | dank: also I'm now thinking a tarball of /var/lib/partman would be most useful to debug why reuse is hanging. | 13:56 |
dank | before triggering the problem? | 13:56 |
xnox | dank: well it will not be populated yet. so get to at least the point were partman has started and /var/lib/partman/devices got populated with folders. | 13:58 |
dank | os-prober outputs | 13:58 |
dank | grr | 13:59 |
xnox | and then create a tarball of /var/lib/partman/devices and email it to me or attach to a bug report. | 13:59 |
dank | "/devv/sd1a:Ubuntu 13.04 (13.04):Ubuntu:linux" | 13:59 |
dank | what's your email? | 13:59 |
=== kentb-out is now known as kentb | ||
xnox | dank: my-irc-nick@ubuntu.com | 13:59 |
xnox | dank: or see PM | 14:00 |
dank | parted -l says: | 14:03 |
dank | Model: ATA WDC WD5000AACS-0 (scsi) | 14:03 |
dank | Disk /dev/sda: 500GB | 14:03 |
dank | Sector size (logical/physical): 512B/512B | 14:03 |
dank | Partition Table: msdos | 14:03 |
dank | Number Start End Size Type File system Flags | 14:03 |
dank | 1 1049kB 14.0GB 14.0GB primary ext4 boot | 14:03 |
dank | 2 14.0GB 22.0GB 8000MB primary linux-swap(v1) | 14:03 |
dank | 3 22.0GB 500GB 478GB extended | 14:03 |
dank | 5 22.0GB 500GB 478GB logical ext4 | 14:03 |
dank | ok, I added set -x to /lib/partman/automatically_partition/*/choices | 14:05 |
xnox | dank: ok. Let me send you a couple of files and instructions on what to try. One moment please. | 14:05 |
dank | and.... this time it didn't hang. | 14:06 |
xnox | dank: ok. still I'd like to send you something. | 14:07 |
dank | ok | 14:07 |
dank | maybe debug-ubiquity makes it not hang | 14:09 |
xnox | dank: that's possible =) in that case reboot "normal", replace the reuse & replace choices with my patched scripts from people.canonical.com/~xnox/reuse and people.canonical.com/~xnox/replace | 14:11 |
xnox | e.g.: | 14:11 |
xnox | wget -O /lib/partman/automatically_partition/15reuse/choices http://people.canonical.com/~xnox/reuse/choices | 14:12 |
xnox | wget -O /lib/partman/automatically_partition/25replace/choices http://people.canonical.com/~xnox/replace/choices | 14:12 |
xnox | chmod +x /lib/partman/automatically_partition/*/choices | 14:12 |
xnox | http://paste.ubuntu.com/5592667/ | 14:16 |
xnox | this is my current guess, that grub-mount rightfully gracefully exits with non-zero, but ro-mount subsequently chokes up completely =) | 14:17 |
xnox | I guess it would be useful to redirect grub-mount and mount output somewhere useful from above..... | 14:17 |
xnox | not sure where to though. | 14:17 |
dank | I did the usual set -x and redirect inside partman. Now to run... | 14:21 |
dank | no hang | 14:22 |
dank | want the log? | 14:22 |
xnox | dank: yeah. | 14:22 |
xnox | /var/log/partman /var/log/syslog | 14:22 |
xnox | and the output from set -x, if it's not in /var/log/syslog (should be) | 14:23 |
xnox | dank: if you install pastebinit package you can simply do " cat foo | pastebinit" | 14:27 |
dank | sent | 14:27 |
dank | oh, you kids, get off my lawn with this pastebin stuff :-) | 14:28 |
dank | while you look at that I may do one more run with plain old files and judicious set -x to see if I can catch it hanging again | 14:28 |
xnox | awesome =) | 14:30 |
dank | it hung | 14:37 |
dank | I'll send you another batch of logs | 14:37 |
dank | sent | 14:41 |
dank | fun fact: ps shows 15reuse/choices still running, in S state | 14:42 |
xnox | thanks. Now, that it is hanging, can you try with replaced choices as above ^^^^ ? but please reboot. Once it's hanging, there is no clean way to go back to original state, and most likely it will work if you kill partman/choices and restart the installer instead of rebooting. | 14:45 |
cjwatson | It's *usually* sufficient to kill all ubiquity/partman/parted* processes and rm -rf /var/lib/partman | 14:46 |
cjwatson | But it requires some care and it's probably best to avoid introducing new variables while debugging | 14:46 |
xnox | cjwatson: but that won't cleanup anything which is already mounted by e.g. reuse/replace recipes which at this point may or may not have mounted something. | 14:49 |
cjwatson | True | 14:49 |
xnox | i guess one can also go and do $ ls /dev/sd* | xargs -L 1 umount | 14:49 |
xnox | Argh "Apr 22 14:35:47 ubuntu rsyslogd-2177: imuxsock begins to drop messages from pid 4575 due to rate-limiting" | 14:51 |
xnox | cjwatson: have you considered making the limits basically unlimited for partman/ubiquity, just in case one is trying to debug it =) | 14:52 |
dank | hmm. How would I disable rate limiting for this run? | 14:55 |
dank | maybe I'll just redirect stderr. | 14:56 |
xnox | dank: | 14:56 |
xnox | $SystemLogRateLimitInterval 0 | 14:56 |
xnox | $SystemLogRateLimitBurst 0 | 14:56 |
xnox | in | 14:56 |
xnox | /etc/rsyslog.conf | 14:57 |
xnox | and then restart it. | 14:57 |
dank | k | 14:59 |
dank | I've swapped out the two files, marked them executable, turned off rate limiting. Here we go... | 14:59 |
dank | sent. Included output of ps augxw and lsof in files in /tmp | 15:08 |
dank | and now I'm turning into a pumpkin. I can run more tests for you tomorrow. | 15:08 |
xnox | dank: did it still hang with my swapped in files? | 15:08 |
xnox | dank: thanks a lot for your help! | 15:08 |
xnox | well i guess i should see that in the logs... | 15:09 |
xnox | it did, sigh. | 15:09 |
cjwatson | xnox: It might be a plan to unlimit logging in casper, perhaps | 15:39 |
cjwatson | xnox: Are you going to be in the London office tomorrow? | 15:40 |
xnox | cjwatson: yeah =) | 15:40 |
cjwatson | Excellent, if all else fails we can pair-debug it there | 15:40 |
plars | dank: still around? | 20:39 |
slangasek | xnox: hi, so I see from logs that dank was able to send you some interesting debug output | 20:49 |
slangasek | xnox: does this get us closer to having a reproducer? plars is trying to reproduce the bug, so maybe if the debug output is interesting you could share it with him | 20:50 |
plars | yes, would love to find a way to reliably reproduce this | 20:51 |
xnox | plars: there wasn't muh useful in it. my current hypothesis is that when ext4 needs to recover journal or otherwise have rw access to the hard drive. replace/reuse will hang. My other idea is to axe reuse/replace and simply use os-prober instead of that code. | 21:12 |
xnox | similarly dirty / hybernated / not-cleanly mounted ntfs partitions could cause the same. But maybe not as much after the top crasher got fixed for it (bug 1019806) | 21:14 |
ubot2 | Launchpad bug 1019806 in ntfs-3g (Ubuntu) "ntfs-3g crashed with SIGABRT in get_node()" [Medium,Confirmed] https://launchpad.net/bugs/1019806 | 21:14 |
xnox | I will run these past cjwatson tomorrow, and see what he thinks. | 21:15 |
xnox | I think my other hdd had that once where I had to wipe it clean it was: intel raid metadata -> full-disk lvm -> lvm volumes used for VMs and thus having -> (full disk ubuntu install, nested lvm installs, etc) But (a) I wouldn't want to reproduce such a setup (b) not sure we want to support anything like that. | 21:16 |
xnox | dank reported that there are no errors from os-prober, it just works fine. | 21:17 |
slangasek | xnox: well, the hope is that plars would be able to get you a reproducer for this bug before tomorrow | 21:17 |
xnox | slangasek: my next step is to remaster a ubiquity cd which defaults to full debugging and set -x partman and disable rate-limitting, such that hopefully lower the "debugging" skills of those who hit this bug. | 21:19 |
GrueMaster | xnox: Wouldn't it be quicker/easier to modify the files in an already created cd image than remaster it? | 21:42 |
xnox | GrueMaster: sure. are you affected by 1080701 ? | 21:43 |
xnox | bug 1080701 | 21:43 |
ubot2 | Launchpad bug 1080701 in ubiquity (Ubuntu Raring) "After 'Preparing to install Ubuntu' screen, raring installation hangs" [Critical,Confirmed] https://launchpad.net/bugs/1080701 | 21:43 |
xnox | so far I can't think of anything more than https://wiki.ubuntu.com/DebuggingUbiquity#Deeper_debugging_of_partman to help us find the problem | 21:44 |
GrueMaster | No, I stopped testing the live images after 12.04. | 21:44 |
GrueMaster | enotime. | 21:44 |
xnox | I understand. | 21:44 |
GrueMaster | But I used to do this almost weekly when I was doing Arm QA. | 21:45 |
GrueMaster | And before you say that is different, I used to verify issues I found on x86/amd64. | 21:45 |
GrueMaster | :) | 21:46 |
GrueMaster | Ubiquity is one of the ugliest programs I have seen to try to debug. the kernel parameters for adding debug output are largely ignored from what I have seen. | 21:53 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!