=== fabbione [i=fabbione@gordian.fabbione.net] has joined #ubuntu-ports === tmarble [n=tmarble@192.18.101.5] has joined #ubuntu-ports [04:03] hey tmarble [04:03] fabbione: ciao [04:04] so are you, or David, lacking hardware to find this bug? [04:04] not that I can do anything about it... but I thought i'd ask [04:06] what bug? [04:07] it's not like we have just one :) [04:07] let me rephrase that [04:07] we have no bugs.. [04:07] the hw just doesn't work [04:07] :P [04:12] ah. so you're saying it's my problem ;-) [04:13] eheh [04:13] no seriously.. i have been hacking like mad today [04:13] what bug are you talking about? [04:14] BUG: soft lockup detected on CPU#2! [04:15] i already fixed that [04:15] it's pushed in git but it will take a few days to be uploaded [04:15] and i found another bug in the reboot code for the kernel that get trapped by the HV and generate a poweroff error [04:16] hmmm yes.. having both those fixes will be nice [04:18] i thought that the latter was caused by the former, but i have been proven wrong [04:44] for the purpose of debugging, would it make sense to package up linux-image-2.6.17-10-sparc64-smp with these fixes such that I can try them? [04:44] i.e. before it becomes a 2.6.17.n release? [04:51] i already tested it, but i can upload an image if you want [04:52] as I only have this box for little over a week I'd like to get going with it -- if it's not too much trouble [04:53] there also would be enormous value in confirming the effectiveness of the fix [04:53] if, for some reason, the fix(es) are not sufficient I assume it is better to know that early, right? [04:54] So, for example, I don't know if it's possible for me to take the recent boot.img from yesterday at http://archive.ubuntu.com/ubuntu/dists/edgy/main/installer-sparc/current/images/sparc64/netboot/2.6/ [04:54] and then point at a non-production mirror (if you prefer) [04:55] then we wouldn't have to wait the 6 hours for ubuntu dinstall, right? [04:55] halt.. [04:55] you are confusing 2 things here [04:55] one is the d-i error you saw [04:56] that has been addressed yesterday [04:56] k [04:56] the new image will work just fine [04:56] but there might be instability in the mirrors soon [04:56] new image will be default in a couple of hours [04:56] will it, indeed, be linux-image-2.6.17-10-sparc64-smp ? [04:57] yes, but that image doesn't contain the fix for the CPU lockups yet [04:57] so what i suggest to do is: [04:57] right -- I confirm the d-i bug is fixed -- install completed nicely [04:57] wait tomorrow or thursday for the new image [04:57] install [04:58] you will see also a login prompt on the console (that was broken when you did the first install) [04:58] and then we can install the new kernel with the fix [04:58] if the fix will not hit archive before that [04:58] ah, so you are saying that linux-image-2.6.17-11-sparc64-smp *will* include the CPU lockup fix? [04:58] it can also be -10- with a version bump [04:59] ah - ok [04:59] -10- or -9- indicates the ABI version of the kernel [04:59] right, my bad [04:59] nah that's ok [04:59] a lot of people don't grok that easily [04:59] I actually do understand upstream vs. debian versions -- just not facile with these package names yet [05:00] yeah [05:00] kernel is special in this regard [05:00] but, your point is, by tomorrow or thursday we will have a kernel with the CPU lockup fix, right? [05:03] no, my point is that by tomorrow or thursday you will get an image that will install and take you to a console [05:03] the reason why you didn't get a console was because of a bug in the installer that was fixed today [05:03] ok [05:04] the kernel with the CPU lock fixed is on my niagara and i can give you a copy [05:04] but am I still going to trip over CPU lockup? [05:04] yes right [05:04] that's not a big deal [05:04] ah - -then I install manually from the console? [05:04] exactly [05:04] i can also explain to you how to fix your actual install [05:04] are you going to give me a .deb [05:04] and get a console [05:04] yes i will give you a deb [05:04] easy enough [05:04] if you are bored.. netboot the machine with the installer [05:05] get to the partitioner and [05:05] get to the main menu [05:05] choose ash [05:05] ? [05:05] "exit to a shell" or something similar [05:05] mount your root somewhere [05:05] it's already on /target isn't it? [05:05] it's on target if you install [05:06] but since you already installed, might as well fix the install [05:06] ah -- ok [05:06] so if you get to the partitioner, before reformatting anything [05:06] get to the shell as i told you [05:06] mount / somewhere [05:06] in /etc/events.d [05:06] add a file called ttyS0 [05:06] and slam this in it: [05:07] start on runlevel-2 [05:07] start on runlevel-3 [05:07] start on runlevel-4 [05:07] start on runlevel-5 [05:07] stop on shutdown [05:07] respawn /sbin/getty -L ttyS0 9600 vt102 [05:07] (actually it's /etc/event.d) [05:07] reboot from there [05:07] I assume this is under my mount -- not the installer real / , correct? [05:07] yes right [05:07] under the mount [05:07] k [05:07] that will give you console access [05:07] ah.. good [05:07] now i need to finish a couple of silo fixes [05:07] then I can install your deb from that [05:07] ok [05:08] one more, different question please [05:08] and i will upload the image on people.ubuntu.com/~fabbione/tmarble [05:08] k [05:08] my colleage wants to boot (from OBP) to a given partition (in his case, parition 4) [05:08] I asked him to try this: [05:09] ok boot /pci@780/pci@0/pci@9/scsi@0/disk@0,0:d [05:09] BUT, it said "file is not executable" [05:09] is there some other way to do this? [05:09] did he install silo in the partition? or in the MBR? [05:10] i'll ask, hold please [05:11] the installer by default uses the MBR iirc [05:11] ok... [05:11] to install on the partition you need to do it manually [05:11] IIRC with silo -f -t [05:11] but he needs to check on the manpage [05:11] i really don't remember [05:11] I have to read more on silo, but is there any reason I can't add an entry in silo to jump to a partition (that happens to be running Solaris)? [05:11] k [05:11] yes you can add an entry to silo [05:12] any thoughts about supporting grub? [05:12] there are some people working on grub2 [05:12] supposedly grub understands Sun VTOC [05:12] but i have no idea about the status [05:12] k [05:13] nikolay is not responding... please go ahead to work on your silo fixes... [05:13] thanks for your help! let me know when you have a deb for me [05:14] it won't take long for the deb [05:14] i need to build the kernrel but it takes only a few minutes on Niagara :) [05:14] ok... i'll try to get the console thing fixed now [05:15] btw.. tell Nikolay that i didn't forget about his gcc/linking issue [05:15] i just had no time to work on it [05:15] silo booting is beta blocker [05:15] yeah -- I have promised to file a bug on that (and he has another kernel NFS bug I need to file too) [05:46] /usr/bin/make -j512 EXTRAVERSION=-10-sparc64-smp ARCH=sparc64 \ [05:46] image [05:46] almost there :) [05:46] that -j512 makes things go *slightly* faster :) [06:01] I had trouble on install components -- skipping to detect disks [06:03] they are changing the kernel bits in the archive as we speak [06:03] i figured that... jumped to ash too early .. /dev/sdb was not known [06:03] eheh [06:04] mounted [06:05] nice [06:05] kernel is almost ready [06:05] testing one more fix [06:05] don't have /etc/event.d [06:05] shall i mkdir [06:07] it has to be there [06:07] are you chrooted in /target or using real /etc ? [06:07] does this look right (sorry for the flood): [06:07] /mnt/event.d # pwd [06:07] /mnt/etc/event.d [06:07] /mnt/event.d # cat ttyS0 [06:07] start on runlevel-2 [06:07] start on runlevel-3 [06:08] start on runlevel-4 [06:08] start on runlevel-5 [06:08] stop on shutdown [06:08] respawn /sbin/getty -L ttyS0 9600 vt102 [06:08] /mnt/event.d # [06:08] /mnt/event.d # mount [06:08] none on /proc type proc (rw) [06:08] tmpfs on /dev type tmpfs (rw) [06:08] sysfs on /sys type sysfs (rw) [06:08] tmpfs on /.dev type tmpfs (rw) [06:08] /dev/sdb on /mnt type ext3 (rw,data=ordered) [06:08] /mnt/event.d ?? [06:08] /mnt/event.d # [06:08] /etc/event.d ? [06:08] that's ash that is confused -- hence the pwd [06:08] /mnt/etc/event.d [06:09] that directory has to be there [06:09] what's in there? [06:09] is it empty? [06:09] no, I just added ttyS0 [06:09] /mnt/event.d # ls -al [06:09] drwxr-xr-x 2 root root 1024 Sep 26 16:04 . [06:09] drwxr-xr-x 4 root root 1024 Sep 26 16:04 .. [06:09] -rw-r--r-- 1 root root 138 Sep 26 16:06 ttyS0 [06:09] /mnt/event.d # cat ttyS0 [06:09] start on runlevel-2 [06:09] start on runlevel-3 [06:09] no [06:09] start on runlevel-4 [06:09] start on runlevel-5 [06:09] stop on shutdown [06:09] there is something wrong here [06:09] respawn /sbin/getty -L ttyS0 9600 vt102 [06:09] /mnt/event.d # [06:09] that dir is full of stuff [06:10] ls [06:10] control-alt-delete rc0 rc0-poweroff rc2 rc4 rc6 rcS sulogin tty2 tty4 tty6 [06:10] logd rc0-halt rc1 rc3 rc5 rc-default rcS-sulogin tty1 tty3 tty5 ttyS0 [06:10] do you have /mnt/etc/inittab ? [06:10] /mnt/event.d # ls -l /mnt/etc/inittab [06:10] ls: /mnt/etc/inittab: No such file or directory [06:12] something is weird [06:12] no [06:12] it's all wrong [06:12] do you have /etc ? [06:12] or did you mount /boot by mistake? [06:13] perhaps.. let me try again [06:17] my bad [06:17] /tmp/5/etc/event.d # mount | grep /dev/sdb5 [06:17] /dev/sdb5 on /tmp/5 type ext3 (rw,data=ordered) [06:17] /tmp/5/etc/event.d # ls [06:17] control-alt-delete rc2 sulogin [06:17] logd rc3 tty1 [06:17] rc-default rc4 tty2 [06:17] rc0 rc5 tty3 [06:17] rc0-halt rc6 tty4 [06:17] rc0-poweroff rcS tty5 [06:17] rc1 rcS-sulogin tty6 [06:17] /tmp/5/etc/event.d # [06:18] /tmp/5/etc/event.d # cat ttyS0 [06:18] start on runlevel-2 [06:18] start on runlevel-3 [06:18] start on runlevel-4 [06:18] start on runlevel-5 [06:18] stop on shutdown [06:18] respawn /sbin/getty -L ttyS0 9600 vt102 [06:18] /tmp/5/etc/event.d # [06:18] Correct? [06:20] looks much better now :) [06:20] ok [06:20] now umount and reboot? [06:20] yeps [06:20] k [06:21] i am booting the test kernel for you === tmarble steps away to get lunch before conference calls begin in 30 min === tmarble back [06:37] tmarble: you eat too fast [06:38] no.. i just got the food -- will eat during the ConCall :-( [06:38] oh [06:38] i am reliefed i am not the only one :) [06:40] i'm booting into solaris on sysdisk0 on reboots .... [06:40] i thought I could set this in OBP: [06:40] setenv auto-boot false [06:40] but it doesn't like that/ [06:43] {7} ok setenv auto-boot? false [06:43] auto-boot? = false [06:45] there is already one there now? [06:45] i didn't notice [06:45] yeah but i am testing one with an extra fix [06:45] ah [06:45] that one is useable [06:47] OK ...last time it stopped here, I pressed and it continued...... is it normal to stop here? [06:47] Rebooting with command: boot disk1 [06:47] Boot device: /pci@780/pci@0/pci@9/scsi@0/disk@1 File and args: [06:47] SILO Version 1.4.12 [06:47] boot: boot: [06:48] it's normal.. but there is a timeout [06:48] just press return now, then? [06:48] it gives you time to chose the kernel or otherwise it will go by itself [06:48] yeah [06:50] Ubuntu edgy (development branch) blade220 ttyS0 [06:51] Linux blade220 2.6.17-9-sparc64-smp #2 SMP Fri Sep 22 04:57:24 UTC 2006 sparc64 [06:51] root@blade220:~# [06:51] I'm ready [06:51] yeah i am doing the last boot.. only a few minutes (hopefully) [06:52] but you can start using that one to avoid CPU lockups [06:52] no worries -- now I can at least repair this system [06:52] i'll wait [07:01] uploading the new version now [07:01] cool [07:10] tmarble: can you please get somebody to look at https://launchpad.net/distros/ubuntu/+source/linux-source-2.6.17/+bug/62485 ? [07:10] it would be enough to know what the hell that message from the HV means [07:10] hold on [07:11] note that it doesn't happen with the old dapper kernel [07:11] only on edgy [07:11] and the kernel is on people [07:26] So, I added the deb.... [07:26] it looks like silo is set for [07:26] image=/vmlinuz [07:27] and the link was updated [07:27] vmlinuz -> boot/vmlinuz-2.6.17-10-sparc64-smp [07:27] so, just reboot, right? [07:28] yes but make sure you have vmlinux.old pointing to the old kernel [07:28] just in case [07:29] after that.. reboot [07:29] it does [07:29] so , in case of trouble do.... boot: vmlinux.old [07:29] LinuxOLD [07:29] there is an alias set in silo.conf [07:29] right.. just saw that in silo.conf [07:29] at silo: you can tab [07:29] and see what images are available [07:30] ok [07:31] last reboot and i am off for today [07:31] ok [07:31] 15 hours in the day.. i am dead tired [07:31] i can't understand??? [07:32] 15 hours of work today.. i am dead tired [07:32] thanks so much -- I'm rebooting now [07:32] no problem at all [07:32] you might get that HV error [07:32] as i did show you in the bug [07:32] the machine might poweroff [07:32] i should also get some new OBP for my T2000 [07:33] but that can wait tomorrow [07:33] yes you should [07:33] tmarble: can you send me a link with the latest crack? [07:33] what should i do about the HV bug? [07:33] sure [07:33] i need to know what that error means [07:33] it was in an e-mail i sent you [07:33] ok [07:34] checking in the emails [07:34] ah no [07:34] you gave it to me here on IRC [07:34] and i did install that update [07:34] i need to check fi there are new ones [07:35] 123482-02 [07:36] hmm [07:36] noi can't find it [07:36] in my url list i mean [07:37] ok hold on [07:37] i have it now [07:38] it's years i don't do sysadm on solaris, but i still remember how to search on sun.com :) [07:38] http://sunsolve.sun.com/search/document.do?assetkey=1-21-123482-02-1 [07:38] right [07:38] Wed Sep 13 12:26:54 MDT 2006 [07:40] 6437802 JBI Fatal HV error should not happen when I/O protection is on [07:40] that smells like it [07:41] had to powercycle my box... still waiting [07:41] yeah i told you [07:41] it's annoying [07:41] anyway i am off [07:41] the kernel will boot fine [07:41] thanks for the help [07:41] I did init 0 from the older kernel [07:41] no worries... take care... have a good rest [07:41] cya tomorrow [07:41] k