[00:08] davidm: pong, sorry eating [00:09] pgraner, my laptop is now able to reproduce the bug very well 8 out of 10 boots. [00:09] davidm: ok, did the workaround solve it? [00:09] Do you want it? I can fedex it to you or tim since it now is in text mode boot [00:09] workaround does solve it. [00:10] davidm: its definitely a race, its just a matter of narrowing down what with the 3945 driver [00:10] It also alwayst gets past USB boot sequence so you can use serial console if you need it. [00:10] rtg: ping... see above [00:10] I have about 20 minutes to drop off time here. [00:11] davidm: does it have a real serial port? [00:11] nope, just USB [00:11] But USB is getting up before the hang from what I see. [00:11] And as I say I'm in text mode so you can printk debug with it. [00:12] usplash is hard off [00:12] davidm: is it hard hung, i.e. does the caps lock toggle on and off? [00:12] hard hung [00:12] no caps lock, no alt/printscreen t or 9 all locked solid [00:13] davidm: can you live without it for a few days? If so send it to rtg this way he has one that repros it [00:13] I'm off on Friday so I have no issue giving you unit until Monday [00:13] davidm: ok send it to rtg, you have the addr? [00:13] I need address to do so [00:14] davidm: via email [00:14] OK [00:14] I'll fedex for early delevery [00:15] davidm: you have mail [00:16] OK [00:21] OK have address will try to get FedEx to take it. I'll be off line use phone to get me if need be [01:04] what type of error would this be: FATAL error inserting battery /lib/modules/..../acpi/batery.ko): no such device. [01:17] emgent, offhand, I'd say a bad one. It sounds like your kernel modules went and ran away [01:20] er, emma [01:20] That's trouble. [01:21] To me it sounds like the module in question can't find the hardware its designed to work with. [01:21] I got it from aptitude while installing xserver-xorg in the cli only mini.iso on a computer that's quite old. [01:21] emma, is it a laptop? [01:22] * NCommander notes that should have been the obvious first question [01:22] The odd thing is, besides that fatal error, xserver-xorg runs. [01:22] Well, the fatal error sounds like its being caused by modprobe and not xorg [01:22] Nope not a laptop. This is an old dino-puter, I got it for fun and experimenting with linux. :) It's a Dell Optiplex GX1, Pentium III. [01:23] At install it said it had to force aspci [01:24] acpi that is. === emma is now known as joe-the-plumber === joe-the-plumber is now known as emma [04:28] Hello? [09:31] anyone online with an iwl3945 to debug? [09:52] Sorry, iwlagn here :-/ [09:52] I might be able to dig one up mdz [09:53] mdz, speaking of things to debug, do you know what is needed to get a kernel.u.c account? [09:56] NCommander: it helps to be a regular contributor first before requesting an account. [09:56] NCommander: what are you planning to work on? [09:56] linux-ports kernel [09:57] I want to start work on rebasing to 2.6.27 in a side tree so once jaunty is available, and a kernel team member can sign off on it, it can become the jaunty kernel. I'm sorta irked at the age of the -ports kernel [09:58] NCommander: why can't that be done by just pulling the current ports tree and doing it locally? Once you have something to show, there won't be a reason _not_ to give you an account :) [09:58] At the moment, the 2.6.25 tree is someone fracked [09:58] *somewhat [09:58] I dunno what happened to it, but git explodes during a rebasing attempt [09:58] * NCommander did have one of the kernel gurus also try it to rule out user error ;-) [09:59] so its more grab linus's tree, and start it again [10:01] NCommander: 2.6.25 -> 26.27 is enough of a big jump that some attempts at rebase might fail because some patches become obsolete and cause conflicts. So we are better of discarding those patches. Trying 'git merge' to 2.6.27 final might get you there quicker. Then we can figure out the actual 'diff' and recreate a new tree. [10:01] Well, normally I'd agree [10:01] But its failed on trival patches as well [10:02] ITs like git can't see that this patch applies cleanly [10:02] (on the rebasing) [10:02] I personally felt it might be easier to just take the clean 2.6.27 drop, drop debian on it, then pop off each patch individually (there are maybe 15-20 tops) [10:02] (the debian folder that is) [10:03] NCommander: i agree. 'git format-patch -o /tmp ' will help you export all ubuntu/ports patches on top of Linus' tree [10:04] * NCommander nods [10:04] Pretty much what I was thinking [10:06] amitk, http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=497200 is there a way that we get that driver in before final ? [10:06] NCommander: if you came back with such a tree, someone would have a easier time recommending you for a kernel.u.c account. :) But we will be glad to help while you do this... [10:07] * NCommander wishes bzr-git existed [10:07] Oh well [10:07] NCommander, not only you :) [10:07] I just wanted a place where I could upload my tree more or less [10:08] I don't have a machine where I canpush a git repo to [10:08] NCommander, easy to solve btw, you just need to convine linus to switch upstream :) [10:08] Ouch [10:09] I personally wish he switched to monotone [10:09] * NCommander likes mtn [10:13] ogra: one thing at a time... mobile kernel first :) [10:13] amitk, indeed, that would only be for post freeze, no hurry :) [10:14] ogra: when is freeze? [10:14] amitk, well, I'm fairly new to kernel development, but the -ports kernel has seen no love pre-hardy when it was forked from the normal one, and I think if we're going to have usable ports, this needs to change :-/ [10:15] amitk, slangasek said something like "morning-ish UTC" [10:15] NCommander: agreed. we discussed this in kernel team meeting on tuesday [10:15] ogra: today? [10:15] yes [10:16] amitk, well, I'm a -ports user, REVU is on a sparc box. Its really pathetic that all the ports die because no one can maintain the kernel :-/ [10:16] amitk, it obviously didnt happen yet [10:16] amitk, whoops it happened right now when i was typing the above [10:17] heh [10:17] * amitk groans [10:17] (see -devel) [10:26] hi, in 2.6.27 (ubuntu 8.10, amd64) - compared to 8.04 on same box (acer laptop single core), CPU usage is awesome (98% in C3 when idle, was 1% on 8.04 bue to probably kernel problem) [10:26] this rocks [10:27] however, now one thing causes wakeups (70 per second) [10:27] i915@pci:0000:00:02.0 [10:27] I didnt see that one on 8.04... any ideas? [10:28] oh ok. this is only when desktop effects are active. I guess this is because it uses OpenGL mode... [10:55] NCommander: if you have a 3945 system, debugging for bug 263059 would be greatly appreciated [10:55] I'll see if I can dig one up === asac_ is now known as asac === mdz_ is now known as mdz [14:52] Ng: did you catch this little factoid re e1000e corruption? http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-intrepid.git;a=commit;h=67d9b90a1c844bf1c6daaffd2c60561fc8c445f7 [14:52] rtg: yeah, I was gonna ask if we're going to take that :) [14:53] I know basically nothing about ftrace, but it makes me wonder if it's possible it's causing other subtle weirdnesses [14:53] Ng: test building now. [14:54] Ng: its possible. I'm getting David M's laptop this morning 'cause it exhibits Jane's i3945 related crash. Its really weird and racy. [15:52] rtg: 5 reboots without a hand on the 3945 now [15:52] s/hand/hang [15:53] mjg59: what happens if you access an I/O mapped register after pci_disable_device() ? I imagine its somewhat HW dependent. [15:53] rtg: is your test build going to yield a handy .deb? If so I could do more reboot testing of my weird boot crash/corruption tonight [15:53] rtg: In principle, I don't think the device should decode it [15:53] but that's purely me hoping and wishing that it will magically go away with ftrace ;) [15:53] Ng: actually, I think its worthy of an upload. [15:53] fair enough [15:53] Ng: I'll have a .deb soon if you are on amd64-generic [15:55] newp, i386 [15:55] I like my proprietary media to work ;) [15:56] amitk: fwiw, I've got a similar issue with ipw2200, and I got 9 successful boots with the 10th a hang [15:56] jdstrand: bug #? [15:56] amitk: can you instrument your i3945 driver such that you can tell when the rf-kill handler is called? [15:57] bug #284406 [15:57] mjg59: so you think it would wedge? [15:57] i.e., stop the PCI bus. [15:59] No, it should just cause an aborted write [15:59] /shouldn't/ wedge it [16:01] rtg: I don't seem to have a way to enable rf_kill. Fn+F5 will toggle it to 1, but it goes to 0 on reboot. I can disable it in the bios, but then ipw2200 isn't detected/loaded during boot [16:01] BenC: ping [16:01] rtg: mdz already sprinkled some printks around rfkill_init() this morning. The .ko is attached to the bug. [16:01] * amitk back in 15 mins (dinner) [16:01] amitk: I don't have hardware yet. It should get delivered soon. [16:02] rtg: I have a kernel compile going with FTRACE disabled. I'll tell after that. [16:02] amitk: pull from the repo. I've already committed taht stuff [16:05] amitk: so you can reproduce the bug now? [16:11] jdstrand: I have a T42 I can test, any hints on reproducing bug 284406? upgrading it to intrepid at the moment isn't practical, but I can test a CD [16:12] mdz: not really-- just patience. I've gone as many as 9 consecutive successful reboots [16:12] s/reboots/boots/ [16:13] mdz: I did have associate=0 in modprobe.d/ due to another bug that was fixed in 7.11, but removing that file makes no difference [16:15] rtg: yes, I already pulled and building the kernel now. Almost done [16:16] mdz: yes, I can reproduce, but very infrequently. Once in 10 tries or so... [16:34] rtg: I'm not having too much luck reproducing after adding debug to the kernel cmdline [16:35] jdstrand: at what point did this regression get introduced? [16:41] asac,all: is there any clarity on whether bug 259157 needs to be fixed in userland or kernel? [16:42] mdz: I am trying to ascertain that. I *definitely* know I never saw this in hardy. I thought I saw it in 2.6.26-5.17 once this morning, but am trying to reproduce that currently [16:43] mdz: unfortunately, this is not a machine I actively use, so I may have well been hitting good boots all along in the cycle [16:43] (though I seem to remember at least one time before the latest kernel that I had a hang, but didn't have time to troubleshoot) [16:45] BenC: what is the resolution to be for 246269 (uvesafb)? [16:47] *sigh* 20 reboots now without a hang. [16:47] rtg: did the hardware arrive? [16:48] mdz: yep - cloning /home [16:48] rtg: great [16:48] rtg: did it hang when you booted it up? [16:48] I'm wondering if maybe it's tickled by what sort of APs happen to be nearby, if any [16:51] mdz: on conf call, I'll be done in a bit. [16:51] mdz: there is no userland solution for atheros ... the previous we had used the wpasupplicant madwifi module [16:51] which doesnt work anymore i think [16:52] mdz: (thats for ath) ... for orinoco the solution could be user land, but the code changed considerably in NM and i havent received a single complained about it (and therefore no testers) [16:53] mdz: so to answer your question: atheros -> fix in kernel (though at least some appear to work with latest drivers); orinoco -> maybe userland [16:57] * jdstrand finally got smart and created /etc/rc2.d/S19rebootme so he doesn't have to hand hold his laptop through endless reboots [17:08] mdz: we uploaded a new kernel with vesafb as default [17:08] BenC: we reverted from uvesafb to vesafb? [17:10] mdz: right [17:18] BenC,rtg,amitk,pgraner,cjwatson,slangasek: targeted kernel bug review on its way to your mailbox [17:18] mdz: if the rf-kill switch is enabled, then the AP should have no influence [17:19] ohh ah, locked up. [17:22] rtg: good point, several of us tried that [17:23] mdz: on another note, I upload Intrepid LBM about 4 hours ago. need to get Steve to release it. [17:24] mdz: hmm, fixing this bug it gonna be difficult if I can't _ever_ get David laptop to finish booting. 5 for 5 on boot hangs [17:28] rtg: he's managed to get it very reliable [17:28] rtg: his boot process is modified from stock (he's not waiting for udevtrigger to finish before continuing) [17:29] init=/bin/sh should get you in of course [17:29] mdz: how does that effect rf-kill processing? I assume its udev tha does that as a result of ACPI events. [17:31] rtg: the normal boot process does essentially "udevtrigger; udevsettle" davidm's just runs udevtrigger and keeps going. so it would mean that there are other init scripts running in the background while udev is still loading modules [17:32] it probably gets to runlevel 2 or so before it's done loading modules === paran_ is now known as paran [17:32] 8 out of 10 will lock [17:33] mdz i think you put udevsettle back [17:33] davidm: I think I'm 10 for 10. init=/bin/sh never gives a prompt, I had to use break=top [17:34] davidm: I did, but then you took it out again last I checked [17:34] you may want to boot a cd [17:34] nope i never touched it [17:34] rtg: so it's possible that it's one of the startup scripts in rcS.d after S10udev which tickles the bug [17:35] that made the lock happen earlier [17:38] where have the linux-image-debug-* packages gone in intrepid? [17:41] rtg: will wireless backports happen before release? if so, I'll skip bug # [17:42] rtg: will wireless backports happen before release? if so, I'll skip bug #284354 for now. [17:42] amitk: I'm not sure yet. I've been to busy with other stuff to get back top it. [17:43] I have committed the current compat-wireless, but it didn't seem to work, e.g., would not connect. [17:51] mdz: fyi-- 31 consecutive successful boot with 2.6.26-5.17-generic. moving on to 2.6.27-2.3... [17:56] jdstrand: don't tell me, put it in the bug :-) [17:56] mdz: I did :) [17:56] paran: bug 253904 [18:16] rtg: since you are having more luck than I at reproducing the 3945, could you add the following debugging lines: https://pastebin.canonical.com/10277/ [18:20] amitk: what's the dump_stack() gonna show you? [18:21] mdz: thanks, so they are on ddebs? why? you really need kernel debug symbols to use things like oprofile or systemtap that are both in the normal archive :( [18:23] rtg: the call stack, I got a freeze (only twice yet) right before that. I was hoping to get something out before the freeze [18:25] amitk: gimme a bit. I rebuilt the module without rf-kill. its altered the problem enough that david's laptop appears to boot more reliably. no hnags after 4 tries. [18:28] rtg: sure, take your time. I am about to put this on an automatic reboot loop and call it a day. [18:28] amitk: later... [18:59] hi, did you see that 2.6.27.1 hotfix? [18:59] Kano: nobody in here follows upstream development [19:00] you should have learned that by now [19:00] well but that disables the root cause of the e1000 problem [19:00] i would say it is very important [19:01] i think an intel developer has submitted that patch to the kernel-team list [19:01] disable CONFIG_DYNAMIC_FTRACE due to possible memory corruption on module unload [19:02] you do not explicit need that patch,but at least disable it [19:03] the patch would not hurt however [19:04] alreay committed to the Intrepid repo [19:04] *already [19:04] is config changed? [19:04] the Kconfig change is the patch [19:04] ok [19:05] will compile new snapshot then [19:45] ugh, we really need something like hal-info for HDA quirks. [19:49] Being able to set the mappings from userspace would help a great deal [19:49] As would being able to parse the .inf files [19:58] is there any hope of getting the elantech touchpad driver into intrepid? [20:07] It always goes quiet when I say that. :/ [20:10] solarion: the ditro was frozen today and we release in 2 weeks. So nobody wants to add another driver at this late stage. [20:10] *distro [20:10] amitk: the request's been in since Hardy, and likely before [20:11] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/123775 [20:13] yes, this bug is clearly over a year and a quarter old [20:14] solarion: but clearly at that time it was an out of tree driver with no future. So it made it to the -mm tree later. What happened next? Do you know the story? [20:14] I've no idea [20:15] I just want to disable tap-to-click and enable scroll areas. :) [20:20] solarion: I suggest you right to the author here: http://arjan.opmeer.net/elantech/ and ask him why the driver is still outside the tree after 3 major kernel versions. [20:20] s/right/write [20:21] amitk: good point [20:25] there. Sendified. :) [20:29] let us know what you hear back [20:29] well, it took 36 tries (!) to get the boot hang (bug #284406) with 2.6.27-3.4, but ony 6 with 2.6.27-2.3... gotta love race conditions :( [20:36] jdstrand: https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/263059/comments/148 [20:36] its likely they are the same hang. [20:37] rtg: excellent-- is it worthwhile to retest 2.6.26, or do you have enough to go on at this point? [20:37] jdstrand: I'm gnarling into that function, and I've solicited some advice from the wireless guys (no response yet) [20:38] rtg: well, I've got this thing on a loop so I can let it churn away... [20:39] jdstrand: instrument the ipw2200 driver so that it prints some locators, preferably just before and just after the call to ieee80211_register_hw() [20:42] rtg: can you paste what you used for iwl3945? [20:43] jdstrand: just do 'printk(KERN_INFO DRV_NAME " %d\n",__LINE__);' where DRV_NAME is something like 'ipw2200' [20:43] rtg: ok, easy enough [20:51] rtg: i've been looking at the link-order angle of this bug. [20:51] but since 3945 is a module, that doesn't apply [20:52] but is the order in which userspace triggers these module loads fixed? and if so, where is it? [20:53] amitk: oh, defintely. David was able to get it more often by messing with udev settling [20:54] where is keybuk when we need him? [20:54] amitk: see /etc/init.d/udev. comment out the 'udevadm settle' clause in the restart option. [20:56] amitk: looks like it might be in 2.6.28 [20:56] too late for intrepid, unfortunately [20:56] solarion: first thing in jaunty though :) Did he mention why it took this long? [20:57] amitk: no, but I didn't really ask that; I asked what was keeping it from going in-tree [20:57] ok [20:58] * solarion digs for the linux-kernel archive posting [20:58] http://www.uwsg.iu.edu/hypermail/linux/kernel/0810.2/0404.html [21:02] nice [21:03] amitk: I've chased it as far as the rtnl_lock() in ieee80211_register_hw() [21:11] rtg: the two times I did get a freeze, I saw all messages from ieee80211_init_rate_ctrl_alg() [21:12] rtg: interestingly, https://bugs.edge.launchpad.net/ubuntu/+bug/263059/comments/128 finished pci_probe on every hang [21:12] ^ using mdz's instrumented modules [21:12] amitk: i just had it lock up _after_ the i3945 module init completed. [21:14] ok. that makes more sense. [21:14] or that is what I am seeing the bug reports [21:26] rtg: comment out udevadm settle in restart or start option? [21:40] amitk: mdz commented out the clause in restart (on David's machine). [21:45] huh [21:53] rtg: do you have then X60? [22:20] amitk: T60 === TheMuso_ is now known as TheMuso === laga_ is now known as laga