[01:27] u/c === BenC1 is now known as BenC [04:21] I filed a hibernation bug about a month ago and I've been unable to fix it myself (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/366264). [04:21] Malone bug 366264 in linux "[Dell XPS m1530] Resume fails after hibernate/suspend" [Undecided,New] [04:21] Is this the right place to ask for help in diagnosing and fixing this bug? [04:30] is there somewhere else I should ask for help with this bug? [04:59] colonelqubit: remove "splash" and "quiet" from your kernel boot line [04:59] that should give you some more idea what's wrong [04:59] johanbr: will do [05:03] johanbr: should I edit menu.lst, or just temporarily change the boot params in the grub menu and then try to test suspend? [05:09] colonelqubit: doesn't really matter [05:10] it's mostly for debugging, so maybe the temporary way is best [05:11] johanbr: what should I be looking for? I've been reading notes on the wiki (https://wiki.ubuntu.com/DebuggingKernelSuspendHibernateResume), but I'm not sure if there's specific info I should put on the bug. [05:11] well, it looks like it locks up, right? [05:11] just write down what the last few messages are [05:13] johanbr: yes, it looks like it locks up. I'll append the longer kernel log to the bug [05:14] alright [05:14] gotta go... good luck [05:14] johanbr: thanks for the help -- should I ping back to #ubuntu-kernel once I've done this? [05:16] sure, that doesn't hurt [05:16] although most of the kernel guys are probably asleep now :) [05:16] okay, I'll try tomorrow morning [05:17] alright [05:17] have a good night [05:18] same to you [10:32] Which kernel will Karmic final use? [10:46] Yingying_Zhao, we decided to use 2.6.31 [11:21] yep 2.6.31 is the aim === apw changed the topic of #ubuntu-kernel to: Karmic Kernel Plan: 2.6.31 -- Ubuntu Developer Summit this week. info on http://summit.ubuntu.com - and in #ubuntu-devel-summit [11:22] cking, we were musing about doing a small image for testing of new features on usb [11:23] apw, jaunty or karmic? [11:23] do you have any canned way of making those or is this something we need to think about [11:23] i t [11:23] i think the though was making a karmic job with x-edgers, updated kernels and grub2 [11:23] and somehow making all that fit in 0MB (of course) [11:24] so it would be trivial to download and test as a bootable image not installed [11:25] apw: I just do a clean install to a 8GB USB pen drive and then remove the bloaty apps then shrink the partition using gparted [11:25] how small do you think it might go? [11:26] Mmm.. 2.2GB was everything, so possibly down to less than 2GB and compressed down to 600MB [11:26] so not exacty tiny then [11:28] my chroots which contain X are of the order of 550MB [11:28] I've got a minimal gnome lpia installer which is ~450MB [11:28] (compressed) [11:29] thats not the 100MB i was hoping for ... though it was a naieve wish [11:29] 100MB is just not possible [11:33] apw: would a 500MB image be reasonable? [11:34] frankly i don't think it was me who cared how big it is [11:34] i think the utility to both the user and us outweighs the size downsides [11:45] apw: do you want this image to be produced automatically? [11:47] cking, i am not seeing a need for that ... [11:47] i can just update it and remake it etc pretty easy once i have a basic one [11:48] I suggest downloading the latest image, install it to a 8GB USB stick and removing all the bloaty apps and shrinking it with gparted and working with this as a basis. [13:27] cking, yeah sounds like a reasonable plan indeed [14:40] moo [14:41] oink [14:42] root@osiris:~# grep CPU_FREQ_DEBUG /boot/config-2.6.30-6-generic [14:42] # CONFIG_CPU_FREQ_DEBUG is not set [14:42] hrm [14:43] ogra, trying to find something. But mjg59 if you happen to hang around. Would you know a way to prevent a build-in acpi-cpufreq from getting selected? [14:43] ogra, The whole of acpi debug is disabled by default [14:43] yeah [14:43] smb: No [14:43] we should probably have it on during development [14:44] Why would you want to? [14:44] my system is constantly running the fan and the frequency doesnt go below 1GHz since the modules were included in the kernel by default [14:44] mjg59, To try to check what driver would get used next. ogra has a case where he had a lower freq avalable before but not with the acpi driver [14:44] i'd like to have my 800MHz default back :) [14:45] What hardware is this? [14:45] model name : Intel(R) Core(TM)2 Duo CPU T5550 @ 1.83GHz [14:45] And now what's the lowest speed? [14:46] mjg59, 1Ghz [14:46] acpi-cpufreq is the only correct driver for that chip [14:47] If it's missing a state now then that implies that there's been a code change rather than a driver change [14:47] Unless you were somehow using p4-clockmod previously, in which case your life is now immeasurably better than it was then [14:47] i'm pretty sure with the modules a different direver was used [14:47] *driver [14:47] not sure if it was the right one, but it definately scaled to a lower speed by default [14:48] Other than acpi-cpufreq, only p4-clockmod could possibly bind to that chip [14:48] mjg59, Yeah, that was somehow my suspicion [14:48] And that doesn't do voltage scaling [14:48] yeah, might be that it was the p4 one [14:48] So things are now better than they were previously [14:48] my lappie is most of the time on AC ... what i care for is the constantly running fan [14:49] It'll run coler at 1GHz with acpi-cpufreq than at 800MHz with p4-clockmod [14:49] s/coler/cooler/ [14:49] hmm [14:49] V^2/R and all that [14:49] then why does it run constantly even though thermal_zone only shows 55°C ? [14:49] Because the lowest trip point is below 55C? [14:51] root@osiris:~# cat /proc/acpi/thermal_zone/TZ0/trip_points [14:51] critical (S5): 91 C [14:51] not really [14:51] Oh, so it's all BIOS controlled anyway [14:51] i have no BOIS option for it though [14:51] *BIOS [14:51] You typically won't [14:53] well, it used to only spin up once an hour or so when the machine was idling [14:53] ogra: got my mail ? [14:53] until the whole stack of cpufreq/thermal moved into the kernel [14:53] oops, wrong chan :) thought I was in #ltsp, sorry [14:54] ogra: Shrug. [14:54] ogra: If you were using p4-clockmod before then your machine was running hotter. That's a thermodynamic certainty. [14:54] Probably with p4-clockmod the temp was slightly lower at 800MHz [14:54] smb: No [14:55] Ok, that would have been my only way to explain this [14:55] smb: Power consumption is linear with frequency, but tends towards a V^2 relationship with voltage scaling [14:56] Hm, and iirc the p4 driver did only freq scaling [14:56] is there any way to tell acpi-cpufreq to use 800MHz ? [14:57] ogra: It uses whatever frequencies your BIOS provides [14:58] mjg59, Could a custom dsdt override the frequencies? [14:58] Yes [14:58] But if you're driving the chip at lower voltages then it's designed to, then you risk data corruption [14:59] And just dropping the frequency won't save you anything [14:59] It just means you spend less time in deep C states [15:00] ogra, i am a little supprised you have no 800mhz given the model you have thre [15:00] my model, looks like a later version of yours and has 800 available [15:00] The frequencies available are determined by the bios, not the CPU [15:01] my system is a clevo tn120 [15:01] interesting ... i'd not thought of it that way [15:01] ogra: Stick the output of acpidump somewhere? [15:01] wow, thats huge [15:01] Yes [15:01] Don't pastebin it [15:03] http://people.ubuntu.com/~ogra/acpidump.out [15:07] Hm. [15:08] mjg59, Triying to learn a bit more here. What would be the thing to look at? [15:08] smb: You want the _PSS object [15:08] It contains the set of ACPI performance states [15:09] Ok, seems to be referenced in ssdt2 and dsdt [15:09] Except it's not actually present in them [15:09] There's probably a dynamically loaded table [15:11] I see... hm [15:14] mjg59, That _tss method in ssdt2 looks slightly like setting up something oneshot... [15:14] _TSS is for T-states, not P-states [15:21] mjg59, sigh, quite confusing. It walks through the p-states to build the t-states... or so it looks... [15:21] Yeah, I'm not sure what it's doing there [15:35] * ogra is back from his phonecall [16:02] ogra, Unfortunately I seem to be far from understanding what your BIOS does... :( [16:03] well, dont ask me about it either :) [16:03] ogra, I won't :) [16:23] Hi, I'm looking for a download location for the .ddeb containing debug information on 2.6.28-12-generic for jaunty. Do any mirrors exist or are the 2.6.30 versions on ddebs.ubuntu.com the only available versions? [16:35] i believe ddebs.u.c is the place for those [16:37] yes, do you know where to find older kernel versions as ddebs.u.c only has 2.6.30 [17:40] hey, anyone want to look at a dmesg dump from a laptop that slowly dies after resuming from suspend? [17:40] 2.6.30rc7 from the mainline kernel ppa [17:40] http://pastie.org/496676 [17:40] The process that triggers the first oops is varies, but it's always "BUG: unable to handle kernel paging request at
" [17:40] s/is// [17:42] I have an ssh open to it, so I can paste additional information, although at this point not much actually runs (dmesg does for instance, new gnome-terminal windows can open, but new bash processes don't start) [17:42] the disk doesn't come back to life properly? [17:43] it kinda does [17:43] it's weird [17:43] I can resume and use it normally for a while [17:43] and then things start breaking [17:43] at which point, any access to the disk seems to hang the process that made the access [17:44] under our .30 kernel, it hangs almost immediately on resume (on the order of a few seconds to a few minutes). on mainline, it works for several minutes to several hours [17:45] https://bugs.launchpad.net/bugs/380807 has some info from the -generic kernel [17:45] Malone bug 380807 in linux "[karmic, intel] Laptop locks up moments after resuming from suspend" [Undecided,New] [17:48] apw: can you get the bug above [380807] into the Suspend/Resume queue? [17:48] would ssh access to the box in this state be useful? [17:48] cwillu: yes don't knock it down just yet [17:49] it would be useful to know are able to login to it remotly [17:49] pgraner, yep, it's still up, and I can reproduce it fairly easily [17:49] apw: I think he had a ssh up prior to the oops [17:49] (sec, attaching dmesg to that bug, and then I'll see if I can ssh in a second time) [17:49] apw: looks like we have mem being mapped into the ether [17:52] bah, ssh session just died [17:52] pgraner, are we having an IRC meeting tomorrow? [17:52] desktop session is still up though [17:53] dmesg is attached to that bug now (same as the pastie) [17:54] pgraner, if you had one command that you could see the result of, what would it be? (noting that it may just hang the last remaining terminal I have open) [17:54] (can still run dmesg, but have no way to copy it elsewhere) [17:55] any commands you run,. run in the background fo & [17:55] and then if they hang you'll likely keep your shell [17:55] k [17:56] wasn't sure if it would lock it up too [17:56] it may... [17:56] do you have KMS enabled? [17:56] and... gnome-terminal just died [17:57] (independently of any other commands, just stopped responding) [17:57] damn [17:57] bjf: we should be meeting, I need to look at the page and see who is chairing it [17:58] and there goes compiz, although I can still move the mouse :p [17:58] pgraner, I asked because I think I am chairing. [17:58] cwillu, so this is triggered by a suspend/resume i assume ? [17:58] apw, yes, every time [17:58] bjf: heh, cool. I think just putting out a summary of kernel actions out of UDS would be good. [17:58] cwillu: i might recommend using 'netconsole' on the next boot [17:59] you don't have an mmc card inserted do you? [17:59] nope [17:59] desktop is locked up completely now (no mouse, etc) [17:59] mouse too ... hmm [17:59] sysrq-s hit the disk though [17:59] cwillu: is this a laptop or desktop? [17:59] laptop [17:59] if you configure netconsole it'll dump all the kernel output to a udp port somewhere... like syslog [17:59] suspend was rock stable from 7.04 up to 9.04 [18:00] Sam-I-Am, next reboot :) [18:00] heh [18:00] cwillu: 2.6.30 got an overhaul of the suspend/resume subsystem [18:00] who was talking about locking issues in -rc6 for suspend? who told us about that [18:01] apw: marcel was telling us at UDS [18:01] apw, I saw a bug about that I think, I might have it open still [18:01] cwillu, be good to have any pointers you have found already to save time [18:02] apw: perhaps and apport-collect to make sure you have everything? [18:02] pgraner, the attachments on my bug were from apport-collect [18:03] http://patchwork.kernel.org/patch/1113/ [18:03] cwillu: cool [18:03] that's old though [18:03] january [18:03] just noticed the date [18:04] there's 3 vaguely similar bugs on launchpad that I saw, all against 2.6.28 though [18:04] I'm going to reboot and get it reproducing again [18:04] cwillu, looks interesting as i am sure that same routine appears in suspend [18:05] and so far i've not seen that change as presented in the kernel [18:05] do you want me on ubuntu's kernel or mainline's? [18:05] both have the problem, ubuntu's crashes far quicker though (almost too quickly to do anything with) [18:06] other interesting datapoints, I _think_ it worked fine while I was still on jaunty with 2.6.30rcX (needed it to work around the ext4 file deletion hanger) [18:06] cwillu, so that was with a mainline kernel? your own build? [18:06] although you should probably ignore that, I don't remember at all well enough [18:06] apw, mainline ppa [18:07] so that should still be on your machine and testable on karmic too [18:07] apw, you misunderstood me :p [18:07] apw, both the mainline ppa kernel and ubuntu's -generic do this [18:07] that dmesg came from 2.6.30rc7 from the mainline ppa [18:08] ok ... [18:08] * cwillu reboots into 2.6.30-020630rc7-generic [18:08] any major differences in userspace suspend between jaunty and karmic? [18:08] the patch you pointed to on patchworks is somehow in karmic under a different patch id [18:08] johanbr, outside the kernel no, inside some [18:09] how do I verify that I'm not using kms? [18:09] I played with plymouth during jaunty's release (as long as I was on 2.6.30 and all that) [18:09] unless you have modeset mentioned on your grub command line [18:09] thought I removed it completely [18:09] or in your module options then you don't [18:09] it is opt in currently [18:10] k, should be clear then [18:11] cwillu: how do you suspend? [18:11] johanbr, normal suspend [18:11] not uswsusp or anything like that [18:25] cwillu, another good test point would be the tip of linus' tree: http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2009-05-30/ [18:25] apw, k, downloading it right now [18:25] there are some locking changes for suspend in there, though i think they are meant to be more theoretical to be fair [18:26] is there an easy way to force a message so I can see if this netconsole is actually working? [18:26] hrm [18:26] sysrq help might do it [18:26] i think the title is always emitted [18:26] yep [18:26] good [18:26] (in the sense of 'it worked') [18:26] also worth doing the suspend from vt1, as you may see errors there [18:27] messages that wouldn't show up in /var/log/{messages/kern.log/syslog/dmesg}? [18:28] depending on the nature of the hang they might not get to disk yes [18:29] the oops that starts to bring things down doesn't necessarily occur right away [18:30] I'll still try it, I'm just spewing everything I can think of that might be relevant [18:30] 8 minutes until the download is complete, at which point I'll install, and then suspend on the current kernel [18:30] or should I go straight to the daily and see if it reproduces first? [18:32] * cwillu pokes apw with an inquisitive stick [18:32] i think continue with your test on the current, and then the c-o-d kernel [18:33] k [18:39] installing right now [18:47] suspending from console [18:48] hard lock [18:49] 9 segfaults are visible, laptop_mode mostly, and one cpufreq [18:50] I think I saw another error flash by too [18:50] * cwillu reboots and tries that again [18:50] nothing showed up in the netconsole [18:50] actually, that's a lie [18:50] [drm:i915_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0 [18:50] show up [18:51] cwillu: could you try with the jaunty kernel? [18:51] johanbr, sure, doing it now [18:51] (noting that jaunty's kernel has ext4 issues that causes other lockups) [18:52] (but I shouldn't run into them without deleting files) [18:55] * cwillu suspends [18:56] came back with no errors on the console [18:56] 2.6.28-10-generic #33 [18:57] netconsole didn't show anything during the suspend, although it's still connected (I did a sysrq-sync before and after, both showed up in the netconsole) [18:58] johanbr, apw, given my original dmesg, is there anything that jumps out at you as a way to check whether something has gone wrong, even before the system oops? [18:59] something in /proc/ or /sys/ I could check? [19:02] Comparison with the dmesg from the jaunty kernel might be interesting [19:06] I'll post that in a minute [19:06] just proving to myself that it's stable [19:06] by playing flash games on casualcollective.com of course :p [19:13] seems to be fine [19:13] okay, dmesg [19:14] http://pastebin.com/f3757f4a5 [19:14] was the original with the fault http://pastie.org/496676 [19:14] (bah, pastie was with fault, pastebin was jaunty's kernel that worked fine) [19:15] * cwillu pokes johanbr and apw [19:15] I'm rebooting into the daily 2.6.30 now [19:18] * cwillu suspends current daily kernel [19:20] dmesg from 2.6.30-999-generic (current daily): http://pastebin.com/f162d6f02 [19:20] didn't immediately crash, let me test it for a bit again [19:22] oh, wait [19:22] an error showed up a second after I pastebin'ed it [19:23] [ 261.585748] EXT4-fs error (device sda1): ext4_lookup: deleted inode referenced: 12206095 [19:23] [ 261.585768] Aborting journal on device sda1:8. [19:23] johanbr, apw: http://pastebin.com/f5c2ec258 has the latest error included [19:25] opinions? [19:26] I'm tempted to remount rw and see if there's any further instability, [19:28] * cwillu reboots to fsck and repeat [19:33] fsck finished, suspending agani [19:33] well, suspending after rebooting :/ [19:38] cwillu, hrm. can we get all that info into the bug, and i'll have a look about. especially the version you are stable on [19:38] apw, yep. I'm not completely convinced the last crash was the related, but I'll post the dmesg's [19:39] * apw has to pop out ... [19:39] * cwillu tackles apw to hold him in the channel :p [19:52] the daily hasn't done anything odd yet [19:52] I'm going to keep poking it for a couple hours (longest I've ever had a suspend last and still crash) [19:56] apw, updated the bug report with jaunty's dmesg [20:22] * cwillu suspends again for kicks [20:23] apw, it's still a little early to be sure, but I've suspended a couple times without rebooting on the daily kernel, and I haven't had any issues except for that first attempt with the ext4 mounting read-only, which may have been related to previous crashes more than the suspend [20:38] cwillu, thats encouraging ... if you have an oppotunity to do say 10 suspends and report back on the bug that would be handy [20:39] definitely [20:39] * cwillu suspends again [20:41] [ 4022.240004] BUG: soft lockup - CPU#0 stuck for 61s! [python:8575] [20:45] lots of b44: eth2: Error, poll already scheduled [20:46] apw, how likely is it that a daily would have random crashers in it? [21:22] apw, just completed 20 suspend cycles on the daily, 60 seconds between each resume and the next suspend. Nothing suspicious shows up in dmesg, I'm posting it to the bug report [21:36] johanbr: hi again. I've had some progress in testing hibernation on my laptop. [21:38] alright [21:38] I'm about to head out, but go ahead... [21:39] if I disable splash/quiet and enable no_console_suspend, and hibernate from vt1, and never try to go to vt7 (with X), then hibernation and resume works (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/366264) [21:39] Malone bug 366264 in linux "[Dell XPS m1530] Resume fails after hibernate/suspend" [Undecided,New] [21:40] I'm getting errors about btusb_intr_complete and btusb_bulk_complete. They appear to have "failed to resubmit" [21:49] that shouldn't matter very much [21:49] okay [22:06] johanbr: As I said, it appears that I can get hibernation to work properly in vt1, but I can't hibernate from within X or return to X once I've gone though a hibernation cycle. Any suggestions on what I should try testing next? === sconklin is now known as sconklin-gone [22:37] It is intentional that ddebs.ubuntu.com only provides karmic ddebs? [22:38] maxb, I see a bunch of different releases in /dists/ [22:39] Sorry, for the kernel I mean [22:39] pool/main/l/linux/ only seems to have karmic packages