/srv/irclogs.ubuntu.com/2009/06/01/#ubuntu-kernel.txt

TheMusou/c01:27
=== BenC1 is now known as BenC
colonelqubitI filed a hibernation bug about a month ago and I've been unable to fix it myself (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/366264).04:21
ubot3Malone bug 366264 in linux "[Dell XPS m1530] Resume fails after hibernate/suspend" [Undecided,New] 04:21
colonelqubitIs this the right place to ask for help in diagnosing and fixing this bug?04:21
colonelqubitis there somewhere else I should ask for help with this bug?04:30
johanbrcolonelqubit: remove "splash" and "quiet" from your kernel boot line04:59
johanbrthat should give you some more idea what's wrong04:59
colonelqubitjohanbr: will do04:59
colonelqubitjohanbr: should I edit menu.lst, or just temporarily change the boot params in the grub menu and then try to test suspend?05:03
johanbrcolonelqubit: doesn't really matter05:09
johanbrit's mostly for debugging, so maybe the temporary way is best05:10
colonelqubitjohanbr: what should I be looking for? I've been reading notes on the wiki (https://wiki.ubuntu.com/DebuggingKernelSuspendHibernateResume), but I'm not sure if there's specific info I should put on the bug.05:11
johanbrwell, it looks like it locks up, right?05:11
johanbrjust write down what the last few messages are05:11
colonelqubitjohanbr: yes, it looks like it locks up. I'll append the longer kernel log to the bug05:13
johanbralright05:14
johanbrgotta go... good luck05:14
colonelqubitjohanbr: thanks for the help -- should I ping back to #ubuntu-kernel once I've done this?05:14
johanbrsure, that doesn't hurt05:16
johanbralthough most of the kernel guys are probably asleep now :)05:16
colonelqubitokay, I'll try tomorrow morning05:16
johanbralright05:17
colonelqubithave a good night05:17
johanbrsame to you05:18
Yingying_ZhaoWhich kernel will Karmic final use?10:32
cooloneyYingying_Zhao, we decided to use 2.6.3110:46
apwyep 2.6.31 is the aim11:21
=== apw changed the topic of #ubuntu-kernel to: Karmic Kernel Plan: 2.6.31 -- Ubuntu Developer Summit this week. info on http://summit.ubuntu.com - and in #ubuntu-devel-summit
apwcking, we were musing about doing a small image for testing of new features on usb11:22
ckingapw, jaunty or karmic?11:23
apwdo you have any canned way of making those or is this something we need to think about11:23
apwi t11:23
apwi think the though was making a karmic job with x-edgers, updated kernels and grub211:23
apwand somehow making all that fit in 0MB (of course)11:23
apwso it would be trivial to download and test as a bootable image not installed11:24
ckingapw: I just do a clean install to a 8GB USB pen drive and then remove the bloaty apps then shrink the partition using gparted 11:25
apwhow small do you think it might go?11:25
ckingMmm.. 2.2GB was everything, so possibly down to less than 2GB and compressed down to 600MB11:26
apwso not exacty tiny then11:26
apwmy chroots which contain X are of the order of 550MB11:28
ckingI've got a minimal gnome lpia installer which is ~450MB11:28
cking(compressed)11:28
apwthats not the 100MB i was hoping for ... though it was a naieve wish11:29
cking100MB is just not possible11:29
ckingapw: would a 500MB image be reasonable?11:33
apwfrankly i don't think it was me who cared how big it is11:34
apwi think the utility to both the user and us outweighs the size downsides11:34
ckingapw: do you want this image to be produced automatically?11:45
apwcking, i am not seeing a need for that ... 11:47
apwi can just update it and remake it etc pretty easy once i have a basic one11:47
ckingI suggest downloading the latest image, install it to a 8GB USB stick and removing all the bloaty apps and shrinking it with gparted and working with this as a basis.11:48
apwcking, yeah sounds like a reasonable plan indeed13:27
ogramoo14:40
Sam-I-Amoink14:41
ograroot@osiris:~# grep CPU_FREQ_DEBUG /boot/config-2.6.30-6-generic 14:42
ogra# CONFIG_CPU_FREQ_DEBUG is not set14:42
ograhrm14:42
smbogra, trying to find something. But mjg59 if you happen to hang around. Would you know a way to prevent a build-in acpi-cpufreq from getting selected?14:43
smbogra, The whole of acpi debug is disabled by default14:43
ograyeah14:43
mjg59smb: No14:43
ograwe should probably have it on during development 14:43
mjg59Why would you want to?14:44
ogramy system is constantly running the fan and the frequency doesnt go below 1GHz since the modules were included in the kernel by default14:44
smbmjg59, To try to check what driver would get used next. ogra has a case where he had a lower freq avalable before but not with the acpi driver14:44
ograi'd like to have my 800MHz default back :)14:44
mjg59What hardware is this?14:45
ogramodel name: Intel(R) Core(TM)2 Duo CPU     T5550  @ 1.83GHz14:45
mjg59And now what's the lowest speed?14:45
smbmjg59, 1Ghz 14:46
mjg59acpi-cpufreq is the only correct driver for that chip14:46
mjg59If it's missing a state now then that implies that there's been a code change rather than a driver change14:47
mjg59Unless you were somehow using p4-clockmod previously, in which case your life is now immeasurably better than it was then14:47
ograi'm pretty sure with the modules a different direver was used14:47
ogra*driver14:47
ogranot sure if it was the right one, but it definately scaled to a lower speed by default14:47
mjg59Other than acpi-cpufreq, only p4-clockmod could possibly bind to that chip14:48
smbmjg59, Yeah, that was somehow my suspicion14:48
mjg59And that doesn't do voltage scaling14:48
ograyeah, might be that it was the p4 one14:48
mjg59So things are now better than they were previously14:48
ogramy lappie is most of the time on AC ... what i care for is the constantly running fan14:48
mjg59It'll run coler at 1GHz with acpi-cpufreq than at 800MHz with p4-clockmod14:49
mjg59s/coler/cooler/14:49
ograhmm14:49
mjg59V^2/R and all that14:49
ograthen why does it run constantly even though thermal_zone only shows 55°C ?14:49
mjg59Because the lowest trip point is below 55C?14:49
ograroot@osiris:~# cat /proc/acpi/thermal_zone/TZ0/trip_points 14:51
ogracritical (S5):           91 C14:51
ogranot really14:51
mjg59Oh, so it's all BIOS controlled anyway14:51
ograi have no BOIS option for it though14:51
ogra*BIOS14:51
mjg59You typically won't14:51
ograwell, it used to only spin up once an hour or so when the machine was idling14:53
stgraberogra: got my mail ?14:53
ograuntil the whole stack of cpufreq/thermal moved into the kernel 14:53
stgraberoops, wrong chan :) thought I was in #ltsp, sorry14:53
mjg59ogra: Shrug.14:54
mjg59ogra: If you were using p4-clockmod before then your machine was running hotter. That's a thermodynamic certainty.14:54
smbProbably with p4-clockmod the temp was slightly lower at 800MHz 14:54
mjg59smb: No14:54
smbOk, that would have been my only way to explain this14:55
mjg59smb: Power consumption is linear with frequency, but tends towards a V^2 relationship with voltage scaling14:55
smbHm, and iirc the p4 driver did only freq scaling14:56
ograis there any way to tell acpi-cpufreq to use 800MHz ?14:56
mjg59ogra: It uses whatever frequencies your BIOS provides14:57
smbmjg59, Could a custom dsdt override the frequencies?14:58
mjg59Yes14:58
mjg59But if you're driving the chip at lower voltages then it's designed to, then you risk data corruption14:58
mjg59And just dropping the frequency won't save you anything14:59
mjg59It just means you spend less time in deep C states14:59
apwogra, i am a little supprised you have no 800mhz given the model you have thre15:00
apwmy model, looks like a later version of yours and has 800 available15:00
mjg59The frequencies available are determined by the bios, not the CPU15:00
ogramy system is a clevo tn12015:01
apwinteresting ... i'd not thought of it that way15:01
mjg59ogra: Stick the output of acpidump somewhere?15:01
ograwow, thats huge15:01
mjg59Yes15:01
mjg59Don't pastebin it15:01
ograhttp://people.ubuntu.com/~ogra/acpidump.out15:03
mjg59Hm.15:07
smbmjg59, Triying to learn a bit more here. What would be the thing to look at?15:08
mjg59smb: You want the _PSS object15:08
mjg59It contains the set of ACPI performance states15:08
smbOk, seems to be referenced in ssdt2 and dsdt15:09
mjg59Except it's not actually present in them15:09
mjg59There's probably a dynamically loaded table15:09
smbI see... hm15:11
smbmjg59, That _tss method in ssdt2 looks slightly like setting up something oneshot...15:14
mjg59_TSS is for T-states, not P-states15:14
smbmjg59, sigh, quite confusing. It walks through the p-states to build the t-states... or so it looks...15:21
mjg59Yeah, I'm not sure what it's doing there15:21
* ogra is back from his phonecall15:35
smbogra, Unfortunately I seem to be far from understanding what your BIOS does... :(16:02
ograwell, dont ask me about it either :)16:03
smbogra, I won't :)16:03
snakieHi, I'm looking for a download location for the .ddeb containing debug information on 2.6.28-12-generic for jaunty. Do any mirrors exist or are the 2.6.30 versions on ddebs.ubuntu.com the only available versions?16:23
apwi believe ddebs.u.c is the place for those16:35
snakieyes, do you know where to find older kernel versions as ddebs.u.c only has 2.6.3016:37
cwilluhey, anyone want to look at a dmesg dump from a laptop that slowly dies after resuming from suspend?17:40
cwillu2.6.30rc7 from the mainline kernel ppa17:40
cwilluhttp://pastie.org/49667617:40
cwilluThe process that triggers the first oops is varies, but it's always "BUG: unable to handle kernel paging request at <address>"17:40
cwillus/is//17:40
cwilluI have an ssh open to it, so I can paste additional information, although at this point not much actually runs (dmesg does for instance, new gnome-terminal windows can open, but new bash processes don't start)17:42
johanbrthe disk doesn't come back to life properly?17:42
cwilluit kinda does17:43
cwilluit's weird17:43
cwilluI can resume and use it normally for a while17:43
cwilluand then things start breaking17:43
cwilluat which point, any access to the disk seems to hang the process that made the access17:43
cwilluunder our .30 kernel, it hangs almost immediately on resume (on the order of a few seconds to a few minutes).  on mainline, it works for several minutes to several hours17:44
cwilluhttps://bugs.launchpad.net/bugs/380807 has some info from the -generic kernel17:45
ubot3Malone bug 380807 in linux "[karmic, intel] Laptop locks up moments after resuming from suspend" [Undecided,New] 17:45
pgranerapw: can you get the bug above [380807] into the Suspend/Resume queue?17:48
cwilluwould ssh access to the box in this state be useful?17:48
pgranercwillu: yes don't knock it down just yet17:48
apwit would be useful to know are able to login to it remotly17:49
cwillupgraner, yep, it's still up, and I can reproduce it fairly easily17:49
pgranerapw: I think he had a ssh up prior to the oops17:49
cwillu(sec, attaching dmesg to that bug, and then I'll see if I can ssh in a second time)17:49
pgranerapw: looks like we have mem being mapped into the ether17:49
cwillubah, ssh session just died17:52
bjfpgraner, are we having an IRC meeting tomorrow?17:52
cwilludesktop session is still up though17:52
cwilludmesg is attached to that bug now (same as the pastie)17:53
cwillupgraner, if you had one command that you could see the result of, what would it be?  (noting that it may just hang the last remaining terminal I have open)17:54
cwillu(can still run dmesg, but have no way to copy it elsewhere)17:54
apwany commands you run,. run in the background fo &17:55
apwand then if they hang you'll likely keep your shell17:55
cwilluk17:55
cwilluwasn't sure if it would lock it up too17:56
apwit may...17:56
apwdo you have KMS enabled?17:56
cwilluand... gnome-terminal just died17:56
cwillu(independently of any other commands, just stopped responding)17:57
apwdamn17:57
pgranerbjf: we should be meeting, I need to look at the page and see who is chairing it17:57
cwilluand there goes compiz, although I can still move the mouse :p17:58
bjfpgraner, I asked because I think I am chairing.17:58
apwcwillu, so this is triggered by a suspend/resume i assume ?17:58
cwilluapw, yes, every time17:58
pgranerbjf: heh, cool. I think just putting out a summary of kernel actions out of UDS would be good.17:58
Sam-I-Amcwillu: i might recommend using 'netconsole' on the next boot17:58
apwyou don't have an mmc card inserted do you?17:59
cwillunope17:59
cwilludesktop is locked up completely now (no mouse, etc)17:59
apwmouse too ... hmm17:59
cwillusysrq-s hit the disk though17:59
pgranercwillu: is this a laptop or desktop?17:59
cwillulaptop17:59
Sam-I-Amif you configure netconsole it'll dump all the kernel output to a udp port somewhere... like syslog17:59
cwillususpend was rock stable from 7.04 up to 9.0417:59
cwilluSam-I-Am, next reboot :)18:00
Sam-I-Amheh18:00
pgranercwillu: 2.6.30 got an overhaul of the suspend/resume subsystem18:00
apwwho was talking about locking issues in -rc6 for suspend?  who told us about that18:00
pgranerapw: marcel was telling us at UDS18:01
cwilluapw, I saw a bug about that I think, I might have it open still18:01
apwcwillu, be good to have any pointers you have found already to save time18:01
pgranerapw: perhaps and apport-collect to make sure you have everything?18:02
cwillupgraner, the attachments on my bug were from apport-collect18:02
cwilluhttp://patchwork.kernel.org/patch/1113/18:03
pgranercwillu: cool18:03
cwilluthat's old though18:03
cwillujanuary18:03
cwillujust noticed the date18:03
cwilluthere's 3 vaguely similar bugs on launchpad that I saw, all against 2.6.28 though18:04
cwilluI'm going to reboot and get it reproducing again18:04
apwcwillu, looks interesting as i am sure that same routine appears in suspend18:04
apwand so far i've not seen that change as presented in the kernel18:05
cwilludo you want me on ubuntu's kernel or mainline's?18:05
cwilluboth have the problem, ubuntu's crashes far quicker though (almost too quickly to do anything with)18:05
cwilluother interesting datapoints, I _think_ it worked fine while I was still on jaunty with 2.6.30rcX (needed it to work around the ext4 file deletion hanger)18:06
apwcwillu, so that was with a mainline kernel?  your own build?18:06
cwillualthough you should probably ignore that, I don't remember at all well enough18:06
cwilluapw, mainline ppa18:06
apwso that should still be on your machine and testable on karmic too18:07
cwilluapw, you misunderstood me :p18:07
cwilluapw, both the mainline ppa kernel and ubuntu's -generic do this18:07
cwilluthat dmesg came from 2.6.30rc7 from the mainline ppa18:07
apwok ...18:08
* cwillu reboots into 2.6.30-020630rc7-generic18:08
johanbrany major differences in userspace suspend between jaunty and karmic?18:08
apwthe patch you pointed to on patchworks is somehow in karmic under a different patch id18:08
apwjohanbr, outside the kernel no, inside some18:08
cwilluhow do I verify that I'm not using kms?18:09
cwilluI played with plymouth during jaunty's release (as long as I was on 2.6.30 and all that)18:09
apwunless you have modeset mentioned on your grub command line18:09
cwilluthought I removed it completely18:09
apwor in your module options then you don't18:09
apwit is opt in currently18:09
cwilluk, should be clear then18:10
johanbrcwillu: how do you suspend?18:11
cwillujohanbr, normal suspend18:11
cwillunot uswsusp or anything like that18:11
apwcwillu, another good test point would be the tip of linus' tree: http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2009-05-30/18:25
cwilluapw, k, downloading it right now18:25
apwthere are some locking changes for suspend in there, though i think they are meant to be more theoretical to be fair18:25
cwilluis there an easy way to force a message so I can see if this netconsole is actually working?18:26
apwhrm18:26
apwsysrq help might do it18:26
apwi think the title is always emitted18:26
cwilluyep18:26
cwillugood18:26
cwillu(in the sense of 'it worked')18:26
apwalso worth doing the suspend from vt1, as you may see errors there18:26
cwillumessages that wouldn't show up in /var/log/{messages/kern.log/syslog/dmesg}?18:27
apwdepending on the nature of the hang they might not get to disk yes18:28
cwilluthe oops that starts to bring things down doesn't necessarily occur right away18:29
cwilluI'll still try it, I'm just spewing everything I can think of that might be relevant18:30
cwillu8 minutes until the download is complete, at which point I'll install,  and then suspend on the current kernel18:30
cwilluor should I go straight to the daily and see if it reproduces first?18:30
* cwillu pokes apw with an inquisitive stick18:32
apwi think continue with your test on the current, and then the c-o-d kernel18:32
cwilluk18:33
cwilluinstalling right now18:39
cwillususpending from console18:47
cwilluhard lock18:48
cwillu9 segfaults are visible, laptop_mode mostly, and one cpufreq18:49
cwilluI think I saw another error flash by too18:50
* cwillu reboots and tries that again18:50
cwillunothing showed up in the netconsole18:50
cwilluactually, that's a lie18:50
cwillu[drm:i915_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 018:50
cwillushow up18:50
johanbrcwillu: could you try with the jaunty kernel?18:51
cwillujohanbr, sure, doing it now18:51
cwillu(noting that jaunty's kernel has ext4 issues that causes other lockups)18:51
cwillu(but I shouldn't run into them without deleting files)18:52
* cwillu suspends18:55
cwillucame back with no errors on the console18:56
cwillu2.6.28-10-generic #3318:56
cwillunetconsole didn't show anything during the suspend, although it's still connected (I did a sysrq-sync before and after, both showed up in the netconsole)18:57
cwillujohanbr, apw, given my original dmesg, is there anything that jumps out at you as a way to check whether something has gone wrong, even before the system oops?18:58
cwillusomething in /proc/ or /sys/ I could check?18:59
johanbrComparison with the dmesg from the jaunty kernel might be interesting19:02
cwilluI'll post that in a minute19:06
cwillujust proving to myself that it's stable19:06
cwilluby playing flash games on casualcollective.com of course :p19:06
cwilluseems to be fine19:13
cwilluokay, dmesg19:13
cwilluhttp://pastebin.com/f3757f4a519:14
cwilluwas the original with the fault http://pastie.org/49667619:14
cwillu(bah, pastie was with fault, pastebin was jaunty's kernel that worked fine)19:14
* cwillu pokes johanbr and apw 19:15
cwilluI'm rebooting into the daily 2.6.30 now19:15
* cwillu suspends current daily kernel19:18
cwilludmesg from 2.6.30-999-generic (current daily):  http://pastebin.com/f162d6f0219:20
cwilludidn't immediately crash, let me test it for a bit again19:20
cwilluoh, wait19:22
cwilluan error showed up a second after I pastebin'ed it19:22
cwillu[  261.585748] EXT4-fs error (device sda1): ext4_lookup: deleted inode referenced: 1220609519:23
cwillu[  261.585768] Aborting journal on device sda1:8.19:23
cwillujohanbr, apw:  http://pastebin.com/f5c2ec258 has the latest error included19:23
cwilluopinions?19:25
cwilluI'm tempted to remount rw and see if there's any further instability, 19:26
* cwillu reboots to fsck and repeat19:28
cwillufsck finished, suspending agani19:33
cwilluwell, suspending after rebooting :/19:33
apwcwillu, hrm. can we get all that info into the bug, and i'll have a look about.  especially the version you are stable on19:38
cwilluapw, yep.  I'm not completely convinced the last crash was the related, but I'll post the dmesg's 19:38
* apw has to pop out ...19:39
* cwillu tackles apw to hold him in the channel :p19:39
cwilluthe daily hasn't done anything odd yet19:52
cwilluI'm going to keep poking it for a couple hours (longest I've ever had a suspend last and still crash)19:52
cwilluapw, updated the bug report with jaunty's dmesg19:56
* cwillu suspends again for kicks20:22
cwilluapw, it's still a little early to be sure, but I've suspended a couple times without rebooting on the daily kernel, and I haven't had any issues except for that first attempt with the ext4 mounting read-only, which may have been related to previous crashes more than the suspend20:23
apwcwillu, thats encouraging ... if you have an oppotunity to  do say 10 suspends and report back on the bug that would be handy20:38
cwilludefinitely20:39
* cwillu suspends again20:39
cwillu[ 4022.240004] BUG: soft lockup - CPU#0 stuck for 61s! [python:8575]20:41
cwillulots of b44: eth2: Error, poll already scheduled20:45
cwilluapw, how likely is it that a daily would have random crashers in it?20:46
cwilluapw, just completed 20 suspend cycles on the daily, 60 seconds between each resume and the next suspend.  Nothing suspicious shows up in dmesg, I'm posting it to the bug report21:22
colonelqubitjohanbr: hi again. I've had some progress in testing hibernation on my laptop.21:36
johanbralright21:38
johanbrI'm about to head out, but go ahead...21:38
colonelqubitif I disable splash/quiet and enable no_console_suspend, and hibernate from vt1, and never try to go to vt7 (with X), then hibernation and resume works (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/366264)21:39
ubot3Malone bug 366264 in linux "[Dell XPS m1530] Resume fails after hibernate/suspend" [Undecided,New] 21:39
colonelqubitI'm getting errors about btusb_intr_complete and btusb_bulk_complete. They appear to have "failed to resubmit"21:40
johanbrthat shouldn't matter very much21:49
colonelqubitokay21:49
colonelqubitjohanbr: As I said, it appears that I can get hibernation to work properly in vt1, but I can't hibernate from within X or return to X once I've gone though a hibernation cycle. Any suggestions on what I should try testing next?22:06
=== sconklin is now known as sconklin-gone
maxbIt is intentional that ddebs.ubuntu.com only provides karmic ddebs?22:37
cwillumaxb, I see a bunch of different releases in /dists/22:38
maxbSorry, for the kernel I mean22:39
maxbpool/main/l/linux/ only seems to have karmic packages22:39

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!