=== panda is now known as Guest59538 | ||
=== _LibertyZero is now known as LibertyZero | ||
=== LibertyZero is now known as _LibertyZero | ||
=== Guest59538 is now known as panda|x201 | ||
htorque | hello everyone! i'm facing a significant boot time regression with the 2.6.38 kernels: http://img.xrmb2.net/images/274432.png (hybrid laptop) - is this something known? | 08:40 |
---|---|---|
apw | htorque, nope that isn't known | 08:51 |
apw | htorque, are those two graphs with the identicle userspace ? | 08:53 |
apw | htorque, although plymouthd is fingered, it is a process which simply waits | 08:54 |
=== fddfoo is now known as fdd | ||
htorque | apw, yes, identical userspace on all four | 08:59 |
htorque | apw, i'd like to open a bug report on this, but i don't know for which component - linux (upstream?), plymouth? | 09:00 |
apw | htorque, indeed tricky, as plymouth is likely just blamed cause it waits for the display to appear i am inclined not to blame it without seeing the rest of the images | 09:05 |
apw | can i see the whole boot charts in each of the two you pasted before | 09:06 |
apw | 37-12 against 38-1 | 09:06 |
apw | i think it was | 09:06 |
htorque | sure, wait a sec (38-3 it is) | 09:06 |
htorque | apw, intel: http://img.xrmb2.net/images/100556.png vs. http://img.xrmb2.net/images/217302.png - nvidia: http://img.xrmb2.net/images/306234.png vs. http://img.xrmb2.net/images/502469.png | 09:16 |
=== smb` is now known as smb | ||
apw | htorque, there doesn't seem to be anything specific occuring in the two slow images in userspace. the only apparent oddity is there is a kworker running for most of the 'gap' period on both image ... i am suspicious it is related but it is very hard to tell from these | 09:22 |
apw | kworker/3:0 in the intel case and kworker/1:0 is shown slightly coloured for about 8 or so seconds | 09:23 |
apw | unsure what the colour represents as there is nothing in the top graph then | 09:24 |
apw | htorque, is there anything in the dmesg of the slow version? can i see a dmesg off the intel one for the boot | 09:24 |
htorque | the light red should be unint. sleep, but i'm not sure - it's definitely not i/o | 09:25 |
apw | htm | 09:27 |
htorque | apw, http://paste.ubuntu.com/568091/ | 09:29 |
apw | htorque, this was -3.30 yes ? | 09:31 |
htorque | -38.3.17 | 09:31 |
apw | it is 3.30 ... ok | 09:32 |
htorque | ah, sorry, checked the wrong package :) | 09:35 |
htorque | apw, i have to leave for ~2h, will stay on though | 09:42 |
TeTeT | is linux-crashdump broken on Natty? I try to force a kernel crash dump (customer interest), following https://wiki.ubuntu.com/Kernel/CrashdumpRecipe but I don't get a crash image in /var/crash | 10:10 |
TeTeT | kernel in use is 2.6.38-3-generic, linux-crashdump is 2.6.38.3.17 | 10:10 |
smb | There seems to be some brokeness in every release | 10:11 |
smb | (and not enought time to play around with it to get it fixed) | 10:12 |
* smb thinks its broken at least in Lucid as well | 10:12 | |
TeTeT | smb: any hints on how to get this fixed? | 10:16 |
smb | TeTeT, Since we rarely use it its not that high on our list. Would need someone to sit down and look at it and frankly we got enough other things that we have to look at. :/ | 10:20 |
apw | TeTeT, i'd say mostly we arn't aware which releases it does and does not work, this implies it is not a heavily used feature (no one screaming for it) and thus not a high propority | 10:20 |
TeTeT | smb+apw: ok, the query was motivated by a customer's employee that attended a linux debugging class last week, so it's not a high prio either - btw the class was run on opensuse, there the crashdump stuff seems to work | 10:22 |
TeTeT | smb: thanks for the detailed instructions on bug 539467, will give it a try right away | 10:23 |
ubot2 | Launchpad bug 539467 in linux "SATA link power management causes disk errors and corruption" [Medium,Confirmed] https://launchpad.net/bugs/539467 | 10:23 |
smb | TeTeT, ok thanks | 10:24 |
TREllis | I have a case where a certified system has suspend broken in Lucid, there is a SUSPEND_MODULES workaround and the bug ()522998 is set to won't fix for Lucid | 10:56 |
TREllis | what are the options? | 10:57 |
htorque | apw, that kworker process is indeed in state D (i made pybootchartgui paint unint. sleep in a different color) - is there any way i can find out what it's doing? profile it somehow? | 12:40 |
tgardner | cking, AceLan_: need to bounce tangerine for SSL security update | 13:41 |
cking | ok | 13:42 |
cking | tgardner, I'm off | 13:42 |
tgardner | cking, I know :) | 13:42 |
cking | I just like filling this channel with pointless information ;-) | 13:43 |
apw | cking, but what are you off, your food ? | 13:59 |
apw | htorque, tricky as it doesn't continue after boot hrm | 14:00 |
tgardner | smb, are you pushing maverick meta? | 14:00 |
tgardner | nm, just saw the email | 14:00 |
smb | tgardner, Look into your latest copy of the inbox. :) | 14:00 |
smb | tgardner, Just did not get to it before the break | 14:01 |
apw | htorque, i have profiled a couple of my systems (intel) with the 37-12 and 38-3 kernels and cannot see any difference, cirtainly not that kworkerd lit up | 14:01 |
apw | tgardner, any idea which timezone jbarnes is in? | 14:05 |
tgardner | apwhe lives in pdx | 14:10 |
apw | tgardner, thought he might | 14:10 |
tgardner | apw: damn tab completion | 14:10 |
tgardner | apw: or at least, I think he works at Jones Farm last I knew | 14:11 |
apw | heh, normally for me it pings someone unrelated | 14:11 |
apw | at least you didn't do that :) | 14:11 |
tgardner | apw: well, if I didn't have to stare at my keyboard whilst typing... | 14:12 |
apw | tgardner, those pesky keys, they tend to move around i find | 14:12 |
tgardner | mmm, snarky | 14:13 |
htorque | apw, just compiled -rc5 (with ubuntu .config) and got some oopses at the time when plymouthd stops in those charts - they point to this bug https://bugzilla.kernel.org/show_bug.cgi?id=26232 | 14:20 |
ubot2 | bugzilla.kernel.org bug 26232 in Console/Framebuffers "Multiple framebuffer oops and sysfs attribute deadlock" [Normal,New] | 14:20 |
htorque | weird thing, i see no such messages with the ubuntu kernels | 14:20 |
htorque | could this be the problem (i'm currently re-compiling using those two patches)? | 14:20 |
apw | htorque, yep without the fixes we have for open/close of framebuffers you will hit all kinds of races | 14:21 |
apw | we get to framebuffer opens much earlier than most distros | 14:22 |
apw | i have fixes which i need to push up | 14:22 |
apw | htorque, there is a -4 kernel building in the archive you could test | 14:22 |
htorque | apw, ok, so that isn't the problem then (stopping the kernel build :-)) | 14:22 |
htorque | apw, yeah will try | 14:23 |
apw | htorque, nope, those are just fookages in mainline, clear bugs | 14:23 |
htorque | apw, no luck, same with -4 | 14:36 |
apw | htorque, ok... file a bug pls with the latest diagrams attached | 14:37 |
htorque | at LP? | 14:37 |
apw | sure | 14:37 |
htorque | ok, will do, thanks! | 14:37 |
apw | file it against linux for the time being as kworkerd is the most suspicious thing on there | 14:37 |
apw | plus you are only chanign the kernel to get behaviour | 14:38 |
apw | htorque, please include the info that there are intel and nvidia machines | 14:39 |
htorque | actually it's the same machine with both gpus (hybrid) | 14:39 |
apw | htorque, oh, not a mac is it .? | 14:39 |
htorque | no, it's a thinkpad t510 | 14:39 |
htorque | (they can boot with either gpu while the other one is really switched off - unlike some other hybrid laptops from acer, sony, etc.) | 14:41 |
htorque | apw, do you think it's worth it doing kernel bisecting? | 14:42 |
apw | htorque, probabally the only way | 14:44 |
htorque | apw, never done this with ubuntu kernel sources - is ubuntu/ubuntu-natty.git the right one? | 14:48 |
apw | htorque, yeah but you will find it hard to bisect cause of the ubuntu delta | 14:48 |
apw | can you remind me, this issue appears in -1 yes ? | 14:48 |
apw | and can you see the kworker slowness on the mainline boot | 14:49 |
apw | even though it oopses ? | 14:49 |
apw | as its easier to bisect at that level | 14:49 |
htorque | yes, started with -1 and yes, even with the oops i see the kworker thing | 14:50 |
htorque | *rc1 | 14:50 |
apw | htorque, ok then i have the infrastucture to do bisect builds between 2.6.37 and 2.6.38-rc1 | 14:50 |
apw | htorque, wahts the bug number | 14:52 |
htorque | not done yet | 14:52 |
apw | so run ubuntu-bug linux when running -5 | 14:52 |
apw | and get me the bug number | 14:53 |
apw | and i'll get building you a bisect kernel | 14:53 |
htorque | -5? or -4? | 14:54 |
htorque | apw, i have a small kernel config for this machine which builds in ~15 minutes, if i see the bug with it i could certainly do it locally without a pain | 14:55 |
apw | htorque, ok then thats fine | 14:56 |
apw | -4 sorry missed | 14:56 |
apw | htorque, if you can do it in 15mins then it makes most sense for you to do it | 14:56 |
=== sconklin-gone is now known as sconklin | ||
=== Nafallo_ is now known as Nafallo | ||
=== herton is now known as herton_lunch | ||
htorque | apw, bug report will need some time as apport complains about -4 not being a genuine ubuntu package | 15:10 |
* apw SHOUTS AT LAUNCHPAD | 15:13 | |
apw | launchpad get the hell out of my way and let me do my job | 15:14 |
=== yofel_ is now known as yofel | ||
=== herton_lunch is now known as herton | ||
=== ayan_ is now known as ayan | ||
sforshee | I could use some advice debugging a hang on resume with a toshiba netbook with an Atom N450. The machine hangs for 5 minutes on resume (exactly how long it takes for the lower 32 bits of the hpet to wrap on this machine). The hang happens when executing the acpi WAK method. Booting with "nohz=off highres=off" makes the hang go away, and one difference is that the hpet is put in periodic mode instead of oneshot mode during r | 16:09 |
sforshee | esume. Booting with hpet=disable doesn't fix the hang during resume, in fact machine never resumes. But performance is much better with hpet=disable or nohz=off. Any suggestions? | 16:09 |
cking | sforshee, what model is it? | 16:13 |
sforshee | cking, nb 305 | 16:13 |
=== JanC_ is now known as JanC | ||
cking | sforshee, does "nohz=off highres=off noapic" work? | 16:18 |
cking | oops, you said it does. | 16:18 |
sforshee | cking, "nohz=off highres=off", I probably haven't tried with all three | 16:19 |
cking | sforshee, try "irqpoll" instead of "noapic" too | 16:19 |
mjg59 | Probably the same hpet polarity thing | 16:19 |
sforshee | cking, ack | 16:19 |
cking | sounds like it | 16:19 |
sforshee | mjg59, that's a strong possibility, although the workaround that worked on the AMD-based machines doesn't work here | 16:20 |
sforshee | the workaround being acpi_skip_timer_override | 16:20 |
smb | That could be if the interrupt the timer then ends up on has the wrong polarity set too | 16:21 |
sforshee | did a proper workaround for the hpet polarity issue ever turn up? | 16:22 |
smb | not that I saw one | 16:22 |
smb | And Andreas seems either to ignore me or being on vacation | 16:22 |
mjg59 | Although it's an Intel chipset rather than an AMD one | 16:23 |
mjg59 | So... | 16:23 |
smb | (one could suspect Toshiba may be do it wrong even on Intel) but a fix for that unlikely will come from Andreas for sure | 16:25 |
sforshee | in general things just seem to be bad on this machine when the hpet is on oneshot mode, in periodic mode things seem much better | 16:27 |
tgardner | sconklin, bjf: I uploaded linux-meta for lucid/maverick for bug #720139 | 16:55 |
ubot2 | Launchpad bug 720139 in linux-meta "Kernel meta packages are built for wrong architectures." [Undecided,Fix committed] https://launchpad.net/bugs/720139 | 16:55 |
sconklin | tgardner: we should replace the one in proposed, right? | 16:55 |
bjf | tgardner, once they've built do you want us to coordinate with pitti or will you ? | 16:56 |
tgardner | sconklin, I uploaded straight to the archive and bypassed the c-k-t PPA | 16:56 |
tgardner | it should just go through the normal channels | 16:56 |
sconklin | I thought everything had to go through the ppa so it is built against -security packages. Although for meta I don't think there could be a conflict | 16:57 |
tgardner | sconklin, linux-meta has no dependencies so I think we're OK | 16:58 |
cking | mjg59, so is this polarity setting wrong because it's mis-configured in the MP configuration table? Or am I way off here? | 17:04 |
mjg59 | I'd guess | 17:04 |
mjg59 | Or wherever we get the default polarities from on APIC systems | 17:05 |
* cking gets reading the MP spec... | 17:05 | |
cking | Nothing is trivial in this domain | 17:05 |
sforshee | cking, mjg59, there is an override in the MP configuration table for the toshiba | 17:09 |
sforshee | [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) | 17:10 |
sforshee | [ 0.000000] Int: type 0, pol 1, trig 1, bus 00, IRQ 00, APIC ID 2, APIC INT 02 | 17:10 |
sforshee | I'm preparing a test build to override what's in the MP table, any guesses what triggering I should use? | 17:12 |
cking | sforshee, well, its a 50/50 guess isn't it? | 17:25 |
sforshee | cking, well, if polarity is the only problem | 17:26 |
sforshee | there's also edge vs. level | 17:26 |
sforshee | one option can be eliminated right away, though :) | 17:26 |
sforshee | i'm starting with only changing polarity | 17:27 |
=== sforshee is now known as sforshee-lunch | ||
jamespage | I'm hoping someone on channel can clear up some confusion re kernel names for ubuntu server | 17:47 |
jamespage | So (and this is all for lucid) for the minimal virtual install the linux-image-virtual package gets installed | 17:47 |
jamespage | this is nice and small (thats all OK) but uname -r returns a -server kernel name - is this correct? the current ISO test case says it should return -virtual | 17:48 |
tgardner | jamespage, for Lucid the virtual package was distilled from server binaries | 17:48 |
tgardner | so, there is no -virtual kernel per se | 17:49 |
jamespage | excellent :-); hggdh - can we get the test case updated for next time? | 17:49 |
jamespage | tgardner: has this changed in maverick or natty? | 17:49 |
tgardner | jamespage, note that maverick and subsequent releases _will_ have a -vertual kernel | 17:50 |
jamespage | tgardner: thanks for clearing that up for us | 17:50 |
hggdh | jamespage: I will update the test with a caveat on lucid | 17:51 |
jamespage | hggdh: marvellous! | 17:52 |
hggdh | OK, another question. the i386 ISO (at least on Lucid, probably valid onwards) defaults to always installing -pae. Is this correct? | 18:00 |
tgardner | hggdh, The ISO boots the non-pae kernel IIRC, then detects the platform capabilities and installs accordingly IIRC. cjwatson or ev could probably answer deep installer questions like taht. | 18:02 |
hggdh | tgardner: thank you | 18:03 |
cking | sforshee-lunch, once you get some insight, please let me know via email. meanwhile I think I should dig into seeing if we can make this testable from fwts | 18:49 |
* tgardner --> lunch | 19:01 | |
JFo | <- headed to lunch | 19:01 |
=== sforshee-lunch is now known as sforshee | ||
cking | everyone is making me feel hungry | 19:11 |
apw | cking, its your dinner time, go away and don't come back | 19:12 |
cking | yessir! | 19:13 |
apw | i am watching | 19:13 |
cking | and... | 19:13 |
sforshee | cking, will do, so far I've tested edge-low triggering with no luck, moving on to level triggering now | 19:13 |
cking | sforshee, ta | 19:13 |
Darxus | "make install" doesn't work (doesn't update grub) in Maverick, right? It looks like it does in Natty? | 19:38 |
tgardner | Darxus, yep, I fixed the package for natty | 19:39 |
Darxus | tgardner: Nice, thanks. | 19:39 |
tgardner | Darxus, you can fix maverick by just copying /sbin/installkernel from Natty | 19:40 |
Darxus | tgardner: Yeah, looks like the only significant difference is the run-parts line? | 19:40 |
Darxus | Any chance of that getting into Maverick? | 19:40 |
tgardner | Darxus, IIRC | 19:40 |
tgardner | Darxus, unlikely, since its really a developer tool. | 19:41 |
Darxus | Okay, thanks. | 19:42 |
GrueMaster | tgardner: On bug 720189, I don't believe marvel (or any ubuntu supported platforms) use it. It looks like the Amiga serial port driver. | 19:48 |
ubot2 | Launchpad bug 720189 in linux-lts-backport-maverick "CVE-2010-4076" [Undecided,In progress] https://launchpad.net/bugs/720189 | 19:48 |
GrueMaster | (and marvel kernel is on 2.6.32). | 19:50 |
tgardner | GrueMaster, the CVE description is a bit misleading. the patch actually cleans up a generic TTY vulnerability. | 19:51 |
GrueMaster | Ah. Because the bug was against amiserial.c. That's how I was confused. | 19:51 |
tgardner | GrueMaster, I'm still considering what to do with mvl-dove regression. I'm considering just backing out stable updates and just do CVEs (which is all that were are contractually obligated for) | 19:52 |
GrueMaster | That would be my suggestion. | 19:52 |
GrueMaster | Especially considering they are doing hardware changes that we can't follow. | 19:53 |
tgardner | GrueMaster, cool. I'll get started... | 19:53 |
* jjohansen -> lunch | 19:59 | |
=== Artir is now known as JoseLuisRicon | ||
=== JoseLuisRicon is now known as Artir | ||
htorque | apw, git bisect suggests http://git390.marist.edu/cgi-bin/gitweb.cgi?p=linux-2.6.git;a=commit;h=23d69b09b78c4876e134f104a3814c30747c53f1 - does this make any sense? (i'll re-compile the enclosing commits to confirm) | 20:42 |
skaet_ | rtg, can someone have a quick look at https://launchpad.net/bugs/720865 | 20:45 |
ubot2 | Ubuntu bug 720865 in linux "kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)" [Undecided,New] | 20:45 |
=== sconklin is now known as sconklin-afk | ||
skaet_ | tgardner, ^^ | 20:52 |
tgardner | skaet_, ack | 21:00 |
skaet_ | thanks! | 21:00 |
apw | htorque, it seems to be a merge, i am supprised it would tell you it was that one | 21:09 |
apw | though i am not supprised if one of the commits that was merged is the culprit | 21:10 |
apw | normally though i would expect it to try and select which of the merged commits is at fault | 21:10 |
htorque | apw, yeah, unfortunately git-bisect stopped at that point. anyways, that's what i'm planning to do after i double checked it's the right one. | 21:15 |
apw | yeah i guess start with the 'right hand side of the merge as a 'bad' and probabally pick something old as good :) | 21:16 |
GrueMaster | tgardner: Out of curiosity, I am currently subscribed to linux-mrvl-dove, which has generated a recent spat of CVE spam. I assume you are running a script to push these? If so, you may want to modify it to automatically exclude natty for dove, as it is not supported at this time. | 21:21 |
tgardner | GrueMaster, you're assuming launchpad has that flexibility. besides, I'm having to do the whole damn thing by hand. | 21:24 |
GrueMaster | Oh, oops. :P | 21:24 |
GrueMaster | Wasn't trying to nit pick. Just thought it was semi-automated. No problems. | 21:24 |
tgardner | GrueMaster, just uploaded Lucid mvl-dove to https://launchpad.net/~canonical-kernel-team/+archive/ppa, so see if you can install it tomorrow (assuming it builds OK) | 21:29 |
GrueMaster | Will do. | 21:29 |
GrueMaster | Will it be the same package for Maverick (assuming since they are the same code base)? | 21:30 |
tgardner | GrueMaster, yep, identical code base, different compiler | 21:31 |
=== sconklin-afk is now known as sconklin | ||
GrueMaster | So, with all these recent patches for dove kernels, is anyone updating linux-image-imx51 (karmic, lucid), linux-image-omap (lucid, maverick) or linux-image-omap4 (maverick, natty)? | 21:46 |
=== _LibertyZero is now known as LibertyZero | ||
=== sconklin is now known as sconklin-gone |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!