/srv/irclogs.ubuntu.com/2011/02/17/#ubuntu-kernel.txt

=== panda is now known as Guest59538
=== _LibertyZero is now known as LibertyZero
=== LibertyZero is now known as _LibertyZero
=== Guest59538 is now known as panda|x201
htorquehello everyone! i'm facing a significant boot time regression with the 2.6.38 kernels: http://img.xrmb2.net/images/274432.png (hybrid laptop) - is this something known?08:40
apwhtorque, nope that isn't known08:51
apwhtorque, are those two graphs with the identicle userspace ?08:53
apwhtorque, although plymouthd is fingered, it is a process which simply waits08:54
=== fddfoo is now known as fdd
htorqueapw, yes, identical userspace on all four08:59
htorqueapw, i'd like to open a bug report on this, but i don't know for which component - linux (upstream?), plymouth?09:00
apwhtorque, indeed tricky, as plymouth is likely just blamed cause it waits for the display to appear i am inclined not to blame it without seeing the rest of the images09:05
apwcan i see the whole boot charts in each of the two you pasted before09:06
apw37-12 against 38-109:06
apwi think it was09:06
htorquesure, wait a sec (38-3 it is)09:06
htorqueapw, intel: http://img.xrmb2.net/images/100556.png vs. http://img.xrmb2.net/images/217302.png - nvidia: http://img.xrmb2.net/images/306234.png vs. http://img.xrmb2.net/images/502469.png09:16
=== smb` is now known as smb
apwhtorque, there doesn't seem to be anything specific occuring in the two slow images in userspace.  the only apparent oddity is there is a kworker running for most of the 'gap' period on both image ... i am suspicious it is related but it is very hard to tell from these09:22
apwkworker/3:0 in the intel case and kworker/1:0 is shown slightly coloured for about 8 or so seconds09:23
apwunsure what the colour represents as there is nothing in the top graph then09:24
apwhtorque, is there anything in the dmesg of the slow version?  can i see a dmesg off the intel one for the boot09:24
htorquethe light red should be unint. sleep, but i'm not sure - it's definitely not i/o09:25
apwhtm09:27
htorqueapw, http://paste.ubuntu.com/568091/09:29
apwhtorque, this was -3.30 yes ?09:31
htorque-38.3.1709:31
apwit is 3.30 ... ok09:32
htorqueah, sorry, checked the wrong package :)09:35
htorqueapw, i have to leave for ~2h, will stay on though09:42
TeTeTis linux-crashdump broken on Natty? I try to force a kernel crash dump (customer interest), following https://wiki.ubuntu.com/Kernel/CrashdumpRecipe but I don't get a crash image in /var/crash10:10
TeTeTkernel in use is 2.6.38-3-generic, linux-crashdump is 2.6.38.3.1710:10
smbThere seems to be some brokeness in every release10:11
smb(and not enought time to play around with it to get it fixed)10:12
* smb thinks its broken at least in Lucid as well10:12
TeTeTsmb: any hints on how to get this fixed?10:16
smbTeTeT, Since we rarely use it its not that high on our list. Would need someone to sit down and look at it and frankly we got enough other things that we have to look at. :/10:20
apwTeTeT, i'd say mostly we arn't aware which releases it does and does not work, this implies it is not a heavily used feature (no one screaming for it) and thus not a high propority10:20
TeTeTsmb+apw: ok, the query was motivated by a customer's employee that attended a linux debugging class last week, so it's not a high prio either - btw the class was run on opensuse, there the crashdump stuff seems to work10:22
TeTeTsmb: thanks for the detailed instructions on bug 539467, will give it a try right away10:23
ubot2Launchpad bug 539467 in linux "SATA link power management causes disk errors and corruption" [Medium,Confirmed] https://launchpad.net/bugs/53946710:23
smbTeTeT, ok thanks10:24
TREllisI have a case where a certified system has suspend broken in Lucid, there is a SUSPEND_MODULES workaround and the bug ()522998 is set to won't fix for Lucid10:56
TRElliswhat are the options?10:57
htorqueapw, that kworker process is indeed in state D (i made pybootchartgui paint unint. sleep in a different color) - is there any way i can find out what it's doing? profile it somehow?12:40
tgardnercking, AceLan_: need to bounce tangerine for SSL security update13:41
ckingok13:42
ckingtgardner, I'm off13:42
tgardnercking, I know :)13:42
ckingI just like filling this channel with pointless information ;-)13:43
apwcking, but what are you off, your food ?13:59
apwhtorque, tricky as it doesn't continue after boot hrm14:00
tgardnersmb, are you pushing maverick meta?14:00
tgardnernm, just saw the email14:00
smbtgardner, Look into your latest copy of the inbox. :)14:00
smbtgardner, Just did not get to it before the break14:01
apwhtorque, i have profiled a couple of my systems (intel) with the 37-12 and 38-3 kernels and cannot see any difference, cirtainly not that kworkerd lit up14:01
apwtgardner, any idea which timezone jbarnes is in?14:05
tgardnerapwhe lives in pdx 14:10
apwtgardner, thought he might14:10
tgardnerapw: damn tab completion14:10
tgardnerapw: or at least, I think he works at Jones Farm last I knew14:11
apwheh, normally for me it pings someone unrelated14:11
apwat least you didn't do that :)14:11
tgardnerapw: well, if I didn't have to stare at my keyboard whilst typing...14:12
apwtgardner, those pesky keys, they tend to move around i find14:12
tgardnermmm, snarky14:13
htorqueapw, just compiled -rc5 (with ubuntu .config) and got some oopses at the time when plymouthd stops in those charts - they point to this bug https://bugzilla.kernel.org/show_bug.cgi?id=2623214:20
ubot2bugzilla.kernel.org bug 26232 in Console/Framebuffers "Multiple framebuffer oops and sysfs attribute deadlock" [Normal,New]14:20
htorqueweird thing, i see no such messages with the ubuntu kernels14:20
htorquecould this be the problem (i'm currently re-compiling using those two patches)?14:20
apwhtorque, yep without the fixes we have for open/close of framebuffers you will hit all kinds of races14:21
apwwe get to framebuffer opens much earlier than most distros14:22
apwi have fixes which i need to push up14:22
apwhtorque, there is a -4 kernel building in the archive you could test14:22
htorqueapw, ok, so that isn't the problem then (stopping the kernel build :-))14:22
htorqueapw, yeah will try14:23
apwhtorque, nope, those are just fookages in mainline, clear bugs14:23
htorqueapw, no luck, same with -414:36
apwhtorque, ok... file a bug pls with the latest diagrams attached14:37
htorqueat LP?14:37
apwsure14:37
htorqueok, will do, thanks!14:37
apwfile it against linux for the time being as kworkerd is the most suspicious thing on there14:37
apwplus you are only chanign the kernel to get behaviour14:38
apwhtorque, please include the info that there are intel and nvidia machines14:39
htorqueactually it's the same machine with both gpus (hybrid)14:39
apwhtorque, oh, not a mac is it .?14:39
htorqueno, it's a thinkpad t51014:39
htorque(they can boot with either gpu while the other one is really switched off - unlike some other hybrid laptops from acer, sony, etc.)14:41
htorqueapw, do you think it's worth it doing kernel bisecting?14:42
apwhtorque, probabally the only way14:44
htorqueapw, never done this with ubuntu kernel sources - is ubuntu/ubuntu-natty.git the right one?14:48
apwhtorque, yeah but you will find it hard to bisect cause of the ubuntu delta14:48
apwcan you remind me, this issue appears in -1 yes ?14:48
apwand can you see the kworker slowness on the mainline boot14:49
apweven though it oopses ?14:49
apwas its easier to bisect at that level14:49
htorqueyes, started with -1 and yes, even with the oops i see the kworker thing14:50
htorque*rc114:50
apwhtorque, ok then i have the infrastucture to do bisect builds between 2.6.37 and 2.6.38-rc114:50
apwhtorque, wahts the bug number14:52
htorquenot done yet14:52
apwso run ubuntu-bug linux when running -514:52
apwand get me the bug number14:53
apwand i'll get building you a bisect kernel14:53
htorque-5? or -4?14:54
htorqueapw, i have a small kernel config for this machine which builds in ~15 minutes, if i see the bug with it i could certainly do it locally without a pain14:55
apwhtorque, ok then thats fine14:56
apw-4 sorry missed14:56
apwhtorque, if you can do it in 15mins then it makes most sense for you to do it14:56
=== sconklin-gone is now known as sconklin
=== Nafallo_ is now known as Nafallo
=== herton is now known as herton_lunch
htorqueapw, bug report will need some time as apport complains about -4 not being a genuine ubuntu package15:10
* apw SHOUTS AT LAUNCHPAD15:13
apwlaunchpad get the hell out of my way and let me do my job15:14
=== yofel_ is now known as yofel
=== herton_lunch is now known as herton
=== ayan_ is now known as ayan
sforsheeI could use some advice debugging a hang on resume with a toshiba netbook with an Atom N450. The machine hangs for 5 minutes on resume (exactly how long it takes for the lower 32 bits of the hpet to wrap on this machine). The hang happens when executing the acpi WAK method. Booting with "nohz=off highres=off" makes the hang go away, and one difference is that the hpet is put in periodic mode instead of oneshot mode during r16:09
sforsheeesume. Booting with hpet=disable doesn't fix the hang during resume, in fact machine never resumes. But performance is much better with hpet=disable or nohz=off. Any suggestions?16:09
ckingsforshee, what model is it?16:13
sforsheecking, nb 30516:13
=== JanC_ is now known as JanC
ckingsforshee, does "nohz=off highres=off noapic" work?16:18
ckingoops, you said it does.16:18
sforsheecking, "nohz=off highres=off", I probably haven't tried with all three16:19
ckingsforshee, try "irqpoll" instead of "noapic" too16:19
mjg59Probably the same hpet polarity thing16:19
sforsheecking, ack16:19
ckingsounds like it16:19
sforsheemjg59, that's a strong possibility, although the workaround that worked on the AMD-based machines doesn't work here16:20
sforsheethe workaround being acpi_skip_timer_override16:20
smbThat could be if the interrupt the timer then ends up on has the wrong polarity set too16:21
sforsheedid a proper workaround for the hpet polarity issue ever turn up?16:22
smbnot that I saw one16:22
smbAnd Andreas seems either to ignore me or being on vacation16:22
mjg59Although it's an Intel chipset rather than an AMD one16:23
mjg59So...16:23
smb(one could suspect Toshiba may be do it wrong even on Intel) but a fix for that unlikely will come from Andreas for sure16:25
sforsheein general things just seem to be bad on this machine when the hpet is on oneshot mode, in periodic mode things seem much better16:27
tgardnersconklin, bjf: I uploaded linux-meta for lucid/maverick for bug #72013916:55
ubot2Launchpad bug 720139 in linux-meta "Kernel meta packages are built for wrong architectures." [Undecided,Fix committed] https://launchpad.net/bugs/72013916:55
sconklintgardner: we should replace the one in proposed, right?16:55
bjftgardner, once they've built do you want us to coordinate with pitti or will you ?16:56
tgardnersconklin, I uploaded straight to the archive and bypassed the c-k-t PPA16:56
tgardnerit should just go through the normal channels16:56
sconklinI thought everything had to go through the ppa so it is built against -security packages. Although for meta I don't think there could be a conflict16:57
tgardnersconklin, linux-meta has no dependencies so I think we're OK16:58
ckingmjg59, so is this polarity setting wrong because it's mis-configured in the MP configuration table? Or am I way off here?17:04
mjg59I'd guess17:04
mjg59Or wherever we get the default polarities from on APIC systems17:05
* cking gets reading the MP spec...17:05
ckingNothing is trivial in this domain17:05
sforsheecking, mjg59, there is an override in the MP configuration table for the toshiba17:09
sforshee[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)17:10
sforshee[    0.000000] Int: type 0, pol 1, trig 1, bus 00, IRQ 00, APIC ID 2, APIC INT 0217:10
sforsheeI'm preparing a test build to override what's in the MP table, any guesses what triggering I should use?17:12
ckingsforshee, well, its a 50/50 guess isn't it?17:25
sforsheecking, well, if polarity is the only problem17:26
sforsheethere's also edge vs. level17:26
sforsheeone option can be eliminated right away, though :)17:26
sforsheei'm starting with only changing polarity17:27
=== sforshee is now known as sforshee-lunch
jamespageI'm hoping someone on channel can clear up some confusion re kernel names for ubuntu server17:47
jamespageSo (and this is all for lucid) for the minimal virtual install the linux-image-virtual package gets installed17:47
jamespagethis is nice and small (thats all OK) but uname -r returns a -server kernel name - is this correct?  the current ISO test case says it should return -virtual17:48
tgardnerjamespage, for Lucid the virtual package was distilled from server binaries17:48
tgardnerso, there is no -virtual kernel per se17:49
jamespageexcellent :-); hggdh  - can we get the test case updated for next time?17:49
jamespagetgardner: has this changed in maverick or natty?17:49
tgardnerjamespage, note that maverick and subsequent releases _will_ have a -vertual kernel17:50
jamespagetgardner: thanks for clearing that up for us17:50
hggdhjamespage: I will update the test with a caveat on lucid17:51
jamespagehggdh: marvellous!17:52
hggdhOK, another question. the i386 ISO (at least on Lucid, probably valid onwards) defaults to always installing -pae. Is this correct?18:00
tgardnerhggdh, The ISO boots the non-pae kernel IIRC, then detects the platform capabilities and installs accordingly IIRC. cjwatson or ev could probably answer deep installer questions like taht.18:02
hggdhtgardner: thank you18:03
ckingsforshee-lunch, once you get some insight, please let me know via email. meanwhile I think I should dig into seeing if we can make this testable from fwts18:49
* tgardner --> lunch19:01
JFo<- headed to lunch19:01
=== sforshee-lunch is now known as sforshee
ckingeveryone is making me feel hungry19:11
apwcking, its your dinner time, go away and don't come back19:12
ckingyessir!19:13
apwi am watching19:13
ckingand...19:13
sforsheecking, will do, so far I've tested edge-low triggering with no luck, moving on to level triggering now19:13
ckingsforshee, ta19:13
Darxus"make install" doesn't work (doesn't update grub) in Maverick, right?  It looks like it does in Natty?19:38
tgardnerDarxus, yep, I fixed the package for natty19:39
Darxustgardner: Nice, thanks.19:39
tgardnerDarxus, you can fix maverick by just copying /sbin/installkernel from Natty19:40
Darxustgardner: Yeah, looks like the only significant difference is the run-parts line?19:40
DarxusAny chance of that getting into Maverick?19:40
tgardnerDarxus, IIRC19:40
tgardnerDarxus, unlikely, since its really a developer tool.19:41
DarxusOkay, thanks.19:42
GrueMastertgardner: On bug 720189, I don't believe marvel (or any ubuntu supported platforms) use it.  It looks like the Amiga serial port driver.19:48
ubot2Launchpad bug 720189 in linux-lts-backport-maverick "CVE-2010-4076" [Undecided,In progress] https://launchpad.net/bugs/72018919:48
GrueMaster(and marvel kernel is on 2.6.32).19:50
tgardnerGrueMaster, the CVE description is a bit misleading. the patch actually cleans up a generic TTY vulnerability.19:51
GrueMasterAh.  Because the bug was against amiserial.c.  That's how I was confused.19:51
tgardnerGrueMaster, I'm still considering what to do with mvl-dove regression. I'm considering just backing out stable updates and just do CVEs (which is all that were are contractually obligated for)19:52
GrueMasterThat would be my suggestion.19:52
GrueMasterEspecially considering they are doing hardware changes that we can't follow.19:53
tgardnerGrueMaster, cool. I'll get started...19:53
* jjohansen -> lunch19:59
=== Artir is now known as JoseLuisRicon
=== JoseLuisRicon is now known as Artir
htorqueapw, git bisect suggests http://git390.marist.edu/cgi-bin/gitweb.cgi?p=linux-2.6.git;a=commit;h=23d69b09b78c4876e134f104a3814c30747c53f1 - does this make any sense? (i'll re-compile the enclosing commits to confirm)20:42
skaet_rtg,  can someone have a quick look at  https://launchpad.net/bugs/72086520:45
ubot2Ubuntu bug 720865 in linux "kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)" [Undecided,New]20:45
=== sconklin is now known as sconklin-afk
skaet_tgardner, ^^20:52
tgardnerskaet_, ack21:00
skaet_thanks!21:00
apwhtorque, it seems to be a merge, i am supprised it would tell you it was that one21:09
apwthough i am not supprised if one of the commits that was merged is the culprit21:10
apwnormally though i would expect it to try and select which of the merged commits is at fault21:10
htorqueapw, yeah, unfortunately git-bisect stopped at that point. anyways, that's what i'm planning to do after i double checked it's the right one.21:15
apwyeah i guess start with the 'right hand side of the merge as a 'bad' and probabally pick something old as good :)21:16
GrueMastertgardner: Out of curiosity, I am currently subscribed to linux-mrvl-dove, which has generated a recent spat of CVE spam.  I assume you are running a script to push these?  If so, you may want to modify it to automatically exclude natty for dove, as it is not supported at this time.21:21
tgardnerGrueMaster, you're assuming launchpad has that flexibility. besides, I'm having  to do the whole damn thing by hand.21:24
GrueMasterOh, oops.  :P21:24
GrueMasterWasn't trying to nit pick.  Just thought it was semi-automated.  No problems.21:24
tgardnerGrueMaster, just uploaded Lucid mvl-dove to https://launchpad.net/~canonical-kernel-team/+archive/ppa, so see if you can install it tomorrow (assuming it builds OK)21:29
GrueMasterWill do.21:29
GrueMasterWill it be the same package for Maverick (assuming since they are the same code base)?21:30
tgardnerGrueMaster, yep, identical code base, different compiler21:31
=== sconklin-afk is now known as sconklin
GrueMasterSo, with all these recent patches for dove kernels, is anyone updating linux-image-imx51 (karmic, lucid), linux-image-omap (lucid, maverick) or linux-image-omap4 (maverick, natty)?21:46
=== _LibertyZero is now known as LibertyZero
=== sconklin is now known as sconklin-gone

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!