/srv/irclogs.ubuntu.com/2015/03/02/#ubuntu-kernel.txt

=== gerald is now known as Guest65404
smb	tseliot, Hey, we had been a look at bug 1426492 this morning as this has a ugly bus error in the logs. I just tried but could only confirm the problem of producing false crash reports issues, which actually seems to be a dkms one	11:07
ubot5	bug 1426492 in nvidia-graphics-drivers-340 (Ubuntu) "nvidia-340 340.76-0ubuntu1: nvidia-340 kernel module failed to build" [High,Confirmed] https://launchpad.net/bugs/1426492	11:07
smb	tseliot, Have you heard of any other "bus error" breakage or might that way of failing be a one off	11:08
=== swordsmanz is now known as hugbot
tseliot	smb: no, I haven't heard of it but DKMS seems to fail at random, or rather report failing even when it doesn't (LP: #1268257)	11:38
ubot5	Launchpad bug 1268257 in nvidia-graphics-drivers-331-updates (Ubuntu) "nvidia-331-updates 331.38-0ubuntu3: nvidia-331-updates kernel module failed to build, with only error: "objdump: '... .tmp_nv.o': No such file"" [High,Triaged] https://launchpad.net/bugs/1268257	11:38
smb	tseliot, Right, that is the way I see often enough and looks to be dkms' fault for using exec in the middle of /etc/kernel/postinst.d/dkms	11:39
smb	Basically one attempt fails because the headers are not done, yet. then the next one succeeds and you get a crash report while everything is shiny	11:40
tseliot	oh	11:40
tseliot	smb: I think it's desirable to catch any failures though	11:42
smb	tseliot, Adam had spotted this. Well but exec ends the current script, doesn't it?	11:43
tseliot	yep	11:43
smb	What we want is the error message about headers missing and probably not error the postinst for that special case	11:43
tseliot	yes, I've just noticed	11:43
* tseliot nods		11:44
smb	tseliot, We should use one of those nvidia dkms fail reports to actually fix the dkms problem (which I see even in Trusty). I just was not sure whether the one with the bus error was "suitable"	11:44
tseliot	smb: I think we have more than enough duplicates of 1268257 to make a point ;)	11:45
smb	tseliot, I bet. :) Any of those you would prefer. That other one you mentioned is for 331.113... that be U or before.	11:48
smb	Maybe I should open a new report against dkms and we can dup the nvidia ones against that as we see fit	11:49
tseliot	smb: that would work too	11:49
smb	tseliot, ok, so here we go ... bug 1427175	11:56
ubot5	bug 1427175 in dkms (Ubuntu) "dkms postinst should handle missing headers" [Undecided,New] https://launchpad.net/bugs/1427175	11:56
tseliot	smb: great, thanks. I'll also talk to upstream about it	11:56
smb	tseliot, Ok, sounds good	11:57
=== hugbot is now known as swordsmanz
smb	bjf, sforshee, I think patch#3 which was sent for bug 1410852 really has some issue. Just did a backport myself which would not remove a function. Would it be enough to re-submit just that one or should I send all three again (which I just used for a test build)	15:51
ubot5	bug 1410852 in linux (Ubuntu Trusty) "restarting container with a vlan interface results in kernel stack trace" [Medium,In progress] https://launchpad.net/bugs/1410852	15:51
bjf	smb, do all 3 so it's all clear	15:52
smb	bjf, ok. ack	15:53
smb	bjf, sforshee, kamal, Ok, sent... of course with another (though slightly minor) glitch. Maybe if you apply it, maybe you can drop the OCD drop of that additional newline	16:15
bjf	apw, so just how broken _is_ ipv6?	16:16
bjf	apw, completely or only for certain cases?	16:16
apw	bjf, as broken as it was last time, exploding at random several times a day	16:16
apw	bjf, remember stgrabers issue ?	16:16
bjf	apw, no	16:16
apw	we had a bad backport in stable of an ipv6 sizing patch, which was triggering bangs on mapped ipv4 addresss	16:17
bjf	apw, i'm trying to decide if it warrants a respin to fix this one issue or wait for the SRU cycle which starts today	16:17
apw	kamal, what did we do last time, did we respin for it then ?	16:17
apw	bjf, its pretty serious if you have ipv6 enabled and you are a server	16:18
stgraber	last time I had to wait for two kernel updates (about a month) to get a fix	16:18
stgraber	which meant reverting to a kernel with a known security issue on public facing machines...	16:18
bjf	apw, sounds like a respin	16:18
apw	oh we clearly don't care about you :)	16:18
apw	stgraber, ugg	16:18
kamal	iirc, we fixed and released it promptly, once the problem had been identified	16:19
bjf	kamal, you did such a fantastic job last week ... :-)	16:20
kamal	... someone else doesn't want a turn?	16:20
bjf	kamal, i'd be happy to do it	16:21
apw	sforshee, anyhow i've written that up in the bug ... sigh ... i guess we need to apply it ... again	16:21
bjf	will have to respin hwe-trusty as well	16:21
kamal	bjf, I'm swamped (with the other IPv6 issue, among other things) ... if you want to take this one, that would be great.	16:22
bjf	kamal, not a problem. i'll take it	16:22
bjf	apw, i should just revert the second one and spin right?	16:22
stgraber	kamal: previous bug was reported on the 20th of December, fixed on the 6th of January and released to -updates on the 2nd of February, you and I have a different definition of "promptly" :)	16:23
apw	bjf, yeah that is the more correct version	16:23
apw	stgraber, i don't think we were engaged with it till much later, when we started looking at it, it was quick (for us) from there	16:23
apw	i blame those christmas things	16:23
bjf	stgraber, we would have loved to have gotten to it quicker but were forced to take shutdown days ... so sorry	16:24
stgraber	oh yeah, the whole 20th of December till early January was totally expected due to company shutdown :) It's just that to me a month doesn't really qualify as "promptly" :)	16:26
stgraber	anyway, enough complaining for today :)	16:26
bjf	sforshee, you want to do a quick respin of trusty and hwe-trusty? would be good practice :-)	16:26
sforshee	bjf: well want might be a strong word, but I'm willing ;-)	16:28
bjf	sforshee, it's yours to do.	16:32
bjf	sforshee, this should not be an ABI bump	16:34
sforshee	bjf: ack	16:37
=== adam_g_out is now known as adam_g
gchao_	Hi	19:19
gchao_	Is there someone who knows about kernel crashes? XD	19:20
apw	gchao_, if you have a kernel crash, file a bug as that will get the requsite info from your machine	19:21
gchao_	hi apw! thanks for the response	19:22
gchao_	so filing a bug is standard procedure even if I'm just troubleshooting?	19:22
apw	gchao_, if you want help from outside it is easiest to see things like the crash stack as they are large	19:24
apw	if you have one you could pastebin it too, and someone might have ideas	19:24
apw	as people are on varying timezones it is hard to keep abreast of the whole story if it isn't all in one place and the bug is a good place for that	19:25
gchao_	The thing is - I don't even get a crash stack (the one under /var/crash/ ?) I was trying to use crash to debug it but nothing is there. A real "crash" never occurs, but it gets stuck in an endless panic loop	19:26
gchao_	and seems to ignore the magic button.	19:26
gchao_	magic key*	19:27
apw	gchao_, that is tricky, can you get a photo of the errors perhaps when they start?	19:33
apw	some kind of hint might tell someone enough to help	19:34
pepee	I'd try using netconsole	19:35

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!