/srv/irclogs.ubuntu.com/2011/07/20/#ubuntu-kernel.txt

=== jk-- is now known as jk-
mwhudson	hi, the ubuntu-natty kernel tree doesn't seem to include the 2.6.38-10.46 abi stuff at the Ubuntu-2.6.38-10.46 tag	01:26
mwhudson	is that to be expected?	01:26
mwhudson	i'm very new to this sort of thing :)	01:26
mwhudson	ah, it seems that the abi stuff gets added the commit after the tag	01:31
mwhudson	(that makes some sense i guess)	01:31
=== smb` is now known as smb
* apw waves		07:25
* smb finishes scrollback and waves back		07:28
apw	reading scrollback, how dedicated :)	07:35
smb	Just because someone caused botty to spew out a lot of stuff and some of it hit the "bell of interest" ;)	07:36
apw	smb, ahh yes the c-ve bell	08:11
ppisati	apw: can you take a look at my ubuntu-oneiric ti-omap4-next branch? if it's ok, i'll pull req	09:42
* ppisati -> out for lunch		10:52
=== _LibertyZero is now known as LibertyZero
apw	ppisati, will try and look it over after i've sorted this lucid kernel issue	12:06
Q-FUNK	It seems that the apport rules for reporting a kernel bug are excessively thight:	12:36
Q-FUNK	I was forced to reboot into an older kernel, because of a regression, all while leaving the current package pulled by linux-generic installed. now, apport refuses to let me file a bug using 'ubuntu-bug' because the currently running kernel is not the same as the latest.	12:36
Q-FUNK	using 'ubuntu-bug linux' that is.	12:37
ppisati	apw: k	12:37
apw	Q-FUNK, yep its a pain in the backside	12:38
Q-FUNK	apw: is there any way this could be fixed?	12:38
apw	Q-FUNK, it cirtainly can be fixed	12:39
Q-FUNK	I can understand apport complaining if there is no linux-generic installed at all, but refusing to let me file a bug just because I rebooted into an older kernel that is no longer in the repository seems excessive.	12:40
apw	Q-FUNK, indeed seems excessive if that is the criteria	12:40
apw	though the bug information will be poor if you cannot file it from the right kernel anynhow	12:40
apw	as that will indicate the bug is in your old kernel which is false	12:41
Q-FUNK	will it? or will the bug be filed against the current version of linux-generic that is installed?	12:41
apw	all of the live information will be taken from teh running kernel	12:42
apw	so it would be even worse as mixture	12:42
Q-FUNK	right. shouldn't there be a way to instead attach the dmesg from the previous boot with the buggy kernel, instead?	12:42
apw	yeah there likely should, or a way to take all the infrmation offline or something	12:44
Q-FUNK	that would work too.	12:45
apw	Q-FUNK, what happens if you ask it to file a bug against the specific binary kernel package which is broken	12:45
Q-FUNK	don't we already same the dmesg from the previous boot?	12:45
Q-FUNK	öö... save	12:46
apw	Q-FUNK, maybe so, not that i am aware of specificall, but maybe	12:46
Q-FUNK	trying that. just a sec.	12:46
Q-FUNK	yup. specifying the exact binary seems to work.	12:47
apw	Q-FUNK, well thats something, not helpful particularly but ok ..	12:47
Q-FUNK	well, at least the hardware data remains valid. :)	12:49
apw	whats the regression and with which release	12:49
Q-FUNK	since 2.6.39 there's frequent kernel paging failures that make the kernel oops. it especially happens whenever running dpkg.	12:51
Q-FUNK	it's a non-fatal oops, but I end up having to run 'dpkg -a --configure' and re-start 'apt-get upgrade' several times in a row to complete the upgrade.	12:52
apw	so in lucid ?	12:52
Q-FUNK	oneirc	12:52
apw	well you must be unsual as i have 10 machines in that range and none have ever shown that kind of behaviour	12:53
apw	you are saying if you run 2.6.39 it doesn't occur ?	12:53
Q-FUNK	2.6.38 works fine, but 2.6.39 and more recent have the symptoms. with 2.6.39 it happened seldom, but with 3.0 it's nearly systematically when I do a daily package upgrade.	12:53
apw	32 or 64 bit ?	12:54
Q-FUNK	32-bit. Geode LX	12:54
apw	oh a geode, heh, bet its h/w specific	12:54
Q-FUNK	not really	12:55
apw	well its not occuring any any of my oneiric boxes	12:55
apw	so its cirtainly not general	12:55
Q-FUNK	however, this particular host has repeatedly shown to be a good trap for corner-case kernel bugs.	12:55
apw	Q-FUNK, as you have a pair of good/bad you could see if the corresponding mainline kernels have the issue	12:57
apw	and then use the -rc's to try and pare it down a bit	12:57
Q-FUNK	tried that before, the last time I had a major regression on that host. in the end, while we found out around which commit the regression took place, it never got investigated.	12:58
apw	Q-FUNK, as you are the only one with the h/w i suspect unless you do it it'll stay broken	13:00
Q-FUNK	I don't mind testing every -rc in the vanilla kernel folder to narrow it down to one specific release, but unless actions are taken beyond that point, it gets ridiculous.	13:01
mjg59	geode is amazingly niche hardware, and lots of people seem enthusiastic about building businesses around it without actually being willing to put in the funding or effort for making sure that the software they depend on continues to work	13:02
Q-FUNK	for instance, the ext4 inode destroying bug I encountered on the same host from kernel 2.6.31 to 2.6.35 was never properly investigated. it was just marked as fixed the day I announced that 2.6.36 apparently fixes it.	13:03
apw	Q-FUNK, if we can figure out whats breaking you we'll try and fix it, but last time it basically appears to be a work around for broken cache coherency so ... its not easy to either find or fix	13:03
Q-FUNK	apw: it could be. I'm still baffled as to how the bug occurred on that particular geode box and not on another one with a different bios.	13:04
apw	Q-FUNK, isn't half of the geode instruction set implemented via SMI, at which point the BIOS is part of your processor from a semanatic point of view	13:05
mjg59	apw: Not so much the instruction set. Just every time you think you're touching hardware.	13:06
apw	mjg59, ahh ok, so its part of the h/w then even after its handed off ... just as bad	13:06
Q-FUNK	IIRC the bios is mostly used to provide a traditional x86 abstraction (PCI bus, etc.) for a system that uses an entirely different bus architecture and there are various implementations of that abstraction layer.	13:06
mjg59	Right, any PCI accesses get handled by non-free firmware	13:07
mjg59	So who knows what it's doing?	13:07
apw	Q-FUNK, right but if the code is running via SMI after the kernel takes over, any bug in it ... can break things ... if they don't clean up right	13:07
Q-FUNK	free or non-free. coreboot works quite well on those, at least for a few known configurations.	13:07
mjg59	Anyone who knew anything about how it worked appears to have vanished in a set of freak accidents	13:08
Q-FUNK	mostly in a set of random AMD attritions and OLPC changes of mind.	13:08
apw	mjg59, not physcial accidents i hope	13:08
mjg59	apw: Not to the best of my knowledeg	13:09
Q-FUNK	AMD used to have an extremely knowledgeable and dedicated coder who handled Geode coding for the OLPC project. he even won employee awards for his efforts. then one day, after a particularly bad quarterly, he fell the victim of random attrition.	13:10
mjg59	I suspect that if it weren't for OLPC, everyone would just have given up pretending to support Geode	13:10
mjg59	It's cetainly the only reason we care	13:10
mjg59	(And we don't for RHEL)	13:10
apw	ahh ... shame i don't know anyone who has one	13:10
Q-FUNK	that random attrition even left his immediate boss in usnly had to provide technical support and ongoing code development without his main guy.	13:11
Q-FUNK	argh. friggin kernel i/o stealing my keystrokes again	13:12
Q-FUNK	that random attrition even left his immediate boss in total limbo, because he suddenly had to provide technical support and ongoing code updates without his main guy.	13:13
Q-FUNK	I really hate how hard-disk access has the bad habit of momentarily halting the keyboard buffer.	13:14
apw	Q-FUNK, never see that either, keys may be delayed for me but not lost	13:15
apw	not that i want them delayed of course, but thats a separate gripe	13:15
Q-FUNK	delayed would be acceptable. half of the time, if the kernel starts swapping, I end up missing several words in the middle of a sentence.	13:16
ohsix	you could try changing your io wait method	13:18
apw	herton, we likely are going to have to shove a lucid kernel with a single fix in for the point release ... so hold off any lucid uploads to the PPA	13:19
herton	apw: we were avoiding uploading anything because of the point release too, so that's ok	13:21
apw	ok good stuff	13:21
Q-FUNK	ohsix: the scheduling algorhythm, you mean?	13:21
ohsix	Q-FUNK: nevermind, was thinking of something else	13:22
apw	sconklin, i've pushed a temporary bracnh to lucid, master-point which is what i am proposing for the upload	14:24
sconklin	apw: ok, sounds good	14:30
* ogasawara back in 20		14:43
apw	sconklin, ok the decision from the release team is that they want a kernel spun, could you check that branch for me as the stable check script just talks crap	14:57
sconklin	heh, probably as soon as we finish the meeting	14:58
climbe2	Problem! If anyone is able to help.... running 10.04, recently upgraded to kernel 2.6.32-33, and can't boot up!! See http://ubuntuforums.org/showthread.php?t=1807978 for my ongoing thread. Any suggestions?!?!	15:33
ogasawara	tgardner: heading in, see ya in 15	15:33
tgardner	ogasawara, ack	15:34
=== tgardner is now known as tgardner-afk
* herton -> lunch		16:10
ppisati	mumble went belly up	16:18
kees	apw: say, did you see my email about a funky CVE in the hardy update?	16:21
apw	kees, not yet, which one	16:23
kees	oh, sorry, not hardy. 2010-4175 in linux-lts-backport-maverick	16:24
kees	the tracker shows "released 2.6.35-25.44~lucid1" but there is a changelog entry in 2.6.35-30.56~lucid1	16:24
apw	kees, seems to be applied twice	16:26
kees	??	16:26
apw	git log --oneline origin/lts-backport-maverick \| grep 'rds: Integer overflow in RDS cmsg handling'	16:26
kees	was it a no-change cherry-pick or something?	16:26
apw	9a3798f rds: Integer overflow in RDS cmsg handling, CVE-2010-4175	16:26
ubot2	apw: Integer overflow in the rds_cmsg_rdma_args function (net/rds/rdma.c) in Linux kernel 2.6.35 allows local users to cause a denial of service (crash) and possibly trigger memory corruption via a crafted Reliable Datagram Sockets (RDS) request, a different vulnerability than CVE-2010-3865. (http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-4175)	16:26
apw	ae1d3ae rds: Integer overflow in RDS cmsg handling	16:26
apw	git describe --contains ae1d3ae	16:26
apw	Ubuntu-lts-2.6.35-25.44~33	16:26
apw	git describe --contains 9a3798f	16:26
apw	Ubuntu-lts-2.6.35-28.50~20	16:26
apw	so thats where the version you mention come from. from whats in the tree i'd expect my tools to take the first version number there	16:27
apw	and i think thats what you are saying it did	16:27
kees	yeah, that's certainly the correct approach, but I guess I'm wonder if we need to look closer -- how did it get applied twice (did that have a bad effect?)	16:27
apw	kees, both commits seem to be complete	16:27
apw	which should be impossible	16:27
kees	right :P	16:28
sconklin	apw: did you test build this?	16:28
apw	sconklin, i built only generic on amd64/i386	16:28
sconklin	I'm going to test build before I shove it to the PPA. Less time in the long run if it fails	16:29
apw	sconklin, ack, thanks	16:29
apw	kees, will investigate and let you know	16:29
kees	apw: cool, thanks	16:29
apw	i can only assume the first one got reverted somehow	16:30
apw	just ... how i don't know	16:30
sconklin	apw: does this require an ec2 respin?	16:35
apw	sconklin, i would expect it might now you mention it, but i am confirming now with the release team	16:36
herton	ppisati: yeah, mumble can't login anymore here	16:41
sconklin	here either	16:41
apw	kees, ok this actually is fixed in the earlier one as advertised. however the relayout of the code since then makes it inobvious so someone has applied the same fix again to the containing routine in the second place	16:45
apw	kees, so its doubly fixed essentially	16:45
kees	apw: how strange, okay. thanks!	16:49
apw	sconklin, ok the basic answer is no we don't need -ec2	16:49
apw	kees, good spot though	16:49
sconklin	ok	16:49
apw	17:49:02 Daviey \| it's a nice to have.. but we are only currently seeing this with euca. │ astraljava	16:50
Daviey	\o/	16:51
apw	Daviey, yep taking your name in vein	16:53
* apw screams about mumble		16:53
* jjohansen running an errand		17:14
sconklin	apw: package uploaded, It'll take 24 hours to complete, since it includes ARM arch	17:30
apw	sconklin, yeah	17:30
apw	sconklin, one reason we need it in as soon as	17:30
sconklin	apw: yep, and another reason that I did a test build (which completed OK)	17:31
apw	sconklin, yep all sensible and right	17:32
climbe2	Problem! If anyone is able to help.... running 10.04, recently upgraded to kernel 2.6.32-33, and can't boot up!! See http://ubuntuforums.org/showthread.php?t=1807978 for my ongoing thread. Any suggestions?!?!	17:38
apw	climbe2, does the previous kernel work	17:57
apw	as in seelecting an older kernle from the grub menu	17:58
tgardner	apw, mumble seems to have reincarnated	18:03
climbe2	apw, no none will work anymore	18:04
apw	climbe2, ok thats indeed odd if you took a kernle update and all your old kernels stop working too	18:05
apw	which versions have you tried	18:05
climbe2	perhaps it was not the upgrade.... I was recently using an SDHC card through my HP printer... two different USB storage devices...	18:05
climbe2	all I know is that I can't boot up at all!	18:06
apw	/dev/disk/by-uuid/e3df952c-d462-4292-bab6-4965da1d567c does not exist. Dropping to a shell.	18:06
apw	so that implies your disk does not have the lable that is expected	18:07
apw	at the busy box you can get the dmesg output and see if any of your disks were found	18:08
climbe2	ok, how do I do that?	18:08
climbe2	I am also on a different computer right now, so I will have to hand type it	18:09
apw	climbe2, 'dmesg \| grep sd'	18:09
apw	that command sequence might give you some clues	18:09
apw	climbe2, you might also try 'blkid'	18:09
apw	if that is available it might tell you waht disks it thinks it can see	18:10
apw	/dev/sda5: UUID="cf503727-25f2-4ecd-b0f3-2b894523bcba" TYPE="ext4"	18:10
climbe2	I can only get to the grub command line	18:10
climbe2	is there another command line I can use?>	18:10
apw	mine has a line like that ... wihhc matches the UUID in the error ...	18:10
apw	climbe2, your post has the busybox prompt in it ?	18:10
apw	(initramfs)	18:11
apw	that one	18:11
climbe2	I've been a few different places from back then...let me see	18:11
climbe2	yes, I am there	18:12
apw	does blkid produce any output	18:13
apw	and does dmesg \| grep sd	18:13
climbe2	dmesg \| grep sd produces hundreds of lines it seems....blkid produces my drives	18:14
apw	ok, and does the UUID in the error line appear ?	18:14
climbe2	blkid: /dev/sda1, 2, 5, 6 /dev/sdab1	18:14
apw	climbe2, is that the exact output it produced ?	18:15
apw	i am expecting some UUID= segments	18:15
apw	/dev/disk/by-uuid/e3df952c-d462-4292-bab6-4965da1d567c	18:16
climbe2	there is an extra "\" in the error message than in the drive UUID	18:16
apw	an extra ?	18:16
apw	what ?	18:16
climbe2	wait..let me see	18:16
climbe2	in grub, i press 'e' to edit... it has /dev/disk/by-uuid/e3fd.....b\ab6-... instead of bab6	18:18
apw	well you could try removing that \	18:19
climbe2	rather, in the edit screen, 5th line down it says linux /boot/vmlinuz 2.6.32-33-generic root=UUID=e3fd.....b\ab6...	18:20
apw	though i am supprised to get those full names in your grub conf	18:21
climbe2	perhaps that is an end-of-line signal...? it occurs on another line as well	18:21
climbe2	when the line breaks to continue to the next	18:21
* apw has to run		18:22
climbe2	thanks for the help	18:23
climbe2	i'll see what i can do	18:23
pgraner	skaet, ping	18:26
pgraner	skaet, we don't need to pull the kernel re: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/788351	18:27
ubot2	Ubuntu bug 788351 in linux "xfs ioctl XFS_IOC_FSGEOMETRY_V1 clobbers kernel stack" [Undecided,New]	18:27
pgraner	skaet, the patch in question is in the current -proposed	18:27
pgraner	skaet, no need to do anything	18:27
skaet	pgraner, thanks - was pinging around about it.	18:27
pgraner	skaet, next time just drop in here and ask	18:27
skaet	pgraner, will do	18:28
=== Quintasan_ is now known as Quintasan
kees	apw: another "why is this 'released'?" question for 2011-0711. uct shows it as "released" for a kernel in -proposed (hardy)	18:59
* jjohansen lunch		19:26
* tgardner bounces tangerine for kernel update		20:37
=== yofel_ is now known as yofel
FireZen	!	21:15
=== kentb is now known as kentb-out
=== jeremy is now known as Guest4843
=== rsalveti` is now known as rsalveti

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!