/srv/irclogs.ubuntu.com/2011/07/20/#ubuntu-kernel.txt

=== jk-- is now known as jk-
mwhudsonhi, the ubuntu-natty kernel tree doesn't seem to include the 2.6.38-10.46 abi stuff at the Ubuntu-2.6.38-10.46 tag01:26
mwhudsonis that to be expected?01:26
mwhudsoni'm very new to this sort of thing :)01:26
mwhudsonah, it seems that the abi stuff gets added the commit *after* the tag01:31
mwhudson(that makes some sense i guess)01:31
=== smb` is now known as smb
* apw waves07:25
* smb finishes scrollback and waves back07:28
apwreading scrollback, how dedicated :)07:35
smbJust because someone caused botty to spew out a lot of stuff and some of it hit the "bell of interest" ;)07:36
apwsmb, ahh yes the c-ve bell08:11
ppisatiapw: can you take a look at my ubuntu-oneiric ti-omap4-next branch? if it's ok, i'll pull req09:42
* ppisati -> out for lunch10:52
=== _LibertyZero is now known as LibertyZero
apwppisati, will try and look it over after i've sorted this lucid kernel issue12:06
Q-FUNKIt seems that the apport rules for reporting a kernel bug are excessively thight:  12:36
Q-FUNKI was forced to reboot into an older kernel, because of a regression, all while leaving the current package pulled by linux-generic installed. now, apport refuses to let me file a bug using 'ubuntu-bug' because the currently running kernel is not the same as the latest.12:36
Q-FUNKusing 'ubuntu-bug linux' that is.12:37
ppisatiapw: k12:37
apwQ-FUNK, yep its a pain in the backside12:38
Q-FUNKapw: is there any way this could be fixed?12:38
apwQ-FUNK, it cirtainly can be fixed12:39
Q-FUNKI can understand apport complaining if there is no linux-generic installed at all, but refusing to let me file a bug just because I rebooted into an older kernel that is no longer in the repository seems excessive.12:40
apwQ-FUNK, indeed seems excessive if that is the criteria12:40
apwthough the bug information will be poor if you cannot file it from the right kernel anynhow12:40
apwas that will indicate the bug is in your old kernel which is false12:41
Q-FUNKwill it? or will the bug be filed against the current version of linux-generic that is installed?12:41
apwall of the live information will be taken from teh running kernel12:42
apwso it would be even worse as mixture12:42
Q-FUNKright.  shouldn't there be a way to instead attach the dmesg from the previous boot with the buggy kernel, instead?12:42
apwyeah there likely should, or a way to take all the infrmation offline or something12:44
Q-FUNKthat would work too.12:45
apwQ-FUNK, what happens if you ask it to file a bug against the specific binary kernel package which is broken12:45
Q-FUNKdon't we already same the dmesg from the previous boot?12:45
Q-FUNKöö... save12:46
apwQ-FUNK, maybe so, not that i am aware of specificall, but maybe12:46
Q-FUNKtrying that. just a sec.12:46
Q-FUNKyup. specifying the exact binary seems to work.12:47
apwQ-FUNK, well thats something, not helpful particularly but ok ..12:47
Q-FUNKwell, at least the hardware data remains valid. :)12:49
apwwhats the regression and with which release12:49
Q-FUNKsince 2.6.39 there's frequent kernel paging failures that make the kernel oops.  it especially happens whenever running dpkg.12:51
Q-FUNKit's a non-fatal oops, but I end up having to run 'dpkg -a --configure' and re-start 'apt-get upgrade' several times in a row to complete the upgrade.12:52
apwso in lucid ?12:52
Q-FUNKoneirc12:52
apwwell you must be unsual as i have 10 machines in that range and none have ever shown that kind of behaviour12:53
apwyou are saying if you run 2.6.39 it doesn't occur ?12:53
Q-FUNK2.6.38 works fine, but 2.6.39 and more recent have the symptoms.  with 2.6.39 it happened seldom, but with 3.0 it's nearly systematically when I do a daily package upgrade.12:53
apw32 or 64 bit ?12:54
Q-FUNK32-bit.  Geode LX12:54
apwoh a geode, heh, bet its h/w specific12:54
Q-FUNKnot really12:55
apwwell its not occuring any any of my oneiric boxes12:55
apwso its cirtainly not general12:55
Q-FUNKhowever, this particular host has repeatedly shown to be a good trap for corner-case kernel bugs.12:55
apwQ-FUNK, as you have a pair of good/bad you could see if the corresponding mainline kernels have the issue12:57
apwand then use the -rc's to try and pare it down a bit12:57
Q-FUNKtried that before, the last time I had a major regression on that host.  in the end, while we found out around which commit the regression took place, it never got investigated.12:58
apwQ-FUNK, as you are the only one with the h/w i suspect unless you do it it'll stay broken13:00
Q-FUNKI don't mind testing every -rc in the vanilla kernel folder to narrow it down to one specific release, but unless actions are taken beyond that point, it gets ridiculous.13:01
mjg59geode is amazingly niche hardware, and lots of people seem enthusiastic about building businesses around it without actually being willing to put in the funding or effort for making sure that the software they depend on continues to work13:02
Q-FUNKfor instance, the ext4 inode destroying bug I encountered on the same host from kernel 2.6.31 to 2.6.35 was never properly investigated. it was just marked as fixed the day I announced that 2.6.36 apparently fixes it.13:03
apwQ-FUNK, if we can figure out whats breaking you we'll try and fix it, but last time it basically appears to be a work around for broken cache coherency so ... its not easy to either find or fix13:03
Q-FUNKapw: it could be. I'm still baffled as to how the bug occurred on that particular geode box and not on another one with a different bios.13:04
apwQ-FUNK, isn't half of the geode instruction set implemented via SMI, at which point the BIOS is part of your processor from a semanatic point of view13:05
mjg59apw: Not so much the instruction set. Just every time you think you're touching hardware.13:06
apwmjg59, ahh ok, so its part of the h/w then even after its handed off ... just as bad13:06
Q-FUNKIIRC the bios is mostly used to provide a traditional x86 abstraction (PCI bus, etc.) for a system that uses an entirely different bus architecture and there are various implementations of that abstraction layer.13:06
mjg59Right, any PCI accesses get handled by non-free firmware13:07
mjg59So who knows what it's doing?13:07
apwQ-FUNK, right but if the code is running via SMI after the kernel takes over, any bug in it ... can break things ... if they don't clean up right13:07
Q-FUNKfree or non-free.  coreboot works quite well on those, at least for a few known configurations.13:07
mjg59Anyone who knew anything about how it worked appears to have vanished in a set of freak accidents13:08
Q-FUNKmostly in a set of random AMD attritions and OLPC changes of mind.13:08
apwmjg59, not physcial accidents i hope13:08
mjg59apw: Not to the best of my knowledeg13:09
Q-FUNKAMD used to have an extremely knowledgeable and dedicated coder who handled Geode coding for the OLPC project.  he even won employee awards for his efforts.  then one day, after a particularly bad quarterly, he fell the victim of random attrition.13:10
mjg59I suspect that if it weren't for OLPC, everyone would just have given up pretending to support Geode13:10
mjg59It's cetainly the only reason we care13:10
mjg59(And we don't for RHEL)13:10
apwahh ... shame i don't know anyone who has one13:10
Q-FUNKthat random attrition even left his immediate boss in usnly had to provide technical support and ongoing code development without his main guy.13:11
Q-FUNKargh.  friggin kernel i/o stealing my keystrokes again13:12
Q-FUNKthat random attrition even left his immediate boss in total limbo, because he suddenly had to provide technical support and ongoing code updates without his main guy.13:13
Q-FUNKI really hate how hard-disk access has the bad habit of momentarily halting the keyboard buffer.13:14
apwQ-FUNK, never see that either, keys may be delayed for me but not lost13:15
apwnot that i want them delayed of course, but thats a separate gripe13:15
Q-FUNKdelayed would be acceptable.  half of the time, if the kernel starts swapping, I end up missing several words in the middle of a sentence.13:16
ohsixyou could try changing your io wait method13:18
apwherton, we likely are going to have to shove a lucid kernel with a single fix in for the point release ... so hold off any lucid uploads to the PPA13:19
hertonapw: we were avoiding uploading anything because of the point release too, so that's ok13:21
apwok good stuff13:21
Q-FUNKohsix: the scheduling algorhythm, you mean?13:21
ohsixQ-FUNK: nevermind, was thinking of something else13:22
apwsconklin, i've pushed a temporary bracnh to lucid, master-point which is what i am proposing for the upload14:24
sconklinapw: ok, sounds good14:30
* ogasawara back in 2014:43
apwsconklin, ok the decision from the release team is that they want a kernel spun, could you check that branch for me as the stable check script just talks crap14:57
sconklinheh, probably as soon as we finish the meeting14:58
climbe2Problem!  If anyone is able to help.... running 10.04, recently upgraded to kernel 2.6.32-33, and can't boot up!!  See http://ubuntuforums.org/showthread.php?t=1807978 for my ongoing thread.  Any suggestions?!?!15:33
ogasawaratgardner: heading in, see ya in 1515:33
tgardnerogasawara, ack15:34
=== tgardner is now known as tgardner-afk
* herton -> lunch16:10
ppisatimumble went belly up16:18
keesapw: say, did you see my email about a funky CVE in the hardy update?16:21
apwkees, not yet, which one16:23
keesoh, sorry, not hardy. 2010-4175 in linux-lts-backport-maverick16:24
keesthe tracker shows "released 2.6.35-25.44~lucid1" but there is a changelog entry in 2.6.35-30.56~lucid116:24
apwkees, seems to be applied twice16:26
kees??16:26
apw git log --oneline origin/lts-backport-maverick  | grep 'rds: Integer overflow in RDS cmsg handling'16:26
keeswas it a no-change cherry-pick or something?16:26
apw9a3798f rds: Integer overflow in RDS cmsg handling, CVE-2010-417516:26
ubot2apw: Integer overflow in the rds_cmsg_rdma_args function (net/rds/rdma.c) in Linux kernel 2.6.35 allows local users to cause a denial of service (crash) and possibly trigger memory corruption via a crafted Reliable Datagram Sockets (RDS) request, a different vulnerability than CVE-2010-3865. (http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-4175)16:26
apwae1d3ae rds: Integer overflow in RDS cmsg handling16:26
apwgit describe --contains ae1d3ae16:26
apwUbuntu-lts-2.6.35-25.44~3316:26
apwgit describe --contains 9a3798f16:26
apwUbuntu-lts-2.6.35-28.50~2016:26
apwso thats where the version you mention come from.  from whats in the tree i'd expect my tools to take the first version number there16:27
apwand i think thats what you are saying it did16:27
keesyeah, that's certainly the correct approach, but I guess I'm wonder if we need to look closer -- how did it get applied twice (did that have a bad effect?)16:27
apwkees, both commits seem to be complete16:27
apwwhich should be impossible16:27
keesright :P16:28
sconklinapw: did you test build this?16:28
apwsconklin, i built only generic on amd64/i38616:28
sconklinI'm going to test build before I shove it to the PPA. Less time in the long run if it fails16:29
apwsconklin, ack, thanks16:29
apwkees, will investigate and let you know16:29
keesapw: cool, thanks16:29
apwi can only assume the first one got reverted somehow16:30
apwjust ... how i don't know16:30
sconklinapw: does this require an ec2 respin?16:35
apwsconklin, i would expect it might now you mention it, but i am confirming now with the release team16:36
hertonppisati: yeah, mumble can't login anymore here16:41
sconklinhere either16:41
apwkees, ok this actually is fixed in the earlier one as advertised.  however the relayout of the code since then makes it inobvious so someone has applied the same fix again to the containing routine in the second place16:45
apwkees, so its doubly fixed essentially16:45
keesapw: how strange, okay. thanks!16:49
apwsconklin, ok the basic answer is no we don't need -ec216:49
apwkees, good spot though16:49
sconklinok16:49
apw17:49:02      Daviey | it's a nice to have.. but we are only currently seeing this with euca.                         │ astraljava16:50
Daviey\o/16:51
apwDaviey, yep taking your name in vein16:53
* apw screams about mumble16:53
* jjohansen running an errand17:14
sconklinapw: package uploaded, It'll take 24 hours to complete, since it includes ARM arch17:30
apwsconklin, yeah17:30
apwsconklin, one reason we need it in as soon as17:30
sconklinapw: yep, and another reason that I did a test build (which completed OK)17:31
apwsconklin, yep all sensible and right17:32
climbe2Problem!  If anyone is able to help.... running 10.04, recently upgraded to kernel 2.6.32-33, and can't boot up!!  See http://ubuntuforums.org/showthread.php?t=1807978 for my ongoing thread.  Any suggestions?!?!17:38
apwclimbe2, does the previous kernel work17:57
apwas in seelecting an older kernle from the grub menu17:58
tgardnerapw, mumble seems to have reincarnated18:03
climbe2apw, no none will work anymore18:04
apwclimbe2, ok thats indeed odd if you took a kernle update and all your old kernels stop working too18:05
apwwhich versions have you tried18:05
climbe2perhaps it was not the upgrade.... I was recently using an SDHC card through my HP printer... two different USB storage devices...18:05
climbe2all I know is that I can't boot up at all!18:06
apw/dev/disk/by-uuid/e3df952c-d462-4292-bab6-4965da1d567c does not exist. Dropping to a shell. 18:06
apwso that implies your disk does not have the lable that is expected18:07
apwat the busy box you can get the dmesg output and see if any of your disks were found18:08
climbe2ok, how do I do that?18:08
climbe2I am also on a different computer right now, so I will have to hand type it18:09
apwclimbe2, 'dmesg | grep sd'18:09
apwthat command sequence might give you some clues18:09
apwclimbe2, you might also try 'blkid'18:09
apwif that is available it might tell you waht disks it thinks it can see18:10
apw/dev/sda5: UUID="cf503727-25f2-4ecd-b0f3-2b894523bcba" TYPE="ext4" 18:10
climbe2I can only get to the grub command line18:10
climbe2is there another command line I can use?>18:10
apwmine has a line like that ... wihhc matches the UUID in the error ...18:10
apwclimbe2, your post has the busybox prompt in it ?18:10
apw(initramfs) 18:11
apwthat one18:11
climbe2I've been a few different places from back then...let me see18:11
climbe2yes, I am there18:12
apwdoes blkid produce any output18:13
apwand does dmesg | grep sd18:13
climbe2dmesg | grep sd produces hundreds of lines it seems....blkid produces my drives18:14
apwok, and does the UUID in the error line appear ?18:14
climbe2blkid:  /dev/sda1, 2, 5, 6    /dev/sdab118:14
apwclimbe2, is that the exact output it produced ?18:15
apwi am expecting some UUID= segments18:15
apw/dev/disk/by-uuid/e3df952c-d462-4292-bab6-4965da1d567c18:16
climbe2there is an extra "\" in the error message than in the drive UUID18:16
apwan extra ?18:16
apwwhat ?18:16
climbe2wait..let me see18:16
climbe2in grub, i press 'e' to edit... it has /dev/disk/by-uuid/e3fd.....b\ab6-... instead of bab618:18
apwwell you could try removing that \18:19
climbe2rather, in the edit screen, 5th line down it says linux /boot/vmlinuz 2.6.32-33-generic root=UUID=e3fd.....b\ab6...18:20
apwthough i am supprised to get those full names in your grub conf18:21
climbe2perhaps that is an end-of-line signal...?  it occurs on another line as well18:21
climbe2when the line breaks to continue to the next18:21
* apw has to run18:22
climbe2thanks for the help18:23
climbe2i'll see what i can do18:23
pgranerskaet, ping18:26
pgranerskaet, we don't need to pull the kernel re: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/78835118:27
ubot2Ubuntu bug 788351 in linux "xfs ioctl XFS_IOC_FSGEOMETRY_V1 clobbers kernel stack" [Undecided,New]18:27
pgranerskaet, the patch in question is in the current -proposed18:27
pgranerskaet, no need to do anything18:27
skaetpgraner,  thanks - was pinging around about it. 18:27
pgranerskaet, next time just drop in here and ask18:27
skaetpgraner, will do18:28
=== Quintasan_ is now known as Quintasan
keesapw: another "why is this 'released'?" question for 2011-0711. uct shows it as "released" for a kernel in -proposed (hardy)18:59
* jjohansen lunch19:26
* tgardner bounces tangerine for kernel update20:37
=== yofel_ is now known as yofel
FireZen!21:15
=== kentb is now known as kentb-out
=== jeremy is now known as Guest4843
=== rsalveti` is now known as rsalveti

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!