/srv/irclogs.ubuntu.com/2012/09/28/#ubuntu-kernel.txt

bjfjjohansen: about?03:13
jjohansenbjf: hey whats up04:37
bjfjjohansen: i'm trying to figure out if these apparmor qrt failures are due to missing pkgs04:37
bjfhttp://kernel.ubuntu.com/beta/testing/test-results/rizzo.2012-09-28_01-40-19/qrt/default/qrt.test-apparmor.py/debug/qrt.test-apparmor.py.DEBUG.html04:38
bjfjjohansen: ^04:38
jjohansenbjf: they python binding failures seem to be04:38
bjfjjohansen: i "think" i fixed those and have rerun the tests04:38
jjohansenokay, you have a longpath failure, which machine04:39
jjohansenso I can log in and look at it04:39
bjfjjohansen: rizzo04:39
bjfjjohansen: did you get your qa lab creds?04:39
jjohansenokay, longpath is not a python binding04:39
jjohansenI suspect it is something to do with mount/chroot04:39
jjohansenbjf: okay, I'll look into it tonight, I haven't been able to replicate yet and I have tried on the qa infastructure04:40
bjfjjohansen: so you can get into the qa lab machines?04:41
jjohansenif you can leave rizzo as is, I can get to it in an hour or two, /me is trying to get kids to be04:41
jjohansenbjf: yeah I should be able to I have been given vpn access and keys04:41
bjfjjohansen: no problem, i'm done for the night, won't be back to it til tomorrow a.m.04:41
jjohansenokay I should have something for you in the morning04:42
bjfjjohansen: you want me to check back with you in about an hour to see if you are getting in?04:42
jjohansensorry I got distracted by other problems like the weird stack failure, that apw was dealing with earlier in the week04:42
bjfnp04:42
bjfthat took priority04:42
jjohansenbjf: give me a couple minutes I'll try it now04:43
bjfack04:43
jjohansenbjf: its going to take /me more than a couple minutes04:45
bjfnp, i'll check back in about an hour .. if that's ok04:45
=== henrix_ is now known as henrix
=== smb` is now known as smb
smbmorning08:12
diwichmm, could it be that the mainline 3.6 kernel does not output snd_printk messages by default, but the ubuntu 3.5 kernel does?08:16
smb"Could be" could always be.  Though I *know* as much as you do there...08:19
ogra_hmm, do kernel buillds use -Werror=maybe-uninitialized by default ?08:21
ppisatiogra_: if we would use -Werror=maybe-uninitialized than NONE of our kernels will ever build :)08:29
ogra_ppisati, well, then i wonder where it comes from in the linux-ac100 build :)08:29
ogra_probably an upstream Makefile from firendly nvidia or so :)08:30
* ogra_ goes digging08:30
ogra_ogra@anubis:~/Devel/packages/linux-ac100-3.1.10$ grep -r Werror *|grep tegra|grep Makefile|wc -l08:33
ogra_608:33
ogra_ogra@anubis:~/Devel/packages/linux-ac100-3.1.10$ 08:33
ogra_yay, fun08:33
* ogra_ wouldnt mind that they set this *if* they would actually make sure there are no warnings ... tsk08:36
=== ivoks is now known as SiIverSpace
=== SiIverSpace is now known as ivoks
Nosophorushi12:08
* henrix -> SIGFOOD12:35
* ppisati goes out to run an errand, brb13:50
rtgogasawara, whaddya think of git://kernel.ubuntu.com/rtg/ubuntu-quantal.git ixgbe ?14:30
* ogasawara looks14:30
rtgogasawara, it brings ixgbe up to 3.6-rc714:31
rtghmm, though looking at the diff I think a doc patch touched a bunch of other stuff. better figure that out.14:32
ogasawarartg: ah, a full update rather than just the SR-IOV patch.  I could go for that.14:34
rtglooks like its 49ce9c2cda18f62b13055dc715e7b514157c2da8 which is benign documentation and cleanup14:34
rtgogasawara, I need a test bed for ixgbe, any thoughts ? I wonder if tangerine or gomeisa has this.14:35
ogasawarartg: I was just thinking the same thing14:35
rtgtangerine is igb14:35
rtgso is gomeisa14:35
ogasawarartg:  what about any of the boxes bjf has in the lab14:36
rtgogasawara, what is your shark bay ? unlikely since ixgbe is 10GB IIRC14:36
ogasawarartg: I don't think it's ixgbe, but I'll double check14:37
rtgsconklin_, can you root around on your lab boxes to see if any are ixgbe ?14:37
sconklin_ack, odds are low14:38
rtgsconklin_, you've got a newer emerald class box I think. that is a possibility.14:39
sconklin_rtg: it's server-only, right? No sense checking commercial laptops?14:39
rtgsconklin_, right. ixgbe is 10GB14:39
rtgyep, "Intel 10 Gigabit PCI Express Linux driver"14:40
sconklin_I don't have an emerald class box, the 'biggest' boxed I have are sandy bridge and ivy bridge desktops14:42
sconklin_they're both gigabit, I just checked14:42
bjfsconklin_: he's asking about machines in the lab14:42
sconklin_oh hell14:42
sconklin_ok, stand by14:42
bjfsconklin_: statler is igb14:44
rtgI've got plenty of igb boxes14:45
bjfsconklin_: rizzo is broadcom14:46
bjfrtg, so no, we don't have any14:46
rtgbjf, ok, thanks. maybe we'll just upload it and see if it sticks :) The intel guys are pretty good about ethernet driver testing.14:47
bjfrtg, sorry, one more system to check14:47
jjohansenbjf: so I hit a different bug last night, and haven't tracked down the aa problem yet14:47
ogasawarartg: if anything, I can ask yingying to test14:47
bjfjjohansen: heh, np. you want me to leave that system for you?14:47
jjohansenbjf: can I just reboot the system?14:47
rtgogasawara, you think she's got one ?14:47
bjfjjohansen: yup14:47
ogasawarartg: she should have access to one14:48
jjohansenbjf: so the bug I hit is that loop mounts aren't getting unmounted, something is hold a ref to them14:48
rtgogasawara, ok, I'll request that in the bug.14:48
rtgogasawara, can she build, or should I do it for her 14:49
rtg?14:49
ogasawarartg: you should build for her14:49
rtgack14:49
bjfrtg, nope, my other lab server is igb as well14:58
rtgbjf, ok, thanks for looking14:58
ogasawarasmb: I've opened up 100909815:02
smbogasawara, Ok cool, then as soon as people like my source package they can upload. 15:03
ogasawarasmb: sweet, thanks!15:03
smbDaviey, ^ FYI15:04
Davieysmb: rocking15:08
Davieysmb: shall we break some stuff?15:09
smbDaviey, Why not? Its Friday, close to release... perfect time to break things...15:09
Davieywfm15:10
smbzul, Or what ya say? :)15:10
rtgDaviey,  smb, I think you guys should have several beers first15:10
Daviey*glug*15:11
smbrtg, Oh right, need to wait until beer thirty... and log off irc...15:11
rtgsmb, and only then should you upload :)15:12
smb:D15:12
zulsmb: what are we breaking?15:12
Davieysmb: dpkg-source: error: Version number suggests Ubuntu changes, but Maintainer: does not have Ubuntu address15:12
rtgyou know ogasawara can fix it.15:12
smbzul, Only Xen15:12
Davieysmb: How did you build a source package?15:12
zulsmb:  ah 15:12
smbDaviey, Was just a warning for me15:13
Davieysmb: and you thought best to ignore it? :)15:13
smbDaviey, Of course! :) 15:13
smbcanonical ... ubuntu... who cares15:14
jjohansenbjf: okay so the failure is in the test suite, and is around the core dump pattern15:26
jgriffithsmb: So what can I do to help get to the bottom of this LVM problem?15:37
xnoxwhat LVM problem?15:37
jgriffithxnox: OpenStack LVM issue15:37
smbjgriffith, One thing would be to let me know whether the steps in the bug report will cause the problem for you15:38
jgriffithxnox: In a nut-shell dd if=/dev/zero to an LVM snapshot hangs the system15:38
jgriffithsmb: They did15:38
smbbug 102375515:38
jgriffithsmb: It's random however15:38
ubot2Launchpad bug 1023755 in linux "Precise kernel locks up while dd to /dev/mapper files > 1Gb (was: Unable to delete volume)" [Undecided,Confirmed] https://launchpad.net/bugs/102375515:38
smbjgriffith, Hm, so I should run that in a loop (that was not completely clear in the report)15:39
jgriffithsmb: Sorry... 15:39
ogasawarappisati: can I close bug 746137 for the linux task as well?  ie set Invalid15:39
ubot2Launchpad bug 746137 in linux "Page allocation failure on Pandaboard and Beagle XM" [High,Confirmed] https://launchpad.net/bugs/74613715:39
jgriffithsmb: Really even running in a loop probably won't do it for ya TBH15:39
smbjgriffith, The other thing is, would making the snapshot storage slightly bigger than the original volume be a work-around usable?15:40
* xnox is confused why would you zero out LVM snapshot, instead of asking lvm which blocks the snapshot used and zero those out.15:40
xnoxis this reproducible in quantal?15:40
smbxnox, That would be another way15:40
jgriffithsmb: I can try it,but I don't see how that would help15:40
jgriffithxnox: Not sure, I'll look at it today15:40
bjfjjohansen: ack. you'll see that the necessary fixes are applied to the qrt tests?15:40
smbxnox, jgriffith Unfortunately for that you need to use dmsetup commands (I believe)15:41
jgriffithxnox: I did pull the quantal kernel into 12.04 and couldn't reproduce that way15:41
jjohansenbjf: yep working on and then I finally get to the yama test as well15:41
bjfjjohansen: very good. just let me know and i'll give it a spin.15:41
jgriffithsmb: What would require dmsetup commands?  Quantal?15:41
xnoxinteresting. precise has a very old lvm2 unfortunately (we missed opportunity to merge lvm2 on time)15:42
xnoxoh well.15:42
smbjgriffith, No converting the snapshot area into a linear volume15:42
jgriffithsmb: Oh, I see where you're going15:43
jgriffithSorry missed that above15:43
smbAs well as apparently a working lvconvert solution... seems to lock up in Precise sadly15:43
jgriffithsmb: Your lvconvert locks up?  Sounds familiar15:43
smbjgriffith, Similar but I am not sure it is the same. It seems to do things but then messing up the refcounting so the snapshot cannot be removed and any other lvm command locks up as well15:45
jgriffithsmb: So that's very similar behavior at any rate15:45
smbI did not dig down there because it was something else and the dd's were acting as they should be for me15:46
jgriffithsmb: yeah, the dd thing is interesting to try and reproduce15:46
jgriffithsmb: You'll notice I initially marked this bug as invalid/not reproducible for the same reason15:47
jgriffithsmb: But then I start hitting it more often, and for a while I'd hit it almost EVERY time I ran15:47
jgriffithsmb: Nothing different that I could determine on the systems15:47
jgriffithsmb: Using Vagrant and Virtual box so the steps/setup should be the same in every case15:47
smbjgriffith, Right, maybe the copy on write locks up rarely... Hm, when you hit it, can you in any way cause a crashdump15:48
jgriffithsmb: Well, that *ONE* time the system was still resposive15:49
jgriffithsmb: But typically, the entire system is locked up and I can't do anything except power it off15:49
jgriffithsmb: (virtually of course)15:49
smbjgriffith, not even sysrq keys?15:49
jgriffithsmb: Nope... nothing15:49
smbToo bad. For a Xen guest it would be simple to get a dump. I should probably investigate how / whether that can be done for KVM guests, too15:50
smb(of course the xen guest I tried on did not break)15:51
jgriffithsmb: Wonder if it would be worth the pain of running devstack to try and reproduce that way?15:51
smbjgriffith, Probably first would try to run the test more often to get it to happen with basic lvm commands.15:53
jgriffithsmb: Understood15:53
jgriffithSo the things I've found to be key are the size (>= 2G) and an LVM snap 15:54
jgriffithsmb: Neither LVM every being written to or used15:55
smbjgriffith, Ok, yeah. I did set it up that way. There were a few modifications to the commands I had to do (noted in the report) but at least I am trying with a loopfile based VG as well15:56
jgriffithsmb: cool... that should do it15:57
smbHopefully. I will hack on a script to loop and then can let it run over some time over the weekend15:58
jgriffithsmb: great.. thanks!  Keep me posted, and anything I can do to help just say the word15:58
smbjgriffith, Sure I try to keep the bug report updated with any progress or questions15:59
smbrtg, Btw, I reached a state with bug 1021471 where I cannot really say which of the oddities I see would be the real problem. Perhaps a bell rings for you (most of the details are in the upstream bug). Otherwise I fear this needs someone from upstream actually caring.16:06
ubot2Launchpad bug 1021471 in linux "clone() hang when creating new network namespace (dmesg show unregister_netdevice: waiting for lo to become free. Usage count = 2)" [High,Triaged] https://launchpad.net/bugs/102147116:06
rtgsmb, prolly. maybe serge ?16:07
smbrtg, Err him being expert in net/ipv4/* ?16:08
rtgsmb, no, I was thinking he's an expert with name spaces16:08
smbrtg, Ah ok. Well in that case it might only indirectly related to name spaces. I have not really checked that but it could be that even in the initial one there would be dangling references. Just that in that case nobody cares16:09
smbSo far it looks like an odd entry in the route cache. But whether it should not be there or be there with different values or other flags I got no clue16:10
smbAnd of course upstream replaced the whole thing again16:11
* smb notes that percpu refcounting alone sounds scary enough16:12
sforsheebjf, I've got another test kernel for brcmsmac. I've drastically reduced the amount of buffering done by the driver, and I haven't seen the flush warning once.16:17
sforsheebjf, http://people.canonical.com/~sforshee/lp1046507/linux-3.6.0-030600rc7+wt201209271321/16:17
bjfsforshee: pulling ...16:18
rtgsforshee, do you think its a buffer bloat issue ?16:18
sforsheertg, I think that the bufferbloat guys definitely wouldn't like the amount of buffering brcmsmac does, but I don't think the problems we're seeing are caused by bufferbloat16:19
sforsheertg, currently brcmsmac can queue 200+ packets internally16:20
rtgsforshee, oh, thats _way_ too many16:21
sforsheertg, I think the timeouts during flush are just because it can't send all of those in the time it's allowing flush to complete16:21
rtgI think it shouldn't have more then 3-5 queued16:21
rtgwhat is the TX rate steps down to 1MB ? Thats an enormous amount of time....16:22
rtgwhat if*16:22
sforsheertg, with the way the broadcom chips appear do AMPDU I think it has to queue at least one AMPDU's worth16:22
sforsheewhich is 1616:22
rtghow does it behave if you reduce the queue depth and disable aggregation ?16:22
sforsheeI'm planning to test without aggregation but haven't gotten around to it yet. It's a different code path through the driver though.16:24
rtgwhy does it flush ?16:24
sforsheeThe AMPDU code path explicitly tries to wait until enough packets are queued for an AMPDU16:24
sforsheemost of the flushes I see are before switching channels for background scanning16:25
rtgah16:25
rtgI think waiting is a bad idea. if there is sufficient congestion, then packets will tend to queue up naturally. 16:26
sforsheeThe broadcom chips require extra headers on them when they are handed off to the hardware, and the first and last AMPDU packets must be marked as such in this header.16:29
sforsheeSo it appears brcmsmac needs to know how many AMPDU packets it's handing off at the time it moves them to the DMA ring.16:29
rtgsforshee, maybe you could try that, e.g., only aggregate the packets that are in the queue when the TX is ready to run rather then waiting.16:30
sforsheeWhat the driver does is waits as long as there's a tx in progress. Once all pending tx finishes it starts moving packets to the DMA ring regardless of how many are queued up.16:31
sforsheeIt might make more sense if it just moved packets any time there was room in the DMA ring though16:31
rtgsforshee, how many does the DMA ring hold ? is that teh 200 number ?16:31
sforsheeno, I don't recall the exact number but it is much smaller16:31
sforsheeI think the 200 number had something to do with their expectations when operating as an AP, but brcmsmac doesn't support AP mode right now anyway iirc16:32
* ppisati -> EOW16:42
* henrix -> EOD17:03
=== henrix is now known as henrix_
rtgjsalisbury, re: bug #1041883, Too much noise. TLDR. Did you narrow it down to CONFIG_X86_X32 ?17:28
ubot2Launchpad bug 1041883 in linux "Recent patch to asus-wmi module makes system unbootable" [High,In progress] https://launchpad.net/bugs/104188317:28
jsalisburyrtg, it looks like it.  Comment #97 is the original bug reporter.  He states that he no longer sees the panic, which was the orignial problem.17:32
jsalisburyrtg, I asked some of the other commenters to confirm, but who knows the kind of response we will get, based on the other comments17:33
rtgjsalisbury, ok, I think I'll go ahead and disable it, especially after chatting with cking and apw17:34
jsalisburyrtg, cool17:34
rtgjsalisbury, hmm, just _how_ do you disable that bugger ?17:47
jsalisburyrtg, I changed it to "# CONFIG_X86_X32 is not set" in ~ubuntu-quantal/debian.master/config/config.common.ubuntu17:49
rtgjsalisbury, did you check that it stuck after running updateconfigs ?17:51
rtggit diff17:51
=== yofel_ is now known as yofel
jsalisburyrtg, It looks like it from git log of the SHA1 generated:17:53
jsalisbury(quantal-amd64)jsalisbury@tangerine:~/bugs/lp1041883/ubuntu-quantal$ git show fc124a4fe9eda36d9509a4c5e15945ec45839d2217:53
jsalisburycommit fc124a4fe9eda36d9509a4c5e15945ec45839d2217:53
jsalisburyAuthor: Joseph Salisbury <joseph.salisbury@canonical.com>17:53
jsalisburyDate:   Wed Sep 26 17:36:06 2012 +010017:53
jsalisbury    Disabled config.common.ubuntu17:53
jsalisbury    17:53
jsalisbury    BugLink: http://bugs.launchpad.net/bugs/104188317:53
ubot2Launchpad bug 1041883 in linux "Recent patch to asus-wmi module makes system unbootable" [High,In progress]17:53
jsalisbury    17:53
jsalisbury    Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>17:53
jsalisburydiff --git a/debian.master/config/config.common.ubuntu b/debian.master/config/config.common.ubuntu17:53
jsalisburyindex 37cb4b8..e852eff 10064417:53
jsalisbury--- a/debian.master/config/config.common.ubuntu17:53
jsalisbury+++ b/debian.master/config/config.common.ubuntu17:53
jsalisbury@@ -6406,7 +6406,7 @@ CONFIG_X86_USE_PPRO_CHECKSUM=y17:53
jsalisbury CONFIG_X86_WANT_INTEL_MID=y17:53
jsalisbury CONFIG_X86_WP_WORKS_OK=y17:53
jsalisbury CONFIG_X86_X2APIC=y17:53
jsalisbury-CONFIG_X86_X32=y17:53
jsalisbury+# CONFIG_X86_X32 is not set17:53
jsalisbury CONFIG_X86_XADD=y17:53
jsalisbury CONFIG_XEN=y17:54
jsalisbury CONFIG_XENFS=m17:54
jsalisburyrtg, is there another way I can check while the kernel is booted?17:54
rtgjsalisbury, /boot/config*17:54
rtg/boot/config-`uname -r`17:54
jsalisburyrtg, ok.  I'll boot up the test kernel on a test machine and check. One minute17:55
rtgjsalisbury, you don't have to boot it up if the kernel is still installed.17:56
* rtg -> lunch. bbiab17:56
jsalisburyrtg, I don't have it installed, but I will now17:57
jsalisbury_rtg, It looks like the change stuck:18:09
jsalisbury_jsalisbury@samsung:~$ cat /boot/config-3.5.0-15-generic | grep CONFIG_X86_X3218:09
jsalisbury_# CONFIG_X86_X32 is not set18:09
infinityDoes anyone know the status of linux-armadaxp 3.5.0 in the PPA?  Is that version scrubbed of non-free badness, or am I still waiting on that?18:29
rtginfinity, hassle vanhoof. he's more likely to know, but I seem to remember that ike was gonna remove some stuff.18:32
infinityrtg: Yeah, I just mailed vanhoof, ike, and friends.18:33
rtgjsalisbury, I don't think CONFIG_X86_32 was _ever_ enabled. Reset to Ubuntu-3.5.0-11.11 and have a look at the config that is generated: '# CONFIG_X86_32 is not set'18:38
rtgjsalisbury, doh, Sarvatt pointed out that its CONFIG_X86_X32, not CONFIG_X86_3218:43
jsalisburyrtg, hmm, yeah, if I reset to Ubuntu-3.5.0-11.11 it isn't set.  The test kernel I built was a fresh clone of 3.5.0-1518:43
rtgjsalisbury, I lead you astray, try CONFIG_X86_X3218:43
jsalisburyrtg, and I reset to Ubuntu-3.5.0-11.11 in the tree that I modified.  A fresh clone and reset to 3.5.0-11.11 does show it set18:45
rtgjsalisbury, right, I was chasing the wrong symbol18:45
jsalisburyrtg, ack18:46
* ogasawara lunch19:36
profiler1982is it someone tr kernel 3.5 on 11.10 ubuntu? am have amd  APU c-6019:41
rtgogasawara, your shiny new kernel is accumulating patches at an alarming rate.20:05
rtgogasawara, have a look at https://blueprints.launchpad.net/ubuntu/+spec/powerpc-kernel-devel to consider for approval.20:23
rtgand with that, I'm outta here.20:24
* rtg -> EOW20:24
infinityogasawara: Want to whitelist me on canonical-kernel-team@lists?23:03
infinityogasawara: (Assuming you moderate it)23:03

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!