[05:42] <maxb> I have a weird issue where my Ubuntu machine has decided that it doesn't want to honour ICMP fragmentation needed for PMTU discovery, whilst other Linux machines on the same subnet connecting to the same destination are fine, can anyone think of any useful avenues to investigate?
[05:42] <maxb> (I'm assuming something this deep into the networking stack must be a kernel issue)
[07:28] <ppisati> moin
[07:52] <diwic> I got a question from upstream about how Ubuntu deals with the power_save parameter for HDA codecs. I don't think we do anything (i e, just follow upstream), but is there a way to verify?
[07:54] <RAOF> diwic: I guess check the module parameters in /etc/modprobe.d?
[07:55] <diwic> RAOF, nothing there, I was more thinking of kernel patches
[07:56] <RAOF> You'd find them in the kernel git tree, then.
[07:57] <RAOF> The kernel tree is based on whatever upstream commit is most recent, with all our patches on top of that, so ‘git log 3.8-rc3..’ should get you all the patches we apply.
[07:58] <diwic> RAOF, ok, thanks
[08:09] <apw> diwic, i don't recall anything specific any more for hda, but worth looking indeed
[08:10] <diwic> apw, ack. As a side note, I think we once had a larger buffer size for hda, but that patch must have been removed again
[08:12] <apw> maxb, pmtu discovery does not require fragmentation, indeed it requires you request no fragmentation
[08:13] <apw> maxb, on the expectation you get a 'fragmentation required' icmp in response to anything too large.  you should be able to see those in your network traces if things are right
[08:13] <maxb> apw: Yes indeed. My problem is that I can see ICMP fragmentation needed packets arriving, but Linux doesn't seem to be taking them into account. It continues trying to use a too-large MTU with DF set
[08:14] <apw> maxb, i wonder if they are being firewalled into the bit bucket
[08:15] <maxb> I can see them in tcpdump on the client host
[08:16] <apw> you would expect to see them in tcpdump even if they get dropped i think, as the copies for tracing are taken very early
[08:16] <apw> do you have iptables rules on this box
[08:16] <maxb> only one, a nat/POSTROUTING/MASQUERADE rule, and all the chain default policies are ACCEPT
[08:18] <apw> maxb, not that than
[08:18] <maxb> Indeed :-/
[08:18] <apw> maxb, i assume /proc/sys/net/ipv4/ip_no_pmtu_disc is 0
[08:19] <maxb> yes. Of course, disabling pmtu is a workaround for the original communications problems that set me to investigating this, but it's not ideal
[08:21] <maxb> In theory this is a fairly average ubuntu workstation install running raring :-/
[08:21] <maxb> Perhaps I should boot a live USB and see if the problem persists
[08:22] <apw> maxb, well now we need to try and acertain if this has always been this way or has regressed, so i would probabally grab a live CD and see if that is affected, adn then if so, grab a quantal kerenl and boot that against the raring user space
[08:23] <apw> maxb, is it possible to test without the MASQ rule you mentioned too, as there was a case in 2.6.11 where loading rules there broke this
[08:24] <apw> maxb, finally can you tell me which host you are having the issue with server side (privatly is fine) so perhaps i can test here and see with my raring system
[08:24] <maxb> I deleted that rule, no change. But I'll try without it from a clean reboot too
[08:24] <maxb> The problem host is on the other side of a private IPsec tunnel
[08:24] <apw> yeah it is 'having ipt_MASQUERADE loaded' which was the trigger, though it should be fixed in theory
[08:25] <apw> maxb, fair enough not going to be doing that then
[08:28] <apw> maxb, a quick survey of places i visit often i do not get any icmp-fragneeded packets, sigh
[08:30] <smb> morning
[08:30] <apw> smb, moin
[08:30] <smb> apw, insomnia?
[08:30] <apw> smb, sunny day and no curtains
[08:31] <smb> ah
[08:31] <smb> unexpected but seems there is something bright outside here too
[08:31]  * smb has curtains, though
[08:39] <maxb> Well, not loading the MASQUERADE rule doesn't seem to have changed matters
[08:45] <maxb> Hmm, but quantal kernel raring userspace works
[08:46] <apw> maxb, ok that implies a regression in v3.8
[08:46] <apw> maxb, so ... next we would normally ask you to try the v3.8 mainline, v3.7 mainline and v3.6 mainline kernels
[08:46] <apw> maxb, https://wiki.ubuntu.com/Kernel/MainlineBuilds
[08:47] <maxb> will do
[08:47] <apw> maxb, and ... get a bug filed and let jsalisbury know so he can help us get the bisect done
[08:49] <apw> (jo and me of course the bug number)
[08:53] <maxb> I'll file a bug later today once I have some kernel-ppa tests done
[09:10] <apw> maxb, ack
[09:26] <brendand> henrix, do you know why the certification-testing task is marked Invalid in the lucid tracking bug? https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1158939
[09:26] <ubot2> Launchpad bug 1158939 in kernel-sru-workflow/verification-testing "linux: 2.6.32-46.107 -proposed tracker" [Medium,In progress]
[09:27] <henrix> brendand: hmm... i'm not aware of any specific reason, so most likely a bot bug
[09:28] <henrix> brendand: you can go ahead and just change the state to 'New' i believe
[09:28] <henrix> brendand: i'm ping bjf later about that
[10:11] <ppisati> brb
[11:16] <maxb> My pmtud-related bisection has established the interval of v3.5.7.8-quantal .. v3.6-rc1-quantal :-/
[11:22] <apw> maxb, ok ... what is v3.5-foo like, as that is on the same mainline
[11:23] <maxb> Oh, as in determine whether a fix landed during 3.5.x ?
[11:33] <apw> maxb, and if not it is easier to bisect v3.5->v3.6-rc1 than from .8
[11:34] <maxb> Just doing a quick side trip into 3.9-rc4 to see if anything changes there, then I'll try out 3.5
[11:50]  * ppisati rush out to get some food before the conf call
[12:07] <maxb> A colleague has just observed that 3.6 saw the removal of the IPv4 routing cache
[12:29] <maxb> Which would kind of be a good reason for this to have broken
[12:29] <maxb> Except, I've also discovered an additional wrinkle.
[12:31] <maxb> I'm connecting to several different sites via the same IPsec gateway. And some behave differently to others
[12:32] <maxb> Accessing some, my local machine just magically decides to operate an IP MTU of 1420, and I can't see any evidence why
[12:33] <rtg_> ppisati, I assume you want those 2 patches mentioned in your response to robher applied ?
[12:34] <rtg_> if so, please submit them on the public k-t list.
[12:42] <rtg_> apw, the kbuild test robot email re: 'lib/dynamic_debug.c:1059:6: warning: passing argument 7 of 'parse_args' from incompatible pointer type' looks legitimately broken. can you have a look ?
[12:44] <apw> rtg_, sure
[12:44] <apw> maxb, it is not making your life easy is it
[12:45] <maxb> I've just figured out that the ones when it works, is because a router in the remote site is doing MSS clamping
[12:47] <maxb> So I think as far as Ubuntu in general is concerned, the question is "did 3.6 break pmtud?"
[12:57] <apw> maxb, yep
[12:59] <maxb> This commit message is quite scary - https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3c4cfadef6a1665d9cd02a543782d03d3e6740c6
[13:25] <apw> heh yep, it is :)
[13:29] <apw> rtg_, ok ... fixed and pushed
[13:30] <rtg_> apw, just going through the rest of them to see which are legit
[13:31] <apw> rtg_, great, any you want me to poke send 'em over
[13:31] <rtg_> ack
[14:25] <maswan> henrix: anything more we should do for 1111416 or is it all in hand now?
[14:26] <henrix> maswan: nop, everythings good. the next kernel to go to updates will now have CONFIG_NFS_V4_1 ;)
[14:27] <ppisati> rtg_: i'm building another kernel to grab a stack trace, then ill send the email
[14:27] <rtg_> ppisati, ack
[14:29] <apw> ppisati, your "MUSB annotation can be dropped" comment does that apply to CONFIG_USB_MUSB_OMAP2PLUS CONFIG_TWL4030_USB and CONFIG_TWL6030_USB
[14:30] <apw> rtg_, did i see you say you had the 8250_DMA thing in hand ?
[14:30] <maswan> henrix: excellent
[14:30] <rtg_> apw, yes, I think so
[14:30] <ppisati> apw: no, we need TWL4030 for booting
[14:30] <ppisati> apw: does it depend on MUSB?
[14:30] <henrix> maswan: enjoy ;)
[14:46] <apw> ppisati, those are both M in armhf-generic but marked as 'y' in our annotations
[14:46] <apw> ppisati, i suspect from your bootting comment 'y' is correct
[14:47] <ppisati> apw: that's what i recalled, but if you tell me they are 'm' now (and everything works) we should just keep them as is
[14:47] <ppisati> apw: bu let me check
[14:47] <apw> ppisati, ack
[14:47] <apw> (to letting you check)
[14:49] <maxb> So I *think* I've figured out what's going on now, it looks like in the refactorings in 3.6, support for fragmentation needed packets which don't supply next hop mtu information (i.e. set the field to zero) got dropped
[14:49] <maxb> Off to file an actual bug at this point
[14:50] <apw> maxb, great
[15:27] <maxb> The bug that I have been talking about for most of today is now bug 1160966
[15:27] <ubot2> Launchpad bug 1160966 in linux (Ubuntu) "PMTU discovery no longer works in Linux 3.6+ with routers that do not send next hop MTU information" [Undecided,New] https://launchpad.net/bugs/1160966
[15:39] <jsalisbury> rtg_, apw should we be building the ddeb packages for Precise?  bug 1160674
[15:39] <ubot2> Launchpad bug 1160674 in linux (Ubuntu) "ddeb package missing for 3.2.0-31-generic kernel (and 3.2.0-30 too)" [Medium,Confirmed] https://launchpad.net/bugs/1160674
[15:40] <rtg_> jsalisbury, they prolly _are_ getting built, but perhaps they aren't getting copied. bjf ?
[15:40] <ppisati> apw: ok so, booting from mmc doesn't work
[15:40] <ppisati> apw: let me check if making them =y fixes it
[15:40] <ppisati> apw: i recall it was mmc related
[15:48] <apw> maxb, if i am reading this commit correctly the issue is that there is a 6 year old router which does not support pmtu correctly in the channel?
[15:48] <rtg_> jjohansen, 'UBUNTU: SAUCE: apparmor: Add the ability to mediate mount' is broken in raring. the prototype for struct security_operations.sb_mount has changed. apparmor_sb_mount() needs to be changed accordingly. I wonder how this even works.
[15:49] <rtg_> jjohansen, oh , never mind. I was looking at the wrong function.
[15:50] <apw> rtg_, we may only use that support for lxc ?
[15:50] <rtg_> apw, its just the addition of 'const' to a couple of the parameters.
[15:54] <maxb> apw: It supports PMTU, it just uses the original RFC792 definition of what an ICMP fragmentation needed packet should look like
[16:08] <apw> maxb, i assume this means you can work round it by mss clamping at the source end
[16:08] <apw> maxb, ie at your linux box
[16:11] <apw> maxb, 792 isn't exactly helpful in defining the PMTU form clearly is it
[16:20] <apw> maxb, ok this is better described in rfc1191 which says
[16:20] <apw> "Hosts MUST be able to deal with Datagram Too Big messages that do not
[16:20] <apw>    include the next-hop MTU, since it is not feasible to upgrade all the
[16:20] <apw>    routers in the Internet in any finite time. "
[16:20] <apw> maxb, so you might want to add that to the two bugs, indicating we stopped being compliant there
[16:24] <apw> maxb, but i would not be too hopeful of upstream ever putting this back, they seem to think your router is too old to care about, is there a reason it is not upgraded to a later version of openbsd there seems to be a bunch of later versions
[16:24] <apw> maxb, either way i would be interested if a simple mss clamp would sort you out
[16:25] <maxb> Reason for not upgrading is merely round tuits.
[16:25] <maxb> A MSS clamp should work.
[16:26] <apw> lets see what they say on your bug, but i am expecting ... "heee, that would hurt" or something helpful
[16:26] <apw> jsalisbury, i don't think reverting that patch will fly on its own, i am expecting you will find it is part of a larger series you'll never unpick
[16:29] <jsalisbury> apw, ack.  
[16:36] <apw> maxb, that said it is not clear we could not just pass the 0 down and handle it as a 'mtu -= 16' or something until it works
[16:37] <maxb> The prior behaviour was to pick from a descending list of common MTU sizes
[16:38] <apw> well ... based on some random info in the packet
[16:38] <rtg_> ppisati, I pushed your 2 highbank patches on raring master-next. please check that they are correct.
[16:38] <apw> the issue is that changing the mtu there is not allowed
[16:39] <apw> but ... it must be changed lower down
[16:43] <ppisati> apw: ok, i don't have any mmc-only installation anymore
[16:43] <ppisati> apw: drop the annotation and leave these as modules
[16:49] <apw> ppisati, thanks
[16:49] <apw> maxb, this is a raring box yes ?
[16:49] <maxb> yes, that is right
[16:49] <ppisati> rtg_: i think you lost part of robher cfg
[16:50] <ppisati> rtg_: let me do that
[16:50] <rtg_> ppisati, ack
[16:57] <apw> maxb, ok ..  i have had a go at just reinstating the most basic 'step down until it works'
[16:58] <apw> maxb, down in the bit where we normally update mtu anyhow, i'll get you a kernel to test
[17:04] <rtg_> ppisati, I'm off to grab a bite. ping me when you have those config options done. I need to upload this stuff today.
[17:04]  * rtg_ -> lunch
[17:04] <ppisati> rtg_: yep, i'm building another kernel
[17:29] <ppisati> rtg_: houston, we have a problem
[17:29] <ppisati> rtg_: i'm checking if it's our stuff or robher but
[17:29] <ppisati> rtg_: http://paste.ubuntu.com/5652876/
[17:29] <ppisati> rtg_: on highbank
[17:32] <robher> ppisati: you need the cpuidle disable by default patch.
[17:35] <ppisati> robher: was it part of your pull?
[17:38] <ppisati> robher: ok, saw what's missing
[17:38] <dobey> anyone around knowledgeable enough about intel/ivybridge to look at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021924 perhaps?
[17:38] <ubot2> Launchpad bug 1021924 in linux "Multiple Displays not working on Core i7 3770S + Intel DQ77MK motherboard" [Medium,Confirmed]
[17:57] <apw> dobey, did you try any if the 3.9-rc kernels as yet ?
[18:07] <ppisati> rtg_: ok, all sent
[18:11]  * apw calls it a day
[18:13] <rtg_> ppisati, both batches of patches ?
[18:19] <rtg_> ppisati, pushed
[18:20] <ppisati> rtg_: k
[18:20]  * ppisati -> dinner
[18:35] <fragmede> Hi all; I'm not seeing a tag for Ubuntu-3.8.0-14.24 in git://kernel.ubuntu.com/ubuntu/ubuntu-raring.git.
[18:35] <rtg_> fragmede, oops, just re-pushed
[18:36] <fragmede> Great, thanks!
[18:44] <rtg_> ogasawara, ok, looks like I'll be able to upload raring pretty quick. I was beginning to wonder if this highbank stuff was gonna come together in time for the Beta freeze.
[18:44] <ogasawara> rtg_: ack
[18:46] <dobey> apw: i haven't. given the lack of comments on the upstream bug though, i doubt it will fix it if i do try one
[19:32]  * rtg_ -> EOD
[19:53] <FUF>  Greetings.. I was just wondering how the linux-image-virtual kernel packages differs from the -generic images - what advantages do they offer for my VMs compared to generic?
[19:55] <FUF> should the differences I see when I diff their /boot/config* be enough to answer my question?
[19:57] <dobey> probably
[19:58]  * ogasawara lunch
[23:59] <apw> dobey, i'll get you some test kernels with this patch for tommorrow, and put a pointer in the bug