[10:53] Good morning all, I would like to apply a patch to the kernel package and would like some guidance could someone please advise. [10:55] to provide a bit more information I am looking to patch the kernel with grsecurity [10:56] ok, ask ay specific questions you have and we'll try and answer [10:56] *any [10:57] How do i apply a patch to the kernel source package [10:58] leon_pegg, to be honest if applying a patch is a new thing you are going to find applying a monster like grsecurity a huge effort [10:59] leon_pegg, but ... i would start by checking out our git repository for the release to which you intend to apply it [11:00] apw: I have applied patches to raw source before but never to packages, and have built custom kernels before just not applied patches to deb packages [11:00] apw: whats the git repo address? [11:01] leon_pegg, was struggling to find it in our wiki *sigh* ... https://wiki.ubuntu.com/Kernel/SourceCode see the git bit there [11:01] leon_pegg, i would then apply the patch in git and build the source package from that [11:01] apw: thanks [11:02] apw: fingers crossed it all goes well :D [11:04] leon_pegg, also, didn't grseurity stop producing patches in public anyhow ? [11:05] apw: they still provide the testing patches to keep supporting arch-hardened [11:06] leon_pegg, against what kernel versions though [11:07] apw: kernel versions 3.1-4.3.4 [11:13] so you might have some luck on 4.2 in wily then, but xenial is too new [11:19] apw: using wily on the desktops in the office so we should be good, we are still is discussions about applying grsec on the servers [11:19] leon_pegg, that is a big committment for sure, as you'll need to keep up to date with security release cadance [11:24] apw: exactly it is a huge commitment, and keeping up with security updates adds additional work. the plan at the moment it to trial on some of the desktops and progress from there after [13:27] Hi, is there an escalation procedure we could follow to get some more attention on https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1398465 ? [13:27] Launchpad bug 1398465 in linux (Ubuntu) "Memory allocation failure, presumably in FUSE" [Medium,Invalid] [13:27] Oops wrong bug sorry [13:28] Well actually same bug, but https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1505948 is the version we reported :-) [13:28] Launchpad bug 1505948 in linux (Ubuntu Wily) "Memory allocation failure crashes kernel hard, presumably related to FUSE" [High,Confirmed] [13:28] There is an out of bounds write somewhere in fuse_direct_io on 4.1 and higher that can panic the kernel (and reliably does, on our virtualization hosts) [13:29] We're working around it by staying on 3.19, but that is coming uncomfortably close to being out of support we think [13:29] MaikZ, that doesn't look like a memory allocation failure, more like a memory arena corruption, a double free or something [13:29] apw: Yes [13:29] We booted a test machine with slub_debug [13:29] And it confirms that something in fuse_direct_io wrote zero bytes into the "poison" areas [13:30] Without slub_debug, that makes the kernel panic on some future allocation or deallocation. [13:30] is that info in the bug (save me reading it) [13:31] MaikZ, also are you able to test the 4.4 kernel in xenial-proposed easily for this ? [13:31] If there's a .deb that installs on trusty, we should be able to do that fairly quickly yes [13:32] Not yet, our storage vendor was supposed to add it but apparently didn't. I'll attach it. [13:32] MaikZ, well in principle those are nistallable there with lick [13:32] luck [13:33] MaikZ, and this won't affect "regular" consumers of ubuntu i assume, as you need to use fuse 3.0 to get async io right ? [13:35] Not sure about the FUSE versioning, but I can confirm that the specific FUSE file system we use ships with its own, very recent libfuse. [13:35] and presumably you do not see this without [13:36] We have been able to reproduce the kernel crash using ntfs-3g because fuse-devel was unhappy about a closed-source userspace part [13:36] But that was also a fresh build [13:44] So for testing 4.4, we could try grabbing .debs from this build: https://launchpad.net/ubuntu/+source/linux/4.4.0-1.15/+build/8880824 [13:44] And they should sort of work on trusty because kernel packaging hasn't fundamentally changed? [13:47] MaikZ, that would be my expectation, do get -extra if you need it [13:59] Okay, one machine cleared of production workloads and 4.4 installed, let's see what happens. [13:59] MaikZ, i think it will break just the same, but that a good data point, i have found a suspicious looking async path [14:01] g'day [14:07] Madkiss, ? [14:09] apw: MaikZ told me there's an interesting conversation going on here about a bug we reported several months ago, so I just thought i'd join and read :) [14:10] MaikZ, assuming that machine also goes pop, i might have a guess as to the cuase of this, if so then i might have a test kernel to try (soon) [14:11] MaikZ, if so, 1) would you be able to test it, and 2) i would need to confirm its not leaking instead of blowing up, so it'd need some monitoring [14:12] We can install a custom build. What monitoring do you have in mind? [14:12] There is sadly no metrics collection in that cluster. [14:12] So I can't draw you pretty memory usage graphs. [14:18] MaikZ, well if it is the right fix then the machine will not blow up and things will be fine, if it is not then the machine may work fine and leak 96 byte memory blocks for every async IO and lose memeory over time [14:18] MaikZ, so I guess I am syaing we need to monitor the slabs are not growing out of control in this case [14:19] MaikZ, if your screenshot is accurate then like: while :; do grep kmalloc-96 /proc/slabinfo; sleep 5; done [14:20] would be something to watch is similar to other machines [14:21] MaikZ, i assume if i am wrong then it will literally lose an entry for every single IO, so it should be obvious and spectacular [14:21] Spectacular failure is our core business! [14:33] I'm so far failing to reproduce the previous issue (slub_debug is silent, also no panic), but instead I/O has totally frozen on the test VM. [14:33] With the 4.4 kernel. [14:34] Or looks like it...some ops are getting through according to storage metrics. [14:42] any thoughts on disabling mtrr for xenial? https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1163587 [14:42] Launchpad bug 1163587 in linux (Ubuntu) "mtrr_cleanup: can not find optimal value, perhaps no longer needed?" [Medium,Triaged] [15:33] MaikZ, ok, after some further reading it doesn't look like that suspicious thing is suspicious [15:56] I will let the 4.4 tests run overnight, the kernel might panic after a couple of hours. It usually happens within minutes but not always. [16:02] MaikZ, sounds good thanks [16:02] MaikZ, put any result info in the bug and let us know too here [16:04] Sure.