[00:58] <fallacy> Question.  The nouveau driver in Ubuntu 14.04 kernel (3.11.0-20) doesn't work on my system but works fine in the upstream kernel from a later version (3.14.1-031401).  Should I report that as a bug for Ubuntu 14.04?
[00:59] <fallacy> oops, wrong info.  It's broken in 3.13.0-24 (which is Ubuntu 14.04).  Works in 3.11.0-20 (from Ubuntu 13.10) and in upstream Ubuntu build 3.14.1-031401
[01:03] <fallacy> Having those upstream packages are really useful for debugging
[07:15] <niluje> I run nbd-client from initramfs (ubuntu 14.04) to create the device /dev/nbd0, on which the rootfs is mounted. When I shutdown or reboot, I get a kernel panic because some I/O are done and the device is not accessible (http://pastebin.com/P4qWXUFC). I guess nbd-client is killed too early by init. Using systemd, the solution is described here
[07:15] <niluje> http://www.freedesktop.org/wiki/Software/systemd/RootStorageDaemons/, and renaming nbd-client to @nbd-client (or specifing the option -m in nbd-client 3.8, (https://github.com/yoe/nbd/blob/master/nbd-client.c#L532) would probably solve the problem. Any idea how I should solve the issue?
[07:18] <infinity> niluje: See /run/sendsigs.omit.d
[07:18] <infinity> niluje: Any PIDs in there will be bypassed by the "kill the world" bit on shutdown.
[07:19] <niluje> infinity: I'm giving a look
[07:26] <niluje> infinity: thanks a *LOT*
[07:26] <niluje> it works fine, no more kernel panic
[07:26] <niluje> however, when the process is killed?
[07:27] <niluje> the connection doesn't seem to be cleanly closed, I guess the nbd-client didn't send a shutdown(2) to the nbd-server
[07:35] <infinity> niluje: So, I'm not sure how best you'd mangle that.  S90{halt,reboot} in rc[06].d will fail if the root disappears.
[07:36] <infinity> niluje: S60umountroot should at least be making sure the disappearing client doesn't cause data loss.  Assuming nbd has a concept of read-only mounts.
[07:42] <niluje> infinity: I'm looking into /etc/rc6.d/S60umountroot and I don't really get what I could do here. Could you tell me a bit more?
[07:43] <infinity> niluje: Well, nothing there, really.  Just pointint out that that should be remounting your root readonly (if nbd supports such a thing).
[07:43] <infinity> niluje: There's a fundamental chicken/egg issue where we can *never* actually umount root completely, or the reboot/halt would fail, cause you'd no longer have binaries to run.
[07:44] <niluje> hm right
[07:45] <niluje> infinity: I tried a mount -t ext4 -o remount,ro /dev/nbd0 / yesterday and it works well, the file system is readonly, can't write anything on it
[07:46] <infinity> niluje: Right, and shutdown should be doing that for you, it's just never going to exit the nbd client.
[07:48] <infinity> niluje: There are certainly ways we could do this more cleverly, given a bunch of re-engineering (like copying everything we need to run shutdown(8) into a tmpfs, umounting /, and then rebooting), but that's a fair bit of fiddling to get right and has never really been an issue worth caring about, since settling a read-only FS should guarantee that an unclean shudown doesn't really matter.
[07:50] <niluje> infinity: there's something I don't understand
[07:50] <niluje> You made me understand why the rootfs can't be unmounted
[07:51] <niluje> When I put the nbd-client PID in /run/sendsigs.omit.d, the nbd-client process will not be killed upon reboot
[07:51] <niluje> in /etc/rc6.d/S60umountroot, I see a [...] mount    $MOUNT_FORCE_OPT -n -o remount,ro -t dummytype dummydev [...]
[07:52] <niluje> which, if I understand well, will flush what needs to be flushed and assure me the filesystem will be "clean" once the reboot will occure (and the fs will be remounted rw)
[07:52] <niluje> am I wrong somewhere?
[07:54] <infinity> Remounted ro, you mean.
[07:55] <niluje> infinity: remounted rw
[07:55] <infinity> niluje: The whole point of that line is to remount ro.
[07:55] <niluje> I mean, the fs is first remounted ro, then when the system restarts, it is remounted rw
[07:55] <infinity> niluje: Oh, sure.  Yes.
[07:55] <infinity> niluje: rw when it comes back, obviously.
[07:55] <niluje> ok
[07:56] <niluje> and the "ro" part is to flush pending writes, right?
[08:01] <niluje> infinity: what I don't understand is, if the process isn't killed, why can't I reconnect on nbd-server upon reboot?
[08:01] <infinity> That, I don't know.  I've never used nbd.
[08:02] <niluje> thinking about it
[08:02] <niluje> it seems pretty normal, I guess the network it shutdown too
[08:02] <niluje> is*
[08:07] <niluje> I just added some logging in the initramfs to run ps just before launching nbd-client. What I do is: start the machine, put nbd-client PID in /run/sendsigs.omit.d/, reboot. Then, if the process hasn't been killed, shouldn't I see it listed? (it is not)
[08:09] <infinity> niluje: How would the process be listed after a reboot?
[08:15] <niluje> infinity: I'm lost :-(
[08:16] <niluje> I don't understand when the process is killed, when listed in /run/sendsigs.omit.d
[08:25] <infinity> niluje: It's not.  The machine reboots.
[08:25] <infinity> niluje: Which, of course, kills everything.  But not with a signal.  It just dies.  Cause the machine is dead. :P
[08:28] <niluje> infinity: ok
[08:28] <niluje> :p
[08:28] <niluje> so I don't understand why you can't unmount the rootfs
[08:30] <jk-> heya ikepanhc, do you know if Alex is around?
[08:30] <ikepanhc> jk-: hi Jeremy :D
[08:30] <ikepanhc> jk-: give me a sec
[08:30] <jk-> ikepanhc: thanks! :)
[08:31] <alexhung> jk-, ping
[08:32] <jk-> hi alexhung, just had a few fwts questions; is there a better channel to use?
[08:32] <infinity> niluje: Because if you unmount it, how can you call "halt" or "reboot"?
[08:32] <infinity> niluje: Or the shell script that is S90reboot...
[08:33] <niluje> hm right :p
[08:35] <niluje> infinity: is there anywhere a ressource explaining step by step what happens during a reboot?
[08:35] <infinity> niluje: I doubt it.  It's mostly documented in code.
[08:35] <niluje> ok
[08:36] <niluje> wouldn't it be possible to mount a tmpfs in which I'd get something executed that'd kill the process at the very last moment?
[08:37] <infinity> niluje: I hinted above at that being the very complicated solution to trying to not need root to reboot.
[08:37] <infinity> niluje: But it's overkill for literally every use case I've ever run into.
[08:37] <infinity> niluje: If nbd-server is exploding when a client disappears, that's a bug, IMO, and should be fixed, not worked around.
[08:40] <infinity> niluje: The maintainer of nbd (wouter) happens to be a very nice fellow.  You might ask him about your issues.
[08:40] <infinity> niluje: I will continue to maintain the opinion that if the server can't handle a client disappearing on reboot and trying to reconnect later with a new instance, it's fundamentally broken. :P
[08:41] <apw> yeah what happens when you lose power or the network for reasons outside your control, you end up in the same mess
[08:41] <niluje> yup
[08:42] <niluje> infinity: I'm going to dig a bit more before contacting the guys behind nbd.
[08:42] <niluje> thanks a lot for your time
[08:42] <infinity> niluje: Sure.  He happens to be a decent dude with access to lovely chocolate.
[08:42] <infinity> He tried to kill me with said chocolate last time I saw him.
[08:42] <infinity> Evil Belgians.
[08:43] <niluje> :D
[08:43] <smb> Thats killing someone softly... :)
[08:43] <niluje> just wondering, are you a ubuntu committer, a kernel committer, or something?
[08:43] <infinity> smb: That's wouter's style.
[08:44] <infinity> niluje: Ubuntu and Debian developer, and random hacker on several upstream projects.  Also, just a guy who can't sleep (but I'm going to go try).
[08:44] <niluje> aren't you in london? at 9am?!
[08:44] <infinity> niluje: I'm in Calgary.  It's 2:44am.
[08:44] <apw> heh ... very few people are in london
[08:45] <smb> apw, Beside all the inhabitants
[08:45] <infinity> apw: Hey, to be fair, I'm in London more than some people who claim to live there.
[08:45] <apw> true, but as you arn't, the irc contingent is severely reduced
[08:45] <niluje> ok :-) thanks a lot again
[08:47] <apw> fallacy (not), yes file a bug against it indeed
[08:50] <smb> apw, Bored of repeating that answer you are?
[08:50] <smb> :)
[08:51] <apw> heh
[08:52] <mlankhorst> ;D
[08:57] <smb> mlankhorst, So about my whining part #1. bug 1298517 has screenshots about it. ;)
[08:57] <ubot2> Launchpad bug 1298517 in mesa (Ubuntu) "Rendering issues in unity with xserver-modeset" [High,Triaged] https://launchpad.net/bugs/1298517
[08:58] <mlankhorst> modesetting hates you :P
[09:00] <smb> Well, its mostly nearly working. It's just that unity developers obviously don't test in VMs and if they do, they don't use terminals. ;-P
[09:01] <mlankhorst> llvmpipe bugs have a low chance of getting fixed, though..
[09:02] <smb> Unless "he" stumbles over them trying to present a Ubuntu desktop running in THE CLOUD™
[09:03] <amitk> :)
[09:04] <smb> amitk, Oh look who is there. Hello stranger. ;)
[09:05] <amitk> smb: that should go on the quotes page, too bad I don't have access anymore ;)
[09:05] <smb> amitk, Not again. :-P
[09:08] <amitk> smb: for the record, I'm always lurking around this channel..
[09:09] <amitk> still use ubuntu core, none of the desktop stuff though
[09:09] <smb> amitk, Yeah, and I could tell if I cared to look at the list of people. But I consider you under the towel unless you say something. :)
[10:43] <apw> smb, i osmetimes wonder why i even have that thing enabled (the list)
[10:45] <smb> apw, You got a point there
[10:46]  * apw idly wonders if it can be turned off in his interface .... hmm
[14:09] <apw> oh i hate that, spend 3 hours debugging a test suite failure "you kernel is broken" to ... of course ... find the test is broken
[14:16] <mlankhorst> solution: spend less time on testing. ;-)
[14:21] <sconklin> arges, fwiw that network namespace cleanup bug has been repro'd by someone else on 3.13.0-24 (upstream bug). I think we have an easy way to reproduce it, working on setting that up now.
[14:21] <arges> cool
[14:26] <sconklin> also interesting is that it's also been seen on bare metal
[14:32] <diwic> apw, that's what we call 'quality time'
[14:35] <apw> mlankhorst, you are right .... rm -rf tests ... better
[14:36] <stgraber> sconklin: is that the bug where the refcount appears to fail and the kernel thinks there's still something holding eth0 in an otherwise unused ns?
[14:36] <sconklin> stgraber: https://bugzilla.kernel.org/show_bug.cgi?id=65191
[14:36] <ubot2> bugzilla.kernel.org bug 65191 in Netfilter/Iptables "BUG in nf_nat_cleanup_conntrack" [Normal,New]
[14:37] <stgraber> ah, I haven't seen that one in a while :)
[14:38] <sconklin> we see it when we scale up and create a bunch of containers, then scale down and remove all of them
[14:50] <jsalisbury> **
[14:50] <jsalisbury> ** Ubuntu Kernel Team Meeting - Today @ 17:00 UTC - #ubuntu-meeting
[14:50] <jsalisbury> **
[14:55] <sconklin> stgraber: are there any multiple-container scaling tests in the LXC tests?
[15:01] <stgraber> sconklin: we have lxc-test-concurrent that tests multi-threaded LXC but the LXC tests are typically simple busybox so they don't do much
[15:03] <sconklin> ok. Thanks. I'll see whether we have anything that can run outside of our environment to stress this
[15:07] <arges> sconklin: in your testcase you'll also need to do something that is generating conntrack entries (and allowing them to timeout or get cleaned up)
[15:08] <sconklin> yeah, it's not trivial, and I'm not terribly optimistic
[15:22] <BenC> apw: I’m working on fixing powerpc64-emb right now. What’s my timeframe on getting it into the next upload?
[15:23] <bjf> BenC, you now have 3 weeks :-)
[15:24] <BenC> bjf: Not sure if that’s good or bad :)
[15:25] <bjf> BenC, you missed the friday cutoff. i'll be uploading new trusty bits today ... if you are talking utopic, that's a different story
[15:25] <BenC> bjf: utopic
[15:25] <bjf> BenC, ok, then apw and/or rtg can tell you that
[15:25] <BenC> I still keep reading that as micro-topic
[15:39] <apw> BenC, well we want to get upload soon, but if you have bits to fix power so we don't have to rip rip, that would be welcome
[15:39] <apw> BenC, how long do you think you'll be
[15:39] <BenC> apw: End of today
[15:42] <apw> as tim is away we should be ok :)
[15:59] <stgraber> sconklin: hey, so I just reproduced the conntrack panic here on metal with current trusty
[15:59] <stgraber> (accidently, I was doing stress testing for another unrelated bug ;))
[15:59] <arges> stgraber: whoa. what test case did you use?
[16:00] <stgraber> arges: running a -j9 kernel build + one unprivileged container running chrome with a youtube video + another container with 500 sub-containers and around 30 with sub-sub-containers
[16:00] <sconklin> stgraber, arges - I'm looking at commit 0eba801b64cc8284d9024c7ece30415a2b981a72 which is in upstream but not stable, could be related - I have not tested upstream tip yet
[16:01] <stgraber> so I must have been having around 600 netns on that box, each with around 150 ipv6 routing table entries and 20-30 of those with iptables and ip6tables nat entries
[16:03] <arges> I could build a test kernel for stgraber if he can easily reproduce. sconklin is there a launchpad bug for this?
[16:03] <sconklin> arges: no
[16:04] <stgraber> arges: I'm not sure about "easily" yet but I can try again once that kernel is actually done building and I don't mind my work machine panicing on me :)
[16:07] <arges> sconklin: ok file a bug please : ). stgraber when you're ready probably be good to first try http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/ and then we can try patch guessing
[16:13] <sconklin> arges: fwiw the mainline build v3.15-rc2-trusty should have this patch in it
[16:13] <arges> sconklin: yup.
[16:14] <BenC> apw: Successfully linked. Still need to boot test it
[16:17] <apw> BenC, sounds hopeful
[16:36] <sconklin> arges: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1314274
[16:36] <ubot2> Launchpad bug 1314274 in linux (Ubuntu) "BUG in nf_nat_cleanup_conntrack" [Undecided,New]
[16:48] <xnox> exellent usage of buildds =) i like it
[16:54] <jsalisbury> ##
[16:54] <jsalisbury> ## Kernel team meeting in 5 minutes
[16:54] <jsalisbury> ##
[17:00] <jsalisbury> ##
[17:00] <jsalisbury> ## Meeting starting now - #ubuntu-meeting
[17:00] <jsalisbury> ##      agenda: https://wiki.ubuntu.com/KernelTeam/Meeting
[17:00] <jsalisbury> ##
[20:48] <sconklin> arges: that bug also happens with the mainline v3.15-rc2-trusty build, I put that in the bug.
[20:48] <sconklin> that means that the patch we talked about doesn't fix it, although that one may need to be in -stable also
[20:48] <sconklin> kamal: ^^