fallacy | Question. The nouveau driver in Ubuntu 14.04 kernel (3.11.0-20) doesn't work on my system but works fine in the upstream kernel from a later version (3.14.1-031401). Should I report that as a bug for Ubuntu 14.04? | 00:58 |
---|---|---|
fallacy | oops, wrong info. It's broken in 3.13.0-24 (which is Ubuntu 14.04). Works in 3.11.0-20 (from Ubuntu 13.10) and in upstream Ubuntu build 3.14.1-031401 | 00:59 |
fallacy | Having those upstream packages are really useful for debugging | 01:03 |
=== sz0 is now known as sz0` | ||
=== sz0` is now known as sz0 | ||
=== sz0 is now known as sz0` | ||
=== sz0` is now known as sz0 | ||
niluje | I run nbd-client from initramfs (ubuntu 14.04) to create the device /dev/nbd0, on which the rootfs is mounted. When I shutdown or reboot, I get a kernel panic because some I/O are done and the device is not accessible (http://pastebin.com/P4qWXUFC). I guess nbd-client is killed too early by init. Using systemd, the solution is described here | 07:15 |
niluje | http://www.freedesktop.org/wiki/Software/systemd/RootStorageDaemons/, and renaming nbd-client to @nbd-client (or specifing the option -m in nbd-client 3.8, (https://github.com/yoe/nbd/blob/master/nbd-client.c#L532) would probably solve the problem. Any idea how I should solve the issue? | 07:15 |
infinity | niluje: See /run/sendsigs.omit.d | 07:18 |
infinity | niluje: Any PIDs in there will be bypassed by the "kill the world" bit on shutdown. | 07:18 |
niluje | infinity: I'm giving a look | 07:19 |
niluje | infinity: thanks a *LOT* | 07:26 |
niluje | it works fine, no more kernel panic | 07:26 |
niluje | however, when the process is killed? | 07:26 |
niluje | the connection doesn't seem to be cleanly closed, I guess the nbd-client didn't send a shutdown(2) to the nbd-server | 07:27 |
infinity | niluje: So, I'm not sure how best you'd mangle that. S90{halt,reboot} in rc[06].d will fail if the root disappears. | 07:35 |
infinity | niluje: S60umountroot should at least be making sure the disappearing client doesn't cause data loss. Assuming nbd has a concept of read-only mounts. | 07:36 |
niluje | infinity: I'm looking into /etc/rc6.d/S60umountroot and I don't really get what I could do here. Could you tell me a bit more? | 07:42 |
infinity | niluje: Well, nothing there, really. Just pointint out that that should be remounting your root readonly (if nbd supports such a thing). | 07:43 |
infinity | niluje: There's a fundamental chicken/egg issue where we can *never* actually umount root completely, or the reboot/halt would fail, cause you'd no longer have binaries to run. | 07:43 |
niluje | hm right | 07:44 |
niluje | infinity: I tried a mount -t ext4 -o remount,ro /dev/nbd0 / yesterday and it works well, the file system is readonly, can't write anything on it | 07:45 |
infinity | niluje: Right, and shutdown should be doing that for you, it's just never going to exit the nbd client. | 07:46 |
infinity | niluje: There are certainly ways we could do this more cleverly, given a bunch of re-engineering (like copying everything we need to run shutdown(8) into a tmpfs, umounting /, and then rebooting), but that's a fair bit of fiddling to get right and has never really been an issue worth caring about, since settling a read-only FS should guarantee that an unclean shudown doesn't really matter. | 07:48 |
niluje | infinity: there's something I don't understand | 07:50 |
niluje | You made me understand why the rootfs can't be unmounted | 07:50 |
niluje | When I put the nbd-client PID in /run/sendsigs.omit.d, the nbd-client process will not be killed upon reboot | 07:51 |
niluje | in /etc/rc6.d/S60umountroot, I see a [...] mount $MOUNT_FORCE_OPT -n -o remount,ro -t dummytype dummydev [...] | 07:51 |
niluje | which, if I understand well, will flush what needs to be flushed and assure me the filesystem will be "clean" once the reboot will occure (and the fs will be remounted rw) | 07:52 |
niluje | am I wrong somewhere? | 07:52 |
infinity | Remounted ro, you mean. | 07:54 |
niluje | infinity: remounted rw | 07:55 |
infinity | niluje: The whole point of that line is to remount ro. | 07:55 |
niluje | I mean, the fs is first remounted ro, then when the system restarts, it is remounted rw | 07:55 |
infinity | niluje: Oh, sure. Yes. | 07:55 |
infinity | niluje: rw when it comes back, obviously. | 07:55 |
niluje | ok | 07:55 |
niluje | and the "ro" part is to flush pending writes, right? | 07:56 |
niluje | infinity: what I don't understand is, if the process isn't killed, why can't I reconnect on nbd-server upon reboot? | 08:01 |
infinity | That, I don't know. I've never used nbd. | 08:01 |
niluje | thinking about it | 08:02 |
niluje | it seems pretty normal, I guess the network it shutdown too | 08:02 |
niluje | is* | 08:02 |
niluje | I just added some logging in the initramfs to run ps just before launching nbd-client. What I do is: start the machine, put nbd-client PID in /run/sendsigs.omit.d/, reboot. Then, if the process hasn't been killed, shouldn't I see it listed? (it is not) | 08:07 |
infinity | niluje: How would the process be listed after a reboot? | 08:09 |
niluje | infinity: I'm lost :-( | 08:15 |
niluje | I don't understand when the process is killed, when listed in /run/sendsigs.omit.d | 08:16 |
infinity | niluje: It's not. The machine reboots. | 08:25 |
infinity | niluje: Which, of course, kills everything. But not with a signal. It just dies. Cause the machine is dead. :P | 08:25 |
niluje | infinity: ok | 08:28 |
niluje | :p | 08:28 |
niluje | so I don't understand why you can't unmount the rootfs | 08:28 |
jk- | heya ikepanhc, do you know if Alex is around? | 08:30 |
ikepanhc | jk-: hi Jeremy :D | 08:30 |
ikepanhc | jk-: give me a sec | 08:30 |
jk- | ikepanhc: thanks! :) | 08:30 |
alexhung | jk-, ping | 08:31 |
jk- | hi alexhung, just had a few fwts questions; is there a better channel to use? | 08:32 |
infinity | niluje: Because if you unmount it, how can you call "halt" or "reboot"? | 08:32 |
infinity | niluje: Or the shell script that is S90reboot... | 08:32 |
niluje | hm right :p | 08:33 |
niluje | infinity: is there anywhere a ressource explaining step by step what happens during a reboot? | 08:35 |
infinity | niluje: I doubt it. It's mostly documented in code. | 08:35 |
niluje | ok | 08:35 |
niluje | wouldn't it be possible to mount a tmpfs in which I'd get something executed that'd kill the process at the very last moment? | 08:36 |
infinity | niluje: I hinted above at that being the very complicated solution to trying to not need root to reboot. | 08:37 |
infinity | niluje: But it's overkill for literally every use case I've ever run into. | 08:37 |
infinity | niluje: If nbd-server is exploding when a client disappears, that's a bug, IMO, and should be fixed, not worked around. | 08:37 |
infinity | niluje: The maintainer of nbd (wouter) happens to be a very nice fellow. You might ask him about your issues. | 08:40 |
infinity | niluje: I will continue to maintain the opinion that if the server can't handle a client disappearing on reboot and trying to reconnect later with a new instance, it's fundamentally broken. :P | 08:40 |
apw | yeah what happens when you lose power or the network for reasons outside your control, you end up in the same mess | 08:41 |
niluje | yup | 08:41 |
niluje | infinity: I'm going to dig a bit more before contacting the guys behind nbd. | 08:42 |
niluje | thanks a lot for your time | 08:42 |
infinity | niluje: Sure. He happens to be a decent dude with access to lovely chocolate. | 08:42 |
infinity | He tried to kill me with said chocolate last time I saw him. | 08:42 |
infinity | Evil Belgians. | 08:42 |
niluje | :D | 08:43 |
smb | Thats killing someone softly... :) | 08:43 |
niluje | just wondering, are you a ubuntu committer, a kernel committer, or something? | 08:43 |
infinity | smb: That's wouter's style. | 08:43 |
infinity | niluje: Ubuntu and Debian developer, and random hacker on several upstream projects. Also, just a guy who can't sleep (but I'm going to go try). | 08:44 |
niluje | aren't you in london? at 9am?! | 08:44 |
infinity | niluje: I'm in Calgary. It's 2:44am. | 08:44 |
apw | heh ... very few people are in london | 08:44 |
smb | apw, Beside all the inhabitants | 08:45 |
infinity | apw: Hey, to be fair, I'm in London more than some people who claim to live there. | 08:45 |
apw | true, but as you arn't, the irc contingent is severely reduced | 08:45 |
niluje | ok :-) thanks a lot again | 08:45 |
apw | fallacy (not), yes file a bug against it indeed | 08:47 |
=== apw changed the topic of #ubuntu-kernel to: Home: https://wiki.ubuntu.com/Kernel/ || Ubuntu Kernel Team Meeting - Tues April 29th, 2014 - 17:00 UTC || If you have a question just ask, and do wait around for an answer! If the question is should I file a bug for something, likely you can assume yes. | ||
smb | apw, Bored of repeating that answer you are? | 08:50 |
smb | :) | 08:50 |
apw | heh | 08:51 |
mlankhorst | ;D | 08:52 |
smb | mlankhorst, So about my whining part #1. bug 1298517 has screenshots about it. ;) | 08:57 |
ubot2 | Launchpad bug 1298517 in mesa (Ubuntu) "Rendering issues in unity with xserver-modeset" [High,Triaged] https://launchpad.net/bugs/1298517 | 08:57 |
mlankhorst | modesetting hates you :P | 08:58 |
smb | Well, its mostly nearly working. It's just that unity developers obviously don't test in VMs and if they do, they don't use terminals. ;-P | 09:00 |
mlankhorst | llvmpipe bugs have a low chance of getting fixed, though.. | 09:01 |
smb | Unless "he" stumbles over them trying to present a Ubuntu desktop running in THE CLOUD™ | 09:02 |
amitk | :) | 09:03 |
smb | amitk, Oh look who is there. Hello stranger. ;) | 09:04 |
amitk | smb: that should go on the quotes page, too bad I don't have access anymore ;) | 09:05 |
smb | amitk, Not again. :-P | 09:05 |
amitk | smb: for the record, I'm always lurking around this channel.. | 09:08 |
amitk | still use ubuntu core, none of the desktop stuff though | 09:09 |
smb | amitk, Yeah, and I could tell if I cared to look at the list of people. But I consider you under the towel unless you say something. :) | 09:09 |
=== sz0 is now known as sz0` | ||
=== sz0` is now known as sz0 | ||
=== sz0 is now known as sz0` | ||
=== sz0` is now known as sz0 | ||
apw | smb, i osmetimes wonder why i even have that thing enabled (the list) | 10:43 |
smb | apw, You got a point there | 10:45 |
* apw idly wonders if it can be turned off in his interface .... hmm | 10:46 | |
=== deffrag__ is now known as deffrag_ | ||
apw | oh i hate that, spend 3 hours debugging a test suite failure "you kernel is broken" to ... of course ... find the test is broken | 14:09 |
mlankhorst | solution: spend less time on testing. ;-) | 14:16 |
sconklin | arges, fwiw that network namespace cleanup bug has been repro'd by someone else on 3.13.0-24 (upstream bug). I think we have an easy way to reproduce it, working on setting that up now. | 14:21 |
arges | cool | 14:21 |
sconklin | also interesting is that it's also been seen on bare metal | 14:26 |
diwic | apw, that's what we call 'quality time' | 14:32 |
apw | mlankhorst, you are right .... rm -rf tests ... better | 14:35 |
stgraber | sconklin: is that the bug where the refcount appears to fail and the kernel thinks there's still something holding eth0 in an otherwise unused ns? | 14:36 |
sconklin | stgraber: https://bugzilla.kernel.org/show_bug.cgi?id=65191 | 14:36 |
ubot2 | bugzilla.kernel.org bug 65191 in Netfilter/Iptables "BUG in nf_nat_cleanup_conntrack" [Normal,New] | 14:36 |
stgraber | ah, I haven't seen that one in a while :) | 14:37 |
sconklin | we see it when we scale up and create a bunch of containers, then scale down and remove all of them | 14:38 |
jsalisbury | ** | 14:50 |
jsalisbury | ** Ubuntu Kernel Team Meeting - Today @ 17:00 UTC - #ubuntu-meeting | 14:50 |
jsalisbury | ** | 14:50 |
sconklin | stgraber: are there any multiple-container scaling tests in the LXC tests? | 14:55 |
stgraber | sconklin: we have lxc-test-concurrent that tests multi-threaded LXC but the LXC tests are typically simple busybox so they don't do much | 15:01 |
sconklin | ok. Thanks. I'll see whether we have anything that can run outside of our environment to stress this | 15:03 |
arges | sconklin: in your testcase you'll also need to do something that is generating conntrack entries (and allowing them to timeout or get cleaned up) | 15:07 |
sconklin | yeah, it's not trivial, and I'm not terribly optimistic | 15:08 |
BenC | apw: I’m working on fixing powerpc64-emb right now. What’s my timeframe on getting it into the next upload? | 15:22 |
bjf | BenC, you now have 3 weeks :-) | 15:23 |
BenC | bjf: Not sure if that’s good or bad :) | 15:24 |
bjf | BenC, you missed the friday cutoff. i'll be uploading new trusty bits today ... if you are talking utopic, that's a different story | 15:25 |
BenC | bjf: utopic | 15:25 |
bjf | BenC, ok, then apw and/or rtg can tell you that | 15:25 |
BenC | I still keep reading that as micro-topic | 15:25 |
apw | BenC, well we want to get upload soon, but if you have bits to fix power so we don't have to rip rip, that would be welcome | 15:39 |
apw | BenC, how long do you think you'll be | 15:39 |
BenC | apw: End of today | 15:39 |
apw | as tim is away we should be ok :) | 15:42 |
stgraber | sconklin: hey, so I just reproduced the conntrack panic here on metal with current trusty | 15:59 |
stgraber | (accidently, I was doing stress testing for another unrelated bug ;)) | 15:59 |
arges | stgraber: whoa. what test case did you use? | 15:59 |
stgraber | arges: running a -j9 kernel build + one unprivileged container running chrome with a youtube video + another container with 500 sub-containers and around 30 with sub-sub-containers | 16:00 |
sconklin | stgraber, arges - I'm looking at commit 0eba801b64cc8284d9024c7ece30415a2b981a72 which is in upstream but not stable, could be related - I have not tested upstream tip yet | 16:00 |
stgraber | so I must have been having around 600 netns on that box, each with around 150 ipv6 routing table entries and 20-30 of those with iptables and ip6tables nat entries | 16:01 |
arges | I could build a test kernel for stgraber if he can easily reproduce. sconklin is there a launchpad bug for this? | 16:03 |
sconklin | arges: no | 16:03 |
stgraber | arges: I'm not sure about "easily" yet but I can try again once that kernel is actually done building and I don't mind my work machine panicing on me :) | 16:04 |
arges | sconklin: ok file a bug please : ). stgraber when you're ready probably be good to first try http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/ and then we can try patch guessing | 16:07 |
sconklin | arges: fwiw the mainline build v3.15-rc2-trusty should have this patch in it | 16:13 |
arges | sconklin: yup. | 16:13 |
BenC | apw: Successfully linked. Still need to boot test it | 16:14 |
apw | BenC, sounds hopeful | 16:17 |
sconklin | arges: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1314274 | 16:36 |
ubot2 | Launchpad bug 1314274 in linux (Ubuntu) "BUG in nf_nat_cleanup_conntrack" [Undecided,New] | 16:36 |
xnox | exellent usage of buildds =) i like it | 16:48 |
jsalisbury | ## | 16:54 |
jsalisbury | ## Kernel team meeting in 5 minutes | 16:54 |
jsalisbury | ## | 16:54 |
jsalisbury | ## | 17:00 |
jsalisbury | ## Meeting starting now - #ubuntu-meeting | 17:00 |
jsalisbury | ## agenda: https://wiki.ubuntu.com/KernelTeam/Meeting | 17:00 |
jsalisbury | ## | 17:00 |
=== jsalisbury changed the topic of #ubuntu-kernel to: Home: https://wiki.ubuntu.com/Kernel/ || Ubuntu Kernel Team Meeting - Tues May 6th, 2014 - 17:00 UTC || If you have a question just ask, and do wait around for an answer! If the question is should I file a bug for something, likely you can assume yes. | ||
sconklin | arges: that bug also happens with the mainline v3.15-rc2-trusty build, I put that in the bug. | 20:48 |
sconklin | that means that the patch we talked about doesn't fix it, although that one may need to be in -stable also | 20:48 |
sconklin | kamal: ^^ | 20:48 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!