[01:02] hallyn, ack. thanks for the heads up === fmasi_afk is now known as fmasi === fmasi is now known as Guest32901 === elmo__ is now known as elmo [07:42] * apw yawns [07:59] I guess we don't have debugsym builds for the lts backport kernels? [07:59] *dbgsym [08:03] tjaalton: We should do. [08:04] tjaalton: http://ddebs.ubuntu.com/pool/main/l/linux-lts-raring/ [08:04] tjaalton: http://ddebs.ubuntu.com/pool/main/l/linux-lts-quantal/ [08:04] infinity: oh of course.. [08:05] different source package [08:09] just to clarify, should this work when a dbgsym package is installed? http://stackoverflow.com/questions/6151538/addr2line-on-kernel-module [08:09] what I'm trying is to get the line of code that matches an oops trace [08:10] the gdb approach just says 'no debugging symbols' [08:12] nevermind, used the wrong argument.. needed the one in /usr/lib/debug/lib/modules/.. === yofel_ is now known as yofel [12:14] * henrix -> lunch === Guest32901 is now known as fmasi === ayan_ is now known as ayan [14:25] guess there are no dbgsym packages for mainline kernels? [15:03] would be awesome to have those [16:01] hey - is there a signalfd+epoll wizard here? [16:01] If I run http://people.canonical.com/~serge/signalfd.c 10 times very quickly, one of the runs will end up hanging because main never gets an epoll event for the sigchild from the first cloned child [16:02] the clone is done after the signalfd is set up, so I would expect epoll to fire for that child's sigchld [16:02] (when it hangs, the cloned child and its own child both are , waiting for parent to reap them...) [16:06] hallyn, you know you cannot do IO in a signal handler right ? [16:06] hallyn, or indeed very little else [16:06] apw: but this isn't a real signal handler... [16:06] despite the name :) [16:07] it runs from main() in the epoll loop [16:07] oh heh [16:08] but i'm hoping someone can point out what i'm doing wrong there, before i go to lkml to present it as a possible bug... somewher ein that stack... [16:08] hallyn, so you clone the children the make the epoll thingy, why is that not racy [16:09] oh i see you make the signalfd first [16:09] oh yeah, lemme change that order, [16:09] though i would expect that to behave right i think :) [16:09] yeah, but maybe i need to make the epollfd and attach the signalfd first. Mind you, that *is* being done right int he code I'm trying to reproduce :) [16:12] I also apparently inserted the clone block twice [16:13] heh [16:13] post up the fixed version when you have it :) [16:13] well, the double clone seems to be the trigger [16:14] updated http://people.canonical.com/~serge/signalfd.c [16:18] hallyn, what the heck is the code trying to even do at the bottom [16:18] it seems to be using the number of live fds reported to loop over the signal handler [16:18] apw: it's waiting on an epoll event for the signalfd fd, and calling 'signal_handler' on the fd when it arrives. [16:19] yeah but looping over 0 -> nfds and reading for each doesn't make sense [16:19] as only one of them is a signafs [16:19] apw: admittedly since i only have one fd the loop seems silly - the more generalized case is in src/lxc/mainloop.c [16:19] signalfd anyhow, and the number returned is not related to the number of signals ... is it ? [16:20] right, the number returned is # available fds, [16:20] and yes that can only ever be 1 [16:20] so for you yes, but for them? [16:20] if this code you have is cut down from theirs i mean [16:20] right, there they also add console fds and such [16:20] it seems to be reading the signalfd once per the number of fds they have awake [16:20] that doesn't make sense really does it? [16:21] no they call a different handler for each fd type [16:21] (one sec, ...) [16:22] apw: see mainloop.c in https://github.com/lxc/lxc/blob/staging/src/lxc/mainloop.c [16:23] ahh that makes more sense [16:24] apw: if you lxc-start a container with a init which just does exit(0), the mainloop handler often hangs, while init sits around defunct. i'm trying to reproduce that... [16:25] i'll pull out the inner loop since it only obfuscates the problem [16:26] hallyn, well the original code does have a case (i think) where it can hide an event coming in [16:27] if the handler return non-zero it will silently go back to sleep, it might be worth instrumenting when it does things like that [16:30] apw: right, the handler returns 1 if the container init's sigchild was received, 0 otherwise [16:30] apw: by 'to sleep' you mean wiating on an epoll event right? [16:30] hallyn, yes to sleep == called epoll_wait and blocks again [16:30] apw: I was wondering last night whether the epoll event came in for two siganls, and i was only reading one - but trying another read doesn't seem to help, [16:31] and near as I can tell epoll_wait should return again if i haven't consumed all the data [16:31] well in theory if there was two, and you read one and poll again, it should wait again ... yes that [16:31] right - unless i was doing edge triggered, but i'm not (EPOLL_ET or whatever) [16:34] hallyn, so the differnce in your code is you should really be doing [16:34] you should be running 'all' the ones it returns and doing [16:34] if (event[N].data == signal_handler) [16:35] signal_handler() [16:35] just in case it is returning something else [16:35] if there was a bug in which it returned the wrong data, you wouldn't see it with your code as is [16:36] wella ctually epoll-wait(0 seems to be always returning 0, even when it has data. (huh?) [16:38] well if it returned 0 ther should be no data [16:38] you might want to seed the events with poison to confirm [16:41] easy enough, when it hangs, just hit ctrl-c :) [16:44] for me this code does behave odd [16:44] oh this is the old old code [16:44] hm, i see, 0 means don't wait, that's why i'm getting 0 often. [16:45] * hallyn tries with timeout -1 [16:45] yeah thats non-blocking [16:47] apw: verifying data ... actually may fix this! haven't reproduced so far [16:47] updated http://people.canonical.com/~serge/signalfd.c [16:47] nope, here we go [16:48] how often do you run it to get a hang [16:49] with the updated code, 100 times [16:50] for me i got one where it say 'got an event' a lot over and over [16:51] got an event (nfds 1) [16:51] bad epoll event! [16:52] right [16:53] so why is there a bad epoll event. weird [16:53] am i not initializing something before setting up? [16:54] * apw pokes, at least this would explain the hang [16:54] i think [16:54] anyhow, poking away, as i can repro it as well [16:55] doh [16:56] apw: nm, that's a bogus one. i is never set :) [16:56] before we check events[i].data.ptr [16:56] hallyn, heh indeed, doh [16:56] and i just got it to hang witht hat fixed [16:56] double-doh, though - fixing it (new http://people.canonical.com/~serge/signalfd.c ) [16:57] yeah [16:57] hangs immediately [16:57] what's an ascii emoticon for tortured agonizing troll? [17:21] apw: well, to my surprise, even replacing clone with fork eventually reproduces it. that's good i guess. [17:22] hallyn, what are you testing on [17:22] which release [17:22] hallyn, as it seems reproducible on 3.8 (whatever this is) [17:23] apw: yeah, raring. oh, snap, i thought my other vm was saucy, but no i've only tested on raring [17:23] wait a sec, i'm marking nonblock wrongly, perhaps [17:24] hallyn, same here [17:26] are you marking anything noblock at all? [17:27] apw: yeah (refresh http://people.canonical.com/~serge/signalfd.c :) [17:27] apw: i was doing it with fcntl for awhile, but then saw that signalfd takes a SFD_NONBLOCK flag [17:27] anyway, doesn't fix it [17:27] so the signal really really does seem to just get lost [17:28] ok lemme try on saucy [17:28] or, uh, precise [17:29] happens same way on precise [17:29] hallyn, so it always hangs when it gets the other exit first right? [17:30] yeah [17:30] apw: no [17:31] sometimes it gets the other exit first and still succeeds [17:31] but every time it fails it does first get the other exit [17:31] * apw isn't seeing that, but it is hanging very quick for me [17:31] apw: http://paste.ubuntu.com/5840986/ here is an example [17:32] it does feel like it is behaving like it is ET [17:33] yeah [17:34] apw: but then i'd expect my looping of the read to fix it [17:37] yep that we would [17:39] apw: d'oh. [17:39] hallyn, ? [17:39] signalfd() does not do flags. [17:39] wonder if signalfd() library call uses signalfd4(0 syscall [17:40] oh [17:40] yeah strace says it uses signalfd4 [17:40] and mine is seemingly non-b [17:40] blocking [17:40] so doesn't that mean it does work [17:41] waiting ... [17:41] got an event (nfds 1) [17:41] got sig for 20171, init was 20173 [17:41] failed to read signal info errno=11 [17:41] failed to read signal info errno=11 [17:41] waiting ... [17:41] for me it does that, on a hang, so it saw the wrong pid, read 3 times (two failed cause nothing there) and then we went back [17:42] but why 2 in a row failures? [17:42] it should only fail once, then return to the epoll-wait loop [17:43] apw: that printf looks differnet from mine [17:43] perror("read"); [17:43] printf("errno is %d\n", errno); [17:43] if (errno == EAGAIN || errno == EWOULDBLOCK) [17:43] return 0; [17:43] printf("failed to read signal info"); [17:43] yeah it is mine indeed [17:43] ok [17:43] i'm tempted to ping mkerrisk [17:45] mind you the corresponding lxc bug has been somewhat low priority for me, but i thought we were doing something more wrong than this... if it's a kernel or libc bug i think it's a bigger deal [17:49] hallyn, i'll keep poking for a bit too [17:51] hallyn, of course it might be a bug in signalfd not epoll, hmm [17:56] apw: yup [17:56] apw: that's why i was happy that fork could reproduce it - at least it doesn't involve CLONE_NEWPID as well :) [17:57] hallyn, and it might be an alignment issue, as it is random and we have address psace randomisation on [17:57] as SIGCLD is one of the large ones in that structure [17:58] so if the structure was too small, and aligned bad or something [17:58] or too big or something [17:58] struct signalfd_siginfo __user *siginfo; [17:58] if that isn't the same size as the one in the kernel ... we might fail [17:59] as if we dequeue it and then fail to return it ... we just lose it === fmasi is now known as fmasi_afk [18:06] hallyn, what makes you think (i assume you have docs somewhere) that just because you are using this interface that you can guarentee to see all signals ? [18:07] hallyn, as you say when it hangs you can see that you have pending children events [18:07] hallyn, are you sure you can expect to see both signals as separate events with separate children ? [18:07] hallyn, would i not expect you to actuall ignore the siginfo and use wait to get the childre [18:07] hallyn, and there i htink i would expect you to get both as they are there waiting to be waited for [18:08] hallyn, ie reading signalfd is _not_ waiting for them nor reporting them [18:10] hallyn, and you need to wait to clean them up regardless [18:13] interesting [18:13] so i have to waitpid() on the first to clear it up before i can get a sigfd for the second sigchld? [18:14] lemme try [18:14] it makes sense [18:14] (and it briefly came to mind last night, but i figured "nah!" :) [18:15] hallyn, no i am not saying you have to waitpid to clear it [18:15] hallyn, i am saying that signalfd is reporting pending signals [18:16] and pending signals with their 'latest' siginfo [18:16] but if you get two signals of the same type only the last one will be reported [18:16] that is normal for signals and siginfo [18:16] i am syaing when you get a signal, you should be looking on wait (with NOWAIT) and using [18:16] i see [18:16] the data it returns [18:16] as the pids not anything int he siginfo. [18:17] so i (and the original author) am (is) completely misunderstanding sigchld delivery [18:17] as we see both as zombies i bet money that you will find two waits [18:17] i did indeed assume i'd get N sigchlds [18:17] that is not normal with signals at the process level for sure [18:17] now signalfd could have either semantic and it does not say which [18:18] but if we conjecture it has the smae semantics as signals [18:18] and the way POLLIN is calculated i suspect it is [18:18] then you may not indeed, but a looping wait should always be sage [18:18] safe, and i think you need to be waiting anyhow to clean up the carcasses [18:18] so if i am right, a wait loop will repair things, if it is broken [18:19] then it will only ever have one and make no differende [18:23] apw: thanks - final signalfd.c testcase uploaded - this one doesn't seem to reproduce. seems you were right! [18:23] now i can fix lxc :) [18:23] \o/ [18:23] thanks apw [18:23] hallyn, heh... there is point in having a kernel team after all :) [18:23] hallyn, and i am glad we got to the bottom of it [18:24] * hallyn goes to take a walk outside before writing the actual patches [18:26] hallyn, i know the feeling, i think i need a beer after thinking that hard [18:27] that sounds. good. [18:29] hallyn, ok .. you need to break that loop for == 0 i think [18:29] as it spins in there till they exit otherwise [18:30] hallyn, and waitid() may be a more natural fit for your code base maybe [18:30] hallyn, switching your loop from >= 0 to > 0 still seems to work [18:30] and removes the repeated looping returning 0 [18:31] got sig for 2791, init was 2793 [18:31] waitpid ret=2791 [18:31] waitpid ret=2793 [18:31] got init's sigchld: done [18:31] hallyn, and we see the pattern there which i conjectured .. coool [18:31] beer it is === chiluk` is now known as chiluk [19:08] apw: (drat, no beer for me) do you have a copy of your final one you can post? [19:08] curious which loop you ended up going with [19:09] apw: oh, right, i misread the == 0 return value case! lol [19:09] "... in any state" well that does me no good :) [19:10] waitid is too foreign to me :) i'll muck it up somehow [19:15] hallyn, that version of yours looked ok once it was >= 0 :) [19:15] > 0 even [19:17] getting ready to test the lxc ase [19:17] lxc case [19:25] hm, not a slam dunk fix. [19:27] odd [19:36] hallyn, ? [19:39] apw: added the waitpid loop into signal_handler in lxc, but lxc-start still ends up with defunct init [19:39] so i'm missing something else that's going wrong there [19:41] sounds like a tommorrow fun [19:42] tomorrow's a holiday here - but ttyl [19:43] hallyn, drop me a line if you have any other examples to play with and i can look at it tommorrow [19:44] thx [19:46] hey [19:46] is there any issue with the saucy 3.10 kernel and screen light? [19:47] changing it using the multimedia keys hangs my latitude laptop for a second [19:47] where it works fine with 3.9 [19:50] seb128, noting i have seen reported [19:50] apw, hum, ok [19:51] apw, is there any special info that would be useful in a bug report? [19:52] seb128, h/w info which ubuntu-bug will collect [19:52] ok [19:52] that's a latitude e6410 [20:19] apw: well huh - particularly interesting in the lxc case is that after i call the parent task, the [init] task remains defunct, while parented by 1 [20:19] *that* probably shouldn't happen [20:20] hallyn, if you reap a parent before its child then its children go to init don't they ? [20:20] you could test that [20:20] with a basic fork test [20:23] apw: (not sure i was clear) i'm saying: root 1878 1 0 21:32 pts/2 00:00:00 [init] [20:23] '[init]'is the container init, which has exited; it's now parented by the host init, but is still rockin' the zombie life [20:23] right so noone has waited for it [20:23] have you waited for its parent [20:24] bc it's not sending a new sigchld? [20:24] that is not what i mean [20:24] i figured init would reap zombies regardless of whether it got a sigchld [20:24] oh i see [20:24] i killed its parent, and the parent is gone. [20:24] right so yes it should jump to real init, and that should be reaping it [20:25] that is its job [20:26] all right i'd better step away a few hours. sorry if i pulled you away from that beer :) [20:27] hallyn, that seems wrong, and in the first place i would be blaming init itself, which we cannot trace easily hmmm, i wonder if --verbose would give us any info [20:29] apw: but as i say if the exiting task has already sent its sigchld, then gets reparented to init, then init won't get a sigchld - [20:29] i'd actually expect the kernel at that point, instead of reparenting, to just reap ... [20:29] (i guess that's nonsense-talk) [20:30] so upstart would need to walk its pid list and check for zombies... which is not efficient. still, yeah, cleaning up after bad users is its job... [20:30] this should be trivial to write a test case for :) will do so later === fmasi_afk is now known as fmasi === fmasi is now known as fmasi_afk