[12:06] <soren> Keybuk: I've got defunct process with ppid 1.. :( Anything you want me to do?
[12:14] <Keybuk> soren: ps output?
[12:15] <soren> All of it? Or just: soren    13177  0.1  0.0      0     0 ?        Zl   Nov23   5:31 [epiphany-browse] <defunct>
[12:16] <Keybuk> ps ajx for that pid
[12:18] <Keybuk> ps j -p 13177
[12:18] <Keybuk> should do the trick
[12:19] <soren> 13177  7364  7364 ?        Zl     5:31 [epiphany-browse] <defunct>
[12:19] <Keybuk> it's ppid isn't 1 then
[12:19] <Keybuk> ?
[12:19] <soren> That's what ps -efl says.
[12:19] <Keybuk> ps -j ! :)
[12:20] <soren> ps -j -p gives these headers:
[12:20] <Keybuk> no
[12:20] <Keybuk> ps j
[12:20] <soren> Gah.
[12:20] <Keybuk> ps j -p 13177
[12:20] <Keybuk> ;)
[12:20] <soren> I'm an idiot.
[12:20] <soren>  PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
[12:20] <soren>     1 13177  7364  7364 ?           -1 Zl    1000   5:31 [epiphany-browse] <defunct>
[12:20] <Keybuk> and curse ps for having the most insanely bad command-line ever
[12:20] <Keybuk> hmm
[12:20] <Keybuk> if you kill -CHLD 1 does it go away?
[12:21] <soren> Wow.. No.
[12:21] <Keybuk> sudo initctl version
[12:21] <Keybuk> ?
[12:21] <Keybuk> (note: no --)
[12:21] <soren> Now init (upstart 0.3.9)
[12:22] <soren> Er.. -now :)
[12:22] <Keybuk> did it go away?
[12:22] <soren> No.
[12:22] <Keybuk> *shrug*
[12:22] <Keybuk> iz kernel boog
[12:22] <soren> What makes you say that?
[12:22] <soren> Surely, it's upstart's job to wait() for it?
[12:22] <Keybuk> yes
[12:23] <Keybuk> and you just did two things that guaranteed that upstart would have called wait() :P
[12:23] <soren> Eh? What was the other one? initctl version?
[12:23] <Keybuk> yeah
[12:23] <soren> *g*
[12:23] <soren> Why?
[12:23] <Keybuk> upstart always calls wait() in a loop inside its main loop
[12:23] <soren> Oh.
[12:23] <Keybuk> rather than in a signal handler
[12:24] <Keybuk> so you just did two things that would have definitely caused upstart to repeat its main loop
[12:24] <Keybuk> so definitely caused wait() to be called
[12:24] <Keybuk> which kernel version?
[12:24] <soren> Just once? Or until there aren't any more children?
[12:24] <Keybuk> until there aren't any more children
[12:24] <Keybuk> it calls waitid (..., WNOHANG) in a loop
[12:24] <soren> uname -a says: Linux butch 2.6.22-14-generic #1 SMP Sun Oct 14 21:45:15 GMT 2007 x86_64 GNU/Linux
[12:25] <soren> I may or may not have upgraded the kernel package since I booted, so that's the best I've got.
[12:25] <Keybuk> sudo stop tty1
[12:25] <Keybuk> sudo start tty1
[12:25] <soren> Same.
[12:25] <Keybuk> any getty processes now zombies?
[12:25] <soren> Nope
[12:26] <Keybuk> sudo status tty1
[12:26] <Keybuk> then kill that pid
[12:26] <Keybuk> does it respawn?
[12:26] <soren> Yup
[12:26] <Keybuk> any zombie gettys?
[12:26] <soren> Nope.
[12:26] <Keybuk> epiphany still there?
[12:27] <soren> Yup
[12:27] <Keybuk> *shrug*
[12:27] <Keybuk> not my fault then ;)
[12:27] <soren> Gah..
[12:27] <Keybuk> definitely a kernel issue
[12:27] <Keybuk> try sending SIGCONT to 13177 ?
[12:28] <soren> To a zombie? Er.. No. I can, though.
[12:28] <Keybuk> try ie
[12:28] <Keybuk> try it
[12:28] <soren> Same.
[12:28] <Keybuk> fair enough
[12:28] <soren> What effect could that possibly have?
[12:28] <Keybuk> -> #ubuntu-kernel :)
[12:28] <Keybuk> might have released an in-kernel lock
[12:28] <soren> Oh.
[12:29] <soren> The epiphany process has an npviewer.bin child process, but I doubt that can mess things up this badly.
[12:30] <Keybuk> well
[12:30] <Keybuk> that kinda proves the kernel is stuffed then
[12:30] <soren> That's a good point.
[12:30] <soren> :)
[12:32] <Keybuk> zombie processes can't, by definition, have children
[12:32] <soren> Precisely.
[12:33] <Keybuk> since when they exit, the children would have been reparented to init first
[12:33] <soren> I didn't think of that.
[12:33] <soren> Yeah.
[12:33] <soren> Ah... dmesg reveals trouble, too.
[12:33] <soren> E.g. "Fixing recursive fault but reboot is needed!"
[12:34] <soren> ...but I had the vmware server modules loaded, so I'm SOL.
[12:35] <soren> I'll just blame them. There. Problem solved.