/srv/irclogs.ubuntu.com/2009/07/13/#upstart.txt

=== h\h is now known as haraldh
rjbell4	Keybuk: I was wondering if you could elaborate, as ion mentioned yesterday.	13:19
Keybuk	rjbell4: what's up?	13:19
rjbell4	I asked the following yesterday: Is there any support in upstart for monitoring multiple child processes? I've found "expect fork" and "expect daemon", which seem to monitor a single child or grandchild, but what if there are several child processes, and if any of them fail then I want to take action?	13:19
Keybuk	that's planned	13:21
Keybuk	though obviously within certain limits	13:21
Keybuk	since you'd need to know beforehand how many children to expect	13:21
Keybuk	how long it typically takes for that number of children to appear	13:21
Keybuk	and wouldn't want to die if they double-fork()d along the way	13:21
Keybuk	etc.	13:21
ion	rjbell4: Please describe your exact use case.	13:22
rjbell4	Keybuk: Okay, but not yet implemented, I gather?	13:22
ion	When i said 0.10 yesterday, i meant the-next-major-version-still-in-planning.	13:23
Keybuk	correct	13:23
Keybuk	rjbell4: but without more details, I'm not sure that what's planned will actually do what you want	13:23
rjbell4	ion: We have replaced init with upstart on our product, and one of the services we run forks a few different processes to do related-but-separate tasks. If any of those processes fails (segfaults, whatever), the whole thing should be restarted.	13:24
Keybuk	how would you tell Upstart, in advance, what those related-but-separate tasks were?	13:24
Keybuk	what can Upstart use to distinguish them?	13:25
Keybuk	does the parent process that spawned them remain, or does that terminate?	13:25
Keybuk	do any of the processes daemonise, or fork?	13:25
rjbell4	I was actually intrigued by the "read-a-pid-a-file" approach, as I thought if the service had support, that might be a cheap way to support telling Upstart which processes to monitor. But that support appears to have been removed.	13:26
Keybuk	are the spawned processes the same executable, or are they exec() of other executables?	13:26
Keybuk	rjbell4: right, it's not very reliable	13:26
rjbell4	It might suffice to tell upstart to monitor the 3 children of the process that it kicks off, rather that just looking for 1 with "expect fork". I'm not positive it would work, but I think it might.	13:26
Keybuk	can you answer the other questions above?	13:27
rjbell4	Keybuk: In this case, I think the parent terminates, and the children keep running. I'd have to check to be certain, but I believe that's correct. It's possible that they should daemonize (by forking again), but don't.	13:28
ion	If one of the children dies, what should happen to the other children?	13:29
Keybuk	the problem there is that you're spawning more then three new processes then	13:29
Keybuk	so how does Upstart know it's terminating because it's daemonising	13:30
Keybuk	or terminating because of an error?	13:30
rjbell4	ion: I suspect that should be handled however a "stop" would normally be handled	13:30
rjbell4	Keybuk: How is that different from 'expect daemon'? For example, couldn't there be an 'expect 3 daemons'-like functionality? I might be missing the problem.	13:32
rjbell4	Oh, BTW, good job with upstart. I'm actually considering what some colleagues have done with Upstart and wondering if it couldn't be done better / easier.	13:33
Keybuk	because that just follows one line of processes	13:33
Keybuk	you need the mechanism to follow multiple children	13:33
Keybuk	and know when it's reached the stable end	13:33
Keybuk	I'm thinking, from the information you've given, that it's not a hard problem	13:33
Keybuk	I assume that if these children were to exit(0) that'd be ok?	13:33
rjbell4	Keybuk: I'd actually expect that to happen only as a result of a stop event. If they terminated any earlier it should be with an error. Terminating earlier with a 0 exit status "should never happen", so I wouldn't presume to describe what the result should be.	13:36
rjbell4	Keybuk: re: hard problem. I suspect it's just extra work to track things. For the "expect 3 daemons" case, you trace the primary process until it forks, then continue to trace the child process as it forks 3 times, each time adding the child process to the list of processes to monitor for that service.	13:37
rjbell4	I suppose an argument could be made that this crosses some line and something else should be monitoring more complex models like this, but Upstart just seems to be in the right place to do this.	13:39
Keybuk	no, you're not following	13:39
Keybuk	the problem is how do you keep count of the forks	13:40
Keybuk	a - b - c	13:40
Keybuk	+- d	13:40
Keybuk	+- e	13:40
Keybuk	+- f	13:40
Keybuk	that's 5 forks, not 3 ;)	13:40
ion	How about this: keep track of all children. Whenever one of them exits with a zero exit status and it wasn’t the last process, just remove it from the child list and continue as usual. If one of the dies with a non-zero exit status or due to a signal, consider that an error and kill the other processes (if any). When the last process exits with a zero exit status, consider that a non-error exit.	13:43
Keybuk	ion: that's exactly what I was thinking ;-)	13:43
rjbell4	Keybuk: Right, you're not just keeping track of the forks, but which process forks. You ptrace a until it forks, then you continue to trace b until it forks 3 times (in this case, for c and d and e, but not f, because you only told Upstart to trace three children)	13:44
Keybuk	ion: that fits with the general model I was going for with netlink	13:44
rjbell4	Keybuk, ion: Sounds reasonable, but since the mechanism is ptrace, that unfortunately rules out running a debugging on any of those processes, right?	13:44
Keybuk	rather than having "the main process" we're in the "running" state with N processes	13:44
ion	Substitute zero exit status with the value in “normal exit”	13:44
rjbell4	^debugging^debugger	13:44
Keybuk	rjbell4: the mechanism is not going to be ptrace for long	13:45
Keybuk	ptrace doesn't work	13:45
Keybuk	ion: right - normal exit usually only implies 0 if it's a task	13:45
rjbell4	Keybuk: Oh, well that probably colors my responses then.	13:45
Keybuk	but it makes sense that normal exit implies 0 if it's a task or there are other processes still running	13:45
rjbell4	Keybuk: You guys know what you are doing better than I do (obviously); I'm just providing feedback from a consumer. :-)	13:46
Keybuk	rjbell4: I think we can definitely make this work for you	13:46
rjbell4	Keybuk: Out of curiosity, what's the new mechanism planned to be?	13:46
Keybuk	rjbell4: using a Linux feature called the "proc connector"	13:47
Keybuk	it's a netlink socket from which you receive messages for all fork(), exec(), setsid(), setuid(), etc. calls	13:47
rjbell4	Keybuk: Interesting, thanks for the info.	13:49
Keybuk	ptrace has an annoying race condition	13:50
Keybuk	when you get the TRAP for fork(), this actually happens in the parent after the child process is spawned	13:50
Keybuk	the child STOPs of course	13:50
Keybuk	but Upstart ignores that, because it didn't know the pid	13:51
ion	Say, Upstart receives a message from the proc connector saying ”process 1234 exited with status 1”. While Upstart begins to process that message, 1234’s child, 1235 already exited with exit status 1 and a new, unrelated process happened to start with pid 1235. Upstart happily kills 1235 and then continues to read further messages from proc connector (”process 1235 exited with status 1” etc). Is this remotely possible? (I haven’t looked at how proc ...	13:57
ion	... connector behaves yet.)	13:57
ion	keybuk: Highlight	14:09
Keybuk	hmm	14:53
Keybuk	ah	14:53
Keybuk	no, you're forgetting one key detail	14:53
Keybuk	pids aren't reused until you wait() and clean them up	14:53
Keybuk	if the process is a direct child of Upstart, or a daemon that has been reparented to Upstart	14:53
Keybuk	(pulls up the notes he made about this on his iPhone)	14:53
Keybuk	right	14:54
Keybuk	receiving a "process 1234 exited" from the proc connector before the SIGCHLD means we store a flag	14:54
Keybuk	likewise receiving a SIGCHLD for process 1234 before we see the proc connector entry means we store the flag	14:54
Keybuk	then on the opposite one, we actually take action	14:54
Keybuk	in other words, a child of init is not considered dead until we've been told about it and seen the body	14:55
Keybuk	at that point, all of its children are implicitly reparented to init	14:55
Keybuk	so again, init would receive SIGCHLD on them as well as the proc connector event	14:55
Keybuk	so provided we wait for the notification and the body, we're safe	14:55
Keybuk	now there's a race as you say, if the children aren't reparented	14:56
Keybuk	if the tree is	14:56
Keybuk	a	14:56
Keybuk	`-b	14:56
Keybuk	`-c	14:56
Keybuk	a is our child, we get SIGCHLD	14:56
Keybuk	but b isn't, it's a's child	14:56
Keybuk	in that case, I'm not sure that it's up to upstart to supervise b	15:08
Keybuk	that's b's job ;)	15:08
Keybuk	err, a's job	15:09
Keybuk	but if a dies, proc connector means we know about b and c	15:09
Keybuk	so know they're reparented to us	15:09
Keybuk	so Upstart will supervise them both	15:09
sadmac2	Keybuk: is it possible to get information about b from proc connector before a dies?	15:10
Keybuk	sadmac2: yes, proc connector tells us everything	15:10
Keybuk	we will know that a forked b	15:10
Keybuk	the exception is if either b or c call setsid(), to put themselves out of a's session	15:11
Keybuk	the only reason a process would call setsid() is if it needs to be the leader of a session	15:12
Keybuk	because it wishes to control a tty	15:12
Keybuk	e.g. ssh	15:12
Keybuk	in which case, we deliberately abandon it	15:12
Keybuk	because when a (sshd) dies, we don't want to just supervise b and c (ssh-login scott) in their place	15:13
Keybuk	we want to consider a dying a bad thing	15:13
Keybuk	and we don't want to kill b or c either	15:13
sadmac2	Keybuk: yes, I was happy when you figured out those bits	15:15
sadmac2	Keybuk: here's a question: how do things like gnome-session work in upstart-of-tommorow? We don't really want another per-user session manager duplicating most of our effort, do we?	15:16
Keybuk	it's quite easy to just have upstart be the session manager	15:17
Keybuk	the question turns out to be whether we want pid #1 to be that session manager,	15:17
Keybuk	whether we want a middle-man session manager running as the user,	15:17
sadmac2	Keybuk: my thought was gnome-session isn't a process anymore. Its just a taskless state that all the session services depend on	15:17
Keybuk	(but everything still gets reparented to #1 anyway)	15:17
Keybuk	or whether we actually want a mini-init for user sessions	15:17
Keybuk	such that any user session daemon actually gets reparented to the user session manager	15:17
Keybuk	and make the process trees look pretty	15:17
sadmac2	Keybuk: can we actually manipulate the reparenting behavior right now?	15:18
Keybuk	not without a patch Lennart sent to lkml	15:19
Keybuk	http://lkml.org/lkml/2009/5/28/430	15:19
ion	keybuk: Alright	15:20
* sadmac2 now cringes every time exit.c is patched		15:20
Keybuk	lol, why?	15:21
sadmac2	Keybuk: its like playing Jenga	15:21
Keybuk	it's just the kernel	15:21
sadmac2	I really do have to find time to just rewrite that whole file	15:22
Keybuk	one of the simpler parts reall	15:22
ion	http://www.youtube.com/watch?v=F9BmTmMEOhQ	15:22
sadmac2	Keybuk: its particularly poorly written IMHO. if (some return value) { // we have a spinlock } else { //we gave up the spinlock awhile ago } is never a good thing to see	15:23
sadmac2	ugly code is worse than broken code. Bugs are easier to fix than hideous.	15:24
sadmac2	Keybuk: so in order to take 0.6 in Fedora, we need to not regress on the state transfer thing.	15:30
Keybuk	does the patch apply?	15:31
sadmac2	Keybuk: and since 0.6 is hopefully forward compatible, that means we need a solution that you've had some architectural say in.	15:31
sadmac2	Keybuk: the patch might nearly apply (I'd imagine it needs some heavy reworking) but if you're going to do it differently when you do it, its probably best that we do it your way.	15:33
ion	keybuk: Btw, now that jobs are in /etc/init without any 0.6 namespacing, how do you plan to have 0.10 handle 0.6 jobs cleanly? :-) I’m still advocating a separate parser for 0.6 jobs that outputs 0.10 job objects.	15:35
=== Keybuk_ is now known as Keybuk
sadmac2	Keybuk: what's the last thing you got before you dropped?	15:36
Keybuk	ion: there's not that much difference in the syntax	15:36
Keybuk	sadmac2: can't remember, what was the last thing I said? :)	15:36
sadmac2	Keybuk: does the patch apply?	15:36
Keybuk	haven't tried	15:36
sadmac2	Keybuk: no, you were asking me	15:36
sadmac2	:)	15:36
sadmac2	10:35 < sadmac2> Keybuk: so in order to take 0.6 in Fedora, we need to not regress on the state transfer thing.	15:37
sadmac2	10:35 < Keybuk> does the patch apply?	15:37
sadmac2	10:35 < sadmac2> Keybuk: and since 0.6 is hopefully forward compatible, that means we need a solution that you've had some architectural say in.	15:37
sadmac2	10:38 < sadmac2> Keybuk: the patch might nearly apply (I'd imagine it needs some heavy reworking) but if you're going to do it differently when you do it, its probably best that we do it your way.	15:37
sadmac2	Keybuk: ^^ that's what lead up to you dropping	15:37
Keybuk	sadmac2: the problem is I don't have a preferred way of doing it yet	15:59
Keybuk	and I glanced through your patch, and I don't see how it can possibly work	15:59
Keybuk	you don't transfer most of the state	15:59
sadmac2	Keybuk: its enough. It does depend on the configs lining up.	15:59
sadmac2	Keybuk: it fixed our broken TTYs anyway	16:00
Keybuk	it isn't enough though	16:00
Keybuk	what if a job is mid-starting?	16:00
sadmac2	Keybuk: bear in mind that I didn't write this. Just reformatted it :)	16:00
sadmac2	Keybuk: what are we missing for that. We have the state and the pid. The new process should get the signal and move it forward.	16:01
Keybuk	the event queue?	16:01
Keybuk	the attached events?	16:01
Keybuk	if there's a "start" command running, you don't transfer the D-Bus message structure from one instance to the other	16:02
Keybuk	(let along the d-bus connections)	16:02
sadmac2	Keybuk: yes. that is true...	16:02
sadmac2	Keybuk: wouldn't it be better to just stop taking input and flush all those out?	16:04
sadmac2	well, the way blocking is done now you still need the event queue	16:04
sadmac2	no that won't work yet..	16:05
Keybuk	the problem is you need to be in a state where all services are running or waiting	16:05
Keybuk	and all tasks are waiting	16:05
Keybuk	it may not be possible to be in that state	16:05
sadmac2	yeah. I saw it captured the other states but didn't look at what it needed to advance out of them...	16:06
sadmac2	time to stop trusting patches from strangers	16:06
sadmac2	Keybuk: wb	16:10
ion	Instead of /etc/init/dbus-reconnect.conf, why not do telinit q in /etc/init.d/dbus, and then in /etc/init/dbus.conf whenever it’s moved over?	16:14
Keybuk	ion: this was a simpler hack	16:15
=== robbiew is now known as robbiew-afk
=== robbiew-afk is now known as robbiew

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!