[08:46] <urban> hi, how can i deny sshd service to start more than once?
[06:10] <_ion> Well? :-)
[06:16] <Keybuk> still thinking
[06:16] <Keybuk> the typical problem can be defined as the "I just upgraded apache but haven't restarted it yet" one
[06:16] <Keybuk> you'd have an apache job, marked for deletion, that's still running
[06:17] <Keybuk> and a newer version of the apache job that's no longer running
[06:17] <Keybuk> you'd want "status apache" to show them bove
[06:17] <Keybuk> s/bove/both/
[06:17] <Keybuk> e.g.
[06:17] <Keybuk> # status apache
[06:17] <Keybuk> apache (start; deleted) running, process 12345
[06:17] <Keybuk> apache (stop) waiting
[06:17] <Keybuk> #
[06:18] <_ion> Right.
[06:18] <Keybuk> in which case, that's identical to "list" :p
[06:19] <_ion> Hmm
[06:20] <_ion> Perhaps list should accept a search pattern, so 'list apache' as well as 'list pach' would list both of them, but 'status apache' would be guaranteed to list only zero or one entries; in that case, "apache (start; deleted) running, process 12345".
[06:20] <Keybuk> so what criterion would status use to determine which one of the multiple apache jobs to return?
[06:22] <_ion> The one that is running. :-) But that criterion fails if there are multiple instances running simultaneously...
[06:23] <_ion> In the case of a single instance and an updated config file, the "apache (stop) waiting" would be kind of redundant information from the user's point of view, as the previous line already says "deleted".
[06:24] <Keybuk> interesting
[06:25] <_ion> This is just thinking out loud, i definitely haven't thought the whole thing through. :-)
[06:25] <Keybuk> you can technically start the new apache without first stopping the old one
[06:26] <_ion> I'm not sure that should be allowed, if the conffile doesn't allow instances.
[06:27] <Keybuk> aye, not sure how to prevent that yet :p
[06:29] <_ion> apache (deleted) running, apache (stop) waiting: any goal changes apply to the former, *until* it reaches the 'waiting' state. Then it vanishes and goal changes apply to the latter.
[06:30] <Keybuk> what if the two have different event matches, and the event sequence for the newer one happens? :p
[06:30] <_ion> Good question. :-)
[06:31] <Keybuk> for instance jobs, this gets even more interesting :p
[06:31] <Keybuk> for those, you probably *do* want the newer one to be the one that's started, and not the older one
[06:31] <_ion> Yeah.
[06:34] <_ion> "what if the two have different event matches, and the event sequence for the newer one happens?" Intuitively, i'd expect upstart to ignore that, until the running, deleted job is gone. Or can you think of a reason that would be a bad thing?
[06:35] <Keybuk> instance jobs
[06:35] <_ion> Yeah, i specifically meant non-instance jobs.
[06:35] <Keybuk> not off-hand
[06:36] <_ion> The rules would be different for instance jobs.
[06:36] <Keybuk> defining a sensible set of rules is the tricky bit ;)
[06:36] <_ion> Another interesting question: what happens if a job's multi-instance flag is changed :-)
[06:36] <Keybuk> how can it be changed?
[06:37] <_ion> I mean, the deleted conffile says 'instance' and the new one doesn't, or vice versa.
[06:37] <Keybuk> the new conffile creates a new job
[06:37] <Keybuk> that's kinda the point :p
[06:37] <Keybuk> and why you end up with deleted jobs for the old conffile
[06:39] <_ion> Yes. I mean:
[06:39] <Keybuk> oh, you mean wrt instance behaviour?
[06:39] <Keybuk> yes, I see your point
[06:39] <_ion> If there's a rule that:
[06:40] <_ion>  not 'instance': the new one can't be started if the old one is still running
[06:40] <_ion>  'instance': new instances are started based on the new conffile's rules
[06:41] <_ion> what to do when the 'instance' is changed :-)
[06:41] <_ion> +setting
[06:41] <Keybuk> *thinks*
[06:41] <Keybuk> we could define the following rule
[06:41] <Keybuk>  * For any job name, there is one master job that holds that name.
[06:42] <Keybuk>  * That job may have multiple instances
[06:42] <Keybuk>  * That job may also have a potential replacement
[06:42] <Keybuk> -- 
[06:43] <Keybuk> parsing a new conffile, would locate the existing job, destroy any existing replacement, and place itself as the replacement of that job
[06:43] <Keybuk> it would not be a "master" until the job it replaces has been deleted
[06:44] <Keybuk> instances are also not a "master" because they're an instance of another job
[06:44] <Keybuk> --
[06:44] <Keybuk> start, stop & status would all operate on the current master job
[06:44] <Keybuk> they may also, through specific job id, operate on an instance of the job
[06:45] <Keybuk> but may never operate on a replacement job while the master exists
[06:45] <_ion> would stopping the master job kill all the instances?
[06:45] <Keybuk> good question
[06:46] <Keybuk> the above would solve the instance/not-instance problem -- whatever flag is "current" applies, and applies until the current job is gone
[06:46] <_ion> Yeah.
[06:48] <_ion> From the point of a daemon's postinst script, it might be best if it's certain 'stop foobard && start foobard' actually kills any instances of the old version. It might be a critical security update.
[06:52] <Keybuk> yeah
[07:04] <Keybuk> should sysadmins be allowed to explicitly start a replacement job by id?
[07:06] <_ion> What would be an use case for that?
[07:06] <Keybuk> what would be the use case for not allowing that?
[07:07] <Keybuk> if they've identified it explicitly by id, should that be considered valid?
[07:08] <_ion> Sorry, i'm not exactly sure what you mean by starting a replacement job by id. Is that the id of an instance?
[07:08] <Keybuk> all jobs have a unique id
[07:08] <Keybuk> so you can do something like
[07:08] <Keybuk> # initctl list -v apache
[07:08] <Keybuk> apache [#1234]  (start) running, process 12345
[07:08] <Keybuk> apache [#1980]  (stop) waiting
[07:08] <Keybuk> # 
[07:09] <Keybuk> where the 1980 one is the "new" replacement for the other
[07:09] <Keybuk> if you then did
[07:09] <Keybuk> # start --id 1980
[07:09] <Keybuk> should it give you an "unknown job" error, because it's not yet replaced #1234, or should it just do it since you were explicit
[07:10] <_ion> I'm still thinking it should be enforced that no two apaches can be running simultaneously if it's not an 'instance' job.
[07:11] <Keybuk> even though the sysadmin was clearly trying?
[07:11] <Keybuk> it's easy enough to add some code to enforce that
[07:12] <_ion> Yes, even though she was clearly trying. :-) But that's only my personal opinion.
[07:13] <_ion> There could be an informal message, like "Please stop the deleted job before starting the new one".
[07:15] <_ion> If the job doesn't specifically say 'instance', i think it's a mistake from the sysadmin to try to force multiple instances to run.
[07:15] <_ion> E.g. apache: the second instance is going to try to listen on the same port, write to the same logfiles, etc.
[07:18] <Keybuk> yeah
[07:19] <Keybuk> ok
[07:19] <Keybuk> so how about the following:
[07:20] <Keybuk> a job is in the DELETED state if
[07:20] <Keybuk> - it's an instance which has finished
[07:20] <Keybuk> - it's a non-instance or instance-less job that has been replaced
[07:20] <Keybuk> when we move a job with a replacement into the DELETED state, we unmark the replacement so it becomes the new master
[07:21] <Keybuk> when finding a job by name, we never return instances or replacements
[07:21] <Keybuk> we don't allow spawning an instance, changing the goal or handling events for DELETED jobs or replacements
[07:22] <Keybuk> (technically we don't allow spawning an instance of instances either, but that's handled differently by redirection)
[07:23] <_ion> Sounds good.
[07:24] <_ion> Hmm, perhaps present the job IDs to the user in base-36? :-) '1234' would be 'ya' and '1980' would be '1j0'. They'd be shorter and perhaps even easier to remember.
[07:25] <Keybuk> heh
[07:26] <Keybuk> so, "start apache" would only ever start the current version
[07:26] <Keybuk> it'd say unchanged if it was running, might spawn a new instance, or just start the real thing
[07:26] <Keybuk> "stop apache" would stop the current version
[07:26] <Keybuk> it'd say unchanged if it was running, might stop all instances, or just stop the real thing
[07:27] <Keybuk> "status" apache would query the current version
[07:27] <Keybuk> it might return a single status, or a list of instances with the master first
[07:37] <_ion> Yeah.
[10:17] <Keybuk> dunno what next door are doing
[10:17] <Keybuk> lots of banging and drilling sounds