/srv/irclogs.ubuntu.com/2011/11/09/#upstart.txt

=== Md is now known as Guest18462
=== Md_ is now known as Md
twbFor lucid, is this the right way to do multiple instances?  http://paste.debian.net/144039/02:57
twbHow the hell do you get a list of all active instances of a job03:44
brodertwb: initctl list will print out all active instances03:44
twbAh, OK, and then grep my job name out of that03:47
twbYeah initctl list | sed -rn 's/^network-interface \((.*)\).*/\1/p'03:49
twbinitctl list | sed -rn 's/^network-interface \((.*)\).*/\1/p' | while read url do stop streamripper URL=$url done03:50
lericsonGuys, seriously -- is "Job failed to start" the best diagnostics message I can get?10:15
lericsonI'll tell you what, it does not suffice.10:15
lericsonAlso, can I get the instance identifier (i.e. `instance $X/$Y` I want a variable containing "$X/$Y", potentially named $INSTANCE? I would TIAS but it's impossible to diagnostice errors.)10:17
ionAs init(5) says, UPSTART_INSTANCE.10:18
ionAnything in syslog? If not, make your scripts log something.10:18
lericsonNov  9 10:12:02 localhost init: gunicorn (julkort/julkort:app) pre-start process (32452) terminated with status 110:18
lericsonIt says10:18
lericsonWhich is a little better I agree, but I want the stderr output10:19
ionpre-start script10:19
ion  exec >>/tmp/foo 2>&110:19
ion  …10:19
jhuntlericson: http://upstart.ubuntu.com/cookbook/#debugging10:20
jhuntlericson: Have you checked your job with init-checkconf (http://upstart.ubuntu.com/cookbook/#init-checkconf)? Also, you may be affected by "set -e". Read this snippet for details: http://upstart.ubuntu.com/cookbook/#debugging-a-script-which-appears-to-be-behaving-oddly10:23
lericsonAh, thanks -- seems my pre-start script wasn't to its liking, will look into later.10:27
lericsonBut now I get this weird behavior, inictl list says:10:27
lericsongunicorn (julkort/julkort:app) start/running10:27
lericsonYet the process is dead, and the syslog says "main process ended, respawning" twice with one second in between10:27
lericsonI'd expect it to respawn it 10 times within 15 seconds?10:27
lericsonOr does it somehow consider it running?10:28
lericson(It would seem to be the case given the list output.)10:28
ionAdd similar logging to the main process and see why it dies immediately.10:28
lericsonjhunt: And I don't have init-checkconf on this server, no idea how to get it either.10:29
jhuntlericson: sounds like you're running lucid?10:29
lericsonion: It dies immediately because I told it to, but I want to know why Upstart isn't doing as advertised (respawning is a major point)10:29
lericsonjhunt: Yes, LTS10:29
jhuntlericson: do this then - http://upstart.ubuntu.com/cookbook/#older-versions-of-upstart10:29
lericsonThank you! I give you two 5/5 in friendliness, expertise and professionalism-- feels like customer support.10:31
jhuntlericson: I'd guess you hadn't told it to respawn. If you have a service you want to respawn, you need to add the "respawn" stanza to your .conf file (see "man 5 init" for details).10:32
lericsonI like the idea of Upstart by the way, I was a Gentooist for the longest time but I think Upstart > OpenRC10:32
lericsonI did, jhunt10:32
ionPlease paste your job definition and the relevant lines from syslog.10:32
lericsonRelevant might be that the service daemonizes itself (I do have expect fork)10:32
ionAre you sure the service forks exactly once?10:33
jhuntlericson: when you say "daemonizes", do you mean that? If so, you need to specify "expect daemon" (2 forks) rather than "expect fork" (1 fork)10:33
jhuntlericson: this is a known problem - if you don't know how many times your app forks and mis-specify it, Upstart is unable to "track" the pid so you can get odd behaviour. We intend to fix this issue for the next LTS (it's a difficult problem to address).10:34
lericsonhttp://pb.lericson.se/p/SBiMze/10:34
lericsonI know, jhunt -- I haven't read gunicorn's source lately, so I don't know, but I would guess they don't daemonize properly.10:34
lericsonBut you're saying this is a symptom of it losing the PID?10:35
jhuntlericson: quite possibly. I'd recommend running "strace -fFv -o /tmp/strace.log /usr/bin/gunicorn ..." and grepping for fork/clone calls in /tmp/strace.log.10:36
lericsonAlso by the way, my pre-start script seems to check out fine.10:36
ionIt may be simpler just to tell the main process not to daemonize and drop the “expect” stanza.10:37
lericsonWell, interestingly http://gunicorn.org/faq.html#gunicorn-fails-to-start-with-upstart10:37
jhuntlericson: also, your "stop on" looks wrong - there is no standard "shutdown" event on Lucid. You generally specify "stop on runlevel [016]" (ie stop on halt/single-user mode/reboot).10:38
lericsonOh, I ripped that off of mysql10:38
lericsonNo, I didn't.10:39
lericsonI got it somewhere anyway10:39
lericsonSo now initctl stop blah just sits there, apparently doing nothing.10:45
lericsonsyslog is empty10:45
ionI take it initctl status says ”stop/running” for the job?10:46
lericson^C'd it and it says gunicorn (julkort/julkort:app) stop/killed, process 3257210:46
ionah, stop/killed indeed.10:46
ionAnother symptom of lying to Upstart about the forking behavior of the main process. :-) Run workaround-upstart-snafu 32572. http://heh.fi/tmp/workaround-upstart-snafu10:46
jhuntI'd recomment: (1) killing the pid of your gunicorn process manually, then (2) retrying the "stop". Then, (3) comment out "respawn" and "expect fork" and starting the job. If the reported pid is correct, gunicorn isn't forking at all and you can then just re-enable the respawn.10:50
jhuntIf the pid is wrong, stop the job again (kill + stop), and try adding "expect daemon".10:50
lericsonion: It forks until it gets the same pid and then dies? :|10:51
ionyeah10:51
jhuntThe maximum number of forks an app will do is 2 (there is no benefit doing more)10:51
lericsonjhunt: It's dead already, not even a zombie process10:51
lericsonOut of curiosity, how does it track forks?10:52
jhuntso does "gunicorn --daemon" coupled with "expect daemon" work for you?10:53
jhuntlericson: it uses ptrace10:53
lericsonjhunt: I don't know, I can't even get it to start again.10:53
lericsonHaven't tried ion's workaround (no ruby)10:53
jhuntlericson: We will be making some improvements to the tracking this cycle and may start using cgroups at a later stage. However, 99% of apps can be handled via ptrace.10:54
ionWhen status says stop/killed with an inexistent process, Upstart is in a confused state, perpetually expecting to receive a SIGCHLD for a process with that PID. The workaround provides such a process. :-P10:55
ionjhunt: cgroups, huh? Didn’t Keybuk come up with a proc connector implementation much superior to a cgroups implementation? There’s even a working prototype.10:56
lericsonCan't I restart the daemon? >_>10:56
jhuntion: Keybuk and I discussed both options recently. The proc connector has issues.10:56
ionok10:57
jhuntion: specifically wrt containerized environments10:57
ionIs there a fix for the cgroups issues?10:57
jhuntlericson: unfortunately, if the pid being reported by upstart no longer exists, nominally you'll have to reboot to clear Upstarts knowledge of that job. If you're on a dev box, just copy your .conf file to a new name and work with that newly named job for the meantime.10:58
ion(or use workaround-upstart-snafu)10:59
lericsonSo quite predictably, running a tool that is essentially a fork bomb fork bombed the machine11:02
ionIt’s not a fork bomb.11:02
jhuntlericson: seems that gunicorn doesn't fork at all by default.11:02
lericsonjhunt: No, but it wants to11:03
jhuntlericson: hence, you don't need "expect" at all.11:03
lericsonAs linked above, their FAQ explicitly says "use --daemon"11:03
jhuntlericson: have you checked your gunicorn config file (daemon=False|True)?11:04
ionThe script has no more than three processes at any time.11:04
lericsonjhunt: I did, and the reason Upstart got confused is me11:04
lericsonHowever I'm more focused on fixing the issue than placing blame :-)11:04
jhuntlericson: so, any joy with copying the .conf file and modifying it?11:05
lericsonI'm going with that approach yeah, thanks for the tip-- other question: $UPSTART_INSTANCE expands to $UPSTART_INSTANCE (or rather doesn't expand at all) in my exec line11:06
lericsonAh, mystery is solved; I had expect fork but Gunicorn does proper daemonization https://github.com/benoitc/gunicorn/blob/master/gunicorn/util.py#L28011:08
jhuntlericson: I thought we'd already come to that conclusion :)11:10
jhuntso you should be able to run gunicorn without --daemon and without the "expect" stanza, or with --daemon and "expect daemon". If one/both of these work, I'd appreaciate it if you could let the gunicorn people know so they can update their faq (and maybe provide a "debian/gunicorn.upstart" in their distribution files)11:11
lericsonjhunt: I'll be contributing this upstart script to gunicorn most likely, and yes it does work as intended with --daemon and expect daemon11:13
lericsonWhich seems pretty obvious now that I say it, but hey11:13
jhuntlericson: great - thanks very much!!11:14
lericsonBTW, OpenRC has this /etc/init.d/foo zap command which is extremely useful in situations like these as I am sure you're aware, is this avoided on a philosophical basis or just not implemented?11:15
jhuntjhunt: I'll update the cookbook with a section on how to identify which stanzas you need for daemons of various types hopefully this week...11:15
jhuntlericson: well, historically "both", but we do intend to resolve this issue in the current cycle. Since Upstart now supports user jobs, being able to kill a mis-specified user job is particularly important as user jobs can be used as a testing ground for system jobs.11:17
lericsonISTM a lot of things can make Startup get stuck, nginx has the same issue.11:28
lericsonSo the same freeze effect happens if the post-stop script fails?11:38
lericsonHey, nevermind11:39
lericsonTurns out it doesn't freeze11:40
jhuntlericson: it can be tricky to write a new .conf file, particularly when it isn't obvious what the daemon is actually doing. However, once Upstart knows how to establish the pid of your job, you're in good hands :)11:44
lericson(-:11:48
lericsonIt occurred to me that you can't really expand variables in the `exec` line, is this true?11:49
lericsonOr rather they need to be isolated somehow?11:50
jhuntlericson: expansion should work fine. Maybe you've somehow got some unicode dollar/underscores in the original version of your .conf file?11:58
lericsonjhunt: It was in the env lines13:06
jhuntlericson: aha!13:13
lericsonSo how does one restart a service with Upstart?13:40
lericsoninitctl restart <job>13:41
lericsonOk, thanks.13:41
lericsonDoes this first stop the job and the start it again, or how does it work?13:41
jYyes that's how it works13:44
lericsonSo, is there a way to add a reload event? I was thinking along the lines of initctl emit reload JOB=nginx14:15
lericsonHmmmm14:16
SpamapSlericson: restart stops and starts the job without reloading the job config14:33
SpamapSlericson: initctl reload will tell upstart to send SIGHUP to the tracked process14:33
lericsonSpamapS: The daemon I'm writing the script cycles PID on HUP.14:43
lericsonSo I have a task without an exec or script section, only post-start and post-stop. It is meant to start a set of services on boot14:44
lericsonThe problem is that it does nothing.14:45
lericsonIn fact starting it just makes initctl stop14:45
=== ion_ is now known as ion
SpamapSlericson: yeah, you have to use stop/start for that15:00
SpamapSI actually don't like restart for most operations. :-/15:01
lericsonI think it's a mistake assuming all daemons reload with SIGHUP without changing, and that it is a mistake assuming that all daemons best restart via stop + start.15:01
codebeakerlericson: I think there's a missing pardigm [:start, :stop, :restart (stop, something else?, :start), :reload]15:02
codebeaker^ would fit the way daemons actually need to work15:03
lericsonon reload exec nginx -s reload15:04
lericsonmake it go!15:05
codebeaker^ lericson ?15:07
codebeakeris that valid "on reload …………"15:07
codebeaker?15:07
lericsonNope, but that's what I want.15:10
codebeaker^ right :)15:12
SpamapSYou can change the kill signal now... but I don't think you can change the reload signal. :-P15:14
SpamapSlericson: its not meant to do all service orchestration for you. Its meant to facilitate starting and stopping of services. You can write a script to do complex interactions like that.15:15
SpamapSI agree that assuming behaviors of tracked daemons causes as many problems as it solves.15:16
codebeakerSpamapS: HUP certainly wasn't designed historically to tell a process to RELOAD15:16
SpamapSindeed, it just became a convention of nearly every daemon ;)15:23
SpamapSsince daemons, by definition, do not have a terminal, "the terminal hung up" isn't meaningful for them in its original context15:24
codebeakerthat's a point I hadn't considered15:31
codebeakerbut, what's stopping people inventing new signals ?15:31
codebeaker(theoretically?)15:31
lericsonStandardization across platforms15:42
codebeakersure, but upstart introduced an arbitrary signaling interface, did it not ?15:59
codebeakeror, "Eventing" rather15:59
=== ion_ is now known as ion

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!