[04:01] <StevenK> wgrant: https://code.launchpad.net/~stevenk/launchpad/no-completed-jobs-for-celery/+merge/168859 with http://pastebin.ubuntu.com/5760296/ to claw it back to net-negative if you approve
[04:04] <wgrant> StevenK: I'm still not entirely sure that that's the right place to put it
[04:05] <wgrant> And we also need to handle FAILED
[04:05] <wgrant> And RUNNING
[04:05] <wgrant> We handle that in the normal runner by only retrieving jobs that are WAITING
[04:05] <wgrant> Ideally the list of jobs from celery should be filtered at a similar level -- above runJob
[04:05] <StevenK> RUNNING should be handled by acquireLease
[04:08] <StevenK> wgrant: I wasn't able to figure out which code path pulls job off the queue, so we could make it skip and drop jobs that aren't WAITING
[04:14] <wgrant> StevenK: lazr.jobrunner.celerytask.RunJob.run is the relevant method.
[04:14] <wgrant> Difficult to override without just copying it
[04:16] <StevenK> It just calls job_source.get, we could make it return None if the job is not WAITING and then have run pike if the job is None
[04:16] <wgrant> That sounds less than ideal
[04:16] <StevenK> Right
[04:19] <wgrant> StevenK: I'd put a waitingness check between the get and acquireLease
[04:20] <wgrant> Probably just in a local override for now
[04:20] <StevenK> wgrant: But isn't that too high again? That isn't pulling stuff off the queue, it's running the job
[04:23] <wgrant> StevenK: That's the celery task
[04:23] <wgrant> It's a RunJob task that is given to celery
[04:23] <wgrant> It examines and runs the job
[04:24] <wgrant> It's the appropriate level
[04:24] <wgrant> And indeed the highest level that isn't inside celery
[04:24]  * StevenK stabs LP
[04:24] <nigelb> LP will stab you back... :p
[04:24] <StevenK> Now this script is causing an OOPS because it REALLY REALLY REALLY wants to send a mail
[04:25] <StevenK> nigelb: Duh. I have scars.
[04:25] <nigelb> Heh
[04:28] <StevenK> wgrant: lp.services.job.celeryjob.CeleryRunJob is a subclass of RunJob, we could check the status is WAITING pretty easily and then not call super() ?
[04:29] <wgrant> StevenK: Except that the superclass method is the bit that calls job.get
[04:29] <wgrant> I'd just override the method and we can push it upstream into that project that nobody else uses eventually
[04:30] <StevenK> wgrant: Or I could just fix it in lazr.jobrunner?
[04:31] <wgrant> StevenK: You could. I guess adding the extra bit to the interface isn't a problem, as there's only one implementation.
[04:49] <StevenK> wgrant: http://pastebin.ubuntu.com/5760341/
[04:50] <StevenK> wgrant: I couldn't work out how to get the enum into jobrunner
[04:54] <StevenK> wgrant: And I can't work out how to test it.
[04:57] <wgrant> StevenK: I'd consider 'if not job.canRun()', perhaps
[04:58] <StevenK> wgrant: Then it can go live in jobrunner if it's that.
[04:58] <wgrant> StevenK: That's the point :)
[05:12] <StevenK> wgrant: Then it turns into a two liner, because I can't work out how to get at the logger
[05:13] <wgrant> StevenK: :(
[05:39] <StevenK> wgrant: https://code.launchpad.net/~stevenk/lazr.jobrunner/only-run-if-canrun/+merge/169105
[05:41] <wgrant> StevenK: That doesn't break its tests?
[05:41]  * StevenK finds out
[05:46] <StevenK> FileJob\\\' object has no attribute \\\'canRun\
[05:46] <StevenK> Pity
[05:56] <StevenK> wgrant: This would be easier if r49 didn't fail tests locally.
[06:22] <StevenK> wgrant: From what I can see, both r49 and r51 with my changes fail locally in the same way.
[06:22] <wgrant> StevenK: Can you fix the test failure?
[06:24] <StevenK> wgrant: I don't get why r49 fails
[06:24] <StevenK> r49 being 'trunk'
[23:44] <StevenK> wgrant: So you're not happy wih https://code.launchpad.net/~stevenk/lazr.jobrunner/only-run-if-canrun/+merge/169105 ?
[23:48] <wgrant> StevenK: Ah, that's right, I got distracted by the test suite hanging and then other things
[23:53] <StevenK> wgrant: The test suite breaks for me locally, but the failures from trunk are identical at least
[23:54] <wgrant> StevenK: Right, now that you've fixed the test job implementation
[23:58] <StevenK> wgrant: Happy to build 0.12, stuff it into sourcedeps and then toss a branch that jumps to 0.12 at ec2
[23:58] <wgrant> StevenK: As long as you also add a canRun method to the LP job implementations...
[23:59] <StevenK> Oh, I thought that exists
[23:59] <StevenK> *existed
[23:59] <StevenK> Let me find the base class
[23:59] <wgrant> No
[23:59] <wgrant> There's actually already an is_pending method, which is WAITING/RUNNING/SUSPENDED
[23:59] <wgrant> Might be better to make canRun() is_runnable