StevenK | wgrant: https://code.launchpad.net/~stevenk/launchpad/no-completed-jobs-for-celery/+merge/168859 with http://pastebin.ubuntu.com/5760296/ to claw it back to net-negative if you approve | 04:01 |
---|---|---|
wgrant | StevenK: I'm still not entirely sure that that's the right place to put it | 04:04 |
wgrant | And we also need to handle FAILED | 04:05 |
wgrant | And RUNNING | 04:05 |
wgrant | We handle that in the normal runner by only retrieving jobs that are WAITING | 04:05 |
wgrant | Ideally the list of jobs from celery should be filtered at a similar level -- above runJob | 04:05 |
StevenK | RUNNING should be handled by acquireLease | 04:05 |
StevenK | wgrant: I wasn't able to figure out which code path pulls job off the queue, so we could make it skip and drop jobs that aren't WAITING | 04:08 |
wgrant | StevenK: lazr.jobrunner.celerytask.RunJob.run is the relevant method. | 04:14 |
wgrant | Difficult to override without just copying it | 04:14 |
StevenK | It just calls job_source.get, we could make it return None if the job is not WAITING and then have run pike if the job is None | 04:16 |
wgrant | That sounds less than ideal | 04:16 |
StevenK | Right | 04:16 |
wgrant | StevenK: I'd put a waitingness check between the get and acquireLease | 04:19 |
wgrant | Probably just in a local override for now | 04:20 |
StevenK | wgrant: But isn't that too high again? That isn't pulling stuff off the queue, it's running the job | 04:20 |
wgrant | StevenK: That's the celery task | 04:23 |
wgrant | It's a RunJob task that is given to celery | 04:23 |
wgrant | It examines and runs the job | 04:23 |
wgrant | It's the appropriate level | 04:24 |
wgrant | And indeed the highest level that isn't inside celery | 04:24 |
* StevenK stabs LP | 04:24 | |
nigelb | LP will stab you back... :p | 04:24 |
StevenK | Now this script is causing an OOPS because it REALLY REALLY REALLY wants to send a mail | 04:24 |
StevenK | nigelb: Duh. I have scars. | 04:25 |
nigelb | Heh | 04:25 |
StevenK | wgrant: lp.services.job.celeryjob.CeleryRunJob is a subclass of RunJob, we could check the status is WAITING pretty easily and then not call super() ? | 04:28 |
wgrant | StevenK: Except that the superclass method is the bit that calls job.get | 04:29 |
wgrant | I'd just override the method and we can push it upstream into that project that nobody else uses eventually | 04:29 |
StevenK | wgrant: Or I could just fix it in lazr.jobrunner? | 04:30 |
wgrant | StevenK: You could. I guess adding the extra bit to the interface isn't a problem, as there's only one implementation. | 04:31 |
StevenK | wgrant: http://pastebin.ubuntu.com/5760341/ | 04:49 |
StevenK | wgrant: I couldn't work out how to get the enum into jobrunner | 04:50 |
StevenK | wgrant: And I can't work out how to test it. | 04:54 |
wgrant | StevenK: I'd consider 'if not job.canRun()', perhaps | 04:57 |
StevenK | wgrant: Then it can go live in jobrunner if it's that. | 04:58 |
wgrant | StevenK: That's the point :) | 04:58 |
=== tasdomas_afk is now known as tasdomas | ||
StevenK | wgrant: Then it turns into a two liner, because I can't work out how to get at the logger | 05:12 |
wgrant | StevenK: :( | 05:13 |
StevenK | wgrant: https://code.launchpad.net/~stevenk/lazr.jobrunner/only-run-if-canrun/+merge/169105 | 05:39 |
wgrant | StevenK: That doesn't break its tests? | 05:41 |
* StevenK finds out | 05:41 | |
StevenK | FileJob\\\' object has no attribute \\\'canRun\ | 05:46 |
StevenK | Pity | 05:46 |
StevenK | wgrant: This would be easier if r49 didn't fail tests locally. | 05:56 |
StevenK | wgrant: From what I can see, both r49 and r51 with my changes fail locally in the same way. | 06:22 |
wgrant | StevenK: Can you fix the test failure? | 06:22 |
StevenK | wgrant: I don't get why r49 fails | 06:24 |
StevenK | r49 being 'trunk' | 06:24 |
=== tasdomas is now known as tasdomas_afk | ||
=== tasdomas_afk is now known as tasdomas | ||
=== wedgwood_away is now known as wedgwood | ||
=== tasdomas is now known as tasdomas_afk | ||
=== wedgwood is now known as wedgwood_away | ||
StevenK | wgrant: So you're not happy wih https://code.launchpad.net/~stevenk/lazr.jobrunner/only-run-if-canrun/+merge/169105 ? | 23:44 |
wgrant | StevenK: Ah, that's right, I got distracted by the test suite hanging and then other things | 23:48 |
StevenK | wgrant: The test suite breaks for me locally, but the failures from trunk are identical at least | 23:53 |
wgrant | StevenK: Right, now that you've fixed the test job implementation | 23:54 |
StevenK | wgrant: Happy to build 0.12, stuff it into sourcedeps and then toss a branch that jumps to 0.12 at ec2 | 23:58 |
wgrant | StevenK: As long as you also add a canRun method to the LP job implementations... | 23:58 |
StevenK | Oh, I thought that exists | 23:59 |
StevenK | *existed | 23:59 |
StevenK | Let me find the base class | 23:59 |
wgrant | No | 23:59 |
wgrant | There's actually already an is_pending method, which is WAITING/RUNNING/SUSPENDED | 23:59 |
wgrant | Might be better to make canRun() is_runnable | 23:59 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!