/srv/irclogs.ubuntu.com/2010/03/13/#launchpad-dev.txt

peitschiewgrant: many thanks again :)00:00
wgrantnp00:00
wgrantHmmm.00:06
wgrantI just got a really foul OOPS that I both shouldn't have got, and is broken.00:06
wgrantThere is a self-closing <script> in the header, so the page is just about empty.00:07
wgrant(edge)00:07
jtvhi wgrant, thanks for saving us from that dpkg path traversal vulnerability :)01:35
wgrantjtv: Morning.01:38
jtvwgrant: I've been up all night playing with representing different kinds of buildfarmjobs in the UI01:52
jtvMy current experiment goes something like this: delegate representing the current job to the buildqueue, which in turn delegates to specific_job.01:53
jtvI introduce two new interfaces: IBuildFarmBuildJob ("an IBuildFarmJob that also refers to a Build") and IBuildFarmBranchJob ("an IBuildFarmJob that's also an IBranchJob").01:54
jtvThese have their own templates to render them on the builder pages.01:54
jtvThere's a fallback for "just an IBuildFarmJob without a Branch or a Build," but currently I don't think anything would use that.01:54
jtvSo everything is based on a hierarchy of interfaces, with TAL at various levels of refinement.01:55
jtvNow, the builder-index TAL only has the "there is a current job" case and the "there is no current job" case.01:57
jtvIf there is a job, the rendering mostly happens in a BuildQueue view (the permissions check moves in there as well so the TAL only needs to deal with a single condition).01:57
jtvThe TAL for the BuildQueue then incorporates the polymorphic rendering of the BranchJob.01:58
jtvSorry, the BuildFarmJob01:58
wgrantRight, this is sort of what I was thinking.01:58
wgrantBut then people started throwing around ideas about redesigning the model.01:58
jtvI think know how you feel: "yes that has to happen, but given the choice between discussing that some more and taking the first useful step..."  :)02:00
wgrantHeh.02:00
wgrantYes.02:00
jtvFunny thing is, from where I'm standing, the whole where-do-we-keep-history thing seems to be entirely independent of this.02:01
lifelessbzr02:08
lifelesskthankssolveitonce02:08
jtvlifeless: ??02:10
lifelessjtv: 'where to store history' :)02:44
jtvah, that part I was emphatically not planning to solve anytime soon  :)02:45
wgrantjml: Is having an open team as launchpad-web's owner a really great idea? It lets everyone in the world change the security contact, for example.07:05
jtvwgrant: I've been staring myself blind at how the buildd master keeps track of what build belongs to a buildfarmjob... does it simply keep the buildfarmjobbehavior in memory and just forget about the whole thing if it ever gets restarted before the slave is finished?08:30
wgrantjtv: No, it can be restarted fine.08:30
wgrantjtv: Isn't a Build a BuildFarmJob?08:30
wgrantBuilder.current_build_behavior looks at Builder.current_job and magically creates the behaviour from that.08:31
jtvBuild is not a BuildFarmJob, but in some of the base buildfarm code it's tacitly understood that a buildfarmjob _has_ a build.08:32
wgrantOh, is an IBuildFarmJob that sits between Job and (SourcePackageRecipe)Build?08:32
* wgrant could just check the source.08:33
jtvYes, pretty much...  generally the Job is the magic magnet that brings all the various parts of a slave job together.08:34
jtvNote that a SourcePackageRecipeBuild is not a Build.08:34
wgrantRight, it's a BuildBase.08:34
wgrantjtv: Don't the relevant implementations of IBuildFarmJob have a 'build' attribute?08:35
jtvTrue, they do.08:35
* jtv needs pen and paper08:35
wgrantI think we just needed a bigger whiteboard at the sprint.08:35
jtvWith this being BuildMaster's dirty little secret, it gets hard to keep track08:35
* wgrant blames that for all design flaws.08:36
jtvNice one08:36
wgrantIt's nothing to do with buildd-manager.08:36
jtvIsn't that the process that drives this though?08:36
wgrantIt is. But all the stuff that deals with jobs like that is all deep in the model.08:37
wgrantbuildd-manager knows only about BuildQueues and Builders.08:37
jtvRight.  So what I'm trying to figure out is if we can do away with the Build.id in the buildfarmjob names.08:39
jtvAnd instead, have a single implementation for generating/verifying all those names, which contains just (1) the BuildQueue.id and (2) a cookie.08:40
wgrantNames meaning the slave build ID?08:40
jtvYes08:40
jtv(confusingly referred to as "name" in IBuildFarmJob but "slave build ID" in IBuildFarmJobBehavior)08:40
wgrantI wasn't aware of the 'name' name.08:41
* wgrant checks.08:41
jtvIt's subtle: what I mean is it's the only sensible implementation for getName08:41
jtv...though it's actually up to the behavior class to compose the slave build id in that way or some other way.08:41
* wgrant isn't sure why it's getName() rather than just name.08:42
wgrantThat makes me a little suspicious.08:42
jtvYeah.  Right now, the conventional structure for the id is <Build.id>-<BuildQueue.id>08:42
wgrant(I added getTitle following that convention; I probably shouldn't have)08:43
wgrantRight.08:43
jtvBut recipe builds use SourcePackageRecipeBuild.id instead of Build.id08:43
wgrantThe content of it doesn't matter. As long as it's reasonably unique.08:43
jtvIt's not even used to find back the BuildQueue?08:43
wgrantWe use a method on BuildQueueSet for that.08:44
wgrantIt's deep in SOMEWHERE.08:44
jtvOh, that's getByJob's job isn't it08:44
wgrantI have branches that clean buildd-manager up and make it less completely ununderstandable.08:44
wgrantBuilddMaster.scanActiveBuilders is all that should need to retrieve a BuildQueue within the manager.08:45
jtvI can only cheer for you.  Given all this, let's stop pretending there's any structure to the slave build id at all, beyond what's needed to guarantee uniqueness.08:45
wgrantExcept for the thing that finds a new job to dispatch.08:45
wgrantRight.08:45
wgrantI mean, it could probably just be the BQ ID.08:46
wgrantAlthough it would be nice to be safe.08:46
jtvIt should include that, but shouldn't it also be resistant against a compromised slave guessing the id of another ongoing slave job?08:46
jtvWhich is why I was suggesting <BuildQueue.id>-<cookie>08:46
jtvBecause you obviously can't trust a slave who doesn't have his cookie.08:47
wgrantjtv: I thought so when looking at the code on the first night -- I got very scared.08:47
wgrantBut the master doesn't trust that.08:47
wgrantit only uses it to check against the DB record and potentially rescue.08:47
wgrantYou can't use it to hijack a build.08:47
jtvSo what does the master trust?08:48
wgrantThe DB.08:48
jtvBuildQueue.builder?08:48
jtvBuilder.currentjob?08:48
wgrantRight.08:48
jtvI'm sort of disoriented... is there any reason why we have this id at all now?08:49
wgrantWe need something to check if the builder needs to be rescued. A hijacked builder may be able to subvert that detection, but it's nice to be able to handle the non-malicious case.08:50
jtvThe thing about malicious cases is that they have a way of disguising themselves as whatever non-malicious cases you try to support.08:50
jtv:)08:50
wgrantThe worst they can do is prevent the builder from being aborted, thus hanging it, or cause inappropriate abortion, in which case they get aborted.08:51
wgrantAlso, they used to be able to break a log message or two, but that might have been fixed in the refactor.08:52
jtvYou're saying we've been very thoroughly checking for a complex, unlikely error condition that we've got no idea what to do with if we ever spotted it.08:53
jtvThis has all the hallmarks of government.08:53
wgrantNo, it has happened twice today.08:54
wgrantIt happens when builders drop of the network for a few minutes, for example.08:54
jtvI don't mean the rescue, I mean mismatched ids!08:54
wgrantThe mismatched string is useful. Detecting a mismatch between the build and buildqueue IDs has probably never been useful, right.08:55
jtvwaitwaitwhat08:55
jtv"The mismatched string" is useful?08:55
wgrantThere are two cases: the slave thinks it's building something, but the master thinks it's building something else.08:56
wgrantThe string mismatch will be noticed by the master, and it will solve everything.08:56
wgrantThe other case is where the slave thinks it's building something, but the master thinks it isn't building anything, and the something doesn't exist in the DB.08:56
jtvAh, ok, so matching the slave build id to the buildqueue id is useful, but further sanity-checking on the slave build id is not.08:57
wgrantLet me think.08:57
wgrantProbably not, no.08:57
wgrantSo, for the case where the master thinks it isn't building anything, it should probably just abort straight away.08:58
wgrantWhen the master does think it's building something, the slave just needs to possess some value that can be linked back to the buildqueue.08:58
jtvEven if we're not sure of the hazards, would you agree that we can do a more thorough job of the check with less work if we just use <buildqueue>-<cookie>?08:58
wgrantWe can't reliably detect compromised builders, so we should not try.08:58
wgrantRight.08:58
wgrantEven just <buildqueue> is probably fine, but who knows...08:58
jtvRight.  If we're not 100% sure, we can add the cookie with very little work—and still scratch all this &$%*@# complexity08:59
wgrantWell, it's not *that* complex.09:00
wgrantI mean, if I'm malicious I can easily steal the cookie from another build log.09:01
jtvI just had a recipe-build test fail because it passed a SourcePackageRecipeBuild.id instead of a Build.id to BuildFarmJobBehaviorBase.verifySlaveBuildID.09:01
wgrantHm? That should be fine.09:01
wgrantAs long as the correct behaviour class was selected.09:02
jtvThe correct behaviour class didn't have its own verifySlaveBuildID09:02
wgrantIt doesn't need it.09:02
wgrantDoesn't SourcePackageRecipeBuildFarmJob have a .build?09:03
jtvWell, I was simplifying verifySlaveBuildID and a slightly more thorough check turned out to be simpler.09:03
wgrantWhat was the change?09:03
jtvYes, it does, but it's a buildbase not a build.09:03
wgrantBoth Build and SourcePackageRecipeBuild have an 'id' attribute, and they are the only concretions of BuildBase.09:04
jtvIIRC the original checked the build id it got from the slave build id to self.buildfarmjob.build.id09:04
wgrantRight.09:04
wgrantThat was done in order to be generic.09:04
jtvI see.09:04
wgrantUgly, yes.09:04
jtvI think part of the problem there is that I genericized the lookups of Build.id, SourcePackageBuild.id, and BuildQueue.id to use a single helper function.09:05
jtvWhich I still think makes sense.09:05
jtvBut being so generic, of course it does some sanity checks against comparing ids from different classes.09:05
jtvExcept obviously I can't just go "are these two of the same class?"09:06
jtvSo instead, the helper takes the expected class as an argument and does a zope "isinstance" check.09:06
wgrantAhh.09:07
jtvAnd no matter which way we turn it, that work seems pointless—the real link is made by the buildqueue id (which by rights should come first in the slave build id) and the added "specialized extra id stuff" in there is just going to make accidental mismatches more likely and any stopping power for attacks less likely.09:08
wgrantRight.09:09
jtvabentley should be asleep :)09:09
wgrantremember this dates from the beggining of time.09:09
wgrantabentley: He pinged out a couple of minutes ago, so he probably is.09:09
jtvwgrant: you seem to be overestimating my age, which is not generally considered the safe side to err on in conversation.09:09
jtvI do not "remember" the beginning of time, thank you very much.09:09
jtvI'll ask barry though.09:10
wgrantHeh.09:10
jtvSo here's what I suggest, and can propose on the mailing list:09:11
jtv * We have a single "concoct an id for this buildfarmjob" method.  It produces <BuildQueue.id>-<pseudo-unique string>09:12
jtv * Nobody fucking messes with this.  Laziness is encouraged in the implementation classes.09:12
jtv * A single verification method is enough.09:12
wgrantHopefully Soyuz will agree that the pseudo-unique string can be dropped.09:13
wgrantBut yes, please propose.09:13
jtv * The entire, ready-made id gets passed to dispatchBuildToSlave.09:13
jtv * ...which probably doesn't have much left besides this to justify specialized implementations, but I digress.09:13
wgrantdispatchBuildToSlave has to cache files and calculate arguments and all that.09:14
jtv * Everybody agrees to bury the whole issue and forget the episode ever happened.09:14
jtvYeah, but these are separate jobs.  I'm not convinced we need a single "do all the household stuff you need here" method.09:15
wgrantRight.09:15
wgrantIt should all change if everything goes properly async, anyway.09:15
wgrantRecordingSlave's trickery makes me sad.09:16
wgrantBut it was the quickest way to make bits of it async.09:16
jtvBTW you may think it odd that I keep coming back to the cookie...  you probably have a much better understanding of whether it would be really needed; for me it's an insurance policy.  "If double-checking the slave build id is necessary, this does it better.  If it's not necessary, this is still less work than what we have now.  And future changes are more likely to strengthen these arguments than to weaken them."09:18
jtv-> no need to argue over the colour of the inside of the bikeshed.  :-)09:19
wgrantYep.09:19
jtvI'd much rather add that piece of code than get stuck in that discussion while maintaining the current code!09:20
wgrantBut cookie probably implies DB change.09:20
wgrantAnd DB changes make me sad.09:20
jtvah, there's the catch: is this id actually stored anywhere09:20
wgrantNo.09:21
wgrantIt is calculated by the master, and stored by the slave during the build.09:21
wgrantAt no point should it be possessed by the master outside dispatchBuildToSlave and rescueBuilderIfLost.09:21
jtvSo that does make the case against the cookie stronger, unless we can find some generic, reproducible, and less-than-predictable stuff to hash.09:22
jtvOr include in some way.09:22
wgrantLet's just ask Soyuz, I think.09:26
jtvYes.09:26
jtv(FWIW the exact Job.date_created would IMHO probably be a better "cookie" than anything we have now, though still not particularly good)09:27
wgrantAh, yes, that is a good idea actually.09:27
jtvCertainly better than using floating-point primary keys, as a friend's customer did.09:28
wgrantSince it's the particular build of the BQ.09:28
wgrantHaha.09:28
jtv(BTW I got a call from the hardcore windows guy who's trying out Ubuntu Server...  to my utter amazement he's wildly enthusiastic despite the teething problems you'd expect)09:29
jtvOkay, I'll write this up for the ML09:30
wgrantGreat! On both counts.09:31
jtvwgrant: message is on the way10:25
* wgrant reads.10:25
wgrantjtv: Build.id and BuildBase.id are the same thing.10:26
wgrantBuild inherits BuildBase.10:26
jtvThere isn't really a BuildBase.id though, is there?  I meant "the id of some other BuildBase-derived class."10:27
wgrantTrue.10:27
jtvWhich only goes to illustrate how messy the whole thing is, I suppose10:31
jtvwgrant: it's been about a 32-hour working day, so I think I'll go enjoy the tail of weekend now.  See you next week I hope!10:44
=== jtv is now known as jtv-afk
wgrantjtv-afk: Wow! Yes, go and have a weekend.10:45
wgrantSee you.10:45
jtv-afk:)10:45
=== Ursinha_ is now known as Ursinha
=== Ursinha is now known as Ursinha-afk

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!