/srv/irclogs.ubuntu.com/2014/03/18/#maas.txt

bigjoolsjtv: in src/maasserver/tests/test_api_node.py I am wondering if the tests that do things like start() are creating real celery teasks04:40
jtvbigjools: I recall seeing task code running (from tracebacks, unrelated) when the node/nodes API tests did things like accept a node.04:42
jtvOh, and I think one thing that happened was:04:42
jtvSome tests get run in scenarios for different kinds of user (regular vs. admin, sometimes I think anon as well).04:43
jtvSomething I hate, by the way.04:43
jtvAnyway, when the admin tests enlist a node through the API...04:43
jtv...they are of course auto-accepted...04:43
jtv...which of course sends them straight into commissioning...04:43
jtv...including a power action to start them.04:44
bigjoolsyeah so the power action is what I am getting at05:03
bigjoolsjtv: so we have a celery fixture as self.celery in the maasserver tests, right?05:05
bigjoolsI don't know whether it goes as far as issuing actual tasks05:05
jtvYes — it slows down the tests a little bit, but many tests seem to rely on it.05:05
jtvDoes the celery fixture actually stop celery tasks though?  Or does it ensure their delivery, just without a transport?05:06
jtvLooking at the docstring, I infer the latter.05:07
bigjoolsok right it does them sync05:07
bigjoolsaaaiieeeeeeeee05:07
bigjoolsfuck05:08
jtvAy caramba05:09
jtv¡Aiaiay!05:09
bigjools¡Merda!05:09
jtvMeanwhile, Blake's branch has failed to land 3 times — twice because of that weird expected-to-be-called-once test bug, once because of a new one.05:09
bigjoolssheesh05:09
jtvIt's mierda, with an ‘i’05:10
jtvBy “new one,” I don't necessarily mean one he introduced... looks like another one that was already lurking in the code.05:10
bigjoolsyeah05:10
bigjoolsthat one is very rare05:10
bigjoolsI suspect all of this is a consequence of those celery tasks we just talked about05:10
jtvAhhh, it's an ExternalProcessError from rndc — but with a traceback this time.  So arguably this is progress.05:11
bigjoolseek not updated my trusty machine in a week and..... about a million packages to get05:11
jtvYup.05:12
jtvBut if you had other machines updating in the background, many of them will be in cache already.05:12
bigjoolson the bright side the coffee I just made is excellent05:12
bigjoolsno, I had not updated any of my machines so the cache was very cold05:12
bigjoolsunlike this coffee, which is hawt05:12
jtvThen your machines get to enjoy full cache re-use.05:13
bigjoolsand all I am trying to do is get tycho's branch tested before I approve it05:13
jtvThe rndc error looks like a port-is-in-use.  :/05:13
bigjoolsgood old isolation errors!05:14
jtvThe real hello-old-friend here is a traceback going through a save() into a signal handler and from there into a celery task which then dives deeper into a callback chain.05:15
jtvIn this case though, it's all intentional.05:16
bigjoolswe need to limit these types of tests to a separate set of tests that are obviously integration tests and work hard to keep the integration/unit split05:22
jtvI see it as mostly a practices issue.  To me, “integration test” is a relative term — the test still generally isolates some part of the application.05:29
jtvJust a bigger part than a unit test.05:30
bigjoolsjtv: well what I mean is, let's make the celery fixture do nothing by default05:31
jtvAbsolutely, yes!05:31
bigjoolsand make the tests that need it explicitly turn it on05:31
jtvBut I'm warning you: it'll affect loadsatests.05:32
bigjoolsit will be painful to fix but better05:32
jtvYup.05:32
bigjoolsit's tech debt05:32
bigjoolssad but true05:32
bigjoolsdamn I need an intern :)05:32
jtvI know it's been a hard day, Mr. President, but don't you want to wait until Hilary is out of the running?05:33
jtvI also wish we could just roll up our sleeves and replace all signals with explicit calls.  But the very essence of the problem is that those calls are spread out and implicit.05:33
bigjoolsand there's no other way sometimes I think?05:43
jtvThere are other things we could do: disable the signals by default during testing, or issue warnings and act later, or just live with the problem.05:46
bigjoolsyeah, I'd like signals disabled too.06:14
bigjoolsall of this stuff a) slows down tests, b) causes spurious side effects06:15
jtvAnd let's not forget: c) makes it harder to reason about what goes on06:24
jtv—just mentioning that because we are seeing cryptic test failures from time to time.06:24
bigjoolsjtv: well yes that's b) really06:42
bigjoolsrvba: what part of oleg's stuff cannot be landed in parallel to the existing import?06:54
bigjoolsI don't understand why06:54
rvbabigjools: have a look at https://code.launchpad.net/~strikov/maas/maas-new-metadata-format/+merge/210843 The changes to src/provisioningserver/pxe/config.py will break the package until the new script is used.06:55
rvbabigjools: same for the changes to src/provisioningserver/kernel_opts.py (we can probably improve what has been done here a bit btw)06:56
bigjoolsrvba: stuff like this seems grossly unnecessary:06:56
bigjools501-    root = String(if_missing="/var/lib/maas/tftp")06:56
bigjools502+    root = String(if_missing="/var/lib/maas/boot-resources/current/")06:56
rvbabigjools: I agree, but that's not what I'm talking about06:57
rvbabigjools: that's precisely why I didn't want to land the whole branch but just the raw script: so that we can improve things the way was want.06:57
rvbas/was/we/06:58
gmbG'moaning.06:58
jtvHi gmb06:58
rvbaHello gmb06:58
bigjoolsrvba: ok so the whole structure/naming of the images is different now?06:59
bigjoolsgreetings gmb06:59
rvbabigjools: we need to investigate that.  The only thing that seems to have changed in this regard is get_ephemeral_name().07:01
bigjoolsrvba: well I am referring to the stuff in pxe/config/py07:01
bigjoolsconfig.py07:01
bigjoolsdi-<thing> and boot-<thing>07:02
bigjoolsinstead of initrd and linux07:02
rvbaYeah, the name of the files clearly has changed.07:02
rvbaThe structure, I'm not sure.07:02
bigjoolsI don't know why they changed07:02
bigjoolsand the change in src/provisioningserver/pxe/tftppath.py is quite perplexing07:03
rvbaI think the two changes are linked.07:03
rvbadi-* is for install07:03
rvbaboot-* is for commissioning07:04
bigjoolsyes07:04
bigjoolscutting it as fine as ever with this stuff :/07:07
=== CyberJacob|Away is now known as CyberJacob
=== CyberJacob is now known as CyberJacob|Away
jtvYay!  Image migration, image import, commissioning, fast-path install, and classic install all work with the labels.09:47
jtvIt even works with Trusty, because the import script doesn't actually report the images as non-"release" yet.09:49
jtvrvba: question about the new import-script integration branch...10:28
jtvHow do the changes in the armhf templates work?10:29
jtvDo they just override a parameter to the template?10:29
jtvAnd if so, does this mean that we stop supporting the highbank subarch name?10:30
jtvDo we no longer need to support it because the simplestreams data gives us generic?10:30
jtvI'm asking because the overridden variable is for _newer_ releases than Raring, not older ones.10:31
rvbajtv: I'm told the new data will explicitly include subarches=['highbank', 'generic'] for the 'generic' boot resources.10:31
jtvAh that explains.  Thanks.10:31
rvbajtv: I'd like to see it for myself but we have to wait for the metadata to be published…10:34
jtvYeah.  It's a bit like boring an undersea tunnel and having to trust that you're going to meet up with the people working from the other end.10:36
jtvgmb: if you hadn't said anything about the "release" fallback, I wouldn't have worried about it.  But if it's not something that should happen in real life, I'd prefer not to return a sane label to an insane world.10:42
gmbjtv: Yeah, that's my feeling too. I'll fix the world.10:42
gmb(A bit)10:42
jtvNah, let the world stew in its own venom.  Just don't give it a label that will look (when debugging) as if there has to be an image.10:43
jtvIn fact, I wonder if it might be worth returning something recognisable as a hint...10:43
gmbjtv: Hmm... 'invalid-label' seems a good start. Either that or "FOAD". Maybe that's too subtle.10:45
jtvInsert joke at the expense of your favourite group here.10:46
jtvBut "invalid label" is the outcome of something more profound: no image!10:46
jtvThat, I should think, is what the user needs to know.10:46
jtvQuestion is, is that best expressed on the node's console or in the server logs?10:46
gmbClarity clarity clarity.10:46
jtvI'm not sure what you mean.  Could you explain?10:47
jtv<deadpan>10:47
gmbROFLYSST.10:47
jtv"at your silly statement"..?10:47
gmbRolling on Floor Laughing, Yet Somehow Still Typing.10:48
jtvAh!10:48
jtvUseful one.  Thanks.10:48
gmb(It's a Billy Bailey-ism)10:48
gmbBill*10:48
gmbANYROAD10:48
gmbjtv: I think it should probably go in the server's logs; I don't know how often people look at the node's console whilst it's booting (especially in large-scale deployments). That said, there's no harm in putting it in both.10:49
jtvFair enough...  Then I'd say pxeconfig should just raise an exception in this situation, so that both it and the cluster controller are in a position to log it.10:50
gmbjtv: Right.10:50
jtvAnd if the cluster controller wants, it can convey problems to the node's console through silly results, but...10:50
jtv...see your point above.10:51
gmbIndeed.10:51
gmbOn it.10:51
jtvAnd while you're there, can I just put in a good word for simple conditionals?10:51
jtv"if latest_image is not None" with an "else" makes for increased likelihood of future mistakes.10:52
jtvThere are two schools of thoughts about this — "most normal condition first" and "avoid unnecessary negation."10:52
gmbjtv: Yeah... In this case the latter seems the right path.10:52
gmbAnd an early escape from insanity.10:53
gmb("There's no latest image; fuck it, raise an exception.")10:53
jtvHuzzah.  Labels have landed.10:56
rvbajtv: \o/.  Please update the integration branch.10:57
jtvWill do.10:58
rvbaTa.10:58
jtvBlake's branch failed to land (in this latest attempt) because an architecture in the registry has its pxealiases field set to None.  Does anyone know why that might be?11:56
jtvHmm... looks like ArchitectureRegistry expects that field to be iterable, and yet it defaults to None.11:59
jtvWhy isn't that breaking other branches?11:59
jtvUh-oh.  That's dependent on coincidental dict ordering, isn't it?12:00
rvbajtv: is it?  I'd say get_by_pxealias() is simply broken.12:01
jtvYes, but in a way that will pass tests sometimes.12:02
jtvDepends on the ordering of the dict iteration in get_by_pxealias().12:02
jtvThis is what happens when we skimp on negative tests.12:02
rvbapxealiases should probably be an empty tuple by default.  Instead of None.12:04
jtvEither will do.12:05
jtvJust not this.  :)12:05
jtvI'm fixing it.12:05
rvbajtv: the more I look into moving the config for the new import script into a new file, the more I think it's a rabbit hole… because of the presence of the boot/ephemeral/images_directory config.12:07
rvbaIt's used heavily in the provisioning server itself.12:07
rvbaYes, we could probably have that stored on the BootImages objects.12:07
rvbaBut it's a lot of work for very little gain.12:07
rvbajtv: what do you think?12:08
jtvLet me just finish what I'm doing here...12:09
jtvBRB12:09
gmbjtv: So, doing what we wanted with the exception from pxeconfig() is causing a raft of test failures that are proving *very* hard to fix. I'm going to grab some lunch then take one last stab at it. If that doesn't work I'll fall back to returning an obviously-wrong label for the time being and file a bug.12:21
* gmb lunches 12:21
gmb\12:21
jtvgmb: well at least that would give it a good _reason_, which is also good.  :)12:21
gmbjtv: Can you give https://code.launchpad.net/~gmb/maas/label-in-pxe-config/+merge/211480 a thumbs up so we can land it?13:38
jtvLooking13:41
jtvgmb: approved with comments.13:55
gmbjtv: You have given voice the the inner monologue that I had when I wrote the test... I got distracted by the no-such-image nonsense.13:56
gmb*sigh*13:56
gmbAlways listen to yourself.13:56
gmbUnless you tell yourself to kill people.13:56
jtvGlad to hear you were thinking along the same lines.14:00
jtvOtherwise I might have come across as a bit of a pain in the neck.14:00
gmbjtv: I find that writing code is like sculpture.14:01
gmbYou just remove all the bits that don't look like a horse.14:01
gmbExcept by committee.14:01
gmbSo sometimes you get a camel.14:01
jtvThere is of course one glaring and fatal flaw in your reasoning.14:01
jtvWhy would we need code to look like a horse?\14:01
gmbYOU ARE NO FUN14:02
* gmb wants a pony14:02
=== cmagina is now known as cmagina-away
=== cmagina-away is now known as cmagina
=== cmagina is now known as cmagina-away
=== cmagina-away is now known as cmagina
=== cmagina is now known as cmagina-away
=== cmagina-away is now known as cmagina
=== matsubara is now known as matsubara-afk
JeffreyI have about 8 blade servers that I am in the process of getting MaaS to work with. I have all the blades enlisted with the MaaS Controller, and it boots into pxe boot and boots the image. But after installing the image it says "boot sector signature not found" and then drops me to a boot prompt. Can anyone help me troubleshoot this?17:21
=== matsubara-afk is now known as matsubara
=== CyberJacob|Away is now known as CyberJacob
=== cmagina is now known as cmagina-away

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!