[03:13] <JJ__> hi
[03:18] <JJ__> Hello?
[09:22] <rvba> bigjools: please have another look at my branch (https://code.launchpad.net/~rvb/maas/retry-power-changes/+merge/227884);  I've kept the hardcoded list of power types that support querying for now.  The way we're using the power type registry is a bit nasty: the objects it stores are JSON representation of the fields the power type requires, not proper objects that can be extended cleanly.  If we're going to
[09:22] <rvba> extend this, I think it's worth taking a step back and seeing if we can't refactor this a bit before we had more cruft on top of what we have now.
[09:22] <bigjools> rvba: sure
[09:23] <bigjools> gimme a few, just writing an email
[09:23] <rvba> No worries
[09:24] <rvba> bigjools: also, if you have time after that, I'd like to talk about error reporting with you.
[09:38] <bigjools> rvba: first comment - the test_change_power_state_marks_the_node_broken_if_exception test won't fail if there's no Failure.  You need to do d.addBoth() and check the result is a Failure.
[09:39] <bigjools> did you ever get the test to fail?  :)
[09:47] <bigjools> rvba: see self.assert_fails_with() which does what you want
[09:49] <rvba> bigjools: I actually *did* get the test to fail…
[09:49] <rvba> bigjools: if you apply http://paste.ubuntu.com/7846674/ the test fails.
[09:51] <bigjools> rvba: wrong kind of failure
[09:51] <bigjools> if there is no failure at all, the test will pass
[09:51]  * bigjools writes on MP
[09:51] <rvba> bigjools: I get http://paste.ubuntu.com/7846676/, that's what I'm expecting
[09:52] <rvba> i.e. the node not being marked as broken
[09:52] <bigjools> rvba: wait for my MP comments, the test needs a simple fix, that's all
[09:55] <rvba> bigjools: oh, I see what you mean now
[09:55] <bigjools> :)
[09:55] <bigjools> rvba: there is a test helper that does it all in one line
[09:55] <rvba> Right.
[09:59] <bigjools> rvba: see email
[10:04] <bigjools> also a useful nugget is that you can pass debug=True to make_factory() which gives you a bunch of info about Deferreds when debugging your tests (full stack trace of where they were created is tremendously useful)
[10:06] <rvba> bigjools: hum, looks like you need to "return" assert_fails_with()
[10:06] <bigjools> rvba: ah yes
[10:06] <bigjools> returns the deferred to the reactor
[10:08] <rvba>         d = power.change_power_state(
[10:08] <rvba>             system_id, power_type, power_change, context)
[10:08] <rvba>         d.addErrback(check)
[10:08] <bigjools> rvba: well what you can do, since you're adding an extra check is this (let me just write it down)
[10:08] <rvba>         return assert_fails_with(d, Exception)
[10:08] <rvba> bigjools: this ^ works.
[10:08] <rvba> And check is:
[10:08] <rvba>         def check(failure):
[10:08] <rvba>             # The node has been marked broken.
[10:08] <bigjools> rvba: you don't need the check func you can do a lambda
[10:08] <rvba>             self.assertThat(
[10:08] <rvba>                 client.mark_node_broken, MockCalledOnceWith(system_id))
[10:08] <rvba>             raise failure.value
[10:09] <bigjools> d.addErrback(self.assertThat, ...)
[10:09] <bigjools> not a lambda... heh
[10:10] <bigjools> rvba: ok?
[10:10] <rvba> Having a cleanly defined check() method is easier on the eyes.
[10:10] <rvba> I think.
[10:10] <bigjools> rvba: I massively disagree
[10:10] <bigjools> inline funcs are to be avoided at all costs
[10:10] <bigjools> well, not all, but they are hideous
[10:11] <rvba> Not in tests when they encapsulate a tiny bit of logic that is nice to see isolated from the rest of the code.
[10:12]  * bigjools continues to disagree
[10:12] <rvba> bigjools: in this case, I need to do two things in check(): the test itself and then re-raise the exception
[10:12] <bigjools> rvba: no you don't need to re-raise
[10:12] <bigjools> if you add the errbacks in the right order, anyway
[10:12] <rvba> bigjools: if I don't re-raise, the exception is swalled and assert_fails_with fails
[10:13] <rvba> oh, I see.
[10:13] <bigjools> rvba: d = self.assert_fails_with ...
[10:13] <rvba> assert_fails_with returns a deferred
[10:13] <rvba> bigjools: yeah
[10:13] <bigjools> d.addErrback(...
[10:17] <rvba> bigjools: so that's what you're suggesting then? http://paste.ubuntu.com/7846751/
[10:17] <bigjools> rvba: no need to re-assign d
[10:18] <bigjools> but basically, yeah
[10:18] <rvba> It's not failing properly when no exception is raised…
[10:19]  * rvba reads the documentation for d.addErrback…
[10:20] <rvba> bigjools: http://paste.ubuntu.com/7846769/ that's better
[10:21] <bigjools> ah yes sorry, forgot the extra arg :)
[10:21] <bigjools> normally it's not needed but because you have other callbacks ...
[10:23] <rvba> Right.
[10:43] <bigjools> rvba: that;s interesting.  The test func where you added the yield doesn't have inlinecallbacks
[10:43] <rvba> bigjools: ah!
[10:44] <bigjools> It's fine, but ...
[10:44] <bigjools> rvba: is assert_fails_with not on the testcase?
[10:44] <bigjools> self.assert_fails_with ... ?
[10:44] <rvba> No, it's not.
[10:45] <bigjools> huh, weird.  Someone must have patched it in on the LP tests then
[10:46] <bigjools> rvba: I reckon we should change our test cases to have it
[10:47] <rvba> bigjools: it's only used in two places now (including the one I'm adding) so the fix should be tiny.
[10:49] <rvba> bigjools: how do you explain "yield method()" works without the inlinecallbacks decorator?
[10:49] <bigjools> it's a generator...
[10:50] <rvba> But how come the test runner deals with that transparently?
[10:51] <bigjools> I suspect you don't need the run_tests_with
[10:51] <bigjools> it's never hitting the reactor
[10:53] <rvba> bigjools: when I call power.change_power_state() without explicitly passing a reactor, the default one gets used…
[10:53] <bigjools> I mean the tests are not relying on it
[10:55] <rvba> But when 'yield pause(waiting_time, reactor)' gets called, this should hit the reactor shouldn't it?
[11:08] <bigjools> rvba: not if inlinecallbacks is missing it's not :)
[11:09] <rvba> bigjools: but I've got other test methods with inlinecallbacks.  They pass with or without it.
[11:09] <bigjools> meh, the lack of an easy way to run a quick image import on canonistack is annoying
[11:09] <bigjools> need a sources file now
[11:09] <bigjools> rvba: right, because the tests are not relying on the reactor spinning, just the clock
[11:12] <rvba> bigjools: but the clock defaults to twisted.internet.reactor
[11:12] <bigjools> rvba: not the one you're relying on
[11:13] <rvba> bigjools: I'm talking about other tests which don't use Clock() at all.
[11:13] <bigjools> I'm not doing a good job of explaining, sorry
[12:27] <jtv> Who wants to review my database migration fix for trunk?  https://code.launchpad.net/~jtv/maas/bug-1347579/+merge/228099
[12:27] <bigjools> jtv: I would be delighted to
[12:27] <jtv> Thank you.
[13:40] <allenap> Please can a review get I? https://code.launchpad.net/~allenap/maas/rpc-get-preseed-data-helper/+merge/228086
[13:40] <rvba> allenap: I'll look at it in a sec
[13:40] <allenap> rvba: Thanks.
[13:52] <bigjools> allenap: pserv logging.  it's, erm, not doing a lot
[13:52] <bigjools> is the python logger wired up?
[13:53] <bigjools> to twisted's, I mean
[13:56] <allenap> bigjools: What’s happening specifically?
[14:08] <bigjools> allenap: two problems:
[14:09] <bigjools> 1. logger.info() is not appearing in the log
[14:09] <bigjools> 2. the log is swamped with RPC connection reporting which is annoying
[14:13] <roaksoax> bigjools: so I was thinking, would it be a good idea to start working on Unified Logging?
[14:13] <roaksoax> since most of the new logging that we will be doing in all of the work
[14:13] <roaksoax> will need to end up in a unified log
[14:13] <bigjools> roaksoax: that's a terrible idea right now, but when everything else in progress is done, it will be an excellent idea
[14:15] <roaksoax> bigjools: right, but we need to start consolidating logs in a single location anyway, instead of creating new logs for new stuff
[14:16] <allenap> bigjools: I shall look.
[14:16] <bigjools> roaksoax: there is nothing that I know of that is going to require a new log
[14:17] <allenap> roaksoax: Yeah, let’s just clean up what we have for now, get over the hump, no new development right now. In a month we can do something a bit more structured.
[14:18] <bigjools> allenap: thanks.
[14:23] <allenap> bigjools: logger.info() probably not appearing because we’re not configuring the standard lib's logging in pserv.
[14:23] <bigjools> allenap: I think we need to build some retrying backoff algorithm into getClient()
[14:23] <bigjools> allenap: yes that;s what I guessed at 14:52
[14:24] <bigjools> getClient fails hard and quick, and this is not a normal situation, we expect RPC to always be up.  It could be that we're trying to access too quickly after startup.
[14:25] <allenap> bigjools: We can wire up Twisted’s log to the standard lib’s logging with PythonLoggingObserver.
[14:25] <allenap> And that’s probably a good thing to do.
[14:25] <bigjools> yes
[14:27] <bigjools> allenap: how evil would it be to put inlineCallbacks on __init__ ? :)
[14:27] <bigjools> I need a startup pause for the image checking service because of this getClient thing
[14:30] <allenap> bigjools: We could add a queue to ClusterClientService for getClient requests.
[14:30] <allenap> bigjools: inlineCallbacks on __init__ won’t work.
[14:30] <bigjools> yeah I figured
[14:45] <bigjools> allenap: client(GetBootSources) is returning a string.  Is that right?
[15:03] <allenap> bigjools: Nope, it returns a {“sources”: {…lots of stuff…}}. See GetBootSources.response for the schema.
[15:03] <bigjools> allenap: I am getting a very very weird error
[15:03] <bigjools> let me paste
[15:04] <bigjools> allenap: http://paste.ubuntu.com/7848044/
[15:04] <bigjools> the returned data looks ok
[15:05] <bigjools> oh no.... do I really have to do returned_data['sources'] ...
[15:09] <allenap> bigjools: Yep, that’s it.
[15:09] <bigjools> allenap: so did that, got further: http://paste.ubuntu.com/7848059/
[15:09] <bigjools> I feel we're lacking in some integration tests
[15:11] <allenap> Yep :)
[15:12] <bigjools> allenap: seriously - there's too much connection mocking, we really need end-to-end tests
[15:12] <allenap> bigjools: I’m not sure that the response that GetBootSources gives was designed to exactly slot into existing code. It was designed to make sense first of all :)
[15:12] <rvba> bigjools: I know you're busy right now but here is the MP that implements what we talked about earlier: https://code.launchpad.net/~rvb/maas/improve-broken-state/+merge/228132
[15:13] <bigjools> roaksoax: ack
[15:13] <allenap> bigjools: You’re the one wiring it up for the first time, so you get to write the integration tests.
[15:13] <bigjools> rvba: ack
[15:13] <bigjools> ffs
[15:13] <bigjools> allenap:yay?
[15:18] <allenap> bigjools: Think of how good you’ll feel when it’s done :)
[15:19] <bigjools> allenap: I always enjoy finishing someone else's work ;)
[15:21] <allenap> bigjools: Great, I think I have some more for you :)
[15:23] <bigjools> allenap: arf!
[15:23] <bigjools> allenap: do we have any infra to set it up in a test?
[15:32] <allenap> bigjools: There’s ClusterRPCFixture for the other direction. You could replicate that for cluster->region.
[15:43] <bigjools> allenap: right
[15:45] <bigjools> allenap: so I see the problem
[15:45] <bigjools> the keyring data is returned in sources, but the importer code wants it in a file
[15:45] <bigjools> I love it when the rabbit hole deepens like this
[15:46] <bigjools> so I can't just pass that sources obj that rpc returns
[15:46] <bigjools> what I want to know is, how is the file written normally when the celery job runs
[15:49] <bigjools> argh
[15:50] <allenap> bigjools: That’s done in boot_resources.import_images, right? write_all_keyrings
[15:50] <bigjools> argh arfg argf
[15:52] <bigjools> allenap: the rpc code has a bug
[15:52] <bigjools> it's returning the data in a field with an incorrect name
[15:52] <bigjools> "keyring" is the filename, "keyring_data" is the, well, data
[15:52] <bigjools> but the call uses the former for the latter
[15:54] <bigjools> allenap:
[15:54] <bigjools>         keyring_data = source.pop("keyring_data")
[15:54] <bigjools>         source["keyring"] = b64decode(keyring_data)
[15:54] <bigjools> why!
[15:55] <bigjools> thinko or deliberate?
[15:59] <allenap> bigjools: Deliberate. BootSource.to_dict() is the one smoking crack for the benefit of Celery. The cluster never needs to know the filename. To the cluster the keyring is the keyring, it does not need to be qualified with a _data suffix. Once the Celery task is gone we can fix BootSource.to_dict().
[15:59] <bigjools> allenap: I'm changing it back to keyring_data, because it's the least path of resistance
[16:00] <allenap> (It’s smoking crack because the model is where the base-64 encoding is happening, instead of where the task is dispatched.)
[16:00] <bigjools> the code in import_resources relies on all of this
[16:00] <bigjools> it's not just celery
[16:00] <allenap> bigjools: Jumping off a cliff is a quick way to the bottom.
[16:00] <bigjools> \o/
[16:01] <bigjools> I prefer keyring_data tbh.  I doesn't leave me wondering if it's a filename or the data
[16:01] <bigjools> keyring on its own is ambiguous
[16:01] <bigjools> and in fact more confusing because the gpg executable uses --keyring=<filename>
[16:02] <allenap> bigjools: Either way, fix BootSource.to_dict() when you delete the import_boot_images task.
[16:03] <allenap> bigjools: Note that everything assumes that keyring_data is base-64 too. You’ll need to disabuse everything of that notion.
[16:05] <bigjools> allenap: sigh :(
[16:05] <bigjools> allenap: for now I am going to ensure that everything RPC is bytes and not b64
[16:05] <bigjools> it's a good cutpoint.
[16:06] <allenap> Yeah.
[16:43] <bigjools> allenap: I have a dilemma
[16:44] <bigjools> there's a time when the importer code needs to deal with both b64 and bytes for the keyring
[16:44] <bigjools> before both the scheduled job and manual import are migrated to rpc
[16:44] <allenap> The only way out is through.
[16:45]  * allenap is the font of aphorisms today.
[16:46] <allenap> bigjools: Well, you /could/ name one "keyring_data" and the other, oh, I don’t know, just “keyring”? :)
[16:46] <bigjools> allenap: fooey
[16:47] <bigjools> doesn't help anyway
[16:47] <bigjools> all the underlying code assumes b64
[16:47] <bigjools> can I detect which it is and DTRT
[16:51] <allenap> bigjools: You could try decoding base-64 and catch exceptions, but that’s not without risk.
[16:51] <bigjools> yeah
[16:51] <bigjools> I'll decode it when the task is called
[16:51] <allenap> Yeah, that works.
[16:51] <bigjools> not when pumped right into the underlying code :/
[16:51] <allenap> Perfick.
[16:52] <bigjools> then when the task gets removed eventually, all will be well
[16:59] <bigjools> allenap: so there is a slight flaw in the plan for detecting when to run a download :(
[16:59] <bigjools> although it's not the end of the world
[16:59] <bigjools> maas.meta only gets updated when there is something to download.  So once it's a week old, we run the downloads hourly :)
[17:11] <bigjools> allenap: bug 1271188 is getting more attention now, and I am also a little frustrated.  Let's stick it on the list while we're doing this work?
[17:18] <bigjools> allenap: if you want a gander I have a WIP MP, I'm offski now.  There's a bit more to do to stop that hourly checking.
[17:18] <bigjools> https://code.launchpad.net/~julian-edwards/maas/replace-celery-download-job-with-pserv/+merge/228168
[17:25] <allenap> bigjools: Ha! We could ensure that maas.meta is touched after each successful run, or use another reference file.
[17:26] <allenap> bigjools: I’ll be back online later, but I’m off now too. I’ll try to have a look at that wip.