=== Guest14971 is now known as wallyworld === CyberJacob|Away is now known as CyberJacob [07:01] bigjools: did you QA the fix for https://bugs.launchpad.net/maas/+bug/1341001? (I can help with that if you didn't) [07:01] Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] [07:01] rvba: I did not [07:02] bit sidetracked by other stuff today [07:02] Okay, I'll do it now. === CyberJacob is now known as CyberJacob|Away [07:40] easy review karma: https://code.launchpad.net/~julian-edwards/maas/cluster-task--report-last-import-time/+merge/226782 [07:44] bigjools: approved [07:44] rvba: thanks [07:46] "make lint" always makes me chuckle [08:18] allenap: hi, I see you've changed the RPC stuff so that exceptions are properly propagated. [08:20] jtv: I'm still in the process of making sure it's repeatable but when I tried upgrading my 1.5.2 installation to trunk I got: http://paste.ubuntu.com/7797346/. Any idea where this is coming from? [08:21] rvba: yes, that's a known bug — fix is in review. [08:21] Ah okay. Ta. [08:21] bigjools: I replied to your review. [08:22] allenap: the error I'm getting on the cluster when a NoSuchNode is raised on the Region looks like this: 'Node with system_id=Node with system_id=unknown-system-id-dPaTUv could not be found. could not be found.' [08:22] allenap: Looks like the code from paste.ubuntu.com/7797390/ is "applied" twice or something. [08:22] rvba: Ah, that’ll be because you’ve customised __init__ to munge the message. [08:23] allenap: I see, the error is "recreated" on the other side… [08:23] Right? [08:23] rvba: Yep. [08:24] rvba: You could instead customise __str__ (and __unicode__). [08:24] allenap: that's precisely what I'm trying now :). [08:57] jtv: lp:~jtv/maas/bug-1340896 is the branch that fixes the upgrade problem right? [08:58] Yes. [08:58] See the bug. [08:59] You might actually have insights useful for the review discussion. [08:59] Okay, I'll look at it in a bit. I'm in the middle of doing some QA now and I'd like to get that sorted first… [09:04] jtv: got the same upgrade problem with the fixes from that branch :/ [09:06] Hrrrh!? [09:07] It's possible that my installation is broken… I've been fighting with it all morning. [09:07] Then maybe we need 3 migrations after all... [09:07] But I did make it go through the migrations forwards and backwards with cluster interfaces present. [09:07] (And I observed the changes in an SQL shell) [09:09] * rvba switches to maas/1.6 [09:10] jtv: happy to do some more testing later (and have a look at that branch/bug) but right now, I just want to get through with the QA. [09:10] * rvba backports fix. [09:11] rvba: right 1.6 doesn't have those schema changes, so the problem won't happen there. [09:11] Yep, that's why I'm using it. [09:12] OK [09:13] Would love to hear more if you run into that migration problem again — I did a similar experiment here and didn't have the problem. [09:15] rvba: did you look at why CI is failing? [09:15] bigjools: no; still busy QAing the fix for https://bugs.launchpad.net/maas/+bug/1341001 [09:15] Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] [09:16] bigjools: error installing: http://paste.ubuntu.com/7797564/ [09:17] rvba: looks like a dependent package is broken [09:18] very hard to see which one though === CyberJacob|Away is now known as CyberJacob [09:21] bigjools: job #215 failed with the error I pasted above. job #214 failed because the cluster didn't connect to the region (I'm assuming this is somehow related to the problems Jeroen is working on); but the dependent packages were exactly the same in the two runs. [09:22] rvba: could have been an upload in the interim. Is this utopic or trusty? they both fail [09:22] bigjools: utopic. Like I said, I retrieve the list of the packages that have been installed and there is no difference. [09:23] rvba: same versions? [09:23] Yes [09:23] ok [09:23] weird then [09:23] very weird [09:23] I'm diffing the lines: [09:23] Get:1 http://archive.ubuntu.com/ubuntu/ utopic/main libc-bin amd64 2.19-4ubuntu1 [1169 kB] [09:23] … [09:23] k [09:24] I'm running a test on 1.6 (topic-adt-maas-manual). [09:26] bigjools: the failure of the trusty job is different. That job uses Trusty. Seems like revision 2290 of lp:maas/1.5 broke the build (unless it's spurious). [09:27] bigjools: 2290 is the backport of the nonce stuff. [09:27] rvba: fuck [09:27] bigjools: I started another job, just to make sure the failure is consistent. [09:28] ok thanks [09:28] back it out if it fails again [09:28] k [09:30] bigjools: allenap: fwiw there is a card on the board (in the robustness lane) for the "Move power-related tasks over to twisted" task. [09:30] And no, I'm not insisting ;). [09:30] it'll need breaking down, but that's ok [09:32] bigjools: QA ok for the fix for https://bugs.launchpad.net/maas/+bug/1341001 [09:32] Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] [09:37] bigjools: hum, the precise job also failed and the MAAS version didn't change. Something might be wrong with the lab itself. [09:38] rvba: yeah, that would be my next suspect [09:41] bigjools: I'm backporting the fix for https://bugs.launchpad.net/maas/+bug/1341001 to 1.6 [09:41] Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] [09:41] rvba: thank you [09:41] you're too kind :) [10:29] allenap: I still get the repeated exception message when I customize __str__/__unicode__ on my exception class. [10:29] rvba: Can you show me the code? [10:32] allenap: just one sec… [10:34] allenap: http://paste.ubuntu.com/7797848/ [10:35] allenap: http://paste.ubuntu.com/7797873/ wqorks [10:35] works* [10:36] rvba: Intriguing. I like the alternative constructor pattern. (Though don’t inherit from BaseException.) [10:36] allenap: right (I got rid of BaseException) [10:39] allenap: where in the code does the exception recreation happen? [10:42] rvba: See eb_massage_error in MAAS, or _massageError in amp.py. [10:43] allenap: I'm confused, why is this in src/provisioningserver/rpc/testing/__init__.py ? i.e. in what seem to be a testing utility? [10:43] seems* [10:45] rvba: Because it’s a testing utility that avoids mimics a remote call without having to do messy stuff with sockets and suchlike. [10:46] s/avoids / [10:46] allenap: but outside of testing, the error is also recreated on the caller's side right? [10:47] rvba: Yes. eb_massage_error is meant to make call_responder behave more like a real RPC call. === CyberJacob is now known as CyberJacob|Away [10:48] allenap: all right. It feels a bit weird that we have to manually recreate that behavior in testing but okay.. [10:52] bigjools: the 1.5 (trusty) job passed. It was indeed a problem with a node, the nonce stuff is okay. [10:53] cool [10:53] thanks for checking it out [10:53] I'm running all the other jobs now. [10:54] A run with 1.6 is in progress now. [10:54] If this one passes, it will be time to release another beta package. === SolutionL is now known as Solution-X [11:21] let me know === CyberJacob|Away is now known as CyberJacob [11:43] bigjools: test passed (revision 2534 on lp:maas/1.6). [11:45] thanks [11:47] Trusty CI is better now. [11:47] Oh, that's 1.5. :( [11:51] The failure has changed. Could it be that the CI needs to refresh its knowledge of the API? [11:59] But then why did previous updates to the NGI succeed? [12:01] Ahh, no, it didn't. [12:02] jtv: looks like it's the parsing of MAAS' output that failed. [12:03] jtv: no, I'm wrong, it's related to the recent addition of the NGI's name. [12:06] Yes. Question is, how? [12:06] The API doesn't expect values for all of the form fields, does it? [12:07] Depends on the form that handles the data. [12:08] The field is taken directly from the model, and it's blank=True null=False. [12:09] The form's cleaning provides a value if it's blank. [12:09] Or at least, that's the theory. [12:11] The parsing fails because _set_up_dhcp needs to check manually for an error return from _run_maas_cli, and doesn't. [12:11] It receives error output and assumes it gets usable json. [12:12] Right. [12:12] The only hint we get from the CLI is "name" — which we don't specify, so that's why I wonder if there's a problem with that. [12:14] I'm going to see if I can reproduce the problem with a simple API/CLI update. [12:14] Should do the trick. === CyberJacob is now known as CyberJacob|Away [12:23] Seems to work on the API... :/ [12:31] class Action(Command): [12:31] uri = property(lambda self: self.handler["uri"]) [12:31] def __call__(self, options): [12:31] uri = self.uri.format(**vars(options)) [12:32] ^ This is why we don't want code to be clever or "use the language to its full." [12:32] That last assignment is the source of the error. [12:32] Whoopee, right? [12:34] What's 'options'? [12:34] Parameter to __call__. [12:34] No shit :) [12:35] If you want to know more, and so do I, let's ask allenap... [12:35] heh [12:36] * allenap reads backwards [12:37] allenap: the main thing is that bit of code (class Action(Command):) from the CLI. [12:37] jtv: What’s going wrong? [12:37] KeyError. [12:38] In that self.uri.format line. [12:39] jtv: Is it saying which key is missing? [12:39] Yes. It's ‘name’. [12:40] So that’s something in the URI template that Django/Piston generates. [12:40] Where do they get the information for generating that URI template? [12:41] It comes from the Piston handler class. [12:41] resource_uri_template [12:41] Then I think understanding dawns. [12:41] See describe_handler in maasserver.apidoc. [12:42] The CLI call isn't passing the name field, which is supposed to go in the URL path. [12:46] jtv: That implies that the CLI doesn’t know the name field is required. I think that parameters can be declared, but they need to be part of the URL path that a Django/Piston handler is registered with, or something like that. [12:47] Which it should be :-/ [12:47] The CLI doesn't know that the name field is _not_ required. :( [12:48] So the URI template says "give me here" and the CLI code tries to take from its parameters dict, which doesn't have it. [12:49] jtv: What does the URL look like? [12:50] http://localhost:5240/api/1.0/nodegroups/{uuid}/interfaces/{name}/ [12:50] jtv: Why is the name optional? [12:50] What behaviour does that produce? [12:52] Without the {name} filled in, I think that should resolve to the NodeGroupInterfacesHandler. With {name} it should resolve to NodeGroupInterfaceHandler, i.e. the singular. [12:52] The URL used to have the network interface name in it. That's been changed to be the cluster interface name. [12:52] So the request doesn't need to pass ‘name’ as a field, only as part of the URL. [12:55] jtv: Can you paste the error for me? I don’t have the CI email any more :-/ [12:57] No point. You get less useful information than this. [12:58] The problem is that the CLI takes the positional URL parameters by name instead of positionally. [12:58] No idea how to fix that. :( [13:00] jtv: Maybe the CI script needs to do `maas refresh` to get updated API descriptions? [13:01] allenap: the CI use a new VM instance (to install MAAS) for each run. [13:09] allenap: no, a refresh doesn't do it. The problem is that the CLI takes the positional parameters from the dict of keyword arguments — which I suppose also spells trouble for any case where an identifying field is editable. [13:19] jtv: Do you have some instructions so that I can reproduce this? [13:20] allenap: ./bin/maas node-group-interface update [13:20] Produces: ./bin/maas: error: u'name' [13:21] Debugging shows that it's a KeyError while attempting to format that URI. [13:23] To add to the fun, passing ‘name’ as a named parameter doesn't help. [13:27] jtv: It also believes that the name is required for bin/maas node-group-interface read [13:28] Duh. [13:28] The problem is that the CLI code passes that parameter around by name, instead of positionally. [13:30] jtv: NodeGroupInterfaceHandler.resource_uri is still returning ‘interface’ as a parameter. Not sure if that would explain everything, but it’s worth giving a go. [13:30] * allenap gives it a go. [13:31] Passing name and interface both doesn't change it for me. :( [13:32] https://www.irccloud.com/pastebin/vlInRrAc [13:32] jtv: ^ try applying that, restarting, refreshing the cli. [13:43] allenap, jtv: fwiw, I filed https://bugs.launchpad.net/maas/+bug/1342117 for the cryptic error: u'name' failure [13:43] Ubuntu bug 1342117 in MAAS "CLI command to set up node-group-interface fails with /usr/lib/python2.7/dist-packages/maascli/__main__.py: error: u'name'" [Critical,Triaged] [13:43] Thanks matsubara. We're just looking into it. [13:44] jtv, yep, I saw the backlog. Thanks. (And thanks for fixing the other nodegroup crash!) [14:03] allenap: your paste (with a refresh) fixes the problem! [14:04] I thought that part where these methods substitute a fixed string was only there for documentation..? [14:05] jtv: There’s still an oddity where `… node-group-interfaces list` requires a `uuid` argument. I think NodeGroupInterfacesHandler.resource_uri needs adjusting. [14:06] jtv: Nope, waste not want not: it was repurposed for driving the CLI too :) [14:07] jtv: Actually, and I don’t know why, NodeGroupInterfaces.list() takes a uuid argument, which I think it probably shouldn’t. === CyberJacob|Away is now known as CyberJacob [14:58] allenap: IIRC NodeGroupInterfaces.list lists the cluster interfaces for one cluster — and so you need to tell it which cluster. [15:09] Exceedingly small branch for review: https://code.launchpad.net/~jtv/maas/ngi-name-ci-breakage/+merge/226859 === lutostag_ is now known as lutostag === Solution-X is now known as Solution-X|AFK === sputnik1_ is now known as sputnik13net === roadmr is now known as roadmr_afk === CyberJacob is now known as CyberJacob|Away === roadmr_afk is now known as roadmr === CyberJacob|Away is now known as CyberJacob === matsubara is now known as matsubara-afk === CyberJacob is now known as CyberJacob|Away === matsubara-afk is now known as matsubara === CyberJacob|Away is now known as CyberJacob [21:26] hi, I'm using maas + juju + virsh, installed as the doc said and it's working [21:27] I have the nodes installed and when they boot they show as "enlisted-node-XXX" [21:27] but I cant get juju to do the bootstrap [21:28] because every time I run "juju boostrap" it starts a VM and the operating system installation runs again [21:28] what I am doing wrong? [21:42] hi out there, have a maas lxc related question can someone help?? [21:46] schegi_: nobody will know until you actually ask your question [21:48] thought it was more juju related, bootstraped a maas environemt and was wondering how to define the bridge device which is used if an service is deployed in a container. [21:48] asked in the juju channel [22:11] why maas keeps reinstalling a node? === jfarschman is now known as MilesDenver === CyberJacob is now known as CyberJacob|Away