=== Guest14971 is now known as wallyworld | ||
=== CyberJacob|Away is now known as CyberJacob | ||
rvba | bigjools: did you QA the fix for https://bugs.launchpad.net/maas/+bug/1341001? (I can help with that if you didn't) | 07:01 |
---|---|---|
ubot5 | Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] | 07:01 |
bigjools | rvba: I did not | 07:01 |
bigjools | bit sidetracked by other stuff today | 07:02 |
rvba | Okay, I'll do it now. | 07:02 |
=== CyberJacob is now known as CyberJacob|Away | ||
bigjools | easy review karma: https://code.launchpad.net/~julian-edwards/maas/cluster-task--report-last-import-time/+merge/226782 | 07:40 |
rvba | bigjools: approved | 07:44 |
bigjools | rvba: thanks | 07:44 |
bigjools | "make lint" always makes me chuckle | 07:46 |
rvba | allenap: hi, I see you've changed the RPC stuff so that exceptions are properly propagated. | 08:18 |
rvba | jtv: I'm still in the process of making sure it's repeatable but when I tried upgrading my 1.5.2 installation to trunk I got: http://paste.ubuntu.com/7797346/. Any idea where this is coming from? | 08:20 |
jtv | rvba: yes, that's a known bug — fix is in review. | 08:21 |
rvba | Ah okay. Ta. | 08:21 |
jtv | bigjools: I replied to your review. | 08:21 |
rvba | allenap: the error I'm getting on the cluster when a NoSuchNode is raised on the Region looks like this: 'Node with system_id=Node with system_id=unknown-system-id-dPaTUv could not be found. could not be found.' | 08:22 |
rvba | allenap: Looks like the code from paste.ubuntu.com/7797390/ is "applied" twice or something. | 08:22 |
allenap | rvba: Ah, that’ll be because you’ve customised __init__ to munge the message. | 08:22 |
rvba | allenap: I see, the error is "recreated" on the other side… | 08:23 |
rvba | Right? | 08:23 |
allenap | rvba: Yep. | 08:23 |
allenap | rvba: You could instead customise __str__ (and __unicode__). | 08:24 |
rvba | allenap: that's precisely what I'm trying now :). | 08:24 |
rvba | jtv: lp:~jtv/maas/bug-1340896 is the branch that fixes the upgrade problem right? | 08:57 |
jtv | Yes. | 08:58 |
jtv | See the bug. | 08:58 |
jtv | You might actually have insights useful for the review discussion. | 08:59 |
rvba | Okay, I'll look at it in a bit. I'm in the middle of doing some QA now and I'd like to get that sorted first… | 08:59 |
rvba | jtv: got the same upgrade problem with the fixes from that branch :/ | 09:04 |
jtv | Hrrrh!? | 09:06 |
rvba | It's possible that my installation is broken… I've been fighting with it all morning. | 09:07 |
jtv | Then maybe we need 3 migrations after all... | 09:07 |
jtv | But I did make it go through the migrations forwards and backwards with cluster interfaces present. | 09:07 |
jtv | (And I observed the changes in an SQL shell) | 09:07 |
* rvba switches to maas/1.6 | 09:09 | |
rvba | jtv: happy to do some more testing later (and have a look at that branch/bug) but right now, I just want to get through with the QA. | 09:10 |
* rvba backports fix. | 09:10 | |
jtv | rvba: right 1.6 doesn't have those schema changes, so the problem won't happen there. | 09:11 |
rvba | Yep, that's why I'm using it. | 09:11 |
jtv | OK | 09:12 |
jtv | Would love to hear more if you run into that migration problem again — I did a similar experiment here and didn't have the problem. | 09:13 |
bigjools | rvba: did you look at why CI is failing? | 09:15 |
rvba | bigjools: no; still busy QAing the fix for https://bugs.launchpad.net/maas/+bug/1341001 | 09:15 |
ubot5 | Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] | 09:15 |
rvba | bigjools: error installing: http://paste.ubuntu.com/7797564/ | 09:16 |
bigjools | rvba: looks like a dependent package is broken | 09:17 |
bigjools | very hard to see which one though | 09:18 |
=== CyberJacob|Away is now known as CyberJacob | ||
rvba | bigjools: job #215 failed with the error I pasted above. job #214 failed because the cluster didn't connect to the region (I'm assuming this is somehow related to the problems Jeroen is working on); but the dependent packages were exactly the same in the two runs. | 09:21 |
bigjools | rvba: could have been an upload in the interim. Is this utopic or trusty? they both fail | 09:22 |
rvba | bigjools: utopic. Like I said, I retrieve the list of the packages that have been installed and there is no difference. | 09:22 |
bigjools | rvba: same versions? | 09:23 |
rvba | Yes | 09:23 |
bigjools | ok | 09:23 |
bigjools | weird then | 09:23 |
bigjools | very weird | 09:23 |
rvba | I'm diffing the lines: | 09:23 |
rvba | Get:1 http://archive.ubuntu.com/ubuntu/ utopic/main libc-bin amd64 2.19-4ubuntu1 [1169 kB] | 09:23 |
rvba | … | 09:23 |
bigjools | k | 09:23 |
rvba | I'm running a test on 1.6 (topic-adt-maas-manual). | 09:24 |
rvba | bigjools: the failure of the trusty job is different. That job uses Trusty. Seems like revision 2290 of lp:maas/1.5 broke the build (unless it's spurious). | 09:26 |
rvba | bigjools: 2290 is the backport of the nonce stuff. | 09:27 |
bigjools | rvba: fuck | 09:27 |
rvba | bigjools: I started another job, just to make sure the failure is consistent. | 09:27 |
bigjools | ok thanks | 09:28 |
bigjools | back it out if it fails again | 09:28 |
rvba | k | 09:28 |
rvba | bigjools: allenap: fwiw there is a card on the board (in the robustness lane) for the "Move power-related tasks over to twisted" task. | 09:30 |
rvba | And no, I'm not insisting ;). | 09:30 |
bigjools | it'll need breaking down, but that's ok | 09:30 |
rvba | bigjools: QA ok for the fix for https://bugs.launchpad.net/maas/+bug/1341001 | 09:32 |
ubot5 | Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] | 09:32 |
rvba | bigjools: hum, the precise job also failed and the MAAS version didn't change. Something might be wrong with the lab itself. | 09:37 |
bigjools | rvba: yeah, that would be my next suspect | 09:38 |
rvba | bigjools: I'm backporting the fix for https://bugs.launchpad.net/maas/+bug/1341001 to 1.6 | 09:41 |
ubot5 | Ubuntu bug 1341001 in MAAS 1.6 "maas.tgt not rewritten on 1.5 -> 1.6 cluster upgrade" [Critical,Triaged] | 09:41 |
bigjools | rvba: thank you | 09:41 |
bigjools | you're too kind :) | 09:41 |
rvba | allenap: I still get the repeated exception message when I customize __str__/__unicode__ on my exception class. | 10:29 |
allenap | rvba: Can you show me the code? | 10:29 |
rvba | allenap: just one sec… | 10:32 |
rvba | allenap: http://paste.ubuntu.com/7797848/ | 10:34 |
rvba | allenap: http://paste.ubuntu.com/7797873/ wqorks | 10:35 |
rvba | works* | 10:35 |
allenap | rvba: Intriguing. I like the alternative constructor pattern. (Though don’t inherit from BaseException.) | 10:36 |
rvba | allenap: right (I got rid of BaseException) | 10:36 |
rvba | allenap: where in the code does the exception recreation happen? | 10:39 |
allenap | rvba: See eb_massage_error in MAAS, or _massageError in amp.py. | 10:42 |
rvba | allenap: I'm confused, why is this in src/provisioningserver/rpc/testing/__init__.py ? i.e. in what seem to be a testing utility? | 10:43 |
rvba | seems* | 10:43 |
allenap | rvba: Because it’s a testing utility that avoids mimics a remote call without having to do messy stuff with sockets and suchlike. | 10:45 |
allenap | s/avoids / | 10:46 |
rvba | allenap: but outside of testing, the error is also recreated on the caller's side right? | 10:46 |
allenap | rvba: Yes. eb_massage_error is meant to make call_responder behave more like a real RPC call. | 10:47 |
=== CyberJacob is now known as CyberJacob|Away | ||
rvba | allenap: all right. It feels a bit weird that we have to manually recreate that behavior in testing but okay.. | 10:48 |
rvba | bigjools: the 1.5 (trusty) job passed. It was indeed a problem with a node, the nonce stuff is okay. | 10:52 |
bigjools | cool | 10:53 |
bigjools | thanks for checking it out | 10:53 |
rvba | I'm running all the other jobs now. | 10:53 |
rvba | A run with 1.6 is in progress now. | 10:54 |
rvba | If this one passes, it will be time to release another beta package. | 10:54 |
=== SolutionL is now known as Solution-X | ||
bigjools | let me know | 11:21 |
=== CyberJacob|Away is now known as CyberJacob | ||
rvba | bigjools: test passed (revision 2534 on lp:maas/1.6). | 11:43 |
bigjools | thanks | 11:45 |
jtv | Trusty CI is better now. | 11:47 |
jtv | Oh, that's 1.5. :( | 11:47 |
jtv | The failure has changed. Could it be that the CI needs to refresh its knowledge of the API? | 11:51 |
jtv | But then why did previous updates to the NGI succeed? | 11:59 |
jtv | Ahh, no, it didn't. | 12:01 |
rvba | jtv: looks like it's the parsing of MAAS' output that failed. | 12:02 |
rvba | jtv: no, I'm wrong, it's related to the recent addition of the NGI's name. | 12:03 |
jtv | Yes. Question is, how? | 12:06 |
jtv | The API doesn't expect values for all of the form fields, does it? | 12:06 |
rvba | Depends on the form that handles the data. | 12:07 |
jtv | The field is taken directly from the model, and it's blank=True null=False. | 12:08 |
jtv | The form's cleaning provides a value if it's blank. | 12:09 |
jtv | Or at least, that's the theory. | 12:09 |
jtv | The parsing fails because _set_up_dhcp needs to check manually for an error return from _run_maas_cli, and doesn't. | 12:11 |
jtv | It receives error output and assumes it gets usable json. | 12:11 |
rvba | Right. | 12:12 |
jtv | The only hint we get from the CLI is "name" — which we don't specify, so that's why I wonder if there's a problem with that. | 12:12 |
jtv | I'm going to see if I can reproduce the problem with a simple API/CLI update. | 12:14 |
rvba | Should do the trick. | 12:14 |
=== CyberJacob is now known as CyberJacob|Away | ||
jtv | Seems to work on the API... :/ | 12:23 |
jtv | class Action(Command): | 12:31 |
jtv | uri = property(lambda self: self.handler["uri"]) | 12:31 |
jtv | def __call__(self, options): | 12:31 |
jtv | uri = self.uri.format(**vars(options)) | 12:31 |
jtv | ^ This is why we don't want code to be clever or "use the language to its full." | 12:32 |
jtv | That last assignment is the source of the error. | 12:32 |
jtv | Whoopee, right? | 12:32 |
rvba | What's 'options'? | 12:34 |
jtv | Parameter to __call__. | 12:34 |
rvba | No shit :) | 12:34 |
jtv | If you want to know more, and so do I, let's ask allenap... | 12:35 |
rvba | heh | 12:35 |
* allenap reads backwards | 12:36 | |
jtv | allenap: the main thing is that bit of code (class Action(Command):) from the CLI. | 12:37 |
allenap | jtv: What’s going wrong? | 12:37 |
jtv | KeyError. | 12:37 |
jtv | In that self.uri.format line. | 12:38 |
allenap | jtv: Is it saying which key is missing? | 12:39 |
jtv | Yes. It's ‘name’. | 12:39 |
allenap | So that’s something in the URI template that Django/Piston generates. | 12:40 |
jtv | Where do they get the information for generating that URI template? | 12:40 |
allenap | It comes from the Piston handler class. | 12:41 |
allenap | resource_uri_template | 12:41 |
jtv | Then I think understanding dawns. | 12:41 |
allenap | See describe_handler in maasserver.apidoc. | 12:41 |
jtv | The CLI call isn't passing the name field, which is supposed to go in the URL path. | 12:42 |
allenap | jtv: That implies that the CLI doesn’t know the name field is required. I think that parameters can be declared, but they need to be part of the URL path that a Django/Piston handler is registered with, or something like that. | 12:46 |
allenap | Which it should be :-/ | 12:47 |
jtv | The CLI doesn't know that the name field is _not_ required. :( | 12:47 |
jtv | So the URI template says "give me <name> here" and the CLI code tries to take <name> from its parameters dict, which doesn't have it. | 12:48 |
allenap | jtv: What does the URL look like? | 12:49 |
jtv | http://localhost:5240/api/1.0/nodegroups/{uuid}/interfaces/{name}/ | 12:50 |
allenap | jtv: Why is the name optional? | 12:50 |
allenap | What behaviour does that produce? | 12:50 |
allenap | Without the {name} filled in, I think that should resolve to the NodeGroupInterfacesHandler. With {name} it should resolve to NodeGroupInterfaceHandler, i.e. the singular. | 12:52 |
jtv | The URL used to have the network interface name in it. That's been changed to be the cluster interface name. | 12:52 |
jtv | So the request doesn't need to pass ‘name’ as a field, only as part of the URL. | 12:52 |
allenap | jtv: Can you paste the error for me? I don’t have the CI email any more :-/ | 12:55 |
jtv | No point. You get less useful information than this. | 12:57 |
jtv | The problem is that the CLI takes the positional URL parameters by name instead of positionally. | 12:58 |
jtv | No idea how to fix that. :( | 12:58 |
allenap | jtv: Maybe the CI script needs to do `maas refresh` to get updated API descriptions? | 13:00 |
rvba | allenap: the CI use a new VM instance (to install MAAS) for each run. | 13:01 |
jtv | allenap: no, a refresh doesn't do it. The problem is that the CLI takes the positional parameters from the dict of keyword arguments — which I suppose also spells trouble for any case where an identifying field is editable. | 13:09 |
allenap | jtv: Do you have some instructions so that I can reproduce this? | 13:19 |
jtv | allenap: ./bin/maas <profile> node-group-interface update <uuid> <name> | 13:20 |
jtv | Produces: ./bin/maas: error: u'name' | 13:20 |
jtv | Debugging shows that it's a KeyError while attempting to format that URI. | 13:21 |
jtv | To add to the fun, passing ‘name’ as a named parameter doesn't help. | 13:23 |
allenap | jtv: It also believes that the name is required for bin/maas <profile> node-group-interface read | 13:27 |
jtv | Duh. | 13:28 |
jtv | The problem is that the CLI code passes that parameter around by name, instead of positionally. | 13:28 |
allenap | jtv: NodeGroupInterfaceHandler.resource_uri is still returning ‘interface’ as a parameter. Not sure if that would explain everything, but it’s worth giving a go. | 13:30 |
* allenap gives it a go. | 13:30 | |
jtv | Passing name and interface both doesn't change it for me. :( | 13:31 |
allenap | https://www.irccloud.com/pastebin/vlInRrAc | 13:32 |
allenap | jtv: ^ try applying that, restarting, refreshing the cli. | 13:32 |
matsubara | allenap, jtv: fwiw, I filed https://bugs.launchpad.net/maas/+bug/1342117 for the cryptic error: u'name' failure | 13:43 |
ubot5 | Ubuntu bug 1342117 in MAAS "CLI command to set up node-group-interface fails with /usr/lib/python2.7/dist-packages/maascli/__main__.py: error: u'name'" [Critical,Triaged] | 13:43 |
jtv | Thanks matsubara. We're just looking into it. | 13:43 |
matsubara | jtv, yep, I saw the backlog. Thanks. (And thanks for fixing the other nodegroup crash!) | 13:44 |
jtv | allenap: your paste (with a refresh) fixes the problem! | 14:03 |
jtv | I thought that part where these methods substitute a fixed string was only there for documentation..? | 14:04 |
allenap | jtv: There’s still an oddity where `… node-group-interfaces list` requires a `uuid` argument. I think NodeGroupInterfacesHandler.resource_uri needs adjusting. | 14:05 |
allenap | jtv: Nope, waste not want not: it was repurposed for driving the CLI too :) | 14:06 |
allenap | jtv: Actually, and I don’t know why, NodeGroupInterfaces.list() takes a uuid argument, which I think it probably shouldn’t. | 14:07 |
=== CyberJacob|Away is now known as CyberJacob | ||
jtv | allenap: IIRC NodeGroupInterfaces.list lists the cluster interfaces for one cluster — and so you need to tell it which cluster. | 14:58 |
jtv | Exceedingly small branch for review: https://code.launchpad.net/~jtv/maas/ngi-name-ci-breakage/+merge/226859 | 15:09 |
=== lutostag_ is now known as lutostag | ||
=== Solution-X is now known as Solution-X|AFK | ||
=== sputnik1_ is now known as sputnik13net | ||
=== roadmr is now known as roadmr_afk | ||
=== CyberJacob is now known as CyberJacob|Away | ||
=== roadmr_afk is now known as roadmr | ||
=== CyberJacob|Away is now known as CyberJacob | ||
=== matsubara is now known as matsubara-afk | ||
=== CyberJacob is now known as CyberJacob|Away | ||
=== matsubara-afk is now known as matsubara | ||
=== CyberJacob|Away is now known as CyberJacob | ||
dergrunepunkt | hi, I'm using maas + juju + virsh, installed as the doc said and it's working | 21:26 |
dergrunepunkt | I have the nodes installed and when they boot they show as "enlisted-node-XXX" | 21:27 |
dergrunepunkt | but I cant get juju to do the bootstrap | 21:27 |
dergrunepunkt | because every time I run "juju boostrap" it starts a VM and the operating system installation runs again | 21:28 |
dergrunepunkt | what I am doing wrong? | 21:28 |
schegi_ | hi out there, have a maas lxc related question can someone help?? | 21:42 |
rbasak | schegi_: nobody will know until you actually ask your question | 21:46 |
schegi_ | thought it was more juju related, bootstraped a maas environemt and was wondering how to define the bridge device which is used if an service is deployed in a container. | 21:48 |
schegi_ | asked in the juju channel | 21:48 |
dergrunepunkt_ | why maas keeps reinstalling a node? | 22:11 |
=== jfarschman is now known as MilesDenver | ||
=== CyberJacob is now known as CyberJacob|Away |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!