=== CyberJacob|Away is now known as CyberJacob | ||
=== CyberJacob is now known as CyberJacob|Away | ||
rvba | gmb: question for you: when I'm performing a bulk operation on two nodes that are both being powered up, instead of the expected summary message (about the fact that the 2 operations can't be performed), I get only one message about one node. | 10:04 |
---|---|---|
rvba | gmb: is that expected? | 10:04 |
rvba | gmb: from the logs it seems there is only failure happening, the 'abort commissioning' operation fails with the first node and the second operation (aborting the second node) isn't even tried. | 10:12 |
gmb | rvba: I don't know about expected, but remember that under-the-hood, although Node.objects.start_nodes() accepts a collection of Nodes to start, it will get called thus: | 10:22 |
gmb | Node.objects.start_nodes([node_1]) | 10:23 |
gmb | Node.objects.start_nodes([node_2]) | 10:23 |
gmb | That may be why you're only getting one error. | 10:23 |
gmb | Though I don't know the entire context, so I don't know why you'd expect to see both. | 10:23 |
gmb | rvba: Ah, although… because it's called one at a time, and each one raises an error, and the error handling middleware adds a message and then redirects… that's probably why you'd see only one message. | 10:24 |
rvba | gmb: I've done another experiment: | 10:25 |
rvba | I've got two nodes. I commission one of them and waits until the power command is executed. | 10:26 |
rvba | wait* | 10:26 |
rvba | Then I commission another node and quickly after that (i.e. while the lock on the second node is still there) I try to abort the commissioning operation for the *2* nodes. | 10:26 |
rvba | gmb: guess what happened :) | 10:27 |
rvba | I got a message about the fact that it's impossible to cancel the commission for node 2… and node 1 didn't get touched! | 10:27 |
gmb | rvba: Both aborts aborted? | 10:27 |
gmb | Oh! | 10:27 |
gmb | Haha | 10:27 |
gmb | Yeah, that doesn't actually surprise me. Again, see above… Your request hit an error, so *everything* in that transaction got aborted | 10:28 |
gmb | (i.e. both aborts aborted) | 10:28 |
rvba | Right. I understand… but that's a bit surprising from a user's perspective. | 10:29 |
gmb | agreed | 10:29 |
gmb | rvba: This is a problem that allenap and I were aware of; that we're going to end up with these kind of something-should-be-unwound-something-shouldn't-be states. But we punted on it at the time because things were getting complex enough as it was. | 10:29 |
rvba | Ok. I'll still file a bug about it. | 10:30 |
gmb | rvba: Please do! | 10:30 |
rvba | We should probably use subtransactions here. | 10:30 |
allenap | rvba: Yep. | 10:32 |
allenap | Agreed. | 10:32 |
rvba | gmb: allenap: actually, it's much worse than that: in my example, the first node (which was half way through commissioning) got shutdown (but was still left in the commissioning state because the DB transaction got rolledback)! | 10:40 |
rvba | gmb: allenap: this is because the power operation is not part of the transaction and didn't get rolled back :). | 10:42 |
gmb | Yep. | 10:44 |
allenap | rvba: For that I think we should commit the transaction before issuing the power command. It may be that commit-before-rpc — or commit-before-doing-it — should be the approach everywhere. | 10:45 |
gmb | allenap, rvba, jtv: Review for one of you, if you please: https://code.launchpad.net/~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121/+merge/237039 | 10:50 |
jtv | I'll take it. | 10:50 |
gmb | Ta | 10:50 |
rvba | allenap: gmb:https://bugs.launchpad.net/maas/+bug/1377099 | 10:56 |
ubot5 | Ubuntu bug 1377099 in MAAS "Bulk operation leaves nodes in inconsistent state" [Critical,Triaged] | 10:56 |
rvba | gmb: testing your branch now | 11:07 |
gmb | rvba: Ta | 11:08 |
rvba | gmb: looks nice :) | 11:09 |
jtv | Thanks Gav. | 11:09 |
rvba | gmb: not a proper review but I added a couple of minor comments to your branch. | 11:11 |
gmb | k | 11:11 |
jtv | One reason to dislike IPv6: the digit is right in the dead zone on the keyboard. | 11:13 |
gmb | rvba, jtv: Thankee | 11:13 |
allenap | rvba: I’ve commented. | 12:24 |
=== jfarschman is now known as MilesDenver | ||
=== roadmr is now known as roadmr_afk | ||
gmb | rvba:Thanks for the painful bugs list :) | 14:57 |
rvba | gmb: welcome :) | 14:57 |
rvba | gmb: some of these bugs are not only painful but also painful to reproduce unfortunately. | 14:57 |
gmb | Yeah. | 14:58 |
gmb | rvba, allenap: Are either of you working on those bugs? If so, which? Need help? | 14:58 |
rvba | I assume allenap is still working on the RPC stuff. | 14:59 |
rvba | gmb: I've worked on 1376782 but it isn't fixed completely yet. | 15:00 |
rvba | gmb: it's one of the bugs that is really hard to reproduce. | 15:00 |
gmb | Right | 15:02 |
gmb | rvba: Can you be more specific on the bug about why the problem might still occur? (I ask because if I pick it up I'll need to refer to it later, and my memory is a sieve today due to jet lag) | 15:03 |
* gmb -> jetlag break; bbiab | 15:23 | |
rvba | blake_r: btw, great job on the import boot images JS, I've tested it and it's great! | 15:33 |
blake_r | rvba: thanks | 15:43 |
blake_r | rvba: glad you like it | 15:44 |
blake_r | rvba: only works for Ubuntu images, if you want to extend a FFe I can get the others as well, ;-) | 15:44 |
=== roadmr_afk is now known as roadmr | ||
=== beisner- is now known as beisner | ||
=== matsubara is now known as matsubara-lunch | ||
=== matsubara-lunch is now known as matsubara | ||
=== CyberJacob|Away is now known as CyberJacob | ||
=== CyberJacob is now known as CyberJacob|Away | ||
=== CyberJacob|Away is now known as CyberJacob | ||
=== roadmr is now known as roadmr_afk | ||
=== CyberJacob is now known as CyberJacob|Away | ||
=== CyberJacob|Away is now known as CyberJacob | ||
=== jfarschman is now known as MilesDenver | ||
=== jfarschman is now known as MilesDenver | ||
=== roadmr_afk is now known as roadmr | ||
=== jfarschman is now known as MilesDenver | ||
=== sebas538_ is now known as sebas5384 | ||
=== jfarschman is now known as MilesDenver |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!