[05:20] <designated> Does anyone more comprehensive documentation for curtin?  Specifically with regards to being more granular in creating partitions and configuring raid, raid0, simple and simple_boot will not fit my needs.  Can this be done using in-target commands?  Is there somewhere I can find proper formatting of the OUTPUT_FSTAB file that gets created?
[10:50] <jtv> I wonder why CI hits a timeout error during sleep().
[10:51] <jtv> rvba, do you have any idea about ^ ?
[10:52] <rvba> jtv: the latest failure on Utopic (#70) is failing because a deployed node never comes online.
[10:53] <rvba> jtv: not sure what the reason is but I've got a manual run in progress right now (I wanted to test ~rvb/maas/retry-bug-1398082 before I realized the CI was red on trunk).
[10:59] <jtv> Yes, I was just trying to provide some clearer output there and found I was puzzled by what the story actually was.
[11:14] <jtv> rvba: your manual build failed as well.
[11:14] <ahasenack> g
[11:14] <jtv> rvba: I'm running my own manual test for a branch that makes these failures slightly easier to debug.
[11:17] <jtv> Okay, I see now why the sleep() times out: it's the timeout decorator.  But it uses signals to capture this event...  I wonder if we're not accidentally catching a timeout that's meant to end the sleep()?
[11:17] <jtv> allenap would know.
[11:21] <rvba> jtv: are you talking about the CI code or the MAAS code?
[11:22] <jtv> rvba: CI code, sorry.
[11:23] <rvba> jtv: it seems to me the timeout comes from Jenkins and not the CI code.
[11:24] <rvba> jtv: but let's wait for the current run to finish and we will investigate the failure if it fails.
[11:24] <jtv> Sure.  It's just that I just proposed a branch to improve the error output a bit.
[11:25] <jtv> Because the output was one of those things that I think kept us from debugging failures.
[11:27] <jtv> gmb, rvba: manual utopic run #22 is mine — I guess #23 is gmb's?
[11:27] <rvba> jtv: #23 is mine
[11:28] <jtv> OK.  Mine just went to a different "pending" state... not familiar.
[11:28] <rvba> jtv: I think the failure I got in #21 is spurious
[11:28] <jtv> It was the same one that we'd seen before though.
[11:28] <jtv> Timeout during sleep().  And no repeated calls to justify it.
[11:29] <jtv> *Could* be Jenkins I guess, but our code catches the signal and it's probably not really supposed to.
[11:30] <rvba> jtv: the failure in #21 is clear: TestMAASIntegration.test_check_nodes_ready failed because one of the nodes didn't make it to 'Ready'.
[11:31] <jtv> rvba: even so, the code would normally have retried the listing.
[11:31] <jtv> It looks to me as if it only tried once.
[11:31] <jtv> Or do "details" from later runs overwrite earlier ones?
[11:31] <rvba> jtv: the retried happened but the node was stuck
[11:32] <jtv> So then I guess the details are only shown for the last attempt to list nodes.
[11:32] <rvba> jtv: yes
[11:43] <gmb> jtv, rvba: That failure in TestMAASIntegration.test_check_nodes_ready is similar to the one that  Gavin was banging his head against during Austin week; turned out to be because of Piston's anonymous handlers, and really ought be fixed… We'll see depending on what the current build does.
[11:44] <jtv> OK
[12:10] <gmb> jtv, rvba: Utopic adt job is now green again.
[12:14] <jtv> \0/
[12:27]  * gmb -> dentist. Back later.
[13:35] <jtv> Jenkins would be so much nicer if it could show who requested a build...  I'm sure there's an option somewhere.
[15:15] <designated> Does anyone have more comprehensive documentation for curtin?  Specifically with regards to being more granular in creating partitions and configuring raid.  Raid0, simple and simple_boot will not fit my needs.  Can this be done using in-target commands?  Is there somewhere I can find proper formatting of the OUTPUT_FSTAB file that gets created?
[15:17] <designated> or do I have to go back to using preseeds?
[15:24] <roadmr> designated: I usually resort to reading curtin's source code :/ heheh
[16:33] <designated> roadmr: I'm not much of a programmer but I guess if that's the only way...time to learn python.
[16:36] <roadmr> designated: heheh ... sorry I couldn't offer a better solution
[16:40] <designated> roadmr: no worries.  thanks for the response.
[18:39] <designated> I'm trying to enlist a node in MAAS 1.7 that worked fine under MAAS 1.5, I can see "maas-enlisting-node login:" on the console but it's just been sitting there for 40+ minutes.  Here is the output I see from /var/log/maas/maas.log http://pastebin.com/AzJjAwDb . Anyone have any ideas on where to start troubleshooting?
[18:45] <roadmr> designated: whoa, enlisting should take only a few minutes... if the node shows up on maas, I guess you've already tried deleting it and reenlisting it?
[18:48] <designated> roadmr: this node never shows up in MAAS, it PXE boots and then sits at the login prompt.  unfortunately I don't see any other logs regarding what could be wrong.  Am I going to have to add backdoor credentials to try and get into the device to check the local log?
[18:55] <roadmr> designated: odd, and you have other nodes that do work?
[18:56] <designated> roadmr: I successfully enlisted and commisioned another node earlier but I just tried another one and it's exhibiting the same behavior.
[18:56] <designated> I'm going to try and restart all maas services
[18:56] <roadmr> designated: cool, keep us updated
[19:11] <designated> roadmr: I can easily restart maas with "service apache2 restart" but how do I restart the associated services?  I'm not finding anything in the documentation.
[19:11] <roadmr> designated: look in /etc/init, there's a bunch of maas-* services, those are the names you'd have to restart
[19:26] <designated> roadmr: I restarted everything successfully.  The second node I booted up enlisted just fine, the original problem node is still not enlisting.  I just restarted the problem node to try one more time.
[19:27] <roadmr> designated: ok... maybe maas has some record of the node's mac address somewhere, so that's why it complains. I don't know much about maas internals or where it would store this data though :/ a database perhaps?