=== CyberJacob is now known as CyberJacob|Away | ||
=== MasterPieceF is now known as MasterPiece | ||
designated | Does anyone more comprehensive documentation for curtin? Specifically with regards to being more granular in creating partitions and configuring raid, raid0, simple and simple_boot will not fit my needs. Can this be done using in-target commands? Is there somewhere I can find proper formatting of the OUTPUT_FSTAB file that gets created? | 05:20 |
---|---|---|
jtv | I wonder why CI hits a timeout error during sleep(). | 10:50 |
jtv | rvba, do you have any idea about ^ ? | 10:51 |
rvba | jtv: the latest failure on Utopic (#70) is failing because a deployed node never comes online. | 10:52 |
rvba | jtv: not sure what the reason is but I've got a manual run in progress right now (I wanted to test ~rvb/maas/retry-bug-1398082 before I realized the CI was red on trunk). | 10:53 |
=== CyberJacob|Away is now known as CyberJacob | ||
jtv | Yes, I was just trying to provide some clearer output there and found I was puzzled by what the story actually was. | 10:59 |
jtv | rvba: your manual build failed as well. | 11:14 |
ahasenack | g | 11:14 |
jtv | rvba: I'm running my own manual test for a branch that makes these failures slightly easier to debug. | 11:14 |
jtv | Okay, I see now why the sleep() times out: it's the timeout decorator. But it uses signals to capture this event... I wonder if we're not accidentally catching a timeout that's meant to end the sleep()? | 11:17 |
jtv | allenap would know. | 11:17 |
rvba | jtv: are you talking about the CI code or the MAAS code? | 11:21 |
jtv | rvba: CI code, sorry. | 11:22 |
rvba | jtv: it seems to me the timeout comes from Jenkins and not the CI code. | 11:23 |
rvba | jtv: but let's wait for the current run to finish and we will investigate the failure if it fails. | 11:24 |
jtv | Sure. It's just that I just proposed a branch to improve the error output a bit. | 11:24 |
jtv | Because the output was one of those things that I think kept us from debugging failures. | 11:25 |
jtv | gmb, rvba: manual utopic run #22 is mine — I guess #23 is gmb's? | 11:27 |
rvba | jtv: #23 is mine | 11:27 |
jtv | OK. Mine just went to a different "pending" state... not familiar. | 11:28 |
rvba | jtv: I think the failure I got in #21 is spurious | 11:28 |
jtv | It was the same one that we'd seen before though. | 11:28 |
jtv | Timeout during sleep(). And no repeated calls to justify it. | 11:28 |
jtv | *Could* be Jenkins I guess, but our code catches the signal and it's probably not really supposed to. | 11:29 |
rvba | jtv: the failure in #21 is clear: TestMAASIntegration.test_check_nodes_ready failed because one of the nodes didn't make it to 'Ready'. | 11:30 |
jtv | rvba: even so, the code would normally have retried the listing. | 11:31 |
jtv | It looks to me as if it only tried once. | 11:31 |
jtv | Or do "details" from later runs overwrite earlier ones? | 11:31 |
rvba | jtv: the retried happened but the node was stuck | 11:31 |
jtv | So then I guess the details are only shown for the last attempt to list nodes. | 11:32 |
rvba | jtv: yes | 11:32 |
gmb | jtv, rvba: That failure in TestMAASIntegration.test_check_nodes_ready is similar to the one that Gavin was banging his head against during Austin week; turned out to be because of Piston's anonymous handlers, and really ought be fixed… We'll see depending on what the current build does. | 11:43 |
jtv | OK | 11:44 |
gmb | jtv, rvba: Utopic adt job is now green again. | 12:10 |
jtv | \0/ | 12:14 |
* gmb -> dentist. Back later. | 12:27 | |
=== dimitern_ is now known as dimitern | ||
jtv | Jenkins would be so much nicer if it could show who requested a build... I'm sure there's an option somewhere. | 13:35 |
=== jfarschman is now known as MilesDenver | ||
designated | Does anyone have more comprehensive documentation for curtin? Specifically with regards to being more granular in creating partitions and configuring raid. Raid0, simple and simple_boot will not fit my needs. Can this be done using in-target commands? Is there somewhere I can find proper formatting of the OUTPUT_FSTAB file that gets created? | 15:15 |
designated | or do I have to go back to using preseeds? | 15:17 |
roadmr | designated: I usually resort to reading curtin's source code :/ heheh | 15:24 |
=== roadmr is now known as roadmr_afk | ||
=== caribou_ is now known as caribou | ||
=== roadmr_afk is now known as roadmr | ||
designated | roadmr: I'm not much of a programmer but I guess if that's the only way...time to learn python. | 16:33 |
roadmr | designated: heheh ... sorry I couldn't offer a better solution | 16:36 |
designated | roadmr: no worries. thanks for the response. | 16:40 |
=== Guest34941 is now known as mscheel | ||
designated | I'm trying to enlist a node in MAAS 1.7 that worked fine under MAAS 1.5, I can see "maas-enlisting-node login:" on the console but it's just been sitting there for 40+ minutes. Here is the output I see from /var/log/maas/maas.log http://pastebin.com/AzJjAwDb . Anyone have any ideas on where to start troubleshooting? | 18:39 |
roadmr | designated: whoa, enlisting should take only a few minutes... if the node shows up on maas, I guess you've already tried deleting it and reenlisting it? | 18:45 |
designated | roadmr: this node never shows up in MAAS, it PXE boots and then sits at the login prompt. unfortunately I don't see any other logs regarding what could be wrong. Am I going to have to add backdoor credentials to try and get into the device to check the local log? | 18:48 |
roadmr | designated: odd, and you have other nodes that do work? | 18:55 |
designated | roadmr: I successfully enlisted and commisioned another node earlier but I just tried another one and it's exhibiting the same behavior. | 18:56 |
designated | I'm going to try and restart all maas services | 18:56 |
roadmr | designated: cool, keep us updated | 18:56 |
designated | roadmr: I can easily restart maas with "service apache2 restart" but how do I restart the associated services? I'm not finding anything in the documentation. | 19:11 |
roadmr | designated: look in /etc/init, there's a bunch of maas-* services, those are the names you'd have to restart | 19:11 |
designated | roadmr: I restarted everything successfully. The second node I booted up enlisted just fine, the original problem node is still not enlisting. I just restarted the problem node to try one more time. | 19:26 |
roadmr | designated: ok... maybe maas has some record of the node's mac address somewhere, so that's why it complains. I don't know much about maas internals or where it would store this data though :/ a database perhaps? | 19:27 |
=== jfarschman is now known as MilesDenver | ||
=== jfarschman is now known as MilesDenver | ||
=== CyberJacob is now known as CyberJacob|Away | ||
=== jfarschman is now known as MilesDenver |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!