rick_h_ | so are preseeds a way I could suggestion someone use ansible with maas w/o juju? | 01:11 |
---|---|---|
rick_h_ | basically a way to run some custom cloud-init when the node is brought up | 01:12 |
thetrav | trying to learn MAAS. Is there a command to re-image a node? | 01:35 |
thetrav | everything seems to work properly to get one up and running, but then I do my configuration, break it horrible, and want to start again | 01:36 |
thetrav | s/horrible/horribly | 01:36 |
thetrav | maybe I can commission it with a different release, then switch it back | 01:39 |
thetrav | seems a bit long winded though | 01:39 |
thetrav | trying to get MAAS to re-image a machine... so far it seems to just be re-booting it with the existing OS in place. Anyone know how to get it to start from scratch again? | 03:42 |
jtv | thetrav: if you just release it, it will be reinstalled on next boot. | 03:42 |
thetrav | how do I release it? | 03:43 |
thetrav | there's no button for that in the web ui :P | 03:43 |
thetrav | maas -h also doesn't lead me to any answers | 03:43 |
jtv | There should be, if it's currently allocated to you. | 03:43 |
jtv | Oh, you want to re-commission? | 03:43 |
thetrav | there is a "stop" button | 03:43 |
jtv | I fell into the middle of what you were saying earlier, so I'm missing some context. | 03:43 |
thetrav | I think what I was saying earlier is the same | 03:44 |
jtv | Ah yes, older version than what we're working with. Sorry. That "stop" button released the node. | 03:44 |
thetrav | or it was meant to be | 03:44 |
thetrav | I'm working with whatever is in the ubuntu package repo | 03:44 |
thetrav | I figured that would be "stable" | 03:44 |
jtv | We're developing the next version, and so we're more actively familiar with it. | 03:44 |
thetrav | right | 03:45 |
thetrav | makes sense | 03:45 |
thetrav | is there a way to do it in the old version? | 03:45 |
jtv | If you hit Stop, the node should then go into the Ready state. | 03:45 |
thetrav | yeah, it does do that | 03:45 |
jtv | Good. Now, if you re-allocate and restart it again (the Start button in your version), it should reinstall. | 03:45 |
thetrav | re-allocate? | 03:46 |
thetrav | if I click the start button it just powers up the machine. Still has the existing file system and ubuntu install | 03:46 |
jtv | Yeah. There's two steps to deploying a machine, which are more clearly distinct in the next version: first you allocate the machine, then you fire it up. | 03:46 |
jtv | If it went through Ready and is now in Allocated state, it should _not_ have the same install... | 03:46 |
jtv | If this doesn't boot you back into the installer, I suspect it's just not netbooting. | 03:48 |
thetrav | right | 03:49 |
jtv | Question is, if it's not netbooting, how did it install before? | 03:49 |
thetrav | so... one of the things I noticed | 03:49 |
thetrav | in the preseed there's a line that says "turn off PXE netboot" | 03:50 |
jtv | Yeah, that happens at the end when the node's installed and deploying. | 03:50 |
thetrav | and when I go look at the bios in CIMC, PXE is not in the Actual Boot Order | 03:50 |
thetrav | is MAAS supposed to be modifying the boot order? | 03:51 |
thetrav | when it shuts down? | 03:51 |
jtv | When the node gets released, it should set it to netboot again. | 03:51 |
thetrav | right | 03:51 |
thetrav | so I think that's not happening right | 03:51 |
jtv | Sounds like. | 03:51 |
thetrav | what mechanism is it using to do that? | 03:51 |
jtv | IIRC it's a parameter to the power command: "come up, and when you do, netboot." | 03:52 |
thetrav | right, so in this case I've configured it to IPMI v2 | 03:52 |
thetrav | I do recall seeing something about a bug with cisco integrated management controller and MAAS power management | 03:53 |
* thetrav searches | 03:53 | |
thetrav | so it may be that I just have to manually set it to netboot whenever i restart | 03:55 |
jtv | That'd be annoying. | 03:56 |
thetrav | yep | 03:56 |
thetrav | something like this: https://bugs.launchpad.net/maas/+bug/1300476 | 03:57 |
ubot5 | Ubuntu bug 1300476 in maas (Ubuntu) "Unable to setup BMC/UCS user on Cisco B200 M3" [Critical,Fix released] | 03:57 |
jtv | (You _can_ restart a node that you currently own without MAAS' involvement, of course, and that won't need the change. But still.) | 03:57 |
thetrav | although this is not a B200, it's a C240-M3S | 03:57 |
thetrav | yeah, so I told it to netboot using KVM and it's re-imaging now | 03:58 |
thetrav | would be nice if I didn't need KVM though | 03:58 |
jtv | I don't suppose this hardware supports the UCS power method? | 03:59 |
thetrav | I believe UCS requires expensive hardware that we haven't purchased | 04:00 |
thetrav | I don't really know though | 04:00 |
thetrav | I"m a software guy more than a hardware guy | 04:00 |
thetrav | this space is all pretty new to me | 04:00 |
thetrav | when you say UCS do you mean the cisco unified computing system dealy? | 04:01 |
thetrav | or is there another meaning for that achronym? | 04:01 |
jtv | I thinkthat's the one... Unified Computing and Servers? | 04:04 |
thetrav | yeah, so everything I've read and been told (admittedly by cisco sales guys) about that is that I can't use it without a 25k fabric interconnect | 04:05 |
thetrav | or maybe it's the fabric extender | 04:06 |
thetrav | point is, it's some seriously expensive fabric | 04:06 |
thetrav | for my budget of $0 | 04:06 |
thetrav | that bug is not the issue btw | 04:07 |
thetrav | I checked and it has created a maas user for the IPMI stuff | 04:07 |
thetrav | it's just not adjusting the boot settings | 04:07 |
thetrav | at least on server shut down | 04:07 |
jtv | Yeah your problem is a different one from that bug. | 04:08 |
jtv | I'm not sure it should be adjusting on shutdown — I think it does that on power up. | 04:08 |
=== CyberJacob|Away is now known as CyberJacob | ||
=== CyberJacob is now known as CyberJacob|Away | ||
=== MasterPiece is now known as Qarekhani | ||
=== Qarekhani is now known as MasterPiece | ||
=== MasterPiece is now known as Qarekhani | ||
=== Qarekhani is now known as MasterPiece | ||
ram_ | how to validate if maas is installed correctly on the server? - I do not get the URL <ServerIP>/maas working in my setup | 11:00 |
=== jefferai_ is now known as jefferai | ||
=== jfarschman is now known as MilesDenver | ||
=== jfarschman is now known as MilesDenver | ||
rick_h_ | any maas folks around to help me get through this ssh problem. We're trying to qa updates to quickstart to enable maas support and having some fun | 15:18 |
roaksoax | rick_h_: how can we help you? | 16:00 |
rick_h_ | roaksoax: I've got maas running on 3 nucs, we thought we had everything good but keep having issues with ssh and juju from the maas controller to the two nucs it's controlling | 16:00 |
rick_h_ | roaksoax: I'm confused about how the amt/maas control stuff is meant to work so maybe things look setup right but aren't | 16:01 |
roaksoax | rick_h_: what are the issues? | 16:01 |
rick_h_ | roaksoax: juju is unable to ssh https://bugs.launchpad.net/juju-core/+bug/1314682 it looks a lot like that bug | 16:01 |
ubot5 | Ubuntu bug 1314682 in juju-core "Bootstrap fails because of virt-manager config" [High,Triaged] | 16:01 |
rick_h_ | roaksoax: so I think my amt control isn't 100% correct | 16:01 |
rick_h_ | roaksoax: so if I can fire a couple of questions maybe it'll lead to something | 16:01 |
rick_h_ | roaksoax: on the amt node, I started out with it setup dhcp, but chnged it to static ip in an effort to make sure the node is always in the same place | 16:02 |
rick_h_ | roaksoax: I entered that into the maas power settings | 16:02 |
roaksoax | rick_h_: did you add AMT credentials for each of the nodes and confirmed MAAS power's them on on a juju bootstrap? | 16:02 |
rick_h_ | roaksoax: and when the machine is commissioned, it gets a different ip, is that ok? amt has 10.0.0.101 and comissioned one gets 10.0.0.250? | 16:02 |
rick_h_ | roaksoax: well that's the thing. in maas it's 'start/stop' but that doesn't seem to really control power on or power off? | 16:03 |
roaksoax | rick_h_: what version of MAAS are you using? | 16:03 |
rick_h_ | roaksoax: I moved to the daily ppa last night trying to work around a different issue | 16:03 |
rick_h_ | 1.6.1+bzr2550+2551+295~ppa0~ubuntu14.04.1 | 16:04 |
rick_h_ | roaksoax: it's setup at maas.jujugui.org and happy to help give access if it helps in debugging | 16:04 |
roaksoax | rick_h_: can you test ppa:maas-maintainers/experimental ? | 16:04 |
rick_h_ | roaksoax: sure thing | 16:05 |
roaksoax | rick_h_: who is giving IP Address ot AMT? I'd suggest you configure the IP manually for each AMT host and not on a range that MAAS manages | 16:05 |
rick_h_ | roaksoax: right, that's what I've done. I've hard coded the amt ip now on both nucs | 16:06 |
rick_h_ | roaksoax: and then maas gives the machine a dynamic space ip when it comissions | 16:06 |
roaksoax | rick_h_:yes that's fine. The static IP allocation means that MAAS pics an IP and assigns it to the node on *start* | 16:06 |
rick_h_ | roaksoax: ok | 16:06 |
rick_h_ | roaksoax: 1.7 installing now | 16:07 |
rick_h_ | roaksoax: hmm, change to maas_local_settings.py in upgrade there removing all rabbitmq? | 16:08 |
rick_h_ | roaksoax: sent diff in pm if that's expected? | 16:09 |
roaksoax | rick_h_: Y | 16:11 |
roaksoax | rick_h_: that's expeceted | 16:11 |
rick_h_ | roaksoax: ok cool thanks for sanity check | 16:11 |
rick_h_ | smaller one on pserv.yaml accepted as well | 16:12 |
roaksoax | rick_h_: i'll try to handle that automatically | 16:12 |
rick_h_ | ok, new ui loaded :) | 16:13 |
rick_h_ | "Boot image import process not started. Nodes will not be able to provision without boot images. Start the boot images import process to resolve this issue." warning | 16:13 |
roaksoax | rick_h_: ah yeah, og to the Images tab | 16:14 |
roaksoax | and import images again :) | 16:14 |
roaksoax | rick_h_: but the images should be there | 16:14 |
roaksoax | rick_h_: we just haven't migrated | 16:14 |
roaksoax | yet | 16:14 |
rick_h_ | ah cool yea got it | 16:14 |
rick_h_ | ok, see some more options on the nodes as well | 16:14 |
rick_h_ | roaksoax: so stop == shutdown? | 16:14 |
roaksoax | rick_h_: yes! please files bugs if you thinkg that should be changed :) | 16:15 |
rick_h_ | roaksoax: so I could not start/stop the node with the error that the config didn't allow it | 16:18 |
rick_h_ | roaksoax: so I went in to edit the node and in the 'power type' I have a select list with no options | 16:18 |
rick_h_ | roaksoax: so it seems I lost my power info on the node in the upgrade and have no valid types to choose from now | 16:19 |
roaksoax | rick_h_: so what if you do: sudo service apache2 restart && sudo service maas-cluster-register restart && sudo service maas-cluster restart | 16:20 |
roaksoax | rick_h_: wait a bit | 16:20 |
roaksoax | rick_h_: and check under the 'Clusters' tab to see if the cluster is connected | 16:20 |
rick_h_ | roaksoax: running now | 16:20 |
roaksoax | rick_h_: once connected, you should get that back | 16:20 |
rick_h_ | roaksoax: rgr | 16:20 |
roaksoax | rick_h_: did it work this time? | 16:26 |
rick_h_ | roaksoax: so it grabbed some ips on the wrong network so editing the pserv.yaml and maas_cluster.conf to update those ips and restarting | 16:26 |
roaksoax | rick_h_: sudo dpkg-reconfigure maas-cluster-controller :) | 16:27 |
rick_h_ | roaksoax: good to know | 16:27 |
rick_h_ | roaksoax: ok, restarted and connected cluster now | 16:27 |
rick_h_ | roaksoax: ok, so my power config is back and set | 16:28 |
roaksoax | rick_h_: ok great, let's try to test this time | 16:28 |
rick_h_ | roaksoax: is the mac addr diff or the same from amt to the 'machine'? (I have them as the same but curious if I should be looking for a diff one) | 16:28 |
roaksoax | rick_h_: i think it might be the same | 16:28 |
rick_h_ | machine is off, trying to start it brings up same error "The action "Start selected nodes" could not be performed on 1 node because its state does not allow that action. | 16:28 |
roaksoax | rick_h_: i cna't remember.. don't have a NUC in hand now unfortunately | 16:28 |
rick_h_ | roaksoax: all good | 16:29 |
roaksoax | rick_h_: Allocate machine first | 16:29 |
roaksoax | rick_h_: and then you can start it | 16:29 |
rick_h_ | allocate == commission? | 16:29 |
rick_h_ | it's showing as status of ready atm | 16:29 |
rick_h_ | ah, more details in the new edit details page | 16:29 |
rick_h_ | Failed to query node's BMC — Node could not be queried node-ee9f70b4-48aa-11e4-8a8c-eca86bffcfed (nuc1) amt failed with return code 2: Missing amttool (amtterm package) | 16:30 |
roaksoax | rick_h_: commissioning is the stage where MAAS lears about the machine | 16:30 |
roaksoax | rick_h_: there you go, sudo apt-get install amtterm :) | 16:30 |
roaksoax | rick_h_: good thing that MAAS shows what's going on nowadays :) | 16:30 |
rick_h_ | no kidding | 16:30 |
rick_h_ | never would have realized it didn't already come with that stuff ootb | 16:30 |
roaksoax | rick_h_: that's why mark was happy in nuremberg :) | 16:31 |
rick_h_ | oh yay color icon shows off now | 16:32 |
rick_h_ | roaksoax: ok, I've got a power status and the button on the edit to 'check power status' shows it's off correctly | 16:35 |
rick_h_ | roaksoax: but when I go to 'start' I get 'The action "Start selected nodes" could not be performed on 1 node because its state does not allow that action. | 16:35 |
rick_h_ | ' | 16:35 |
rick_h_ | roaksoax: from a 'ready' state currently | 16:35 |
rick_h_ | roaksoax: no new error in the edit view, just the same 6min old 'amtterm' missing | 16:35 |
roaksoax | rick_h_: yeah you need to *own* the machine first | 16:39 |
roaksoax | rick_h_: so *allocate* the machine first and then *start* | 16:39 |
rick_h_ | roaksoax: allocate == commission or acquire or ? | 16:40 |
* rick_h_ isn't seeing a allocate button and is feeling a bit like a dumb user | 16:40 | |
rick_h_ | roaksoax: ok yea so the 'acquire' gave me the status of 'allocated' so I'd give user feedback of making the terms consistent | 16:42 |
roaksoax | rick_h_: agreed, please do files bugs | 16:42 |
rick_h_ | ok, and now I can start and the machine is coming up woot! | 16:42 |
rick_h_ | roaksoax: will do ty | 16:42 |
rick_h_ | roaksoax: ok, more fun. So I managed to acquire and then start. Now I can't shut down or do anything else. It shows green started, and I'ge for tftp request node events | 17:07 |
roaksoax | rick_h_: what's the status of the node? Deploying? Deployed? | 17:07 |
rick_h_ | roaksoax: I've tried to "Abort operatoin" and "stop node" and I get The action "Stop selected nodes" could not be performed on 1 node because its state does not allow that action. | 17:08 |
rick_h_ | deploying | 17:08 |
roaksoax | rick_h_: ok, I think that's xpected since it is in the process of being deployed | 17:08 |
rick_h_ | but that failed, the machine came up, got a tftp timeout, and now is sitting at the 'reboot or select proper boot device" | 17:08 |
roaksoax | rick_h_: humm so it never really started? | 17:08 |
rick_h_ | roaksoax: correct | 17:08 |
rick_h_ | it 'turned on' but never got rolling due to the tftp timeout | 17:08 |
roaksoax | rick_h_: did you add an SSH key? can you show output of /var/log/maas/maas.log and /var/log/maas/pserv.log | 17:09 |
rick_h_ | roaksoax: https://pastebin.canonical.com/117948/ and https://pastebin.canonical.com/117948/ respectively | 17:11 |
rick_h_ | roaksoax: yes, ssh key is added to maas | 17:11 |
rick_h_ | roaksoax: tftp error was "PXE-E32: TFTP open timeout | 17:12 |
roaksoax | rick_h_: yeah I don't see a PXE boot here: https://pastebin.canonical.com/117949/ | 17:13 |
roaksoax | rick_h_: is the cluster controller correctly configured to do DHCP/DNS? | 17:13 |
rick_h_ | roaksoax: as far as I'm aware. it's connected, managed interfaces: 1, nodes 1, images synced, with "Manage" set to DHCP & DNS | 17:14 |
rick_h_ | roaksoax: ok, I can release the node and now there's an error where amt disagreed that the machine was going to power off, yet it did | 17:15 |
rick_h_ | https://pastebin.canonical.com/117950/ but it did occur so at least it's not tied up on the broken deploying now | 17:16 |
roaksoax | rick_h_: weird... maybe AMT issues? | 17:16 |
rick_h_ | roaksoax: maybe, my first experience with it. | 17:16 |
roaksoax | rick_h_: the PXE boot issue i think it has been seen before but that's something we are investigating | 17:16 |
rick_h_ | ok, had to release it twice but now it's actually released, back in a powered off 'black' state | 17:17 |
rick_h_ | roaksoax: ok, cool. Well, this is farther than I was with better debug info. I'll add the second node and see if I can get it to work at all with juju now. | 17:18 |
rick_h_ | roaksoax: thanks for the help and pointer to the experimental stuff. | 17:18 |
roaksoax | rick_h_: ok, great | 17:18 |
rick_h_ | I'll file a few bugs on things based on the experience as feedback | 17:18 |
roaksoax | rick_h_: awesome! thanks | 17:19 |
=== roadmr is now known as roadmr_afk | ||
rick_h_ | roaksoax: can't add a my second node, does this look like something you've seen? https://pastebin.canonical.com/117958/ | 18:33 |
roaksoax | rick_h_: interesting... it seems like it would be trying to pxe boot a node but it doesn't fine it | 18:34 |
roaksoax | rick_h_: can youplease file a bug and point it to me? | 18:34 |
rick_h_ | roaksoax: will do | 18:34 |
rick_h_ | roaksoax: that's in the add node UI after entering the info to add a new one | 18:34 |
roaksoax | rick_h_: ah yes, we spotted a bug when adding via webui | 18:40 |
roaksoax | rick_h_: it is in the process of being fixed | 18:40 |
rick_h_ | roaksoax: ok cool | 18:40 |
=== CyberJacob|Away is now known as CyberJacob | ||
=== jfarschman is now known as MilesDenver | ||
=== roadmr_afk is now known as roadmr | ||
=== CyberJacob is now known as CyberJacob|Away | ||
=== mjs0 is now known as menn0 | ||
=== jfarschman is now known as MilesDenver |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!