mattrae | hi, when i destroy-environment and then redeploy to the same machines they aren't reinstalled. I have to delete the machines and re-enlist to get a fresh installation. | 22:00 |
---|---|---|
mattrae | i couldn't find a bug for this. sounds like a bug though | 22:00 |
mattrae | i think whenever i terminate a machine, it should be reinstalled when that hardware is used again | 22:06 |
bigjools | mattrae: I have seen one other person say this too, however I can't re-create it. Which version of maas are you using? | 22:11 |
mattrae | hi bigjools i'm using 1.2+bzr1360+dfsg-0ubuntu1~ppa1 from http://ppa.launchpad.net/maas-maintainers/stable/ubuntu/ | 22:13 |
bigjools | mattrae: which version of juju, and what do you see happening on the machines' consoles that's different to normal? | 22:19 |
mattrae | bigjools: i'm using juju 0.6.0.1+bzr618-0juju2~precise1. for example i do juju destroy-environment. then when i do juju bootstrap ubuntu is not reinstalled. i can do juju status and see my old environment | 22:21 |
bigjools | sounds like a bug in juju? | 22:21 |
mattrae | i see the machines being returned to 'ready' when i terminate the environment | 22:22 |
bigjools | can you see the destroy request in the maas log? | 22:22 |
mattrae | sure i'll check for the request in the log | 22:22 |
bigjools | so the machines are ready, yet when you do a status, you see them in use? | 22:28 |
mattrae | bigjools: i'm looking in maas.log and i see a number of these errors from around the time i destroyed the environment: "PermissionDenied: Not authenticated as a known node." | 22:28 |
bigjools | ew | 22:28 |
bigjools | can you paste the log somewhere for me plesae | 22:28 |
bigjools | please | 22:28 |
mattrae | the machines are ready, then i do juju bootstrap, the machine starts but doesn't reinstall. then i do juju status and i see my previous environment | 22:29 |
bigjools | what does it do instead of re-installing? | 22:29 |
mattrae | it appears to just boot the machine with whatever was on it previously | 22:30 |
bigjools | what state does maas show it in at that point? | 22:30 |
mattrae | since i see my old environment | 22:30 |
mattrae | at that point its allocated | 22:30 |
bigjools | the machine will do a local boot if anything goes wrong with the pxe boot | 22:31 |
bigjools | so I suspect your PermissionDenied errors have got something to do with this | 22:31 |
mattrae | here's the error i see a few times i the log. yeah i don't know whether it is related or not http://pastebin.com/zu16ETvW | 22:32 |
mattrae | that error could be related to me trying to re-enlist the machines. re-enlisting doesn't seem to be working | 22:33 |
mattrae | only worked for one node :/ | 22:34 |
bigjools | yeah it's a metadata server error | 22:34 |
mattrae | sounds like i need to reinstall maas :/ | 22:34 |
bigjools | the metadata server IDs the requesting node so it knows which data to send it | 22:35 |
bigjools | don't re-install, let's investigate | 22:35 |
mattrae | ok cool :) | 22:35 |
bigjools | ok this will take a while, sorry, but can you remove all your nodes, shut down maas, wipe your logs and start again | 22:36 |
bigjools | then re-enlist, bootstrap | 22:36 |
mattrae | now it seems that even though i deleted the machines from maas. then rebooted them, they aren't re-inlisting. only one node re-inlisted.. the rest just booted up with whatever they had on them previously | 22:36 |
bigjools | send me the log | 22:36 |
mattrae | i'll try rebooting them again | 22:36 |
bigjools | then destroy-env and send me the log again | 22:36 |
mattrae | ok sounds good | 22:37 |
bigjools | this will get me a clean log | 22:38 |
mattrae | bigjools: hrm, so i deleted all nodes from maas and shut down the machines. i power on one machine and it boots up with what it had previously and I get that same "permissionDenied: Not authenticated as a known node" error | 22:42 |
mattrae | the nodes never show up in the maas web interface | 22:43 |
bigjools | hmm | 22:43 |
bigjools | so there's 0 nodes registered? | 22:43 |
mattrae | yeah "0 nodes in maas" | 22:44 |
bigjools | ok | 22:44 |
bigjools | can you send me the pserv log | 22:44 |
mattrae | sure, want the whole thing? the most recent message in pserv.log is 40 min ago | 22:45 |
bigjools | oh | 22:46 |
bigjools | darn | 22:46 |
bigjools | that's odd | 22:46 |
bigjools | you don't have more than one dhcp server on your network do you? | 22:47 |
mattrae | would it make sense to try restarting the maas server? | 22:52 |
mattrae | or is there a way to check the health? | 22:52 |
mattrae | i see maas-pserv, maas-cluster-celery, maas-txlongpoll, maas-region-celery, and maas-dhcp-server are running | 22:52 |
mattrae | i can try reinstalling maas too and report if i see this issue again | 22:52 |
mattrae | normally deleting nodes works, so i wonder if something got corrupted | 22:52 |
mattrae | deleting/re-inlisting works i mean | 22:52 |
mattrae | nope | 22:53 |
bigjools | when you boot the node, can you see it pxe booting from maas? | 22:54 |
bigjools | or does it time out? | 22:54 |
bigjools | it looks to me like it is not pxe booting, and the previous installation boots and tries to contact the metadata server with predictable results | 22:54 |
mattrae | hmm i'll check | 22:54 |
bigjools | the most obvious reason for that is usually that there's another dhcp server | 22:57 |
mattrae | bigjools: ahh yeah looks like my vm's are set to boot from the hd. that is weird because i'm not sure how i would have got them deployed previously | 22:59 |
bigjools | aha | 22:59 |
mattrae | these are libvirt vms.. does maas set a node to not pxe once there is an installation? | 22:59 |
bigjools | yes | 23:00 |
bigjools | oh wait | 23:00 |
bigjools | no, sorry | 23:00 |
mattrae | or maybe it was pxeing because there was no installation | 23:00 |
mattrae | cool, i should at least be able to set these back to pxe | 23:01 |
bigjools | the tftp server gives a different config based on what state we think it's in | 23:01 |
bigjools | quite likely, yes | 23:01 |
bigjools | maas does not touch bios boot order | 23:01 |
bigjools | you need to make sure pxe is first | 23:01 |
bigjools | great, good luckj | 23:01 |
mattrae | great, thanks for the help | 23:02 |
bigjools | welcome | 23:02 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!