[00:10] <bradm> anyone seen maas lie about power state?  I've got some HP kit that maas says it powers on, the web ui says its on, but the ilo says its off
[00:10] <bradm> this only seems to happen when its being deployed, the commissioning worked fine, so I know the power settings are right
[00:10] <bradm> and its not consistantly doing it, its only since upgrading to 1.9.1+bzr4543-0ubuntu1
[00:27] <mup> Bug #1555864 opened:  [2.0a1] UI Nodes page shows 'ascii' codec can't decode byte <MAAS:New> <https://launchpad.net/bugs/1555864>
[02:31] <mup> Bug #1555901 opened: Number of regiond process is not determined <MAAS:Triaged> <https://launchpad.net/bugs/1555901>
[11:13] <BlackDex> i get the following error during boot in dmesg: [  132.794959] init: maas-regiond-worker (3) main process (4763) terminated with status 1
[11:14] <BlackDex> [  132.794979] init: maas-regiond-worker (3) main process ended, respawning
[11:14] <BlackDex> And that happens a lot
[13:32] <mup> Bug #1556085 opened: adding boot-source keyring_data fails silently <sts> <MAAS:New> <https://launchpad.net/bugs/1556085>
[14:49] <voidspace> roaksoax: ping
[14:50] <voidspace> roaksoax: when commissioning fails I can login - how do I disable poweroff?
[14:50] <voidspace> roaksoax: and what logfiles would be helpful, there's no console.log - there's cloud-init and cloud-init-output
[14:50] <voidspace> amongst other things
[14:52] <voidspace> cloud-init-output.log has the error 400s in it
[14:52] <voidspace> I'll attach those two to my bug
[14:53] <voidspace> done
[14:55] <roaksoax> voidspace: can you send me the link of yoiur bug again please
[14:55] <roaksoax> voidspace: and if you are using 1.9, when you commission there's an option to disable power off
[14:55] <voidspace> roaksoax: yeah, I selected that... it still powers off
[14:56] <voidspace> roaksoax: I'll try again to confirm I did it right
[14:56] <voidspace> roaksoax: https://bugs.launchpad.net/maas/+bug/1555570
[14:56] <roaksoax> voidspace: interesting, then cloud-init might be doing something it should...
[14:56] <roaksoax> voidspace: oh, you ar enot commissioning, you are enlisting
[14:56] <voidspace> roaksoax: no, I'm commissioning
[14:56] <voidspace> roaksoax: enlisting works, it's commissioning that doesn't
[14:57] <voidspace> trying again, *definitely* selected disable power off
[14:58] <roaksoax> voidspace: failed to enlist system maas server
[14:58] <roaksoax> sleeping 60 seconds then poweroff
[14:58] <roaksoax> voidspace: the cloud-init-output you attached is not for commissioning, it is for enlistment
[14:58] <voidspace> well, I commssion and then reboot the machine manually
[14:58] <voidspace> ah yes, indeed it says that
[14:59] <voidspace> but I'm *trying* to commission
[14:59] <roaksoax> voidspace: based on the log, that doesn't seem a commissioning
[14:59] <voidspace> so it seems the bug then is "maas doesn't commission but tries to re-enlist"
[15:00] <roaksoax> voidspace: questions:
[15:00] <roaksoax> 1. does the machine in MAAS have *all* mac addresses of the system?
[15:00] <roaksoax> 2. is the system trying to PXE boot from a mac address/interface that's not in MAAS ?
[15:01] <roaksoax> voidspace: i'd say: 1. delete the machine in maas. 2. let it auto-enlist. 3. once the machine is in 'New' state, try to commission and see what happens
[15:01] <voidspace> roaksoax: that machine has one interface (one mac) and is pxe booting from maas
[15:01] <voidspace> roaksoax: that is exactly what I've been doing, repeatedly
[15:01] <voidspace> roaksoax: I have deleted and re-enlisted multiple times with multiple fresh installs
[15:01] <roaksoax> voidspace: the only reason why I'd think the machine is trying to enlist even though it should be commissioning, it is because MAAS is detecting a different MAC address than the one it has stored
[15:01] <voidspace> roaksoax: I can provide maas logs
[15:02] <roaksoax> voidspace: please do
[15:04] <voidspace> roaksoax: regiond, rackd and maas logs attached
[15:05] <voidspace> roaksoax: this same setup behaves fine with maas 1.9
[15:05] <roaksoax> voidspace: thanks
[15:06] <roaksoax> voidspace: : http://pastebin.ubuntu.com/15347930/
[15:07] <roaksoax> voidspace: can you show /etc/maas/regiond.conf and /etc/maas/rackd.conf ?
[15:07] <voidspace> ok
[15:08] <mup> Bug #1555570 opened: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>
[15:08] <mup> Bug #1556138 opened: maas regiond upgrade from 1.8.2 to 1.9.1 silently failed <MAAS:New> <https://launchpad.net/bugs/1556138>
[15:09] <voidspace> roaksoax: done
[15:09] <voidspace> roaksoax: rackd.conf shows localhost as the url - which is what I get after a default install
[15:10] <voidspace> roaksoax: if I reconfigure maas-rack-controller and put in the url http://172.16.0.2:5240/MAAS then the rack controller reports it can't connect to the region
[15:10] <roaksoax> voidspace: what version of 1.2 ?
[15:10] <roaksoax> 2.0
[15:10] <roaksoax> err 2.0
[15:11] <voidspace> roaksoax: whatever is in next-proposed as of a couple of hours ago
[15:11] <roaksoax> voidspace: is 172.16.0.2 inside a network that the machines can commitcate with ?
[15:12] <voidspace> yes
[15:13] <roaksoax> voidspace: are you willing to try something even more bleeding edge ?
[15:14] <voidspace> roaksoax: yes, but after I go collect my daughter from school
[15:14] <voidspace> roaksoax: if you pastebin instructions on how to install from source (or link to them) then I'll try after I get back
[15:15] <roaksoax> voidspace: ppa:maas-maintainers/experimental3
[15:15] <voidspace> roaksoax: I'm installing on disposable VMs
[15:15] <voidspace> roaksoax: ah, cool
[15:15] <voidspace> thanks
[15:26] <mup> Bug #1532935 opened: Nodes stuck at grub menu when attempting to Autopilot deploy <cdo-qa> <MAAS:Confirmed> <https://launchpad.net/bugs/1532935>
[15:38] <mup> Bug #1555570 changed: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>
[15:38] <mup> Bug #1556153 opened: ERROR destroying instances: cannot release nodes: gomaasapi: got error back from server: 504 GATEWAY TIMEOUT (Unexpected exception: TimeoutError <oil> <MAAS:New> <https://launchpad.net/bugs/1556153>
[15:44] <roaksoax> voidspace: also, please attach maas <maas-user> interfaces read <node-system-id> the output of that to your bug
[15:44] <roaksoax> voidspace: i think it si related to other thing
[15:44] <voidspace> roaksoax: ok, cloning a vm right now
[15:47] <roaksoax> voidspace: https://bugs.launchpad.net/maas/+bug/1555570/comments/11
[15:48] <mup> Bug #1556153 changed: ERROR destroying instances: cannot release nodes: gomaasapi: got error back from server: 504 GATEWAY TIMEOUT (Unexpected exception: TimeoutError <oil> <MAAS:New> <https://launchpad.net/bugs/1556153>
[15:48] <mup> Bug #1555570 opened: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>
[15:57] <mup> Bug #1556153 opened: ERROR destroying instances: cannot release nodes: gomaasapi: got error back from server: 504 GATEWAY TIMEOUT (Unexpected exception: TimeoutError <oil> <MAAS:New> <https://launchpad.net/bugs/1556153>
[15:57] <mup> Bug #1556158 opened: Spurious test failure in TestRegionProtocol_SendEvent.test_send_event_does_not_fail_if_unknown_type <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556158>
[16:00] <mup> Bug #1555570 changed: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>
[16:00] <mup> Bug #1556158 changed: Spurious test failure in TestRegionProtocol_SendEvent.test_send_event_does_not_fail_if_unknown_type <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556158>
[16:09] <mup> Bug #1555570 opened: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>
[16:09] <mup> Bug #1556158 opened: Spurious test failure in TestRegionProtocol_SendEvent.test_send_event_does_not_fail_if_unknown_type <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556158>
[16:39] <mup> Bug #1556185 opened: TypeError: 'Machine' object is not iterable <MAAS:New> <https://launchpad.net/bugs/1556185>
[16:39] <mup> Bug #1556188 opened: Spurious test failure in TestMachinePartitionListener.test__calls_handler_with_update_on_update <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556188>
[17:00] <Free99> Hey everyone, new to MaaS. I'm having an issue where I deploy 14.04 to my IPMI nodes, but I get "Deployment Failed"
[17:00] <Free99> I can't seem to find any details in maas.log, regiond.log or clusterd.log as to why this step fails
[17:01] <Free99> Interestingly, I think the system properly installs
[17:06] <Bofu2U> Free99: have you looked at the screen or watched it through IPMI when it's deploying?
[17:07] <Free99> Bofu2U, yeah, only thing that shows up is an sr0 error..
[17:07] <Free99> what would the CD drive have to do with this though?
[17:07] <Bofu2U> nothing that I can think of
[17:08] <Bofu2U> you're talking about a server you're trying to boot into maas through discovery, right?
[17:08] <Bofu2U> not the head node/master/whatever
[17:09] <Free99> right.. I got it registered properly with maas, it booted the tftp image.. but then the webui jumps to "deployment failed after about 5-10 minutes
[17:09] <Free99> only complication here: DHCP is provided by my gateway
[17:09] <Bofu2U> does the image load/boot properly ?
[17:10] <Bofu2U> (in other words does it start booting through PXE, TFTP, etc)
[17:11] <Bofu2U> only times I've run into something like that was when the node couldn't access something at some point (yes, vague as hell) - I deleted it from maas entirely and rebooted it so it went back through discovery, etc
[17:14] <Free99> crud I hope I don't need to do that
[17:14] <Bofu2U> also note I'm talking about
[17:14] <Bofu2U> deleted the node from maas
[17:14] <Bofu2U> not maas as it's entirety
[17:15] <Free99> no I know, but still... 10 nodes
[17:15] <Bofu2U> oh it's on all 10?
[17:15] <Bofu2U> err
[17:15] <Bofu2U> May want to wait around and see if anyone else has any ideas then :(
[17:18] <Free99> ok, so the one node I directly watched boot gets all the way to the login screen
[17:19] <Free99> but... maas still says "deploying"
[17:19] <Bofu2U> and is this on commissioning
[17:19] <Bofu2U> or deploying
[17:19] <Free99> just deploying
[17:19] <Bofu2U> on the prompt is the server name ubuntu
[17:19] <Bofu2U> or the correct name set in MAAS
[17:19] <Free99> commissioning worked, it figured out the disk layout and blah blah
[17:19] <Free99> coreect name, node-7
[17:20] <Bofu2U> go back to your MAAS properties on that server, make sure the IP is set
[17:20] <Free99> can't modify it unless ready or broken
[17:20] <Bofu2U> is there an IP set at all?
[17:23] <Bofu2U> it sounds like it gets to the prompt and then can't connect back to the headnode to let it know it's finished deploying
[17:24] <Free99> seems like it, DHCP lease list on my gateway indicates the right FQDN has an ip, and it responds to ping
[17:24] <Free99> can't ssh in though in spite of the public key
[17:24] <Free99> *in spite of adding my public key
[17:24] <Bofu2U> but SSH does get through?
[17:24] <Bofu2U> aka it connects properly, but then fails due to auth
[17:25] <Free99> auth fails but I can ssh from the maas control server
[17:26] <Bofu2U> ok so it can talk then
[17:26] <Bofu2U> hm
[17:26] <Bofu2U> nothing else comes up on the login screen? like apt-get randomly or anything like that
[17:26] <Free99> nope, not even that sr0 error
[17:26] <Bofu2U> ok do you have any nodes still in "deploying" state
[17:26] <Bofu2U> aka haven't failed yet
[17:26] <Free99> no
[17:27] <Bofu2U> This is going to sound a bit ... weird but, sometimes it worked for me and I have absolutely no idea why
[17:27] <Free99> I only tried deploying to this one node which I have a display connected to... figure if I can get this one working I'll get all of them
[17:27] <Bofu2U> go through the process again with 1 node
[17:27] <Bofu2U> discovery, then commission
[17:27] <Bofu2U> then deploy
[17:27] <Bofu2U> every time you see it boot up, hit the F<whatever> key to forcibly select the boot sequence into PXE
[17:28] <Bofu2U> there's also ways to "backdoor" your image to put a user/pass so you can login but I wasn't able to make that work :-/
[17:28] <Free99> yeah I saw
[17:28] <Free99> sheesh... this software seems a little rough around the edges
[17:29] <Free99> can't add an ECDSA or ed25519 key
[17:29] <Bofu2U> heh
[17:29] <Bofu2U> yeah there's a few quirks that would be nice if they were different
[17:30] <Bofu2U> like not taking almost 2 weeks to figure out how to add centos images to it
[17:30] <Bofu2U> you know, small things :P
[17:30] <Free99> they mention windows image support, but no docs!
[17:31] <Free99> I'll write to docs, no problem, but I gotta get it to work at all
[17:32] <Bofu2U> I know the feeling
[17:32] <Bofu2U> :)
[17:39] <Free99> Bofu2U, another question: does the system install to local disk at all?
[17:39] <Bofu2U> yes
[17:39] <Free99> it doesn't seem to though
[17:39] <Bofu2U> that's the problem you're running into then
[17:39] <Free99> but how did it boot?
[17:39] <Bofu2U> PXE
[17:39] <Free99> it's just ram resident?
[17:39] <Bofu2U> yeah the curtain installer
[17:40] <Bofu2U> that's what the final reboot is on the deploy
[17:40] <Bofu2U> it hits PXE, PXE tells it to boot off local disk
[17:40] <Free99> how do I watch what curtain is doing?
[17:40] <Bofu2U> through the IPMI
[17:41] <Bofu2U> so, the first is the initial boot and info gathering
[17:41] <Bofu2U> that won't touch the disk, just gets it into MAAS. Doesn't get RAM/CPUs, but will pull IPMI specs
[17:41] <Bofu2U> then you commission and it gathers more information such as the RAM, CPU, etc.
[17:41] <Bofu2U> then deploy, and it writes to disk, does all of that, and then reboots and PXE tells it to boot from that disk
[17:42] <Bofu2U> hopefully that makes sense - just going off of what I remember from the process overall
[17:42] <Free99> sure does, I've gotten to the deploy stage.. and that's it
[17:42] <Bofu2U> yeah
[17:42] <Bofu2U> just because I think it would be an interesting test
[17:42] <Bofu2U> have you tried hitting the bios and disabling the CD ROM?
[17:43] <Free99> I'll try that if this deploy fails
[17:44] <Free99> got back to the login screen, correctly named node-7
[17:45] <Free99> latest event is PXE request - curtin install
[17:45] <Free99> but no visible disk activity
[17:49] <Free99> hmm... I did set to install with LVM, maybe I ought to revert to flat disk layout..
[17:49] <Bofu2U> worth trying
[17:52] <Free99> Bofu2U, I'm going to recommission this one node.. should I allow SSH? retain network?
[17:52] <Bofu2U> I did that just so I could try to test it
[17:52] <Bofu2U> I think the login is ubuntu/ubuntu
[17:52] <Free99> the network is DHCP, with dhcp registering hostnames in dns automatically
[17:53] <Free99> I love linux
[17:53] <Free99> and bsd too
[17:54] <Free99> sometimes the software is really cranky though
[17:57] <mup> Bug #1556219 opened: maas enlistment of power8 found ipmi 1.5 should do ipmi 2.0 <MAAS:New> <https://launchpad.net/bugs/1556219>
[18:00] <Free99> weird, it just denies my logging in due to publickkey
[18:00] <Free99> doesn't even prompt for a pass :-/
[18:04] <Bofu2U> try from ipmi?
[18:06] <Free99> Bofu2U, I've never used SOL before. do I need to add a kernel line to redirect to com1?
[18:06] <Bofu2U> what kind of servers?
[18:06] <Free99> it's a dell with iDrac 5 I think
[18:06] <Free99> ipmi 2
[18:06] <Bofu2U> login to the web, and try to load ... usually called "virtual console"
[18:07] <Bofu2U> don't need actual SOL
[18:07] <Free99> think they added that webconsole thing in idrac 6
[18:07] <Bofu2U> ah crap
[18:10] <Free99> any way to increase verbosity on all this stuff?
[18:10] <Bofu2U> don't know :(
[18:10] <Bofu2U> sorry
[18:10] <Free99> ubuntu/ubuntu doesn't work as a login here
[18:19] <Free99> ah ha! with the key I added in the Maas dashboard, I have to login to the nodes with ubuntu@hostname and use the same key I added to my dashboard login
[18:20] <Bofu2U> ahhh ok
[18:23] <Free99> ok so check it: cloud-init-output.log says error encountered setting up postfix
[18:23] <Bofu2U> o.O
[18:26] <Free99> ok, what logs would help figure this out?
[18:26] <Free99> I've got em all
[18:32] <voidspace> roaksoax: so the version from that experimental ppa certainly behaves *differently*
[18:33] <voidspace> roaksoax: with that version the nodes don't enlist
[19:07] <Free99> why can't I set an FQDN?
[19:08] <Free99> http://paste.ubuntu.com/15349884/ <-- my setup fails because of this
[19:09] <Free99> I think there's a bug here folks
[19:34] <Free99> can anyone please help with this cloud-init issue?
[19:48] <mup> Bug #1556258 opened: boot source keyring data is sometimes outputted as memoryview object <MAAS:Triaged> <https://launchpad.net/bugs/1556258>
[19:55] <Free99> dang it :[
[19:55] <Free99> I wish I could figure out why maas 1.9.7 is adding an extraneous period to my postfix file which borks the whole deployment
[22:55] <mpontillo> Free99: can you file a bug? I think it's maybe a postfix bug TBH; that is a valid and proper FQDN
[22:59] <mpontillo> See http://tools.ietf.org/html/rfc1034 section 3.1
[23:49] <roaksoax> voidspace: i have managed to enlist machines with the one on experimental, howeve,r I hit your issue