/srv/irclogs.ubuntu.com/2016/03/11/#maas.txt

bradmanyone seen maas lie about power state?  I've got some HP kit that maas says it powers on, the web ui says its on, but the ilo says its off00:10
bradmthis only seems to happen when its being deployed, the commissioning worked fine, so I know the power settings are right00:10
bradmand its not consistantly doing it, its only since upgrading to 1.9.1+bzr4543-0ubuntu100:10
mupBug #1555864 opened:  [2.0a1] UI Nodes page shows 'ascii' codec can't decode byte <MAAS:New> <https://launchpad.net/bugs/1555864>00:27
mupBug #1555901 opened: Number of regiond process is not determined <MAAS:Triaged> <https://launchpad.net/bugs/1555901>02:31
=== menn0 is now known as menn0-afk
=== menn0-afk is now known as menn0
BlackDexi get the following error during boot in dmesg: [  132.794959] init: maas-regiond-worker (3) main process (4763) terminated with status 111:13
BlackDex[  132.794979] init: maas-regiond-worker (3) main process ended, respawning11:14
BlackDexAnd that happens a lot11:14
mupBug #1556085 opened: adding boot-source keyring_data fails silently <sts> <MAAS:New> <https://launchpad.net/bugs/1556085>13:32
voidspaceroaksoax: ping14:49
voidspaceroaksoax: when commissioning fails I can login - how do I disable poweroff?14:50
voidspaceroaksoax: and what logfiles would be helpful, there's no console.log - there's cloud-init and cloud-init-output14:50
voidspaceamongst other things14:50
voidspacecloud-init-output.log has the error 400s in it14:52
voidspaceI'll attach those two to my bug14:52
voidspacedone14:53
roaksoaxvoidspace: can you send me the link of yoiur bug again please14:55
roaksoaxvoidspace: and if you are using 1.9, when you commission there's an option to disable power off14:55
voidspaceroaksoax: yeah, I selected that... it still powers off14:55
voidspaceroaksoax: I'll try again to confirm I did it right14:56
voidspaceroaksoax: https://bugs.launchpad.net/maas/+bug/155557014:56
roaksoaxvoidspace: interesting, then cloud-init might be doing something it should...14:56
roaksoaxvoidspace: oh, you ar enot commissioning, you are enlisting14:56
voidspaceroaksoax: no, I'm commissioning14:56
voidspaceroaksoax: enlisting works, it's commissioning that doesn't14:56
voidspacetrying again, *definitely* selected disable power off14:57
roaksoaxvoidspace: failed to enlist system maas server14:58
roaksoaxsleeping 60 seconds then poweroff14:58
roaksoaxvoidspace: the cloud-init-output you attached is not for commissioning, it is for enlistment14:58
voidspacewell, I commssion and then reboot the machine manually14:58
voidspaceah yes, indeed it says that14:58
voidspacebut I'm *trying* to commission14:59
roaksoaxvoidspace: based on the log, that doesn't seem a commissioning14:59
voidspaceso it seems the bug then is "maas doesn't commission but tries to re-enlist"14:59
roaksoaxvoidspace: questions:15:00
roaksoax1. does the machine in MAAS have *all* mac addresses of the system?15:00
roaksoax2. is the system trying to PXE boot from a mac address/interface that's not in MAAS ?15:00
roaksoaxvoidspace: i'd say: 1. delete the machine in maas. 2. let it auto-enlist. 3. once the machine is in 'New' state, try to commission and see what happens15:01
voidspaceroaksoax: that machine has one interface (one mac) and is pxe booting from maas15:01
voidspaceroaksoax: that is exactly what I've been doing, repeatedly15:01
voidspaceroaksoax: I have deleted and re-enlisted multiple times with multiple fresh installs15:01
roaksoaxvoidspace: the only reason why I'd think the machine is trying to enlist even though it should be commissioning, it is because MAAS is detecting a different MAC address than the one it has stored15:01
voidspaceroaksoax: I can provide maas logs15:01
roaksoaxvoidspace: please do15:02
voidspaceroaksoax: regiond, rackd and maas logs attached15:04
voidspaceroaksoax: this same setup behaves fine with maas 1.915:05
roaksoaxvoidspace: thanks15:05
roaksoaxvoidspace: : http://pastebin.ubuntu.com/15347930/15:06
roaksoaxvoidspace: can you show /etc/maas/regiond.conf and /etc/maas/rackd.conf ?15:07
voidspaceok15:07
mupBug #1555570 opened: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>15:08
mupBug #1556138 opened: maas regiond upgrade from 1.8.2 to 1.9.1 silently failed <MAAS:New> <https://launchpad.net/bugs/1556138>15:08
voidspaceroaksoax: done15:09
voidspaceroaksoax: rackd.conf shows localhost as the url - which is what I get after a default install15:09
voidspaceroaksoax: if I reconfigure maas-rack-controller and put in the url http://172.16.0.2:5240/MAAS then the rack controller reports it can't connect to the region15:10
roaksoaxvoidspace: what version of 1.2 ?15:10
roaksoax2.015:10
roaksoaxerr 2.015:10
voidspaceroaksoax: whatever is in next-proposed as of a couple of hours ago15:11
roaksoaxvoidspace: is 172.16.0.2 inside a network that the machines can commitcate with ?15:11
voidspaceyes15:12
roaksoaxvoidspace: are you willing to try something even more bleeding edge ?15:13
voidspaceroaksoax: yes, but after I go collect my daughter from school15:14
voidspaceroaksoax: if you pastebin instructions on how to install from source (or link to them) then I'll try after I get back15:14
roaksoaxvoidspace: ppa:maas-maintainers/experimental315:15
voidspaceroaksoax: I'm installing on disposable VMs15:15
voidspaceroaksoax: ah, cool15:15
voidspacethanks15:15
mupBug #1532935 opened: Nodes stuck at grub menu when attempting to Autopilot deploy <cdo-qa> <MAAS:Confirmed> <https://launchpad.net/bugs/1532935>15:26
mupBug #1555570 changed: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>15:38
mupBug #1556153 opened: ERROR destroying instances: cannot release nodes: gomaasapi: got error back from server: 504 GATEWAY TIMEOUT (Unexpected exception: TimeoutError <oil> <MAAS:New> <https://launchpad.net/bugs/1556153>15:38
roaksoaxvoidspace: also, please attach maas <maas-user> interfaces read <node-system-id> the output of that to your bug15:44
roaksoaxvoidspace: i think it si related to other thing15:44
voidspaceroaksoax: ok, cloning a vm right now15:44
roaksoaxvoidspace: https://bugs.launchpad.net/maas/+bug/1555570/comments/1115:47
mupBug #1556153 changed: ERROR destroying instances: cannot release nodes: gomaasapi: got error back from server: 504 GATEWAY TIMEOUT (Unexpected exception: TimeoutError <oil> <MAAS:New> <https://launchpad.net/bugs/1556153>15:48
mupBug #1555570 opened: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>15:48
mupBug #1556153 opened: ERROR destroying instances: cannot release nodes: gomaasapi: got error back from server: 504 GATEWAY TIMEOUT (Unexpected exception: TimeoutError <oil> <MAAS:New> <https://launchpad.net/bugs/1556153>15:57
mupBug #1556158 opened: Spurious test failure in TestRegionProtocol_SendEvent.test_send_event_does_not_fail_if_unknown_type <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556158>15:57
mupBug #1555570 changed: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>16:00
mupBug #1556158 changed: Spurious test failure in TestRegionProtocol_SendEvent.test_send_event_does_not_fail_if_unknown_type <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556158>16:00
mupBug #1555570 opened: Problem commissioning nodes (2.0) <MAAS:New> <https://launchpad.net/bugs/1555570>16:09
mupBug #1556158 opened: Spurious test failure in TestRegionProtocol_SendEvent.test_send_event_does_not_fail_if_unknown_type <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556158>16:09
=== redelmann is now known as rudi|comida
mupBug #1556185 opened: TypeError: 'Machine' object is not iterable <MAAS:New> <https://launchpad.net/bugs/1556185>16:39
mupBug #1556188 opened: Spurious test failure in TestMachinePartitionListener.test__calls_handler_with_update_on_update <tests> <MAAS:Triaged> <https://launchpad.net/bugs/1556188>16:39
Free99Hey everyone, new to MaaS. I'm having an issue where I deploy 14.04 to my IPMI nodes, but I get "Deployment Failed"17:00
Free99I can't seem to find any details in maas.log, regiond.log or clusterd.log as to why this step fails17:00
Free99Interestingly, I think the system properly installs17:01
Bofu2UFree99: have you looked at the screen or watched it through IPMI when it's deploying?17:06
Free99Bofu2U, yeah, only thing that shows up is an sr0 error..17:07
Free99what would the CD drive have to do with this though?17:07
Bofu2Unothing that I can think of17:07
Bofu2Uyou're talking about a server you're trying to boot into maas through discovery, right?17:08
Bofu2Unot the head node/master/whatever17:08
Free99right.. I got it registered properly with maas, it booted the tftp image.. but then the webui jumps to "deployment failed after about 5-10 minutes17:09
Free99only complication here: DHCP is provided by my gateway17:09
Bofu2Udoes the image load/boot properly ?17:09
Bofu2U(in other words does it start booting through PXE, TFTP, etc)17:10
Bofu2Uonly times I've run into something like that was when the node couldn't access something at some point (yes, vague as hell) - I deleted it from maas entirely and rebooted it so it went back through discovery, etc17:11
Free99crud I hope I don't need to do that17:14
Bofu2Ualso note I'm talking about17:14
Bofu2Udeleted the node from maas17:14
Bofu2Unot maas as it's entirety17:14
Free99no I know, but still... 10 nodes17:15
Bofu2Uoh it's on all 10?17:15
Bofu2Uerr17:15
Bofu2UMay want to wait around and see if anyone else has any ideas then :(17:15
Free99ok, so the one node I directly watched boot gets all the way to the login screen17:18
Free99but... maas still says "deploying"17:19
Bofu2Uand is this on commissioning17:19
Bofu2Uor deploying17:19
Free99just deploying17:19
Bofu2Uon the prompt is the server name ubuntu17:19
Bofu2Uor the correct name set in MAAS17:19
Free99commissioning worked, it figured out the disk layout and blah blah17:19
Free99coreect name, node-717:19
Bofu2Ugo back to your MAAS properties on that server, make sure the IP is set17:20
Free99can't modify it unless ready or broken17:20
Bofu2Uis there an IP set at all?17:20
Bofu2Uit sounds like it gets to the prompt and then can't connect back to the headnode to let it know it's finished deploying17:23
Free99seems like it, DHCP lease list on my gateway indicates the right FQDN has an ip, and it responds to ping17:24
Free99can't ssh in though in spite of the public key17:24
Free99*in spite of adding my public key17:24
Bofu2Ubut SSH does get through?17:24
Bofu2Uaka it connects properly, but then fails due to auth17:24
Free99auth fails but I can ssh from the maas control server17:25
Bofu2Uok so it can talk then17:26
Bofu2Uhm17:26
Bofu2Unothing else comes up on the login screen? like apt-get randomly or anything like that17:26
Free99nope, not even that sr0 error17:26
Bofu2Uok do you have any nodes still in "deploying" state17:26
Bofu2Uaka haven't failed yet17:26
Free99no17:26
Bofu2UThis is going to sound a bit ... weird but, sometimes it worked for me and I have absolutely no idea why17:27
Free99I only tried deploying to this one node which I have a display connected to... figure if I can get this one working I'll get all of them17:27
Bofu2Ugo through the process again with 1 node17:27
Bofu2Udiscovery, then commission17:27
Bofu2Uthen deploy17:27
Bofu2Uevery time you see it boot up, hit the F<whatever> key to forcibly select the boot sequence into PXE17:27
Bofu2Uthere's also ways to "backdoor" your image to put a user/pass so you can login but I wasn't able to make that work :-/17:28
Free99yeah I saw17:28
Free99sheesh... this software seems a little rough around the edges17:28
Free99can't add an ECDSA or ed25519 key17:29
Bofu2Uheh17:29
Bofu2Uyeah there's a few quirks that would be nice if they were different17:29
Bofu2Ulike not taking almost 2 weeks to figure out how to add centos images to it17:30
Bofu2Uyou know, small things :P17:30
Free99they mention windows image support, but no docs!17:30
Free99I'll write to docs, no problem, but I gotta get it to work at all17:31
Bofu2UI know the feeling17:32
Bofu2U:)17:32
=== rudi|comida is now known as redelmann
Free99Bofu2U, another question: does the system install to local disk at all?17:39
Bofu2Uyes17:39
Free99it doesn't seem to though17:39
Bofu2Uthat's the problem you're running into then17:39
Free99but how did it boot?17:39
Bofu2UPXE17:39
Free99it's just ram resident?17:39
Bofu2Uyeah the curtain installer17:39
Bofu2Uthat's what the final reboot is on the deploy17:40
Bofu2Uit hits PXE, PXE tells it to boot off local disk17:40
Free99how do I watch what curtain is doing?17:40
Bofu2Uthrough the IPMI17:40
Bofu2Uso, the first is the initial boot and info gathering17:41
Bofu2Uthat won't touch the disk, just gets it into MAAS. Doesn't get RAM/CPUs, but will pull IPMI specs17:41
Bofu2Uthen you commission and it gathers more information such as the RAM, CPU, etc.17:41
Bofu2Uthen deploy, and it writes to disk, does all of that, and then reboots and PXE tells it to boot from that disk17:41
Bofu2Uhopefully that makes sense - just going off of what I remember from the process overall17:42
Free99sure does, I've gotten to the deploy stage.. and that's it17:42
Bofu2Uyeah17:42
Bofu2Ujust because I think it would be an interesting test17:42
Bofu2Uhave you tried hitting the bios and disabling the CD ROM?17:42
Free99I'll try that if this deploy fails17:43
Free99got back to the login screen, correctly named node-717:44
Free99latest event is PXE request - curtin install17:45
Free99but no visible disk activity17:45
Free99hmm... I did set to install with LVM, maybe I ought to revert to flat disk layout..17:49
Bofu2Uworth trying17:49
Free99Bofu2U, I'm going to recommission this one node.. should I allow SSH? retain network?17:52
Bofu2UI did that just so I could try to test it17:52
Bofu2UI think the login is ubuntu/ubuntu17:52
Free99the network is DHCP, with dhcp registering hostnames in dns automatically17:52
Free99I love linux17:53
Free99and bsd too17:53
Free99sometimes the software is really cranky though17:54
mupBug #1556219 opened: maas enlistment of power8 found ipmi 1.5 should do ipmi 2.0 <MAAS:New> <https://launchpad.net/bugs/1556219>17:57
Free99weird, it just denies my logging in due to publickkey18:00
Free99doesn't even prompt for a pass :-/18:00
Bofu2Utry from ipmi?18:04
Free99Bofu2U, I've never used SOL before. do I need to add a kernel line to redirect to com1?18:06
Bofu2Uwhat kind of servers?18:06
Free99it's a dell with iDrac 5 I think18:06
Free99ipmi 218:06
Bofu2Ulogin to the web, and try to load ... usually called "virtual console"18:06
Bofu2Udon't need actual SOL18:07
Free99think they added that webconsole thing in idrac 618:07
Bofu2Uah crap18:07
Free99any way to increase verbosity on all this stuff?18:10
Bofu2Udon't know :(18:10
Bofu2Usorry18:10
Free99ubuntu/ubuntu doesn't work as a login here18:10
Free99ah ha! with the key I added in the Maas dashboard, I have to login to the nodes with ubuntu@hostname and use the same key I added to my dashboard login18:19
Bofu2Uahhh ok18:20
Free99ok so check it: cloud-init-output.log says error encountered setting up postfix18:23
Bofu2Uo.O18:23
Free99ok, what logs would help figure this out?18:26
Free99I've got em all18:26
voidspaceroaksoax: so the version from that experimental ppa certainly behaves *differently*18:32
voidspaceroaksoax: with that version the nodes don't enlist18:33
Free99why can't I set an FQDN?19:07
Free99http://paste.ubuntu.com/15349884/ <-- my setup fails because of this19:08
Free99I think there's a bug here folks19:09
Free99can anyone please help with this cloud-init issue?19:34
mupBug #1556258 opened: boot source keyring data is sometimes outputted as memoryview object <MAAS:Triaged> <https://launchpad.net/bugs/1556258>19:48
Free99dang it :[19:55
Free99I wish I could figure out why maas 1.9.7 is adding an extraneous period to my postfix file which borks the whole deployment19:55
mpontilloFree99: can you file a bug? I think it's maybe a postfix bug TBH; that is a valid and proper FQDN22:55
mpontilloSee http://tools.ietf.org/html/rfc1034 section 3.122:59
=== CyberJacob is now known as zz_CyberJacob
roaksoaxvoidspace: i have managed to enlist machines with the one on experimental, howeve,r I hit your issue23:49

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!