=== CyberJacob is now known as CyberJacob|Away | ||
designate | in the maas 1.7 changelog it says maas no longer uses squid-deb-proxy but for some reason it's still getting installed... | 00:25 |
---|---|---|
designate | does anyone even monitor this channel? | 01:18 |
roaksoax | designate: sudo apt-get dist-upgrade should remove it | 01:51 |
roaksoax | designate: if you did sudo apt-get update it might not have | 01:51 |
designate | roaksoax: thank you, I will try that. | 01:54 |
roaksoax | designate: np! | 01:56 |
=== CyberJacob|Away is now known as CyberJacob | ||
=== CyberJacob is now known as CyberJacob|Away | ||
thebozz | Guys, I need some help. We're trying to comission a Dell R710 server into our newly installed MAAS cluster. However, it fails after turning on and off a couple of times, the GUI only says "Failed to power on node — Timeout after 7 tries ". What could be wrong, where should we look to start fixing this issue? | 13:22 |
thebozz | The rest of our cluster are only R720 servers. My boss suspects there could be an issue because the R720s have iDRAC 7, while the 710 has iDRAC 6. | 13:23 |
thebozz | Anyone in here? | 14:54 |
jhobbs | thebozz: what version of MAAS are you using? | 14:55 |
jhobbs | sounds like 1.7 | 14:55 |
jhobbs | there is a button on the node page where you can check the power state of the noe | 14:55 |
jhobbs | *node | 14:55 |
jhobbs | thebozz: can you try that button and see if it works for the 710 | 14:55 |
thebozz | It ran successfully, and detected the node as off. | 14:56 |
jhobbs | so MAAS can reach the node and has good credentials for it | 14:57 |
jhobbs | when you say it fails after turning on | 14:57 |
jhobbs | and off a couple of times, | 14:57 |
jhobbs | how are you turning it on and off there? | 14:57 |
thebozz | Clicking on "comission node". | 14:58 |
thebozz | Let me get the relevant logs, maybe there's some useful info in there. | 14:58 |
jhobbs | is there anything else talking to the BMCs? nagios or something like that | 14:59 |
jhobbs | or serial over lan | 15:00 |
thebozz | :/ actually, I have no idea what I'm looking at. I haven't been involved in the deployment other than helping here and there. Is there anything I should look at in the logs to help me debug this? | 15:01 |
jhobbs | well /var/log/maas/maas.log and /var/log/maas/maas-django.log might be useful, if you can post them | 15:01 |
jhobbs | if the node is powered off right now, what state is it in in MAAS? Ready? Failed Commissioning? | 15:02 |
thebozz | Failed Commissioning. Let me do some filtering on those files, I'll try to grab anything that seems relevant. | 15:03 |
jhobbs | can you try commissioning again, since the power check is working? | 15:04 |
jhobbs | if that doesn't work, you should try powering on the node manually via IPMI using MAAS's credentials | 15:05 |
thebozz | My boss insists he thinks it has to do with every other node having iDRAC 7 while this one has iDRAC 6. Is that relevant at all? | 15:10 |
jhobbs | it could be | 15:12 |
thebozz | Here are the logs: http://pastebin.com/aTf05ajh => maas.log ; http://pastebin.com/aQEAn3MR => pserv.log ; maas-django.log didn't have any references to the relevant MAC address. Is there anything else I can use to filter? | 15:13 |
thebozz | About the iDRAC thing... how is it relevant? I don't really understand that. | 15:13 |
jhobbs | there was a bug at one point that affected r710 | 15:14 |
jhobbs | https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1287964 | 15:14 |
ubot5` | Launchpad bug 1287964 in MAAS "MAAS incorrectly detects / sets-up BMC information on Dell PowerEdge servers" [High,Fix released] | 15:14 |
jhobbs | IPMI is quirky - different versions of it react different ways to the same commands sometimes | 15:14 |
jhobbs | just minor differences in how the protocol is implemented | 15:14 |
jhobbs | r710 doesn't look to be ubuntu certified for 14.04 | 15:15 |
jhobbs | r720 is though | 15:15 |
thebozz | Huh. Then it's worth a shot to do it manually. Will MAAS be able to turn it on and off at will after comissioning? | 15:15 |
jhobbs | well that depends on why it's not working - if it's not working now, and nothing changes, i wouldn't expect it to change after commissioning | 15:16 |
jhobbs | oh, r710 is certified too, so it should be working | 15:16 |
thebozz | That only makes this even weirder :/ | 15:16 |
jhobbs | http://www.ubuntu.com/certification/hardware/201404-14939/ | 15:16 |
jhobbs | since it works sometimes and sometimes it doesn't, i would suspect either something else is talking to it via IPMI and using its sessions up, or maybe something at the network layer is bad - duplicate IP addresses maybe? | 15:18 |
jhobbs | or maybe the firmware on the bmc is out of date? | 15:18 |
thebozz | The cluster is on its own subnetwork, pretty much isolated ATM. So yeah, my guess would be network layer issues or firmware. | 15:20 |
=== jfarschman is now known as MilesDenver | ||
=== roadmr is now known as roadmr_afk | ||
=== jfarschman is now known as MilesDenver | ||
=== jfarschman is now known as MilesDenver | ||
=== roadmr_afk is now known as roadmr | ||
designate | I am trying to bootstrap an environment using maas/juju (latest stable versions of both) but I'm getting the following error: "401 OK (Authorization Error: 'Expired timestamp: given 1417774874 and now 1417800053 has a greater difference than threshold 300')" despite the fact that I have configured an NTP server in MAAS that is reachable by all servers. | 17:48 |
=== CyberJacob|Away is now known as CyberJacob | ||
=== roadmr is now known as roadmr_afk | ||
=== roadmr_afk is now known as roadmr | ||
designate | since images are now stored in the maas database, can anyone point me in the direction of modifying the ephemeral image? I need to add an NTP server because of clock differences causing oauth errors. | 22:41 |
=== CyberJacob is now known as CyberJacob|Away |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!