pmatulis | please take notes :) we need to capture everything in the docs | 00:00 |
---|---|---|
mimizone | all right, not completely over my long story with curtin... :) | 00:18 |
mimizone | the stuff specific to the node is taken properly, but the late_commands etc.... are not executed. | 00:19 |
mimizone | if I stuff them in the curtin_userdata, they are executed. | 00:19 |
mimizone | I suspect it is overriden or something since the userdata redefines the late_commands | 00:21 |
mimizone | ? | 00:21 |
mimizone | last reboot/pxe for today.... enough of that thing already :) | 00:28 |
mimizone | see you tomorrow gents. | 00:39 |
pmatulis | see you mimizone | 00:56 |
Budgie^Smore | OK I wish we were on IPv6 already!!! I keep running into transition problems! | 00:59 |
stormmore | This is really weird, I have installed maas a dozen times today and randomly it stops configuring the maas rack controller! | 02:17 |
cnf | aaaand morning | 09:29 |
cnf | how do I tell maas to not scan / ignore certain subnets? | 09:33 |
brendand | cnf, you mean for device discovery? | 09:34 |
cnf | yes | 09:34 |
brendand | cnf, and i guess you can't just delete the subnet? | 09:34 |
cnf | i don't know, can I? it was auto discovered | 09:34 |
brendand | cnf, yeah, of course | 09:35 |
cnf | k, did that | 09:35 |
brendand | cnf, it auto discovers everything as a 'convenience' | 09:35 |
cnf | ok | 09:36 |
brendand | the assumption is that most people will care about all visible subnets | 09:36 |
cnf | well, i don't care about the public subnet | 09:36 |
cnf | it's just in it to have internet access | 09:36 |
brendand | cnf, fyi you can disable per subnet too - uncheck 'Active mapping' in the subnet page | 09:37 |
cnf | yeah, it added those subnets again | 09:38 |
cnf | i deleted them, maas just added them back | 09:38 |
brendand | interesting | 09:38 |
brendand | roaksoax, mpontillo , intentional ^ | 09:38 |
brendand | ? | 09:38 |
cnf | also, IPMI doesn't seem to work on HP ilo 4 | 09:42 |
cnf | it calls the binary with the wrong arguments | 09:43 |
cnf | hmm | 09:44 |
brendand | cnf, hp ilo is a separate power management type | 09:45 |
cnf | yes | 09:45 |
cnf | and maas is calling ipmitool with the wrong arguments | 09:45 |
cnf | Failed to execute ('/usr/bin/ipmitool', '-I', 'lanplus', '-H', 'x.x.x.x', '-U', 'Administrator', '-P', '********', '94:18:82:03:AD:2E', 'power', 'status') for cartridge 94:18:82:03:AD:2E at x.x.x.x: Invalid command: 94:18:82:03:AD:2E | 09:46 |
cnf | if i remove the mac from the command, it works fine | 09:47 |
brendand | cnf, actually, what kind of servers are these? | 09:47 |
cnf | HP DL380 | 09:47 |
cnf | gen 9 | 09:47 |
brendand | cnf, i think the ilo entries are for moonshot | 09:47 |
brendand | cnf, which is not what you want | 09:47 |
cnf | so these are just not supported? | 09:48 |
brendand | cnf, i think you should be using just IPMI | 09:48 |
brendand | cnf, were the power params auto-detected or did you enter them yourself? | 09:48 |
cnf | entered them myself, maas so far can't see my server | 09:49 |
cnf | that's an entirely different mess | 09:49 |
cnf | k, you where right, using just ipmi works | 09:49 |
cnf | thanks | 09:49 |
cnf | (i'm trying to understand and fix several things about MaaS at once...) | 09:50 |
brendand | cnf, so are your machines pxe booting? | 09:50 |
cnf | no | 09:50 |
brendand | cnf, as in they aren't trying, or maas doesn't detect them when they do? | 09:51 |
cnf | they are trying | 09:51 |
cnf | but i have some problems getting it to wotrk | 09:51 |
brendand | cnf, and is dhcp enabled on the subnet the are on? | 09:52 |
cnf | first, they are on a LAG | 09:52 |
brendand | cnf, ok | 09:52 |
cnf | figuring out how to PXE on a lag has been a chore | 09:52 |
cnf | now, i am seeing dhcp requests, but the machines are not responding to whatever the maas dhcpd is sending | 09:52 |
cnf | that's where i left yesterday | 09:52 |
cnf | 10:51:59.769027 14:02:ec:8a:83:85 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 389: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:02:ec:8a:83:85, length 347 | 09:53 |
cnf | 10:52:00.770859 00:50:56:94:e8:37 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: 172.20.20.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300 | 09:53 |
cnf | basically | 09:53 |
cnf | it's the right source mac | 09:53 |
brendand | cnf, is http://www.brocade.com/content/html/en/configuration-guide/fastiron-08030b-l2guide/GUID-DC254740-F2BF-4279-820A-794AFBB86999.html in any way relevant? | 09:56 |
brendand | "You can configure the member port of a dynamic LAG to be logically operational even when the dynamic LAG is not operating. This enables PXE boot support on this port." | 09:56 |
brendand | probably not the same switch, but maybe the principle applies? | 09:56 |
cnf | yeah, we looked at that | 09:56 |
cnf | it's a juniper QFX3500 | 09:56 |
cnf | and on juniper it's also called force-up | 09:56 |
cnf | atm, we disabled LACP, and are just using etherchannel | 09:57 |
cnf | the requests are coming in, so it should work | 09:57 |
cnf | brendand: http://www.juniper.net/techpubs/en_US/junos10.3/topics/reference/configuration-statement/force-up-edit-interfaces-ex-series.html is the juniper ref, btw | 10:00 |
brendand | cnf, so you're running dhcpdump on the maas server when booting the machine? | 10:00 |
cnf | yep | 10:00 |
cnf | and the MAAS server is a VM on a vsphere cluster, if that matters | 10:01 |
brendand | cnf, and you're sure dhcp is enabled on the correct vlan? | 10:04 |
cnf | brendand: well, it's replying to the requests :P | 10:04 |
cnf | also, for the MAAS vm, it's just an interface, it doesn't see the vlans | 10:05 |
brendand | cnf, ok i misunderstood. so it's responding but the BOOTPREPLY packets don't reach the machine | 10:06 |
brendand | then we can probably rule out maas as the problem | 10:07 |
brendand | cnf, what's happening on the switch itself | 10:07 |
brendand | ? | 10:07 |
cnf | sorry, was AFK changing somw switch htings | 10:24 |
cnf | added a copper port, so I can try with a laptop | 10:24 |
cnf | brb | 10:24 |
brendand | cnf, that's alright, i was afk making breakfast ;) | 10:25 |
cnf | :P | 10:30 |
cnf | ok, the server on copper worked | 10:57 |
brendand | cnf, good to know | 10:58 |
cnf | ok, but it doesn't auto detect servers it boots? | 10:58 |
brendand | cnf, maas should auto-detect any system that boots under its direction | 10:59 |
cnf | i can't find it, atm | 10:59 |
brendand | you should see a message like 'booting under maas direction' | 10:59 |
brendand | on the machine's console | 11:00 |
cnf | yes, it did that | 11:00 |
brendand | cnf, is the machine still on and doing things? | 11:00 |
cnf | but i can't find in maas | 11:00 |
cnf | it sits at the login prompt | 11:00 |
brendand | cnf, that didn't work properly then | 11:01 |
cnf | hmz | 11:01 |
brendand | cnf, is there anything in /var/log/maas/rsyslog? | 11:01 |
cnf | that is an empty directory | 11:01 |
cnf | i have /var/log/maas/maas.log | 11:02 |
cnf | i see it there | 11:02 |
cnf | so it discovers everything i don't care about, but not what I do care about :P | 11:03 |
cnf | it should show up on the dashboard, right? | 11:05 |
cnf | with the rest of the discovered devices? | 11:05 |
cnf | so when I go to the subnet, the ip the metal booted with is marked as "observed" | 11:06 |
cnf | but that's the only reference i can see to it | 11:06 |
cnf | hmz, am I doing something wrong? | 11:07 |
brendand | cnf, i'd have to see the serial console to know for sure | 11:07 |
cnf | serial of what? | 11:08 |
cnf | do I have to add my metal manually, maybe? | 11:08 |
brendand | cnf, the booting machine? | 11:09 |
cnf | oh | 11:09 |
cnf | i don't have access to that, atm | 11:09 |
brendand | cnf, you can use serial over lan | 11:09 |
cnf | yes, i have no access to that, atm | 11:10 |
cnf | recycled machine, someone changed the ilo password | 11:10 |
cnf | someone that isn't in today | 11:10 |
brendand | cnf, i see | 11:10 |
brendand | cnf, you need to watch it closely and see what happens after booting under maas direction | 11:11 |
brendand | cnf, it should transfer a grub/efi config and initrd | 11:11 |
brendand | cnf, if you're desperate you could video it? | 11:12 |
cnf | ok, it says "no datasource found" | 11:13 |
cnf | (also, i'm going to smack the person that changed the ilo pass)\ | 11:13 |
cnf | and then some python3 errors | 11:14 |
brendand | sounds like cloud-init | 11:14 |
brendand | any mention of cloud-init there? | 11:14 |
brendand | cnf, details on the stacktrace would be good too | 11:15 |
cnf | https://www.dropbox.com/s/bo40mqgdfqow3lv/2017-02-28%2012.12.48.jpg?dl=0 | 11:16 |
cnf | i feel like a caveman, posting pictures of screens :P | 11:16 |
brendand | likely bad things to come! | 11:16 |
brendand | you're telling me | 11:17 |
cnf | the new machines i have iLo console on | 11:17 |
cnf | this one i am using to get it on copper | 11:17 |
cnf | anyway | 11:17 |
cnf | what is this datasource i can't find? | 11:18 |
cnf | hmz | 11:22 |
brendand | cnf, is /var/log/cloud-init-output.log populated? | 11:22 |
cnf | no such file or directory | 11:23 |
brendand | cnf, and does /etc/maas/preseeds/enlist exist? | 11:23 |
cnf | on the maas server | 11:23 |
cnf | that last one does | 11:23 |
cnf | but /var/log/cloud-init-output.log does not exist | 11:23 |
cnf | i also don't have /etc/cloud, at all | 11:23 |
brendand | cnf, i guess it can't access the metadata | 11:24 |
brendand | cat /etc/maas/regiond.conf? | 11:24 |
cnf | oh, it's stupid | 11:27 |
cnf | it sees itself with the wrong ip | 11:27 |
* cnf facepalms | 11:27 | |
cnf | what do i need to restart after i edit that file? | 11:27 |
brendand | cnf, systemctl restart maas-regiond | 11:28 |
cnf | also, how can I tell maas to totally ignore certain subnets, already :( | 11:28 |
brendand | cnf, best you can do is tell it not to scan it looks like | 11:28 |
brendand | i think the rediscovering thing is a bug | 11:28 |
cnf | hmm | 11:31 |
cnf | ok, now it came up, it seems | 11:34 |
cnf | and i can ssh to it | 11:42 |
cnf | right, that's a first step | 11:42 |
cnf | ok! it doesn't see my storage, though | 11:45 |
cnf | interesting | 11:45 |
cnf | ok, so without LAG, things seem to be working now | 11:48 |
cnf | after lunch, i'll need to get the LAG case working | 11:48 |
cnf | ok! | 14:04 |
cnf | so, one machine working, i think | 14:04 |
brendand | cnf, cool | 14:06 |
cnf | that's on a single copper link | 14:06 |
cnf | i can't get the ones on the 10G LAG's working | 14:06 |
cnf | i have no idea why | 14:06 |
brendand | cnf, it's common i think to require a seperate interface for pxe | 14:06 |
cnf | yeah, that's a bit of an issue | 14:07 |
cnf | our network is set up for 2 x 10G fiber LAGs | 14:07 |
cnf | hmm | 14:35 |
cnf | so we have metal that can _only_ boot from the 10G cards | 14:35 |
cnf | no one here experience with PXE booting on LAG interfaces? | 14:36 |
roaksoax | cnf: i dont, but i do know others have done successfully on 10g cards | 14:47 |
roaksoax | cnf: so whats the particular issue in your case ? | 14:47 |
cnf | roaksoax: if i take down the LACP, and set up a pure etherchannel, i see dhcp requests coming in, but the server isn't reponding to the replies | 14:49 |
roaksoax | cnf: you mean, you see dhcp requests in the maas server, but the maas server isn't responding, or is it the other way around ? | 14:51 |
cnf | i see requests on the maas server, and i see responses on the maas server | 14:51 |
cnf | but my metal isn't reacting at all | 14:51 |
cnf | roaksoax: also, is there a way to have maas ignore certain ranges in discovery? | 14:57 |
cnf | it's sat in a few mixed ranges (iLo, and public) that it should just ignore, but i get a huge list of discovered devices in those ranges | 14:58 |
roaksoax | cnf: i wonder if this is something the switch is filtering out ? | 15:06 |
roaksoax | cnf: like stp enabled ? | 15:07 |
roaksoax | cnf: and no the discovery will happen on all subnets, you can turn off discovery altogether though | 15:07 |
cnf | hmm, that sound weird, should have an option to turn it off per subnet (or per interface) | 15:08 |
cnf | afaik stp isn't enabled on the qfabrics | 15:13 |
mup | Bug #1668650 opened: I have both hwe-16.04 and hwe-x minimum kernel options and they mean different things apparently <oil> <MAAS:New> <https://launchpad.net/bugs/1668650> | 15:13 |
cnf | any way to stop maas from trying to load a firmware in an endless loop? | 16:08 |
cnf | http://pastebin.com/qygUiHYT i keep seeing that over and over again, 2 or 3 times / second | 16:09 |
cnf | hmm, and It seems the DL380p's can't boot from the pci broadcom NIC | 16:34 |
cnf | :( | 16:34 |
cnf | k, firmware upgrade helped :P | 16:59 |
mup | Bug #1550081 changed: [2.0] No error message is displayed when failed to add a domain <error-surface> <MAAS:Fix Released> <MAAS 2.0:Won't Fix> <MAAS trunk:Fix Released> <https://launchpad.net/bugs/1550081> | 17:07 |
cnf | ok, time to go home | 17:20 |
mup | Bug #1668703 opened: Use external NTP servers only option has no effect <MAAS:New> <https://launchpad.net/bugs/1668703> | 17:39 |
mimizone | hi, is the API call "GET /MAAS/metadata/latest/by-id/agqa6n/?op=get_preseed" deprecated? it still works and being used by MAAS but I don't see any documentation on it. | 18:04 |
mimizone | I'd like to know what are the other op values I can use to retrieve the other metadata files in curtin | 18:05 |
mimizone | specifically the curtin_userdata file | 18:05 |
vogelc | Anyone here have expierence adding subnets and configuring DHCP? I can't add a subnet and enable DHCP if I dont have a host with an interface on that subnet. | 18:07 |
pmatulis | mimizone, hi. what version of maas are you using? | 18:21 |
mimizone | 2.1 | 18:21 |
mimizone | I've tried op=get_curtin_userdata but the curl returns an error and HTTP 400 | 18:22 |
pmatulis | mimizone, i don't see any mention of 'preseed' in any api version | 18:23 |
mimizone | pmatulis: I've seen that too... but it is used during the deployment process I can insure you. it's how the node retrieves the curtin files. | 18:23 |
mimizone | you can even see the log in the maas log file | 18:24 |
pmatulis | mimizone, hm, maybe it's meant for internal processes only | 18:24 |
pmatulis | vogelc, that makes sense IMO. how can a machine offer dhcp leases for a subnet it is not connected to? i suppose you can try using a dhcp relay | 18:25 |
mimizone | I just want to use it for debugging actually. to understand how all that curtin stuff works | 18:25 |
pmatulis | ohh | 18:25 |
mimizone | the cloud-init must make a call of some kind to retrieve the curtin_userdata file, I can't figure this one out. I haven't read the entire source code of maas :) | 18:26 |
mimizone | where is the code that runs in the node during the deployment by the way? I've looked only at the maasserver code so far | 18:28 |
pmatulis | there are scripts for enlistment, commissioning, deployment. i'm not sure of their location | 18:31 |
brendand | pmatulis, mimizone - /etc/maas/preseeds | 18:32 |
brendand | pmatulis, don't think that's quite what mimizone is looking for though | 18:33 |
mimizone | brendand: I mean the logic that gets those preseed files | 18:33 |
mimizone | those files are rendered/sent by the maasserver/preseed.py but I can't find what is making the call to the server. | 18:34 |
brendand | mimizone, can you check /etc/maas/regiond.conf? | 18:34 |
brendand | mimizone, is maas_url something the node can reach? | 18:34 |
mimizone | not much there. the maas_url, database information | 18:35 |
mimizone | in my experience, I see that the node deployments uses curtin, but the commands in my configuration files are not used, then on the second boot, cloud-init seems to retrieve the curtin_userdata file, and then it triggers the curtin commands in that file. I assume that logic is somewhere. | 18:38 |
mimizone | brendand: yes, no issue reaching the maas server. I am just interested in understanding the curtin/metadata stuff because I have to customize the deployment. | 18:42 |
mimizone | I also use this process to fix a current bug in maas when there is multiple static routes. | 18:42 |
vogelc | pamtulis, thats right we are planning on pointing the dhcp forwarders to the MAAS rack controllers. | 18:48 |
vogelc | pamtulis, really all we need to do is define the subnets in DHCP. Can we do it manually until a possible patch is in place? | 18:49 |
vogelc | pmatulis: sorry fro spelling your id incorrectly. | 18:52 |
pmatulis | vogelc, the current devel version of maas has dhcp relay integration in the sense that maas will send the appropriate config to the active dhcp server (providing it is maas-managed) | 18:58 |
pmatulis | it does not provide the relay however | 18:58 |
vogelc | pmatulis: So I did upgrade to 2.2 but it still will not let me add DHCP to a VLAN because there are no controllers with that subnet defined on an interface. I was hoping what you mentioned would be our fix. | 19:01 |
pmatulis | vogelc, did you choose "Relay DHCP" for that VLAN? | 19:03 |
vogelc | pmatulis: totally missed that option. so if my NIC is on 10.1.1.0, it will relay it to 10.2.2.0 vlan/subnet? | 19:06 |
pmatulis | https://docs.ubuntu.com/maas/devel/en/installconfig-network-dhcp#dhcp-relay | 19:07 |
pmatulis | maas does not actually do the relaying. like i said above, it will send the dhcp config to the active dhcp server | 19:09 |
pmatulis | (which must be maas-managed obviously) | 19:09 |
vogelc | pmatulis: awesome!! Thanks, trying it out now. | 19:09 |
pmatulis | vogelc, i'm really interested in your feedback on this. especially re the documentation | 19:10 |
vogelc | pmatulis: For sure, I'll provide an update. I have to get the relay forwarder setup on the switch and then we should be good to go. | 19:12 |
stormmore | OK that is weird, something about the node I am trying to enlist is causing it not the enlist :-/ | 20:15 |
palmertime | I have installed a single MAAS server (2.1.3+bzr5573-0ubuntu1) on ubuntu 16.04. I have setup ssh keys, dhcp and images have are in sync for 16.04. When i boot up a PXE host, it receives an DHCP address and boots the image. It then fails on "Starting cloud-init" and ends at the login screen. From the server point of view i see the dhcp lease but no device was discovered. Is there a default login to the boot image? where should i | 20:15 |
catbus1 | palmertime: when it fails on starting cloud-init, what's the error message? | 20:21 |
palmertime | catbus1: It scrolls by to fast in the console. | 20:22 |
catbus1 | I happened to encounter an enlist error yesterday and it also ends at a login screen. I found that the maas-region-controller ip wasn't set correctly, cloud-init was reporting failing to contact 169.254.169.254. | 20:23 |
catbus1 | palmertime: could you run sudo dpkg-reconfigure maas-region-controller to check if ip is set correctly? | 20:23 |
catbus1 | /nick catbus1-afk | 20:27 |
palmertime | catbus1: Thanks for the tip. I'm going to dig through the API and see what the current setting is | 20:27 |
stormmore | it is weird, I can create a VM manually and it will enlist, use vagrant to do it and it won't :-/ | 20:33 |
stormmore | I am also getting an inconsistent installing the maas "package". sometimes it doesn't register the rack controller with the region controller | 20:40 |
stormmore | "Select a valid choice. qpn7s6 is not one of the available choices." is what I get from time to time when I am setting up dhcp through the cli shortly after install maas | 20:45 |
stormmore | it makes no sense... doing apt-get install maas should not cause that error, should it? I am seeing it more frequently :-/ | 21:02 |
mup | Bug #1668759 opened: [2.2, trunk] Window width directive fails to remove event from window <MAAS:Triaged by ricgard> <https://launchpad.net/bugs/1668759> | 21:11 |
oviliz | Hi guys | 21:14 |
oviliz | "You need one small server for MAAS and at least one server which can be managed with a BMC. It is recommended to have the MAAS server provide DHCP and DNS on a network the managed machines are connected to." | 21:15 |
oviliz | Does that means that the "small" server for MAAS is not going to be used sharing its CPU/RAM/drives? | 21:16 |
brendand | oviliz, yes | 21:20 |
brendand | oviliz, maas needs a place to exist | 21:20 |
stormmore | make no sense :-/ I can't see a difference between the 2 VMs but there has to be one :-/ | 21:25 |
oviliz | ty @brendand | 21:45 |
palmertime | How do i set the MAAS PXE/Provisioning network address from the maas command rather than dpkg-reconfigure? | 21:55 |
mup | Bug #1668774 opened: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774> | 21:56 |
mup | Bug #1668774 changed: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774> | 22:05 |
mup | Bug #1668774 opened: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774> | 22:08 |
roaksoax | palmertime: /etc/maas/rackd.conf -> maas_url change that from localhost:5240/MAAS to <ip:5240>/MAAS | 22:33 |
roaksoax | palmertime: or:sudo maas-rack config --region-url http://<ip>:5240/MAAS | 22:36 |
roaksoax | palmertime: then restart maas-rackd | 22:36 |
mimizone | what would be a quick way via the cli/API to copy the same network configuration from one node to another (multiple interfaces/vlans), changing only the Ip addresses? | 23:12 |
palmertime | roaksoax: Perfect, Thanks for the info | 23:19 |
Budgie^Smore | roaksoax so I have narrowed down my enlisting problem a little bit - seem the way Vagrant creates VMs is to blame, trying to determine what it does / doesn't do differently from manually creating the VMs | 23:49 |
* Budgie^Smore is stormmore btw | 23:51 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!