[00:00] please take notes :) we need to capture everything in the docs [00:18] all right, not completely over my long story with curtin... :) [00:19] the stuff specific to the node is taken properly, but the late_commands etc.... are not executed. [00:19] if I stuff them in the curtin_userdata, they are executed. [00:21] I suspect it is overriden or something since the userdata redefines the late_commands [00:21] ? [00:28] last reboot/pxe for today.... enough of that thing already :) [00:39] see you tomorrow gents. [00:56] see you mimizone [00:59] OK I wish we were on IPv6 already!!! I keep running into transition problems! [02:17] This is really weird, I have installed maas a dozen times today and randomly it stops configuring the maas rack controller! [09:29] aaaand morning [09:33] how do I tell maas to not scan / ignore certain subnets? [09:34] cnf, you mean for device discovery? [09:34] yes [09:34] cnf, and i guess you can't just delete the subnet? [09:34] i don't know, can I? it was auto discovered [09:35] cnf, yeah, of course [09:35] k, did that [09:35] cnf, it auto discovers everything as a 'convenience' [09:36] ok [09:36] the assumption is that most people will care about all visible subnets [09:36] well, i don't care about the public subnet [09:36] it's just in it to have internet access [09:37] cnf, fyi you can disable per subnet too - uncheck 'Active mapping' in the subnet page [09:38] yeah, it added those subnets again [09:38] i deleted them, maas just added them back [09:38] interesting [09:38] roaksoax, mpontillo , intentional ^ [09:38] ? [09:42] also, IPMI doesn't seem to work on HP ilo 4 [09:43] it calls the binary with the wrong arguments [09:44] hmm [09:45] cnf, hp ilo is a separate power management type [09:45] yes [09:45] and maas is calling ipmitool with the wrong arguments [09:46] Failed to execute ('/usr/bin/ipmitool', '-I', 'lanplus', '-H', 'x.x.x.x', '-U', 'Administrator', '-P', '********', '94:18:82:03:AD:2E', 'power', 'status') for cartridge 94:18:82:03:AD:2E at x.x.x.x: Invalid command: 94:18:82:03:AD:2E [09:47] if i remove the mac from the command, it works fine [09:47] cnf, actually, what kind of servers are these? [09:47] HP DL380 [09:47] gen 9 [09:47] cnf, i think the ilo entries are for moonshot [09:47] cnf, which is not what you want [09:48] so these are just not supported? [09:48] cnf, i think you should be using just IPMI [09:48] cnf, were the power params auto-detected or did you enter them yourself? [09:49] entered them myself, maas so far can't see my server [09:49] that's an entirely different mess [09:49] k, you where right, using just ipmi works [09:49] thanks [09:50] (i'm trying to understand and fix several things about MaaS at once...) [09:50] cnf, so are your machines pxe booting? [09:50] no [09:51] cnf, as in they aren't trying, or maas doesn't detect them when they do? [09:51] they are trying [09:51] but i have some problems getting it to wotrk [09:52] cnf, and is dhcp enabled on the subnet the are on? [09:52] first, they are on a LAG [09:52] cnf, ok [09:52] figuring out how to PXE on a lag has been a chore [09:52] now, i am seeing dhcp requests, but the machines are not responding to whatever the maas dhcpd is sending [09:52] that's where i left yesterday [09:53] 10:51:59.769027 14:02:ec:8a:83:85 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 389: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:02:ec:8a:83:85, length 347 [09:53] 10:52:00.770859 00:50:56:94:e8:37 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: 172.20.20.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300 [09:53] basically [09:53] it's the right source mac [09:56] cnf, is http://www.brocade.com/content/html/en/configuration-guide/fastiron-08030b-l2guide/GUID-DC254740-F2BF-4279-820A-794AFBB86999.html in any way relevant? [09:56] "You can configure the member port of a dynamic LAG to be logically operational even when the dynamic LAG is not operating. This enables PXE boot support on this port." [09:56] probably not the same switch, but maybe the principle applies? [09:56] yeah, we looked at that [09:56] it's a juniper QFX3500 [09:56] and on juniper it's also called force-up [09:57] atm, we disabled LACP, and are just using etherchannel [09:57] the requests are coming in, so it should work [10:00] brendand: http://www.juniper.net/techpubs/en_US/junos10.3/topics/reference/configuration-statement/force-up-edit-interfaces-ex-series.html is the juniper ref, btw [10:00] cnf, so you're running dhcpdump on the maas server when booting the machine? [10:00] yep [10:01] and the MAAS server is a VM on a vsphere cluster, if that matters [10:04] cnf, and you're sure dhcp is enabled on the correct vlan? [10:04] brendand: well, it's replying to the requests :P [10:05] also, for the MAAS vm, it's just an interface, it doesn't see the vlans [10:06] cnf, ok i misunderstood. so it's responding but the BOOTPREPLY packets don't reach the machine [10:07] then we can probably rule out maas as the problem [10:07] cnf, what's happening on the switch itself [10:07] ? [10:24] sorry, was AFK changing somw switch htings [10:24] added a copper port, so I can try with a laptop [10:24] brb [10:25] cnf, that's alright, i was afk making breakfast ;) [10:30] :P [10:57] ok, the server on copper worked [10:58] cnf, good to know [10:58] ok, but it doesn't auto detect servers it boots? [10:59] cnf, maas should auto-detect any system that boots under its direction [10:59] i can't find it, atm [10:59] you should see a message like 'booting under maas direction' [11:00] on the machine's console [11:00] yes, it did that [11:00] cnf, is the machine still on and doing things? [11:00] but i can't find in maas [11:00] it sits at the login prompt [11:01] cnf, that didn't work properly then [11:01] hmz [11:01] cnf, is there anything in /var/log/maas/rsyslog? [11:01] that is an empty directory [11:02] i have /var/log/maas/maas.log [11:02] i see it there [11:03] so it discovers everything i don't care about, but not what I do care about :P [11:05] it should show up on the dashboard, right? [11:05] with the rest of the discovered devices? [11:06] so when I go to the subnet, the ip the metal booted with is marked as "observed" [11:06] but that's the only reference i can see to it [11:07] hmz, am I doing something wrong? [11:07] cnf, i'd have to see the serial console to know for sure [11:08] serial of what? [11:08] do I have to add my metal manually, maybe? [11:09] cnf, the booting machine? [11:09] oh [11:09] i don't have access to that, atm [11:09] cnf, you can use serial over lan [11:10] yes, i have no access to that, atm [11:10] recycled machine, someone changed the ilo password [11:10] someone that isn't in today [11:10] cnf, i see [11:11] cnf, you need to watch it closely and see what happens after booting under maas direction [11:11] cnf, it should transfer a grub/efi config and initrd [11:12] cnf, if you're desperate you could video it? [11:13] ok, it says "no datasource found" [11:13] (also, i'm going to smack the person that changed the ilo pass)\ [11:14] and then some python3 errors [11:14] sounds like cloud-init [11:14] any mention of cloud-init there? [11:15] cnf, details on the stacktrace would be good too [11:16] https://www.dropbox.com/s/bo40mqgdfqow3lv/2017-02-28%2012.12.48.jpg?dl=0 [11:16] i feel like a caveman, posting pictures of screens :P [11:16] likely bad things to come! [11:17] you're telling me [11:17] the new machines i have iLo console on [11:17] this one i am using to get it on copper [11:17] anyway [11:18] what is this datasource i can't find? [11:22] hmz [11:22] cnf, is /var/log/cloud-init-output.log populated? [11:23] no such file or directory [11:23] cnf, and does /etc/maas/preseeds/enlist exist? [11:23] on the maas server [11:23] that last one does [11:23] but /var/log/cloud-init-output.log does not exist [11:23] i also don't have /etc/cloud, at all [11:24] cnf, i guess it can't access the metadata [11:24] cat /etc/maas/regiond.conf? [11:27] oh, it's stupid [11:27] it sees itself with the wrong ip [11:27] * cnf facepalms [11:27] what do i need to restart after i edit that file? [11:28] cnf, systemctl restart maas-regiond [11:28] also, how can I tell maas to totally ignore certain subnets, already :( [11:28] cnf, best you can do is tell it not to scan it looks like [11:28] i think the rediscovering thing is a bug [11:31] hmm [11:34] ok, now it came up, it seems [11:42] and i can ssh to it [11:42] right, that's a first step [11:45] ok! it doesn't see my storage, though [11:45] interesting [11:48] ok, so without LAG, things seem to be working now [11:48] after lunch, i'll need to get the LAG case working [14:04] ok! [14:04] so, one machine working, i think [14:06] cnf, cool [14:06] that's on a single copper link [14:06] i can't get the ones on the 10G LAG's working [14:06] i have no idea why [14:06] cnf, it's common i think to require a seperate interface for pxe [14:07] yeah, that's a bit of an issue [14:07] our network is set up for 2 x 10G fiber LAGs [14:35] hmm [14:35] so we have metal that can _only_ boot from the 10G cards [14:36] no one here experience with PXE booting on LAG interfaces? [14:47] cnf: i dont, but i do know others have done successfully on 10g cards [14:47] cnf: so whats the particular issue in your case ? [14:49] roaksoax: if i take down the LACP, and set up a pure etherchannel, i see dhcp requests coming in, but the server isn't reponding to the replies [14:51] cnf: you mean, you see dhcp requests in the maas server, but the maas server isn't responding, or is it the other way around ? [14:51] i see requests on the maas server, and i see responses on the maas server [14:51] but my metal isn't reacting at all [14:57] roaksoax: also, is there a way to have maas ignore certain ranges in discovery? [14:58] it's sat in a few mixed ranges (iLo, and public) that it should just ignore, but i get a huge list of discovered devices in those ranges [15:06] cnf: i wonder if this is something the switch is filtering out ? [15:07] cnf: like stp enabled ? [15:07] cnf: and no the discovery will happen on all subnets, you can turn off discovery altogether though [15:08] hmm, that sound weird, should have an option to turn it off per subnet (or per interface) [15:13] afaik stp isn't enabled on the qfabrics [15:13] Bug #1668650 opened: I have both hwe-16.04 and hwe-x minimum kernel options and they mean different things apparently [16:08] any way to stop maas from trying to load a firmware in an endless loop? [16:09] http://pastebin.com/qygUiHYT i keep seeing that over and over again, 2 or 3 times / second [16:34] hmm, and It seems the DL380p's can't boot from the pci broadcom NIC [16:34] :( [16:59] k, firmware upgrade helped :P [17:07] Bug #1550081 changed: [2.0] No error message is displayed when failed to add a domain [17:20] ok, time to go home [17:39] Bug #1668703 opened: Use external NTP servers only option has no effect [18:04] hi, is the API call "GET /MAAS/metadata/latest/by-id/agqa6n/?op=get_preseed" deprecated? it still works and being used by MAAS but I don't see any documentation on it. [18:05] I'd like to know what are the other op values I can use to retrieve the other metadata files in curtin [18:05] specifically the curtin_userdata file [18:07] Anyone here have expierence adding subnets and configuring DHCP? I can't add a subnet and enable DHCP if I dont have a host with an interface on that subnet. [18:21] mimizone, hi. what version of maas are you using? [18:21] 2.1 [18:22] I've tried op=get_curtin_userdata but the curl returns an error and HTTP 400 [18:23] mimizone, i don't see any mention of 'preseed' in any api version [18:23] pmatulis: I've seen that too... but it is used during the deployment process I can insure you. it's how the node retrieves the curtin files. [18:24] you can even see the log in the maas log file [18:24] mimizone, hm, maybe it's meant for internal processes only [18:25] vogelc, that makes sense IMO. how can a machine offer dhcp leases for a subnet it is not connected to? i suppose you can try using a dhcp relay [18:25] I just want to use it for debugging actually. to understand how all that curtin stuff works [18:25] ohh [18:26] the cloud-init must make a call of some kind to retrieve the curtin_userdata file, I can't figure this one out. I haven't read the entire source code of maas :) [18:28] where is the code that runs in the node during the deployment by the way? I've looked only at the maasserver code so far [18:31] there are scripts for enlistment, commissioning, deployment. i'm not sure of their location [18:32] pmatulis, mimizone - /etc/maas/preseeds [18:33] pmatulis, don't think that's quite what mimizone is looking for though [18:33] brendand: I mean the logic that gets those preseed files [18:34] those files are rendered/sent by the maasserver/preseed.py but I can't find what is making the call to the server. [18:34] mimizone, can you check /etc/maas/regiond.conf? [18:34] mimizone, is maas_url something the node can reach? [18:35] not much there. the maas_url, database information [18:38] in my experience, I see that the node deployments uses curtin, but the commands in my configuration files are not used, then on the second boot, cloud-init seems to retrieve the curtin_userdata file, and then it triggers the curtin commands in that file. I assume that logic is somewhere. [18:42] brendand: yes, no issue reaching the maas server. I am just interested in understanding the curtin/metadata stuff because I have to customize the deployment. [18:42] I also use this process to fix a current bug in maas when there is multiple static routes. [18:48] pamtulis, thats right we are planning on pointing the dhcp forwarders to the MAAS rack controllers. [18:49] pamtulis, really all we need to do is define the subnets in DHCP. Can we do it manually until a possible patch is in place? [18:52] pmatulis: sorry fro spelling your id incorrectly. [18:58] vogelc, the current devel version of maas has dhcp relay integration in the sense that maas will send the appropriate config to the active dhcp server (providing it is maas-managed) [18:58] it does not provide the relay however [19:01] pmatulis: So I did upgrade to 2.2 but it still will not let me add DHCP to a VLAN because there are no controllers with that subnet defined on an interface. I was hoping what you mentioned would be our fix. [19:03] vogelc, did you choose "Relay DHCP" for that VLAN? [19:06] pmatulis: totally missed that option. so if my NIC is on 10.1.1.0, it will relay it to 10.2.2.0 vlan/subnet? [19:07] https://docs.ubuntu.com/maas/devel/en/installconfig-network-dhcp#dhcp-relay [19:09] maas does not actually do the relaying. like i said above, it will send the dhcp config to the active dhcp server [19:09] (which must be maas-managed obviously) [19:09] pmatulis: awesome!! Thanks, trying it out now. [19:10] vogelc, i'm really interested in your feedback on this. especially re the documentation [19:12] pmatulis: For sure, I'll provide an update. I have to get the relay forwarder setup on the switch and then we should be good to go. [20:15] OK that is weird, something about the node I am trying to enlist is causing it not the enlist :-/ [20:15] I have installed a single MAAS server (2.1.3+bzr5573-0ubuntu1) on ubuntu 16.04. I have setup ssh keys, dhcp and images have are in sync for 16.04. When i boot up a PXE host, it receives an DHCP address and boots the image. It then fails on "Starting cloud-init" and ends at the login screen. From the server point of view i see the dhcp lease but no device was discovered. Is there a default login to the boot image? where should i [20:21] palmertime: when it fails on starting cloud-init, what's the error message? [20:22] catbus1: It scrolls by to fast in the console. [20:23] I happened to encounter an enlist error yesterday and it also ends at a login screen. I found that the maas-region-controller ip wasn't set correctly, cloud-init was reporting failing to contact 169.254.169.254. [20:23] palmertime: could you run sudo dpkg-reconfigure maas-region-controller to check if ip is set correctly? [20:27] /nick catbus1-afk [20:27] catbus1: Thanks for the tip. I'm going to dig through the API and see what the current setting is [20:33] it is weird, I can create a VM manually and it will enlist, use vagrant to do it and it won't :-/ [20:40] I am also getting an inconsistent installing the maas "package". sometimes it doesn't register the rack controller with the region controller [20:45] "Select a valid choice. qpn7s6 is not one of the available choices." is what I get from time to time when I am setting up dhcp through the cli shortly after install maas [21:02] it makes no sense... doing apt-get install maas should not cause that error, should it? I am seeing it more frequently :-/ [21:11] Bug #1668759 opened: [2.2, trunk] Window width directive fails to remove event from window [21:14] Hi guys [21:15] "You need one small server for MAAS and at least one server which can be managed with a BMC. It is recommended to have the MAAS server provide DHCP and DNS on a network the managed machines are connected to." [21:16] Does that means that the "small" server for MAAS is not going to be used sharing its CPU/RAM/drives? [21:20] oviliz, yes [21:20] oviliz, maas needs a place to exist [21:25] make no sense :-/ I can't see a difference between the 2 VMs but there has to be one :-/ [21:45] ty @brendand [21:55] How do i set the MAAS PXE/Provisioning network address from the maas command rather than dpkg-reconfigure? [21:56] Bug #1668774 opened: intermittent SSL connection [22:05] Bug #1668774 changed: intermittent SSL connection [22:08] Bug #1668774 opened: intermittent SSL connection [22:33] palmertime: /etc/maas/rackd.conf -> maas_url change that from localhost:5240/MAAS to /MAAS [22:36] palmertime: or:sudo maas-rack config --region-url http://:5240/MAAS [22:36] palmertime: then restart maas-rackd [23:12] what would be a quick way via the cli/API to copy the same network configuration from one node to another (multiple interfaces/vlans), changing only the Ip addresses? [23:19] roaksoax: Perfect, Thanks for the info [23:49] roaksoax so I have narrowed down my enlisting problem a little bit - seem the way Vagrant creates VMs is to blame, trying to determine what it does / doesn't do differently from manually creating the VMs [23:51] * Budgie^Smore is stormmore btw