[00:00] <pmatulis> please take notes :) we need to capture everything in the docs
[00:18] <mimizone> all right, not completely over my long story with curtin... :)
[00:19] <mimizone> the stuff specific to the node is taken properly, but the late_commands etc.... are not executed.
[00:19] <mimizone> if I stuff them in the curtin_userdata, they are executed.
[00:21] <mimizone> I suspect it is overriden or something since the userdata redefines the late_commands
[00:21] <mimizone> ?
[00:28] <mimizone> last reboot/pxe for today.... enough of that thing already :)
[00:39] <mimizone> see you tomorrow gents.
[00:56] <pmatulis> see you mimizone
[00:59] <Budgie^Smore> OK I wish we were on IPv6 already!!! I keep running into transition problems!
[02:17] <stormmore> This is really weird, I have installed maas a dozen times today and randomly it stops configuring the maas rack controller!
[09:29] <cnf> aaaand morning
[09:33] <cnf> how do I tell maas to not scan / ignore certain subnets?
[09:34] <brendand> cnf, you mean for device discovery?
[09:34] <cnf> yes
[09:34] <brendand> cnf, and i guess you can't just delete the subnet?
[09:34] <cnf> i don't know, can I? it was auto discovered
[09:35] <brendand> cnf, yeah, of course
[09:35] <cnf> k, did that
[09:35] <brendand> cnf, it auto discovers everything as a 'convenience'
[09:36] <cnf> ok
[09:36] <brendand> the assumption is that most people will care about all visible subnets
[09:36] <cnf> well, i don't care about the public subnet
[09:36] <cnf> it's just in it to have internet access
[09:37] <brendand> cnf, fyi you can disable per subnet too - uncheck 'Active mapping' in the subnet page
[09:38] <cnf> yeah, it added those subnets again
[09:38] <cnf> i deleted them, maas just added them back
[09:38] <brendand> interesting
[09:38] <brendand> roaksoax, mpontillo , intentional ^
[09:38] <brendand> ?
[09:42] <cnf> also, IPMI doesn't seem to work on HP ilo 4
[09:43] <cnf> it calls the binary with the wrong arguments
[09:44] <cnf> hmm
[09:45] <brendand> cnf, hp ilo is a separate power management type
[09:45] <cnf> yes
[09:45] <cnf> and maas is calling ipmitool with the wrong arguments
[09:46] <cnf> Failed to execute ('/usr/bin/ipmitool', '-I', 'lanplus', '-H', 'x.x.x.x', '-U', 'Administrator', '-P', '********', '94:18:82:03:AD:2E', 'power', 'status') for cartridge 94:18:82:03:AD:2E at x.x.x.x: Invalid command: 94:18:82:03:AD:2E
[09:47] <cnf> if i remove the mac from the command, it works fine
[09:47] <brendand> cnf, actually, what kind of servers are these?
[09:47] <cnf> HP DL380
[09:47] <cnf> gen 9
[09:47] <brendand> cnf, i think the ilo entries are for moonshot
[09:47] <brendand> cnf, which is not what you want
[09:48] <cnf> so these are just not supported?
[09:48] <brendand> cnf, i think you should be using just IPMI
[09:48] <brendand> cnf, were the power params auto-detected or did you enter them yourself?
[09:49] <cnf> entered them myself, maas so far can't see my server
[09:49] <cnf> that's an entirely different mess
[09:49] <cnf> k, you where right, using just ipmi works
[09:49] <cnf> thanks
[09:50] <cnf> (i'm trying to understand and fix several things about MaaS at once...)
[09:50] <brendand> cnf, so are your machines pxe booting?
[09:50] <cnf> no
[09:51] <brendand> cnf, as in they aren't trying, or maas doesn't detect them when they do?
[09:51] <cnf> they are trying
[09:51] <cnf> but i have some problems getting it to wotrk
[09:52] <brendand> cnf, and is dhcp enabled on the subnet the are on?
[09:52] <cnf> first, they are on a LAG
[09:52] <brendand> cnf, ok
[09:52] <cnf> figuring out how to PXE on a lag has been a chore
[09:52] <cnf> now, i am seeing dhcp requests, but the machines are not responding to whatever the maas dhcpd is sending
[09:52] <cnf> that's where i left yesterday
[09:53] <cnf> 10:51:59.769027 14:02:ec:8a:83:85 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 389: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:02:ec:8a:83:85, length 347
[09:53] <cnf> 10:52:00.770859 00:50:56:94:e8:37 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: 172.20.20.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
[09:53] <cnf> basically
[09:53] <cnf> it's the right source mac
[09:56] <brendand> cnf, is http://www.brocade.com/content/html/en/configuration-guide/fastiron-08030b-l2guide/GUID-DC254740-F2BF-4279-820A-794AFBB86999.html in any way relevant?
[09:56] <brendand> "You can configure the member port of a dynamic LAG to be logically operational even when the dynamic LAG is not operating. This enables PXE boot support on this port."
[09:56] <brendand> probably not the same switch, but maybe the principle applies?
[09:56] <cnf> yeah, we looked at that
[09:56] <cnf> it's a juniper QFX3500
[09:56] <cnf> and on juniper it's also called force-up
[09:57] <cnf> atm, we disabled LACP, and are just using etherchannel
[09:57] <cnf> the requests are coming in, so it should work
[10:00] <cnf> brendand: http://www.juniper.net/techpubs/en_US/junos10.3/topics/reference/configuration-statement/force-up-edit-interfaces-ex-series.html is the juniper ref, btw
[10:00] <brendand> cnf, so you're running dhcpdump on the maas server when booting the machine?
[10:00] <cnf> yep
[10:01] <cnf> and the MAAS server is a VM on a vsphere cluster, if that matters
[10:04] <brendand> cnf, and you're sure dhcp is enabled on the correct vlan?
[10:04] <cnf> brendand: well, it's replying to the requests :P
[10:05] <cnf> also, for the MAAS vm, it's just an interface, it doesn't see the vlans
[10:06] <brendand> cnf, ok i misunderstood. so it's responding but the BOOTPREPLY packets don't reach the machine
[10:07] <brendand> then we can probably rule out maas as the problem
[10:07] <brendand> cnf, what's happening on the switch itself
[10:07] <brendand> ?
[10:24] <cnf> sorry, was AFK changing somw switch htings
[10:24] <cnf> added a copper port, so I can try with a laptop
[10:24] <cnf> brb
[10:25] <brendand> cnf, that's alright, i was afk making breakfast ;)
[10:30] <cnf> :P
[10:57] <cnf> ok, the server on copper worked
[10:58] <brendand> cnf, good to know
[10:58] <cnf> ok, but it doesn't auto detect servers it boots?
[10:59] <brendand> cnf, maas should auto-detect any system that boots under its direction
[10:59] <cnf> i can't find it, atm
[10:59] <brendand> you should see a message like 'booting under maas direction'
[11:00] <brendand> on the machine's console
[11:00] <cnf> yes, it did that
[11:00] <brendand> cnf, is the machine still on and doing things?
[11:00] <cnf> but i can't find in maas
[11:00] <cnf> it sits at the login prompt
[11:01] <brendand> cnf, that didn't work properly then
[11:01] <cnf> hmz
[11:01] <brendand> cnf, is there anything in /var/log/maas/rsyslog?
[11:01] <cnf> that is an empty directory
[11:02] <cnf> i have /var/log/maas/maas.log
[11:02] <cnf> i see it there
[11:03] <cnf> so it discovers everything i don't care about, but not what I do care about :P
[11:05] <cnf> it should show up on the dashboard, right?
[11:05] <cnf> with the rest of the discovered devices?
[11:06] <cnf> so when I go to the subnet, the ip the metal booted with is marked as "observed"
[11:06] <cnf> but that's the only reference i can see to it
[11:07] <cnf> hmz, am I doing something wrong?
[11:07] <brendand> cnf, i'd have to see the serial console to know for sure
[11:08] <cnf> serial of what?
[11:08] <cnf> do I have to add my metal manually, maybe?
[11:09] <brendand> cnf, the booting machine?
[11:09] <cnf> oh
[11:09] <cnf> i don't have access to that, atm
[11:09] <brendand> cnf, you can use serial over lan
[11:10] <cnf> yes, i have no access to that, atm
[11:10] <cnf> recycled machine, someone changed the ilo password
[11:10] <cnf> someone that isn't in today
[11:10] <brendand> cnf, i see
[11:11] <brendand> cnf, you need to watch it closely and see what happens after booting under maas direction
[11:11] <brendand> cnf, it should transfer a grub/efi config and initrd
[11:12] <brendand> cnf, if you're desperate you could video it?
[11:13] <cnf> ok, it says "no datasource found"
[11:13] <cnf> (also, i'm going to smack the person that changed the ilo pass)\
[11:14] <cnf> and then some python3 errors
[11:14] <brendand> sounds like cloud-init
[11:14] <brendand> any mention of cloud-init there?
[11:15] <brendand> cnf, details on the stacktrace would be good too
[11:16] <cnf> https://www.dropbox.com/s/bo40mqgdfqow3lv/2017-02-28%2012.12.48.jpg?dl=0
[11:16] <cnf> i feel like a caveman, posting pictures of screens :P
[11:16] <brendand> likely bad things to come!
[11:17] <brendand> you're telling me
[11:17] <cnf> the new machines i have iLo console on
[11:17] <cnf> this one i am using to get it on copper
[11:17] <cnf> anyway
[11:18] <cnf> what is this datasource i can't find?
[11:22] <cnf> hmz
[11:22] <brendand> cnf, is /var/log/cloud-init-output.log populated?
[11:23] <cnf> no such file or directory
[11:23] <brendand> cnf, and does /etc/maas/preseeds/enlist exist?
[11:23] <cnf> on the maas server
[11:23] <cnf> that last one does
[11:23] <cnf> but /var/log/cloud-init-output.log does not exist
[11:23] <cnf> i also don't have /etc/cloud, at all
[11:24] <brendand> cnf, i guess it can't access the metadata
[11:24] <brendand> cat /etc/maas/regiond.conf?
[11:27] <cnf> oh, it's stupid
[11:27] <cnf> it sees itself with the wrong ip
[11:27]  * cnf facepalms
[11:27] <cnf> what do i need to restart after i edit that file?
[11:28] <brendand> cnf, systemctl restart maas-regiond
[11:28] <cnf> also, how can I tell maas to totally ignore certain subnets, already :(
[11:28] <brendand> cnf, best you can do is tell it not to scan it looks like
[11:28] <brendand> i think the rediscovering thing is a bug
[11:31] <cnf> hmm
[11:34] <cnf> ok, now it came up, it seems
[11:42] <cnf> and i can ssh to it
[11:42] <cnf> right, that's a first step
[11:45] <cnf> ok! it doesn't see my storage, though
[11:45] <cnf> interesting
[11:48] <cnf> ok, so without LAG, things seem to be working now
[11:48] <cnf> after lunch, i'll need to get the LAG case working
[14:04] <cnf> ok!
[14:04] <cnf> so, one machine working, i think
[14:06] <brendand> cnf, cool
[14:06] <cnf> that's on a single copper link
[14:06] <cnf> i can't get the ones on the 10G LAG's working
[14:06] <cnf> i have no idea why
[14:06] <brendand> cnf, it's common i think to require a seperate interface for pxe
[14:07] <cnf> yeah, that's a bit of an issue
[14:07] <cnf> our network is set up for 2 x 10G fiber LAGs
[14:35] <cnf> hmm
[14:35] <cnf> so we have metal that can _only_ boot from the 10G cards
[14:36] <cnf> no one here experience with PXE booting on LAG interfaces?
[14:47] <roaksoax> cnf: i dont, but i do know others have done successfully on 10g cards
[14:47] <roaksoax> cnf: so whats the particular issue in your case ?
[14:49] <cnf> roaksoax: if i take down the LACP, and set up a pure etherchannel, i see dhcp requests coming in, but the server isn't reponding to the replies
[14:51] <roaksoax> cnf: you mean, you see dhcp requests in the maas server, but the maas server isn't responding, or is it the other way around ?
[14:51] <cnf> i see requests on the maas server, and i see responses on the maas server
[14:51] <cnf> but my metal isn't reacting at all
[14:57] <cnf> roaksoax: also, is there a way to have maas ignore certain ranges in discovery?
[14:58] <cnf> it's sat in a few mixed ranges (iLo, and public) that it should just ignore, but i get a huge list of discovered devices in those ranges
[15:06] <roaksoax> cnf: i wonder if this is something the switch is filtering out ?
[15:07] <roaksoax> cnf: like stp enabled ?
[15:07] <roaksoax> cnf: and no the discovery will happen on all subnets, you can turn off discovery altogether though
[15:08] <cnf> hmm, that sound weird, should have an option to turn it off per subnet (or per interface)
[15:13] <cnf> afaik stp isn't enabled on the qfabrics
[15:13] <mup> Bug #1668650 opened: I have both hwe-16.04 and hwe-x minimum kernel options and they mean different things apparently <oil> <MAAS:New> <https://launchpad.net/bugs/1668650>
[16:08] <cnf> any way to stop maas from trying to load a firmware in an endless loop?
[16:09] <cnf> http://pastebin.com/qygUiHYT i keep seeing that over and over again, 2 or 3 times / second
[16:34] <cnf> hmm, and It seems the DL380p's can't boot from the pci broadcom NIC
[16:34] <cnf> :(
[16:59] <cnf> k, firmware upgrade helped :P
[17:07] <mup> Bug #1550081 changed: [2.0] No error message is displayed when failed to add a domain <error-surface> <MAAS:Fix Released> <MAAS 2.0:Won't Fix> <MAAS trunk:Fix Released> <https://launchpad.net/bugs/1550081>
[17:20] <cnf> ok, time to go home
[17:39] <mup> Bug #1668703 opened: Use external NTP servers only option has no effect <MAAS:New> <https://launchpad.net/bugs/1668703>
[18:04] <mimizone> hi, is the API call "GET /MAAS/metadata/latest/by-id/agqa6n/?op=get_preseed" deprecated? it still works and being used by MAAS but I don't see any documentation on it.
[18:05] <mimizone> I'd like to know what are the other op values I can use to retrieve the other metadata files in curtin
[18:05] <mimizone> specifically the curtin_userdata file
[18:07] <vogelc> Anyone here have expierence adding subnets and configuring DHCP?  I can't add a subnet and enable DHCP if I dont have a host with an interface on that subnet.
[18:21] <pmatulis> mimizone, hi. what version of maas are you using?
[18:21] <mimizone> 2.1
[18:22] <mimizone> I've tried op=get_curtin_userdata but the curl returns an error and HTTP 400
[18:23] <pmatulis> mimizone, i don't see any mention of 'preseed' in any api version
[18:23] <mimizone> pmatulis: I've seen that too... but it is used during the deployment process I can insure you. it's how the node retrieves the curtin files.
[18:24] <mimizone> you can even see the log in the maas log file
[18:24] <pmatulis> mimizone, hm, maybe it's meant for internal processes only
[18:25] <pmatulis> vogelc, that makes sense IMO. how can a machine offer dhcp leases for a subnet it is not connected to? i suppose you can try using a dhcp relay
[18:25] <mimizone> I just want to use it for debugging actually. to understand how all that curtin stuff works
[18:25] <pmatulis> ohh
[18:26] <mimizone> the cloud-init must make a call of some kind to retrieve the curtin_userdata file, I can't figure this one out. I haven't read the entire source code of maas :)
[18:28] <mimizone> where is the code that runs in the node during the deployment by the way? I've looked only at the maasserver code so far
[18:31] <pmatulis> there are scripts for enlistment, commissioning, deployment. i'm not sure of their location
[18:32] <brendand> pmatulis, mimizone - /etc/maas/preseeds
[18:33] <brendand> pmatulis, don't think that's quite what mimizone is looking for though
[18:33] <mimizone> brendand: I mean the logic that gets those preseed files
[18:34] <mimizone> those files are rendered/sent by the maasserver/preseed.py but I can't find what is making the call to the server.
[18:34] <brendand> mimizone, can you check /etc/maas/regiond.conf?
[18:34] <brendand> mimizone, is maas_url something the node can reach?
[18:35] <mimizone> not much there. the maas_url, database information
[18:38] <mimizone> in my experience, I see that the node deployments uses curtin, but the commands in my configuration files are not used, then on the second boot, cloud-init seems to retrieve the curtin_userdata file, and then it triggers the curtin commands in that file. I assume that logic is somewhere.
[18:42] <mimizone> brendand: yes, no issue reaching the maas server. I am just interested in understanding the curtin/metadata stuff because I have to customize the deployment.
[18:42] <mimizone> I also use this process to fix a current bug in maas when there is multiple static routes.
[18:48] <vogelc> pamtulis, thats right we are planning on pointing the dhcp forwarders to the MAAS rack controllers.
[18:49] <vogelc> pamtulis, really all we need to do is define the subnets in DHCP.  Can we do it manually until a possible patch is in place?
[18:52] <vogelc> pmatulis: sorry fro spelling your id incorrectly.
[18:58] <pmatulis> vogelc, the current devel version of maas has dhcp relay integration in the sense that maas will send the appropriate config to the active dhcp server (providing it is maas-managed)
[18:58] <pmatulis> it does not provide the relay however
[19:01] <vogelc> pmatulis:  So I did upgrade to 2.2 but it still will not let me add DHCP to a VLAN because there are no controllers with that subnet defined on an interface.  I was hoping what you mentioned would be our fix.
[19:03] <pmatulis> vogelc, did you choose "Relay DHCP" for that VLAN?
[19:06] <vogelc> pmatulis: totally missed that option.  so if my NIC is on 10.1.1.0, it will relay it to 10.2.2.0 vlan/subnet?
[19:07] <pmatulis> https://docs.ubuntu.com/maas/devel/en/installconfig-network-dhcp#dhcp-relay
[19:09] <pmatulis> maas does not actually do the relaying. like i said above, it will send the dhcp config to the active dhcp server
[19:09] <pmatulis> (which must be maas-managed obviously)
[19:09] <vogelc> pmatulis:  awesome!!  Thanks,  trying it out now.
[19:10] <pmatulis> vogelc, i'm really interested in your feedback on this. especially re the documentation
[19:12] <vogelc> pmatulis:  For sure, I'll provide an update.  I have to get the relay forwarder setup on the switch and then we should be good to go.
[20:15] <stormmore> OK that is weird, something about the node I am trying to enlist is causing it not the enlist :-/
[20:15] <palmertime> I have installed a single MAAS server (2.1.3+bzr5573-0ubuntu1) on ubuntu 16.04. I have setup ssh keys, dhcp and images have are in sync for 16.04.  When i boot up a PXE host, it receives an DHCP address and boots the image.  It then fails on "Starting cloud-init" and ends at the login screen.  From the server point of view i see the dhcp lease but no device was discovered.  Is there a default login to the boot image?  where should i
[20:21] <catbus1> palmertime: when it fails on starting cloud-init, what's the error message?
[20:22] <palmertime> catbus1: It scrolls by to fast in the console.
[20:23] <catbus1> I happened to encounter an enlist error yesterday and it also ends at a login screen. I found that the maas-region-controller ip wasn't set correctly, cloud-init was reporting failing  to contact 169.254.169.254.
[20:23] <catbus1> palmertime: could you run sudo dpkg-reconfigure maas-region-controller to check if ip is set correctly?
[20:27] <catbus1>  /nick catbus1-afk
[20:27] <palmertime> catbus1: Thanks for the tip.  I'm going to dig through the API and see what the current setting is
[20:33] <stormmore> it is weird, I can create a VM manually and it will enlist, use vagrant to do it and it won't :-/
[20:40] <stormmore> I am also getting an inconsistent installing the maas "package". sometimes it doesn't register the rack controller with the region controller
[20:45] <stormmore> "Select a valid choice. qpn7s6 is not one of the available choices." is what I get from time to time when I am setting up dhcp through the cli shortly after install maas
[21:02] <stormmore> it makes no sense... doing apt-get install maas should not cause that error, should it? I am seeing it more frequently :-/
[21:11] <mup> Bug #1668759 opened: [2.2, trunk] Window width directive fails to remove event from window <MAAS:Triaged by ricgard> <https://launchpad.net/bugs/1668759>
[21:14] <oviliz> Hi guys
[21:15] <oviliz> "You need one small server for MAAS and at least one server which can be managed with a BMC. It is recommended to have the MAAS server provide DHCP and DNS on a network the managed machines are connected to."
[21:16] <oviliz> Does that means that the "small" server for MAAS is not going to be used sharing its CPU/RAM/drives?
[21:20] <brendand> oviliz, yes
[21:20] <brendand> oviliz, maas needs a place to exist
[21:25] <stormmore> make no sense :-/ I can't see a difference between the 2 VMs but there has to be one :-/
[21:45] <oviliz> ty @brendand
[21:55] <palmertime> How do i set the MAAS PXE/Provisioning network address from the maas command rather than dpkg-reconfigure?
[21:56] <mup> Bug #1668774 opened: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774>
[22:05] <mup> Bug #1668774 changed: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774>
[22:08] <mup> Bug #1668774 opened: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774>
[22:33] <roaksoax> palmertime: /etc/maas/rackd.conf -> maas_url change that from localhost:5240/MAAS to <ip:5240>/MAAS
[22:36] <roaksoax> palmertime: or:sudo maas-rack config --region-url http://<ip>:5240/MAAS
[22:36] <roaksoax> palmertime: then restart maas-rackd
[23:12] <mimizone> what would be a quick way via the cli/API to copy the same network configuration from one node to another (multiple interfaces/vlans), changing only the Ip addresses?
[23:19] <palmertime> roaksoax: Perfect, Thanks for the info
[23:49] <Budgie^Smore> roaksoax so I have narrowed down my enlisting problem a little bit - seem the way Vagrant creates VMs is to blame, trying to determine what it does / doesn't do differently from manually creating the VMs
[23:51]  * Budgie^Smore is stormmore btw