[14:25] <hkominos> narinder
[14:25] <narinder> hkominos, hi
[14:27] <hkominos> Hi. There is rumour that you demoed maas in the opnfv plugfest
[14:27] <hkominos> so I assume you have a deep understanding of Maas?
[14:28] <narinder> hkominos, not a rumour but i did talk about kubernetes and MAAS.
[14:28] <narinder> hkominos, what exactly you are looking foe?
[14:30] <hkominos> I wanted to ask you about a bug (?) that I see in maas. Here in opnfv-armband at least. It all started with tftp being veeeeeeery slow. After considerable looking around I found that the tftp library which maas leverages is unable to understand the properties of the underlying link and forces blocksize to be small. I remember you had a similar issue with 9000 MTU ??
[14:31] <hkominos> IF you set 9000 MTU maas does nto server images anymore BUT alav told me that you might have a hack in place to tell mass to use proper block size ?
[14:31] <narinder> foer pxe network we uses 1500 MTU most of the time.
[14:33] <hkominos> we have the same. However the library forces blksize to be 1000 . (if I force the blksize to be 1464 then it works perfectly and faster)
[14:34] <hkominos> Should i just open a bug ?
[14:40] <narinder> hkominos, please open it against maas and maas team can have a look.
[14:41] <hkominos> ok thx!
[17:55] <Karunamon> 'morning folks - I've got a fresh install of 2.3.0 on 16.04 that is refusing to let me enable DHCP on the one single vlan that was configured out of the box
[17:57] <Karunamon> specifically, I get a "The IPRange could not be created because the data didn't validate."
[18:02] <Karunamon> no idea where to even begin troubleshooting because this is literally step 6 on the install page
[18:05] <catbus> Karunamon: could it be the ip range for dhcp is not reachable via that interface?
[18:06] <catbus> Karunamon: what's the MAAS ip on the interface where you will provide DHCP and what's the IP range you configured for DHCP?
[18:08] <Karunamon> catbus: so the box has a single interface - 10.0.0.225, the DHCP range is only two addresses now, 10.0.0.246 through 10.0.0.247.
[18:08] <Karunamon> as configured the vlan is 10.0.0.0/24
[18:11] <Karunamon> I didn't actually add that vlan, it was automatically created after install, presumably from the IP settings on the machine prior to installing the 'maas' package
[19:28] <roaksoax> mpontillo: "The IPRange could not be created because the data didn't validate."
[19:30] <roaksoax> Karunamon: can you file a bug with all the information aout your networks, and waht ranges you are trying to create, etc
[19:31] <mpontillo> Karunamon: are you creating the IP range in the UI or the API? how does the IP range you're trying to create correlate to any subnets on that VLAN?
[19:46] <dd__> Hi all, My network discovery is only finding 3 bare metal machines that already have OS's on them in the VLAN. When I PXE boot from Idrac, MaaS isn't discovering the machines. Any ideas on things to check?
[19:46] <Karunamon> roaksoax: I think one may already exist, but it's been dormant for months. https://bugs.launchpad.net/maas/+bug/1569960
[19:46] <Karunamon> in the meantime I'm evaluating auto deployment tools and the poor error messaging is... discouraging
[19:48] <Karunamon> roaksoax: The range was already created when I logged into the system for the first time. This error appears when I hit "provide DHCP" on the vlan
[19:52] <Karunamon> dd__: Do you have active discovery turned on in the vlan?
[19:52] <Karunamon> *subnet, rather
[19:53] <mpontillo> Karunamon: right, I triaged that but couldn't get the issue to occur; it wasn't clear what the cause of the problem was. if you can tell us how to get that to happen consistently, that would be great
[19:54] <mpontillo> dd__: what happens on the system console when it PXE boots?
[19:56] <dd__> The virtual console on idrac just says no media detected.
[19:57] <dd__> Or do you mean on the maas console?
[19:58] <Karunamon> mpontillo: I can give a tl;dr - it's a bog standard 16.04 install with ssh and basic ubuntu server installed, IPs as configured above, maas installed with 'apt install maas'
[19:58] <mpontillo> dd__: I was trying to confirm that the machine actually PXE booted from MAAS first. "no media detected" implies that it either isn't set to boot from the network, or MAAS isn't providing DHCP on the network and thus can't serve up the parameters for the PXE request
[19:59] <dd__> MaaS says it's providing dhcp.
[19:59] <Karunamon> the only thing I can think of is that it's getting upset that the maas server (rack and region controller same box) is living inside the vlan its managing
[20:00] <mpontillo> Karunamon: the only unusual thing about that is probably the very small range - can you try making the DHCP range larger?
[20:01] <mpontillo> Karunamon: I'll check the code to see how we handle that
[20:01] <Karunamon> mpontillo: tried that. When I hit fill out the "provide DHCP" form, it suggests a range of 10.0.0.136 through 10.0.0.199
[20:02] <Karunamon> I get the same behavior even if I just accept the default
[20:03] <mpontillo> Karunamon: btw, it's fine (required even, since DHCP is UDP-based) for MAAS to have an IP address on the managed VLAN
[20:04] <Karunamon> mpontillo: heh, just spitballing
[20:04] <Karunamon> I tried creating a dynamic range manually under the subnet, that causes a "There is no room for any dynamic ranges on this subnet."
[20:06] <dd__> mpontillo: idrac says booting from PXE Device 1: Integrated NIC 1 Port 1 Partition 1 PXE: No media detected. Maas on the Controller summary says DHCPD is running.
[20:09] <mpontillo> dd__: can you confirm that the MAAS server is actually receiving the DHCP requests and replying?
[20:16] <dd__> mpontillo: how would I confirm that?
[20:16] <Karunamon> mpontillo: I think I worked past it. Reserved everything but the range I wanted used (so two regular reservations), and reserved the remainder as a dynamic range
[20:16] <Karunamon> then I was allowed to enable DHCP for the vlan
[20:18] <mpontillo> Karunamon: strange, I wouldn't expect that would have an effect. when you "provide DHCP" in the UI it should be equivalent to adding the IP range; seems there may be a bug in the "provide dhcp" path then
[20:18] <mpontillo> Karunamon: glad you got it working thouh
[20:18] <mpontillo> *though
[20:18] <mpontillo> dd__: you could install something like `dhcpdump` and run `sudo dhcpdump -i <interface>` on the MAAS rack controller, then any DHCP requests and replies would be printed on the console
[20:18] <Karunamon> mpontillo: The trick was putting the subnet into managed rather than unmanaged
[20:19] <Karunamon> I toggled it around a few times and that seems to be the magic
[20:20] <mpontillo> dd__: when I do that, I can see the DHCP request from the machine I'm booting, and then I can see the reply from MAAS that includes "FNAME: pxelinux.0", showing that MAAS has given the client the option of doing a PXE boot
[20:20] <mpontillo> Karunamon: ah. should have known. that part of MAAS can be a bit confusing; I actually have a blog post that explains a bit about the history of it here http://spectrum42.com/posts/ip-ranges-in-maas/
[20:22] <mpontillo> Karunamon: sounds like we need clearer error messages about attempts to enable DHCP on unmanaged subnets (unmanaged means we don't control DHCP, so I'm guessing that's why it was rejected)
[20:27] <Karunamon> mpontillo: Ah! Unmanaged means it doesn't even try to assign addresses to a dynamic range you create yourself?
[20:27] <Karunamon> (say, you've got an external DHCP server, you're basically telling maas to look for new nodes within that range?)
[20:29] <mpontillo> Karunamon: it's more for dual-homed MAAS environments; let's say your MAAS has two interfaces; it managed DHCP on a private network set up for MAAS, but it also connects to your office LAN with some other DHCP server. and maybe you've got a couple ranges on that network that MAAS can allocate from. You configure those ranges that MAAS can manage as reserved, that way if you deploy a dual-homed machine on both networks, it can use what your IT
[20:29] <mpontillo> department (or whatever) has allocated for you on the network where you /don't/ manage DHCP.
[20:29] <roaksoax> Karunamon: unmanaged means "interface doesn't get configured", DHCP means " the interface gets configured for /dhcp/"
[20:29] <mpontillo> Karunamon: that way you don't make your IT department angry at you for running a DHCP server on their network ;-)
[20:29] <mpontillo> but you can still assign IPs
[20:36] <dd__> mpontillo: Thanks for that. It actually was getting DHCP requests, but in settings I hit enable active discovery and then told to scan every 10 mins. Passive discovery just wasn't doing it.
[20:37] <mpontillo> dd__: passive discovery is ARP based, so you won't devices show up there unless they get an IP address from DHCP and start asking where things are
[20:37] <mpontillo> dd__: but to be clear, not being able to PXE boot is completely separate from device discovery
[20:37] <mpontillo> dd__: if a machine isn't enlisting into MAAS< that is generally an issue with DHCP enablement in MAAS on the VLAN you want to boot from
[20:38] <mpontillo> (that is, if it won't even PXE boot.)
[20:39] <mpontillo> dd__: if you don't control DHCP on the subnet, you'd have to have a network administrator configure it to PXE boot from MAAS, or enable DHCP on an alternate VLAN and relay it to MAAS. but MAAS works smoother if it controls DHCP itself
[20:40] <dd__> mpontillo: The dhcp is disabled on this vlan.
[20:41] <mpontillo> dd__: all right; if you are able to enable it, you should see PXE requests start to work - when you PXE boot the machine it should auto-enlist and add an IPMI password to allow MAAS to power-control the machine
[20:41] <dd__> mpontillo: I should say that our networks ip manager (infoblox) is not managing dhcp for that vlan.
[20:42] <mpontillo> dd__: good. so if there are no other DHCP servers on that network, I'd go ahead and try enabling DHCP in MAAS for that VLAN
[20:43] <dd__> mpontillo: DHCP is enabled for that maas vlan
[20:44] <mpontillo> dd__: you said you saw DHCP /requests/ in dhcpdump when you attempted the PXE boot but you didn't mention any /replies/; are you sure the settings are correct?
[20:45] <mpontillo> dd__: I would take a look at /var/lib/maas/dhcp.conf on the rack controller and see if it looks sane
[20:51] <dd__> mpontillo: Nothing jumps out at me on /var/lib/maas/dhcpd.conf  I haven't looked at anything dhcp snippets yet though.  On the dhcpdump, I got a bunch of bootprequests and bootreply over and over.
[20:53] <mpontillo> dd__: ok, so it sounds like MAAS is attempting to reply to the PXE request but the server is ignoring it? it could help if you pastebin the dhcpdump.
[20:54] <mpontillo> dd__: I would check the BIOS settings to make sure the system is set up to allow and prefer network boot
[20:55] <mpontillo> dd__: also check to be sure there is no firewall on your network that could be preventing the DHCP replies from reaching the server, or could be preventing the PXE boot (TFTP-based) from happening
[21:12] <dd__> mpontillo:https://pastebin.com/hHCnFJ97
[21:13] <dd__> mpontillo: Since maas is on the same vlan as the machines it's provisioning, there wouldn't be any firewall.
[22:03] <Karunamon> Great news - DHCP works :D
[22:03] <Karunamon> bad news: my machine picks up the initrd from maas and immediately reboots
[22:05] <mpontillo> dd__: yeah, just checking since some switches prevent DHCP requests unless they are from known authorized DHCP servers - like a L2 firewall
[22:07] <ltrager2> Karunamon: can you get the console output?
[22:07] <roaksoax> Karunamon: the initrd should launch an ephemeral image
[22:08] <roaksoax> Karunamon: so mabe it is just doing that ?
[22:10] <mpontillo> Karunamon: the expected behavior for the first time it PXE boots is for it to load the kernel/initrd, then run a script to register itself with MAAS, then turn itself off
[22:13] <mpontillo> dd__: from the pastebin it kind of looks like the offer from MAAS is sent but never acted upon; the DHCPDISCOVER request is repeated a few seconds later, after the original DHCPOFFER was sent. I'd check for network problems between MAAS and the BMC; maybe it can can send traffic but not receive it, or maybe it's a network filtering issue like what I mentioned before
[22:14] <mpontillo> dd__: it looks like MAAS offers up the possibility of an IP address and a PXE boot path four times, but the offer is never accepted
[22:14] <mpontillo> Karunamon: btw, if it succeeds, you'll see the server in the machines listing in MAAS with a random name
[23:23] <dd__> mpontillo: It looks like something IS blocking it at the switch level. Thanks for your help today.
[23:24] <mpontillo> dd__: happy to be of service =)