/srv/irclogs.ubuntu.com/2017/04/21/#maas.txt

xygnalmpontino yes.  same interface.00:09
xygnaldont think we have a rack conte00:10
xygnalrack controller in each L2.  we planned00:11
xygnalto have one reigon two racks. thats it.00:12
xygnaldo we needa rack contriOller00:13
xygnalughvirtlv00:13
xygnalok sorry, virtual keyboard hell for a minute.  I believe they had planned to NOT have a rack controller in every L2 and wanted to share them across multiple00:18
xygnalkeep in mind the clients in this particular project demanded subnets of /27 size, matching up to the size of the physical infrastructure block.00:19
xygnalwe would need a ton of rack controllers in order to facilitate that00:19
mupBug #1685105 opened: maas gui/cli tags lenght mismatch <MAAS:New> <https://launchpad.net/bugs/1685105>07:10
mupBug #1685108 opened: maas gui settings for proxy credentials should be reworked <MAAS:New> <https://launchpad.net/bugs/1685108>07:19
mupBug #1685108 changed: maas gui settings for proxy credentials should be reworked <MAAS:New> <https://launchpad.net/bugs/1685108>07:34
mupBug #1685108 opened: maas gui settings for proxy credentials should be reworked <MAAS:New> <https://launchpad.net/bugs/1685108>07:40
=== frankban|afk is now known as frankban
fabianscGood Morning everyone. I am trying to commission a virtual machine which runs on a host, which has been set up by MAAS Version 2.1.5+bzr5596-0ubuntu1 (16.04.1). After booting the VM it receives an IP address from MAAS. Afterwards the VM is placed within the "NODES" tab in MAAS. From now on, the VM does not receive any IP address.07:51
fabianscI placed the logs in an bug which refers to it missing DHCP service from MAAS (https://bugs.launchpad.net/maas/+bug/1660743).. Does someone has some time to check with me what else might be wrong?07:52
=== bdx_ is now known as bdx
ikoniawin 414:40
xygnalmpontillo: let me know if you have a few minutes today to help finish verifying if our architecture would work15:32
xygnalmpontillo: since we are trying to do multiple L2s untagged over the same network interface, I am worried it will never work like that.15:37
xygnalmpontillo: we do not want to use vlan tagging on the deployed systems, if possible, which is why the VLAN approach doesnt work well for us15:38
gemahi, I have a question about the behaviour of the OS installed on the server expected by MAAS15:53
kklimondahmm, today some of my nodes have stopped installing - after I've logged to one of them (injecting login/password) and re-run cloud-init --debug init manually I've noticed that cloud-config-url is pointing to http://[maas]/metadata/... instead of http://[maas]/MAAS/metadata/...15:54
kklimondaonly a subset of servers is affected15:54
gemais maas expecting the servers to be always trying to boot from network when rebooted or is it expecting the installer to change uefi to boot from disk subsequently?15:54
kklimondahow can I debug where they are getting a wrong url from?15:54
gema(trying to build an image that does the right thing)15:54
pmatulishey gema15:59
pmatulisgema, a node should always netboot15:59
mpontilloxygnal: can you file a bug on the original issue you hit with multiple fabrics?16:00
gemapmatulis: thanks!16:04
xygnalmpontillo: yes. I believe that is what Corey was initially trying to do but I was worried that separate fabrics on the same untagged physical network interface was not going to work16:05
xygnalmpontillo: i'm also trying to get a diagram to attach so you understand the networking layout16:06
xygnalmpontillo: new bug opened with clear details.  old one can be closed out to avoid confusion.16:33
maticueHi everyone! I have a DNS question, lamont are you there?16:34
maticueI'd like to know if it is possible create an A record for a subnetwork not managed by MAAS. Is there any way to do it?16:35
mpontilloxygnal: thanks.16:39
xygnalmpontillo: still adding some more details.  You want the dhcpd.conf as well? anything else?16:40
mpontillomaticue: yeah, you can do that starting with MAAS 2.0. https://bugs.launchpad.net/maas/+bug/1590021 I think a fix landed to make it easier in MAAS 2.1 (not sure if it has been released in a point release yet) and MAAS 2.216:40
mupBug #1681467 changed: [2.2 beta5] not able to create new tags - Conflict error. Try your request again, as it will most likely succeed. <cdo-qa-blocker> <oil> <MAAS:Invalid> <https://launchpad.net/bugs/1681467>16:40
mupBug #1685306 opened: [2.2] Subnet changes fabrics during deployment <MAAS:New> <https://launchpad.net/bugs/1685306>16:40
mpontilloxygnal: let me read through the bug and give it some thought, I'll let you know16:41
mupBug #1685306 changed: [2.2] Subnet changes fabrics during deployment <MAAS:New> <https://launchpad.net/bugs/1685306>16:49
mupBug #1681467 opened: [2.2 beta5] not able to create new tags - Conflict error. Try your request again, as it will most likely succeed. <cdo-qa-blocker> <oil> <MAAS:Invalid> <https://launchpad.net/bugs/1681467>16:49
xygnalmpontillo: just posted the latest info, and changed the title to match more closely what is actually happening16:49
xygnalthe final comments get into the nitty gritty16:50
xygnalall of this in the bug today is from talking in unison with my team mates this morning16:50
mpontilloxygnal: great, thanks for the info. I'll try to triage it today.16:50
mupBug #1681467 changed: [2.2 beta5] not able to create new tags - Conflict error. Try your request again, as it will most likely succeed. <cdo-qa-blocker> <oil> <MAAS:Invalid> <https://launchpad.net/bugs/1681467>16:55
mupBug #1685306 opened: [2.2] MAAS-UI loses networking config after deployment <MAAS:New> <https://launchpad.net/bugs/1685306>16:55
xygnalmpontillo: thanks much!16:59
kukaczhi, I'm facing issues my node attempting to boot kernel /ubuntu/amd64/hwe-t/xenial/... which obviously does not work (Trusty x Xenial)17:05
kukaczany ideas what might be wrong? (using maas 2.1.5+bzr5596)17:06
=== frankban is now known as frankban|afk
maticuempontillo: thanks, I see those commands are used when you already declared a subnetwork on MaaS. Is it secure declare a Subnet on MaaS that will be only used to declare DNS A records?17:42
maticuempontillo: What I want to know if it is a good practice or not use MAAS API to manage DNS records17:43
maticuempontillo: for other devices that are not managed by MAAS17:43
mpontillomaticue: there should be no problem with adding a subnet that isn't MAAS managed. you can either add them as devices (and specify the MAC) or just reserve an IP address and associate the A record with that. you do not have to reserve the IP address in MAAS 2.2; not sure if we fixed that in 2.1.x18:24
mpontillomaticue: here's an example that works with the latest MAAS 2.2 RC http://paste.ubuntu.com/24428403/18:29
mpontilloxygnal: how is each node's interface configured? MAAS displays "Observed" for an IP address when it gets a notification from the DHCP server that a lease has been issued for it. so, do I assume correctly that they are set to DHCP?18:35
mpontillomaticue: ah nevermind I see that it's set to auto-assign. hmm18:36
mpontillomaticue: did you say you're deploying CentOS? the CentOS image doesn't support advanced networking configuration, so I would try deploying with it set to DHCP. that said, it's very strange that it would revert to "unconfigured"18:37
maticuempontillo: thanks! I think you're talking with someone else, I'm asking about DNS (and I had a really good answer from you) but I wasn't talking about auto-assign neither CentOS18:38
maticuempontillo: So, that DNS example is great!! my question is... is this a good practice? (create a subnet that isn't MAAS managed for MAAS 2.1.x?18:39
mpontilloxygnal: sorry, the messages above for maticue were intended for you. ;-) also, I saw that you said you were using MAAS 2.2 RC1; can you give MAAS 2.2 RC2 a try? this might have been fixed18:39
mpontillomaticue: right, I can't think of any potential issues with modeling subnets in MAAS which MAAS doesn't manage. I would feel free to do it that way. perhaps one day MAAS will manage them ;-)18:41
mpontillomaticue: you mentioned security issues; the only thing I can think of is that MAAS will generate a squid proxy configuration which allows access to the subnets which it knows about18:42
mpontillomaticue: of course, you can mitigate that with an iptables rule if it's an issue18:42
mpontillomaticue: and yes, MAAS is intended to manage your DNS zones, so please give it a try; feedback is welcome!18:42
mupBug #1685337 opened: [1.9.5] unable to install <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1685337>18:44
maticuempontillo: thanks! I really appreciate your answers18:44
mpontilloltrager: any ideas on kukacz's issue above? kukacz, you should be able to set the kernel when you go to deploy, and you should be able to specify the minimum kernel on the global settings page. I assume you're trying to deploy Xenial and for whatever reason it's choosing the Trusty kernel?18:47
mpontillokukacz: what mechanism are you using to deploy? (API? UI? Juju?)18:47
mupBug #1685337 changed: [1.9.5] unable to install <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1685337>18:53
=== med_ is now known as Guest76746
vasey_hey folks, i'm using MAAS 2.1, and none of the servers i'm using show up in the Nodes page; i see their IPMI addresses all in the Device Discovery page. when i try adding the machines manually, they time out during the actual PXE boot process, though the IPMI power cycling and boot order control works fine. what could be going wrong here?18:55
mupBug #1685337 opened: [1.9.5] unable to install <cdo-qa> <MAAS:New> <https://launchpad.net/bugs/1685337>18:56
xygnalmpontillo: auto-assign is only Ubuntu? does that mean CentOS can only be dynamic leases that may change?19:04
mpontilloxygnal: unfortunately, I believe the CentOS image just grabs a DHCP address, yeah.19:06
ltragerkukacz: Is there anything in the install log or in the machine events?19:07
mpontillovasey_: where in the PXE process does it time out?19:07
mpontillovasey_: is DHCP enabled on that VLAN in MAAS?19:08
ltragerkukacz: I was able to deploy Trusty with hwe-x using MAAS 2.219:08
xygnal mpontillo: that brings another question. If we wanted to convert these to static ourselves after build, lets say.. using a resreved range.  would there be a way to release the DHCP ips back so we didnt need a DHCP range as big as the reserved range?19:11
mpontilloxygnal: well, you should be able to do that. MAAS's default lease time is pretty short since it's intended to be used mainly for enlisting and commissioning19:11
vasey_mpontillo: DHCP is enabled on the 'unassigned' VLAN in MAAS, i'm re-running it to see the specific message i get when the PXE boot is attempted19:12
mpontilloxygnal: I do worry about corner cases when changing around ranges (especially where overlap occurs); in general it should be okay but file bugs or enhancement requests if you see any issues19:13
mpontillovasey_: with IPMI nodes you generally don't need to add them manually, you just go ahead and boot them. if DHCP is configured in MAAS then it should PXE from MAAS, set the IMPI password, and automatically enlist the node (with a random hostname)19:14
mpontillovasey_: should be more reliable that way, too, since you don't need to worry if you manually entered the MACs correctly, etc19:14
xygnalmpontillo: so if I switch a nodes IP after build, and that old lease expires, MAAS wont do anything? (so long as the deployed node does not ask for DHCP lease again)19:15
mpontilloxygnal: you should see the Observed IP disappear from MAAS when the lease is gone19:17
mpontilloxygnal: but that' sit19:17
vasey_mpontillo: tried just rebooting the server into the correct PXE interface, I get a ">>Start PXE over IPv4. \n PXE-E18: Server response timeout." message19:20
mpontillovasey_: go to the Nodes tab, select the tab that shows your controllers, then click on it and check that the services all have green checkmarks and dhcpd is enabled19:22
mpontillovasey_: then check /var/lib/maas/dhcpd.conf and see if the configuration looks correct19:22
vasey_mpontillo: the services are all green checks except dhcpd6 and ntp (since it's region managed)19:25
vasey_mpontillo: though one weird thing in that conf file is under networks i'm seeing "shared-network vlan-5001", does 5001 happen to refer to the unassigned VLAN or is this amiss?19:25
mpontillovasey_: ok, that sounds good. is any DHCP relay involved? if you do "sudo maas-rack observe-dhcp <expected-incoming-dhcp-iterface>" on the MAAS rack, do you see the DHCP packets from the node that is trying to PXE boot?19:26
mpontillovasey_: that is just an internal identifier in MAAS for that VLAN, you should see it on the URL bar as well if you browse to that VLAN in the UI19:26
vasey_mpontillo: i do see packets from that command output19:28
vasey_mpontillo: there's a source mac address, which appears correct, but source and destination iP are 0.0.0.0 and 255.255.255.255 respectively19:29
mpontillovasey_: sounds correct for a DHCP request packet19:29
vasey_mpontillo: the destination mac address is ff:ff:ff:ff:ff:ff19:29
mpontillovasey_: but I take it there is no reply? can you confirm that the DHCP server is listening on the same interface on the MAAS server? such as by looking at the output of 'ps auxw | grep dhcp | grep maas'19:30
mpontillovasey_: you might also look at 'cat /var/log/syslog | grep dhcpd' to see what's going on.19:30
vasey_mpontillo: correct, there's no reply, and the DHCP server does appear to be listening on the correct interface based on that ps command19:31
mpontillovasey_: ok, let's find out what the syslog says then... you should at least see the DHCPDISCOVER packets coming in19:31
vasey_mpontillo: now there's something interesting, i'm seeing a "no free leases" command19:31
vasey_error message, anyhow19:32
mpontillovasey_: so, are there no free leases? ;-) how large is the dynamic range on the related subnet in MAAS? Can you add additional dynamic ranges, or extend the existing one?19:36
mpontillovasey_: you can try running this to check the lease database: dhcp-lease-list --lease /var/lib/maas/dhcp/dhcpd.leases19:36
mpontillovasey_: that will only show active leases by default19:36
xygnalmpontillo: just one more question I think.  Is there any reason to believe that node which is using DHCP mode will be able to keep its IP address indefinitely?19:38
xygnalif we did not convert them later to static, that is19:38
xygnalie: left them as DHCP19:38
xygnalmpontillo: I was also just told that we've had a couple nodes build successfully with auto-assign, centos nodes, how is that possible?19:39
mpontilloxygnal: that would be up to the DHCP server but if the node stayed online with connectivity to the DHCP server then its IP address should not change19:40
xygnalmpontillo: if the node was off the network for a couple hours (lets see, network link down, needs repair)19:41
mpontilloxygnal: check the deployed CentOS nodes to see what their network config looks like; my guess is that it's set to DHCP19:41
xygnalguaranteed lost IP?19:41
xygnalmpontillo: it was my understanding that auto-assign is still DHCP, it just sets a static lease instead of a dynamic one.  Is that not true?19:41
mpontilloxygnal: generally DHCP clients will try to request the same IP address, but I wouldn19:41
mpontilloxygnal: wouldn't ever guarantee it. if it goes offline and the dynamic range is exhausted, the DHCP server could certainly give it away19:42
mpontilloxygnal: ah, you are correct that auto-assign will provide a static DHCP lease; we actually do that for static IP addresses too. sorry, I had forgotten about that19:42
xygnalmpontillo: does that change the status of requiring DHCP mode? can we safely use auto-assign then?19:43
mpontilloxygnal: so the real question in your case is: why is the IP address becoming unconfigured? if we can figure out what's doing that, we'll likely solve your problem. is it possible that the observed IP address and MAC address from the DHCP lease overwrite what has been configured on the node? that would be my guess19:43
xygnalmpontillo: and as to using RC2, we had it blow up when we tried to upgrade, so we haven't been able to try RC2 yet19:43
mpontilloxygnal: ok, maybe best to wait for RC3 then =)19:44
xygnalmpontillo: what information do you want me to gather for you, to prove whats causing it to go unconfigured?19:44
mpontilloxygnal: let me run a test to see if observed IPs from DHCP overwrite a configured automatic address.. if so, that could be an easy fix for RC319:44
xygnalok19:45
vasey_mpontillo: so there were no active leases in that list, but i expanded the DHCP range, and now PXE is "booting under MAAS direction" :)19:49
mpontillovasey_: hm interesting, if you pass in --last it should show the most recent for each MAC, I wonder why dhcpd thought the range was full19:51
mpontilloxygnal: can you look at /var/log/maas/regiond.log and /var/log/maas/rackd.log and let me know if you see any tracebacks? (search for "Traceback".)19:54
vasey_mpontillo: so i may have forgotten the last 's' in the leases file name, but now just the one system who has actually connected is showing up in the list, even without the --last option19:54
vasey_mpontillo: the only difference is between using --last and not is that without it the hostname is listed as 'maas-enlist', while it's '-NA-' with it19:55
vasey_mpontillo: this is looking to be working now. that said, the nodes are still showing up in the Device Discovery tab, and not in the Nodes tab19:57
mpontillovasey_: ok. honestly I'm not sure how much I trust that tool; it's undocumented, but it can be useful in situations like this19:57
mpontillovasey_: it also support --all if you want to see every lease for the MAC, not sure if that would change things19:58
mpontilloxygnal: also, can you grep your /var/log/maas/maas.log for "Allocated automatic IP address " - check that you see that for each node you attempt to deploy?20:25
xygnal151 tracebacks20:28
mpontilloxygnal: are they all the same? can you pastebin one for me?20:28
xygnalI dont see any allocated lines in mass.log20:29
mpontilloxygnal: ok, I see lines like this when I have an automatic IP assigned and then go to deploy. Apr 21 20:10:32 maas maas.interface: [info] Allocated automatic IP address 192.168.0.206 for eno1 (physical) on nuc3.20:30
mpontilloxygnal: so that much is at least a clue20:30
vasey_mpontillo: now i've got three hosts successfully PXE booted into ubuntu 16.04.2 LTS, but nothing has happened after that. they still appear in the device discovery tab, and the nodes page lists 0 machines, 0 devices, and just the 1 controller (this is correct)20:31
mpontillovasey_: after they PXE boot they are supposed to contact MAAS to tell MAAS the IPMI credentials for commissioning. if that isn't happening, check that the URL is correct.20:32
mpontillovasey_: on the rack controller, run "sudo dpkg-reconfigure -plow maas-rack-controller" and ensure the URL is set to an IP address that the commissioning nodes will be able to reach after they DHCP20:33
mpontilloxygnal: if you can provide the most recent traceback in the log, that would be great20:33
xygnalnone of them appear to be valid to the problem.  tracebacks for poweroffs sometimes, tracebacks related to image downloads and connection resets.20:36
xygnalnothing that stood out20:36
xygnal most recent for region is connection reset by peer,20:36
vasey_mpontillo: ah, trying that now. how will the nodes be able to tell MAAS the IPMI creds, as in how do they know them in the first place?20:37
xygnalmost recent rackd.log traceback is for power20:37
mpontilloxygnal: yeah, I would be more interested in anything in the regiond.log20:38
xygnalmpontillo: also if i zgrep through all of the maas logs with -i for automatic20:38
xygnalno entries20:38
catbus1vasey_: First time machine pxe boots under maas direction, maas will load an ephemeral image on the machine ramdisk, and run ipmi command to create a BMC user 'maas' with a password randomly generated, before the machine shuts down, it will contact maas with that info.20:40
xygnalwait a second. i've been grabbing off the rack server, not the reigon controller.20:40
mpontilloxygnal: ok, so the interface must be modified before the node goes to assign the automatic IPs during deployment. can you do an experiment for me? commission a node, then observe that the interface information is correct post-commissioning. then wait two minutes and check again, and see if it has changed20:41
mpontilloxygnal: ah ok, yeah, region logs please =) the region is what contacts the database and will be responsible for IP assignment, etc20:41
xygnalyes, yes.  maas.log has automatic entries now too20:42
xygnal73 such entries in maas.log20:42
mpontilloxygnal: all right, if you can send any logs (maas.log and/or regiond.log) from that system or just look for relevant info you can share (and/or tracebacks) that would be good20:42
xygnalmpontillo: no tracebacks in region.log that are not about image checksum not matching.  no tracebacks at all in rackd.log, obviously.20:45
xygnalcan I matcch the timing of the Allocated messages to look for clues? is that when it assigns?20:46
vasey_mpontillo: when my hosts PXE boot, they all run into an error during the cloud-init job; 'did not find a data source'20:47
mpontilloxygnal: yeah, when you go to deploy a node, MAAS will allocate that IP address, and then rewrite the DHCP config so that it becomes a static lease. then you should see the IP address in /var/lib/maas/dhcpd.conf -- but it sounds like something is subsequently clearing it out, and I wonder why20:48
mpontillovasey_: yeah, that's why I suggested checking the MAAS URL; if it can't find the data source, the most likely problem is that it is trying to reach an unreachable IP address. for example, is a gateway needed to reach it? if so, have you configured one on the subnet details page?20:49
kukaczmpontillo: ltrager: back to the boot kernel issue - I am using UI for the deployment. and I had previously had that node commissioned (using Xenial image) and deployed Trusty on it20:50
kukaczmpontillo: ltrager: the issue occurred when I attempted to commision the node again or do some other operation - like release or rescue20:51
mpontillokukacz: when you go to deploy, can you select a different option for the kernel? it's the rightmost dropdown box20:51
mpontillokukacz: can you check the "Commissioning" section of the settings page, and maybe try re-saving it?20:52
vasey_mpontillo: i have a gateway configured, but the MAAS IP is in the same subnet as the DHCP range, so they should all be able to talk to each other. the jumpbox machine i'm using can ping the maas ip, the gateway, and everything maas assigned at PXE boot20:52
kukaczmpontillo: yes, I can. but the issue occurs in other than deploy actions20:52
kukaczmpontillo: I've tried many times change and save the default comissioning image and kernel, with no effect20:53
mpontilloxygnal: so *immediately* after deploy the interfaces change to unconfigured. hm. that's unexpected. the key to solving this will be to figure out why that is happening20:53
xygnalmpontillo: My team was setting up build demos for something we are showing off next week, so its possible todays logs are not a good example.  I will verify that.20:53
kukaczmpontillo: I've only found that the Trusty comissioning image appears broken as it attempts to do "lsblk -x" during comissioning, while that "-x" parameter was added in Trusty20:54
xygnalmpontillo: yes, it happens right away.20:54
kukaczsorry - Trusty=Xenial. basically that parameter was not present in Trusty age20:54
ltragerkukacz: can you commission with Xenial?21:00
kukaczltrager: yes, I can. but I have to delete the node and recreate it first21:01
kukaczltrager: seems like the boot image + kernel configuration is fixed somewhere until I delete the node (machine) completely21:02
kukaczltrager: at least I have found that workaround with deleting the machine first - I did not know it when I've originally asked for help here21:05
ltragerkukacz: I think I see whats happening. Commissioning with Trusty is broken but it fails in a weird way. The system turns off but the status is still 'commissioning'21:11
ltragerkukacz: So what I had to do was abort the commission operation, change commissioning to use Xenial, then recommission21:11
ltragerkukacz: can you use Xenial for commissioning or do you have a specific requirement for Trusty?21:12
kukaczltrager: I can safely comission with Xenial and then deploy Trusty, which is the release I need21:12
kukaczltrager: then if I need some change on that node (recomission, rescue, release ...), I need to delete it first and comission again with Xenial21:14
ltragerkukacz: you shouldn't have to delete it. But a node can't be in a deployed state to commission.21:15
mpontilloaw, vasey_ left. I was just going to point him to https://gist.github.com/mpontillo/6ee4c96d8aed4d0efde66a37aa6d5af9 << my script to test fetching the URLs provided by the enlistment configuration21:19
kukaczuntil I delete it, I seem to be stuck forever with system attempting to boot "hwe-t/xenial"21:19
kukaczltrager: regarding that deployed state - I usually start with "mark broken" and do the other actions then. all end up in that unbootable state, not just the comissioning one21:23
kukaczltrager: also I've found package "vlan" is missing in the Trusty deployment - prevents VLAN tagged interfaces from correctly starting with upstart21:30
mupBug #1685361 opened: MAAS unable to commission with Trusty <MAAS:Triaged> <https://launchpad.net/bugs/1685361>21:32
mupBug #1685361 changed: MAAS unable to commission with Trusty <MAAS:Triaged> <https://launchpad.net/bugs/1685361>21:38
mupBug #1685361 opened: MAAS unable to commission with Trusty <MAAS:Triaged> <https://launchpad.net/bugs/1685361>21:44

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!