/srv/irclogs.ubuntu.com/2017/02/28/#maas.txt

pmatulisplease take notes :) we need to capture everything in the docs00:00
mimizoneall right, not completely over my long story with curtin... :)00:18
mimizonethe stuff specific to the node is taken properly, but the late_commands etc.... are not executed.00:19
mimizoneif I stuff them in the curtin_userdata, they are executed.00:19
mimizoneI suspect it is overriden or something since the userdata redefines the late_commands00:21
mimizone?00:21
mimizonelast reboot/pxe for today.... enough of that thing already :)00:28
mimizonesee you tomorrow gents.00:39
pmatulissee you mimizone00:56
Budgie^SmoreOK I wish we were on IPv6 already!!! I keep running into transition problems!00:59
stormmoreThis is really weird, I have installed maas a dozen times today and randomly it stops configuring the maas rack controller!02:17
cnfaaaand morning09:29
cnfhow do I tell maas to not scan / ignore certain subnets?09:33
brendandcnf, you mean for device discovery?09:34
cnfyes09:34
brendandcnf, and i guess you can't just delete the subnet?09:34
cnfi don't know, can I? it was auto discovered09:34
brendandcnf, yeah, of course09:35
cnfk, did that09:35
brendandcnf, it auto discovers everything as a 'convenience'09:35
cnfok09:36
brendandthe assumption is that most people will care about all visible subnets09:36
cnfwell, i don't care about the public subnet09:36
cnfit's just in it to have internet access09:36
brendandcnf, fyi you can disable per subnet too - uncheck 'Active mapping' in the subnet page09:37
cnfyeah, it added those subnets again09:38
cnfi deleted them, maas just added them back09:38
brendandinteresting09:38
brendandroaksoax, mpontillo , intentional ^09:38
brendand?09:38
cnfalso, IPMI doesn't seem to work on HP ilo 409:42
cnfit calls the binary with the wrong arguments09:43
cnfhmm09:44
brendandcnf, hp ilo is a separate power management type09:45
cnfyes09:45
cnfand maas is calling ipmitool with the wrong arguments09:45
cnfFailed to execute ('/usr/bin/ipmitool', '-I', 'lanplus', '-H', 'x.x.x.x', '-U', 'Administrator', '-P', '********', '94:18:82:03:AD:2E', 'power', 'status') for cartridge 94:18:82:03:AD:2E at x.x.x.x: Invalid command: 94:18:82:03:AD:2E09:46
cnfif i remove the mac from the command, it works fine09:47
brendandcnf, actually, what kind of servers are these?09:47
cnfHP DL38009:47
cnfgen 909:47
brendandcnf, i think the ilo entries are for moonshot09:47
brendandcnf, which is not what you want09:47
cnfso these are just not supported?09:48
brendandcnf, i think you should be using just IPMI09:48
brendandcnf, were the power params auto-detected or did you enter them yourself?09:48
cnfentered them myself, maas so far can't see my server09:49
cnfthat's an entirely different mess09:49
cnfk, you where right, using just ipmi works09:49
cnfthanks09:49
cnf(i'm trying to understand and fix several things about MaaS at once...)09:50
brendandcnf, so are your machines pxe booting?09:50
cnfno09:50
brendandcnf, as in they aren't trying, or maas doesn't detect them when they do?09:51
cnfthey are trying09:51
cnfbut i have some problems getting it to wotrk09:51
brendandcnf, and is dhcp enabled on the subnet the are on?09:52
cnffirst, they are on a LAG09:52
brendandcnf, ok09:52
cnffiguring out how to PXE on a lag has been a chore09:52
cnfnow, i am seeing dhcp requests, but the machines are not responding to whatever the maas dhcpd is sending09:52
cnfthat's where i left yesterday09:52
cnf10:51:59.769027 14:02:ec:8a:83:85 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 389: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:02:ec:8a:83:85, length 34709:53
cnf10:52:00.770859 00:50:56:94:e8:37 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: 172.20.20.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 30009:53
cnfbasically09:53
cnfit's the right source mac09:53
brendandcnf, is http://www.brocade.com/content/html/en/configuration-guide/fastiron-08030b-l2guide/GUID-DC254740-F2BF-4279-820A-794AFBB86999.html in any way relevant?09:56
brendand"You can configure the member port of a dynamic LAG to be logically operational even when the dynamic LAG is not operating. This enables PXE boot support on this port."09:56
brendandprobably not the same switch, but maybe the principle applies?09:56
cnfyeah, we looked at that09:56
cnfit's a juniper QFX350009:56
cnfand on juniper it's also called force-up09:56
cnfatm, we disabled LACP, and are just using etherchannel09:57
cnfthe requests are coming in, so it should work09:57
cnfbrendand: http://www.juniper.net/techpubs/en_US/junos10.3/topics/reference/configuration-statement/force-up-edit-interfaces-ex-series.html is the juniper ref, btw10:00
brendandcnf, so you're running dhcpdump on the maas server when booting the machine?10:00
cnfyep10:00
cnfand the MAAS server is a VM on a vsphere cluster, if that matters10:01
brendandcnf, and you're sure dhcp is enabled on the correct vlan?10:04
cnfbrendand: well, it's replying to the requests :P10:04
cnfalso, for the MAAS vm, it's just an interface, it doesn't see the vlans10:05
brendandcnf, ok i misunderstood. so it's responding but the BOOTPREPLY packets don't reach the machine10:06
brendandthen we can probably rule out maas as the problem10:07
brendandcnf, what's happening on the switch itself10:07
brendand?10:07
cnfsorry, was AFK changing somw switch htings10:24
cnfadded a copper port, so I can try with a laptop10:24
cnfbrb10:24
brendandcnf, that's alright, i was afk making breakfast ;)10:25
cnf:P10:30
cnfok, the server on copper worked10:57
brendandcnf, good to know10:58
cnfok, but it doesn't auto detect servers it boots?10:58
brendandcnf, maas should auto-detect any system that boots under its direction10:59
cnfi can't find it, atm10:59
brendandyou should see a message like 'booting under maas direction'10:59
brendandon the machine's console11:00
cnfyes, it did that11:00
brendandcnf, is the machine still on and doing things?11:00
cnfbut i can't find in maas11:00
cnfit sits at the login prompt11:00
brendandcnf, that didn't work properly then11:01
cnfhmz11:01
brendandcnf, is there anything in /var/log/maas/rsyslog?11:01
cnfthat is an empty directory11:01
cnfi have /var/log/maas/maas.log11:02
cnfi see it there11:02
cnfso it discovers everything i don't care about, but not what I do care about :P11:03
cnfit should show up on the dashboard, right?11:05
cnfwith the rest of the discovered devices?11:05
cnfso when I go to the subnet, the ip the metal booted with is marked as "observed"11:06
cnfbut that's the only reference i can see to it11:06
cnfhmz, am I doing something wrong?11:07
brendandcnf, i'd have to see the serial console to know for sure11:07
cnfserial of what?11:08
cnfdo I have to add my metal manually, maybe?11:08
brendandcnf, the booting machine?11:09
cnfoh11:09
cnfi don't have access to that, atm11:09
brendandcnf, you can use serial over lan11:09
cnfyes, i have no access to that, atm11:10
cnfrecycled machine, someone changed the ilo password11:10
cnfsomeone that isn't in today11:10
brendandcnf, i see11:10
brendandcnf, you need to watch it closely and see what happens after booting under maas direction11:11
brendandcnf, it should transfer a grub/efi config and initrd11:11
brendandcnf, if you're desperate you could video it?11:12
cnfok, it says "no datasource found"11:13
cnf(also, i'm going to smack the person that changed the ilo pass)\11:13
cnfand then some python3 errors11:14
brendandsounds like cloud-init11:14
brendandany mention of cloud-init there?11:14
brendandcnf, details on the stacktrace would be good too11:15
cnfhttps://www.dropbox.com/s/bo40mqgdfqow3lv/2017-02-28%2012.12.48.jpg?dl=011:16
cnfi feel like a caveman, posting pictures of screens :P11:16
brendandlikely bad things to come!11:16
brendandyou're telling me11:17
cnfthe new machines i have iLo console on11:17
cnfthis one i am using to get it on copper11:17
cnfanyway11:17
cnfwhat is this datasource i can't find?11:18
cnfhmz11:22
brendandcnf, is /var/log/cloud-init-output.log populated?11:22
cnfno such file or directory11:23
brendandcnf, and does /etc/maas/preseeds/enlist exist?11:23
cnfon the maas server11:23
cnfthat last one does11:23
cnfbut /var/log/cloud-init-output.log does not exist11:23
cnfi also don't have /etc/cloud, at all11:23
brendandcnf, i guess it can't access the metadata11:24
brendandcat /etc/maas/regiond.conf?11:24
cnfoh, it's stupid11:27
cnfit sees itself with the wrong ip11:27
* cnf facepalms11:27
cnfwhat do i need to restart after i edit that file?11:27
brendandcnf, systemctl restart maas-regiond11:28
cnfalso, how can I tell maas to totally ignore certain subnets, already :(11:28
brendandcnf, best you can do is tell it not to scan it looks like11:28
brendandi think the rediscovering thing is a bug11:28
cnfhmm11:31
cnfok, now it came up, it seems11:34
cnfand i can ssh to it11:42
cnfright, that's a first step11:42
cnfok! it doesn't see my storage, though11:45
cnfinteresting11:45
cnfok, so without LAG, things seem to be working now11:48
cnfafter lunch, i'll need to get the LAG case working11:48
cnfok!14:04
cnfso, one machine working, i think14:04
brendandcnf, cool14:06
cnfthat's on a single copper link14:06
cnfi can't get the ones on the 10G LAG's working14:06
cnfi have no idea why14:06
brendandcnf, it's common i think to require a seperate interface for pxe14:06
cnfyeah, that's a bit of an issue14:07
cnfour network is set up for 2 x 10G fiber LAGs14:07
cnfhmm14:35
cnfso we have metal that can _only_ boot from the 10G cards14:35
cnfno one here experience with PXE booting on LAG interfaces?14:36
roaksoaxcnf: i dont, but i do know others have done successfully on 10g cards14:47
roaksoaxcnf: so whats the particular issue in your case ?14:47
cnfroaksoax: if i take down the LACP, and set up a pure etherchannel, i see dhcp requests coming in, but the server isn't reponding to the replies14:49
roaksoaxcnf: you mean, you see dhcp requests in the maas server, but the maas server isn't responding, or is it the other way around ?14:51
cnfi see requests on the maas server, and i see responses on the maas server14:51
cnfbut my metal isn't reacting at all14:51
cnfroaksoax: also, is there a way to have maas ignore certain ranges in discovery?14:57
cnfit's sat in a few mixed ranges (iLo, and public) that it should just ignore, but i get a huge list of discovered devices in those ranges14:58
roaksoaxcnf: i wonder if this is something the switch is filtering out ?15:06
roaksoaxcnf: like stp enabled ?15:07
roaksoaxcnf: and no the discovery will happen on all subnets, you can turn off discovery altogether though15:07
cnfhmm, that sound weird, should have an option to turn it off per subnet (or per interface)15:08
cnfafaik stp isn't enabled on the qfabrics15:13
mupBug #1668650 opened: I have both hwe-16.04 and hwe-x minimum kernel options and they mean different things apparently <oil> <MAAS:New> <https://launchpad.net/bugs/1668650>15:13
cnfany way to stop maas from trying to load a firmware in an endless loop?16:08
cnfhttp://pastebin.com/qygUiHYT i keep seeing that over and over again, 2 or 3 times / second16:09
cnfhmm, and It seems the DL380p's can't boot from the pci broadcom NIC16:34
cnf:(16:34
cnfk, firmware upgrade helped :P16:59
mupBug #1550081 changed: [2.0] No error message is displayed when failed to add a domain <error-surface> <MAAS:Fix Released> <MAAS 2.0:Won't Fix> <MAAS trunk:Fix Released> <https://launchpad.net/bugs/1550081>17:07
cnfok, time to go home17:20
mupBug #1668703 opened: Use external NTP servers only option has no effect <MAAS:New> <https://launchpad.net/bugs/1668703>17:39
mimizonehi, is the API call "GET /MAAS/metadata/latest/by-id/agqa6n/?op=get_preseed" deprecated? it still works and being used by MAAS but I don't see any documentation on it.18:04
mimizoneI'd like to know what are the other op values I can use to retrieve the other metadata files in curtin18:05
mimizonespecifically the curtin_userdata file18:05
vogelcAnyone here have expierence adding subnets and configuring DHCP?  I can't add a subnet and enable DHCP if I dont have a host with an interface on that subnet.18:07
pmatulismimizone, hi. what version of maas are you using?18:21
mimizone2.118:21
mimizoneI've tried op=get_curtin_userdata but the curl returns an error and HTTP 40018:22
pmatulismimizone, i don't see any mention of 'preseed' in any api version18:23
mimizonepmatulis: I've seen that too... but it is used during the deployment process I can insure you. it's how the node retrieves the curtin files.18:23
mimizoneyou can even see the log in the maas log file18:24
pmatulismimizone, hm, maybe it's meant for internal processes only18:24
pmatulisvogelc, that makes sense IMO. how can a machine offer dhcp leases for a subnet it is not connected to? i suppose you can try using a dhcp relay18:25
mimizoneI just want to use it for debugging actually. to understand how all that curtin stuff works18:25
pmatulisohh18:25
mimizonethe cloud-init must make a call of some kind to retrieve the curtin_userdata file, I can't figure this one out. I haven't read the entire source code of maas :)18:26
mimizonewhere is the code that runs in the node during the deployment by the way? I've looked only at the maasserver code so far18:28
pmatulisthere are scripts for enlistment, commissioning, deployment. i'm not sure of their location18:31
brendandpmatulis, mimizone - /etc/maas/preseeds18:32
brendandpmatulis, don't think that's quite what mimizone is looking for though18:33
mimizonebrendand: I mean the logic that gets those preseed files18:33
mimizonethose files are rendered/sent by the maasserver/preseed.py but I can't find what is making the call to the server.18:34
brendandmimizone, can you check /etc/maas/regiond.conf?18:34
brendandmimizone, is maas_url something the node can reach?18:34
mimizonenot much there. the maas_url, database information18:35
mimizonein my experience, I see that the node deployments uses curtin, but the commands in my configuration files are not used, then on the second boot, cloud-init seems to retrieve the curtin_userdata file, and then it triggers the curtin commands in that file. I assume that logic is somewhere.18:38
mimizonebrendand: yes, no issue reaching the maas server. I am just interested in understanding the curtin/metadata stuff because I have to customize the deployment.18:42
mimizoneI also use this process to fix a current bug in maas when there is multiple static routes.18:42
vogelcpamtulis, thats right we are planning on pointing the dhcp forwarders to the MAAS rack controllers.18:48
vogelcpamtulis, really all we need to do is define the subnets in DHCP.  Can we do it manually until a possible patch is in place?18:49
vogelcpmatulis: sorry fro spelling your id incorrectly.18:52
pmatulisvogelc, the current devel version of maas has dhcp relay integration in the sense that maas will send the appropriate config to the active dhcp server (providing it is maas-managed)18:58
pmatulisit does not provide the relay however18:58
vogelcpmatulis:  So I did upgrade to 2.2 but it still will not let me add DHCP to a VLAN because there are no controllers with that subnet defined on an interface.  I was hoping what you mentioned would be our fix.19:01
pmatulisvogelc, did you choose "Relay DHCP" for that VLAN?19:03
vogelcpmatulis: totally missed that option.  so if my NIC is on 10.1.1.0, it will relay it to 10.2.2.0 vlan/subnet?19:06
pmatulishttps://docs.ubuntu.com/maas/devel/en/installconfig-network-dhcp#dhcp-relay19:07
pmatulismaas does not actually do the relaying. like i said above, it will send the dhcp config to the active dhcp server19:09
pmatulis(which must be maas-managed obviously)19:09
vogelcpmatulis:  awesome!!  Thanks,  trying it out now.19:09
pmatulisvogelc, i'm really interested in your feedback on this. especially re the documentation19:10
vogelcpmatulis:  For sure, I'll provide an update.  I have to get the relay forwarder setup on the switch and then we should be good to go.19:12
stormmoreOK that is weird, something about the node I am trying to enlist is causing it not the enlist :-/20:15
palmertimeI have installed a single MAAS server (2.1.3+bzr5573-0ubuntu1) on ubuntu 16.04. I have setup ssh keys, dhcp and images have are in sync for 16.04.  When i boot up a PXE host, it receives an DHCP address and boots the image.  It then fails on "Starting cloud-init" and ends at the login screen.  From the server point of view i see the dhcp lease but no device was discovered.  Is there a default login to the boot image?  where should i20:15
catbus1palmertime: when it fails on starting cloud-init, what's the error message?20:21
palmertimecatbus1: It scrolls by to fast in the console.20:22
catbus1I happened to encounter an enlist error yesterday and it also ends at a login screen. I found that the maas-region-controller ip wasn't set correctly, cloud-init was reporting failing  to contact 169.254.169.254.20:23
catbus1palmertime: could you run sudo dpkg-reconfigure maas-region-controller to check if ip is set correctly?20:23
catbus1 /nick catbus1-afk20:27
palmertimecatbus1: Thanks for the tip.  I'm going to dig through the API and see what the current setting is20:27
stormmoreit is weird, I can create a VM manually and it will enlist, use vagrant to do it and it won't :-/20:33
stormmoreI am also getting an inconsistent installing the maas "package". sometimes it doesn't register the rack controller with the region controller20:40
stormmore"Select a valid choice. qpn7s6 is not one of the available choices." is what I get from time to time when I am setting up dhcp through the cli shortly after install maas20:45
stormmoreit makes no sense... doing apt-get install maas should not cause that error, should it? I am seeing it more frequently :-/21:02
mupBug #1668759 opened: [2.2, trunk] Window width directive fails to remove event from window <MAAS:Triaged by ricgard> <https://launchpad.net/bugs/1668759>21:11
ovilizHi guys21:14
oviliz"You need one small server for MAAS and at least one server which can be managed with a BMC. It is recommended to have the MAAS server provide DHCP and DNS on a network the managed machines are connected to."21:15
ovilizDoes that means that the "small" server for MAAS is not going to be used sharing its CPU/RAM/drives?21:16
brendandoviliz, yes21:20
brendandoviliz, maas needs a place to exist21:20
stormmoremake no sense :-/ I can't see a difference between the 2 VMs but there has to be one :-/21:25
ovilizty @brendand21:45
palmertimeHow do i set the MAAS PXE/Provisioning network address from the maas command rather than dpkg-reconfigure?21:55
mupBug #1668774 opened: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774>21:56
mupBug #1668774 changed: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774>22:05
mupBug #1668774 opened: intermittent SSL connection <MAAS:Triaged> <https://launchpad.net/bugs/1668774>22:08
roaksoaxpalmertime: /etc/maas/rackd.conf -> maas_url change that from localhost:5240/MAAS to <ip:5240>/MAAS22:33
roaksoaxpalmertime: or:sudo maas-rack config --region-url http://<ip>:5240/MAAS22:36
roaksoaxpalmertime: then restart maas-rackd22:36
mimizonewhat would be a quick way via the cli/API to copy the same network configuration from one node to another (multiple interfaces/vlans), changing only the Ip addresses?23:12
palmertimeroaksoax: Perfect, Thanks for the info23:19
Budgie^Smoreroaksoax so I have narrowed down my enlisting problem a little bit - seem the way Vagrant creates VMs is to blame, trying to determine what it does / doesn't do differently from manually creating the VMs23:49
* Budgie^Smore is stormmore btw 23:51

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!