/srv/irclogs.ubuntu.com/2017/06/01/#maas.txt

pmatulisbenlake, thank you00:25
BlackDexHello there. Does any of you know of any conectifity issues when using a riverbed appliance inbetween a maas controller and the clients?07:47
BlackDexi'm having problems with maas which tells the client during commisioning the auth has failed getting a 401 response09:42
roaksoaxBlackDex: maas deployed machines need to be able to contact the region controller for metadata service12:39
roaksoaxBlackDex: the IP they would contact is the one on rackd.conf12:39
roaksoaxBlackDex: so that's likely what's causing your issues12:39
roaksoaxusing something in between12:39
BlackDexroaksoax: well it can conntact the server12:42
BlackDexthe region controller that is12:42
BlackDexbut i just get 401 access denied12:42
BlackDexeven though i have wireshark logs that states sending the oauth tokens etc..12:43
BlackDexand i see exact the same that is sent is also received without any modifications12:43
roaksoaxBlackDex: during commissioning, I believe once it says "i'm done commissioning" then some delayed messages may not make it due to that12:44
roaksoaxthat could be it12:45
BlackDexit doesn't even fetch the commissioning script :(12:45
BlackDex401 access denied12:45
roaksoaxdo you have sample logs ?12:45
BlackDexone moment, that should be in the rsyslog folder then?12:45
roaksoaxit should if cloud-init was able to sent it back12:46
BlackDexroaksoax: just a moment, ill try to create a new fresh log12:48
BlackDexroaksoax: the strange thing is that it worked for a long long time12:52
BlackDexand suddenly it stoped12:52
BlackDexroaksoax: http://paste.ubuntu.com/24737218/12:54
BlackDexi think it maybe is an ntp problem13:06
BlackDexit seems the ntp-sync on the region controller wasn't synced for a long time13:06
BlackDexalso, there are apperently to many hops between the clients and the region controller/external etc.. for the ntp13:07
BlackDexso i now have a ntp server which should be reachable for all clients/servers within the network13:07
BlackDexlets see what that does13:07
BlackDexroaksoax: that doesn't work13:15
BlackDexatleast the clock skew adjustment is gone now13:15
BlackDexroaksoax: http://paste.ubuntu.com/24737421/13:16
BlackDexthat is second attempt with the NTP fixed now13:17
roaksoaxBlackDex: http://pastebin.ubuntu.com/24737462/13:21
roaksoaxBlackDex: that's the erro13:21
BlackDexthat causes the 401 errors :S13:23
BlackDexoke, lets see what it does when i disable that13:24
BlackDexi didn't know that was enabled btw13:24
BlackDexgood spot!13:24
BlackDexlets see13:24
BlackDexit think it would be strange if that causes the 401 errors13:27
BlackDexbut lets see13:27
roaksoaxBlackDex: the thing is that cloud-init sends a meesage to MAAS and tells MAAS "commissioning has failed", so MAAS goes and says "ok, i'm gonna remove the nonce"13:28
roaksoaxBlackDex: hence maas can't authenticate13:28
BlackDexah13:28
BlackDexdarn13:28
BlackDexand it works13:28
benlakeI guess when it rains PPA errors it pours PPA errors :P13:30
BlackDexthanks!13:30
benlakeroaksoax: speaking of NTP, did you see my situation with IP selection?13:32
BlackDextrying the deploy now, seems to look good13:32
BlackDexstrange that this didn't have any impact on the other rack controllers :S13:41
BlackDexhmm13:42
BlackDexi think they were able to download the gpg key, and this specific site doesn't13:42
xygnalroaksoax: how is grub config managed on install?  We need to do some tweaking to the defaults.14:18
piwi3910hey people, hope anyone can point me to a solution14:30
piwi3910i'm running maas 2.214:30
piwi3910any node i boot, physical or virtual always fails the boot with:14:31
piwi3910cloud-init can not apply stage final, no datasource found14:31
piwi3910any clues14:31
piwi3910fresh install14:31
benlakeI guess that’s my cue14:31
benlakehello piwi3910, you are me 5 days ago14:32
piwi3910hehhe cool, back to the future14:32
piwi3910i've installed maas before, never had this issue14:32
piwi3910no clue what's going on14:32
piwi3910so tell me your magic14:32
benlakelook at both /etc/maas/rackd.conf and /etc/maas/regiond.conf14:33
benlakeyou’ll probably say, “ah ha!” when looking at one of those.14:33
piwi3910regiond points to my public side of the maas14:34
piwi3910rackd to localhost14:34
benlakeand what subnet is the deployed node being asked to land on?14:36
benlakeand can said deployed node route to this “public side” you speak of.14:36
benlakes/./?/14:36
piwi3910so this is what i noticed, in the dhcp the GW for the pxe network is filled in14:37
piwi3910but when the hosts boots, i can only ping it from the maas node14:37
piwi3910so the GW is not being taken14:37
benlakecareful, when you say it boots, you mean when it is booted into the ephemeral image, correct?14:38
piwi3910yep14:39
benlakeI tried troubleshooting network stuff from that image and it behaves very oddly.14:39
piwi3910it boots up get's in to ubuntu14:39
piwi3910i see the network being brought up14:39
piwi3910but i don't see the gw being set14:39
piwi3910another machine i tried in the same vlan14:39
piwi3910sure the GW works14:40
piwi3910it's the image not taking the GW from dhcp14:40
benlakedoes it have a default route?14:40
piwi3910or dhcp not providing it14:40
piwi3910nothing14:40
benlakeis that the answer to my default route question?14:40
piwi3910well i can do what you propose and run the stuff on the internal pxe network14:41
piwi3910so it doesn't go to the public side anymore14:41
piwi3910but that would only work for one pxe network14:42
benlakebacking up, the problem is the deploying/commissioning/enlisting node cannot speak to the rack controller14:42
piwi3910kinda fucks up the point of having multiple deploy networks14:42
benlakeI don’t understand what you mean, or your expectation of, “multiple deploy networks"14:43
piwi3910ok:14:43
piwi3910from what i can see, for some reason the node doesn't get the gW14:43
piwi3910becasue of that it cannot get to the rackserver14:44
piwi3910as that one has default it's config on the public side14:44
piwi3910so what i can do is edit the file14:44
piwi3910and put the pxe side in the rackd config14:44
benlakefor the subnet you enabled DHCP on, did you confirm a gateway is set?14:44
piwi3910yes dhcp is set and gw is defined14:45
benlakeare you trying to enlist or commission?14:45
piwi3910commission14:47
benlakeso you manually added the hardware?14:47
piwi3910yep14:49
piwi3910have a few dell server r610 i tried14:49
piwi3910and some vm's14:49
piwi3910all have the same issue14:49
piwi3910i'm gonna try another image14:49
benlakeand I’m guessing the motifivation for that is because enlistment didn’t work? :)14:50
benlakewhat image are you trying? (I’m pretty sure it isn’t image related)14:50
piwi3910now the default 16.0414:51
benlakecan you screen cap the PXE boot when it acquires an address and poops our a helpful config line?14:51
benlakethat image is fine, that’s all I’ve been using.14:51
piwi3910ok i'll do a screencap and drop it on dropbox14:51
benlakethat’s how I discovered the rack IP when I had this issue.14:52
benlakethere sure is a lot of code to discover routing information...14:53
benlakemore precisely, attempt to discover what can route to what.14:54
* benlake looks at Gavin14:55
piwi3910ok very interesting14:55
piwi3910just got it fixed on the fault image14:55
piwi3910default image14:55
piwi3910only thing i did was change the kernel minimum to the hwe kernel14:55
benlakeerr, what’s a default image?14:55
piwi3910now all servers boot fine14:55
piwi3910the 16.0414:55
benlakeoh interesting. guess it was driver related14:56
piwi3910on every VM and physical server?14:56
piwi3910with different nics and all14:56
piwi3910that would be weird14:56
piwi3910any way, i'll do the screencap anyway14:57
benlakethe VMs, yeah, weird. But you only mentioned one bare metal server type14:57
benlakeand you’ve said nothing as to what your VMs are.14:57
benlakefor all I know they are using sr-iov and thus need more awareness of the underlying NIC.14:58
benlake“NTP servers, specified as IP addresses or hostnames delimited by commas and/or spaces, to be used as time references for MAAS itself, the machines MAAS deploys, and devices that make use of MAAS's DHCP services.”15:05
benlakeDo I understand “MAAS itself” to mean the region and rack controllers, correctly?15:06
roaksoaxbenlake: yes15:33
benlakealright. then my issue stands. NTP server IP being selected is non-optimal.15:34
roaksoaxbenlake: what version are you running ?15:34
benlake2.115:34
benlakeI’ll happily upgrade when it hits GA15:35
roaksoaxbenlake: 2.2 is ga already15:35
roaksoaxbenlake: which 2.1 ?15:35
roaksoax2.1.3 ?15:35
benlakewell, its PPA GA right? I don’t see it in backports for xenial15:35
roaksoaxbenlake: the same version in PPA will hit xenial once the SRU process goes through15:35
benlake2.1.3+bzr5573-0ubuntu1 (16.04.1)15:36
benlakeright, waiting on SRU I suppose15:36
roaksoaxbenlake: we wont be doing any maintenance on 2.1 anymore15:36
benlakeI’m not stuck. I just ansibled the ntp server.15:36
benlakeunderstood.15:36
benlakeIf I remember too and see this in 2.2, I’m sure I’ll whine about it :D15:37
roaksoaxbenlake: cool, if you could file a bug then it would be great15:37
roaksoaxbenlake: provided that in 2.2 we fixed a but wrt15:37
roaksoaxbug*15:37
benlakeagain, it is dicey as to whether it is an actually flaw or just a awkward use case15:38
benlakeI saw again with regards to IP selection discussions in general15:38
benlakes/saw/say/15:39
roaksoaxbenlake: i do know that there's been some weird things in NTP15:39
roaksoaxbenlake: i can't remember if we backported that to a later 2.115:39
benlakethere is a lot of “route finding” code that seems to only be used by NTP at first glance15:39
benlakeso could definitely be isolated weirdness15:40
xygnalroaksoax hey?15:47
roaksoaxbenlake: it is indeed, as we try to find all rack controllers in the same vlan to have access to ntp15:47
roaksoaxxygnal: hey!15:47
xygnalroaksoax asked you a question a little while ago15:48
roaksoaxxygnal: that's done by curtin15:49
xygnalroaksoax:  i'll check curtin trunk docs for details about grub15:49
roaksoaxor the hooks effectively15:49
roaksoaxxygnal: any particular issues you've seen  ?15:50
xygnalroaksoax when does this apply?  We have a client who is applying grub changes in their user_data script15:50
xygnaland they are discovering that those settings are being over-written by MAAS15:50
xygnalroaksoax hm... the grub section does not even cover kernel options15:51
xygnalits the GRUB_CMDLINE_LINUX variable15:51
xygnalthat is being set15:51
roaksoaxxygnal: are you modifying this ti inject custom kernel options for the deployed machine?15:53
roaksoaxto*15:53
xygnalxygnal yes, such as console= settings we need15:53
xygnaland anything else that may come up15:53
roaksoaxxygnal: https://docs.ubuntu.com/maas/2.1/en/installconfig-nodes-kernel-boot-options15:54
roaksoaxxygnal: https://docs.ubuntu.com/maas/2.1/en/manage-cli-advanced#specify-kernel-boot-options-for-a-machine15:54
xygnalroaksoax looks like this can only be done via global UI, or per-host CLI? no API?15:56
roaksoaxxygnal: everything can be done via the api/cli. The UI some stuff is missing indeed15:57
xygnalroaksoax I was digging through the API docs and did not see grub options15:57
xygnalmaybe i should look for kernel =p15:57
roaksoaxxygnal: the CLI is autogenerated from the API15:57
roaksoaxat least the current one15:57
xygnalkernel_opts?15:58
xygnalroaksoax and this is passed for Custom/CentOS as well?15:58
roaksoaxxygnal: i can't recall of the top of my head, but I think we do16:00
* roaksoax otp16:01
xygnalroaksoax testing that out now.  client also wants to change other GRUB settings,  are those hard-coded?16:13
xygnalroaksoax like GRUB_TIMEOUT= and others16:13
benlakebah! now what! Jun  1 16:11:27 fair-ewe cloud-init[983]: E: Malformed entry 1 in list file /etc/apt/sources.list.d/linbit-drbd9-stack_4.list (Component)16:16
benlakecould someone point me to docs regarding the proxy? specifically, I’d like to flush the cache16:26
xygnalroaksoax  hm... seems this setting is not taking.  either globally or per node it appears to be ignored?  When does it apply? client is noticing this during user_data script execution.16:30
benlakeI don’t know why this repo I added is causing commissioning issues (KVM guest). The repo I added has been enabled while deploying 3 bare metal nodes.16:30
benlake^ that completed successfully16:30
xygnalroaksoax during user_data execution, that line shows as ="" instead of with our custom tag settings or global settings16:33
benlaketotes a bug. this is what is ending up in /etc/apt/sources.list.d/linbit-drbd9-stack_4.list16:38
benlakedeb http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu  main16:38
benlakeie. sans xenial16:38
benlakeenlist, commission fails. deploy succeeds. with the additional repo enabled.17:01
benlakedoesn’t affect releasing, that’s interesting.17:24
roaksoaxxygnal: i'd need to investagte. this could be due how this is generally handled in ubuntu/debian vs how it is handled on centos, but IIRC, we would just copy those extra params and use them for the installed system as well17:47
xygnalroaksoax right now I am coding a curtin script to automatically nuke the added lines17:53
xygnalbut that is not ideal17:53
xygnalthe grub config has value settings PRIOR to maas modifying it17:53
xygnalbut mass puts its setting BELOW those17:53
xygnalwhich causes it to ignore the higher ones17:53
xygnalright now my script jsut rips those extra lines out during deploy to be sure they do not interfere.  this is not something i could easily do for different clients, so i'd like to bug track this to see if COS is really unsupported/get a bug/feature requets going17:54
roaksoaxxygnal: You should file a bug and submit a patch :). That'd be awesome!17:56
xygnalroaksoax: I dont know python, so patching it would be quite a challenge.   I will help to debug it though, if you can give me some hints on how to verify18:15
roaksoaxxygnal: i'll need to investigate. Haven't look at that code in ages. But I'd recommend you to file a bug18:58
roaksoaxso it is tracked at least18:58
roaksoaxbenlake: https://bugs.launchpad.net/maas/+bug/169508320:47
roaksoaxbenlake: that's what you were hitting earlier today wrt NTP20:47
benlakehmm, perhaps. I did have a new fabric pop up, but can’t quite pin down timing20:51
benlakeinteresting, so it seems /etc/systemd/timesyncd.conf is never updated. Is that correct?21:01
mupBug #1695083 opened: [2.2] NTP misconfigured after the Rack discovered a new 'lxdbr0' interface <MAAS:Triaged> <MAAS 2.2:Triaged> <https://launchpad.net/bugs/1695083>21:05
roaksoaxbenlake: we dont update timesyncd.conf we install ntpd22:07
roaksoaxbenlake: effectively is your same issue22:07
benlakesure, but the nummer of not touching timesyncd means there is double duty AND I get to see it fail in the logs :P22:48
benlakebut I’ll ansible that away too I suppose.22:49

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!