/srv/irclogs.ubuntu.com/2018/07/26/#maas.txt

fallenour_o/02:03
fallenour_hey guys Im havinga lot of issues with maas,and im super confused at this point02:03
fallenour_maas can resolve to any system, even itself, machines deployed by maas can resolve to any machine, except for maas, any containers deployed on said machines can ping any ip, but cant resolve to any systems by dns, but they can ping 127.0.0.53, which is set in resolv.conf02:04
fallenour_all settings are DHCP02:04
=== frankban|afk is now known as frankban
=== mulbc is now known as ChrisNBlum
fallenour_hey is anyone else having issues with dns resolving with maas?13:09
roaksoaxfallenour: did you upgrade to 2.4.0 final rather than sticking with beta2 ?13:21
roaksoaxfallenour: on the dns side, are the containers IP addresses managed by MAAS ? (e.g. maas allows resolving against subnets it knows about)13:22
roaksoaxfallenour: e.g. https://bugs.launchpad.net/maas/+bug/177420613:23
=== Guest19794 is now known as kklimonda
robottalkHi all. Just having a bit of a time trying to get a private ssh key into our preseed configuration. Tried a bunch of late_command methods (ssh_keys, write_files, custom sh), following maas, cloud-init, preseed docs. Nothing working very well. Any thoughts on best practice here or suggestions for getting this to work? We need the private key to then access a private git repo via ssh and pull down some manifes16:20
robottalkrobottalk16:20
robottalkts... Thank you!16:20
roaksoaxrobottalk: did you do a late_command "in-target" ?16:37
robottalkyes, last night we left off with something like   ssh_key_copy: curtin in-target -- sh -c "/bin/cp --preserve=mode /home/conductor/.ssh/maas_deploy /target/root/.ssh/id_rsa; /bin/cp --preserve=mode /home/conductor/.ssh/maas_deploy.pub /target/root/.ssh/id_rsa.pub"16:39
robottalkbut i wasn't sure at that point if the maas server (host called conductor) mount points were accessible16:39
robottalkso that was a last effort16:39
robottalkwrite_files worked16:39
robottalkbut the key was broken because it didn't respect new lines16:40
roaksoaxyeah I was gonna suggest write_files would be better16:40
robottalkthat was like16:40
robottalkwrite_files:16:41
robottalk  f1:16:41
robottalk    path: /root/.ssh/maasdeploy16:41
robottalk    content: "-----BEGIN RSA PRIVATE KEY-----16:41
robottalk...16:41
robottalk-----END RSA PRIVATE KEY-----"16:41
robottalk    permissions: '0600'16:41
robottalkbut again the key seems broken since it's just a long run on string16:41
roaksoaxrobottalk: https://pastebin.ubuntu.com/p/2Fsw8CyrZN/16:42
roaksoaxi would do that16:42
robottalkthe pipe :-)16:42
robottalklemme try it16:42
robottalkthanks!16:43
roaksoaxrobottalk: for example, i did this myself: https://pastebin.ubuntu.com/p/vytRwCKd2x/16:43
roaksoaxrobottalk: that correctly wrote the script provided by content16:43
roaksoaxas yaml would do weird stuff with the quotes16:43
robottalkawesome thanks - testing now16:44
robottalkfingers crossed16:44
robottalkhehe16:44
=== frankban is now known as frankban|afk
roaksoaxrobottalk: i guess it worked ?17:11
robottalkit just came up17:12
robottalkthe write_files with pipe worked17:12
robottalk:-)17:12
robottalkthanks so much!17:12
robottalkjust checking something in the cloud-init-output17:12
robottalkseeing a "Failed to start Apply the settings specified in cloud-config"17:13
robottalkand cloud-config.service isn't running17:13
robottalkbut i just started looking into it...17:13
robottalkjust this error which seems new ... but not sure of it's impact just yet17:14
robottalkERROR: ld.so: object 'libeatmydata.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.17:14
roaksoaxthat libeatmydata.so shouldn't really be a concern I think17:19
robottalkroaksoax: it seems ok ... is the cloud-config.service needed once the machine is deployed? just looking around and that error and it doesn't seem critical but not sure if that service is required ... seems important?17:22
robottalkroaksoax: thanks for your help! going to move forward with this method - seems to do the trick - wish i had though about the pipe yesterday! hahaha thanks again!17:35
roaksoaxrobottalk: no prob, glad it works18:36
xygnalroaksoax: we are having problems with commissions just 'stalling' and never finishing.18:39
xygnalroaksoax: just gathering some numbers but it feels like 40-50% of the time we sbmit a commission it fails.18:40
xygnaland its not a per-model or per-serial problem, we'll repeat the same commmission 4 times and the 4th time it will go through.18:41
xygnalçcat /tee18:41
roaksoaxxygnal: what version?19:29
roaksoaxxygnal: and where does it get stuck ?19:29
roaksoaxxygnal: does it not expire after X minutes and gets marked failed commissioning?19:29
xygnal2.419:35
xygnaldoes not expire19:35
xygnalI am also seeing disk service times of up to 1.0s on tur19:35
xygnalon the local disk19:36
xygnalbut this is a regiond without db inside19:36
fallenour_roaksoax: Im not sure what you mean19:37
fallenour_roaksoax: im on 2.4.0-beta219:37
fallenour_if there is a more stable version of maas, id rather be on that, even if it means reverting back a version or two. I need maas to work, otherwise life will be quite miserable for me.19:38
xygnalroaksoax:  it's as if the commission jobs *die* and MAAS loses track of them/does not restart the jobs.19:40
xygnalThey remain in 'COMMISSIONING' status *until we abort*19:40
roaksoaxfallenour: ppa:maas/stable is where 2.4.0 final is available20:08
roaksoaxfallenour: you should upgrade20:09
roaksoaxxygnal: how long after is that you abort ?20:09
roaksoaxxygnal: and where do they get stuck ? e.g. do they pxe boot into the ephemeral environment ? do they get stuck running a script ?20:11
xygnal roaksoax:  hours to days20:32
roaksoaxxygnal: i would file a bug, but the important thing here is determining where it is getting stuck20:33
xygnalroaksoax: there appears to be a pxe boot element, as I believe we've seen it receive the DHCP offer but but be told it had nothing for it to boot20:33
roaksoaxxygnal: how many machines are you booting at the same time ?20:34
xygnalroaksoax: doesnt matter if we do 1 at a time or 10 at a time, we see the same result20:35
roaksoaxxygnal: right, well again i would file a bug, attach logs and such and we can try to look and determine whats wrong20:35
xygnalI can verify that the actual load on the MAAS box itself is very low pretty much all the time, so it does not appear to be a performance bottleneck.  at least not ouside of app code.20:35
xygnaljust /var/log/maas/ logs?20:36
roaksoaxxygnal: yeah, and the events for the given machine20:36
xygnalwell yes, a hostname and some chronological information20:36
xygnalas well :)20:36
xygnalon a lighter note, we just made a patch to your python code that handles code 64 SMART errors with a pass instead of a fail20:37
xygnal(means that the log had errors, but there are not active errors)20:37
xygnalwe run into it a lot with nodes that have FPDMA errors from bad cabling.  The SMART Log will never clear, so in order to get MAAS to pass commission we had to force over-ride each time in the past.20:38
xygnalapparently the Munin product has a simila patch/problem in the past20:38
xygnalits like.. 2 lines of code change.  we could submit a PR to your code or.. with how small it is.. would you prefer a bug report + attached fix?20:39
roaksoaxbug report + attached fix is better to keep track of stuff20:40
roaksoaxor you cana ttacha  diff to the bug report as well20:40
xygnalits tiny so i'll just bug report + the lines + a linked article about why20:42
roaksoaxcool20:45
mupBug #1783889 opened: COMMISSION S.M.A.R.T Tests fail unnecessarily  on code 64 (past log entries) <MAAS:New> <https://launchpad.net/bugs/1783889>20:57
xygnalroaksoax: both submitted.  the COMMISSION one is private as it contains logs with IPs inside.21:36
xygnalroaksoax: I think you are going to be disappointed with the info, as the logs show no problems between starting commission and our manual abort.21:37
roaksoaxxygnal: have the link for the commission on?21:42
roaksoaxone*21:42
xygnal178389221:43
roaksoaxxygnal: do you have the rackd.log from where this machine is to be pxe booting ?21:47
xygnal can get thatççççççççexit21:48
roaksoaxxygnal: and the events of the machine in the failed attempt21:51
roaksoaxmaas admin events query hostname=<machine>21:51
roaksoaxxygnal: also this would be helpful on a fialed run /var/log/maas/rsyslo/<machine-name>/<date>/messages21:53
xygnalhm... i can't find the hostname in rackd log on any rack servers21:57
xygnalwould i not BE listed with its hostname in rackd?21:57
roaksoaxxygnal: by pxe mac21:58
roaksoaxor by pxe ip starting from 2.521:58
ronethQuestion: How to configure IP on the deployed OS?22:00
ronethJust deployed CentOS7 and the deployed OS was assigned an IP which seemed to be from one of the subnet defined in MAAS.22:01
ronethHow can it be deployed such that a desired IP is assigned to the deployed OS ?22:01
roaksoaxroneth: you mean you want the machine to have a specific ip rather than the auto-assigned ip ?22:02
ronethYEs.22:02
xygnalyou have to set the network config for the node to Static IP instead of Auto-Assign22:03
xygnaland you have to put the IP you want in ahead of time, before deployment22:03
xygnalwhich means an admin has to do it :)22:03
ronethI am the admin. : ) Can you please elaborate on how I should be setting up the network config?22:03
roaksoaxroneth: for example, go to the UI, go to the specific machine, go to the interfaces section22:04
roaksoaxroneth: and edit the specific interface,22:05
roaksoaxchange it from 'Auto assign' to 'Static assign' and select the IP you want22:05
xygnalroaksoax: I dont see the mac address in question listed at all in rackd.log on our controllers.   using grep -i and :'s22:06
ronethAh! So, that is "commissioning stage"22:06
roaksoaxroneth: commissioning will obtain an IP from the MAAS run DHCP. Once the machine is 'Ready' you can change the ip you want for deployment22:06
roaksoaxxygnal: that's strange... that would mean the machine is not pxe booting ?22:07
xygnalit's correct that we have not seen them able to boot PXE.  We see them get DHCP, but the PXE reply seemed to be invalid22:07
ronethSo, after commissioning and after the machine become "Ready", Would have to edit the interface to be "static assign" --- correct?22:07
roaksoaxroneth: correct22:07
xygnalroneth: yep22:07
xygnalroneth: beware that you need to be running a recent version of the CentOS image if you need that IP to be static file in CentOS as opposed to static DHCP22:08
xygnalroneth: and if you only run with DHCP STatic method, beware that if you bring rack controllers down for 10 minutes all of those boxes will go offline22:09
ronethxygnal: thank a lot. A different Question on "subnet": How does MAAS determine what subnet to pick per commisioning ?22:09
roaksoaxroneth: whatever it is the subnet you have connected to the vlan where the machine PXE boots on22:10
roaksoaxroneth: and for which you have enabled DHCP by creating a dynamic DHCP range22:10
ronethah! So, that depends on the underlying real physical set up ?22:11
roaksoaxroneth: yes22:13
roaksoaxroneth: well, depends22:13
roaksoaxroneth: a vlan can have as many subnets as you want really22:13
roaksoaxyou can just add any subnet , the machine could get any ip, pxe boot, etc22:14
roaksoaxbut you will need a gateway to access the external network22:14
roaksoaxto get packages and stuff22:14
roaksoaxso that will be dependent on that22:14
xygnalunless you proxy them through maas22:14
xygnalno?22:14
xygnalmost of our rackd subnets are NOT internet accessible22:15
roaksoaxxygnal: yes, but it still has a gateway22:15
ronethSo, If I have 3 VLANs defined in MAAS, how can I tell MAAS what vlan to pick per commissioning or deployment?22:16
roaksoaxroneth: you will always need a physical network to do the pxe booting. After that you can configure pretty much anything you want22:17
ronethSo, it depends on the underlying VLAN then, I can't really tell MAAS what VLAN to pick. (?)22:19
xygnalroaksoax: internal gateway to traverse internal subnets, sure. my bad.22:20
xygnalroaksoax: let me know if we need to turn up any debugging on next commission22:20
roaksoaxxygnal: yup, will look at it tomorrow as i'm eod22:23
xygnalty same here22:23
roaksoaxroneth: so basically, maas will have a interface that's facing the machines for PXE boot rght? which could be connected to any vlan configured on the switch port or a trunk vlan or a mangement vlan22:23
roaksoaxwhatever you may wanna call it22:23
roaksoaxlets say that's eth1 - 10.10.10.2 in MAAS22:24
roaksoaxin the maas model that would be, say fabric-0 - untagged - 10.10.10.0/2422:24
roaksoaxmachines that PXE boot are connected to the same vlan22:24
roaksoaxso you would need to go to the 'untagged' vlan of fabric-0, enable DHCP and create a dynamic range on 10.10.10.0/24 so that machines that PXE boot get an IP from MAAS on that subnet22:25
ronethroaksoax: that make sense. Thank you.22:28
ronethI have a script that configure the bonding and assign IP to the interface.... What can I pass the script to the deployed machine?22:30
ronethI tried "user_data" but it doesn't seemed like it ever gets run.22:30

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!