/srv/irclogs.ubuntu.com/2017/05/23/#maas.txt

benlakeFolks, I hate to bother, but I’ve been at this for a while. Attempting to commission a my first node on a fresh 16.04.2 MAAS install using the 16.04 image and cloud-init is failing with “no datasource found”. Here is a screen cap with the interesting points being at times 1:27, 1:32, and 1:44, https://www.screencast.com/t/iSyL4IAPiI03:36
benlakeI can’t quite tell if this issue is related: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/164838003:37
benlakePossibly relevant, at 0:59 mid screen you can see eth0 being renamed to enp1s0. I manually added the node (because this same issue prevented PXE based enlisting), and manually changed the interface name to enp1s0 in MAAS. Thinking that perhaps that was playing a part in preventing communication between controller and node.03:39
benlake1:06 the line “Invalid path for Logical Volume” when referring to the iSCSI LUN seems troubling too, but it then goes on to seemingly mount and read /scripts/, so I guess OK?03:46
benlakeThe next step I suppose will need to be backdooring the image to dig around locally.03:46
benlakeAttempting to poke around the “MAAS datasource” section on https://docs.ubuntu.com/maas/2.1/en/troubleshoot-faq led me to files that don’t exist on my controller. So I don’t know if that information is just old or if something didn’t get installed.03:50
benlakeany pointers are appreciated.03:51
=== frankban|afk is now known as frankban
cnfugh, silly maas networking :/09:37
=== rvba` is now known as rvba
mupBug #1681278 opened: bootstrap failure on MAAS <bootstrap> <maas> <juju:Incomplete> <juju 2.0:Won't Fix> <MAAS:New> <https://launchpad.net/bugs/1681278>13:10
jeevanhi13:48
benlakeHmm, so I don’t have any files denoted by the “backdoor steps”, ie. no /var/lib/maas/boot-resources/*/*/*/*/*/*/root-image14:01
benlakebut I think it just dawned on me that process is referring to images that actually manage to get deployed.14:01
benlakeI instead, am stuck at cloud init issues during PXE boot.14:01
benlakeshould the cloud-init package been installed via the maas metapackage?14:55
benlakeis there a way to get into the PXE booted image to check network config?15:26
benlakeI’m working under the assumption “no datasource found” might be a failed MAAS API call15:26
pmatulisbenlake, i cannot see your original screenshot. got some kind of java problem trying to view15:29
benlakethis? https://www.screencast.com/t/iSyL4IAPiI - it is a flash video15:29
benlakeI’ll snap pics at the times I noted.15:30
benlakeiSCSI path issue (maybe?) https://www.screencast.com/t/d74xmHp8tt6L15:35
benlakecloud-init init https://www.screencast.com/t/tJXY2WzefYl15:35
benlakecloud-config apply no datasource https://www.screencast.com/t/rvKCTWh1xv715:35
benlakecloud-init stage final no datasource https://www.screencast.com/t/zZ3KnTDZAPV15:36
benlakeI’ve been poking around the maas node to try and find/understand what config is being sent to cloud init15:37
pmatulisbenlake, can you explain your network topology and what the underlying machines are?15:41
benlakemaas controller has 2 interfaces, same wire. Untagged and VLAN 100 interfaces. Untagged is 10.128.1.128/25 the “provisioning subnet”. I set a dynamic range of 10.128.1.192-.254. I enabled maas DHCP on the untagged interface.15:45
benlakethe controller is a supermicro + ubu 16.04, and the node attempting to be deployed is the same spec supermicro15:45
benlakeboth with IPMI 2.0+15:45
benlakethe node I’m attempting to deploy properly PXE’s via maas15:46
benlakeand DHCP is assigning 10.128.1.19215:47
benlakesince enlistment didn’t work, I manually added. maas is properly managing IPMI to control power to the machine.15:47
benlakebecause I don’t want you to have to believe me, here is the PXE boot https://www.screencast.com/t/9vlNb5rfRM15:50
benlakeoh balls.15:50
benlakeoh pmatulis, you rubber ducky you15:50
pmatulisbenlake, hm?15:50
benlakeI just noticed there is a config-url displayed in that last screenshot15:51
benlakeand it has the tagged interfaces IP!15:51
benlakebut the iscsi target IP is correct. why the hell is maas deciding to use both?15:51
benlakeanyone know where maas is pulling the config-url IP from?15:59
kikoblake_r, mpontillo, or maybe newell might know that one16:13
kikobenlake, those addresses do look suspicious16:14
kikobenlake, I assume the 74.x network is not accessible from the node itself16:14
benlakethe IPs are valid, just not what maas should be using here.16:14
blake_rbenlake: what is the maas_url= in the /etc/maas/regiond.conf16:14
benlakecorrect, 74. is not.16:14
blake_rbenlake: that needs to be the same IP address that the PXE booting nodes can reach the region controller16:15
benlakeI just found this command, sudo maas-region local_config_set16:15
blake_rbenlake: you can use that command to change the value in /etc/maas/regiond.log, or manually change it16:16
benlakeand set it to the correct IP, so /etc/maas/regiond.conf has the correct IP. Not sure if it did before. Let me check history16:16
benlakeding, ok so that command prolly changed it16:16
blake_rbenlake: if you change it you need to restart16:16
blake_rsystemctl restart maas-regiond16:16
kikobenlake, this is a very common problem, it stems from us trying to guess the address during MAAS insteall16:16
blake_rto get the updated IP and then try to enlist16:16
kikoinstall16:16
kikoblake_r, where did we end up with putting a debconf option to request upon installation?16:17
kikoblake_r, or alternatively, leaving it unconfigured until the user sets it explicitly16:17
kikoinstead of trying to guess16:17
kikoas when we guess wrong, which is almost always on a multi-homed regiond, the failure more is horrible?16:17
blake_rkiko: dpkg doesnt request16:17
kikoI had a bug filed on this since prehistoric times I think16:17
blake_rkiko: this is a change the snap has done, to make this process better16:17
blake_rkiko: the snap asks you when you configure16:18
kikothat's nice -- but dpkg could too if we wanted it to?16:18
blake_rkiko: another fix would be to proxy all requests through the rack, then that IP doesn't matter16:18
blake_rkiko: I think it doesn't because it breaks installation from the ISO16:18
blake_rkiko: but I don't fully remember why16:18
kikoI think it depends on the priority set in the ISO installer16:18
kikoindeed, that would work and is probably the right solution16:19
benlakeblake_r: restarted, testing. I concur with not guessing. Especially since MAAS most definitely is OK working with VLANs, so the controllers will likely always have multiple interfaces...16:19
kikoright16:19
kikohttps://bugs.launchpad.net/maas/+bug/141804416:20
benlakewhy are the iscsi paths using the other IP?16:20
kikobecause they talk to the rack controller16:20
kikowhich is a separate component16:20
benlakeseems odd that discovery isn’t at least consistent16:20
benlakeoh, gotcha16:20
kikowell that is due to a design artifact:16:20
kikoa) region and rack are separate, for scalability reasons (you'll want many rack controllers)16:21
kikob) the nodes mostly talk to the rack, but for metadata requests currently talk to the region16:21
blake_rbenlake: you can also change the /etc/maas/rackd.conf maas_url16:21
kikoc) at install time it's unclear what the internal interface for the region controller actually is, and we guess16:21
blake_rbenlake: that is unique per rack controller16:21
benlakeblake_r: that’s currently localhost, so it’s OK16:22
blake_rbenlake: if its localhost, MAAS will use the IP set in regiond.conf16:22
benlakeoh wow, that’s opaque16:22
* benlake changes it16:23
blake_rbenlake: it tells the rack controller how to talk to the region controller16:23
kikoblake_r, wtf??16:23
blake_rbenlake: but the machines that PXE boot from that rack controller, must also be able to contact the region controller at that address16:23
benlakeyeah, I follow that, but detecting localhost and then using another config is a bit rough16:23
kikowhat benlake said16:24
benlakeif it says localhost, I expect localhost to be resolved.16:24
kikoanyway, it's really unclear that that config means "the IP which nodes trying to talk to me should use"16:24
kikowhich leads to putting localhost in be fine16:24
benlakeblake_r: yup, gotcha.16:24
blake_rreally what should happen is that all comunication should proxy through the rack controller16:24
blake_rremoving the need for the nodes to use the maas_url in rackd.conf16:24
kikosorry, leads to people thinking that putting localhost in is fine16:24
kikoyeah16:25
blake_rbutting localhost is fine16:25
blake_rin a simple MAAS16:25
blake_rin complex networking things get more difficult16:25
blake_rbut proxy through rack would solve this problem16:25
benlakeone thing I did after manually adding this node, then marked it broken, was to change the interface name from eth0 to enp1s0 - is that necessary? is that interface name meaningful to cloud-init/deployment?16:26
benlakeI did this in my troubleshooting quest16:26
kikoit should not be necessary16:27
blake_rbenlake: that is not necessary16:27
blake_rbenlake: but when you deploy that interface will always get that name16:27
benlakegreat! trying commission again, if it moves along, I’ll blow everything away and try a raw enlist via pxe.16:27
benlake“get that name” - as in end up in /etc/network/interfaces?16:28
blake_rbenlake: yep, and udev rules to make sure it has that name16:29
benlakesweet, cloud init worked!16:29
benlakeblake_r: ah, ok then.16:29
cnfso MaaS won't allow a default gateway outside of the CIDR i define16:31
cnfbut the CIDR is a lot larger, i just have a small part of it assigned to me16:32
mpontillocnf: MAAS expects the CIDR to match how it is defined on the network; if you only control a small part you can make it an "unmanaged" subnet in MAAS 2.2 (reserve a range for IP allocation). if you want MAAS to use DHCP on that subnet, define the entire subnet, and make sure to define it as a managed subnet with reserved ranges for the portions MAAS is not16:33
mpontilloallowed to allocate from16:33
cnfmpontillo: yeah, that's a pain, because i have a /29 in it and a /28 in it, used for different things16:34
mpontillocnf: worth noting is that MAAS is happier with non-overlapping subnets as well; users are allowed to model overlapping subnets but there might be edge cases, so I would recommend against it16:34
mpontillocnf: if it's just one bit, should be easy to mask off the unusable-to-MAAS portion with a reserved range?16:35
cnfyou can recommend against it, but i don't decide on the network ranges used16:35
cnf2 bits, i have 2 non- following parts of a /2416:35
cnfboth with the same gateway16:36
mpontillocnf: oh okay, so there are at least three overlapping subnets, a /24, /28, and /29?16:36
cnfyeah16:36
mpontillocnf: are you managing DHCP on the subnet?16:36
mpontillocnf: rather, do you expect MAAS to manage DHCP on your portion of the subnet?16:37
mpontillocnf: if yes, how is the traffic isolated from the larger subnet?16:37
cnfit's not DHCP, but juju needs to ask for IP's in it16:37
cnfmpontillo: which is used for IPs for containers16:38
cnflegacy, so much fun16:42
mpontillocnf: ok. so if I understand you correctly, you have /24, you don't manage DHCP on the subnet, and you want to carve out /28 and /29 networks for specific container-IP-assignment purposes?16:43
benlakeblake_r: pmatulis: kiki: thanks for your help. The machine is now progressing. Some other things to tinker with, but those are likely with my setup.16:43
kikobenlake, thanks, please chime in on the bug so it's not just me :-)16:43
cnfmpontillo:  yes16:43
benlakekiko: will do!16:44
cnfit's the only way i know to get juju the IP's for containers, when running on MaaS16:44
mpontillocnf: how do you plan to tell juju which subnet to use? (are you using spaces? if yes, MAAS 2.2 may actually break you, since spaces moved to be associated with VLANs instead of subnets)16:45
cnfi am using spaces16:46
cnfand ugh16:46
cnfwhy would you associate spaces with VLANs?16:46
cnfhow can you then tell juju what subnet to use?16:46
mpontillocnf: well, there was a significant debate about that. basically there is no perfect solution, but in order to deploy OpenStack in certain scenarios we needed to have a way to have an "empty" VLAN with a space, but no subnets assigned yet16:48
mpontillocnf: it was understood at the time that people weren't using spaces how you're using them =(16:48
blake_rcnf: there is no true isolation unless a space is a VLAN16:48
cnfi have been struggleing with openstack on MaaS / juju for a LONG time now16:48
cnflist of open bugs is growing...16:48
cnfi mean, if you want a vlan, define a vlan!16:50
cnfthe usefulnes of spaces was that you could put several subnets in a single space16:50
mpontillocnf: right, so as blake_r implied, spaces were envisioned as a "color" for a vlan/subnet the defined its security properties. you might have a "red" space for your DMZ, "green" for your intranet, "purple" for your protected health care data, etc16:50
mpontillocnf: if you combine all those things onto the same VLAN is sort of defeated the purpose of spaces modeling the security properties of the network16:50
blake_rcnf: you can have 2 vlans in the same space16:50
blake_rcnf: just a router between them16:51
mpontillo*can't I think you mean blake_r?16:51
blake_rmpontillo: can(16:51
blake_rcan*16:51
cnfthis is going to be a fun RFI16:51
blake_ras for the subnet you don't control16:51
blake_radd the whole subnet16:51
blake_rset it to unmanaged16:52
cnfso i can almost start from scratch16:52
cnfand it STILL doesn't fix my problems16:52
blake_rand define a range16:52
blake_rthen Juju will only use those IP's16:52
cnfblake_r: and then add the same subnet again?16:52
blake_rcnf: how does it not fix your problem?16:52
cnfand maas won't mind that?16:52
blake_ryou add the whole subnet, and set that subnet to unmanaged16:52
blake_rin that subnet you create an IP range16:53
blake_rMAAS will only use those IP's in that range16:53
cnfthere is a /24, out of which i have non - sequential a /29 and a /2816:53
cnfwith different purposes16:53
blake_rcnf: that is fine16:53
blake_rcnf: add the whole /2416:53
cnfso i need to add the same /24 twice in maas16:53
blake_rcnf: define the range you want your IP's to be assinged in that fall with in the /29 and /2816:54
cnfdefine them where? how do i distinguish between the 2?16:54
cnfblake_r: i have no idea what you mean16:56
blake_rcnf: what subnets do you have now?16:57
blake_rcnf: in MAAS16:57
cnfuhm, tons16:58
cnfi have the /2416:58
cnfthe other 2 are a problem16:58
blake_rdid you add those manually? or where they discovered?16:58
cnfi have discovery turned off16:59
cnfthat was just a mess16:59
blake_r"the other 2"? did you add them manually or did they just show up?16:59
cnfthey are not defined16:59
cnfas i don't know how to16:59
mpontillocnf: well, MAAS will "discover" subnets outside of "device discovery"; it will automatically add subnets it finds configured on rack controllers16:59
blake_rcnf: okay16:59
blake_rcnf: what MAAS version?16:59
cnfmpontillo: i have _all_ discovery turned off17:00
cnf2.1.3+bzr5573-0ubuntu1 (16.04.1)17:00
mpontillocnf: there is no option to disable discovery of subnets found on rack controllers17:00
blake_rmpontillo: he means device discovery17:01
blake_rcnf: with your setup you will want the unmanaged subnet feature17:02
blake_rcnf: since that is a subnet that MAAS doesn't manage17:02
cnfwhich i guess is in 2.217:02
cnfright?17:04
blake_rsudo add-apt-repository ppa:maas/next-proposed17:04
blake_rsudo apt update && sudo apt upgrade17:04
cnfyeah, but that would completely break my juju openstack17:05
blake_rwhy is that? thought it didn't work at all?>17:05
cnfit's running, just without the ip's from the /2917:06
cnfwhich are the routable ones17:06
cnfwhich means i need weird tunnels to access the openstack17:06
cnfbecause, hell, putting them behind a single ip isn't something that works atm with charms17:06
cnfanyway, it's 19:07, time to go home...17:07
blake_rcnf: you can set static IP addresses on nodes17:07
cnfblake_r: not on containers17:07
blake_rcnf: ah true17:07
cnfso you need a LOT of routable IP's just to have a workable chams openstack17:08
cnfwhich is what the /29 is for17:08
cnfi won't even start on the long, long list of other bugs i have on juju etc :(17:08
cnfa lot of which come from assumptions of network layouts17:09
cnfanyway, 7 pm, i'm hungry17:09
cnftomorrow is another day17:09
cnfi'll have to look at how the new spaces work, i guess17:09
cnfthanks for the help17:09
=== frankban is now known as frankban|afk
cnfaaand home18:00
mpontillocnf: please let us know how things work out (or don't work out) for you; you can bring things up on the maas-devel list and/or the bug tracker (Launchpad) if you want more visibility for your use cases18:10
mupBug #1651316 changed: Disks are found but not shown <MAAS:Fix Released> <https://launchpad.net/bugs/1651316>19:13
benlakewhat does it mean to use the “retain network configuration” option when commissioning?19:41
kikobenlake, the network interfaces, do you want them reset back to unbonded, unvlanned etc?20:07
kikobenlake, also, c'mon https://bugs.launchpad.net/maas/+bug/141804420:07
benlakeI promise I am going to do that thing!20:08
kikocnf, I'd love you to share the current issues you're running into with juju, ivoks nobuto and I are tracking this closely20:08
benlakeso network temporarily reset to a defailt state, ignoring any setup done in maas?20:08
benlake*setup - any network configuration adjustments setup in maas20:09
kikocorrect20:09
benlakekk20:09
cnfkiko: https://bugs.launchpad.net/~cnf is a start :P20:24
cnfkiko: a very large part of them are proxy related20:25
kikocnf, you and ivoks would be best friends20:31
kikocnf, we actually started on a plan with jamespage to address that more widely, let me find it20:31
cnfkiko: i have been in contact with jamespage for most of it20:32
cnfalso, place i work at is launching an RFI for an openstack install20:32
cnfso i'll be using that  channel to add some weight to some issues20:32
kikocnf, https://www.dropbox.com/s/qvhi0wyfj87tyxq/PROXY.txt20:33
kikocnf, see if that matches what you think could work. there is backwards-compatibility problem thrown in the mix but it should be solvable20:33
benlakefinally have a fully deployed node. really surprised the base install is 9.5GB O.o20:34
* benlake checks if he installed Windows20:34
cnfkiko: you can add "openstack picks up htt-proxy and ignores no-prozy"20:34
kikobenlake, 9.5GB can't be right. fully deployed with what?20:34
benlakeubuntu xenial20:34
kikobenlake, uhhh that can't be right20:34
benlake /dev/mapper/vgroot-lvroot  219G  9.5G  199G   5% /20:34
kikocnf, "openstack" as in the system or the charms or what?20:34
cnfkiko: openstack as installed with charms20:35
cnftotally ignores no-proxy settings20:35
cnfso _nothing_ can talk to keystone20:35
cnfbecause my proxy can't talk to keystone20:35
kikocnf, but my question is whether the charms themselves ignore no_proxy or whether the systems are configured without them or..?20:35
cnfoh, no, no-proxy envs are set20:36
cnfit's populated wherever i know to look?20:36
cnfbut openstack just seems to ignore it20:36
cnfwhich isn't a juju problem, of course20:36
kikohmm... that's weird.20:36
cnfbut it causes problems with other software, which i can't fix with juju20:36
kikoso how does http_proxy get used by openstack itself?20:37
cnfkiko: whish is where https://bugs.launchpad.net/juju/+bug/1681495 came from20:37
benlakewell, I found the reason./swap.img is 8GB20:37
cnfkiko: yes20:37
kikodoes it pick up from env vars set when launching the control plane services?20:37
kikobenlake, that looks more like it20:37
benlakeI guess we’ve switched to file based swap!20:37
cnfkiko: it sure looks that way, yes20:37
cnfkiko: jamespage was involved with debugging this, byw20:37
kikobenlake, maybe we do that if you don't define a swap partition? I'm surprised tbh, also because I'm not a big fan of file based swap for i/o path reasons20:38
benlakeslightly unfortunate side effect of that transition is that the swap space is hidden. guess I’ll get used to that20:38
kikobenlake, can't you just define a swap partition?20:38
benlakeI did nothing special, just pushed buttons to get a thing deployed. wanted to see a success before mucking around20:38
benlakeI have an existing PXE+preseed environment. Is there somewhere I can use my existin pressed with MAAS?20:39
benlakeor do some merging?20:39
benlakehonestly, I need to continue reading the docs. This is the farthest I’ve made it, and most of my time had been on reading setup and troubleshooting20:40
cnfkiko: i also have a need for the openstack loadbalancer charm20:40
cnfbut i understand resources are not available for that atm20:40
kikocnf, for lbaasv2?20:40
cnfkiko: http://specs.openstack.org/openstack/charm-specs/specs/pike/approved/openstack-load-balancer.html20:41
kikocnf, oh, on the infra layer -- L3 HA basically21:19
cnfuhu21:19
cnfsolves HA, AND openstack services not being on routable networks21:19
kikowe are probably likely to have to work on this -- our telco customers all have these requirements21:20
cnfcount me as "a telco customer" :P21:20
kikoand for many of them we've done the setup manually21:20
cnfkiko: our contact at canonical is Richard Card, i believe21:21
kikocnf, oh are you an existing customer?21:21
cnfwell, we just launched an RFI21:21
cnfwe'll be doing an RFP end Q3, early Q4?21:22
kikoah, yes, I read it earlier this week21:22
cnfit under Telenet, or Liberty Global21:23
cnfi think it's under Telenet21:23
sanjayhi22:34
sanjayI have an issue for node deployment in maas22:34
sanjaythe deployment of ubuntu os gets completed but at last moment the system goes to grub rescue mode22:35
sanjayand then doesn't move on for deployment suceessfull22:35
benlakeinteresting. I’m configuring the firewall of the maas controller, and I’ve noticed that enlisting discovers the power type when the firewall is off, but is unable to discover it with the firewall on. What service is providing that discovery?23:55
benlakeI figured it’d all be happening on the server using local IPMI tools and then hitting the maas api with the deets23:56
benlakemaybe a round trip to the api, to trigger a reach out to the IPMI to confirm, but I’m not blocking outbound connections, so not sure why that would break.23:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!