/srv/irclogs.ubuntu.com/2020/03/12/#cloud-init.txt

=== vrubiolo1 is now known as vrubiolo
=== vrubiolo1 is now known as vrubiolo
shaykerenhi15:05
shaykerenI'm trying to write to root home directory ~/ from userdata (/bin/sh) in ec2 instance(AWS), but I'm getting error because directory not exists, I found that HOME env variable is empty15:07
shaykerenany idea?15:07
Odd_Blokeshaykeren: Instead of using ~, you could just use /root/?15:10
shaykerensure, but I would like to know why it is working when my userdata runs with #!/bin/bash and also the HOME env var is not empty in that case,15:20
shaykerenwhy is it behave differently?15:21
anankesounds like the issue with shell, not cloud init. is it /bin/sh from busybox or some other trimmed shell?15:22
shaykerenim using ubuntu ami15:24
Odd_Blokeshaykeren: Can you pastebin your userdata, please?15:25
anankethey may be using dash, who knows how that behaves15:25
shaykerenwait..15:28
shaykerenuserdata:15:35
shaykeren#!/bin/shecho test>~/testecho $HOMEecho "[cs18][$(date +"%T.%3N")] start of chronos user-data";echo "[cs18][$(date +"%T.%3N")] end of chronos user-data";15:35
shaykerencloud-init-output:15:36
shaykerenCloud-init v. 18.3-9-g2e62cb8a-0ubuntu1~16.04.2 running 'modules:config' at Thu, 12 Mar 2020 15:34:10 +0000. Up 25.08 seconds./var/lib/cloud/instance/scripts/part-001: 2: /var/lib/cloud/instance/scripts/part-001: cannot create ~/test: Directory nonexistent[cs18][15:34:12.364] start of chronos user-data[cs18][15:34:12.366] end of chronos user-data15:36
shaykerenI also saw that in case the userdata has #!/bin/bash it is executed as /bin/bash [part-001 path] and incase of #!/bin/sh it is executed as /bin/bash[part-001 path]15:39
shaykerenI also figured that if Ill run this script manually (inside the ec2 instance) /bin/sh[part-001 path] it is running ok without any issue15:40
Odd_Blokeshaykeren: Sorry, when I say pastebin I mean something like https://paste.ubuntu.com/.  Could you paste that all there and then post the link in here?15:43
Odd_Bloke(It's much easier to figure out what's going on when newlines are intact. :)15:43
shaykerensure my fault15:46
shaykerenhttps://paste.ubuntu.com/p/Sxrf9727n7/15:46
shaykerencloud-init-output logs: https://paste.ubuntu.com/p/37sK5ggYTb/15:48
Odd_Blokeshaykeren: OK, so that's the failing case, right?  What does the passing case look like?15:49
shaykerenchanging the #!/bin/sh to #!/bin/bash in the userdata15:50
Odd_Blokeshaykeren: And what does the output look like, if you could paste that too?16:03
shaykerenok let me run it with /bin/bash16:04
shaykerencloud-init-output - https://paste.ubuntu.com/p/TbdhvB7gkb/ , file was created under home directory of the root user16:12
shaykerenuserdata - https://paste.ubuntu.com/p/jV9WBTGBGM/16:13
shaykerenin both cases the HOME env variable was empty but in case of #!/bin/bash the file created16:14
Odd_Blokeshaykeren: So I'm not sure why there's a difference in behaviour there, but I don't believe that cloud-init treats the two files differently.  So you're probably seeing differences in behaviour between bash and dash with the environment that cloud-init executes them in.16:17
Odd_Blokeshaykeren: If you want to dig into this more, then please file a bug at https://bugs.launchpad.net/cloud-init/+filebug after reproducing on a more recent version of cloud-init (using the latest image in EC2 would do the trick).16:19
Odd_BlokeAnd attach the output of `cloud-init collect-logs` on the /bin/sh instance, too.16:19
shaykerenit is weird  because if I run it manually it is run with no issue16:19
Odd_BlokeYeah, it'll be something to do with the execution environment that cloud-init uses.16:20
Odd_BlokeIf you run it from a logged-in shell then there'll be a lot more environment variables around, for example.16:20
Odd_BlokeAnd cloud-init also runs early enough in boot that some stuff may not yet be on disk, which can sometimes affect behaviour.  (I doubt it in this case, but you never know.)16:21
shaykerenok ill collect the logs and upload it16:21
shaykerenThanks!16:22
Odd_Bloke:)16:22
blackboxswOdd_Bloke: this is a very thoughtful concern you have about ordering https://github.com/canonical/cloud-init/pull/114#discussion_r391222054 . What I think this means is that cloud-init in network_config only (always) adds route information to the first static IP address listed on an interface. Which means that we could currently be adding a normalized route to an IPv6 address (internal net_config subnet) xenial17:42
blackboxswand on an IPV4 interface on Bionic+ due to python key dict ordering difference right?17:42
blackboxswsorry that's a weighty question out of left field I realize17:42
blackboxswI'm trying to wrap my head around where/if this is a bug/shortcoming in existing cloud-init net_config translation for interfaces with multiple static addresses on a single interface17:43
blackboxswahh I see now. that for loop on routes is actually adding all routes ipv4 and ipv6 to the first additional static subnet that we configure on an interface.17:51
blackboxswtestsimple_render_bond_v2_input_netplan seems to be the only test that exercises this route adding17:54
blackboxswahh and ok I get it. we are ok on ordering changes because cloud-init renders netplan with routes on the interface, not the specific address to which we are attaching all those routes.17:57
blackboxswhere, https://github.com/canonical/cloud-init/blob/master/tests/unittests/test_net.py#L2061-L209917:59
blackboxswok routes in initial yaml-v2, get assigned to first IPv4 or ipv6 in internal network_confing addresses  list, and then gets bubbled up to netplan config output under network.bonds.bond0.routes instead of hanging off under network.bonds.bond0.addresses[0] as our internal code may have implied. I'll add a comment about this in the code so we don't have to dig into this again next time18:01
Goneriblackboxsw, https://github.com/canonical/cloud-init/pull/62 you can merge the patch. It's all good here.18:02
blackboxswexcellent Goneri, will you take a followup work item PR for *BSD to sort the package_command('upgrade') call?18:05
GoneriYes, I think, but first, I will rebase OpenBSD branch.18:06
blackboxsw+1 no rush on that, just checking whether you agree that is something that should eventually be tackled18:07
GoneriI would also like to take a look at the Ephemeral DHCP thing, it's a bit of a pain in the neck18:07
Goneriunlike Linux, a BSD image should come with (mostly) zero packages. So it's less critical.18:08
Gonerithe base system is not part of a packaging system18:08
blackboxswGoneri: does the description of your PR now look acceptable? I'll be using that for the squash merge commit message https://github.com/canonical/cloud-init/pull/6218:17
blackboxswif you have any changes/corrections to the PR description text. please update and let me know when you are done reviewing it18:18
Goneriyeah, it's fine. I've updated the list of the OS used during the tests18:19
Gonericould you avoid a squash merge? I tried to isolate each patch as much as I could.18:20
Goneri(well, ignore what I just said, the result is not that great. /me goes hide)18:20
blackboxswheh Goneri project-wise we squash merge the world in cloud-init18:29
GoneriRoger that.18:29
blackboxswwhich is why generally we want to keep PRs  smaller and more concise18:29
blackboxsw... where possible18:29
GoneriYeah, I know the feeling :-)18:31
blackboxswmerged Goneri  thanks again for that work18:37
Goneri\o/ I've started the branch 1 year ago + 4 days :-) \o/18:46
Gonerithanks all, I feel emotional now!18:47
blackboxswheh holy moly18:48
blackboxswshould've landed that on the anniversary 4 days ago18:49
blackboxsw!18:49
blackboxswit may be worth an update to https://cloudinit.readthedocs.io/en/latest/topics/availability.html#distributions18:49
blackboxswto add NetBSD in there too18:49
blackboxswrharper: Odd_Bloke https://github.com/canonical/cloud-init/pull/114/files#r391823711      reality check for me. I think dropping public-ipv4 check in ec2 still allows us to properly setup dhcp on primary/fallback nic always.18:51
rharperblackboxsw: without looking, what's the benefit to dropping it? what sort of LOC or execution time are we saving ?19:01
blackboxswrharper: I don't think there is a really a benefit, probably just a risk19:01
blackboxswyeah one key lookup isn't going to break anything. so we can leave that logic in place19:02
blackboxswor reproduce it in the new v2 config19:02
blackboxswrharper: ec2 also doesn't actually add public IP configuration details to the instance at all. a work item that we should discuss19:04
blackboxswat some point.19:04
rharperblackboxsw: please file a bug for that with details, that way it gets on the backlog19:05
blackboxswcurrently we rely on dhcp and any secondary local-ipv4s or ipv6s. no additional config added for reading ec2's public_ipv4s19:05
blackboxswrharper: will do19:06
rharperAFAIK, they only publish public DNS names19:06
blackboxswrharper: https://cloudinit.readthedocs.io/en/latest/topics/instancedata.html#example-output shows an example19:06
blackboxswand this PR 114 shows the updates as well with the new EC2 API version19:06
blackboxswhttps://bugs.launchpad.net/cloud-init/+bug/186719719:15
ubot5Ubuntu bug 1867197 in cloud-init "ec2: add support for rendering public IP addresses in network config" [Undecided,New]19:15
Goneriblackboxsw, https://github.com/canonical/cloud-init/pull/25019:19
Odd_BlokeGoneri: Congrats on the branch landing! \o/19:32
powersjblackboxsw, what's left on your ec2 secondary nics branch?19:33
blackboxswpowers working it actively right now. ahh and rharper my question about public-ipv4s handling is moot in this branch as that config in the network_config  v1 version of Ec2 only setup dhcp: true if we were fallback_nic, had local-ipv4s or had public-ipv4s metadata values.. in the new v2 network config for Ec2 per PR #114 we are setting dhcp4: true on all nics... I'm looking at trying to reconstitute only adding19:34
blackboxswdhcp4: true if fallback_nic public-ipv4s or local-ipv4s19:34
blackboxswpowersj: I think the only thing left it resolving what I dropped regarding public-ipv4s reading from the metadata.19:35
blackboxswas least so we have no risk of regression, even though I believe it doesn't regress anything in it's current state, best to be sure19:35
rharperblackboxsw: I don't understand the previous logic;  we always want to dhcp4 on primary nic;  and optionally add static ips if present19:36
rharperIIUC, that logic came before we parsed network config from IMDS, no ?19:36
blackboxswrharper: the release  ec2 network_config would have added network-config information that didn't enable dhcp4 at all on a device if nic is not primary and doesn't have public or local ipv4 addrs19:37
rharperyes, correct19:37
blackboxswthe *released* ec2 network_config19:37
rharperyes, I see that19:37
blackboxswthe network_config in PR 114, was setting up dhcp4 on all nics19:37
rharperah, and you're fixing that19:37
blackboxswregardless of secondary nic with no local-ipv4 addrs19:37
blackboxswright19:37
blackboxswI think that's a gap in the switch that I left19:38
rharperyes19:38
blackboxswso logic should be this for ec2:19:38
rharperwe dhcp *only* on primary; and for all nics (primary included) if there are secondary ips, add them (v4 or v6)19:38
blackboxswif fallback_nic(primary nic): dhcp4: true19:38
blackboxswif scondary nic: dhcp4 true only if local-ipv4s or public-ipv4s is present19:39
rharperwait19:39
blackboxswyep stating case 3 then will wait19:39
rharperI've never seen ec2 say you can dhcp on secondary nics19:39
blackboxsw3: if any nic and len(local-ipv4s) > 1  or len(ipv6s) > 1 then add those secondary IPs to nic config19:40
blackboxswrharper: I just tested that on an ec2 instance with 2 nics19:40
blackboxswdhcp on both gets you the proper matching ipv4 addrs and routes that secondary config would have setup statically19:41
blackboxswbut maybe that's unsupported?19:41
rharperbut it is not required19:41
rharperdon't we already have the private ips ?19:41
rharperhttps://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-instance-addressing.html19:41
rharpersays, yes you can dhcp on interfaces with private IP19:41
rharperI'm not seeing them say dhcp on secondary interfaces should work; though I believe it did work for you19:42
blackboxswright "We allocate private IPv4 addresses to instances using DHCP"19:43
blackboxswbut, what about secondary nics with only public ip addresses (is that even a thing?)19:44
rharperpublic ips are 'elastic ips'19:44
rharperI doubt those are allocated via DHCP19:44
rharperand I'm not sure you're going to get additional private ips on the same interface19:44
blackboxswif we had nic2, only public-ipv4s, no local-ipv4s, would dhcp work. I think not as there is no private IP allocated to that nic19:45
rharpercan you confirm that your secondary nic dhcp response included more than one IP ?19:45
rharperyou always get a private ip (local-ipv4) no ?19:45
blackboxswrharper: yeah will setup one now19:45
rharperthats how you communicate nic to nic internally from instance to instance19:45
blackboxswI think we always get a private IP on any attached interface19:45
blackboxswright19:45
rharperyes19:45
blackboxswso I think my comment about "public-ipv4s" existing and no local-ipv4s is *not* a thing (not a viable vm network config)19:46
rharperso, yes you can DHCP on all nics if we wanted;  I'm just wondering if that's useful vs. just assiging static ips19:46
blackboxswI think a prerequisite of having the attached nic is that it *must* have a local-ipv4s addr19:46
blackboxswrharper: right, probably not as useful. we could avoid doing that. though if ec2 instruments custom dhcp options we'd miss out due to our static ip config on secondary nic19:47
rharperwell, let's see what dhcp response shows up on secondary nics; I suspect it's the same as the primary19:47
blackboxswbut if we add dhcp, on all nics ec2 vms also pay the cost of that dhcp roundtrip right19:47
blackboxswsounds good. will setup the instance now19:47
rharperwe can check what ec2net-utils does as well19:47
rharperand check with Fred19:48
blackboxswso powersj the branch is close, I think we are circling the drain on final implementation. it doesn't involve a whole lot of work either way and I'd like to see it landed if we can today or tomorrow so we can get it into the CPC image pipeline19:49
blackboxswfor focal19:49
powersjblackboxsw, how do you and rharper close on the remaining work?19:50
blackboxswfor this current ec2 branch?19:50
powersjyes19:50
rharperpowersj: it's not going to happen in the short term19:51
rharperwe need to step back and confirm how we want to do it19:51
blackboxswI think I have to spin up an instance, we need to check dhcp config output and confirm we aren't missing something interesting by using static addresses?19:51
rharperlet's look at the AmazonLinux package; even see what a multi-nic multi-ip AmazonLinux instance looks like19:51
rharperand then, I'd check with Fred to see if that's optimal, or if there are better things to do; and then we can make a decision19:52
rharpernow, if we wanted to "land it today"  I'd keep it with dhcp on primary, and then *static ips only*  if present on all other interfaces present19:52
blackboxswok fair. so sample configs on multi-nic ubuntu and multi-nic amazonlinux and suggest what's best in email to fred/ahnvo ?19:53
rharperblackboxsw: once we enable DHCP on all interfaces, we certainly need to do route-metric again19:53
rharperjust fred19:53
blackboxswrharper: why do we need route-metric if ec2 isn't using classless routes in dhcp?19:53
rharperAhn doesn't care about Ec2 networking I bet =)19:53
rharperif it has a gateway19:53
rharperyou don't want it to clobber primary route19:53
blackboxswahh I thought it was only gateway, plus classless static routes in dhcp that caused this concern19:54
rharpergateway is a route19:54
rharperthe default19:54
rharperclassless are additional routes (which may include a default route as well) in which you ignore the gateway value,19:54
rharperwe know they don't currently put in a classless static route set; but there may be a GATEWAY= value  in which case we still need route metrics to ensure that we don't route packets meant for the internet out of the secondary interfaces19:55
blackboxswok so short term potential of only enabling dhcp on primary interface and static for all the other  nics, would that get us into an upgrade pickle if we went dhcp after discussion with Fred?19:55
rharperit may not have a GATEWAY value, but it could show up (accidental or on purpose) so it's best to put a metric on secondary interfaes19:56
rharperyeah, unless we render network on every boot19:56
rharperwhich we should discuss, with Fred;  I blieve we already do each boot but on Ec2 classic only19:56
blackboxswrharper: I think we also have CPC image magic that invalidates the cache on cloud images. but can confirm19:59
blackboxswso that'd be rendering network every boot, everywhere but that may also be limited to a specific ubuntu series19:59
rharperI don't think so20:01
rharperon Azure we added a dropin to rm the obj.pkl20:01
rharperonly on ec2-classic do we render every boot; due to MAC address on nic changes between stops/stars20:02
rharperon vpc, all is fine20:02
rharperthis reminds me of wanting a table on datasource capabilities (check_instance_id, network_config, update_event tpes)20:02
blackboxswrharper: https://paste.ubuntu.com/p/ZyFKkxPDCB/20:05
blackboxswso I'm on a vpc instance (non-classic) and I see cache invalid for Ec2Local ds detection across simple reboots20:06
rharper        # Non-VPC (aka Classic) Ec2 instances need to rewrite the20:06
rharper            # network config file every boot due to MAC address change.20:06
rharper            if self.is_classic_instance():20:06
rharper                self.update_events['network'].add(EventType.BOOT)20:06
rharperso I don't know what's going on in the image but I do know what code we wrote20:07
blackboxswright agreed on what that code does. there is just some drop in magic at play here I think in ec2 images20:07
rharperand we don't persist object.pkl on ec220:07
rharperas it doesn't implement check_instance_id()20:07
blackboxswI've wrapped myself around the axle at the moment on this. first I'll get that multi-nic instance up with PR 114 so we can dissect the dhcp response from networkd20:08
rharperblackboxsw: Odd_Bloke: interested in your thoughts on this: https://github.com/canonical/cloud-init/pull/238#issuecomment-59840858220:49
rharperwhen you have time to look20:49
blackboxswrharper: sorry here's dhcp info on dual-nic ec2 vm https://pastebin.ubuntu.com/p/tg2ZhZ3V6Z/20:50
rharperROUTER=172.31.32.120:50
rharperthat will put in a default route20:50
rharperso, we defintely want a dhcp-route-metric20:50
rharperif we go with dhcp on secondary interfaces20:50
blackboxswso metrics required in this case. ok20:51
blackboxswroute-metric rather.20:51
rharperdhcpX-override: {'route-metric': NNN}'20:51
blackboxswand /me just removed it. sorry a more concerted discuss was in order yesterday or the day before to make sure I was gong down the right path.20:51
rharperhehe20:51
rharperblackboxsw: also, on your unittests there was a bunch of mac.lower() after you had capitalized on of the MAC values;  what was that about ?20:51
blackboxswrharper: I think that was earlier me making sure we exercised some of the internal logic in cloud-init which I know lower()'s  the mac addr we get from IMDS. I should have instead just added a specific unit test that validated uppercase and lowercase macs result in same rendered net config20:53
rharperblackboxsw: ah, ok, yeah; less splash damage to other  tests20:53
blackboxswyeah, and clear documentation of the intent20:53
blackboxswrharper: ok, so what path do we want to go on for focal for ec2 secondary nics do you think?20:54
blackboxswstatic addr setup on secondary nics?20:54
blackboxswas it stands currently, it looks like published cloud-init on bionic only configs primary nic even on dual-nic boxes20:54
rharperlet's look at ec2utils and see what AmazonLinux does; if they dhcp + additional private ips; then I think we do the same20:54
blackboxswhrm, I see a bunch of primary actions (on eth0 only) https://github.com/aws/ec2-net-utils/blob/master/ec2net-functions21:08
blackboxswchecking around for stuff handling 2nd nic21:08
blackboxswwhich calls plug_interface for all interfaces and only activate_primary on each hotplug add21:10
blackboxswhttps://github.com/aws/ec2-net-utils/blob/master/ec2ifup21:10
blackboxswyeah seems in all cases rewrite_primary gets called which noops on !eth021:11
rharperno, it ignores eht021:11
rharperthat's code for all other interfaces21:12
rharperno ?21:12
rharperso they do dhcp on non-eth021:12
rharperand then ensure the rules for non-eth0 don't clobber eth0 (which Im sure they already dhcp on eth0 )21:12
rharperblackboxsw: so, to me, that's equivalent to what we're suggesting now;  dhcp on all interfaces, add secondary ips on all interfaces that have them, and ensure non-primary interfaces have a route-metric (which is what they do with the route table bits)21:13
blackboxswhahah right complete misread21:13
blackboxswok will reconstitute the route-metric bits.21:14
blackboxswRTABLE=${INTERFACE#eth}21:15
blackboxswlet RTABLE+=1000021:15
blackboxswright21:15
blackboxsw10000 offset per nic21:15
blackboxswo k21:15
blackboxswso they add route-metric on nic >= 121:15
blackboxswis there a base route metric I wonder21:15
blackboxswon eth021:16
blackboxswwhich is what we do on Azure21:16
rharperah, its not a metric21:16
rharperits a different routing table altogether21:16
rharperbut it has the same effect in that the primary route table is consulted *first* before looking at higher value tables21:16
blackboxswrharper: I was curious if different routing table name/id is equivalent to different metric21:16
blackboxswright21:16
blackboxswij21:16
blackboxswok21:16
blackboxswrharper: so plan of attack for cloud-init ec2 multi-nic, multi-ip21:21
blackboxswhttps://hackmd.io/rBjW9rjPRg6LYydxOgW8cQ?sync=&type=21:21
rharperright, we dhcp6 as well if interface has ipv6 right?  so the logic is the same for static ipv6 secondary ips as v421:23
blackboxswright. yes rharper21:26
rharpercool21:26
blackboxswdhcp6 only active ipv6s values21:26
blackboxswok I can correct this branch as I think all it needs to route-metrics at the moment21:27
rharperexcellent21:29
rharperDo we need a doc update to the Ec2 Datasource docs w.r.t network configuration ?21:29
rharperif not, I think it would be a good add to describe what we plan to configure in the multi-nic, multi-ip scenarios (v4, v6 and mixed)21:29
blackboxswrharper: can add that to the datasource since we are touching it w.r.t. secondary_nic config option21:30
blackboxswmakes sense21:30
blackboxswrharper: so route-metric: 0 for eth0?21:31
blackboxswor route-metric: 10021:31
rharperno, we did 10021:31
blackboxswfor eth021:31
blackboxswright ok21:31
blackboxswjust wanted to confirm21:31
rharper(index  + 1) * 10021:31
rharpersame as azure IIRC21:31
blackboxswagreed21:31
=== ananke_ is now known as ananke

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!