/srv/irclogs.ubuntu.com/2020/08/24/#cloud-init.txt

makara1	what's a nice way to serve cloud-init configuration?	07:21
makara1	put 'python3 -m http.server' in a unit file?	07:22
=== hjensas__ is now known as hjensas\|lunch
makara1	can I get cloud-init to download a git repo and kick off a build?	13:42
beantaxi	How can I do the following with cloud-init: 1. Have a cloud-config session that does once-per-instance setup 2. A bash script that continues that once-per-instance setup, and 3. A bash-script that executes once-every-boot. I see in the docs how I could create a multipart mime which contains a cloud-config and 2 scripts, but not how to order them or to associate them with lifecycle. Thanks!	13:52
otubo	smoser, I'm reworking your code here: https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+ref/fix/1788915-vlan-sysconfig-rendering	14:02
otubo	smoser, should I add signed-off-by on the pull-request>	14:02
otubo	?	14:02
meena	https://github.com/canonical/cloud-init/pull/521#issuecomment-678476067 ⬅️ this the use-case scapy? for https://scapy.readthedocs.io/en/latest/introduction.html	14:09
=== hjensas\|lunch is now known as hjensas
otubo	smoser, I'm gonna just leave a signed-off-by with your name, if you don't want it I can remove it. :)	14:28
rharper	morning, blackboxsw_ fyi the daily PPA builds for bionic/xenial are failing, https://launchpadlibrarian.net/494499147/buildlog.txt.gz https://launchpadlibrarian.net/494499148/buildlog.txt.gz	14:29
blackboxsw_	morning folks.	14:59
smoser	otubo: thats fine with me.	15:02
smoser	meena: we have a use-case for cloud-init to be able to do a dhcp request and read the response.	15:03
smoser	one path to that is through scapy (and a seemingly simple one)	15:04
smoser	but scapy's dependency stack is not small :-(	15:04
smoser	the difficulty we had wehn looking at a dhcp client in python is that all the example solutions do not use raw sockets.	15:07
smoser	and thus... have an ip address already.	15:07
smoser	s/have/assume/	15:07
=== waxfire7 is now known as waxfire
meena	@smoser scapy would also allow to do a lot of the network discovery work without heaps and heaps of guesswork	15:10
meena	smoser: what makes scapy's dependency stack so big? i thought it only depends on libpcap?	15:12
smoser	https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=907235	15:13
ubot5	Debian bug 907235 in python3-scapy "python3-scapy: Please provide package without large 'Recommends'" [Normal,Fixed]	15:13
meena	da fuck	15:14
smoser	that is fixed in debian (and i think in ubuntu as aresult), but not in all versions of ubuntu and not in many other os	15:14
meena	who does that? why	15:14
meena	the way new packages get into old versions of debuntu remains a mystery	15:17
minimal	Hi folks, so what's the current ETA for 20.3?	15:31
meena	"when it's done" ;P	15:33
minimal	is anything ever done? ;-)	15:33
smoser	meena: they don't get into old versiosn of ubuntu.	15:55
smoser	by design. stable things are stable.	15:55
blackboxsw_	smoser: thanks for the review last week https://github.com/canonical/cloud-init/pull/543 in response. no longer handling non-decodable or non-gzipped content for user-data	16:03
smoser	ACK'd thanks	16:06
beantaxi_	Hello all ... the docs give this example of combining a cloud-config and a bash script in a single mime multipart: % cloud-init devel make-mime -a config.yaml:cloud-config -a script.sh:x-shellscript > user-data	16:24
beantaxi_	However, I'm not sure how that specifies the run order of cloud-config vs the shellscript, or how it specifies how often the shell script runs: per once, per instance, or per boot. Thanks!	16:25
smoser	x-shellscript run per-instance.	17:19
smoser	cloud-config is parsed before x-shellscript.	17:19
beantaxi_	smoser: Thanks. So cloud-config runs before shell scripts no matter what. That's fine for me, and I'm guessing if that's a showstopped then run-cmd is your friend. Just checking.	18:40
meena	smoser: what about … fixing broken packages in older versions of Debuntu? like when somebody accidentally made switched Recommends and Suggests?	18:41
smoser	runcmd runs ~shell-script time frame.	18:42
smoser	actually... odly based on filename. those actually (iirc) get thrown in a dir, and then run with rhnparts. and the runcmd script name is 'runCmd'. so its wierd.	18:42
beantaxi_	As for 'x-shellscript run per-instance' ... I'm afraid I'm not sure what you mean. I do see in the code, that it looks like lifecycle gets controller by the /var/lib/cloud/per-[once\|instance\|boot] folders, but I'm not sure how to use cloud-init to populate those folders. Thanks!	18:43
smoser	meena: it all kind of depends.	18:43
* meena is a big fan of "if broken it is, fix it you should"		18:43
smoser	if you really reallly wanted to get that change in, you have to convince someone on the SRU team that it is important enough.	18:43
smoser	i tend to be a fan of keeping behavior the same.	18:44
smoser	as the primary value of a platform is to put things on top of. its harder to build on a platform if the platform is changing.	18:44
smoser	thus.... the value of a "stable platform" is almost entirely the stability	18:44
smoser	but... its not me that has to be convinced, and, while ubuntu has guidelines on such things, they're not always 100% consistently applied.	18:45
meena	smoser: well, yes, but the behaviour did change, no?	18:45
smoser	it changed from one stable platform to the next. and thats fine. when yo u start to move your house to that next stable platform, then you deal with it.	18:46
smoser	i dont remember the exact values here, but lets just say:	18:47
smoser	16.04: little recommends	18:47
smoser	18.04: big recommends	18:47
smoser	20.04: little recommends	18:47
smoser	thats perfectly fine. it sucks that 18.04 is got this wart there, but people that are building on it expect the wart.	18:48
meena	aye.	18:49
smoser	but again... these are my feelings towards that sort of thing. the person who decides is a Stable Release Team member, and (in my experience) their individual opinions and behaviors differ.	18:49
meena	but yeah, i agree that having a dedicated python3-scapy only package without the kitchensink would be preferable… which of course won't be backported, eh?	18:49
smoser	and sometimes even differ based on the time.	18:49
smoser	the are humans, and as a result have a very hard time remaining consistent.	18:50
smoser	largely RHEL and SuSE have simmilar processes/values	18:50
meena	some people make horrible decisions if they didn't have breakfast, or if their underpants are too tight.	18:50
smoser	so to depend on scapy, i think for the near term, the thing we have to do is make it optional, with a fallback to the questionable decision that I made some years ago to use dhclient like we do	18:52
smoser	(see.... consistency hard for people!)	18:52
meena	smoser: see, if you had gone with scapy years ago, you could've kept it from balooning in size in certain versions of debuntu… and we'd… have a very different networking refactor on our hands rn ;)	18:59
rick_h	minimal: meena sorry, didn't open this up yet today. We're waiting on one final PR and will start the release/SRU processes. falcojr has the PR up but needs review/etc so hopefully that goes through today and we'll start the release tomorrow.	19:10
meena	rick_h: oh, so it is an SRU release	19:11
meena	also, i forgot what that means again	19:11
powersj	stable release upgrade, the exception to the don't update stable releases	19:14
powersj	for cloud-init we take the latest upstream code, push it to the devel release and then test it against the supported releases	19:15
powersj	in this case xenial, bionic, and focal	19:15
beantaxi_	smoser: I am quite confused ... but as best as I can tell: I can add bash scripts to my multi-part mime, and they will be run once per instance, unless I override scripts-user in my cloud-config to [scripts-user, always], in which case my bash scripts will execute on boot.	19:22
beantaxi_	As for my wanting to have one script execute per-instance, and one execute per-boot ... it looks like I could only achieve that by manually copying files to the /var/lib/cloud/per-boot and -instance folders. There's no way to do that otherwise. Is that all correct?	19:22
rharper	beantaxi_: I suggest that you you use write_files and then emit them into the different scripts-per frequency desired, /var/lib/cloud/scripts/per-{boot,instance,once} ;	19:29
smoser	yeah.	19:29
smoser	it does seem useful to have a way to put those in user-data. but as it is, writing files to /var/lib/cloud/per-boot is the only real way.	19:31
beantaxi_	Ah, so in a pinch write_files can be used to put whatever files I want, wherever I want them. That's useful.	19:36
beantaxi_	Would it be technically possible to implement something a bit more like I'm describing, by eg allowing an optional header for a dest folder (or more indirectly specifying a 'per	19:42
beantaxi_	')? Or is the mime-parsing code handled by the cloud providers rather than by cloud-init	19:43
smoser	by cloud-init	19:43
smoser	we could absolutely improve cloud-init to do what you're after.	19:43
beantaxi_	I'm happy to take a lame swing at it ... but I'd want to make sure I'd set expectations sufficiently low	19:44
beantaxi_	In my specific use case, I have a little family of VM initialization scripts, where each VM has an init-image and an init-vm script. Naturally I found out about cloud-init about 5 minutes after I was done.	19:46
smoser	https://cloudinit.readthedocs.io/en/latest/topics/format.html#part-handler	19:46
smoser	you could use a part-handler to do what youre after.	19:47
beantaxi_	So I'd like to use the scripts as-is -- where init-image is per-isntance and init-vm is per-boot, and also start transitioning my per-instance stuff to cloud-config, which I am sure over time I will prefer.	19:47
beantaxi_	Ok nice. I was not quite sure what those were for.	19:47
beantaxi_	K, this is making total sense	19:48
beantaxi_	The part-handler docs link to a blog post from 2011 from 'Foss Boss'. Is this info likely still current?	19:49
blackboxsw_	falcojr: did you have an azure instance up that exhibits https://github.com/canonical/cloud-init/pull/539?	19:58
* blackboxsw_ is trying to get that review complete so we can cut 20.3		19:58
falcojr	blackboxsw_ bah, sorry, didn't get the notification	20:36
falcojr	I have one with the fix currently	20:36
falcojr	I could spin one up quickly that shows the problem though	20:36
falcojr	it just needs to have accelerated networking (on the network tab if you're in the UI)	20:36
falcojr	I used a focal DS2 v2 instance	20:37
falcojr	s/focal/bionic	20:38
blackboxsw_	+1 I'll launch an instance. were you using amazon linux or ubuntu?	20:46
powersj	O.o amazon linux on azure?	20:47
blackboxsw_	hahah	20:47
blackboxsw_	oops brain fail	20:47
blackboxsw_	probably not a product that azure is currently selling :)	20:47
powersj	:D	20:48
falcojr	haha, it should happen on focal or bionic. I used bionic	20:50
blackboxsw_	falcojr: strange as I've added accelerated networking on a DS2v2 instance and not seeing the delay on bionic	21:00
blackboxsw_	ubuntu@test-b1:~$ systemd-analyze blame	21:00
blackboxsw_	8.553s cloud-init.service	21:00
blackboxsw_	8.553s cloud-init.service	21:00
blackboxsw_	6.857s cloud-init-local.service	21:00
falcojr	did you reboot?	21:07
falcojr	blackboxsw_ ^	21:07
falcojr	it doesn't happen on first boot	21:08
blackboxsw_	falcojr: rebooting now. sure enough 1min 31.262s cloud-init-local.service	21:48
blackboxsw_	thanks	21:48
rharper	blackboxsw_: if you're one a box now, I just posted some questions on that PR; would be nice to see requested info attached to the bug that PR addresses;	21:49
blackboxsw_	+1 rharper	21:49
blackboxsw_	I'm also seeing the type of failure mode I think we were looking for on current bionic images, like a failed "rename3" interface lying around post reboot	21:50
blackboxsw_	with matching mac addr	21:50
rharper	yeah, I'm super interested to confirm whether there are two hv_net nics, or if there's a different sriov driver name (we already blacklist the mlx4_core)	21:51
rharper	blackboxsw_: I wonder if we just dropped including the driver value in the emitted netplan	22:00
blackboxsw_	rharper: checking now. we only blacklist mlx4_core on fallback network right?	22:00
blackboxsw_	on azure	22:00
rharper	yeah	22:00
* blackboxsw_ double checks		22:01
rharper	that can't be right	22:01
blackboxsw_	also responded first here: https://github.com/canonical/cloud-init/pull/539#discussion_r475918457	22:01
* blackboxsw_ tries to recall how to attach binary (cloud-init.logs.tar.gz) to github comments		22:01
blackboxsw_	if that's even possible	22:01
rharper	you can attach to the bug it references	22:02
rharper	mlx5_core	22:02
rharper	looks like we'll need to blacklist both 4 and 5 (and fix non-fallback path to include drivers)	22:03
rharper	do we know if IMDS reports the sriov interface at all ?	22:03
blackboxsw_	checking	22:15
blackboxsw_	rharper: nope ... http://paste.ubuntu.com/p/CFjtY7JHm4/	22:16
blackboxsw_	well "yes" in that it only reports one interface	22:16
rharper	yeah, I don't think it did	22:17
blackboxsw_	and both share the the same mac	22:17
blackboxsw_	:)	22:17
blackboxsw_	yeah	22:17
rharper	I recall asking in the past	22:17
rharper	blackboxsw_: I replied; I think the fix is good; but let's update the commit message; the issue here is that once we started rendering from IMDS, we regressed Accelerated Networking setups due to not including the hv_netsvc driver name in our match; when we were v1 rendering, that emits a udev rule which matches the MAC and DRIVER != $blacklist ;	22:19
rharper	the fix for netplan is what's there (binding the emitted netplan to the hyperv driver nic)	22:19
rharper	on the fallback path, we need both mlx4 and mlx5 core (xenial running on AdvNet instances with mxl5 will likely have this hang as well )	22:20
blackboxsw_	rharper: adding https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1830740/comments/11 while I look at this	22:20
ubot5	Ubuntu bug 1830740 in linux-azure (Ubuntu) "[linux-azure] Delay during boot of some instance sizes" [Undecided,New]	22:20
rharper	I wonder if sysconfig rendered instances see failures, depends on what the udev rules look like	22:21
rharper	cool	22:21
blackboxsw_	why wouldn't we want to just blacklist on the mlx* IMDS-based network config	22:22
blackboxsw_	and on fallback config	22:22
blackboxsw_	ok confirmed eth0 is driver/module: module -> ../../../../module/hv_netvsc rename3 : module -> ../../../../module/mlx5_core	22:30
blackboxsw_	so yeah, just wondering if maybe we should make sure we blacklist both mlx5_core and mlx4_core in both fallback config and imds-based config we write	22:30
blackboxsw_	so we don't unintentially ignore other device types.	22:31
blackboxsw_	or are there any other device types that would present "hv_netvsc" for one interface and bool(not any(['mlx4_core', 'mlx5_core']))	22:32
blackboxsw_	guess that's a question for both falcojr and rharper	22:32
rharper	blackboxsw_: so blacklist only works for eni and sysconfig which emit udev rules	22:33
blackboxsw_	ohhh doh	22:33
rharper	for netplan, the 50-cloud-init.yaml needs to use the match driver=hv	22:33
rharper	with the mac	22:33
blackboxsw_	ok ok, so drive matching is the inverse approach for netplan	22:33
blackboxsw_	driver* matching	22:33
rharper	we know (for now) that any rendered interface in IMDS on Azure is hv_	22:34
rharper	blackboxsw_: well, we could emit a driver != I think ... let me confirm	22:34
rharper	yeah, we could emit: Driver=!mlx4_core\nDriver=!=mlx5_core	22:35
rharper	however, I'm not sure if netplan likes that ...	22:35
blackboxsw_	right, or it it only honors the latter of the settings etc	22:37
* blackboxsw_ tries a generate		22:38
blackboxsw_	and sees what systemd thinks	22:38
rharper	quoted it accepts the !driver	22:38
rharper	but not sure about a list (or multiple	22:38
rharper	I suppose we won't have both	22:38
rharper	well, that's not easily found	22:38
rharper	without finding the dup mac device (which may not be present)	22:38
rharper	I think the current code is safest	22:39
blackboxsw_	yeah I think so too at the moment, trying a clean and dirty upgrade and reboot with new code now	22:43
blackboxsw_	as the duplicate !driver matches may collide in systemd or netplan cfg with unintentional side-effects (or matching nothing)	22:44
blackboxsw_	like OR instead of AND	22:44
blackboxsw_	falcojr: at the moment, I'm still seeing 90 seconds times on reboot even with driver matching hv_netvsc	22:49
blackboxsw_	on dirty cloud-init upgrade/reboot	22:49
blackboxsw_	http://paste.ubuntu.com/p/7NjWHX2QTY/	22:49
blackboxsw_	interesting but 2nd reboot was fixed.	22:51
falcojr	I didn't see it after a clean and reboot	23:03

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!