/srv/irclogs.ubuntu.com/2018/04/25/#cloud-init.txt

blackboxswback00:35
rharperblackboxsw: ok; I've got 55 consecutive reboots with no issues00:40
blackboxswok rharper I misread your comment, I thought you did see the hang in europe00:41
rharperoh, I do00:41
rharperthis is with my fix applied00:41
rharperI wanted to make sure it wasn't a fluke;00:41
blackboxswahh good deal. so we may not need to write network names00:41
* blackboxsw is trying again is us-central100:42
rharpersmoser and I were chatting, and that is still proobably the right thing to do anyhow00:42
rharperbut we can decide when to land that;00:42
rharperthis also needs maas qa before we push it too00:42
rharperI'm worried about all of the other places were we don;'t see this now; and we're not using the systemd-udev-settle.service; it's just not running anywhere except if you've got zfs installed or lvm enabled00:43
* rharper probes some vmtest runs on bionic with zfs and lvm to see what their journal says 00:43
blackboxswagain rharper you saw this just w/ cloud-init clean --logs and reboots right?00:45
rharperoh yeah00:45
blackboxswman, us-central1 still not reproducing it for me00:45
rharperfirst boot in europe-west1 with the 420 bionic image works fine00:45
rharperthen I rebooted00:45
rharpersaw it00:46
rharperrebooted, it came up00:46
blackboxswwill try switching to europewest again00:46
rharperso not always, since it's racy00:46
blackboxsws/again/for once/00:46
blackboxswof course this could be as well that my recent attempts had debug message checking and printing name_assign_type00:47
blackboxswheh, didn't realize our account instance view was shared00:52
blackboxswI see rharper-b1 now00:52
blackboxswsure enough first boot in europe-west1-b bricked00:53
blackboxswgeez man region-related for sure00:53
rharperyeah00:54
rharperblackboxsw: that's super interesting00:55
rharperw.r.t the region, I suspect it's load00:55
blackboxswcould be. trying to see if I can a success boot so I can add my debug deb on  followup reboots00:56
rharperah, yeah, you have to get one good boot to set the root password00:57
blackboxswyeah or boot from my previous image snapshot. I'll try that00:58
smoserso .. we can reproduce fairly easily ?00:59
blackboxswlooks like europe-west1-b or europe-west1-d regions01:00
rharpersmoser: yeah01:00
smoserrharper: if nothing was running trigger....01:02
smoserer.. if nothing was runnign the trigger service , then what was doing the cold plug?01:02
rharpernot the trigger service01:02
rharperthat;s always running01:02
smoser*something* has to be doing it. or we wouldn't have .link files respected at all01:02
rharperit's the settle service01:02
smoserso any possible 'udevadm settle' from anywhere would make it work01:03
rharperyeah01:03
rharperin xenial, the networking.service unit runs a pre command with udevadm settle in it01:03
rharperartful could show it, if things were fast enough; and even in bionic, it has to be in this one region where things run slightly odd01:04
rharperok, I need to step a way01:05
smoserblackboxsw: logs of launch-softlayer needs combining with launch-ec2 too.01:08
smoserblackboxsw: this got missed.01:15
smoserhttps://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/34233401:15
smosernot terribly important01:15
smoserand then this one needs landing too01:16
smoser https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/34418901:16
blackboxswsmoser: landed https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/34418901:32
smoserblackboxsw: thanks01:41
=== mgerdts_ is now known as mgerdts
rharperblackboxsw: smoser:  if we wanted to be more targetted with the settle, we could for example, trigger it within cloud-init-local if we detech non-renamed interfaces with knames (and ifnames=0 no in cmdline); that would then only impact systems which happen to have that early race between cloud-init-local and udev-trigger14:02
smoserright. and that would be in some ways safer14:06
smoserfrom the perspective of not changing boot14:06
smoseri'd like to have slangasek or xnox thoughts14:06
rharperyes14:06
smoseras i am apt to agree with you, that not having the settle service active in boot is ... well just wrong.14:06
rharperthere is a swarm of "why is my boot slow/ systemd-analyze blame shows udev-settle.service"14:07
smoseroh?14:07
rharperyes14:07
smoserso we did it as an optimization :)14:07
rharperbut it's because they have things like usb nics or other storage devices that take *time* to come up14:07
rharperno14:07
rharperI don't think so14:07
smoser(it was a joke)14:07
rharperit's not clear to me why it's not enabled by default14:07
rharperyet14:07
smoseri can make a system boot REALLY REALLY FAST14:07
smoserand sometimes even do what you want!14:07
rharperbut, lvm2 has a generator which forces it on, if lvm2 is needed in some sitations14:07
rharperand zfs of course, Requires it14:07
rharpersince they need all of their devices up before they can mount or build a raid, etc14:08
rharperso, it *really* seems like it should just always be on14:08
rharperone ends up "Waiting" for rootfs anyhow14:08
smoseri agree. we should request slangasek and xnox review of your MP ?14:08
rharperwe've seen those "waiting for device ... foo to appear"14:08
rharpersmoser: or possible add a systemd task and ask in the GCE bug14:08
smoserrharper: well, in my fast boot, sometimes / isnt' there, so but it boots really fast.14:08
rharperbut I would like foundation review14:08
rharpersmoser: lol!14:09
rharperI get (initramfs) prompt *so* fast those times14:09
smoserexactly.14:09
smoserand systemd-analyze does not blame udev!14:09
rharperI usually take the extra savings and then compile my own kernel, kexec into it to find my root14:09
rharperblackboxsw: interesting observation w.r.t zone and image;14:11
rharperI wonder if we can further disect what's special about the 420 image in europe-west1 vs. current stuff14:11
rharpernone-the-less; it does make sense to do something to detect if we've raced and try to fix that in the case we do14:12
rharperI'm going to see if I can target the settle within cloud-init-local on the reproducer14:12
smoseri compile my kernels with -O4 and funroll-loops . its the best.14:12
smoser"it does make sense to do something to detect"14:13
smosermaybe14:13
smoserit only makes so much sense to determine when a system is broken... why didn't we just fix the system ?14:14
rharperthat's fair; for now I'm mostly intereted in if we can detect it;14:14
rharperwhether we target a more narrow fix so as to not "udevadm settle" the world ; aka smoser's favorite alias to 'sleep 1'14:15
rharperneeds more discussion14:15
smoserhttps://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/34418914:19
smoserbah. bad link14:19
smoserhttps://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/34425514:20
smoserthat one.14:20
smoserthat is softlayer doc improvement.14:20
smoserrharper, blackboxsw, dpb1 ^14:20
rharperyeah, saw that14:20
rharper6 ways14:20
smoserfrom sunday14:21
rharperhehe14:23
rharper2018-04-25 15:44:49,632 - __init__.py[DEBUG]: WARK: found unstable device names: ['eth0']; calling udevadm settle15:45
rharper2018-04-25 15:44:49,968 - util.py[DEBUG]: WARK: Waiting for udev events to settle took 0.336 seconds15:45
rharpersmoser: we can detect, and "resolve" it more narrowly15:46
rharperif we want15:46
rharperI'll put up an alternative patch with this change15:46
smoserlink ?15:50
smoserphilroche: https://launchpad.net/~smoser/+archive/ubuntu/ibmcloud-test15:50
smosershould  be populated shortly with a test.15:50
blackboxswsmoser: reviewed https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/34425516:25
=== r-daneel_ is now known as r-daneel
rharpersmoser: blackboxsw: this is an alternative, more targetted settle, https://code.launchpad.net/~raharper/cloud-init/+git/cloud-init/+merge/34433917:16
jochablackboxsw: I think I pinged you about my updated merge request, but I don't think I received a response, I might've restarted the chat, anyways here it is again : https://code.launchpad.net/~jocha/cloud-init/+git/cloud-init/+merge/344192 :)18:19
blackboxswahh thanks jocha I'll give it a looksie today18:19
jochaawesome thanks!18:22
smoserblackboxsw: i responded to your https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/344255 .19:55
smoserreally just wanting to know if you think i cleared things up19:56
blackboxswsmoser: yes cleared. land at will, or I can19:57
smoserok. ill land19:57
blackboxswI'm camping in cloud-init hangout trying to get my IBMcloud setup up19:57
blackboxswnow that I'm approved19:57
blackboxswbut can't seem the find/create my API creds19:57
blackboxswgot it. and updating launch-softlayer script22:30

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!