[00:35] <blackboxsw> back
[00:40] <rharper> blackboxsw: ok; I've got 55 consecutive reboots with no issues
[00:41] <blackboxsw> ok rharper I misread your comment, I thought you did see the hang in europe
[00:41] <rharper> oh, I do
[00:41] <rharper> this is with my fix applied
[00:41] <rharper> I wanted to make sure it wasn't a fluke;
[00:41] <blackboxsw> ahh good deal. so we may not need to write network names
[00:42]  * blackboxsw is trying again is us-central1
[00:42] <rharper> smoser and I were chatting, and that is still proobably the right thing to do anyhow
[00:42] <rharper> but we can decide when to land that;
[00:42] <rharper> this also needs maas qa before we push it too
[00:43] <rharper> I'm worried about all of the other places were we don;'t see this now; and we're not using the systemd-udev-settle.service; it's just not running anywhere except if you've got zfs installed or lvm enabled
[00:43]  * rharper probes some vmtest runs on bionic with zfs and lvm to see what their journal says 
[00:45] <blackboxsw> again rharper you saw this just w/ cloud-init clean --logs and reboots right?
[00:45] <rharper> oh yeah
[00:45] <blackboxsw> man, us-central1 still not reproducing it for me
[00:45] <rharper> first boot in europe-west1 with the 420 bionic image works fine
[00:45] <rharper> then I rebooted
[00:46] <rharper> saw it
[00:46] <rharper> rebooted, it came up
[00:46] <blackboxsw> will try switching to europewest again
[00:46] <rharper> so not always, since it's racy
[00:46] <blackboxsw> s/again/for once/
[00:47] <blackboxsw> of course this could be as well that my recent attempts had debug message checking and printing name_assign_type
[00:52] <blackboxsw> heh, didn't realize our account instance view was shared
[00:52] <blackboxsw> I see rharper-b1 now
[00:53] <blackboxsw> sure enough first boot in europe-west1-b bricked
[00:53] <blackboxsw> geez man region-related for sure
[00:54] <rharper> yeah
[00:55] <rharper> blackboxsw: that's super interesting
[00:55] <rharper> w.r.t the region, I suspect it's load
[00:56] <blackboxsw> could be. trying to see if I can a success boot so I can add my debug deb on  followup reboots
[00:57] <rharper> ah, yeah, you have to get one good boot to set the root password
[00:58] <blackboxsw> yeah or boot from my previous image snapshot. I'll try that
[00:59] <smoser> so .. we can reproduce fairly easily ?
[01:00] <blackboxsw> looks like europe-west1-b or europe-west1-d regions
[01:00] <rharper> smoser: yeah
[01:02] <smoser> rharper: if nothing was running trigger....
[01:02] <smoser> er.. if nothing was runnign the trigger service , then what was doing the cold plug?
[01:02] <rharper> not the trigger service
[01:02] <rharper> that;s always running
[01:02] <smoser> *something* has to be doing it. or we wouldn't have .link files respected at all
[01:02] <rharper> it's the settle service
[01:03] <smoser> so any possible 'udevadm settle' from anywhere would make it work
[01:03] <rharper> yeah
[01:03] <rharper> in xenial, the networking.service unit runs a pre command with udevadm settle in it
[01:04] <rharper> artful could show it, if things were fast enough; and even in bionic, it has to be in this one region where things run slightly odd
[01:05] <rharper> ok, I need to step a way
[01:08] <smoser> blackboxsw: logs of launch-softlayer needs combining with launch-ec2 too.
[01:15] <smoser> blackboxsw: this got missed.
[01:15] <smoser> https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/342334
[01:15] <smoser> not terribly important
[01:16] <smoser> and then this one needs landing too
[01:16] <smoser>  https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/344189
[01:32] <blackboxsw> smoser: landed https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/344189
[01:41] <smoser> blackboxsw: thanks
[14:02] <rharper> blackboxsw: smoser:  if we wanted to be more targetted with the settle, we could for example, trigger it within cloud-init-local if we detech non-renamed interfaces with knames (and ifnames=0 no in cmdline); that would then only impact systems which happen to have that early race between cloud-init-local and udev-trigger
[14:06] <smoser> right. and that would be in some ways safer
[14:06] <smoser> from the perspective of not changing boot
[14:06] <smoser> i'd like to have slangasek or xnox thoughts
[14:06] <rharper> yes
[14:06] <smoser> as i am apt to agree with you, that not having the settle service active in boot is ... well just wrong.
[14:07] <rharper> there is a swarm of "why is my boot slow/ systemd-analyze blame shows udev-settle.service"
[14:07] <smoser> oh?
[14:07] <rharper> yes
[14:07] <smoser> so we did it as an optimization :)
[14:07] <rharper> but it's because they have things like usb nics or other storage devices that take *time* to come up
[14:07] <rharper> no
[14:07] <rharper> I don't think so
[14:07] <smoser> (it was a joke)
[14:07] <rharper> it's not clear to me why it's not enabled by default
[14:07] <rharper> yet
[14:07] <smoser> i can make a system boot REALLY REALLY FAST
[14:07] <smoser> and sometimes even do what you want!
[14:07] <rharper> but, lvm2 has a generator which forces it on, if lvm2 is needed in some sitations
[14:07] <rharper> and zfs of course, Requires it
[14:08] <rharper> since they need all of their devices up before they can mount or build a raid, etc
[14:08] <rharper> so, it *really* seems like it should just always be on
[14:08] <rharper> one ends up "Waiting" for rootfs anyhow
[14:08] <smoser> i agree. we should request slangasek and xnox review of your MP ?
[14:08] <rharper> we've seen those "waiting for device ... foo to appear"
[14:08] <rharper> smoser: or possible add a systemd task and ask in the GCE bug
[14:08] <smoser> rharper: well, in my fast boot, sometimes / isnt' there, so but it boots really fast.
[14:08] <rharper> but I would like foundation review
[14:09] <rharper> smoser: lol!
[14:09] <rharper> I get (initramfs) prompt *so* fast those times
[14:09] <smoser> exactly.
[14:09] <smoser> and systemd-analyze does not blame udev!
[14:09] <rharper> I usually take the extra savings and then compile my own kernel, kexec into it to find my root
[14:11] <rharper> blackboxsw: interesting observation w.r.t zone and image;
[14:11] <rharper> I wonder if we can further disect what's special about the 420 image in europe-west1 vs. current stuff
[14:12] <rharper> none-the-less; it does make sense to do something to detect if we've raced and try to fix that in the case we do
[14:12] <rharper> I'm going to see if I can target the settle within cloud-init-local on the reproducer
[14:12] <smoser> i compile my kernels with -O4 and funroll-loops . its the best.
[14:13] <smoser> "it does make sense to do something to detect"
[14:13] <smoser> maybe
[14:14] <smoser> it only makes so much sense to determine when a system is broken... why didn't we just fix the system ?
[14:14] <rharper> that's fair; for now I'm mostly intereted in if we can detect it;
[14:15] <rharper> whether we target a more narrow fix so as to not "udevadm settle" the world ; aka smoser's favorite alias to 'sleep 1'
[14:15] <rharper> needs more discussion
[14:19] <smoser> https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/344189
[14:19] <smoser> bah. bad link
[14:20] <smoser> https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/344255
[14:20] <smoser> that one.
[14:20] <smoser> that is softlayer doc improvement.
[14:20] <smoser> rharper, blackboxsw, dpb1 ^
[14:20] <rharper> yeah, saw that
[14:20] <rharper> 6 ways
[14:21] <smoser> from sunday
[14:23] <rharper> hehe
[15:45] <rharper> 2018-04-25 15:44:49,632 - __init__.py[DEBUG]: WARK: found unstable device names: ['eth0']; calling udevadm settle
[15:45] <rharper> 2018-04-25 15:44:49,968 - util.py[DEBUG]: WARK: Waiting for udev events to settle took 0.336 seconds
[15:46] <rharper> smoser: we can detect, and "resolve" it more narrowly
[15:46] <rharper> if we want
[15:46] <rharper> I'll put up an alternative patch with this change
[15:50] <smoser> link ?
[15:50] <smoser> philroche: https://launchpad.net/~smoser/+archive/ubuntu/ibmcloud-test
[15:50] <smoser> should  be populated shortly with a test.
[16:25] <blackboxsw> smoser: reviewed https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/344255
[17:16] <rharper> smoser: blackboxsw: this is an alternative, more targetted settle, https://code.launchpad.net/~raharper/cloud-init/+git/cloud-init/+merge/344339
[18:19] <jocha> blackboxsw: I think I pinged you about my updated merge request, but I don't think I received a response, I might've restarted the chat, anyways here it is again : https://code.launchpad.net/~jocha/cloud-init/+git/cloud-init/+merge/344192 :)
[18:19] <blackboxsw> ahh thanks jocha I'll give it a looksie today
[18:22] <jocha> awesome thanks!
[19:55] <smoser> blackboxsw: i responded to your https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/344255 .
[19:56] <smoser> really just wanting to know if you think i cleared things up
[19:57] <blackboxsw> smoser: yes cleared. land at will, or I can
[19:57] <smoser> ok. ill land
[19:57] <blackboxsw> I'm camping in cloud-init hangout trying to get my IBMcloud setup up
[19:57] <blackboxsw> now that I'm approved
[19:57] <blackboxsw> but can't seem the find/create my API creds
[22:30] <blackboxsw> got it. and updating launch-softlayer script