[12:03] <ogra> xnox, yo ... i'm currently looking at the panic() function of the initrd ... on Ubuntu Core we set panic=-1 on the cmdline ... which the kernel typically hadles fine ... now we have an issue with a customer where the initrd drops you into a console with the panic= value set ... according to the code it should sleep for $panic seconds and then reboot instead of dropping to a shell ... https://paste.ubuntu.com/p/mswD8Cd869/
[12:04] <ogra> now ... that "sleep -1" (resulting from the "panic=-1" on the cmdline) seems to cause us to still end up in a shell, seemingly the error from the sleep command makes us jump ahead to the /bin/sh at the bottom of that function
[12:04] <ogra> (i assume initrd scripts dont use set -e ?)
[12:06] <ddstreet> ogra that initrd script certainly doesn't match the kernel's panic usage
[12:07] <ogra> ddstreet, that might be ... it is what we have by default in the 16.04 initramfs-tools though (i'm working on Core16 which uses exactly the above code)
[12:07] <ddstreet> ah
[12:08] <ddstreet> ogra unrelated to that but re: panic=-1, i hope you are restoring /proc/sys/kernel/panic to 0 (or whatever the user's configured) after booting?  not leaving the system with panic=-1?
[12:08] <ogra> i need to avoid dropping to the shell (which i can surely by simply setting panic=1 (that should only delay the kernel panic a bit when not in initrd) ... but in general it looks like a bug in initramfs tools
[12:09] <ogra> ddstreet, it is a constant default ... Core is mainly used in non-user setups ... headless remote managed devices etc ... so the panic value should cause an immediate reboot
[12:09] <ogra> (Corer has a builtin rollback mechnaism and the reboot will cause you to go back automatically to the last known good kernel)
[12:10] <xnox> ogra:  it would be interesting to see the full boot log with: set -x
[12:10] <xnox> ogra:  to see exactly what has happened.
[12:10] <ogra> the -1 is fine in all cases except the initrd panic
[12:10] <xnox> ogra:  maybe none of panic() is called at all?
[12:10] <xnox> ogra:  also, do you need sleep here?
[12:11] <ogra> xnox, well, it properly prints the "dropping to a shell yadda yadda" stuff and all ...
[12:11] <ogra> i dont need the sleep at all
[12:11] <ogra> what i need is no shell when the fs is corrupt on a device that has secureboot enabled
[12:11] <xnox> ogra:  you did see "Rebooting automatically due to panic=" right ?
[12:11] <ogra> nope
[12:11] <xnox> ogra:  also it is odd to have a variable named same as a function =)
[12:12] <ogra> it just drops into the initrd shell
[12:12] <xnox> ogra:  so ${panic} is not set?
[12:12] <ogra> (i dont have the exact logs from the customer)
[12:12] <ogra> panic is set to -1
[12:12] <ogra> $ LC_ALL=C sleep "-1"
[12:12] <ogra> sleep: invalid option -- '1'
[12:12] <ogra> Try 'sleep --help' for more information.
[12:12] <ogra> sleep thinks it is an option due to the minus
[12:13] <xnox> sleep -- -1
[12:13] <xnox> sleep: invalid time interval ‘-1’
[12:13] <xnox>   
[12:13] <xnox> but still kind of useless
[12:13] <ogra> yes
[12:13] <ogra> well, before i start hacking up my own panic() function for this customer project i thought we should also fix it upstream ;)
[12:13] <xnox> ogra:  so obviously just this snippet of code is not enough, without reproducer or full logs
[12:14] <xnox> ogra:  so do you have reproducer or full logs?
[12:14] <ogra> well, obviously that code snippet cant handle negative values and this should be fixed
[12:14] <ogra> regardless of logs/reproducer
[12:14] <xnox> ogra:  but also not critical for the customer, right?
[12:15] <ogra> well, given they sell secureboot protected devices, they dont really want people to gain root access to these systems just because of an FS corruption on potentially unrelated plugged in USB drives or whatever
[12:16] <ogra> so it is kind of critical :)
[12:16] <ogra> (i guess i'll get along with panic=1 for that particular case .... but i also think we need a fix in the existing upstream code ... this is why i pinged)
[12:19] <ogra> i assume the original sleep code there is in place so that you can actually see the console message before it reboots
[12:37] <xnox> ogra:  secureboot does not imply locked down device.
[12:38] <xnox> ogra:  i'm not sure what is the intention of -1
[12:38] <ogra> well ... its an expectation that you cant easily gain root access though
[12:38] <ogra> xnox, https://github.com/torvalds/linux/blob/v4.17/Documentation/admin-guide/kernel-parameters.txt#L2931
[12:38] <ogra> (as i just said in the interhnal channel, that initrd functio doesnt take panic=0 into account either)
[12:38] <ogra> *internal
[12:39] <xnox> ogra:  right, so the code should handle that too.... ie what you mentioned earlier i.e. on zero call `sleep infinity` and on negative numbers, skip calling sleep
[12:39] <ogra> right
[12:40] <ogra> i think the original implementation of that function simply pre-dates that kernel feature ...
[13:11] <ogra> xnox, oh my ... digging deeper i see that /usr/share/initramfs-tools/init actually filters out negative values completely for panic= ... (unless my regex understanding is wrong)
[14:25] <ogra> xnox, https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1831252
[14:32] <xnox> ogra:  hehe. I did say "are you sure panic is even set?!"
[14:32] <ogra> yeah, now i'm sure it is not because of this :)
[18:45] <ddstreet> vorlon xnox fyi, not sure if you noticed yet but it's not only systemd failing during test reboots, dbus test just failed as well: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-disco/disco/amd64/d/dbus/20190531_164159_d3a1e@/log.gz
[19:01] <bdmurray> sarnold: Have you had a chance to test bug 1820676? Maybe you could test it with a new laptop install. ;-)
[19:02] <connor_k> lol
[19:03] <sarnold> bdmurray: it's funny you mention that, I just a few moments ago saw https://www.zdnet.com/article/dell-releases-more-high-end-ubuntu-linux-laptops/
[19:04] <sarnold> bdmurray: I tried a new image on the machine that was wedged and it *did* boot past that point -- I however forgot to test what happens if you hit the cancel button in the installer
[19:04] <bdmurray> sarnold: dpm
[19:04] <bdmurray> sarnold: don't you have a new laptop lying around somewhere...
[19:04] <sarnold> bdmurray: yes
[19:05] <bdmurray> sarnold: here's one more reason to play with it!
[19:05] <sarnold> yes :)
[19:10] <connor_k> sarnold, most expensive paperweight ever
[19:11] <sarnold> connor_k: you should meet my server..