[06:55] <joseogando> petn-randall, it is for NVIDIA systems, nothing to do with graphics, desktop pcs or any of the sorts. Most likely, unless you use NVIDIA servers you won't need it.
[11:07] <lis> hi.  i'm working in the cockpit team at redhat.  our CI just caught a pretty bad regression in the kernel version in jammy-proposed (5.15.0.94.91) which wasn't present in the previous version
[11:07] <lis> i'd normally report that as a bug but i haven't logged into lauchpad in half a decade and i lost my 2FA.  i wanted to report it here to give the best chance possible of the update not making it to stable
[11:09] <lis> analysis and a simple reproducer is in https://github.com/cockpit-project/bots/pull/5793 with a link to the patch that landed in the ubuntu kernel tree on Jan 5.  it comes down to the BLKPG_DEL_PARTITION ioctl() returning EINVAL on missing partitions when it used to return ENXIO, which breaks partprobe
[11:09] -ubottu:#ubuntu-kernel- Pull 5793 in cockpit-project/bots "Image refresh for ubuntu-2204" [Open]
[11:11] <lis> the last version our CI ran against (known-good) was 5.15.0.91.88
[11:16] <tjaalton> lis: fixed by 6f64f866aa1ae69 upstream?
[11:17] <lis> i'd guess not.  i think the problem is the GENHD_FL_NO_PART → EINVAL check on entrance to blkpg_do_ioctl() and this patch doesn't seem like it would change anything about that
[11:19] <lis> it doesn't look like there's much else that could return EINVAL around.  bdev_del_partition() gets called fairly directly and it can only return a few possible errnos, none of which are EINVAL
[11:19] <lis> there's one other check that could return EINVAL but it's unchanged from the known-good version of the kernel, so i guess it's not that one
[11:25] <tjaalton> well, file it upstream then?
[11:26] <tjaalton> the patch came from upstream stable, and is applied basically every stable tree
[11:31] <lis> that makes sense, but i still thing ubuntu should stop this update from reaching stable
[11:31] <lis> *think
[11:37] <tjaalton> it's been there since v6.6
[11:37] <tjaalton> -rc1
[11:37] <lis> there's this upstream: https://marc.info/?l=linux-kernel&m=169753467305218&w=2
[11:38] <lis> someone already reported the issue, with a fix, but it was rejected since EINVAL is 'appropriate' (even though it's a behaviour change vs the previous version)
[11:58] <tjaalton> it should still be resolved upstream
[11:59] <tjaalton> and if it won't change, then I don't see why we should change things against upstream
[11:59] <tjaalton> but that's just my 2c
[12:21] <lis> https://lkml.org/lkml/2024/1/15/147 ← upstream report.  let's see what they say
[12:23] <lis> it's lunch time here.  take care!
[12:48] <DiogoConstantino> hi
[14:40] <petn-randall> joseogando: We're running NVIDIA DGX servers. Does it apply to those? What's the difference to the regular kernel?
[14:45] <joseogando> petn-randall, there is some newer hardware enabled there, although AFAIK it is mostly used with NVIDIA custom distribution. Wether it makes sense or not It likely depends on the hardware you've got. If all of the hardware that your server has is enabled then it likely makes no sense. Let me think for a moment....
[14:47] <petn-randall> I guess I could diff the /boot/config-*, and also look at the difference in patches for both kernels, but I believe that info should really be in the package description.
[14:50] <joseogando> That is definitely one way to do it. I was looking for more info but can't think of anything right now.
[14:50] <joseogando> I'll pass along your suggestion - I just did apt-cache show linux-nvidia and I see the description.
[15:08] <petn-randall> Yeah, given that nvidia is producing a wide range of hardware and use cases it might make sense to make it more descriptive what it does and does not do.