/srv/irclogs.ubuntu.com/2018/04/04/#ubuntu-kernel.txt

s10gopalapw, ring ring07:41
s10gopalapw, can you please tell me how to do git bijection and build kernel from source ?07:42
s10gopalalready did s10gopal: "git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git" then "cd linux" and you're in the 'root' of the source tree07:42
apws10gopal, that is not a simple thing to do, there is a wiki page with the details somewhere; or it might be better if we get someone to feed you built kernels to test07:43
apws10gopal, do you h07:43
s10gopaland this too 'git checkout v4.12' then do 'git bisect start' 'git bisect good v4.12' git bisect bad v4.13'07:43
apws10gopal, do you have a 'good and bad' kernel pair07:43
s10gopal 4.12 is good and 4.13 is bad07:43
apwthen you might want to test the 4.13-rc1 mainline build, as that is 'kind of' in the middle of those two07:44
apwhttp://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc1/07:44
apwthat is already pre-built and will save you a bit of time07:44
s10gopalshould i test all ?4.13rc*07:45
s10gopaland what about 4.12.* line ?07:45
apwwell most of the change goes in in -rc1 so that is the nearest to the 'middle' in a bisection sense as we have built07:47
apwif you are lucky that one will be good then it is worth testing the other -rc's too, -rc4 or whatever and hone in on it, but test -rc1 and see if that is good/bad07:47
s10gopalwhat is all rc are bad ?07:48
apwwell if -rc1 is bad, then the issue is between 4.12 and 4.13-rc1 and that is the most likely outcome, but that is what we are trying to narrow in on07:50
s10gopalapw, thx testing it , and after getting good and bad pair , this bug can be fixed or i have to wait for years ?07:52
apws10gopal, once the bisect is complete we will know the exact commit which introduced the issue, tehn it needs looking at, sometimes it is obvious sometimes it needs work upstream07:53
apwso i cannot possibly answer your question07:53
s10gopalthx , going to test rc , bye07:53
apwat least there is hope, and it is within your control to move it along; consider how it would be with windows07:53
apwdijuremo, there may be some movement on at least some of the hangs we have been seeing, i wonder if you could try out the kernels at the bottom of LP: #175992007:59
ubot5`Launchpad bug 1759920 in linux (Ubuntu Artful) "intel-microcode 3.20180312.0 causes lockup at login screen(w/ linux-image-4.13.0-37-generic)" [High,Confirmed] https://launchpad.net/bugs/175992007:59
apwsee the last comment, it has the links to the kernels and hints what to collect07:59
apwiirc you were seeing issues with both 4.4 and 4.13 kernels08:00
=== Elimin8r is now known as Elimin8er
=== oSoMoN_ is now known as oSoMoN
=== shadeslayer_ is now known as shadeslayer
LocutusOfBorgseriously kernel folks, https://bugs.launchpad.net/ubuntu/+source/linux/+bug/176097311:19
dijuremoapw: It's done -> https://bugs.launchpad.net/ubuntu/+source/intel-microcode/+bug/1759920/comments/6811:19
ubot5`Ubuntu bug 1760973 in linux (Ubuntu) "virtualbox-dkms 5.2.8-dfsg-5: virtualbox kernel module failed to build [Makefile:976: "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel"]" [Critical,Confirmed]11:19
ubot5`Ubuntu bug 1759920 in linux (Ubuntu Artful) "intel-microcode 3.20180312.0 causes lockup at login screen(w/ linux-image-4.13.0-37-generic)" [High,Confirmed]11:19
apwLocutusOfBorg, what?  we broke the kernel in -proposed ?  is this no a crime ?11:20
apwnow11:20
apwdijuremo, do i take it it worked from all that :)11:21
apwsforshee, ^11:22
dijuremoso far so good... But this is the T3610 that was stable for a long time yesterday without noibpb11:23
apwdijuremo, well do let us know how you get on, the more the merrier testing wise11:23
dijuremoThe one where I was running prime95 on. I left it running and ran for many hours but today it was frozen with the *older* *unpatched* 4.13.0-3811:24
dijuremoSo this morning, force restarted it, applied the patches and now we will see... If I leave it running prime95 for a full day and does not crash, we have a winner...11:24
apwtyhicks, ^11:25
dijuremoapw: Do you guys plan to patch the 4.4.0 series as well to fix that one?11:30
dijuremoapw: What would you be your best estimate on a release of both patched 4.4.0 and 4.13.0 to the main Ubuntu channels?11:31
apwdijuremo, yes we would be planning on fixing any affected kerenls, it would most likley be in the next cycle, we might expedite this fix it depends11:31
apwdijuremo, so about 3 weeks11:31
dijuremoI guess I will try to hold in adding noibpb to my Ubuntu machines for now, only do it for those freezing, do not want to end up with all of them using nobipb and being vulnerable. Just do not have enough personnel and time for trying something that does config management, i.e. ansible, salt, etc... 11:33
s10gopalapw, rc1 is also bad , which next to test ?12:07
apws10gopal, then the next is a real bisect from v4.12 to v4.13-rc112:07
apws10gopal, which i might make sense for someone to generate the krenls fro you12:08
apwjsalisbury, ^ if you have time perhaps you could help out ?12:08
s10gopalshould i test 4.12.14 too ?12:08
s10gopalapw, 4.12.1 to 4.12.1412:10
=== Blub0\0 is now known as Blub\0
jsalisburyapw, ack12:31
s10gopal!ping12:32
ubot5`pong!12:32
jsalisburys10gopal, do you have a bug opened for this bisect?12:32
s10gopaljsalisbury, https://bugzilla.kernel.org/show_bug.cgi?id=19866512:32
ubot5`bugzilla.kernel.org bug 198665 in Power-Off "Battery drains when laptop is off (shutdown) . WOL disabled and no usb device connected." [High,Needinfo]12:32
s10gopaljsalisbury, and this too https://bugs.launchpad.net/ubuntu/+source/linux/+bug/174564612:33
ubot5`Ubuntu bug 1745646 in linux (Ubuntu) "Battery drains when laptop is off (shutdown)" [Medium,New]12:33
jsalisburys10gopal, I'll start a bisect between 4.12 and 4.13-rc1 and build a test kernel.  I'll post it to the bug shortly.12:34
s10gopaljsalisbury, i am having ssd and core i5 , can you please teach me how to do it ? i will build it and test 12:34
jsalisburys10gopal, if you'd like to try, there is a wiki that explains it.  Let me grab the link12:35
s10gopaljsalisbury, but i am unable to understand it12:35
jsalisburys10gopal, https://wiki.ubuntu.com/Kernel/KernelBisection12:36
s10gopaljsalisbury, but i am unable to understand it12:36
jsalisburys10gopal, Ok, I'll build the test kernels for you to test then.12:36
s10gopaljsalisbury, thx a lot12:37
s10gopaljsalisbury, where you are going to post them ?12:37
jsalisburys10gopal, I"ll post a link in the bug and instructions.  It will be here when built: 12:38
jsalisburyhttp://kernel.ubuntu.com/~jsalisbury/lp1745646/12:38
=== ghostcube_ is now known as ghostcube
s10gopaljsalisbury, i need to install it by sudo dpkg -i linux-image-4.12.0-041200-generic_4.12.0-041200.201804041240_amd64.deb only ?14:15
xnoxs10gopal, use $ sudo apt install ./linux....deb14:18
xnoxs10gopal, note that './' is needed. If you have the full set of packages to install in a dir you can use $ sudo apt install ./*.deb14:18
s10gopali did cd Downloads14:18
xnoxs10gopal, that will tell you if you have all the right deps / if you forgor or missing any downloaded packages.14:18
s10gopalthen  sudo dpkg -i linux-image-4.12.0-041200-generic_4.12.0-041200.201804041240_amd64.deb14:18
xnoxit is no longer advisable to ever use dpkg -i interractively.... using apt with downloaded debs is better.14:19
s10gopalxnox, can we install kernel by single file ? generally it is required to download 3 deb files14:19
xnoxs10gopal, why? generally, no....14:20
xnoxwhy would you want to install it by single file?14:20
xnoxit is not packaged like that... thus by not getting all the files, you will cause broken packages and broken dependencies on your system, preventing further installations of any packages if you force it14:21
s10gopalxnox, i downloaded a kernel from http://kernel.ubuntu.com/~jsalisbury/lp1745646/14:22
xnoxusing apt install ./path-to.deb ./path-to.other.deb -> saveguards you from breaking your system.14:22
xnoxs10gopal, if that installs fine using apt, it is fine.14:22
TJ-xnox: is there any documentation on using 'apt' eith deb files? I don't see anything in 'man apt'14:22
jsalisburys10gopal, Just be sure to reboot after installing the new kernel and select it from the GRUB menu.14:23
s10gopalTJ-, i did sudo dpkg -i linux-image-4.12.0-041200-generic_4.12.0-041200.201804041240_amd64.deb , any other command or file is required too or i should reboot now ?14:23
jsalisburys10gopal, after a reboot, run 'uname -a' to ensure you booted into it.14:23
xnoxTJ-, that would need ping to juliank, but he is not in this channel14:23
* xnox summons juliank14:23
s10gopalok thx14:24
s10gopalrebooting14:24
juliankxnox: you called14:24
juliank:D14:24
xnox<TJ-> xnox: is there any documentation on using 'apt' eith deb files? I don't see anything in 'man apt'14:24
xnoxjuliank, ^14:24
xnoxs/eith/with/14:24
juliankhmm14:25
juliankmaybe not14:25
TJ-It would be nice to refer users to using apt that way14:25
xnoxjuliank, i do recall our extensive discussion of specifying things with ./*.deb at all.14:25
juliankyeah14:25
juliankWhat works is apt --with-source <changes/deb/maybe dir, idk> names..., and apt install /absolute/path ./relative/path/with/dot/in/front14:26
s10gopalLinux gopal-HP-Notebook 4.12.0-041200-generic #201804041240 SMP Wed Apr 4 12:44:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux14:26
juliank--with-source is documented (it adds e.g. a .changes or .deb file to the cache like a normal package set)14:27
xnoxTJ-, yeah in general these days, after building things i only ever use $ sudo apt install ./*.deb -> e.g. when testing new builds of systemd, etc.14:27
xnoxcauser there are tricky depends that dpkg -i cannot get right, if i forget a package, or list them out of order.14:27
juliankbut the other not14:27
juliankNice things for testing builds are14:27
TJ-xnox: right - I could do with apt instead of dpkg -i in some of my scripts14:27
juliankapt install --only-upgrade ../package_version_arch.changes14:28
juliankor without --only-upgrade if you need to install new stuff14:28
s10gopalTJ-, i am in jsalisbury kernel ?14:28
juliankand/or --with-source changes and running upgrade or something14:28
juliankI guess that makes a lot of sense14:28
juliankapt upgrade --with-source ../<built changes file> 14:28
* TJ- pipes juliank into 'man-db'14:29
juliankFor with-source at least, you can also generate a Packages file and install from that14:29
mamarleyYou can use APT to upgrade locally compiled packages by passing it only a changes file?  Where has this been all my life?14:30
xnoxI think juliank is overdue a blogpost on all the planets which just lists these commands and ends with14:30
xnox.... "guess what all of the above do?! Your welcome!"14:31
jsalisburys10gopal, post the output of uname -a from a terminal and we will know.14:31
s10gopalLinux gopal-HP-Notebook 4.12.0-041200-generic #201804041240 SMP Wed Apr 4 12:44:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux14:31
jsalisburys10gopal, great.  Now test for the bug. I'll build the next test kernel based on the results.14:32
jsalisburys10gopal, then will do the same over again about 10 or so times.14:32
juliankxnox: I'd expect apt upgrade <path to changes> to do the wrong thing, BTEW14:32
s10gopaljsalisbury, how much time i should turn off my laptop ? 14:32
jsalisburys10gopal, however long you did in the past to reproduce the bug.14:32
jsalisburys10gopal, we just want to know if this kernel exhibits the bug or not.14:33
juliankmamarley: it's only been around for 2 years14:33
s10gopalgoing to test it14:34
s10gopalthx 14:34
TJ-jsalisbury: you've going to love s10gopal :) Thanks for stepping in 14:39
xnoxjuliank, but at King's Cross you can take pictures on the platform 9¾ 14:42
xnoxwrong channel14:42
xnoxoh wel14:42
jsalisburyTJ-, anytime :-)14:57
TJ-jsalisbury: did you see that s10gopal is suspecting the RTC sync on shutdown due to discovering an old Debian bug?14:58
jsalisburyTJ-, I didn't.  That's what is great about a bisect.  No guessing needed :-D15:00
jsalisburyTJ-, many times no thinking needed either, lol15:00
TJ-jsalisbury: I  think it's an ACPI issue15:01
jsalisburyTJ-, probably.  Once he tests a couple of kernels, the number of commits will come way down, so we may be able to just pick it out15:02
TJ-Yes, I hope so. It's a hell of a one to test though, because the PC needs leaving shutdown for quite a few hours to be sure of the battery drain15:03
=== aaa_ is now known as aaa_|away
jsalisburyTJ-, arg, well I imagine this bisect will take a while then.  I'll skim throught the commits between v4.12 and v4.13-rc1 and see if anything sticks out.15:04
TJ-jsalisbury: I was going to prepare a script to generate all the kernels bisect would need to test ahead of time so he could just install them without waiting for bisect+build but he started 'clinging' to me so I had to back off15:06
jsalisburyTJ-, so he's a "Klingon" ?15:07
jsalisbury:)15:07
TJ-jsalisbury: desperately wants a solution but has very little ability to read documentation/wiki and transpose into what the specific commands he needs are15:08
jsalisburyTJ-, I'll help him along, he seems willing to test, so that's good.15:09
* TJ- nods15:09
dijuremoA quick question about the changes to the kernel per Ubuntu bug 1759920, Will you guys pass the fix upstream to Debian? Were their kernels also affected similarly with ibpb freezes?16:06
ubot5`Ubuntu bug 1759920 in linux (Ubuntu Artful) "intel-microcode 3.20180312.0 causes lockup at login screen(w/ linux-image-4.13.0-37-generic)" [High,Confirmed] https://launchpad.net/bugs/175992016:06
=== himcesjf_ is now known as him-cesjf
Gargravarrdijuremo: hi, i've been sent your way from #ubuntu-server about an intel-microcode issue, possibly related to bug 175992017:10
ubot5`bug 1759920 in linux (Ubuntu Artful) "intel-microcode 3.20180312.0 causes lockup at login screen(w/ linux-image-4.13.0-37-generic)" [High,Confirmed] https://launchpad.net/bugs/175992017:10
Gargravarri have some Kaby Lake laptops which are displaying the same symptoms, which seems to be closely related to sssd17:12
tyhicksdijuremo: the fix is already in the upstream linux kernel - we were using a slightly different patch that had came from a processor vendor17:20
tyhicksGargravarr: sssd is a good way to trigger it17:21
tyhicksGargravarr: if you have time, please follow the instructions in this comment: https://bugs.launchpad.net/ubuntu/+source/intel-microcode/+bug/1759920/comments/6717:21
ubot5`Ubuntu bug 1759920 in linux (Ubuntu Artful) "intel-microcode 3.20180312.0 causes lockup at login screen(w/ linux-image-4.13.0-37-generic)" [High,Confirmed]17:21
Gargravarrtyhicks: thanks for the link. is this known to affect 4.4 kernels as well? my laptop really does not like 4.1317:24
tyhicksGargravarr: it does affect 4.4 but I don't yet have a test kernel17:25
tyhicksGargravarr: keep an eye out as I'll have a 4.4 test kernel soon17:25
Gargravarrthanks. i do have an affected laptop installed with 4.13 but i'm currently testing with the 4.4 one17:25
dijuremoGargravarr: All our machines had sssd, so probably why we were seeing them freeze consistently.17:35
Gargravarrdijuremo: part of my job here has been migrating local users onto LDAP. dogfooding has meant i ran into the problem pretty frequently17:37
dijuremoGargravarr: I maintain RHEL 7.x and Ubuntu Linux desktops and laptops all tied with sssd to Active Directory...17:38
Gargravarryou have my sympathy ;)17:39
dijuremoGargravarr: honestly has worked very well, had some issues related to DNS and very slow lookups of things, but working well now.17:40
Gargravarri built an OpenLDAP cluster from scratch17:41
* waveform is stupid enough to run an openldap cluster at home17:41
=== aaa_|away is now known as YR3aG4hQ
Gargravarrthat's now nice and stable, but trying to do this all open-source has been really quite painful17:41
waveform(there are many raspberry pis ... that's my excuse)17:41
Gargravarrwaveform: i have more laptops than relatives :P the thought of running my own has crossed my mind17:42
=== YR3aG4hQ is now known as sMFts2gy
Gargravarrcan't deny, AD does make building and maintaing a domain significantly easier17:42
dijuremoGargravarr: Dont want to hijack this channel talking about sssd, so PM me if you like. I did run openldap back in the beginning of the 2000s, but it was even more challenging, compiling and running Openldap on Solaris and authenticate against kerberos to get rid of NIS. Was a fun project. But do not want to deal with ldap, etc....17:43
Gargravarri now speak fluent LDIF...17:43
Gargravarrtyhicks helpfully explained why this is a kernel issue in the other channel17:44
Gargravarrand why sssd is particularly good at hitting it17:44
Gargravarrtyhicks: okay, i've loaded this laptop up with your -38 kernel, here goes17:44
Gargravarrtyhicks: no, it froze again, immediately after successful auth17:46
dijuremoGargravarr:  You have just gave me some news, I was not aware that sssd would trigger the freeze, but now it makes sense that more and more of my systems began to freeze. I thought originally it was related to running X and the Nvidia driver since I could sometimes log in, but the after log out the system would freeze restarting lightdm17:47
dijuremoYou have just *given* me some news....17:47
Gargravarrwe all English good on IRC :)17:47
dijuremoNot a native english speaker, so I have an excuse, but try to make up for my mistakes when I see them ;)17:48
waveformyeah, definitely not related to nvidia drivers (that came up a lot on the duplicate bugs, but I've only got AMD/intel graphics here and still experienced the lockup)17:49
tyhicksGargravarr: could you boot the same kernel but add "noibpb" to the kernel command line in grub (so that you can fully boot) and then paste the output that I requested here? https://bugs.launchpad.net/ubuntu/+source/intel-microcode/+bug/1759920/comments/6717:49
ubot5`Ubuntu bug 1759920 in linux (Ubuntu Artful) "intel-microcode 3.20180312.0 causes lockup at login screen(w/ linux-image-4.13.0-37-generic)" [High,Confirmed]17:49
=== sMFts2gy is now known as KM9K62TkUq
* Gargravarr is learning why netcat is such an invaluable tool17:50
Gargravarrtyhicks: it booted successfully without the noibpb flag, what does that do?17:52
tyhicksGargravarr: disables the problematic code path17:52
tyhicks(among other things)17:52
Gargravarrright, that explains it, version signature confirms i've booted the wrong kernel :)17:53
tyhickswoohoo! that's why I insisted on proof :)17:53
tyhicksI'm pretty confident that the fix is right17:53
tyhicksbut I really do appreciate your testing17:53
tyhicksso give it another go with the right kernel and keep your fingers crossed17:54
Gargravarrindeed. wasn't sure if i booted the right kernel in the first place (or whether i already had -38 installed), but it's bloody hard to see on these XPS13's - ours have frikkin' 4k screens17:56
Gargravarrtyhicks: so do you want me to boot -38 with or without that flag to test?18:00
Gargravarri'm guessing without18:03
=== KM9K62TkUq is now known as aaa_
tyhicksGargravarr: without18:03
Gargravarrokay, so everything is matching your output on comment #67 so far18:06
Gargravarrjust fired up sssd18:06
Gargravarrlet's see what happens...18:06
Gargravarrokay, it did an auth and failed on something else (my bad, quirk of our LDAP setup i think)18:08
Gargravarrmost importantly, it DIDN'T freeze18:08
tyhicksnice!18:10
Gargravarryuh, script crashed before it set up the PAM profile, lemme install it...18:10
tyhicksGargravarr: please paste the output of those commands so that I can double check your machine state18:11
tyhicksGargravarr: either in the bug report or via paste.ubuntu.com18:11
Gargravarrtyhicks: collecting it now18:13
tyhicksthanks18:13
Gargravarrhalle-freaking-lujah, logged in to desktop successfully18:14
Gargravarryour fix looks like the ticket18:15
tyhicks:)18:16
Gargravarrtyhicks: comment #7218:22
Gargravarrdoes the CPU generation make much difference on how badly affected the machine is? i notice our Skylake desktops haven't run into it yet, but the one machine i've checked in depth doesn't have intel-microcode installed. can't say for sure if the others do18:24
Gargravarrand all the ones that are affected are Kaby Lake18:24
tyhicksGargravarr: hmm, could you paste the output of 'sudo cat /sys/devices/system/cpu/cpu0/microcode/version' into this channel?18:24
tyhicksGargravarr: run that command on the machine that you left the bug comment about18:25
Gargravarrokay, lemme start it back up18:25
tyhicksas for CPU generation, I'm not sure... some of your machine may not have updated microcode18:25
Gargravarrindeed. i thought it was part of my build script18:25
Gargravarroh feck. that was stupid.18:26
tyhicksyou may not have the latest microcode on the machine that you left the comment about18:26
Gargravarrmy LDAP setup here involves storing the ecryptfs recovery key for a user directly in LDAP against their profile18:27
Gargravarri seem to have neglected to make an exception for root...18:27
Gargravarrtyhicks: 0x7c18:29
Gargravarrfor the XPS 13 running your -38 kernel18:29
Gargravarri7 7560U chip18:29
tyhicksGargravarr: what version of intel-microcode is installed?18:31
Gargravarr...it ain't18:32
Gargravarrinteresting. so i have (at some point) updated the BIOS (which includes microcode) to v2.5.0, released 18th Feb18:35
tyhicksGargravarr: and 'cat /proc/sys/kernel/ibpb_enabled' prints out '1'?18:35
Gargravarronly i can't find that version on Dell's site any more18:35
Gargravarrseems to have been entirely replaced by 2.5.1, which makes me think the .0 version is known broken18:36
tyhicksI think you should have revision 0x8418:37
tyhickssee https://newsroom.intel.com/wp-content/uploads/sites/11/2018/04/microcode-update-guidance.pdf for better info than I can provide18:37
Gargravarrokay. i'll push the firmware up to the latest version while i'm at it18:38
tyhicksanyways, IBPB is available in your microcode so that's all that I needed to know18:38
tyhicksI need to get heads-down on these backports now18:38
tyhicksthanks again!18:38
Gargravarrtyhicks: okay, so updating the system firmware has indeed pushed the microcode up to 0x8418:46
tyhicksgood deal18:52
Gargravarrthis is where it gets confusing, with the BIOS and userspace stuff overlapping :)18:52
Gargravarrand it all meets in the kernel18:52
s10gopaljsalisbury, it is bad18:52
jsalisburys10gopal, ack.  thanks for testing. I'll build the next kernel and post it shortly.18:57
Gargravarrright, that's 3 hours past clocking-off time, time to go home :P18:58

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!