/srv/irclogs.ubuntu.com/2018/04/05/#ubuntu-kernel.txt

=== cyberzeus_ is now known as cyberzeus
s10gopaljsalisbury, it is also bad03:09
s10gopaljsalisbury, can you please teach me how to build kernel for test ?07:47
s10gopalapw, ^07:48
apws10gopal, this page has the broad methods: https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel07:50
apws10gopal, but it is differennt if you are building a mainline one, there you might well use make kpkg07:50
apws10gopal, but also unless you have a huge box, it takes many hours for a build07:51
apws10gopal, the one we build test kernels on has over 200 cores07:51
apws/cores/cpus07:51
TJ-apw: that's more cores than an apple tree!07:51
s10gopali have core i5 6th gen and 128gb ssd , how much time it is going to take ?07:51
s10gopal2 physical core07:52
apwi would be supprised if that is not in the range of 4-6 hours07:52
apwthat would count as "tiny" on the scale of a machine to build the kernel07:53
TJ-s10gopal: I believe jsalisbury was planning on building the bisected kernel packages for you, you just need to report back in the LP bug report each time as to whether the last test was successful or not07:53
apwwhich is why having j-salisbury help you07:53
apwshould be speeding things up07:53
s10gopali thought , if i build kernel myself i dont have to wait for him to come online07:53
TJ-s10gopal: if you use the LP bug report to leave comments you don't need to wait for him either, he'll build a new package and comment in the bug with a link to the download location when it is ready07:55
apws10gopal, indeed you would not though i am not convinced you could get a kernel built before he comes online anyhow07:56
s10gopalapw, everyone can access that super computer or just developers can ?08:08
apws10gopal, that is an internal system08:09
lorddoskiascompiling an upstream kernel with gcc-5 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 results in thread_union type being erased from the resulting kernel image. When i try compiling with gcc-7 (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0 i thread_union is there? 09:56
apwlorddoskias, erased in what sense ?10:21
lorddoskiasapw: in the sense that if i open the resulting vmlinux with gdb and write "ptype union thread_union" i get "no such type" 10:22
lorddoskiasand as a result the "crash" utility is not working correctly as well 10:22
apwlorddoskias, that does sound like a bug (or limitation) in the older compiler10:28
lorddoskiaswell the older compiler is whatever is shipped as lts for ubuntu 16.04.0410:28
lorddoskiasso in fact it is the supported compiler 10:28
lorddoskiasso the 5.4 has retpoline support enabled, whereas the 7.2.0 doesn't  - the 16.04 backport is from 201710:29
apwlorddoskias, indeed, debugging with older compilers might be hard, though i assume you can dump that thing out10:30
apwas you do know the type it contains10:30
lorddoskiaswell given that the oldest kernel supported by the upstream kernel is 4.8 i won't call 5.4 old 10:31
lorddoskiastrue, there is gcc 8 release approaching but still 10:31
apwit builds it just fine, right10:31
lorddoskiasbuild wise it's ok 10:32
apwand you know which of those three things are in there, in your context10:32
apwso you can just dump the address as which ever it is10:32
apwi assume it is a lack of representation of unions, though i can find no direct evidence of that being improved in later compilers in a quick search10:33
lorddoskiasbut ptype hpet_lock works. and hpet_lock is defined as: arch/x86/kernel/hpet.c:union hpet_lock {10:34
lorddoskiasptype union ftrace_op_code_union also works 10:35
lorddoskiasso it's not being applied to all unions 10:35
apwlorddoskias, from your 7.2 print of the type, does it contain anything other than stack, ie are the other optional fields in there10:36
apwlorddoskias, i am wondering if it is because it is a union with only one member10:37
lorddoskiaslet me rebuild and see 10:37
apwas the others you list all definatly have more than one member10:37
apwit is possible there is only one member, so the type is optimised away10:38
lorddoskiasi think this is an invalid optimisation, no ?10:39
lorddoskiasand in fact here we are talking about debug info 10:39
apwdepends how you look at it, the type is technically redundant if it only has one member, the inner type can be promoted10:40
apwso if the optimiser did that, then there would be no references to the type in the emitted code10:40
lorddoskiasso how come the newer compiler with presumably more advanced optimisation logic omits this optimisation but the older one doesn't?10:40
apwand so emitting the type would be a waste of space in the debug info10:40
lorddoskiasbut if i specifically configured the kernel to produce debug info then presumably i don't care much about optimisations,esp of the debug info 10:41
lorddoskiasthat sounds ludicrous 10:41
lorddoskiastype = union thread_union {10:41
lorddoskias    struct task_struct task;10:41
lorddoskias    unsigned long stack[2048];10:41
lorddoskias}10:41
apwso not that then10:41
lorddoskiasso you are not right, because there are 2 members10:41
apwindeed10:41
apwit was a theory, you tested it, it failed10:42
apwthat is the nature of theories10:42
lorddoskiasalso i have : # CONFIG_DEBUG_INFO_REDUCED is not set10:42
lorddoskiasso the compiler should really be even more verbose10:42
apwthe compiler is still only going to include what it thinks it needs, and it is clearly wrong in this case10:43
apwas many files include headers with lots of types, which you don't use10:43
apwso including them all would be dumb, it obviously thinks it can remove that one, incorrectly10:43
lorddoskiasalso i've been compiling kernels with older version of the LTS compiler 10:43
apwbut also in this case, you really have a clear way to dump this10:43
lorddoskiasah, freenode, being free node 10:46
apwbut also in this case, you really have a clear way to dump this10:46
apwso i don't see a huge amount of value in actually caring10:46
lorddoskiasapw: what do you mean by having clear way to dupm this10:46
lorddoskiasapw: the amount of value caring is the fact that crash, the utility which is used to debug kernel crashes 10:46
apwby definition with a full union of this nature, you as dumper needs10:47
lorddoskiasrelies on thread_union being defined in order for it to adjust internal offsets and be  able to dump stacks 10:47
apwto know which of the N types it contains10:47
apwso you can dump it yourself no?10:47
lorddoskiaswhat do you mean by dump it? you keep insisting on this?10:47
apwit is either a thread_thing or it is the raw stack10:47
apwwhy do you care this type exists or not10:48
apwpresumably to display the contents10:48
lorddoskiasthe structure is needed so tthat programatically the stack size can be deduced 10:48
lorddoskiasbecause crash adjusts its internal stack size and is able to dump tasks stack?10:48
apwthe stack size is also defined by STACK_SIZE no ?10:48
apws/STACK_SIZE/THREAD_SIZE10:48
apwand on all of the architectures we support stacks grown downwards10:49
lorddoskiasit is defined based on THREAD_SIZE / target arch size of ulong 10:49
apwwell it is defined as unsigned long foo[THREAD_SIZE/sizeof(long)] so it is exactly10:50
apwTHREAD_SIZE bytes long10:50
apwon the presumption that none of the other things are bigger than that of course10:50
lorddoskiashuhz 10:50
lorddoskiasso you are going to try everything than admit that the compiler is broken and basically you (i mean ubuntu, not you personally ) are not interested in fixing this? 10:50
lorddoskiasis that your position?10:51
apwno i am trying to help you get past this blocker to you10:51
apwi am sure the compiler is broken, if the newer one does the right thigns on the same code10:51
apwif you file a bug someone might be able to fix it, but the reality is10:51
lorddoskiasi can get past this blocker by hacking around crash but that's not the point, the point is that ubuntus LTS toolchain should be producing working debug info, no ?10:51
apwthat fixing the compiler contains risk10:51
apwand doing that for something which only affects debugging information is going to 10:52
apwbe a non-obvious risk balance10:52
lorddoskiaswhat if i prove an earlier version of the LTS compiler worked, and that the current one is a regression ?10:52
lorddoskiaswhat then?10:52
apwnot that that is my call as a kernel engineer10:52
apwthen that is a more compelling case to fix it10:53
apwbut overall it would be a decision for the compiler maintainer not myself, so i wouldn't10:53
apwread a whole heap into my oppinion.  updating compilers always makes me afraid10:53
apwmy only care was to see if i could help you get past it without needing to fix it10:54
apwas that is not going to take 0 time10:54
apweven if they don't won't fix the bug10:54
apwlorddoskias, do either way file a bug against gcc-5, just don't expect it fixed in short order10:55
lorddoskiasapw: i have bitter experience with reporting bugs to ubuntu in that those bugs are rarely acted upon. for example 1605843 ...10:56
apwLP: #160584310:56
ubot5`Launchpad bug 1605843 in linux (Ubuntu) "Kernel crashes from time to time when using ftrace " [Medium,Confirmed] https://launchpad.net/bugs/160584310:56
lorddoskiaswhere i've provided asm dumps of what's wrong and even the commit that's supposed to fix it so it would have been a matter of cherry picking it up 10:56
apwa fix which if i am reading history right was applied and released in 4.4.0-67.8810:58
lorddoskiasdunno, i switched to the . releases of the lts so i'm currently suing: 4.13.0-37-generic10:59
lorddoskiasso it works there, but at the time following multiple kernel releases the issue wasn't fixed10:59
apwit did get fixed, about a year back, though the bug does not appear to have been referenced11:00
lorddoskiasit could have even been closed 11:01
apwand i have done so, we have a very large volume of bugs, it is very easy to lose individual ones over time11:01
lorddoskiasso what's the suggested way to downgrade a package ?11:03
apwi have tended to get them from the launchpad librarian and install them with dpkg11:05
apwhttps://launchpad.net/ubuntu/+source/gcc-5/+publishinghistory11:05
lorddoskiaswell apt-cache madison seems to be helpful11:06
apwthat only shows you versions in the pool i assume11:07
lorddoskiashuhz, but then this barfs about unmed dependencies huhz 11:09
dijuremoapw: Arggg my coworker turned off the DELL T3610 where I was testing prime95 on the patched kernel 4.13.0-38.43+lp1759920 I am going to move it elsewhere and start again. As far as I can tell, he properly powered it off at around 3PM, so that means 8-ish hours without freezing.11:09
=== gpiccoli_ is now known as gpiccoli
s10gopalhow to extract .config from kernel?11:27
apws10gopal, our configs are in /boot alongside the binary kernel11:29
s10gopalthx11:30
s10gopal!ping11:38
ubot5`pong!11:38
s10gopalhow to fix it ? scripts/sign-file.c:25:30: fatal error: openssl/opensslv.h: No such file or directory12:00
s10gopal #include <openssl/opensslv.h>12:00
apwyou need to build with the build-dependencies for the kernel installed12:00
apwnormally you do that by building in a chroot with those installed into it12:00
s10gopalapw, can you please explain it ?12:01
s10gopalapw, i am using time make deb-pkg12:01
cascardos10gopal: man debuild12:07
bjfs10gopal, https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel12:08
s10gopalthx12:10
s10gopali am at   CC [M]  drivers/media/pci/ivtv/ivtvfb.o.  how much time it will take to finish ?12:47
apws10gopal, it builds in parallle, so hard to say, when i do a build it takes about 10m but they are scrolling past fast enough to be hard to read13:01
=== TJ_Remix is now known as TJ-
s10gopalhow to fix ./scripts/package/builddeb: line 33: dpkg-gencontrol: command not found                     make[1]: *** [deb-pkg] Error 127   make: *** [deb-pkg] Error 2 ?14:20
cascardos10gopal: do you have dpkg-dev installed?14:22
cascardos10gopal: apt install build-essential ; apt-get build-dep linux14:23
s10gopalcascardo, it was not , i need to do time make deb-pkg  again ?14:23
cascardos10gopal: if you are running make deb-pkg, you are doing it wrong to build an ubuntu kernel14:23
s10gopalcascardo, how to do it?14:24
cascardoif you want a deb package for a mainline kernel, then you might want to use make deb-pkg14:24
cascardos10gopal: did you read the documentation pointed at you?14:24
s10gopalcascardo, yes , but not complete14:25
s10gopalin place of make -j `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-custom i should use make -j4 `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-custom to use all my 4 core ?14:27
s10gopalcascardo, ^14:28
cascardos10gopal: what documentation did you get that from?14:31
cascardocertainly not from https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel14:31
s10gopalhttps://wiki.ubuntu.com/KernelTeam/GitKernelBuild , it is easy i can understand them14:31
s10gopalit is wrong?14:32
cascardoit depends on what you want to achieve14:32
s10gopali want to find bad commit14:33
cascardos10gopal: from an ubuntu kernel or a mainline kernel?14:33
s10gopalbisect v4.12 and v4.13-rc114:33
cascardos10gopal: you only need to use make deb-pkg, then, if you want to install those kernels using deb packages14:34
s10gopalmainline , http://kernel.ubuntu.com/~kernel-ppa/mainline/14:34
cascardobut, yes, probably easier to just use make deb-pkg in that case14:34
s10gopali need to repeat the whole process again ?14:35
s10gopaltime make deb-pkg ? and again it is going to take 100 min or it will be faster this time ?14:35
cascardos10gopal: it's possible it's gonna just do the last steps, if you haven't run any clean command14:39
s10gopaland how to find the kernel which i made ? 14:39
cascardoshould be in the top dir14:40
s10gopalat $HOME ?14:42
TJ-deb-pkg should put the .deb files in ../ (parent directory)14:44
TJ-or if 'make bzImage' in ./arch/x86/boot/14:45
s10gopalTJ-, a user on ##linux taught me how to build kernel from git , how i can find ##linux log ?14:45
TJ-s10gopal: I have no idea, where di you put it?14:46
s10gopallike #ubuntu logs are at http://irclogs.ubuntu.com/14:47
s10gopalwhat is the difference between time make and make ?14:48
TJ-s10gopal: oh, thought you meant you'd kept a log locally of the commands. I don't know if ##linux logs, I think it's up to you to do it locally14:49
s10gopalthx14:50
s10gopaljsalisbury, i have learned how to make kernel , i should test kernel build by you or i can build and test ?14:51
=== himcesjf_ is now known as him-cesjf
dijuremoMay I pick you brain kernel gurus? I am looking at some RHEL VMs, the hypervisor is patched and I manually updated the microcode to the 20180312 release, but the VM still shows as vulnerable. Does QUEMU, etc need to be patched as well for the vulnerability to be mitigated?17:53
apwyes qemu needs to be updated to pass through the spec bits17:55
dijuremo--> runs to open RH support case....17:55
dijuremoLOL17:55
tyhicksdijuremo: they may have patched everything but you might be using a model which doesn't support IBRS18:09
tyhicksdijuremo: to see if such a model is available, you can run 'virsh cpu-models x86_64 | grep IBRS'18:10
tyhicks(IBRS support comes as a separate, unique model so that live migration wasn't broken with the addition of IBRS support to the CPU model)18:11
tyhicks(you can also use <cpu mode='host-passthrough'> in your domain xml if you don't plan on migrating VMs to another machine)18:12
dijuremo# virsh -r cpu-models x86_64 | grep IBRS 18:15
dijuremoNehalem-IBRS 18:15
dijuremoWestmere-IBRS 18:15
dijuremoSandyBridge-IBRS 18:15
dijuremoIvyBridge-IBRS 18:15
dijuremoHaswell-noTSX-IBRS 18:15
dijuremoHaswell-IBRS 18:15
dijuremoBroadwell-noTSX-IBRS 18:15
dijuremoBroadwell-IBRS 18:15
dijuremoSkylake-Client-IBRS18:15
dijuremoSo it seems it does support them, but I am not quite sure to force their use.18:16
dijuremoI do not manage the RHEV Manager machine and it is outdated, running 3.5 so that may be a problem. Perhaps they need to update that so that I can see Nehalem-IBRS since I only see the CPU types without -IBRS as available.18:16
dijuremoI am a proxy for this question, a friend asks related to the ibpb hangups that you guys fixed with the 4.13.0-38.43+lp1759920 patches in the Ubuntu kernel:  which debian kernels have, or will have the fix, or maybe what to look for in the upstream kernel changelog?18:24
tyhicksthe problematic patch that was sent around during the spectre embargo is "x86/mm: Only set IBPB when the new thread cannot ptrace current thread"18:32
tyhicksinstead of using that patch, this upstream patch should be used:18:33
tyhickshttps://git.kernel.org/linus/18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c718:33
dijuremotyhicks: Any idea how I can find out what the latest microcode version should be for a specific CPU, Intel(R) Xeon(R) CPU E5530? I tried expanding the intel-ucode folder on one of my RHEL servers, but after reboot it is not using IBPB/IBRS. dmesg reports: microcode: CPU0 sig=0x106a5, pf=0x1, revision=0x1919:02
tyhicksdijuremo: you'll find that info here: https://newsroom.intel.com/wp-content/uploads/sites/11/2018/04/microcode-update-guidance.pdf19:03
dijuremotyhicks: So I note there that they have the new microcode 0x1c but dmesg is reporting in this machine 0x19, likely old from the BIOS. Any idea where I may be able to find out how to force load the new microcode?19:08
dijuremoOh I think I know, I need to rebuilt initrd, right/19:10
dijuremoNope... no luck... still showing revision=0x19 after reboot.19:14
s10gopaljsalisbury: can you please build a set of kernels so  i can test them without waiting for you to come online ?19:28
jsalisburys10gopal, The thing with a bisect is you need to know if the current test kernel is good or bad to determine which test kernel to build next.19:29
s10gopalsuppose we are at 10 so build a kernel for good and another for bad 19:30
s10gopalat commit 5 and at commit 1519:30
s10gopaljsalisbury: on my laptop it takes 3.5 hour to make kernel from source , how i can speed it up ?19:31
s10gopali am already using -j519:31
s10gopal-j419:31
jsalisburys10gopal, I can build you the next two kernels.  We just have to be careful when testing and reporting test results.  It's easy to make a mistake.19:31
jsalisburys10gopal, you could get a faster computer ;-)19:32
jsalisburys10gopal, Seriously thought, if it takes that long, tuning isn't going to speed it up much faster19:32
s10gopali am having core i5 6200u , 12gb ram and 128gb ssd19:33
s10gopalcan you please teach me how to remove modules from kernel , then it will compile fast ?19:34
jsalisburys10gopal, You would need to edit config options.19:35
jsalisburys10gopal, figuring all that out would probably take more time than just doing the bisect19:35
s10gopaljsalisbury: can i make kernel on remote machine ?19:37
jsalisburys10gopal, sure, if you have a remote machine.19:37
s10gopalit is possible to rent one online ? and how much it is going to cost ?19:38
jsalisburys10gopal, That I have no idea.  You could get an Amazon or Google cloud image, but I really don't think that would be worth it.19:39
s10gopaljsalisbury: it is good19:39
s10gopal4.12.0-041200-generic #20180405111819:39
jsalisburys10gopal, this bisect is going to tell us which commit introduced the regression.  We are then going to have to work with upstream to fix up that commit or revert it.19:40
jsalisburys10gopal, I can build the next kernel now19:41
s10gopaljsalisbury: ok i will test it19:41
tyhicksdijuremo: I just took a quick look and I suspect that Intel didn't fully update their slides19:47
tyhicksdijuremo: all the 106A5 CPUIDs have been marked RED, indicating that they're not going to update the microcode, except for yours19:48
tyhicksdijuremo: also, your processor isn't listed as receiving a microcode update in their release notes: https://downloadcenter.intel.com/download/27591/Linux-Processor-Microcode-Data-File?product=87319:49
tyhicksdijuremo: I suspect that they forgot to change your processor's row to RED in the slides :/19:49
tyhicksthe 106a5 microcode in their latest bundle matches what your kernel is reporting:19:50
tyhicks$ iucode-tool -L microcode.dat  | grep -i 106a519:50
tyhicks  01/129: sig 0x000106a5, pf mask 0x03, 2013-06-21, rev 0x0019, size 1024019:50
dijuremoWow, that is so screwed up... How is one going to figure that out when the slides show yellow and also a firmware 0x1c  and their download page says This download is valid for the product(s) listed below. and includes Intel® Xeon® Processor E5530 (8M Cache, 2.40 GHz, 5.86 GT/s Intel® QPI)19:52
ntdi have a few grievances with current 4.15, should be worked out before bionic rtm :)20:04
ntd"rtm"20:04
aaa_what is rtm?21:36
TJ-Release To Manufacturing21:37
aaa_ty22:39
aaa_:)22:39
ntdThere are some confirmed and reproducible issues with current 4.15 that really should be ironed out before bionic release (potential userbase won't be able to boot)23:23
ntdanyone i can talk to about it?23:23

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!