=== cyberzeus_ is now known as cyberzeus [03:09] jsalisbury, it is also bad [07:47] jsalisbury, can you please teach me how to build kernel for test ? [07:48] apw, ^ [07:50] s10gopal, this page has the broad methods: https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel [07:50] s10gopal, but it is differennt if you are building a mainline one, there you might well use make kpkg [07:51] s10gopal, but also unless you have a huge box, it takes many hours for a build [07:51] s10gopal, the one we build test kernels on has over 200 cores [07:51] s/cores/cpus [07:51] apw: that's more cores than an apple tree! [07:51] i have core i5 6th gen and 128gb ssd , how much time it is going to take ? [07:52] 2 physical core [07:52] i would be supprised if that is not in the range of 4-6 hours [07:53] that would count as "tiny" on the scale of a machine to build the kernel [07:53] s10gopal: I believe jsalisbury was planning on building the bisected kernel packages for you, you just need to report back in the LP bug report each time as to whether the last test was successful or not [07:53] which is why having j-salisbury help you [07:53] should be speeding things up [07:53] i thought , if i build kernel myself i dont have to wait for him to come online [07:55] s10gopal: if you use the LP bug report to leave comments you don't need to wait for him either, he'll build a new package and comment in the bug with a link to the download location when it is ready [07:56] s10gopal, indeed you would not though i am not convinced you could get a kernel built before he comes online anyhow [08:08] apw, everyone can access that super computer or just developers can ? [08:09] s10gopal, that is an internal system [09:56] compiling an upstream kernel with gcc-5 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 results in thread_union type being erased from the resulting kernel image. When i try compiling with gcc-7 (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0 i thread_union is there? [10:21] lorddoskias, erased in what sense ? [10:22] apw: in the sense that if i open the resulting vmlinux with gdb and write "ptype union thread_union" i get "no such type" [10:22] and as a result the "crash" utility is not working correctly as well [10:28] lorddoskias, that does sound like a bug (or limitation) in the older compiler [10:28] well the older compiler is whatever is shipped as lts for ubuntu 16.04.04 [10:28] so in fact it is the supported compiler [10:29] so the 5.4 has retpoline support enabled, whereas the 7.2.0 doesn't - the 16.04 backport is from 2017 [10:30] lorddoskias, indeed, debugging with older compilers might be hard, though i assume you can dump that thing out [10:30] as you do know the type it contains [10:31] well given that the oldest kernel supported by the upstream kernel is 4.8 i won't call 5.4 old [10:31] true, there is gcc 8 release approaching but still [10:31] it builds it just fine, right [10:32] build wise it's ok [10:32] and you know which of those three things are in there, in your context [10:32] so you can just dump the address as which ever it is [10:33] i assume it is a lack of representation of unions, though i can find no direct evidence of that being improved in later compilers in a quick search [10:34] but ptype hpet_lock works. and hpet_lock is defined as: arch/x86/kernel/hpet.c:union hpet_lock { [10:35] ptype union ftrace_op_code_union also works [10:35] so it's not being applied to all unions [10:36] lorddoskias, from your 7.2 print of the type, does it contain anything other than stack, ie are the other optional fields in there [10:37] lorddoskias, i am wondering if it is because it is a union with only one member [10:37] let me rebuild and see [10:37] as the others you list all definatly have more than one member [10:38] it is possible there is only one member, so the type is optimised away [10:39] i think this is an invalid optimisation, no ? [10:39] and in fact here we are talking about debug info [10:40] depends how you look at it, the type is technically redundant if it only has one member, the inner type can be promoted [10:40] so if the optimiser did that, then there would be no references to the type in the emitted code [10:40] so how come the newer compiler with presumably more advanced optimisation logic omits this optimisation but the older one doesn't? [10:40] and so emitting the type would be a waste of space in the debug info [10:41] but if i specifically configured the kernel to produce debug info then presumably i don't care much about optimisations,esp of the debug info [10:41] that sounds ludicrous [10:41] type = union thread_union { [10:41] struct task_struct task; [10:41] unsigned long stack[2048]; [10:41] } [10:41] so not that then [10:41] so you are not right, because there are 2 members [10:41] indeed [10:42] it was a theory, you tested it, it failed [10:42] that is the nature of theories [10:42] also i have : # CONFIG_DEBUG_INFO_REDUCED is not set [10:42] so the compiler should really be even more verbose [10:43] the compiler is still only going to include what it thinks it needs, and it is clearly wrong in this case [10:43] as many files include headers with lots of types, which you don't use [10:43] so including them all would be dumb, it obviously thinks it can remove that one, incorrectly [10:43] also i've been compiling kernels with older version of the LTS compiler [10:43] but also in this case, you really have a clear way to dump this [10:46] ah, freenode, being free node [10:46] but also in this case, you really have a clear way to dump this [10:46] so i don't see a huge amount of value in actually caring [10:46] apw: what do you mean by having clear way to dupm this [10:46] apw: the amount of value caring is the fact that crash, the utility which is used to debug kernel crashes [10:47] by definition with a full union of this nature, you as dumper needs [10:47] relies on thread_union being defined in order for it to adjust internal offsets and be able to dump stacks [10:47] to know which of the N types it contains [10:47] so you can dump it yourself no? [10:47] what do you mean by dump it? you keep insisting on this? [10:47] it is either a thread_thing or it is the raw stack [10:48] why do you care this type exists or not [10:48] presumably to display the contents [10:48] the structure is needed so tthat programatically the stack size can be deduced [10:48] because crash adjusts its internal stack size and is able to dump tasks stack? [10:48] the stack size is also defined by STACK_SIZE no ? [10:48] s/STACK_SIZE/THREAD_SIZE [10:49] and on all of the architectures we support stacks grown downwards [10:49] it is defined based on THREAD_SIZE / target arch size of ulong [10:50] well it is defined as unsigned long foo[THREAD_SIZE/sizeof(long)] so it is exactly [10:50] THREAD_SIZE bytes long [10:50] on the presumption that none of the other things are bigger than that of course [10:50] huhz [10:50] so you are going to try everything than admit that the compiler is broken and basically you (i mean ubuntu, not you personally ) are not interested in fixing this? [10:51] is that your position? [10:51] no i am trying to help you get past this blocker to you [10:51] i am sure the compiler is broken, if the newer one does the right thigns on the same code [10:51] if you file a bug someone might be able to fix it, but the reality is [10:51] i can get past this blocker by hacking around crash but that's not the point, the point is that ubuntus LTS toolchain should be producing working debug info, no ? [10:51] that fixing the compiler contains risk [10:52] and doing that for something which only affects debugging information is going to [10:52] be a non-obvious risk balance [10:52] what if i prove an earlier version of the LTS compiler worked, and that the current one is a regression ? [10:52] what then? [10:52] not that that is my call as a kernel engineer [10:53] then that is a more compelling case to fix it [10:53] but overall it would be a decision for the compiler maintainer not myself, so i wouldn't [10:53] read a whole heap into my oppinion. updating compilers always makes me afraid [10:54] my only care was to see if i could help you get past it without needing to fix it [10:54] as that is not going to take 0 time [10:54] even if they don't won't fix the bug [10:55] lorddoskias, do either way file a bug against gcc-5, just don't expect it fixed in short order [10:56] apw: i have bitter experience with reporting bugs to ubuntu in that those bugs are rarely acted upon. for example 1605843 ... [10:56] LP: #1605843 [10:56] Launchpad bug 1605843 in linux (Ubuntu) "Kernel crashes from time to time when using ftrace " [Medium,Confirmed] https://launchpad.net/bugs/1605843 [10:56] where i've provided asm dumps of what's wrong and even the commit that's supposed to fix it so it would have been a matter of cherry picking it up [10:58] a fix which if i am reading history right was applied and released in 4.4.0-67.88 [10:59] dunno, i switched to the . releases of the lts so i'm currently suing: 4.13.0-37-generic [10:59] so it works there, but at the time following multiple kernel releases the issue wasn't fixed [11:00] it did get fixed, about a year back, though the bug does not appear to have been referenced [11:01] it could have even been closed [11:01] and i have done so, we have a very large volume of bugs, it is very easy to lose individual ones over time [11:03] so what's the suggested way to downgrade a package ? [11:05] i have tended to get them from the launchpad librarian and install them with dpkg [11:05] https://launchpad.net/ubuntu/+source/gcc-5/+publishinghistory [11:06] well apt-cache madison seems to be helpful [11:07] that only shows you versions in the pool i assume [11:09] huhz, but then this barfs about unmed dependencies huhz [11:09] apw: Arggg my coworker turned off the DELL T3610 where I was testing prime95 on the patched kernel 4.13.0-38.43+lp1759920 I am going to move it elsewhere and start again. As far as I can tell, he properly powered it off at around 3PM, so that means 8-ish hours without freezing. === gpiccoli_ is now known as gpiccoli [11:27] how to extract .config from kernel? [11:29] s10gopal, our configs are in /boot alongside the binary kernel [11:30] thx [11:38] !ping [11:38] pong! [12:00] how to fix it ? scripts/sign-file.c:25:30: fatal error: openssl/opensslv.h: No such file or directory [12:00] #include [12:00] you need to build with the build-dependencies for the kernel installed [12:00] normally you do that by building in a chroot with those installed into it [12:01] apw, can you please explain it ? [12:01] apw, i am using time make deb-pkg [12:07] s10gopal: man debuild [12:08] s10gopal, https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel [12:10] thx [12:47] i am at CC [M] drivers/media/pci/ivtv/ivtvfb.o. how much time it will take to finish ? [13:01] s10gopal, it builds in parallle, so hard to say, when i do a build it takes about 10m but they are scrolling past fast enough to be hard to read === TJ_Remix is now known as TJ- [14:20] how to fix ./scripts/package/builddeb: line 33: dpkg-gencontrol: command not found make[1]: *** [deb-pkg] Error 127 make: *** [deb-pkg] Error 2 ? [14:22] s10gopal: do you have dpkg-dev installed? [14:23] s10gopal: apt install build-essential ; apt-get build-dep linux [14:23] cascardo, it was not , i need to do time make deb-pkg again ? [14:23] s10gopal: if you are running make deb-pkg, you are doing it wrong to build an ubuntu kernel [14:24] cascardo, how to do it? [14:24] if you want a deb package for a mainline kernel, then you might want to use make deb-pkg [14:24] s10gopal: did you read the documentation pointed at you? [14:25] cascardo, yes , but not complete [14:27] in place of make -j `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-custom i should use make -j4 `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-custom to use all my 4 core ? [14:28] cascardo, ^ [14:31] s10gopal: what documentation did you get that from? [14:31] certainly not from https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel [14:31] https://wiki.ubuntu.com/KernelTeam/GitKernelBuild , it is easy i can understand them [14:32] it is wrong? [14:32] it depends on what you want to achieve [14:33] i want to find bad commit [14:33] s10gopal: from an ubuntu kernel or a mainline kernel? [14:33] bisect v4.12 and v4.13-rc1 [14:34] s10gopal: you only need to use make deb-pkg, then, if you want to install those kernels using deb packages [14:34] mainline , http://kernel.ubuntu.com/~kernel-ppa/mainline/ [14:34] but, yes, probably easier to just use make deb-pkg in that case [14:35] i need to repeat the whole process again ? [14:35] time make deb-pkg ? and again it is going to take 100 min or it will be faster this time ? [14:39] s10gopal: it's possible it's gonna just do the last steps, if you haven't run any clean command [14:39] and how to find the kernel which i made ? [14:40] should be in the top dir [14:42] at $HOME ? [14:44] deb-pkg should put the .deb files in ../ (parent directory) [14:45] or if 'make bzImage' in ./arch/x86/boot/ [14:45] TJ-, a user on ##linux taught me how to build kernel from git , how i can find ##linux log ? [14:46] s10gopal: I have no idea, where di you put it? [14:47] like #ubuntu logs are at http://irclogs.ubuntu.com/ [14:48] what is the difference between time make and make ? [14:49] s10gopal: oh, thought you meant you'd kept a log locally of the commands. I don't know if ##linux logs, I think it's up to you to do it locally [14:50] thx [14:51] jsalisbury, i have learned how to make kernel , i should test kernel build by you or i can build and test ? === himcesjf_ is now known as him-cesjf [17:53] May I pick you brain kernel gurus? I am looking at some RHEL VMs, the hypervisor is patched and I manually updated the microcode to the 20180312 release, but the VM still shows as vulnerable. Does QUEMU, etc need to be patched as well for the vulnerability to be mitigated? [17:55] yes qemu needs to be updated to pass through the spec bits [17:55] --> runs to open RH support case.... [17:55] LOL [18:09] dijuremo: they may have patched everything but you might be using a model which doesn't support IBRS [18:10] dijuremo: to see if such a model is available, you can run 'virsh cpu-models x86_64 | grep IBRS' [18:11] (IBRS support comes as a separate, unique model so that live migration wasn't broken with the addition of IBRS support to the CPU model) [18:12] (you can also use in your domain xml if you don't plan on migrating VMs to another machine) [18:15] # virsh -r cpu-models x86_64 | grep IBRS [18:15] Nehalem-IBRS [18:15] Westmere-IBRS [18:15] SandyBridge-IBRS [18:15] IvyBridge-IBRS [18:15] Haswell-noTSX-IBRS [18:15] Haswell-IBRS [18:15] Broadwell-noTSX-IBRS [18:15] Broadwell-IBRS [18:15] Skylake-Client-IBRS [18:16] So it seems it does support them, but I am not quite sure to force their use. [18:16] I do not manage the RHEV Manager machine and it is outdated, running 3.5 so that may be a problem. Perhaps they need to update that so that I can see Nehalem-IBRS since I only see the CPU types without -IBRS as available. [18:24] I am a proxy for this question, a friend asks related to the ibpb hangups that you guys fixed with the 4.13.0-38.43+lp1759920 patches in the Ubuntu kernel: which debian kernels have, or will have the fix, or maybe what to look for in the upstream kernel changelog? [18:32] the problematic patch that was sent around during the spectre embargo is "x86/mm: Only set IBPB when the new thread cannot ptrace current thread" [18:33] instead of using that patch, this upstream patch should be used: [18:33] https://git.kernel.org/linus/18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7 [19:02] tyhicks: Any idea how I can find out what the latest microcode version should be for a specific CPU, Intel(R) Xeon(R) CPU E5530? I tried expanding the intel-ucode folder on one of my RHEL servers, but after reboot it is not using IBPB/IBRS. dmesg reports: microcode: CPU0 sig=0x106a5, pf=0x1, revision=0x19 [19:03] dijuremo: you'll find that info here: https://newsroom.intel.com/wp-content/uploads/sites/11/2018/04/microcode-update-guidance.pdf [19:08] tyhicks: So I note there that they have the new microcode 0x1c but dmesg is reporting in this machine 0x19, likely old from the BIOS. Any idea where I may be able to find out how to force load the new microcode? [19:10] Oh I think I know, I need to rebuilt initrd, right/ [19:14] Nope... no luck... still showing revision=0x19 after reboot. [19:28] jsalisbury: can you please build a set of kernels so i can test them without waiting for you to come online ? [19:29] s10gopal, The thing with a bisect is you need to know if the current test kernel is good or bad to determine which test kernel to build next. [19:30] suppose we are at 10 so build a kernel for good and another for bad [19:30] at commit 5 and at commit 15 [19:31] jsalisbury: on my laptop it takes 3.5 hour to make kernel from source , how i can speed it up ? [19:31] i am already using -j5 [19:31] -j4 [19:31] s10gopal, I can build you the next two kernels. We just have to be careful when testing and reporting test results. It's easy to make a mistake. [19:32] s10gopal, you could get a faster computer ;-) [19:32] s10gopal, Seriously thought, if it takes that long, tuning isn't going to speed it up much faster [19:33] i am having core i5 6200u , 12gb ram and 128gb ssd [19:34] can you please teach me how to remove modules from kernel , then it will compile fast ? [19:35] s10gopal, You would need to edit config options. [19:35] s10gopal, figuring all that out would probably take more time than just doing the bisect [19:37] jsalisbury: can i make kernel on remote machine ? [19:37] s10gopal, sure, if you have a remote machine. [19:38] it is possible to rent one online ? and how much it is going to cost ? [19:39] s10gopal, That I have no idea. You could get an Amazon or Google cloud image, but I really don't think that would be worth it. [19:39] jsalisbury: it is good [19:39] 4.12.0-041200-generic #201804051118 [19:40] s10gopal, this bisect is going to tell us which commit introduced the regression. We are then going to have to work with upstream to fix up that commit or revert it. [19:41] s10gopal, I can build the next kernel now [19:41] jsalisbury: ok i will test it [19:47] dijuremo: I just took a quick look and I suspect that Intel didn't fully update their slides [19:48] dijuremo: all the 106A5 CPUIDs have been marked RED, indicating that they're not going to update the microcode, except for yours [19:49] dijuremo: also, your processor isn't listed as receiving a microcode update in their release notes: https://downloadcenter.intel.com/download/27591/Linux-Processor-Microcode-Data-File?product=873 [19:49] dijuremo: I suspect that they forgot to change your processor's row to RED in the slides :/ [19:50] the 106a5 microcode in their latest bundle matches what your kernel is reporting: [19:50] $ iucode-tool -L microcode.dat | grep -i 106a5 [19:50] 01/129: sig 0x000106a5, pf mask 0x03, 2013-06-21, rev 0x0019, size 10240 [19:52] Wow, that is so screwed up... How is one going to figure that out when the slides show yellow and also a firmware 0x1c and their download page says This download is valid for the product(s) listed below. and includes Intel® Xeon® Processor E5530 (8M Cache, 2.40 GHz, 5.86 GT/s Intel® QPI) [20:04] i have a few grievances with current 4.15, should be worked out before bionic rtm :) [20:04] "rtm" [21:36] what is rtm? [21:37] Release To Manufacturing [22:39] ty [22:39] :) [23:23] There are some confirmed and reproducible issues with current 4.15 that really should be ironed out before bionic release (potential userbase won't be able to boot) [23:23] anyone i can talk to about it?