/srv/irclogs.ubuntu.com/2016/10/26/#ubuntu-kernel.txt

him-cesjfHello, I am running Kubuntu 16.10 on a Dell laptop. For some reason, the OS hangs/responds very slowly in between while typing or switching tasks or while doing any work. During this hang/slow behaviour, I notice the fan revving in high speed, delay in typing, network disconnection and  stuck digital clock seconds counter which restores to normal after the lag subsides but occurs every 10 seconds or so lasting for about 3-4 seconds. Is this 05:25
him-cesjfa kernel issue? How can I determine what is causing this?05:25
dzneti have a some problem with intel 7265 wireless adapter on my laptop. how i can solve this?11:33
dznethello!11:33
dzneti use linux mint 18. kernel 4.4.0-4511:35
ogra_by asking in a mint channel ?11:42
him-cesjfHello, I am running Kubuntu 16.10 on a Dell laptop. For some reason, the OS hangs/responds very slowly in between while typing or switching tasks or while doing any work. During this hang/slow behaviour, I notice the fan revving in high speed, delay in typing, network disconnection and  stuck digital clock seconds counter which restores to normal after the lag subsides but occurs every 10 seconds or so lasting for about 3-4 seconds. Is this 12:41
him-cesjfa kernel issue? How can I determine what is causing this?12:41
apwcould be a kernel issue, or a display driver issue, or ... some *stat commands are good for those kinds of things12:57
ckinghim-cesjf, i suggest usign  forkstat and fnotifystat to see what's up12:59
him-cesjfapw: Hi, could you please guide me on how to narrow down the exact cause?12:59
him-cesjfSysinfo for 'TuxStick': Running inside KDE Plasma 5.7.5 on Ubuntu 16.10 (Yakkety Yak) powered by Linux 4.8.0-26-generic, CPU: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz at 2099/2700 MHz, RAM: 7443/7902 MB, Storage: 26/57 GB, 283 procs, 65.76h up12:59
him-cesjfcking: Okay, I'll try12:59
ckinghim-cesjf, and maybe cpustat too13:00
apwcking, thanks13:01
him-cesjfInstalling forkstat13:01
him-cesjfcking: cpustat log - http://paste.ubuntu.com/23383540/13:21
him-cesjfcking: fnotifystat log - http://paste.ubuntu.com/23383534/13:21
him-cesjfcking: forstat log - https://paste.ubuntu.com/23383516/13:21
him-cesjfforkstat*13:21
ckingplasmashell is very busy13:22
ckingyou can get more stats on what it is doing using healthcheck, e.g. sudo health-check -p plasmashell13:22
him-cesjfnothing from other two log files?13:23
him-cesjfSure13:23
ckinghim-cesjf, lets drill down on the top offender first13:23
ckinghttp://kernel.ubuntu.com/~cking/health-check - sudo apt-get install health-check13:23
him-cesjfSure13:23
ckingcking, but I'm not going to be around this afternoon, so use that tool and past the details in this channel and I'll pick it up from there hopefully tomorrow13:24
ckinghim-cesjf, ^13:24
him-cesjfOh, um okay. here is health-check file - http://paste.ubuntu.com/23383576/13:27
him-cesjfcking: ^13:27
ckinghim-cesjf, seems to be spinning on a poll(), I'd file a bug against that application and put that information above in the bug report 13:29
him-cesjfcking: Okay, filing bug report. Thanks for pointing it out!13:30
cking28967 plasmashell          poll                 4589.4564   112804        0   0.0 sec    0.0 sec    0.0 sec13:30
him-cesjfYes, noticed13:30
ckingthis shows that it's spinning on a zero timeout poll and that's bad :-(13:30
him-cesjfAnything else possible other than filing bug report to solve it?13:30
apwouch :)13:31
him-cesjfcking: Any other possible cause?13:35
ckinghim-cesjf, nothing else looks like the culprit to me13:43
him-cesjfThanks13:47
zygahey, I've reported https://bugs.launchpad.net/ubuntu/+source/linux/+bug/163684714:11
ubot5`Ubuntu bug 1636847 in linux (Ubuntu) "unexpectedly large memory usage of mounted snaps" [Undecided,Confirmed]14:11
zygait would be great if someone could look at that and see if the bug is in squashfs itself or in the particular config we use14:11
apwzyga, there is some pretty suspicious sources of memory "use" calculations in that14:14
apwzyga, anyhow thanks for the heads up will find someone to look at it14:15
zygaapw: can you be more specific?14:15
zygaapw: memory does quickly run out though, in simple real-world tests oom killer jumps in just a few mounts in14:15
zygaapw: (after mounting empty snaps)14:15
zygaapw: and my numbes agree with slabtop14:16
apwtotal slabs allocated doesn't tell us how much of them is in use 14:16
apwjust how much memory is in them currently14:16
zygaapw: mounting 5 empty snaps crashes the box on 1GB VM14:16
zygaapw: in any case, I'd love feedback on how to improve this 14:17
zygaapw: the relevant code is https://github.com/zyga/mounted-fs-memory-checker/blob/master/analyze.py#L1914:17
apwyep14:17
him-cesjfcking: apw: Filed bug report - https://bugs.launchpad.net/ubuntu/+source/plasma-workspace/+bug/1636869 https://bugs.kde.org/show_bug.cgi?id=37171214:17
ubot5`Ubuntu bug 1636869 in plasma-workspace (Ubuntu) "Plasmashell polling on zero timeout" [Undecided,New]14:17
ubot5`KDE bug 371712 in DataEngines "Plasmashell polling on zero timeout" [Major,Unconfirmed]14:17
apwdo you have a nice emppyt snap ?14:18
zygaapw: and https://github.com/zyga/mounted-fs-memory-checker/blob/master/analyze.py#L4814:18
zygaapw: everything is in that repo14:18
zygaapw: just look around14:18
zygaapw: I also collected raw traces from various kernels and distros14:18
zygaapw: fork that repo and run the overview script, it just chruns the datq14:18
zygaapw: fake payload is in https://github.com/zyga/mounted-fs-memory-checker/tree/master/payload14:18
apwzyga, won't be me, will get someone to look at it though14:19
zygaapw: if you want to see the numbers this is what I get from the script now: http://paste.ubuntu.com/23383523/14:19
zygaapw: thank you, noted14:19
zygaapw: and do let them tell me if I counted this incorrectly14:19
apw+ ./analyze.py ubuntu 16.04 4.4.0-45-generic 4 size-1m.squashfs.xz.heavy14:20
apwzyga, what does the 4 mean14:20
zygaapw: for core system14:20
zygaapw: four*14:21
apwzyga, and you are just mounting it, right ? or are you looking at the contents14:22
zygaapw: just mounting14:22
zygaapw: the contents is a 0 sized file14:22
zygaapw: or in this case, a 1mb file14:22
zygaapw: (of urandom data)14:22
zygaapw: you can run those traces with the same file in vfat and ext4 for comparison14:22
zygaapw: there memory usage doesn't change at all (nearly)14:23
apwsquashfs, that well tested filesystem :<14:23
zygaapw: we're the only distro that uses this set of kernel config options14:23
zygaapw: my traces include kernel config from each system14:24
apwsnaps are literally about the only thing which uses squashfs14:24
zygaapw: it is worth looking into two things IMHO: 1) why are single cpu systems using so much more memory as compared to a four-cpu system14:24
zygaapw: and is the multi-threaded, per-cpu decompressor worth it (other distros use one single threaded decompressor)14:25
manjortg, apw any chance you will have 4.9 rc rebased ontop of unstable in the coming weeks ? 14:26
apwzyga, yeah, well i can intuit why that might trigger that behavior, i assume we are hitting pathalogical memory allocator behaviour by our memory patterns14:26
rtgmanjo, some chance14:26
apwzyga, and the majority on those pages have just one teensy bit of useful allocation in them14:26
zygaapw: all of the memory is in kmalloc-4096 slab14:26
apwzyga, can you point me at the config delta, as logically i should give you a test kernel with that changed to confirm it is tha14:26
apwtthat14:26
manjortg, before Nov 11 ? 14:27
zygaapw: well, not on one line but just grep for squashfs in https://github.com/zyga/mounted-fs-memory-checker/tree/master/traces/ubuntu/16.10/4.8.0-26-generic/ncpus-1 and https://github.com/zyga/mounted-fs-memory-checker/blob/master/traces/fedora/24/4.7.9-200.fc24.x86_64/ncpus-1/kernel.config14:28
zygaapw: I can do that if you want to but I'd rather have someone investigate deeer and just run those tests again14:28
zygadeeper*14:28
apwzyga, so there is no deadline to make any improvement here ...14:29
zygaapw: well, as soon as it starts to explode on us :/14:29
zygaapw: I think we don't want 130MB per snap 14:29
zygaon small sytems14:29
apwright so making it go away is more important than why it is 14:29
zygayes14:29
apwbroken, so if we switch the config you test that14:29
apwand if it works we ship that, and find out _why_ later14:29
zygaworth a try14:30
apwCONFIG_SQUASHFS_DECOMP_SINGLE14:31
apwi assume it is those ones you want flipped here14:31
apwzyga, also what release are you testing in, so i make a test kernel in the right version14:32
zygaapw: this is all focused on snappy so currently that's a xenial kernel14:32
zygaapw: correct14:32
zygaapw: that seems to be the most plausible one14:32
apwi'll flip that _SINGLE and get you some kernels, can you test amd64 i assume so14:33
apwzyga, ^14:33
zygaI sure can, thanks14:33
rtgmanjo, 4.9 won't be released before Nov 1114:39
manjortg, yes I know it is still rc 214:40
apwzyga, ok ... people.canonical.com/~apw/lp1636847-xenial/ has test kerenls with that flipped over ...15:09
apwzyga, ping me when you know if that is better15:10
zygaapw: thanks, downloading now15:10
zygaapw: just ran the numbers again, looking much better16:13
zygaapw: the 131 MB/snap is down to 416:13
zygaapw: data pushed back to the repo16:15
him-cesjfapw: Curious about polling in terms of software. I know about polling in microprocessor/hardware where it polls to check the status of a device, like in printer. What does polling mean in software, like how we noticed in plasmashell few minutes ago?16:17
apwhim-cesjf, the poll call is a way to avoid active polling, whne done correct16:19
apwzyga, ok, so i think we switch that up in the next sru kernel, and i'll have one of my engineers look at why it is broken in that other seemingly superiour mode16:19
apwzyga, could you report tht in the bug as well for me, helps with sru'ing it16:20
him-cesjfapw: I didn't follow why polling was done for plasmashell and what active polling means for it16:26
him-cesjfSorry if I am bothering with basic questions16:26
zygaapw: with pleaseure, thank you for the kernel16:26
apwwell poll is normally used for waiting for events from like the mouse and the like, well the input in general, and respses from teh display server, this should be an waiting poll, but if you do it wrong it will return immediatly to tell you did it wrong for instance and then you can get into a cpu consuming loop16:27
him-cesjfYes, but polling is usually for a device/hardware from what I kow taking classes in microprocessor in electronics, why plasmashell requires polling is what I am trying to understand16:29
him-cesjfknow from*16:29
apwit is concept not related to hardware specifiically16:31
apwthough in general unpredictable events are from hardware/users etc16:31
apwin this case though the name is a missnoma, it is used to avoild polling on files16:31
apwit specifically is used to "wait for anything to happen to any one of this set of file descriptors"16:32
apwand those usually are connected to your devices and display server in this kind of context16:32
apwit should sit their quietly and do nothing until something happens, but it showing up in the stats16:33
apwimply it is not... and something is wrong in that application16:33
zygaapw: done16:34
him-cesjfapw: Still around?17:30
apwvaugly17:30
him-cesjfhttps://bugs.kde.org/show_bug.cgi?id=371712 someone replied and thinks the lag due to plasmashell is not a problem17:32
ubot5`KDE bug 371712 in DataEngines "Plasmashell polling on zero timeout" [Major,Needsinfo: waitingforinfo]17:32
him-cesjfMaybe I didn't explain the problem well?17:33
him-cesjfapw^17:38
apwit sounds like they ahve suggested a reasonable course of action17:40
him-cesjfUmokay17:40
him-cesjfapw: Could you give me a one line explaination of this line so that I can explain the same in reply. I am not good in interpretting it?17:47
him-cesjf28967 plasmashell          poll                 4589.4564   112804        0   0.0 sec    0.0 sec    0.0 sec17:47
apwjust shove it in verbatim, they will know what it means17:47
apwif not they should not have asked for it17:47
him-cesjfSure17:51
Free99hello everyone, is there a bug where an application loading the CPU causes a full system freeze currently out for kernel 4.4.0-45? I can't seem to find one21:18

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!