mgedmin | laptop died a swappy death | 07:50 |
---|---|---|
mgedmin | I upgraded two laptops to ubuntu gnome 14.10 and both do this about once a week :( | 07:51 |
mgedmin | both run without swap because both have SSDs | 07:51 |
mgedmin | one has 8 gigs of ram, the other has 2 gigs | 07:51 |
darkxst | mgedmin, I am confused how can a laptop die a swappy death when its not using swap? | 07:53 |
darkxst | you shouldnt have any problems with 8gb and no swap, 2gb might be problematic | 07:54 |
mgedmin | HDD LED on, computer nonresponsive (caps lock takes 60 seconds between keypress and led being on), mouse movement limited to 1/2px per minute | 07:54 |
mgedmin | the onset is instant | 07:54 |
mgedmin | waiting 20 seconds for some help (OOM killer?) doesn't help | 07:54 |
mgedmin | I don't _know_ that it was OOM | 07:54 |
darkxst | its booting into shell? | 07:55 |
mgedmin | what do you mean? | 07:55 |
darkxst | like you are in gnome-shell and its doing this? | 07:55 |
mgedmin | yes | 07:55 |
mgedmin | I'm looking at atop logs | 07:56 |
darkxst | is it really using swap though, or something else flogging drives | 07:56 |
darkxst | mgedmin, or iotop | 07:56 |
mgedmin | it shows the disk being busy between 92 and 102%, doing mostly reads at 230 MB/s | 07:56 |
mgedmin | this is forensic study now, I had to alt-sysrq-s,u,b reboot | 07:56 |
mgedmin | atop writes a snapshot of the system state every 10 minutes to a binary log file | 07:57 |
mgedmin | when the laptop became non-responsive gnome-shell's clock said 09:14 | 07:57 |
mgedmin | I turned it off at about 09:37 | 07:57 |
mgedmin | well, I hit Alt-SysRq-K at that point to kill X | 07:58 |
mgedmin | the S,U,B was at 09:39 | 07:58 |
mgedmin | which is lucky, since atop wrote its last system snapshot at 09:38 | 07:58 |
darkxst | I hit a similar issue caused by media-scanner, but you shouldnt have that installed on Ubuntu-GNOME unless you also have ubuntu desktop installed | 07:58 |
mgedmin | let's look at the 09:28 snapshot: I have 4.1G in cache, 112M free, 0 swap, sda is busy 102%, reading at 250 MB/s | 07:59 |
darkxst | can you ssh in and find what is causing the reads? | 07:59 |
mgedmin | processes reading from disk include: skype (8%), VBoxHeadless (7%), chromium-browser (7%) kswapd0 (6%) | 07:59 |
mgedmin | basically everything is reading from disk | 07:59 |
mgedmin | whoa, the page scan rate is 1603e6 | 08:00 |
mgedmin | at 09:08:32 it was "PAG | scan 76411 |" | 08:00 |
mgedmin | 10 minutes later it was 2744e5 | 08:00 |
mgedmin | ten minutes later it was 1603e6 | 08:00 |
mgedmin | and ten minutes later it was 1630e6 | 08:01 |
mgedmin | my hypothesis: the kernel decides it needs to free some ram, so it starts discarding mapped executable pages | 08:01 |
mgedmin | and then all the running apps have to read them back in all the time | 08:01 |
mgedmin | which makes for 250 MB/s read rate and processes like Skype reading 13.5GB of data in a 10 minute window | 08:02 |
mgedmin | if I had some swap, maybe the kernel would push some dirty pages out | 08:02 |
darkxst | mgedmin, not too sure but you could try an older kernel and see if that helps | 08:02 |
darkxst | I have to cook dinner, then head out for the night, ping me tomorrow | 08:03 |
mgedmin | I want a system monitor applet in my gnome-shell | 08:11 |
mgedmin | one that shows the amount of free memory I have and doesn't block the main gnome-shell thread | 08:12 |
darkxst | mgedmin, my one is largely unmaintained now, and it does block unfortunately | 08:41 |
darkxst | stupid libgtop doesn't have an async api | 08:41 |
mgedmin | does gjs support threads? | 08:41 |
darkxst | nope | 08:41 |
darkxst | and it likely never will | 08:42 |
mgedmin | jay | 08:42 |
mgedmin | I used to use https://extensions.gnome.org/extension/120/system-monitor/ until I discovered that little gotcha with network filesystems going away | 08:42 |
darkxst | most of the real GNOME libraries have async api's though | 08:42 |
mgedmin | I don't think the kernel has an async version of statvfs, does it? | 08:43 |
darkxst | heh, I just disabled network filesystems, since it was causing blocking on stale mounts | 08:43 |
darkxst | well not just, ages ago | 08:43 |
* mgedmin files https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1390358 and has no hopes of this being looked at seriously | 08:49 | |
ubot5 | Ubuntu bug 1390358 in linux (Ubuntu) "Computer unusable under memory pressure with no swap space" [Undecided,Confirmed] | 08:49 |
darkxst | mgedmin, did this just start happening? or you only just upgraded to 14.10 | 08:53 |
mgedmin | before 14.10 it was "basically never" on my main laptop | 08:55 |
mgedmin | after 14.10 it's about once a week | 08:55 |
mgedmin | on my second laptop (2gigs of ram, a thinkpad x200, used as a media center at home) actually the same | 08:55 |
mgedmin | it would run out of ram about once a week (gnome-shell memleak in 12.04), but I could recover with alt-f2 r | 08:56 |
mgedmin | hey, it's now running 14.04, not 14.10 | 08:56 |
mgedmin | and when it freezes this way I can't recover with alt-f2 r, I have to alt-sysrq-s,u,b | 08:56 |
mgedmin | so hm | 08:56 |
darkxst | its pretty damn critical if you can't even switch to a VT | 08:57 |
mgedmin | the media laptop typically freezes when my wife opens a youtube video | 08:57 |
mgedmin | I can switch to a vt, if I'm patient enough | 08:57 |
mgedmin | I can't log in, because login it times out after 60 seconds without giving me a chance to enter my password | 08:57 |
mgedmin | once, just once, I lived through this kind of disk storm | 08:58 |
mgedmin | vmstat was funny to look at | 08:58 |
darkxst | mgedmin, i bet, anyway Im out now | 08:59 |
=== ljcr is now known as dsfa | ||
=== dsfa is now known as ljcr |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!