/srv/irclogs.ubuntu.com/2013/12/05/#ubuntu-kernel.txt

=== lifelike_ is now known as lifelike
=== shengyao_afk is now known as shengyao
=== shengyao is now known as shengyao_afk
=== shuduo_afk is now known as shuduo
apb1963ubuntu 12.04.3  3.2.0-57-generic-pae #87-Ubuntu SMP Tue Nov 12 21:57:43 UTC 2013 i686 i686 i386 GNU/Linux   CRASH.  FREEZE....reboot.... FREEZE....reboot... FREEZE....reboot.... FREEZE03:58
apb1963:(04:01
apb1963I get about 30 minutes of uptime now... then it freezes.04:02
apb1963I don't see anything exciting in the logs.... but then I don't know what to look for.04:03
apb1963started happening yesterday04:04
=== shuduo is now known as shuduo_afk
apb1963or perhaps this morning... something came down in an update, I don't know what.04:05
apb1963AFAIK everything is current.04:06
=== shuduo_afk is now known as shuduo
=== shuduo is now known as shuduo_afk
=== shuduo_afk is now known as shuduo
apb1963And we have another freeze... and reboot.05:23
apb1963And another freeze.... and reboot06:56
apwapb1963, if this is sudden, then you want to try and oldre kernel and see if it does it there09:01
smbapw, Or gfx, whatever or whichever he got09:03
apwmoin09:03
smbapw, moin09:03
ckingyawn09:04
smbfwiw I was running that kernel for a bit but now moved on as I got proposed enabled09:04
apb1963apw Would the update manager have downloaded and installed a new kernel?09:13
apb1963The date here....3.2.0-57-generic-pae #87-Ubuntu SMP Tue Nov 12 21:57:43 UTC 2013 i686 i686 i386  isn't that the date the kernel was compiled?09:14
smbyes that is the compile date, but also yes update manager also does download and install kernel packages just like any other09:15
apb1963ok, so .... did something force a download within the last 24 hours?  Wouldn't I have gotten that kernel 3 weeks ago?09:16
apwyou can tell what has ben downloaded in the last 24 hours in your apt logs09:16
apb1963cool.  let me check that.09:16
smbEven longer in /var/log/apt/history.log* 09:18
apb1963yeah that's what I was looking at... not a lot in there.  However, notice the time of my last comment before you guys woke up.09:20
apb1963it's been over 2 hours09:20
apb1963The only thing that really happened in that time is I updated sflphone - an application I use.  A softphone.  I went into that channel several hours ago... and noticed that the developer mentioned to me he thought he had fixed a bug I reported.  I told him of this freeze problem - he didn't respond, but another updated came down shortly after... I installed that and now it's been 2 hours since the last freeze.09:22
apb1963That's the same app that was tickling a DBUS bug I reported not too long ago.09:23
apb1963It was crashing my system... so he changed his code... to accommodate.09:23
apb1963So many changes going on it's hard to know what's causing what.09:24
* apw nods09:24
apb1963And... as best as I can remember...  I thought I updated to the -57 version a few weeks ago... but i'm not 100% sure... hell, it could have been -5609:25
apb1963I can post the apt log if you wouldn't mind having a look.  I'm sure you're more familiar with the various things than I am.09:26
smbapb1963, Usually you have at least 2 (usually even more if you don't manually delete them) of the previous kernels09:26
apb1963I haven't deleted anything 09:26
apb1963although I'm not sure where to look for them09:27
smbIf you hit left shift during boot (or modify grub to always show up for a few seconds) you can select them09:27
smbls /boot09:27
apb1963Yeah....  I tried hitting both left & right shift... it just ignores it.09:28
smbThe timing is... trick09:28
smbtricky09:28
apb1963I held it down for the entire boot sequence... it ignored me09:28
smbI prefer to modify /etc/default/grub to comment out the two *HIDDEN* lines09:28
smbset the other timeout to 10 and run "sudo update-grub"09:29
apb1963heh.  I've got the last 10 kernels.09:29
apb1963I wlil do that09:29
apb1963so... this one...#GRUB_HIDDEN_TIMEOUT=009:30
smbright and there is another one09:31
smb#GRUB_HIDDEN_TIMEOUT=009:31
smb#GRUB_HIDDEN_TIMEOUT_QUIET=true09:31
smbGRUB_TIMEOUT=1009:31
apb1963GRUB_HIDDEN_TIMEOUT_QUIET=true09:31
apb1963so that was already uncommented09:31
smbapb1963, They have to be like I posted above09:32
smbBoth HIDDEN commented out and the GRUB_TIMEOUT set to whatever wait time you want09:32
smbIf hidden timeout is true, the hidden timeout value is used (which is 0 by default), which gives you no delay and you have to press shift exactly at the right time between BIOS screen after keyboard is active and before grub starts.09:34
smbNot really much time09:35
apb1963go it09:35
apb1963got it09:36
apb1963Yeah, I've been experiencing all kinds of weird stuff09:37
apb1963My network takes like 2 minutes to come up09:37
apb1963the network manager can't even see my network - not that I use network manager but still09:38
apb1963I have an approximate 7 second delay on SIP calls.  Nobody has a clue what that's all about.09:38
apb1963Stuff crashes randomly09:39
apb1963Various applications09:39
apb1963I get loads over 7 for no apparent reason09:39
apb1963Normal operation is like less than .509:39
apb1963I've been reporting bugs left & right in various apps.09:40
apb1963I'm frustrated :/09:40
apb1963well, at least my machine stopped freezing up for now.09:40
apwyou should try running development where that kind of experience is the norm09:41
apb1963Yeah well...... this is my "production" machine.  And I only have the one. 09:43
apb1963This machine was supposed to an asterisk server.... then my only other machine died of hardware failure... and I was forced to install a desktop on this one...09:45
apb1963So I'm waiting for a hand-me-down from my brother for 3 weeks now, and I don't konw if he's ever going to send it.09:45
apb1963In the meantime, I'm trying to get by on just this one machine running everything... 09:46
apb1963Don't ever be poor if you can avoid it.09:47
apb1963I can't decide if I should start some of the stuff I had running earlier to see if it triggers the freeze09:49
apb1963About 35 chrome windows, libreoffice with a half dozen or so files open, skype, Kontact, firefox....09:49
smbWell doing one at a time and wait some time in between at least would give some hints09:50
apb1963yeah09:50
apb1963I have a strong suspicion it was sflphone09:50
=== shuduo is now known as shuduo_afk
apb1963how an app could cause a kernel freeze though.... I don't know.09:51
smbIs it really a kernel freeze? Sometimes X locking up hard just looks the same09:51
apb1963I was afraid it might be bad RAM... is there a way to test RAM while the machine is booted?09:51
apb1963I couldn't switch to a VT09:51
apb1963When X locks up I can generally switch to a VT09:52
smbI had those too in the past. if X takes all the keypresses and ignores them. Hard to say then09:52
apb1963I mean my system gets a  high load and brings it to a crawl... but eventually I can get a terminal and run top... shows me a rediculous load I start killing a few chrome windows and it bounces back.09:53
apb1963I think flash is causing the load09:53
apb1963but this was different09:53
smbOnly chance would be a second computer on the network and trying to ping or ssh... If you get top running, I think 'c' sorts by cpu usage, so you would see which process is doing that09:53
apb1963Oh and the very first time this happend I got some kernel messages09:54
=== shuduo_afk is now known as shuduo
apb1963yeah09:54
apb1963I got like a "cut here" message09:54
apb1963I couldn't find it in the log09:54
apb1963but it was a definite kernel crash09:54
apwphotos are you main plan when that happens in case it isn't in the log09:54
apb1963after the crash it just kept freezing09:54
apb1963photos?09:55
apwif you can see it, so can your camera09:55
apb1963oh.  yeah well...  let me use the "poor" key word again09:55
apb1963sucks to be me09:55
apb1963oh well :)09:56
smbPen and paper? Yeah it is tedious. ;)09:56
apb1963well...  I just assumed it would be in the log09:56
apb1963so I didn't even consider that09:56
smbYeah, problem with crashes is that the kernel grinds to a halt. So is the part that writes to disk. 09:57
apb1963I don't know why, but I figured if it had time to write to the screen it had time to write to disk.  What can I say, I wasn't thinking.09:58
=== shuduo is now known as shuduo_afk
apb1963ok i'll startup skype09:59
apb1963hmmm... might have already been running.... came up too fast09:59
apb1963and now we call libreoffice to the stand....09:59
smbapb1963, If you can afford the time it would be good to wait about those 30m it took to freeze between additional applications10:01
smbok skype was running for that I suppose10:02
apb1963well, I can afford the time - but only because it's 2am and I need sleep10:03
apb1963so if it crashes or freezes, so be it.10:04
apb1963But lets say it does... then what do I do?10:04
smbIf you see crash messages on the screen, write them down. Then probably repeat with just the last app started 10:09
smbif you find a suspect, you could slowly go back to older kernel versions to see whether they are the same10:10
smbIf it looks to be the kernel start filing a bug by running "ubuntu-bug linux" from that machine.10:12
smbIf it keeps crashing even with older kernels and you find its a certain application started, you probably have to make the bug report manually. Hm, well try to replace linux with the appname. Not sure this works though10:14
apb1963hmm10:15
apb1963there's also the further complication that I'm using 6 different virtual desktops10:17
apb1963so who knows if that's interacting10:18
apb1963earlier I was sorting my apps into VD's... this last time I didn't bother.10:18
apb1963Would be nice if apps returned to their assigned VD's, but that's another story.10:18
smbI could not guarantee that but which VD an app is should not matter for crashing. And no, we cannot help you with that placement problem here. ;)10:20
apb1963k10:20
apb1963KDE thing I guess10:20
apb1963well, thank you for the  help!  I'm gonna hit the hay before my head hits the keyboard :)10:22
apb1963'night10:22
smbThat may help a lot. Good night. :)10:23
=== gfrog is now known as gfrog_afk
=== ghostcube__ is now known as ghostcube
rtghenrix, hmm, v3.12.3 broke e1000e. have you heard anything about that ? I'll check that it really works on 3.12.2 first13:46
henrixrtg: no, i haven't found any e1000e patch but usually network patches aren't tagged for stable13:55
henrixrtg: davem queues them and sends them in a batch13:55
henrixrtg: let me check his queue...13:55
henrixrtg: a quick look here http://patchwork.ozlabs.org/bundle/davem/stable/?state=* doesn't show anything13:57
henrixand there's nothing obvious on 3.12.3 that would break this driver13:58
bjfrtg, how does the issue manifest itself? does it just not recognize the nic or does it spew chunks?13:58
rtgbjf, henrix: actually, this looks like AA denying dhclient. 3.12 kernel in a precise user space14:01
rtgI'll try booting with it disabled.14:02
rtgthat was it.14:06
rtgjjohansen1, can you think of a reason that a 3.12 kernel wouldn't work with a precise apparmor ? its blocking dhclient.14:07
smbIts a bit weird my VM guest was ok with dhcp and a ... oh 3.11 kernel ... 14:09
rtgsmb, try a mailine 3.12.314:10
rtgmainline14:10
apwthat sounds familiar, it blocking dhclient i mean14:11
smbrtg, Yeah maybe later. Right now I use that for fiddling with some lts-s dkms14:11
apwwasn't that the error we had on the phones when part of apparmor3 was missing ?14:11
rtgapw, maybe14:11
apwor if userspace was out of sync with kernel, something like that14:11
apwsmb, test the 3.13 kernel if that works we can hide from it perhaps14:12
rtgapw,  a newer kernel should always work on an older user space14:12
smbapw, can do in a bit. but I suspect I might be too late then14:13
apwrtg, oh of course it should, i mean when the aa3 bits went in there was an order that was bad, i think kernel had to hit before userspace14:17
apwso we could have lost something in the kernel and break outselves with newer userspace14:17
rtgapw, that shouldn't be the case for precise, right ?14:17
apwoohhhh, this is lts-backport, erm, it just might, as in the new kernel is enforcing something which was specified but broken in older kernels14:18
apwso lets think... i thnk we had to have updated user profiles _first_ then the new kernel bits for aa3, which might explain this14:19
rtgapw, well, its not rally the LTS, its just the trusty kernel (though it shouldn't make a difference)14:19
apwbut ... we need jjohansen1 rather than my rather unreliable memeory14:19
apwright, this is the kernel with aa3 on precise ie old profiles, that might be an issue14:19
apwwe might have to update the profiles, which will work same on older kernels.  but jj would be the one to confirm for sure14:20
rtgapw, so I'm running a 3.13 kernel on a precise server14:20
apwand it works or does not work14:20
rtgworks fine14:21
apwusing dhcp or static network config14:21
apwi think it was dbus mediation which was the issue14:21
rtgapw, uses dhcp. lemme go get the bit of log where AA was denying the socket request.14:22
apwie nothing wrong with networking, just unable to request networking14:22
apwsomething odd like that, so ehternet might work, and wireless not or ... something14:22
rtgDec  5 23:58:24 gbyte kernel: [  644.311454] type=1400 audit(1386251904.862:93): apparmor="DENIED" operation="connect" parent=2101 profile="/usr/lib/NetworkManager/nm-dhcp-client.action" name="/run/dbus/system_bus_socket" pid=2103 comm="nm-dhcp-client." requested_mask="r" denied_mask="r" fsuid=0 ouid=014:23
rtgDe14:23
rtgapw, yep, the network driver seems fine14:23
apwyeah that'd be my memory of the issue, that the dbus bits get mediated and prevented14:24
apwbecause the profiles specifity it wrongly, but the older kernles don't implement it either14:25
rtgapw, yeah, its also denying cupsd14:25
apwso it all works, and then we fix the kernel and it applies them and crunch14:25
apwso i think we will need a profile update for precise before this can work, but as i say14:25
apwjjohansen1, is the man to confirm my madness14:25
apwor we might decide to switch that bit off for the backport only14:25
rtgapw, just pushed Ubuntu-3.12.0-6.14 which we should smoke test just to be ure14:29
apwack14:29
apwrtg, i take it you didn't tag it14:31
apw(as yet)14:31
rtgnope, not until I'm done making changes14:31
rtgI generally tag things just before uploading14:31
apwmakes sense, jsut confriming i'd not gotten an old one14:34
apwrtg, i've just responded to your linux-tools thing, i don't think its quite there ...14:40
rtgapw, good, I wondered if I missed anything. I had a splitting headache whilst I was doing that14:40
=== jdstrand_ is now known as jdstrand
apwrtg, you know, i wonder if we should be offering them the linux-tools-version[-flavour] and also offering them something based on the linux-image- meta package they have installed perhaps16:37
rtgapw, well, the meta package naming isn't totally consistent. I think this at least will give folks an idea that they might have to hunt down the _right_ meta package.16:38
apwyeah, i think we should get the -flavour bit right, shall i have a poke and see if i can improve16:40
rtgapw, have at it. Frankly, my attitude is that if someone is installing tools, then they likely know what they are doing (given a little nudging)16:40
apwrtg, it would be nice to not need that to have versions in there, i wonder if we could fix that somehow16:52
rtgapw, do you mean you wish perf was not ABI specific ?16:52
apwrtg, hey i also wish that.  i meant i wish i could incant at like dpkg to get the list of versions and packages16:54
apwso it would work on all releases for ever16:54
rtgapw, well, we've only got to worry about one more LTS kernel for 12.0416:55
rtgand we can probably predict the meta package name16:55
apwyeah16:59
* rtg -> lunch18:39
hallynapw: could you kick off a new build of the trusty -unstable kernel in ppa?18:52
jjohansen1apw, rtg: nothing new in 3.12 over 3.11 that I am aware of, however the aa3 bits in both of those do require some policy updates or dhclient etc will break19:06
rtgjjohansen1, which is exactly what is happening19:06
jjohansen1yep19:07
jjohansen1sorry just working through the backscroll19:07
bjfcking, i see this every once and a while in testing: "Timed out after 30 seconds doing mkdir() - possible eCryptfs hang" but if i just retry the test it doesn't happen again. do you want to see these?20:29
tyhicksbjf: looking at the test, that may be caused by a race condition which would explain why you don't always see it20:31
bjftyhicks, do you want to look at the system when it gets into that state?20:32
tyhicksbjf: yeah, that would be helpful20:32
bjftyhicks, ok, next time it happens and your around i'll let you know. do you have your QA lab vpn creds?20:33
tyhicksbjf: I think so, but it has been a while so I'm not 100% sure20:33
bjftyhicks, ack20:33
tyhicksbjf: yep, I see it in my network manager menus20:33
tyhicksand it connects, so I should be ready when it happens again20:34
tyhicksbjf: what kernel version is this happening on? do you know what lower filesystem ecryptfs is mounted on when it happens?20:35
* rtg -> EOD21:15

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!