[03:58] <apb1963> ubuntu 12.04.3  3.2.0-57-generic-pae #87-Ubuntu SMP Tue Nov 12 21:57:43 UTC 2013 i686 i686 i386 GNU/Linux   CRASH.  FREEZE....reboot.... FREEZE....reboot... FREEZE....reboot.... FREEZE
[04:01] <apb1963> :(
[04:02] <apb1963> I get about 30 minutes of uptime now... then it freezes.
[04:03] <apb1963> I don't see anything exciting in the logs.... but then I don't know what to look for.
[04:04] <apb1963> started happening yesterday
[04:05] <apb1963> or perhaps this morning... something came down in an update, I don't know what.
[04:06] <apb1963> AFAIK everything is current.
[05:23] <apb1963> And we have another freeze... and reboot.
[06:56] <apb1963> And another freeze.... and reboot
[09:01] <apw> apb1963, if this is sudden, then you want to try and oldre kernel and see if it does it there
[09:03] <smb> apw, Or gfx, whatever or whichever he got
[09:03] <apw> moin
[09:03] <smb> apw, moin
[09:04] <cking> yawn
[09:04] <smb> fwiw I was running that kernel for a bit but now moved on as I got proposed enabled
[09:13] <apb1963> apw Would the update manager have downloaded and installed a new kernel?
[09:14] <apb1963> The date here....3.2.0-57-generic-pae #87-Ubuntu SMP Tue Nov 12 21:57:43 UTC 2013 i686 i686 i386  isn't that the date the kernel was compiled?
[09:15] <smb> yes that is the compile date, but also yes update manager also does download and install kernel packages just like any other
[09:16] <apb1963> ok, so .... did something force a download within the last 24 hours?  Wouldn't I have gotten that kernel 3 weeks ago?
[09:16] <apw> you can tell what has ben downloaded in the last 24 hours in your apt logs
[09:16] <apb1963> cool.  let me check that.
[09:18] <smb> Even longer in /var/log/apt/history.log* 
[09:20] <apb1963> yeah that's what I was looking at... not a lot in there.  However, notice the time of my last comment before you guys woke up.
[09:20] <apb1963> it's been over 2 hours
[09:22] <apb1963> The only thing that really happened in that time is I updated sflphone - an application I use.  A softphone.  I went into that channel several hours ago... and noticed that the developer mentioned to me he thought he had fixed a bug I reported.  I told him of this freeze problem - he didn't respond, but another updated came down shortly after... I installed that and now it's been 2 hours since the last freeze.
[09:23] <apb1963> That's the same app that was tickling a DBUS bug I reported not too long ago.
[09:23] <apb1963> It was crashing my system... so he changed his code... to accommodate.
[09:24] <apb1963> So many changes going on it's hard to know what's causing what.
[09:24]  * apw nods
[09:25] <apb1963> And... as best as I can remember...  I thought I updated to the -57 version a few weeks ago... but i'm not 100% sure... hell, it could have been -56
[09:26] <apb1963> I can post the apt log if you wouldn't mind having a look.  I'm sure you're more familiar with the various things than I am.
[09:26] <smb> apb1963, Usually you have at least 2 (usually even more if you don't manually delete them) of the previous kernels
[09:26] <apb1963> I haven't deleted anything 
[09:27] <apb1963> although I'm not sure where to look for them
[09:27] <smb> If you hit left shift during boot (or modify grub to always show up for a few seconds) you can select them
[09:27] <smb> ls /boot
[09:28] <apb1963> Yeah....  I tried hitting both left & right shift... it just ignores it.
[09:28] <smb> The timing is... trick
[09:28] <smb> tricky
[09:28] <apb1963> I held it down for the entire boot sequence... it ignored me
[09:28] <smb> I prefer to modify /etc/default/grub to comment out the two *HIDDEN* lines
[09:29] <smb> set the other timeout to 10 and run "sudo update-grub"
[09:29] <apb1963> heh.  I've got the last 10 kernels.
[09:29] <apb1963> I wlil do that
[09:30] <apb1963> so... this one...#GRUB_HIDDEN_TIMEOUT=0
[09:31] <smb> right and there is another one
[09:31] <smb> #GRUB_HIDDEN_TIMEOUT=0
[09:31] <smb> #GRUB_HIDDEN_TIMEOUT_QUIET=true
[09:31] <smb> GRUB_TIMEOUT=10
[09:31] <apb1963> GRUB_HIDDEN_TIMEOUT_QUIET=true
[09:31] <apb1963> so that was already uncommented
[09:32] <smb> apb1963, They have to be like I posted above
[09:32] <smb> Both HIDDEN commented out and the GRUB_TIMEOUT set to whatever wait time you want
[09:34] <smb> If hidden timeout is true, the hidden timeout value is used (which is 0 by default), which gives you no delay and you have to press shift exactly at the right time between BIOS screen after keyboard is active and before grub starts.
[09:35] <smb> Not really much time
[09:35] <apb1963> go it
[09:36] <apb1963> got it
[09:37] <apb1963> Yeah, I've been experiencing all kinds of weird stuff
[09:37] <apb1963> My network takes like 2 minutes to come up
[09:38] <apb1963> the network manager can't even see my network - not that I use network manager but still
[09:38] <apb1963> I have an approximate 7 second delay on SIP calls.  Nobody has a clue what that's all about.
[09:39] <apb1963> Stuff crashes randomly
[09:39] <apb1963> Various applications
[09:39] <apb1963> I get loads over 7 for no apparent reason
[09:39] <apb1963> Normal operation is like less than .5
[09:40] <apb1963> I've been reporting bugs left & right in various apps.
[09:40] <apb1963> I'm frustrated :/
[09:40] <apb1963> well, at least my machine stopped freezing up for now.
[09:41] <apw> you should try running development where that kind of experience is the norm
[09:43] <apb1963> Yeah well...... this is my "production" machine.  And I only have the one. 
[09:45] <apb1963> This machine was supposed to an asterisk server.... then my only other machine died of hardware failure... and I was forced to install a desktop on this one...
[09:45] <apb1963> So I'm waiting for a hand-me-down from my brother for 3 weeks now, and I don't konw if he's ever going to send it.
[09:46] <apb1963> In the meantime, I'm trying to get by on just this one machine running everything... 
[09:47] <apb1963> Don't ever be poor if you can avoid it.
[09:49] <apb1963> I can't decide if I should start some of the stuff I had running earlier to see if it triggers the freeze
[09:49] <apb1963> About 35 chrome windows, libreoffice with a half dozen or so files open, skype, Kontact, firefox....
[09:50] <smb> Well doing one at a time and wait some time in between at least would give some hints
[09:50] <apb1963> yeah
[09:50] <apb1963> I have a strong suspicion it was sflphone
[09:51] <apb1963> how an app could cause a kernel freeze though.... I don't know.
[09:51] <smb> Is it really a kernel freeze? Sometimes X locking up hard just looks the same
[09:51] <apb1963> I was afraid it might be bad RAM... is there a way to test RAM while the machine is booted?
[09:51] <apb1963> I couldn't switch to a VT
[09:52] <apb1963> When X locks up I can generally switch to a VT
[09:52] <smb> I had those too in the past. if X takes all the keypresses and ignores them. Hard to say then
[09:53] <apb1963> I mean my system gets a  high load and brings it to a crawl... but eventually I can get a terminal and run top... shows me a rediculous load I start killing a few chrome windows and it bounces back.
[09:53] <apb1963> I think flash is causing the load
[09:53] <apb1963> but this was different
[09:53] <smb> Only chance would be a second computer on the network and trying to ping or ssh... If you get top running, I think 'c' sorts by cpu usage, so you would see which process is doing that
[09:54] <apb1963> Oh and the very first time this happend I got some kernel messages
[09:54] <apb1963> yeah
[09:54] <apb1963> I got like a "cut here" message
[09:54] <apb1963> I couldn't find it in the log
[09:54] <apb1963> but it was a definite kernel crash
[09:54] <apw> photos are you main plan when that happens in case it isn't in the log
[09:54] <apb1963> after the crash it just kept freezing
[09:55] <apb1963> photos?
[09:55] <apw> if you can see it, so can your camera
[09:55] <apb1963> oh.  yeah well...  let me use the "poor" key word again
[09:55] <apb1963> sucks to be me
[09:56] <apb1963> oh well :)
[09:56] <smb> Pen and paper? Yeah it is tedious. ;)
[09:56] <apb1963> well...  I just assumed it would be in the log
[09:56] <apb1963> so I didn't even consider that
[09:57] <smb> Yeah, problem with crashes is that the kernel grinds to a halt. So is the part that writes to disk. 
[09:58] <apb1963> I don't know why, but I figured if it had time to write to the screen it had time to write to disk.  What can I say, I wasn't thinking.
[09:59] <apb1963> ok i'll startup skype
[09:59] <apb1963> hmmm... might have already been running.... came up too fast
[09:59] <apb1963> and now we call libreoffice to the stand....
[10:01] <smb> apb1963, If you can afford the time it would be good to wait about those 30m it took to freeze between additional applications
[10:02] <smb> ok skype was running for that I suppose
[10:03] <apb1963> well, I can afford the time - but only because it's 2am and I need sleep
[10:04] <apb1963> so if it crashes or freezes, so be it.
[10:04] <apb1963> But lets say it does... then what do I do?
[10:09] <smb> If you see crash messages on the screen, write them down. Then probably repeat with just the last app started 
[10:10] <smb> if you find a suspect, you could slowly go back to older kernel versions to see whether they are the same
[10:12] <smb> If it looks to be the kernel start filing a bug by running "ubuntu-bug linux" from that machine.
[10:14] <smb> If it keeps crashing even with older kernels and you find its a certain application started, you probably have to make the bug report manually. Hm, well try to replace linux with the appname. Not sure this works though
[10:15] <apb1963> hmm
[10:17] <apb1963> there's also the further complication that I'm using 6 different virtual desktops
[10:18] <apb1963> so who knows if that's interacting
[10:18] <apb1963> earlier I was sorting my apps into VD's... this last time I didn't bother.
[10:18] <apb1963> Would be nice if apps returned to their assigned VD's, but that's another story.
[10:20] <smb> I could not guarantee that but which VD an app is should not matter for crashing. And no, we cannot help you with that placement problem here. ;)
[10:20] <apb1963> k
[10:20] <apb1963> KDE thing I guess
[10:22] <apb1963> well, thank you for the  help!  I'm gonna hit the hay before my head hits the keyboard :)
[10:22] <apb1963> 'night
[10:23] <smb> That may help a lot. Good night. :)
[13:46] <rtg> henrix, hmm, v3.12.3 broke e1000e. have you heard anything about that ? I'll check that it really works on 3.12.2 first
[13:55] <henrix> rtg: no, i haven't found any e1000e patch but usually network patches aren't tagged for stable
[13:55] <henrix> rtg: davem queues them and sends them in a batch
[13:55] <henrix> rtg: let me check his queue...
[13:57] <henrix> rtg: a quick look here http://patchwork.ozlabs.org/bundle/davem/stable/?state=* doesn't show anything
[13:58] <henrix> and there's nothing obvious on 3.12.3 that would break this driver
[13:58] <bjf> rtg, how does the issue manifest itself? does it just not recognize the nic or does it spew chunks?
[14:01] <rtg> bjf, henrix: actually, this looks like AA denying dhclient. 3.12 kernel in a precise user space
[14:02] <rtg> I'll try booting with it disabled.
[14:06] <rtg> that was it.
[14:07] <rtg> jjohansen1, can you think of a reason that a 3.12 kernel wouldn't work with a precise apparmor ? its blocking dhclient.
[14:09] <smb> Its a bit weird my VM guest was ok with dhcp and a ... oh 3.11 kernel ... 
[14:10] <rtg> smb, try a mailine 3.12.3
[14:10] <rtg> mainline
[14:11] <apw> that sounds familiar, it blocking dhclient i mean
[14:11] <smb> rtg, Yeah maybe later. Right now I use that for fiddling with some lts-s dkms
[14:11] <apw> wasn't that the error we had on the phones when part of apparmor3 was missing ?
[14:11] <rtg> apw, maybe
[14:11] <apw> or if userspace was out of sync with kernel, something like that
[14:12] <apw> smb, test the 3.13 kernel if that works we can hide from it perhaps
[14:12] <rtg> apw,  a newer kernel should always work on an older user space
[14:13] <smb> apw, can do in a bit. but I suspect I might be too late then
[14:17] <apw> rtg, oh of course it should, i mean when the aa3 bits went in there was an order that was bad, i think kernel had to hit before userspace
[14:17] <apw> so we could have lost something in the kernel and break outselves with newer userspace
[14:17] <rtg> apw, that shouldn't be the case for precise, right ?
[14:18] <apw> oohhhh, this is lts-backport, erm, it just might, as in the new kernel is enforcing something which was specified but broken in older kernels
[14:19] <apw> so lets think... i thnk we had to have updated user profiles _first_ then the new kernel bits for aa3, which might explain this
[14:19] <rtg> apw, well, its not rally the LTS, its just the trusty kernel (though it shouldn't make a difference)
[14:19] <apw> but ... we need jjohansen1 rather than my rather unreliable memeory
[14:19] <apw> right, this is the kernel with aa3 on precise ie old profiles, that might be an issue
[14:20] <apw> we might have to update the profiles, which will work same on older kernels.  but jj would be the one to confirm for sure
[14:20] <rtg> apw, so I'm running a 3.13 kernel on a precise server
[14:20] <apw> and it works or does not work
[14:21] <rtg> works fine
[14:21] <apw> using dhcp or static network config
[14:21] <apw> i think it was dbus mediation which was the issue
[14:22] <rtg> apw, uses dhcp. lemme go get the bit of log where AA was denying the socket request.
[14:22] <apw> ie nothing wrong with networking, just unable to request networking
[14:22] <apw> something odd like that, so ehternet might work, and wireless not or ... something
[14:23] <rtg> Dec  5 23:58:24 gbyte kernel: [  644.311454] type=1400 audit(1386251904.862:93): apparmor="DENIED" operation="connect" parent=2101 profile="/usr/lib/NetworkManager/nm-dhcp-client.action" name="/run/dbus/system_bus_socket" pid=2103 comm="nm-dhcp-client." requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[14:23] <rtg> De
[14:23] <rtg> apw, yep, the network driver seems fine
[14:24] <apw> yeah that'd be my memory of the issue, that the dbus bits get mediated and prevented
[14:25] <apw> because the profiles specifity it wrongly, but the older kernles don't implement it either
[14:25] <rtg> apw, yeah, its also denying cupsd
[14:25] <apw> so it all works, and then we fix the kernel and it applies them and crunch
[14:25] <apw> so i think we will need a profile update for precise before this can work, but as i say
[14:25] <apw> jjohansen1, is the man to confirm my madness
[14:25] <apw> or we might decide to switch that bit off for the backport only
[14:29] <rtg> apw, just pushed Ubuntu-3.12.0-6.14 which we should smoke test just to be ure
[14:29] <apw> ack
[14:31] <apw> rtg, i take it you didn't tag it
[14:31] <apw> (as yet)
[14:31] <rtg> nope, not until I'm done making changes
[14:31] <rtg> I generally tag things just before uploading
[14:34] <apw> makes sense, jsut confriming i'd not gotten an old one
[14:40] <apw> rtg, i've just responded to your linux-tools thing, i don't think its quite there ...
[14:40] <rtg> apw, good, I wondered if I missed anything. I had a splitting headache whilst I was doing that
[16:37] <apw> rtg, you know, i wonder if we should be offering them the linux-tools-version[-flavour] and also offering them something based on the linux-image- meta package they have installed perhaps
[16:38] <rtg> apw, well, the meta package naming isn't totally consistent. I think this at least will give folks an idea that they might have to hunt down the _right_ meta package.
[16:40] <apw> yeah, i think we should get the -flavour bit right, shall i have a poke and see if i can improve
[16:40] <rtg> apw, have at it. Frankly, my attitude is that if someone is installing tools, then they likely know what they are doing (given a little nudging)
[16:52] <apw> rtg, it would be nice to not need that to have versions in there, i wonder if we could fix that somehow
[16:52] <rtg> apw, do you mean you wish perf was not ABI specific ?
[16:54] <apw> rtg, hey i also wish that.  i meant i wish i could incant at like dpkg to get the list of versions and packages
[16:54] <apw> so it would work on all releases for ever
[16:55] <rtg> apw, well, we've only got to worry about one more LTS kernel for 12.04
[16:55] <rtg> and we can probably predict the meta package name
[16:59] <apw> yeah
[18:39]  * rtg -> lunch
[18:52] <hallyn> apw: could you kick off a new build of the trusty -unstable kernel in ppa?
[19:06] <jjohansen1> apw, rtg: nothing new in 3.12 over 3.11 that I am aware of, however the aa3 bits in both of those do require some policy updates or dhclient etc will break
[19:06] <rtg> jjohansen1, which is exactly what is happening
[19:07] <jjohansen1> yep
[19:07] <jjohansen1> sorry just working through the backscroll
[20:29] <bjf> cking, i see this every once and a while in testing: "Timed out after 30 seconds doing mkdir() - possible eCryptfs hang" but if i just retry the test it doesn't happen again. do you want to see these?
[20:31] <tyhicks> bjf: looking at the test, that may be caused by a race condition which would explain why you don't always see it
[20:32] <bjf> tyhicks, do you want to look at the system when it gets into that state?
[20:32] <tyhicks> bjf: yeah, that would be helpful
[20:33] <bjf> tyhicks, ok, next time it happens and your around i'll let you know. do you have your QA lab vpn creds?
[20:33] <tyhicks> bjf: I think so, but it has been a while so I'm not 100% sure
[20:33] <bjf> tyhicks, ack
[20:33] <tyhicks> bjf: yep, I see it in my network manager menus
[20:34] <tyhicks> and it connects, so I should be ready when it happens again
[20:35] <tyhicks> bjf: what kernel version is this happening on? do you know what lower filesystem ecryptfs is mounted on when it happens?
[21:15]  * rtg -> EOD