FernandoMiguel | nite | 00:00 |
---|---|---|
apw | RAOF, hey i have just had another of those X lockups, triggered by touching the edge of teh screen to pull out the launcher | 08:45 |
apw | any suggested triage, before i start restarting bits? | 08:46 |
RAOF | apw: Still there? | 08:47 |
apw | RAOF, yep | 08:47 |
RAOF | apw: Has X crashed, or frozen? | 08:47 |
RAOF | If it's frozen, could you ssh in, attach gdb, and nab a backtrace? | 08:47 |
apw | xserver still there, its in a futex wait | 08:47 |
apw | RAOF, no idea if its X or compiz, or another | 08:48 |
RAOF | Oh, really. | 08:48 |
RAOF | Is compiz in D state by any chance? | 08:48 |
apw | compiz is Sl | 08:48 |
RAOF | Hm. | 08:50 |
apw | RAOF, X appears to be in an intel_drv.so thingy, called out of _CallCallbacks | 08:50 |
RAOF | Ah. Could you pastebin a backtrace please? | 08:50 |
apw | RAOF, http://paste.ubuntu.com/869540/ | 08:52 |
* RAOF is confused. | 08:53 | |
RAOF | Could you do the same, with debugging symbols installed? | 08:53 |
apw | RAOF, this is a relativly recent behaviour, and i am pretty sure bjf has also hinted at similar behaviour -- cirtainly this is the third of this behaviour i've seen | 08:53 |
apw | RAOF, where doth i get those | 08:53 |
RAOF | apt-get install xserver-xorg-core-dbg xserver-xorg-video-intel-dbg xserver-xorg-input-evdev-dbgsym | 08:54 |
RAOF | And let's throw in libdrm2-dbg libdrm-intel1-dbg | 08:54 |
RAOF | That looks an *awful* lot like a deadlock caused by a signal handler, doesn't it. | 08:54 |
apw | E: Unable to locate package xserver-xorg-input-evdev-dbgsym | 08:55 |
RAOF | Oh, *ARSE* | 08:55 |
RAOF | Ignore evdev, then. | 08:55 |
* RAOF has a sneaking suspicion that he knows what the problem is. | 08:56 | |
apw | something i can type to get confirmation for you | 08:56 |
RAOF | No, I don't think so. | 08:57 |
RAOF | But a backtrace with function names would help me to check. | 08:57 |
apw | http://paste.ubuntu.com/869546/ | 08:58 |
apw | RAOF, ^ | 08:58 |
RAOF | Had you closed a window shortly before this happened? | 09:00 |
apw | RAOF, i had just touched the left edge about to reel out the launcher | 09:00 |
apw | the odd shadow thing is still there i think | 09:00 |
jcristau | sending events from the sigio handler sounds wrong. | 09:01 |
RAOF | evdev... with a mouse, right? | 09:01 |
RAOF | jcristau: Yeah, I just noticed thatc. | 09:01 |
apw | RAOF, indeed mouse in motion | 09:01 |
jcristau | no wonder it deadlocks | 09:01 |
RAOF | jcristau: Hence my *ARSE* :/. | 09:01 |
RAOF | The wonder is that it deadlocks so infrequently. | 09:01 |
apw | are you allowed to do _anything_ in a signal handler? | 09:02 |
jcristau | apw: not much. | 09:02 |
jcristau | you're supposed to read the input event, queue it, and get the hell out. | 09:02 |
apw | i am supprised reading input is allowed, let alone any of the other milarki | 09:02 |
RAOF | You can write() | 09:02 |
apw | so you can send yourself a hint | 09:03 |
jcristau | signal(7) has a list of functions you're allowed to call | 09:03 |
RAOF | *much* of writing events qualifies; probably something mallocs in a certain case. | 09:03 |
RAOF | I was just incautious and wasn't thinking that ConstrainCursorHarder was called from a signal context. | 09:04 |
jcristau | well in X sending events is not allowed from the sighandler, because it does more than write() | 09:04 |
RAOF | Yeah. | 09:04 |
* apw would be supprised if malloc was safe most of the time | 09:04 | |
jcristau | apw: it's not | 09:04 |
jcristau | malloc takes a lock | 09:05 |
jcristau | so if malloc gets interrupted by the signal and you call it again from the handler, it goes boom | 09:05 |
RAOF | Indeed. | 09:05 |
apw | RAOF, its a good job Intel graphics arn't common ... sigh | 09:06 |
RAOF | In this case, probably not malloc; looks like drmIoctl takes out a lock. | 09:06 |
apw | RAOF, yeah you'd hope wouldn't you | 09:07 |
RAOF | Anyway, time to rework this sucker. | 09:07 |
apw | RAOF, need anything mroe from this hang, or shall i kill it off | 09:07 |
RAOF | Kill it with extreme prejudice. | 09:07 |
jcristau | RAOF: is that bug upstream too? | 09:08 |
RAOF | jcristau: No; it's in my code (a prototype of which is languishing on the list) | 09:08 |
jcristau | ok | 09:08 |
RAOF | Would have been nice for someone to point out that CursorConstrainCursor is called in a signal context, though :) | 09:09 |
RAOF | The input code does a scary amount of work in that context. | 09:09 |
jcristau | having annotations in the code about it could be helpful.. | 09:10 |
jcristau | but i guess it'd easily get stale | 09:10 |
RAOF | Maybe | 09:10 |
jcristau | maybe sparse could handle that | 09:11 |
RAOF | Well, easily get stale, but the flow of the sigio handler shouldn't change *that* much between releases. | 09:11 |
RAOF | How well does sparse handle calls chained through function pointers? | 09:11 |
jcristau | no idea | 09:11 |
jcristau | i should try and play with it some time | 09:12 |
RAOF | It would seem to be something amenable to static analysis, but as I say, there's a surprising amount of code there. Fun fact! It calls into the nvidia blob now. | 09:12 |
jcristau | read input calls into nvidia?? | 09:12 |
RAOF | Yes. | 09:12 |
apw | RAOF, i hope you don't mean the intel driver :) | 09:13 |
jcristau | for hw cursor, or something else? | 09:13 |
RAOF | Because read input wraps into ConstrainCursorHarder, and nvidia now has a ConstrainCursorHarder hook. | 09:13 |
jcristau | ah. | 09:13 |
RAOF | To handle the same thing as xrandr's ConstrainCursorHarder hook. | 09:13 |
RAOF | One might think that the xrandr/ subdirectory would be free of the sigio context; you'd be mistaken. | 09:14 |
apw | RAOF, ok just in case i took a gcore of it | 09:14 |
RAOF | apw: Well, I now know of at least one critical bug in my code. | 09:15 |
apw | RAOF, heh well something came out of my hang :) | 09:16 |
RAOF | And with that, dinner. | 09:16 |
=== yofel_ is now known as yofel | ||
ossguy | is this the right place to discuss regressions in X between Ubuntu 12.04 Alpha and Beta? | 16:05 |
ossguy | I did a dist-upgrade this morning (last one was a week ago, before Beta was released) and now it will only start in low graphics mode | 16:06 |
tjaalton | which driver do you use? | 16:09 |
ossguy | sorry, it's radeon | 16:09 |
tjaalton | so no fglrx? | 16:10 |
tjaalton | installed | 16:10 |
ossguy | correct | 16:10 |
ossguy | never installed | 16:10 |
tjaalton | pastebin /var/log/Xorg.0.log | 16:10 |
ossguy | yep, just a sec | 16:10 |
tjaalton | (pastebinit $foo) | 16:13 |
ossguy | sorry, I'm otherwise distracted currently :) | 16:13 |
ossguy | it's at http://ossguy.com/xorg/20120305/1010/Xorg.0.log now | 16:14 |
ossguy | and http://ossguy.com/xorg/20120305/1010/Xorg.0.log.old | 16:14 |
ossguy | (running 2 different kernel versions; I got the same issue with all of them (including -18, which I tried before these)) | 16:14 |
tjaalton | dualscreen? | 16:15 |
ossguy | yes | 16:16 |
ossguy | identical model screens | 16:16 |
tjaalton | so which version was the last working one? | 16:17 |
ossguy | neither of the 2 I posted are from working X sessions | 16:18 |
ossguy | I will try to find you a log from a working session | 16:18 |
tjaalton | just try an older kernel abi version | 16:18 |
tjaalton | if you have those installed | 16:18 |
ossguy | how much older do you mean? | 16:18 |
tjaalton | one that works | 16:19 |
ossguy | I have 3.2.0-12, -15, -17, and -18 | 16:19 |
tjaalton | so try -12 | 16:19 |
ossguy | I've tried the last 3 | 16:19 |
ossguy | ok | 16:19 |
ossguy | I will once I'm in a state where I can reboot the system I'm on | 16:19 |
tjaalton | actually, the logfile says it's using 19x12 | 16:23 |
ossguy | log from working sessions is here: http://ossguy.com/xorg/20120216/1628/Xorg.1.log | 16:23 |
ossguy | s/sessions/session/ | 16:23 |
tjaalton | [ 73.483] (II) RADEON(0): Output DisplayPort-0 using initial mode 1920x1200 | 16:23 |
tjaalton | [ 73.483] (II) RADEON(0): Output HDMI-0 using initial mode 1920x1200 | 16:23 |
ossguy | yes, I noticed the screen was the usual res | 16:24 |
ossguy | but I got the box saying it was running in low graphics mode | 16:24 |
ossguy | and there didn't seem to be an easy way to get around it | 16:24 |
tjaalton | huh? | 16:24 |
ossguy | after I did the dist-upgrade this morning, I rebooted the system and X started up with the "you're running in low graphics mode" message | 16:26 |
ossguy | but as far as I could tell, it was running at 19x12 (and the logs confirm this) | 16:26 |
tjaalton | ahah, so xdiagnose fired up for some reason | 16:26 |
ossguy | right (I guess :) ) | 16:27 |
tjaalton | well you didn't say it looked normal | 16:27 |
ossguy | also, the screens were in mirroring mode | 16:27 |
ossguy | not side-by-side, as I had configured | 16:27 |
tjaalton | it's a per-user setting | 16:28 |
ossguy | ah, right | 16:28 |
ossguy | ok, so we didn't even get there | 16:28 |
ossguy | any ideas why xdiagnose might have decided to run? | 16:28 |
tjaalton | no.. | 16:28 |
ossguy | or how to get it to not run? | 16:28 |
tjaalton | it runs on every boot? | 16:28 |
ossguy | yes | 16:28 |
tjaalton | bryceh: ^ :) | 16:28 |
ossguy | (or maybe a different venue where it's more appropriate to ask this question?) | 16:29 |
tjaalton | no this is fine | 16:29 |
ossguy | ok | 16:29 |
tjaalton | basically it would mean that lightdm had failed to start | 16:30 |
tjaalton | check /var/log/lightdm/* | 16:30 |
tjaalton | oh, dist-upgrade.. check that you have lightdm _installed_ | 16:31 |
ossguy | lightdm produced logfiles the last time I started Ubuntu | 16:31 |
ossguy | so I presume it is installed | 16:31 |
tjaalton | ok, check what's in them | 16:31 |
ossguy | [+0.67s] DEBUG: Failed to load session file /usr/share/xgreeters/.desktop: No such file or directory: | 16:32 |
ossguy | [+0.67s] DEBUG: Failed to start greeter | 16:32 |
ossguy | perhaps that is an issue? | 16:32 |
ossguy | I will post the full log; just a sec | 16:32 |
tjaalton | is unity-greeter installed? | 16:33 |
tjaalton | apt-cache policy unity-greeter | 16:33 |
ossguy | sorry, I'm not booted into Ubuntu right now (I'm in a different distro on the same machine) | 16:33 |
ossguy | I can check some apt-related files, though | 16:34 |
tjaalton | check the upgrade log | 16:34 |
tjaalton | dpkg.log | 16:34 |
ossguy | ok | 16:34 |
tjaalton | and you can always chroot into the partition :) | 16:35 |
ossguy | lightdm log here: http://ossguy.com/xorg/20120305/1010/lightdm.log | 16:35 |
ossguy | true; I'd rather not do anything funny there, though :) | 16:35 |
tjaalton | do you have /usr/share/xgreeters/unity-greeter.desktop | 16:36 |
ossguy | no, /usr/share/xgreeters does not exist | 16:36 |
tjaalton | so no /usr/sbin/unity-greeter either? | 16:36 |
tjaalton | grep unity-greeter /var/log/dpkg.log ? | 16:37 |
ossguy | it suggests unity-greeter was removed | 16:37 |
tjaalton | there you go | 16:37 |
tjaalton | never blindly run dist-upgrade | 16:38 |
ossguy | I presume something was unexpected here | 16:38 |
ossguy | ah, ok, good lesson I suppose :) | 16:38 |
ossguy | what's the best/easiest way to get back to working? | 16:38 |
tjaalton | but first upgrade and then see if dist-upgrade would remove stuff | 16:38 |
tjaalton | ubuntu-desktop got removed as well I guess, so install it | 16:39 |
ossguy | yep, looks like it was removed, too | 16:39 |
ossguy | so what was the way I should have done this upgrade? | 16:39 |
ossguy | I mean eventually I want to run "dist-upgrade" | 16:39 |
tjaalton | only when it's not doing silly things | 16:40 |
ossguy | so how should I determine when is the right time to do so? | 16:40 |
ossguy | ok | 16:40 |
ossguy | so in alpha/beta I should expect silly things | 16:40 |
tjaalton | a good indicator is when it's removing *-desktop.. | 16:40 |
tjaalton | yes | 16:40 |
ossguy | right :P | 16:40 |
tjaalton | transitions etc | 16:41 |
ossguy | tjaalton: thanks very much for all your help | 16:43 |
ossguy | and for bearing with a dumb user ;) | 16:43 |
tjaalton | yw | 16:45 |
=== Sinnerman is now known as Cobalt | ||
=== FernandoMiguel is now known as FM_fewd | ||
=== FM_fewd is now known as FernandoMiguel | ||
Sarvatt | hmm, haven't seen this input related crash come up before, just hit it http://paste.ubuntu.com/870647/ | 22:45 |
bryceh | RAOF, thoughts on bug 931967? | 22:52 |
ubottu | Launchpad bug 931967 in OEM Priority Project "Corrupted graphics after the login until the unity launcher appears" [Critical,Triaged] https://launchpad.net/bugs/931967 | 22:52 |
bryceh | I'm assuming it's just that when lightdm hands off to X, the framebuffer isn't zero'd out. something lightdm should be doing? | 22:53 |
RAOF | I started looking at that, then got pulled into more pressing issues. | 22:54 |
RAOF | What I had was: lightdm draws to the root window. | 22:54 |
RAOF | Or, rather *unity-greeter* draws to the root window. | 22:55 |
RAOF | This should mean that whatever unity-greeter leaves on the display stays there until something else draws; that something else should be compiz. | 22:55 |
RAOF | However, on *some* drivers that isn't happening; it doesn't happen on intel, it does on radeon and nouveau. | 22:56 |
RAOF | Which suggests it might be an EXA issue, or something. | 22:57 |
bryceh | fglrx and -nvidia too aiui | 22:57 |
bryceh | yep, just repro'd on -nvidia | 22:57 |
RAOF | I wonder if RetainPermanent is horribly unimplemented. | 23:04 |
RAOF | What I was going to try is to replace unity-greeter's cairo render-to-root-pixmap code with a straight X11 CopyArea from a pixmap to the root window. | 23:05 |
bryceh | hmm, wonder if unity-greeter has some better debugging output... | 23:07 |
RAOF | bryceh: Hey, you know those xlib-xcb asserts we've been seeing? Think http://article.gmane.org/gmane.comp.freedesktop.xorg.devel/29102 might have something to do with it? | 23:53 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!