=== wgrant_ is now known as wgrant [02:01] RAOF, tjaalton: any concerns if I upload the mesa 8.0.3 merge to quantal now? [02:01] I'm good with that. [02:02] alrighty [02:03] up it goes [05:31] bryceh: yeah, thanks [08:58] hm, does xorg 1.12 have the signal safe stuff? [08:59] no [08:59] iirc [09:06] not even a point-release? [09:06] hmm was it the security fix or something else? [09:08] got it now, apparently still under testing [09:18] but yeah updating my laptop now, might as well do a thorough sru of https://bugs.launchpad.net/ubuntu/precise/+source/xserver-xorg-input-synaptics/+bug/941953 and confirm it in quantal first :) [09:18] Launchpad bug 941953 in xserver-xorg-input-synaptics (Ubuntu Precise) "Xorg crashed with SIGSEGV in WriteToClient() with buf = 0x100000000 from ProcXIGetProperty()" [High,Triaged] [09:20] looks like my desk is getting too small already :) [09:45] mlankhorst, hey [09:52] hey [09:54] mlankhorst, how are you? [09:54] im good, you? [09:54] mlankhorst, I'm good thanks [09:54] mlankhorst, is there any reason you are not on #ubuntu-desktop? ;-) [09:54] woops must have left at one point and never rejoined [09:55] mlankhorst, ok, Laney is having login issues and we were looking for xorg expertise ;-) [10:16] is Xephyr segfaulting on precise for others as well? [10:18] how do you use it? [10:18] tjaalton, Xephyr :1 [10:18] DISPLAY=:1 somecommand [10:20] rather [10:20] - Xephyr :1 as my user [10:20] su testuser [10:20] DISPLAY=:1 gnome-settings-daemon [10:20] which leads to a [10:20] Backtrace: [10:20] 0: Xephyr (xorg_backtrace+0x37) [0xea1107] [10:20] 1: Xephyr (0xd02000+0x1a2e8a) [0xea4e8a] [10:20] 2: (vdso) (__kernel_rt_sigreturn+0x0) [0x25640c] [10:21] every time [10:21] get a debug build and use gdb? [10:25] no xserver-xephyr-dbgsym and xserver-xorg-core-dbg doesn't include it :-( [10:25] I guess I will need to rebuild xorg [10:25] http://pastebin.ubuntu.com/1048865/ is the non debug bt [10:27] I ran xterm inside xephyr, and exiting it caused a similar segfault [10:29] I think we have several of these reported against xserver [10:30] tjaalton, do you have a number I can track and maybe milestone for 12.04.1? ;-) [10:31] bug 1009629 [10:31] Launchpad bug 1009629 in xorg-server (Ubuntu) "Xorg crashed with SIGSEGV in DeliverRawEvent()" [High,Confirmed] https://launchpad.net/bugs/1009629 [10:32] tjaalton, thanks [10:35] possibly https://lists.debian.org/debian-x/2012/05/msg00240.html? [10:36] sadly no bt in that bug though [10:36] in the lp one? right [10:36] but does look similar [10:37] yeah looks the same [10:37] there's a revert in 1.12-branch that fixes it [10:37] oh [10:37] 58dfb13953af71021317b9d85230b1163198f031 [10:37] I'll check it out [10:39] uh, there was an sru to _add_ that code [10:39] seb128: can you reproduce it with -0ubuntu10.1? [10:39] if you can find it.. [10:40] tjaalton, let me try, do I need to downgrade only xserver-xephyr? [10:40] seb128: maybe so, I think the code is builtin [10:40] tjaalton, https://launchpad.net/ubuntu/+source/xorg-server/2:1.11.4-0ubuntu10.1 has the binaries if you want btw [10:41] yeah I'll try it as well [10:41] tjaalton, yeah, no segfault with that version [10:42] confirmed [10:43] bad cnd :) [10:43] wonder if there was another fix upstream for the original bug [10:43] cnd, no cookie for you! [10:43] thanks jcristau for the pointer [10:45] seb128: I added a note to the bug [10:45] tjaalton, thanks [10:46] tjaalton, who should be assigned to this bug? we want to make sure that regression is fixed before 12.04.1 [10:47] seb128: pick me? [10:47] mlankhorst, done, thanks ;-) [10:48] we'll probably move to 1.13 though [10:48] not for precise [10:49] That's *totally* SRUable. No new features at all! [10:49] :) [10:49] hehehe :P [10:49] maybe the original bug would need to be reopened as well [10:49] bug 968845 [10:49] Launchpad bug 968845 in xserver-xorg-input-synaptics (Ubuntu Quantal) "bcm5974 touchpad doesn't work after S3 on MacBookAir" [Medium,Fix released] https://launchpad.net/bugs/968845 [10:52] while you guys are around [10:52] could somebody look at https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/962892 [10:52] Launchpad bug 962892 in xorg-server (Ubuntu) "Xorg crashed with SIGABRT in __assert_fail_base() unless clear compiz/unity settings" [High,Triaged] [10:52] it's getting quite some dups on launchpads and on errors.ubuntu.com [10:57] what a messy bug [10:58] looks like the duplicate bot isn't that helpful there [10:58] some have fglrx loaded, mostly intel though [10:58] tjaalton: the original report is a assert(0) in intel ddx [10:58] so ignore the fglrx ones.. [10:58] right [10:59] seb128: I'm going to take a look :) [10:59] mlankhorst, thanks [10:59] yeah, i'll reboot instead [11:01] tjaalton, mlankhorst: you can check the reports on errors.ubuntu.com as well they might have useful infos [11:01] open http://errors.ubuntu.com, entry xserver-xorg-core in the entry and select "month" in the combo [11:02] it's the first bug listed, if you click on the function you have the list of individual reports [11:02] 00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0116] (rev 09) (prog-if 00 [VGA controller]) [11:02] specific error seems to be assert(0) when it can't identify the generation of bufmgr_gem->pci_device [11:10] but it should be recognising it as gen6 if I'm reading it right.. [11:29] it's also a hybrid system, like many on the bug [11:29] of the dupes [11:30] i noticed, but running updates now to test [11:40] https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/984189 is not hybrid though [11:40] Launchpad bug 962892 in xorg-server (Ubuntu) "duplicate for #984189 Xorg crashed with SIGABRT in __assert_fail_base() unless clear compiz/unity settings" [High,Triaged] [11:43] hm getting a different bug instead [11:45] yeah not all dupes had hybrid [11:46] some early corruption in xserver-xorg [11:46] awesome :) [11:50] i've unduped 984189 [11:50] looks like another snb crasher, probably fixed already [11:57] damageRegionProcessPending [11:59] you can repro it? [12:01] I'm hitting another bug [12:01] damageRegionProcessPending for some reason [12:02] I'll see if I can attach valgrind again [12:04] * mlankhorst looks for the signal patches again [14:04] yay, worth it [14:04] ==1651== Address 0xdfdfdfdfdfdfdff7 is not stack'd, malloc'd or (recently) free'd [14:05] i sent three drm/i915 commits to stable@, makes at least my ivb stable :) [14:05] so it seems my crash is caused by freed memory [14:07] my X server is called 'exec /usr/bin/valgrind --leak-resolution=high --malloc-fill=ef --free-fill=df /usr/bin/Xorg "$@" &> /home/mlankhorst/nfs/vg' :) [14:08] http://pastebin.com/xg2yfXKs [14:11] tjaalton, seb128, mlankhorst: only Jeremy Huddleston saw the crash on OS X, so I didn't bother with reverting the added patch [14:11] we will need to cherry-pick a couple of patches that fix it cleanly, IIRC [14:11] cnd, it happens every time you use Xephyr and close a client [14:12] fun [14:12] weird.. I'll try upstart xf86-video-intel [14:13] mlankhorst, did you chat with RAOF about synaptics? [14:13] cnd: btw, is the precise xserver missing some input fixes from 1.12.x? can't recall what it maps to [14:13] i mean _other_ fixes ;) [14:13] tjaalton, it maps to 1.12.1 + the patches in debian/patches [14:14] cnd: ugh some other issues popped up in between, I wanted to reproduce it on this laptop first with valgrind but it dies in a new way I haven't seen before [14:14] for input [14:14] oh right [14:14] yeah [14:14] I'll try the other one [14:14] mlankhorst, do you have a macbook? [14:14] nope [14:15] mlankhorst, then there's a low chance you'll be able to reproduce it easily [14:15] I can only reproduce it on a macbook air [14:15] cnd: k [14:16] which issue specifically? [14:17] mlankhorst, when you close the lid, the screen interacts with the touchpad and causes many dancing touches [14:18] and then the device is disabled for suspend [14:18] the touches aren't disposed of properly [14:18] so on resume, some touches may be "stuck" as active [14:21] ah k [14:23] I guess you can probably fake that playing with the lid contact on a normal laptop :p [14:24] was thinking the same [14:24] just echo mem > /sys/power/state [14:24] wait.. why does irssi have tab completion for that? [15:33] So yeah, you guys need to blacklist the AMD SI driver from Unity. [15:33] No cayman [16:05] hm, just running X with valgrind is providing plenty of amusement.. [16:51] bryceh/RAOF: If we decide to push x 1.12, can we upstream the signal safe patches too and force some testing with valgrind? [16:52] mlankhorst, some patches can be upstreamed (and are on my todo list), but a few are not going to be acceptable upstream [16:53] I mean, I was tracking why i915 was refusing to log in only to notice that upstream added another regression on top :s [16:57] and seeing how many different bugs in x org are memory based it would be nice to have as feature.. [16:59] how do you force testing with valgrind?-) [17:00] create a shell script X2 with contents: [17:00] exec /usr/bin/valgrind --leak-resolution=high --malloc-fill=ef --free-fill=df /usr/bin/Xorg "$@" &>> /home/mlankhorst/nfs/vg.$(hostname) [17:00] look for any suspicious read/write errors or crashes [17:21] bryceh: hm any thoughts on this ? http://pastebin.com/qFpTpMzx [17:25] drawableDamage(pDrawable); [17:25] hmm [17:26] df is my valgrind free-fill [17:30] but from whatI can tell it should only nuke that drawable if refcount drops to zero, which it did. [17:31] I wonder if it's caused by this cast: [17:31] return (char *) (*privates) + key->offset; [17:32] don't think so, it just looks like that's how it registers private data into it [17:33] or something. the "Invalid read of size 8" errors are complaining about differences in variable sizes [17:33] I think it's simply reffing the damage after the pixmap was freed, but I don't see how.. [17:35] mlankhorst, I'm not a valgrind expert but --leak-check=full might help to stop where it was freed [17:35] mlankhorst, not sure if --leak-resolution=high does the same [17:35] mlankhorst, I just know that the one we listed is on our standard set of flags for desktop debugging [17:35] seb128: no that's on exit [17:35] ok [17:36] --leak-check= [default: summary] [17:36] When enabled, search for memory leaks when the client program finishes. If set to summary, it says how many leaks occurred. If set [17:36] to full or yes, it also gives details of each individual leak. [17:36] I need to set --track-origins=yes though [17:36] more slowdown :) [17:38] mlankhorst, yeah not sure what's going on there. If you got a reproducible case, might chat with ickle [17:39] bryceh: yeah seems to happen on upstream intel too [17:41] it's annoying since it blocks login [17:58] hm just to be sure I'll try without valgrind patch [18:18] bryceh: hm, at this point I'm not even 100% sure it's intel specific, I'll cut down on other options [18:34] https://bugs.freedesktop.org/show_bug.cgi?id=51240 added :) [18:34] Freedesktop bug 51240 in Driver/intel "[i915] crash in damageRegionProcessPending on login" [Normal,New: ] [20:28] hm was afraid of that, some changes between x 1.11 and x 1.12 [20:32] mlankhorst: i dont even see that commit in xserver at all [20:33] Sarvatt: well with the x 1.12 I uploaded to x-staging ppa things work, so I guess there's probably some truth to it [20:35] Sarvatt, hey I'm looking at the gpu lockup udev rule. [20:35] currently we trigger on ERROR=1 from the kernel. there is also a RESET=1 which apparently happens later but still prior to the reset [20:36] one of the Intel guys suggested moving to RESET=1 might eliminate the false gpu lockups [20:53] sounds right to me, we did it on ERROR=1 before to grab an intel_gpu_dump of the actual crash before but its automatically captured in debugfs now [20:53] right [20:54] Sarvatt, ok... I'll test it on a couple systems but expect to have it in within the week; ping me if you spot anything wonky... I think you tend to notice gpu weirdness before anyone else here :-) [20:59] night all :) [21:12] * bryceh waves