[00:00] jbarnes: did albert23's i915 debugfs stuff net any clues? [00:00] it seemed ok but I've got to go through each instruction with the docs next to me [00:02] are the docs online? [00:02] yeah [00:02] at intellinuxgraphics.org [00:02] ok thanks, I'll link to them [00:02] trying to refine my debug patch a little [00:05] ogasawara is working on building a module with jbarnes' patch for the jaunty kernel [00:05] I'll be posting a new one shortly [00:06] jbarnes: should I just wait for the new one? [00:06] yeah give me a few minutes please [00:10] bryce_: did you have any luck reproducing the bug with the script? [00:10] oh yeah let me try that first [00:16] 1280 * 4: syntax error in expression (error token is "1280 [00:17] bryce_: it doesn't work under metacity [00:17] it seems to configure workspaces differently, and I haven't bothered to make it work there [00:18] aha [00:18] I started it up, and *then* activated compiz [00:18] it definitely is unhappy [00:18] ok, froze [00:18] ok, tooltime [00:19] ogasawara: ok just posted an updated patch to the lp [00:20] man my machine is failing hard [00:20] jbarnes: ok cool [00:25] unfortunately, I can't help. I have an nvidia card. [00:27] mdeslaur: oh, sorry, you were a false positive in my search [00:31] oops had a register address wrong in that last patch [00:31] not harmful, and that reg wasn't very interesting anyway [00:32] jbarnes: I can fix it up really quick if it'll help [00:33] IPEIR should be 0x2064 rather than 0x2088 [00:33] jbarnes: np, I'll update it [00:33] cool [00:33] I'm checking the other ones now [00:44] yay, got a freeze dump for you jbarnes [00:44] oh yay :) [00:44] it's after midnight here, I'm going to have to pick this up again tomorrow [00:45] thanks so much to all of you for working this issue [00:45] night mdz, thanks for your help as well [00:45] please remember to update the bug report so that others can pick up where you leave off [00:45] mdz: ok later, thanks [00:46] jbarnes: http://launchpadlibrarian.net/25686110/freeze_dump.txt [00:48] Buffer size too small in MI_STORE_DATA_INDEX (2 < 3) [00:49] yeah [00:49] interesting [00:49] that might just be junk after the end of the ring [00:49] the head pointer should indicate where the hang happened [00:51] MI_NOOP ? [00:51] yeah just padding [00:51] lots of padding :) [00:52] 3DSTATE_DRAWING_RECTANGLE sounds interesting [00:54] bryce: see if you can get i915_interrupt too [00:54] it's one of the debugfs files [00:54] path? [00:54] nevermind [00:56] Interrupt enable: 00020053 [00:56] Interrupt identity: 00000000 [00:56] Interrupt mask: fffcdfac [00:56] Pipe A stat: 00040000 [00:56] Pipe B stat: 00000206 [00:56] Interrupts received: 43680 [00:56] Current sequence: 517905 [00:56] Waiter sequence: 517991 [00:56] IRQ sequence: 517746 [00:57] jbarnes: uploaded tarball of all the files to the bug [00:58] cool thanks [01:03] bryce: those files are empty :p [01:03] bah [01:03] huh, you're right [01:04] cp: reading `0/i915_batchbuffers': Cannot allocate memory [01:05] bryce: wanna join #intel-gfx? anholt has questions [01:14] bryce: hey [01:14] bryce: i just upgraded mesa to 7.4-0ubuntu3 [01:14] bryce: is that the one thought to provide the 965+compiz fix? [01:14] kirkland: yes [01:14] bryce: fwiw, i upgraded to that rebooted, reran mdz's repro.sh script; still hangs X [01:15] ok [01:15] bugger [01:15] kirkland: there is now some additional debug tools I've packaged for ubuntu [01:15] https://edge.launchpad.net/~ubuntu-x-swat/+archive/x-freeze-test [01:15] requires installing a new kernel and new libdrm [01:17] bryce: okay, what are we trying to get out of this? [01:17] kirkland: register dumps that the intel guys can grok to figure out what's gone wrong [01:17] I posted my stuff on bug 359392 just now [01:17] Launchpad bug 359392 in xserver-xorg-video-intel "[i965] X freezes starting on April 3rd" [Critical,Triaged] https://launchpad.net/bugs/359392 [01:17] a second set might give extra insight [01:18] bryce: okay, gimme a moment [01:24] bryce: doesn't look like intel-gpu-tools - 1.0-0ubuntu1 has built yet ... [01:24] god I hate ppas [01:24] there was an older ~pre2 there which is identical [01:25] I guess ppa supersedes packages at point of upload, rather than waiting until the new thingee has built? [01:25] anyway, looks like amd64 has built, so presumably i386 is nearly done [01:26] k, amd64 is what i need [01:29] bryce: which of the libdrm debs do i need? [01:30] kirkland: if you add the apt sources.list lines from the ppa into your /etc/apt/sources.list, and do apt-get install intel-gtp-tools it will pull in exactly what it needs [01:30] bryce: perfect, thanks. [01:33] bryce: rebooting into this jankey environment :-) [01:33] :-) [01:42] bryce: okay [01:42] bryce: X hung again, but slightly differently [01:43] bryce: i still have control of my mouse [01:43] yeah that's typical [01:43] bryce: but keyboard is unresponsive [01:43] bryce: oh? hmm, i didn't rember that [01:43] bryce: okay, where do you want the freeze_dump.txt ? [01:43] bug 359392 [01:43] Launchpad bug 359392 in xserver-xorg-video-intel "[i965] X freezes starting on April 3rd" [Critical,Triaged] https://launchpad.net/bugs/359392 [01:44] bryce: k [01:44] maybe snag also the files in /sys/kernel/debug/dri/0/i915* while you're at it [01:47] bryce: sure [01:49] bryce: done === rickspencer3 is now known as rickspencer3-afk [01:49] kirkland: thanks! [01:49] https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/359392/comments/102 [01:49] kirkland: go ahead and restore your system to normal, I think that'll do it for now. [01:49] Ubuntu bug 359392 in xserver-xorg-video-intel "[i965] X freezes starting on April 3rd" [Critical,Triaged] [01:49] https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/359392/comments/103 [01:50] god that bug's getting long :-) [01:50] bryce: yeah, i'm not jealous, i promise [01:50] * kirkland thinks bryce has one of the hardest jobs in ubuntu :-) [01:51] how much of the -x team's work is writing code versus testing / filing reports for intel? [01:51] seems like a lot of what goes on is wrangling patches from various sources [01:52] kirkland: heh thanks [01:52] pwnguin: pretty much [01:52] pwnguin: most of the X coding is limited to reworking upstream patches to apply to our stuff, or little quirks or enable/disable options and stuff [01:53] pwnguin: the bug load is just too high to be able to spend much time doing coding for any one particular issue [01:53] unless it's either really important, or really trivial [01:54] heh [01:55] indeed, it kinda seems like the worst thing that can happen to a bug is to be triaged medium [01:55] I have been doing a fair bit of scripting type coding, for support tools and so forth [01:55] low priority bugs seem to have tiny fixes [01:55] ok I have to go home now too.. I'll leave my machine running (still hasn't hung with mdz's test script even after suspend/resume) [01:56] jbarnes: any leads from the info collected so far? [01:56] pwnguin: to be honest I mostly ignore the priorities on bugs [01:56] my best guess is that we're actually looking at a few bugs here [01:57] but until we can accurately reproduce things we won't know for sure [01:57] hah, my gut was right [01:57] jbarnes: yes I've sensed since the start that there are more than a couple bugs going on === eric__ is now known as LLStarks [01:58] pwnguin: what I do is skim for bugs with patches, git commit ids, backtraces, fixed-upstream, needs-sponsoring, or other indications that the bug is "ripe". I'll pick those off regardless of priority set [01:58] bryce what is this delicious new ppa you've created? [01:58] LLStarks: https://edge.launchpad.net/~ubuntu-x-swat/+archive/x-freeze-test [01:58] i see [02:23] bryce, jbarnes: I've attempted testing with the updated patch, but my system is locking up hard [02:23] bryce: could you give it a try and see if you get the same? [02:23] bryce: I've posted a comment to the bug [02:42] bryce: I gotta drop off, but will be checking back in later. feel free to email if you need anything else. === rickspencer3-afk is now known as rickspencer3 === rickspencer3 is now known as rickspencer3-bbl === rickspencer3-bbl is now known as rickspencer3 [07:43] tjaalton: I wasn't either able to crash ith mem=1G, ran rotate-forever for the night [07:47] tjaalton: what about resolutions btw, I've 1400x900, does your and dustin's machine have same screen? [07:49] oh, I guess I shouldn't update compiz since now all 2a02:s are blacklisted [08:16] Mirv: I have a 10x7 screen === Sarvatt_ is now known as Sarvatt [08:27] ok, there is simply nothing to differentiate non-crashing 2a02 rev 0c:s from crashing ones :( [08:27] or maybe if taking a hard look at lspci, but still [08:29] I'll try the script now [08:30] heh, dholbach has a X61s (very similar to X61) and no problems [08:31] maybe it's the models they sell in the US that have problems :) [08:33] maybe it's 110V that causes the crashes ;) [08:33] then we could disable compiz based on country ;) [08:33] ooh, must be.. power drainage [08:37] sweet, now it crashed [08:37] running repro.sh [08:38] it got to the fourth workspace [08:40] do the crashing ones have dual channel memory by any chance? [08:41] this is a laptop, dunno [08:41] sigh, for some reason reboot doesn't actually reboot, it'll just bounce back to runlevel 2 [08:41] I got frozen again [08:42] this time I did it the old fashioned way, just used it for several hours [08:42] this was after updating to today's mesa [08:42] I expected it would freeze, but what was curious was that it took 4 hours to do [08:43] anyway, I posted the intel_gpu_dump and etc. data onto the bug report [08:43] it seems to be considerably different from the output I got when freezing with the repro.sh script [08:43] so I wonder if that means they're independent bugs [08:44] jbarnes: ^ is that a correct conclusion to draw? [08:44] or can the same root bug cause differing intel_gpu_dump output? [08:46] heh, after shutdown compiz refuses to work [08:47] tjaalton: 2a02 was blacklisted... [08:47] gah.. [08:47] there is a config thingee to force it [08:47] but right, I had missed repro.sh, I have it running now though not getting a crash [08:47] I just edited /usr/bin/compiz and commented out the 2a02 line [08:48] sudo gedit .config/compiz/compiz-manager [08:48] and add this line: [08:48] SKIP_CHECKS=yes [08:48] we should probably recommend testers enable compiz that way, rather than editing the script [08:49] I edited the wrapper like Mirv [08:49] last time it hung before finishing the first cycle, but not this time [08:49] I'll let it run for a while [08:50] should the repro.sh be visibly able to switch between ws 4 and 0? for me the visible thing is that it stays at workspace 0, but I just cannot change the workspace since it always rotates back [08:53] oh, later comment, excepts 1x6 workspace [08:53] yeah has to be 1x6 [08:54] are you two on the 2.6.30 kernel with the intel_gpu_tools stuff installed? [08:54] you have to have 1x6 grid [08:54] ok, hung again [08:54] not yet [08:54] this is stock jaunty [08:54] what ppa was it again [08:54] ? [08:54] http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.30-rc2/ [08:54] https://edge.launchpad.net/~ubuntu-x-swat/+archive/x-freeze-test/ [08:55] and not yet [08:55] oh right the intel_gpu_tools [08:55] I've tried to make the directions fairly paint by number but let me know if I missed anything [09:01] heh, intel_gpu_top is funny [09:01] every task is 100% when it's hung [09:01] well, makes sense I guess [09:03] tjaalton: so you managed to crash easily with repro.sh? are you still using the mem=1G? I'm not getting a crash but using also full 2G now again. [09:04] Mirv: not using mem=1G and it crashes, yes [09:04] hmmh === Sarvatt_ is now known as Sarvatt [09:06] Mirv: are you running the stock mesa? === Sarvatt_ is now known as Sarvatt [09:19] tjaalton: yes, ubuntu3 now, earlier was that ubuntu2~bug359392 [09:20] those two should be basically identical [09:22] I've done a couple of suspends as well while running the script, no problems [09:26] ok I got a dump [09:29] next I'll try without patch 104 [09:37] no difference [09:45] still running smoothly, ca. 10 suspend-resumes done as well [09:49] next I'll build mesa 7.3 and try repro.sh with it [11:09] morning folks [11:10] morning mdz [11:11] tjaalton: it appears the mesa patch was not to blame for this particular set of bugs [11:12] mdz: no, but neither was the mesa upgrade ;) [11:12] 7.3 froze immediately [11:12] just finished testing it [11:13] repro.sh is probably a bit too ruthless [11:13] yes, it's useful to have a case which freezes, but I worry that it may not be the same bug which people encounter in normal usage [11:14] right [11:14] the visual artifacts are the same, though [11:14] I think it's reasonably likely [11:47] it sounds like there is one bug which is fixed by reverting the mesa patch, but it is not the one experienced by most people in the bug report [11:48] mdz, i'm pretty sure i saw the issue way earlier [11:48] like around the berlin sprint [11:48] ogra: which one is "the" issue? [11:49] we are certain at this point that there are multiple bugs which can cause X to freeze [11:49] for me thats the alt-tab animation being delayed by about two seconds, workspace switching being very slow, freezes every two days [11:50] when i had enabled UXA (which i did to overcome the first two) i could see my swapspace being eaten, after two/three days my system hit OOM [11:51] ogra: those are at least two separate bugs [11:51] right, and UXA isnt relevant atm anyway, i switched that off yesterday [11:52] but i remember that my X exposed issues in berlin already [11:52] and it didnt change much since [11:58] the delayed animations one is just generic slowness with the -intel driver jaunty has [11:59] and UXA has memleaks, probably affects EXA too but not as much [12:02] yeah, i still see zero swap used since i switched off UXA [12:03] running constantly my system used to use about 2G swap after a day before ... [12:03] the genric slowness came together with the freezes here though [12:04] that's weird [12:04] sure it's not after turning UXA off? [12:04] ogra: it's changed a lot since then actually, including a whole new upstream mesa [12:04] i turned off UXA yesterday, i had my first freezes shortly after the sprint [12:05] and turned on UXA about two to three weeks after the sprint [12:07] * ogra turns off all important apps and runs the testscript now [12:10] for what it's worth, i'm running mesa 7.5 (with only 03 patch in the debian/ubuntu series), xserver-xorg-video-intel 2.7.99/libdrm-2.4.9 on 2.6.30-rc2 with the latest drm-intel-next stuff from a few hours ago and UXA (with and without KMS) is still freezing with alot of compositing activity on my 945GME. With the ubuntu packages I had hard freezes usually when alt-tabbing or opening a new gnome-terminal but with my packages the mouse i [12:10] s still moveable but the same things trigger it. updating just libdrm/-intel/kernel alone had the same exact crashes as ubuntu packages, the mesa upgrade is the only thing that changed it at all. [12:14] hmm, glxgears eats a complete CPU core ... it feels like i have no GL accel at all and its software rendering [12:15] and shouldnt it constantly switch workspaces ? [12:16] it only did one run and left me on the first then [12:17] you have a 1x6 grid? [12:17] nope, all default [12:17] 4 workspaces [12:17] you need to have that [12:18] one row [12:21] yup, that helped [12:22] * ogra_babbage is impressed how good pulse copes ... not the slightest coppyness with the video sound [12:22] *choppyness [12:41] With Virtual 2048 2048 in display subsection in xorg.conf I could run the test script without freezing for more then an hour [12:41] After removing that etting X froze again in less then 2 minutes [12:42] I originally used it to solve the slow cube rotation. Effect is that you get more EXA offscreen memory [12:44] For me EXA offscreen memory goes up from 19 to 49 MB [16:56] bryce: it's possible we're seeing different gpu dumps but still the same root cause [17:06] bryce, mdz: ran the test script all night with stock jaunty bits from yesterday, no hang, even did a suspend/resume shortly after starting it [17:07] unfortuantely I didn't have the right number of virtual desktops so I've got to start it over this morning :/ [17:48] jbarnes: do you see any sense in albert23's suggestions of Virtual setting (increasing EXA offscreen memory)? at least that would be worth trying also by others who experience the hang easily [17:48] albert23: if you don't mind, I quote you in the bug report? [17:49] (oh well, these are publicly logged channels anyway) [17:49] so quoted :) [17:52] tell them also that sitting in front of your screen for to long while the script runs might harm your stomack [17:53] :)