[00:51]  * robert_ancell -> lunch
[01:02] <RAOF> robert_ancell: When you get back we need to discuss how to proceed with Operation: Worst Thing Ever (ie: sideband pipe between usc and Xmir)
[01:12] <RAOF> duflu: You tracked down damage problems?
[01:14] <duflu> RAOF: Kind of, not really. I need to devote more hours to better learning XMir
[01:35] <duflu> kdub: You're back? Congratulations by the way
[01:50] <smspillaz> racarr: long shot, but still around ?
[01:55] <duflu> RAOF: ickle says he "pushed a patch" for the intel cache/lines issue with bypass. But I can't tell which commit(s) it was. Do you think... http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=47e718bf321f6fe80dc5f797f433b00bc8de91c7 ?
[01:56] <robert_ancell> RAOF, yes. I was also wondering how far from working the focus thing is - that would at least reduce the pressure for a less hacky solution
[02:12] <RAOF> duflu: It's all folded up into the single monolithic xmir commit, but yeah.
[02:13] <RAOF> duflu: It's a bit ugly; we should either work out how to unuglify it - probably by telling the DDX when bypass is in effect, rather than having it guess it.
[02:13] <duflu> RAOF: If you can get the bo flags then it's easy :/
[02:14] <RAOF> The bo-flags are userspace only
[02:15] <duflu> RAOF: Oh, actually that wouldn't help. Most large surfaces are now SCANOUT even when not bypassed
[02:16] <duflu> RAOF: A surface attribute? Still ugly
[02:17] <RAOF> A buffer attribute.
[02:17] <RAOF> As it has to be :)
[02:18] <RAOF> Because SCANOUT is already a possible pessimisation in the non-bypass case.
[02:35] <RAOF> robert_ancell: Oh, the focus thing is mostly ok. It works for single-head; I just need to work out what I do wrong in the multihead case.
[03:05] <duflu> RAOF: Sanity-check vladmir-upstreaming FTW?
[03:05] <RAOF> duflu: Yes.
[03:05] <RAOF> Although you'll want to pull the changes back to the ubuntu branch.
[03:06] <RAOF> Because vladmir-upstreaming is based on master, which is an ABI bump ahead of us.
[03:07] <duflu> RAOF: Argh. Maybe I'll just keep working with apt-get source on saucy to minimize impact on the system...
[03:08] <RAOF> duflu: vladmir-upstream-base is the tag you want; it's easy to reproduce xmir.patch
[03:08] <RAOF> duflu: git diff vladmir-upstream-base..vladmir-upstreaming > debian/patches/xmir.patch
[03:11] <duflu> RAOF: When will that be upstreamed? Post 13.10?
[03:11] <RAOF> Once it's broadly feature-complete.
[03:11] <RAOF> I think glx-bypass is the final piece of that.
[03:35] <duflu> RAOF: I can't see any guarantees about which portion of damage actually makes it to the swap_buffers. How can we be sure about the exact damage region that *really* got swapped?
[03:35] <duflu> ... without waiting
[03:35] <duflu> Oh, actually, maybe the answer is in the DDX...
[03:36] <RAOF> What do you mean? The drivers are responsible for passing a correct region into submit_rendering.
[03:36] <RAOF> And they're responsible for doing enough flushing that the fd submitted to Mir actually contains the rendering expected.
[03:36] <duflu> RAOF: Yes! OK so the answer is the region parameter to the submit. Ta.
[03:38] <RAOF> As a practical matter, the DDXen all pass the region that they get passed in.
[04:45]  * duflu -> lunch
[05:04] <RAOF> Ah, good. Now with significantly less password-leakage-across-VT-switch!
[05:25] <RAOF> robert_ancell: So, the XMir side of focus-thing works. The Mir side (drop focus on VT switch) hasn't landed yet as far as I can tell.
[05:52] <robert_ancell> RAOF, cool
[05:53] <tvoss_> robert_ancell, ping
[05:53] <tvoss_> robert_ancell, don't have broadband in the new house, yet
[05:53] <robert_ancell> tvoss_, ok, so not coming to meeting?
[05:54] <tvoss_> robert_ancell, trying, but don't know if it works, only 3G here :/
[05:58] <robert_ancell> hikiko, duflu, racarr, kdub, meeting
[06:18] <robert_ancell> duflu, should we close bug 1217262 invalid?
[06:18] <duflu> tvoss_: How did you go with Mir-native-OpenArena ?
[06:19] <robert_ancell> duflu, you may have been hitting the problem where u-s-c was left running due to the way it was launched with a shell script
[06:19] <duflu> robert_ancell: I think it might still be valid. We have something occasionally keeping DRM open and triggering bug 1206633
[06:20] <duflu> ... but it's racy. If you stop, pause, start, then it's OK
[06:20] <mlankhorst> yeah of course it's racy..
[06:20] <duflu> As mlankhorst keeps pointing out
[06:20] <robert_ancell> duflu, then is it better covered by that specific bug? The bug you filed is a bit vague and likely to attract random me toos
[06:20] <mlankhorst> unity-system-compositor needs to be completely dead before starting lightdm again :P
[06:21] <duflu> mlankhorst: That should surely be solvable in upstart
[06:21] <robert_ancell> mlankhorst, right, and lightdm does wait for that. But there was a bug where it was launched by a shell script without exec and lightdm was killing the shell script and not realizing u-s-c was still running
[06:21] <mlankhorst> or we should start usc WAY early.. and make plymouth a client to usc..
[06:21] <duflu> robert_ancell: Incomplete bugs should be left to expire naturally due to inactivity I say
[06:22] <duflu> "Expire of natural causes"
[06:22] <robert_ancell> duflu, well, you filed this bug... why not just close and re-open if you see it again?
[06:22] <duflu> robert_ancell: I will retest for it now...
[06:24] <duflu> robert_ancell: Reproduced. Now a duplicate of bug 1206633 :)
[06:24] <robert_ancell> duflu, awesome
[06:25] <robert_ancell> duflu, is bug 1216245 still applicable? afaik the demo shells don't need any special code for multi-monitor
[06:26] <duflu> robert_ancell: Yes it's an issue and nice to have for testing. Just set Wishlist
[06:33] <hikiko> duflu, are you working on the bug: #1216522?
[06:33] <duflu> robert_ancell: I'm not sure having a completely broken display is only Medium... bug 1218815
[06:34] <robert_ancell> duflu, change it :)
[06:34] <robert_ancell> I'm just doing a first cut for triaging, I'll re-look over high-med-critical and do them next
[06:35] <hikiko> question: I startx to have a 2nd window manager apart from unity and the events (like double click on desktop) are send to unity (wrong vt?) is this a bug or something expected?
[06:35] <duflu> hikiko: Yes, RAOF is working on that
[06:36] <hikiko> plus programs like synergy dont work (but ok that's expected I guess)
[06:36] <duflu> robert_ancell: The hang bug is incomplete. I haven't stress-tested MM recently to retry
[06:36] <hikiko> duflu, and there's no way I use X without mir for the 2nd wm so that I can debug more easily the xmir? (eg attach a gdb etc)
[06:37] <duflu> hikiko: I don't think you can right now due to the input focus problems. RAOF will have it fixed soon. Meanwhile you probably need a second machine
[06:38] <hikiko> ouch :)
[06:38] <RAOF> hikiko: Yeah, you can; but until https://bugs.launchpad.net/ubuntu/+source/unity-system-compositor/+bug/1192843 is fixed all keyboard input will always go to XMir.
[06:38] <duflu> Alternatively, go old-school and move your desktop to a text VT :/
[06:38] <tvoss_> duflu, not sure what you mean?
[06:38] <duflu> tvoss_: You were going to benchmark OpenArena with the Mir-native-SDL ?
[06:38] <duflu> Or something
[06:38] <hikiko> and could you guys suggest me a bug that is important to fix and none of you is working on it atm?
[06:39] <duflu> hikiko: I think generally anything Critical in the mir or xmir projects
[06:39] <hikiko> because most of our critical bugs are either GPU specific (need ATI for example) or assigned from what I see
[06:39] <duflu> Hmm, maybe we should unassign those not in progress
[06:40] <hikiko> kevin suggested one that is a test failure (but it's a bug that only appears some times) that ^ you said is incomplete
[06:40] <hikiko> 3 others are ATI Specific
[06:41] <duflu> hikiko: Bug 1206633 is a killer. And close to code you have seen recently...
[06:41] <hikiko> cool :D
[06:41] <duflu> Though that might just be upstart scripting. I guess I should test
[06:41] <hikiko> I'll pick this one then!!
[06:42] <tvoss_> duflu, yup, got the numbers, need to post-process them
[06:42] <duflu> hikiko: It might not be a bug. Maybe just our upstart scripts need better dependency rules
[06:43] <hikiko> yes I saw your comment
[06:44] <hikiko> I'll give it a try though
[06:46] <duflu> hikiko: I also recall that exception wasn't clear which function it came from. Can also happen with errno==0. So maybe you should make the exception clearer first... and mention which function failed
[06:47] <hikiko> yes, I did that when I was testing for the nested mir: it was permission denied not success
[06:47] <hikiko> and my guess is
[06:48] <duflu> hikiko: I mean, it would be good to propose a branch with better exception details first
[06:48] <hikiko> that the problem is that we have that test (what interface version is used?) before we set the drm master
[06:48] <hikiko> oh :)
[06:48] <hikiko> true!
[06:48] <hikiko> will do!
[06:48] <hikiko> ok, cool!I'll start with this bug :D thank you duflu!
[06:52] <hikiko> final question: for as long as I only have my laptop for development
[06:52] <hikiko> is there any way to disable and enable xmir?
[06:53] <hikiko> +if I start xserver without -mir and then I start my compiled version of xmir (the branch I work) will I get a crash?
[06:53] <duflu> hikiko: Yes, comment out type=unity from /etc/lightdm/lightdm.conf.d/*
[06:54] <hikiko> duflu, if I do so, and compile/start xmir I wont have any issues isnt it?
[06:55] <hikiko> or I have to remember to enable it before I test?
[06:55] <hikiko> (enable the system's xmir)
[06:55] <robert_ancell> bye all
[06:55] <duflu> hikiko: That only affects the default system logon. You can still run any server manually
[06:55] <duflu> Bye robert_ancell
[06:55] <hikiko> cool that was my guess :)
[06:56] <hikiko> thanks duflu :)
[07:00] <dholbach> good morning
[07:14] <duflu> Morning dholbach
[07:15] <dholbach> hi duflu
[07:52] <duflu> RAOF: All root fragments share the same Drawable right?
[07:52] <RAOF> Correct.
[07:53] <RAOF> They're all associated with the root window, as seen by pScreen->root
[07:54] <duflu> RAOF: Looks like we don't filter damage reports to check if they overlap the fragment in question. Any damage effectively damages all fragments
[07:54] <RAOF> This is correct. *but* we take the intersection of the damage with the fragment's region before updating the damage regions.
[07:55] <duflu> RAOF: Still, that explains why on the Mir side I was seeing all outputs redraw even when only one should... ?
[07:55] <RAOF> No, I don't think so?
[07:55] <duflu> OK, forget I said that. It's tangental
[07:56] <RAOF> We *should* only be calling the driver's copyproc when xmir_window_is_dirty() returns true, which is iff RegionNotEmpty(xmir_window_get_dirty(xmir_win)) returns true.
[08:25] <hikiko> duflu, still here?
[08:26] <duflu> hikiko: Yes, but about to vanish for a brief while... ?
[08:26] <hikiko> :)
[08:27] <hikiko> I have a question but ok, I'll find out! good evening :)
[08:27] <duflu> hikiko: I'll be back soon. What is the Q?
[08:30] <hikiko> well, I wonder how and when xmir starts working with the xserver (because I remember that when I was implementing the mir display for X I had to do a trick to tell the xserver not to handle the drm device that mir is using and I want to see if this is necessary in xmir as well)
[08:30] <hikiko> it might be related to the 1206633 bug
[08:31] <hikiko> bug #120633
[08:31] <hikiko> sorry
[08:31] <hikiko> bug #1206633
[08:58] <duflu> hikiko: I don't think there would be any remaining races for DRM devices in the X server. Because it is single threaded. More likely other processes racing, like maybe plymouth has not finished deactivating yet
[09:03] <hikiko> the only reason I suspect xserver is because I get the bug when I start a 2nd instance of lightdm which starts a second instance of xserver
[09:04] <hikiko> when it's only lightdm and plymouth I dont get the freeze
[09:04] <hikiko> but ok, I'll test a few other things first :)
[09:05] <duflu> hikiko: Try adding a sleep and retry. That will give you an idea of whether the race just lasts a split second
[09:07] <hikiko> I will :)
[09:52] <duflu> RAOF: That's fun. Each frame XMir is looping through my *unused* outputs as well as used
[09:52] <duflu> ... or something.... because I have 6 XMir wins to loop through
[09:56] <duflu> Hmm, sometimes 8. This is quite strange
[10:17] <davmor2> hey guys, I think I found an odd issue, if you hit the power button at any time in saucy the machine shutsdown with no prompts, this includes if the machine is in lock/sleep mode,  meaning someone could leave their machine and it sleeps and locks and someone else hit the power button, all their work is then lost
[10:18] <tvoss_> davmor2, did you try it without Mir?
[10:18] <tvoss_> davmor2, that is, without unity-system-compositor?
[10:19] <davmor2> tvoss_: nope but I can
[10:19] <duflu> davmor2: Yeah I thought it was a new feature. I like it. Has nothing to do with Mir though
[10:20] <davmor2> duflu: I'd of said it was a critical regression it should switch off in locked mode it should just popup the user password dialogue box
[10:20] <davmor2> shouldn't switch off even
[10:21] <duflu> davmor2: I suggest you raise it in #ubuntu-desktop. It's really unrelated to this channel
[10:22] <davmor2> duflu: will do thanks
[10:22] <duflu> davmor2: Though it's not a security issue. Anyone with physical access to the power button always has the ability to kill a machine. There's no change there
[10:23] <davmor2> tvoss_: so yeah happens with with mir removed too
[10:23] <tvoss_> davmor2, then it's not a an xmir issue
[10:25] <davmor2> duflu: indeed that is the case for a hard press but if someone though the machine was hibernating and taps the power button to wake it like they do on windows instant shutdown may not be the desired result :)
[10:26] <duflu> I like it. I always hated having to click on a dialog after I've already pressed the power button.
[10:27] <duflu> And it's better than Windows' annoying behaviour of always suspending, IMHO
[10:28] <davmor2> duflu: I have no issue with it in an unlocked system, but if I lock my system I'd rather be forced to open the system and ensure there is nothing unsaved before my system shutsdown on me :)
[10:29] <duflu> davmor2: Fair point. And Ubuntu design often makes contentious decisions. You should probably start by raising it in #ubuntu-desktop
[10:30] <davmor2> duflu: I am now :)
[11:40] <Mirv> I filed bug #1220652 now for my problem. the attachments may not be very useful since I'm running normal X again, but just in case someone would have an force-rebootable desktop at some point and could try to reproduce my problem
[11:41] <Mirv> and input devices disappearing might be a bit weird anyhow, not shown in the normal logs.
[11:42] <Mirv> oh, added the tidbit that writing in VT1 also has the text in VT7, but in lightdm itself I can't type or move mouse
[11:44] <tvoss_> Mirv, what would be the expected behavior from your pov when forcefully killing usc?
[11:47] <Mirv> tvoss_: to that matter, it restarting itself gracefully like unity does
[11:48] <tvoss_> Mirv, that would be an upstart task, wouldn't it?
[11:48] <Mirv> tvoss_: yes, I think
[11:49] <tvoss_> Mirv, might be good to mention that in the bug report, too
[11:49] <Mirv> added ", even after reboot" to the bug title to make it more clear that the killing is not the main issue
[11:49] <Mirv> behavior in kill situation might be worth feature request in another bug, but that bug's intend is that I can't use xmir at the moment anymore, for some mysterious reason X
[11:50] <Mirv> I'll file another bug for the request to gracefully restart itself
[11:51] <Mirv> and that is bug #1220656
[13:06] <tkamppeter> I have serious multi-screen problems with the XMIR as of saucy release (not -proposed) should I report a bug or are there already fixes in the queue for which I should wait?
[13:08] <kgunn> tkamppeter: one minute while i dig
[13:08] <kgunn> we've tagged multimonitor here https://bugs.launchpad.net/xmir/+bugs?field.tag=multimonitor
[13:08] <kgunn> curious are you on intel? or ati/nvidia?
[13:09] <tkamppeter> kgunn, I am on Intel, Lenovo Thinkpad Twist with Core i& with its built-in GPU.
[13:11] <kgunn> tkamppeter: yeah, i'd skim thru the multimonitor tagged bugs...if you don't find a match/similar bug...please file
[13:27] <tkamppeter> kgunn, it seems to be bug 1216472, workaround (1) of this bug solves the problem.
[13:27] <kgunn> cool
[13:28] <tkamppeter> kgunn, I have marked the "Me too" now and subscribed to the bug.
[15:09] <hikiko> bye!
[15:59] <kdub> we need a way to change the client-side connection configuration in integration/acceptance tests
[16:01] <alan_g> kdub: which parts of the configuration can't you change?
[16:02] <kdub> if we use mir_connect, we can't put in a stub client-side graphics platform
[16:02] <kdub> was thinking of changing it a bit to use some test connection functions
[16:03] <kdub> but its just nice that the client's exec() in our test framework just uses our public api
[16:04] <alan_g> kdub: yes, we should have made the client API mockable
[16:07] <kdub> alan_g, would you be opposed to having integration and acceptance tests connect with something like mtd::mir_connect()?
[16:07] <kdub> (which allows us to put different client connection configurations in, depending on whether we want real graphics or not)
[16:10] <alan_g> kdub: If I were not starting from here I'd add a level of indirection so that all the API calls could be intercepted by test doubles.
[16:11] <kdub> with 'here' being 'lots of acceptance/integration tests already written' ?
[16:12] <alan_g> "here" being introducing function pointers breaking the ABI
[16:12] <kdub> ah, understood
[16:16] <alan_g> kdub: I think the better way is for mir_connect() to call through a function pointer  that defaults to the current implementation, but can be replaced for a test.
[16:17] <kdub> yeah
[16:18] <kdub> or even, maybe put that function into mir_client_library_debug.h
[16:19] <kdub> probably worth doing, two critical bugs about the tests have popped up related to this
[16:25] <alan_g> kdub: "that function" being the existing implementation and the function pointer?
[16:26] <kdub> well, i'm currently thinking the path of least resistance will be a mtf::test_client_connect() function that returns a MirConnection*
[16:27] <kdub> as opposed to adding a mir_connect(<current args>, MirClientFnPtrStruct*)
[16:28] <kdub> no touching the API, just a small inconvenience for tests
[16:29] <alan_g> kdub: I was thinking of a global pointer variable, not an extra parameter
[16:30] <alan_g> mir_connect() calls through (*mir_connect_impl)() which defaults to mir_connect_default_impl()
[16:31] <kdub> and that global pointer can be overwritten via a debug function?
[16:32] <kdub> rather, a function in mir_client_library_debug.h
[16:33] <alan_g> Or just declare it in mir_client_library_debug.h and allow it to be assigned to
[16:34] <kdub> yeah, i'll see which is nicer (i'll try to keep the struct containing the function pointers hidden)
[16:46] <racarr> Morning...!
[16:47] <racarr> Zzzz. im not sure its worth it for me to go to the late night weekly
[16:47] <alan_g> Afternoon...!
[16:47] <racarr> because I try but this ends up happening every week
[16:48] <racarr> Hey alan_g ! How have you been?
[16:48] <alan_g> ;)
[16:48] <alan_g> racarr: I've been good.
[16:49] <alan_g> How was your vacation?
[16:52] <racarr> alan_g: It was great!
[16:53] <racarr> Left a significant portion of my stress in the desert :)
[16:53] <alan_g> Excellent. (Now let's see if we can fix that...)
[16:55] <alan_g> racarr: kdub: I've left a couple of MPs for you to review.
[16:56] <kdub> will look
[16:56]  * alan_g sneaks off 3 minutes early to mother-in-law's birthday...
[17:50] <kdub> fun gcc error: "The bug is not reproducible, so it is likely a hardware or OS problem."
[17:53] <racarr> kdub: Lol beautiful
[19:59] <ricmm> olli_: kgunn sorry guys chrome crashed
[21:11] <robert_ancell> racarr, "Renamed toggle_dpms to toggle_dpms_between_on_and_off" - love it :)
[21:31] <racarr> robert_ancell: :D I felt pretty happy about that too
[21:43] <racarr> ricmm: Can you tell me more about the screen blanking stuff in upowerd and the small hack you mentioned? I think our DPMS support API is pretty ready to land, so if the mechanism is simple
[21:43] <racarr> maybe I can just do the android impl...like right now!
[21:50] <kgunn> rsalveti: ^
[21:51] <rsalveti> cool
[21:52] <kgunn> rsalveti: also...did i see you mention it was easy to crash (or freeze) mir on N4 today ?
[21:52] <rsalveti> kgunn: yup
[21:53] <rsalveti> just installed today's mir image, and could easily crash mir when opening the browse-app, getting back to the shell and then killing the app
[21:53] <kgunn> rsalveti: any particular sequence? ( ive been testing regularly....but admit, i hadn't updated today)
[21:53] <kgunn> eeewww
[21:54] <rsalveti> let me reboot and give it another try
[21:54] <kgunn> rsalveti: one more....what are "input_tests" ?
[21:54] <kgunn> e.g. how to replicate what you see in terms of failed input handling
[21:54] <rsalveti> oh, it's just that currently powerd is listening for the input events directly
[21:55] <rsalveti> to support the power button
[21:55] <rsalveti> that's not the desired way when mir is in place, but it was the way we did this with SF
[21:55] <racarr> Howshould powerd get the power button
[21:55] <racarr> without a window, it doesnt seem it should get it from mir
[21:55] <rsalveti> what I tried to check is if we were able to get the input events via hal, which wasn't the case
[21:55] <racarr> so it needs to communicate with unity somehow
[21:56] <rsalveti> seems Mir was getting all the events somehow
[21:56] <rsalveti> well, powerd is not the right one to handle the power button
[21:56] <ricmm> racarr: not really, just exploring a simple interface via the HAL
[21:56] <ricmm> racarr: I can take a look at it
[21:56] <ricmm> the shell needs to implement policy, powerd execute it
[21:56] <rsalveti> as we discussed during that previous call, we need someone that is already handling the input events to handle that power button
[21:56] <ricmm> thats the decision we had made
[21:56] <rsalveti> and communicate with the system to let it suspend/resume
[21:56] <rsalveti> yeah
[21:57] <ricmm> or in this case, the other way around... powerd executes the policy by asking the shell (Mir) to blank the screen
[21:57] <ricmm> out via the android impl for DPMS, but all we really need is a screen on/off toggle
[21:57] <ricmm> no more
[21:57] <racarr> I don't understand where powerd helps us here, because
[21:57] <ricmm> rsalveti: do you know if a basic HAL interface for this?
[21:57]  * ricmm looks
[21:57] <rsalveti> right, but powerd is not the one listening for the power input events
[21:58] <racarr> the shell, needs to make the policy decision (no one else should see all the input events)
[21:58] <rsalveti> at least it shouldn't be
[21:58] <racarr> but, then if the shell uses powerd for the mechanism
[21:58] <racarr> mir in turn needs to respond to the screen turning off (explicitly, or through error conditions I guess)
[21:58] <ricmm> powerd will work the idle timeout
[21:58] <racarr> i.e. stop the compositor, stop handing out buffers
[21:58] <racarr> etc
[21:58] <ricmm> the shell will work the power button and ask powerd if it should suspend
[21:58] <racarr> so why not just have mir turn off the screen
[21:58] <racarr> For powerd to work the idle timeout
[21:59] <kgunn> racarr: i think they are saying the shell should own that policy
[21:59] <racarr> how will that work? i.e. it sees the stream of all input events?
[21:59] <racarr> Or does unity send it an event like "idle"
[21:59] <kgunn> for making decisions about whether or not to actually blank the screeen
[21:59] <rsalveti> who is handling all the input, mir, right?
[21:59] <ricmm> want to jump in a hangout?
[21:59] <racarr> Yes
[21:59] <rsalveti> one thing ins handling the power button input
[21:59] <ricmm> https://plus.google.com/hangouts/_/f4a766e9a7fc27ef894ab5ac8235856049fc75ab
[21:59] <rsalveti> the other is the timeout
[22:00] <racarr> kgunn: I am interpreting the opposite XD
[22:19] <racarr> Thanks :) will give screen blanking a shot in 10-15 min
[23:01] <robert_ancell> racarr, in src/server/shell/default_focus_mechanism.cpp, there is a reference held to the currently focussed surface that is never cleared until another surface has focus. Do you see any issues there if the surface is destroyed?
[23:02] <racarr> robert_ancell: Yes. That's the second part of the stress test race
[23:02] <racarr> robert_ancell: I tried to fix it with session-transactions but need to come up with another solution I guess
[23:02] <robert_ancell> racarr, do you have a branch that fixes that specific part of the code? Because I'm modifying that code to support dropping focus on vt switch
[23:02] <robert_ancell> ah
[23:02] <racarr> I think it may be resolvable with a weak reference
[23:03] <racarr> plus the appropriate logic when lock fails
[23:03] <racarr> or the other idea I had was a different approach to session transactions hwere consumers of the surfacers in the shell store
[23:03] <racarr> mf::SurfaceId and there is an api like
[23:03] <racarr> session->with_surface_do(SurfaceId, [&](msh::Surface& surface) { stuff })
[23:04] <racarr> I think it is the cause of some of the phone crashes gerry is seeing so I am taking it on after DPMS
[23:04] <racarr> Screen blanking is blocking mir on phone though, so I am taking a minor diversion from GBM DPMS to hopefully quite quickly implement android screen blanking XD
[23:05] <racarr> robert_ancell: the thing is the surface will still be destroyed or whatever but exceptions may throw when you try to use it because the underlying surface is gone
[23:05] <robert_ancell> right
[23:05] <racarr> there is another race in the same section of code
[23:05] <racarr>     auto surface = focus_session->default_surface();
[23:05] <robert_ancell> racarr, is there an easy way to detect that without just ignoring exceptions when accessing it?
[23:06] <racarr> Then a few lines later:         surface->raise(surface_controller);
[23:06] <racarr> but the underlying surface could be destroyed
[23:06] <racarr> robert_ancell: Not really because without some sort of lock it can always happen inbetween
[23:06] <racarr> your detection and operations
[23:07] <racarr> so you need some way to do things atomically.
[23:07] <racarr> I guess a weak_ptr is not enough.
[23:07] <racarr> I like with_surface_do sort of. but im not sure it will land
[23:07] <racarr> let me figure out why session-transactions was rejected I dont remember
[23:08] <racarr> ok the recursive mutex.
[23:09] <racarr> hmm
[23:10] <racarr> the problem is, the idea is with_surface_do or do_transaction or whatever is that it holds a lock on the session that prevents destroy_surface, etc from running during the body of your function
[23:10] <racarr> but then certainly the body of the function should be able to call these methods
[23:10] <racarr> so you need a recursive mutex.
[23:13] <racarr> robert_ancell: Ok I have an idea
[23:14] <racarr> the focus mechanism needs to track the last surface, so it can unfocus it when a new surface takes focus
[23:14] <racarr> so...um...
[23:14] <racarr> another way to do that? XD not sure what the pattern is
[23:16] <racarr> err, nvm I thought myself in a circle
[23:17] <racarr> I was thinking about, passing out token objects to the surfaces
[23:17] <racarr> that know how to generate unfocus events when they are destroyed
[23:17] <racarr> but it doesnt seem to make sense
[23:21] <racarr> Maybe the transaction API, but just don't use a recursive mutex
[23:21] <racarr> and make it clear that calling session functions in the body of this is a deadlock
[23:21] <racarr> with some API like
[23:22] <racarr> session->do_with_surface_while_freezing_requests(mf::SurfaceId, std::function)...
[23:23] <robert_ancell> racarr, but what do we do when the last surface is removed and then we try and access it afterwards?
[23:24] <racarr> Mm
[23:24] <racarr> Ok maybe remove the SurfaceId from the API and the session tracks
[23:24] <racarr> its last focused surface (as the default_surface I guess)
[23:24] <racarr> and there is API like
[23:24] <racarr> session->with_default_surface_do
[23:24] <racarr> std::function
[23:29] <racarr> I need to run some errands. Back in ~1 hour