[00:54] <dupingping86> who is main developer!
[00:55] <dupingping86> I 'll try to become a main developer.
[00:55] <dupingping86> let's help each other.
[00:55] <dupingping86> so let's make mir wonderful.
[00:58] <prepangolin> Hello everybody.
[10:52] <alf_> alan_g: Do you want to take another look at mir-screencast-basic-client-api or shall I top-approve?
[10:53] <alan_g> alf_: If you've got other approvals I won't waste time
[10:53]  * alan_g is hinting a race condition in buffer management
[10:53] <alan_g> *hunting
[10:55] <alf_> alan_g: interesting!
[11:01] <alan_g> alf_: it is what we software developers live for. ;)
[11:03] <alan_g> alf_: can you take a peek in case it is obvious to you how this happens? https://bugs.launchpad.net/mir/+bug/1267323/comments/4
[11:04] <alf_> alan_g: sure
[11:31] <alf_> alan_g: can you try if calling surface_data->frame_posted(); before swapping the client buffers makes a difference in ms::BasicSurface::swap_buffers() ?
[11:31] <alan_g> alf_:  sure
[11:35] <alan_g> No effect.
[11:36] <alf_> alan_g: a problem in informing the compositor about changes to the scene would have the blocking effect (but that's just a guess in this case)
[11:37] <alf_> alan_g: hmm, although that would probably block all clients, not just the new ones...
[11:38] <alan_g> alf_: yeah, trawling through the logic now. Thanks for looking (was hoping different eyes would help)
[11:38] <alan_g> alf_: in lp:1274208 it is all clients
[11:39] <alan_g> For the moment I'm assuming that duflu is right about this being one problem. It is simpler that way. ;)
[11:58] <sil2100> Hi guys, do you know usually how long the mir unit-tests should run on armhf?
[11:59] <sil2100> Is it taking a long time normally?
[11:59] <alf_> sil2100: are you running with valgrind/memcheck?
[12:00] <sil2100> alf_: just what is run in a standard mir package build
[12:01] <sil2100> alf_: https://launchpad.net/~ci-train-ppa-service/+archive/landing-005/+build/5537029 <- it's like standing there for at least 20-25 minutes
[12:02] <alf_> sil2100: that's not normal, something is wrong
[12:04] <sil2100> alf_: it was fine yesterday, and suddenly it started to work like this in the morning
[12:06] <alf_> alf_: this a native armhf (not cross-compiled) build I gather?
[12:07] <alf_> sil2100: this a native armhf (not cross-compiled) build I gather?
[12:09] <alf_> sil2100: where can I get the details for this build, which branch was used etc?
[12:13] <sil2100> alf_: yes, it's a non-virtualized builder
[12:13] <sil2100> alf_: let me provide you all the details
[12:14] <sil2100> alf_: https://code.launchpad.net/~mir-team/mir/trunk-0.1.4/+merge/201707 <- this is the branch being built
[12:15] <sil2100> alf_: it's the branch that releases all mir development happening in the past
[12:16] <sil2100> alf_: nothing changed since yesterday it seems, and yesterday we had a green mir in the PPA - all other platforms build fine
[12:16] <sil2100> alf_: just armhf gets stuck here
[12:44] <alan_g> alf_: I've made some progress. It appears to be related to the topmost window overlaying other windows - not sure of the exact scenario but I've got a little movie (tries to remember how to access chinstrap)
[13:36] <sil2100> kgunn: hi! It seems we can't get mir building for armhf now
[13:36] <sil2100> kgunn: it was fine yesterday, now it hangs infinitely on unit-tests
[13:36] <sil2100> kgunn: and it's just for armhf
[13:37] <sil2100> kgunn: I can't block on mir any longer, so I'll free the mir landing silo, release all the queued platform-api bits in another landing and only then re-assign mir a silo
[13:49] <alf_> kgunn: @mir failing in silo, I am building the branch locally on Nexus 10, to see if I can reproduce the hang
[13:50] <alf_> alan_g: The first thing that comes to mind is the mc::OcclusionFilter failing somehow, but in that case I would expect the other windows not to be drawn at all...
[13:58] <alan_g> alf_: I assume we block occluded surface to prevent them repainting. That will block a frontend thread. And if we block enough of them (e.g. the only one) then frontend is dead.
[14:01] <alf_> alan_g: yes, that sounds right
[14:02] <alf_> alan_g: (we "block" them by not consuming their previous buffers)
[14:02] <alan_g> ack
[14:03]  * alan_g thinks it is easy to test - but not trivial to fix
[14:05] <anpok> can we discard them
[14:07] <alan_g> alf_: yes if OcclusionFilter::operator() is hacked to return false then we don't consume all the frontend threads.
[14:07] <alan_g> anpok: we intend to block the client, but not to exhaust frontend threads
[14:26] <alf_> kgunn: mir_unit_tests run fine locally, both when cross-compiled and when native compiled on N10 (using trunk-0.1.4 branch)
[14:45] <alan_g> alf_: anpok can you confirm my logic:
[14:45] <alan_g> 1. We shouldn't have blocking calls on frontend threads
[14:45] <alan_g> 2. as currently implemented SessionMediator::next_buffer() can block in SwitchingBundle::client_acquire()
[14:45] <alan_g> 3. If client_acquire() is changed to non-blocking then it needs to store a completion callback when it can't complete
[14:45] <alan_g> 4. If SwitchingBundle::compositor_release() finds a completion callback it must invoke and release it
[14:45] <alan_g> 5. That callback needs to call pack_protobuf_buffer() and done->Run()
[14:45] <alan_g> 6. This introduces some resource management issues - like the lifetime of the message response object
[14:46] <alan_g> 7. Not allowing blocking calls on frontend threads means we shoudn't need to unblock them on shutdown (we can delete code)
[14:48] <kgunn> hey guys...dr appt longer than i thot
[14:48] <kgunn> alf_: thanks for trying...
[14:49] <alf_> alan_g: Looks sensible
[15:22] <kgunn> sil2100: so what's the current thot on mir in ci train ?
[15:24] <sil2100> kgunn: for now, until alf_ or anyone else figures out why the mir unit tests hang on armhf, we land platform-api pending changes separately - and then we re-enable mir landing
[15:26] <kgunn> sil2100: so locally they don't...for the ci train is this on calxeda where the unit test fails ??
[15:26] <sil2100> kgunn: I guess so, it's the kishin builders - it was failing on more than one of the builder machines so it's not only one-hardware issue I guess
[15:29] <kgunn> sil2100: ok...i'll try local as well...we may have to go down the road of debug on the calxeda
[15:32] <sil2100> kgunn: the strange thing is that yesterday all was fine
[15:33] <kgunn> sil2100: no kidding, curious...did platform-api (aka papi) start behaving all of a sudden too ?
[15:34] <sil2100> kgunn: no, this one got actually fixed - but I guess it was some cmake fix FWIK
[16:12] <anpok> alf_: async page flips?
[16:13] <alf_> anpok: yes
[16:16] <kgunn> sil2100: so is the "fixed" papi in proposed image ?
[16:16] <kgunn> kdub_: ^
[16:16] <sil2100> kgunn: not yet... having problems with cmake-related things
[16:18] <kdub_> sil2100, is there a patch to try?
[17:33] <anpok> alf_: so the issue is that compositing might happen while the page flip is still scheduled, instead we should wait for the completion of the page flip during the first draw call
[17:34] <alf_> anpok: yes, that is, keep what we have now, I don't think the optimization Daniel describes will work properly
[17:35] <anpok> but we dont have it at the next draw call, do we?
[17:45] <alf_> anpok: right now we schedule and wait for page flips before continuing
[20:30] <dandrader> is mir supposed to know anything or have any involvement with clipboard (copy/paste)?