AlbertAtoo many threads.....too many mutexes....00:39
AlbertAso I think a combination of anpok_ branch and just avoiding canceling the timer in TimeoutFrameDroppingPolicy00:42
AlbertAwith appropiate checks of pending_swaps...should do it...00:43
RAOFAlbertA: I think one extra mutex should make everything work right :)00:49
AlbertARAOF: heh...I think we also need an extra thread :)00:51
=== chihchun is now known as chihchun_afk
=== alf is now known as 6JTAAQ1J9
=== kgunn is now known as Guest2511
dufluGuest2511: Welcome to Mir :)02:37
Guest2511its just me :) kgunn02:38
Guest2511night o/02:38
=== chihchun_afk is now known as chihchun
RAOFIs there really no way to tell before you try writing how much data you'll need to write to a socket before it'll block?04:34
=== 6JTAAQ1J9 is now known as alf_
=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
alan_ganpok_: I know you've a wider solution in the pipe, but if you're happy with this can we top-approve to get a fix landed? https://code.launchpad.net/~vanvugt/mir/fix-1339700-alarm/+merge/22625208:23
alf_alan_g: http://paste.ubuntu.com/7779262/ , it's not a problem in the observer code per se, it's a race in our session shutdown code09:46
alf_alan_g: if when closing a session we focus another session that has a different display configuration, we queue a display configuration change in the action loop that eventually needs to a acquire a write lock to remove surface observers when shutting the compositor down09:47
alan_galf_: should I take your word for it or do you need a second set of eyes?09:47
alf_alan_g: thanks, the problem is clear to me so need to take up your time09:48
alf_alan_g: ...so no need...09:48
alf_alan_g: ... meanwhile we continue tearing down the session object which triggers a surface remove observer (under read lock) and which eventually wants to unregister an input fd from the main loop09:50
alf_alan_g: that's just FYI, you can ignore the above, just expressing the issue so that it becomes even clearer in mind09:54
alan_galf_: understood. And it will help me with the review09:54
alf_alan_g: camako_: FYI, https://bugs.launchpad.net/mir/+bug/134066910:52
ubot5Ubuntu bug 1340669 in Mir "Intermittent deadlock when swithcing to session with custom display configuration while closing other session" [High,New]10:52
camako_alf_, hmmm this is different from the other deadlocks?10:53
alf_camako_: yes10:54
alf_camako_: (not fixed/affected by Alan's branch)10:54
camako_alf_, ack... just as severe, and need to be included soon in a release, I guess.10:55
alf_camako_: right, I marked it "high" instead of "critical" since it only affects XMir session at the moment, but feel free to upgrade10:56
alf_camako_: unfortunately, although the problem is clear, the solution hasn't become clear yet10:58
camako_alf_, finding the bug is half of the work to fix it :-)...10:59
alan_gwriting the test that demonstrates the bug is half of the work too11:02
alan_g...so fixing it must be trivial11:02
alan_gcamako_: any problems with the nested_lifecycle_events MP or just been busy?11:04
camako_alan_g, just back to it now.. should update soon11:05
=== chihchun is now known as chihchun_afk
anpok_alan_g: alf_: hey what about going all single threaded?11:54
alan_galf_: what's the status of MesaDisplayTest? I thought there was a recent "intermittent failure" bug but can't seem to find it11:55
alan_ganpok_: where's the fun in that?11:55
anpok_I believe we should copy whatever the renderers need to composite a scene each frame, and have the rest happen in a single thread11:55
anpok_alan_g: there is none! but maybe when we get bored after a few months without a race or deadlock we would happily switch back?11:56
alf_alan_g: https://bugs.launchpad.net/mir/+bug/1336671 , waiting for fixed umockdev package to land in the archive (haven't checked today, perhaps it already did)11:57
ubot5Ubuntu bug 1336671 in Mir "Intermittent mir_unit_tests.MesaDisplayTest.drm_device_change_event_triggers_handler test failure: libumockdev isn't thread safe" [Medium,In progress]11:57
alf_anpok_: all the threads we have are there for a reason (performance), I am all for a single threaded structure, but I doubt we could get decent performance this way12:00
alan_galf_: thanks.12:02
anpok_alf_: you mean both the ipc handling threads and the renderer threads?12:08
=== alan_g is now known as alan_g|lunch
alf_anpok_: and input12:09
anpok_ah right I forgot about input12:10
anpok_at least i try to :)12:10
* alf_ is reminded once again that "Writing correct shut-down code is hard" http://books.google.gr/books?id=_i6bDeoCQzsC&lpg=PT411&ots=eo2Pxn2b06&dq=writing%20correct%20shutdown%20code%20is%20hard%20martin&pg=PT411#v=onepage&q&f=false12:10
alf_at least I know have an acceptance/regression test for the deadlock...12:12
camako_alan_g, just pushed the fixed version of nested_lc12:30
camako_alan_g|lunch ^^13:00
sil2100kgunn: hi!13:12
=== alan_g|lunch is now known as alan_g
kgunnsil2100: hey13:15
alan_gcamako_: looking13:15
* kgunn realizes this is going to be a 3 coffee day13:15
kgunnsee i've included all the "e"s so i've already had 213:15
sil2100kgunn: how's the progress on bug LP: #1339700 ? As I see many merges there and got a bit confused ;)13:15
ubot5Launchpad bug 1339700 in Mir "[regression] Device locks randomly on welcome screen" [High,In progress] https://launchpad.net/bugs/133970013:15
kgunnsil2100: sorry...i meant to send a mail last night13:16
kgunnsil2100: lemme see one email in my pile real quick13:16
kgunnre this13:16
sil2100Ok :)13:16
kgunnsil2100: ok, bottom line, we thot we had a fix, got to talking realized we needed to do a little surgery vs band-aiding....13:18
kgunnsil2100: so we backed up, had lots of discussion on irc & launchpad....13:18
kgunnsil2100: i think we'll consolidate those thots into one patch today, test, promote on our devel, then propose to trunk in an isolated fashion...13:18
kgunne.g. we'll do a 0.4.1mir with this bug fix only13:19
sil2100kgunn: that's a good plan13:19
kgunncamako_: ^ did i speak right ?13:19
sil2100kgunn: would be best to land this isolated, without pulling in all the other, risky bits13:19
sil2100Especially that we're trying to get a completely green image ;)13:19
camako_kgunn, yeah sounds right13:19
sil2100kgunn, camako_: thanks guys :)13:21
sil2100camako_: oh, and let me finish up my check-up on the lp:mir request you had yesterday13:21
camako_sil2100.. sure.. take ur time13:22
alan_gcamako: another iteration of fixes13:43
camakoalan_g ack13:45
* kdub_ is debating adding a mir::Fd type14:11
alan_gkdub_: do it14:13
kdub_yay, will help me fix this flummoxing resource problem where I have to dance around pod-int-fds14:15
* kdub_ wishes gmock was better with movable / not copyable types14:21
* alan_g too14:24
=== alan_g is now known as alan_g|tea
=== alan_g|tea is now known as alan_g
=== chihchun_afk is now known as chihchun
=== chihchun is now known as chihchun_afk
AlbertAIf I could just std::lock all the three mutexes involved....we would be golden...16:42
alan_gAlbertA: assuming there is a canonical order to the locking16:49
AlbertAalan_g: right... it would mean maybe the Alarm would need to get a reference to the mutex it's handler uses16:51
AlbertAso the handler can lock it in the same order16:51
alan_gSounds messy - a mutex should only be protecting data in the associated object16:54
AlbertAmaybe we could make it part of the handler interface16:54
AlbertAa lock method...16:54
AlbertAthen you have an opportunity to implement the same locking order16:55
AlbertAand optional if the handler does not need to deal with such a thing16:55
alan_gHmm. It is too late in the week for me to think about that.16:57
AlbertAits still too brittle though, prone to error, because as soon as you break the lock hierarchy ( inadvertently) you would still deadlock17:01
AlbertAvery possible with the levels of indirection we have17:01
=== alan_g is now known as alan_g|EOW
=== renato is now known as Guest85133
=== Guest85133 is now known as renato___
AlbertAmy head hurts....but I think a simple solution is just to restore cancel to the way it was...and just make the destructor of AlarmImpl20:12
AlbertAuse the synchronous wait for callback to complete version20:13
AlbertAkgunn: so I'm confident that https://code.launchpad.net/~mir-team/mir/fix-1339700-take2/+merge/22653421:15
AlbertAkgunn: with Daniel's just landed change should address the deadlock without sideeffects21:15
AlbertAkgunn: so this should not break abi: https://code.launchpad.net/~mir-team/mir/fix-1339700-take2-in-0.4/+merge/22653721:16
kgunnAlbertA: you are awesome21:24
kgunnracarr__: ^ can you give that a quick review as well ?21:24
kgunnkdub_: ^ if you're gonna be on for a little21:25
kgunnAlbertA: so i assume you tested on n4 & n10 ?21:25
AlbertAkgunn: yes22:00

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!