AlbertA | too many threads.....too many mutexes.... | 00:39 |
---|---|---|
AlbertA | so I think a combination of anpok_ branch and just avoiding canceling the timer in TimeoutFrameDroppingPolicy | 00:42 |
AlbertA | with appropiate checks of pending_swaps...should do it... | 00:43 |
RAOF | AlbertA: I think one extra mutex should make everything work right :) | 00:49 |
AlbertA | RAOF: heh...I think we also need an extra thread :) | 00:51 |
=== chihchun is now known as chihchun_afk | ||
=== alf is now known as 6JTAAQ1J9 | ||
=== kgunn is now known as Guest2511 | ||
duflu | Guest2511: Welcome to Mir :) | 02:37 |
Guest2511 | its just me :) kgunn | 02:38 |
Guest2511 | night o/ | 02:38 |
=== chihchun_afk is now known as chihchun | ||
RAOF | Is there really no way to tell before you try writing how much data you'll need to write to a socket before it'll block? | 04:34 |
=== 6JTAAQ1J9 is now known as alf_ | ||
=== chihchun is now known as chihchun_afk | ||
=== chihchun_afk is now known as chihchun | ||
alan_g | anpok_: I know you've a wider solution in the pipe, but if you're happy with this can we top-approve to get a fix landed? https://code.launchpad.net/~vanvugt/mir/fix-1339700-alarm/+merge/226252 | 08:23 |
alf_ | alan_g: http://paste.ubuntu.com/7779262/ , it's not a problem in the observer code per se, it's a race in our session shutdown code | 09:46 |
alf_ | alan_g: if when closing a session we focus another session that has a different display configuration, we queue a display configuration change in the action loop that eventually needs to a acquire a write lock to remove surface observers when shutting the compositor down | 09:47 |
alan_g | alf_: should I take your word for it or do you need a second set of eyes? | 09:47 |
alf_ | alan_g: thanks, the problem is clear to me so need to take up your time | 09:48 |
alf_ | alan_g: ...so no need... | 09:48 |
alf_ | alan_g: ... meanwhile we continue tearing down the session object which triggers a surface remove observer (under read lock) and which eventually wants to unregister an input fd from the main loop | 09:50 |
alf_ | alan_g: that's just FYI, you can ignore the above, just expressing the issue so that it becomes even clearer in mind | 09:54 |
alan_g | alf_: understood. And it will help me with the review | 09:54 |
alf_ | alan_g: camako_: FYI, https://bugs.launchpad.net/mir/+bug/1340669 | 10:52 |
ubot5 | Ubuntu bug 1340669 in Mir "Intermittent deadlock when swithcing to session with custom display configuration while closing other session" [High,New] | 10:52 |
camako_ | alf_, hmmm this is different from the other deadlocks? | 10:53 |
alf_ | camako_: yes | 10:54 |
alf_ | camako_: (not fixed/affected by Alan's branch) | 10:54 |
camako_ | alf_, ack... just as severe, and need to be included soon in a release, I guess. | 10:55 |
alf_ | camako_: right, I marked it "high" instead of "critical" since it only affects XMir session at the moment, but feel free to upgrade | 10:56 |
alf_ | camako_: unfortunately, although the problem is clear, the solution hasn't become clear yet | 10:58 |
camako_ | alf_, finding the bug is half of the work to fix it :-)... | 10:59 |
alan_g | writing the test that demonstrates the bug is half of the work too | 11:02 |
alan_g | ...so fixing it must be trivial | 11:02 |
alan_g | camako_: any problems with the nested_lifecycle_events MP or just been busy? | 11:04 |
camako_ | alan_g, just back to it now.. should update soon | 11:05 |
alan_g | cool | 11:05 |
=== chihchun is now known as chihchun_afk | ||
anpok_ | alan_g: alf_: hey what about going all single threaded? | 11:54 |
alan_g | alf_: what's the status of MesaDisplayTest? I thought there was a recent "intermittent failure" bug but can't seem to find it | 11:55 |
alan_g | anpok_: where's the fun in that? | 11:55 |
anpok_ | I believe we should copy whatever the renderers need to composite a scene each frame, and have the rest happen in a single thread | 11:55 |
anpok_ | alan_g: there is none! but maybe when we get bored after a few months without a race or deadlock we would happily switch back? | 11:56 |
alf_ | alan_g: https://bugs.launchpad.net/mir/+bug/1336671 , waiting for fixed umockdev package to land in the archive (haven't checked today, perhaps it already did) | 11:57 |
ubot5 | Ubuntu bug 1336671 in Mir "Intermittent mir_unit_tests.MesaDisplayTest.drm_device_change_event_triggers_handler test failure: libumockdev isn't thread safe" [Medium,In progress] | 11:57 |
alf_ | anpok_: all the threads we have are there for a reason (performance), I am all for a single threaded structure, but I doubt we could get decent performance this way | 12:00 |
alan_g | alf_: thanks. | 12:02 |
anpok_ | alf_: you mean both the ipc handling threads and the renderer threads? | 12:08 |
=== alan_g is now known as alan_g|lunch | ||
alf_ | anpok_: and input | 12:09 |
anpok_ | ah right I forgot about input | 12:10 |
anpok_ | at least i try to :) | 12:10 |
* alf_ is reminded once again that "Writing correct shut-down code is hard" http://books.google.gr/books?id=_i6bDeoCQzsC&lpg=PT411&ots=eo2Pxn2b06&dq=writing%20correct%20shutdown%20code%20is%20hard%20martin&pg=PT411#v=onepage&q&f=false | 12:10 | |
alf_ | at least I know have an acceptance/regression test for the deadlock... | 12:12 |
anpok_ | doodle? | 12:25 |
camako_ | alan_g, just pushed the fixed version of nested_lc | 12:30 |
camako_ | alan_g|lunch ^^ | 13:00 |
sil2100 | kgunn: hi! | 13:12 |
=== alan_g|lunch is now known as alan_g | ||
kgunn | sil2100: hey | 13:15 |
alan_g | camako_: looking | 13:15 |
* kgunn realizes this is going to be a 3 coffee day | 13:15 | |
kgunn | see i've included all the "e"s so i've already had 2 | 13:15 |
sil2100 | kgunn: how's the progress on bug LP: #1339700 ? As I see many merges there and got a bit confused ;) | 13:15 |
ubot5 | Launchpad bug 1339700 in Mir "[regression] Device locks randomly on welcome screen" [High,In progress] https://launchpad.net/bugs/1339700 | 13:15 |
kgunn | sil2100: sorry...i meant to send a mail last night | 13:16 |
kgunn | sil2100: lemme see one email in my pile real quick | 13:16 |
kgunn | re this | 13:16 |
sil2100 | Ok :) | 13:16 |
sil2100 | Thanks | 13:16 |
kgunn | sil2100: ok, bottom line, we thot we had a fix, got to talking realized we needed to do a little surgery vs band-aiding.... | 13:18 |
kgunn | sil2100: so we backed up, had lots of discussion on irc & launchpad.... | 13:18 |
kgunn | sil2100: i think we'll consolidate those thots into one patch today, test, promote on our devel, then propose to trunk in an isolated fashion... | 13:18 |
kgunn | e.g. we'll do a 0.4.1mir with this bug fix only | 13:19 |
sil2100 | kgunn: that's a good plan | 13:19 |
kgunn | camako_: ^ did i speak right ? | 13:19 |
sil2100 | kgunn: would be best to land this isolated, without pulling in all the other, risky bits | 13:19 |
sil2100 | Especially that we're trying to get a completely green image ;) | 13:19 |
camako_ | kgunn, yeah sounds right | 13:19 |
sil2100 | kgunn, camako_: thanks guys :) | 13:21 |
sil2100 | camako_: oh, and let me finish up my check-up on the lp:mir request you had yesterday | 13:21 |
camako_ | sil2100.. sure.. take ur time | 13:22 |
alan_g | camako: another iteration of fixes | 13:43 |
camako | alan_g ack | 13:45 |
* kdub_ is debating adding a mir::Fd type | 14:11 | |
alan_g | kdub_: do it | 14:13 |
kdub_ | yay, will help me fix this flummoxing resource problem where I have to dance around pod-int-fds | 14:15 |
* kdub_ wishes gmock was better with movable / not copyable types | 14:21 | |
* alan_g too | 14:24 | |
=== alan_g is now known as alan_g|tea | ||
=== alan_g|tea is now known as alan_g | ||
=== chihchun_afk is now known as chihchun | ||
=== chihchun is now known as chihchun_afk | ||
AlbertA | If I could just std::lock all the three mutexes involved....we would be golden... | 16:42 |
alan_g | AlbertA: assuming there is a canonical order to the locking | 16:49 |
AlbertA | alan_g: right... it would mean maybe the Alarm would need to get a reference to the mutex it's handler uses | 16:51 |
AlbertA | so the handler can lock it in the same order | 16:51 |
alan_g | Sounds messy - a mutex should only be protecting data in the associated object | 16:54 |
AlbertA | maybe we could make it part of the handler interface | 16:54 |
AlbertA | a lock method... | 16:54 |
AlbertA | then you have an opportunity to implement the same locking order | 16:55 |
AlbertA | and optional if the handler does not need to deal with such a thing | 16:55 |
alan_g | Hmm. It is too late in the week for me to think about that. | 16:57 |
AlbertA | its still too brittle though, prone to error, because as soon as you break the lock hierarchy ( inadvertently) you would still deadlock | 17:01 |
AlbertA | very possible with the levels of indirection we have | 17:01 |
=== alan_g is now known as alan_g|EOW | ||
=== renato is now known as Guest85133 | ||
=== Guest85133 is now known as renato___ | ||
AlbertA | my head hurts....but I think a simple solution is just to restore cancel to the way it was...and just make the destructor of AlarmImpl | 20:12 |
AlbertA | use the synchronous wait for callback to complete version | 20:13 |
AlbertA | kgunn: so I'm confident that https://code.launchpad.net/~mir-team/mir/fix-1339700-take2/+merge/226534 | 21:15 |
AlbertA | kgunn: with Daniel's just landed change should address the deadlock without sideeffects | 21:15 |
AlbertA | kgunn: so this should not break abi: https://code.launchpad.net/~mir-team/mir/fix-1339700-take2-in-0.4/+merge/226537 | 21:16 |
kgunn | AlbertA: you are awesome | 21:24 |
kgunn | thanks | 21:24 |
kgunn | racarr__: ^ can you give that a quick review as well ? | 21:24 |
kgunn | kdub_: ^ if you're gonna be on for a little | 21:25 |
kgunn | AlbertA: so i assume you tested on n4 & n10 ? | 21:25 |
AlbertA | kgunn: yes | 22:00 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!