/srv/irclogs.ubuntu.com/2014/07/03/#ubuntu-mir.txt

bregmawell, I'm going to try again: desktop Unity 8 has been dead in the water for a while, segfaulting in the Intel driver on uninitialized buffers when the Mir server does a glClear()00:02
bregmacurrently logged as https://bugs.launchpad.net/ubuntu/+source/unity8-desktop-session/+bug/133685400:03
ubot5Ubuntu bug 1336854 in unity8-desktop-session (Ubuntu) "Unity 8 fails to start, segfault in i965_dri.so" [Undecided,New]00:03
bregmawit han attached stack trace00:03
bregmaI'm looking for suggestions on what may have changed in the stack in the last week or two, and/or suggestions on how to chase down more information to help narrow the cause down00:03
RAOFbregma: Yeah, I noticed that yesterday.00:03
RAOFIt's going to be a mesa problem.00:04
bregmaunity-system-compositor continues to run fine, so it could be something wacky higher in the stack, it's under a lot of churn lately too00:04
RAOFusc is fine because it's not using the Mir EGL platform.00:05
RAOFI suspect that you first saw this on the 23rd?00:05
RAOFThat's when Maarten uploaded hte new mesa.00:05
bregmasounds suspicious00:05
bregmaif I can perhaps revert to an older version, that would point the finger00:06
RAOFGive it a whirl.00:06
RAOFI'll poke around in the mesa code.00:06
bregmaFTR, reverting libgl1-mesa-dri to 10.1.3-0ubuntu0.1 fixes the segfault00:34
RAOFThanks. I'm installing a debug build, so it shouldn't be long before working out what's wrong and fixing it.00:42
RAOFAh! There's the problem.01:34
RAOFbregma: Mesa 1001:54
RAOFbregma: Mesa 10.2.1-2ubuntu3 fixes your problem. Enjoy!01:55
dufluracarr: What's a prompt session?02:57
racarrduflu: trust sessions03:07
racarr;)03:07
racarrhttps://wiki.ubuntu.com/Security/TrustStoreAndSessions03:07
racarrsort of explains how the name prompt session03:08
racarrcomes about03:08
dufluracarr: OK then...03:09
RAOFOh, whoops. We appear to call all manner of un-signal-safe functions in the emergency cleanup signal handler.06:25
RAOFIncluding allocations!06:25
dufluRAOF: Sounds a bit silly. Incidentally I mentioned we shouldn't ever do such a thing in a branch that's pending... https://code.launchpad.net/~vanvugt/mir/fatal-error/+merge/21947106:48
RAOFYeah, reviewing that is what brought me to look at the emergency cleanup bit.06:48
dufluRAOF: Well the latest emergency cleanup stuff I don't think I reviewed either06:49
dufluThe plate was piled too high in May06:49
dufluHmm Chrome in Trusty has annoyingly bad tearing (diagonal/triangles)... I wonder if that's Chrome or Compiz06:51
dufluIf it's Compiz then of course we don't care any more06:52
RAOFI think it's Chrome; it started with the Aura drop, IIRC.06:53
* duflu looks up Aura06:53
dufluOh, nice one Google... they really should be able to do better than that06:54
alf_RAOF: duflu: where are we doing allocations in the emergency cleanup handlers?07:12
dufluNo idea07:12
* duflu points to RAOF07:12
RAOFalf_: You assign to a vector; that's at least potentially doing allocation.07:13
dufluOh that will do it07:13
RAOFAlso, pthreads isn't threadsafe, right?07:13
dufluDepending on the method...07:13
dufluRAOF: Umm, what?07:13
RAOFAhem.07:13
RAOFSignalsafe07:13
RAOF'cause we also lock a mutex in there.07:13
dufluRAOF: It never used to be... Signals could arrive in _any_ thread but should arrive in just one. So unpredictable but safe I think07:14
RAOFRight, but what happens if pthread_lock gets interrupted by a signal?07:14
dufluIf you want to determine the thread to use then pthread_kill07:14
alf_RAOF: duflu: at vector, right, that's easy to fix I will take a look07:15
dufluRAOF: "If a signal is delivered to a thread waiting for a mutex, upon return from the signal handler the thread resumes waiting for the mutex as if it was not interrupted."07:15
duflu[http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_lock.html]07:15
RAOFI'm not sure that covers my concern.07:16
RAOFBut signal-safety documentation isn't the best :)07:16
dufluRAOF: Well that's just the spec. Everyone has their own implementation07:17
RAOFNo, I mean that it's not clear that documentation covers the case I'm concerned about.07:17
alf_RAOF: "The mutex functions are not async-signal safe. What this means is that they should not be called from a signal handler. In particular, calling pthread_mutex_lock or pthread_mutex_unlock from a signal handler may deadlock the calling thread."07:17
alf_RAOF: :)07:17
RAOFalf_: Ding!07:18
dufluCertainly, locking in signal handlers is also bad07:18
dufluThen again without such bugs I would have received more core files in my life time than I have07:18
RAOFBasically, the set of things you can _safely_ do in a signal handler is poke at memory addresses :)07:18
RAOF</hyperbole>07:19
dufluI remember plenty of stack traces from customers showing code hanging in a signal handler after a crash :/07:19
RAOFYup.07:20
RAOFIt's disturbingly easy to deadlock in one.07:20
alf_duflu: RAOF: We can drop the locks, making it clear that you are not to call the emergency cleanup while concurrently adding a handler07:21
alf_duflu: RAOF: plus the emergency cleanup is a best effort cleanup07:21
duflualf_: If we can guarantee the signal handler is only called once then I guess reduced locking is OK07:22
RAOFWell, the signal handler isn't reentrant.07:22
RAOF(By default)07:22
RAOFBut we should ensure that we don't do anything that's highly likely to deadlock there :)07:22
dufluAlso easy to do by accident though... if someone else asks the signal() function for the existing handler (yours) and calls it07:23
RAOF?!07:23
RAOFWhy on earth would they do that?+07:23
alf_duflu: RAOF: and if we get rid of the locking requirement we might as well drop the vector copy07:24
dufluRAOF: 3rd party libraries or general dealing with bad APIs where you need a signal handler07:24
RAOFduflu: Oh, as in wrapping a signal handler?07:24
dufluRAOF: I think that's one case07:25
duflualf_: If the vector never gets too big then an array is fine07:25
RAOFalf_: Yup. We could, of course, make it a signal-safe data structure, but given that it's basically write-once I think we can happily read without locking.07:26
dufluIf it's write-once and that's guaranteed to happen before the read there is no race. And Helgrind etc will be happy without locking07:28
alf_duflu: RAOF: it says "The mutex functions are not *async-signal* safe", but the signals we handle with emergency cleanup are synchronous07:32
RAOFUnless someone sends us SIGsomething :)07:34
dufluHeh, bypass must be awesome if it's still doing my head in a year later07:35
=== doko_ is now known as doko
=== vila_ is now known as vila
=== alan_g is now known as alan_g|lunch
=== alan_g|lunch is now known as alan_g
=== pete-woods is now known as pete-woods-lunch
=== alan_g is now known as alan_g|tea
=== alan_g|tea is now known as alan_g
=== greyback_ is now known as greyback|post
=== pete-woods-lunch is now known as pete-woods
=== greyback|post is now known as greyback
=== chihchun is now known as chihchun_afk
=== dandrader is now known as dandrader|lunch
=== alan_g is now known as alan_g|EOD
=== dandrader|lunch is now known as dandrader
dobeywho would be best to bug to maybe get a crash in libmirclient on application exit on the phone fixed asap today?18:09
AlbertAdobey: bug #?18:10
AlbertAdobey: which image# ?18:10
AlbertAracarr: can you take a look at https://code.launchpad.net/~albaguirre/unity-system-compositor/no-inactivity-handling-desktop/+merge/22553718:11
AlbertAracarr: it's fairly small :)18:11
dobeyAlbertA: on 111, but lokos like it's been happening for a little while. i haven't filed a bug yet, the reports on errors.u.c failed to retrace, so there's no "create a bug" link for them. trying to determine the best way to file the bug18:11
=== renato_ is now known as Guest58058
AlbertAdobey: so any application exiting crashes?18:12
dobeyshould i just file it with the top of the stack trace and a link to the errors?18:12
AlbertAdobey: what's the fastest way to reproduce?18:12
dobeyAlbertA: any using the mir backend of qt/qml afaict. open clock app, wait a few seconds, and close it, and there should be a crash report in /var/crash/18:12
dobeyhttps://errors.ubuntu.com/problem/6552ba4342afeb93d20e22711ac36f655cd885d818:13
dobeyor go to online accounts, then hit back and should result in in a crash report too18:14
AlbertAdobey: I couldn't access that link18:16
AlbertAdobey: but I'll take a look, if you can submit a bug # that would be great18:16
dobeyAlbertA: sure. should i just copy/paste the top of the stack trace in the bug?18:17
AlbertAdobey: sure18:17
dobeyok will do18:18
racarrAlbertA: Sure18:21
dobeyAlbertA: https://bugs.launchpad.net/ubuntu/+source/mir/+bug/133748118:28
ubot5Ubuntu bug 1337481 in mir (Ubuntu) "Crash in libmirclient on app exit on phone" [Undecided,New]18:28
=== dandrader is now known as dandrader|afk
racarrdandrader|afk: greyback: https://code.launchpad.net/~unity-team/platform-api/devel-for-qtmircompositor/+merge/225320https://code.launchpad.net/~unity-team/platform-api/devel-for-qtmircompositor/+merge/22532018:59
racarrerr whoops18:59
racarrbut what is up with line 42718:59
racarrIt looks like, state_before_hiding is being used to save the state across minimize state changes19:01
racarrbut why initialize to MAXIMIZED? There must be a default or currentstate19:01
greybackracarr: indeed. I can't say. Hope dandrader|afk can reply19:01
racarranyway good besides that19:02
racarrif he doesnt come back soon ill leave some comments19:03
racarron launchpad19:03
greybackracarr: please do, and I'll try to fix (daniel off on hols at eod today)19:03
racarr:) sounds nice19:04
racarrracarr off on holidays in 51 days19:04
racarrlol19:04
=== dandrader|afk is now known as dandrader
dandraderracarr, right, I'm just initializing it to something21:43
dandraderracarr, the first time you call show(), the state will be set to this value21:44
dandraderracarr, which is what we want on phablet anyway21:46
dandraderracarr, "restored" windows make no sense there21:46
dandraderracarr, and the "beautiful" papi API is not expressive enough at the moment for the user to set the mir surface states properly. eg: "papi::window::show() == mir::surface::set_state(maximized) or mir::surface::set_state(restored?21:48
dandraderracarr, so I don't bother with this sad situation and just let it go maximized, which is what we want21:48
dandraderand now I'm repeating myself...21:48
dandraderracarr, so if you wanna to fix thing you would have to rewrite that papi api to be a 1-to-1 mapping of mir21:49
dandraderracarr, which is a waste of developer time IMHO21:49
dandraderand would also require changing papi users: ie, qtubuntu21:51
=== mterry is now known as 18VAAOQDS
=== 18VAAOQDS is now known as mterry

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!