/srv/irclogs.ubuntu.com/2014/07/03/#ubuntu-mir.txt

bregma	well, I'm going to try again: desktop Unity 8 has been dead in the water for a while, segfaulting in the Intel driver on uninitialized buffers when the Mir server does a glClear()	00:02
bregma	currently logged as https://bugs.launchpad.net/ubuntu/+source/unity8-desktop-session/+bug/1336854	00:03
ubot5	Ubuntu bug 1336854 in unity8-desktop-session (Ubuntu) "Unity 8 fails to start, segfault in i965_dri.so" [Undecided,New]	00:03
bregma	wit han attached stack trace	00:03
bregma	I'm looking for suggestions on what may have changed in the stack in the last week or two, and/or suggestions on how to chase down more information to help narrow the cause down	00:03
RAOF	bregma: Yeah, I noticed that yesterday.	00:03
RAOF	It's going to be a mesa problem.	00:04
bregma	unity-system-compositor continues to run fine, so it could be something wacky higher in the stack, it's under a lot of churn lately too	00:04
RAOF	usc is fine because it's not using the Mir EGL platform.	00:05
RAOF	I suspect that you first saw this on the 23rd?	00:05
RAOF	That's when Maarten uploaded hte new mesa.	00:05
bregma	sounds suspicious	00:05
bregma	if I can perhaps revert to an older version, that would point the finger	00:06
RAOF	Give it a whirl.	00:06
RAOF	I'll poke around in the mesa code.	00:06
bregma	FTR, reverting libgl1-mesa-dri to 10.1.3-0ubuntu0.1 fixes the segfault	00:34
RAOF	Thanks. I'm installing a debug build, so it shouldn't be long before working out what's wrong and fixing it.	00:42
RAOF	Ah! There's the problem.	01:34
RAOF	bregma: Mesa 10	01:54
RAOF	bregma: Mesa 10.2.1-2ubuntu3 fixes your problem. Enjoy!	01:55
duflu	racarr: What's a prompt session?	02:57
racarr	duflu: trust sessions	03:07
racarr	;)	03:07
racarr	https://wiki.ubuntu.com/Security/TrustStoreAndSessions	03:07
racarr	sort of explains how the name prompt session	03:08
racarr	comes about	03:08
duflu	racarr: OK then...	03:09
RAOF	Oh, whoops. We appear to call all manner of un-signal-safe functions in the emergency cleanup signal handler.	06:25
RAOF	Including allocations!	06:25
duflu	RAOF: Sounds a bit silly. Incidentally I mentioned we shouldn't ever do such a thing in a branch that's pending... https://code.launchpad.net/~vanvugt/mir/fatal-error/+merge/219471	06:48
RAOF	Yeah, reviewing that is what brought me to look at the emergency cleanup bit.	06:48
duflu	RAOF: Well the latest emergency cleanup stuff I don't think I reviewed either	06:49
duflu	The plate was piled too high in May	06:49
duflu	Hmm Chrome in Trusty has annoyingly bad tearing (diagonal/triangles)... I wonder if that's Chrome or Compiz	06:51
duflu	If it's Compiz then of course we don't care any more	06:52
RAOF	I think it's Chrome; it started with the Aura drop, IIRC.	06:53
* duflu looks up Aura		06:53
duflu	Oh, nice one Google... they really should be able to do better than that	06:54
alf_	RAOF: duflu: where are we doing allocations in the emergency cleanup handlers?	07:12
duflu	No idea	07:12
* duflu points to RAOF		07:12
RAOF	alf_: You assign to a vector; that's at least potentially doing allocation.	07:13
duflu	Oh that will do it	07:13
RAOF	Also, pthreads isn't threadsafe, right?	07:13
duflu	Depending on the method...	07:13
duflu	RAOF: Umm, what?	07:13
RAOF	Ahem.	07:13
RAOF	Signalsafe	07:13
RAOF	'cause we also lock a mutex in there.	07:13
duflu	RAOF: It never used to be... Signals could arrive in _any_ thread but should arrive in just one. So unpredictable but safe I think	07:14
RAOF	Right, but what happens if pthread_lock gets interrupted by a signal?	07:14
duflu	If you want to determine the thread to use then pthread_kill	07:14
alf_	RAOF: duflu: at vector, right, that's easy to fix I will take a look	07:15
duflu	RAOF: "If a signal is delivered to a thread waiting for a mutex, upon return from the signal handler the thread resumes waiting for the mutex as if it was not interrupted."	07:15
duflu	[http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_lock.html]	07:15
RAOF	I'm not sure that covers my concern.	07:16
RAOF	But signal-safety documentation isn't the best :)	07:16
duflu	RAOF: Well that's just the spec. Everyone has their own implementation	07:17
RAOF	No, I mean that it's not clear that documentation covers the case I'm concerned about.	07:17
alf_	RAOF: "The mutex functions are not async-signal safe. What this means is that they should not be called from a signal handler. In particular, calling pthread_mutex_lock or pthread_mutex_unlock from a signal handler may deadlock the calling thread."	07:17
alf_	RAOF: :)	07:17
RAOF	alf_: Ding!	07:18
duflu	Certainly, locking in signal handlers is also bad	07:18
duflu	Then again without such bugs I would have received more core files in my life time than I have	07:18
RAOF	Basically, the set of things you can _safely_ do in a signal handler is poke at memory addresses :)	07:18
RAOF	</hyperbole>	07:19
duflu	I remember plenty of stack traces from customers showing code hanging in a signal handler after a crash :/	07:19
RAOF	Yup.	07:20
RAOF	It's disturbingly easy to deadlock in one.	07:20
alf_	duflu: RAOF: We can drop the locks, making it clear that you are not to call the emergency cleanup while concurrently adding a handler	07:21
alf_	duflu: RAOF: plus the emergency cleanup is a best effort cleanup	07:21
duflu	alf_: If we can guarantee the signal handler is only called once then I guess reduced locking is OK	07:22
RAOF	Well, the signal handler isn't reentrant.	07:22
RAOF	(By default)	07:22
RAOF	But we should ensure that we don't do anything that's highly likely to deadlock there :)	07:22
duflu	Also easy to do by accident though... if someone else asks the signal() function for the existing handler (yours) and calls it	07:23
RAOF	?!	07:23
RAOF	Why on earth would they do that?+	07:23
alf_	duflu: RAOF: and if we get rid of the locking requirement we might as well drop the vector copy	07:24
duflu	RAOF: 3rd party libraries or general dealing with bad APIs where you need a signal handler	07:24
RAOF	duflu: Oh, as in wrapping a signal handler?	07:24
duflu	RAOF: I think that's one case	07:25
duflu	alf_: If the vector never gets too big then an array is fine	07:25
RAOF	alf_: Yup. We could, of course, make it a signal-safe data structure, but given that it's basically write-once I think we can happily read without locking.	07:26
duflu	If it's write-once and that's guaranteed to happen before the read there is no race. And Helgrind etc will be happy without locking	07:28
alf_	duflu: RAOF: it says "The mutex functions are not async-signal safe", but the signals we handle with emergency cleanup are synchronous	07:32
RAOF	Unless someone sends us SIGsomething :)	07:34
duflu	Heh, bypass must be awesome if it's still doing my head in a year later	07:35
=== doko_ is now known as doko
=== vila_ is now known as vila
=== alan_g is now known as alan_g\|lunch
=== alan_g\|lunch is now known as alan_g
=== pete-woods is now known as pete-woods-lunch
=== alan_g is now known as alan_g\|tea
=== alan_g\|tea is now known as alan_g
=== greyback_ is now known as greyback\|post
=== pete-woods-lunch is now known as pete-woods
=== greyback\|post is now known as greyback
=== chihchun is now known as chihchun_afk
=== dandrader is now known as dandrader\|lunch
=== alan_g is now known as alan_g\|EOD
=== dandrader\|lunch is now known as dandrader
dobey	who would be best to bug to maybe get a crash in libmirclient on application exit on the phone fixed asap today?	18:09
AlbertA	dobey: bug #?	18:10
AlbertA	dobey: which image# ?	18:10
AlbertA	racarr: can you take a look at https://code.launchpad.net/~albaguirre/unity-system-compositor/no-inactivity-handling-desktop/+merge/225537	18:11
AlbertA	racarr: it's fairly small :)	18:11
dobey	AlbertA: on 111, but lokos like it's been happening for a little while. i haven't filed a bug yet, the reports on errors.u.c failed to retrace, so there's no "create a bug" link for them. trying to determine the best way to file the bug	18:11
=== renato_ is now known as Guest58058
AlbertA	dobey: so any application exiting crashes?	18:12
dobey	should i just file it with the top of the stack trace and a link to the errors?	18:12
AlbertA	dobey: what's the fastest way to reproduce?	18:12
dobey	AlbertA: any using the mir backend of qt/qml afaict. open clock app, wait a few seconds, and close it, and there should be a crash report in /var/crash/	18:12
dobey	https://errors.ubuntu.com/problem/6552ba4342afeb93d20e22711ac36f655cd885d8	18:13
dobey	or go to online accounts, then hit back and should result in in a crash report too	18:14
AlbertA	dobey: I couldn't access that link	18:16
AlbertA	dobey: but I'll take a look, if you can submit a bug # that would be great	18:16
dobey	AlbertA: sure. should i just copy/paste the top of the stack trace in the bug?	18:17
AlbertA	dobey: sure	18:17
dobey	ok will do	18:18
racarr	AlbertA: Sure	18:21
dobey	AlbertA: https://bugs.launchpad.net/ubuntu/+source/mir/+bug/1337481	18:28
ubot5	Ubuntu bug 1337481 in mir (Ubuntu) "Crash in libmirclient on app exit on phone" [Undecided,New]	18:28
=== dandrader is now known as dandrader\|afk
racarr	dandrader\|afk: greyback: https://code.launchpad.net/~unity-team/platform-api/devel-for-qtmircompositor/+merge/225320https://code.launchpad.net/~unity-team/platform-api/devel-for-qtmircompositor/+merge/225320	18:59
racarr	err whoops	18:59
racarr	but what is up with line 427	18:59
racarr	It looks like, state_before_hiding is being used to save the state across minimize state changes	19:01
racarr	but why initialize to MAXIMIZED? There must be a default or currentstate	19:01
greyback	racarr: indeed. I can't say. Hope dandrader\|afk can reply	19:01
racarr	anyway good besides that	19:02
racarr	if he doesnt come back soon ill leave some comments	19:03
racarr	on launchpad	19:03
greyback	racarr: please do, and I'll try to fix (daniel off on hols at eod today)	19:03
racarr	:) sounds nice	19:04
racarr	racarr off on holidays in 51 days	19:04
racarr	lol	19:04
=== dandrader\|afk is now known as dandrader
dandrader	racarr, right, I'm just initializing it to something	21:43
dandrader	racarr, the first time you call show(), the state will be set to this value	21:44
dandrader	racarr, which is what we want on phablet anyway	21:46
dandrader	racarr, "restored" windows make no sense there	21:46
dandrader	racarr, and the "beautiful" papi API is not expressive enough at the moment for the user to set the mir surface states properly. eg: "papi::window::show() == mir::surface::set_state(maximized) or mir::surface::set_state(restored?	21:48
dandrader	racarr, so I don't bother with this sad situation and just let it go maximized, which is what we want	21:48
dandrader	and now I'm repeating myself...	21:48
dandrader	racarr, so if you wanna to fix thing you would have to rewrite that papi api to be a 1-to-1 mapping of mir	21:49
dandrader	racarr, which is a waste of developer time IMHO	21:49
dandrader	and would also require changing papi users: ie, qtubuntu	21:51
=== mterry is now known as 18VAAOQDS
=== 18VAAOQDS is now known as mterry

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!