/srv/irclogs.ubuntu.com/2013/09/10/#ubuntu-mir.txt

mhall119RAOF: hey, I was just going to ask if you had any indication from ickle about that patch rejection, other than what was in the git log and NEWS file00:00
RAOFmhall119: No, I've not.00:00
mhall119I really feel sorry for him, sounds like he's in an unpleasant spot00:00
RAOFI don't know; I don't have a feel for Intel-internal politics.00:01
RAOFrobert_ancell: https://code.launchpad.net/~raof/mir/prebump-abi-for-lifecycle-cookie/+merge/184703 ?00:52
robert_ancellRAOF, cool. I have the u-s-c change in lp:~robert-ancell/unity-system-compositor/app-lifecycle - still feels a bit clunky though00:57
jrrI tried out mir last night (installed unity-system-compositor , open source radeon driver), and the whole display had small black stripes (say, 5px normal, 5x black, 5px normal... across the whole thing)00:57
jrris this known, or should I re-repro and file a bug?00:57
robert_ancellRAOF, so the logic for XMir needs to be, don't grab input devices on startup until you get  lifecycle state set to "mir_lifecycle_state_resumed". Drop them on "mir_lifecycle_state_will_suspend"00:58
robert_ancelljrr, just file a bug and we'll mark it duplicate if there is already one00:58
jrralright, will do00:59
RAOFrobert_ancell: And then tell Mir that it's done, right?00:59
robert_ancellRAOF, yes, once we can do that00:59
robert_ancellRAOF, and if I don't hear from you in a reasonable amount of time I just drop your connection :)01:00
RAOF:)01:00
robert_ancellRAOF, how can I build a local XMir to test this?01:01
RAOF./autogen.sh --prefix=$HOME/.local --enable-xmir && make -j9 && make install01:01
robert_ancellRAOF, have you made the switch from surface focus to app lifecycle on master?01:02
RAOFNo, I haven't yet.01:02
* robert_ancell -> lunch01:02
* RAOF → coffee01:04
ricmmRAOF: no, thats not how the application model is defined01:08
ricmmapplications are given a long timeframe to save their state01:09
ricmmtheres no calling back into the server to signal state saving completion01:09
ricmmits a hard policy scenario, the client has no say in the time/resource constraints associated with its lifecycle01:09
RAOFMy reading of the application lifecycle doc was that the on_application_about_to_stop callback would “3. Wait for timeout or completion then SIGSTOP the process.”01:11
RAOFWhich implies to me that the unity API needs some way to signal completion?01:12
RAOFNot that *client code* would be calling it; we don't expect clients to use libmirclient directly anyway.01:12
RAOFricmm: ^^^01:14
tvoss__RAOF, nope, the client just receives the signal, and is granted a grace period for serialization01:18
tvoss__a.k.a. state preservation01:18
ricmmRAOF: if the document says that then it is wrong, will need to update it01:19
tvoss__ricmm, taking an AI to update01:19
ricmmthanks01:19
tvoss__ricmm, might be later my day, though01:20
tvoss__google docs are a bit difficult in this place01:20
ricmmno worries, about time to sleep over here01:21
ricmmRAOF: consider that 3. item broken, tvoss will update document, applications only get signaled of an about_to_suspend() transition01:21
ricmmor a resumed() transition if they are previously-stopped applications01:21
ricmmotherwise they start from 0, its up to applications to restore their state from their serialization targets01:22
ricmmthnks for taking an interest in clarifying the design guidelines01:23
* ricmm -> beer/sleep01:23
RAOFHm.01:44
RAOFI'd prefer if 3. *wasn't* broken, and I'm not sure why we're adamant that it is.01:44
robert_ancellRAOF, what's the document link?02:03
RAOFrobert_ancell: https://docs.google.com/a/canonical.com/document/d/186nT03Jyu_d-GMyJ--8Qp83o1Ey-O1EsWZtwrPGE2TQ/edit02:03
robert_ancellRAOF, hmm, I'm not seeing how this can be completed without the shell being signalled when a client has completed02:09
RAOFrobert_ancell: I think they're just going to not bother with the "until completed"02:13
RAOFrobert_ancell: So instead it's ‘SIGSTOP after $TIMEOUT’02:13
RAOFRather than SIGSTOP after $TIMEOUT or completion, whichever happens first.02:14
RAOFWhich I think is the trivially better behaviour, so I'm not sure why they're adamant that it should not be that way.02:14
robert_ancellRAOF, right, but the component that does the killing is the shell right? Which means it has to know when completion occurs to do the killing. Unless the platform API has a separate thread that does the killing if the function call doesn't return in a sufficient time02:15
robert_ancellI'm not sure how this is all implemented02:15
RAOFrobert_ancell: No, it can just assume that completion has occurred by $TIMEOUT.02:15
RAOFie: the shell sends the "about_to_suspend" signal, and then in 10 seconds SIGSTOPs the client.02:16
robert_ancellRAOF, oh, I guess the process will quit before 10s if it is successful, so the "complete" signal is the process termination02:17
robert_ancellRAOF, in our case where we don't actually want the process to quit, it doesn't work so well02:17
RAOFOh, no.02:17
RAOFThe process doesn't quit.02:17
robert_ancellor in the case of the shell, it might not be the process termination but the Mir client connection quitting02:17
RAOFIt just has no completion signal.02:17
robert_ancellRAOF, so the shell has 0 idea if the process is actually suspended?02:18
RAOFCorrect02:18
robert_ancellah02:18
RAOFWell, the shell knows, because it's sent SIGSTOP02:18
RAOFIt has no idea if the process has *saved its state*02:18
RAOFWhich I think is silly.02:19
robert_ancellyes02:19
RAOFSo I'm going to continue hacking on the cookie branch and then land it when they come to their senses.02:19
robert_ancell:)02:19
robert_ancellI don't see any reason not to send the signal back to the shell02:19
RAOFRight.02:19
robert_ancellThe apps won't ever know02:20
RAOFThe shell doesn't have to *wait* for the signal02:20
robert_ancelland it will be make debugging a lot easier02:20
RAOFYes.02:20
robert_ancellrather than just guessing if an app actually suspended02:20
RAOFAnd, hell, we'll be able to send SIGSTOP *sooner*02:20
robert_ancellRAOF, does SIGSTOP free up any resources except for CPU?02:21
RAOFAnd give interesting metrics, like ‘your app took 5s to suspend, which is getting close to the 10s timeout…’02:21
RAOFrobert_ancell: No02:21
robert_ancellI suppose it allows the kernel to page out all the memory for that app02:21
RAOFYeah.02:21
RAOFBut the kernel could page out all but one page of the app anyway.02:21
RAOF(Assuming a reasonable app that's blocked when mir_swap_buffers is blocked)02:22
robert_ancelland all apps will be like that :)02:22
=== chihchun_afk is now known as chihchun
ricmmRAOF: consider it a no-op to the display server itself03:41
ricmmif you want I can rewrite in a way that the Mir protocol itself needs not understand what the passed message means03:42
RAOFI think it makes perfect sense for the Mir protocol to understand?03:43
RAOFI'm not sure why having a ‘Yo! I've finished this thing you asked me to do’ callback is a bad idea.03:43
ricmmwell I believe saying "until they come to our senses" is out of place03:43
ricmmits irrelevant, this was planned ages ago03:43
RAOFIt doesn't prevent us from having a hard timeout.03:43
ricmmfor functionality scheduled to land for 13.1003:44
ricmmand which has been in place since a long time ago03:44
ricmmMir being the new player in the game03:44
RAOFFor a little context: Robert and I would like for this callback to exist, because it's really useful for unity-system-compositor to know when XMir's done handling the "please suspend" message.03:45
ricmmI dont think you need to care about the timeout or not, the lifecycle policy itself is up to the shell to implement03:45
ricmmnot the display server03:45
ricmm*policy*03:45
RAOFBut the shell is a part of the display server. Not the display server policy, certainly, but the callback gives the shell a mechanism for better policy.03:46
ricmmhmmm03:47
ricmmI'm sorry but the display server is a library that the shell implements03:47
ricmmthe display server imposing non-display-server policy on the shell is wrong03:47
RAOFBut this isn't imposing policy?03:47
ricmmyes, the shell and therefore the system can decide what policy to implement regarding application management03:47
ricmmthe display server != application manager03:47
RAOFIf the shell wants to ignore the completion event that's well within its rights?03:48
ricmmthe shell *defines* the existance of an event at all03:48
ricmmmaybe the mistake was exposing it Mir in terms of defined semantics03:48
ricmmit should've just been an opaque message passing over a bus03:49
RAOFAn extensible Mir protocol. Yeah, that would have been a good idea :)03:49
ricmmsure, but this is what we have due to lack of assigned man power03:50
ricmmif it doesnt fit the XMir world, extend it03:50
ricmmand make it fit, without breaking the touch world03:50
RAOFSure, that's what I intend to do.03:50
ricmmgreat, but dont extend the lifecycle states, feel free to add extra stuff to the protobuf message itself03:51
ricmmbecause the first interferes with the defined model03:51
RAOFBut the cleanest way to do that is to add a completion event to the lifecycle callback added the client passes to libmirclient.03:52
RAOFThat doesn't extend the lifecycle states, nor does it interfere with the model at all.03:52
ricmmthat is not going to happen before post-October planning03:52
ricmmit demands extending a model that has been in place for many months now03:52
RAOFWhat?03:52
RAOFNo it doesn't!03:52
ricmmand it is not something that will be considered for development 4 weeks before delivery date03:52
RAOFI don't see how it extends the lifecycle model?03:53
ricmmwell first of all I dont see why the *display* *server* needs to care about an application having saved its state03:53
ricmmthat souns like session/application management03:54
ricmmam I wrong to think that?03:54
RAOFNot particularly, but session/application management is also handled through Mir.03:55
ricmm*if* you are doing some sort of session management in Mir to support XMir itself then thats XMir's problem (which only runs legacy applications, afaik)03:55
=== chihchun is now known as chihchun_afk
RAOFThis is true.03:55
=== chihchun_afk is now known as chihchun
RAOFricmm: So, http://bazaar.launchpad.net/~raof/platform-api/update-for-lifecycle-cookie/revision/149 is what this would look like, platform-api side.04:28
=== duflu is now known as duflu|away
robert_ancellricmm, the session management is u-s-c managing its children (i.e. XMir), not the applications running underneath those (i.e. the X clients). It's the same logic as in the shell - when an application is no longer visible, then it should be triggered to suspend. In this case, when you switch sessions the old session is asked to "suspend" and it stops reading input (and potentially could do more if it wanted)05:39
robert_ancellgtg, bye all05:40
alf_RAOF: Hi! Any thoughts on https://lists.ubuntu.com/archives/mir-devel/2013-September/000376.html ?05:41
RAOFalf_: Not really, no.05:44
RAOFAlthough... hm.05:44
RAOFIt's possible that you're running afoul of the name caching that i965 does?05:44
alf_RAOF: ?05:49
RAOFIn intel_context.c, intel_process_dri2_buffer05:50
=== duflu|away is now known as duflu
alf_RAOF: ah, yes, we were trying various things in there with Marteen yesterday but unfortunately didn't fix the problem06:04
RAOFAh, ok.06:04
alf_RAOF: (not to say that the problem isn't there, perhaps we didn't try the right fix :))06:05
RAOF06:05
alf_RAOF: I am just saying that we haven't exhausted the investigation of what could be going wrong in intel_process_dri2_buffer with PRIME fds06:06
smspillazRAOF: does xmir copy the entire root window (as it lives in gpu memory) to a mir surface (which also lives on the gpu)?06:09
smspillazor does it copy and blit each subwindow directly06:09
RAOFEntire root window06:09
duflusmspillaz: Damage rects only06:09
smspillazah, that makes sense06:09
smspillazyep +106:10
RAOFI'd like to subwindow it, actually.06:10
RAOFBut that's a discussion for next week.06:10
smspillazRAOF: that would probably make sense for the noncomposited case06:10
duflusmspillaz: At least the intel DDX seems to loop through the rectangles and take as little as possible each frame06:10
RAOFduflu: The other DDXen are similar, but they copy the bounding-box of the damage06:11
RAOFsmspillaz: Indeed.06:11
dufluClose (and efficient) enough06:11
dufluRAOF: In fact, possibly _better_ for cache performance06:11
RAOFIndeed06:11
smspillazRAOF: I guess that's why some parts of xmir had to live in the ddx06:12
RAOFDoesn't Intel gate on the number of rectangles, though, and do the bounding box given sufficiently many rectangles?06:12
smspillazbecause the 2d accel parts are all different wherever you look06:12
RAOFsmspillaz: Right06:12
dufluRAOF: Not AFAIK... it copies the rects as they're given06:12
smspillazcool06:12
smspillazI was just explaining to someone why xmir stuff had to live in the drivers and wanted to make sure I understood the code correctly06:13
dufluSpeaking of which, I must set up a saucy nouveau today. To figure out which bugs are truly common06:13
RAOFThere's a long-term intent to do a generic xf86-video-mir using Mir's EGL platform and Glamor, but that's a lot more effort and likely to be less performant.06:13
alf_duflu: @https://lists.ubuntu.com/archives/mir-devel/2013-September/000379.html, I think the OP is using fglrx?06:15
smspillazRAOF: as I thought06:16
smspillazRAOF: just to confirm, xwayland works by forcing all windows ot live on the cpu right?06:16
smspillazor at least the root window06:16
smspillazerm06:17
smspillaznormal windwos actually06:17
smspillazits rootless06:17
smspillaz(so that they can be used as a wl_buffer via shm)06:17
RAOFNo, XWayland also has DDX patches06:17
duflualf_: Possibly a good point, but not true for the existing reporters of that bug06:17
alf_duflu: sure, just for this particular instance06:18
smspillazRAOF: ah okay06:18
smspillazRAOF: I wonder what's to us from having ddx patches which all they do is "copy damaged bit from PixmapPtr to fd however you like"06:19
RAOFsmspillaz: It's just that there's exactly one maintained xwayland DDX patch - intel. I wrote the ati and nouveau patches a year ago, and they're somewhat out of date.06:19
duflualf_: Unless it _is_ possible to start Mir with radeon while fglrx kmod is loaded?06:19
RAOFsmspillaz: Well, xwayland *doesn't* copy from the PixmapPtr; it shares the backing BO with weston, and submits damage rects.06:20
alf_duflu: no idea if it's possible...06:21
smspillazRAOF: oh weird06:21
RAOFsmspillaz: Nah, it's perfectly sensible for a client-allocated model.06:21
smspillazI guess that makes sense actually06:22
smspillazbecause xserver was allocating anyways06:22
smspillazI wonder how hard it would be to beat the xserver into accepting some foreign buffer06:22
smspillazprobably quite hard06:22
RAOFNo, pretty easy actually.06:26
RAOFIf we single-buffered in Mir I could totally do that for XMir.06:26
smspillazRAOF: ah, right06:26
dufluRAOF: BTW, single buffering in SwitchingBundle recently became impossible in order to simplify the logic. But I could add it back in easily enough06:29
RAOFI don't think we particularly want to use single-buffering.06:29
dufluTrue06:30
dufluThat's a new one. Unity panel takes up a quarter of the screen height06:36
dufluRAOF: How do I disable i915 and let nvidia/optimus rule?06:47
RAOFduflu: In your bios?06:52
dufluRAOF: No such option. Either intel only, or both intel+nv with intel given control of all outputs except VGA :(06:52
RAOFduflu: You *may* have luck with /sys/kernel/debug/vgaswitcheroo06:53
RAOFBut it's also possible that the nvidia card is *only* hooked up to VGA06:53
dufluRAOF: It certainly looks like nvidia only talks to the VGA port. That's quite annoying and unexpected07:01
RAOFNot entirely unexpected. Hardware muxes cost money.07:02
RAOFAnd suck a bit anyway07:02
dufluNo wonder it cost me so little :/07:02
RAOF(Unless you do fancy things, like Apple do/did)07:02
dufluGreat. Then I still have only intel hardware for saucy/xmir testing.07:02
RAOFExcept for the vga port? :)07:03
dufluRAOF: I particularly needed dual monitor testing07:04
RAOFHm. Less useful.07:04
dufluOookay. Perhaps I need a second desktop and prerequisite electrical upgrades to the house :/07:13
mlankhorstalf_: oh btw are you sure it didn't help things? if so can you please do a strace of the failing process?07:15
mlankhorstwith patches applied07:15
mlankhorstI only care about the failing instance, maybe it will show a clue of what's wrong07:16
alf_mlankhorst: ok, just be sure (because I am applying the patches to a local tree), I only care about the intel changes from the diff, right?07:16
mlankhorstwhat other changes are there in that diff?07:17
alf_various other bits here and there e.g. in the gallium state tracker07:22
mlankhorstoh just some whitespace fixups07:22
alf_mlankhorst: @populating region.name, if we are dealing with prime fd buffers, won't all incoming buffer only have the .fd  field populated? Setting the region flink name, won't help us avoid recreating the region. Am I missing something?07:24
mlankhorstwhy wouldn't it?07:26
mlankhorstif 2 buffers ar equal the flink name would be the same07:27
mlankhorstbut.. mayb3e stracea will help  find the issue07:28
dufluRAOF: Did you (or someone) hack xserver-xorg-video-intel to fix initial mode selection?07:29
dufluIt's *different* now07:29
alf_mlankhorst: sure, but the incoming buffers don't have the .name field set, just the .fd field, so they will always compare unequal to the region. Unless, that is, the buffer information is also updated somehow?07:31
mlankhorstalf_: oh like that..07:31
mlankhorstalf_: ugh how could that be the case for dri buffers? o.O07:33
mlankhorstoh right, mir backend is mapped to dri207:34
mlankhorstbah, I'll need to think about it some more.. grr:P07:35
RAOFduflu: Hm, not deliberately? :)07:37
dufluRAOF: Oh, one definite change seems to be that the intel DDX no longer accepts NullRegion (now crashes)07:39
RAOFYeah. We should never be passing in NullRegion, though.07:39
RAOFAre we?07:39
mlankhorstdun dun duuuun07:42
alf_mlankhorst: that sounds ominous :)07:42
dufluRAOF: No, but I may need to as a workaround. Unless I can figure out how to fix the intel code :P07:44
dufluRAOF: Did you investigate https://bugs.freedesktop.org/show_bug.cgi?id=68969 ?07:46
ubot5Freedesktop bug 68969 in Driver/intel "xf86-video-intel 2.99.901 + XMir + multimonitors = all displays black" [Normal,Resolved: notourbug]07:46
RAOFduflu: I did have a look, but didn't get as far as reproducing.07:47
RAOFduflu: I pulled the latest xmir patch from ickle's branch into the latest Ubuntu package, though, so it's some other change in the tree breaking it.07:50
alf_RAOF: https://github.com/RAOF/mesa/pull/407:56
RAOFalf_: Hm. What frees dri2_surf in that case?07:58
alf_RAOF: dri2_surf is just a cast of surf to dri2_egl_surface, they are the same thing08:00
RAOF...urgh. Quite true!08:00
mlankhorstalf_: hm i have a fix for i965, i think08:01
* alf_ is excited...08:01
mlankhorsthttp://paste.debian.net/37818/08:03
mlankhorstno idea if it works though or if the fd is correct ;;p08:05
alf_mlankhorst: thanks, will check08:06
dufluRAOF: It *looks* like the intel DDX is tracking its own damage per-pixmap, and XMir's multi-monitor optimization of only submitting outputs/pixmaps when dirty is confusing it. Any ideas? I'm going round in circles08:26
alf_mlankhorst: no luck, bug still occurs? Do you want me to get an strace?08:33
mlankhorstdefinitely08:33
alf_mlankhorst: btw, I tried another experiment: I also pass the GEM name with the incoming buffer (name provided by Mir), but use the fd to create the buffer, and setting the name in the region manually with flink (like the previous patches). I still get the bug, which indicates that the core problem may not be (only) in intel_process_dri2_buffer()08:38
mlankhorstalf_: fun :p08:39
mlankhorstalf_: oops, can you set singlesample_mt->region->handle = region->handle in intel_miptree_create_for_dri2_buffer ?08:52
alf_mlankhorst: sure08:52
mlankhorstit would appear I missed that part on importing bo's, so it was still 0 ;)09:00
alf_mlankhorst: https://github.com/afrantzis/mesa/tree/egl-platform-mir-egl-image-i965-experiment09:24
alf_mlankhorst: (https://github.com/afrantzis/mesa.git branch egl-platform-mir-egl-image-i965-experiment)09:24
mlankhorstalf_: do you close the original gbm bo afterwards?09:27
mlankhorstor at any point09:27
alf_mlankhorst: the BOs are closed only when the Mir surface is destroyed09:28
mlankhorstwhat about the pixmap created with dri2_create_image_khr_pixmap09:29
mlankhorstdo you ever close that one?09:29
alf_mlankhorst: also when the surfaces are destroyed (the bo and respective EGL images are created lazily when the compositor/clients needs them the first time).09:30
mlankhorstok but this definitely looks wrong here..09:30
mlankhorst+   dri2_img->dri_image =09:31
mlankhorst+      dri2_dpy->image->createImageFromName(dri2_dpy->dri_screen,09:31
mlankhorst+                                           width,09:31
mlankhorst+                                           height,09:31
mlankhorst+                                           format,09:31
mlankhorst+                                           flink_arg.name,09:31
mlankhorst+                                           stride / 4,09:31
mlankhorst+                                           NULL);09:31
alf_mlankhorst: ok, what is wrong with it?09:31
mlankhorstyou can't use FLINK internally09:32
mlankhorstyou'd need to use the dupimage call, assuming it works09:34
alf_mlankhorst: it doesn't...09:34
mlankhorstwhat happens when you try?09:35
alf_mlankhorst: the call succeeds but I still get errors further down when rendering, even when using GEM names09:35
alf_mlankhorst: let me paste...09:36
mlankhorstalf_: yes probably, but it's more correct than flinking09:36
alf_mlankhorst: here is the strace output with USE_DUP 1 , http://paste.ubuntu.com/6087181/09:39
mlankhorststill more correct09:40
alf_mlankhorst: btw, why can't I FLINK? Note that this is still happening at the EGL platform level, outside any driver specific context.09:40
mlankhorstalf_: because there is no refcounting in drm09:40
mlankhorstif you close 1 handle, everything is invalid. userspace has to explicitly keep track themselves09:40
alf_mlankhorst: I am only getting the global name, I am not opening/closing anything09:42
mlankhorstalf_: you are creating a new representation of the bo through createImage.09:42
mlankhorstalf_: but regardless in this case the problem didn't change.. can you add extra traces to libdrm/intel to the GEM_CLOSE calls?09:45
alf_mlankhorst: I guess that is drm_intel_gem_bo_free(), sure09:48
mlankhorstwith indication of which handle closed, I'm not good enough to understand that part from the ioctl numbers yet :P09:50
alf_mlankhorst: http://paste.ubuntu.com/6087259/, with USE_DUP = 110:01
dufluI give up10:09
=== dandrader is now known as dandrader|afk
mlankhorstalf_: what do close messages look like?10:11
alf_mlankhorst: "drm_intel_gem_bo_free: ..."10:12
mlankhorstalf_: because I see some handles that are being re-created after close10:13
mlankhorsthowever due to flushing they may not be killed right away after destroying10:13
mlankhorstalf_: I think mir needs to be smarter, and cache the fd's. check if it's seen them before or not and if so re-use them..10:17
mlankhorstor better yet10:17
mlankhorstallocate them on the client side and give them to mir for use10:17
alf_mlankhorst: so instead of sending the Prime FD every time, send it once and then send an another id back and forth?10:20
mlankhorstalf_: no, keep the fd cached on the client side, don't close it10:20
=== dandrader|afk is now known as dandrader
alf_mlankhorst: hmm, I think we are already doing this in the client10:30
alan_galf_: I may have missed the context, but we have two client processes in the chain - nested-mir and the application. Both get passed the fd10:36
RAOFYeah, we cache fds.10:39
alf_alan_g: This is about buffer fds used by the final application. (As you know) nested-mir creates the surface/buffers itself.10:43
alan_galf_: Ok then. Sorry for the noise10:44
=== chihchun is now known as chihchun_afk
=== hikiko is now known as hikiko|lunch
=== alan_g is now known as alan_g|lunch
=== hikiko|lunch is now known as hikiko
=== dandrader is now known as dandrader|lunch
kgunnRAOF: you and ricmm all ok ?12:17
kgunn:)12:17
=== dandrader|lunch is now known as dandrader
=== alan_g|lunch is now known as alan_g
alan_galf_: Are you OK with the updated https://code.launchpad.net/~alan-griffiths/mir/spike-nested-input/+merge/184351?13:15
alf_alan_g: approved13:20
alan_galf_: thanks13:20
mlankhorstalf_: anyway the final application is messing up here, nothing the nested mir could do would cause -ENOENT here unless they share the drm fd.. :P13:48
=== jono is now known as Guest88427
=== mzanetti is now known as mzanetti|meeting
alf_mlankhorst: the final application gets the drm fd so it can use it with mesa14:07
alf_mlankhorst: and handles everything through it (and libmirclient)14:07
alf_mlankhorst: through it == mesa14:08
mlankhorstalf_: yes, but is the same fd shared between mesa and nested mir?14:15
=== alan_g is now known as alan_g|tea
alf_mlankhorst: yes (of course not with the same fd number, but the same underlying file)14:21
alf_mlankhorst: host mir, nested mir and clients/mesa use dup()-ed fds that point to the same drm file instance14:24
=== mzanetti|meeting is now known as mzanetti
kgunndidrocks: ping14:30
didrockskgunn: pong14:31
kgunndidrocks: hey...been a while, hope time off was good14:31
kgunndidrocks: just wanting to know...are you ok with resolution here14:31
kgunnhttps://bugs.launchpad.net/xmir/+bug/122120914:31
ubot5Launchpad bug 1221209 in unity-system-compositor (Ubuntu) "need to establish upgrade action when xmir becomes default" [Undecided,New]14:31
didrockskgunn: was excellent, thanks! In a sprint in Boston this week (so investigating some part of holidays to travel ;))14:31
* didrocks looks14:31
kgunndidrocks: basically...robert recommending reboot on xmir distro roll out14:32
didrockskgunn: yeah, it sounds good (and the right way to implement it)14:33
kgunndidrocks: cool...just wanted to make sure we're ok (before the 11th hour gets here :)14:33
didrockskgunn: but I think people won't reboot every 4 hours for now (as we release every 4 hours)14:33
didrockskgunn: so, it comes back to the discussion about ABI stability, is it coming so that we can remove the "hack" for forcing rebuilds in both u-s-c and unity-mir?14:33
didrocksseems I was disconnected…14:34
didrocks10:33:11      didrocks | kgunn: yeah, it sounds good (and the right way to implement it)14:34
didrocks10:33:29      didrocks | kgunn: but I think people won't reboot every 4 hours for now (as we release every 4 hours)14:34
didrocks10:33:54      didrocks | kgunn: so, it comes back to the discussion about ABI stability, is it coming so that we can remove the "hack" for forcing14:34
didrocks                       | rebuilds in both u-s-c and unity-mir?14:34
didrockskgunn: ^14:34
kgunndidrocks: do you recall if there is a bug for that hack ? if not...i'll log one...and tag it for "make-xmir-default"14:35
kgunnhttps://bugs.launchpad.net/xmir/+bugs?field.tag=make-xmir-default14:35
didrockskgunn: indeed, let me log it as a bug, one sec14:35
=== alan_g|tea is now known as alan_g
kgunndidrocks: oh..thanks...yeah, i think its a seperate issue from the reboot discussion14:35
didrockskgunn: it's linked as because of this, it means that potentially, we force every 4 hours people to reboot14:37
didrocksas we rebuild u-s-c as soon as there is a commit in Mir14:38
kgunndidrocks: true...i see the link...just saying, it deserves its own bug14:38
didrockskgunn: https://bugs.launchpad.net/xmir/+bug/122339314:39
ubot5Launchpad bug 1223393 in XMir "ABI stability of libmirserver" [Undecided,New]14:39
didrockslet me add the tag14:39
didrockskgunn: TBH, I think ABI stability needs to happen in the incoming 2 weeks14:40
didrocksseeing how far we are, we need to try to stabilize now14:40
kgunndidrocks: no doubt14:40
=== dandrader is now known as dandrader|afk
=== dandrader|afk is now known as dandrader
=== tkamppeter__ is now known as tkamppeter
hikikobye15:12
mlankhorstalf_: ARRRRRRRGHHHHHH15:19
mlankhorstARGHH15:19
mlankhorstbad alf_15:19
mlankhorstbad!15:19
alf_mlankhorst: ?15:19
mlankhorstalf_: if any client closes the bo, they will be closed for all clients..15:20
mlankhorstyou can't dup the drm fd for that reason15:20
alf_mlankhorst: but open()-ing a new drm_fd works?15:23
mlankhorstyes15:23
alf_mlankhorst: and possibly sending it over a unix socket?15:23
mlankhorstit will, but the dup is why it fails15:24
mlankhorstif the nested mir closes their bo it was closed on the other one too :P15:24
mlankhorstcausing the -ENOENT out of nowhere15:24
alf_mlankhorst: ok, I will try to de-dup() and see if that helps. Note, though, that I don't think we are explicitly closing anything during rendering...15:26
mlankhorstmaybe, but it will at least nail the issue down to the process causing it15:26
alf_mlankhorst: ah, sorry for the false alarm, we actually fixed that long ago... we drmOpen() a new fd and send it to clients :/15:29
alf_mlankhorst: but hmm...15:33
alf_mlankhorst: I actually see a dup() in NestedMir... I will get rid of this and check what is going on15:35
alf_mlankhorst: removed stray dup() in nested mir, no change :/16:31
alan_gkgunn: nested input works on android drivers. (still needs more test coverage, but sort of there) - https://code.launchpad.net/~alan-griffiths/mir/missing-links-to-wire-up-input/+merge/18482416:48
kdubalan_g, yay16:49
alan_gkdub: let's forget about this mesa stack. ;)16:50
kgunnalan_g: wow!...i guess input was easier than render16:51
kgunnawesome!16:51
alan_gkgunn: all the input problems were in code we "own"16:52
kdubalan_g, we could table it! http://translate.google.com/#es/en/mesa16:52
alan_gkdub: sadly, in the UK "table it" means something different.16:53
kdubneed to find an idiom translator now...16:53
alan_ghttp://en.wikipedia.org/wiki/Table_(parliamentary_procedure)16:54
alan_gAnyway, a good point for...16:55
=== alan_g is now known as alan_g|EOD
kgunnkdub: so technically mterry could take your android updates+ alan_g|EOD 's input and try to integrate greeter against it16:59
kdubkgunn, sure, it would be easier if its the ubuntu touch shell/greeter (qml apps?) as opposed to lightdm and xmir though17:00
racarrI think I've come up with a way to rebuild message processor + socket session + session mediator17:38
racarrthat will fix this DPMS IPC issue but it's kind of invasive17:38
racarrwondering if I should run with it, or refocus on perhaps doing DPMS for XMir via some out of channel API for the time being17:39
kdubracarr, whats the plan?17:40
racarrkdub: So the essentialy difficulty is now, that the message reading loop runs like17:41
racarrStep -1: Begin read in constructor17:42
racarrStep 0: Respond to asynbc read17:42
racarrStep 1: Allow the message processor and session mediator to fully handle the message (i.e. then say block on Surface::advance_buffer17:42
tvoss_racarr, why is that? I thought we tried to avoid blocking event loops and be as parallel as possible?17:43
racarrSTep 2: Schedule the next asynchronous read.17:43
racarrtvoss_: No, the way it's architected now we can't read a second message from a client until a first is fully processed17:43
tvoss_racarr, ?17:43
racarrtvoss_: Because things are written as in Steps 0-2 there, we don't17:44
racarrread a second message until we have finished processing the first one completely17:44
racarrwhich we use to keep messages in order17:44
racarrthe problem is not all messages are in the same "channel"17:44
kdubracarr, can't we have a special case on the client side though... "if you are the client that turned off the screen, you'll get an error if you try to swapbuffers before you turn the screen on"17:44
kdub(or did we already consider that?)17:44
racarrand the particular problem is, say you use DPMS to turn off the screen, then call advance buffer17:44
tvoss_racarr, which would be fine, too, if the message handling just handles stuff like dpms asynchronously17:45
racarrand you are17:45
racarrperpetually blocked on advance buffer17:45
racarrso you can never turn17:45
racarrthe screen back on17:45
racarrkdub: It could always be racy I think (because other people can turn on/off the screen...but it's a possibility)17:45
kdubracarr, even the server could turn on the screen, but then it would just be an event sent to clients17:46
tvoss_kdub, I would rather want to avoid such a hacky appraoch17:46
racarrkdub: The plan I was developing, was instead to read messages as fast as possible, then the SessionMediator uses a thread pool17:46
racarrto perform the actual operations and returns futures or whatever to the message processor17:46
racarrthe SessionMediator can use, multiple locks17:46
racarrto enforce the different channels17:46
racarri.e. there is a17:46
racarrSessionMediator::display_configuration_lock17:46
racarrand SessionMediator::surface_channel_lock17:46
racarrso, you can't execute resize_surface while swap is still executing17:47
racarrbut you can reconfigure the display, or say receive a display reconfiguration event17:47
racarryou know messages within a "sequence" i.e. alll protected by the surface channel lock17:48
racarrwill all be in order, because you don't read the next message17:48
racarruntil you have actually received the std::future from the SessionMediator17:48
racarrat which point you know the lock from that channel is held17:48
racarrI am pretty sure it works, but am nervous about doing it because it changes the entire server side threading model basically17:49
racarrand who knows what that does17:49
racarrI mean I guess hypothetically I should17:49
kdubracarr, yeah, thats why i keep thinking17:49
racarrbut I don't seem to be confident about that :p17:49
kdub'there's got to be some smarts we could put into the client side'17:49
racarrI think it's always a race on the client.17:50
racarralso this may show up other places17:51
racarri.e. two surface clients17:51
racarrif you have to wait until the server responds that you have swapped one buffer before you can begin swapping the next17:51
kdubits a big change, because client requests go from in-order to out of order17:51
kdubor rather two (or more channels) and we have to do the sync logic in the server at some point17:51
racarrkdub: No, that's the thing with still not reading the next message until the SessionMediator takes some lock for you17:52
racarrit's just there become seperate channels, i.e. display-configuration, and surface17:52
kdubracarr, well, thats the sync i was talking about :) taking the lock17:52
racarrbut messages in the surface channel will still be processed in the order they are sent17:53
racarrah yeah17:53
racarryeah17:53
racarrand it's not super trivial with all the thread pools and such17:53
kdubwith swapbuffers, we currently say 'this might block!'17:54
kdubwe could just say like, 'after calling 'block server/turn screen off' the only thing you can do is some subset of the server functionality'17:55
racarrI think we understand that to mean though, that mir_client_swap_buffers_sync might not return immediately17:55
racarrnot that you can't call any mir_client_ functions17:55
racarrMm. I guess we could say that, espescially perhaps to XMir17:56
racarrbut it seems really difficult if you are writing a multithreaded client17:56
kdubwell, the renderthread just knows the swap could wait an arbitrarily long time17:56
kdubit doesn't have to know why17:57
racarrand I worry we will end up with bugs that literally mean the screen turns off and can't come back on :p17:57
racarr?17:57
racarrif xmir accidentally swaps after turning off DPMS17:57
racarrthe swap thread will wait infinitely17:57
kdubracarr, if it has two threads17:57
kduba render thread, and display management thread, its not a problem17:57
racarrYou mean, it's not a problem as long as the client17:58
racarrnever calls swap_buffers after turning off the display?17:58
racarrI am just not sure it's a good idea to put that requirement on the client when the failure case is17:58
racarrrestart the entire session17:58
kdubright, as long as the client remembers the well-known "swapbuffers can wait a really long time sometimes"17:59
racarr? How does that help you?17:59
racarrIt will wait forever17:59
racarrthere's no way for the system to recover17:59
racarreven if the client uses17:59
racarrasync swap buffers17:59
racarrand the client itself isn't blocking18:00
racarrit can still never turn the display back on.18:00
kdubah, the 'server thread is blocked' problem18:00
racarrYes :(18:00
kdubstill coming back into the problem :)18:00
racarrhaha no worries, it's confusing, I didn't realized this is what was happening on GBM18:01
racarr...for a long time haha18:01
kdubwell, if the client sends 'display off', then 'swapbuffers'18:02
kdubwe know when we receive the swapbuffers command18:02
kdubthat we cannot guarantee that we will ever be able to service the command18:02
kdubso perhaps an error?18:02
racarrkdub: I've been thinking about an error yeah18:04
racarrthe thing is how does the client know when to call swap buffers again18:04
racarrthe client then has to like18:04
racarrcall swap buffers, if error, see if the error was because the display was off18:04
kdubwhen it turns the screen back on, or sees that the screen has been turned back on18:04
racarrif so, update some flag so we watch the display configuration for the display to come back on to start our render loop18:04
racarrthe thing is there will be apps and stuff too who will try and swap buffers while the screen is off, not just the client who has to turn the screen back on18:05
racarrso it has to be a reasonable behavior for them too18:05
kdubwell, those can block18:05
kdubnormally18:05
racarrunless we throw an error :p18:05
racarrin which case they have to decide what to do with it, which can't just be call swap_buffers again because they need to be sleeping while18:06
racarrthe screen is off18:06
kdubjust the client that is turning the screen on and off is the problem one that we have to give an error to18:06
kdubso i think that approach would work, but there's also the reworking of the session mediator18:06
racarrhmm yeah maybe just errors to the one client...18:07
kdubbecause, although i sorta like it the way it is :) if we have a new scenario we might have to improve it to fit the new scenario18:07
racarrI think it could be improved in general some18:07
kdublike... on new message, make a std::async to service the request18:08
racarrI think you should be able to process messages from the same client18:08
racarrabout different surfaces18:08
racarrconcurrently18:08
racarr(as an example of a general improvement which has kind of similar requirements to this)18:08
kdubracarr, right18:09
kdubthere's a few other ones i'm sure where that would be beneficial18:09
kdubracarr, i'm starting to lean more towards improving the server so that a new message is handled in a future18:12
racarrkdub: Yes I think it should work...I'm worried it might be too big to finish this week though18:14
racarrlots of test redoing, etc.18:14
racarrI just had an interesting idea for a "solution"18:14
racarrif you swap buffers while you turned off the display18:15
racarryou turn it back on XD18:15
kdubracarr, it is a fair amount of test reworking and restructuring18:15
racarrand then xmir tries to do the right thing18:15
racarrbut if it ever fails its not catastrophic18:15
racarri.e. the session mediator implicitly turns it on for you18:15
kdubracarr, not a bad solution :)18:15
racarrkdub: Mm.18:19
racarrOk thanks for talking through it with me :D I'm going to work on other stuff for a few hours18:19
racarrand then revisit and choose an approach18:19
racarrmaybe wait to sync with Alan in the morning, I bet he has an opinion :D18:20
kdubracarr, no problem... if you want, we could call a hangout on it for tomorrow morning18:21
racarrkdub: Mm good idea, ill make sure to be up early enough and we can just do it after the standup18:23
mlankhorstalf_: meh new strace then :P18:59
mlankhorstwith the dup removed still18:59
=== seb128_ is now known as seb128
kgunnmornin robert_ancell21:03
robert_ancellkgunn, hello21:03
kgunnrobert_ancell: curious, any joy on fixin the sec bug/vt input shotgun style21:14
robert_ancellkgunn, we had a disagreement with ricmm/tvoss over using the app lifecycle api, so we need to do some more convincing there21:15
kgunnrobert_ancell: we're supposed to be  default on the 19th....i fear we're running out of runway21:15
robert_ancellkgunn, very much so21:15
robert_ancellkgunn, I think we're just going to have to make the API change - it shouldn't affect the shell team21:16
robert_ancelldidrocks, still around?21:19
didrocksrobert_ancell: yeah, I'm in Boston right now in a sprint21:19
robert_ancelldidrocks, oh, cool. All the autolanding is off for mir right? How do we push through releases now?21:20
didrocksrobert_ancell: we just workarounded a dns issue that we are having for the last 3 days21:20
didrocks(an infra issue)21:20
didrockswhen I say just, it's really *just*21:20
robert_ancelldidrocks, oh, they weren't intentionally blocked?21:21
didrocksso at least, we'll have build for dailies21:21
didrocks2 issues :)21:21
didrocks1. intentionally block21:21
didrocks2. dns issues making things even not building for the past 3 days21:21
didrockscount 2 as fixed21:21
didrocksfor 1, we need to ensure the current image is fine21:21
didrocksand then doing the unity8+mir transition21:21
didrocksis what is in mir trunk for that goal?21:22
didrocks(the thing you do want to release?)21:22
didrockssorry, but on holidays and then, just back on this sprint, so quite lost on all the things that happened :)21:22
didrocksrobert_ancell: still around?21:25
robert_ancelldidrocks, yep21:26
didrocksdoes my question make sense?21:26
robert_ancelldidrocks, mir trunk should still be going into saucy, otherwise we wont get any critical fixes or features for the phone21:27
robert_ancellbrb, need to restart modem21:29
robert_ancell_blew our 100G cap, just upgraded to 200G. The 64k throttled speed is so unusable...21:32
didrocksrobert_ancell_: ok, let's try to get mir, u-s-c, unity-mir and qtubuntu rebuilt for making the transition21:33
robert_ancell_didrocks, the mirclient3 transition?21:33
=== robert_ancell_ is now known as robert_ancell
robotfuelrobert_ancell_: https://bugs.launchpad.net/bugs/1218381 is happening on saucy, with additional black vertical lines (there is a bug for the black lines)21:34
ubot5Launchpad bug 1218381 in XMir "saucy has horizontal distortion on ati" [High,Incomplete]21:34
robert_ancellrobotfuel, ta21:35
didrocksrobert_ancell: urgh, transition?21:45
didrocksrobert_ancell: no transition right now please21:45
didrocksasac tries to get unity-mir on21:46
asac:)21:47
asacyay!!21:47
asacwe all try to get it in21:47
asacnot me :)21:47
RAOFTransition all the things!23:25
kgunnok23:29
RAOFrobert_ancell: Incidentally, http://bazaar.launchpad.net/~raof/platform-api/update-for-lifecycle-cookie/revision/149 is the platform-api update for our proposed change.23:32
robert_ancellRAOF, yes, saw that23:33
robert_ancellRAOF, any luck convincing ricmm?23:33
RAOFNot really.23:33
robert_ancellRAOF, ricmm, I was also thinking, if the correct behaviour when switching sessions is that the hidden session should suspend its apps the only way this can occur is if the system compositor can notify the shell. So it makes sense in the Unity 8 case as well as the XMir case23:34
robert_ancellRAOF, we should code it up without the cookie for now. That's at least an improvement on the current case. It will have the input overlap issue potentially23:34
RAOFYeah. I'll just push the branch so you can test.23:35

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!