[02:02] duflu_: https://code.launchpad.net/~raof/mir/dont-ask-downstreams-to-link-against-private-libraries/+merge/241475 is available for your FTBFS-fixing pleasure! [02:02] RAOF: Ah cool === duflu_ is now known as duflu [02:36] duflu: I think you must have a libmirprotobuf.so symlink lying around somewhere on your filesystem. [02:36] RAOF: Yes, it was without deb packaging having deleted it [02:36] RAOF: I'll manually test your branch. If it works still then awesome [02:37] Feel free to verify! [02:37] ... and that means something else fixed the bug [02:37] Quite possibly. [02:38] Or, alternatively, the bug is not actually actually a bug but was a local configuration error. [02:40] RAOF: Doubt it. So far my methodology has raised build faults we otherwise don't discover till silos. I have documented it for Alan and Cemil. Maybe I should share wider [02:41] Possibly. [02:41] It's also clearly introduced build faults, so maybe we could improve it :) [02:41] s/introduced/failed to detect/ [02:41] RAOF: Although yes, I didn't expect our packaging scripts to delete libraries [02:41] My environment did not replicate that fact [02:42] They don't delete libraries on the filesystem. [02:42] Even if you remove the rm from debian/rules your change would still be incorrect, because we didn't *install* libmirprotobuf.so anywhere. [02:43] (Although once you remove the rm you'd then fail the build because dh_install would tell you that you haven't installed libmirprotobuf.so anywhere, but we deliberately *don't* want to install libmirprotobuf.so) [02:43] RAOF: Yep, I know. The question is why does it suddenly work going back to zero diff? I need to test [02:43] * RAOF 's money is on the bug not actually being a bug :) [02:43] The "completed separation of mirprotobuf" might be related [02:44] Indeed it could. [02:44] Oh, right. [02:44] That's probably it, actually. [02:44] Oh, no. [02:44] Maybe. [02:44] It's plausible :) [02:47] RAOF: Well, my methodology is more plain. If it works only because of something in the deb packaging then that's a bug for other distros [02:47] * duflu is testing [02:47] It's not that it only works because of something in the deb packaging, it's that it only works because you've got some detritus lying around in your build environment. That's a bug for everyone that's not you :) [02:48] Ooooh, right. [02:48] No, I see. [02:48] RAOF: Unlikely. I'm very clean (always delete and start from fresh) with builds [02:48] Ah, right. [02:48] You install your build system-wide? [02:49] RAOF: No, local installs [02:49] Ah. So you're never going to pick up problems with the packaging. Fair chop. [02:49] RAOF: No, I'm intentionally avoiding the slow packaging :) [02:50] Hm. [02:50] We could probably make the packaging much faster by supporting nodoc... [02:55] RAOF: Yep, basic library lookup failure. I am now fixing my method :) [02:55] Did you fail to set LD_LIBRARY_PATH=$HOME/.local/lib/x86_64-linux-gnu? :) [02:57] RAOF: Apparently so. And the symbols in our libraries changed dramatically (no versioning to MIRCOMMON now) [02:59] Always forget that implicit indirect symbol lookup step [03:00] RAOF: BTW the x86_64-linux-gnu part vanished some time recently. That's nice, but intentional? [03:00] NoL [03:01] That would be a bug. [03:01] Oh. [03:01] No. [03:01] RAOF: See: make install [03:01] That would be an artefact of the way you're building mir :) [03:01] RAOF: OK, fair point. We could make that the job of packaging [03:01] Which makes more sense [03:05] * duflu still likes being able to build Mir and all downstream projects in 10 minutes [03:05] If only CI could do the same [03:05] I think with a little tweaking you'll still be able to do that *and* do so in a clean sbuild environment. [03:07] Yeah maybe. I like purity though. Meaning minimal dependency on tools. Helps to understand how everything works (or doesn't) [04:54] duflu: https://code.launchpad.net/~raof/mir/moar-timeouts/+merge/241499 addresses the CI failure, too. [04:55] Because reliably fast CI infrastructure is for suckers! [05:47] Hm. There doesn't seem to be any reason to not require setting a pixel format for a surface. [05:47] There's no codepath where you're like “yeah, I don't care what BPP my surface is, whether it's got an alpha channel or not, and what order the pixels are in”... [05:59] RAOF: It's a weird use case (e.g. changing video streams while keeping a single surface open), but on the other hand our logic supports switching on the fly perfectly, so why impose limitations? [05:59] Imposing limitations actually makes it more complex [05:59] No, that's not what I mean. [05:59] Sounds familiar :) [06:00] Even in that case you don't *not care* what the pixel format is. [06:00] You just want to be able to change the pixel format from the first format you cared about to the second format you care about. [06:00] * RAOF is perfectly happy to allow changing pixel formats. [06:02] But you never create a surface going ‘I don't care at all what pixel format this surface has’. Because you're always going to render to it at least once :) [06:02] Heh. Or maybe we should make the default pixel format 1bpp. That'd show em'! [06:02] RAOF: Actually that is a use case: Configure a GL window with rough requirements... and then let the system tell you what the chosen pixel format is [06:03] Except GL doesn't allow that. [06:04] EGL even. Still... it's in the realm of possibility. And there's more likely a future use for it than a good reason to impose limitations needlessly [06:04] yet [06:05] Hm. I guess our EGL layer _might_ be able to do that eventually. [06:05] I'm not entirely sure that's spec-compliant, though. [06:05] RAOF: We're in charge of the /Mir/ spec. The only requirement is that it should support everyone else's specs too [06:05] Somehow and eventually [06:06] Right, but you're use-case for ‘maybe we don't need to specify a pixel format’ is ‘it could be helpful for EGL’ and that behaviour is disallowed by the EGL spec, then it doesn't really seem like a reasonable use-case :) [06:06] To say there will never be a toolkit/app that needs to do something is naive. Better to not close the door on future features until you need to [06:07] Because it's worse to have to remove a limitation later, and admit the limitation was imposed without good reason in the first place [06:08] But on the other hand you *cannot* impose a new limitation in future without breaking clients. [06:08] It's much much less costly to loosen limitations that were initially too stringent than to tighten requirements that were originally too loose. [06:09] RAOF: Sounds like the lazy man's way of saving and receiving free karma in future. But I think better if the API just works from day one [06:09] Nobody thanks you when things *just work*. But it's still better that way [06:11] Just like it's naive to say that there will never be a toolkit/app that needs to do something it's equally naive to say that a shell will never need to enforce some behaviour. [06:11] And people thank you even less when you break their stuff :) [06:14] And ‘but the shell can just disallow that behaviour’ is not a response. That means that the shell will not work with existing clients. [06:16] RAOF: Yeah you can easily argue arbitrary limitations help for future platforms/drivers to fully implement the spec. But there still may never be a reason for the limitation to exist. You never know [06:17] Yes, that's true. [06:17] But the consequences for ‘oops, we restricted something too tightly’ are much better than the consequences for ‘oops, we allowed something that we shouldn't have’. [06:19] My philosophy is conservative in that way: given that we can only think of marginal, niche uses for a feature that will be easy to add later but that will be very expensive to remove if it turns out to be difficult, we should choose the easy to fix problem. [06:23] RAOF: I would err on the side of effort -- Do you have to write any additional code to limit the functionality in question? If you can't immediately justify why you're doing it then ideally don't [06:23] That's certainly a reasonable input to the decision, but I (obviously) think it's insufficient. [06:24] RAOF: You also confuse future maintainers who then ask why the limitation (and test thereof) exists. [06:24] Not if that's a consistent pattern throughout the code. [06:24] The presence of a test suggests reason. So I guess if you do have the limitation, at least don't enforce it by test [06:25] The reason is “we can't think of any reason someone would want to do this, and it might impose significant costs down the line if we allow it”. [06:25] Then the limitation can be lifted and everything still passes unquestionably [06:26] RAOF: Sure, just outline why the significant cost might exist [06:26] I don't think there's much precedent for people wanting to make code less capable after it matures. Such things do happen, but usually only by accident [06:27] I think there's *lots* of precedent for wishing you didn't let users do things. [06:27] RAOF: On another issue, if Jenkins fails again I will land your branch by hand [06:28] Oh, has it died again on the .pc thing? [06:28] Oh, not yet :) [06:29] RAOF: OTOH I'm a fan of limited sized arrays to make APIs simple and then worry about expanding them later as required :) [06:29] * RAOF was hoping to be lazy, let that land, and then his other MPs no longer have extraneous diff :) [06:29] There's more to debate if the limitation makes things less simple [06:30] I think we also differ about what makes an API simple :) [06:31] RAOF: It's mathematically unambiguous. Or so I thought. Then I met Sam who proposed his 100-character long function names made things easier to understand [06:32] 100 character function names suggest that a refactoring might be necessary :) [06:32] RAOF: That was a refactoring :S [06:32] A conceptual refactoring. [06:33] On the other hand, modern tooling makes the cost of 100 character function names much less. It's possible they're justified in some strange cases :) [06:34] I think there's a definite difference between horizontal and vertical code-readers. Not sure if there's much literature on the subject. I prefer the mathematical approach of use all the symbols and then invent new ones [06:34] That makes for a really hard codebase to dip into, obviously. [06:34] Yeah, actually I prefer a compromise [06:35] Just not function names that are longer than the width of the window [06:35] Makes you laugh and cry all at once [06:43] hi, the Mir Spec wiki page is outdated, e.g. QMir is deprecated. can you update the page to include all the projects associated with mir and give us a picture of the current Mir stack? [06:45] sgx1: That was bound to happen... We have so far avoided mentioning much about downstream projects but the best place would be to document things in the source code, which then gets mirrored on the web automagically: http://unity.ubuntu.com/mir/ [06:45] That page is correct, except QMir is actually QtMir [06:49] duflu: CI success! [06:50] RAOF: Are you sure about QtMir? I thought that was the server/shell only in there [06:50] The docs refer to the client port [06:51] I think? [06:51] wiki.ubuntu.com/Mir/Spec refers to both client and server side stacks. [06:51] And just as soon as the wiki lets me login I'll be updating that bit. [06:54] You know what? It'd be nice if Launchpad updated the MP diff when the merge-into branch changed. [07:06] RAOF: Yes, but doesn't scale. Our infrastructure can't keep up already [08:24] Woo, a profiler that *works* [08:25] (sudo apt-get install google-perftools) [08:25] * duflu is still amazed there's no adequate open source solution... [10:34] Argh. OK, so Google's profiler is excellent. However it doesn't understand std::function (since Google doesn't allow such things in their code they don't care) :P And most of Mir's time is in std::function::operator [10:35] So the best profiler is still only partially useful [10:35] Can you fix it? [10:36] alan_g: I can only think of replacing suspicious functors with named functions [10:36] But that's effort and it's already eating into dinner time [10:37] is it really std::function (not lamdas) it doesn't understand? [10:38] alan_g: More to the point it seems to lose caller info for them. Then it wouldn't matter [10:38] Are we omitting frame pointers? [10:39] That sounds like it ought to be a compiler option [10:40] Yeah but we're not optimizing so they should be present [10:40] * duflu forces it to be sure [10:41] Google profiler is *useful* simply because it reports real time spent. Not CPU time used. [10:45] ... or so some results indicate. If you take that literally then some functions that should be present are absent [10:48] Ah yes, 69% in sleep_for() [10:53] alan_g: Fun fact: The one consistently worrying result I get each time is that the compositing thread always spends around 30% of Mir's time destroying (releasing and replying about) TemporaryCompositorBuffer. I can get an instant improvement moving that to a new thread, but it's ugly [10:54] 20% is the responding with buffers to the client, 10% is wasted by high lock contention in the same code [10:55] The client calls are typically blocking and completing on the composit thread? Interesting. [10:56] We can likely come up with a better design if we're sure that's a bottleneck. [10:56] alan_g: Unexpected but that's correct. The last ref is released and the callback happens in the compositing thread, for each client [10:56] 69% in sleep_for() seems rather low [10:56] alan_g: Other functions sleep also [10:56] Sure [10:57] alan_g: The issue right now is, try to be even a little bit clever (like moving the response code into the thread pool) and the additional locking/context switching negates the benefit [10:57] It's still slow [10:58] I guessed that. We *could* put the responses into a lock-free queue and have a thread servicing that. [11:00] Yeah. Only a quick and dirty std::move of the render list into a thread without locking is working. And that's terrible to start a new thread each frame [11:00] BIAB === alan_g is now known as alan_g|afk === alan_g|afk is now known as alan_g [11:24] alan_g: hey, I was digging into unity8 doing proper stack unwinding/emergency cleanup last night, and recalled a notion that you were reworking things to get rid of run_mir and just have server::run/stop. Is that still the case? [11:25] greyback: yes. I have a spike here: https://code.launchpad.net/~alan-griffiths/qtmir/migrate-to-mir-Server-API/+merge/240451 [11:26] What you want is shown here: tests/acceptance-tests/server_signal_handling.cpp [11:26] alan_g: ah nice [11:27] (That's mir trunk) [11:28] alan_g: yep, that's what I need, thanks [11:28] yw [11:28] alan_g: lp:mir is now the development branch? [11:29] It is (at last) [11:29] great [11:29] * greyback deletes [12:08] good morning [12:09] I'm trying to figure out the reason for a regression/crash I've got in Ubuntu 14.10 with one of our products [12:09] the backtrace is like this http://pastebin.com/EqJA0Dy4 [12:10] which is triggered from dl = dlopen ("libclutter-glx-1.0.so", RTLD_LAZY); [12:10] this is comming from a GStreamer plugin which and .so file also opened via dlopen [12:11] I've tried to reproduce it with no luck from a regular executable [12:13] it seems similar to https://bugs.launchpad.net/mir/+bug/1358698 [12:13] Ubuntu bug 1358698 in mir (Ubuntu) "[regression] Test failure holding up all merge proposals: "File already exists in database: mir_protobuf_wire.proto"" [Critical,Fix released] [12:13] and the issue isn't reproducible in Ubuntu 14.04 [12:13] ad-n770: you're on a channel for Mir (see topic). Why do you think your problem is related to Mir? It doesn't appear in your backtrace. [12:14] /usr/lib/x86_64-linux-gnu/libmircommon.so.2 [12:14] from frame #5 [12:14] ad-n770: sorry, missed that [12:16] do you know any trick to get debugging info for libmircommon and protobuf ? [12:18] I'm here for suggestions on how to dig further on the issue [12:20] mmm, I've to leave for a while now. I'll come back later on this [12:20] ad-n770: are you intentionally using mir? It isn't the default [12:38] ad-n770: the problem is happening in the protobuf support library and looks different to the bug you referenced. The latter was caused because protobuf errors if the same description is initialized twice - it doesn't segfault as your backtrace shows. But I don't know why clutter loads libmircommon (I know gtk has some /optional/ support for mir, but I don't know any details without digging) === Trevinho_ is now known as Trevinho === dandrader is now known as dandrader|afk [12:54] I'm not using mir intentionally [12:55] the GStreamer plugin is Fluendo's VA decoder [12:56] this is a plugin that does HW accelerated video decoding via XvBA/VDPAU/VAAPI [12:56] it also provides beside decoding GStreamer features for rendering on an XWindow or into a Clutter Actor [12:57] the plugin uses dlopen in order to enable the features at runtime accordingly to the system capabilites [12:58] the actual crash comes from this very early discovery when it tries to discover if clutter is available in the system [12:59] the worst is that it causes nautilus crashes as per a customer report when it tries to generate thumbnails === alan_g is now known as alan_g|lunch [13:02] alan_g|lunch: let's talk about it when you'll back === dandrader|afk is now known as dandrader === alan_g|lunch is now known as alan_g [14:15] alan_g: I'm back [14:16] ad-n770: I don't have much for you really. Does uninstalling Mir help? (I'm not suggesting it as a solution, just for adding a data point.) [14:17] the think is that I haven't installed mir [14:17] it's a freshly installed ubuntu 14.10 system [14:18] an nothing installed on top of it a part of nvidia drivers [14:19] Mir is installed - although I don't know why. I don't have a fresh 14.10 installation to figure why, but it shouldn't be needed by the default stack [14:23] ad-n770: alan_g: Could it be that what you are seeing comes from trying to load EGL drivers? [14:24] maybe, I couldn't track yet the reason for clutter going into mir [14:25] if I try to sudo apt-get remove libmircommon2 a lot of packages are going to be removed [14:28] ad-n770: For the debug packages https://wiki.ubuntu.com/DebuggingProgramCrash#Debug_Symbol_Packages [14:29] * alan_g is curious which packages are installed by default with a dependency on libmircommon2 [14:30] ok, going to try get debug symbols and a better backtrace [14:34] ad-n770: my bet is on EGL loading drivers to check what it can use, but let's see [14:39] ad-n770: curious, did you happen to install unity8-desktop-session ?? [14:39] * kgunn missed some of the backscroll [14:40] ad-n770: hmm, I was wrong, ldd shows that libclutter-glx-1.0.so.0 brings in libmirclient/common [14:40] kgunn: http://pastebin.com/2ZSZ34EF [14:41] ad-n770: not directly, though [14:41] new backtrace with debug info http://pastebin.com/f8uWFJ3d [14:41] frame 5 is suspicious [14:41] this=0x0 [14:43] this=0x0 is always suspicious [14:44] ad-n770: alan_g: so, libclutter-glx-1.0.so.0 depends on libgdk which depends on libmirclient, at least now we know why we load libmircommon [14:45] ad-n770: alan_g: tells us nothing about the failure, though [14:46] well I've debug symbols and source code on gdb now [14:48] mmm, mir_protobuf_wire.pb.cc is generated [14:48] I don't have source code for this one [14:49] there's any way to get /build/buildd/mir-0.8.0+14.10.20141010/obj-x86_64-linux-gnu/src/common/protobuf/mir_protobuf_wire.pb.cc ? [14:50] running sudo apt-get build-dep libmircommon2 now [14:52] if those help generated_database_->Add(encoded_file_descriptor, size) [14:52] generated_database is NULL [14:57] ad-n770: That's all deep in protobuf code. It seems that InitGeneratedPool() (through InitGeneratedPoolOnce()) doesn't run, which implies that perhaps protobuf has been setup and torn down in the same process before? Not sure... [14:58] ad-n770: Is there an easy way to reproduce this locally? === dandrader is now known as dandrader|lunch [15:48] desrt: hey there, so we were talking about the concept of reparenting windows, understand that gtk supports this...but does anyone actually use ? [15:50] kgunn, desrt is on vac this week [15:50] just as a fyi [15:55] ah [16:01] kgunn: https://searchcode.com/?q=gtk_window_set_transient_for&loc=0&loc2=10000&lan=28&lan=16&lan=15 [16:01] whether those calls are all necessary or not, I don't know [16:02] right === chihchun is now known as chihchun_afk [16:33] AD-N770: I've managed to reproduce the crash on my intel machine [16:33] good [16:34] AD-N770: I will take a deeper look tomorrow [16:34] ok, thanks [16:34] in any case I will write a test case too [16:34] AD-N770: that would be great [16:36] bleh :P [16:36] Xmir standalone thingy working [16:38] kdub: you're not working on lp:1391923? (I need to fix that anyway) [16:39] alan_g, I took a quick look, but have stopped [16:39] so no [16:39] * alan_g grabs it [16:40] cool [16:46] * alan_g wishes that C++ didn't think char const* -> bool is a better conversion than char const* -> std::string. [16:47] Which ultimately, is because string should be a builtin type, not a UTD. [16:48] +100 === dandrader|lunch is now known as dandrader === chihchun_afk is now known as chihchun [17:09] kdub: fixed - https://code.launchpad.net/~alan-griffiths/mir/fix-1391923/+merge/241585 [17:27] alan_g, alf_: the issue is reproducible with http://pastebin.com/qzfHa547 [17:27] at the end it seems that our product is doing dlopen twice [17:27] I will refactor our code to avoid it [17:27] AD-N770: Your example code also fails when trying to load only libmircommon.so.1 , so it's certainly something wrong there, either because we are not using protobuf right, or because protobuf has a bug. [17:27] AD-N770: thanks! [17:29] AD-N770: thanks [17:34] alan_g: AD-N770: https://bugs.launchpad.net/mir/+bug/1391976 [17:34] Ubuntu bug 1391976 in Mir "Loading libmircommon.so twice leads to a segfault in protobuf code" [Undecided,New] === chihchun is now known as chihchun_afk === chihchun_afk is now known as chihchun === chihchun is now known as chihchun_afk === dandrader is now known as dandrader|afk === dandrader|afk is now known as dandrader