[03:52] <duflu> RAOF: I've landed as much as I safely can, manually. But there's reasonable evidence people stop doing reviews when Jenkins gives spurious "Needs fixing"... https://code.launchpad.net/mir/+activereviews
[03:52] <RAOF> duflu: Yeah, I'm on it.
[03:52] <duflu> Although presently the problem might not be spurious. It might be our fault
[04:20] <rsalveti> AlbertA: https://bugs.launchpad.net/ubuntu/+source/unity-system-compositor/+bug/1359530
[04:29] <RAOF> BAH!
[04:29] <RAOF> Commit before pushing, push before rerunning the tests!
[05:23]  * RAOF does his bit to ensure that the mako test devices stay nice and warm
[06:28]  * RAOF decides that actually landing privatise-all-the-things so that libmirserver25 is coinstallable with libmirserver24 is a better use of time than futzing with mir_install_packages.sh to work around that problem.
[08:53] <alan_g> duflu: @~vanvugt/mir/revert-to-r1831 we think alike - that was the next thing I was going to try. Bisecting via CI is going to be slow though.
[08:53] <duflu> alan_g: Faster than not doing it I guess
[08:55] <alan_g> yeah
[08:56] <duflu> Huh. We release Mir debs for arm64. Is there any hardware other than the build machine we could run these on?
[09:02] <alan_g> Not in my office but I get the impression that we default to releasing for any platform that exists in archive.
[09:11] <alan_g> duflu: I've worked out a way to replicate the symptoms. Just not sure how it can be affecting CI. Got headspace to listen?
[09:14] <alan_g> I start by removing the libmirplatformgraphics.so in lib
[09:14] <alan_g> Then when running e.g. mir_integration_tests the ambient version of libmirplatformgraphics.so is picked up.
[09:14] <alan_g> So mir_integration_tests links to a local libmircommon.so.2
[09:14] <alan_g> But the ambient version of libmirplatformgraphics.so links to a global libmircommon.so.1
[09:14] <alan_g> And and two different libmircommon.so.* both tryt to register wit libprotobuf
[09:15] <alan_g> And that gives us "libprotobuf ERROR..."
[09:16] <alan_g> The question therefore is: could we pick up the wrong libmirplatformgraphics.so in CI?
[09:16] <duflu> alan_g: Sounds obvious then if there are two libmircommon.so*
[09:16] <duflu> alan_g: Oh, ambient means system installed?
[09:17] <alan_g> yea
[09:18] <duflu> alan_g: This sounds like a continuation of that "how did USC build when it should have failed" discussion. The CI system is probably using too much of the installed system over the new Mir being built and tested.
[09:18] <alan_g> I'm tempted to revert -c 1846 to see if that picks up the new libmircommon
[09:18] <duflu> Can we force the environment of phablet-test-thingy to look in the newly built lib/ first?
[09:18] <duflu> Because it really should
[09:19]  * duflu is distracted exploring the bleeding edge phone image
[09:19] <alan_g> I'm sure we can. (Might need to bully ci-eng)
[09:20] <duflu> alan_g: Similarly, teach them to set up PKG_CONFIG_PATH environment correctly for USC/qtmir silo builds
[09:20] <duflu> If that is them
[09:21]  * alan_g isn't sure if there is a "them" and not loads of well meaning passers by
[09:27]  * duflu assumes there are CI Gnomes somewhere who know how things work
[09:29]  * alan_g used to think that until he went spelunking through some build problems
[09:30] <alan_g> Anyway, I've got my day ahead to pursue this idea. Thanks for listening
[10:37]  * alan_g decides trying to dlopen "libmirplatformgraphics.so" is probably a bad idea when we ought to care about getting the one we're testing.
[12:22] <anpok_> alan_g: i was just able to reproduce the issue
[12:22] <anpok_> .. well .. i did not try to ..
[12:25] <anpok_> and in my case it is caused by broken relative link support.. so it cannot pick libmirplatformgraphics.so so i guess it takes the older one in /usr/lib ..
[12:32] <anpok_> alan_g: just as you said because it opens libmirplatformgraphics.so and failes to do because of the broken link it pulls in the systems libmircommon: http://paste.ubuntu.com/8105879/
[12:51] <alan_g> anpok_: That gives the same symptoms - I've yet to prove it is the same thing as happens in CI.
[12:52] <anpok_> hmm yes .. it could be because, as you said link against the system mircommon .. or like in the vm load the wrong platformgraphics.so hmm
[12:53] <alan_g> anpok_: it is a very plausible theory. (And I believe it.) But I'm awaiting proof from CI before claiming it as a solution.
[13:04] <alan_g> anpok_: this "broken relative link support" - is that specific to the emulator, or could it be affecting phone images?
[13:04] <alan_g> Specifically the makos in CI
[13:11] <anpok_> alan_g: special to the 9p virtual file system.. it seems to fail exposing links to the virtual machine
[13:13] <alan_g> Thanks. I'm still trying to understand the exact failure mechanism in CI.
[14:53] <robotfuel> kgunn: can you have someone look at this mir bug, I am not sure if it's qtubuntu, mir, or platform api https://bugs.launchpad.net/mir/+bug/1358874
[14:55] <kgunn> AlbertA2: camako either of you wanna have some pointer tracking "fun" :)
[14:56] <kgunn> or anpok_ even ^
[14:56] <kgunn> volunteers ?
[14:57] <kgunn> robotfuel: so that error report says ubuntu-system-settings, but do we know anything more ? like what was happening? opening or closing or just using settings at the time ?
[14:58] <camako> kgunn, sure
[14:59] <kgunn> camako: looks like the startup phase of u-s-s....
[15:00] <robotfuel> kgunn: no I'll get video of it today
[15:00] <kgunn> robotfuel: cool...the trace suggests startup of settings...but video would be awesome
[15:14] <robotfuel> kgunn: I also have this one that needs someone assigned to it. I am not sure who it should be. https://bugs.launchpad.net/barajas/+bug/1359270
[15:17] <AlbertA2> robotfuel: I can take a look at that
[15:25] <robotfuel> AlbertA2: what's your launchpad id?
[15:32] <AlbertA2> albaguirre
[15:42] <mibofra> AlbertA2, hi. I think that more or less my setup it's like an unflipped image of utouch, so everything have to work ok. At this point I think it's a driver issue.
[15:43] <AlbertA2> mibofra: umm any idea which driver call maybe causing the segfault?
[15:43] <AlbertA2> or which opengl/egl call?
[15:45] <mibofra> no I've to debug it AlbertA2. Think about that I'm on the cm11, as I can see on the porting guide the android base used it's an essential cm10.1
[15:46] <AlbertA2> mibofra: which android version do you have?
[15:46] <AlbertA2> 4.4.1? 4.2?
[15:46] <mibofra> 4.4.4
[15:47] <mibofra> yeah the 4.4.4 I've checked
[15:47] <AlbertA2> I wonder if they have changed the opengl dispatcher table again...
[15:48] <AlbertA2> there could be subtle TLS bugs
[15:48] <AlbertA2> when using gnulibc vs bionic
[15:52] <mibofra> AlbertA2, it's to be checked
[16:05] <kgunn> thanks AlbertA2 for diving in on that one
[16:47] <mibofra> AlbertA2, I've an idea, or well I think it was activated but It's deactivated every reboot of the phone
[16:48] <mibofra> the trace of opengl on logcat on the dev options
[16:53] <mibofra> ok AlbertA2 there is anything new
[17:07] <mibofra> ok AlbertA2 I've asked, maybe you're write, there is the probability that the table from cm10.1 to 11 was changed
[17:14] <mibofra> AlbertA2, (I'm controlling the code on github) if the problem is the opengl dispatcher table, do you (the devs/engineers of the mir team) can do anything or do you can do nothing because officially you're using as base cm10.1?
[17:23] <anpok_> re
[17:24] <anpok_> kgunn: saw my nick hilighted, have I been volunteered?
[17:39] <AlbertA2> mibofra: we are officially using 4.4.1 now
[17:39] <mibofra> ok
[17:49] <mibofra> anyway AlbertA2 if the dispatcher table of opengl?
[17:49] <AlbertA2> the main thing there is the optimization
[17:49] <AlbertA2> where they use TLS to store the pointer and hardcode it
[17:50] <AlbertA2> you can check
[17:50] <AlbertA2> if the TLS is being corrupted
[17:51] <AlbertA2> but I doubt it, since you can get the driver working on the server side...
[17:51] <AlbertA2> but I don't know if there's some interplay in the client thread context with TLS that we are missing
[17:52] <camako> anpok, it was about https://bugs.launchpad.net/mir/+bug/1358874 ... I took it on. No action on your part.
[18:18] <mibofra> AlbertA2, I'm going to have dinner, cm10.1 https://github.com/CyanogenMod/android_frameworks_native/blob/cm-10.1/opengl/libs/EGL/egl_tls.h https://github.com/CyanogenMod/android_frameworks_native/blob/cm-10.1/opengl/libs/EGL/egl_tls.cpp , cm11 https://github.com/CyanogenMod/android_frameworks_native/blob/cm-11.0/opengl/libs/EGL/egl_tls.h https://github.com/CyanogenMod/android_frameworks_native/blob/cm-11.0/opengl/libs/EGL/egl_tls.cpp I've made
[18:18] <mibofra> 2 diffs: egl_tls.cpp diff http://paste.ubuntu.com/8108125/ egl_tls.h diff http://paste.ubuntu.com/8108127/ . See you after dinner.
[18:26] <kgunn> anpok_: nah, you're good
[18:30] <AlbertA2> mibofra: https://code-review.phablet.ubuntu.com/gitweb?p=aosp/platform/frameworks/native.git;a=blob;f=opengl/libs/GLES2/gl2.cpp;h=3134e568fe984315dae7bf3c1fd4714359b7b7ba;hb=refs/heads/phablet-4.4.1_r1
[18:31] <AlbertA2> mibofra: we are using aosp: https://code-review.phablet.ubuntu.com/#/admin/projects/
[18:33] <AlbertA2> mibofra: https://wiki.ubuntu.com/Touch/AndroidDevel
[18:33] <AlbertA2> basically this is the android base we use:
[18:34] <AlbertA2> "repo init -u https://code-review.phablet.ubuntu.com/p/aosp/platform/manifest.git -b phablet-4.4.2_r1"
[18:35] <AlbertA2> that's for the dev version of ubuntu touch
[19:10] <mibofra> AlbertA2, I've to confront. So maybe it's the different base the problem
[19:18] <mibofra> AlbertA2, from the source you're using and cm11 I've only this diff: http://paste.ubuntu.com/8108505/
[19:18] <mibofra> for gl2.cpp
[19:19] <AlbertA2> interesting...
[19:20] <mibofra> let's try the egl_tls
[19:25] <mibofra> uhm AlbertA2 there are no differences for egl_tls