/srv/irclogs.ubuntu.com/2014/07/10/#ubuntu-mir.txt

=== chihchun_afk is now known as chihchun
=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
RAOFBah! Why does my USB controller suddenly drop dead?05:24
RAOFI _like_ my external keyboard, damnit!05:24
dufluRAOF: Snap. Me too, on random systems05:30
anpok_it workss06:41
anpok_now i need to add mir platform support.. hmm06:42
=== Mirv_ is now known as Mirv
dufluHmm why is bzr push suddenly so slow this week?08:19
dufluMy upload speed is unchanged08:19
duflucamako, alan_g: Priority regression fix needs review: https://code.launchpad.net/~vanvugt/mir/fix-1339700-alarm/+merge/22625208:30
alan_gduflu: looking...08:31
dufluI think the safety of dropping the lock early is easy to verify on inspection. Then you just have to be convinced that it's the same issue as shown in the stack traces08:32
dufluAgh, stupid slow uploads. What's wrong with the pipe?08:34
dufluHmm, actually it looks like it's uploading at theoretically max speed. So bzr has somehow been slowed down by the complexity of our branches?..08:35
alan_gduflu: I'm not convinced by "*easy* to verify on inspection" - if the cb be can removed between the unlock and invoking the copy then the called back code may touch objects that no longer exist.08:36
duflualan_g: You're assuming the callback touches its own alarm object. That's actually quite hard to do, and even if you did, you would be re-entering the alarm's mutex (which is not recursive), resulting in a crash/deadlock08:38
dufluActually C++ says "undefined" behaviour, which is sometimes crash or deadlock08:39
duflualan_g: Or rather, the callback is only made within the lifetime of the code that owns the alarm. So you do have confidence in when/how the callback will be made08:41
alan_gduflu: I'm saying that it is isn't "easy to verify" that nothing (on any thread) can decide to cancel the alarm and destroy the handler between the unlock and the invoke.08:41
alan_gIt may well be *possible* to verify08:42
alan_gAm looking...08:42
duflualan_g: OK, I'm not sure now either. The right answer is for objects to never have internal locking except to protect threads they themselves have created. But that's a larger architectural change08:44
alf_camako: alan_g: duflu: top-approving https://code.launchpad.net/~thomas-voss/mir/explicit-gcc-version/+merge/226140 unless someone objects soon08:50
duflualf_: *shrug*08:50
alan_galf_: no objection08:50
alf_@duflu's "Hmm why is bzr push suddenly so slow this week?" -> something changed and bzr is uploading an awful lot even for small branches: just uploaded 40M for a 6K diff :/09:05
alan_galf_: I *think* it is because we lost lp:mir and the uploads are diffs against that09:06
alf_alan_g: hmm, so lp is not using stacked branches for our new branches...09:07
alan_galf_: it is a guess based on the delays seen when we diverge from lp:mir not on research into how bzr works09:08
alan_galf_: camako as duflu has gone can we have another opinion on https://code.launchpad.net/~mir-team/mir/fix-1339700/+merge/226233 with a view to top-approving?09:10
camakoalan_g, sure...09:10
=== Eisbrecher_xnox is now known as xnox
alf_camako: ^^ and we also need to fix the bzr upload issue... we can't be pushing 40M for each new branch09:12
camakoalf_, okay09:13
greybackI think you can use bzr push --stacked-on=lp:something for now. But not having lp:mir is extremely confusing!09:19
alan_ggreyback: thanks, that makes sense.09:20
anpok_alan_g, alf, camako: I have a third solution09:23
anpok_but not ready yet09:24
alan_ganpok_: solution to which discussion?09:24
anpok_deadlock09:24
alan_ganpok_: lp:1339700?09:25
anpok_yes09:25
anpok_calling timer callback without a lock, and ensuring sequential ordering of timer callback execution and eventual canceling/reconfiguration09:25
anpok_havent worked on it since I do the qxl mesa/kernel stuff09:26
=== alan_g is now known as alan_g|tea
anpok_https://code.launchpad.net/~andreas-pokorny/mir/synchronous-cancel-of-alarms/+merge/22453009:32
anpok_just updated it09:32
=== alan_g|tea is now known as alan_g
alf_anpok_: Can't we make main_loop_thread atomic<> instead of locking it? As it is, the code may deadlock if e.g. a synchronous action from execute() calls AsioMainLoop::stop()09:45
anpok_hmmm09:48
anpok_hm09:49
anpok_ok then i would reset the main_loop_thread before stopping the io service09:50
anpok_to ensure that nobody queues in further handlers/actions that might not get executed09:50
anpok_the lock is more about the stop() than about the reset09:51
alf_anpok_: so, to make sure I understand correctly:09:55
alf_anpok_: 150+ if (data->state.compare_exchange_strong(expected_state, mir::time::Alarm::triggered)) , is what guards us from an alarm event that was enqueued asynchronously after cancel() was called?09:57
alf_anpok_: e.g. we call cancel and enqueue a cancellation handler, but meanwhile the alarm gets triggered and enqueues an alarm handler09:58
anpok_the cancel op is first10:05
anpok_it will remove the strong ref from the timer object10:05
anpok_when the alarm handler is called it will fail in getting a shared_ptr10:05
davmor2Hey guys I'm ready to start testing silo 006 the qt comp only I've been informed that the lastest version didn't build is there anyone that can double check that before I waste my time trying to install and test it please?10:05
alan_ggreyback: were you dealing with the silo? ^10:07
davmor2greyback: looking at it, it might of built for arm and just failed else where but I just wanted to be sure10:08
anpok_alf_: i would love to get rid of the thread id mutex - if we could gurantee that any outstanding operation is executed during stop - and an attempt to restart during the stop procedure is avoided (<-why did I think this is necessary?)10:08
anpok_during or before stop completes10:09
alf_anpok_: "the cancel op is first, it will remove the strong ref from the timer object", where does it do that?10:14
anpok_oh you are right10:16
anpok_I have seen too many versions of that part10:17
anpok_it just changes the state10:17
anpok_so cas will fail and no handler will be executed10:18
greybackdavmor2: silo is rebuilding, a recent unity8 release means silo6 is a little out of date10:20
davmor2greyback: awesome thanks,  Didn't want to waste half a day to realise I hadn't actually tested what needed testing :)10:21
greybackdavmor2: I'd love your testing feedback at some stage however. Can I ping you when silo ready?10:22
davmor2greyback: yeap sure I'm setup today for just testing this silo on manta mako and flo10:24
greybackdavmor2: magic. Silo6 does work currently, but you have to carefully specify packages with this http://pastebin.ubuntu.com/7774465/ - you might be better off waiting for the rebuild tho10:29
davmor2greyback: I can wait there is other stuff I need to get on with too :)10:30
greybackdavmor2: ack10:31
=== MacSlow is now known as MacSlow|lunch
davmor2greyback: hmm silobot tells me that the packages are built now ;)11:08
greybackdavmor2: huh, it didn't ping me11:09
davmor2greyback: see #ubuntu-ci-eng11:10
* greyback doubts his irc client now11:10
greybackaha there we are11:10
davmor2haha11:10
=== alan_g is now known as alan_g|lunch
=== ara is now known as Guest56630
=== chihchun is now known as chihchun_afk
alf_greyback: Trying out QtComp on N4, works well. Some notes that I am not sure if they are problems:11:56
greybackalf_: great, please share!11:57
alf_greyback: The icons in the launcher seem strange, at least slightly different from what I remember with previous unity8. e.g. some icons have the bottom left corner cut off11:58
alf_greyback: launcher => the bar you swipe in from the left11:58
greybackmzanetti: is that the new design^^11:58
mzanettigreyback: alf_: yes it is. It indicates which icons are pinned to the launcher11:59
mzanettirecent (unpinned) onces won't have the corner clipped11:59
mzanettiand yes, the whole launcher got a new design :)11:59
alf_greyback: mzanetti: also the ubuntu/home icon is now a full rectangle?11:59
greybackmzanetti: how are you doing that corner clip?12:00
mzanettiyeah... I'm not yet used to that either...12:00
mzanettigreyback: shadereffect12:00
greybackmzanetti: ok12:00
alf_greyback: I also only see the apps scope, is this normal, or have I messed up something?12:00
mzanettialf_: are you?12:01
mzanettialf_: note, scopes have a new header too :)12:01
mzanettialf_: tried swiping left/right?12:01
alf_mzanetti: nothing happens12:01
mzanettihmm... ok... *should* work12:01
greybackactually same here12:01
alf_mzanetti: I see the new header with search icon on the right12:01
mzanettiis this QtComp?12:02
greybackyeah12:02
mzanettiit was working here before when I tried the merge. lemme check12:02
alf_greyback: mzanetti: not related to qtcomp, more of a design issue, but I find the various different ways that you go "back" distracting12:05
alf_greyback: mzanetti: not much consistency there12:06
mzanettialf_: hmm... example?12:06
mzanettialf_: afaik we only have the one back button at the upper left corner (except for apps that haven't been updated yet)12:06
mzanettiwhich shouldn't be many any more12:07
mzanettigallery app is one of them12:07
alf_mzanetti: right, that's the one I was thinking of12:08
mzanettialf_: yeah... that's gallery app being outdated12:08
=== MacSlow|lunch is now known as MacSlow
alf_mzanetti: but on a related note, the upper left corner is not very easy to reach if you are holding the phone with one hand (the right)12:09
mzanettialf_: I agree. there has been a discussion on the ubuntu-phone mailing list12:10
mzanettialf_: seems that's the only place where users don't fail to find it12:10
mzanettidon't ask me why12:10
mzanettialf_: ogra_ even dropped his phone multiple times because of this :D12:11
alf_mzanetti: since I am play testing... have there been any discussions to reduce applications start up times, perhaps preload some common ones (e.g. waiting 2-3 seconds for the dialer to come up is painful)12:13
ogra_mzanetti, yeah !12:14
mzanettialf_: there has been a discussion although I don't know the state/outcome.12:14
ogra_there wasnt any12:14
ogra_design said "this is how we designed it" ... no further discussion happened12:14
mzanettiright, for the back button. yep, that's what it was mostly12:15
mzanettifor the preloading though I think architects had some thoughts about it12:15
anpok_well, we should look what takes so much time - i.e. it could be something trivial like shader compilation..12:23
mzanettilots of it is QML compilation12:25
mzanettithere have been thoughts about precompiling QML12:25
=== pete-woods is now known as pete-woods_lunch
=== alan_g|lunch is now known as alan_g
=== chihchun_afk is now known as chihchun
=== pete-woods_lunch is now known as ptet-woods
AlbertAalf_: so there is one scenario in TimeoutFrameDroppingPolicy14:43
AlbertAalf_: where an alarm can be rescheduled after it was cancelled14:43
AlbertAalf_: if Thread A calling swap_not_blocking is pre-empted right after if (pending_swaps++ == 0)14:44
AlbertAalf_: and Thread B calls swap_unblocked, cancels alarm and decrements pending_swaps14:45
AlbertAthen Thread A will reschedule the timer, which can potentially lead to the assert(pending_swaps.load() > 0); triggering14:45
AlbertAbut I think converting that assert into just a return should cover that...14:46
dobeyAlbertA: for #1337481 i wonder if just doing a no-change rebuild in the archive might fix it? (though of coruse it won't prevent it from possibly recurring in the future)14:52
AlbertAdobey: it's something I wanted to try for sure...we are about to spin 0.4.1 though...so perhaps that would cover it14:56
AlbertAdobey: I hate these heisenbugs...14:56
davmor2kgunn: osk is playing up in Qtcomp trying to put my finger on why also I only see the apps scope I don't seem to be able to change14:58
dobeyAlbertA: yeah, i know what you mean14:58
greybackdavmor2: apps scope bug is our bug, we've fix on the way14:59
davmor2greyback: right nice14:59
greybackdavmor2: osk bug?14:59
davmor2greyback: what I've discovered is that osk will randomly stops working in some apps.  Like messaging app I now can't get it to raise, The text box goes white but the cursor doesn't appear in the test field15:03
kgunndavmor2: might want to make doubly sure its qtcomp...it was acting up like hell on N7 for me in the virgin image (y'day)15:06
kgunnmessaging and phone app very very wonky specifically on n7 virgin image also15:07
=== mhall119_ is now known as mhall119
=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
davmor2kgunn: okay so everything I can test seems to be working,  Only the keyboard issue that I've hit.  I'll start digging into that and see if I can get anything useful from logs etc.  I would need the fix for scopes to land to actually continue testing though.  Apps only currently :)16:43
kgunndavmor2: thanks for testing!16:44
=== alan_g is now known as alan_g|EOD
popeykgunn: looks like latest mir broke the music app (again). Unplug phone, start music app, press play, let phone go dark, it doesn't continue to the next track, but does when you wake the phone.17:26
popeybug 1292306 related17:26
ubot5bug 1292306 in mir (Ubuntu) "Qt render gets blocked on EGLSwapBuffers [fka Upon upgrading to Qt5.2 the music app no longer plays the next song if the screen is off]" [Critical,Fix released] https://launchpad.net/bugs/129230617:26
kgunnpopey: are you playing local music ?17:27
popeyyes17:27
kgunnhmmm....17:27
popeyahayzen: music dev discovered and I confirmed it17:27
kgunnmir hadn't changed since image 11017:27
kgunnpopey: are you sure its mir  ^17:28
popeywell it felt like *that* bug ☻17:28
ahayzenkgunn, how can we tell if it is/isn't mir?17:28
kgunnpopey: and we sure hadn't touched that particular part of the mechanism (....so....much....pain)17:28
popeyheh, i hear you!17:29
kgunnahayzen: when did it start happening ?17:29
kgunni assume this gets tested every image17:29
popeynot sure it does ⍨17:29
ahayzenkgunn, 'recently' ... no we don't have any automated testing on this :/17:29
popeyi have it on the devel image17:29
kgunnpopey: i do know that jhaddop and the boys/girls were changing stuff in their area related to this....but not sure what...i can go retro and test an image to see if it was mir...but nothings changed since 11017:32
kgunnbtw i need to run17:32
popeyok, thanks kgunn17:33
ahayzenthanks kgunn17:33
popeyahayzen: lets get a new bug filed for this, and get it on the radar17:33
* popey moves to -ci-eng17:33
ahayzenpopey, agreed17:33
anpokAlbertA: regarding the deadlock17:37
anpokdisplay needs to be off?17:37
anpokto experience it17:38
AlbertAanpok: it needs to be off, and then you need to hit the power key again to start the compositor17:38
anpokoh seems like I just experienced a different problem17:38
AlbertAso if you get lucky17:38
AlbertAthe timeout will be executing as the compositor starts and calls swap_unblocked17:38
AlbertAwhich will deadlock17:38
AlbertAanpok: oh yea?17:39
anpokn10 with qtcomp branch .. freezes17:39
AlbertAjust in normal use?17:39
anpokfor a few seconds then continues17:39
anpokhmm during animations17:39
anpokfrom app to shell17:40
anpokor on application startup17:40
AlbertAanpok: I see....17:40
anpokhm this is new .. n10 was working extremely fluid yesterday17:42
=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
greybackanpok: yeah, I'm seeing it now too. First time I've ever experienced that, wtf is making everything just block?18:30
greybackhmm, wonder if the snapshotting is to blame18:34
greybackI see blocking in libdbus which I didn't expect either18:35
AlbertAanpok: so your branch https://code.launchpad.net/~andreas-pokorny/mir/synchronous-cancel-of-alarms/+merge/22453020:18
AlbertAanpok: would not resolve the deadlock for https://bugs.launchpad.net/mir/+bug/133970020:19
ubot5Ubuntu bug 1339700 in Mir 0.4 "[regression] Device locks randomly on welcome screen" [High,In progress]20:19
AlbertAanpok: since cancel is synchronous20:19
AlbertAi.e. Thread A (the one executing the ServerActionQueue.. may be executing the alarm handler for TimeoutFrameDroppingPolicy20:21
AlbertAthe policy callback then will try to acquire BufferQueue::guard20:22
AlbertAlet's say there's thread B, calling BufferQueue::compositor_release20:23
AlbertAwhich owns BufferQueue::guard20:23
AlbertAand attempts to cancel the alarm (due to framedrop_policy->swap_unblocked();)20:23
AlbertAwhich will wait indefintely since it's synchronous and won't get executed until AsioMainLoop::process_server_actions returns from the alarm handler20:25
AlbertAso deadlock...20:25
AlbertAanpok: but I think if we expose the async_cancel api, we can make use of that to avoid this condition. The alarm handler in TimeoutFrameDroppingPolicy20:31
AlbertAcan deal with spurious calls....20:32
AlbertAmaybe....I need to think about it some more....20:32
=== chihchun is now known as chihchun_afk
anpokAlbertA: hm but it wont use the queue in that case20:50
AlbertAanpok: ? Trhead B? but that's the compositor thread20:51
anpokah ok I have to read that again20:51
anpokAlbertA: yes you are right this was an attemmpt to keep up the synchronous api20:55
anpokbut it seems that is the actual mistake20:56
AlbertAanpok: so the reason for queing them up for server action queue,20:56
AlbertAis due to timer.cancel() not guaranteeing that there will be no more handlers invoked?20:57
anpokyes20:57
AlbertAok20:57
anpoktimer.cancel tries to provide a synchronous api without the guarantees20:58
anpoki.e. cancel may return, and may destroy other related objects, previously referenced by the timer callback, and another thread is scheduled and executes the timer20:58
anpoki.e. happened in ~TimeoutFrameDroppingPolicy20:59
anpokgreyback: yes there are unity8 logs about snapshotting21:00
greybackanpok: actually I think it is a problem with dbus21:00
anpokbut only in the app -> phone shell switch cases..21:00
greybackanpok: connecting with strace, unity8 is continually polling for something, and when it tries to send a dbus message, blocks for 25seconds before timing out (then tries again I think)21:01
greybacksbus messages are sent for app focus changes21:01
anpokand that inside the rendering thread?21:02
anpokor the event thread block rendering again?21:02
greybackanpok: event thread blocks21:02
anpokyay!21:02
greybackwhy dbus is failing I don't understand21:02
anpokwas there a recent change?21:03
greybackbut I see in my unity8 log that it crashed the first time when trying to connect to the dbus socket21:03
anpokmy n4 image is a bit older than the n10 image21:03
anpokonly see it there21:03
greybackno relevant recent change actually21:03
greybackI only see this on N10, not N4/721:03
greybackperhaps a race somewhere21:03
anpokAlbertA: i am not sure - there are a few synchronisation points like destructors - there we need to have synchronous behavior. Apart from that queuing or working with completion handlers seems simpler.21:08
AlbertAalso it looks like the users of alarm need to protect it externally21:10
AlbertAi.e. like if one thread is doing reschedule_in and another trying to cancel21:10
AlbertAwell not in the traditional sense I suppose21:11
AlbertAalarm state itself will be fine21:11
=== Guest36942 is now known as renato__
AlbertAanpok: I think the branch looks fine, except for line 24421:17
AlbertAdata = std::make_shared<InternalState>(data->callback);21:17
AlbertAUSC will update the timer repeatedly to reset the inactivity timer during motion events21:17
AlbertAI'm concerned about the overhead21:17
kgunni love it when i type reboot in the wrong window21:18
AlbertAkgunn: I hate it...:)21:18
kgunnso annoying21:18
AlbertAI've gotten used to adb shell reboot instead.... <= workaround21:18
kgunntotally...21:19
kgunni had a moment of weakness :)21:19
anpokAlbertA: wait until your system exposes an adb service that can be used from your phone21:19
AlbertAanpok: ha21:19
anpokAlbertA: hm could be replaced by a different mechanism21:21
anpokmaybe a configuration counter? to differ between two pending states?21:21
anpokand the action inside the queue stores the expected pending counter?21:21
AlbertAlike adding a pending_state ?21:25
anpokhm there alread is?21:27
anpoki meant something to detect inside the queue whether the alarm object is out of sync with the currently executed action21:27
AlbertAanpok: so the only reason I see for data being a shared_ptr is for lifetime issues...which should be now addressed with the synchronous cancel no?21:59
AlbertAdo we really need auto data = possible_data.lock();21:59
AlbertAanpok: I mean basicaly this comment22:01
AlbertAhttp://paste.ubuntu.com/7777346/22:01
=== chihchun_afk is now known as chihchun

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!